-
Notifications
You must be signed in to change notification settings - Fork 15
/
Copy pathimport.qmd
133 lines (97 loc) · 5.06 KB
/
import.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
---
title: "Import Data"
date-modified: 'today'
date-format: long
license: CC BY-NC
bibliography: references.bib
---
::: callout-warning
## Did you start with a Project?
Reproducibility starts at the [foundation](proj.html#reproducibility)! Always begin by opening a project. See [New Projects](proj.html) for tips.
We **recommend** entering your code in **code-chunks** within [**coding notebooks**](quarto.html#notebooks---recomended).
:::
In this workshop we recommend the [Tidyverse](https://tidyverse.org) approach to learning and using R
Below are some of the core [*tidyverse*](https://tidyverse.tidyverse.org/) packages that are loaded with the function: `library(tidyverse)`.
| package | use | package | use |
|--------------|------------------------|--------------|---------------------|
| `dplyr` | data wrangling | `forcats` | categorical data / factors |
| `ggplot2` | visualization | `lubridate` | dates and times |
| `readr` | import CSV | `stringr` | regular expressions / strings |
| `purrr` | iteration / functional programing | `tidyr` | pivot data |
| `readxl` | import Excel files | `haven` | import SPSS/Stata/SAS |
## Data import wizard
The data import wizard is a quick and easy way to import your data
[![Import dataset](images/data_import.jpg){fig-alt="import dataset"}](images/data_import.jpg)
It's actually way **better** to follow the **reproducible steps** -- and hardly any more effort -- below...
## Load library packages
Open a Quarto document, Insert a code-chunk (Ctrl-Alt-I) and copy the following code. Then execute the code. You may first have to [install the tidyverse-package](packages.html#install-packages) if you have not already[^1].
[^1]: In R, a package is a collection of R functions, and/or data, and/or documentation. R users find and install packages via centralized package-hubs (e.g. [Metacran](https://www.r-pkg.org/), CRAN, [Bioconductor](https://bioconductor.org/), [R-universe](https://r-universe.dev/search/), Github) to aid in the specialization and efficiency of R coding.
```{r}
#| warning: false
#| message: false
library(tidyverse)
```
## Import data
In RStudio, in the Files quadrant and tab, click the `data` folder, then left-click the `brodhead_center.csv` file. Using the context menu, choose the *Import Dataset...* option. Once inside the data wizard, you can copy the code int he code-preview window, then paste the code into the code chunk of your quarto document or r script.
::: column-margin
```{=html}
<iframe width="300" height="200" src="https://www.youtube.com/embed/BKDkj7I4L-Y" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
```
:::
```{r}
#| label: import-data
#| warning: false
#| message: false
# library(readr)
brodhead_center <- read_csv("data/brodhead_center.csv")
# view(brodheadCenter)
```
::: callout-tip
## Composing the data import code...
Writing the import data function can be tricky. Try the import wizard pictured above. THEN, paste the code from the **Code Preview** section into your script.
[![Easily write import data function](images/data_import_code.jpg){fig-alt="Easily write import data function" width="800"}](images/data_import_code.jpg)
:::
## Excel, SPSS, SAS, etc.
The data import wizard will help you find the proper package for importing your data. For example, use...
- `library(readxl)` for Excel data
- `library(haven)` for SPSS, SAS, Stata
- `lirary(readr)` for CSV or other delimeters
Just start with `File > Import Dataset` to get started composing that code, then paste your code into a script.
## Look at the data object
Now that you've [assigned](index.htm#pipes-and-assignment "Assign and object with <-") the output from the `read_csv` function to the name `brodhead_center`, simply call that object name in a code chunk.
```{r}
brodhead_center
```
## Visualize your data with {[ggplot2](https://ggplot2.tidyverse.org)}
Here's a quick teaser on visualizing data. Read more in the [visualization chapter](viz.html).
```{r}
#| warning: false
#| message: false
brodhead_center |>
ggplot(aes(x = name, y = cost)) +
geom_boxplot()
```
```{r}
#| warning: false
#| message: false
brodhead_center |>
filter(name != "Tandoor") |>
ggplot(aes(x = rating, y = cost)) +
geom_jitter(aes(color = name))
```
```{r}
#| code-fold: true
#| code-summary: "Show the code"
brodhead_center |>
drop_na(rating, cost, name) |>
filter(name != "Tandoor") |>
ggplot(aes(x = factor(rating), y = cost)) +
geom_tile(aes(fill = name), alpha = .3) +
scale_y_continuous(label = scales::dollar) +
scale_fill_brewer(palette = "Dark2") +
labs(x = "rating", y = NULL, title = "Heatmap: cost over ratings",
caption = "Source: https://github.com/data-and-visualization/Intro2R",
fill = "Restaurant name") +
theme_classic() +
theme(plot.title.position = "plot")
```