forked from dcossyleon/basic-course-website
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathcond.Rmd
281 lines (224 loc) · 7.96 KB
/
cond.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
---
title: "Lecture 3: Making choices"
questions:
- "How do I make choices using `if` and `else` statements?"
- "How do I compare values?"
- "How do I save my plots to a PDF file?"
objectives:
- "Save plot(s) in a PDF file."
- "Write conditional statements with `if` and `else`."
- "Correctly evaluate expressions containing `&&` ('and') and `||` ('or')."
keypoints:
- "Save a plot in a pdf file using `pdf(\"name.pdf\")` and stop writing to the pdf file with `dev.off()`."
- "Use `if (condition)` to start a conditional statement, `else if (condition)` to provide additional tests, and `else` to provide a default."
- "The bodies of conditional statements must be surrounded by curly braces `{ }`."
- "Use `==` to test for equality."
- "`X && Y` is only true if both X and Y are true."
- "`X || Y` is true if either X or Y, or both, are true."
source: Rmd
---
<!-- ```{r, include = FALSE}
source("../bin/chunk-options.R")
knitr_fig_path("04-cond-")
``` -->
```{r setup, include=FALSE}
library(knitr)
knitr::opts_chunk$set(echo = TRUE)
```
### Conditionals
In order to update our function to decide between saving or not, we need to write code that automatically decides between multiple options. The computer can make these deicisons through logical comparisons.
```{r}
num <- 37
num > 100
```
As 37 is not greater than 100, this returns a `FALSE` object. And as you likely guessed, the opposite of `FALSE` is `TRUE`.
```{r}
num < 100
```
We pair these logical comparison tools with what R calls a **conditional statement**, and it looks like this:
```{r, results='hold'}
num <- 37
if (num > 100) {
print("greater")
} else {
print("not greater")
}
print("done")
```
The second line of this code uses an `if` statement to tell R that we want to make a choice.
If the following test is `TRUE`, the body of the `if` (i.e., the lines in the curly braces underneath it) are executed.
If the test is `FALSE`, the body of the `else` is executed instead.
Only one or the other is ever executed:
<!-- <img src="../figures/python-flowchart-conditional.svg" alt="Executing a Conditional" /> -->
![Conditional programming](figures/python-flowchart-conditional.svg)
In the example above, the test `num > 100` returns the value `FALSE`, which is why the code inside the `if` block was skipped and the code inside the `else` statement was run instead.
```{r}
num > 100
```
And as you likely guessed, the opposite of `FALSE` is `TRUE`.
```{r}
num < 100
```
Conditional statements don't have to include an `else`.
If there isn't one, R simply does nothing if the test is false:
```{r}
num <- 53
if (num > 100) {
print("num is greater than 100")
}
```
We can also chain several tests together when there are more than two options.
This makes it simple to write a function that returns the sign of a number:
```{r}
sign <- function(num) {
if (num > 0) {
return(1)
} else if (num == 0) {
return(0)
} else {
return(-1)
}
}
sign(-3)
sign(0)
sign(2/3)
```
Note that when combining `else` and `if` in an `else if` statement, the `if` portion still requires a direct input condition. This is never the case for the `else` statement alone, which is only executed if all other conditions go unsatisfied.
Note that the test for equality uses two equal signs, `==`.
## Other Comparisons
Other tests include:
* greater than or equal to (`>=`),
* less than or equal to(`<=`), and
* not equal to (`!=`).
We can also combine tests:
* two ampersands, `&&`, symbolize "and",
* two vertical bars, `||`, symbolize "or",
`&&` is only true if both parts are true:
```{r}
if (1 > 0 && -1 > 0) {
print("both parts are true")
} else {
print("at least one part is not true")
}
```
while `||` is true if either part is true:
```{r}
if (1 > 0 || -1 > 0) {
print("at least one part is true")
} else {
print("neither part is true")
}
```
In this case, "either" means "either or both", not "either one or the other but not both".
### Saving Automatically Generated Figures
```{r, results='hide'}
analyze <- function(filename) {
# Plots the average, min, and max inflammation over time.
# Input is character string of a csv file.
dat <- read.csv(file = filename, header = FALSE)
avg_day_inflammation <- apply(dat, 2, mean)
plot(avg_day_inflammation)
max_day_inflammation <- apply(dat, 2, max)
plot(max_day_inflammation)
min_day_inflammation <- apply(dat, 2, min)
plot(min_day_inflammation)
}
```
```{r eval = FALSE}
pdf("inflammation-01.pdf")
analyze("data/inflammation-01.csv")
dev.off()
```
Now that we know how to have R make decisions based on input values,
let's update `analyze`:
```{r analyze-save}
analyze <- function(filename, output = NULL) {
# Plots the average, min, and max inflammation over time.
# Input:
# filename: character string of a csv file
# output: character string of pdf file for saving
if (!is.null(output)) {
pdf(output)
}
dat <- read.csv(file = filename, header = FALSE)
avg_day_inflammation <- apply(dat, 2, mean)
plot(avg_day_inflammation)
max_day_inflammation <- apply(dat, 2, max)
plot(max_day_inflammation)
min_day_inflammation <- apply(dat, 2, min)
plot(min_day_inflammation)
if (!is.null(output)) {
dev.off()
}
}
```
We added an argument, `output`, that by default is set to `NULL`.
An `if` statement at the beginning checks the argument `output` to decide whether or not to save the plots to a pdf.
Let's break it down.
The function `is.null` returns `TRUE` if a variable is `NULL` and `FALSE` otherwise.
The exclamation mark, `!`, stands for "not".
Therefore the line in the `if` block is only executed if `output` is "not null".
```{r}
output <- NULL
is.null(output)
!is.null(output)
```
Now we can use `analyze` interactively, as before,
```{r inflammation-01}
analyze("data/inflammation-01.csv")
```
but also use it to save plots,
```{r results='hide'}
analyze("data/inflammation-01.csv", output = "inflammation-01.pdf")
```
Before going further, we will create a directory `results` for saving our plots.
It is [good practice](http://swcarpentry.github.io/good-enough-practices-in-scientific-computing/) in data analysis projects to save all output to a directory separate from the data and analysis code.
```{r warning = FALSE}
# create a new folder using R
dir.create("results")
```
Now run `analyze` and save the plot in the `results` directory,
```{r results='hide'}
analyze("data/inflammation-01.csv", output = "results/inflammation-01.pdf")
```
This now works well when we want to process one data file at a time, but how can we specify the output file in `analyze_all`?
We need to do two things:
1. Substitute the filename ending "csv" with "pdf".
2. Save the plot to the `results` directory.
To change the extension to "pdf", we will use the function `sub`,
```{r}
f <- "inflammation-01.csv"
sub("csv", "pdf", f)
```
To add the "results" directory to the filename use the function `file.path`,
```{r}
file.path("results", sub("csv", "pdf", f))
```
Now let's update `analyze_all`:
```{r analyze_all-save}
analyze_all <- function(pattern) {
# Directory name containing the data
data_dir <- "data"
# Directory name for results
results_dir <- "results"
# Runs the function analyze for each file in the current working directory
# that contains the given pattern.
filenames <- list.files(path = data_dir, pattern = pattern)
for (f in filenames) {
pdf_name <- file.path(results_dir, sub("csv", "pdf", f))
analyze(file.path(data_dir, f), output = pdf_name)
}
}
```
Now we can save all of the results with just one line of code:
```{r}
analyze_all("inflammation.*csv")
```
Now if we need to make any changes to our analysis, we can edit the `analyze` function and quickly regenerate all the figures with `analyze_all`.
## Exercise: Changing the Behavior of the Plot Command
One of your collaborators asks if you can recreate the figures with lines instead of points.
Find the relevant argument to `plot` by reading the documentation (`?plot`),
update `analyze`, and then recreate all the figures with `analyze_all`.
```{r knit_exit, include=F, echo=F}
knit_exit()
```