cond.Rmd

---
title: "Lecture 3: Making choices"
questions:
- "How do I make choices using `if` and `else` statements?"
- "How do I compare values?"
- "How do I save my plots to a PDF file?"
objectives:
- "Save plot(s) in a PDF file."
- "Write conditional statements with `if` and `else`."
- "Correctly evaluate expressions containing `&&` ('and') and `||` ('or')."
keypoints:
- "Save a plot in a pdf file using `pdf(\"name.pdf\")` and stop writing to the pdf file with `dev.off()`."
- "Use `if (condition)` to start a conditional statement, `else if (condition)` to provide additional tests, and `else` to provide a default."
- "The bodies of conditional statements must be surrounded by curly braces `{ }`."
- "Use `==` to test for equality."
- "`X && Y` is only true if both X and Y are true."
- "`X || Y` is true if either X or Y, or both, are true."
source: Rmd
---

<!-- ```{r, include = FALSE}
source("../bin/chunk-options.R")
knitr_fig_path("04-cond-")
``` -->


```{r setup, include=FALSE}
library(knitr)
knitr::opts_chunk$set(echo = TRUE)
```

### Conditionals

In order to update our function to decide between saving or not, we need to write code that automatically decides between multiple options. The computer can make these deicisons through logical comparisons.

```{r}
num <- 37
num > 100
```

As 37 is not greater than 100, this returns a `FALSE` object. And as you likely guessed, the opposite of `FALSE` is `TRUE`.

```{r}
num < 100
```

We pair these logical comparison tools with what R calls a **conditional statement**, and it looks like this:

```{r, results='hold'}
num <- 37
if (num > 100) {
  print("greater")
} else {
  print("not greater")
}
print("done")
```

The second line of this code uses an `if` statement to tell R that we want to make a choice.
If the following test is `TRUE`, the body of the `if` (i.e., the lines in the curly braces underneath it) are executed.
If the test is `FALSE`, the body of the `else` is executed instead.
Only one or the other is ever executed:

<!-- <img src="../figures/python-flowchart-conditional.svg" alt="Executing a Conditional" /> -->
![Conditional programming](figures/python-flowchart-conditional.svg)

In the example above, the test `num > 100` returns the value `FALSE`, which is why the code inside the `if` block was skipped and the code inside the `else` statement was run instead.

```{r}
num > 100
```

And as you likely guessed, the opposite of `FALSE` is `TRUE`.

```{r}
num < 100
```

Conditional statements don't have to include an `else`.
If there isn't one, R simply does nothing if the test is false:

```{r}
num <- 53
if (num > 100) {
  print("num is greater than 100")
}
```

We can also chain several tests together when there are more than two options.
This makes it simple to write a function that returns the sign of a number:

```{r}
sign <- function(num) {
  if (num > 0) {
    return(1)
  } else if (num == 0) {
    return(0)
  } else {
    return(-1)
  }
}

sign(-3)
sign(0)
sign(2/3)
```

Note that when combining `else` and `if` in an `else if` statement, the `if` portion still requires a direct input condition.  This is never the case for the `else` statement alone, which is only executed if all other conditions go unsatisfied.
Note that the test for equality uses two equal signs, `==`.

## Other Comparisons
Other tests include:

* greater than or equal to (`>=`),

* less than or equal to(`<=`), and

* not equal to (`!=`).

We can also combine tests:

* two ampersands, `&&`, symbolize "and",

* two vertical bars, `||`, symbolize "or",

`&&` is only true if both parts are true:

```{r}
if (1 > 0 && -1 > 0) {
    print("both parts are true")
} else {
  print("at least one part is not true")
}
```

while `||` is true if either part is true:

```{r}
if (1 > 0 || -1 > 0) {
    print("at least one part is true")
} else {
  print("neither part is true")
}
```

In this case, "either" means "either or both", not "either one or the other but not both".

### Saving Automatically Generated Figures

```{r, results='hide'}
analyze <- function(filename) {
  # Plots the average, min, and max inflammation over time.
  # Input is character string of a csv file.
  dat <- read.csv(file = filename, header = FALSE)
  avg_day_inflammation <- apply(dat, 2, mean)
  plot(avg_day_inflammation)
  max_day_inflammation <- apply(dat, 2, max)
  plot(max_day_inflammation)
  min_day_inflammation <- apply(dat, 2, min)
  plot(min_day_inflammation)
}
```
```{r eval = FALSE}
pdf("inflammation-01.pdf")
analyze("data/inflammation-01.csv")
dev.off()
```

Now that we know how to have R make decisions based on input values,
let's update `analyze`:

```{r analyze-save}
analyze <- function(filename, output = NULL) {
  # Plots the average, min, and max inflammation over time.
  # Input:
  #    filename: character string of a csv file
  #    output: character string of pdf file for saving
  if (!is.null(output)) {
    pdf(output)
  }
  dat <- read.csv(file = filename, header = FALSE)
  avg_day_inflammation <- apply(dat, 2, mean)
  plot(avg_day_inflammation)
  max_day_inflammation <- apply(dat, 2, max)
  plot(max_day_inflammation)
  min_day_inflammation <- apply(dat, 2, min)
  plot(min_day_inflammation)
  if (!is.null(output)) {
    dev.off()
  }
}
```

We added an argument, `output`, that by default is set to `NULL`.
An `if` statement at the beginning checks the argument `output` to decide whether or not to save the plots to a pdf.
Let's break it down.
The function `is.null` returns `TRUE` if a variable is `NULL` and `FALSE` otherwise.
The exclamation mark, `!`, stands for "not".
Therefore the line in the `if` block is only executed if `output` is "not null".

```{r}
output <- NULL
is.null(output)
!is.null(output)
```

Now we can use `analyze` interactively, as before,

```{r inflammation-01}
analyze("data/inflammation-01.csv")
```

but also use it to save plots,

```{r results='hide'}
analyze("data/inflammation-01.csv", output = "inflammation-01.pdf")
```

Before going further, we will create a directory `results` for saving our plots.
It is [good practice](http://swcarpentry.github.io/good-enough-practices-in-scientific-computing/) in data analysis projects to save all output to a directory separate from the data and analysis code.
```{r warning = FALSE}
# create a new folder using R
dir.create("results")
```

Now run `analyze` and save the plot in the `results` directory,
```{r results='hide'}
analyze("data/inflammation-01.csv", output = "results/inflammation-01.pdf")
```

This now works well when we want to process one data file at a time, but how can we specify the output file in `analyze_all`?
We need to do two things:

1. Substitute the filename ending "csv" with "pdf".
2. Save the plot to the `results` directory.

To change the extension to "pdf", we will use the function `sub`,
```{r}
f <- "inflammation-01.csv"
sub("csv", "pdf", f)
```
To add the "results" directory to the filename use the function `file.path`,
```{r}
file.path("results", sub("csv", "pdf", f))
```

Now let's update `analyze_all`:

```{r analyze_all-save}
analyze_all <- function(pattern) {
  # Directory name containing the data
  data_dir <- "data"
  # Directory name for results
  results_dir <- "results"
  # Runs the function analyze for each file in the current working directory
  # that contains the given pattern.
  filenames <- list.files(path = data_dir, pattern = pattern)
  for (f in filenames) {
    pdf_name <- file.path(results_dir, sub("csv", "pdf", f))
    analyze(file.path(data_dir, f), output = pdf_name)
  }
}
```

Now we can save all of the results with just one line of code:

```{r}
analyze_all("inflammation.*csv")
```

Now if we need to make any changes to our analysis, we can edit the `analyze` function and quickly regenerate all the figures with `analyze_all`.

## Exercise: Changing the Behavior of the Plot Command

One of your collaborators asks if you can recreate the figures with lines instead of points.
Find the relevant argument to `plot` by reading the documentation (`?plot`),
update `analyze`, and then recreate all the figures with `analyze_all`.

```{r knit_exit, include=F, echo=F}
knit_exit()
```