From be663866c52143cecbbdaac559fb86dfebe7c855 Mon Sep 17 00:00:00 2001
From: wangyubo <1179041704@qq.com>
Date: Tue, 26 Sep 2023 10:36:39 -0700
Subject: [PATCH 1/2] fix exercise6 bugs

---
 collaborative-group24.Rproj |  13 ---
 troubleshooting-2.Rmd       | 186 ++++++++++++++++++++++++++++++++++++
 2 files changed, 186 insertions(+), 13 deletions(-)
 delete mode 100644 collaborative-group24.Rproj
 create mode 100644 troubleshooting-2.Rmd

diff --git a/collaborative-group24.Rproj b/collaborative-group24.Rproj
deleted file mode 100644
index 8e3c2eb..0000000
--- a/collaborative-group24.Rproj
+++ /dev/null
@@ -1,13 +0,0 @@
-Version: 1.0
-
-RestoreWorkspace: Default
-SaveWorkspace: Default
-AlwaysSaveHistory: Default
-
-EnableCodeIndexing: Yes
-UseSpacesForTab: Yes
-NumSpacesForTab: 2
-Encoding: UTF-8
-
-RnwWeave: Sweave
-LaTeX: pdfLaTeX
diff --git a/troubleshooting-2.Rmd b/troubleshooting-2.Rmd
new file mode 100644
index 0000000..c14d323
--- /dev/null
+++ b/troubleshooting-2.Rmd
@@ -0,0 +1,186 @@
+---
+title: "Team Troubleshooting Deliverable 2"
+output: github_document
+---
+
+```{r include = FALSE}
+knitr::opts_chunk$set(error = TRUE)
+```
+
+There are **11 code chunks with errors** in this Rmd. Your objective is to fix all of the errors in this worksheet. For the purpose of grading, each erroneous code chunk is equally weighted.
+
+Note that errors are not all syntactic (i.e., broken code)! Some are logical errors as well (i.e. code that does not do what it was intended to do).
+
+## Exercise 1: Exploring with `select()` and `filter()`
+
+[MovieLens](https://dl.acm.org/doi/10.1145/2827872) are a series of datasets widely used in education, that describe movie ratings from the MovieLens [website](https://movielens.org/). There are several MovieLens datasets, collected by the [GroupLens Research Project](https://grouplens.org/datasets/movielens/) at the University of Minnesota. Here, we load the MovieLens 100K dataset from Rafael Irizarry and Amy Gill's R package, [dslabs](https://cran.r-project.org/web/packages/dslabs/dslabs.pdf), which contains datasets useful for data analysis practice, homework, and projects in data science courses and workshops. We'll also load other required packages.
+
+```{r}
+### ERROR HERE ###
+load.packages(dslabs)
+load.packages(tidyverse)
+load.packages(stringr)
+install.packages("devtools") # Do not run this if you already have this package installed! 
+devtools::install_github("JoeyBernhardt/singer")
+load.packages(gapminder)
+```
+
+Let's have a look at the dataset! My goal is to:
+
+-   Find out the "class" of the dataset.
+-   If it isn't a tibble already, coerce it into a tibble and store it in the variable "movieLens".
+-   Have a quick look at the tibble, using a *dplyr function*.
+
+```{r}
+### ERROR HERE ###
+class(dslabs::movielens)
+movieLens <- as_tibble(dslabs::movielens)
+dim(movieLens)
+```
+
+Now that we've had a quick look at the dataset, it would be interesting to explore the rows (observations) in some more detail. I'd like to consider the movie entries that...
+
+-   belong *exclusively* to the genre *"Drama"*;
+-   don't belong *exclusively* to the genre *"Drama"*;
+-   were filmed *after* the year 2000;
+-   were filmed in 1999 *or* 2000;
+-   have *more than* 4.5 stars, and were filmed *before* 1995.
+
+```{r}
+### ERROR HERE ###
+filter(movieLens, genres == "Drama")
+filter(movieLens, !genres == "Drama")
+filter(movieLens, year >= 2000)
+filter(movieLens, year == 1999 | month == 2000)
+filter(movieLens, rating > 4.5, year < 1995)
+```
+
+While filtering for *all movies that do not belong to the genre drama* above, I noticed something interesting. I want to filter for the same thing again, this time selecting variables **title and genres first,** and then *everything else*. But I want to do this in a robust way, so that (for example) if I end up changing `movieLens` to contain more or less columns some time in the future, the code will still work. Hint: there is a function to select "everything else"...
+
+```{r}
+### ERROR HERE ###
+movieLens %>%
+  filter(!genres == "Drama") %>%
+  select(title, genres, year, rating, timestamp)
+```
+
+## Exercise 2: Calculating with `mutate()`-like functions
+
+Some of the variables in the `movieLens` dataset are in *camelCase* (in fact, *movieLens* is in camelCase). Let's clean these two variables to use *snake_case* instead, and assign our post-rename object back to "movieLens".
+
+```{r}
+### ERROR HERE ###
+movieLens <- movieLens %>%
+  rename(user_id == userId,
+         movie_id == movieId)
+```
+
+As you already know, `mutate()` defines and inserts new variables into a tibble. There is *another mystery function similar to `mutate()`* that adds the new variable, but also drops existing ones. I wanted to create an `average_rating` column that takes the `mean(rating)` across all entries, and I only want to see that variable (i.e drop all others!) but I forgot what that mystery function is. Can you remember?
+
+```{r}
+### ERROR HERE ### 
+mutate(movieLens,
+       average_rating = mean(rating))
+```
+
+## Exercise 3: Calculating with `summarise()`-like functions
+
+Alone, `tally()` is a short form of `summarise()`. `count()` is short-hand for `group_by()` and `tally()`.
+
+Each entry of the movieLens table corresponds to a movie rating by a user. Therefore, if more than one user rated the same movie, there will be several entries for the same movie. I want to find out how many times each movie has been reviewed, or in other words, how many times each movie title appears in the dataset.
+
+```{r}
+movieLens %>%
+  group_by(title) %>%
+  tally()
+```
+
+Without using `group_by()`, I want to find out how many movie reviews there have been for each year.
+
+```{r}
+### ERROR HERE ###
+movieLens %>%
+  tally(year)
+```
+
+Both `count()` and `tally()` can be grouped by multiple columns. Below, I want to count the number of movie reviews by title and rating, and sort the results.
+
+```{r}
+### ERROR HERE ###
+movieLens %>%
+  count(c(title, rating), sort = TRUE)
+```
+
+Not only do `count()` and `tally()` quickly allow you to count items within your dataset, `add_tally()` and `add_count()` are handy shortcuts that add an additional columns to your tibble, rather than collapsing each group.
+
+## Exercise 4: Calculating with `group_by()`
+
+We can calculate the mean rating by year, and store it in a new column called `avg_rating`:
+
+```{r}
+movieLens %>%
+  group_by(year) %>%
+  summarize(avg_rating = mean(rating))
+```
+
+Using `summarize()`, we can find the minimum and the maximum rating by title, stored under columns named `min_rating`, and `max_rating`, respectively.
+
+```{r}
+### ERROR HERE ###
+movieLens %>%
+  mutate(min_rating = min(rating), 
+         max_rating = max(rating))
+```
+
+## Exercise 5: Scoped variants with `across()`
+
+`across()` is a newer dplyr function (`dplyr` 1.0.0) that allows you to apply a transformation to multiple variables selected with the `select()` and `rename()` syntax. For this section, we will use the `starwars` dataset, which is built into R. First, let's transform it into a tibble and store it under the variable `starWars`.
+
+```{r}
+starWars <- as_tibble(starwars)
+```
+
+We can find the mean for all columns that are numeric, ignoring the missing values:
+
+```{r}
+starWars %>%
+  summarise(across(where(is.numeric), function(x) mean(x, na.rm=TRUE)))
+```
+
+We can find the minimum height and mass within each species, ignoring the missing values: 
+
+```{r}
+### ERROR HERE ###
+starWars %>%
+  group_by(species) %>%
+  summarise(across("height", "mass", function(x) min(x, na.rm=TRUE)))
+```
+
+Note that here R has taken the convention that the minimum value of a set of `NA`s is `Inf`.
+
+## Exercise 6: Making tibbles
+
+Manually create a tibble with 4 columns:
+
+-   `birth_year` should contain years 1998 to 2005 (inclusive);
+-   `birth_weight` should take the `birth_year` column, subtract 1995, and multiply by 0.45;
+-   `birth_location` should contain three locations (Liverpool, Seattle, and New York).
+
+```{r}
+### ERROR HERE ###
+fakeStarWars <- tribble(
+  ~name,            ~birth_weight,  ~birth_year, ~birth_location
+  "Luke Skywalker",  1.35      ,   1998        ,  Liverpool, England,
+  "C-3PO"         ,  1.80      ,   1999        ,  Liverpool, England,
+  "R2-D2"         ,  2.25      ,   2000        ,  Seattle, WA,
+  "Darth Vader"   ,  2.70      ,   2001        ,  Liverpool, England,
+  "Leia Organa"   ,  3.15      ,   2002        ,  New York, NY,
+  "Owen Lars"     ,  3.60      ,   2003        ,  Seattle, WA,
+  "Beru Whitesun Iars", 4.05   ,   2004        ,  Liverpool, England,
+  "R5-D4"         ,  4.50      ,   2005        ,  New York, NY,
+)
+```
+
+## Attributions
+
+Thanks to Icíar Fernández-Boyano for writing most of this document, and Albina Gibadullina, Diana Lin, Yulia Egorova, and Vincenzo Coia for their edits.
\ No newline at end of file

From 284d2e05da46d1b2b7a31359d035f629fffed42d Mon Sep 17 00:00:00 2001
From: wangyubo <1179041704@qq.com>
Date: Wed, 27 Sep 2023 17:02:12 -0700
Subject: [PATCH 2/2] do exercise 6 with kinit, add md file and remove ###ERROR
 HERE### headers

---
 troubleshooting-2.Rmd |  26 ++--
 troubleshooting-2.md  | 327 ++++++++++++++++++++++++++++++------------
 2 files changed, 250 insertions(+), 103 deletions(-)

diff --git a/troubleshooting-2.Rmd b/troubleshooting-2.Rmd
index 5e33f64..37d3f14 100644
--- a/troubleshooting-2.Rmd
+++ b/troubleshooting-2.Rmd
@@ -15,7 +15,7 @@ Note that errors are not all syntactic (i.e., broken code)! Some are logical err
 
 [MovieLens](https://dl.acm.org/doi/10.1145/2827872) are a series of datasets widely used in education, that describe movie ratings from the MovieLens [website](https://movielens.org/). There are several MovieLens datasets, collected by the [GroupLens Research Project](https://grouplens.org/datasets/movielens/) at the University of Minnesota. Here, we load the MovieLens 100K dataset from Rafael Irizarry and Amy Gill's R package, [dslabs](https://cran.r-project.org/web/packages/dslabs/dslabs.pdf), which contains datasets useful for data analysis practice, homework, and projects in data science courses and workshops. We'll also load other required packages.
 
-```{r}
+```{r eval=FALSE, include=FALSE}
 # Changed previous install.packages to have ""
 install.packages("dslabs")
 install.packages("tidyverse")
@@ -24,6 +24,12 @@ install.packages("devtools") # Do not run this if you already have this package
 devtools::install_github("JoeyBernhardt/singer")
 install.packages("gapminder")
 ```
+```{r message=FALSE, warning=FALSE}
+library("dslabs")
+library("tidyverse")
+library("stringr")
+library("gapminder")
+```
 
 Let's have a look at the dataset! My goal is to:
 
@@ -32,7 +38,7 @@ Let's have a look at the dataset! My goal is to:
 -   Have a quick look at the tibble, using a *dplyr function*.
 
 ```{r}
-### ERROR HERE ###
+
 class(dslabs::movielens)
 movieLens <- as_tibble(dslabs::movielens)
 dim(movieLens)
@@ -50,7 +56,7 @@ Now that we've had a quick look at the dataset, it would be interesting to explo
 -   have *more than* 4.5 stars, and were filmed *before* 1995.
 
 ```{r}
-### ERROR HERE ###
+
 filter(movieLens, genres == "Drama")
 # Changed this 
 #filter(movieLens, !genres == "Drama")
@@ -73,7 +79,7 @@ filter(movieLens, rating > 4.5, year < 1995)
 While filtering for *all movies that do not belong to the genre drama* above, I noticed something interesting. I want to filter for the same thing again, this time selecting variables **title and genres first,** and then *everything else*. But I want to do this in a robust way, so that (for example) if I end up changing `movieLens` to contain more or less columns some time in the future, the code will still work. Hint: there is a function to select "everything else"...
 
 ```{r}
-### ERROR HERE ###
+
 movieLens %>%
   # Changed this
   #filter(!genres == "Drama") %>%
@@ -90,7 +96,7 @@ movieLens %>%
 Some of the variables in the `movieLens` dataset are in *camelCase* (in fact, *movieLens* is in camelCase). Let's clean these two variables to use *snake_case* instead, and assign our post-rename object back to "movieLens".
 
 ```{r}
-### ERROR HERE ###
+
 movieLens <- movieLens %>%
   # Changed this
   #rename(user_id == userId,
@@ -106,7 +112,7 @@ head(movielens)
 As you already know, `mutate()` defines and inserts new variables into a tibble. There is *another mystery function similar to `mutate()`* that adds the new variable, but also drops existing ones. I wanted to create an `average_rating` column that takes the `mean(rating)` across all entries, and I only want to see that variable (i.e drop all others!) but I forgot what that mystery function is. Can you remember?
 
 ```{r}
-### ERROR HERE ### 
+ 
 # Most likely, the prompt of the question refers to transmute, which "creates a new data frame containing only the specified computations"
 # (see: https://dplyr.tidyverse.org/reference/transmute.html)
 # Changed this
@@ -132,7 +138,7 @@ movieLens %>%
 Without using `group_by()`, I want to find out how many movie reviews there have been for each year.
 
 ```{r}
-### ERROR HERE ###
+
 #movieLens %>%
 #  tally(year)
 #Changed to 
@@ -143,7 +149,7 @@ movieLens %>%   # Tally is used for grouped data, count is a short-hand for grou
 Both `count()` and `tally()` can be grouped by multiple columns. Below, I want to count the number of movie reviews by title and rating, and sort the results.
 
 ```{r}
-### ERROR HERE ###
+
 #movieLens %>%
 #  count(c(title, rating), sort = TRUE)
 # changed to: 
@@ -166,7 +172,7 @@ movieLens %>%
 Using `summarize()`, we can find the minimum and the maximum rating by title, stored under columns named `min_rating`, and `max_rating`, respectively.
 
 ```{r}
-### ERROR HERE ###
+
 #movieLens %>%
 #  mutate(min_rating = min(rating), 
 #         max_rating = max(rating))
@@ -214,7 +220,7 @@ Manually create a tibble with 4 columns:
 -   Modification: add *,* after `birth_location`, add *""* for birth_location value.
 
 ```{r}
-### ERROR HERE ###
+
 fakeStarWars <- tribble(
   ~name,            ~birth_weight,  ~birth_year, ~birth_location,
   "Luke Skywalker",  1.35      ,   1998        ,  "Liverpool, England",
diff --git a/troubleshooting-2.md b/troubleshooting-2.md
index 73f5a44..f1ed73e 100644
--- a/troubleshooting-2.md
+++ b/troubleshooting-2.md
@@ -24,57 +24,12 @@ projects in data science courses and workshops. We’ll also load other
 required packages.
 
 ``` r
-# Changed previous install.packages to have ""
-install.packages("dslabs")
+library("dslabs")
+library("tidyverse")
+library("stringr")
+library("gapminder")
 ```
 
-    ## Installing package into '/opt/homebrew/lib/R/4.2/site-library'
-    ## (as 'lib' is unspecified)
-
-    ## Error in contrib.url(repos, type): trying to use CRAN without setting a mirror
-
-``` r
-install.packages("tidyverse")
-```
-
-    ## Installing package into '/opt/homebrew/lib/R/4.2/site-library'
-    ## (as 'lib' is unspecified)
-
-    ## Error in contrib.url(repos, type): trying to use CRAN without setting a mirror
-
-``` r
-install.packages("stringr")
-```
-
-    ## Installing package into '/opt/homebrew/lib/R/4.2/site-library'
-    ## (as 'lib' is unspecified)
-
-    ## Error in contrib.url(repos, type): trying to use CRAN without setting a mirror
-
-``` r
-install.packages("devtools") # Do not run this if you already have this package installed! 
-```
-
-    ## Installing package into '/opt/homebrew/lib/R/4.2/site-library'
-    ## (as 'lib' is unspecified)
-
-    ## Error in contrib.url(repos, type): trying to use CRAN without setting a mirror
-
-``` r
-devtools::install_github("JoeyBernhardt/singer")
-```
-
-    ## Error in loadNamespace(x): there is no package called 'devtools'
-
-``` r
-install.packages("gapminder")
-```
-
-    ## Installing package into '/opt/homebrew/lib/R/4.2/site-library'
-    ## (as 'lib' is unspecified)
-
-    ## Error in contrib.url(repos, type): trying to use CRAN without setting a mirror
-
 Let’s have a look at the dataset! My goal is to:
 
 - Find out the “class” of the dataset.
@@ -83,23 +38,17 @@ Let’s have a look at the dataset! My goal is to:
 - Have a quick look at the tibble, using a *dplyr function*.
 
 ``` r
-### ERROR HERE ###
 class(dslabs::movielens)
 ```
 
-    ## Error in loadNamespace(x): there is no package called 'dslabs'
+    ## [1] "data.frame"
 
 ``` r
 movieLens <- as_tibble(dslabs::movielens)
-```
-
-    ## Error in as_tibble(dslabs::movielens): could not find function "as_tibble"
-
-``` r
 dim(movieLens)
 ```
 
-    ## Error in eval(expr, envir, enclos): object 'movieLens' not found
+    ## [1] 100004      7
 
 ``` r
     # In addition to dim() (which is a part of base R), I used the dplyr function glipmse() 
@@ -107,7 +56,15 @@ dim(movieLens)
 glimpse(movieLens)
 ```
 
-    ## Error in glimpse(movieLens): could not find function "glimpse"
+    ## Rows: 100,004
+    ## Columns: 7
+    ## $ movieId   <int> 31, 1029, 1061, 1129, 1172, 1263, 1287, 1293, 1339, 1343, 13…
+    ## $ title     <chr> "Dangerous Minds", "Dumbo", "Sleepers", "Escape from New Yor…
+    ## $ year      <int> 1995, 1941, 1996, 1981, 1989, 1978, 1959, 1982, 1992, 1991, …
+    ## $ genres    <fct> Drama, Animation|Children|Drama|Musical, Thriller, Action|Ad…
+    ## $ userId    <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
+    ## $ rating    <dbl> 2.5, 3.0, 3.0, 2.0, 4.0, 2.0, 2.0, 2.0, 3.5, 2.0, 2.5, 1.0, …
+    ## $ timestamp <int> 1260759144, 1260759179, 1260759182, 1260759185, 1260759205, …
 
 Now that we’ve had a quick look at the dataset, it would be interesting
 to explore the rows (observations) in some more detail. I’d like to
@@ -120,11 +77,23 @@ consider the movie entries that…
 - have *more than* 4.5 stars, and were filmed *before* 1995.
 
 ``` r
-### ERROR HERE ###
 filter(movieLens, genres == "Drama")
 ```
 
-    ## Error in as.ts(x): object 'movieLens' not found
+    ## # A tibble: 7,757 × 7
+    ##    movieId title                             year genres userId rating timestamp
+    ##      <int> <chr>                            <int> <fct>   <int>  <dbl>     <int>
+    ##  1      31 Dangerous Minds                   1995 Drama       1    2.5    1.26e9
+    ##  2    1172 Cinema Paradiso (Nuovo cinema P…  1989 Drama       1    4      1.26e9
+    ##  3    1293 Gandhi                            1982 Drama       1    2      1.26e9
+    ##  4      62 Mr. Holland's Opus                1995 Drama       2    3      8.35e8
+    ##  5     261 Little Women                      1994 Drama       2    4      8.35e8
+    ##  6     300 Quiz Show                         1994 Drama       2    3      8.35e8
+    ##  7     508 Philadelphia                      1993 Drama       2    4      8.35e8
+    ##  8     537 Sirens                            1994 Drama       2    4      8.35e8
+    ##  9    2702 Summer of Sam                     1999 Drama       3    3.5    1.30e9
+    ## 10    3949 Requiem for a Dream               2000 Drama       3    5      1.30e9
+    ## # ℹ 7,747 more rows
 
 ``` r
 # Changed this 
@@ -133,7 +102,20 @@ filter(movieLens, genres == "Drama")
 filter(movieLens, genres != "Drama")
 ```
 
-    ## Error in as.ts(x): object 'movieLens' not found
+    ## # A tibble: 92,247 × 7
+    ##    movieId title                            year genres  userId rating timestamp
+    ##      <int> <chr>                           <int> <fct>    <int>  <dbl>     <int>
+    ##  1    1029 Dumbo                            1941 Animat…      1    3      1.26e9
+    ##  2    1061 Sleepers                         1996 Thrill…      1    3      1.26e9
+    ##  3    1129 Escape from New York             1981 Action…      1    2      1.26e9
+    ##  4    1263 Deer Hunter, The                 1978 Drama|…      1    2      1.26e9
+    ##  5    1287 Ben-Hur                          1959 Action…      1    2      1.26e9
+    ##  6    1339 Dracula (Bram Stoker's Dracula)  1992 Fantas…      1    3.5    1.26e9
+    ##  7    1343 Cape Fear                        1991 Thrill…      1    2      1.26e9
+    ##  8    1371 Star Trek: The Motion Picture    1979 Advent…      1    2.5    1.26e9
+    ##  9    1405 Beavis and Butt-Head Do America  1996 Advent…      1    1      1.26e9
+    ## 10    1953 French Connection, The           1971 Action…      1    4      1.26e9
+    ## # ℹ 92,237 more rows
 
 ``` r
 # Changed this
@@ -142,7 +124,20 @@ filter(movieLens, genres != "Drama")
 filter(movieLens, year > 2000)
 ```
 
-    ## Error in as.ts(x): object 'movieLens' not found
+    ## # A tibble: 25,481 × 7
+    ##    movieId title                             year genres userId rating timestamp
+    ##      <int> <chr>                            <int> <fct>   <int>  <dbl>     <int>
+    ##  1    5349 Spider-Man                        2002 Actio…      3    3      1.30e9
+    ##  2    5669 Bowling for Columbine             2002 Docum…      3    3.5    1.30e9
+    ##  3    6377 Finding Nemo                      2003 Adven…      3    3      1.30e9
+    ##  4    7153 Lord of the Rings: The Return o…  2003 Actio…      3    2.5    1.30e9
+    ##  5    7361 Eternal Sunshine of the Spotles…  2004 Drama…      3    3      1.30e9
+    ##  6    8622 Fahrenheit 9/11                   2004 Docum…      3    3.5    1.30e9
+    ##  7    8636 Spider-Man 2                      2004 Actio…      3    3      1.30e9
+    ##  8   44191 V for Vendetta                    2006 Actio…      3    3.5    1.30e9
+    ##  9   48783 Flags of Our Fathers              2006 Drama…      3    4.5    1.30e9
+    ## 10   50068 Letters from Iwo Jima             2006 Drama…      3    4.5    1.30e9
+    ## # ℹ 25,471 more rows
 
 ``` r
 # Changed this
@@ -151,13 +146,39 @@ filter(movieLens, year > 2000)
 filter(movieLens, year == 1999 | year == 2000)
 ```
 
-    ## Error in as.ts(x): object 'movieLens' not found
+    ## # A tibble: 9,088 × 7
+    ##    movieId title                             year genres userId rating timestamp
+    ##      <int> <chr>                            <int> <fct>   <int>  <dbl>     <int>
+    ##  1    2694 Big Daddy                         1999 Comedy      3    3      1.30e9
+    ##  2    2702 Summer of Sam                     1999 Drama       3    3.5    1.30e9
+    ##  3    2762 Sixth Sense, The                  1999 Drama…      3    3.5    1.30e9
+    ##  4    2841 Stir of Echoes                    1999 Horro…      3    4      1.30e9
+    ##  5    2858 American Beauty                   1999 Drama…      3    4      1.30e9
+    ##  6    2959 Fight Club                        1999 Actio…      3    5      1.30e9
+    ##  7    3510 Frequency                         2000 Drama…      3    4      1.30e9
+    ##  8    3949 Requiem for a Dream               2000 Drama       3    5      1.30e9
+    ##  9   27369 Daria: Is It Fall Yet?            2000 Anima…      3    3.5    1.30e9
+    ## 10    2628 Star Wars: Episode I - The Phan…  1999 Actio…      4    5      9.50e8
+    ## # ℹ 9,078 more rows
 
 ``` r
 filter(movieLens, rating > 4.5, year < 1995)
 ```
 
-    ## Error in match.arg(method): object 'year' not found
+    ## # A tibble: 8,386 × 7
+    ##    movieId title                             year genres userId rating timestamp
+    ##      <int> <chr>                            <int> <fct>   <int>  <dbl>     <int>
+    ##  1     265 Like Water for Chocolate (Como …  1992 Drama…      2      5    8.35e8
+    ##  2     266 Legends of the Fall               1994 Drama…      2      5    8.35e8
+    ##  3     551 Nightmare Before Christmas, The   1993 Anima…      2      5    8.35e8
+    ##  4     589 Terminator 2: Judgment Day        1991 Actio…      2      5    8.35e8
+    ##  5     590 Dances with Wolves                1990 Adven…      2      5    8.35e8
+    ##  6     592 Batman                            1989 Actio…      2      5    8.35e8
+    ##  7     318 Shawshank Redemption, The         1994 Crime…      3      5    1.30e9
+    ##  8     356 Forrest Gump                      1994 Comed…      3      5    1.30e9
+    ##  9    1197 Princess Bride, The               1987 Actio…      3      5    1.30e9
+    ## 10     260 Star Wars: Episode IV - A New H…  1977 Actio…      4      5    9.50e8
+    ## # ℹ 8,376 more rows
 
 While filtering for *all movies that do not belong to the genre drama*
 above, I noticed something interesting. I want to filter for the same
@@ -168,7 +189,6 @@ less columns some time in the future, the code will still work. Hint:
 there is a function to select “everything else”…
 
 ``` r
-### ERROR HERE ###
 movieLens %>%
   # Changed this
   #filter(!genres == "Drama") %>%
@@ -180,7 +200,20 @@ movieLens %>%
   select(title, genres, everything())
 ```
 
-    ## Error in movieLens %>% filter(genres != "Drama") %>% select(title, genres, : could not find function "%>%"
+    ## # A tibble: 92,247 × 7
+    ##    title                           genres  movieId  year userId rating timestamp
+    ##    <chr>                           <fct>     <int> <int>  <int>  <dbl>     <int>
+    ##  1 Dumbo                           Animat…    1029  1941      1    3      1.26e9
+    ##  2 Sleepers                        Thrill…    1061  1996      1    3      1.26e9
+    ##  3 Escape from New York            Action…    1129  1981      1    2      1.26e9
+    ##  4 Deer Hunter, The                Drama|…    1263  1978      1    2      1.26e9
+    ##  5 Ben-Hur                         Action…    1287  1959      1    2      1.26e9
+    ##  6 Dracula (Bram Stoker's Dracula) Fantas…    1339  1992      1    3.5    1.26e9
+    ##  7 Cape Fear                       Thrill…    1343  1991      1    2      1.26e9
+    ##  8 Star Trek: The Motion Picture   Advent…    1371  1979      1    2.5    1.26e9
+    ##  9 Beavis and Butt-Head Do America Advent…    1405  1996      1    1      1.26e9
+    ## 10 French Connection, The          Action…    1953  1971      1    4      1.26e9
+    ## # ℹ 92,237 more rows
 
 ## Exercise 2: Calculating with `mutate()`-like functions
 
@@ -190,7 +223,6 @@ use *snake_case* instead, and assign our post-rename object back to
 “movieLens”.
 
 ``` r
-### ERROR HERE ###
 movieLens <- movieLens %>%
   # Changed this
   #rename(user_id == userId,
@@ -198,15 +230,24 @@ movieLens <- movieLens %>%
   # To this
   rename(user_id = userId,
          movie_id = movieId)
-```
-
-    ## Error in movieLens %>% rename(user_id = userId, movie_id = movieId): could not find function "%>%"
 
-``` r
 head(movielens)
 ```
 
-    ## Error in head(movielens): object 'movielens' not found
+    ##   movieId                                   title year
+    ## 1      31                         Dangerous Minds 1995
+    ## 2    1029                                   Dumbo 1941
+    ## 3    1061                                Sleepers 1996
+    ## 4    1129                    Escape from New York 1981
+    ## 5    1172 Cinema Paradiso (Nuovo cinema Paradiso) 1989
+    ## 6    1263                        Deer Hunter, The 1978
+    ##                             genres userId rating  timestamp
+    ## 1                            Drama      1    2.5 1260759144
+    ## 2 Animation|Children|Drama|Musical      1    3.0 1260759179
+    ## 3                         Thriller      1    3.0 1260759182
+    ## 4 Action|Adventure|Sci-Fi|Thriller      1    2.0 1260759185
+    ## 5                            Drama      1    4.0 1260759205
+    ## 6                        Drama|War      1    2.0 1260759151
 
 As you already know, `mutate()` defines and inserts new variables into a
 tibble. There is *another mystery function similar to `mutate()`* that
@@ -216,7 +257,6 @@ entries, and I only want to see that variable (i.e drop all others!) but
 I forgot what that mystery function is. Can you remember?
 
 ``` r
-### ERROR HERE ### 
 # Most likely, the prompt of the question refers to transmute, which "creates a new data frame containing only the specified computations"
 # (see: https://dplyr.tidyverse.org/reference/transmute.html)
 # Changed this
@@ -227,7 +267,20 @@ transmute(movieLens,
        average_rating = mean(rating))
 ```
 
-    ## Error in transmute(movieLens, average_rating = mean(rating)): could not find function "transmute"
+    ## # A tibble: 100,004 × 1
+    ##    average_rating
+    ##             <dbl>
+    ##  1           3.54
+    ##  2           3.54
+    ##  3           3.54
+    ##  4           3.54
+    ##  5           3.54
+    ##  6           3.54
+    ##  7           3.54
+    ##  8           3.54
+    ##  9           3.54
+    ## 10           3.54
+    ## # ℹ 99,994 more rows
 
 ## Exercise 3: Calculating with `summarise()`-like functions
 
@@ -246,13 +299,25 @@ movieLens %>%
   tally()
 ```
 
-    ## Error in movieLens %>% group_by(title) %>% tally(): could not find function "%>%"
+    ## # A tibble: 8,832 × 2
+    ##    title                                  n
+    ##    <chr>                              <int>
+    ##  1 "\"Great Performances\" Cats"          2
+    ##  2 "$9.99"                                3
+    ##  3 "'Hellboy': The Seeds of Creation"     1
+    ##  4 "'Neath the Arizona Skies"             1
+    ##  5 "'Round Midnight"                      2
+    ##  6 "'Salem's Lot"                         1
+    ##  7 "'Til There Was You"                   4
+    ##  8 "'burbs, The"                         19
+    ##  9 "'night Mother"                        3
+    ## 10 "(500) Days of Summer"                45
+    ## # ℹ 8,822 more rows
 
 Without using `group_by()`, I want to find out how many movie reviews
 there have been for each year.
 
 ``` r
-### ERROR HERE ###
 #movieLens %>%
 #  tally(year)
 #Changed to 
@@ -260,14 +325,26 @@ movieLens %>%   # Tally is used for grouped data, count is a short-hand for grou
   count(year)
 ```
 
-    ## Error in movieLens %>% count(year): could not find function "%>%"
+    ## # A tibble: 104 × 2
+    ##     year     n
+    ##    <int> <int>
+    ##  1  1902     6
+    ##  2  1915     2
+    ##  3  1916     1
+    ##  4  1917     2
+    ##  5  1918     2
+    ##  6  1919     1
+    ##  7  1920    15
+    ##  8  1921    12
+    ##  9  1922    28
+    ## 10  1923     3
+    ## # ℹ 94 more rows
 
 Both `count()` and `tally()` can be grouped by multiple columns. Below,
 I want to count the number of movie reviews by title and rating, and
 sort the results.
 
 ``` r
-### ERROR HERE ###
 #movieLens %>%
 #  count(c(title, rating), sort = TRUE)
 # changed to: 
@@ -275,7 +352,20 @@ movieLens %>%
   count(title, rating, sort = TRUE) # c() function call should not be passed into count()
 ```
 
-    ## Error in movieLens %>% count(title, rating, sort = TRUE): could not find function "%>%"
+    ## # A tibble: 28,297 × 3
+    ##    title                              rating     n
+    ##    <chr>                               <dbl> <int>
+    ##  1 Shawshank Redemption, The               5   170
+    ##  2 Pulp Fiction                            5   138
+    ##  3 Star Wars: Episode IV - A New Hope      5   122
+    ##  4 Forrest Gump                            4   113
+    ##  5 Schindler's List                        5   109
+    ##  6 Godfather, The                          5   107
+    ##  7 Forrest Gump                            5   102
+    ##  8 Silence of the Lambs, The               4   102
+    ##  9 Fargo                                   5   100
+    ## 10 Silence of the Lambs, The               5   100
+    ## # ℹ 28,287 more rows
 
 Not only do `count()` and `tally()` quickly allow you to count items
 within your dataset, `add_tally()` and `add_count()` are handy shortcuts
@@ -293,14 +383,26 @@ movieLens %>%
   summarize(avg_rating = mean(rating))
 ```
 
-    ## Error in movieLens %>% group_by(year) %>% summarize(avg_rating = mean(rating)): could not find function "%>%"
+    ## # A tibble: 104 × 2
+    ##     year avg_rating
+    ##    <int>      <dbl>
+    ##  1  1902       4.33
+    ##  2  1915       3   
+    ##  3  1916       3.5 
+    ##  4  1917       4.25
+    ##  5  1918       4.25
+    ##  6  1919       3   
+    ##  7  1920       3.7 
+    ##  8  1921       4.42
+    ##  9  1922       3.80
+    ## 10  1923       4.17
+    ## # ℹ 94 more rows
 
 Using `summarize()`, we can find the minimum and the maximum rating by
 title, stored under columns named `min_rating`, and `max_rating`,
 respectively.
 
 ``` r
-### ERROR HERE ###
 #movieLens %>%
 #  mutate(min_rating = min(rating), 
 #         max_rating = max(rating))
@@ -310,7 +412,20 @@ movieLens %>%
   summarize(min_rating = min(rating, na.rm = TRUE), max_rating = max(rating, na.rm = TRUE)) 
 ```
 
-    ## Error in movieLens %>% group_by(title) %>% summarize(min_rating = min(rating, : could not find function "%>%"
+    ## # A tibble: 8,832 × 3
+    ##    title                              min_rating max_rating
+    ##    <chr>                                   <dbl>      <dbl>
+    ##  1 "\"Great Performances\" Cats"             0.5        3  
+    ##  2 "$9.99"                                   2.5        4.5
+    ##  3 "'Hellboy': The Seeds of Creation"        2          2  
+    ##  4 "'Neath the Arizona Skies"                0.5        0.5
+    ##  5 "'Round Midnight"                         0.5        4  
+    ##  6 "'Salem's Lot"                            3.5        3.5
+    ##  7 "'Til There Was You"                      0.5        4  
+    ##  8 "'burbs, The"                             1.5        4.5
+    ##  9 "'night Mother"                           5          5  
+    ## 10 "(500) Days of Summer"                    0.5        5  
+    ## # ℹ 8,822 more rows
 
 ## Exercise 5: Scoped variants with `across()`
 
@@ -324,8 +439,6 @@ into a tibble and store it under the variable `starWars`.
 starWars <- as_tibble(starwars)
 ```
 
-    ## Error in as_tibble(starwars): could not find function "as_tibble"
-
 We can find the mean for all columns that are numeric, ignoring the
 missing values:
 
@@ -334,7 +447,10 @@ starWars %>%
   summarise(across(where(is.numeric), function(x) mean(x, na.rm=TRUE)))
 ```
 
-    ## Error in starWars %>% summarise(across(where(is.numeric), function(x) mean(x, : could not find function "%>%"
+    ## # A tibble: 1 × 3
+    ##   height  mass birth_year
+    ##    <dbl> <dbl>      <dbl>
+    ## 1   174.  97.3       87.6
 
 We can find the minimum height and mass within each species, ignoring
 the missing values:
@@ -346,7 +462,28 @@ starWars %>%
   summarise(across(c("height", "mass"), function(x) min(x, na.rm=TRUE)))
 ```
 
-    ## Error in starWars %>% group_by(species) %>% summarise(across(c("height", : could not find function "%>%"
+    ## Warning: There were 6 warnings in `summarise()`.
+    ## The first warning was:
+    ## ℹ In argument: `across(c("height", "mass"), function(x) min(x, na.rm = TRUE))`.
+    ## ℹ In group 4: `species = "Chagrian"`.
+    ## Caused by warning in `min()`:
+    ## ! no non-missing arguments to min; returning Inf
+    ## ℹ Run `dplyr::last_dplyr_warnings()` to see the 5 remaining warnings.
+
+    ## # A tibble: 38 × 3
+    ##    species   height  mass
+    ##    <chr>      <int> <dbl>
+    ##  1 Aleena        79    15
+    ##  2 Besalisk     198   102
+    ##  3 Cerean       198    82
+    ##  4 Chagrian     196   Inf
+    ##  5 Clawdite     168    55
+    ##  6 Droid         96    32
+    ##  7 Dug          112    40
+    ##  8 Ewok          88    20
+    ##  9 Geonosian    183    80
+    ## 10 Gungan       196    66
+    ## # ℹ 28 more rows
 
 Note that here R has taken the convention that the minimum value of a
 set of `NA`s is `Inf`.
@@ -364,7 +501,6 @@ Manually create a tibble with 4 columns:
   birth_location value.
 
 ``` r
-### ERROR HERE ###
 fakeStarWars <- tribble(
   ~name,            ~birth_weight,  ~birth_year, ~birth_location,
   "Luke Skywalker",  1.35      ,   1998        ,  "Liverpool, England",
@@ -376,15 +512,20 @@ fakeStarWars <- tribble(
   "Beru Whitesun Iars", 4.05   ,   2004        ,  "Liverpool, England",
   "R5-D4"         ,  4.50      ,   2005        ,  "New York, NY"
 )
-```
-
-    ## Error in tribble(~name, ~birth_weight, ~birth_year, ~birth_location, "Luke Skywalker", : could not find function "tribble"
-
-``` r
 fakeStarWars
 ```
 
-    ## Error in eval(expr, envir, enclos): object 'fakeStarWars' not found
+    ## # A tibble: 8 × 4
+    ##   name               birth_weight birth_year birth_location    
+    ##   <chr>                     <dbl>      <dbl> <chr>             
+    ## 1 Luke Skywalker             1.35       1998 Liverpool, England
+    ## 2 C-3PO                      1.8        1999 Liverpool, England
+    ## 3 R2-D2                      2.25       2000 Seattle, WA       
+    ## 4 Darth Vader                2.7        2001 Liverpool, England
+    ## 5 Leia Organa                3.15       2002 New York, NY      
+    ## 6 Owen Lars                  3.6        2003 Seattle, WA       
+    ## 7 Beru Whitesun Iars         4.05       2004 Liverpool, England
+    ## 8 R5-D4                      4.5        2005 New York, NY
 
 ## Attributions