Skip to content

Commit

Permalink
Attempting recovery of lost edits.
Browse files Browse the repository at this point in the history
  • Loading branch information
aphalo committed Mar 14, 2016
1 parent 1d227d3 commit 1d7e059
Show file tree
Hide file tree
Showing 9 changed files with 5,595 additions and 27,146 deletions.
11 changes: 8 additions & 3 deletions R.more.plotting.Rnw
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
% !Rnw root = appendix.main.Rnw
% !Rnw root = using-r.main.Rnw

<<echo=FALSE, include=FALSE>>=
opts_chunk$set(opts_fig_wide)
Expand Down Expand Up @@ -103,14 +103,14 @@ A very simple stat named \code{stat\_debug()} can save the work of adding print

<<>>=
ggplot(lynx.df, aes(year, lynx)) + geom_line() +
stat_debug(alpha = 0.8)
stat_debug_group(alpha = 0.8)
@

<<>>=
lynx.df$century <- ifelse(lynx.df$year >= 1900, "XX", "XIX")
ggplot(lynx.df, aes(year, lynx, color = century)) +
geom_line() +
stat_debug(alpha = 0.8, size = rel(2.5))
stat_debug_group(alpha = 0.8, size = rel(2.5))
@

\section[ggrepel]{\ggrepel}
Expand All @@ -119,6 +119,11 @@ ggplot(lynx.df, aes(year, lynx, color = century)) +

Package \ggrepel provides two new geoms: \code{geom\_text\_repel} and \code{geom\_label\_repel}. They are used similarly to \code{geom\_text} and \code{geom\_label} but the text or labels ``repel'' each other so that they rarely overlap unless the plot is very crowded.

<<>>=
ggplot(lynx.df, aes(year, lynx)) +
geom_line() +
stat_peaks(geom = "label_repel", nudge_y = 500)
@

<<>>=
try(detach(package:ggpmisc))
Expand Down
76 changes: 55 additions & 21 deletions R.scripts.Rnw
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ We can assign to a variable defined outside a function with operator \texttt{<<-

Now we will define a useful function: a function for calculating the standard error of the mean from a numeric vector.

<<fun-1>>=
<<fun-1NN>>=
SEM <- function(x){sqrt(var(x)/length(x))}
a <- c(1, 2, 3, -5)
a.na <- c(a, NA)
Expand Down Expand Up @@ -193,7 +193,7 @@ The R function \texttt{lm} is used next to fit a linear regression.

<<models-1>>=
fm1 <- lm(dist ~ speed, data=cars) # we fit a model, and then save the result
plot(fm1) # we produce diagnosis plots
plot(fm1, which = 2) # we produce diagnosis plots
summary(fm1) # we inspect the results from the fit
anova(fm1) # we calculate an ANOVA
@
Expand All @@ -202,7 +202,7 @@ Let's look at each step separately: \texttt{dist ~ speed} is the specification o

<<models-2>>=
fm2 <- lm(dist ~ speed - 1, data=cars) # we fit a model, and then save the result
plot(fm2) # we produce diagnosis plots
plot(fm2, which = 2) # we produce diagnosis plots
summary(fm2) # we inspect the results from the fit
anova(fm2) # we calculate an ANOVA
@
Expand All @@ -211,12 +211,21 @@ We now we fit a second degree polynomial.

<<models-3>>=
fm3 <- lm(dist ~ speed + I(speed^2), data=cars) # we fit a model, and then save the result
plot(fm3) # we produce diagnosis plots
plot(fm3, which = 3) # we produce diagnosis plots
summary(fm3) # we inspect the results from the fit
anova(fm3) # we calculate an ANOVA
@

We can also compare the two models.
The ``same'' fit using an orthogonal polynomial.

<<models-3a>>=
fm3a <- lm(dist ~ poly(speed, 2), data=cars) # we fit a model, and then save the result
plot(fm3a, which = 3) # we produce diagnosis plots
summary(fm3a) # we inspect the results from the fit
anova(fm3a) # we calculate an ANOVA
@

We can also compare two models.

<<models-4>>=
anova(fm2, fm1)
Expand All @@ -225,11 +234,23 @@ anova(fm2, fm1)
Or three or more models. But be careful, as the order of the arguments matters.

<<models-5>>=
anova(fm2, fm1, fm3)
anova(fm2, fm1, fm3, fm3a)
@

We can use different criteria to choose the best model: significance based on $P$-values or information criteria (AIC, BIC) that penalize the result based on the number of parameters in the fitted model. In the case of AIC and BIC, a smaller value is better, and values returned can be either positive or negative, in which case more negative is better.

<<>>=
BIC(fm2, fm1, fm3, fm3a)
AIC(fm2, fm1, fm3, fm3a)
@

One can see above that these three criteria ``select'' different models as best.
\begin{description}
\item[anova] \code{fm1}
\item[BIC] \code{fm1}
\item[AIC] \code{fm3}
\end{description}

\section{Control of execution flow}

\subsection{Conditional execution}
Expand Down Expand Up @@ -271,28 +292,50 @@ As you can see above the statement immediately following \texttt{else} is execut
Do you still remember the rules about continuation lines?

<<auxiliary, echo=FALSE, eval=TRUE>>=
show.results <- TRUE
if (show.results) eval.if.4 <- c(1:4) else eval.if.4 <- FALSE
eval.if.4
show.results <- FALSE
if (show.results) eval.if.4 <- c(1:4) else eval.if.4 <- FALSE
eval.if.4
@

<<if-4, eval=eval.if.4>>=
<<if-4>>=
# 1
a <- 1
if (a < 0.0)
print("'a' is negative") else
print("'a' is not negative")
@

Why does the statement below (not here) trigger an error?

<<if-4a, eval=FALSE>>=
# 2 (not evaluated here)
if (a < 0.0) print("'a' is negative")
else print("'a' is not negative")
@

Why does only the second example above trigger an error?

Play with the use conditional execution, with both simple and compound statements, and also think how to combine \texttt{if} and \texttt{else} to select among more than two options.

There is in R a \texttt{switch} statement, that we will not describe here, that can be used to select among ``cases'', or several alternative statements, based on an expression evaluating to a number or a character string.
There is in R a \texttt{switch} statement, that we describe here, which can be used to select among ``cases'', or several alternative statements, based on an expression evaluating to a number or a character string. The switch statement returns a value, the value returned by the code corresponding to the matching switch value, or the default if there is no match, and a default has been included in the code. Both character values or numeric values can used.

<<>>=
my.object <- "two"
b <- switch(my.object,
one = 1,
two = 1 / 2,
three = 1/ 4,
0
)
b
@

Do play with the use of the switch statement.

\subsubsection{Vectorized}

The vectorized conditional execution is coded by means of a \textbf{function} called \texttt{ifelse} (one word). This function takes three arguments: a logical vector, a result vector for TRUE, a result vector for FALSE. All three can be any construct giving the necessary argument as their result. In the case of result vectors, recycling will apply if they are not of the correct length. \textcolor{red}{The length of the result is determined by the length of the logical vector in the first argument!}.
Vectorized conditional execution is coded by means of a \textbf{function} called \texttt{ifelse} (one word). This function takes three arguments: a logical vector, a result vector for TRUE, a result vector for FALSE. All three can be any construct giving the necessary argument as their return value. In the case of result vectors, recycling will apply if they are not of the correct length. \textcolor{red}{The length of the result is determined by the length of the logical vector in the first argument!}.

<<ifelse-1>>=
a <- 1:10
Expand Down Expand Up @@ -523,20 +566,11 @@ In R speak `library' is the location where `packages' are installed. Packages ar
library(graphics)
@

Currently there are thousands of packages available. The most reliable source of packages is CRAN, as only packages that pass strict tests and are actively maintained are included. In some cases you may need or want to install less stable code, and this is also possible.
Currently there are thousands of packages available. The most reliable source of packages is CRAN, as only packages that pass strict tests and are actively maintained are included. In some cases you may need or want to install less stable code, and this is also possible. With package \code{devtools} it is even possible to install packages directly from Github, Bitbucket and a few other repos. These later installations are always installations from source (see below).

R packages can be installed either from source, or from already built `binaries'. Installing from sources, depending on the package, may require quite a lot of additional software to be available. Under MS-Windows, very rarely the needed shell, commands and compilers are already available. Installing then is not too difficult (you will need RTools, and MiKTeX). For this reason it is the norm to install packages from binary .zip files. Under Linux most tools will be available, or very easy to install, so it is not unusual to install from sources. For OS X (Mac) the situation is somewhere in-between. If the tools are available, packages can be very easily installed from source from within RStudio.

The development of packages is beyond the scope of the current course, but it is still interesting to know a few things about packages. Using RStudio it is relatively easy to develop your own packages. Packages can be of very different sizes. Packages use a relatively rigid structure of folder for storing the different types of files, and there is a built-in help system, that one needs to use, so that the package documentation gets linked to the R help system when the package is loaded. In addition to R code, packages can call C, C++, FORTRAN, Java, etc. functions and routines, but some kind of `glue' is needed, as data is stored differently. At least for C++, the recently developed Rcpp R package makes the gluing extremely easy.

In addition to some packages from CRAN, later in the course we will use a suite of packages for photobiology that I have developed during the last couple of years. Some of the functions in these packages are very simple, and others more complex. In one of the packages, I included some C++ functions to improve performance. Replacing some R for loops with C++ for loops and iterators, resulted in a huge speed increase. The reason for this is that R is an interpreted language and C++ is compiled into machine code. Recent versions of R allow byte-compilation which can give some speed improvement, without need to switch to another language.

The source code for the photobiology and many other packages is freely available, so if you are interested you can study it. For any function defined in R, typing at the command prompt the name of the function without the parentheses lists the code.

<<packages-2>>=
length # a function defined in C within R itself
SEM # the function we defined earlier
@

One good way of learning how R works, is by experimenting with it, and whenever using a certain function looking at the help, to check what are all the available options.

17 changes: 7 additions & 10 deletions appendixes.prj
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,13 @@
1
1
using-r.main.Rnw
104
103
13
5

using-r.main.Rnw
TeX:RNW:UTF-8
152055803 0 -1 3385 -1 3387 25 25 1393 519 0 1 527 336 -1 -1 0 0 86 -1 -1 86 1 0 3387 -1 0 -1 0
152055803 1 -1 4806 -1 4819 25 25 1393 519 0 1 158 1040 -1 -1 0 0 86 -1 -1 86 1 0 4819 -1 0 -1 0
abbrev.sty
TeX:STY
1060850 0 20 1 18 50 200 200 1593 649 0 0 417 306 -1 -1 0 0 2 0 0 2 1 0 50 18 0 0 0
Expand All @@ -19,19 +19,19 @@ TeX:STY
1060850 0 10 38 10 38 0 0 1691 494 0 0 321 170 -1 -1 0 0 1 17 17 1 1 0 38 10 0 0 0
R.as.calculator.Rnw
TeX:RNW
1060859 0 -1 215 -1 1222 175 175 1543 669 0 1 716 320 -1 -1 0 0 0 -1 -1 0 1 0 1222 -1 0 -1 0
1060859 0 -1 215 -1 1222 175 175 1543 669 0 1 716 48 -1 -1 0 0 0 -1 -1 0 1 0 1222 -1 0 -1 0
R.data.Rnw
TeX:RNW
1060859 0 -1 170 -1 174 25 25 1343 589 0 1 383 112 -1 -1 0 0 26 -1 -1 26 1 0 174 -1 0 -1 0
R.more.plotting.Rnw
TeX:RNW
286273531 0 -1 3695 -1 3715 200 200 1568 694 1 1 607 304 -1 -1 0 0 26 -1 -1 26 1 0 3715 -1 0 -1 0
286273531 0 -1 4112 -1 4117 200 200 1568 694 1 1 256 368 -1 -1 0 0 26 -1 -1 26 1 0 4117 -1 0 -1 0
:\aphalo\Documents\R bits and pieces\polynomials\polynomials.Rmd
DATA
5243122 0 0 1 0 1 0 0 1418 354 1 0 78 0 -1 -1 0 0 4 0 0 4 1 0 1 0 0 0 0
:\aphalo\Documents\RPackages\ggpmisc\vignettes\examples.Rmd
DATA
5243122 1 176 1 181 1 25 25 1443 379 1 0 78 496 -1 -1 0 0 4 0 0 4 1 0 1 181 0 0 0
5243122 1 176 1 181 1 25 25 1443 379 1 0 78 480 -1 -1 0 0 4 0 0 4 1 0 1 181 0 0 0
R.plotting.Rnw
TeX:RNW
17838075 0 -1 34385 -1 34385 200 200 1568 694 1 1 94 448 -1 -1 0 0 26 -1 -1 26 1 0 34385 -1 0 -1 0
Expand All @@ -46,10 +46,10 @@ TeX:RNW
17838075 0 -1 48 -1 10454 150 150 1543 599 0 1 41 512 -1 -1 0 0 53 -1 -1 53 1 0 10454 -1 0 -1 0
R.scripts.Rnw
TeX:RNW
1060859 2 -1 28549 -1 28556 225 225 1593 719 0 1 1247 512 -1 -1 0 0 153 -1 -1 153 1 0 28556 -1 0 -1 0
1060859 0 -1 28549 -1 26870 225 225 1593 719 0 1 194 256 -1 -1 0 0 153 -1 -1 153 1 0 26870 -1 0 -1 0
using-r.main.tex
TeX:UTF-8
286273531 8 -1 10808 -1 10794 0 0 1393 449 0 1 230 256 -1 -1 0 0 104 -1 -1 104 1 0 10794 -1 0 -1 0
286273531 8 -1 1876 -1 1874 0 0 1393 449 0 1 41 512 -1 -1 0 0 104 -1 -1 104 1 0 1874 -1 0 -1 0
r4photobiology.sty
TeX:STY
269496306 0 53 1 16 1 0 0 1691 494 0 0 25 272 -1 -1 0 0 1 2 2 1 1 0 1 16 0 0 0
Expand All @@ -68,9 +68,6 @@ TeX:RNW
C:\ProgramData\MiKTeX\2.9\miktex\log\update.log
DATA
273678578 0 0 1 0 1 25 25 1468 385 1 0 78 -10482 -1 -1 0 0 -1 -1 -1 -1 1 0 1 0 0 0 0
:\aphalo\Documents\RPackages\photobiologyInOut\vignettes\userguidepbio.tex
TeX:UTF-8
286273531 8 -1 17499 -1 17485 50 50 1493 410 0 1 41 240 -1 -1 0 0 -1 -1 -1 -1 1 0 17485 -1 0 -1 0
:\aphalo\Documents\Instrumentation\Department suggestions\shopping_list2.txt
ASCII
273688443 0 0 1 0 1 25 25 1468 385 1 0 73 0 -1 -1 0 0 -1 -1 -1 -1 1 0 1 0 0 0 0
Expand Down
2 changes: 2 additions & 0 deletions my.first.script.r
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# this is my first R script
print(3+4)
6 changes: 3 additions & 3 deletions using-r.main.Rnw
Original file line number Diff line number Diff line change
Expand Up @@ -111,8 +111,8 @@ This series of Notes cover different aspects of the use of R. They are meant to
are short and terse. We do not discuss here statistics, just R as a tool and language for data manipulation and display. The idea is a bit like how children learn a language: they work-out what the rules are simply by listening to people speak. I do give some explanations and comments, but the idea of this notes is mainly for you to use the numerous examples to find-out by yourself the overall patterns and coding philosophy behind the R language.

This is work-in-progress. I will appreciate suggestions for further examples, notification of errors and unclear things and any bigger contributions. Many of the examples here have been collected from diverse sources over many years and because of this not all sources are acknowledged. If you recognize any example as yours or someone else's please let me know so that I can add a proper acknowledgement.
<<child-r-calc, child='R.as.calculator.Rnw', eval=TRUE>>=

<<child-r-calc, child='R.as.calculator.Rnw', eval=FALSE>>=
@

<<child-r-scripts, child='R.scripts.Rnw', eval=FALSE>>=
Expand All @@ -124,7 +124,7 @@ This is work-in-progress. I will appreciate suggestions for further examples, no
<<child-r-plotting, child='R.plotting.Rnw', eval=FALSE>>=
@

<<child-r-more-plotting, child='R.more.plotting.Rnw', eval=FALSE>>=
<<child-r-more-plotting, child='R.more.plotting.Rnw', eval=TRUE>>=
@

<<child-r-maps, child='R.maps.Rnw', eval=FALSE>>=
Expand Down
Binary file modified using-r.main.pdf
Binary file not shown.
Loading

0 comments on commit 1d7e059

Please sign in to comment.