Skip to content

Commit

Permalink
Significant additions only to stats and as.calculator chapters, other…
Browse files Browse the repository at this point in the history
… chapters \citetitle -> \citebooktitle
  • Loading branch information
aphalo committed Jul 14, 2018
1 parent 07b0309 commit 7f5d1c5
Show file tree
Hide file tree
Showing 74 changed files with 338,287 additions and 381,400 deletions.
184 changes: 142 additions & 42 deletions R.as.calculator.Rnw

Large diffs are not rendered by default.

159 changes: 73 additions & 86 deletions R.data.Rnw

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions R.friends.Rnw
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Improving the efficiency of your S functions can be well worth some effort.\ \ld

\section{Aims of this chapter}

In this final chapter I highlight what in my opinion are limitations and advantages of using \langname{R} as a scripting and programming language for data analysis, briefly describing alternative approaches that can help overcome performance bottle necks in R code. A few books cover this subject from different perspectives, including \citetitle{Wickham2015} \autocite{Wickham2015}, \citetitle{Wickham2014} \autocite{Wickham2014}, \citetitle{Matloff2011} \autocite{Matloff2011}, \citetitle{Chambers2016} \autocite{Chambers2016} from a practical programming perspective, and, \citetitle{Burns1998} \autocite{Burns1998} and \citetitle{Burns2011} \autocite{Burns2011} from a more conceptual and language design oriented perspective.
In this final chapter I highlight what in my opinion are limitations and advantages of using \langname{R} as a scripting and programming language for data analysis, briefly describing alternative approaches that can help overcome performance bottle necks in R code. A few books cover this subject from different perspectives, including \citebooktitle{Wickham2015} \autocite{Wickham2015}, \citebooktitle{Wickham2014} \autocite{Wickham2014}, \citebooktitle{Matloff2011} \autocite{Matloff2011}, \citebooktitle{Chambers2016} \autocite{Chambers2016} from a practical programming perspective, and, \citebooktitle{Burns1998} \autocite{Burns1998} and \citebooktitle{Burns2011} \autocite{Burns2011} from a more conceptual and language design oriented perspective.

\section{Packages used in this chapter}

Expand Down Expand Up @@ -60,8 +60,8 @@ You may ask, how can I know, where in the code is the performance bottleneck\ind

There are some rules of style\index{code!writing style}, and common sense, that should be always applied, to develop good quality program code. However, as in some cases, high performance comes at the cost of a more complex program or algorithm, optimizations should be applied only to the sections of the code that are limiting overall performance. Usually even when the requirement of high performance is known in advance, it is best to start with a simple implementation of a simple algorithm. Get this first solution working reliably, and use this as a reference both for performance and accuracy of returned results while attempting performance optimization.

The book \citetitle{Matloff2011} \autocite{Matloff2011} is very good at presenting the use of R language and how to profit from its peculiar features to write concise and efficient code.
Studying the book \citetitle{Wickham2014advanced} \autocite{Wickham2014advanced} will give you a deep understanding of the R language, its limitations and good and bad approaches to its use. If you aim at writing R packages, then \citetitle{Wickham2015} \autocite{Wickham2015} will guide you on how to write your own packages, using modern tools. Finally, any piece of software, benefits from thorough and consistent testing, and R packages and scripts are no exception. Building a set of test cases simplifies enormously code maintenance, as they help detect unintended changes in program behaviour \autocite{Wickham2015,Cotton2016}.
The book \citebooktitle{Matloff2011} \autocite{Matloff2011} is very good at presenting the use of R language and how to profit from its peculiar features to write concise and efficient code.
Studying the book \citebooktitle{Wickham2014advanced} \autocite{Wickham2014advanced} will give you a deep understanding of the R language, its limitations and good and bad approaches to its use. If you aim at writing R packages, then \citebooktitle{Wickham2015} \autocite{Wickham2015} will guide you on how to write your own packages, using modern tools. Finally, any piece of software, benefits from thorough and consistent testing, and R packages and scripts are no exception. Building a set of test cases simplifies enormously code maintenance, as they help detect unintended changes in program behaviour \autocite{Wickham2015,Cotton2016}.

\section{Measuring and improving performance}\label{sec:performance:tuning}

Expand Down
10 changes: 5 additions & 5 deletions R.functions.Rnw
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Computer Science is a science of abstraction---creating the right model for a pr

\section{Aims of this chapter}

In earlier chapters we have only used base \Rlang features. In this chapter you will learn how to expand the range of features available. In the first part of the chapter I will focus on using existing packages and how they expand the functionality of \Rlang. In the second part you will learn how to define new functions and classes by yourself. We will not consider the important, but more advanced question of packaging functions and classes into new packages. The development of packages is thoroughly described in the book \citetitle{Wickham2015} \autocite{Wickham2015}.
In earlier chapters we have only used base \Rlang features. In this chapter you will learn how to expand the range of features available. In the first part of the chapter I will focus on using existing packages and how they expand the functionality of \Rlang. In the second part you will learn how to define new functions and classes by yourself. We will not consider the important, but more advanced question of packaging functions and classes into new packages. The development of packages is thoroughly described in the book \citebooktitle{Wickham2015} \autocite{Wickham2015}.

\section{Packages}\label{sec:script:packages}

Expand All @@ -34,9 +34,9 @@ Currently there are many thousands of packages available. The most reliable sour

\Rpgrm packages can be installed either from source, or from already built `binaries'. Installing from sources, depending on the package, may require quite a lot of additional software to be available. Under \pgrmname{MS-Windows}, very rarely the needed shell, commands and compilers are already available. Installing them is not too difficult (you will need \pgrmname{RTools}, and \pgrmname{\hologo{MiKTeX}}). However, for this reason it is the norm to install packages from binary \texttt{.zip} files under \pgrmname{MS-Windows}. Under Linux most tools will be available, or very easy to install, so it is usual to install packages from sources. For \pgrmname{OS X} (Apple Mac) the situation is somewhere in-between. If the tools are available, packages can be very easily installed from sources from within \RStudio. However, binaries are for most packages also readily available.

The development of packages is beyond the scope of the current book, and very well explained in the book \citetitle{Wickham2015} \autocite{Wickham2015}. However, it is still worthwhile mentioning a few things about the development of \Rpgrm packages. Using \RStudio it is relatively easy to develop your own packages. Packages can be of very different sizes. Packages use a relatively rigid structure of folders for storing the different types of files, and there is a built-in help system, that one needs to use, so that the package documentation gets linked to the R help system when the package is loaded. In addition to \langname{R} code, packages can call \langname{C}, \langname{C++}, \langname{FORTRAN}, \langname{Java}, etc. functions and routines, but some kind of `glue' is needed, as function call conventions and \emph{name mangling} depend on the programming language, and in many cases also on the compiler used. At least for \langname{C++}, the recently developed \pkgname{Rcpp} \langname{R} package makes the ``gluing'' extremely easy. See Chapter \ref{chap:R:performance} starting on page \pageref{chap:R:performance} for more information on performance-related and other limitations of \pgrmname{R} and how to solve possible bottlenecks.
The development of packages is beyond the scope of the current book, and very well explained in the book \citebooktitle{Wickham2015} \autocite{Wickham2015}. However, it is still worthwhile mentioning a few things about the development of \Rpgrm packages. Using \RStudio it is relatively easy to develop your own packages. Packages can be of very different sizes. Packages use a relatively rigid structure of folders for storing the different types of files, and there is a built-in help system, that one needs to use, so that the package documentation gets linked to the R help system when the package is loaded. In addition to \langname{R} code, packages can call \langname{C}, \langname{C++}, \langname{FORTRAN}, \langname{Java}, etc. functions and routines, but some kind of `glue' is needed, as function call conventions and \emph{name mangling} depend on the programming language, and in many cases also on the compiler used. At least for \langname{C++}, the recently developed \pkgname{Rcpp} \langname{R} package makes the ``gluing'' extremely easy. See Chapter \ref{chap:R:performance} starting on page \pageref{chap:R:performance} for more information on performance-related and other limitations of \pgrmname{R} and how to solve possible bottlenecks.

One good way of learning how R works, is by experimenting with it, and whenever using a certain function looking at its help, to check what are all the available options. How much documentation is included with packages varies a lot, but many packages include comprehensive user guides or examples as \emph{vignettes} in addition to the help pages for individual functions or data sets. It is not unusual to decide which package to use from a set of alternatives based on the available documentation. In the case of some packages adding many new capabilities, packages may be documented in depth in a whole book. Well known examples are \citetitle{Pinheiro2000} \autocite{Pinheiro2000}, \citetitle{Sarkar2008} \autocite{Sarkar2008} and \citetitle{Wickham2016} \autocite{Wickham2016}.
One good way of learning how R works, is by experimenting with it, and whenever using a certain function looking at its help, to check what are all the available options. How much documentation is included with packages varies a lot, but many packages include comprehensive user guides or examples as \emph{vignettes} in addition to the help pages for individual functions or data sets. It is not unusual to decide which package to use from a set of alternatives based on the available documentation. In the case of some packages adding many new capabilities, packages may be documented in depth in a whole book. Well known examples are \citebooktitle{Pinheiro2000} \autocite{Pinheiro2000}, \citebooktitle{Sarkar2008} \autocite{Sarkar2008} and \citebooktitle{Wickham2016} \autocite{Wickham2016}.

\begin{warningbox}
\textbf{Naming conflicts} When two objects with the same name are present in the search path used by R to match names to stored objects, conflicts can occur. Not all names belong to the same namespace, and consequently, in many situations different object with identical names can coexist in the R environment. It is important that you realize than in such cases all these objects remain within reach of our code, but that the one nearest to the top of the search path will returned when a ``plain'' name is used. The name of the namespace can be prepended to that of the object separated by double colons (\code{::}). In recent versions of R, every package is required to use a namespace to isolate the names used, and even names used in base R are in their own name space. This greatly facilitates the resolution of naming conflicts.
Expand Down Expand Up @@ -159,9 +159,9 @@ Create some additional vectors containing \code{NA}s or not. Use them to test fu

\section{Objects, classes and methods}\label{sec:script:objects:classes:methods}

An\index{objects}\index{classes}\index{methods} in-depth discussion of object oriented programming in \langname{R} is outside the scope of this book. Several books describe in detail the different class systems available and how to take best advantage of them when developing packages extending R. For the non-programmer user, a basic understanding can be useful, even if he or she do not intend to create new classes. This basic knowledge is what we intend to convey in this section. For an in-depth treatment of the subject please consult the recently published book \citetitle{Wickham2014} \autocite{Wickham2014}.
An\index{objects}\index{classes}\index{methods} in-depth discussion of object oriented programming in \langname{R} is outside the scope of this book. Several books describe in detail the different class systems available and how to take best advantage of them when developing packages extending R. For the non-programmer user, a basic understanding can be useful, even if he or she do not intend to create new classes. This basic knowledge is what we intend to convey in this section. For an in-depth treatment of the subject please consult the recently published book \citebooktitle{Wickham2014} \autocite{Wickham2014}.

We start with a quotation form \citetitle{Burns1998} \autocite[][, page 13]{Burns1998}.
We start with a quotation form \citebooktitle{Burns1998} \autocite[][, page 13]{Burns1998}.
\begin{quotation}
The idea of object-oriented programming is simple, but carries a lot of weight.
Here's the whole thing: if you told a group of people ``dress for work'', then
Expand Down
Loading

0 comments on commit 7f5d1c5

Please sign in to comment.