Skip to content

Commit

Permalink
Ch 3, reviewer comments
Browse files Browse the repository at this point in the history
  • Loading branch information
aphalo committed Sep 7, 2023
1 parent 3524f05 commit 12bd8f4
Show file tree
Hide file tree
Showing 27 changed files with 390,691 additions and 389,174 deletions.
1,000 changes: 542 additions & 458 deletions R.as.calculator.Rnw

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions R.data.containers.Rnw
Original file line number Diff line number Diff line change
Expand Up @@ -878,7 +878,7 @@ View(cars)
edit(cars)
@

\section{Attributes of R objects}\label{sec:calc:attributes}
\section{Attributes of \Rlang objects}\label{sec:calc:attributes}
\index{attributes|(}

\Rlang objects can have attributes. Attributes are named slots normally used to store ancillary data such as object properties. There are no restrictions on the class of what is assigned to an attribute. They are used by \Rlang itself to store things like column names in data frames and labels of factor levels. All these attributes are visible to user code, and user code can read and write objects' attributes. However, they are rarely displayed explicitly when an object is printed. They can be also used to store metadata accompanying the data stored in an object, which is important for reproducible research and data sharing.
Expand Down Expand Up @@ -954,7 +954,7 @@ rm(list = setdiff(ls(pattern="*"), to.keep))

\section{Saving and loading data}

\subsection{Data sets in R and packages}
\subsection{Data sets in \Rlang and packages}
\index{data!loading data sets|(}
To be able to present more meaningful examples, we need some real data. Here we use \code{cars}, one of the many data sets included in base \Rpgrm. Function \Rfunction{data()} is used to load data objects that are included in \Rlang or contained in packages. It is also possible to import data saved in files with \textit{foreign} formats, defined by other software or commonly used for data exchange. Package \pkgname{foreign}, included in the \Rlang distribution, as well as contributed packages make available functions capable of reading and decoding various foreign formats. How to read or import ``foreign'' data is discussed in \Rlang documentation in \emph{R Data Import/Export}, and in this book, in chapter \ref{chap:R:data:io} on page \pageref{chap:R:data:io}. It is also good to keep in mind that in \Rlang, URLs (Uniform Resource Locators) are accepted as arguments to the \code{file} or \code{path} parameter of many functions (see section \ref{sec:files:remote} on page \pageref{sec:files:remote}).

Expand Down
2 changes: 1 addition & 1 deletion R.data.io.Rnw
Original file line number Diff line number Diff line change
Expand Up @@ -842,7 +842,7 @@ hyper_tibble(meteo_data.tnc,
select(.data = ., -time)
@

In this second example, we extract data for all grid points along latitudes. To achieve this we need only to omit the test for \code{lat} from the chuck above. The tibble is assembled automatically and columns for the active dimensions added. The decoding of the months remains unchanged.
In this second example, we extract data for all grid points along latitudes. To achieve this we need only to omit the test for \code{lat} from the chunk above. The tibble is assembled automatically and columns for the active dimensions added. The decoding of the months remains unchanged.

<<tidync-03>>=
hyper_tibble(meteo_data.tnc,
Expand Down
12 changes: 6 additions & 6 deletions R.intro.Rnw
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ opts_knit$set(unnamed.chunk.label = 'intro-chunk')
opts_knit$set(concordance=TRUE)
@

\chapter{R: the Language and the Program}\label{chap:R:introduction}
\chapter{\Rlang: the Language and the Program}\label{chap:R:introduction}

\begin{VF}
In a world of \ldots\ relentless pressure for more of everything, one can lose sight of the basic principles---simplicity, clarity, generality---that form the bedrock of good software.
Expand All @@ -22,7 +22,7 @@ You will also learn how to interact with \Rpgrm when sitting at a computer. You

I describe the steps taken in a typical scientific or technical study, including the data analysis work flow and the roles that \Rpgrm can play in it. I share my views on the advantages and disadvantages of textual command languages such as \Rlang compared to menu-driven user interfaces, frequently used in other statistics software. I discuss the role of textual languages and \emph{literate programming} in the very important question of reproducibility of data analyses and mention how I have used them while writing and typesetting this book.

\section{What is R?}
\section{What is \Rlang?}

\subsection{R as a language}
\index{R as a language@{\Rlang as a language}}
Expand Down Expand Up @@ -61,7 +61,7 @@ The \Rpgrm program does not have a full-fledged graphical user interface (GUI),
As \Rpgrm is essentially a command-line application, it can be used on what nowadays are frugal computing resources, equivalent to a personal computer of three decades ago. \Rpgrm can run even on the Raspberry Pi\index{Raspberry Pi}, a micro-controller board with the processing power of a modest smart phone (see \url{https://r4pi.org/}). At the other end of the spectrum, on really powerful servers, \Rpgrm can be used for the analysis of big data sets with millions of observations. How powerful a computer is needed for a given data analysis task depends on the size of the data sets, on how patient one is, on the ability to select efficient algorithms and on writing ``good'' code.
\end{explainbox}

\section{Using R}\label{sec:intro:using:R}
\section{Using \Rlang}\label{sec:intro:using:R}

\subsection{Editors and IDEs}
Integrated Development Environments (IDEs)\index{integrated development environment}\index{IDE|see{integrated development environment}} are normally used when developing computer programs. IDEs provide a centralized user interface from within which the different tools used to create and test a computer program can be accessed and used in coordination. Most IDEs include a dedicated editor capable of syntax highlighting (automatically colouring ``code words'' based on their role in the programming language), and even able to report some mistakes in advance of running the code. One could describe such an editor as the equivalent of a word processor with spelling and grammar checking, that can alert about spelling and syntax errors for a computer language like \Rlang instead of for a natural language like English. IDEs frequently add other features that help navigation of the programme source code and make easy the access to documentation.
Expand Down Expand Up @@ -92,15 +92,15 @@ The ``desktop'' version of \RStudio that one installs and uses locally, runs on
In this book I provide only a minimum of guidance on the use of \RStudio, and no guidance for other IDEs. To learn more about \RStudio, please, read the documentation available through \RStudio's help menu and keep at hand a printed copy of the \RStudio cheat sheet while learning how to use it. This and other useful \Rlang-related cheatsheets can be downloaded at \url{https://posit.co/resources/cheatsheets/}. Additional instructions on the use of \RStudio, including a video, are available through the Resources menu entry at the book website at \url{https://www.learnr-book.info/}.
\end{infobox}

\subsection{R sessions and workspaces}\label{sec:R:workspace}
\subsection{\Rlang sessions and workspaces}\label{sec:R:workspace}

We use \emph{session} to describe the interactive execution from start to finish of one running instance of the \Rpgrm program. We use \emph{workspace} to name the imaginary space were all objects currently available in an \Rpgrm session are stored. In \Rpgrm the whole workspace can be stored in a single file on disk at the end or during a session and restored later into another session, possibly on a different computer. Usually when working with \Rpgrm we dedicate a folder in disk storage to store all files from a given data analysis project. We normally keep in this folder files with data to read in, scripts, a file storing the whole contents of the workspace, named by default \code{.Rdata} and a text file with the history of commands entered interactively, named by default \code{.Rhistory}. The user's files within this folder can be located in nested folders. There are no strict rules on how the files should be organised or on their number. The recommended practice is to avoid crowded folders and folders containing unrelated files. It is a good idea to keep in a given folder and workspace the work in progress for a single data-analysis project or experiment, so that the workspace can be saved and restored easily between sessions and work continued from where one left it independently of work done in other workspaces. The folder where files are currently read and saved is in \Rpgrm documentation called the \emph{current working directory}. When opening an \code{.Rdata} file the current working directory is automatically set to the location where the \code{.Rdata} file was read from.

\begin{warningbox}
\RStudio projects are implemented as a folder with a name ending in \code{.Rprj}, located under the same folder where scripts, data, \code{.Rdata} and \code{.Rhistory} are stored. This folder is managed by \RStudio and should be not modified or deleted by the user. Only in the very rare case of its corruption, it should be deleted, and the \RStudio project created again from scratch. Files \code{.Rdata} and \code{.Rhistory} should not be deleted by the user, except to reset the \Rlang workspace. However, this is unnecessary as it can be also easily achieved from within \Rpgrm.
\end{warningbox}

\subsection{Using R interactively}
\subsection{Using \Rlang interactively}

Decades ago users communicated with computers through a physical terminal (keyboard plus text-only screen) that was frequently called a \emph{console}\index{console}. A text-only interface to a computer program, in most cases a window or a pane within a graphical user interface, is still called a console. In our case, the \Rpgrm console (Figure \ref{fig:intro:console}). This is the native user interface of \Rpgrm.

Expand Down Expand Up @@ -152,7 +152,7 @@ When trying to access help related to \Rlang extension packages trough \Rlang's

When using \RStudio there are easier ways of navigating to a help page than calling function \Rfunction{help()} by typing its name, for example, with the cursor on the name of a function in the editor or console, pressing the \textsf{F1} key opens the corresponding help page in the help pane. Letting the cursor hover for a few seconds over the name of a function at the \Rpgrm console will open ``bubble help'' for it. If the function is defined in a script or another file that is open in the editor pane, one can directly navigate from the line where the function is called to where it is defined. In \RStudio one can also search for help through the graphical interface. The \Rlang manuals can also be accessed most easily through the Help menu in \RStudio or \pgrmname{RGUI}.

\subsection{Using R in a ``batch job''}
\subsection{Using \Rlang in a ``batch job''}

To run a script\index{scripts}\index{batch job} we need first to prepare a script in a text editor. Figure \ref{fig:intro:script} shows the console immediately after running the script file shown in the text editor. As before, red text, the command \code{source("my-script.R")}, was typed by the user, and the blue text in the console is what was displayed by \Rpgrm as a result of this action. The title bar of the console, shows ``R-console,'' while the title bar of the editor shows the \emph{path} to the script file that is open and ready to be edited followed by ``R-editor.''

Expand Down
12 changes: 6 additions & 6 deletions R.learning.Rnw
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ In this chapter I describe how I imagine the book can be used most effectively t

\section{Approach and structure}

Depending on previous experience, reading \emph{Learn R: As a Language} will be about exploring a new world or revisiting a familiar one. In both cases \emph{Learn R: As a Language} aims to be a travel guide, neither a traveler's account, nor a cookbook of \Rlang recipes. It can be used as a course book, supplementary reading or for self instruction, and also as a reference.\vspace{1ex}
Depending on previous experience, reading \emph{Learn R: As a Language} will be about exploring a new world or revisiting a familiar one. In both cases \emph{Learn R: As a Language} aims to be a travel guide, neither a traveler's account, nor a cookbook of \Rlang recipes. It can be used as a course book, supplementary reading or for self instruction, and also as a reference.\vspace{1ex} My hope is that as a guide to the use of \Rlang, this book will remain useful to readers' as they gain experience and develop skills.

\noindent
\emph{I encourage readers to approach \Rlang like a child approaches his or her mother tongue when learning to speak: do not struggle, just play, and fool around with \Rlang! If the going gets difficult and frustrating, take a break! If you get a new insight, take a break to enjoy the victory!\vspace{1ex}
Expand All @@ -34,18 +34,18 @@ Readers already familiar with \Rlang will be able to read the chapters in the bo

I expect \emph{Learn R: As a Language} to remain useful as a reference to those readers who use it to learn \Rlang. It will be also useful as a reference to readers already familiar with \Rlang. To support the use of the book as a reference, I have been thorough with indexing, including many carefully chosen terms, their synonyms and the names of all \Rlang objects and constructs discussed, collecting them in three alphabetical indexes: \emph{General index}, \emph{Index of R names by category}, and \emph{Alphabetic index of R names} starting at pages \pageref{idx:general}, \pageref{idx:rcats} and \pageref{idx:rindex}, respectively. I have also included back and forward cross references linking related sections throughout the whole book.

\section{Typographic and Naming Conventions}
\section{Typographic and naming conventions}

\subsection{Call outs}

Marginal bars and icons are used in the book to inform about what content is advanced or included with a specific aim. The following icons and colours are used.

\begin{infobox}
Signals ancillary information.
Signals ancillary information, in most cases unrelated to \Rlang as a language.
\end{infobox}

\begin{explainbox}
Signals in-depth explanations of specific \Rlang features or general programming concepts, which can be skipped on first reading, but to which you should return without hurry, preferably with a cup of coffee or tea.
Signals in-depth explanations of specific \Rlang features or general programming concepts. Several of these explanations make reference to programming concepts or features of the \Rlang language that are explained later in the book. Readers new to \Rlang and computer programming can safely skip these call outs on first reading of the book. To become proficient in the use of \Rlang these readers are expected to return at a later time without hurry, preferably with a cup of coffee or tea to these call outs. Readers with more experience, like those possibly reading individual chapters or using the book as a reference, will find these in-depth explanations useful.
\end{explainbox}

\begin{warningbox}
Expand All @@ -64,7 +64,7 @@ Signals \emph{playgrounds} containing open-ended exercises---ideas and pieces of
Signals \emph{advanced playgrounds} sections that require more time to play with before grasping concepts than regular \emph{playground} sections. Numbered by chapter together with other playgrounds.
\end{advplayground}

\subsection{Code Conventions and Syntax Highlighting}
\subsection{Code conventions and syntax highlighting}

Small sections of program code interspersed within the main text, receive the name of \emph{code chunks}. In this book \Rlang code chunks are typeset in a typewriter font, using colour to highlight the different elements of the syntax, such as variables, functions, constant values, etc. \Rlang code elements embedded in the text are similarly typeset but always black. For example in the code chunk below \code{mean()} and \code{print()} are functions; 1, 5 and 3 are constant numeric values, and \code{z} is the name of a variable where the result of the computation done in the first line of code is stored. The line starting with \code{\#\# } shows what is printed or shown when executing the second statement: \code{[1] 1}. In the book \code{\#\# } is used as a marker to signal output from \Rlang, it is not part of the output.

Expand All @@ -73,7 +73,7 @@ z <- mean(1, 5, 3)
print(z)
@

When naming objects (or variables) when explaing general concepts I use short abstract names, while for real-life examples I use meaningful names. Although not required, for clarity, I use names ending in an abbreviation that hint at the type of object stored, unless the whole name fulfils the role.
When naming objects (or variables) when explaining general concepts I use short abstract names, while for real-life examples I use meaningful names. Although not required, for clarity, I use names hinting at the structure of objects stored, such as \code{mat1} for a matrix.

Code in playgrounds does not modify objects created by code examples listed outside playgrounds, and is self-contained in the sense that if earlier code is use, this is mentioned in the text of the playground. The code outside playgrounds does reuse objects created earlier in the same chapter, but is independent from code or data used in earlier chapters.

Expand Down
Loading

0 comments on commit 12bd8f4

Please sign in to comment.