chapter corrections

tlverse · Mar 11, 2021 · 80269a4 · 80269a4
1 parent 7a5244d
commit 80269a4
Show file tree

Hide file tree

Showing 7 changed files with 646 additions and 442 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -1,6 +1,6 @@
 branches:
   only:
-  - master
+    - master
 
 env:
   global:

diff --git a/03-tlverse.Rmd b/03-tlverse.Rmd
@@ -85,8 +85,8 @@ https://github.com/tlverse, not yet on [CRAN](https://CRAN.R-project.org/). You
 can use the [`usethis` package](https://usethis.r-lib.org/) to install them:
 
 ```{r installation, eval=FALSE}
-install.packages("usethis")
-usethis::install_github("tlverse/tlverse")
+install.packages("devtools")
+devtools::install_github("tlverse/tlverse")
 ```
 
 The `tlverse` depends on a large number of other packages that are also hosted

diff --git a/06-sl3.Rmd b/06-sl3.Rmd
diff --git a/07-tmle3.Rmd b/07-tmle3.Rmd
@@ -13,60 +13,114 @@ Based on the [`tmle3` `R` package](https://github.com/tlverse/tmle3).
 
 ## Introduction
 
-In the previous chapter on `sl3` we learned how to estimate a regression function like $E[Y|X]$ from data. That's an important first step in learning from data, but how can we use this predictive model to estimate statistical and causal effects?
-
-Going back to the roadmap in Chapter 1, suppose we'd like to estimate the effect of a treatment variable $A$ on an outcome $Y$. As discussed, one potential parameter that characterizes that effect is the Average Treatment Effect ATE, defined as: $\psi_0=E_W[E[Y|A=1,W]-E[Y|A=0,W]]$ and interpreted as the difference in mean outcome under when treatment $A=1$ and $A=0$, averaging over the distribution of covariates $W$. We'll illustrate several potential estimators for this parameter, and motivate the use of TMLE, using the following example data:
+In the previous chapter on `sl3` we learned how to estimate a regression
+function like $\mathbb{E}[Y \mid X]$ from data. That's an important first step
+in learning from data, but how can we use this predictive model to estimate
+statistical and causal effects?
+
+Going back to the roadmap in Chapter 1, suppose we'd like to estimate the effect
+of a treatment variable $A$ on an outcome $Y$. As discussed, one potential
+parameter that characterizes that effect is the Average Treatment Effect (ATE),
+defined as $\psi_0 = \mathbb{E}_W[E[Y \mid A=1,W] - \mathbb{E}[Y \mid A=0,W]]$
+and interpreted as the difference in mean outcome under when treatment $A=1$ and
+$A=0$, averaging over the distribution of covariates $W$. We'll illustrate
+several potential estimators for this parameter, and motivate the use of TMLE,
+using the following example data:
 
 ```{r tmle_fig1, results="asis", echo = FALSE}
 knitr::include_graphics("img/misc/tmle_sim/schematic_1_truedgd.png")
 ```
 
-The small ticks on the right indicate the mean outcomes (averaging over $W$) under $A=1$ and $A=0$ respectively, so their difference is the quantity we'd like to estimate.
+The small ticks on the right indicate the mean outcomes (averaging over $W$)
+under $A=1$ and $A=0$ respectively, so their difference is the quantity we'd
+like to estimate.
 
-While we hope to motivate the application of TMLE in this chapter, we refer the interested reader to the two Targeted Learning books and associated works for full technical details.
+While we hope to motivate the application of TMLE in this chapter, we refer the
+interested reader to the two Targeted Learning books and associated works for
+full technical details.
 
 ### Substitution Estimators
 
-We can use `sl3` to fit a Super Learner or other regression model to estimate the function $E[Y|A,W]$. We refer to this function as $\bar{Q}_0(A,W)$ and our estimate of it as $\bar{Q}_n(A,W)$. We can then directly "plug-in" that estimate to obtain an estimate of the ATE:  $\hat{\psi}_n=\frac{1}{n}\sum(\bar{Q}_n(1,W)-\bar{Q}_n(0,W))$. This kind of estimator is called a plug-in or substitution estimator, as we substitute our estimate $Q_n(A,W)$ of the function $Q_0(A,W)$ for the function itself.
-
+We can use `sl3` to fit a Super Learner or other regression model to estimate
+the function $\mathbb{E}_0[Y \mid A,W]$. We refer to this function as
+$\bar{Q}_0(A,W)$ and our estimate of it as $\bar{Q}_n(A,W)$. We can then
+directly "plug-in" that estimate to obtain an estimate of the ATE:
+$\hat{\psi}_n=\frac{1}{n}\sum(\bar{Q}_n(1,W)-\bar{Q}_n(0,W))$. This kind of
+estimator is called a plug-in or substitution estimator, as we substitute our
+estimate $Q_n(A,W)$ of the function $Q_0(A,W)$ for the function itself.
 
-Applying `sl3` to estimate the outcome regression in our example, we can see that it fits the data quite well:
+Applying `sl3` to estimate the outcome regression in our example, we can see
+that it fits the data quite well:
 
 ```{r tmle_fig2, results="asis", echo = FALSE}
 knitr::include_graphics("img/misc/tmle_sim/schematic_2b_sllik.png")
 ```
 
-The solid lines indicate the `sl3` estimate of the regression function, with the dotted lines indicating the `tmle3` update described below.
+The solid lines indicate the `sl3` estimate of the regression function, with the
+dotted lines indicating the `tmle3` update described below.
 
-While substitution estimators are intuitive, naively using this approach with a Super Learner estimate of $\bar{Q}_0(A,W)$ has several limitations. First, Super Learner is selecting learner weights to minimize risk across the entire regression function, instead of "targeting" the ATE parameter we hope to estimate, leading to biased estimation. That is, `sl3` is trying to do well on the full regression curve on the left, instead of focusing on the small ticks on the right. What's more, the sampling distribution of this approach is not asymptotically linear, and therefore inference is not possible.
+While substitution estimators are intuitive, naively using this approach with a
+Super Learner estimate of $\bar{Q}_0(A,W)$ has several limitations. First, Super
+Learner is selecting learner weights to minimize risk across the entire
+regression function, instead of "targeting" the ATE parameter we hope to
+estimate, leading to biased estimation. That is, `sl3` is trying to do well on
+the full regression curve on the left, instead of focusing on the small ticks on
+the right. What's more, the sampling distribution of this approach is not
+asymptotically linear, and therefore inference is not possible.
 
-We can see these limitations illustrated in the estimates generated for the example data: 
+We can see these limitations illustrated in the estimates generated for the
+example data:
 
 ```{r tmle_fig3, results="asis", echo = FALSE}
 knitr::include_graphics("img/misc/tmle_sim/schematic_3_effects.png")
 ```
 
-We see that Super Learner, estimates the true parameter value (indicated by the dashed vertical line) more accurately than GLM. However, it is still less accurate than TMLE, and valid inference is not possible. In contrast, TMLE achieves a less biased estimator and valid inference.
+We see that Super Learner, estimates the true parameter value (indicated by the
+dashed vertical line) more accurately than GLM. However, it is still less
+accurate than TMLE, and valid inference is not possible. In contrast, TMLE
+achieves a less biased estimator and valid inference.
 
 ## TMLE
 
-TMLE takes an initial estimate of $\bar{Q}_0(A,W)$ as well as an estimate of the propensity score $\bar{g}_0(A|W)=p(A|W)$ and produces an updated estimate $\bar{Q}^{\star}_0(A,W)$ that is "targeted" to the parameter of interest. TMLE keeps the benefits of substitution estimators (it is one), but augments the original estimates to correct for bias and also results in an asymptotically linear (and thus normally-distributed) estimator with consistent Wald-style confidence intervals.
-
-There are different types of TMLE, sometimes for the same set of parameters, but below is an example of the algorithm for estimating the ATE. $\bar{Q}^{\star}_n(A,W)$ is the TMLE augmented estimate
+TMLE takes an initial estimate $\bar{Q}_n(A,W)$ as well as an estimate of the
+propensity score $g_n(A \mid W) = p(A \mid W)$ and produces an updated estimate
+$\bar{Q}^{\star}_n(A,W)$ that is "targeted" to the parameter of interest. TMLE
+keeps the benefits of substitution estimators (it is one), but augments the
+original estimates to correct for bias and also results in an asymptotically
+linear (and thus normally-distributed) estimator with consistent Wald-style
+confidence intervals.
+
+There are different types of TMLE, sometimes for the same set of parameters,
+but below is an example of the algorithm for estimating the ATE.
+$\bar{Q}^{\star}_n(A,W)$ is the TMLE-augmented estimate
 $f(\bar{Q}^{\star}_n(A,W)) = f(\bar{Q}_n(A,W)) + \epsilon_n \cdot h_n(A,W)$,
-where $f(\cdot)$ is the appropriate link function (e.g., logit), $\epsilon_n$
-is an estimated coefficient and $h_n(A,W)$ is a "clever covariate". In this case, $h_n(A,W) = \frac{A}{g_n(W)}-\frac{1-A}{1-g_n(W)}$, with $g_n(W)
-  = \mathbb{P}(A=1 \mid W)$ being the estimated (also by SL) propensity score,
-  so the estimator depends both on initial SL fit of the outcome regression
-  ($\bar{Q}_0$) and an SL fit of the propensity score ($g_n$).
-
-There are further robust augmentations that are used in `tlverse`, such as an added layer of cross-validation to avoid over-fitting bias (CV-TMLE), and so called methods that can more robustly estimated several parameters simultaneously (e.g., the points on a survival curve).
+where $f(\cdot)$ is the appropriate link function (e.g., logit), $\epsilon_n$ is
+an estimated coefficient and $h_n(A,W)$ is a "clever covariate". In this case,
+$h_n(A,W) = \frac{A}{g_n(A \mid W)} - \frac{1-A}{1-g_n(A, W)}$, with $g_n(A, W)
+= \mathbb{P}(A=1 \mid W)$ being the estimated (also by SL) propensity score, so
+the estimator depends both on initial SL fit of the outcome regression
+($\bar{Q}_n$) and an SL fit of the propensity score ($g_n$).
+
+There are further robust augmentations that are used in `tlverse`, such as an
+added layer of cross-validation to avoid over-fitting bias (CV-TMLE), and so
+called methods that can more robustly estimated several parameters
+simultaneously (e.g., the points on a survival curve).
 
 ### Inference
 
-Because TMLE yields an **asymptotically linear**, estimator, obtaining inference is trivial. Each TMLE is associated with an **influence function** that describes its asymptotic distribution, and Wald-style inference can be obtained by plugging into this function our estimates $\bar{Q}^{\star}_n$ and $g_n$ and taking the sample standard error. 
+Because TMLE yields an **asymptotically linear**, estimator, obtaining inference
+is trivial. Each TMLE is associated with an **influence function** that
+describes its asymptotic distribution, and Wald-style inference can be obtained
+by plugging into this function our estimates $\bar{Q}^{\star}_n$ and $g_n$ and
+taking the sample standard error.
 
-The following sections describe both a simple and more detailed way of specifying and estimating a TMLE in the `tlverse`. In designing `tmle3`, we sought to replicate as closely as possible the very general estimation framework of TMLE, and so each theoretical object relevant to TMLE is encoded in a corresponding software object. First, we will present the simple application of `tmle3` to the WASH Benefits exaple, and then go on to describe the underlying objects in more detail.
+The following sections describe both a simple and more detailed way of
+specifying and estimating a TMLE in the `tlverse`. In designing `tmle3`, we
+sought to replicate as closely as possible the very general estimation framework
+of TMLE, and so each theoretical object relevant to TMLE is encoded in a
+corresponding software object. First, we will present the simple application of
+`tmle3` to the WASH Benefits example, and then go on to describe the underlying
+objects in more detail.
 
 ## Easy-Bake Example: `tmle3` for ATE
 
@@ -121,7 +175,7 @@ Currently, missingness in `tmle3` is handled in a fairly simple way:
 
 * Missing covariates are median (for continuous) or mode (for discrete)
   imputed, and additional covariates indicating imputation are generated
-* Observations missing either treatment or outcome variables are excluded.
+* Observations missing treatment variable are excluded.
 
 We implemented IPCW-TMLE to more efficiently handle missingness in the outcome
 variable, and we plan to implement an IPCW-TMLE to handle missingness in the
@@ -140,8 +194,8 @@ node_list <- processed$node_list
 `tmle3` is general, and allows most components of the TMLE procedure to be
 specified in a modular way. However, most end-users will not be interested in
 manually specifying all of these components. Therefore, `tmle3` implements a
-`tmle3_Spec` object that bundles a set ofcomponents into a _specification_
-that, with minimal additional detail, can be run by an end-user.
+`tmle3_Spec` object that bundles a set of components into a _specification_
+("Spec") that, with minimal additional detail, can be run by an end-user.
 
 We'll start with using one of the specs, and then work our way down into the
 internals of `tmle3`.
@@ -164,7 +218,7 @@ to be estimated with `sl3`:
 ```{r tmle3-learner-list}
 # choose base learners
 lrnr_mean <- make_learner(Lrnr_mean)
-lrnr_xgb <- make_learner(Lrnr_xgboost)
+lrnr_rf <- make_learner(Lrnr_ranger)
 
 # define metalearners appropriate to data types
 ls_metalearner <- make_learner(Lrnr_nnls)
@@ -173,11 +227,11 @@ mn_metalearner <- make_learner(
   loss_loglik_multinomial
 )
 sl_Y <- Lrnr_sl$new(
-  learners = list(lrnr_mean, lrnr_xgb),
+  learners = list(lrnr_mean, lrnr_rf),
   metalearner = ls_metalearner
 )
 sl_A <- Lrnr_sl$new(
-  learners = list(lrnr_mean, lrnr_xgb),
+  learners = list(lrnr_mean, lrnr_rf),
   metalearner = mn_metalearner
 )
 learner_list <- list(A = sl_A, Y = sl_Y)

diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,82 +1,95 @@
 Package: tlversehandbook
 Title: Targeted Learning in R with the 'tlverse'
-Version: 0.0.4
-Authors@R: c(
-    person("Jeremy", "Coyle", email = "[email protected]",
-           role = "aut",
-           comment = c(ORCID = "0000-0002-9874-6649")),
-    person("Nima", "Hejazi", email = "[email protected]",
-           role = c("aut", "cre", "cph"),
-           comment = c(ORCID = "0000-0002-7127-2789")),
-    person("Ivana", "Malenica", email = "[email protected]",
-           role = "aut",
-           comment = c(ORCID = "0000-0002-7404-8088")),
-    person("Rachael", "Phillips", email = "[email protected]",
-           role = "aut",
-           comment = c(ORCID = "0000-0002-8474-591X")),
-    person("Alan", "Hubbard", email = "[email protected]",
-           role = c("aut", "ths"),
-           comment = c(ORCID = "0000-0002-3769-0127")),
-    person("Mark", "van der Laan", email = "[email protected]",
-           role = c("aut", "ths"),
-           comment = c(ORCID = "0000-0003-1432-5511"))
-  )
+Version: 0.1.0
+Authors@R: 
+    c(person(given = "Jeremy",
+             family = "Coyle",
+             role = "aut",
+             email = "[email protected]",
+             comment = c(ORCID = "0000-0002-9874-6649")),
+      person(given = "Nima",
+             family = "Hejazi",
+             role = c("aut", "cre", "cph"),
+             email = "[email protected]",
+             comment = c(ORCID = "0000-0002-7127-2789")),
+      person(given = "Ivana",
+             family = "Malenica",
+             role = "aut",
+             email = "[email protected]",
+             comment = c(ORCID = "0000-0002-7404-8088")),
+      person(given = "Rachael",
+             family = "Phillips",
+             role = "aut",
+             email = "[email protected]",
+             comment = c(ORCID = "0000-0002-8474-591X")),
+      person(given = "Alan",
+             family = "Hubbard",
+             role = c("aut", "ths"),
+             email = "[email protected]",
+             comment = c(ORCID = "0000-0002-3769-0127")),
+      person(given = "Mark",
+             family = "van der Laan",
+             role = c("aut", "ths"),
+             email = "[email protected]",
+             comment = c(ORCID = "0000-0003-1432-5511")))
 Maintainer: Nima Hejazi <[email protected]>
-Description: An open source reproducible handbook for causal machine learning
-  and data science with the targeted learning methodology, with an emphasis on
-  practical examples and tutorials using the 'tlverse' ecosystem of packages.
-Depends: R (>= 3.6.0)
+Description: An open source reproducible handbook for causal machine
+    learning and data science with the targeted learning methodology, with
+    an emphasis on practical examples and tutorials using the 'tlverse'
+    ecosystem of packages.
+URL: https://github.com/tlverse/tlverse-workshop,
+    https://tlverse.org/tlverse-handbook
+BugReports: https://github.com/tlverse/tlverse-workshop/issues
+Depends: 
+    R (>= 3.6.0)
 Imports:
-  bookdown,
-  bslib,
-  downlit,
-  rmarkdown,
-  ggplot2,
-  tibble,
-  tidyr,
-  dplyr,
-  readr,
-  knitr,
-  stringr,
-  skimr,
-  kableExtra,
-  ggfortify,
-  data.table,
-  mvtnorm,
-  dagitty,
-  ggdag,
-  randomForest,
-  forecast,
-  delayed,
-  origami,
-  sl3,
-  tmle3,
-  tmle3mopttx,
-  tmle3shift
+    bookdown,
+    bslib,
+    dagitty,
+    data.table,
+    delayed,
+    downlit,
+    dplyr,
+    forecast,
+    ggdag,
+    ggfortify,
+    ggplot2,
+    kableExtra,
+    knitr,
+    mvtnorm,
+    origami,
+    randomForest,
+    readr,
+    rmarkdown,
+    skimr,
+    sl3,
+    stringr,
+    tibble,
+    tidyr,
+    tmle3,
+    tmle3mopttx,
+    tmle3shift
 Suggests:
-  nnls,
-  Rsolnp,
-  arm,
-  gam,
-  e1071,
-  glmnet,
-  xgboost,
-  speedglm,
-  ranger,
-  SuperLearner,
-  hal9001,
-  haldensify,
-  polspline
+    arm,
+    e1071,
+    gam,
+    glmnet,
+    hal9001,
+    haldensify,
+    nnls,
+    polspline,
+    ranger,
+    Rsolnp,
+    speedglm,
+    SuperLearner,
+    xgboost
 Remotes:
-  github::rstudio/bslib,
-  github::rstudio/bookdown,
-  github::tlverse/sl3@devel,
-  github::tlverse/tmle3@master,
-  github::tlverse/tmle3mopttx@5ba5f65,
-  github::tlverse/tmle3shift@master,
-  github::nhejazi/haldensify@f0de4b5
-URL:
-  https://github.com/tlverse/tlverse-workshop,
-  https://tlverse.org/tlverse-handbook
-BugReports: https://github.com/tlverse/tlverse-workshop/issues
+    github::nhejazi/haldensify@f0de4b5,
+    github::rstudio/bookdown,
+    github::rstudio/bslib,
+    github::tlverse/sl3@devel,
+    github::tlverse/tmle3@master,
+    github::tlverse/tmle3mopttx@5ba5f65,
+    github::tlverse/tmle3shift@master
+Encoding: UTF-8
 RoxygenNote: 7.1.1
diff --git a/index.Rmd b/index.Rmd
@@ -149,10 +149,10 @@ network) and adaptive sequential designs.
 ### Rachael Phillips {-}
 
 Rachael Phillips is a PhD student in biostatistics, advised by Alan Hubbard and
-Mark van der Laan. She has an MA in Biostatistics, BS in Biology, and BA in 
-Mathematics. A student of targeted learning and causal inference; her research 
-integrates personalized medicine, human-computer interaction, experimental 
-design, and regulatory policy. 
+Mark van der Laan. She has an MA in Biostatistics, BS in Biology, and BA in
+Mathematics. A student of targeted learning and causal inference; her research
+integrates personalized medicine, human-computer interaction, experimental
+design, and regulatory policy.
 
 ### Alan Hubbard {-}
 
@@ -219,8 +219,8 @@ introductory resources:
 
 For a general introduction to causal inference, we recommend
 
-* [Miguel A. Hernán and James M. Robins' _Causal Inference_, forthcoming
-    2020](https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/)
+* [Miguel A. Hernán and James M. Robins' _Causal Inference: What If_,
+    2021](https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/)
 * [Jason A. Roy's _A Crash Course in Causality: Inferring Causal Effects from
   Observational Data_ on
   Coursera](https://www.coursera.org/learn/crash-course-in-causality)
-Original file line number
+Diff line change
@@ -1,6 +1,6 @@
     branches:
       only:
-      - master
+        - master
     env:
       global:
@@ Expand Down @@