-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #329 from Merck/289-vignette-of-the-information-un…
…der-the-null-and-alternative-hypothesis 289 vignette of the information under the null and alternative hypothesis
- Loading branch information
Showing
2 changed files
with
161 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,160 @@ | ||
--- | ||
title: "Statistical information under null and alternative hypothesis" | ||
output: | ||
rmarkdown::html_document: | ||
toc: true | ||
toc_float: true | ||
toc_depth: 2 | ||
number_sections: true | ||
highlight: "textmate" | ||
css: "custom.css" | ||
bibliography: gsDesign2.bib | ||
vignette: | | ||
%\VignetteIndexEntry{Statistical information under null and alternative hypothesis} | ||
%\VignetteEncoding{UTF-8} | ||
%\VignetteEngine{knitr::rmarkdown} | ||
--- | ||
|
||
```{r setup, include=FALSE} | ||
knitr::opts_chunk$set( | ||
collapse = TRUE, | ||
comment = "#>" | ||
) | ||
``` | ||
|
||
In a group sequential design of the $k$-th analysis, the Z-score is | ||
$$ | ||
Z_k = \delta_k / \sqrt{\text{Var}(\delta_k | H_0)} | ||
$$ | ||
The statistical information $\mathcal I_k$ is defined as the inverse of the variance of $\delta_k$, i.e., | ||
$$ | ||
\mathcal I_k = 1 / \text{Var}(\delta_k). | ||
$$ | ||
|
||
# Continuous outcomes | ||
|
||
Imagine a trial with a continuous outcome. Let $X_{0, i} \sim N(\mu_0, \sigma^2)$ for subjects $i = 1, \ldots, n_0$ in the control arm and $X_{1,i} \sim N(\mu_1, \sigma^2)$ for patient $i$ ($i = 1, \ldots, n_1$) in the experimental arm. | ||
|
||
For a superiority design, the tested hypothesis is | ||
$$ | ||
H_0: \; \mu_0 = \mu_1 \;\;\; \text{vs.} \;\;\; H_1:\; \mu_1 > \mu_0. | ||
$$ | ||
Suppose at the $k$-th analysis, there are $n_{0k}$ subjects in the control arm, and there are $n_{1k}$ subjects in the experimental arm. We have $\delta_k$ as the difference of group means, i.e., | ||
$$ | ||
\delta_k | ||
= | ||
\frac{\sum_{i=1}^{n_{1k}} X_{i,1}}{n_{1k}} | ||
- | ||
\frac{\sum_{i=1}^{n_{0k}} X_{i,0}}{n_{0k}}. | ||
$$ | ||
It can be estimated as $\widehat\delta_k = \frac{\sum_{i=1}^{n_{1k}} x_{i,1}}{n_{1k}} - \frac{\sum_{i=1}^{n_{0k}} x_{i,0}}{n_{0k}}$, where $x_{i,j}$ are the observation of $X_{i,j}$ of subject $i$ in arm $j$. | ||
|
||
The statistical information $\mathcal I_k$ is | ||
$$ | ||
\mathcal I_k^{-1} | ||
= | ||
\text{Var}(\delta_k | H_0) | ||
= | ||
\sigma^2 (1 / n_{1k} + 1 / n_{0k}), | ||
$$ | ||
under both $H_0$ and $H_1$, which can be estimated as $\mathcal I_k = \widehat\sigma^2 (1 / n_{1k} + 1 / n_{0k})$. | ||
|
||
# Binary outcomes | ||
|
||
Imagine a trial with a binary outcome. Let $X_{0, i} \sim B(p_0)$ for patient $i = 1, \ldots, n_0$ and $X_{1,i} \sim B(p_1)$ for patient $i$ ($i = 1, \ldots, n_1$), where $p_0$ and $p_1$ are failure rate probability. Suppose at the $k$-th analysis, there are $n_{0k}$ subjects in the control arm, and there are $n_{1k}$ subjects in the experimental arm. For a superiority design, the null and alternative hypothesis is | ||
$$ | ||
H_0: \; p_0 = p_1 = p \;\;\; \text{vs.} \;\;\; H_1:\; p_0 > p_1. | ||
$$ | ||
The nature-scale treatment effect is | ||
$$ | ||
\delta_k | ||
= | ||
\frac{\sum_{i=1}^{n_{1k}} X_{i,1}}{n_{1k}} | ||
- | ||
\frac{\sum_{i=1}^{n_{0k}} X_{i,0}}{n_{0k}}, | ||
$$ | ||
It can be estimated as $\widehat\delta_k = \frac{\sum_{i=1}^{n_{1k}} x_{i,1}}{n_{1k}} - \frac{\sum_{i=1}^{n_{0k}} x_{i,0}}{n_{0k}}$, where $x_{i,j}$ are the observation of $X_{i,j}$ of subject $i$ in arm $j$. | ||
|
||
The statistical information is | ||
$$ | ||
\mathcal I_k^{-1} | ||
= | ||
\text{Var}(\delta_k) | ||
= | ||
\left\{ | ||
\begin{array}{ll} | ||
p(1-p)/n_{1k} + p(1-p)/n_{0k} & \text{under } H_0\\ | ||
p_1(1-p_1)/n_{1k} + p_0(1-p_0)/n_{0k} & \text{under } H_1\\ | ||
\end{array} | ||
\right.. | ||
$$ | ||
Its estimation is | ||
$$ | ||
\widehat{\mathcal I}_k^{-1} | ||
= | ||
\left\{ | ||
\begin{array}{ll} | ||
\bar p(1 - \bar p) / n_{1k} + \bar p(1 - \bar p) / n_{0k} & \text{under } H_0\\ | ||
\widehat p_1(1-p_1) / n_{1k} + \widehat p_0(1 - \widehat p_0)/n_{0k} & \text{under } H_1\\ | ||
\end{array} | ||
\right., | ||
$$ | ||
where $\bar p = \frac{\sum_{i=1}^{n_{1k}}x_{i1} + \sum_{i=1}^{n_{0k}}x_{i0}}{n_{1k} + n_{0k}}$, $\widehat p_j = \frac{\sum_{i=1}^{n_{jk}}x_{ij}}{n_{jk}}$ for $j = 0, 1$. | ||
|
||
|
||
# Survival outcome | ||
|
||
In many clinical trials, the outcome is the time to some event. For simplicity, assume the event is death so that each person can only have one event; the same ideas apply for events that can recur, but in those cases we restrict attention to the first event for each patients. We use the logrank statistics to compare the treatment and control arms. | ||
If we assume there are $N_k$ total number of deaths at analysis $k$. The numerator of the logrank statistics at analysis $k$ is [@proschan2006statistical] | ||
$$ | ||
\sum_{i=1}^{N_k} D_i, | ||
$$ | ||
where $D_i = O_i - E_i$ with $O_i$ is the indicator that the $i$th death occurred in a treatment patient, and $E_i = m_{1i} / (m_{0i} + m_{1i})$ as the null expectation of $O_i$ given the respective numbers, $m_{0i}$ and $m_{1i}$, of control and treatment patients at risk just prior to the $i$th death. | ||
|
||
|
||
Conditioned on $m_{0i}$ and $m_{1i}$, the $O_i$ has a Bernoulli distribution with parameter $E_i$. The null conditional mean and variance of $D_i$ are 0 and $V_i = E_i(1 − E_i)$, respectively. | ||
|
||
Unconditionally, the $D_i$ are uncorrelated, mean 0 random variables with variance $E(V_i)$ under the null hypothesis. | ||
|
||
Thus, conditioned on $N_k$, we have | ||
$$ | ||
\begin{array}{ccl} | ||
\mathcal I_k^{-1} | ||
& = & | ||
\text{Var}(\delta_k) | ||
= | ||
\sum_{i=1}^{N_k} \text{Var}(D_i) | ||
= | ||
\sum_{i=1}^{N_k} E(V_i) | ||
= | ||
E \left( \sum_{i=1}^{N_k} V_i \right) | ||
= | ||
E \left( \sum_{i=1}^{N_k} E_i(1 − E_i) \right) \\ | ||
& = & | ||
\left\{ | ||
\begin{array}{ll} | ||
E\left(\sum_{i=1}^{N_k} \frac{r}{1+r} \frac{1}{1+r} \right) | ||
& \text{under } H_0\\ | ||
E\left(\sum_{i=1}^{N_k} \frac{m_{1i}}{(m_{0i} + m_{1i})} \frac{m_{0i}}{(m_{0i} + m_{1i})}\right) | ||
& \text{under } H_1 | ||
\end{array} | ||
\right., | ||
\end{array} | ||
$$ | ||
where $r$ is the randomization ratio. Its estimation is | ||
$$ | ||
\begin{array}{ccl} | ||
\widehat{\mathcal I}_k^{-1} | ||
& = & | ||
\left\{ | ||
\begin{array}{ll} | ||
\sum_{i=1}^{N_k} \frac{r}{1+r} \frac{1}{1+r} | ||
& \text{under } H_0\\ | ||
\sum_{i=1}^{N_k} \frac{m_{1i}}{(m_{0i} + m_{1i})} \frac{m_{0i}}{(m_{0i} + m_{1i})} | ||
& \text{under } H_1 | ||
\end{array} | ||
\right.. | ||
\end{array} | ||
$$ | ||
|
||
# References |