Skip to content

Commit

Permalink
Merge pull request #329 from Merck/289-vignette-of-the-information-un…
Browse files Browse the repository at this point in the history
…der-the-null-and-alternative-hypothesis

289 vignette of the information under the null and alternative hypothesis
  • Loading branch information
nanxstats authored Feb 5, 2024
2 parents eb1d509 + a81ead5 commit 91e4e4d
Show file tree
Hide file tree
Showing 2 changed files with 161 additions and 0 deletions.
1 change: 1 addition & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@ articles:
- articles/story-compute-expected-events
- articles/story-summarize-designs
- articles/story-canonical-h0-h1
- articles/story-info-formula

- title: "Design for binary endpoints"
desc: >
Expand Down
160 changes: 160 additions & 0 deletions vignettes/articles/story-info-formula.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
---
title: "Statistical information under null and alternative hypothesis"
output:
rmarkdown::html_document:
toc: true
toc_float: true
toc_depth: 2
number_sections: true
highlight: "textmate"
css: "custom.css"
bibliography: gsDesign2.bib
vignette: |
%\VignetteIndexEntry{Statistical information under null and alternative hypothesis}
%\VignetteEncoding{UTF-8}
%\VignetteEngine{knitr::rmarkdown}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```

In a group sequential design of the $k$-th analysis, the Z-score is
$$
Z_k = \delta_k / \sqrt{\text{Var}(\delta_k | H_0)}
$$
The statistical information $\mathcal I_k$ is defined as the inverse of the variance of $\delta_k$, i.e.,
$$
\mathcal I_k = 1 / \text{Var}(\delta_k).
$$

# Continuous outcomes

Imagine a trial with a continuous outcome. Let $X_{0, i} \sim N(\mu_0, \sigma^2)$ for subjects $i = 1, \ldots, n_0$ in the control arm and $X_{1,i} \sim N(\mu_1, \sigma^2)$ for patient $i$ ($i = 1, \ldots, n_1$) in the experimental arm.

For a superiority design, the tested hypothesis is
$$
H_0: \; \mu_0 = \mu_1 \;\;\; \text{vs.} \;\;\; H_1:\; \mu_1 > \mu_0.
$$
Suppose at the $k$-th analysis, there are $n_{0k}$ subjects in the control arm, and there are $n_{1k}$ subjects in the experimental arm. We have $\delta_k$ as the difference of group means, i.e.,
$$
\delta_k
=
\frac{\sum_{i=1}^{n_{1k}} X_{i,1}}{n_{1k}}
-
\frac{\sum_{i=1}^{n_{0k}} X_{i,0}}{n_{0k}}.
$$
It can be estimated as $\widehat\delta_k = \frac{\sum_{i=1}^{n_{1k}} x_{i,1}}{n_{1k}} - \frac{\sum_{i=1}^{n_{0k}} x_{i,0}}{n_{0k}}$, where $x_{i,j}$ are the observation of $X_{i,j}$ of subject $i$ in arm $j$.

The statistical information $\mathcal I_k$ is
$$
\mathcal I_k^{-1}
=
\text{Var}(\delta_k | H_0)
=
\sigma^2 (1 / n_{1k} + 1 / n_{0k}),
$$
under both $H_0$ and $H_1$, which can be estimated as $\mathcal I_k = \widehat\sigma^2 (1 / n_{1k} + 1 / n_{0k})$.

# Binary outcomes

Imagine a trial with a binary outcome. Let $X_{0, i} \sim B(p_0)$ for patient $i = 1, \ldots, n_0$ and $X_{1,i} \sim B(p_1)$ for patient $i$ ($i = 1, \ldots, n_1$), where $p_0$ and $p_1$ are failure rate probability. Suppose at the $k$-th analysis, there are $n_{0k}$ subjects in the control arm, and there are $n_{1k}$ subjects in the experimental arm. For a superiority design, the null and alternative hypothesis is
$$
H_0: \; p_0 = p_1 = p \;\;\; \text{vs.} \;\;\; H_1:\; p_0 > p_1.
$$
The nature-scale treatment effect is
$$
\delta_k
=
\frac{\sum_{i=1}^{n_{1k}} X_{i,1}}{n_{1k}}
-
\frac{\sum_{i=1}^{n_{0k}} X_{i,0}}{n_{0k}},
$$
It can be estimated as $\widehat\delta_k = \frac{\sum_{i=1}^{n_{1k}} x_{i,1}}{n_{1k}} - \frac{\sum_{i=1}^{n_{0k}} x_{i,0}}{n_{0k}}$, where $x_{i,j}$ are the observation of $X_{i,j}$ of subject $i$ in arm $j$.

The statistical information is
$$
\mathcal I_k^{-1}
=
\text{Var}(\delta_k)
=
\left\{
\begin{array}{ll}
p(1-p)/n_{1k} + p(1-p)/n_{0k} & \text{under } H_0\\
p_1(1-p_1)/n_{1k} + p_0(1-p_0)/n_{0k} & \text{under } H_1\\
\end{array}
\right..
$$
Its estimation is
$$
\widehat{\mathcal I}_k^{-1}
=
\left\{
\begin{array}{ll}
\bar p(1 - \bar p) / n_{1k} + \bar p(1 - \bar p) / n_{0k} & \text{under } H_0\\
\widehat p_1(1-p_1) / n_{1k} + \widehat p_0(1 - \widehat p_0)/n_{0k} & \text{under } H_1\\
\end{array}
\right.,
$$
where $\bar p = \frac{\sum_{i=1}^{n_{1k}}x_{i1} + \sum_{i=1}^{n_{0k}}x_{i0}}{n_{1k} + n_{0k}}$, $\widehat p_j = \frac{\sum_{i=1}^{n_{jk}}x_{ij}}{n_{jk}}$ for $j = 0, 1$.


# Survival outcome

In many clinical trials, the outcome is the time to some event. For simplicity, assume the event is death so that each person can only have one event; the same ideas apply for events that can recur, but in those cases we restrict attention to the first event for each patients. We use the logrank statistics to compare the treatment and control arms.
If we assume there are $N_k$ total number of deaths at analysis $k$. The numerator of the logrank statistics at analysis $k$ is [@proschan2006statistical]
$$
\sum_{i=1}^{N_k} D_i,
$$
where $D_i = O_i - E_i$ with $O_i$ is the indicator that the $i$th death occurred in a treatment patient, and $E_i = m_{1i} / (m_{0i} + m_{1i})$ as the null expectation of $O_i$ given the respective numbers, $m_{0i}$ and $m_{1i}$, of control and treatment patients at risk just prior to the $i$th death.


Conditioned on $m_{0i}$ and $m_{1i}$, the $O_i$ has a Bernoulli distribution with parameter $E_i$. The null conditional mean and variance of $D_i$ are 0 and $V_i = E_i(1 − E_i)$, respectively.

Unconditionally, the $D_i$ are uncorrelated, mean 0 random variables with variance $E(V_i)$ under the null hypothesis.

Thus, conditioned on $N_k$, we have
$$
\begin{array}{ccl}
\mathcal I_k^{-1}
& = &
\text{Var}(\delta_k)
=
\sum_{i=1}^{N_k} \text{Var}(D_i)
=
\sum_{i=1}^{N_k} E(V_i)
=
E \left( \sum_{i=1}^{N_k} V_i \right)
=
E \left( \sum_{i=1}^{N_k} E_i(1 − E_i) \right) \\
& = &
\left\{
\begin{array}{ll}
E\left(\sum_{i=1}^{N_k} \frac{r}{1+r} \frac{1}{1+r} \right)
& \text{under } H_0\\
E\left(\sum_{i=1}^{N_k} \frac{m_{1i}}{(m_{0i} + m_{1i})} \frac{m_{0i}}{(m_{0i} + m_{1i})}\right)
& \text{under } H_1
\end{array}
\right.,
\end{array}
$$
where $r$ is the randomization ratio. Its estimation is
$$
\begin{array}{ccl}
\widehat{\mathcal I}_k^{-1}
& = &
\left\{
\begin{array}{ll}
\sum_{i=1}^{N_k} \frac{r}{1+r} \frac{1}{1+r}
& \text{under } H_0\\
\sum_{i=1}^{N_k} \frac{m_{1i}}{(m_{0i} + m_{1i})} \frac{m_{0i}}{(m_{0i} + m_{1i})}
& \text{under } H_1
\end{array}
\right..
\end{array}
$$

# References

0 comments on commit 91e4e4d

Please sign in to comment.