forked from MathiasHarrer/Doing-Meta-Analysis-in-R
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path15-power_analysis.Rmd
377 lines (274 loc) · 19.9 KB
/
15-power_analysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
# Power Analysis
![](_figs/poweranalysis.jpg)
A big asset (and probably one of the reasons why meta-analysis can be helpful in practical research) of meta-analyses is that **they allow for large data to be combined** to attain a more precise pooled effect. **Lack of statistical power**, however, may still play an important role, even in meta-analysis. This is particularly true for the clinical field, where it is often the case that only **few studies are available for synthesis**. The median number of included studies in the *Cochrane Database for Systematic Reviews*, for example, is six [@borenstein2011]. This is even more grave once we consider that (1) many meta-analysts will also want to perform **subgroup analyses and meta-regression**, for which even more power is required, and (2) many meta-analyses have **high heterogeneity**, which reduces our precision, and thus our power.
Power is directly related to the **Type II error level** ($\beta$) we defined: $Power = 1- \beta$. It is common practice to set our **Type I error level** ($\alpha$) to $\alpha=0.05$, and thus to assume that the **Type I error is four times as grave as the Type II error** (i.e., falsely finding an effect while there is no effect in reality is four times as bad as not finding an effect while there is one in reality). The **Type II error** is therefore set at $\beta=0.20$, and the power should thus be $1-\beta=1-0.20=80\%$.
<br><br>
```{block,type='rmdinfo'}
**What assumptions should i make for my meta-analysis?**
While researchers conducting primary studies can **plan the size of their sample based on the effect size they want to find**, the situation is a little different in meta-analysis, where we can only work with the published material. However, we have some **control over the number of studies we want to include in our meta-analysis** (e.g., through more leniently or strictly defined inclusion criteria). Therefore, we can change our power to some extent by including more or less studies into the meta-analysis. There are **four things we have to make assumptions about when assessing the power of our meta-analysis a priori**.
* The **number of included or includable studies** $k$
* The **overall size of the studies we want to include** (are the studies in the field rather small or large?)
* The **effect size we want to determine**. This is particularly important, as we have to make assumptions about how **big an effect size has to be to still be clinically meaningful**. One study calculated that for interventions against depression, even effects as small as $SMD=0.24$ may still be meaningful for patients [@cuijpers2014threshold]. If we want to study **negative effects of an intervention** (e.g., death or symptom deterioration), even **very small effect sizes are extremely important and should be detected**.
* The **heterogeneity** of our studies' effect sizes, as this also affects the precision of our meta-analysis, and thus its potential to find significant effects.
Besides these parameters, it is also important to think about other analyses, such as the **subgroup analyses** we want to conduct. How many studies are there for each subgroup, and what effects do we want to find in the subgroups? This is particularly important if we **hypothesize that an intervention is not effective in a subgroup of patients**, because we do not want to falsely find a treatment to be ineffective simply because the power was insufficient.
```
```{block,type='rmdachtung'}
**Post-hoc power tests: the abuse of power**
Please note that power analyses should always be conducted **a priori**, meaning *before* you perform the meta-analysis.
Power analyses conducted *after* an analysis ("post hoc") are fundamentally flawed [@hoenig2001abuse], as they suffer from the so-called **"power approach paradox"**, in which an analysis yielding no significant effect is thought to show more evidence that the null hypothesis is true when the p-value is smaller, since then, the power to detect a true effect would be higher.
```
---
## Fixed-Effect Model {#fixed.power}
To determine the **power** of a meta-analysis **under the fixed-effect model**, we have to assume the **true value of a distribution when the alternative hypothesis is correct** (i.e., when there is an effect). For power analysis in a conventional study, this distribution is $Z$. Follwing Borenstein et al. [@borenstein2011], we will call the true value $\lambda$ here to make clear that we are dealing with a meta-analysis, and not a primary study. $\lambda$ is defined as:
$$\lambda=\frac{\delta}{\sqrt{V_{\delta}}}$$
Where $\delta$ is the **true effect size** and $V_{\delta}$ its variance.
$V_{\delta}$ can be calculated for meta-analysis using the fixed-effect model with this formula:
$$V_{\delta}=\frac{\frac{n_1+n_2}{n_1xn_2}+\frac{d^2}{2(n_1+n_2)}}{k}$$
Where $k$ are all the included studies, and $n_1$ and $n_2$ are the **average sample sizes in each trial arm we assume across our studies**.
Assuming a normal distribution and using $\lambda$, we can calculate the Power:
$$Power = 1- \beta$$
$$Power = 1- \Phi(c_{\alpha}-\lambda)+\Phi(-c_{\alpha}-\lambda) $$
Where $c_{\alpha}$ is the critical value of a $Z$-distribution. $\Phi$ is the **standard normal density function**, which we we need to calcuate the power using this equation:
$$\Phi(Z)=\frac{1}{\sqrt {2\pi}}e^{-\frac{Z^2}{2}}$$
Luckily, you don't have too think about these statistical details too much, as we have prepared a **function** for you with which you can easily conduct a **power analysis** using the fixed-effect model yourself. The function is called `power.analysis`. The function is part of the [`dmetar`](#dmetar) package. If you have the package installed already, you have to load it into your library first.
```{r, eval=FALSE}
library(dmetar)
```
If you don't want to use the `dmetar` package, you can find the source code for this function [here](https://raw.githubusercontent.com/MathiasHarrer/dmetar/master/R/power.analysis.R). In this case, *R* doesn't know this function yet, so we have to let *R* learn it by **copying and pasting** the code **in its entirety** into the **console** in the bottom left pane of RStudio, and then hit **Enter ⏎**. The function then requires the `ggplot2` package to work.
```{r, echo=FALSE}
source("dmetar/power.analysis.R")
power.analysis.fixed<-function(d,k,n1,n2,p){
n1<-n1
n2<-n2
d<-d
k<-k
p<-p
title<-c("Power for a fixed-effect meta-analysis:")
v.d<-((n1+n2)/(n1*n2))+((d*d)/(2*(n1+n2)))
v.m<-v.d/k
lambda<-(d/sqrt(v.m))
plevel<-1-(p/2)
zval<-qnorm(p=plevel, 0,1)
power<-1-(pnorm(zval-lambda))+(pnorm(-zval-lambda))
return(power)
}
```
---
**For this function, we have to specify the following parameters:**
```{r,echo=FALSE}
library(kableExtra)
Package<-c("d","OR", "k", "n1", "n2", "p", "heterogeneity")
Description<-c("The hypothesized, or plausible overall effect size of a treatment/intervention under study compared to control, expressed as the standardized mean difference (SMD). Effect sizes must be positive numerics (i.e., expressed as positive effect sizes).",
"The hypthesized, or plausible overall effect size of a treatment/intervention under study compared to control, expressed as the Odds Ratio (OR). If both d and OR are specified, results will only be computed for the value of d.",
"The expected number of studies to be included in the meta-analysis.",
"The expected, or plausible mean sample size of the treatment group in the studies to be included in the meta-analysis.",
"The expected, or plausible mean sample size of the control group in the studies to be included in the meta-analysis.",
"The alpha level to be used for the power computation. Default is alpha=0.05",
"Which level of between-study heterogeneity to assume for the meta-analysis. Can be either 'fixed' for no heterogeneity/a fixed-effect model, 'low' for low heterogeneity, 'moderate' for moderate-sized heterogeneity or 'high' for high levels of heterogeneity. Default is 'fixed'.")
m<-data.frame(Package,Description)
names<-c("Parameter", "Description")
colnames(m)<-names
kable(m)
```
Now, let's give an example. I assume that an effect of $d=0.30$ is likely and meaningful for the field of my meta-analysis. I also assume that on average, the studies in my analysis will be rather small, with 25 participants in each trial arm, and that there will be 10 studies in my analysis. I will set the $\alpha$-level to 0.05, as is convention.
```{r, eval=FALSE}
power.analysis.(d=0.30,
k=10,
n1=25,
n2=25,
p=0.05)
```
The output of the function is:
```{r,echo=FALSE, fig.align="center"}
power.analysis(d=0.30,k=10,n1=25,n2=25,p=0.05)
```
Meaning that my power is 92%. This is more than the desired 80%, so given that my assumptions are remotely true, my meta-analysis will have **sufficient power using the fixed-effect model to detect a clinically relevant effect if it exists**.
So, if i assume an effect of $d = 0.30$ in this example, i am lucky. If we play around with the effect size a little, however, while holding the other parameters constant, this can look very different.
```{r,fig.width=5,fig.align='center',echo=FALSE,message=FALSE}
library(ggplot2)
library(reshape)
k <- seq(0, 50, length=1000)
pow.vals01<-lapply(k,function(k) power.analysis.fixed(d=0.10,k=k,n1=25,n2=25,p=0.05))
pow.vals02<-lapply(k,function(k) power.analysis.fixed(d=0.20,k=k,n1=25,n2=25,p=0.05))
pow.vals03<-lapply(k,function(k) power.analysis.fixed(d=0.30,k=k,n1=25,n2=25,p=0.05))
pow.vals01<-as.numeric(pow.vals01)
pow.vals02<-as.numeric(pow.vals02)
pow.vals03<-as.numeric(pow.vals03)
data<-data.frame(k,pow.vals01,pow.vals02,pow.vals03)
ggplot()+
geom_line(data = data, aes(x = k, y = pow.vals01), color = "blue",size=2) +
geom_line(data = data, aes(x = k, y = pow.vals02), color = "red",size=2) +
geom_line(data = data, aes(x = k, y = pow.vals03), color = "green",size=2) +
xlab('Number of Studies') +
ylab('Power')+
scale_y_continuous(labels = scales::percent)+
theme(
axis.line= element_line(color = "black",size = 1,linetype = "solid"),
legend.position = "bottom",
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
legend.background = element_rect(linetype="solid",
colour ="black"),
legend.title = element_blank(),
legend.key.size = unit(0.75,"cm"),
legend.text=element_text(size=14))+
annotate("text", x = 2, y = 0.7, label = "d = 0.3",size=5)+
annotate("text", x = 14, y = 0.6, label = "d = 0.2",size=5)+
annotate("text", x = 20, y = 0.24, label = "d = 0.1",size=5)+
geom_hline(yintercept=0.8,linetype="dashed")
```
As you can see from this plot, sufficient power (see the **dashed line**) is soon reached for $d=0.30$, even if only few studies are included. If i assume a smaller effect size of $d=0.10$, however, **even 50 studies will not be sufficient to find a true effect**.
---
## Random-Effects Model
For power analyses under the **random-effects model**, the formula to calculate the variance of my true mean effect looks slightly different:
$$V_{\delta}^*=\frac{V_Y+\tau^2}{k}$$
We see that again, $tau^2$ has to be included to take the **between-study heterogeneity into account** (see [Chapter 4.2](#random) for more details). However, i **do not know the between-study heterogeneity of my analysis** before i perform it, so what value should i assume?
According to Hedges and Pigott [@hedges2004power], the follwing formulae may be used to calculate the power in the random-effect model assuming **small, moderate or high heterogeneity**:
**Small heterogeneity:**
$$V_{\delta}^*=1.33\times\frac{V_Y}{k}$$
**Moderate heterogeneity:**
$$V_{\delta}^*=1.67\times\frac{V_Y}{k}$$
**Large heterogeneity:**
$$V_{\delta}^*=2\times\frac{V_Y}{k}$$
Again, you don't have to worry about the statistical details here. We have put the entire calculations into the `power.analysis` function, which can already introduced [before](#fixed.power).
```{r, echo=FALSE}
power.analysis.random<-function(d,k,n1,n2,p,heterogeneity){
n1<-n1
n2<-n2
d<-d
k<-k
p<-p
heterogeneity<-heterogeneity
if(heterogeneity=="low"){
v.d<-((n1+n2)/(n1*n2))+((d*d)/(2*(n1+n2)))
v.m<-v.d/k
v.m<-1.33*v.m
lambda<-(d/sqrt(v.m))
plevel<-1-(p/2)
zval<-qnorm(p=plevel, 0,1)
power<-1-(pnorm(zval-lambda))+(pnorm(-zval-lambda))
return(power)
}
if(heterogeneity=="moderate"){
v.d<-((n1+n2)/(n1*n2))+((d*d)/(2*(n1+n2)))
v.m<-v.d/k
v.m<-1.67*v.m
lambda<-(d/sqrt(v.m))
plevel<-1-(p/2)
zval<-qnorm(p=plevel, 0,1)
power<-1-(pnorm(zval-lambda))+(pnorm(-zval-lambda))
return(power)
}
if(heterogeneity=="high"){
v.d<-((n1+n2)/(n1*n2))+((d*d)/(2*(n1+n2)))
v.m<-v.d/k
v.m<-2*v.m
lambda<-(d/sqrt(v.m))
plevel<-1-(p/2)
zval<-qnorm(p=plevel, 0,1)
power<-1-(pnorm(zval-lambda))+(pnorm(-zval-lambda))
return(power)
}
}
```
I will assume the same parameters used for the [fixed-effect model power analysis](#fixed.power), but this time i will also have to specify the `heterogeneity` in the function, which can take the values `"low"`, `"moderate"` and `"high"`. I will choose `"moderate"` for this example.
```{r,eval=FALSE}
power.analysis(d=0.30,
k=10,
n1=25,
n2=25,
p=0.05,
heterogeneity = "moderate")
```
The output i get is:
```{r,echo=FALSE, fig.align="center"}
power.analysis(d=0.30,k=10,n1=25,n2=25,p=0.05,heterogeneity = "moderate")
```
Interestingly, we see that this value is 73%, which is smaller than the value of 91% which was calculated using the **fixed-effect model**. The value is also below 80%, meaning that i would not have optimal power to find the desired effect of $d=0.30$ to be statistically significant if it exists.
This has to do with the **larger heterogeneity** i assume in this simulation, which decreases the precision of my effect size estimate, and thus increases my need for statistical power.
**The graph below visualizes this relationship:**
```{r,fig.width=5,fig.align='center',echo=FALSE,fig.cap="Power in the random-effects-model. Darker colors indicate higher heterogeneity",message=FALSE}
library(ggplot2)
library(reshape)
k <- seq(0, 50, length=1000)
pow.vals01<-lapply(k,function(k) power.analysis.random(d=0.10,k=k,n1=25,n2=25,p=0.05,heterogeneity = "moderate"))
pow.vals02<-lapply(k,function(k) power.analysis.random(d=0.20,k=k,n1=25,n2=25,p=0.05,heterogeneity = "moderate"))
pow.vals03<-lapply(k,function(k) power.analysis.random(d=0.30,k=k,n1=25,n2=25,p=0.05,heterogeneity = "moderate"))
pow.vals01<-as.numeric(pow.vals01)
pow.vals02<-as.numeric(pow.vals02)
pow.vals03<-as.numeric(pow.vals03)
data1<-data.frame(k,pow.vals01,pow.vals02,pow.vals03)
k <- seq(0, 50, length=1000)
pow.vals01<-lapply(k,function(k) power.analysis.random(d=0.10,k=k,n1=25,n2=25,p=0.05,heterogeneity = "low"))
pow.vals02<-lapply(k,function(k) power.analysis.random(d=0.20,k=k,n1=25,n2=25,p=0.05,heterogeneity = "low"))
pow.vals03<-lapply(k,function(k) power.analysis.random(d=0.30,k=k,n1=25,n2=25,p=0.05,heterogeneity = "low"))
pow.vals01<-as.numeric(pow.vals01)
pow.vals02<-as.numeric(pow.vals02)
pow.vals03<-as.numeric(pow.vals03)
data2<-data.frame(k,pow.vals01,pow.vals02,pow.vals03)
k <- seq(0, 50, length=1000)
pow.vals01<-lapply(k,function(k) power.analysis.random(d=0.10,k=k,n1=25,n2=25,p=0.05,heterogeneity = "high"))
pow.vals02<-lapply(k,function(k) power.analysis.random(d=0.20,k=k,n1=25,n2=25,p=0.05,heterogeneity = "high"))
pow.vals03<-lapply(k,function(k) power.analysis.random(d=0.30,k=k,n1=25,n2=25,p=0.05,heterogeneity = "high"))
pow.vals01<-as.numeric(pow.vals01)
pow.vals02<-as.numeric(pow.vals02)
pow.vals03<-as.numeric(pow.vals03)
data3<-data.frame(k,pow.vals01,pow.vals02,pow.vals03)
ggplot()+
geom_line(data = data1, aes(x = k, y = pow.vals01), color = "blue",size=2) +
geom_line(data = data1, aes(x = k, y = pow.vals02), color = "#ff0000",size=2) +
geom_line(data = data1, aes(x = k, y = pow.vals03), color = "#00cc00",size=2) +
geom_line(data = data2, aes(x = k, y = pow.vals01), color = "lightblue",size=2) +
geom_line(data = data2, aes(x = k, y = pow.vals02), color = "#ff4d4d",size=2) +
geom_line(data = data2, aes(x = k, y = pow.vals03), color = "#1aff1a",size=2) +
geom_line(data = data3, aes(x = k, y = pow.vals01), color = "darkblue",size=2) +
geom_line(data = data3, aes(x = k, y = pow.vals02), color = "#b30000",size=2) +
geom_line(data = data3, aes(x = k, y = pow.vals03), color = "#008000",size=2) +
xlab('Number of Studies') +
ylab('Power')+
scale_y_continuous(labels = scales::percent)+
theme(
axis.line= element_line(color = "black",size = 1,linetype = "solid"),
legend.position = "bottom",
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
legend.background = element_rect(linetype="solid",
colour ="black"),
legend.title = element_blank(),
legend.key.size = unit(0.75,"cm"),
legend.text=element_text(size=14))+
annotate("text", x = 3, y = 0.7, label = "d = 0.3",size=5)+
annotate("text", x = 25, y = 0.6, label = "d = 0.2",size=5)+
annotate("text", x = 20, y = 0.13, label = "d = 0.1",size=5)+
geom_hline(yintercept=0.8,linetype="dashed")
```
---
## Subgroup Analyses
Often when conducting a test for subgroup differences within a meta-analysis, we might be interested which effect size difference is needed between the two subgroups for the difference to become significant given the data material at hand. To evaluate this, a **power analysis for subgroup differences** may be helpful.
We have created a function called `power.analysis.subgroup`, which calculates the power of a subgroup difference test (using only two subgroups) by implementing the formulae described by @hedges2001power. The function is part of the [`dmetar`](#dmetar) package. If you have the package installed already, you have to load it into your library first.
```{r, eval=FALSE}
library(dmetar)
```
If you don't want to use the `dmetar` package, you can find the source code for this function [here](https://raw.githubusercontent.com/MathiasHarrer/dmetar/master/R/power.analysis.subgroup.R). In this case, *R* doesn't know this function yet, so we have to let *R* learn it by **copying and pasting** the code **in its entirety** into the **console** in the bottom left pane of RStudio, and then hit **Enter ⏎**. The function then requires the `ggplot2` package to work.
```{r, echo=FALSE}
source("dmetar/power.analysis.subgroup.R")
```
Let's assume we have a subgroup analysis in which the pooled effect size of subgroup 1 is $g=0.30$, with a standard error of $0.13$, and an effect size of $g=0.66$ for subgroup 2, with a standard error of $0.14$. We can then use the function like this:
```{r, fig.align="center"}
power.analysis.subgroup(TE1=0.30,
TE2=0.66,
seTE1=0.13,
seTE2=0.14)
```
From the output, we can see that the power for our subgroup test (47%) is **not sufficient**. We also see that the effect size difference must be at least 0.536, leaving all other parameters the same, to reach sufficient power.
---
## Power Calculator Tool
If you're feeling lazy, or if you want to quickly **check for the power of your meta-analysis under varying assumptions**, you might need a tool which makes it easier for you to calculate the power without having to run the R functions we described before each time.
We therefore built a online **Power Calculator Tool**, which you can find below. The calculations are based on the formulae and functions we described in the previous chapters.
[View in full page mode](https://mathiasharrer.shinyapps.io/power_calculator_meta_analysis/)
```{r chunk5,echo=FALSE}
knitr::include_app("https://mathiasharrer.shinyapps.io/power_calculator_meta_analysis/", height = "1500px")
```
---