jupyter_PowerAnalysis

1 Introduction
2 Statistical test
- 2.1 $\alpha$ and $\beta$
- 2.2 Reporting
  - 2.2.1 measurement
  - 2.2.2 P-value
3 Effect size
- 3.1 Type of effect size
4 Power analysis
5 Reference

Introduction¶

This note is intend to illustration the connection between:

statistical test
Effect size
Power analysis

Statistical test¶

The processing of statistical test is to obtain Standardized Test Statistic, which allow the comparision of the measurement (difference, proportion, etc) at a standardized distribution.

A statistical test is determined by:

the distribution assumption based on the type of measurement
A signifiant level, $\alpha$
The null hypothesis $H_0$, which form the null hypothesis distribution with zero mean
The alternative hypothesis $H_a$, which form the alternative hypothesis distribution with standarized score (t, z etc) as in mean.

Illustration of H0&H1 distribution at native scale:

$\alpha$ and $\beta$¶

$1-\beta$: correctly rejecting the null hypothesis. "1 − β" is also known as the power of the test.
$1-\alpha$: correctly not rejecting the null hypothesis

Reporting¶

measurement¶

The general form is: $$\text{CI}_{1-\alpha} = \overline{X} \pm \text{E} \text{ with } \text{E} =CV_\gamma \times \sqrt{\frac{\sigma^2}{n}}$$

where:

$\alpha$: the threshold for statistical significance, tpyically 5%, 10%, 20%
$\text{CI}$: $1-\alpha$ confident interval
$\overline{X}$: the mean value of the measurement
$\text{E}$: the margin of error for the mean measurement
- $\alpha$: confidence level
- $CV_{\alpha}$: critical values measuring how many time of standard deviation in the standardized scale, defined by $\alpha$
- $\sigma$: standard deviation
- $n$: sample size

P-value¶

The P-value stands for probability and measures how likely it is that any observed difference between groups is due to chance (False positive). P-value operate on the scale of the difference measurement. The general form is:

$$\text{P-value} = Pr(X > \overline{X}) = Pr(\text{SS} > \text{CV})$$

where:

$X$: the native scale of the difference measurement
- $\text{SS}$: the standardized scale of the difference measurement, i.e., z, t, etc.
$\overline{X}$: the mean value of the difference measurement, calculated from samples
- $\text{CV}$: critical value on standardized scale of of the difference measurement, which is obtained by standardizing $\overline{X}$. i.e., $z^*$, $t^*$, etc.

However, it cannot describe:

the degree of correlation
the size of the difference,
the phenomenon of "P value manipulation".

Effect size¶

To overcome the diadvanage of p-value, Effect size is an alternative option of reporting significance. The general form is:

$$\text{ES} = \frac{\Delta\mu}{\sigma}$$

where:

$\text{ES}$: the effect size
$\Delta\mu$: the mean difference between two groups. (this is different from margin of error)
$\sigma$: the standard deviation

Effect size is a measure of the size of an effect. The effect size has basic properties such as being irrelevant to

the measurement unit,
monotonic,
unaffected by the sample size.

Type of effect size¶

Due to properties of experimental design, effect sizes with different adjustment are selected for different need, including:

type of measurement:
- difference, correlation, group-overlap
type of outcome:
- Continuous vs Dichotomous
sample size:
- small vs large
experimental design:
- balanced vs unbalanced

Comparing difference¶

Distribution	standard deviation	Margin of error	Sample size required
Continuous Outcome
Dichotomous Outcome

Adjustment:

Cohen’s d
Hedges’ g
Gauss's Delta

Comparing correlation¶

Distribution	standard deviation	Margin of error	Sample size required

Adjustment:

Comparing group-overlap¶

Distribution	standard deviation	Margin of error	Sample size required

Adjustment:

Power analysis¶

A power analysis is the calculation used to estimate the smallest sample size needed for an experiment, given a required significance level, statistical power, and effect size. It helps to determine if a result from an experiment or survey is due to chance, or if it is genuine and significant.

Sample size requirement¶

The minimul sample size required to observe the population distribution is usually viewed in term of effect size and critical value. The general form is (derived from 2.2.1-2):

$$\hat{n} \ge (\frac{CV_\alpha \times \sigma}{\overline{E}})^2 = (\frac{CV_\alpha}{ES})^2$$

where:

$\hat{n}$: sample size required to achive the designate significant level.
$\alpha$: significant level that expected to be achived in the statistical test after data collection, usually 0.05, 0.1, or 0.20
$CV_\alpha$: critical value on standardized scale, which is determined by significant level, $\alpha$. i.e., $z^*$, $t^*$, etc.
$ES$: the effect size, which is defined by designated margin of error, $\overline{E}$ and
- $\sigma$: the pooled standard deviation, which is usually assumed to be the same as previous study.
- $\overline{E}$: the designated margin of error, which is usually determined by investigators in experiment design.

Note:

Important application: if we can increate the effect size by decreasing standard deviation, smaller sample size is required to achive the critical value with the same significant level
The sample size requirement can be interpreted with Central limit theorem. $$\text{CLT: } \begin{equation*}\mu_{\bar{x}_{\text{sample}}}=\mu\end{equation*}$$ $$\begin{equation*}\bar{X}_{\text{sample}}\sim N (\mu,(\dfrac{\sigma^2_{\text{population}}}{n})\end{equation*}$$ $$\begin{equation*}\bar{X}_{\text{population}}\sim N (\mu,\sigma^2_{\text{population}})\end{equation*}$$ $$\therefore \begin{equation*} \sqrt{n}=\dfrac{\sigma_{\text{population}}}{\sigma_{\text{sample}}} =\dfrac{\Delta\mu}{\sigma_{\text{sample}}} \times \dfrac{\sigma_{\text{population}}}{\Delta\mu}\end{equation*} = \frac{CV_\alpha}{ES}$$

Power estimation¶

The power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis when a specific alternative hypothesis is true. P-value operate on the scale of the difference measurement. The general form is

$$\text{Power} = 1-\beta= Pr(\text{X}>\overline{E}) = Pr(\text{SS}>CV)$$

$X$: the native scale of the difference measurement.
- $\text{SS}$: standardized scale of the difference measurement, i.e., z, t, etc
$\overline{E}$: the designated margin of error, which is pre-determined by investigators in experiment design and will be similar to the calculated value from the collected data afterward.
- $\text{CV}$: critical value on standardized scale, which is obtained by standardizing $\overline{E}$. i.e., $z^*$, $t^*$, etc.

Available power analysis funtions in 'statsmodels.stats.power':

statsmodels.stats.power.TTestIndPower
statsmodels.stats.power.TTestPower (see example)
statsmodels.stats.power.GofChisquarePower
statsmodels.stats.power.NormalIndPower
statsmodels.stats.power.FTestAnovaPower
statsmodels.stats.power.FTestPower

Example code¶

In [1]:

import numpy as np
import statsmodels.stats.power as smpwr
import matplotlib.pyplot as plt

fig = plt.figure(figsize = (10,10))
ax = fig.add_subplot(2,1,1)
fig=smpwr.TTestPower().plot_power(dep_var='nobs', # variable on the x-axis: 'nobs', 'effect_size', alpha
                                  nobs= np.arange(2, 200),
                                  effect_size=np.array([0.1, 0.2, 0.3, 0.5, 1, 1.5, 2]),
                                  alternative='larger',
                                  ax=ax, title='Power of t-Test')
ax = fig.add_subplot(2,1,2)
fig=smpwr.TTestPower().plot_power(dep_var='es',
                                  nobs=np.array([10, 20, 30, 50, 70, 100]),
                                  effect_size=np.linspace(0.01, 2, 51),
                                  alternative='larger',
                                  ax=ax, title='') #supress title

Reference¶

In [ ]:

!jupyter nbconvert jupyter_PowerAnalysis.ipynb --to html

Power Analysis

Weiquan Luo

2023-06-05

Table of Contents

Introduction¶

Statistical test¶

$\alpha$ and $\beta$¶

Reporting¶

measurement¶

P-value¶

Effect size¶

Type of effect size¶

Comparing difference¶

Comparing correlation¶

Comparing group-overlap¶

Power analysis¶

Sample size requirement¶

Power estimation¶

Example code¶

Reference¶