This note is intend to illustration the connection between:
The processing of statistical test is to obtain Standardized Test Statistic, which allow the comparision of the measurement (difference, proportion, etc) at a standardized distribution.
A statistical test is determined by:
Illustration of H0&H1 distribution at native scale:
The general form is: $$\text{CI}_{1-\alpha} = \overline{X} \pm \text{E} \text{ with } \text{E} =CV_\gamma \times \sqrt{\frac{\sigma^2}{n}}$$
where:
The P-value stands for probability and measures how likely it is that any observed difference between groups is due to chance (False positive). P-value operate on the scale of the difference measurement. The general form is:
$$\text{P-value} = Pr(X > \overline{X}) = Pr(\text{SS} > \text{CV})$$where:
However, it cannot describe:
To overcome the diadvanage of p-value, Effect size is an alternative option of reporting significance. The general form is:
$$\text{ES} = \frac{\Delta\mu}{\sigma}$$where:
Effect size is a measure of the size of an effect. The effect size has basic properties such as being irrelevant to
Due to properties of experimental design, effect sizes with different adjustment are selected for different need, including:
Distribution | standard deviation | Margin of error | Sample size required |
---|---|---|---|
Continuous Outcome | |||
Dichotomous Outcome |
Adjustment:
Distribution | standard deviation | Margin of error | Sample size required |
---|---|---|---|
Adjustment:
Distribution | standard deviation | Margin of error | Sample size required |
---|---|---|---|
Adjustment:
A power analysis is the calculation used to estimate the smallest sample size needed for an experiment, given a required significance level, statistical power, and effect size. It helps to determine if a result from an experiment or survey is due to chance, or if it is genuine and significant.
The minimul sample size required to observe the population distribution is usually viewed in term of effect size and critical value. The general form is (derived from 2.2.1-2):
$$\hat{n} \ge (\frac{CV_\alpha \times \sigma}{\overline{E}})^2 = (\frac{CV_\alpha}{ES})^2$$where:
Note:
The power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis when a specific alternative hypothesis is true. P-value operate on the scale of the difference measurement. The general form is
$$\text{Power} = 1-\beta= Pr(\text{X}>\overline{E}) = Pr(\text{SS}>CV)$$Available power analysis funtions in 'statsmodels.stats.power':
import numpy as np
import statsmodels.stats.power as smpwr
import matplotlib.pyplot as plt
fig = plt.figure(figsize = (10,10))
ax = fig.add_subplot(2,1,1)
fig=smpwr.TTestPower().plot_power(dep_var='nobs', # variable on the x-axis: 'nobs', 'effect_size', alpha
nobs= np.arange(2, 200),
effect_size=np.array([0.1, 0.2, 0.3, 0.5, 1, 1.5, 2]),
alternative='larger',
ax=ax, title='Power of t-Test')
ax = fig.add_subplot(2,1,2)
fig=smpwr.TTestPower().plot_power(dep_var='es',
nobs=np.array([10, 20, 30, 50, 70, 100]),
effect_size=np.linspace(0.01, 2, 51),
alternative='larger',
ax=ax, title='') #supress title
!jupyter nbconvert jupyter_PowerAnalysis.ipynb --to html