1. Introduction

Experimental design involves:

Main concerns in experimental design include: the establishment of validity, reliability, and replicability.

Related concerns include: achieving appropriate levels of statistical power and sensitivity.

1.1 Element of Experimental design

  • Group:
  • pre-test
  • treatment
  • post-test

2. Different Design of Experiments

A methodology for designing experiments was proposed by Ronald Fisher, in his innovative books: The Arrangement of Field Experiments (1926) and The Design of Experiments (1935). Much of his pioneering work dealt with agricultural applications of statistical methods. As a mundane example, he described how to test the lady tasting tea hypothesis, that a certain lady could distinguish by flavour alone whether the milk or the tea was first placed in the cup. These methods have been broadly adapted in the physical and social sciences, are still used in agricultural engineering and differ from the design and analysis of computer experiments.

Group Pre-test Treatment Post-test

2.1 Non-Factorization: Observational study

a research design in which a single group is observed on a single occasion after experiencing some event, treatment, or intervention.

  • Risks: Because there is no control group against which to make comparisons, it is a weak design; any changes noted are merely presumed to have been caused by the event.

2.1.1 Cross-sectional study: Non-Control, Post-test

No control group. This design has virtually no internal or external validity.

One-Shot Case Study
Group Pre-test Treatment Post-test
X O

2.1.2 Longitudinal Study: Pre-test, Post-test

Minimal Control. There is somewhat more structure, there is a single selected group under observation, with a careful measurement being done before applying the experimental treatment and then measuring after. This design has minimal internal validity, controlling only for selection of subject and experimental mortality. It has no external validity.

One group Pre-test, Post-test
Group Pre-test Treatment Post-test
O X O

2.2 Factorization: Orthogonal Design of Experiments

An experimental design is orthogonal if each factor can be evaluated independently of all the other factors.Orthogonal designs for factors with two levels can be fit using least squares Orthogonality guarantees that the effect of one factor or interaction can be estimated independently of the effect of any other factor or interaction in the model.

Orthogonality concerns the forms of comparison (contrasts) that can be legitimately and efficiently carried out. Contrasts can be represented by vectors and sets of orthogonal contrasts are uncorrelated and independently distributed if the data are normal. Because of this independence, each orthogonal treatment provides different information to the others. If there are T treatments and T – 1 orthogonal contrasts, all the information that can be captured from the experiment is obtainable from the set of contrasts.

2.2.1 Controlled Experiments: NonRandom selection, Pre-test, Post-test

Scientific control is offen a contrast in design of experiment.

The main weakness of this research design is the internal validity is questioned from the interaction between such variables as selection and maturation or selection and testing. In the absence of randomization, the possibility always exists that some critical difference, not reflected in the pretest, is operating to contaminate the posttest data. For example, if the experimental group consists of volunteers, they may be more highly motivated, or if they happen to have a different experience background that affects how they interact with the experimental treatment - such factors rather than X by itself, may account for the differences.

Controlled, NonRandom selection, Pre-test, Post-test
Group Pre-test Treatment Post-test
Experimental group with Non-randomly selection O X O
Control group with Non-randomly selection O O

2.2.2 \(2^k\) Factorial Design

Factorial designs: a k way Anova design where each factor has two levels: low = −1 and high = 1.

Use of factorial experiments instead of the one-factor-at-a-time method. These are efficient at evaluating the effects and possible interactions of several factors (independent variables). Analysis of experiment design is built on the foundation of the analysis of variance, a collection of models that partition the observed variance into components, according to what factors the experiment must estimate or test.

Factorial design is an useful technique to investigate main and interaction effects of the variables chosen in any design of experiment. This technique is helpful in investigating use factorial crossing to compare the effects (main effects, pairwise interactions, …, k-fold interaction) of the k factors.

\((\Delta K_{IC})_{var}=\frac{\sum{Y_{var=1}}}{n}-\frac{\sum{Y_{var=-1}}}{n}\)

The table of contrasts for a \(2^3\) design is below:

orthogonal factorial design

orthogonal factorial design

2.2.3 \(2_R^{k-f}\) Fractional Factorial Design

$2_R^{k-f}$ Fractional factorial designs: design has k factors and takes \(m \times 2^{k−f}\) runs where the number of replications m is usually 1. The design is an orthogonal design and each factor has two levels low = −1 and high = 1. R is the resolution of the design.

  • Advantage: the amount of testing is reduced, but the downside is that some interactions and factors are aliased.

  • limitations:the number of runs is \(2^n\). If there are more than a few factors this leads to large gaps in the available options (4, 8, 16, 32, 64, 128, etc.).

  • Solution:

    • Plackett-Burman designs
    • Response Surface Methods
    • Taguchi Designs.

2.2.4 Plackett Burman Designs

Robin L. Plackett’s and J. P. Burman’s goal to find experimental designs for investigating the dependence of some measured quantity on a number of independent variables (factors), each taking k levels, in such a way as to minimize the variance of the estimates of these dependencies using a limited number of experiments. Interactions between the factors were considered negligible. The solution to this problem is to find an experimental design where each combination of levels for any pair of factors appears the same number of times, throughout all the experimental runs (refer to table). A complete factorial design would satisfy this criterion, but the idea was to find smaller designs.

Plackett-Burman designs identify the most important factors early in the experimentation phase. This designs are usually resolution III, 2-level designs. In a resolution III design, main effects are aliased with 2-way interactions. Therefore, you should only use these designs when you are willing to assume that 2-way interactions are negligible.

Each design is based on the number of runs, from 12 to 48, and is always a multiple of 4. The number of factors must be less than the number of runs. For example, a design with 20 runs lets you estimate the main effects for up to 19 factors.

(source: https://support.minitab.com/en-us/minitab/18/help-and-how-to/modeling-statistics/doe/supporting-topics/factorial-and-screening-designs/plackett-burman-designs/)

2.3 Randomization

Random assignment is the process of assigning individuals at random to groups or to different groups in an experiment, so that each individual of the population has the same chance of becoming a participant in the study. The random assignment of individuals to groups (or conditions within a group) distinguishes a rigorous, “true” experiment from an observational study or “quasi-experiment”.

  • Risks: such as having a serious imbalance in a key characteristic between a treatment group and a control group.
  • Solutions: are calculable and hence can be managed down to an acceptable level by using enough experimental units.

2.3.1 Randomized Design: Random selection, Post-test

Two Group, Post-test
Group Pre-test Treatment Post-test
randomly selection \(X_1\) O
randomly selection O
randomly selection \(X_n\) O
randomly selection O

The main advantage of this design is randomization. The post-test comparison with randomized subjects controls for the main effects of history, maturation, and pre-testing; because no pre-test is used there can be no interaction effect of pre-test and X. Another advantage of this design is that it can be extended to include more than Controlled if necessary.

Completely Randomized Design

Completely Randomized Design

2.3.2 Randomized Controlled Experiments: Random selection, Pre-test, Post-test

The advantage here is the randomization, so that any differences that appear in the posttest should be the result of the experimental variable rather than possible difference between the Controlled to start with. This is the classical type of experimental design and has good internal validity. The external validity or generalizability of the study is limited by the possible effect of pre-testing. The Solomon Four-Group Design accounts for this.

Controlled, Random selection, Pre-test, Post-test
Group Pre-test Treatment Post-test
Experimental group with randomly selection O \(X_1\) O
Experimental group with randomly selection O O
Experimental group with randomly selection O \(X_n\) O
Control Group with randomly selection O O

2.3.3 Solomon Four-Group Design

This design contains two extra control groups, which serve to reduce the influence of confounding variables and allow the researcher to test whether the pretest itself has an effect on the subjects

Solomon Four-Group Design
Group Pre-test Treatment Post-test
Experimental group with randomly selection O X O
Control Group with randomly selection O O
Unpre-tested Experimental group with randomly selection X O
Unpre-tested Control Group with randomly selection O

2.3.4 Placebo-controlled Randomized trials study

Randomized trials: the subjects were randomly allocated to their test groups.

Placebo-controlled studies are a way of testing a medical therapy in which, in addition to a group of subjects that receives the treatment to be evaluated, a separate control group receives a sham “placebo” treatment which is specifically designed to have no real effect.

blinded trials: subjects do not know whether they are receiving real or placebo treatment. Blinding is the withholding of information from participants which may influence them in some way until after the experiment is complete. Good blinding may reduce or eliminate experimental biases such as confirmation bias, the placebo effect, the observer effect, and others. A blind can be imposed on any participant of an experiment, including subjects, researchers, technicians, data analysts, and evaluators.

Evalutation: The outcomes within each group are observed, and compared with each other, allowing us to measure:

  • The efficacy of the active drug’s treatment: the difference between A and NH (i.e., A-NH).
  • The efficacy of the active drug’s active ingredient: the difference between A and P (i.e., A-P).
  • The magnitude of the placebo response: the difference between P and NH (i.e., P-NH).

It is a matter of interpretation whether the value of P-NH indicates the efficacy of the entire treatment process or the magnitude of the “placebo response”. The results of these comparisons then determine whether or not a particular drug is considered efficacious.

Placebo-controlled study
Group Pre-test Treatment Post-test
Active drug group (A) O X O
Placebo drug group (P) O O
Natural history group (NH) O X O

2.4 Nesting and Blocking

Nearly all experiments (other than regression designs) have one level of nesting (replicates are nested in treatment), but by convention it is only termed a nested design if there are at least two levels of nesting. Nested designs may result in pseudoreplication if the evaluation units are wrongly treated as experimental units in the analysis. Similar errors of analysis can also occur in partially nested designs which we consider below after looking at blocked and factorial designs.

Blocking is the non-random arrangement of experimental units into groups (blocks/lots) consisting of units that are similar to one another. Blocking reduces known but irrelevant sources of variation between units (block Interaction A:B) and thus allows greater precision in the estimation of the source of variation under study.

2.4.1 Randomized Block Design

Controlled, Random selection, Pre-test, Post-test
Group Pre-test Treatment Post-test
Experimental group with randomly selection from block A O X O
Control Group with randomly selection from block A O O
Experimental group with randomly selection from block B O X O
Control Group with randomly selection from block B O O
Randomized Block Design

Randomized Block Design

2.4.2 Randomized Complete Block Design (RCBD)

The RCBD is the standard design for agricultural experiments where similar experimental units are grouped into blocks or replicates.

  • The number of blocks is the number of replications.
  • Treatments are assigned at random within blocks of adjacent subjects, each treatment once per block.
  • Any treatment can be adjacent to any other treatment, but not to the same treatment within the block
Randomized Complete Block Design

Randomized Complete Block Design

2.5 Statistical Replication

Replication: Measurements are usually subject to variation and measurement uncertainty; thus they are repeated and full experiments are replicated to help identify the sources of variation, to better estimate the true effects of treatments, to further strengthen the experiment’s reliability and validity, and to add to the existing knowledge of the topic.

Genuine run replicates need to be used. A common error is to take m measurements per run, and act as if the m measurements are from m runs. If as a data analyst you encounter this error, average the m measurement into a single value of the response.

For orthogonal design, Randomly assign units to the \(m^2_k\) runs for \(2^k\) Factorial Design, \(m \times 2^{k−f}\) and \(2_R^{k-f}\) Fractional Factorial Design. Often the units are time slots. If possible, perform the \(m^2_k\) runs in random order.