9.3 One Factor Analysis of Variance

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

9.

3 One-Factor Analysis of Variance


A. Designed Experiments
Denition 9.3.1: Observational vs. Designed Experiments
ˆ Observational experiments are experiments in which the researcher has little or
no control over the variables being studied. They merely observe their values.

ˆ Designed experiments are experiments in which the researcher attempts to control


the levels of one or more variables to determine their eect on a variable of interest.

Denition 9.3.2: Independent vs. Dependent Variables


ˆ A response variable is a dependent variable of interest that is measured in the
experiment.

ˆ A factor is an independent variables whose eect on the response is of interest to the


researcher. Factor levels are the values of the factor utilized in the experiment.

Denition 9.3.3: Treatments


The treatments of an experiment are the factorlevel combinations utilized.
B. Single Factor ANOVA
Remark 9.3.4: Consider an experiment that involves a single factor with m treatments. The
completely randomized design is a design in which the m treatments are randomly assigned to the
experimental units or in which independent random samples of experimental units are selected
for each treatment.

Remark 9.3.5: Analysis of variance (ANOVA) attempts to analyze the variation in a set of
responses and assign portions of this variation to each variable in a set of independent variables.

1. Let X1 , X2 , ..., Xm be independent random variables with unknown mean µ1 , µ2 , ..., µm and
2
unknown variance σ .

2. Hypotheses:

ˆ Null Hypothesis: µ1 = µ2 = ... = µm


ˆ Alternate Hypothesis: At least two treatment means dier.

3. Input: n1 random sample values for X1 , n2 random sample values for X2 ,..., nm random
sample values for Xm such that

n1 + n2 + ... + nm = n

and

Samples for X1 X11 X12 ... X1n1


Samples for X2 X21 X22 ... X2n2
. . . . .
. . . . .
. . . . .

Samples for Xm Xm1 Xm2 ... Xmnm

4. For each i ∈ {1, 2, ..., m}, we dene

ni
1 X
ˆ Xi := Xij
ni
j=1
m
1 X
ˆ X := Xi
n
i=1

Denition 9.3.6: Sums of Squares


ˆ The total sum of squares (TSS) represents total variation of the response mea-
surements in our samples:

ni
m X
X
T SS := (Xij − X)2 .
i=1 j=1

ˆ The sum of squares for error (SSE) represents the variability around the treatment
means that is attributed to sampling error:

ni
m X
X
SSE := (Xij − Xi )2 .
i=1 j=1

ˆ The sum of squares between treatments (SST) is the variation between the
treatment means:
m
X
SST := ni (Xi − X)2 .
i=1

Note: T SS = SSE + SST.


Denition 9.3.7: Mean Squares
To make the two measurements of variability comparable, we divide each by the number of
degrees of freedom to yield the following:

ˆ The mean square for treatments (MST) measures the variability among the
treatment means:
SST
M ST := ,
m−1
where the number of degrees of freedom for the m treatments is m − 1.

ˆ The mean square for error (MSE) measures the sampling variability within the
treatments:
SSE
M SE := .
n−m
ˆ An F −statistic is given by the ratio of MST to MSE:

M ST
F := .
M SE

Remark 9.3.8: A typical ANOVA table is given below.

Source Sum of Squares Degrees of Freedom Mean Square F ratio


SST M ST
Treatment SST m−1 M ST = m−1 F = M SE
SSE
Error SSE n−m M SE = n−m
Total TSS n−1

Properties 9.3.9 Criteria for Rejecting H0


Consider the hypotheses

H0 : µ1 = µ2 = ... = µm

H1 : At least two treatment means dier,

where α is the level of signicance picked in advance. We reject H0 if

ˆ Critical Region: The test statistic satises Fc ≥ Fα (m − 1, n − m), which can be


| {z } | {z }
M ST M SE
computed using Table VII.

ˆ p−value: The p−value P (F > Fc ) is smaller than α.


Properties 9.3.10 Necessary Conditions for a Valid ANOVA F −Test
1. The samples are randomly selected in an independent manner from the m treatment
populations.

2. All m sampled populations have distributions that are approximately normal.

3. The m population variances are equal.

Example 9.3.11: Suppose we would like to estimate the favorability index of four dierent

brands of widgets by checking the review of three dierent customers. Let Xi (for i ∈ {1, 2, 3, 4}
by the (random) favorability index of Widget i, which is a normal random variable with mean
µi and variance σ2 (all widgets share the same variance). The review scores from the three
customers are given by in the table below.

Widget Customer A Customer B Customer C Means

Widget 1 13 8 9 X1 = 10
Widget 2 15 11 13 X2 = 13
Widget 3 8 12 7 X3 = 9
Widget 4 11 15 10 X4 = 12

a. Test if at least two of the three average customer responses to the widgets are dierent at
α = 0.05.
Source Sum of Squares Degrees of Freedom Mean Square F ratio

Treatment

Error

Total

b. Estimate the bounds for the p−value of the test.


Remark 9.3.12: Let's use Excel for the next example.

Example 9.3.13: Fifteen dierent patients are subjected to three drugs. Let Xi represent the

recovery score for Drug i, which is a normal random variable with mean µi and variance σ2. A
higher number indicates the patient is better than a patient with a lower number.

Drug 1 5.90 5.92 5.91 5.89 5.88 X1 = 5.90


Drug 2 5.51 5.50 5.50 5.49 5.50 X2 = 5.50
Drug 3 5.01 5.00 4.99 4.98 5.02 X3 = 5.00

a. Test if at least two of the three average patient responses to the drugs are dierent at
α = 0.05.

Source Sum of Squares Degrees of Freedom Mean Square F ratio

Treatment

Error

Total

b. Find the p−value of the test. What can we conclude?

You might also like