ClassWork 03 Multi Way ANOVA - Sol

ClassWork 03 Two Way ANOVA
Exercise 1
Let us prove that: SSTOT = SS A + SS B + SS AB + SS E
Solution
yijk = µˆ + τî + βˆ j + τβ
 +e
ij ijk
yijk − µˆ = τî + βˆ j + τβ
 +e
ij ijk
∑( )
2
∑( y ˆ)
2
ijk − µ= eijk + τî + βˆ j + τβ

ij
ijk ijk
Since the cross-product terms are zero, we get
∑( y − µˆ ) =
2 2
ijk
ijk ∑τˆ + ∑ βˆ + ∑τβ
ijk
i
2  + e
∑ ijk
2
j
ijk
ij
ijk
2
ijk
∑( y − y )
2
SSTOT
= ijk SS A
= ∑=
τˆ i
2
bn∑τî2 SS B
= βˆ
∑= 2
j an∑ βˆ j2
ijk ijk i ijk j
2 2
 
SS AB
= ∑=
τβ
ijk
n∑τβ ij
ij
ij SS E
= ∑e
ijk
2
ijk
Let us prove that the cross product terms are zero, we do that just for one
 = 2n τˆ  τβ
2∑τî *τβ  =0
ij ∑ i ∑ ij 
ijk i  j 

Remember that one of the constraints to solve the normal equations is ∑τβ
j
ij =0
1
Exercise 2 [M]
The yield of a chemical process is being studied. The two most important variables are thought to be
the pressure and the temperature. Three levels of each factor are selected, and a factorial experiment
with two replicates is performed. The yield data follows:
Pressure (psi)
T(°C) 200 215 230
90.4 90.7 90.2
150
90.2 90.6 90.4
90.1 90.5 89.9
160
90.3 90.6 90.1
90.5 90.8 90.4
170
90.7 90.9 90.1
a) Analyze the data. Use α=0.05 and comment the model’s adequacy;
b) Calculate the residual for the data (1,2,1) = 90.7;
c) Under what conditions would you operate this process? Evaluate by hand the three constants
and the critical value, then use Minitab.
Solution
a) Analyze the data. Use α=0.05 and comment the model’s adequacy
Since the experiment is replicated, the individual value plot is used. The individual value plot
should be used only when replicates are present.
Individual Value Plot of Yield

91,0
90,8
90,6
Yield
90,4
90,2
90,0
Pressure 200 215 230 200 215 230 200 215 230
Temperature 150 160 170
The graph indicates that the variability appears uniform.
Two more graphs are plotted to better understand the influence of each factor and of their interaction
on the response variable. The Minitab commands are:
• StatANOVAMain Effect Plot
• StatANOVAInteraction Plot
2
Main Effects Plot for Yield Interaction Plot for Yield
Data Means Data Means
Temperature Pressure 90,9 Temperature
90,7 150
90,8 160
170
90,7
90,6
90,6
90,5 90,5
Mean
Mean
90,4
90,4 90,3
90,2
90,3
90,1
90,0
90,2
200 215 230
150 160 170 200 215 230 Pressure
The main effects plot displays the response means for each factor level in sorted order. A horizontal
line is drawn at the grand mean. The effects are the differences between the means and the reference
line. Analyzing the graph, both the factors seem to be relevant and the pressure seems to have a
greater influence on the response than the temperature. Instead, the interaction between pressure and
temperature seems fairly small, as shown by the similar shape of the three curves.
Let us do the analysis by hand

1 y2
SS
= A ∑ i  abn
bn i
y 2
− = 0.3011
1 y2
SS
= B ∑  j  abn 0.76778
an j
y 2
− =
1 y2
SS=
AB ∑ ij  abn − SS A − SS=B 0.06889
n ij
y 2
−
y2
SSTOT = ∑ y − 2
= 1.29778
ijk SS E = SSTOT − SS A − SS B − SS AB = 0.16
ijk abn
The ANOVA table is:

Source SS df MS F0
A 0.30111 2 0.151 8.47*
B 0.76778 2 0.384 21.59*
AB 0.06889 4 0.017 0.969
Error 0.16 9 0.018
Total 1.29778 17
* significant at 5%
The analysis in MINITAB is done using the command StatANOVAGeneral Linear Model
General Linear Model: Yield versus Temperature; Pressure

Factor Information
Factor Type Levels Values
Temperature Fixed 3 150; 160; 170
Pressure Fixed 3 200; 215; 230
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Temperature 2 0,30111 0,15056 8,47 0,009
Pressure 2 0,76778 0,38389 21,59 0,000
3
Temperature*Pressure 4 0,06889 0,01722 0,97 0,470
Error 9 0,16000 0,01778
Total 17 1,29778
Model Summary
S R-sq R-sq(adj) R-sq(pred)
0,133333 87,67% 76,71% 50,68%
Check the residual assumptions:

Scatterplot of SRES1 vs FITS1; Temperature; Pressure Probability Plot of SRES1
Normal
FITS1 Temperature
2 99
Mean -2,51219E-13
1 StDev 1,029
95 N 18
AD 1,039
0 90
P-Value 0,007
-1 80
70
Percent
-2 60
SRES1
90,0 90,2 90,4 90,6 90,8 150 155 160 165 170 50
40
Pressure
2 30
20
1
10
0 5
-1
1
-3 -2 -1 0 1 2 3
-2
200 210 220 230 SRES1
The hypothesis of normality is refused (the p-value is lower than 0.05). From the probability plot, it
is clear the presence of an overfitting problem. So, we try to solve the rejection of the normality
assumption by reducing the model (i.e. eliminate the non-significant factors) rather than using Box-
Cox transformation. In general, if the normality assumption is not verified and there are non-
significant factors in the model, it is recommended to reduce the model before transforming the data.
Consequently, let us focus only on the significant model (pure additive).
General Linear Model: Yield versus Temperature; Pressure
Factor Information
Temperature Fixed 3 150; 160; 170
Pressure Fixed 3 200; 215; 230
Temperature 2 0,30111 0,15056 8,55 0,004
Pressure 2 0,76778 0,38389 21,80 0,000
Error 13 0,22889 0,01761
Lack-of-Fit 4 0,06889 0,01722 0,97 0,470
Pure Error 9 0,16000 0,01778
Total 17 1,29778
Model Summary
0,132691 82,36% 76,94% 66,19%
The Lack of Fit (LOF) test and the pure error will be covered in ClassWork 08 – regression. For
now, you should only remember that p-value of the LOF test should be >0.05.
4
Before drawing the conclusion, we check the residual assumptions.
Scatterplot of SRES2 vs FITS2; Temperature; Pressure Probability Plot of SRES2
Normal
FITS2 Temperature
2 99
Mean -2,10079E-13
1 StDev 1,029
95 N 18
0 AD 0,189
90
P-Value 0,888
-1 80
70
-2
Percent
60
SRES2
90,0 90,2 90,4 90,6 90,8 150 155 160 165 170 50
Pressure 40
2 30
20
1
10
0
5
-1
1
-2 -3 -2 -1 0 1 2 3
200 210 220 230 SRES2
Test for Equal Variances: SRES2 vs Temperature; Pressure

Temperature Pressure
Bartlett’s Test
150 200
P-Value 0,990
215
230
160 200
215
230
170 200
215
230
0 100 200 300 400 500 600

95% Bonferroni Confidence Intervals for StDevs
All the residuals belong to the interval (-3;+3), then the residuals appear independent from the
predicted response and independent from the factors. The hypotheses of normality and homogeneous
variance cannot be refused.
In conclusion, the temperature and the pressure influence the response. The additive model is
significant.
b) Calculate the residual for the data (1,2,1)=90.7;
Let us calculate manually the residual e121.

Using the full model, the residual would be:
e121 = y121 − yˆ12 = 90.7 − µˆ12 = 90.70 − 90.65 = 0.05
Instead using the reduced model (this is the correct way to estimate the residual):
542.5 544.1 1627.4
yˆ12 = µˆ + τˆ1 + βˆ2 = y + ( y1 − y ) + ( y 2  − y ) = y1 + y 2  − y = + − = 90.687
6 6 18
e121 = y121 − yˆ12 = 90.7 − 90.687 = 0.0 1
c) Under what conditions would you operate this process?
The model is additive: we can pick the best level of temperature independently from the pressure
α
and vice versa. Thus, α FAM = 0.05 ⇒ α= FAM = 0.025 Since both factors have the same
2
number of levels, we can calculate the three constants only for one factor.
The analysis for the factor temperature:
5
a ( a − 1) 3 ( 3 − 1)
Bα t α
= ( df E=
) rA = = 3= df E 13
2 rA 2 2
=Bα t0.025/6
= (13) 3.107
1 1 4.296
Tα
= qα ( a,=
df E ) q0.025 ( =
3,13) = 3.04
2 2 2
Sα =( a − 1) Fα ( a − 1, df E ) =2 F0.025 ( 2,13) =2* 4.9653 =3.151
Sα FAM= ( a + b − 2 ) Fα FAM
( a + b − 2, df E )= 4 F0.05 ( 4,13)= 4*3.179= 3.57
MS E 0.01761
The critical value=is Tα 2 3.04
= 2 0.2329
bn 6
(For demonstration purposes, we built the Table of Differences by hand.)
Let us build the matrix of the differences:
Temperature 150 170
160 0.167 0.317
150 0.15
The conclusion is:
160 150 170
The temperatures 160° and 150° are not statistically different, and the same conclusion is drawn for
the temperatures 150° and 170°.
The same analysis is made for the factor pressure. The constant values are the same because a=b and
the critical value is still 0.228. The matrix of the differences is:
Pressure 200 215
230 0.183 0.5
200 0.317
230 200 215
In conclusion the pressures 230 psi and 200 psi do not are different from a statistical point of view.
If we wish to maximize the yield of the chemical process, we select a pressure equal to 215 psi,
instead for the temperature the levels 150° and 170° are not statistically different. If we consider that
a higher temperature corresponds to a higher energetic cost, we would choose the temperature 150°.
6
Instead, if we have enough money to do more experiments, we could focus our attention on these two
levels of temperature to better understand which one allow for a higher yield.
With Minitab:
Stat  ANOVA  Comparison Options:97.5
7
Exercise 3 [M]
An article describes an experiment to investigate the effect of the type of glass and the type of
phosphor on the brightness of a television tube. The response variable is the current necessary (in
microamps) to obtain a specified brightness level. The data are as follows:
Phosphor
Glass 1 2 3
1 280 300 290
290 310 285
285 295 290
2 230 260 220
235 240 225
240 235 230
a) Analyze the data and draw conclusions. Use α=0.05;

b) Use Tukey’s test with Minitab to determine the best condition (α=0.05).
Solution

We start by using the individual value plot since we have three replicates for each condition.
Individual Value Plot of Brightness
310
300
290
280
Brightness
270
260
250
240
230
220
Phosphor 1 2 3 1 2 3
Glass 1 2
The individual value plot indicates that no evident outliers appears, and the variability is uniform.
Main Effects Plot for Brightness Interaction Plot for Brightness

Glass Phosphor 310 Glass
1
290 300 2
290
280
280
270 270
Mean
Mean
260
260
250
250
240
230
240
220
230 1 2 3
1 2 1 2 3 Phosphor
From the Main Effect plot, the type of glass seems to influence the response more than the type of
phosphor. In the Interaction plot, the lines are approximately parallel, indicating a probable lack of
interaction between factors glass and phosphor.
8
General Linear Model: Brightness versus Glass; Phosphor
Factor Information
Glass Fixed 2 1; 2
Phosphor Fixed 3 1; 2; 3
Glass 1 14450,0 14450,0 273,79 0,000
Phosphor 2 933,3 466,7 8,84 0,004
Glass*Phosphor 2 133,3 66,7 1,26 0,318
Error 12 633,3 52,8
Total 17 16150,0
Model Summary
7,26483 96,08% 94,44% 91,18%
Check for residual assumptions

Scatterplot of SRES1 vs FITS1; Glass; Phosphor Probability Plot of SRES1
FITS1 Glass
Normal
99
2 Mean -3,98447E-15
StDev 1,029
1 95 N 18
AD 0,359
90
0 P-Value 0,411
80
-1
70
Percent
-2 60
SRES1
220 240 260 280 300 1,00 1,25 1,50 1,75 2,00 50
Phosphor 40
30
2
20
1 10
0 5
-1
1
-3 -2 -1 0 1 2 3
-2
1,0 1,5 2,0 2,5 3,0 SRES1
Test for Equal Variances: SRES1 vs Glass; Phosphor

Glass Phosphor
Bartlett’s Test
1 1 P-Value 0,458
2 1
0 5 10 15 20 25 30 35
There are not outliers. The hypothesis of normality cannot be refused, the same is true for the test of
equal variance. The residual assumptions are checked. In conclusion the type of glass and the type of
phosphor are significant, instead their interaction is insignificant.
b) Use Tukey’s test to determine the best condition (α=0.05).

In this case we have two families: Glass and Phosphor: they both have p-values < 0.05 and the
interactions are not significant.
The Minitab command is: StatANOVAGeneral Linear Model Comparison
Options: confidence level 97,5%
9
Grouping Information Using the Tukey Method and 97,5% Confidence
Glass N Mean Grouping
1 9 291,667 A
2 9 235,000 B
Means that do not share a letter are significantly different.

Phosphor N Mean Grouping
2 6 273,333 A
1 6 260,000 B
3 6 256,667 B
If we wish to decrease the current necessary, we would recommend to use the glass number 2 and
one of the phosphors among the types 1and 3.
10
Exercise 4 [M]
Johnson and Leone describe an experiment to investigate warping of copper plates. The two factors
studied were the temperature and the copper content of the plates. The response variable was a
measure of the amount of warping. The data were as follows:
Copper Content (%)

T (°C) 40 60 80 100
50 17; 20 16; 21 24; 22 28; 27
75 12; 9 18; 13 17; 12 27; 31
100 16; 12 18; 21 25; 23 30; 23
125 21; 17 23; 21 23; 22 29; 31

b) If low warping is desirable, what level of copper content would you specify? Use Tuckey with
Minitab (no manual calculations are required).
c) Suppose that temperature cannot be easily controlled in the environment in which the copper
plates are to be used. Does this change your previous answer?
Solution

We have replicated conditions, so we start by using the individual value plot.
Individual Value Plot of Warping
30
25
Warping
20
15
10
Copper 40 60 80 100 40 60 80 100 40 60 80 100 40 60 80 100

Temperature 50 75 100 125
The variability appears uniform and no evident outliers are shown in the graph.
Main Effects Plot for Warping Interaction Plot for Warping

Temperature Copper 30 Temperature
30,0 50
75
100
27,5 125
25
25,0
Mean
20
Mean
22,5
20,0 15
17,5
10
15,0 40 60 80 100
50 75 100 125 40 60 80 100 Copper
The factor copper seems to affect more the response variable than the temperature. The interaction
between the temperature and the copper does not seem particularly relevant.
11
General Linear Model: Warping versus Temperature; Copper
Factor Information
Temperature Fixed 4 50; 75; 100; 125
Copper Fixed 4 40; 60; 80; 100
Temperature 3 156,1 52,031 7,67 0,002
Copper 3 698,3 232,781 34,33 0,000
Temperature*Copper 9 113,8 12,642 1,86 0,133
Error 16 108,5 6,781
Total 31 1076,7
Model Summary
2,60408 89,92% 80,48% 59,69%
Before drawing the conclusion, let us check the residual assumptions.
Scatterplot of SRES1 vs FITS1; Temperature; Copper Probability Plot of SRES1

Normal
FITS1 Temperature
2 99
Mean -4,78784E-16
1 StDev 1,016
95 N 32
0 AD 0,666
90
P-Value 0,074
-1 80
70
-2
Percent
60
SRES1
10 15 20 25 30 40 60 80 100 120 50
Copper 40
2 30
20
1
10
0 5
-1
1
-2 -3 -2 -1 0 1 2 3
40 60 80 100 SRES1
Test for Equal Variances: SRES1 vs Temperature; Copper

Temperature Copper
50 40 Bartlett’s Test
60
P-Value 0,984
80
100
75 40
60
80
100
100 40
60
80
100
125 40
60
80
100
0 200 400 600 800 1000 1200 1400

No outliers appear in the graph, in fact all the standardized residuals belong to the interval (-3,+3).
Moreover, the residuals appear independent form the predicted response and from both factors. The
hypotheses of normality and homogenous variance cannot be rejected.
b) If low warping is desirable, what level of copper content would you specify?
The Minitab command is: StatANOVAGeneral Linear Model Comparison
Options: confidence level 95%
Grouping Information Using the Tukey Method and 95% Confidence
Copper N Mean Grouping
100 8 28,250 A
12
80 8 21,000 B
60 8 18,875 B C
40 8 15,500 C
If we wish low warping, there is not difference among the levels of copper 60 and 40. A next
campaign of experiment focalized on these two levels is desirable.
c) Suppose that temperature cannot be easily controlled in the environment in which the copper
plates are to be used. Does this change your previous answer?
No, it does not. The model is purely additive; this means that we can optimize separately the
temperature and the copper.
13
Exercise 5 [M]
The quality control department of a fabric finishing plant is studying the effect of several factors on
the dyeing of cotton-synthetic cloth. Three operators, three cycle times, and two temperatures were
selected. Three small specimens of cloth were dyed under each set of conditions. The finished cloth
was compared to a standard, and a numerical score was assigned. The results follow:
Temperature
300 350
Operator Operator
Time 1 2 3 1 2 3
23 27 31 24 38 34
40 24 28 32 23 36 36
25 26 29 28 35 39
36 34 33 37 34 34
50 35 38 34 39 38 36
36 39 35 35 36 31
28 35 26 26 36 28
60 24 35 27 29 37 26
27 34 25 25 34 24
a) How should the experiment be conducted?

b) Analyze the data. Use α=0.05.
Solution
a) How should the experiment be conducted?
The experiment should be conducted randomizing the order of the experiments.
b) Analyze the data. Use α=0.05
Individual Value Plot of Score

40
35
Score
30
25
Operator 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
Time 40 50 60 40 50 60
Temperature 300 350
The individual value plot indicates that the variability appears uniform. Instead some data could be
outliers, for instance the observation (300, 50, 2, 1) is somewhat far from the other two replicates, the
same can be said for the observations (300, 60, 2, 2) and (350, 40, 1, 3).
14
Main Effects Plot for Score Interaction Plot for Score
Temperature Time Operator 40 50 60 1 2 3
36
35
Temperature
300
35 350
30
Temperature
34
25
Time
33 35
Mean
40
50
32 Time 30 60
31 25
30
Operator
29
300 350 40 50 60 1 2 3
The factors time and operator seem to influence the response more than the factor temperature. Then,
an interaction plot with three or more factors show separate two-way interaction plots for all two-
factor combinations. The factors time and operator seem to interact since the lack of parallelism of
the lines.
General Linear Model: Score versus Temperature; Time; Operator

Factor Information
Temperature Fixed 2 300; 350
Time Fixed 3 40; 50; 60
Operator Fixed 3 1; 2; 3
Temperature 1 50,07 50,074 15,28 0,000
Time 2 436,00 218,000 66,51 0,000
Operator 2 261,33 130,667 39,86 0,000
Temperature*Time 2 78,81 39,407 12,02 0,000
Temperature*Operator 2 11,26 5,630 1,72 0,194
Time*Operator 4 355,67 88,917 27,13 0,000
Temperature*Time*Operator 4 46,19 11,546 3,52 0,016
Error 36 118,00 3,278
Total 53 1357,33
Model Summary
1,81046 91,31% 87,20% 80,44%
We have to check the residual assumptions before drawing the conclusions.
Scatterplot of SRES1 vs FITS1; Temperature; Time; Operator Probability Plot of SRES1

FITS1 Temperature
Normal
2 99
Mean -2,80434E-15
1 StDev 1,009
95 N 54
0 AD 0,376
90
P-Value 0,400
-1 80
70
-2
Percent
60
SRES1
24 27 30 33 36 300 312 324 336 348 50

Time Operator 40
2 30
20
1
10
0
5
-1
1
-2 -3 -2 -1 0 1 2 3
40 45 50 55 60 1,0 1,5 2,0 2,5 3,0 SRES1
15
Test for Equal Variances: SRES1 vs Temperature; Time; Operator
Temperature Time Operator
300 40 1 Bartlett’s Test
2
3 P-Value 0,867
50 1
2
3
60 1
2
3
350 40 1
2
3
50 1
2
3
60 1
2
3
0 10 20 30 40 50
From the scatterplot, no outliers appear. The hypotheses of normality and equal variance cannot be
rejected. Thus, the model assumptions are verified. In conclusion, the three factors are significant, as
well as the interactions time*operator and time*temperature and the third level interaction.
If we wish to estimate the model parameters, one of the possible output of the GLM command is:
Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant 31,556 0,246 128,08 0,000
Temperature
300 -0,963 0,246 -3,91 0,000 1,00
Time
40 -1,667 0,348 -4,78 0,000 1,33
50 4,000 0,348 11,48 0,000 1,33
Operator
1 -2,444 0,348 -7,02 0,000 1,33
2 2,889 0,348 8,29 0,000 1,33
Temperature*Time
300 40 -1,704 0,348 -4,89 0,000 1,33
300 50 0,963 0,348 2,76 0,009 1,33
Temperature*Operator
300 1 0,519 0,348 1,49 0,145 1,33
300 2 -0,593 0,348 -1,70 0,098 1,33
Time*Operator
40 1 -2,944 0,493 -5,98 0,000 1,78
40 2 -1,111 0,493 -2,25 0,030 1,78
50 1 3,222 0,493 6,54 0,000 1,78
50 2 -1,944 0,493 -3,95 0,000 1,78
Temperature*Time*Operator
300 40 1 1,648 0,493 3,34 0,002 1,78
300 40 2 -1,407 0,493 -2,86 0,007 1,78
300 50 1 -1,185 0,493 -2,41 0,021 1,78
300 50 2 1,093 0,493 2,22 0,033 1,78
16
Exercise 6 [M]
Consider the three-factor model:
=i 1, 2, …, a

yijk = µ + τ i + β j + γ k + τβij + βγ jk + ε ijk with  j = 1, 2, …, b
=
 k 1, 2, …, c
Notice that there is only one replicate. Assuming all the factors are fixed, write down the analysis of
variance table, including the expected mean squares (use the Montgomery’s tables concerning the
complete model).
Solution
The model is yijk = µ + τ i + β j + γ k + τβij + βγ jk + ε ijk

Because of the lack of replicates, the estimate of the error is done using the terms that do not appear
into the model: SSAC e SSABC.
It is required to write down the ANOVA table and the expressions of E(MS) using the Montgomery’s
table. We can easily delete the rows of the terms that are not considered into the model because the
design is orthogonal. In fact, the estimates do not change if some rows are deleted. The table is:
Source SS df MS E(MS)
A SSA (a-1) SSA/(a-1) bc
σ2 + ∑
a −1 i
τ i2
B SSB (b-1) SSB/(b-1) ac

σ2 + ∑
b −1 j
β j2
C SSC (c-1) SSC/(c-1) ab

σ2 + ∑
c −1 k
γ k2
AB SSAB (a-1)(b-1) SSAB/(a-1)(b-1) c

( a − 1)( b − 1) ∑
σ2 + τβ 2
ij
ij
BC SSBC (b-1)(c-1) SSBC/(b-1)(c-1) a

σ2 +
( b − 1)( c − 1) ∑ βγ
jk
2
jk
Error SSE dfE SSE/dfE E ( MS E )

Total SSTOT abc-1
.
Let us calculate the degree of freedom of the error:
=SS E SS AC + SS ABC
df E =df AC + df ABC =(a − 1)(c − 1) + (a − 1)(b − 1)(c − 1) =b(a − 1)(c − 1)
Let us calculate the sum of square of the error:
17
SS AC + SS ABC df SS AC df ABC SS ABC df df ABC
MS E = = AC + = AC MS AC + MS ABC
df AC + df ABC df AC + df ABC df AC df AC + df ABC df ABC df AC + df ABC df AC + df ABC
df AC df ABC
E ( MS E ) = E ( MS AC ) + E ( MS ABC ) =
df AC + df ABC df AC + df ABC
df AC  2 b  df ABC  2 1 
= σ +
df AC + df ABC
∑
(a − 1)(c − 1) ik
τγ ik2  +  σ + ∑
(a − 1) ( b − 1) (c − 1) ijk
τβγ ijk2  =
  df AC + df ABC  
df AC b df ABC 1
σ2 +
= ∑
df AC + df ABC (a − 1)(c − 1) ik
τγ ik2 + ∑
df AC + df ABC (a − 1) ( b − 1) (c − 1) ijk
τβγ ijk2
18
Exercise 7
An experiment was conducted in order to measure the amount of the product y. Two factors were
studied (A and B). The levels chosen were coded to the (-1,1). The levels of A were -1 and +1 and
the levels of B were -1, 0 and +1. The data were as follows:
B
-1 0 1
4.7 3.55 8.08 9.47 17.29 15.92
-1
4.01 2.38 9.12 9.51 15.75 16.86
A
20.30 21.51 21.47 18.79 21.61 16.28
1
19.89 21.04 21.29 18.39 18.08 18.03
a) Analyze the data. Use α=0.05;

b) Under what conditions would you operate this process? Evaluate by hand the three constants
and the critical value, then use Minitab.
Solution
Individual Value Plot of Productivity
20
15
Productivity
10
0
B -1 0 1 -1 0 1
A -1 1
It is better to pay attention to the data (1,0) and (1,1), they appear as potential outliers.
Main Effects Plot for Productivity Interaction Plot for Productivity

A B 22,5 A
20 -1
20,0 1
18 17,5
15,0
16
Mean
Mean
12,5
14 10,0
7,5
12
5,0
10
-1 0 1
-1 1 -1 0 1 B
The factor A seems to have an influence on the response greater than the factor B. The interaction
seems to be fairly relevant.
General Linear Model: Productivity versus A; B

Factor Information
19
A Fixed 2 -1; 1
B Fixed 3 -1; 0; 1
A 1 600,40 600,400 356,70 0,000
B 2 113,08 56,542 33,59 0,000
A*B 2 227,03 113,516 67,44 0,000
Error 18 30,30 1,683
Total 23 970,81
Model Summary
1,29739 96,88% 96,01% 94,45%
Before drawing the conclusion, let us check the residual assumptions

Scatterplot of SRES1 vs FITS1; A; B Probability Plot of SRES1
Normal
FITS1 A
99
Mean 2,312965E-16
2
StDev 1,022
95 N 24
AD 0,216
0 90
P-Value 0,827
80
70
-2
Percent
60
SRES1
5 10 15 20 -1,0 -0,5 0,0 0,5 1,0 50

B 40
30
2 20
10
0 5
1
-2 -3 -2 -1 0 1 2 3
-1,0 -0,5 0,0 0,5 1,0 SRES1
Test for Equal Variances: SRES1 vs A; B

A B
Bartlett’s Test
-1 -1 P-Value 0,218
1 -1
0 2 4 6 8 10 12 14
From the graphs, we can observe that the residual of the data (1,1) has a different behavior respect to
the others, but its residual value is lower than 3, thus it cannot be classified as an outlier. The
hypothesis of normality and the hypothesis of equal variance cannot be rejected.
In conclusion, both factors are significant as well as their interaction.
b) Under what conditions would you operate this process?

The complete model is significant; we have to choose the combination of factor levels that maximize
the response. A multiple comparison is required. First of all, we have to calculate the constant values
of the Bonferroni, Sheffè and Tukey.
20
ab(ab − 1) 6 ( 6 − 1)
=r = = 15 = df E 18 =a 2=b 3
2 2
Bonferroni= B0.05 t0.05
= 30 (18) 3.38
1 1 4.49
Tukey T
=0.05 q0.05 (ab, df
= E) q0.05 (6,18)
= = 3.18
2 2 2
Scheffé 5 ( 5,18 )
S0.05 = (ab − 1) Fα (ab − 1, df E ) = 5 F0.0= 13.864 3.72
=
The Tukey constant is characterized by the lower value.
MS E 1.68
The critical value is: T0.05
= 2 3.18
= 2 2.195
n 4
The mean of each cell is:
A B yij 
-1 -1 3.66
-1 0 9.045
-1 1 16.455
1 1 18.500
1 0 19.985
1 -1 20.685
If we compare each mean cell value to the others:

(-1,0) (-1,1) (1,1) (1,0) (1, -1)
9.045 16,455 18.5 19.985 20.685
(-1,-1) 3.66 5.385 12.795 14.84 16.325 17.025
(-1,0) 9.045 7.41 9.455 10.94 11.64
(-1,1) 16,455 2.045 3.53 4.23
(1,1) 18.5 1.485 2.185
(1,0) 19.985 0.7
In the table, the significant differences are highlighted. The table of the difference is depicted in the
next graph. In conclusion the combination (1,-1), (1,1) and (1,0) are not statistically different.
(-1,-1) (-1,0) (-1,1) (1,1) (1,0) (1,-1)

A*B N Mean Grouping
1 -1 4 20,685 A
1 0 4 19,985 A
1 1 4 18,500 A B
-1 1 4 16,455 B
-1 0 4 9,045 C
-1 -1 4 3,660 D
21
To maximize the response, we can choose between (A,B) = (1, -1) or (1 0) or (1 1). These
conditions are all equivalent.
Exercise 8 [February 28th 2012 ]
An experiment, presented in the paper “A Systematic Approach to the Analysis of Means” (E. G.
Schilling, Journal of Quality Technology, 1973), investigated the washing power of a solution as
measured by the reflectance of pieces of cotton cloth after washing. Pieces of cloth were soiled with
colloidal graphite and liquid paraffin and then washed for 20 minutes at 60° followed by two rinses
at 40° and 30°, respectively. The three factors in the washing solution of interest were:
• “sodium carbonate” (Factor A, levels 0%, 0.05%, and 0.1%);
• “detergent” (Factor B, levels 0.05%, 0.1%, and 0.2%);
• “sodium carboxymethyl cellulose” (Factor C, levels 0%, 0.025%, 0.05%).
One observation was taken per treatment combination, and the responses are shown in the next table:
Cellulose
Carbonate Detergent 1 2 3
1 10,6 14,9 18,2
1 2 19,8 24,3 23,2
3 27 31,5 34
1 19,7 25,5 25,9
2 2 32,9 36,4 38,9
3 36,1 39 40,6
1 22,3 29,4 29,7
3 2 32 41 41,6
3 32,1 41,5 38,7

b) If you wish to increase the quality of the washing power of a solution, under what conditions
would you operate this process? Use Tukey with α=5%; calculate manually the Tukey’s
constant and the critical value.
Solution
Main Effects Plot for Washing Interaction Plot for Washing

Carbonate Detergent Cellulose 1 2 3 1 2 3
40
36 Carbonate
1
34 30 2
Carbonate 3
32 20
40
30 Detergent
Mean
1
28 30 2
Detergent 3
26 20
24
22
Cellulose
20
1 2 3 1 2 3 1 2 3
22
The detergent and the carbonate seem relevant and their effects are large compared to the effect of
the cellulose. The carbonate and detergent show nonparallel lines, indicating a probable interaction.
Instead, the interaction lines of detergent and cellulose are parallel; we do not suspect any interaction.
The same can be true for the interaction between carbonate e cellulose.
General Linear Model: Washing versus Carbonate; Detergent; Cellulose

Factor Information
Carbonate Fixed 3 1; 2; 3
Detergent Fixed 3 1; 2; 3
Cellulose Fixed 3 1; 2; 3
Carbonate 2 723,41 361,707 207,34 0,000
Detergent 2 933,03 466,516 267,42 0,000
Cellulose 2 224,19 112,096 64,26 0,000
Carbonate*Detergent 4 73,37 18,342 10,51 0,003
Carbonate*Cellulose 4 18,23 4,557 2,61 0,115
Detergent*Cellulose 4 1,01 0,253 0,14 0,960
Error 8 13,96 1,745
Total 26 1987,20
Model Summary
1,32081 99,30% 97,72% 92,00%
Let us check the residual assumptions before drawing the conclusions.

Scatterplot of SRES1 vs FITS1; Carbonate; Detergent; Cellulose Probability Plot of SRES1
Normal
FITS1 Carbonate
2 99
Mean -1,38490E-14
1 StDev 1,019
95 N 27
0 AD 0,960
90
P-Value 0,013
-1
80
-2 70
Percent
60
SRES1
10 20 30 40 1,0 1,5 2,0 2,5 3,0 50

Detergent Cellulose 40
2 30
20
1
10
0
5
-1
-2 1
-3 -2 -1 0 1 2 3
1,0 1,5 2,0 2,5 3,0 1,0 1,5 2,0 2,5 3,0 SRES1
The hypothesis of normality is not verified. Looking at the ANOVA table, we can observe that the
interactions Carbonate*Cellulose and Detergent*Cellulose are not relevant. Thus, we delete them
from the model.
General Linear Model: Washing versus Carbonate; Detergent; Cellulose

Factor Information
Carbonate Fixed 3 1; 2; 3
Detergent Fixed 3 1; 2; 3
Cellulose Fixed 3 1; 2; 3
Carbonate 2 723,41 361,707 174,34 0,000
Detergent 2 933,03 466,516 224,86 0,000
23
Cellulose 2 224,19 112,096 54,03 0,000
Carbonate*Detergent 4 73,37 18,342 8,84 0,001
Error 16 33,19 2,075
Total 26 1987,20
Model Summary
1,44037 98,33% 97,29% 95,24%
Let us check the residual assumptions

Scatterplot of SRES2 vs FITS2; Carbonate; Detergent; Cellulose Probability Plot of SRES2
Normal
FITS2 Carbonate
2 99
Mean -8,56928E-15
StDev 1,019
1
95 N 27
AD 0,302
0 90
P-Value 0,552
-1 80
70
-2
Percent
60
SRES2
10 20 30 40 1,0 1,5 2,0 2,5 3,0 50

Detergent Cellulose 40
30
2
20
1
10
0 5
-1
1
-2 -3 -2 -1 0 1 2 3
1,0 1,5 2,0 2,5 3,0 1,0 1,5 2,0 2,5 3,0 SRES2
The normality hypothesis cannot be rejected. Looking at the scatterplot, no outliers appear and the
variance appears homogeneous among the factor levels. The assumptions are verified. In conclusion,
the three factors influence the response as well as the interaction Carbonate*Detergent.
b) If you wish to increase the quality of the washing power of a solution, under what conditions
would you operate this process? Use Tukey with α=5% and calculate manually the Tukey’s
constant.
To choose the level of each factor that increases the quality of the washing power, a multiple
comparison has to be done. The factor cellulose is independent from the other factors, so its best level
can be chosen independently. Instead the effect of the factor Carbonate is not independent form the
effect of the detergent; in fact their interaction is significant. We have two families
α
(Carbonate*Detergent and Cellulose), thus: α FAM = 0.05 ⇒ α= FAM = 0.025
2
The Tukey’s constants are:
1 1 5.550
Tukey (AB) =
T0.025 q0.025 (ab=, df E ) q0.025 (9,16)
= = 3.92
2 2 2 The critical values are:
1 1 4.148
Tukey (C) T0.025
= q0.025 (c=
, df E ) q0.025 (3,16)
= = 2.93
2 2 2
MS E 2.07
AB : =
Tα 2 3.92
= 2 4.605
cn 3
MS E 2.07
C: =
Tα 2 2.93
= 2 1.987
abn 9
Cellulose N Mean Grouping
3 9 32,3111 A
2 9 31,5000 A
1 9 25,8333 B
24
Carbonate*Detergent N Mean Grouping
2 3 3 38,5667 A
3 2 3 38,2000 A
3 3 3 37,4333 A
2 2 3 36,0667 A
1 3 3 30,8333 B
3 1 3 27,1333 B C
2 1 3 23,7000 C D
1 2 3 22,4333 D
1 1 3 14,5667 E
If we wish to maximize the quality of the washing, concerning the Carbonate and Detergent, there is
not difference among the conditions (2,3), (3,2), (3,3) and (2,2). Instead, concerning the cellulose, the
levels 2 and 3 are not statistically different.
25
Exercise 9
We would determine if different hardening methods (A) and the processing times (B) can affect
external hardness. Five different hardening methods and three different processing times are used.
Suppose that we utilize some specimens with a rectangular section and replicate the experiment four
times.
We measure the specimen hardness in the center of the largest face.
Let us calculate the power using the direct method, if we are interested in:
a) a ratio d/σ=3.5 concerning the factor A. Verify the results with Minitab;
b) a difference greater or equal to 4 among the levels of B (σ2=20) ;
c) a difference greater than 2.5σ concerning the interaction AB.
Solution
From the text: a=5, b=3, n=4.

df A = a − 1 = 4 df B = b − 1 = 2 df AB = (a − 1)(b − 1) = 8 df E = ab(n − 1) = 45
a) Let us calculate the power using the direct method, if we are interested in a ratio d/σ=3.5
concerning the factor A. Verify the results with Minitab
Power =1 − β =Prob { F (df A , df E , δ ) > Fα (df A , df E )}
• Let us calculate Fα (df A , df E )
Inverse Cumulative Distribution Function
F distribution with 4 DF in numerator and 45 DF in denominator
P(X<=x) x
0,95 2,57874
• Let us calculate the noncentrality parameter

2
b ⋅ n  d  4⋅3 2
=δA =   = 3.5 73.5
2 σ  2
• Let us calculate β
Cumulative Distribution Function
F distribution with 4 DF in numerator and 45 DF in denominator and
noncentrality parameter 73,5
x P(X<=x)
2,57874 0,0000000
• The power is: Power = 1- β = 1-0,0 = 1
Minitab command: StatPower and Sample size General Full factorial
General Full Factorial Design

α = 0,05 Assumed standard deviation = 4,47214
Factors: 2 Number of levels: 5; 3
Include terms in the model up through order: 2
Not including blocks in model.
Maximum Total
26
Difference Reps Runs Power
15,6525 4 60 1,00000
Power Curve for General Full Factorial

1,0
Reps
4
Assumptions
0,8 α 0,05
StDev 4,47214
# Factors 2
# Levels 5; 3
0,6
Terms Included In Model
Power
Blocks No
Term Order 2
0,4
0,2
0,0
0 2 4 6 8 10 12 14
Maximum Difference
b) Let us calculate the power using the direct method, if we are interested in a difference greater
or equal to 4 among the levels of B;
Power =1 − β =Pr ob { F (df B , df E , δ ) > Fα (df B , df E )}

• Let us calculate Fα (df B , df E )
P(X<=x) x
0,95 3,20432
2
a⋅n  d  4⋅5 4
• Let us calculate the noncentrality parameter
= δB =  = 8
2 σ  2 5
noncentrality parameter 8
x P(X<=x)
3,20432 0,313553
• The power is: Power = 1- β = 1-0,313553 = 0.686447
We cannot verify this result with Minitab because: “Minitab performs the calculation based on the
main effect with the largest number of levels to provide conservative results”.
Let us try to calculate the new d and to use Minitab. The output is:
General Full Factorial Design

α = 0,05 Assumed standard deviation = 4,47214
Factors: 2 Number of levels: 5; 3
Include terms in the model up through order: 2
Not including blocks in model.
Maximum Total
Difference Reps Runs Power
4 4 60 0,345190
27
The output value of Minitab is lower because it is referred to the critical value (i.e., the one with the
maximum number of levels).
c) Let us calculate the power using the direct method, if we are interested in a difference greater
than 2.5σ concerning the interaction AB.
Power =1 − β =Pr ob { F (df AB , df E , δ ) > Fα (df AB , df E )}

• Let us calculate Fα (df AB , df E )
P(X<=x) x
0,95 2,15213
2
n d  4 2
• δ AB
Let us calculate the noncentrality parameter= =  = 2.5 12.5
2 σ  2
noncentrality parameter 12.5
x P(X<=x)
2,15213 0,381248
• The power is: Power = 1- β = 1-0,381248 = 0.618752
Minitab is not able to do this calculation.
28
Exercise 10 [July 10th 2013]
Diet affects weight gain. We wish to compare nine diets; these diets are the factor-level combinations
of protein source (beef, pork, and grain) and number of calories (low, medium, and high). There are
eighteen test animals that were randomly assigned to the nine diets, two animals per diet. The mean
responses (weight gain) are:
Weight Calories
Protein Low Medium High
Beef 76 86,8 101,8
Pork 78,3 89,5 98,2
Grain 78,8 83,5 86,2
a) Analyze the data with α=0.05;

b) If we wish to reduce the weight gain, what would you recommend? Evaluate the Tukey
constant and the critical value, then use Minitab.
Solution
a) Analyze the date with α=0.05
Main Effects Plot for Weight Interaction Plot for Weight

Protein Calories 1 05 Protein
96 1
2
1 00 3
92
95
Mean
88 90
Mean
85
84
80
80
75
1 2 3
1 2 3 1 2 3 Calories
The factor Calories seems to have a large influence on the weight compared to the factor protein. The
interaction between the factors seems relevant but we cannot verify directly its significance because
of the lack of replicates.
General Linear Model: Weight versus Protein; Calories

Factor Information
Protein Fixed 3 1; 2; 3
Calories Fixed 3 1; 2; 3
Protein 2 63,05 31,52 1,36 0,355
Calories 2 469,94 234,97 10,12 0,027
Error 4 92,91 23,23
Total 8 625,90
Model Summary
4,81958 85,16% 70,31% 24,85%
29
Before drawing the conclusions, let us check the residual assumptions.
Scatterplot of SRES1 vs FITS1 ; Protein; Calories Probability Plot of SRES1
Normal
FITS1 Protein
99
1 Mean 4,687608E-1 6
StDev 1 ,061
95 N 9
0 AD 0,291
90
P-Value 0,525
-1 80
70
Percent
-2 60
SRES1
75 80 85 90 95 1 ,0 1 ,5 2,0 2,5 3,0 50

Calories 40
30
1 20
10
0
5
-1
1
-3 -2 -1 0 1 2 3
-2
1 ,0 1 ,5 2,0 2,5 3,0 SRES1
We can observe that all the standardized residuals belong to the interval (-3;+3); no outliers cab be
pointed out. The hypothesis of normality cannot be refused. Instead the hypothesis of homogeneous
variance is assumed looking at the scatterplot. In conclusion, only the factor Calories affects the
weight.
b) If we wish to reduce the weight gain, what would you recommend?
In order to reduce the weight gain, it is necessary a multiple comparison. The only factor significant
1 1 5
is the Calories. Its Tukey’s constant
= is T0.05 =q0.05 (a, df E ) = q0.05 (3, 4) =3.536
2 2 2
MS E 23.23
and the critical value is:
= Tα 2 3.536
= 2 13.92
b 3
Calories N Mean Grouping
3 3 95,4 A
2 3 86,6 A B
1 3 77,7 B
In order to reduce the weight gain, as predictable, it is better to eat foods with a medium or low
amount of calories.
30
Exercise 11 [February 5th 2014]
Derive, explicating all the steps, the expression of the expected mean square of the factor B 𝐸𝐸(𝑀𝑀𝑀𝑀𝐵𝐵 )
for a two-factor analysis with one observation per cell.
Solution
E ( SS B )
E ( MS B ) =
b −1
The model with 2 factors and one observation per cell is: yij = µ + τ i + β j + ε ij
 1 1 2  1 2   1 2
E ( SS=
B) E ∑ y•2j • −  E  ∑ y• j  − E 
y•••= y••  per=
n 1
=  an j 1,=
b abn   a j 1,b   ab 
   
2
1 
2
   1  
=E ( SS B ) E  ∑  ∑ ( µ + τ i + β j + ε ij )   − E  ∑ ( µ + τ i +=β j + ε ij )  
=
a b  i 1,a    i 1,a 
 j 1,=  ab= j =1,b  

1 2  1 2
= E  ∑ ( a µ + τ • + a β j + ε • j )  − E  ( abµ + bτ • + a β= • + ε •• ) 
 a j =1,b   ab 
1   1 
= E ∑ (a µ 2 2
+ a 2 β j2 + ε •2j + 2a 2 µβ j + 2a µε • j )  − E  ( a 2b 2 µ 2 + ε ••2 + =
2abµε •• ) 
a j =1,b   ab 
1 1
= abµ 2 + a ∑ β j2 + E ( ε •2j ) + 2 µε •• − abµ 2 − E ( ε ••2 ) − 2 µε •• =
j =1,b a ab
1 1 ab ab 2
= a ∑ β j2 +
∑ E ( ε •2j ) − E (=
ε ••2 ) a ∑ β j2 + σ 2 − =σ a ∑ β j2 + ( b − 1) σ 2
=j 1,=
b a j 1,b ab =j 1,b a ab=j 1,b
E ( SS B ) a
E ( MS B=
)
b −1
= σ2 + ∑
b − 1 j =1,b
β j2
31

ClassWork 03 Multi Way ANOVA - Sol

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ClassWork 03 Multi Way ANOVA - Sol

Uploaded by

Copyright:

Available Formats

ClassWork 03 Two Way ANOVA

Let us prove that: SSTOT = SS A + SS B + SS AB + SS E

Since the cross-product terms are zero, we get

Individual Value Plot of Yield

The graph indicates that the variability appears uniform.

Let us do the analysis by hand

The ANOVA table is:

General Linear Model: Yield versus Temperature; Pressure

Check the residual assumptions:

General Linear Model: Yield versus Temperature; Pressure

Test for Equal Variances: SRES2 vs Temperature; Pressure

0 100 200 300 400 500 600

b) Calculate the residual for the data (1,2,1)=90.7;

Let us calculate manually the residual e121.

c) Under what conditions would you operate this process?

The analysis for the factor temperature:

Sα =( a − 1) Fα ( a − 1, df E ) =2 F0.025 ( 2,13) =2* 4.9653 =3.151

The conclusion is:

160 150 170

230 200 215

a) Analyze the data and draw conclusions. Use α=0.05;

a) Analyze the data and draw conclusions. Use α=0.05;

Individual Value Plot of Brightness

Main Effects Plot for Brightness Interaction Plot for Brightness

Check for residual assumptions

Test for Equal Variances: SRES1 vs Glass; Phosphor

b) Use Tukey’s test to determine the best condition (α=0.05).

Grouping Information Using the Tukey Method and 97,5% Confidence

Copper Content (%)

a) Analyze the data and draw conclusions. Use α=0.05;

a) Analyze the data and draw conclusions. Use α=0.05;

Copper 40 60 80 100 40 60 80 100 40 60 80 100 40 60 80 100

Main Effects Plot for Warping Interaction Plot for Warping

Before drawing the conclusion, let us check the residual assumptions.

Scatterplot of SRES1 vs FITS1; Temperature; Copper Probability Plot of SRES1

Test for Equal Variances: SRES1 vs Temperature; Copper

0 200 400 600 800 1000 1200 1400

a) How should the experiment be conducted?

a) How should the experiment be conducted?

The experiment should be conducted randomizing the order of the experiments.

b) Analyze the data. Use α=0.05

Individual Value Plot of Score

General Linear Model: Score versus Temperature; Time; Operator

We have to check the residual assumptions before drawing the conclusions.

Scatterplot of SRES1 vs FITS1; Temperature; Time; Operator Probability Plot of SRES1

24 27 30 33 36 300 312 324 336 348 50

Consider the three-factor model:

The model is yijk = µ + τ i + β j + γ k + τβij + βγ jk + ε ijk

B SSB (b-1) SSB/(b-1) ac

C SSC (c-1) SSC/(c-1) ab

AB SSAB (a-1)(b-1) SSAB/(a-1)(b-1) c

BC SSBC (b-1)(c-1) SSBC/(b-1)(c-1) a

Error SSE dfE SSE/dfE E ( MS E )

Let us calculate the sum of square of the error:

a) Analyze the data. Use α=0.05;

a) Analyze the data. Use α=0.05;

Individual Value Plot of Productivity

Main Effects Plot for Productivity Interaction Plot for Productivity

General Linear Model: Productivity versus A; B

Before drawing the conclusion, let us check the residual assumptions

5 10 15 20 -1,0 -0,5 0,0 0,5 1,0 50

Test for Equal Variances: SRES1 vs A; B