Professional Documents
Culture Documents
ClassWork 03 Multi Way ANOVA - Sol
ClassWork 03 Multi Way ANOVA - Sol
Exercise 1
Solution
yijk = µˆ + τˆi + βˆ j + τβ
+e
ij ijk
yijk − µˆ = τˆi + βˆ j + τβ
+e
ij ijk
∑( )
2
∑( y ˆ)
2
ijk − µ= eijk + τˆi + βˆ j + τβ
ij
ijk ijk
∑( y − µˆ ) =
2 2
ijk
ijk ∑τˆ + ∑ βˆ + ∑τβ
ijk
i
2 + e
∑ ijk
2
j
ijk
ij
ijk
2
ijk
∑( y − y )
2
SSTOT
= ijk SS A
= ∑=
τˆ i
2
bn∑τˆi2 SS B
= βˆ
∑= 2
j an∑ βˆ j2
ijk ijk i ijk j
2 2
SS AB
= ∑=
τβ
ijk
n∑τβ ij
ij
ij SS E
= ∑e
ijk
2
ijk
Let us prove that the cross product terms are zero, we do that just for one
= 2n τˆ τβ
2∑τˆi *τβ =0
ij ∑ i ∑ ij
ijk i j
Remember that one of the constraints to solve the normal equations is ∑τβ
j
ij =0
1
Exercise 2 [M]
The yield of a chemical process is being studied. The two most important variables are thought to be
the pressure and the temperature. Three levels of each factor are selected, and a factorial experiment
with two replicates is performed. The yield data follows:
Pressure (psi)
T(°C) 200 215 230
90.4 90.7 90.2
150
90.2 90.6 90.4
90.1 90.5 89.9
160
90.3 90.6 90.1
90.5 90.8 90.4
170
90.7 90.9 90.1
a) Analyze the data. Use α=0.05 and comment the model’s adequacy;
b) Calculate the residual for the data (1,2,1) = 90.7;
c) Under what conditions would you operate this process? Evaluate by hand the three constants
and the critical value, then use Minitab.
Solution
a) Analyze the data. Use α=0.05 and comment the model’s adequacy
Since the experiment is replicated, the individual value plot is used. The individual value plot
should be used only when replicates are present.
90,8
90,6
Yield
90,4
90,2
90,0
Pressure 200 215 230 200 215 230 200 215 230
Temperature 150 160 170
Two more graphs are plotted to better understand the influence of each factor and of their interaction
on the response variable. The Minitab commands are:
• StatANOVAMain Effect Plot
• StatANOVAInteraction Plot
2
Main Effects Plot for Yield Interaction Plot for Yield
Data Means Data Means
Temperature Pressure 90,9 Temperature
90,7 150
90,8 160
170
90,7
90,6
90,6
90,5 90,5
Mean
Mean
90,4
90,4 90,3
90,2
90,3
90,1
90,0
90,2
200 215 230
150 160 170 200 215 230 Pressure
The main effects plot displays the response means for each factor level in sorted order. A horizontal
line is drawn at the grand mean. The effects are the differences between the means and the reference
line. Analyzing the graph, both the factors seem to be relevant and the pressure seems to have a
greater influence on the response than the temperature. Instead, the interaction between pressure and
temperature seems fairly small, as shown by the similar shape of the three curves.
1 y2
SS
= B ∑ j abn 0.76778
an j
y 2
− =
1 y2
SS=
AB ∑ ij abn − SS A − SS=B 0.06889
n ij
y 2
−
y2
SSTOT = ∑ y − 2
= 1.29778
ijk SS E = SSTOT − SS A − SS B − SS AB = 0.16
ijk abn
The analysis in MINITAB is done using the command StatANOVAGeneral Linear Model
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Temperature 2 0,30111 0,15056 8,47 0,009
Pressure 2 0,76778 0,38389 21,59 0,000
3
Temperature*Pressure 4 0,06889 0,01722 0,97 0,470
Error 9 0,16000 0,01778
Total 17 1,29778
Model Summary
S R-sq R-sq(adj) R-sq(pred)
0,133333 87,67% 76,71% 50,68%
-1 80
70
Percent
-2 60
SRES1
90,0 90,2 90,4 90,6 90,8 150 155 160 165 170 50
40
Pressure
2 30
20
1
10
0 5
-1
1
-3 -2 -1 0 1 2 3
-2
200 210 220 230 SRES1
The hypothesis of normality is refused (the p-value is lower than 0.05). From the probability plot, it
is clear the presence of an overfitting problem. So, we try to solve the rejection of the normality
assumption by reducing the model (i.e. eliminate the non-significant factors) rather than using Box-
Cox transformation. In general, if the normality assumption is not verified and there are non-
significant factors in the model, it is recommended to reduce the model before transforming the data.
Consequently, let us focus only on the significant model (pure additive).
Factor Information
Factor Type Levels Values
Temperature Fixed 3 150; 160; 170
Pressure Fixed 3 200; 215; 230
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Temperature 2 0,30111 0,15056 8,55 0,004
Pressure 2 0,76778 0,38389 21,80 0,000
Error 13 0,22889 0,01761
Lack-of-Fit 4 0,06889 0,01722 0,97 0,470
Pure Error 9 0,16000 0,01778
Total 17 1,29778
Model Summary
S R-sq R-sq(adj) R-sq(pred)
0,132691 82,36% 76,94% 66,19%
The Lack of Fit (LOF) test and the pure error will be covered in ClassWork 08 – regression. For
now, you should only remember that p-value of the LOF test should be >0.05.
4
Before drawing the conclusion, we check the residual assumptions.
Scatterplot of SRES2 vs FITS2; Temperature; Pressure Probability Plot of SRES2
Normal
FITS2 Temperature
2 99
Mean -2,10079E-13
1 StDev 1,029
95 N 18
0 AD 0,189
90
P-Value 0,888
-1 80
70
-2
Percent
60
SRES2
90,0 90,2 90,4 90,6 90,8 150 155 160 165 170 50
Pressure 40
2 30
20
1
10
0
5
-1
1
-2 -3 -2 -1 0 1 2 3
200 210 220 230 SRES2
Bartlett’s Test
150 200
P-Value 0,990
215
230
160 200
215
230
170 200
215
230
All the residuals belong to the interval (-3;+3), then the residuals appear independent from the
predicted response and independent from the factors. The hypotheses of normality and homogeneous
variance cannot be refused.
In conclusion, the temperature and the pressure influence the response. The additive model is
significant.
Instead using the reduced model (this is the correct way to estimate the residual):
542.5 544.1 1627.4
yˆ12 = µˆ + τˆ1 + βˆ2 = y + ( y1 − y ) + ( y 2 − y ) = y1 + y 2 − y = + − = 90.687
6 6 18
e121 = y121 − yˆ12 = 90.7 − 90.687 = 0.0 1
The model is additive: we can pick the best level of temperature independently from the pressure
α
and vice versa. Thus, α FAM = 0.05 ⇒ α= FAM = 0.025 Since both factors have the same
2
number of levels, we can calculate the three constants only for one factor.
5
a ( a − 1) 3 ( 3 − 1)
Bα t α
= ( df E=
) rA = = 3= df E 13
2 rA 2 2
=Bα t0.025/6
= (13) 3.107
1 1 4.296
Tα
= qα ( a,=
df E ) q0.025 ( =
3,13) = 3.04
2 2 2
Sα FAM= ( a + b − 2 ) Fα FAM
( a + b − 2, df E )= 4 F0.05 ( 4,13)= 4*3.179= 3.57
MS E 0.01761
The critical value=is Tα 2 3.04
= 2 0.2329
bn 6
(For demonstration purposes, we built the Table of Differences by hand.)
Let us build the matrix of the differences:
Temperature 150 170
160 0.167 0.317
150 0.15
The temperatures 160° and 150° are not statistically different, and the same conclusion is drawn for
the temperatures 150° and 170°.
The same analysis is made for the factor pressure. The constant values are the same because a=b and
the critical value is still 0.228. The matrix of the differences is:
Pressure 200 215
230 0.183 0.5
200 0.317
In conclusion the pressures 230 psi and 200 psi do not are different from a statistical point of view.
If we wish to maximize the yield of the chemical process, we select a pressure equal to 215 psi,
instead for the temperature the levels 150° and 170° are not statistically different. If we consider that
a higher temperature corresponds to a higher energetic cost, we would choose the temperature 150°.
6
Instead, if we have enough money to do more experiments, we could focus our attention on these two
levels of temperature to better understand which one allow for a higher yield.
With Minitab:
Stat ANOVA Comparison Options:97.5
7
Exercise 3 [M]
An article describes an experiment to investigate the effect of the type of glass and the type of
phosphor on the brightness of a television tube. The response variable is the current necessary (in
microamps) to obtain a specified brightness level. The data are as follows:
Phosphor
Glass 1 2 3
1 280 300 290
290 310 285
285 295 290
2 230 260 220
235 240 225
240 235 230
Solution
310
300
290
280
Brightness
270
260
250
240
230
220
Phosphor 1 2 3 1 2 3
Glass 1 2
The individual value plot indicates that no evident outliers appears, and the variability is uniform.
290
280
280
270 270
Mean
Mean
260
260
250
250
240
230
240
220
230 1 2 3
1 2 1 2 3 Phosphor
From the Main Effect plot, the type of glass seems to influence the response more than the type of
phosphor. In the Interaction plot, the lines are approximately parallel, indicating a probable lack of
interaction between factors glass and phosphor.
8
General Linear Model: Brightness versus Glass; Phosphor
Factor Information
Factor Type Levels Values
Glass Fixed 2 1; 2
Phosphor Fixed 3 1; 2; 3
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Glass 1 14450,0 14450,0 273,79 0,000
Phosphor 2 933,3 466,7 8,84 0,004
Glass*Phosphor 2 133,3 66,7 1,26 0,318
Error 12 633,3 52,8
Total 17 16150,0
Model Summary
S R-sq R-sq(adj) R-sq(pred)
7,26483 96,08% 94,44% 91,18%
-2 60
SRES1
220 240 260 280 300 1,00 1,25 1,50 1,75 2,00 50
Phosphor 40
30
2
20
1 10
0 5
-1
1
-3 -2 -1 0 1 2 3
-2
1,0 1,5 2,0 2,5 3,0 SRES1
Bartlett’s Test
1 1 P-Value 0,458
2 1
0 5 10 15 20 25 30 35
95% Bonferroni Confidence Intervals for StDevs
There are not outliers. The hypothesis of normality cannot be refused, the same is true for the test of
equal variance. The residual assumptions are checked. In conclusion the type of glass and the type of
phosphor are significant, instead their interaction is insignificant.
9
Grouping Information Using the Tukey Method and 97,5% Confidence
Glass N Mean Grouping
1 9 291,667 A
2 9 235,000 B
Means that do not share a letter are significantly different.
If we wish to decrease the current necessary, we would recommend to use the glass number 2 and
one of the phosphors among the types 1and 3.
10
Exercise 4 [M]
Johnson and Leone describe an experiment to investigate warping of copper plates. The two factors
studied were the temperature and the copper content of the plates. The response variable was a
measure of the amount of warping. The data were as follows:
Solution
30
25
Warping
20
15
10
The variability appears uniform and no evident outliers are shown in the graph.
25,0
Mean
20
Mean
22,5
20,0 15
17,5
10
15,0 40 60 80 100
50 75 100 125 40 60 80 100 Copper
The factor copper seems to affect more the response variable than the temperature. The interaction
between the temperature and the copper does not seem particularly relevant.
11
General Linear Model: Warping versus Temperature; Copper
Factor Information
Factor Type Levels Values
Temperature Fixed 4 50; 75; 100; 125
Copper Fixed 4 40; 60; 80; 100
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Temperature 3 156,1 52,031 7,67 0,002
Copper 3 698,3 232,781 34,33 0,000
Temperature*Copper 9 113,8 12,642 1,86 0,133
Error 16 108,5 6,781
Total 31 1076,7
Model Summary
S R-sq R-sq(adj) R-sq(pred)
2,60408 89,92% 80,48% 59,69%
60
SRES1
10 15 20 25 30 40 60 80 100 120 50
Copper 40
2 30
20
1
10
0 5
-1
1
-2 -3 -2 -1 0 1 2 3
40 60 80 100 SRES1
50 40 Bartlett’s Test
60
P-Value 0,984
80
100
75 40
60
80
100
100 40
60
80
100
125 40
60
80
100
No outliers appear in the graph, in fact all the standardized residuals belong to the interval (-3,+3).
Moreover, the residuals appear independent form the predicted response and from both factors. The
hypotheses of normality and homogenous variance cannot be rejected.
b) If low warping is desirable, what level of copper content would you specify?
The Minitab command is: StatANOVAGeneral Linear Model Comparison
Options: confidence level 95%
Grouping Information Using the Tukey Method and 95% Confidence
Copper N Mean Grouping
100 8 28,250 A
12
80 8 21,000 B
60 8 18,875 B C
40 8 15,500 C
Means that do not share a letter are significantly different.
If we wish low warping, there is not difference among the levels of copper 60 and 40. A next
campaign of experiment focalized on these two levels is desirable.
c) Suppose that temperature cannot be easily controlled in the environment in which the copper
plates are to be used. Does this change your previous answer?
No, it does not. The model is purely additive; this means that we can optimize separately the
temperature and the copper.
13
Exercise 5 [M]
The quality control department of a fabric finishing plant is studying the effect of several factors on
the dyeing of cotton-synthetic cloth. Three operators, three cycle times, and two temperatures were
selected. Three small specimens of cloth were dyed under each set of conditions. The finished cloth
was compared to a standard, and a numerical score was assigned. The results follow:
Temperature
300 350
Operator Operator
Time 1 2 3 1 2 3
23 27 31 24 38 34
40 24 28 32 23 36 36
25 26 29 28 35 39
36 34 33 37 34 34
50 35 38 34 39 38 36
36 39 35 35 36 31
28 35 26 26 36 28
60 24 35 27 29 37 26
27 34 25 25 34 24
Solution
35
Score
30
25
Operator 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
Time 40 50 60 40 50 60
Temperature 300 350
The individual value plot indicates that the variability appears uniform. Instead some data could be
outliers, for instance the observation (300, 50, 2, 1) is somewhat far from the other two replicates, the
same can be said for the observations (300, 60, 2, 2) and (350, 40, 1, 3).
14
Main Effects Plot for Score Interaction Plot for Score
Data Means Data Means
Temperature Time Operator 40 50 60 1 2 3
36
35
Temperature
300
35 350
30
Temperature
34
25
Time
33 35
Mean
40
50
32 Time 30 60
31 25
30
Operator
29
300 350 40 50 60 1 2 3
The factors time and operator seem to influence the response more than the factor temperature. Then,
an interaction plot with three or more factors show separate two-way interaction plots for all two-
factor combinations. The factors time and operator seem to interact since the lack of parallelism of
the lines.
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Temperature 1 50,07 50,074 15,28 0,000
Time 2 436,00 218,000 66,51 0,000
Operator 2 261,33 130,667 39,86 0,000
Temperature*Time 2 78,81 39,407 12,02 0,000
Temperature*Operator 2 11,26 5,630 1,72 0,194
Time*Operator 4 355,67 88,917 27,13 0,000
Temperature*Time*Operator 4 46,19 11,546 3,52 0,016
Error 36 118,00 3,278
Total 53 1357,33
Model Summary
S R-sq R-sq(adj) R-sq(pred)
1,81046 91,31% 87,20% 80,44%
60
SRES1
-1
1
-2 -3 -2 -1 0 1 2 3
40 45 50 55 60 1,0 1,5 2,0 2,5 3,0 SRES1
15
Test for Equal Variances: SRES1 vs Temperature; Time; Operator
Temperature Time Operator
300 40 1 Bartlett’s Test
2
3 P-Value 0,867
50 1
2
3
60 1
2
3
350 40 1
2
3
50 1
2
3
60 1
2
3
0 10 20 30 40 50
95% Bonferroni Confidence Intervals for StDevs
From the scatterplot, no outliers appear. The hypotheses of normality and equal variance cannot be
rejected. Thus, the model assumptions are verified. In conclusion, the three factors are significant, as
well as the interactions time*operator and time*temperature and the third level interaction.
If we wish to estimate the model parameters, one of the possible output of the GLM command is:
Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant 31,556 0,246 128,08 0,000
Temperature
300 -0,963 0,246 -3,91 0,000 1,00
Time
40 -1,667 0,348 -4,78 0,000 1,33
50 4,000 0,348 11,48 0,000 1,33
Operator
1 -2,444 0,348 -7,02 0,000 1,33
2 2,889 0,348 8,29 0,000 1,33
Temperature*Time
300 40 -1,704 0,348 -4,89 0,000 1,33
300 50 0,963 0,348 2,76 0,009 1,33
Temperature*Operator
300 1 0,519 0,348 1,49 0,145 1,33
300 2 -0,593 0,348 -1,70 0,098 1,33
Time*Operator
40 1 -2,944 0,493 -5,98 0,000 1,78
40 2 -1,111 0,493 -2,25 0,030 1,78
50 1 3,222 0,493 6,54 0,000 1,78
50 2 -1,944 0,493 -3,95 0,000 1,78
Temperature*Time*Operator
300 40 1 1,648 0,493 3,34 0,002 1,78
300 40 2 -1,407 0,493 -2,86 0,007 1,78
300 50 1 -1,185 0,493 -2,41 0,021 1,78
300 50 2 1,093 0,493 2,22 0,033 1,78
16
Exercise 6 [M]
=i 1, 2, …, a
yijk = µ + τ i + β j + γ k + τβij + βγ jk + ε ijk with j = 1, 2, …, b
=
k 1, 2, …, c
Notice that there is only one replicate. Assuming all the factors are fixed, write down the analysis of
variance table, including the expected mean squares (use the Montgomery’s tables concerning the
complete model).
Solution
Source SS df MS E(MS)
A SSA (a-1) SSA/(a-1) bc
σ2 + ∑
a −1 i
τ i2
17
SS AC + SS ABC df SS AC df ABC SS ABC df df ABC
MS E = = AC + = AC MS AC + MS ABC
df AC + df ABC df AC + df ABC df AC df AC + df ABC df ABC df AC + df ABC df AC + df ABC
df AC df ABC
E ( MS E ) = E ( MS AC ) + E ( MS ABC ) =
df AC + df ABC df AC + df ABC
df AC 2 b df ABC 2 1
= σ +
df AC + df ABC
∑
(a − 1)(c − 1) ik
τγ ik2 + σ + ∑
(a − 1) ( b − 1) (c − 1) ijk
τβγ ijk2 =
df AC + df ABC
df AC b df ABC 1
σ2 +
= ∑
df AC + df ABC (a − 1)(c − 1) ik
τγ ik2 + ∑
df AC + df ABC (a − 1) ( b − 1) (c − 1) ijk
τβγ ijk2
18
Exercise 7
An experiment was conducted in order to measure the amount of the product y. Two factors were
studied (A and B). The levels chosen were coded to the (-1,1). The levels of A were -1 and +1 and
the levels of B were -1, 0 and +1. The data were as follows:
B
-1 0 1
4.7 3.55 8.08 9.47 17.29 15.92
-1
4.01 2.38 9.12 9.51 15.75 16.86
A
20.30 21.51 21.47 18.79 21.61 16.28
1
19.89 21.04 21.29 18.39 18.08 18.03
Solution
20
15
Productivity
10
0
B -1 0 1 -1 0 1
A -1 1
It is better to pay attention to the data (1,0) and (1,1), they appear as potential outliers.
18 17,5
15,0
16
Mean
Mean
12,5
14 10,0
7,5
12
5,0
10
-1 0 1
-1 1 -1 0 1 B
The factor A seems to have an influence on the response greater than the factor B. The interaction
seems to be fairly relevant.
19
A Fixed 2 -1; 1
B Fixed 3 -1; 0; 1
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
A 1 600,40 600,400 356,70 0,000
B 2 113,08 56,542 33,59 0,000
A*B 2 227,03 113,516 67,44 0,000
Error 18 30,30 1,683
Total 23 970,81
Model Summary
S R-sq R-sq(adj) R-sq(pred)
1,29739 96,88% 96,01% 94,45%
60
SRES1
10
0 5
1
-2 -3 -2 -1 0 1 2 3
-1,0 -0,5 0,0 0,5 1,0 SRES1
Bartlett’s Test
-1 -1 P-Value 0,218
1 -1
0 2 4 6 8 10 12 14
95% Bonferroni Confidence Intervals for StDevs
From the graphs, we can observe that the residual of the data (1,1) has a different behavior respect to
the others, but its residual value is lower than 3, thus it cannot be classified as an outlier. The
hypothesis of normality and the hypothesis of equal variance cannot be rejected.
In conclusion, both factors are significant as well as their interaction.
20
ab(ab − 1) 6 ( 6 − 1)
=r = = 15 = df E 18 =a 2=b 3
2 2
Bonferroni= B0.05 t0.05
= 30 (18) 3.38
1 1 4.49
Tukey T
=0.05 q0.05 (ab, df
= E) q0.05 (6,18)
= = 3.18
2 2 2
Scheffé 5 ( 5,18 )
S0.05 = (ab − 1) Fα (ab − 1, df E ) = 5 F0.0= 13.864 3.72
=
The Tukey constant is characterized by the lower value.
MS E 1.68
The critical value is: T0.05
= 2 3.18
= 2 2.195
n 4
The mean of each cell is:
A B yij
-1 -1 3.66
-1 0 9.045
-1 1 16.455
1 1 18.500
1 0 19.985
1 -1 20.685
In the table, the significant differences are highlighted. The table of the difference is depicted in the
next graph. In conclusion the combination (1,-1), (1,1) and (1,0) are not statistically different.
21
To maximize the response, we can choose between (A,B) = (1, -1) or (1 0) or (1 1). These
conditions are all equivalent.
An experiment, presented in the paper “A Systematic Approach to the Analysis of Means” (E. G.
Schilling, Journal of Quality Technology, 1973), investigated the washing power of a solution as
measured by the reflectance of pieces of cotton cloth after washing. Pieces of cloth were soiled with
colloidal graphite and liquid paraffin and then washed for 20 minutes at 60° followed by two rinses
at 40° and 30°, respectively. The three factors in the washing solution of interest were:
• “sodium carbonate” (Factor A, levels 0%, 0.05%, and 0.1%);
• “detergent” (Factor B, levels 0.05%, 0.1%, and 0.2%);
• “sodium carboxymethyl cellulose” (Factor C, levels 0%, 0.025%, 0.05%).
One observation was taken per treatment combination, and the responses are shown in the next table:
Cellulose
Carbonate Detergent 1 2 3
1 10,6 14,9 18,2
1 2 19,8 24,3 23,2
3 27 31,5 34
1 19,7 25,5 25,9
2 2 32,9 36,4 38,9
3 36,1 39 40,6
1 22,3 29,4 29,7
3 2 32 41 41,6
3 32,1 41,5 38,7
Solution
40
30 Detergent
Mean
1
28 30 2
Detergent 3
26 20
24
22
Cellulose
20
1 2 3 1 2 3 1 2 3
22
The detergent and the carbonate seem relevant and their effects are large compared to the effect of
the cellulose. The carbonate and detergent show nonparallel lines, indicating a probable interaction.
Instead, the interaction lines of detergent and cellulose are parallel; we do not suspect any interaction.
The same can be true for the interaction between carbonate e cellulose.
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Carbonate 2 723,41 361,707 207,34 0,000
Detergent 2 933,03 466,516 267,42 0,000
Cellulose 2 224,19 112,096 64,26 0,000
Carbonate*Detergent 4 73,37 18,342 10,51 0,003
Carbonate*Cellulose 4 18,23 4,557 2,61 0,115
Detergent*Cellulose 4 1,01 0,253 0,14 0,960
Error 8 13,96 1,745
Total 26 1987,20
Model Summary
S R-sq R-sq(adj) R-sq(pred)
1,32081 99,30% 97,72% 92,00%
60
SRES1
-2 1
-3 -2 -1 0 1 2 3
1,0 1,5 2,0 2,5 3,0 1,0 1,5 2,0 2,5 3,0 SRES1
The hypothesis of normality is not verified. Looking at the ANOVA table, we can observe that the
interactions Carbonate*Cellulose and Detergent*Cellulose are not relevant. Thus, we delete them
from the model.
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Carbonate 2 723,41 361,707 174,34 0,000
Detergent 2 933,03 466,516 224,86 0,000
23
Cellulose 2 224,19 112,096 54,03 0,000
Carbonate*Detergent 4 73,37 18,342 8,84 0,001
Error 16 33,19 2,075
Total 26 1987,20
Model Summary
S R-sq R-sq(adj) R-sq(pred)
1,44037 98,33% 97,29% 95,24%
-1 80
70
-2
Percent
60
SRES2
-1
1
-2 -3 -2 -1 0 1 2 3
1,0 1,5 2,0 2,5 3,0 1,0 1,5 2,0 2,5 3,0 SRES2
The normality hypothesis cannot be rejected. Looking at the scatterplot, no outliers appear and the
variance appears homogeneous among the factor levels. The assumptions are verified. In conclusion,
the three factors influence the response as well as the interaction Carbonate*Detergent.
b) If you wish to increase the quality of the washing power of a solution, under what conditions
would you operate this process? Use Tukey with α=5% and calculate manually the Tukey’s
constant.
To choose the level of each factor that increases the quality of the washing power, a multiple
comparison has to be done. The factor cellulose is independent from the other factors, so its best level
can be chosen independently. Instead the effect of the factor Carbonate is not independent form the
effect of the detergent; in fact their interaction is significant. We have two families
α
(Carbonate*Detergent and Cellulose), thus: α FAM = 0.05 ⇒ α= FAM = 0.025
2
The Tukey’s constants are:
1 1 5.550
Tukey (AB) =
T0.025 q0.025 (ab=, df E ) q0.025 (9,16)
= = 3.92
2 2 2 The critical values are:
1 1 4.148
Tukey (C) T0.025
= q0.025 (c=
, df E ) q0.025 (3,16)
= = 2.93
2 2 2
MS E 2.07
AB : =
Tα 2 3.92
= 2 4.605
cn 3
MS E 2.07
C: =
Tα 2 2.93
= 2 1.987
abn 9
Grouping Information Using the Tukey Method and 97,5% Confidence
Cellulose N Mean Grouping
3 9 32,3111 A
2 9 31,5000 A
1 9 25,8333 B
Means that do not share a letter are significantly different.
24
Grouping Information Using the Tukey Method and 97,5% Confidence
Carbonate*Detergent N Mean Grouping
2 3 3 38,5667 A
3 2 3 38,2000 A
3 3 3 37,4333 A
2 2 3 36,0667 A
1 3 3 30,8333 B
3 1 3 27,1333 B C
2 1 3 23,7000 C D
1 2 3 22,4333 D
1 1 3 14,5667 E
Means that do not share a letter are significantly different.
If we wish to maximize the quality of the washing, concerning the Carbonate and Detergent, there is
not difference among the conditions (2,3), (3,2), (3,3) and (2,2). Instead, concerning the cellulose, the
levels 2 and 3 are not statistically different.
25
Exercise 9
We would determine if different hardening methods (A) and the processing times (B) can affect
external hardness. Five different hardening methods and three different processing times are used.
Suppose that we utilize some specimens with a rectangular section and replicate the experiment four
times.
We measure the specimen hardness in the center of the largest face.
Let us calculate the power using the direct method, if we are interested in:
a) a ratio d/σ=3.5 concerning the factor A. Verify the results with Minitab;
b) a difference greater or equal to 4 among the levels of B (σ2=20) ;
c) a difference greater than 2.5σ concerning the interaction AB.
Solution
Maximum Total
26
Difference Reps Runs Power
15,6525 4 60 1,00000
Assumptions
0,8 α 0,05
StDev 4,47214
# Factors 2
# Levels 5; 3
0,6
Terms Included In Model
Power
Blocks No
Term Order 2
0,4
0,2
0,0
0 2 4 6 8 10 12 14
Maximum Difference
b) Let us calculate the power using the direct method, if we are interested in a difference greater
or equal to 4 among the levels of B;
We cannot verify this result with Minitab because: “Minitab performs the calculation based on the
main effect with the largest number of levels to provide conservative results”.
Let us try to calculate the new d and to use Minitab. The output is:
27
The output value of Minitab is lower because it is referred to the critical value (i.e., the one with the
maximum number of levels).
c) Let us calculate the power using the direct method, if we are interested in a difference greater
than 2.5σ concerning the interaction AB.
28
Exercise 10 [July 10th 2013]
Diet affects weight gain. We wish to compare nine diets; these diets are the factor-level combinations
of protein source (beef, pork, and grain) and number of calories (low, medium, and high). There are
eighteen test animals that were randomly assigned to the nine diets, two animals per diet. The mean
responses (weight gain) are:
Weight Calories
Protein Low Medium High
Beef 76 86,8 101,8
Pork 78,3 89,5 98,2
Grain 78,8 83,5 86,2
Solution
92
95
Mean
88 90
Mean
85
84
80
80
75
1 2 3
1 2 3 1 2 3 Calories
The factor Calories seems to have a large influence on the weight compared to the factor protein. The
interaction between the factors seems relevant but we cannot verify directly its significance because
of the lack of replicates.
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Protein 2 63,05 31,52 1,36 0,355
Calories 2 469,94 234,97 10,12 0,027
Error 4 92,91 23,23
Total 8 625,90
Model Summary
S R-sq R-sq(adj) R-sq(pred)
4,81958 85,16% 70,31% 24,85%
29
Before drawing the conclusions, let us check the residual assumptions.
Scatterplot of SRES1 vs FITS1 ; Protein; Calories Probability Plot of SRES1
Normal
FITS1 Protein
99
1 Mean 4,687608E-1 6
StDev 1 ,061
95 N 9
0 AD 0,291
90
P-Value 0,525
-1 80
70
Percent
-2 60
SRES1
10
0
5
-1
1
-3 -2 -1 0 1 2 3
-2
1 ,0 1 ,5 2,0 2,5 3,0 SRES1
We can observe that all the standardized residuals belong to the interval (-3;+3); no outliers cab be
pointed out. The hypothesis of normality cannot be refused. Instead the hypothesis of homogeneous
variance is assumed looking at the scatterplot. In conclusion, only the factor Calories affects the
weight.
In order to reduce the weight gain, it is necessary a multiple comparison. The only factor significant
1 1 5
is the Calories. Its Tukey’s constant
= is T0.05 =q0.05 (a, df E ) = q0.05 (3, 4) =3.536
2 2 2
MS E 23.23
and the critical value is:
= Tα 2 3.536
= 2 13.92
b 3
Grouping Information Using the Tukey Method and 95% Confidence
Calories N Mean Grouping
3 3 95,4 A
2 3 86,6 A B
1 3 77,7 B
Means that do not share a letter are significantly different.
In order to reduce the weight gain, as predictable, it is better to eat foods with a medium or low
amount of calories.
30
Exercise 11 [February 5th 2014]
Derive, explicating all the steps, the expression of the expected mean square of the factor B 𝐸𝐸(𝑀𝑀𝑀𝑀𝐵𝐵 )
for a two-factor analysis with one observation per cell.
Solution
E ( SS B )
E ( MS B ) =
b −1
The model with 2 factors and one observation per cell is: yij = µ + τ i + β j + ε ij
1 1 2 1 2 1 2
E ( SS=
B) E ∑ y•2j • − E ∑ y• j − E
y•••= y•• per=
n 1
= an j 1,=
b abn a j 1,b ab
2
1
2
1
=E ( SS B ) E ∑ ∑ ( µ + τ i + β j + ε ij ) − E ∑ ( µ + τ i +=β j + ε ij )
=
a b i 1,a i 1,a
j 1,= ab= j =1,b
1 2 1 2
= E ∑ ( a µ + τ • + a β j + ε • j ) − E ( abµ + bτ • + a β= • + ε •• )
a j =1,b ab
1 1
= E ∑ (a µ 2 2
+ a 2 β j2 + ε •2j + 2a 2 µβ j + 2a µε • j ) − E ( a 2b 2 µ 2 + ε ••2 + =
2abµε •• )
a j =1,b ab
1 1
= abµ 2 + a ∑ β j2 + E ( ε •2j ) + 2 µε •• − abµ 2 − E ( ε ••2 ) − 2 µε •• =
j =1,b a ab
1 1 ab ab 2
= a ∑ β j2 +
∑ E ( ε •2j ) − E (=
ε ••2 ) a ∑ β j2 + σ 2 − =σ a ∑ β j2 + ( b − 1) σ 2
=j 1,=
b a j 1,b ab =j 1,b a ab=j 1,b
E ( SS B ) a
E ( MS B=
)
b −1
= σ2 + ∑
b − 1 j =1,b
β j2
31