ESGC6110 - Lecture 5 Multifactor Designs (Chapter 8, PG 221)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

ESGC6110 – Lecture 5

Multifactor Designs (Chapter 8, pg 221)

5.1 Introduction

When more than 2 factors are under study, the number of possible treatment
combinations grows exponentially. E.g. 3 factors at 5 level each, there are
53 125 possible combinations. It is rare to carry out an experiment with 125
different treatment combination, because the management needed and the
money required would be great.

The cost involved in running the experiment is high. It is somewhat


proportional to the number of data values obtained, although sometimes
replicates of a given treatment combinations are relatively cheap.

Managing the collection of data is also an issue. E.g. gathering sales data for a
supermarket, the assistant may forget to count how many particular items were
sold that day!

There are several ways to manage the size of experiments involving several
factors. To design both effectively and efficiently an experiment that has many
factors, we must make informed compromises.

5.2 Latin-Square Designs

3 factors at 3 levels  27 possible treatment combinations of factors levels:


Table 5.1: Three factors, three levels
B1 B2 B3
A1 C1 C1 C1
A2 C1 C1 C1
A3 C1 C1 C1
A1 C2 C2 C2
A2 C2 C2 C2
A3 C2 C2 C2
A1 C3 C3 C3
A2 C3 C3 C3
A3 C3 C3 C3

For any of the 27 cell, the row designates the level of A, the column designates
the level of B, and the subscripts inside the cells designate the level of C. For
eg, the cell in the fourth row and third column has A at level 1, B at level 3 and
C at level 2.
A – row factor; B – column factor; C – “inside” factor.  there is no advantage
or disadvantage to being a row, column or inside factor.

Suppose we run only 9 combinations (printed in bold type) – Table 5.2.


Table 5.2 Nine of 27 Possibilities
B1 B2 B3
A1 C1 C2 C3
A2 C2 C3 C1
A3 C3 C1 C2

Table 5.2 is Latin Square design. It is balanced: each factor is at each level the
same number of times (three); each level of a factor is used in combination with
each other level of a factor the same number of times – once.

The balance in the set of nine treatment combination in Table 5.2 guarantees the
unbiasedness of the main effects of the factors.
- When we look at differences in the row means (i.e. the mean for each
level of A), row one includes exactly one data value at each level of B,
and exactly one data value at each level of C. The same holds for rows 2
& 3.  each row mean is on equal footing with respect to the levels of
factors B & C.
- The differences among the row means can legitimately be examined in
the traditional F-test, ANOVA way.

Table 5.3: data values at A1 all had factor C at C1 , and all date values at A2 all
had factor C at C2 and so on, row differences could not be attributed solely on
the impact of factor A. Table 5.3 is poor design even though it does provide an
unbiased evaluation of the effect of factor B.

Table 5.3 A Poor Choice of Nine


B1 B2 B3
A1 C1 C1 C1
A2 C2 C2 C2
A3 C3 C3 C3

Table 5.3: differences in the row means can be due to the level of factor A or to
the level of factor C; the two effects cannot be separated. When the impact of
one factor can’t be separated from that of a second, the effects are confounded.

For each level of each factor to be on equal footing in a design (Table 5.2), we
must be willing to assume that there is no interaction among the factors.
If there is interaction among the factors, the value of the main effects
determined may not be valid. A1 included a data value that had levels (B1, C1),
but A2 and A3 did not include such value. It (B1, C1) combination greatly
increases the yield beyond the average effect of B1 alone plus the average effect
of C1 alone, then the mean of row one may be much higher than the other row
means solely due to including the (B1, C1) combination, and not due at all to the
levels of A being A1.

However, the latin square design of three factors at three levels each, with only
9 combinations – the same number needed for just 2 factors at three level each –
1/3 replicate, only run 9 out of 27 combinations – cut costs of running the
experiment.

Latin-square designs – same number of factors and levels, assuming there is no


interaction among factors. It may be replicated or unreplicated.

Eg. Unreplicated four-level Latin square design (Table 5.4).


Dependent variable = the number of new car sales over a specified period
Independent variables:
A = service policy (all scheduled services up to 30,000 km are free)
B = hours open for business (open all day Sundays)
C = ancillary amenities (free car wash anytime, even if the car is not
brought in for service that day)

Table 5.4: Number of new vehicles sold


B1 B2 B3 B4
A1 C4 855 C3 877 C2 890 C1 997
A2 C1 962 C2 817 C3 845 C4 776
A3 C3 848 C4 841 C1 784 C2 776
A4 C2 831 C1 952 C4 806 C3 871

Table 5.4 depicts only 16 of the 64 combinations possible. This unreplicated


design (each of the 16 combinations is run only once) is a quarter replicate; it
uses only a fourth of the 64 possible combinations.

The Latin-Square Model and ANOVA

Yijk i j k ijk

i,j and k take on values 1,2,3,…,m. i.e. we have 3 factors, each at m levels.

Yijk Y... (Yi.. Y... ) (Y. j. Y... ) (Y..k Y... ) R


R is a catchall term (the “remainder” to make the equation indeed an equality)

(Yijk Y... ) (Yi.. Y... ) (Y. j. Y... ) (Y..k Y... ) R

(Yijk Y... ) is the total variability among yields


(Yi.. Y... ) is the variability among yields associated with , the row factor
(Y. j. Y... ) is the variability among yields associated with , the column factor
(Y..k Y... ) is the variability among yields associated with , the inside factor, and
R Yijk Yi.. Y. j. Y..k 2Y...
R (Yijk Y... ) [(Yi.. Y... ) (Y. j. Y... ) (Y..k Y... )]
TSS = SSBr (row) + SSB c (column) + SSBinside-factor + SSW

ANOVA Table for m-level Latin Square

Source of Sums of Squares df Expected value of


Variability MSQ
2
Rows m i (Yi.. Y... )
2 m-1 Vrows
2
Columns m j (Y. j. Y... )
2 m-1 Vcolumns
2
Inside factor m k (Y..k Y... )
2 m-1 Vinside factor

2
Error By subtraction (m-1)(m-2)
Total Y... )
2 m2-1
i j k (Yijk ..

ANOVA Table for Car Dealership Example


Source of Sums of Squares df Mean Fcalc p Value
Variability Square
Service policy 17566.5 3 5855.5 2.173 0.192
Hours open 4678.5 3 1559.5 0.579 0.650
Amenities 26722.5 3 8907.5 3.306 0.099
Error 16164.5 6 2094.4
Total 65132.0 15

None of the 3 factors are significant at = 0.05. This indicates that service
policies, hours open, and amenity levels do not affect sales much. However,
amenities, with a p-value of 0.099 would be significant at = 0.10.

If the Latin-square design has replication, the model is similar, except that there
is an additional, explicit term for error.
ANOVA Table for m-level Latin Square with n replicates

Source of Sums of Squares df Expected value of


Variability MSQ
2
Rows nm i (Yi.. Y... )
2 m-1 Vrows
2
Columns nm j (Y. j. Y... )
2 m-1 Vcolumns
2
Inside factor nm k (Y..k Y... )
2 m-1 Vinside factor

Error By subtraction (nm2-1)-3(m-1) = 2

nm2 – 3m + 2
Total Y... )
2 nm2-1
i j k (Yijk ..

Example: Latin-square Analysis of Valet-Parking Use (Pg. 230)


- to determine how best to accommodate the needs of patients efficiently.

Factors:
- Cost of the valet-parking service (row)
- Quantity of handicapped parking spaces in the parking lot closest to the
clinic entrance (column)
- Number of valet-parking attendants on duty (inside factor, indirect
indicator of waiting time)

Each factor is studied at four levels.


- number of space: levels 4, 3, 2, 1 are, respectively, ten, eight, six & four
spaces
- cost: 1, 2, 3, 4 are, respectively, $3, $4, $5, $6.
- Number of attendants: levels happen to be the actual values.

Use of Valet Parking


Number of Handicapped Spaces
4 3 2 1
1 2 29 4 44 3 54 1 71
Cost 2 3 22 1 22 2 59 4 100
to
Park
3 4 38 3 31 1 40 2 79
4 1 29 2 27 4 83 3 100
ANOVA Table for Valet-Parking Study
Source of Sums of Squares df Mean Fcalc
Variability Square
Cost 370.5 3 123.5 2.9
No. of spaces 9025.0 3 3008.3 71.1
No. of attendants 1389.5 3 463.2 10.9
Error 254 6 42.3
Total 11039 15

At = 0.05 and df = (3, 6), c = 4.76. We conclude that the number of


handicapped parking spaces nearby affects how many people use the valet-
parking service, p < 0.001. The number of valet-parking attendants available,
which is a surrogate for the amount of time a patient must wait to use the valet
service, is also significant, p < 0.01. The cost of the valet service does not seem
to affect its use: p > 0.10.

The most significant factor is the number of handicapped spaces; as this number
decreases from ten to four, the average demand increases. As the number of
attendants increases from one to four, the average demand increases from 40.5
to 48.5, to 51.75, to 66.25. The results indicate the importance of the number of
handicapped spaces as well as of the number of valet parking attendants.
Table
Number of handicapped Number of Attendants
space
4 6 8 10 1 2 3 4
Mean number of 87.5 59 31 29.5 40.5 48.5 51.75 66.25
patients using
valet parking

Using SPSS (pg 232)

5.3 Graeco-Latin-Square Designs

Latin square accommodates only 3 factors. Designs involving more than 3


factors are called Graeco-Latin squares. Example of a Graeco-Latin Square:
Table 5.3.1
B1 B2 B3
A1 C1 D1 C2 D2 C3 D3
A2 C2 D3 C3 D1 C1 D2
A3 C3 D2 C1 D3 C2 D1

Each level of A is on equal footing:


each level of A is paired once with each level of B;
each level of A is paired once with each level of C;
each level of A is paired once with each level of D

The same holds for factor B, C and D – all assuming that there are no
interaction effects among the factors.

A necessary, but not sufficient, condition for the treatment combinations of


factors A, B, C, and D (with C and D as inside factors) to form a Graeco-Latin
square is that factors A, B and C form a Latin square, and that factors A, B and
D form a Latin square.
Table 5.3.2: A Non-Graeco-Latin Square
B1 B2 B3
A1 C1 D1 C2 D2 C3 D3
A2 C2 D2 C3 D3 C1 D1
A3 C3 D3 C1 D1 C2 D2

Factors A, B, and C form a Latin Square, and so do factors A, B, and D.


However, factors C and D are confounded.

The Graeco-Latin square in Table 5.3.1 is a complete Graeco-Latin square.


This means that the full capacity of the square (the maximum number of factors
that can be included) is used. The relationship that defines this maximum
involves the number of levels, m, and the resultant number of df. In an m-level
Graeco-Latin square without replication, irrespective of the number of factors in
the study, the number of data points is m2 and the total df is (m2-1). The
number of df associated with each factor is (m-1), therefore the maximum
number of factors that can be accommodated is
(m2 1) /(m 1) m 1

If m=3, (m+1)=4. In the Table 5.3.1, the nine treatment combinations represent
a one-ninth replicate of the possible 34 = 81 treatment combinations.

Table 5.3.3: A complete Graeco-Latin square with five levels, 6 factors:


B1 B2 B3 B4 B5
A1 C1 D1 E1 F1 C2 D2 E2 F2 C3 D3 E3 F3 C4 D4 E4 F4 C5 D5 E5 F5
A2 C2 D3 E4 F5 C3 D4 E5 F1 C4 D5 E1 F2 C5 D1 E2 F3 C1 D2 E3 F4
A3 C3 D5 E2 F4 C4 D1 E3 F5 C5 D2 E4 F1 C1 D3 E5 F2 C2 D4 E1 F3
A4 C4 D2 E5 F3 C5 D3 E1 F4 C1 D4 E2 F5 C2 D5 E3 F1 C3 D1 E4 F2
A5 C5 D4 E3 F2 C1 D5 E4 F3 C2 D1 E5 F4 C3 D2 E1 F5 C4 D3 E2 F1

The flaws of a complete Graeco-Latin Square design: All the df have been used
to estimate the effects and none are left, in an unreplicated design, to allow the
assessment of error. That is, if there are (m+1) factors each “using up” (m-1) df,
all (m2-1) df are utilized; the SSW would come out zero, and the MSW would
be 0 0 (its df), or, an “indeterminate form”, not possible to perform significance
testing.

If we against replication, so-called incomplete Graeco-Latin squares may be


used. Table 5.3.4 shows the Graeco-Latin squares that have m levels (five) of
each factor but fewer than (m+1) factors (four).
B1 B2 B3 B4 B5
A1 C1 D1 C2 D2 C3 D3 C4 D4 C5 D5
A2 C2 D3 C3 D4 C4 D5 C5 D1 C1 D2
A3 C3 D5 C4 D1 C5 D2 C1 D3 C2 D4
A4 C4 D2 C5 D3 C1 D4 C2 D5 C3 D1
A5 C5 D4 C1 D5 C2 D1 C3 D2 C4 D3

The ANOVA table illustrates the testing of the four hypotheses related to the
significance of each of factors A, B, C, D. The value of SSW is determined by
subtracting the other SSB terms from the TSS. The SSB for each factor has 4
df, corresponding to 5 levels.

Source of Sums of Squares df Mean Fcalc


Variability Square
A SSBA 4
B SSBB 4
C SSBC 4
D SSBD 4
Error SSW (by subtraction) 8
Total TSS 24

Exercises:

Pg 240-244: Questions 1, 2, 5, 12, 13.

You might also like