Professional Documents
Culture Documents
Bautista Et Al 1997 A Cluster-Based Approach To Means Separation
Bautista Et Al 1997 A Cluster-Based Approach To Means Separation
Bautista Et Al 1997 A Cluster-Based Approach To Means Separation
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to Journal
of Agricultural, Biological, and Environmental Statistics.
http://www.jstor.org
Key Words: Means separation; Multiple range test; Post hoc test.
Maria G. Bautista is Agricultural Statistician, Colorado Agricultural Statistics Service, 645 Parfet St., Lake-
wood, CO 80215. David W. Smith is Associate Professor and Robert L. Steiner is Assistant Professor,
University Statistics Center, Box 3CQ, New Mexico State University, Las Cruces, NM 88003 (E-mail:
estatiO3@nmsuvm1.nmsu.edu and estatu28 @nmsuvml .nmsu.edu, respectively).
179
Gabriel's method (Gabriel 1978), Hochberg's GT2 method (Hochberg 1974), Studen-
tized Maximum Modulus (Stoline 1978; Stoline and Ury 1979), Sidak T test (Sidak
1967), Tukey's Gap Test (Tukey 1949), Welsch's step-up procedure(Welsch 1977), and
a cluster-basedprocedure (Scott and Knott 1974). Gupta (1965) has provided results
addressingrankingand selection. Chew (1976) provided a nice review.
4. EXAMPLE
On page 140 of Steel and Torrie (1980), there is an example having six treatments.
Implementation of the proposed procedure is illustrated in Table 2. First, there is a
difference among treatments at Stage 0. The two closest estimated means are 3DOk7 and
COMP, with 19.31 for a combined mean. For Stage 1 these two are grouped. Note that
there is a difference among groups (p < .0001), but that there is no difference between
3DOk7 and COMP (p = .5794). One should now proceed to Stage 2.
Using Stage 1 to obtain Stage 2, the two closest estimated means are 3DOk4 and
3DOk13. Therefore, for Stage 2, 3DOk4 and 3DOkl3 are grouped giving 4 groups.
Again, there is a difference among groups (p < .0001) but there are no differences
within the groups (p .7015).
The process continues until Stage 4 is reached and it is discovered that there is a
differencewithin the group of treatmentscomprisedof COMP, 3DOk4, 3DOk5, 3DOk7,
and 3DOkl3 (p = .0004). Therefore,the groupingselected is defined by Stage 3, which
yields a group consisting of 3DOkl alone; a second group containing 3DOk5, 3DOk7,
and COMP;and a third group of 3DOk4 and 3DOkl3.
5. SIMULATION
A simulationstudyusing SAS was conductedto comparethe performanceof the new
method (NEW) to the existing methods-LSD, HSD, SNK, and DUN. The study was
composed of two main sets of simulations,the first being based on five treatmentsand
the latterbeing based on nine treatments.Each main set is composed of three simulations
(n = 3, n = 9, and n = 12). The second main set has the same configurationexcept
that there are nine treatmentsbeing considered. Both main simulationsassume that the
standarderrorof the mean is equal to 1.
The simulationswill evaluate the accuracy of the five means-separationtechniques
by trackingthe average numberof Type I, Type II, and total errorscommitted.A Type I
erroris made when a method fails to group a pair of identical means, and a Type II error
occurs when a method incorrectlygroups a pair of dissimilarmeans. This errorcounting
is possible since the true treatmentmeans are known in advance. Tables 3 through 8
summarizethe results for the main simulations.
To illustrate, in Table 3 there are various sets of actual population means which
were used to test the performanceof the NEW method. The simulationgenerated10,000
"experimentalresults" with significant difference for each of the different sets of true
9 8 7 6 5 4 3 2 1 9 8 7 6 5 4 3 2 1
16 15 14 13 12 11 10 Set 16 15 14 13.12 11 10 Set
1!L t/L
17.00
17.00
20.00 17.00 18.00
17.00 18.00
18.00 18.00
18.00
19.00 19.00
19.00 19.50 20.00
19.75 20.00 17.00
17.00 17.00
17.00 18.00
18.00 18.00
18.00 18.00
19.00
19.00 19.50
19.00 19.75
20.00
1
I2 2
20.00 18.00
17.0019.00 18.00
20.00 18.50
19.00 20.00
19.50 19.00
19.50 20.00
20.00 20.00
20.00 True 20.00 18.00
17.00 20.00
19.00 18.50
18.00 19.50
19.00 20.00
19.00 20.00
19.50 20.00
20.00
20.00 True
A3 /X3
20.5020.00
20.0020.00 20.00
20.00 20.00
20.00 20.00
20.00 20.00 20.00
20.00 20.00 20.00
20.00 20.50
20.00
20.00
20.00
20.00
20.00
20.00
20.00
20.00
20.00
20.00
20.00
20.00 20.00
20.00 20.00
treatment treatment
t4 114
22.00 22.00
23.00 21.00 22.00
20.00 21.50
21.00
20.5021.00
20.0020.50 20.00
20.00 20.00
20.00 means 23.00
22.00 22.00
21.00
20.00 21.50
22.00 21.00
20.50
20.00
21.00
20.50
20.00
20.00
20.00
20.00 means
/5 b15
24.00 23.00
23.00
23.00 22.00
22.00
23.00 22.00
22.00 21.00
22.00 21.00
21.00 20.50
20.25
20.00 23.00
24.00 23.00
23.00
23.00
22.00
22.00
22.00
22.00
22.00
21.00
21.00 20.50
21.00 20.25
20.00
02.1
05.0 03.4
05.202.7
04.5 05.806.1
05.6 03.7
04.8 05.2
06.404.6 03.4LSD
05.6 02.3
04.8 05.1
04.6 02.7 05.4
03.4 05.605.8
03.5
04.5 05.1
06.004.5 05.4
03.8LSD
06.8 06.5
03.5 07.204.5
05.4 08.008.1
07.8 05.2
06.6
08.605.8
06.3 01.2HSDAverage
06.4 06.7 06.8
04.0 07.4 05.5
04.6 07.7 05.2
07.908.0 06.4 06.1
08.405.8 01.5HSDAverage
06.3
total total
Table Table
05.9 05.3
02.6 06.203.6
04.6 07.207.3
06.9 06.1
04.5 06.2
05.9
07.905.4 01.8SNK 05.8
03.0
05.7
06.3
03.7
04.7 07.007.1
06.8 04.3 07.605.2
05.8 05.9
05.7 02.3SNK 3.
4.
number number
05.2
02.2
04.7
05.402.8
03.7 06.106.3
05.8 03.8
05.0 05.3
06.704.7 05.7 of
03.0DUN T= 05.0
02.4 05.3
04.8 02.9
03.5
05.6
05.806.0
03.6
04.7 05.2
06.304.5 03.6DUNof
05.4 T=
5, 5,
errors x errorsx
04.2 03.9
01.9 04.502.8 04.705.0
04.4
02.6 03.6 04.904.2
03.8 04.6 05.0NEW
05.0 o,2 04.1
02.0
04.0
04.5
02.8
02.6
04.4
04.605.0
03.5 04.904.2
03.7 04.6
04.9
05.0NEW o.2
= =
1 1,
n= 00.1
00.1 00.1 01.1 n=
00.1
00.1
00.0 00.1
00.000.2 00.0
00.000.0
00.3
00.3 00.8
00.000.6 03.4LSD
01.0 00.0 00.2
00.0 00.0
00.000.0
00.3
00.3
00.000.7
01.0 03.8LSD
6, 3,
Average Average
00.0
00.0
00.0
00.000.0
00.0
00.0
00.000.0 00.1
00.0 00.000.1
00.3 01.2HSD
00.3 dfe 00.0
00.0 00.0
00.0 00.0
00.0 00.000.0
00.0 00.1
00.1 00.000.2
00.3
00.4
01.5HSD dfe
total total =
=25 10
00.1
00.0 00.000.1
00.0 00.1
00.0 00.1
00.000.0 00.2
00.000.200.4
00.5
01.8SNK 00.1
00.0 00.0
00.0 00.1
00.1 00.0 00.1
00.000.0 00.2
00.000.3
00.5
00.6
02.3SNK
number number
0 0 0 0 of of
00.1
00.1 00.000.1
00.0 00.1
00.0
00.000.0 00.2
00.2 00.00.50.70.93.0 DUN 00.1
00.1
00.0
00.0 00.1
00.2 00.000.0
00.0 00.3
00.3 00.000.6
00.9
01.0
03.6DUN
type type
I I
00.1
00.1
00.0 00.2
00.000.4 00.0
00.000.0 00.001.1
00.5
00.7 01.3
01.5
05.0 00.1
00.1
00.0
00.0 00.2
00.4 00.000.0
00.0 00.7 00.001.1
00.5 01.3
01.5
05.0NEW
NEW
errors errors
04.9
02.0 05.202.5
04.5 03.3 05.806.1
05.6 04.5
03.4 04.4
06.404.0 00.0LSD
04.6 02.2
04.7 05.1
04.6 02.5
03.3
05.4 03.2
05.605.8 04.2 04.1
06.003.8 04.3
00.0LSD
Average Average
06.8
03.5 07.204.5
06.5 05.4 08.008.1
07.8 05.2
06.5
08.605.7 06.1
06.0 00.0HSD 04.0
06.7 06.8
07.4
04.6
05.5
07.7 05.1
07.908.0 06.3
08.405.6
05.8
05.9
00.0HSD
total total
05.9
02.5 06.203.5
05.3 04.5 07.207.3
06.9 04.4 07.905.2
05.9 05.5
05.7
00.0SNK 05.8
02.9
05.7
06.3
03.6
04.6 07.007.1
06.8 04.2
05.6 05.2
07.604.9 05.3
00.0SNK
number number
of of
05.1
02.1
04.7
05.402.7
03.6 06.106.3
05.8 03.6 06.704.2
04.8 04.8
04.6 00.0DUN 02.3
04.9 04.8 02.7
05.3 03.4 05.806.0
05.6 03.3
04.4
06.303.9 04.4
04.3 00.0DUN
type type
11 11
04.1
01.8 04.502.4
03.9 02.4 04.705.0
04.4 02.9 04.903.1
03.3 03.5
03.3 00.0NEW 04.0
01.9
04.0
04.5
02.4
02.4 04.605.0
04.4 02.8 04.903.1
03.2 03.4
03.3 00.0NEW
errors errors
9 8 7 6 5 4 3 2 1 9 8 7 6 5 4 3 2 1
Set 16 15 14 13 12 11 10
Set
16 18 16 16 16 18 18 19 20 1. 11
1
20.00
17.00
17.00
17.00
17.00 18.00
18.00 18.00
18.00
18.00
19.00
19.00
19.00
19.50
19.75
20.00
17 18 16 16 16 18 20 20 20 P2
18 19 18 16 16 20 20 20 20 /'3 A2
17.00
20.00 18.00
19.00 18.00
20.00 18.50
19.00 20.00
19.50 19.00
19.50 20.00
20.0020.00
20.00 True
True
19 19 20 20 16 20 20 20 20 114
/13
20.50
20.00
20.00
20.00
20.00 20.00
20.00 20.00
20.00
20.00
20.00 20.00
20.00 20.00
20.00
20.00
20 20 20 20 20 20 20 20 20 P5 treatment
treatment
21 21 20 20 24 20 20 20 20 116 /14
22.00
23.00
22.00
21.00 22.00
20.00 21.50
21.00
20.50
20.00
21.00
20.50
20.00
20.00
20.00
20.00 means
22 21 22 24 24 20 20 20 20 17 means
5
23 22 24 24 24 22 20 20 20 A8 24.00
23.00
23.00
23.00
23.00
22.00
22.00
22.00
22.00
22.00
21.00 21.00
21.00 20.25
20.50 20.00
24 22 24 24 24 22 22 21 20 119
02.0
05.0 05.102.7
04.5 03.5 05.906.2
05.6 04.8
03.7 06.504.6
05.3 03.2LSD
05.6
16.919.8
10.7
04.7 09.4
02.514.0 13.3
09.5LSD
06.8
03.4 07.204.4
06.3 08.008.2
07.8
05.4 05.2
06.6
08.705.9
06.3 01.1HSDAverage
06.5
26.328.6 06.121.1
13.0
20.0 01.5HSDAverage
14.2
12.9
total Table
Table 06.0 06.103.5
05.2
02.4 04.6
06.9
07.307.4 06.1
04.6 08.005.4
05.9
06.2
01.7SNK 5.
total 6.
16.1
22.927.4 03.520.1
08.7 11.9 02.2SNK
13.8 04
number
02.1
05.3 04.6
05.402.8 05.9
03.7 06.306.5 05.1
03.9 06.98 05.4 02.9DUNof
05.7 T=
T=
number 5,
18.021.5
11.8
05.4
02.715.3 07.6DUNof
13.2
09.5 9,
errorsx
x 04.2
02.0
03.9
04.502.9
02.7
04.4
04.705.0 04.904.1
03.8
03.6 04.6
04.8
05.0NEW u2-
=
errors 2= 1,
07.2
11.313.4 02.7
02.411.6 15.6
11.7 18.3NEW
1, n
=
n= 00.1
00.1
00.0 00.1
00.000.2 00.0 00.2
00.000.0 00.2
00.000.5
00.8 03.2LSD
00.9
3, 12,
00.000.3
00.2
00.4
00.600.8
02.3
04.0
09.5LSD Average
Average
dfe 00.0
00.0
00.0
00.000.0
00.0
00.0
00.000.0
00.0 00.000.1
00.0 00.2 01.1HSD
00.3 dfe
total =
00.000.0
00.0 00.000.1
00.0 00.2
00.5
01.5HSD
total 18 55
00.1
00.0 00.000.1
00.0 00.1
00.0 00.1
00.000.0 00.1
00.000.2
00.3
00.4
01.7SNK
number
00.000.1
00.1
00.2
00.200.2
00.3 02.2SNK
00.7
of
number
00.1
00.1 00.000.1
00.0 00.1
00.0 00.2
00.000.0 00.2
00.000.4 00.8
00.7 02.9DUN
of type
00.000.2
00.2
00.4
00.500.6 03.1
01.5 07.6DUN /
type
I 00.1
00.2
00.0 00.2
00.000.5 00.0
00.000.0
00.7
00.5
00.001.0
01.3
01.4
05.0NEW
errors
00.001.0 01.2
00.9 01.203.6
07.2
09.4
18.3NEW
errors
04.9
01.9 05.102.5
04.5 03.4 05.906.2
05.6 03.5 06.504.1
04.6 04.5
04.7
00.0LSD
16.919.5
10.5
04.3 07.1
01.913.2 09.3
00.0LSD Average
Average 06.8
03.4 07.204.4
06.3 05.4 08.008.2
07.8 05.2
06.6 06.1
08.705.8 06.2
00.0HSD
06.121.0
total
26.328.6
20.0
13.0 13.7
12.7 00.0HSD
total
06.0
02.3 06.103.4
05.2 04.5
06.9
07.307.4
04.5 08.005.2
06.0 05.6 00.0SNK
05.8
22.927.3
16.0
08.5
03.319.9 13.1
11.6 00.0SNK number
number of
of 05.2 04.6
02.0 05.402.7
03.6 06.306.5
05.9 03.7 06.904.4
04.9 04.9
04.7 00.0DUN
18.021.3
11.6 02.214.7
05.0 10.1
08.0 00.0DUN type
type 11
11 04.1
01.8 04.502.4
03.9 04.4
02.5 04.705.0 04.903.1
03.3
02.9 03.3
03.4
00.0NEW
11.312.4
06.3 01.208.0
01.5 04.5
06.2
00.0NEW errors
errors
9 8 7 6 5 4 3 2 1 9 8 7 6 5 4 3 2 1
Set Set
1 1 1 1 1 1 1 1 A-s 1 1 1 1 1 1 1 1
6 8 6 6 6 8 8 9 20 6 8 6 6 6 8 8 9 20 1ji
1
1 1 1 1 1 1 1 1 1 1 1 1
7 8 6 6 6 8 20 20 20 12 7 8 6 6 6 8 20 20 20 2
1 1 1 1 1 1 1 1 1 1
8 9 8 6 6 20 20 20 20 1J3 8 9 8 6 6 20 20 20 20 /13
1 1 1
9 9 20 20 6 20 20 20 20 /14
True 1 1 1 True
9 9 20 20 6 20 20 20 20 P-4
20 20 20 20 20 20 20 20 20 /15 20 20 20 20 20 20 20 20 20 /15
21 21 treatment treatment
20 20 24 20 20 20 20 16 12 21 20 20 24 20 20 20 20 /6
means 2 means
22 21 22 24 24 20 20 20 20 [17 22 1 22 24 24 20 20 20 20 /17
23 22 24 24 24 22 20 20 20 /A8 23 22 24 24 24 22 20 20 20 /8
2 2
24 22 24 24 24 22 22 1 20 /9 24 22 24 24 24 22 22 1 20 /19
16.220.1 04.1
10.0 02.214.5
09.4 08.1LSD
13.0 10.2
16.320.0 04.2
02.314.3 13.1
09.4 08.5LSD
00.2
00.000.2 08.1LSD n=12, 00.2
00.000.2 00.4
00.600.7 08.5LSD
03.4
01.9 =6,
00.4
00.600.7 03.2
01.7
Average Average
dfe
dfe
00.000.0
00.0
00.0 00.1
00.000.0 01.1HSD
00.3 00.000.0
00.0
00.0 00.1
00.000.0 00.4
01.3HSD
total total =45
=99
00.000.1
00.1 00.200.1
00.2 00.1
00.4
01.5SNK 00.1
00.000.1 00.200.1
00.2 00.2
00.5
01.7SNK
number number
of of
00.2
00.000.2 00.4
00.400.4 02.1
01.0 05.9DUN 00.000.2
00.2 01.1
00.400.5
00.4 02.4
06.3DUN
type type
I I
00.001.0 01.2
01.0 01.303.6 09.1
07.0 18.1 00.001.0
01.0
01.2
01.303.6 18.2NEW
09.3
07.0
NEW
errors errors
16.219.9
09.8
03.7
01.613.8 09.8
07.7 00.0LSD 10.0
16.319.8 03.8
01.713.6
07.5
09.7
00.0LSD
Average Average
24.928.5
18.4 05.221.1
11.5 13.9
12.9 00.0HSD 25.228.5
18.8
11.9
05.421.0 13.8
12.9 00.0HSD
total total
21.427.3
14.4
06.9
02.720.0 13.5
12.0 00.0SNK 21.827.3
14.8
07.3
02.820.0 13.4
11.9 00.0SNK
number number
of of
17.422.0
10.8
04.3
01.815.5 10.9
08.8 00.0DUN 17.621.8
11.0
04.5
01.915.4
08.6
10.7
00.0DUN
type type
11 II
10.812.5
05.9 00.908.1
01.3 06.2
04.6 00.0NEW 11.012.5
06.0 01.008.1
01.3 06.2
04.6 00.0NEW
errors errors
9 8 7 6 5 4 3 2 1
16 15 14 13 12 11 10 Set
/
1
20.0017.00
17.00 17.00
17.00 18.00
18.00 18.00 18.00
18.00 19.00
19.00
19.00 19.75
19.50 20.00
1A2
20.00 18.00
17.00 19.00 18.50
18.00
20.00 19.00
19.5019.00
20.00 20.00
19.50 20.00
20.00
20.00 True
/13
20.5020.00
20.00 20.00
20.00
20.00 20.00
20.00 20.00
20.00
20.00 20.00
20.00 20.00
20.00 20.00
treatment
1L4
22.00
23.00 21.00
22.00 20.00 21.50
22.00 21.00
20.50 21.00
20.0020.50 20.00
20.00 20.00
20.00 means
Table
/.5 9.
24.00
23.00
23.00
23.00 22.00
23.0022.00
22.00
22.00 21.00
22.00 21.00
21.00 20.25
20.5020.00
T=5,
04.8 05.102.7
04.5
02.0 03.4 05.806.1
05.5 03.7
04.8 05.2
06.404.6 05.4
03.4LSD
x1
oa2x1
06.6
03.5 07.204.5
06.4 08.008.1
07.7
05.3 05.2
06.6 06.2
08.505.8 01.3HSDAverage
06.3
.6,
total
05.8
02.5 06.103.6
05.3 04.5 07.207.3
06.8 06.1
04.6 05.8
07.805.3 01.9SNK
06.0 ao2
number
02.1 06.106.3 05.1 03.1DUNof .8,
05.0 04.6
05.302.9
03.6
05.8 03.8 06.704.7
05.3
05.5
a,
2
errorsx273
04.2
01.9
03.9
04.502.8
02.6 04.705.0
04.4 03.6 05.004.1
03.8 04.7
04.9
04.9NEW =
1.0,
00.1
00.0 00.0 00.1
00.000.2 00.3
00.000.0
00.0 00.3
00.000.6
00.8 03.4LSD
00.9 ,2-?
=
Average
00.0
00.0 00.000.0
00.0 00.0
00.0 00.1
00.0
00.000.0 00.000.1
00.3
00.3
01.3HSD 1.2,
total
a2
X4xs
00.1
00.0 00.000.1
00.0 00.1
00.0 00.1
00.000.0 00.2
00.000.2
00.4 01.9SNK
00.5
number=
of
1.4,
00.1
00.0 00.0 00.1
00.000.2 00.0 00.2
00.000.0 00.3 00.7
00.000.5 03.1DUN
00.8
type
I n=
00.1
00.1
00.0
00.000.4
00.2
00.0 00.7
00.000.0 00.5
00.001.0
01.3
01.4
04.9NEW 6,
errors
dfe=
25
04.8
01.9 05.102.5
04.5 03.3 05.806.1
05.5 03.4
04.5
06.404.0
04.4
04.5
00.0LSD
Average
03.5
06.6 07.204.5
06.4 05.3 08.008.1
07.7 05.2
06.5
08.505.7
05.9 00.0HSD
06.0
total
02.4
05.8 06.103.5
05.3 04.4 07.207.3
06.8 04.5 07.805.1
05.9 05.4
05.5
00.0SNK
number
of
02.0
05.0 04.6
05.302.7
03.5 06.106.3
05.8 03.6 06.704.2
04.8 04.6
04.7
00.0DUN
type
11
04.1 03.9
01.8 04.502.4
02.4
04.4 02.9
04.705.0 05.003.1
03.3 03.4 00.0NEW
03.5
errors
Set 111 /12 /13 /14 /15 LSD HSD SNK DUN NEW
1 - - - - - - - - -
2 19.75 20.00 20.00 20.00 20.25 05.2 07.0 06.4 05.4 03.9
3 19.50 20.00 20.00 20.00 20.50 04.9 06.7 06.2 05.1 03.7
4 19.00 20.00 20.00 20.00 21.00 04.2 06.2 05.6 04.5 03.2
5 19.00 19.50 20.00 20.50 21.00 11.1 16.1 14.6 11.8 08.3
6 19.00 19.00 20.00 21.00 21.00 06.0 09.3 08.4 06.5 04.3
7 18.00 20.00 20.00 20.00 22.00 03.4 05.3 04.5 03.6 02.9
8 18.00 19.50 20.00 20.50 22.00 09.8 14.3 12.6 10.4 08.1
9 18.00 19.00 20.00 21.00 22.00 09.0 13.9 12.2 09.6 07.1
10 18.00 18.50 20.00 21.50 22.00 08.3 13.5 11.6 08.9 06.3
11 18.00 18.00 20.00 22.00 22.00 03.9 07.2 05.9 04.3 02.8
12 17.00 20.00 20.00 20.00 23.00 02.5 04.5 03.5 02.7 02.4
13 17.00 19.00 20.00 21.00 23.00 07.5 11.8 09.8 08.0 06.8
14 17.00 18.00 20.00 22.00 23.00 05.9 00.9 07.8 06.3 05.2
15 17.00 17.00 20.00 23.00 23.00 02.0 00.0 02.7 02.2 01.9
16 20.00 20.00 20.50 22.00 24.00 06.7 10.4 08.8 07.1 05.5
In Table 10 there are smaller average Type II errors for the NEW method in all of
the eight sets. Note that the same ordering of methods occurs with a set for Tables 4 and
10. In other words, the ranking of the methods does not change with the weighting. The
only effect of the scheme is to increase the number of Type II errors in relation to the
Table 4 results.
A final set of simulations examines the effect of heterogeneous variances on the
means separation techniques. The standard errors of the means are .6, .8, 1.0, 1.2,
and v1.4 for groups 1, 2, 3, 4, and 5, respectively. The results from these simulations
are summarized in Table 9. A comparison of results to Table 4 shows no real effect from
this departure from homogeneity.
6. COMPUTATION
The NEW procedure can be implemented in the Statistical Analysis System (SAS
Institute 1990) with the macro included in the appendix. This macro should be invoked
after PROC ANOVA or PROC GLM indicates that there is a detectable difference be-
tween the treatment means. An example is given below in order to illustrate the use of
the code. In this example the macro is called after the dataset inputdat is created.
This particular dataset has a response variable denoted as NITRO and a classification
variable called CULTURES.
Execution of the macro requires six parameters: the dataset, the response variable,
the classification variable, alpha, error sum of squares, and error degrees of freedom.
be thought of as the default values). If the user wants other values for SSE and
dfe, then these may be entereddirectly into the macro for ss and dff respectively.
4. The macro should be used when there is an equal number of observations per
treatmentgroup. Directions for executing the procedurewhen this condition is
violated are included after the balanced example.
5. If the macro is invoked with a datasetthat is unbalancedor is lacking an overall
significant F, an errormessage is displayed.
The output begins with a display of the group means, the original groups, and
the designated groups. This table (Table 11) allows the user to see the correspondence
between the groups in the datasetand the groups assigned by the macro. In the example,
3DOkl is assigned group 1, 3DOk13 is assigned group 2, 3DOk4 is assigned group 3,
3DOk5 is assigned group4, 3DOk7 is assigned group 5, and COMPis assigned group 6.
The new group names allow the macro to compareand display a large numberof means.
After the means are shown, the macro displays the individual iterations, each in-
cluding the iteration number, the within group F test, the between group F test, and
the current grouping of means. The final grouping of means is given in the matrix
SIMILAR. The following example illustratesthe interpretationof this matrix. Cultures
2 and 3 are not different, and hence form a larger group. Cultures4, 5, and 6 are also
not different, forming another group. Culture 1 is different from all the other cultures.
SIMILAR
23
456
The above result correspondsto the following conclusion using the familiarline notation:
1 23 456.
The researchercan also use the NEW procedureon unbalanceddata. After finding
a significantoverall F, the means are sorted, and the closest two means are located. The
observationsfrom these two groupsare combinedto form a largergroup.A new variable,
denoted as Gl, is added to the original dataset indicating the new grouping.Proc GLM
is then run with the following model statement:model resp = Gl group (Gl) .
The within group F test and between group F test are evaluated using the ANOVA
table. Continue the process of forming new groups (and adding new group variablesto
the dataset) until the stopping conditions are satisfied.
As with most statistical procedures, the goal of the analysis is important.If the
researcheris interestedin nonoverlappinggroups, then the NEW procedurecould prove
to be quite useful. On the other hand, if the treatmentmeans were 25, 30, 35, 40, 45,
and 50, then the NEW proceduremight not be so useful.
Data inputdat;
input resp group $ @@;
cards;
19.4 3DOkl 32.6 3DOkl 27.0 3DOkl 32.1 3DOkl 33.0 3DOkl
17.7 3DOk5 24.8 3DOk5 27.9 3DOk5 25.2 3DOk5 24.3 3DOk5
17.0 3DOk4 19.4 3DOk4 9.1 3DOk4 11.9 3DOk4 15.8 3DOk4
20.7 3DOk7 21.0 3DOk7 20.5 3DOk7 18.8 3DOk7 18.6 3DOk7
14.3 3DOk13 14.4 3DOk13 11.8 3DOk13 11.6 3DOk13 14.2 3DOk13
17.3 COMP 19.4 COMP 19.1 COMP 16.9 COMP 20.8 COMP
7. CONCLUSION
The NEW procedure unambiguously defines the structure among treatment means
by gathering like means into groups. There are no treatment means designated to more
than one group. Treatment means within a group are considered to be homogeneous.
The NEW method performs favorably within the present set of simulations against well-
known multiple range or multiple comparison procedures.
/* S T A R T I M L */
proc iml;
/* S 0 R T */
HOLD1=J(1, ncol(x),.);
HOLD2=J(1,ncol(sec),.);
DO J = 1 TO Nrow(X)-1;
DO I = 1 TO Nrow(X) -1;
finish sortit;
/* M E A N S */
do i = 2 to nrow(x);
if z(|il) ^= z(li-11) then do;
zum =0;
do j = goo to i-1;
zum=zum+ x(IjI)
end;
avel(IMARKER|) = zum/(i-goo);
ave2 (IMARKERI)=z (IGOO|);
goo = i;
MARKER = MARKER+1;
end;
end;
else do;
zum =0;
do j = goo to NROW(X);
zum=zum+ x(I
end;
END;
z=compress(ave2);
finish;
/* V A R I A N C E */
do i = 2 to nrow(x);
if Z(li,COLLI) ^= Z(li-1,COLLI) then do;
zum =0;
zum2=0;
do j = goo to i-1;
zum=zum+ x(j I);
zum2=zum2+ x(jjj)*x(jjj);
end;
if i - goo = 1 then ave(Imarker,1|) = O;else
else do;
zum =0;
zum2=0;
do j = goo to NROW(X);
zum=zum+ x(Iij) ;
zum2=zum2+ x(Iij) *x(Ij );
end;
END;
finish;
/* M E T A S */
diff = j(1,1,4000000);
mindiff = diff;
place = diff;
upper = nrow(x);
do i = 2 to upper;
diff = x( i ) - x( i-11)
if diff < mindiff then do;
mindiff = diff;
place = i;
end;
end;
loc = j(1,1,'
loc = outdat (Ip1acel);
FINISH;
/* M E R G E */
start merg( x, y
dumm = j( nrow(x) 1,'
do i = 1 to nrow(x);
do j = 1 to nrow(y);
if x(ji, ncol(x) 1) = y(Ij, 11)
then dumm(|i|) = y(|j,21);
end;
end;
x=xllcompress (dumm);
finish;
/* S T A R T M A I N */
start main;
KEEPIT = J ( NROW(X) 1
chars={''''''''''''''''''''''''''''''g,
keepit=chars[ 1:nrow(x) ];
/* c h e c k f o r e q u al n a n d s i g n i f f*/
then do ; * O K T O R U N M E T A G R O U P S;
n = x[1,2];
x=x[ , 1];
dftrt=aov[2,1];
sstrt=aov[2,2];
/* b e g i n t h e m e t a p r o c e s s*/
CALL SORTIT(x,keepit );
XORIG=X;
* p r o d u c e m a t r i x o f m e t a g r o u p s;
indx=l;
z= keepit[ , ncol(keepit) ];
call sortit(xorig,z);
call meanit(xorig,x,z);
call sortit(x,z);
call metas(x,z);
call merg(keepit,z);
z= z[ , 2];
indx=indx+l;
end; * o f w h i 1 e
indx = 1;
do until ( sigbet = 1 sigwin = llindx=NCOL(KEEPIT));
sigbet=O; sigwin = 0;
dfwin=O;
sswin=0;
indx = indx+1;
DUMIT = J( NROW(XORIG) 1 ,
CALL SORTIT( XORIG, DUMIT );
do i = 1 to nrow(vari );
dfwin = dfwin + vari(ji,21)-1;
sswin = sswin + vari(Ii,1j)*(vari(Ii,2j)-1 );
end;
sswin = n*sswin;
DFBET=DFTRT-DFWIN;
SSBET=SSTRT-SSWIN;
PRINT '-------------------------------------------------
IT=INDX-1;
PRINT 'I T E R A T I 0 N' IT
PRINT I I;
PRINT 'F TEST FOR WITHIN SUM OF SQUARES';
PRINT DFWIN SSWIN OSLWIN;
PRINT I I;
PRINT 'F TEST FOR BETWEEN SUM OF SQUARES';
PRINT DFBET SSBET OSLBET;
PRINT I I;
PRINT DFERR SSERR SSTRT;
PRINT
PRINT
SIMILAR =UNIQUE( KEEPIT[ ,INDX]);
SIMILAR=SIMILAR';
PRINT 'MEAN GROUPS';
PRINT SIMILAR;
PRINT '-----------------------------------------------
end; * o f r e p e a t ;
PRINT '------------------------------------------------
PRINT
PRINT
SIMILAR = UNIQUE(KEEPIT[ ,INDX]);
SIMILAR = SIMILAR';
PRINT I I;
PRINT 'F I N A L G R O U P I N G O F M E A N S';
PRINT I I;
PRINT SIMILAR;
PRINT '------------------------------------------------
end;
else print 'Sample sizes not equal or overall F not significant';
* C O N D I T I O N S NO T S A T I S F
%mend; * EN D OF M A C R O;
data inputdat;
input resp group $ @@;
cards;
REFERENCES
Chew, V. (1976), "ComparingTreatmentMeans: A Compendium,"HortScience, 11, 348-357.
Duncan, D. B. (1955), "MultipleRange and Multiple F Tests,"Biometrics, 11, 1-42.
Federer,W. T. (1955), ExperimentalDesign Theoryand Application,New York:Macmillan.
Fisher, R. A. (1951), The Design of Experiments(6th ed.), London: Oliver and Boyd.
Gabriel, K. R. (1978), "A Simple Method of Multiple Comparisons of Means", Joumnalof the American
StatisticalAssociation, 73, 724-729.
Gupta,S. S. (1965), "On Some MultipleDecision (Selection and Ranking)Rules," Technometrics,7, 225-245.
Harter,H. L. (1960), "Tablesof Range and StudentizedRange,"Annals of MathematicalStatistics, 31, 1122-
1147.
Hochberg, Y. (1974),"Some Generalizationsof the T-Method in SimultaneousInferences,"Journal of Multi-
variate Analysis, 4, 224-234.
Keuls, M. (1952),"TheUse of the 'StudentizedRange' in ConnectionWith an Analysis of Variance,"Euphytica,
1, 112-122.
Newman, D. (1939), "The Distributionof Range in Samples From A Normal Population,Expressedin Terms
of an IndependentEstimate of StandardDeviation,"Biometrika,31, 20-30.
Version6 (4th ed.), North Carolina:
SAS Institute(1990), SAS/STATUser's Guide Volume2, GLM-VARCOMP,
Author.
Scheffe, H. (1959), TheAnalysis of Variance(1st ed.), New York:Wiley.
Scott, A. J., and Knott, M. (1974),"A Cluster Analysis Method for Grouping Means in The Analysis of
Variance,"Biometrics, 30, 507-512.
Sidak, Z. (1967), "RectangularConfidence Regions for the Means of MultivariateNormal Distributions,"
Journal of the AmericanStatisticalAssociation, 62, 626-633.
Steel, R., and Torrie,J. (1980), Principles and Proceduresof StatisticsA BiometricalApproach(2nd ed.), San
Francisco:McGraw-Hill.