Professional Documents
Culture Documents
1 Experimental Designs I Stat 162 1
1 Experimental Designs I Stat 162 1
This lecture syllabus was a compilation of lectures notes written over decades for the
course Experimental Design I (currently with course no. STAT 162) at the Institute of
Statistics (InStat), University of the Philippines Los Bafios. The materials used were
collected from the writer's file amassed from designing and analyzing experimental
data while he was working at the International Rice Research Institute (IRRI),
Philippine Sugar Commision (tben Philippine Sugar Institute and now the Sugar
Regulatory
Administration) and also at the Statistical Consulting Group of INSTAT.
Parts of the lecture notes were first distributed as handouts. Others were written in
longhand on transparencies and shown on overhead projectors. These notes were refined
as lecture papers used in the numerous computer-based training workshops on statistics
conducted by the INSTAT.
Computations illustrating the statistical analyses were done on SAS (SAS Institute, Inc.)
and SPSS (SPSS Inc.) as well as using the calculator. This lecture syllabus is being
supplemented with exercises during the laboratory session of the course. These
sessions are computer-based using statistical softwares to carry out the computations.
This lecture syllabus has not yet undergone any editing by a competent editor. Many
typographical, grammatical, syntax and other editorial errors may be found. These errors
are also concerns later but the immediate concern now is to reproduce this syllabus for a
handy reference on the course.
Although this syllabus is, intended for STAT 162, it may also come in handy for
researchers and students who are conducting experiments for their researches.
Other INSTAT faculty who have contributed to this syllabus are: Dr. Arturo Y.
Pacificador, Jr, Prof. Lina A. Catahan, Dr. Concorcia E. Reafio and Prof. Mae
Solivas
Cababasay. The writer also thank Dr. Santiago M. Alviar from whom he first learned of
design and analysis of experiment.
Gratitude are given to Ms. Lita O. Averion and Mr. Sonny E. Nerpio who took care of all
Objective. discover/ of new facts and at revising accepted ±.eories taws in the
light of newly discovered facts.
Data
Collection
Data
Data
Examination
Data
Analysis
There is element of random error in the result of the experiment; hence, perfect
generalization of the result can not be made.
However, if the experiment is performed using appropriate principles, design and
techniques of experimentation, then the degree of uncertainb/ can be measured in
terms of probability, and so probabilistic statements and conclusions can be made.
• Experimental design - the set of rules, plans and course of action taken in the
conduct of an experiment.
It is needed to :
• Ensure cost effective collection of appropriate date
E.xperimental Designs i
0
Sampling en•oris the measure of variations among sampiing units within
experimental unit.
Buyios
Chapter I. Inntroductory Concepts
Functions ofreplication
precision required.
Number of treatments.
Experimental design.
Ttme allotment for the experiment.
Functions of randomization:
Bafios
1. The eu's are sufficiently homogenous (like dishes of culture medium) and
2. Effective local control is assured (as those in laboratories, greenhouse).
Suppose there are t = 3 treatments, T1, T2 and T3, which are replicated rl = 2, r2 = 3
and = 4 times, respectively; hence, the nos. of eu's required is n = rt + r2 + = 9. The
randomization (using random no. generator key on calculator) may be as follows:
I. Label the eu's consecutively from 1 to n = 9.
2. Obtain a sequence of n = 9 random numbers. Rank the numbers in increasing
order. Using the sequence of ranks as a randomization of the eu's, assign the first rl
= 2 eu's to Tl, the next r2 = 3 eu's to T2 and the last = 4 eu's to T3.
Random no. . .739 .218 .781 .396 .527 .324 .463 .064 .942 rank(eu
no.): 7 2 8 4 6 3 5 1
9
1 2 3
4 5 6
7 8 9
t.
t2 trt
Linear Model:
Let Yij = observation on the response variable from the jth experimental unit with
the ith treatment.
Yij + tij;
then compute
CF = (42.3)2 = 149.11 n 12
2
+ + 3.52] - 149.11 = 36.50
2
18.92
E
SS = TSS - TrSS = 36.50 - 11.16 = 25.34
Chapter
MSE 2.82
x 100 = x 100 = 47.60%
3.52
2. Standard error ofa treatment mean, s.e.(ül ) is the measure of the average error in
estimating the true treatment mean. It measures the degree of precision of Y, as
the estimate of the true treatment mean.
MSE
s.e.(Y, ) =
3. Standard error of the difference between two treatment means, s.e.(f', -t. ) is a
measure of the average error in estimating the difference between two treatment
means. It measures the degree of precision of (t-t. ) as the estimate of the difference
between the true means of treatment i and treatment i'.
MSE —+—
s.e.(V, = 1
2-5
3-1
CHAPTER 3. ASSUMPTIONS UNDERLYING THE ANALYSIS OF
VARIANCE
Inferences based on the ANOVA are meaningful only when the data set satisifed certain
characteristics, called assumptions underlying the ANOVA. These are:
30-1)
c. Decision rule: Rejected Ho (and accept Ha) if U > or Prob(U ) where a is the
prescribed level of significance.
Other procedures.
Additivity of Effects
€9 A common departure from this assumption in research is one where the treatment
and the environmental effects are multiplicative. Multiplicative effect is encountered
in disease studies because the effects of disease organisms are usually in multiples of
the numbers present.
INSTAT-CAS,
This is appropriate for data where its variance is proportional to its mean as in
data with Of small whole numbers like counts of rare events (e.g. Poisson data) and
frequency count data.
The transformation is
Y,} = Y!+O.05
The 0.5 is added when the data contain small values (say less than 10) or when o's
are present.
3. Arcsine transformation
The transformation is
After transformation the values are checked for possible improvement. The ANOVA
and subsequent analyses wilt be done on the transformed values.
Ho: - Eli, = 0 The treatment means, and are not different Ha:
0 The treatment means, and are different
& For illustration, consider the results from an experiment conducted to measure the
effects of six organic fertilizer treatments and a control treatment on the yield of
soybean.
ANOVA
sv df MS
Treatment 6 5 587 174 931 2.57
196
Error 21 1 990 238 94
773
TOTAL 7 577 412
= 15 1 se(Yi - ) = 153.9 se(d) = 217.7
Terry S. Solivas, T-CAS, UP Los Banos
Page 4-2
Chapter 4. Multiple Comparisions Among Means
Treatment means:
INSTA
test
• The /eastsignificant difference (LSD)
2 Decision rule:
Reject Ho: gl - = 0 (and accept Ha: - P, O) if ( t - t ) 2 LSD
3. Perform the pairwise comparisons ofthe means
a. Sort the means by arranging them from largest to smallest.
b. Compare the largest mean to each of the smaller means starting from
the next smaller mean. Connect nonsignificant comparisons by a sideline.
c. Compare the next largest means as in (b).
d. Repeat (b) and (c) until all the means have been compared.
Sorted Treatment Means Significance
org C 2,678
org D 2,552
Org F 1,681
Control 1,316
Treatment Means*
Or A 2 127 b
Or B 1 796 c
Or C 2 678 a
Or D 2 552 ab
Or E 2 128 bc
1 681 cd
Control 1 316 d
* Any two means having a common letter(s) are not
significantiy different; otherwise, they are significantly
different (at 5% level using LSD test).
1. The DMRT is more sensitive than the F-test in the ANOVA. Thus it may still be
used even if the F-test is not significant.
2. It is used to compare differences between all possible pairs of means.
3. It is a sequentiai test; that is, it requires a series of values of the test statistic,
each corresponding to a specific set of pair comparisons.
4. it is used when the experiment is with equal replications.
5. It uses the new studentized range table.
Ø Test procedure:
1. Test statistic
Let p = range of the number of means in a particular comparison where
counting starts from the higher mean to the iower mean; p = 2, 3
Rp = rp x se(Yi )
org E 2,128
Org A 2,127
Org B 1,796
Org F 1,681
Control 1,316
d
Treatment Means*
Or A 2 127 b
Or B 1 796 c
Or C 2 678 a
Or D 2 552 ab
rest procedure:
1. Test statistic:
Wp = qp X se(Yi )
• In the illustrative example: Using cc = 5%
p 2 3 4 5 6 7
2.95 3.58 3.96 4.23 4.45 4.62
w 454 551 609 651 685 711
3. Decision rue:
(Assignment)
Test procedure:
1. Test statistic:
• Let qp be the tabular value of the studentized range table at (1 level of
significance, error df and p = t,
Compute the value of the test statistic HSD as:
HSD = qp x sect)
• In the illustrative example: At the = 5%
HSD = = 711.0
2. Decision ru/e:
(t-1)Ft x se(d)
• In the illustrative example: At the a = 5%
x (217.7) = 841
2. Decision ru/e:
• Linearcomparison. Let P2, ... , be t population means. The linear function expressed as
• Set of Orthogonal contrasts. The set of t-1 contrasts, 11, 12, ...t is a set of orthogona/
contrasts if they are all pairwise orthogonal.
Ilfustration:
Consider the following results from an experiment conducted to determine the
yield (tons/ha) response of a certain variety of rice to different kinds of fertilizer.
Trt Fertilizer and a lication Reps Total Means
T Control no fertilizer a !ication 4 15.32 3.83
Complete fertilizer (14-1+14) 4 31.20 7.80
Or anic fertiiizer A, single application 4 26.84 6.71
Or anic fe@lizer A, split application 4 29.20 7.30
O anic fertilizer b single application 4 16.12 4.03
Or anic fertilizer B s lita iication 4 19.68 4.92
ANOVA
sv df MS
Treatments 5 59.45 11.89 49.54* *
Error 18 4.30 0.24
TOTAL 23 63.95
Since the ANOVA shows that the treatment means are significantly different, we may
compare groups of these means.
1. Comparing the mean of the control with the means of afi treated. This comparison is
defined by specifying the q's as
5Y1 - + % + Y4 + Ys + % ) contro/vsfreated
2. Comparing the mean of complete fertilizer with the means of organic ferti!izers
3. Comparing the means of organic fertilizer A with the means of organic fertilizer B.
u = - 94 -z A single vs A split
5. Comparing organic fettilizers B single with B spiit.
6. B sing/e vs B sp/it
Other meaningful contrasts are:
Ho: o vs Ha: x
s/ The test statistic is given by
SSE
Terry
Example:
SSLI = = 17.97
17.97
Conclusion: There is significant difference between the controi mean and the treated
means.
The tests for the other contrasts in the orthogonal set are summarized as follows:
Assignment:
1. Complete the ANOVA for another set of orthogonal contrasts for the example.
sv df MS
Treatments 1 59.45 11.89 74.88**
Control vs treated 17.97 17.97
Complete vs
1 13.58 13.58
inorganic 1
Single vs split 1
A single vs B single
Terry S. Solivas, INSTA T-CAS, UP Los Banos
Chapter 4. Multiple Comparisions Among Means
As lit vs Bs lit 1
Error 18 4.30 0.24
TOTAL 23 63.75
2. Discuss the results.
Page
We consider only the case where the quantitative level of the treatments are equally
spaced.
Illustration:
Consider the results from an experiment conducted in CRD to determine the yield response
of corn to a newly formulated organic fertilizer:
The responses to fit in the trend comparisons are called the orthogonal polynomials of
different degrees: Linear, quadratic, cubic, quartic, quintic and other higher degree
polynomials.
The polynomial responses are defined in terms of contrast where the coefficients of the
contrast are given in the following table:
Ho: Ha:
MSE'
SSL = for unequal replications
Decision rule: Reject Ho (and accept Ha) if FC
In the illustrative example, the seguent/@/ tests for the different degree polynomial
(linear, quadratic, cubic, etc) comparisons are follows:
Page 4-14
L - 4.07
SSL=
= -2.92
The next step of the analysis is to fit a quadratic regression equation of the fertilizer
levels on the treatment means.
Page 5-1
CRD with egval €uksmpliag: We have an experiment laid out in CRD with t
treatments and r replicates per treatment (i.e. equal replications). If we make more
than one observations, say s observations on each experimentai unit, then we have
CRD with equal subsampling,
Data presentation:
Let Yijk = kth observation on the response variable from the jth experimental unit
with the ith treatment.
Linear Model:
Yijk = + Eij + öijk 1 2 .., t; j =
where: = general mean ot all possible observations,
= the effect of the ith treatment,
Eij = random error associated with the jth eu with the ith
treatment, öijk — - random error associated with the kth su in the
jth eu with ith trt, = + (i, the true mean of the ith treatment.
Assumptions: 1. tij and öijk NID(O, ,02s )
2. The €ij's are independent of the öijk's
—+ Estimation ofparameters
The estimates of the parameters:
= Y, ; = Y ; and =t-Y
Model of Effects
Experiments may be classified according to the kind of their effects:
Analysis of Data
1. Perform diagnostic checking of the satisfaction of the ANOVA assumptions.
2. Testing of hypothesis
a. Construct the ANOVA table as outline below:
ANOVA
sv df MS Ex ected Mean uares
Model I Model 11
Treatments t-1 TrSS MSTr
+ 52 +
t(r-l)
Expt'i error ESS MSE 52
Sampling error tr(s-l) DSS MSD 2
5-3
MSE
MSD VS Ft F tr(s-l))
v/ Test statistic:
Measures ofprecision:
1. The standard error of a treatment mean
MSE
s.e.(t ) = rs
2MS
s.e.(ü ) = E
rs
3. The CV of the experiment.
= MSD,
Illustrative example:
An experiment was conducted to determine the levels of phosphorus (ppm) of 3
types of soil: A = Maahas clay loam, B = Luisiana clay loam and C = Lipa clay
loam. For each soil type, 4 locations were selected at random and within each
location, two samples were taken. The phosphorus concentration was determined
for each sample. The following data were collected:
Soil Soil Locations Soil type Soil type
sam ies 2 3 4 total mean
1
Maahas 1 9.1 7.3 7.3 10.7
2 7.3 9.0 8.9 12.7
Sub-total 16.4 16.3 16.2 23.4 72.3 9.04
Luisiana 1 12.6 9.1 10.9 8.0
2 14.5 10.8 12.8 9.8
Sub-total 27.1 19.9 23.7 17.8 88.6 11.06
Lipa 1 7.3 6.6 5.2 5.3
9.0 8.4 6.8 6.8
2
Sub-total 16.3 15.0 12.1 12.1 55.5 6.94
216.3 9.01
Analysis:
Identi6,' the treatments, experimenta/ units, sampling units and response variable.
0
Treatments - soil types: A, B, C
0
Expttl units - locations within soil types r = 4 0 Sampl'g
units - samples within locations s = 2. 0 Response var. -
phosphorus concentration (ppm).
Computations:
(216.3)2 = 1949.40
CF =
trs 3-4-2
S
INSTAT-CAS,
Page
S.
1
[16.402 + 16.33+ ... + 12.12] - 2017.49 = 51.08
2068.56 = 18.65
Samples within location ss
- — E E Y} = 2087.21
ANOVA of the phophorus concentrations (ppm)
df MS Fc.os
sv
Amon soil e 2 68.09 34.04 5.99 4.26
Locations within soil 9 51.08 5.68 3.66 2.86
Sam les within location 12 18.65 1.55
TOTAL 23 137.81
= 26.45%; ) = 0.84; se(d) i 19
2. Test ifphosporus concentration vanes among different /ocations within so,7
types.
Statistical hypotheses
I 5-7
Hypotheses:
Test statistic:
MSTr 34.04
VS Ft = F =
4.26 MSE 5.68
Conclusion: At least two soil types have significantly different mean phosporus
concentration.
(Assignment)
INSTA T-CAS,
The use Of one-way classifcation or CRD requires that the eu•s are
homogeneous. sometimes this requirement is not satisfied since the eu's are
markedly heterogeneous with respect to some criteria of classification. For
instance, plots differ in fertility, trees differ in age or height, analysts differ in
efficiency, etc.
The differences among the eu's is a major source of experimental error. When the
eu's are heterogeneous, the appropriate design must account' for these
heterogeneity. These design are the randomized complete block destgn'(RCBD)
and the /atin square desgn,
The eu's are grouped into r blocks in such a way that the differences between the
units among different blocks are greater than the differences between ,the units
within each block.
Likewise, the blocking should be done in such a way that the blocks cut across or
are perpendicular to the direction of the eu's gradient.
This way, if there are differences among the blocks, the variability is removed from the
experimental error thereby improving the precision of the experiment.
1 2 3 4 = treatment no.
Ex: For block 1, draw sequence :
drawn numbers: 2 4 1 3 = eu number
= treatment no.
For block 2, draw sequence : 1 2 3 4
drawn numbers: 4 1 3 2 = eu number
= treatment no.
For block 3, draw sequence : 1 2 3 4
drawn numbers: 3 1 2 4 = eu number
The corresponding layoutis:
1 1 1
2 2
3 3 3
4 4 4
Block 1 Block 2
Total
> Linear Model:
Yij = + pj +
To illustrate the analysis of results, we consider the following data obtained from
an experiment conducted to study the effects of four varieties of mongo. The
experiment was laid out in RCBD with five farms as the blocks. In each farm, four
uniform plots were chosen on which the four varieties were randomly assigned.
Varietie Farms Variety
s 2 3 4 5 Totals
32.3 34.0 34.3 35.0 36.5 172.1
33.3 33.0 36.3 36.8 34.5 173.9
30.8 34.3 35.3 32.3 35.8 168.5
c
D 27.0 26.0 28.9 28.0 28,8 138.7
Totals 123.4 127.3 134.8 132.1 135.6 653.2
At = 5%, test if the different varieties have different yields.
Statisticai hypotheses
Ho: All the variety means are not different.
MSTr
Test statistic: FC = ; critical value: Ft = = 3.49 MSE
Decision rule: Reject Ho (and accept Ha) if FC Ft; else, accept Ho.
ANOVA
sv df MS
1
tr-l
TOTAL TSS
Computation:
INSTAT-CAS,