Professional Documents
Culture Documents
Experimental Design: Some Basic Design Concepts
Experimental Design: Some Basic Design Concepts
Experimental Design: Some Basic Design Concepts
Experimental Design
R og e r E . Kirk
5. Determination of the statistical analysis that will cases it may be possible to find preexisting
be performed (Kirk, 1995: 1–2). or naturally occurring experimental units
who have been exposed to the disease. If
In summary, an experimental design identifies the research has all of the features of an
the independent, dependent, and nuisance experiment except random assignment, it is
variables and indicates the way in which called a quasi-experiment. Unfortunately, the
the randomization and statistical aspects of interpretation of quasi-experiments is often
an experiment are to be carried out. The ambiguous. In the absence of random assign-
primary goal of an experimental design is ment, it is difficult to rule out all variables
to establish a causal connection between other than the independent variable as expla-
the independent and dependent variables. nations for an observed result. In general, the
A secondary goal is to extract the maximum difficulty of unambiguously interpreting the
amount of information with the minimum outcome of research varies inversely with
expenditure of resources. the degree of control that a researcher is able
to exercise over randomization.
Randomization
Replication and local control
The seminal ideas for experimental design
can be traced to Sir Ronald Fisher. The Fisher popularized two other principles of
publication of Fisher’s Statistical methods good experimentation: replication and local
for research workers in 1925 and The design control or blocking. Replication is the obser-
of experiments in 1935 gradually led to the vation of two or more experimental units
acceptance of what today is considered the under the same conditions. Replication
cornerstone of good experimental design: enables a researcher to estimate error effects
randomization. Prior to Fisher’s pioneering and obtain a more precise estimate of
work, most researchers used systematic treatment effects. Blocking, on the other hand,
schemes rather than randomization to assign is an experimental procedure for isolating
participants to the levels of a treatment. variation attributable to a nuisance variable.
Random assignment has three purposes. Nuisance variables are undesired sources
It helps to distribute the idiosyncratic of variation that can affect the dependent
characteristics of participants over the variable. Three experimental approaches are
treatment levels so that they do not selectively used to deal with nuisance variables:
bias the outcome of the experiment. Also,
random assignment permits the computation 1. Hold the variable constant.
2. Assign experimental units randomly to the
of an unbiased estimate of error effects—those
treatment levels so that known and unsuspected
effects not attributable to the manipulation sources of variation among the units are
of the independent variable—and it helps to distributed over the entire experiment and do not
ensure that the error effects are statistically affect just one or a limited number of treatment
independent. Through random assignment, levels.
a researcher creates two or more groups of 3. Include the nuisance variable as one of the factors
participants that at the time of assignment are in the experiment.
probabilistically similar on the average.
The latter approach is called local control or
blocking. Many statistical tests can be thought
Quasi-experimental design of as a ratio of error effects and treatment
Sometimes, for practical or ethical reasons, effects as follows:
participants cannot be randomly assigned to
Test statistic =
treatment levels. For example, it would be
unethical to expose people to a disease to f (error effects) + f (treatment effects)
(1)
evaluate the efficacy of a treatment. In such f (error effects)
EXPERIMENTAL DESIGN 25
Treatment B
b1 b2
No Pretest Pretest
Y 15 Y 12
Y 25 Y 22
.. ..
a 1 Treatment . .
Y n3 5 Y n1 2
.....................................
Y¯ .5 Y¯ .2 Y¯ Treatment
Treatment A Y 16 Y 14
Y 26 Y 24
.. ..
a 2 No Treatment . .
Y n4 6 Y n2 4
.....................................
Y¯ .6 Y¯ .4 Y¯ No Treatment
Y¯ No Pretest Y¯ Pretest
Figure 2.7 Layout for analyzing the Solomon four-group design. Three contrasts can be
tested: effects of treatment A (Ȳ Treatment versus Ȳ No Treatment ), effects of treatment B
(Ȳ Pretest versus Ȳ No pretest ), and the interaction of treatments A and B (Ȳ .5 + Ȳ .4 ) versus
(Ȳ .2 + Ȳ .6 ).
factorial design is shown in Figure 2.7. The sample contrasts of Ȳ .1 versus Ȳ .6 and Ȳ .3
null hypotheses are as follows: versus Ȳ .6 .
The inclusion of one or more control groups
H0 : µTreatment = µNo Treatment (treatment A in a design helps to ameliorate the threats to
population means are equal) internal validity mentioned earlier. However,
H0 : µPretest = µNo Pretest (treatment B as I describe next, a researcher must still deal
with other threats to internal validity when the
population means are equal) experimental units are people.
H0 : A × B interaction = 0 (treatments A
and B do not interact)
THREATS TO INTERNAL VALIDITY
The null hypotheses are tested with the WHEN THE EXPERIMENTAL UNITS
following F statistics: ARE PEOPLE
MSA MSB Demand characteristics
F= ,F = , and
MSWCELL MSWCELL
MSA × B Doing research with people poses special
F= (10) threats to the internal validity of a design.
MSWCELL
Because of space restrictions, I will men-
where MSWCELL denotes the within-cell- tion only three threats: demand character-
mean square. istics, participant-predisposition effects, and
Insight into the effects of administering a experimenter-expectancy effects. According
pretest is obtained by examining the sample to Orne (1962), demand characteristics result
contrasts of Ȳ .2 versus Ȳ .5 and Ȳ .4 versus Ȳ .6 . from cues in the experimental environment
Similarly, insight into the effects of maturation or procedure that lead participants to make
and history is obtained by examining the inferences about the purpose of an experiment
EXPERIMENTAL DESIGN 31
and to respond in accordance with (or in some aren’t interested in the experimenter’s hypoth-
cases, contrary to) the perceived purpose. esis, much less in sabotaging the experiment.
People are inveterate problem solvers. When Instead, their primary concern is in gaining a
they are told to perform a task, the majority positive evaluation from the researcher. The
will try to figure out what is expected data they provide are colored by a desire
of them and perform accordingly. Demand to appear intelligent, well adjusted, and so
characteristics can result from rumors about on, and to avoid revealing characteristics
an experiment, what participants are told that they consider undesirable. Clearly, these
when they sign up for an experiment, participant-predisposition effects can affect
the laboratory environment, or the com- the internal validity of an experiment.
munication that occurs during the course
of an experiment. Demand characteristics
Experimenter-expectancy effects
influence a participant’s perceptions of what
is appropriate or expected and, hence, their Experimenter-expectancy effects can also
behavior. affect the internal validity of an experiment.
Experiments with human participants are
social situations in which one person behaves
Participant-predisposition effects
under the scrutiny of another. The researcher
Participant-predisposition effects can also requests a behavior and the participant
affect the interval validity of an experiment. behaves. The researcher’s overt request may
Because of past experiences, personality, and be accompanied by other more subtle requests
so on, participants come to experiments with a and messages. For example, body language,
predisposition to respond in a particular way. tone of voice, and facial expressions can
I will mention three kinds of participants. communicate the researcher’s expectations
The first group of participants is mainly and desires concerning the outcome of an
concerned with pleasing the researcher and experiment. Such communications can affect
being ‘good subjects.’ They try, consciously a subject’s performance.
or unconsciously, to provide data that support Rosenthal (1963) has documented
the researcher’s hypothesis. This participant numerous examples of how experimenter-
predisposition is called the cooperative- expectancy effects can affect the internal
participant effect. validity of an experiment. He found that
A second group of participants tends to be researchers tend to obtain from their subjects,
uncooperative and may even try to sabotage whether human or animal, the data they
the experiment. Masling (1966) has called want or expect to obtain. A researcher’s
this predisposition the ‘screw you effect.’ The expectations and desires also can influence
effect is often seen when research participa- the way he or she records, analyses, and
tion is part of a college course requirement. interprets data. According to Rosenthal (1969,
The predisposition can result from resentment 1978), observational or recording errors are
over being required to participate in an usually small and unintentional. However,
experiment, from a bad experience in a when such errors occur, more often than
previous experiment such as being deceived not they are in the direction of supporting
or made to feel inadequate, or from a dislike the researcher’s hypothesis. Sheridan (1976)
for the course or the professor associated with reported that researchers are much more
the course. Uncooperative participants may likely to recompute and double-check results
try, consciously or unconsciously, to provide that conflict with their hypotheses than results
data that do not support the researcher’s that support their hypotheses.
hypothesis. Kirk (1995: 22–24) described a variety of
A third group of participants are apprehen- ways of minimizing these kinds of threats to
sive about being evaluated. Participants with internal validity. It is common practice, for
evaluation apprehension (Rosenberg, 1965) example, to use a single-blind procedure in
32 DESIGN AND INFERENCE
which participants are not informed about the delivered by a patch, cognitive-behavioral
nature of their treatment and, when feasible, therapy, and a patch without medication
the purpose of the experiment. A single-blind ( placebo). In this example and those that
procedure helps to minimize the effects of follow, a fixed-effects model is assumed, that
demand characteristics. is, the experiment contains all of the treatment
A double-blind procedure, in which neither levels of interest to the researcher. The null
the participant nor the researcher knows and alternative hypotheses for the smoking
which treatment level is administered, is experiment are, respectively:
even more effective. For example, in the
smoking experiment the patch with the H 0 : µ1 = µ2 = µ3 (11)
medication and the placebo can be coded so H1 : µj $= µj& for some j and j& (12)
that those administering the treatment cannot
identify the condition that is administered. Assume that 30 smokers are available to
A double-blind procedure helps to minimize participate in the experiment. The smokers
both experimenter-expectancy effects and are randomly assigned to the three treatment
demand characteristics. Often, the nature of levels with 10 smokers assigned to each
the treatment levels is easily identified. In level. The layout for the experiment is shown
this case, a partial-blind procedure can be in Figure 2.8. Comparison of the layout
used in which the researcher does not know in this figure with that in Figure 2.4 for
until just before administering the treatment an independent samples t-statistic design
level which level will be administered. In reveals that they are the same except that
this way, experimenter-expectancy effects are the completely randomized design has three
minimized up until the administration of the treatment levels.
treatment level. I have identified the null hypothesis that the
The relatively simple designs described so researcher wants to test, µ1 = µ2 = µ3 , and
far provide varying degrees of control of described the manner in which the participants
threats to internal validity. For a description of are assigned to the three treatment levels.
other simple designs such as the longitudinal- Space limitations prevent me from describing
overlapping design, time-lag design, time- the computational procedures or the assump-
series design, and single-subject design, the tions associated with the design. For this
reader is referred to Kirk (1995: 12–16). The information, the reader is referred to the
designs described next are all analyzed using many excellent books on experimental design
the analysis of variance and provide excellent (Anderson, 2001; Dean and Voss, 1999;
control of threats to internal validity. Giesbrecht and Gumpertz, 2004; Kirk, 1995;
Maxwell and Delaney, 2004; Ryan, 2007).
The partition of the total sum of squares,
ANALYSIS OF VARIANCE DESIGNS SSTOTAL, and total degrees of freedom,
np − 1, for a completely randomized design
WITH ONE TREATMENT
are as follows:
Completely randomized design SSTOTAL = SSBG + SSWG (13)
The simplest analysis of variance (ANOVA) np − 1 = (p − 1) + p(n − 1) (14)
design (a completely randomized design)
involves randomly assigning participants to a where SSBG denotes the between-groups sum
treatment with two or more levels. The design of squares and SSWG denotes the within-
shares many of the features of an independent- groups sum of squares. The null hypothesis
samples t-statistic design. Again, consider the is tested with the following F statistic:
smoking experiment and suppose that the
researcher wants to evaluate the effective- SSBG/( p − 1) MSBG
F= = (15)
ness of three treatment levels: medication SSWG/[ p(n − 1)] MSWG
EXPERIMENTAL DESIGN 33
Figure 2.9 Layout for a randomized block design with p = 3 treatment levels (Treat. Level)
and n = 10 blocks. In the smoking experiment, the p participants in each block were randomly
assigned to the treatment levels. The means of treatment A are denoted by Ȳ .1 , Ȳ .2 , and Ȳ .3 ;
the means of the blocks are denoted by Ȳ 1. , … , Ȳ 10. .
I prefer to use the designation randomized The test of the block null hypothesis is of
block design for all three cases. little interest because the blocks represent
The total sum of squares and total degrees a nuisance variable that a researcher wants
of freedom are partitioned as follows: to isolate so that it does not appear in the
SSTOTAL = SSA + SSBLOCKS estimators of the error effects.
The advantages of this design are: (1) sim-
+ SSRESIDUAL (16)
plicity in the statistical analysis; and (2) the
np − 1 = ( p − 1) + (n − 1) ability to isolate a nuisance variable so as
+ (n − 1)( p − 1) (17) to obtain greater power to reject a false null
hypothesis. The disadvantages of the design
where SSA denotes the treatment A sum of include: (1) the difficulty of forming homo-
squares and SSBLOCKS denotes the block geneous blocks or observing participants p
sum of squares. The SSRESIDUAL is the times when p is large; and (2) the restrictive
interaction between treatment A and blocks; assumptions (sphericity and additivity) of the
it is used to estimate error effects. Two null design. For a description of these assumptions,
hypotheses can be tested: see Kirk (1995: 271–282).
H0 : µ.1 = µ.2 = µ.3 (treatment A
population means are equal) (18)
H0 : µ1. = µ2. = . . . = µ.15 (block, BL, Latin square design
population means are equal) (19) The Latin square design is appropriate to
experiments that have one treatment, denoted
where µij denotes the population mean for the by A, with p ≥ 2 levels and two nuisance
ith block and the jth level of treatment A. The variables, denoted by B and C, each with p lev-
F statistics are: els. The design gets its name from an ancient
SSA/( p − 1) puzzle that was concerned with the number of
F= ways that Latin letters can be arranged in a
SSRESIDUAL/[(n − 1)( p − 1)]
square matrix so that each letter appears once
MSA
= (20) in each row and once in each column. A 3 × 3
MSRESIDUAL Latin square is shown in Figure 2.10.
SSBL/(n − 1) The randomized block design enables a
F=
SSRESIDUAL/[(n − 1)( p − 1)] researcher to isolate one nuisance variable:
MSBL variation among blocks. A Latin square
= (21)
MSRESIDUAL design extends this procedure to two nuisance
EXPERIMENTAL DESIGN 35
c1 c2 c3
b1 a1 a2 a3 Treat. Dep.
Comb. Var.
b2 a2 a3 a1
Participant1 a1b1c 1 Y 111
b3 a3 a1 a2
. . .
Group1 .. .. ..
Figure 2.10 Three-by-three Latin square,
Participantn1 a1b1c 1 Y 111
where aj denotes one of the j = 1, …, p
..............................
levels of treatment A, bk denotes one of the
k = 1, …, p levels of nuisance variable B, Y¯ .111
and cl denotes one of the l = 1, … , p levels
Participant1 a1b2c 3 Y 123
of nuisance variable C . Each level of
.. .. ..
treatment A appears once in each row and Group2 . . .
once in each column as required for a
Latin square. Participantn2 a1b2c 3 Y 123
..............................
Y¯ .123
variables: variation associated with the rows
Participant1 a1b3c 2 Y 132
(B) of the Latin square and variation associ-
.. .. ..
ated with the columns of the square (C). As Group3
. . .
a result, the Latin square design is generally
Participantn3 a1b3c 2 Y 132
more powerful than the randomized block ..............................
design. The layout for a Latin square design Y¯ .132
with three levels of treatment A is shown
Participant1 a2b1c 2 Y 212
in Figure 2.11 and is based on the aj bk cl
.. .. ..
combinations in Figure 2.10. The total sum Group4
. . .
of squares and total degrees of freedom are
Participantn4 a2b1c 2 Y 212
partitioned as follows: ..............................
Y¯ .212
SSTOTAL = SSA + SSB + SSC .. .. ..
. . .
+ SSRESIDUAL + SSWCELL
Participant1 a3b3c 1 Y 331
np2 − 1 = (p − 1) + (p − 1) + (p − 1)
.. .. ..
+ ( p − 1)( p − 2) + p2 (n − 1) Group9
. . .
Participantn9 a3b3c 1 Y 331
where SSA denotes the treatment sum of ..............................
squares, SSB denotes the row sum of squares, Y¯ .331
and SSC denotes the column sum of squares. Dep. Var., dependent variable; Treat. Comb., treatment
SSWCELL denotes the within cell sum of combination.
squares and estimates error effects. Four null Figure 2.11 Layout for a Latin square
hypotheses can be tested: design that is based on the Latin square
in Figure 2.10. Treatment A and the two
H0 : µ1.. = µ2.. = µ3.. (treatment A nuisance variables, B and C , each have
p = 3 levels.
population means are equal)
H0 : µ.1. = µ.2. = µ.3. (row, B, where µjkl denotes a population mean for the
population means are equal) jth treatment level, kth row, and lth column.
H0 : µ..1 = µ..2 = µ..3 (column, C, The F statistics are:
population means are equal)
SSA/( p − 1) MSA
H0 : interaction components = 0 (selected F= =
SSWCELL/[ p2 (n − 1)] MSWCELL
A × B, A × C, B × C, and A × B × C SSB/( p − 1) MSB
F= =
interaction components equal zero) 2
SSWCELL/[ p (n − 1)] MSWCELL
36 DESIGN AND INFERENCE
b1 b2 b1 b2 b1 b2 b3 b4
where SSA × B denotes the interaction sum Randomized block factorial design
of squares for treatments A and B. Three null
hypotheses can be tested: A two-treatment, randomized block factorial
design is constructed by crossing p levels of
one randomized block design with q levels
H0 : µ1. = µ2. = · · · = µp. (treatment A
of a second randomized block design. This
population means are equal) procedure produces p × q treatment combina-
H0 : µ.1 = µ.2 = · · · = µ.q (treatment B tions: a1 b1 , a1 b2 , …, ap bq . The design uses the
population means are equal) blocking technique described in connection
with a randomized block design to isolate
H0 : A × B interaction = 0 (treatments A
variation attributable to a nuisance variable
and B do not interact), while simultaneously evaluating two or more
treatments and associated interactions.
where µjk denotes a population mean for the A two-treatment, randomized block fac-
jth level of treatment A and the kth level of torial design has blocks of size pq. If a
treatment B. The F statistics for testing the block consists of matched participants, n
null hypotheses are as follows: blocks of pq matched participants must be
formed. The participants in each block are
SSA/( p − 1) MSA randomly assigned to the pq treatment com-
F= = binations. Alternatively, if repeated measures
SSWCELL/[ pq(n − 1)] MSWCELL
are obtained, each participant is observed
SSB/(q − 1) MSB
F= = pq times. For this case, the order in which
SSWCELL/[ pq(n − 1)] MSWCELL the treatment combinations is administered
SSA × B/( p − 1)(q − 1) MSA × B
F= = . is randomized independently for each block,
SSWCELL/[ pq(n − 1)] MSWCELL assuming that the nature of the treatments and
research hypotheses permit this. The layout
The advantages of a completely random- for the design with p = 2 and q = 2 treatment
ized factorial design are as follows: (1) levels is shown in Figure 2.15.
All participants are used in simultaneously The total sum of squares and total degrees
evaluating the effects of two or more of freedom for a randomized block factorial
treatments. The effects of each treatment are design are partitioned as follows:
evaluated with the same precision as if the SSTOTAL = SSBL + SSA + SSB
entire experiment had been devoted to that
+ SSA × B + SSRESIDUAL
treatment alone. Thus, the design permits
efficient use of resources. (2) A researcher npq − 1 = (n − 1) + (p − 1) + (q − 1)
can determine whether the treatments interact. + ( p − 1)(q − 1) + (n − 1)( pq−1).
The disadvantages of the design are as Four null hypotheses can be tested:
follows: (1) If numerous treatments are
included in the experiment, the number H0 : µ1.. = µ2.. = . . . = µn.. (block
of participants required can be prohibitive. population means are equal)
(2) A factorial design lacks simplicity in H0 : µ.1. = µ.2. = . . . = µ.p. (treatment A
the interpretation of results if interaction population means are equal)
effects are present. Unfortunately, interactions
among variables in the behavioral sciences H 0 : µ..1 = µ..2 = . . . = µ..q (treatment B
and education are common. (3) The use of population means are equal)
a factorial design commits a researcher to H0 : A × B interaction = 0 (treatments A
a relatively large experiment. A series of and B do not interact),
small experiments permit greater freedom in
pursuiting unanticipated promising lines of where µijk denotes a population mean for the
investigation. ith block, jth level of treatment A, and kth level
40 DESIGN AND INFERENCE
Figure 2.15 Layout for a two-treatment, randomized block factorial design where four
matched participants are randomly assigned to the pq = 2 × 2 = 4 treatments combinations
in each block.
where SSBL(A) denotes the sum of squares acceptable and a researcher is more interested
for blocks within treatment A. Three null in treatment B and the A × B interaction than
hypotheses can be tested: in treatment A, a split-plot factorial design is
a good design choice.
H0 : µ1. = µ2. = . . . = µp. (treatment A Alternatively, if a large block size is
population means are equal) not acceptable and the researcher is pri-
H0 : µ.1 = µ.2 = . . . = µ.q (treatment B marily interested in treatments A and B, a
confounded-factorial design is a good choice.
population means are equal) This design, which is described next, achieves
H0 : A × B interaction = 0 (treatments A a reduction in block size by confounding
and B do not interact), groups of blocks with the A×B interaction. As
a result, tests of treatments A and B are more
where µijk denotes the ith block, jth level of powerful than the test of the A × B interaction.
treatment A, and kth level of treatment B. The
F statistics are:
Confounded factorial designs
SSA/( p − 1) MSA
F= = Confounded factorial designs are constructed
SSBL(A)/[ p(n − 1)] MSBL(A) from either randomized block designs or
SSB/(q − 1) Latin square designs. One of the simplest
F=
SSRESIDUAL/[ p(n − 1)(q − 1)] completely confounded factorial designs is
MSB constructed from two randomized block
= designs. The layout for the design with
MSRESIDUAL
SSA × B/( p − 1)(q − 1) p = q = 2 is shown in Figure 2.17. The design
F= confounds the A × B interaction with groups
SSRESIDUAL/[ p(n − 1)(q − 1)]
MSA × B of blocks and thereby reduces the block size
= to two. The power of the tests of the two
MSRESIDUAL
treatments is usually much greater than the
Notice that the split-plot factorial design power of the test of the A × B interaction.
uses two error terms: MSBL(A) is used to Hence, the completely confounded factorial
test treatment A; a different and usually design is a good design choice if a small block
much smaller error term, MSRESIDUAL, is size is required and the researcher is primarily
used to test treatment B and the A × B interested in tests of treatments A and B.
interaction. The statistic F = MSA/MSBL(A)
is like the F statistic in a completely ran- Fractional factorial design
domized design. Similarly, the statistic F =
MSB/MSRESIDUAL is like the F statistic for Confounded factorial designs reduce the
treatment B in a randomized block design. number of treatment combinations that appear
Because MSRESIDUAL is generally smaller in each block. Fractional-factorial designs use
than MSBL(A), the power of the tests of treatment-interaction confounding to reduce
treatment B and the A×B interaction is greater the number of treatment combinations that
than that for treatment A. A randomized appear in an experiment. For example, the
block factorial design, on the other hand, number of treatment combinations in an
uses MSRESIDUAL to test all three null experiment can be reduced to some fraction—
hypotheses. As a result, the power of the test 1% , 1% , 1% , 1% , 1% , and so on—of the
2 3 4 8 9
of treatment A, for example, is the same as total number of treatment combinations in
that for treatment B. When tests of treatments an unconfounded factorial design. Unfortu-
A and B and the A × B interaction are of equal nately, a researcher pays a price for using
interest, a randomized block factorial design is only a fraction of the treatment combinations:
a better design choice than a split-plot factorial ambiguity in interpreting the outcome of the
design. However, if the larger block size is not experiment. Each sum of squares has two or
EXPERIMENTAL DESIGN 43
Figure 2.17 Layout for a two-treatment, randomized block, confounded factorial design.
The A × B interaction is confounded with groups. Treatments A and B are within-blocks
treatments.
more labels called aliases. For example, two of treatment B appears with only one level
labels for the same sum of squares might be of treatment A. A hierarchical design has
treatment A and the B × C × D interaction. at least one nested treatment; the remaining
You may wonder why anyone would use treatments are either nested or crossed.
such a design—after all, experiments are sup- Hierarchical designs are constructed from
posed to help us resolve ambiguity not create two or more or a combination of completely
it. Fractional factorial designs are typically randomized and randomized block designs.
used in exploratory research situations where For example, a researcher who is interested in
a researcher is interested in six-or- more treat- the effects two types of radiation on learning in
ments and can perform follow-up experiments rats could assign 30 rats randomly to six cages.
if necessary. The design enables a researcher The six cages are then randomly assigned to
to efficiently investigate a large number of the two types of radiation. The cages restrict
treatments in an initial experiment, with the movement of the rats and insure that all rats
subsequent experiments designed to focus on in a cage receive the same amount of radiation.
the most promising lines of investigation or In this example, the cages are nested in the two
to clarify the interpretation of the original types of radiation. The layout for the design is
analysis. The designs are used infrequently in shown in Figure 2.18.
the behavioral sciences and education. There is a wide array of hierarchical
designs. For a description of these designs, the
reader is referred to the extensive treatment in
HIERARCHICAL DESIGNS Chapter 11 of Kirk (1995).
participants to treatment levels or to form Statistics in Behavioral Science, 1: 66–83. New York:
blocks of matched participants. The advantage Wiley.
of statistical control is that it can be used after Masling, J. (1966) ‘Role-related behavior of the subject
data collection has begun. Its disadvantage is and psychologist and its effects upon psychological
that it involves a number of assumptions such data’, in D. Levine (ed.), Nebraska Symposium
on Motivation. Volume 14. Lincoln: University of
as a linear relationship between the dependent
Nebraska Press.
variable and the concomitant variable that Maxwell, S.E. and Delaney, H.D. (2004) Designing Exper-
may prove untenable. iments and Analyzing Data (2nd edn.). Mahwah, NJ:
Lawrence Erlbaum Associates.
Orne, M.T. (1962) ‘On the social psychology of the
REFERENCES psychological experiment: With particular reference
to demand characteristics and their implications’.
Anderson, N.H. (2001) Empirical Direction in Design and American Psychologist, 17, 358–372.
Analysis. Mahwah, NJ: Lawrence Erlbaum Associates. Rosenberg, M.J. (1965) ‘When dissonance fails: on
Dean, A. and Voss, D. (1999) Design and Analysis of eliminating evaluation apprehension from attitude
Experiments. New York: Springer-Verlag. measurement’, Journal of Personality and Social
Finney, D.J. (1945) ‘The fractional replication of factorial Psychology, 1: 18–42.
arrangements’, Annals of Eugenics, 12: 291–301. Rosenthal, R. (1963) ‘On the social psychology of
Finney, D.J. (1946) ‘Recent developments in the design the psychological experiment: The experimenter’s
of field experiments. III. Fractional replication’, hypothesis as an unintended determinant of exper-
Journal of Agricultural Science, 36: 184–191. imental results’, American Scientist, 51: 268–283.
Fisher, R.A. (1925) Statistical Methods for Research Rosenthal, R. (1969) ‘Interpersonal expectations:
Workers. Edinburgh: Oliver and Boyd. Effects of the experimenter’s hypothesis’, in
Fisher, R.A. (1935) The Design of Experiments. R. Rosenthal and R.L. Rosnow (eds.), Artifact in
Edinburgh and London: Oliver and Boyd. Behavioral Research. New York: Academic Press.
Giesbrecht, F.G., and Gumpertz, M.L. (2004) Planning, pp. 181–277.
Construction, and Statistical Analysis of Comparative Rosenthal, R. (1978) ‘How often are our numbers
Experiments. Hoboken, NJ: Wiley. wrong?’ American Psychologist, 33: 1005–1008.
Hacking, I. (1983) Representing and Intervening: Ryan, T.P. (2007) Modern Experimental Design.
Introductory Topics in the Philosophy of Natural Hoboken, NJ: Wiley.
Science. Cambridge, England: Cambridge University Shadish, W.R., Cook, T.D., and Campbell, D.T. (2002)
Press. Experimental and Quasi-experimental Designs for
Kempthorne, O. (1947) ‘A simple approach to Generalized Causal Inference. Boston: Houghton
confounding and fractional replication in factorial Mifflin.
experiments’, Biometrika, 34: 255–272. Sheridan, C.L. (1976) Fundamentals of Experimental
Kirk, R.E. (1995) Experimental Design: Procedures for Psychology (2nd edn.). Fort Worth, TX: Holt, Rinehart
the Behavioral Sciences (3rd edn.). Pacific Grove, CA: and Winston.
Brooks/Cole. Yates, F. (1937) ‘The design and analysis of factorial
Kirk, R.E. (2005) ‘Analysis of variance: Classification’, experiments’, Imperial Bureau of Soil Science
in B. Everitt and D. Howell (eds.), Encyclopedia of Technical Communication No. 35, Harpenden, UK.