Professional Documents
Culture Documents
Babbie Ch8 Experiments
Babbie Ch8 Experiments
Experiments
Introduction Let's assume, for example, that we want to dis-
• non ©
cover ways of reducing prejudice against African
This chapter addresses the research method most Americans. We hypothesize that learning about the
0®o
commonly associated with structured science in contribution of African Americans to U.S . history
general: the experiment. Here we'll focus on the ex- will reduce prejudice, and we decide to test this hy-
0 0
0 periment as a mode of scientific observation in social pothesis experimentally. To begin, we might test a
•
© 0 0 0
0 0 research. At base, experiments involve (1) taking group of experimental subjects to determine their
0
00 00
0
00o0 action and (2) observing the consequences of that levels of prejudice against African Americans. Next,
• 00 00'x
0 o 0°
00 Chapter Overview action. Social researchers typically select a group of we might show them a documentary film depicting
subjects, do something to them, and observe the ef- the many important ways African Americans have
. 00()0 0 fect of what was done. In this chapter, we'll exam-
ova
000 O Oo:o
contributed to the scientific, literary, political, and
An experiment is a mode of observation that
o . -0 0
O
scientific experiments. measure our subjects' levels of prejudice against Af-
o O
D o O o
00
0 ships. Many experiments in social research are It's worth noting at the outset that we often use rican Americans to determine whether the film has
o0 o0 00
o
0
°
O
0 0
O conducted under the controlled conditions of a experiments in nonscientific inquiry. In preparing a actually reduced prejudice.
stew, for example, we add salt, taste, add more salt, Experimentation has also been successful in the
• laboratory, but experimenters can also take
© © ©
and taste again. In defusing a bomb, we clip the red study of small group interaction. Thus, we might
advantage of natural occurrences to study the
wire, observe whether the bomb explodes, clip an- bring together a small group of experimental sub-
effects of events in the social world. other, and .... jects and assign them a task, such as making recom-
0 0 We also experiment copiously in our attempts mendations for popularizing car pools. We observe,
to develop generalized understandings about the then, how the group organizes itself and deals with
world we live in. All skills are learned through ex- the problem. Over the course of several such ex-
I ntroduction An Illustration of Experimentation perimentation: eating, walking, talking, riding a periments, we might systematically vary the nature
bicycle, swimming, and so forth. Through experi- of the task or the rewards for handling the task suc- li
Topics Appropriate to Experiments "Natural" Experiments mentation, students discover how much studying cessfully. By observing differences in the way groups
is required for academic success. Through experi- organize themselves and operate under these vary-
The Classical Experiment Strengths and Weaknesses mentation, professors learn how much prepara- ing conditions, we can learn a great deal about the
Independent and Dependent Variables of the Experimental Method tion is required for successful lectures. This chapter nature of small group interaction and the factors
discusses how social researchers use experiments that influence it. For example, attorneys sometimes
Pretesting and Postesting
MAIN POINTS to develop generalized understandings. We'll see present evidence in different ways to different mock
Experimental and Control Groups
that, like other methods available to the social re- juries, to see which method is the most effective.
The Double-Blind Experiment
KEY TERMS searcher, experimenting has its special strengths We typically think of experiments as being con-
and weaknesses. ducted in laboratories. Indeed, most of the examples
REVIEW QUESTIONS AND EXERCISES in this chapter involve such a setting. This need not
Selecting Subjects
Probability Sampling
be the case, however. Social researchers often study
Randomization ADDITIONAL READINGS
natural experiments: "experiments"
Topics Appropriate to Experiments
what are called
Matching that occur in the regular course of social events.
RESOURCES ON THE INTERNET
Matching or Randomization? The latter portion of this chapter deals with such
Experiments are more appropriate for some topics
research.
and research purposes than others. Experiments are
especially well suited to research projects involving
Variations on Experimental Design
Preexperimental Research Designs relatively limited and well-defined concepts and
Validity Issues in Experimental Research propositions. In terms of the traditional image of sci-
ence, discussed earlier in this book, the experimen-
The Classical Experiment
tal model is especially appropriate for hypothesis In both the natural and the social sciences, the most
testing. Because experiments focus on determining conventional type of experiment involves three
causation, they're also better suited to explanatory major pairs of components: (1) independent and
220 than to descriptive purposes. dependent variables, (2) pretesting and posttesting,
4'
The Classical Experiment . 223
222 . Chapter8: Experiments
and (3) experimental and control groups. This To be used in an experiment, both independent questionnaire, we might conclude that the film had prejudice is reduced only in the experimental group,
section looks at each of these components and the and dependent variables must be operationally de- indeed reduced prejudice. this reduction would seem to be a consequence of
way they're put together in the execution of the fined. Such operational definitions might involve In the experimental examination of attitudes exposure to the film, because that's the only differ-
experiment. a variety of observation methods. Responses to a such as prejudice, we face a special practical prob- ence between the two groups. Alternatively, if prej-
questionnaire, for example, might be the basis for lem relating to validity. As you may already have udice is reduced in both groups but to a greater de-
defining prejudice. Speaking to or ignoring African imagined, the subjects may respond differently to gree in the experimental group than in the control
Independent and Dependent Variables Americans, or agreeing or disagreeing with them, the questionnaires the second time even if their at- group, that, too, would be grounds for assuming
might be elements in the operational definition of titudes remain unchanged. During the first admin- that the film reduced prejudice.
Essentially, an experiment examines the effect of
interaction with African Americans in a small group istration of the questionnaire, the subjects may be The need for control groups in social research
an independent variable on a dependent variable.
setting. unaware of its purpose. By the second measure- became dear in connection with a series of studies
Typically, the independent variable takes the form
Conventionally, in the experimental model, ment, they may have figured out that were inter- of employee satisfaction conducted by F. J. Roeth-
of an experimental stimulus, which is either pres-
dependent and independent variables must be op- ested in measuring their prejudice. Because no one lisberger and W. J. Dickson (1939) in the late 1920s
ent or absent. That is, the stimulus is a dichotomous
erationally defined before the experiment begins. wishes to seem prejudiced, the subjects may "dean and early 1930s. These two researchers were inter-
variable, having two attributes, present or not pres-
However, as you'll see in connection with survey up" their answers the second time around. Thus, ested in discovering what changes in working con-
ent. In this typical model, the experimenter com-
research and other methods, it's sometimes appro- the film will seem to have reduced prejudice al- ditions would improve employee satisfaction and
pares what happens when the stimulus is present
priate to make a wide variety of observations during though, in fact, it has not. productivity. To pursue this objective, they studied
to what happens when it is not.
data collection and then determine the most useful This is an example of a more general problem working conditions in the telephone "bank wiring
In the example concerning prejudice against
operational definitions of variables during later that plagues many forms of social research: The room" of the Western Electric Works in the Chi-
African Americans, prejudice is the dependent vari- cago suburb of Hawthorne, Illinois.
analyses. Ultimately, however, experimentation, very act of studying something may change it. The
able and exposure to African-American history is the
like other quantitative methods, requires specific techniques for dealing with this problem in the con- To the researchers' great satisfaction, they dis-
independent variable. The researcher's hypothesis
and standardized measurements and observations. text of experimentation will be discussed in various covered that making working conditions better in-
suggests that prejudice depends, in part, on a lack
places throughout the chapter. The first technique creased satisfaction and productivity consistently.
of knowledge of African-American history. The
involves the use of control groups. As the workroom was brightened up through bet-
Pretesting and Posttesting
purpose of the experiment is to test the validity of
ter lighting, for example, productivity went up.
this hypothesis by presenting some subjects with
When lighting was further improved, productivity
In the simplest experimental design, subjects are
Experimental and Control Groups
an appropriate stimulus, such as a documentary
went up again.
film. In other terms, the independent variable is measured in terms of a dependent variable ( pre-
To further substantiate their scientific con-
testing), exposed to a stimulus representing an in- Laboratory experiments seldom, if ever, involve only
the cause and the dependent variable is the effect.
clusion, the researchers then dimmed the lights.
Thus, we might say that watching the film caused a dependent variable, and then remeasured in terms the observation of an experimental group to
Whoops-productivity improved again!
change in prejudice or that reduced prejudice was of the dependent variable ( posttesting). Any dif- which a stimulus has been administered. In addi-
At this point it became evident that the wiring-
an effect of watching the film. ferences between the first and last measurements tion, the researchers also observe a control group,
room workers were responding more to the atten-
The independent and dependent variables ap- on the dependent variable are then attributed to which does not receive the experimental stimulus.
tion given them by the researchers than to improved
propriate to experimentation are nearly limitless. the independent variable. In the example of prejudice and African-
working conditions. As a result of this phenomenon,
Moreover, a given variable might serve as an inde- In the example of prejudice and exposure to American history, we might examine two groups
often called the Hawthorne effect, social researchers
pendent variable in one experiment and as a de- African-American history, we'd begin by pretesting of subjects. To begin, we give each group a question-
have become more sensitive to and cautious about
pendent variable in another. For example, preju- the extent of prejudice among our experimental naire designed to measure their prejudice against
the possible effects of experiments themselves. In
dice is the dependent variable in our example, but subjects. Using a questionnaire asking about atti- African Americans. Then we show the film only to
it might be the independent variable in an experi- tudes toward African Americans, for example, we the experimental group. Finally, we administer a
ment examining the effect of prejudice on voting could measure both the extent of prejudice exhib- posttest of prejudice to both groups. Figure 8-1 il-
experimental group in experimentation, a group
behavior. ited by each individual subject and the average lustrates this basic experimental design. of subjects to whom an experimental stimulus is
prejudice level of the whole group. After expos- Using a control group allows the researcher administered.
ing the subjects to the African-American history to detect any effects of the experiment itself. If the control group In experimentation, a group of sub-
film, we could administer the same questionnaire posttest shows that the overall level of prejudice ex- jects to whom no experimental stimulus is adminis-
pretesting The measurement of a dependent variable tered and who should resemble the experimental
again. Responses given in this posttest would per- hibited by the control group has dropped as much
among subjects. group in all other respects. The comparison of the con-
mit us to measure the later extent of prejudice for as that of the experimental group, then the appar-
posttesting The remeasurement of a dependent vari- trol group and the experimental group at the end of
each subject and the average prejudice level of the ent reduction in prejudice must be a function of
able among subjects after they've been exposed to an the experiment points to the effect of the experimental
group as a whole. If we discovered a lower level of the experiment or of some external factor rather stimulus.
independent variable.
prejudice during the second administration of the than a function of the film. If, on the other hand,
224 . Chapter 8: Experiments Selecting Subjects . 225
FIGURE 8-1 research, the experimenters may be more likely to would be conducted with college undergraduates
Diagram of Basic Experimental Design "observe" improvements among patients receiving as subjects. Typically, the experimenter asks stu-
the experimental drug than among those receiv- dents enrolled in his or her classes to participate in
ing the placebo. (This would be most likely, per- experiments or advertises for subjects in a college
Experimental Control haps, for the researcher who developed the drug.) newspaper. Subjects may or may not be paid for
Group Group
A double-blind experiment eliminates this pos- participating in such experiments (recall also from
sibility, because in this design neither the subjects Chapter 3 the ethical issues involved in asking stu-
Measure dependent Measure dependent
Compare: Same? nor the experimenters know which is the experi- dents to participate in such studies).
variable variable
mental group and which is the control. In the In relation to the norm of generalizability in
medical case, those researchers who were respon- science, this tendency dearly represents a poten-
r sible for administering the drug and for noting im- tial defect in social research. Simply put, college
Administer experimental provements would not be told which subjects were undergraduates are not typical of the public at
stimulus (film) receiving the drug and which the placebo. Con- large. There is a danger, therefore, that we may
versely, the researcher who knew which subjects learn much about the attitudes and actions of col-
were in which group would not administer the lege undergraduates but not about social attitudes
Remeasure dependent Remeasure dependent experiment. and actions in general.
variable variable In social scientific experiments, as in medical However, this potential defect is less signifi-
Compare: Different?
experiments, the danger of experimenter bias is cant in explanatory research than in descriptive
further reduced to the extent that the operational research. True, having noted the level of prejudice
definitions of the dependent variables are dear and among a group of college undergraduates in our
the wiring-room study, the use of a proper control Such an event may very well horrify the experi- precise. Thus, medical researchers would be less pretesting, we would have little confidence that the
group-one that was studied intensively without mental subjects, requiring them to examine their likely to unconsciously bias their reading of a pa- same level existed among the public at large. On
any other changes in the working conditions- own attitudes toward African Americans, with the tient's temperature than they would be to bias their the other hand, if we found that a documentary
would have pointed to the existence of this effect. result of reduced prejudice. Because such an effect assessment of how lethargic the patient was. For film reduced whatever level of prejudice existed
The need for control groups in experimentation should happen about equally for members of the the same reason, the small group researcher would among those undergraduates, we would have more
has been nowhere more evident than in medical control and experimental groups, a greater reduc- be less likely to misperceive which subject spoke, confidence-without being certain-that it would
research. Time and again, patients who participate tion of prejudice among the experimental group or to whom he or she spoke, than whether the have a comparable effect in the community at large.
in medical experiments have appeared to improve, would, again, point to the impact of the experi- subject's comments sounded cooperative or com- Social processes and patterns of causal relationships
and it has been unclear how much of the improve- mental stimulus: the documentary film. petitive, a more subjective judgment that's difficult appear to be more generalizable and more stable
ment has come from the experimental treatment Sometimes an experimental design requires to define in precise behavioral terms. than specific characteristics such as an individual's
and how much from the experiment. In testing the more than one experimental or control group. In As I've indicated several times, seldom can level of prejudice.
effects of new drugs, then, medical researchers fre- the case of the documentary film, for example, we we devise operational definitions and measure- Aside from the question of generalizability, the
quently administer a placebo-a "drug" with no might also want to examine the impact of reading a ments that are wholly precise and unambiguous. cardinal rule of subject selection in experimenta-
relevant effect, such as sugar pills-to a control book on African-American history. In that case, we This is another reason why it can be appropriate tion concerns the comparability of experimental
group. Thus, the control-group patients believe that might have one group see the film and read the to employ a double-blind design in social research and control groups. Ideally, the control group rep-
they, like the experimental group, are receiving an book, another group only see the movie, still an- experiments. resents what the experimental group would be
experimental drug. Often, they improve. If the new other group only read the book, and the control like if it had not been exposed to the experimental
drug is effective, however, those receiving the ac- group do neither. With this kind of design, we could stimulus. The logic of experiments requires, there-
tual drug will improve more than those receiving determine the impact of each stimulus separately, fore, that experimental and control groups be as
the placebo. as well as their combined effect. similar as possible. There are several ways to ac-
complish this.
Selecting Subjects
In social scientific experiments, control groups
guard against not only the effects of the experi- in Chapter 7 we discussed the logic of sampling,
ments themselves but also the effects of any events which involves selecting a sample that is repre-
The Double-Blind Experiment sentative of some populations. Similar consider- double-blind experiment An experimental design
outside the laboratory during the experiments. In
in which neither the subjects nor the experimenters
the example of the study of prejudice, suppose that Like patients who improve when they merely ations apply to experiments. Because most social
know which is the experimental group and which is
a popular African-American leader is assassinated think they're receiving a new drug, sometimes researchers work in colleges and universities, it the control.
in the middle of, say, a week-long experiment. experimenters tend to prejudge results. In medical seems likely that research laboratory experiments
MW 1-
226 . Chapter8: Experiments Selecting Subjects . 227
randomization A technique for assigning experi- ficiently achieved through the creation of a quota gender composition, the same racial composition, to assure ourselves that the two groups exhibit the
mental subjects to experimental and control groups matrix constructed of all the most relevant charac- and so forth. This test of comparability should be same overall level of prejudice.
randomly. teristics. Figure 8-2 provides a simplified illustration used whether the two groups are created through
Matching or Randomization?
matching In connection with experiments, the proce- of such a matrix. In this example, the experimenter probability sampling or through randomization.
dure whereby pairs of subjects are matched on the ba- has decided that the relevant characteristics are race, Thus far I have referred to the "relevant" vari-
sis of their similarities on one or more variables, and When assigning subjects to the experimental and
age, and gender. Ideally, the quota matrix is con- ables without saying dearly what those variables
one member of the pair is assigned to the experimental
group and the other to the control group. structed to result in an even number of subjects in are. Of course, these variables cannot be specified control groups, you should be aware of two argu-
each cell of the matrix. Then, half the subjects in in any definite way, any more than I could specify ments in favor of randomization over matching.
First, you may not be in a position to know in ad- Preexperimental Research Designs FIGURE 8-3
vance which variables will be relevant for the match- Three Preexperimental Research Designs
To begin, Campbell and Stanley discuss three "pre-
ing process. Second, most of the statistics used to
experimental" designs, not to recommend them
analyze the results of experiments assume random-
but because they're frequently used in less-than-
ization. Failure to design your experiment that way, Comparison
professional research. These designs are called "pre-
then, makes your later use of those statistics less One-Shot Case Study
experimental" to indicate that they do not meet the
meaningful. Some intuitive
scientific standards of experimental designs. In the A man who exercises
On the other hand, randomization only makes standard of what
sense if you have a fairly large pool of subjects, so
first such design-the one-shot case study-the re- is observed to be in
constitutes a
trim shape
searcher measures a single group of subjects on a trim shape
that the laws of probability sampling apply. With
dependent variable following the administration of
only a few subjects, matching would be a better Time 2
some experimental stimulus. Suppose, for example, Time 1 Time 3
procedure.
that we show the African-American history film
Sometimes researchers can combine matching
mentioned earlier to a group of people and then
and randomization. When conducting an experi- One-Group Pretest-Posttest Design
administer a questionnaire that seems to measure
ment on the educational enrichment of young ado- An overweight man who
prejudice against African Americans. Suppose fur-
lescents, for example, J. Milton Yinger and his col- exercises is later observed
ther that the answers given to the questionnaire to be in trim shape
leagues (1977) needed to assign a large number of
seem to represent a low level of prejudice. We might
students, aged 13 and 14, to several different ex-
be tempted to conclude that the film reduced prej-
perimental and control groups to ensure the com-
udice. Lacking a pretest, however, we can't be sure.
parability of students composing each of the groups.
Perhaps the questionnaire doesn't really represent a
They achieved this goal by the following method.
very sensitive measure of prejudice, or perhaps the
n
Beginning with a pool of subjects, the research-
group we're studying was low in prejudice to begin
ers first created strata of students nearly identical to
with. In either case, the film might have made no Static-Group Comparison
one another in terms of some 15 variables. From
difference, though our experimental results might A man who exercises is
each of the strata, students were randomly assigned
have misled us into thinking it did. observed to be in trim
to the different experimental and control groups. In
The second preexperimental design discussed shape while one who
this fashion, the researchers actually improved on
by Campbell and Stanley adds a pretest for the ex- doesn't is observed to
conventional randomization. Essentially, they had be overweight
perimental group but lacks a control group. This
used a stratified sampling procedure (Chapter 7),
design-which the authors call the onegroup pretest-
except that they had employed far more stratifica-
posttestdesign-suffers from the possibility that some
tion variables than are typically used in, say, survey
factor other than the independent variable might
sampling.
cause a change between the pretest and posttest
Thus far I've described the classical experiment
results, such as the assassination of a respected
-the experimental design that best represents the
African-American leader. Thus, although we can
logic of causal analysis in the laboratory. In prac- Time 1 Time 2 Time 3
see that prejudice has been reduced, we can't be
tice, however, social researchers use a great variety
sure that it was the film that caused that reduction.
of experimental designs. Let's look at some now.
To round out the possibilities for preexperimen-
tal designs, Campbell and Stanley point out that
have no way of knowing that the two groups had review the three preexperimental designs in this
some research is based on experimental and con-
ginning to exercise. Or perhaps he became thin for gone on in the experiment itself. The threat of in- standards or their abilities may change over the tal stimulus and the dependent variable can arise.
ternal invalidity is present whenever anything other course of the experiment. Whenever this occurs, the research conclusion
some other reason, like eating less or getting sick.
The observations shown in the diagram do not than the experimental stimulus can affect the de- that the stimulus caused the dependent variable
5. Statistical regression. Sometimes it's appropriate to
guard against these other possibilities. Moreover, pendent variable. can be challenged with the explanation that the
conduct experiments on subjects who start out with
the observation that the man in the diagram is in Campbell and Stanley (1963:5-6) and Cook "dependent" variable actually caused changes in
extreme scores on the dependent variable. If you
trim shape depends on our intuitive idea of what and Campbell (1979:51-55) point to several were testing a new method for teaching math to the stimulus.
constitutes trim and overweight body shapes. All sources of internal invalidity. Here are twelve: hardcore failures in math, you'd want to conduct 9. Diffusion or imitation of treatments. When experi-
told, this is very weak evidence for testing the rela- your experiment on people who previously have mental and control-group subjects can communi-
1. History. During the course of the experiment,
tionship between exercise and weight loss. done extremely poorly in math. But consider for a cate with each other, experimental subjects may
historical events may occur that will confound
The one-group pretest-posttest design offers minute what's likely to happen to the math achieve- pass on some elements of the experimental stimu-
the experimental results. The assassination of an
somewhat better evidence that exercise produces ment of such people over time without any experi- lus to the control group. For example, suppose
African-American leader during the course of an
weight loss. Specifically, we have ruled out the pos- mental interference. They're starting out so low that there's a lapse of time between our showing of the
experiment on reducing anti-African-American
sibility that the man was thin before beginning to they can only stay at the bottom or improve: They African-American history film and the posttest ad-
prejudice is one example; the arrest of an African-
exercise. However, we still have no assurance that can't get worse. Even without any experimental ministration of the questionnaire. Members of the
American leader for some heinous crime, which
it was his exercising that caused him to lose weight. stimulus, then, the group as a whole is likely to show experimental group might tell control-group sub-
might increase prejudice, is another.
Finally, the static-group comparison eliminates some improvement over time. Referring to a regres- jects about the film. In that case, the control group
2. Maturation. People are continually growing and
the problem of our questionable definition of what sion to the mean, statisticians often point out that ex- becomes affected by the stimulus and is not a real
changing, and such changes can affect the results of control. Sometimes we speak of the control group
constitutes trim or overweight body shapes. In this tremely tall people as a group are likely to have
the experiment. In a long-term experiment, the fact as having been "contaminated."
case, we can compare the shapes of the man who children shorter than themselves, and extremely
that the subjects grow older (and wiser?) may have
exercises and the one who does not. This design, short people as a group are likely to have children 10. Compensation. As you'll see in Chapter 12, in
an effect. In shorter experiments, they may grow
however, reopens the possibility that the man who taller than themselves. There is a danger, then, that experiments in real-life situations-such as a spe-
tired, sleepy, bored, or hungry, or change in other
exercises was thin to begin with. changes occurring by virtue of subjects starting out cial educational program-subjects in the control
ways that affect their behavior in the experiment.
in extreme positions will be attributed erroneously group are often deprived of something considered
3. Testing. As we have seen, often the process of to the effects of the experimental stimulus. to be of value. In such cases, there may be pres-
Validity Issues in Experimental Research testing and retesting influences people's behavior,
6. Selection biases. We discussed selection bias ear- sures to offer some form of compensation. For ex-
thereby confounding the experimental results. Sup- ample, hospital staff might feel sorry for control-
lier when we examined different ways of selecting
At this point I want to present in a more systematic
pose we administer a questionnaire to a group as a group patients and give them extra "tender loving
way the factors that affect the validity of experimen- subjects for experiments and assigning them to ex-
way of measuring their prejudice. Then we admin- care." In such a situation, the control group is no
perimental and control groups. Comparisons don't
tal research. First we'll look at what Campbell and ister an experimental stimulus and remeasure their
have any meaning unless the groups are compa- longer a genuine control group.
Stanley call the sources of internal invalidity, reviewed
prejudice. By the time we conduct the posttest, the
rable at the start of an experiment. 11. Compensatory rivalry. In real-life experiments,
and expanded in a follow-up book by Thomas Cook
subjects will probably have become more sensitive
and Donald Campbell (1979). Then we'll consider 7. Experimental mortality. Although some social ex- the subjects deprived of the experimental stimulus
to the issue of prejudice and will be more thought-
may try to compensate for the missing stimulus by
the problem of generalizing experimental results to periments could, I suppose, kill subjects, experimen-
ful in their answers. In fact, they may have figured
working harder. Suppose an experimental math
the "real" world, referred to as external invalidity. tal mortality refers to a more general and less ex-
out that we're trying to find out how prejudiced
program is the experimental stimulus; the control
Having examined these, we'll be in a position to treme problem. Often, experimental subjects will
they are, and, because few people like to appear
group may work harder than before on their math
appreciate the advantages of some of the more so- drop out of the experiment before it's completed,
prejudiced, they may give answers that they think
in an attempt to beat the "special" experimental
phisticated experimental and quasi-experimental and this can affect statistical comparisons and con-
we want or that will make them look good.
subjects.
designs social science researchers sometimes use. clusions. In the classical experiment involving an
4. Instrumentation. The process of measurement in
experimental and a control group, each with a 12. Demoralization. On the other hand, feelings of
pretesting and posttesting brings in some of the is-
Sources of Internal Invalidity sues of conceptualization and operationalization dis-
pretest and posttest, suppose that the bigots in the deprivation within the control group may result in
experimental group are so offended by the African- their giving up. In educational experiments, de-
The problem of internal invalidity refers to the cussed earlier in the book. If we use different mea-
American history film that they tell the experi- moralized control-group subjects may stop study-
possibility that the conclusions drawn from experi- sures of the dependent variable in the pretest and
menter to forget it, and they leave. Those subjects ing, act up, or get angry.
mental results may not accurately reflect what has posttest (say, different questionnaires about preju-
sticking around for the posttest will have been less These, then, are some of the sources of internal
dice), how can we be sure they're comparable to
prejudiced to start with, so the group results will invalidity in experiments. Aware of these, experi-
each other? Perhaps prejudice will seem to decrease
Internal invalidity Refers to the possibility that the reflect a substantial "decrease" in prejudice. menters have devised designs aimed at handling
simply because the pretest measure was more sen-
conclusions drawn from experimental results may not
8. Causal time order. Though rare in social research, them. The classical experiment, if coupled with
accurately reflect what went onn in the experiment itself. sitive than the posttest measure. Or if the measure-
ments are being made by the experimenters, their ambiguity about the ti me order of the experimen- proper subject selection and assignment, addresses
Variations on Experimental Design . 233
Group 4 Posttest
TIME
trol group. even easier to manage. periment, do they really tell us anything about life experimental design cannot control for that possi-
in the wilds of society? bility. Fortunately, experimenters have devised
This design also guards against the problem of The remaining five problems of internal inva-
history in that anything occurring outside the ex- lidity are avoided through the careful administra- Campbell and Stanley describe four forms of other designs that can.
this problem; I'll present one as an illustration. The Solomon four group design ( D. Campbell and
periment that might affect the experimental group tion of a controlled experimental design. The ex-
The generalizability of experimental findings is Stanley 1963:24-25) addresses the problem of test-
should also affect the control group. Consequently, perimental design we've been discussing facilitates
ing interaction with the stimulus. As the name sug-
there should still be a difference in the two posttest the dear specification of independent and depen- jeopardized, as the authors point out, if there's an
interaction between the testing situation and the gests, it involves four groups of subjects, assigned
results. The same comparison guards against prob- dent variables. Experimental and control subjects
experimental stimulus (1963:18). Here's an ex- randomly from a pool. Figure 8-5 presents this de-
lems of maturation as long as the subjects have can be kept separate, reducing the possibility of
been randomly assigned to the two groups. Testing diffusion or imitation of treatments. Administra- ample of what they mean. sign graphically.
and instrumentation can't be problems, because tive controls can avoid compensations given to the Staying with the study of prejudice and the Af-
both the experimental and control groups are sub- control group, and compensatory rivalry can be rican-American history film, let's suppose that our
external invalidity Refers to the possibility that cnn-
ject to the same tests and experimenter effects. If watched for and taken into account in evaluating experimental group-in the classical experiment-
dusions drawn from experimental results may not be
the subjects have been assigned to the two groups the results of the experiment, as can the problem has less prejudice in its posttest than in its pretest generalizable to the "real" world.
randomly, statistical regression should affect both of demoralization. and that its posttest shows less prejudice than that
234 , Chapter 8: Experiments An Illustration of Experimentation . 235
Notice that Groups 1 and 2 in Figure 8-5 com- static-group comparison discussed earlier), the sub- behaves, but how she's treated. I shall always sequent experiments have focused on specific as-
pose the classical experiment, with Group 2 being jects will be initially comparable on the dependent be a flower girl to Professor Higgins, because he pects of what has become known as the attribution
the control group. Group 3 is administered the ex- variable-comparable enough to satisfy the con- always treats me as a flower girl, and always process, or the expectations communication model. This
perimental stimulus without a pretest, and Group 4 ventional statistical tests used to evaluate the re- will, but I know I can be a lady to you, because research, largely conducted by psychologists, paral-
is only posttested. This experimental design permits sults -so it's not necessary to measure them. In- you always treat me as a lady, and always will. lels research primarily by sociologists, which takes
four meaningful comparisons, which are described deed, Campbell and Stanley suggest that the only (Act 'v a slightly different focus and is often gathered un-
in the figure. If the African-American history film justification for pretesting in this situation is tradi- der the label expectations-states theory. Psychological
really reduces prejudice-unaccounted for by the The sentiment Eliza expresses here is basic
tion. Experimenters have simply grown accustomed studies focus on situations in which the expecta-
problem of internal validity and unaccounted for social science, addressed more formally by sociolo-
to pretesting and feel more secure with research de- tions of a dominant individual affect the perfor-
by an interaction between the testing and the stim- gists such as Charles Horton Cooley (the "looking-
signs that include it. Be clear, however, that this mance of subordinates-as in the case of a teacher
ulus-we should expect four findings: point applies only to experiments in which subjects glass self") and George Herbert Mead ("the gener-
and students, or a boss and employees. The socio-
alized other"). The basic point is that who we think
have been assigned to experimental and control logical research has tended to focus more on the
1. In Group 1, posttest prejudice should be less we are-our self-concept-and how we behave
groups randomly, because that's what justifies the role of expectations among equals in small, task-
than pretest prejudice.] are largely a function of how others see and treat
assumption that the groups are equivalent without oriented groups. In a jury, for example, how do ju-
2. The Group 2 pretest and posttest should show us. Related to this, the way others perceive us is
actually measuring them to find out. rors initially evaluate each other, and how do those
the same degree of prejudice. largely conditioned by expectations they have in ad-
This discussion has introduced the intricacies of initial assessments affect their later interactions?
3. There should be less prejudice evident in the vance. If they've been told we're stupid, for example,
experimental design, its problems, and some solu- (You can learn more about this phenomenon, in-
Group 1 posttest than in the Group 2 posttest. they're likely to see us that way-and we may come
tions. There are, of course, a great many other ex- cluding attempts to find practical applications, by
4. The Group 3 posttest should show less preju- to see ourselves that way and actually act stupidly.
perimental designs in use. Some involve more than searching the Web for "The Pygmalion Effect.")
This phenomenon has generally been called the
dice than the Group 4 posttest. one stimulus and combinations of stimuli. Others Here's an example of an experiment conducted
Pygmalion effect, and it's nicely suited to controlled ex-
involve several tests of the dependent variable over to examine the way our perceptions of our abilities
Notice that findings (3) and (4) rule out any in- periments. In one of the best-known experimental
time and the administration of the stimulus at dif- and those of others affect our willingness to accept
teraction between the testing and the stimulus. And investigations of the Pygmalion effect, Robert Rosen-
ferent times for different groups. If you're interested the other person's ideas. Martha Foschi, G. Keith
remember that these comparisons are meaningful
thal and Lenore Jacobson (1968) administered what
in pursuing this topic, you might look at the Camp- Warriner, and Stephen Hart (1985) were particu-
only if subjects have been assigned randomly to the they called a "Harvard Test of Inflected Acquisition"
bell and Stanley book. larly interested in the role "standards" play in that
different groups, thereby providing groups of equal to students in a West Coast school. Subsequently,
prejudice initially, even though their preexperimen- respect:
they met with the students' teachers to present the
tal prejudice is only measured in Groups 1 and 2. results of the test. In particular, Rosenthal and Ja- In general terms, by "standards" we mean how
There is a side benefit to this research design, as cobson identified certain students as very likely to well or how poorly a person has to perform in
the authors point out. Not only does the Solomon An Illustration of Experimentation exhibit a sudden spurt in academic abilities during order for an ability to be attributed or denied
four-group design rule out interactions between
the coming year, based on the results of the test. him/her. In our view, standards are a key vari-
Experiments have been used to study a wide va-
testing and the stimulus, it also provides data for When IQ test scores were compared later, the
riety of topics in the social sciences. Some experi- able affecting how evaluations are processed
comparisons that will reveal how much of this in-
ments have been conducted within laboratory situ- researchers' predictions proved accurate. The stu- and what expectations result. For example, de-
teraction has occurred in a classical experiment.
dents identified as "sputters" far exceeded their pending on the standards used, the same level
ations; others occur out in the "real world." The
This knowledge allows a researcher to review and classmates during the following year, suggesting
following discussion provides a glimpse of both. of success may be interpreted as a major ac-
evaluate the value of any prior research that used that the predictive test was a powerful one. In fact,
Let's begin with a "real world" example. In complishment or dismissed as unimportant.
the simpler design.
the test was a hoax! The researchers had made
George Bernard Shaw's well-loved play, Pygmalion-
The last experimental design I'll mention here
(1985:108-9)
their predictions randomly among both good and
the basis of the long-running Broadway musical,
is what Campbell and Stanley (1963:25-26) call poor students. What they told the teachers did not To begin examining the role of standards, the
My Fair Lady-Eliza Doolittle speaks of the powers
the posttest-only control group design; it consists of the researchers designed an experiment involving four
others have in determining our social identity. really reflect students' test scores at all. The progress
second half-Groups 3 and 4-of the Solomon experimental groups and a control. Subjects were
made by the "sputters" was simply a result of the
Here's how she distinguishes the way she's treated
design. As the authors argue persuasively, with teachers expecting the improvement and paying told that the experiment involved something called
by her tutor, Professor Higgins, and by Higgins's
proper randomization, only Groups 3 and 4 are more attention to those students, encouraging "pattern recognition ability," which was an innate
friend, Colonel Pickering:
needed for a true experiment that controls for the ability some people had and others didn't. The re-
them, and rewarding them for achievements. (No-
problems of internal invalidity as well as for the in- You see, really and truly, apart from the things tice the similarity between this situation and the searchers said subjects would be working in pairs
teraction between testing and stimulus. With ran- anyone can pick up (the dressing and the proper on pattern recognition problems.
Hawthorne effect discussed earlier in this chapter.)
domized assignment to experimental and control way of speaking, and so on), the difference be- In fact, of course, there's no such thing as pat-
The Rosenthal-Jacobson study attracted a great
groups (which distinguishes this design from the tween a lady and a flower girl is not how she deal of popular as well as scientific attention. Sub- tern recognition ability. The object of the experiment
236 . Chapter 8: Experiments "Natural" Experiments . 237
was to determine how information about this sup- 3. You are possibly worse than your partner. In more detailed analyses, it was found that the escape relatively lightly. What, we might ask, are
posed ability affected subjects' subsequent behavior. 4. You are definitely worse than your partner. same basic pattern held for both men and women, the behavioral consequences of suffering a natural
The first stage of the experiment was to "test" though it was somewhat clearer for women than disaster? Are those who suffer most more likely to
The control group for this experiment was told
each subject's pattern recognition abilities. If you for men. Here are the actual data: take precautions against future disasters than are
nothing about their own abflities or their partners'.
had been a subject in the experiment, you would those who suffer least? To answer these questions,
In other words, they had no expectations.
have been shown a geometrical pattern for 8 sec- we might interview residents of the town some
Mean Number
onds, followed by two more patterns, each of which The final step in the experiment was to set the
time after the hurricane. We might question them
of Switches
Control group 7.95 ral function. The foundation of this study was a survey of
1. You are definitely better at pattern recognition Possibly worse 9.23 Imagine, for example, that a hurricane has the people who had been working at Three Mile Is-
than your partner. land on March 28, 1979, when the cooling system
Definitely worse 9.28 struck a particular town. Some residents of the
2. You are possibly better than your partner. town suffer severe financial damages, and others failed in the number 2 reactor and began melting
the uranium core. The survey was conducted five possibility for taking those problems into account. ate the effects of stimuli in real life. Because this is In discussing several of the sources of internal
to six months after the accident. Among other Social research generally requires ingenuity and in- an increasingly important form of social research, and external invalidity mentioned by Campbell, I
things, the survey questionnaire measured work- sight; natural experiments call for a little more than an entire chapter is devoted to it. Stanley, and Cook, we saw that we can create ex-
ers' attitudes toward working at nuclear power the average. perimental designs that logically control such prob-
plants. B they had measured only the TMI workers' Earlier in this chapter, we used a hypothetical lems. This possibility points to one of the great ad-
vantages of experiments: They lend themselves to
Strengths and Weaknesses
attitudes after the accident, the researchers would example of studying whether an African-American
have had no idea whether attitudes had changed as history film reduced prejudice. Sandra Ball- a logical rigor that is often much more difficult to
The TMI example points to both the special evaluation research involves taking the logic of ex- as much of a problem, of course, for natural experi- dom assignment of subjects guards against each
problems involved in natural experiments and the perimentation into the field to observe and evalu- ments as for those conducted in the laboratory. of these 12 sources of internal invalidity.
• Experiments also face problems of external in- 2. Pick 6 of the 12 sources of internal invalidity dis- emphasis on experimentation. This book is es- i nvite you to participate, either as a subject or as an
validity: Experimental findings may not reflect cussed in this chapter and make up examples (not pecially strong in the philosophy of science. experimenter.
discussed in the chapter) to illustrate each.
real life.
William D. Hacker Social Science Experimental
3. Create a hypothetical experimental design that il-
• The interaction of testing and stimulus is an ex- Laboratory
lustrates one of the problems of external invalidity.
ample of external invalidity that the classical http://ssel.caltech.edu/
4. Think of a recent natural disaster you've wit- RESOURCES ON THE INTERNET This laboratory at the California Institute of Technol-
experiment does not guard against.
nessed or read about. Frame a research question ogy gives students the opportunity to participate as
• The Solomon four-group design and other vari- that might be studied by treating that disaster as a VIRTUAL SOCIETY'S COMPANION WEB SITE FOR THE subjects in experiments online-for pay.
ations on the classical experiment can safeguard natural experiment. In two or three paragraphs, PRACTICE OF SOCIAL RESEARCH, 10TH EDITION
outline how the study might be done. Stanford Prison Experiment
against external invalidity.
http://www.wadsworth.com/sociology http://www.prisonexp.org/
• Campbell and Stanley suggest that, given 5. In this chapter, we looked briefly at the problem
Once at the Virtual Society, dick on "Find Companion This Web site provides a slide show relating a famous
proper randomization in the assignment of sub- of "placebo effects." On the Web, find a study in
Sites" from the left navigation bar, click on "Research social science experiment that reveals some of the
which the placebo effect figured importantly.
jects to the experimental and control groups, Methods and Statistics," and then click on your book problems that can occur in this type of study.
Write a brief report on the study, including the
there is no need for pretesting in experiments. cover. On the companion site, you will find useful
source of your information. (Hint: you might want
learning resources for your course. Some of those re- I NFOTRAC COLLEGE EDITION
• Natural experiments often occur in the course to do a search on "placebo.')
sources include Tutorial Quizzes with feedback, Inter-
of social life in the real world, and social re- http://www.infotrac-college.com/wadsworth/
net Exercises, Flashcards, and Chapter Tutorials for
searchers can implement them in somewhat every chapter, as well as Extended Projects, Social Re- access.html
the same way they would design and conduct search in Cyberspace, and Primers for using various
Access the latest news and journal articles with Info-
laboratory experiments. ADDITIONAL READINGS:,,,% data analysis software such as SPSS and NVivo.
Trac College Edition, an easy-to-use online database of
• Like all research methods, experiments have reliable, full-length articles from hundreds of top aca-
Campbell, Donald, and Julian Stanley. 1963. Experi-
strengths and weaknesses. Their primary weak- WEB LINKS FOR THIS CHAPTER demic journals. Conduct an electronic search using
mental and Quasi-Experimental Designs for Research.
ness is artificiality: What happens in an experi- the following terms:
Chicago: Rand McNally. An excellent analysis of
Please realize that the Internet is an evolving
ment may not reflect what happens in the the logic and methods of experimentation in so- Double-blind experiment Experiment AND
entity, subject to change. Nevertheless, these
outside world. Their strengths include the isola- cial research. This book is especially useful in its randomization
few Web sites should be fairly stable. Experiment AND
tion of the independent variable, which permits application of the logic of experiments to other control group Experiment AND
social research methods. Though fairly old, this Yahoo! Directory: Tests and Experiments
causal inferences; the relative ease of replica- stimulus
http://dir.yahoo.com/Social_Science/Psychology/ Experiment AND
book has attained the status of a classic and is
tion; and scientific rigor. matching Mock jury
still frequently cited. Research/Tests-and-Experiments/
Here you'll find an extensive list of Web sites relating Experiment AND Natural experiment
Cook, Thomas D., and Donald T. Campbell. 1979.
to various kinds of social science experiments. Some placebo Pretest AND posttest
Quasi-Experimentation: Design and Analysis Issues
tell you about past or ongoing experiments, and some
KEY TERMS:, for Field Settings. Chicago: Rand McNally. An ex-
panded and updated version of Campbell and
The following terms are defined in context in the Stanley.
chapter and at the bottom of the page where the term
Jones, Stephen R. G. 1990. "Worker Independence
is introduced, as well as in the comprehensive glossary
and Output: The Hawthorne Studies Reevalu-
at the back of the book. ated." American Sociological Review 55:176-90.
pretesting randomization This article reviews these classical studies and
questions the traditional interpretation (which
posttesting matching
was presented in this chapter).
experimental group internal invalidity
Martin, David W. 1996. Doing Psychology Experiments.
control group external invalidity
4th ed. Monterey, CA: Brooks/Cole. With thor-
double-blind
ough explanations of the logic behind research
experiment
methods, often in a humorous style, this book
emphasizes ideas of particular importance to the
beginning researcher, such as getting an idea for
REVIEW 4UESTIONS.AND-EXERCISES an experiment or reviewing the literature.
Ray, William J. 2000. Methods toward a Science ofBe-
1. In the library or on the Web, locate a research re- havior and Experience. 6th ed. Belmont, CA:
port of an experiment. Identify the dependent Wadsworth. A comprehensive examination of
variable and the stimulus. social science research methods, with a special