Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

BARBARA C. PERDUE and JOHN O.

SUMMERS*

The authors discuss several issues in the timing, construction, and analysis of
manipulation and confounding checks in marketing experiments. A review of 34
experiments involving latent independent variables reported in the Journal of Mar-
keting Research over the past decade suggests that most researchers are familiar
with the concept of manipulation checks but few systematically evaluate potential
sources of confounding in experimental manipulations. Three alternative approaches
for assessing the construct validity of experimental manipulations also are discussed.


Checking the Success of Manipulations In
Marketing Experiments

The identification of cause and effect relationships is Cook and Campbell (1979, p. 60) have noted that fit-
the raison d' etre of experimentation. In experimental re- ting the constructs of interest begins with a "careful
search the investigator attempts to discover the causal preexperimental explication of constructs so that the def-
relationship between two variables by (1) the manipu- initions are clear and in conformity with public under-
lation (systematic variation) of the independent variable standing of the words being used." Failure to adhere to
(also referred to as a factor or a treatment variable) and this principle can lead to the manipulation of the wrong
(2) the subsequent measurement of the dependent vari- "thing" and/or a redefining of the independent variable
able. If differences are observed in the dependent vari- to match whatever the manipulation appears to repre-
able, the investigator would like to conclude that the in- sent. For example, source credibility has been defined
dependent variable of interest was responsible. Random as a function of both expertise and trustworthiness (Hov-
assignment of the test units to treatment conditions cer- land, Irving, and Kelley 1953). Without a preexperi-
tainly facilitates causal interpretation by eliminating po- mental recognition of this definition, the researcher may
tential systematic differences across treatment conditions conclude that source credibility has been manipulated
due to extraneous factors associated with characteristics successfully when the treatment groups differ signifi-
of the test units (Keppel 1982). What if, however, the cantly on perceptions of the source's expertise and at-
manipulations themselves are confounded (i.e., manip- tractiveness but not on perceptions of the source's trust-
ulations that are meant to represent a particular indepen- worthiness.
dent variable can be interpreted plausibly in terms of more When ali of the independent variables to be manipu-
than one construct, each at the same level of reduction)? lated in a particular experiment are concrete, observable
In such a situation confidence in the investigator's causal variables (e.g., price and/or advertising expenditures in
explanation (expressed in theoretical terms) of the ex- dollars, color, number of sales calls), it is relatively sim-
perimental results is greatly reduced because the con- ple to confirm that the independent variables were ma-
struct validity of the manipulations as operationaliza- nipulated as intended. In addition, inadvertent confound-
tions of the intended independent variables would be ing of the manipulations often can be avoided by
questionable. maintaining ceteris paribus conditions across treatments.
However, experimental studies in marketing frequently
involve "higher order," unobservable independent vari-
ables (e.g., perceptions of a salesperson's expertise, fear
*Barbara C. Perdue is Assistant Professor of Marketing, University arousal, group cohesiveness). Because latent variables
of Georgia. John O. Summers is Professor of Marketing, Indiana Uni- such as these cannot be altered directly, the researcher
versity. attempts to manipulate them indirectly by changing se-
The authors acknowledge the many helpful suggestions of three
anonymous JMR reviewers.
lected aspects of the subject's environment. Unfortu-
nately, it is rarely safe to assume that the operations used

317

Journal of Marketing Research


Vol. XXIII (November 1986), 317-26
318 JOURNAL OF MARKETING RESEARCH, NOVEMBER 1986

to manipulate psychological and sociological variables the validity of their manipulation. Only the treatment ef-
will represent the precise concepts the researcher has in fects for their manipulation checks were statistically sig-
mind (cf. Frornkin and Streufert 1976). It is therefore nificant. This finding suggested that source credibility
usually prudent for the researcher to perform manipu- was manipulated successfully and that the manipulation
lation checks for these types of independent variables. was not contaminated by any of the three confounding
Festinger (1953) was among the first to emphasize the variables considered. Later, in a reply to Stanley (1978),
systematic use of manipulation checks; however, there Sternthal and Dholakia (1978) argued that Stanley's de-
is evidence the notion was used in experimental research mand characteristic explanation for their experimental
at least as early as the 1930s (cf. Farnsworth and Misumi results would also predict similar treatment effects for
1931; Saadi and Farnsworth 1934). It seems likely that both the manipulation and confounding checks. Because
the concept developed from a number of independent this did not occur, they rejected the demand character-
sources. istics explanation.
Ideally, the experimenter would like to be able to Finally, reliable manipulation checks are required for
demonstrate that (1) the treatment manipulations are re- conducting internal analyses (e.g., within-cell analyses
lated to "direct" measures of the latent variables they of the relationships among the independent, confound-
were designed to alter and (2) the manipulations did not ing, and dependent variable measures). The case for a
produce changes in measures of related but different particular interpretation of the experimental results often
constructs. The first condition relates to Cook and can be strengthened by this form of analysis. Biehal and
Campbell's (1979) idea of assessing the convergence of Chakravarti (1986) provide an excellent example of this
measures and manipulations of the same "thing" and is approach in their study of consumers' use of memory
the most widely reported form of manipulation check. and external information in choice behavior. Of interest
For example, in an experiment on the effects of a sales- was whether subjects used prior brand evaluations to guide
person's expertise on purchasing agents' buying inten- choice instead of attempting to retrieve specific brand-
tions, the researcher might be expected to determine attribute information as suggested by the authors' ac-
whether subjects in the "high expertise" condition rate cessibility explanation. To resolve this issue they first
the salesperson as more knowledgeable about various as- determined which subjects in their high and low memory
pects of the presumed area of expertise than do those in accessibility conditions attempted to retrieve target brand
the "low expertise" condition. This procedure is often feature information and which were successful. Then they
referred to as assessing the "take" of the independent related the successful retrieval of this information to choice
variable and serves to establish a sort of convergent va- of the target brand. The case for the accessibility expla-
lidity (or lack thereof) between the manipulation and an nation was supported by the findings that (1) most sub-
effort to measure directly the independent variable of in- jects within each condition attempted to retrieve the tar-
terest. The latter condition incorporates Cook and Camp- get brand-attribute information and (2) successful retrieval
bell's suggestion of testing for a divergence of measures was associated strongly with choice of the target brand
and manipulations of related but distinct "things." For within each condition.
example, in the preceding selling experiment, it might A review of 34 experiments reported in the Journal of
be desirable for the experimenter to investigate whether Marketing Research from 1975 to 1984 and involving
the manipulation designed to alter expertise inadvertent- latent independent variables was conducted to gain a per-
ly varied perceptions of the salesperson's trustworthiness spective on the use of manipulation and confounding
or friendliness rather than, or in addition to, expertise. checks in marketing experiments;' That many of these
This special type of manipulation check, which serves articles contain a separate section on manipulation checks
to assess discriminant validity, is referred to as a con- and the majority report an attempt to demonstrate a con-
founding check (Wetzel 1977). Of course, this proce- vergence of the manipulation with an independent effort
dure allows increased confidence in the labeling of the to measure directly the independent variable suggests most
manipulation as an operationalization of a particular con- researchers in marketing are aware of manipulation
struct only when these checks measure the most plau- checks. However, that only two of these articles report
sible rival interpretations of what other constructs the the use of confounding checks, together with the ten-
manipulation might be varying. In summary, both types dency for authors to provide insufficient information for
of checks can provide important evidence about the con- the reader to determine exactly when and how their ma-
struct validity of the putative cause (i.e., the independent
variable) in an experiment.
Manipulation and confounding checks also can be used
to investigate the plausibility of demand characteristics. JThe theory on which a study was based, the hypotheses to be tested,
In a study of the effects of source credibility, Dholakia and statements about what was being intentionally varied by the ma-
and Sternthal (1977) used manipulation checks for trust- nipulations were used to determine whether a particular study was
manipulating latent variables. All studies that could be identified as
worthiness and expertise (the dimensions of credibility) involving at least one latent independent variable were included in the
along with confounding checks for attractiveness, dy- analysis. A tabular summary of the results is available from the au-
namic/not dynamic, and aggressiveness to investigate thors.
CHECKING THE SUCCESS OF MANIPULATIONS 319

nipulation and confounding checks were conducted, raises Pretest and pilot test subjects should be interviewed
doubts about the adequacy with which these checks are immediately after exposure to the manipulation (Aron-
being conducted in marketing experiments. son and Carlsmith 1968). Waiting until after the depen-
In the following sections we discuss several issues dent variables have been assessed may reduce the sub-
pertaining to the appropriate timing, construction, anal- jects' abilities to describe fully their reactions to the
ysis, and reporting of manipulation and confounding manipulation and could bias their reports. For example,
checks. Though our focus is mainly on checks related to in certain cases the desired mental or emotional state (e.g.,
manipulations of latent independent variables, most of anxiety, source credibility) to be produced by a partic-
the issues discussed are also relevant to checks on hy- ular manipulation may be very temporary and/or might
pothesized intervening processes that have a critical role be altered by the process associated with measuring the
in the theory being tested. As Aronson and Carlsmith dependent variables.
(1968, p. 17) suggest, "To get a better idea of whether Usually, the experimenter's initial design of the ma-
some conceptual variable is producing the observed re- nipulation will require several revisions before the main
suits, one can sometimes measure the intervening pro- experiment is run. In addition to signaling that some-
cesses involved." In this context, they use the term thing is wrong, an effective pretest should suggest cor-
"measure" to refer to an attempt to obtain "some direct rective changes to be made in the manipulation. This is
indication" of whether the hypothesized process is ac- particularly important early in the pretesting stage when
tually occurring. For example, mental states that are be- substantial revisions are most likely to occur. Aronson
lieved to be a function of the process of interest often and Carlsmith's (1968) suggestion that long, probing in-
can be assessed. terviews be conducted with subjects after their exposure
to the manipulation seems appropriate at this point in the
TIMING OF MANIPULATION AND CONFOUNDING
study. In addition, experimenters might want to consider
CHECKS
alternative qualitative research techniques such as col-
Pretest Versus the Main Experiment lecting and analyzing concurrent verbal protocols, par-
ticularly when the theory being tested postulates the oc-
Manipulation and confounding checks appear to have
currence of an intervening process. These methods can
their greatest value during the pretest and/or pilot-test-
give the investigator a better perspective of how, if at
ing phases of an experiment when an inadequately de-
all, the manipulation is operating to create the desired
signed manipulation can still be modified and the main
variance in the intended independent variable and/or
experiment saved (Aronson and Carlsmith 1968; Wetzel
permit the investigation of any theoretically important
1977).2 Furthermore, the cost to the researcher associ-
intervening processes.
ated with a negative result at this stage (i.e., the time
and effort involved in refining the manipulation and run- Perhaps more importantly, the preceding methods can
ning an additional pretest or pilot test) is relatively small. contribute to the identification of potential confounding
In contrast, the investigator has much to lose should ma- variables before the main experiment is conducted (if an
nipulation and/or confounding checks in the main ex- additional highly plausible confounding variable is iden-
tified after the main experiment has been run, the re-
periment reflect unfavorably on the manipulation. Fur-
thermore, as we discuss subsequently, including these searcher might also find it useful to perform a confound-
ing check on a separate pool of subjects). However, that
checks in the main experiment can present problems in-
so few of the experimental studies reviewed included the
dependent of whether they come before or after the de-
systematic use of confounding checks suggests many ex-
pendent variable measures. Fortunately, extensive test-
perimenters may not be sufficiently sensitive to these types
ing of the manipulations in the pretest and/or pilot-testing
of problems. Notable exceptions to this observation are
phases will lessen the need for manipulation and con-
studies conducted by Holbrook (1978) and Miniard and
founding checks in the main experiment. Of course, when
Cohen (1979). In an experiment designed to study the
one is extrapolating manipulation and confounding check
effects of the "factualness/evaluativeness" of a persua-
results from a pilot test, it is important that this testing
sive message on beliefs and affect, Holbrook (1978) as-
be conducted with the same procedures, experimental in-
sessed the "perceived message favorability" (a plausible
struments, subject types, etc., as the main experiment.
confounding variable) as well as the factualness/evalu-
In summary, it seems advisable that the major experi-
ativeness of his "advertising-like messages." The ma-
ment be run only after the checks included in a pilot test
nipulation checks for factualness/evaluativeness but not
have suggested the manipulation is successful.
the confounding checks for message favorability were
statistically significant.
Miniard and Cohen (1979) used an experimental method
'The term "pretest" is used to refer to those activities designed to to examine aspects of the construct validity of the Fish-
assess the appropriateness of selected parts of the experimental pro- bein behavioral intentions model. In addition to includ-
cedures and/or instruments. The term "pilot test" applies to those
procedures involved in exposing subjects to the total experimental ex-
ing multiple checks within their main experiment to as-
perience under conditions like those of the main experiment, with the sess the validity of their manipulations of "manipulative
possible exception of the measurement of the dependent variables. intent" and "agree-disagree" (i.e., the consistency be-
320 JOURNAL OF MARKETING RESEARCH, NOVEMBER 1986

tween the hypothetical person's attitude and her belief check groups, and the manipulation and confounding
about what the referent thinks the person should do), the checks are omitted from the experimental groups. This
authors used separate subject pools to investigate pos- approach avoids the potential for one set of measure-
sible confounding variables. One such group was used ments (i.e., those associated with either the checks or
to demonstrate that the agree-disagree manipulation did the dependent variables) to bias the other at the cost of
not inadvertently affect the perceived attractiveness of increasing the number of test units required.
the referent. Though Kidd (1976) criticizes a counterbalancing ap-
In spite of the apparent advantages of establishing the proach, in which half of the subjects receive the checks
credibility of the manipulation during the pretest and/or before the dependent measures and the other half after-
pilot-testing phases, authors reported doing so in only a ward, as not eliminating the problem but only permitting
third of the articles reviewed. However, it seems likely its detection, it is not clear that his position is justified.
that such tests may be underreported because of space The manipulation check group approach is identical to
limitations imposed on authors. counterbalancing with the last set of measurements (i.e.,
the checks or the dependent measures, depending on which
Checks Within the Main Experiment come last) deleted. Because a measurement cannot bias
Potential problems. In reference to the inclusion of responses to other measures that precede it, one can
manipulation and confounding checks in the main ex- achieve the same benefits with the counterbalancing ap-
periment, Wetzel (1977, p. 89) states, "One of the car- proach as with manipulation check groups by merely ig-
dinal rules of experimentation is to measure the major noring the final set of measurements when counterbal-
dependent variables first." This position is based on the ancing is used. Furthermore, when the results obtained
potential for these checks, particularly those involving from counterbalancing suggest there are no order-bias
self-reports or other forms of obtrusive measurement, to problems, one can combine the two sets of groups for
introduce demand characteristics when they precede the the purpose of testing the theoretical hypotheses of in-
dependent variable measures. However, measuring the terest, thereby doubling the effective sample size and in-
dependent variables before conducting an assessment of creasing the power of the related statistical tests. In this
the success of the manipulation presents its own set of regard, the researcher should compare the means and
potentially serious problems. First, as noted before, the variances of all measures (those relating to both the checks
manipulation may cause only a temporary change in the and the dependent variables), as well as the intercorre-
level of the independent variable and/or any confound- lations among these measures, across the two sets of
ing variables. Hence, when the manipulation and con- treatment groups.
founding checks follow the dependent measures, impor- Wetzel (1977) proposed that the subjects in a final pilot
tant effects of the manipulation may already have test serve the function of Kidd's (1976) manipulation
dissipated substantially. Also troublesome is the possi- check groups. Implicit in this suggestion is that the sub-
bility that the subjects' own responses to the dependent jects, manipulations, and environmental setting of this
measures may bias their reactions to the subsequent ma- pilot test be as similar as possible to those of the main
nipulation and confounding checks, particularly when experiment. This alternative is consistent with the po-
these checks involve self-report measures (Kidd 1976). sition that the main experiment should be conducted only
However, the possibility that the process of measuring after checks in a pilot test have suggested that the ma-
the dependent variables affected the later assessment of nipulation is successful and no additional changes in the
the success of the manipulations seems less plausible experimental procedures appear to be required. The total
whenever both manipulation and confounding checks are number of subjects required may be smaller than in the
used and only the manipulation checks demonstrate sig- preceding two approaches, but only if the final pilot test
nificant treatment effects. The extent to which the ex- would have been conducted for reasons other than the
perimenter can exclude such an explanation under the need to conduct manipulation and confounding checks.
conditions described will depend on the degree to which One potentially important disadvantage of running a fi-
the prior assessment of the dependent variables is ex- nal pilot test that includes the manipulation checks is the
pected to affect the confounding checks at least as much difficulty of ensuring the equivalence of the pilot test and
as the manipulation checks. Unfortunately, as noted be- the main experiment on subject and setting factors. The
fore, use of confounding checks has been very limited final pilot test typically is run before the main experi-
in marketing experiments. ment, and subjects usually are not assigned randomly to
Alternative solutions. At least three approaches have the two groups.
been proposed for coping with the potential problems in The studies reviewed failed to reveal a general sen-
the ordering of manipulation and confounding checks and sitivity to the timing of manipulation and confounding
dependent variable measures. First, Kidd (1976) sug- checks involving self-reports. In the majority of cases it
gests the creation of manipulation check groups (one for was not possible to determine whether the checks or the
each treatment condition) whose sole purpose is the as- dependentmeasures were taken first. Furthermore, in three
sessment of the success of the manipulation. The de- experiments the manipulation checks were reported as
pendent variables are not assessed in the manipulation preceding part or all of the measures for the dependent
CHECKING THE SUCCESS OF MANIPULATIONS 321

variables. None of the authors reported using either the Whenever a large portion of the variance in a partic-
manipulation check group or counterbalancing tech- ular manipulation check is "explained" by the related
niques suggested by Kidd (1976), and it was not evident manipulation, the interpretability issue decreases in im-
that any of the experimenters consciously attempted to portance. In this case it is apparent that the manipulation
employ Wetzel's (1977) "pilot-test adaptation" of Kidd's check was sufficiently reliable to detect that a meaning-
manipulation check group approach. ful variance in the intended independent variable was
achieved. However, the use of "reliability estimates,"
SCALES FOR THE MANIPULATION AND based on data from experimental subjects, outside the
CONFOUNDING CHECKS context of the particular study in which these estimates
The measures for the manipulation and confounding were derived seems highly questionable.
checks should be constructed with the same care as those It is when the experimenter fails to find a satisfactory
for the dependent variables. It is important to establish convergence between a manipulation and its associated
that the manipulation produced a large enough variance manipulation check that the reliability issue becomes most
in the intended independent variable to provide for a prominent. In particular, one may have difficulty deter-
meaningful test of the hypotheses of interest. Further- mining whether this lack of convergence is due to a weak
more, Aronson and Carlsmith (1968, p. 46) contend that manipulation, an "unreliable manipulation" (i.e., one for
". . . it is extremely important for all subjects to be which the within-cell variance of the intended indepen-
brought to the identical point by the manipulation. . .. " dent variable is high), an unreliable manipulation check
An assessment of the degree to which these conditions measure, or some combination ofthese factors (note that
are met in a particular experiment obviously requires a the within-cell variance of a manipulation check measure
valid measure of the independent variable of interest. will be a function of both the reliability of the manip-
Furthermore, valid measures of plausible confounding ulation and the reliability of the manipulation check).
variables are needed to permit a clear evaluation of Hence, it will be difficult to assess the relative impact
whether and to what extent the manipulation is con- of these three factors without an estimate of the reli-
founded. The finding of nonsignificant results for the ability of the manipulation check that is independent of
confounding checks contributes little, if anything, to the the strength and reliability of the manipulation itself.
credibility of a manipulation when the measures used are A factor analysis of the across-treatments covariance
of questionable validity. matrix seems likely to produce misleading results be-
The construction of manipulation and confounding cause the covariances will be a function of the strength
checks as well as measures for the dependent variables of the experimental manipulations. One can choose in-
in an experiment involves basically the same set of gen- stead to factor the pooled within-treatments covariance
eral issues as the measurement of any concept (see matrix. This approach avoids allowing the manipulations
Churchill 1979 for an extensive discussion of the various to affect the factor structure and therefore appears to be
procedures for developing better measures of marketing a more defensible procedure. Calder and Stemthal (1980),
constructs). However, when the data used to assess the in a study of television commercial wearout, success-
quality of these measures are collected solely from ex- fully analyzed the factor structure of their dependent
perimental subjects (i.e., those individuals exposed to measures using this within-treatments approach. It should
the experimental treatments) rather than from a random be noted, however, that the success of this approach de-
sample of individuals from some well-defined popula- pends on the presence of a substantial amount of within-
tion (as in a survey), some unique problems arise, par- cell variance on the underlying constructs of interest.
ticularly in the interpretability of reliability estimates and When the within-cell covariance matrix is dominated by
the meaningfulness of estimated factor structures. These random measurement error the results will tend to be un-
problems derive from the fact that the "across-treat- stable and difficult to interpret.
ments" covariance matrix for the items included in these As noted before, the use of self-reports for both the
measures will necessarily be a function of the strength checks and the dependent variables can result in biased
of the manipulations designed by the experimenter. responses to whichever measures are taken last. Fur-
With respect to reliability, strong (weak) manipula- thermore, Aronson and Carlsmith (1968, p. 50) suggest
tions will tend to produce high (low) reliability esti- that ". . . too often subjects are unable or unwilling to
mates. As Peter (1979, p. 7) observes, the reliability explain just what the effects of some manipulation have
coefficient "is nothing more than the ratio of true vari- been," and suggest that" . . . the best solution is to ob-
ance to observed variance." A strong (weak) manipu- serve some other behavior which we expect to covary
lation will produce a large (small) true variance for the directly with our theoretical variable and see whether it
associated manipulation check while, in most cases, not does. . . . " Most frequently, this can be done best in a
materially affecting the error variance, thus producing a pretest or pilot test, particularly if these measures require
high (low) reliability coefficient. Hence, "reliability es- some special intervention by the experimenter. When-
timates" based on the preceding covariance matrix should ever such behavioral checks are included in the main ex-
not be interpreted as reflecting the inherent properties of periment, it seems important that the behaviors in ques-
the related measure. tion occur naturally as a function of the manipulation
322 JOURNAL OF MARKETING RESEARCH, NOVEMBER 1986

itself and that the observation be unobtrusive. Subjects inadvertently affected an independent variable associated
can use their memories of their past behaviors as a basis with a different manipulation. Furthermore, researchers
for their responses to the dependent measures and are must be concerned with the statistical significance of all
most likely to do so when those behaviors are made sa- main and interaction effects, not just those involving the
lient (cf. Sherman et al. 1978). Though a few experi- factor corresponding to the manipulation check measure
menters have been able to develop such behavioral mea- being analyzed. A statistically significant main effect for
sures for their particular studies (cf. Reingen and Kernan the manipulation (factor) corresponding to the manipu-
1977), most apparently find this a difficult if not im- lation check being analyzed provides evidence in favor
possible task, and self-reports are the dominant method of the convergent validity of that particular manipula-
for assessing the "take" of the independent variable in tion. To the extent that other main and/or interaction
marketing experiments. effects are statistically significant, the discriminant va-
lidity of the associated manipulations becomes suspect.
ANALYSIS OF MANIPULATION AND Ideally, only one effect, the main effect of the factor
CONFOUNDING CHECKS (manipulation) of interest, will be statistically signifi-
Conducting a statistical test to determine whether the cant. If effects associated with other manipulations prove
manipulation had either no effect or the wrong effect on to be statistically significant, these manipulations will
an independent variable can be relatively straightforward have been "falsified" in the sense that they have not had
for single-factor designs. However, when multiple fac- their intended effects. That is, these manipulations have
tors are involved, directional r-tests and/or one-way had an effect on an independent variable other than the
ANOVA followed by multiple contrasts may not be suf- one they were individually intended to manipulate, and
ficient for adequate?, analyzing the manipulation and the independent variables, as manipulated, will not be
confounding checks. As Wetzel (1977, p. 88) observes, orthogonal. Under these conditions, misleading results
can occur when the researcher analyzes the dependent
However, in some situations we may be able to falsify a
manipulation in the sense of demonstrating that the ma- measures with the ANOVA model corresponding to the
nipulation was properly carried out. For example, in a 2 planned orthogonal research design.
x 2 (AB) design, manipulation A must be independent Whenever confounding checks are analyzed, the only
of manipulation B. If the researcher discovers that ma- condition strictly favorable to the construct validity of
nipulation A significantly influences the manipulation the manipulations is that of statistically nonsignificant
checks for independent variable B, then the null hypoth- results for all main and interaction effects. Any signif-
esis that the two manipulations are independent can be icant main or interaction effects for the confounding
rejected, and manipulation A can be considered to be fal- checks reflect negatively on the discriminant validity of
sified. the associated manipulation(s). However, even when
The practical significance of Wetzel's remarks is that they confounding is present, it may still be possible in some
indicate an adequate analysis of a manipulation check for cases to detect whether the intended independent vari-
a given factor (manipulation) within a multiple-factor able has had its hypothesized directional effects-for
design requires the use of the full-factorial ANOVA model example, when the pattern of the confounding cannot
whenever it is plausible that one manipulation may have plausibly explain the results for the dependent measures.
Consider an experiment in which videotaped sales pre-
sentations are used to manipulate the perceived expertise
(factor A) and friendliness (factor B) of a salesperson in
'Though Bagozzi's (1977) structural equation approach to analyzing a (2 x 2) AB factorial design. A confounding check for
experimental data enables the experimenter both to assess error in the
measures of the independent and dependent variables and to estimate trustworthiness is included along with manipulation checks
the relationships among these variables within the context of a single for both expertise and friendliness. If when analyzing the
comprehensive model, it has yet to be widely adopted by marketing manipulation check for expertise (A) one finds statisti-
researchers. Only two of the studies reviewed here (Bearden and Shimp cally significant results for the main effect of Band/or
1982; Churchill and Surprenant 1982) utilized structural equation the AB interaction, one must conclude that the friendli-
models. Two factors may partially account for this situation. First,
Bagozzi's approach requires that multiple manipulation checks and ness manipulation had an unintended effect on perceived
multiple measures of the dependent variables be taken on the same expertise. In this case, the treatment cells labeled "high
subjects during the main experiment, which can create methodological expertise" would not contain the same level of expertise.
problems. Furthermore, researchers often find it difficult to develop In the analysis of the confounding check for trustwor-
a single reliable and valid measure for each of their variables, much
less multiple measures. Second, many, if not most, experimental stud-
thiness, a statistically significant main effect for A (B)
ies in marketing involve factorial designs, and interactions between would suggest that the manipulation for expertise
factors are difficult to handle with USREL (Joreskog and Sorbom (friendliness) is confounded, and a significant AB inter-
1981), a particularly popular program for analyzing structural models. action would suggest a more complex confounding of
Neither of the studies reviewed that used structural equation models both manipulations.
included interaction effects in those models. ANDYA appears likely
to remain the dominant approach for the analysis of intervally scaled It is important to demonstrate not only that the in-
dependent variables in experimental studies for the immediate future tended effects of the manipulations did occur, but also
and much of our discussion implicitly assumes its use. that these effects are of sufficient magnitude to provide
CHECKING THE SUCCESS OF MANIPULATIONS 323
for a meaningful test of the hypotheses of interest. What When in the analysis of the manipulation check for A
constitutes an acceptably large intended effect for a given the effect sizes for B and AB are much smaller than that
manipulation check will depend on such factors as the for A, their statistical significance probably should not
desired or predicted effect size for the dependent vari- be of great concern. However, whenever the total of the
able and the expected strength of the relationship be- CJ)2,S for B and AB are of the same order of magnitude
tween the independent and dependent variables. Fur- as the CJ)2 for A, analyzing the dependent measures in the
thermore, if the significance tests suggest that the main experiment with the ANOVA model corresponding
manipulations are confounded, the researcher should to the planned orthogonal design is likely to produce dis-
evaluate whether the degree of confounding present is torted results (e.g., unexpected interaction effects that
serious enough to impair an unambiguous evaluation of are difficult to explain in theoretical terms). In this sit-
the results of the main experiment. uation a statistical model that does not assume orthog-
In the context of the aforementioned ANOVA model, onality (e.g., some form of the general linear model)
an appropriate indicator of effect size might be CJ)2 (Saw- might be used to analyze the dependent variable mea-
yer and Ball 1981; Sawyer and Peter 1983).4 Omega sures.
squared represents the proportion of variance in the "de- When, in comparison with the independent variables
pendent" variable (in this case, the particular manipu- of interest, a confounding variable is considered to have
lation or confounding check measure being analyzed) ac- a substantially greater theoretical impact on the primary
counted for by a given main or interaction effect.5 Ideally, dependent variable, even a small effect size for the cor-
for any given manipulation check measure, the experi- responding check measure can indicate serious problems
menter would like to find a sufficiently large CJ)2 asso- with confounding. Conversely, moderate effect sizes for
ciated with the main effect of the factor (manipulation) the confounding checks can be tolerated when the cor-
corresponding to the manipulation check measure being responding confounding variables are thought to have only
analyzed and a near-zero CJ)2 for each of the other main a very slight impact on the dependent variable. How-
and interaction effects." For any given confounding check ever, checks seem most likely to be included for those
measure, the desired result is that the CJ)2,S for all main potentially confounding variables that are thought to have
and interaction effects be close to zero. a moderate to strong causal link to the primary depen-
Miniard and Cohen (1979) provide an excellent ex- dent variable. In those cases where the experimenter is
ample of the proper analysis of manipulationchecks. They uncertain about the relative theoretical impact of the con-
used the appropriate ANOVA model in analyzing each founding and independent variables, an internal analysis
of several manipulation checks and compared the size of may help resolve the issue when within-cell variance is
the intended and unintended effects within each of their moderate to high and the manipulation and confounding
analyses whenever the latter were statistically significant checks have high reliability.
(also see Baker and Churchill 1977; Burnkrant and How- If the confounding checks suggest that the manipula-
ard 1984). tions are seriously confounded, it may be difficult to sep-
If, in analyzing the results of the manipulation check arate satisfactorily the main and interaction effects of the
for A, the experimenter obtains a statistically significant intended independent variables from those of the con-
main effect for A but an insufficiently high -associated founding variables. However, the researcher might wish
CJ)2, the revision of A should be contemplated if the main to test whether the manipulation checks explain a sub-
experiment has not yet been conducted. However, it is stantial additional portion of the variance in the depen-
also possible that the manipulation check measure itself dent measures beyond the part that can be accounted for
is confounded and/or unreliable. If the experiment al- by the confounding checks alone. The models compar-
ready has been completed and A's manipulation check ison approach (cf. Chapter 2 of Green 1978) is one method
demonstrated to be reliable, some form of internal anal- that might be used for this purpose. However, analyses
ysis might be conducted if there is sufficient within-cell of this type are likely to be very effective only when the
variance; however, causal statements are no longer war- confounding checks are reliable and moderately corre-
ranted (Aronson and Carlsmith 1968). lated with the manipulation checks. Even in this case,
the results should be interpreted with caution because the
manipulations designed to alter the intended independent
variables also produce changes in the confounding vari-
4It would also be desirable to examine the distribution of manipu- ables.
lation check scores within treatments to ensure that what appears to Another simpler, though perhaps not as demanding,
be a "successful" manipulation is not primarily due to a few subjects
showing a very large effect of the treatment. test of the confounding variables involves entering the
'Omega squared will be a function of the reliability of the measure confounding checks as covariates in the ANOVA model
employed, the heterogeneity of the subjects, and the heterogeneity of used to analyze the dependent measures. Burnkrant and
the treatment implementation as well as the differences between cell Howard (1984) used this approach successfully in in-
means (LaTour 1981).
'When this is the case, w 2 for the intended effect will be roughly
vestigating whether the effects of their grammatical form
equivalent to the partial w 2 for this same effect (Keren and Lewis manipulation on "total thoughts" could be explained by
1979). the reduced confidence (the confounding variable) de-
324 JOURNAL OF MARKETING RESEARCH, NOVEMBER 1986

tected in their "question" condition. Because the con- experiments. It therefore seems appropriate for investi-
fidence covariate was nonsignificant in this analysis and gating hypothesized intervening processes as well as the
the statistical significance of the main effect of the gram- construct validity of the manipulation as an operation-
matical form manipulation was not noticeably affected, alization of the intended independent variable. The prac-
they rejected the confidence explanation. tical value of this method is heavily dependent on the
number of plausible alternative hypotheses as well as the
ARE MANIPULATION AND CONFOUNDING experimenter's ability to develop credible converging
CHECKS NECESSARY? operations.
The use of manipulation and confounding checks is The study by Kisielius and Sternthal (1984) on the ef-
only one of several methods for assessing the construct fects of vividness might be interpreted as employing
validity of the experimental manipulations. Given the convergent operations. In a pilot study the authors found
potential problems with conducting these checks within that their verbal message condition produced more fa-
the main experiment, it is worthwhile to consider the vorable product judgments than their picture condition.
available options. One alternative approach involves in- Though these results could be interpreted in availability-
cluding multiple dependent variables in the experimental valence terms, other interpretations (e.g., that informa-
study. Should the pattern of results be completely con- tion unique to the picture condition caused the judgment
sistent with the broad theory underlying the study, one effect) were plausible. The authors conducted a series of
might claim that evidence for the construct validity of three experiments to rule out alternative explanations and/
the manipulation has been provided by demonstrating a or to otherwise strengthen and clarify their availability-
degree of nomological validity. As Campbell (1960, p. valence explanation. For example, the first of these ex-
547) suggests, nomological validity relates to "the pos- periments was designed to distinguish between two al-
sibility of validating tests by using the scores from a test ternative explanations for the initial results, (1) that the
as interpretations of a certain term in a formal theoretical effects were "due to the fact that cognitive elaboration
network and, through this, to generate predictions which in response to the pictures reduced persuasion" and (2)
would be validating if confirmed when interpreted as still that the effects were "due to intertreatment differences
other operations and scores" (also see Peter 1981; Cron- in stimulus information."
bach and Meehl 1955). However, the strength of this A third alternative, suggested by Kidd (1976) among
claim would depend on the absence of other plausible others, is to rely on the emergence of convergent find-
rival interpretations of the manipulation, as well as de- ings from a large number of replications of an experi-
mand artifacts, which might also explain the results. ment to strengthen the case for a particular interpretation
Furthermore, when the expected pattern of results is not of the manipulation. When the design of the manipula-
fully achieved, the experimenter is in the awkward po- tion varies in important ways across experiments, dif-
sition of having to decide whether some part of the the- ferent sets of plausible interpretations will be relevant for
ory (e.g., the relationship between the intended inde- the various studies. If a particular alternative interpre-
pendent variable and a particular dependent variable), tation of the manipulation is reasonable in some studies
the manipulation itself, or some other aspect of the ex- but not in others, it may be possible to eliminate this
perimental procedures (e.g., the dependent variable interpretation for the experimental results in much the
measures) is at fault. Without manipulation and con- same manner as if convergent operations had been pur-
founding checks, finding a convincing solution to this posely employed.
potential dilemma would be difficult. In their review of dissonance theory, Brehm and Cohen
Much of the analysis performed by Miniard and Cohen (1962, p. 311) discuss the significance of the convergent
(1979) involved tests of theoretical linkages among the results found across dissonance experiments. In partic-
independent variables they attempted to manipulate and ular, they claim that, "While each experiment or at least
other constructs associated with the Fishbein behavioral a number of experiments can be explained in alternative
intentions model. Therefore their study could be inter- ways, there appears to be no theory that at present can
preted as employing the preceding "nomological valid- more easily explain the phenomena dealt with by all dis-
ity" approach. sonance experiments."
A second possibility involves the use of what Gamer, It is when a series of replications contain different ir-
Hake, and Eriksen (1956, p. 150) refer to as "converg- relevancies with respect to both the manipulations and
ing operations," that is, "any set of two or more ex- measurement methods employed that convincing evi-
perimental operations which allow the selection or elim- dence is provided about the construct validity of the pu-
ination of alternative hypotheses or concepts which could tative cause (Cook and Campbell 1979). Unfortunately,
explain an experimental result." Their approach is not as the heterogeneity of the irrelevancies contained in a
unlike the "technique of purification" proposed by Aron- series of replications increases, more divergent findings
son and Carlsmith (1968). This alternative represents an can be expected. Without effective manipulation and
application of Platt's (1964) "strong inference" approach confounding checks, resolving these potential inconsis-
in that it suggests the systematic devising of alternative tencies is difficult.
hypotheses and the development and execution of crucial Manipulation checks also may be useful in explaining
CHECKING THE SUCCESS OF MANIPULATIONS 325
an apparent "lack of convergence" within an experi- keting situations, that is, the study's external validity
ment. For example, if when analyzing a key dependent (Cook and Campbell 1979; Lynch 1982), is highly ques-
variable one found that a presumed theoretically irrele- tionable.
vant background variable (e.g., age or sex) interacted The appropriate time to look for problems in an ex-
with the manipulation, one might wish to use manipu- perimental design is in the pretest and/or pilot stages,
lation checks to determine whether the strength of the not after the main experiment has been conducted and
manipulation varied as a function of the background has failed to produce the expected results. An aggressive
variable (e.g., whether the manipulation caused a greater effort should be made in the early pretest stages to iden-
change with respect to the independent variable of in- tify plausible problems with the manipulations. This step
terest for some age groups than for others). To the extent is difficult, not only because there are so many possi-
the manipulation-background variable interaction effect bilities, but also because it places experimenters in the
on the dependent variable can be explained by the man- role of critiquing the results of their own efforts in de-
ner in which the strength of the manipulation varied with signing the initial manipulations. Critical reviews of the
the background variable, the original theory escapes proposed manipulations by other researchers working in
damage even though the interaction was not predicted the same or related areas, the use of unstructured inter-
initially. Hence, omitting manipulation and confounding views with pretest subjects, the collection of concurrent
checks and relying entirely on experimental replications verbal protocols during pretests, and other such quali-
appears to be satisfactory only in those seemingly rare tative methods can be helpful. However, such ap-
cases when the results of several widely diverse repli- proaches do not ensure that all, or even most, of the
cations are in agreement and no unexpected manipula- potential problems will be identified.
tion-background variable interactions are encountered. Insufficient attention to assessing rigorously the suc-
Though each of the alternative approaches for assess- cess of experimental manipulations, particularly in the
ing construct validity might be sufficient in particular pretestand pilot-testing phases, is likely to result in lengthy
situations, one can rarely be certain of this outcome in discussions of alternative post hoc explanations for un-
advance. Furthermore, the case for a particular interpre- expected experimental findings and/or a series of ex-
tation of a manipulation is always more credible when periments that provide little solid evidence about the re-
supported by empirical evidence developed through a search hypotheses of interest. The appropriate design,
variety of methodological approaches. Hence, manipu- execution, analysis, and reporting of manipulation and
lation and confounding checks can have an important role confounding checks are frequently essential to achieving
in establishing the construct validity of the manipulation a convincing interpretation of the results of experimental
even when these alternative approaches are employed. studies involving latent independent variables.
DISCUSSION AND CONCLUSIONS
The design of an experiment involving one or more REFERENCES
latent independent variables is one of the most difficult Aronson, Elliot and J. Merrill Carlsmith (1968), "Experimen-
tasks a researcher can undertake. It requires substantial tation in Social Psychology," in The Handbook of Social
methodological training and an in-depth knowledge of Psychology, 2nd ed., Vol. 2, Gardner Lindzey and Elliot
the theory to be tested. Creativity is also critical because Aronson, eds. Reading, MA: Addison-Wesley Publishing
often no obvious credible alternatives are available for Company, 1-79.
manipulating the independent variables and, conse- Bagozzi, Richard P. (1977), "Structural Equation Models in
quently, experimenters must develop their own manip- Experimental Research," Journal of Marketing Research,
ulations. It is frequently difficult and time-consuming to 14 (May), 209-26.
Baker, Michael J. and Gilbert A. Churchill, Jr. (1977), "The
develop even one apparently reasonable procedure for Impact of Physically Attractive Models on Advertising Eval-
manipulating a particular independent variable, and when uations," Journal of Marketing Research, 14 (November),
several variables are to be manipulated within a single 538-55.
vehicle (e.g., expertise and friendliness within a sales Bearden, William O. and Terence A. Shimp (1982), "The Use
presentation), the complexity of the experimenter's task of Extrinsic Cues to Facilitate Product Adoption," Journal
can increase geometrically. Many things can go wrong of Marketing Research, 19 (May), 229-39.
with an experimental design: (1) the manipulations may Biehal, Gabriel and Dipankar Chakravarti (1986), "Con-
not be strong enough to allow for a meaningful test of sumers' Use of Memory and External Information in Choice:
the theory, (2) a manipulation designed to vary one par- Macro and Micro Perspectives," Journal of Consumer Re-
ticular independent variable may also affect other inde- search, 12 (March), 382-405.
Brehm, Jack W. and Arthur R. Cohen (1962), Exploration in
pendent variables in the study, (3) other latent variables
Cognitive Dissonance. New York: John Wiley & Sons, Inc.
(i.e., confounding variables) may be inadvertently af- Burnkrant, Robert E. and Daniel J. Howard (1984), "Effects
fected by the manipulations, (4) the experimental pro- of the Use of Introductory Rhetorical Questions Versus
cedures might create demand artifacts (Sawyer 1975), Statements on Information Processing," Journal of Person-
and (5) the design of the manipulations may suggest that ality and Social Psychology, 47 (December), 1218-30.
the generalizability of the findings to real-world mar- Calder, Bobby J. and Brian Sternthal (1980), "Television
326 JOURNAL OF MARKETING RESEARCH, NOVEMBER 1986

Commercial Wearout: An Information Processing View," Disadvantage," Representative Research in Social Psychol-
Journal of Marketing Research, 17 (May), 173-86. ogy, 7 (2), 160-5.
Campbell, Donald T. (1960), "Recommendations for APA Test Kisielius, Jolita and Brian Sternthal (1984), "Detecting and
Standards Regarding Construct, Trait, or Discriminant Va- Explaining Vividness Effects in Attitudinal Judgments,"
lidity," American Psychologist, 15 (August), 546-53. Journal of Marketing Research, 21 (February), 54-64.
Churchill, Gilbert A., Jr. (1979), "A Paradigm for Developing LaTour, Stephen A. (1981), "Variance Explained: It Measures
Better Measures of Marketing Constructs, " Journal of Mar- Neither Importance nor Effect Size," Decision Sciences, 12
keting Research, 16 (February), 64-73. (January), 150-60.
- - - and Carol Surprenant (1982), "An Investigation Into Lynch, John G., Jr. (1982), "On the External Validity of Ex-
the Determinants of Customer Satisfaction, " Journal of periments in Consumer Research," Journal of Consumer
Marketing Research, 19 (November), 491-504. Research, 9 (December), 225-39.
Cook, Thomas D. and Donald T. Campbell (1979), Quasi- Miniard, Paul W. and Joel B. Cohen (1979), "Isolating At-
Experimentation: Design and Analysis Issues for Field Set- titudinal and Normative Influences in Behavioral Intentions
tings. Boston: Houghton-Mifflin Company. Models," Journal of Marketing Research, 16 (February),
Cronbach, Lee J. and Paul E. Meehl (1955), "Construct Va- 102-10.
lidity in Psychological Tests," Psychological Bulletin, 52 Peter, J. Paul (1979), "Reliability: A Review of Psychometric
(July), 281-302. Basics and Recent Marketing Practices," Journal of Mar-
Dholakia, Ruby and Brian Sternthal (1977), "Highly Credible keting Research, 16 (February), 6-17.
Sources: Persuasive Facilitators or Persuasive Liabilities?" - - - (1981), "Construct Validity: A Review of Basic Issues
Journal of Consumer Research, 3 (March), 223-32. and Marketing Practices," Journal of Marketing Research,
Farnsworth, Paul R. and Issei Misumi (1931), "Further Data 18 (May), 133-45.
on Suggestion in Pictures," Americal Journal of Psychol- Platt, John R. (1964), "Strong Inference," Science, 146 (Oc-
ogy, 43 (October), 632. tober), 347-53.
Festinger, Leon (1953), "Laboratory Experiments," in Re- Reingen, Peter H. and Jerome B. Kernan (1977), "Compli-
search Methods in the Behavorial Sciences, Leon Festinger ance with an Interview Request: A Foot-in-the-Door, Self-
and Daniel Katz, eds. New York: Holt, Rinehart and Win- Perception Interpretation," Journal of Marketing Research,
ston, 136-72. 14 (August), 365-9.
Frornkin, Howard L. and Siegfried Streufert (1976), "Labo- Saadi, Mitchel and Paul R. Farnsworth (1934), "The Degrees
ratory Experimentation," in Handbook ofIndustrial and Or- of Acceptance of Dogmatic Statements and Preferences for
ganizational Psychology, Marvin D. Dunnette, ed. Chicago: Their Supposed Makers," Journal of Abnormal and Social
Rand McNally, 415-65. Psychology, 29 (2), 143-50.
Garner, Wendell R., Harold W. Hake, and Charles W. Erik- Sawyer, Alan G. (1975), "Demand Artifacts in Laboratory
sen (1956), "Operationism and the Concept of Perception," Experiments in Consumer Research," Journal of Consumer
Psychological Review, 63 (May), 149-59. Research, 1 (March), 20-30.
Green, Paul E. (1978), Analyzing Multivariate Data. Hins- - - - and A. Dwayne Ball (1981), "Statistical Power and
dale, IL: The Dryden Press. Effect Size in Marketing Research," Journal of Marketing
Holbrook, Morris B. (1978), "Beyond Attitude Structure: To- Research, 18 (August), 275-90.
ward the Informational Determinants of Attitude," Journal - - - and J. Paul Peter (1983), "The Significance of Statis-
of Marketing Research, 15 (November), 545-56. tical Significance Tests in Marketing Research," Journal of
Hovland, Carl I., Janis L. Irving, and Harold H. Kelley (1953), Marketing Research, 20 (May), 122-33.
Communication and Persuasion: Psychological Studies of Sherman, Steven J., Karin Ahlm, Leonard Berman, and Ste-
Opinion Change. New Haven, CT: Yale University Press. ven Lynn (1978), "Contrast Effects and Their Relationship
Joreskog, Karl G. and Dag Sorbom (1981), LISREL V. Chi- to Subsequent Behavior," Journal of Experimental Social
cago: National Educational Resources, Inc. Psychology, 14 (July), 340-50.
Keppel, Geoffrey (1982), Design and Analysis: A Research- Stanley, Thomas J. (1978), "Are Highly Credible Sources Per-
er's Handbook, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, suasive?" Journal of Consumer Research, 5 (June), 66-7.
Inc. Sternthal, Brian and Ruby Roy Dholakia (1978), "Rejoinder,"
Keren, Gideon and Charles Lewis (1979), "Partial Omega Journal of Consumer Research, 5 (June), 67-9.
Squared for ANOVA Designs," Educational and Psycho- Wetzel, Christopher G. (1977), "Manipulation Checks: A Re-
logical Measurement, 39 (Spring), 119-28. ply to Kidd," Representative Research in Social Psychol-
Kidd, Robert F. (1976), "Manipulation Checks: Advantage or ogy, 8 (2), 88-93.

You might also like