Understanding Preference Shifts - A Review and Alternate Explanation of Within-Trial Contrast and State-Dependent Valuation

The Behavior Analyst 2012, 35, 179–195 No.
2 (Fall)
Understanding Preference Shifts: A Review and

Alternate Explanation of Within-Trial Contrast and
State-Dependent Valuation
James N. Meindl
The University of Memphis
Stimuli that precede aversive events are typically less preferred than stimuli that precede
nonaversive events. It has recently been demonstrated that stimuli that follow less preferred
events may become favored more than stimuli that follow more preferred events. This
phenomenon has been investigated under a variety of names, most commonly, within-trial
contrast and state-dependent valuation. Although this effect has been replicated, there have been
several failures to replicate and it is still little understood. This paper reviews and summarizes
the literature on within-trial contrast and state-dependent valuation. Procedural variations
across studies are identified and discussed. The two current models that explain the phenomenon
are then outlined and the limitations of each model are described. A third explanation is offered
that incorporates the concept of motivating operations. Last, the predictions of all three models
are compared.
Key words: conditioning preference, within-trial contrast, contrast, motivating operations,
state-dependent valuation
Stimuli that are followed by aver- ent mechanisms to explain the phe-
sive events may become less preferred nomenon and thus makes different
than stimuli that are followed by predictions (discussed later), in gen-
nonaversive events. It is also possible, eral both predict that organisms will
however, for stimulus preference to demonstrate a preference for stimuli
be altered by the events that precede that follow less preferred events rela-
the presentation of the stimulus. This tive to stimuli that follow more
phenomenon has been investigated preferred events.
under a variety of names, most com- In one of the first studies examining
monly, within-trial contrast (WTC) this phenomenon, Clement, Feltus,
and state-dependent valuation (SDV). Kaiser, and Zentall (2000) exposed
Both WTC and SDV are conceptual eight pigeons to two chain-schedule
models that essentially describe an conditions that differed by the
effect wherein exposure to a less amount of effort required.1 In the
preferred event increases preference first condition (see Figure 1), pigeons
for stimuli that follow that event. were exposed to a fixed-ratio (FR) 1
Although each model invokes differ- schedule with a lit center key (initial
I thank Jonathan Ivy, Nancy Neef, and

Manish Vaidya for their thoughtful comments 1
Throughout this paper, chain schedules
and suggestions regarding this paper. The will be described as consisting of three
contents of this article were developed under a components (an initial component, middle
grant from the U.S. Department of Educa- component, and terminal component). The
tion, OSEP (H325DO60032) (N. A. Neef, term initial component will refer to the begin-
Principal Investigator). These contents do not ning of the chain, middle component will refer
necessarily represent the policy of the U.S. to the event manipulated by the researchers
Department of Education, OSEP, and no (e.g., response requirement, effort, delay), and
endorsement by the federal government terminal component will refer to the conditions
should be assumed. that follow the manipulated event. Although
Correspondence concerning this article the usage is unconventional (cf. Clement et al.,
should be addressed to the author at the 2000), explicit identification of a middle
University of Memphis, 400A Ball Hall, component will allow discussion of a larger
Memphis, Tennessee 38152 (e-mail: jnmeindl@ variety of events than is possible using
memphis.edu). conventional chain schedule descriptions.
179
180 JAMES N. MEINDL
Figure 1. Training trials adapted from Clement et al. (2000). A key peck to Circ produced
either an FR 1 or FR 20 schedule followed by discrimination trials. A key peck to S+ resulted in
food reinforcement. A key peck to S2 resulted in no food reinforcement.
component). Completion of this com- effect. As a result, two competing

ponent led to another FR 1 schedule models have been offered to explain
(middle component) followed by a the phenomenon: WTC and SDV.
two-key discrimination task (terminal The purpose of this paper is three-
component). In the terminal compo- fold. First, the literature on WTC
nent, a peck to the red key (S+FR1) and SDV will be reviewed and the
resulted in food reinforcement where- results will be discussed in terms of
as a peck to the yellow key (S2FR1) subject type and initial, middle, and
resulted in no food reinforcement. terminal components. Differences in
The second condition consisted of an procedures will be delineated, as will
identical initial component, followed inconsistencies in findings. Second,
by an FR 20 in the middle compo- the current models that explain the
nent, followed by a similar two-key phenomenon will be contrasted, and
discrimination task in which a peck to their limitations will be described.
the green key (S+FR20) resulted in Third, an alternative explanation of
food reinforcement whereas a peck to the findings will be presented, and the
the blue key (S2FR20) resulted in no predictions of all three models will be
food reinforcement. After repeated evaluated.
exposure to both conditions, prefer-
ence for the various terminal compo- METHOD
nent stimuli was assessed via a paired- Two literature searches were con-
stimulus preference assessment that ducted using the databases Psyc-
compared S+FR1 against S+FR20 and INFO and PubMed. The key words
S2FR1 against S2FR20. Results indi- used were within-trial contrast for the
cated a preference for the S+FR20 and first search and state-dependent valu-
the S2FR20 (both stimuli that had ation for the second. All resulting
previously followed the high-effort peer-reviewed articles were retained.
condition). Of these articles, a search of the
Although there have been multiple references was conducted and studies
replications, the finding is not entire- that investigated WTC or SDV or
ly reliable, and there have been used similar procedures were includ-
numerous failures to replicate. Fur- ed in the list of articles. Excluded
ther, there are still lingering questions were articles that focused only on
as to the cause of the change in cognitive dissonance or justification
preference and whether any middle- of effort, because these studies typi-
component events can produce the cally employed different procedures
UNDERSTANDING PREFERENCE SHIFTS 181
or did not directly measure prefer- conceptual perspective was not iden-
ence but rather relied on preference tifiable for one study (Armus, 2001).
rating scales. Also excluded was Because the procedures used in these
research on the concorde fallacy and studies were quite similar despite
sunk cost effects (see Arkes & Ayton, different conceptual underpinnings,
1999, for discussion). Although these the articles will be collapsed and
concepts are somewhat related to WTC discussed together. (See Appendix
and SDV, the procedures are quite A for a breakdown of each article
dissimilar and may not examine the and to examine effect by individual
same behavior-change phenomenon. experiment.)
Subject types. Of the 22 experi-
RESULTS AND DISCUSSION ments that successfully demonstrated
the effect, 18 were conducted with
Generality of the Findings
various nonhumans including pi-
In total, 23 articles were identified geons, locusts, grasshoppers, rats,
containing a total of 38 separate banded tetras, and starlings. The
experiments. Of these 38 experiments, remaining four experiments demon-
22 (across 18 separate articles) dem- strated the effect with humans, both
onstrated an increase in preference children and adults. Of the 16 exper-
for a stimulus that followed a less iments that were unable to replicate
rather than more preferred event, and the findings, all were conducted with
16 (across 8 separate articles) failed nonhuman organisms including star-
to replicate the phenomenon. Note lings, pigeons, or rats. These data
that some articles contained both suggest that the phenomenon is
replications and failures to replicate. general and not a species-specific
Of the 23 total articles, six appeared characteristic.
to interpret the phenomenon from Middle components. Across the 22
the conceptual perspective of SDV experiments that successfully demon-
(Aw, Holbrook, Burt de Perera, & strated the effect, a variety of events
Kacelnik, 2009; Kacelnik & Marsh, have been programmed for or re-
2002; Marsh, Schuck-Paim, & Kacel- quired in the middle component. For
nik, 2004; Pompilio & Kacelnik, example, the effect has been docu-
2005; Pompilio, Kacelnik, & Behmer, mented when the preceding events
2006; Waite & Passino, 2006), and 15 were high versus low effort (Alessan-
interpreted the phenomenon from dri, Darcheville, Delevoye-Turrell, &
the perspective of WTC (Alessandri, Zentall, 2008; Aw et al., 2011; Cle-
Darcheville, Delevoye-Turrell, & ment et al., 2000; Clement & Zentall,
Zentall, 2008; Alessandri, Darche- 2002; Friedrich & Zentall, 2004;
ville, & Zentall, 2008; Arantes & Kacelnik & Marsh, 2002; Klein et
Grace, 2008; Clement et al., 2000; al., 2005), high versus low probabili-
Clement & Zentall, 2002; DiGian, ties of reinforcement (Clement &
Freidrich, & Zentall, 2004; Friedrich, Zentall, 2002; Gipson et al., 2009),
Clement, & Zentall, 2005; Friedrich short versus long delays to the termi-
& Zentall, 2004; Gipson, Miller, nal component Alessandri, Darche-
Alessandri, & Zentall, 2009; Klein, ville, Delevoye-Turrell, et al., 2008;
Bhatt, & Zentall, 2005; O’Daly, Alessandri, Darcheville, & Zentall,
Meyer, & Fantino, 2005; Singer, 2008; Clement et al., 2000; DiGian
Berry, & Zentall, 2007; Vasconcelos et al., 2004; O’Daly et al., 2005),
& Urcuioli, 2008, 2009; Vasconcelos, preferred versus less preferred sched-
Urcuioli, & Lionello-DeNolf, 2007). ules of reinforcement (Singer et al.,
One study directly compared empir- 2007), the absence or presence of
ical predictions from both conceptual reinforcement (Friedrich et al.,
frameworks (Aw, Vasconcelos, & 2005), and low versus high states of
Kacelnik, 2011). Finally, a clear food deprivation (Aw et al., 2009;
182 JAMES N. MEINDL
Marsh et al., 2004; Pompilio & they noted a tendency towards pref-
Kacelnik, 2005; Pompilio et al., erence change, this tendency was not
2006; Vasconcelos & Urcuioli, 2008). statistically significant. Second, an
Although the effect has been most effect has been found when using a
robustly documented with food dep- delay as the less preferred event (e.g.,
rivation, in each of the above cases, Alessandri, Darcheville, & Zentall,
preference was increased for stimuli 2008), which would seem to counter
that followed the less preferred event. both the argument concerning conti-
Thus, the effect does not appear to be guity between middle and terminal
limited to conditions that involve components and the argument that
differential response requirements energy expenditure is required. Final-
(or more generally, effort) alone. ly, researchers have used experimen-
Although the effect has been pro- tally naive pigeons and employed
duced with a variety of events in overtraining but still failed to replicate
middle components, there have also previous findings (e.g., Vasconcelos &
been many failures to replicate the Urcuioli, 2009).
effect. Studies that have failed to There are other possible explana-
produce the effect have examined tions for some replication failures.
middle components such as high Armus (2001), for example, conduct-
versus low effort (Arantes & Grace, ed experiments using two differently
2008; Armus, 2001; Friedrich & flavored pellets (grape and bacon) as
Zentall, 2004; Vasconcelos & Ur- terminal components following dif-
cuioli, 2009; Vasconcelos et al., ferent amounts of effort. Repeated
2007; Waite & Passino, 2006), long exposure to different food items may
versus short delay (Aw et al., 2011), increase preference (Wardle, Herrera,
and high versus low states of food Cooke, & Gibson, 2003), and it is
deprivation (Vasconcelos & Urcuioli, possible that this repeated exposure
2008). masked or influenced the results of
Explanations for replication failures. the experiment.
Explanations for these failures to
replicate the phenomenon have been Procedural Variations
numerous. Singer et al. (2007) sug- Across Experiments
gested that the effect is slow to
develop. Zentall (2008) suggested a There are a variety of procedural
variety of possible reasons for failure differences across the various exper-
to replicate, including that overtrain- iments that have examined the phe-
ing is required, that the terminal nomenon. It is possible that these
component must be contiguous with variations contribute to the differenc-
the middle component, and that prior es in experimental findings. The
exposure to lean schedules influences following sections describe several
the effect. Finally, Aw et al. (2011) procedural variations regarding the
suggested that the effect occurs only presentation of the initial and termi-
when the middle component requires nal components across the reviewed
energy expenditure. studies.
For each suggested explanation, Initial component. A stimulus that
however, there is at least one study precedes a less preferred event may
that demonstrates that explanation to become a conditioned aversive stim-
be insufficient. First, Arantes and ulus and function as a punisher (e.g.,
Grace (2008) provided an amount of Vorndran & Lerman, 2006). Because
training equal to that used in other a change in preference for stimuli
successful studies and were unable to presented in the terminal component
reproduce the results. Further, Vas- is presumed to be directly related to
concelos and Urcuioli (2009) provided preference for the event in the middle
extensive overtraining, but although component, it is necessary to be able to
attribute the preference change to those 3 did not include initial components
two components alone. If the initial (Armus, 2011; Vasconcelos & Ur-
component is a less preferred or cuioli, 2008, Experiment 2; Waite &
aversive event, this may confound Passino, 2006).
interpretations by either adding or Of the 13 experiments that used
subtracting from the effect. It is there- identical stimuli in the initial compo-
fore important to distinguish between nents, four produced an effect and
studies that used distinct initial com- nine did not. Of the 13 experiments
ponents (in which differential prefer- that used different stimuli in the
ence could develop) and those that used initial components, 10 produced an
identical initial components. effect and three did not. Thus,
Of the experiments that successful- whether the stimuli used in the initial
ly demonstrated the phenomenon, component influence the contrast
10 used distinct initial components effect is currently unclear, but the
(Alessandri, Darcheville, Delevoye- data suggest the possibility that they
Turrell, et al., 2008, Phases 1 and 2; do. Studies directly examining this
Alessandri, Darcheville, & Zentall, possibility include, for example, Di-
2008; Aw et al., 2011, Experiment 3; Gian et al. (2004) who used different
Clement & Zentall, 2002, Experi- initial components (vertical or hori-
ments 1, 2, and 3; Friedrich et al., zontal lines) for one group of pigeons
2005; O’Daly et al., 2005, Experiment and identical initial components
2; Singer et al., 2007), 4 used identical (white keys) for a different group of
initial components (Clement et al., pigeons. Although both groups sig-
2000; Friedrich & Zentall, 2004, nificantly preferred the stimulus that
Experiment 1; Gipson et al., 2009; followed longer delays, the first
group (different initial components)
Klein et al., 2005), 2 used both
displayed a greater degree of prefer-
distinct and identical initial compo- ence change than the second group
nents for different groups (DiGian (identical initial components). This
et al., 2004; O’Daly et al., 2005, may indicate that the initial compo-
Experiment 1), and 6 did not include nent exerts some effect in the condi-
initial components (Aw et al., 2009; tioning process, although such a
Kacelnik & Marsh, 2002; Marsh et conclusion is speculative at this point.
al., 2004; Pompilio & Kacelnik, 2005; Terminal component. The presenta-
Pompilio et al., 2006; Vasconcelos & tion of terminal component stimuli
Urcuioli, 2008, Experiment 1). Those varied among studies primarily based
that did not include initial compo- on whether the researchers presented
nents investigated the effects of dif- a single stimulus after each condition
ferent levels of food deprivation, and or presented two stimuli together in a
an initial component was not feasible. discrimination task. In 12 of the 22
Of the 16 experiments that were experiments that demonstrated an
unable to replicate the phenomenon, effect, a discrimination task followed
3 used distinct initial components the middle component (Alessandri,
(Aw et al., 2011, Experiment 2; Darcheville, Delevoye-Turrell, et al.,
Vasconcelos & Urcuioli, 2009, Ex- 2008, Phases 1 and 2; Alessandri,
periments 1 and 2), 9 used identical Darcheville, & Zentall, 2008; Cle-
initial components (Arantes & Grace, ment et al., 2000; Clement & Zentall,
2008, Experiments 1 and 2; Aw et al., 2002, Experiments 1 through 3;
2011, Experiment 1; Friedrich & DiGian et al., 2004; Friedrich et al.,
Zentall, 2004, Experiment 2; Vascon- 2005; Gipson et al., 2009; Klein et al.,
celos et al., 2007, Experiments 1 2005; Singer et al., 2007, Experi-
through 5), 1 used both distinct and ment 2). The remaining 10 studies
identical initial components (Vascon- presented a single stimulus after each
celos et al., 2007, Experiment 6), and condition. Of the 16 experiments that
184 JAMES N. MEINDL
failed to replicate the effect, 11 An alternative explanation, howev-

employed a discrimination task fol- er, might be that in order for prefer-
lowing the middle component (Aran- ence to change, the organism must
tes & Grace, 2008, Experiments 1 respond to the stimulus that follows
and 2; Vasconcelos & Urcuioli, 2008, the middle component. Arantes and
Experiment 2; Vasconcelos & Ur- Grace (2008) and Vasconcelos et al.
cuioli, 2009, Experiments 1 and 2; (2007) propose that because the S+
Vasconcelos et al., 2007, Experiments and S2 were presented in a discrim-
1 through 6). The remaining five ination task, the organism learned to
studies presented a single stimulus consistently select the S+ stimulus and
after each condition. In summary, of avoid the S2. They suggest that the
the 23 studies that presented a presentation of stimuli in a discrimi-
discrimination task in the terminal nation task may produce an effect
component, 12 successfully produced with the S+ but may inhibit the effect
the effect. Of the 15 studies that with the other stimulus. Currently, it
presented a single stimulus in the is unclear whether this is the case or
terminal component, 10 produced whether the addition of reinforcement
the effect. Thus it appears as though after a terminal component somehow
the phenomenon is more frequently influences the effect.
seen when terminal components in- More straightforward examples of
volve single stimulus presentation the effect are evident in the 10
rather than discrimination tasks. successful experiments that did not
Experiments that employed a dis- impose a discrimination task in the
crimination task in the terminal terminal component (Aw et al., 2009;
component have produced conflict- Aw et al., 2011, Experiment 3; Fried-
ing results. Of the studies that repli- rich & Zentall, 2004, Experiment 1;
cated the phenomenon, some have Kacelnik & Marsh, 2002; Marsh et
shown that both the S+ and S2 al., 2004; O’Daly et al., 2005, Exper-
following the less preferred event iments 1 and 2; Pompilio & Kacelnik,
(S+LP; S2LP) are equally likely to 2005; Pompilio et al., 2006; Vascon-
become preferred relative to the S+ celos & Urcuioli, 2008, Experiment
and S2 following the more preferred 1). In these studies the terminal
event (Clement & Zentall, 2002; component was a single stimulus
Gipson et al., 2009; Singer et al., (e.g., differently colored key, specific
2007), and Clement et al. (2000) arm of a Y maze, etc.). After training,
found an even greater preference for preference was measured by present-
S2LP. These findings suggest that ing both terminal components simul-
relative preference for the preceding taneously and recording the organ-
event is responsible for the effect ism’s choice between components. In
rather than the events that follow each of these studies, a significant
the terminal component (i.e., rein- preference was found for the stimulus
forcement or no reinforcement). that had previously followed less
Other research, however, has dem- preferred events, with the exception
onstrated that the effect is weaker of Aw et al. (2011), who found an
with the S2LP than the S+LP (Fried- effect only when the middle compo-
rich et al., 2005) or have found no nent involved effort rather than delay.
effect with the S2LP (Alessandri,
Darcheville, & Zentall, 2008; Clement Procedural Problems with
& Zentall, 2002; DiGian et al., 2004; Reviewed Studies
Klein et al., 2005). These results are Because changes in preference for
difficult to interpret and may suggest terminal component stimuli are pur-
that the reinforcement that follows portedly related to differential pref-
the S+ somehow influences the effect. erence for the middle component
stimuli, it seems important that base- establish these preferences before

line preference for both components exposure to training.
be established prior to the onset of an
experiment. Surprisingly, however, of DIFFERENT MODELS TO
the 38 experiments reviewed (includ- EXPLAIN THE EFFECTS
ing both replications and failures
to replicate) only three (Alessandri, The Within-Trial Contrast Model
Darcheville, Delevoye-Turrell, et al., The WTC model (see Figure 2)
2008, Experiments 1 and 2; Singer was first proposed by Zentall (2005)
et al., 2007) measured preference for and generally assumes that when an
the middle components. Friedrich and organism is exposed to a less pre-
Zentall (2004) did measure middle- ferred event there is a negative
component preference, but only after change in the organism’s hedonic
training was complete. state (H 2 DH). This negative change
It may be tempting to assume that is directly proportional to the degree
more effort or longer delays will to which the event is less preferred.
always be less preferred; however, When a reinforcing stimulus is pre-
this may not always be true. For sented following this negative shift,
example, Alessandri, Darcheville, there is a positive shift in the hedonic
Delevoye-Turrell, et al. (2008) em- state of the organism. The contrast,
ployed middle components that re- then, is between the organism’s he-
quired participants to press a button donic state before and after the
with varying amounts of force and presentation of the reinforcing stim-
for varying lengths of time. For two ulus. The degree to which preference
participants, the less preferred condi- is changed depends on the amount of
tion was not the least effort/shortest positive shift, and therefore also on
delay condition. Without prior mea- the degree to which one event is
surement, results for these two par- preferred more or less than another
ticipants might have been erroneous- (cf. relative values in Figure 2).
ly interpreted. Limitations of the within-trial
Another concern is that only two contrast model. There are several
of the 38 experiments assessed pref- problems with the WTC model. Aw
erence for terminal component stim- et al. (2011) outline many similar
uli prior to training (Friedrich & limitations; however, they deserve
Zentall, 2004, Experiments 1 and 2). mention here in order to contrast
Without knowing initial preference, it the various models. The first limita-
is difficult to determine the impact of tion is that the model assumes a
training. If preference for a stimulus hedonic state and uses changes in this
was initially low and then shifted to state (both absolute and relative) as
high, this would indicate a very the primary events responsible for
strong effect. If, on the other hand, contrast. Hedonic state, however, is a
preference for the stimulus was ini- term that only vaguely refers to an
tially somewhat high and then shifted organism’s well-being, and there is
higher, this would indicate a fairly currently no agreed-upon definition
weak effect. Without measuring pref- (Pompilio & Kacelnik, 2005). Hedon-
erence for middle and terminal com- ic state seems to function much the
ponents prior to training, it is diffi- same as a hypothetical construct, and
cult to draw strong conclusions there is no method of objectively
regarding the phenomenon. If we determining an organism’s hedonic
are to suggest that preference for state, much less detecting changes in
terminal components is influenced by that state. Although the term hedonic
preference for middle components, it state, and assumptions of changes in
is of utmost importance that we this state, may be convenient, it is
186 JAMES N. MEINDL
Figure 2. ‘‘A model based on change in relative hedonic value, proposed to account for
within-trial contrast effects. According to the model, trials begin with a relative hedonic state, H;
key pecking results in a negative change in hedonic state, H 2 DH1 for FR 1 and H 2 DH20 for
FR 20; obtaining a reinforcer results in a positive change in hedonic state, H + DHRf; the net
change in hedonic state depends on the difference between H + DHRf and H 2 DH1 on an FR 1
and between H + DHRf and H 2 DH20 on an FR 20 trial.’’ (Zentall, 2005, p. 280).
unclear whether hedonic state refers relative value of a reinforcer is

to any measurable state of being. directly related to the energetic state
A second problem with this model or fitness of the organism at the point
is that it describes an upward shift in of reinforcer delivery. A reinforcer
hedonic state as the result of the delivered at a point of low energy
presentation of a reinforcer after a reserves is presumed to be relatively
less preferred event. Recall, however, more valuable than that same rein-
that several researchers used a dis- forcer delivered at a point of higher
crimination task as the terminal energy reserves. The model further
component and found an effect with assumes that with repeated exposures
both the S+ and S2 (Clement et al., to a food item under states of low
2000; Clement & Zentall, 2002; Gip- energy reserves, the utility value of
son et al., 2009; Singer et al., 2007). the food item is somehow represented
Because the S2 was never associated in memory (Pompilio & Kacelnik,
with a reinforcer, there should be no 2005).
increase in hedonic state and hence Limitations of the state-dependent
no shift in preference for the S2 valuation model. One limitation of the
stimulus. SDV model is that it assumes that the
Due to these limitations, the cur- function of the middle components is
rent WTC model appears to be an to differentially depress energy re-
inadequate explanation. Although serves. It is unclear whether this is
the WTC model affords some predic- always the case, however, because the
tion, the model is less than ideal at effect has been successfully demon-
providing a scientific explanation of strated with middle components that
the phenomenon. would not be expected to differen-
tially depress energy reserves, such as
delay lengths (DiGian et al., 2004),
The State-Dependent Valuation Model
different schedules of reinforcement
Kacelnik and Marsh (2002) pro- (Singer et al., 2007), or anticipated or
posed the SDV as an alternative unanticipated effort (Clement & Zen-
model that hypothesizes that the tall, 2002). If energy reserves were not
differentially depressed, however, the key pecking under a state of food

terminal components would not have deprivation (an establishing opera-
differential fitness value, and the tion), key pecking is more likely in
effect would not be predicted. the future because the evocative effect
A second limitation is that several of the establishing operation has been
studies have shown changes in prefer- altered. The function-altering effect
ence when the terminal components has been to modify the function of
were followed by stimuli with presum- the establishing operation. In this
ably no capacity to increase energy case, the evocative effect of food
reserves. Alessandri, Darcheville, and deprivation is to evoke key pecking.
Zentall (2008), for example, provided Let us now apply the concept of
children with short song segments or MOs and function-altering effects to
segments of a cartoon after successful the typical presentation of the initial,
discrimination in the terminal compo- middle, and terminal components in
nent. Similarly, Klein et al. (2005) had the reviewed studies. In a typical
a computer screen display the words study on effort, for example, a pigeon
‘‘correct’’ or ‘‘incorrect’’ during dis- might be exposed to both of the
crimination. Furthermore, as noted in following conditions in an alternating
the limitations of the WTC model, an fashion:
increase in preference has been ob-
served with S2 that was not associat- Condition 1: Initial Component 1 (FR 1) R
ed with any reinforcement. Thus, it Middle Component 1 (FR 5) R Terminal
Component 1 (red key followed by food
appears that for some of the successful reinforcement)
demonstrations, the SDV model pro- Condition 2: Initial Component 2 (FR 1) R
vides an insufficient explanation. Middle Component 2 (FR 30) R Terminal
Component 2 (blue key followed by food
ALTERNATIVE EXPLANATION: reinforcement)
MOTIVATING OPERATIONS
AND THE FUNCTION- In both conditions, engaging in the
ALTERING EFFECT initial component produces differen-
tial effort outcomes: low effort (FR
As currently conceptualized, moti- 5) or high effort (FR 30). Assuming
vating operations (MOs) are stimulus that the pigeon would prefer not to
conditions or events that produce a engage in any effort expenditure,
momentary change in the reinforcing both Middle Components 1 and 2
effectiveness of other stimuli (the may be considered MOs in that they
value-altering effect) and a momen- both function to increase the value of
tary change in the frequency of stimuli that terminate the middle
behaviors (the behavior-altering ef- component. With repeated exposure
fect) that have functioned to produce to training, the function of these
those stimuli (Michael, 2004). Thus, establishing operations is altered.
an MO is not associated with differ- However, due to differences in the
ential availability of consequences, magnitude of the required effort in
but rather produces a temporary both events, Middle Component 2
change in stimulus value and the (FR 30) is less preferred than Middle
probability of behavior that produces Component 1 (FR 5). Therefore,
that stimulus. although the stimuli that follow both
The function-altering effect de- Middle Components 1 and 2 should
scribes a conditioning process where- function as conditioned reinforcers,
by, due to a particular learning the stimuli associated with the termi-
history, the function of specific stim- nation of Middle Component 2
uli is altered in the presence of other should be valued more than the
stimuli (Schlinger & Blakely, 1994). stimuli associated with the termina-
If, for example, food is provided for tion of Middle Component 1. The
188 JAMES N. MEINDL
entire procedure, therefore, has would predict that changes in prefer-

served to expose the pigeon to a less ence for the terminal components are
preferred stimulus condition (the directly related to the degree to which
middle component) and to associate preference for the middle compo-
the termination of that condition nents differs. If one middle compo-
with the presentation of another nent was highly aversive and the
stimulus (the terminal component). other only mildly nonpreferred, a
It has previously been noted that large preference change would be
the terminal component of a chain expected, and this preference change
schedule may become a conditioned would diminish as the two middle
reinforcer because it is consistently components became equally aversive.
contiguous with the delivery of a It is possible that there is simply a
primary reinforcer at the end of the larger discrepancy in preferences for
chain (Fantino, 1977). The explana- levels of food deprivation compared
tion described here asserts that the to different lengths of delay.
same effect occurs in these chain Further, this explanation is able to
schedules except that the terminal account for findings regarding in-
component becomes a conditioned creased preference for the S2LP. As
reinforcer not through contiguity noted earlier, the WTC and SDV
with the positive reinforcer but models predict an increase in prefer-
through contiguity with the termina- ence when a reinforcer is delivered
tion of the middle component. In that differentially increases hedonic
essence, a stimulus that terminates a states or energetic reserves. Because
less preferred event is a reinforcer, the S2 was never associated with
and the less preferred the event, the reinforcement, neither model would
more valuable the stimulus. predict preference change. On the
The MO/function-altering expla- other hand, the MO explanation
nation is more parsimonious than would predict preference change be-
the SDV model because it does not cause the S2LP was associated with
assume that some events (e.g., the termination of a relatively less
6-s delays; differential reinforcement preferred event.
of other behavior vs. fixed-interval
schedules) function to decrease an DIFFERENT PREDICTIONS
organism’s energy reserves. Nor does YIELDED BY EACH MODEL
it assume that the presentation of
cartoons or affirmations functions to If an organism was exposed to two
increase an organism’s biological training conditions with different
fitness. The MO interpretation is also middle components (long delay and
more parsimonious than the WTC short delay, both followed by some
model because it is able to specify the stimulus), the different models would
precise mechanism that is responsible yield different predictions. The WTC
for altering preference (i.e., MOs) as model would predict preference for
opposed to relying on unmeasured the stimulus that followed the long
changes in a hypothetical and ill- delay relative to the stimulus that
defined construct (i.e., hedonic state). followed the short delay. The SDV
In addition, the MO interpretation model would not predict differentiat-
could help to explain some of the ed preference because neither delay
inconsistent findings regarding the condition resulted in energetic expen-
effect. The most reliable demonstra- diture. According to the MO inter-
tion of the effect has been when levels pretation, a prediction could only be
of food deprivation were manipulat- made if the two delay conditions were
ed, whereas the least reliable demon- demonstrated to be differentially
strations were those that used events preferred. That is, if the long delay
such as delay. The MO explanation was less preferred than the short
delay, preference should shift to- CONCLUSION

wards the stimulus following the long
The precise circumstances neces-
delay. If the short delay was less
sary to produce the preference-
preferred than the long delay, the
change phenomenon described in this
opposite prediction would be made.
paper are not fully understood at this
Furthermore, the MO interpreta- time. The effect has been replicated
tion could make novel predictions to multiple times across a variety of
which the WTC and SDV models species and middle components, yet
could not as easily speak. For exam- there have also been numerous fail-
ple, if instead of identical middle ures to replicate. Some of the dis-
components differing by one dimen- crepancies in findings may be due to
sion (e.g., 5-s delay vs. 20-s delay), procedural differences across studies
the middle components were a less (e.g., the presentation of a discrimi-
preferred condition involving energy nation task vs. single stimulus pre-
expenditure (e.g., running) versus a sentation; identical vs. distinct initial
less preferred condition involving components). It is also possible that
energy gain (e.g., eating an unpleas- these conflicts are due to procedural
ant-tasting food item) the predictions problems seen across most of the
of the WTC and SDV are unclear. studies reviewed (e.g., a failure to
The WTC model would not neces- measure preference for middle and
sarily be able to quantify which terminal components prior to train-
condition resulted in a greater de- ing). Further complicating our un-
crease in hedonic state and could derstanding of the phenomenon is
only make this determination after that the two models currently offered
the fact. The SDV model is predicat- to explain the findings both make
ed on the expectance of energy different predictions and are unable
expenditure, so the unpleasant-tast- to account for all of the research
ing food item would not fit with the findings. This paper offers an alter-
model. On the contrary, the MO native explanation to the WTC and
explanation would merely require SDV models. This alternative expla-
that one condition be shown to be nation relies on the concept of MOs
less preferred relative to the other to and function-altering effects and sug-
make a prediction as to preference gests that the terminal components
change. are conditioned as reinforcers
Finally, the MO explanation may through contiguity with the termina-
be extended to situations in which tion of the middle components. Al-
one middle component is nonpre- though this account remains specula-
ferred and the other is preferred. If, tive at this point, it is conceptually
for example, one condition involved systematic, is able to account for
an unpleasant tone and the other inconsistent research findings, and is
involved watching a preferred movie able to make novel predictions.
segment, the MO explanation would The studies reviewed in this paper,
suggest increased preference for the along with the alternative explana-
stimulus following the unpleasant tion, have important implications for
tone and decreased preference for behavior analysis. One implication
the stimulus following the preferred concerns the design of behavioral
movie clip (being associated with the interventions. When these programs
termination of a preferred event). It is are created, reinforcers are selected
unclear what outcome would be and then typically programmed to be
predicted by the WTC model, and delivered on some schedule. Al-
the SDV model would predict no though much attention is given to
preference change because no energy the effect a reinforcer has on an
was expended in either condition. organism’s engagement with a sched-
190 JAMES N. MEINDL
ule, very little attention is given to the Arantes, J., & Grace, R. C. (2008). Failure to
effect the schedule, and correspond- obtain value enhancement by within-trial
contrast in simultaneous and successive
ing work requirements, have on the discriminations. Learning & Behavior, 36,
value of the reinforcer. The studies 1–11.
reviewed here indicate that the work Arkes, H. R., & Ayton, P. (1999). The sunk
requirement may alter the value of cost and concorde effects: Are humans less
the corresponding reinforcer. Care, rational than lower animals. Psychological
Bulletin, 125, 591–600.
therefore, should be exerted when Armus, H. L. (2001). Effect of response effort
selecting and delivering reinforcers on the reward value of distinctively flavored
because the schedule requirements food pellets. Psychological Reports, 88,
may either increase or decrease the 1031–1034.
Aw, J. M., Holbrook, R. I., Burt de Perera, T.,
value of those stimuli. & Kacelnik, A. (2009). State-dependent
A second implication is related to valuation learning in fish: Banded tetras
the current understanding of the prefer stimuli associated with greater past
interaction between MOs and the deprivation. Behavioral Processes, 81, 333–
function-altering effect. The value- 336.
Aw, J. M., Vasconcelos, M., & Kacelnik, A.
altering effect of an establishing (2011). How costs affect preferences: Exper-
operation is understood as ‘‘an in- iments on state dependence, hedonic state
crease in the current [italics added] and within-trial contrast in starlings. Animal
effectiveness of some stimulus, object, Behaviour, 81, 1117–1128.
Clement, T. S., Feltus, J. R., Kaiser, D. H., &
or event as reinforcement’’ (Cooper, Zentall, T. R. (2000). ‘‘Work ethic’’ in
Heron, & Heward, 2007, p. 376). pigeons: Reward value is directly related to
Food deprivation is presumed to the effort or time required to obtain the
momentarily increase the effectiveness reward. Psychonomic Bulletin & Review, 7,
of food as a reinforcer. When the 100–106.
Clement, T. S., & Zentall, T. R. (2002).
level of food deprivation subsides, the Second-order contrast based on the expec-
effectiveness of food as a reinforcer tation of effort and reinforcement. Journal
diminishes. The research reviewed of Experimental Psychology, 28, 64–74.
here, however, suggests that the Cooper, J. O., Heron, T. E., & Heward, W. L.
(2007). Applied behavior analysis (2nd ed.).
function-altering effect interacts with Upper Saddle River, NJ: Pearson Education.
MOs. That is, the food is made more DiGian, K. A., Freidrich, A. M., & Zentall,
valuable when it is presented during T. R. (2004). Discriminative stimuli that
food deprivation, and that increase follow a delay have added value for pigeons.
in value persists into the future. Psychonomic Bulletin & Review, 11,
889–895.
The finding that less preferred Fantino, E. (1977). Conditioned reinforce-
events alter preference for subsequent ment: Choice and information. In W. K.
stimuli warrants further inquiry. Fu- Honig & J. E. R. Staddon (Eds.), Handbook
ture research into this phenomenon of operant behavior (pp. 313–363), Engle-
wood Cliffs, NJ: Prentice Hall.
may help explain how stimulus pref- Friedrich, A. M., Clement, T. S., & Zentall,
erences develop with greater preci- T. R. (2005). Discriminative stimuli that
sion than is currently available, and follow the absence of reinforcement are
may allow a greater understanding of preferred by pigeons over those that follow
how the environment and behavior reinforcement. Learning & Behavior, 33,
337–342.
interact and affect one another. Friedrich, A. M., & Zentall, T. R. (2004).
Pigeons shift their preference toward loca-
REFERENCES tions of food that take more effort to
obtain. Behavioral Processes, 67, 405–415.
Alessandri, J., Darcheville, J.-C., Delevoye- Gipson, C. D., Miller, H. C., Alessandri, J. J. D.,
Turrell, Y., & Zentall, T. R. (2008). & Zentall, T. R. (2009). Within-trial contrast:
Preference for rewards that follow greater The effect of probability of reinforcement in
effort and greater delay. Learning & Behav- training. Behavioral Processes, 82, 126–132.
ior, 36, 352–358. Kacelnik, A., & Marsh, B. (2002). Cost can
Alessandri, J., Darcheville, J.-C., & Zentall, increase preference in starlings. Animal
T. R. (2008). Cognitive dissonance in Behavior, 63, 245–250.
children: Justification of effort or contrast? Klein, E. D., Bhatt, R. S., & Zentall, T. R.
Psychonomic Bulletin & Review, 15, 673–677. (2005). Contrast and justification of effort.
Psychonomic Bulletin & Review, 12, 335– test of within-trial contrast. Learning &
339. Behavior, 36, 12–18.
Marsh, B., Schuck-Paim, C., & Kacelnik, A. Vasconcelos, M., & Urcuioli, P. J. (2009).
(2004). Energetic state during learning Extensive training is insufficient to produce
affects foraging choices in starlings. Behav- the work-ethic effect in pigeons. Journal of
ioral Ecology, 15, 396–399. the Experimental Analysis of Behavior, 91,
Michael, J. (2004). Concepts and principles of 143–152.
behavior analysis. Kalamazoo, MI: Associ- Vasconcelos, M., Urcuioli, P. J., & Lionello-
ation for Behavior Analysis. DeNolf, K. M. (2007). Failure to replicate
O’Daly, M., Meyer, S., & Fantino, E. (2005). the ‘‘work ethic’’ effect in pigeons. Journal
Value of conditioned reinforcers as a of the Experimental Analysis of Behavior, 87,
function of temporal context. Learning and 383–399.
Motivation, 36, 42–59. Vorndran, C. M., & Lerman, D. C. (2006).
Pompilio, L., & Kacelnik, A. (2005). State- Establishing and maintaining treatment
dependent learning and suboptimal choice: effects with less intrusive consequences via
When starlings prefer long over short delays a pairing procedure. Journal of Applied
to food. Animal Behavior, 70, 571–578. behavior Analysis, 39, 35–48.
Waite, T. A., & Passino, K. M. (2006).
Pompilio, L., Kacelnik, A., & Behmer, S. T.
Paradoxical preferences when options are
(2006). State-dependent learned valuation
identical. Behavioral Ecology and Sociobiol-
drives choice in an invertebrate. Science, ogy, 59, 777–785.
311, 1613–1615. Wardle, J., Herrera, M. L., Cooke, L., &
Schlinger, H. D., & Blakely, E. (1994). A Gibson, E. L. (2003). Modifying children’s
descriptive taxonomy of environmental food preferences: The effects of exposure
operations and its imlpications for behav- and reward on acceptance of an unfamiliar
ior analysis. The Behavior Analyst, 17, vegetable. European Journal of Clinical
43–57. Nutrition, 57, 341–348.
Singer, R. A., Berry, L. M., & Zentall, T. R. Zentall, T. R. (2005). A within-trial contrast
(2007). Preference for a stimulus that effect and its implications for several so-
follows a relatively aversive event: Con- cial psychological phenomena. International
trast or delay reduction. Journal of the Journal of Comparative Psychology, 18,
Experimental Analysis of Behavior, 87, 273–297.
275–285. Zentall, T. R. (2008). Within-trial contrast:
Vasconcelos, M., & Urcuioli, P. J. (2008). When you see it and when you don’t.
Deprivation level and choice in pigeons: A Learning & Behavior, 36, 19–22.
192 JAMES N. MEINDL
APPENDIX
Initial Middle
Author Population component component
Alessandri, Phases 2–3 Phase 2 Preferred vs.
Darcheville, N 5 30 different stimuli nonpreferred effort
Delevoye-Turrell, undergraduate
and Zentall, 2008 students
Phase 3 10-s delay vs.
different stimuli nonpreferred effort
Alessandri, N 5 42 children Different Stimuli No delay vs. 8-s delay
Darcheville, and age 7–8 years
Zentall, 2008
Arantes and Experiment 1 Experiment 1 FR 1 vs. FR 20
Grace, 2008 N 5 10 pigeons identical stimuli
Experiment 2 Experiment 2 FR 1 vs. FR 20

N 5 12 pigeons identical stimuli
Armus, 2001 N 5 7 rats No initial component 5-g effort vs. 75-g effort
lever press
Aw, Holbrook, N 5 13 banded tetras No initial component Food deprived vs. prefed
Burt de Perera,
and Kacelnik,
2009
Aw, Vasconcelos, Experiment 1 Experiment 1 3-s delay vs. 18-s delay
and Kacelnik, N57 identical stimuli
2011 wild-caught starlings
Experiment 2 Experiment 2 3-s delay vs. 18-s delay
N 5 12 different stimuli
wild-caught starlings (randomly alternated)
Experiment 3 Experiment 3 4-flight effort vs. 12-flight
N 5 12 different stimuli effort vs. 24-flight effort
wild-caught starlings
Clement, Feltus, N 5 8 pigeons identical stimuli FR 1 vs. FR 20
Kaiser, and
Zentall, 2000
Clement and Experiment 1 Experiment 1 Differential effort vs.
Zentall, 2002 N 5 8 pigeons different stimuli anticipated differential
effort
Experiment 2 Experiment 2 Expected vs. unexpected
N 5 8 pigeons different stimuli food
Experiment 3 Experiment 3 Assessed positive and
N 5 16 pigeons different stimuli negative contrast
High-probability food
vs. low-probability food
DiGian, Friedrich, N 5 16 pigeons Identical stimuli 0-s delay vs. 6-s delay
and Zentall, Different stimuli
2004
Friedrich and Experiment 1 Identical stimuli FR 1 vs. FR 30

Zentall, 2004 N 5 12 pigeons
Experiment 2 Identical stimuli FR 1 vs. FR 30
N 5 6 pigeons Uncorrelated with specific
terminal component
Friedrich, Clement, N 5 8 pigeons Different stimuli Food vs. no food
and Zentall, 2005
APPENDIX
Extended
MC initial Terminal component TC initial
preference assessed? discrimination task? pref assessed? Effect found?
Yes Yes; simultaneous No Yes for S+
Yes Yes; simultaneous No Yes for S+
No Yes; simultaneous No Yes for S+

No for S2
No Yes; simultaneous or No No; dependent on initiating

successive Tested with event
initiating event
No Yes; simultaneous No No; dependent on initiating
event
No No; grape- or bacon- No No
flavored pellet delivered
No No; food delivered in No Yes
different arms of Y maze
No No; colored key No No

presented
No No; colored key No No

presented
No No; colored key No Yes

presented
No Yes; simultaneous No Yes

Yes for S2

Yes for S2
No Yes; simultaneous No Yes for S+ and S2 in
positive and negative
contrast gp
No for S+ and
S2 in negative contrast
group
No Yes; simultaneous No Yes for S+ in identical
stimuli group
No for S+ in different
stimuli group
No for S2 in both groups
No No; left or right feeder Yes Yes
available
No No; alternation Yes No
between left or right
feeders
No Yes; simultaneous No Yes for S+ Yes for S2
194 JAMES N. MEINDL
APPENDIX
Continued
Initial Middle
Author Population component component
Gipson, Miller, N 5 16 pigeons Identical stimuli FR 1 vs. FR 30 50% or
Alessandri, and 100% reinforcement
Zentall, 2009
Kacelnik and N 5 12 wild-caught No initial component 4-flight effort vs. 16-

Marsh, 2002 starlings flight effort
Klein, Bhatt, and N 5 32 undergrad Identical stimuli FR 1 vs. FR 20 or FR 30
Zentall, 2005 students
Marsh, Schuck- N 5 12 wild-caught No initial component Food deprived vs. prefed
Paim, and starlings
Kacelnik, 2004
O’Daly, Meyer, Experiment 1 Identical stimuli VI 100 s vs. VI 10 s
and Fantino, N 5 20 pigeons Different stimuli
2005
Experiment 2 Different stimuli VI 100 s vs. VI 10 s
N 5 8 pigeons
Pompilio and N 5 6 wild-caught No initial component Food deprived vs. prefed
Kacelnik, 2005 starlings and delay: FI 10, 12.5,
15, and 17.5 s
Pompilio, Kacelnik, N unknown No initial component Low nutritional state vs.
and Behmer, 2006 grasshoppers high nutritional state
Singer, Berry, Experiment 2 Different stimuli Preferred vs. non-

and Zentall, 2007 N 5 8 pigeons preferred schedule:
DRO or FI
Vasconcelos, Experiments 1–3, 5 Experiments 1–5 Experiments 1–4
Urcuioli, and N 5 8 pigeons Identical stimuli FR 1 vs. FR 20
Lionello-DeNolf, Experiment 4 Experiment 5 FR 1
2007 N 5 4 pigeons vs. FR 40
Experiment 6 Experiment 6 Experiment 6
N 5 16 pigeons Identical stimuli FR 1 vs. FR 0
Different stimuli
Vasconcelos and Experiment 1 No initial component High deprivation vs. low
Urcuioli, 2008 N 5 4 pigeons deprivation
Experiment 2 No initial component High deprivation vs. low
N 5 6 pigeons deprivation
Vasconcelos and Experiment 1 Different stimuli FR 1 vs. FR 30
Urcuioli, 2009 N 5 6 pigeons
Experiment 2 Different stimuli 4 runs vs. 16 runs be-
N 5 4 pigeons tween panels
Waite and N 5 11 semitame No initial component 60-cm (high effort) vs.
Passino, 2006 gray jays 1.9-cm (low cost)
distance
Note. This table summarizes research by subject type, initial, middle component (MC),
terminal component (TC), and effect. Effects were reported based on the individual article. If
researchers reported a statistical measure, that was used to determine effect. If researchers
reported individual subject data, that was used to determine effect.
APPENDIX Extended,
Continued
MC initial Terminal component TC initial

preference assessed? discrimination task? pref assessed? Effect found?
No Yes; simultaneous No Yes for S+ with 100%
reinforcement
No for S+
with 50% reinforcement
Yes for S2 for both 50%
and 100% reinforcement
No No; colored key No Yes for 10 of 12 starlings
presented
No for S2
No No No Yes
No No No Yes for 15 of 17 pigeons
No No No Yes for 5 of 8 pigeons
No No No Yes for state (deprived vs.

prefed)
No No; lemon grass or No Yes
peppermint odor
presented
Yes Yes; Simultaneous No Yes for S+
S2 untested
No Experiments 1–5 No No
Yes; Simultaneous
No Yes; simultaneous No No
No No No Yes
No Yes; simultaneous No No for S+

No for S2
No Yes; simultaneous No No for S+

No for S2
No Yes; simultaneous No No
No No; color-coded No No
foraging
opportunities

Understanding Preference Shifts - A Review and Alternate Explanation of Within-Trial Contrast and State-Dependent Valuation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Understanding Preference Shifts - A Review and Alternate Explanation of Within-Trial Contrast and State-Dependent Valuation

Uploaded by

Copyright:

Available Formats

The Behavior Analyst 2012, 35, 179–195 No.

Understanding Preference Shifts: A Review and

I thank Jonathan Ivy, Nancy Neef, and

component). Completion of this com- effect. As a result, two competing

failed to replicate the effect, 11 An alternative explanation, howev-

stimuli, it seems important that base- establish these preferences before

unclear whether hedonic state refers relative value of a reinforcer is

differentially depressed, however, the key pecking under a state of food

entire procedure, therefore, has would predict that changes in prefer-

delay, preference should shift to- CONCLUSION

Experiment 2 Experiment 2 FR 1 vs. FR 20

Friedrich and Experiment 1 Identical stimuli FR 1 vs. FR 30

Yes Yes; simultaneous No Yes for S+

No Yes; simultaneous No Yes for S+

No Yes; simultaneous or No No; dependent on initiating

No No; colored key No No

No No; colored key No No

No No; colored key No Yes

No Yes; simultaneous No Yes

No Yes; simultaneous No Yes for S+

No Yes; simultaneous No Yes for S+

Kacelnik and N 5 12 wild-caught No initial component 4-flight effort vs. 16-

Singer, Berry, Experiment 2 Different stimuli Preferred vs. non-

MC initial Terminal component TC initial

No No No Yes for 5 of 8 pigeons

No No No Yes for state (deprived vs.

No Yes; simultaneous No No for S+

No Yes; simultaneous No No for S+

You might also like