Professional Documents
Culture Documents
3 - Page, S., - Neuringer, A. (1985) - Variability Is An Operant
3 - Page, S., - Neuringer, A. (1985) - Variability Is An Operant
3 - Page, S., - Neuringer, A. (1985) - Variability Is An Operant
Variability Is an Operant
Pigeons were rewarded if their pattern of eight pecks to left and right response
keys during the current trial differed from the patterns in each of the last n trials.
Experiments 1 and 2 compared Schwartz's (1980, 1982a) negative findings
(variability was not controlled by reinforcement) with the present positive results
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
and explained the difference. Experiment 3 manipulated n and found that the
This document is copyrighted by the American Psychological Association or one of its allied publishers.
pigeons generated highly variable patterns even when the current response
sequence had to differ from each of the last 50 sequences. Experiment 4
manipulated the number of responses per trial; variability increased with increasing
responses per trial, indicating that the pigeons were acting as quasi-random
generators. Experiment 5 showed that for high levels of variability to be engendered,
reinforcement had to be contingent on response variability. In a yoked condition,
where variability was permitted but not required, little response variability was
observed. Experiment 6 demonstrated stimulus control: Under red lights the
pigeons generated variable patterns, and under blue lights they repeated a
particular fixed pattern. We concluded that behavioral variability is an operant
dimension of behavior controlled by contingent reinforcement.
429
430 SUZANNE PAGE AND ALLEN NEURINGER
reinforced novel behaviors in two porpoises. nate variability in the work of Bryant and
Reinforcers were delivered for any movement Church. Here, too, the argument is that
that in the opinion of the trainer did not variability was not reinforced (variability is
constitute normal swimming motions and not an operant) but the schedules of rein-
that had not been previously reinforced. Dif- forcement engendered variability as a respon-
ferent types of behaviors emerged, including dent by-product. The control experiments
several that had never before been observed necessary to distinguish between the operant
in animals of that species. In a more con- and respondent alternatives were not per-
trolled setting, Schoenfeld, Harris, and Farmer formed.
(1966) delivered rewards to rats only if suc- The most serious evidence against vari-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
cessive interresponse times fell into different ability as an operant dimension of behavior
This document is copyrighted by the American Psychological Association or one of its allied publishers.
temporal class intervals. This procedure pro- comes from Schwartz (1980, 1982a). Pigeons
duced a low level of variability in the rat's were required to generate on two response
rate of bar pressing. In the most sophisticated keys a pattern of eight responses that differed
and demanding experiment to date, Blough from the last sequence of eight responses. A
(1966) reinforced the least frequent of a set 5 X 5 matrix of lights provided the pigeons
of interresponse times and obtained interre- with continual feedback concerning their dis-
sponse time distributions in three pigeons tribution of pecks. The light started in the
that approximated a random exponential dis- upper left-hand corner; each left-key peck (L)
tribution. Similarly, Bryant and Church moved the light one column to the right and
(1974) rewarded rats 75% of the time for each right-key peck (R) moved the light down
alternating between two levers and 25% of one row. In preliminary training, pigeons
the time for repeating a response on the same were simply required to move the light from
lever and found that for 3 of 4 subjects, the upper left to lower right. There was no re-
resulting choice behaviors could not be dis- quirement to vary the pattern. If the birds
tinguished from chance. People have also responded more than four times on either
been rewarded for behaving variably in more key (thereby moving the light off the matrix),
natural environments (Holman, Goetz, & the trial terminated without reinforcement.
Baer, 1977). The pigeons were highly successful, obtaining
Although the above studies indicate that 70% to 90% of available reinforcers by gen-
variability can be reinforced, the conclusion erating repetitive and energy-conserving pat-
is not secure. When the response in question terns (e.g., LLLLRRRR). In the experiments of
is a complex behavior, such as the responses most concern to the present work (Schwartz,
studied by Pryor et al. (1969) and Holman 1980, Experiment 4; 1982a, Experiment 1),
et al. (1977), the observer plays an important in addition to requiring four pecks on each
role in denning novel response, and the novelty key, the schedule demanded that each eight-
observed might be as much a function of the response sequence differ from the immediately
observer as of the subject. Moreover, Schwartz preceding sequence. Now, when the pigeons
(1982b) has argued that the Pryor et al. had to vary their response patterns to be
findings with porpoises can be attributed to rewarded, only 40% of the available reinforcers
the effects of repeated extinction and recon- were earned. Number of different sequences
ditioning, with extinction occurring increas- emitted, an index of learned variability, did
ingly rapidly in progressive instances, and not appreciably increase.
not to the direct reinforcement of novelty. The Schwartz findings, though apparently
Because extinction increases variability (e.g., robust, appeared to conflict with those of
Antonitis, 1951), the variability observed may Pryor et al. (1969), Blough (1966), and Bryant
have been a by-product of the reinforcement and Church (1974). Furthermore, Neuringer
schedule. A similar argument can be made (1984, 1985) showed that when provided with
with respect to both Blough (1966) and Bryant feedback, people learned to generate highly
and Church (1974). Although unlikely, it is variable response sequences, indeed sequences
possible that the reinforcement schedules so variable that they could be described as
somehow elicited interresponse time variabil- random. To explore the apparent disagree-
ity in the Blough experiment and stay-alter- ment between Schwartz's work and these
OPERANT VARIABILITY 431
high. The front and rear walls were aluminum, one side
This document is copyrighted by the American Psychological Association or one of its allied publishers.
wall was pressed board, the other was Plexiglas, and the
The rationale for Experiment 1 derives floor and ceiling were wire mesh. The chamber was
from the query, How frequently would a housed in a sound-attenuating box equipped with a fan
random generator be rewarded under the for ventilation and masking noise. A one-way mirror in
the outer box. parallel to the Plexiglas chamber wall,
Schwartz (1980, Experiment 4; 1982a, Ex-
permitted observation.
periment 1) contingencies where a sequence
The front wall contained two Gerbrands pigeon keys,
of eight responses had to differ from the each 2 cm in diameter, that were operated by withdrawal
previous sequence for reinforcement? If the of an applied force. The keys were 21.5 cm above the
Schwartz contingencies reinforced response floor; one key was directly above a 4.5-cm X 5.5-cm
hopper opening that was itself 7.5 cm above the floor
variability, as was intended, many reinforcers
and centered on the panel, and the second key was 4.5
would be given to a random response gener- cm (center to center) to the right of the first. A Gerbrands
ator. However, we found that a computer- food magazine delivered mixed pigeon grain through the
based random number generator, programmed hopper opening. Each key could be transilluminated with
to respond randomly left or right under a 7.5-W white light, and a 7.5-W white house light was
located above the mesh ceiling, directly above the center
Schwartz's contingencies, was rewarded on
key. The house light was continuously illuminated except
only 29% of the trials. This random generator during reinforcement, when the hopper was lighted by a
was successful on somewhat fewer trials than 7.5-W white light.
were Schwartz's pigeons. The random gener- A 5 X 5 matrix of yellow 5-W cue lights, each 0.75
ator's performance can be explained on the cm in diameter, was located along the wall to the left of
the response keys. The last column of the matrix was 4
following theoretical grounds. The number
cm from the front wall. Columns were 2.5 cm, center to
of distinct eight-response sequences, given center, and rows were separated by 2 cm, center to center.
two alternatives, is 256, or 28. The number Each of the 25 lights could be separately illuminated.
of such sequences meeting the requirement Stimulus events were controlled and responses were
recorded by a Commodore VIC-20 computer through
of no more than four responses on each key
an integrated circuit and relay interface. Data were
is 70. Therefore, the probability that a se- recorded on cassette tape at the end of each session and
quence meets the no-more-than-four require- were later transferred to a Digital Equipment PDP-1170
ment is 0.27, or approximately the percentage computer for analysis.
of times that the random generator was suc-
cessful. According to this reasoning, the no- Procedure
more-than-four requirement was responsible
Pretraining. All pigeons were trained to peck the keys
for the random simulator's low success rate.
under a modified autoshape procedure derived from
To determine whether these theoretical con- Schwartz (I960). After variable intertrial intervals (keys
siderations can explain Schwartz's findings, dark) averaging 40 s, one of the two keys, randomly
the present experiment compared Schwartz's selected, was illuminated for 6 s, followed by presentation
of grain reinforcement for 4 s. Pecks to a lighted key
schedule with one in which the requirement
immediately produced the reinforcer and darkened the
of no more than four responses per key was key. Autoshape training continued for four to six sessions
omitted. after the first session in which a pigeon pecked both
lighted keys. Each session terminated after the 60th
reinforcer.
Method
A comparison of two main conditions followed. In the
Subjects variability (V) condition, a sequence of eight left and
right responses had to differ from the previous sequence
Three experimentally naive pigeons (Nos. 28, 68, and for reinforcement. This was compared with a variability-
71) and 1 with previous experience in conditioning plus-constraint (VC) condition, where, in addition, exactly
432 SUZANNE PAGE AND ALLEN NEURINGER
four responses had to occur on each key, a condition each subject was stable over 5 sessions, or 22 to 31
analogous to Schwartz (1980, Experiment 4; 1982a, sessions.
Experiment 1). Return to variability: Lag 5. The Lag 5 variability
Variability: Lags I and 5. Each trial consisted of contingencies used in the first phase were reinstated for
eight pecks distributed in any manner over the two keys; 6 to 20 sessions. The eight responses could again be
each session comprised 50 trials. At the start of each distributed in any manner; that is, the requirement of
trial, both keys were illuminated with white light. A peck no more than four responses per key was removed. This
to either key immediately darkened both key lights and variability phase differed from the original in that during
initiated a 0.5-s interpeck interval. Pecks during the the first five trials of each session, the bird's sequences
interpeck interval reset the interval but were not counted had to differ from the last five sequences, including
and had no other consequence. The eighth peck to a sequences emitted during the last five trials from the
lighted key terminated the trial with either a reinforcer previous session. That is, the comparison set was made
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
(3.5 s of access to mixed grain) or a 3.5-s time-out, continuous over sessions so that, for example, the fourth
This document is copyrighted by the American Psychological Association or one of its allied publishers.
during which the keys were darkened but the house light trial in a session would be reinforced only if the fourth
remained illuminated. Pecks during the time-out reset sequence differed from the preceding three sequences of
the 3.5-s timer but had no other consequence. The only that session and the last two sequences of the previous
difference between time-out and interpeck interval was session.
the duration of these events. Under the Lag I condition, Owing to a programming error, the timing of experi-
whether a trial ended in reinforcement or time-out mental events (reinforcement, time-out, and interpeck
depended on whether the sequence of eight pecks on that interval) was altered when the comparison set was made
trial differed from the last sequence of eight pecks, that continuous over sessions: All timed events were lengthened
is, whether the response pattern on trial n differed from by a factor of 10/6; reinforcement and time-out were
trial n - 1. If the sequences on the two trials differed, a 5.83 s and interpeck interval was 0.83 s. The effect of
reinforcer was given; if the sequences were identical, these changes was examined in this experiment and more
time-out occurred. Thus, for example, if on Trial 10 the directly in Experiment 5, below, which shows that these
bird pecked LRLLLRRL, Trial I I would be reinforced timing parameter changes had no discernible effect on
unless the bird repeated that sequence. The three exper- sequence variability.
imentally naive birds (Nos. 28, 68, and 71) received 5 to After completion of the research with pigeons, the
13 sessions of this Lag 1 training. Because most trials VIC-20 computer's random number generator was used
ended with a reinforcer (the pigeons successfully varied to generate left and right responses under both V and
their sequences), greater variability was demanded by VC conditions identical to those experienced by the
increasing the look-back value to Lag 5. Now, for rein- pigeons. This simulation by a random generator permitted
forcement, the response sequence on trial n had to differ comparison of the pigeon's sequence variability with a
from each of the sequences on trials n - 1 through n - 5, random standard.'
inclusive; otherwise time-out occurred. The one previously
experienced bird (No. 58) began the experiment with
Results
Lag 5 contingencies. Subjects received 15 to 18 sessions
of Lag 5 until the percentage of reinforced trials remained
Two basic measures were derived from
stable over at least 5 sessions. The matrix lights were not
used under these variability procedures.
Schwartz (1980, 1982a). First, percentage of
Variability-plus-constraint: Lag L Except where noted,
reinforced trials per session (number of rein-
the contingencies and parameters here were the same as forced trials divided by total trials) indicated
in the Lag 1 variability condition. As in Schwartz (1980, how often the pigeons met the contingency
Experiment 4; 1982a, Experiment 1), to be reinforced, requirements, or the probability that the se-
the birds had to peck exactly four times on each key
quence was correct. The second measure,
with a sequence that differed from the last trial. Thus,
there were two ways for a subject to produce time-out: which provided a more direct index of vari-
(a) Respond four times on the left and four times on the ability, was the percentage of different se-
right with a sequence that was identical to the last quences per session (the number of distinct
sequence, or (b) respond more than four times on either
eight-response patterns divided by the total
of the keys. The major difference between the V and the
VC conditions was that the eight responses could be
number of trials). A sequence was termed
distributed in any manner under the former schedule, distinct if it differed in any way from all
whereas exactly four responses per key were required previous sequences in a given session. Note
under the latter. that subjects could demonstrate mastery of
The 5 X 5 cue light matrix functioned as in Schwartz.
the contingency requirement (percentage of
At the start of each trial, the top left matrix light was
illuminated. A peck to the left key darkened that light reinforced trials could be very high) even
and lit the one to its right. Pecks to the right key darkened
the presently illuminated light and lit the one immediately
1
below it. A fifth peck to cither key moved the matrix Computer-based random number generators are often
light off the board, darkening the currently illuminated referred to as pseudorandom or quasi random because
light and initiating time-out. The VC procedure was the output is generated from an equation. For ease of
continued until the percentage of reinforced trials for presentation we shall refer simply to the random generator.
OPERANT VARIABILITY 433
O RND
though the percentage of different sequences 100
O A*28 O
remained low. Under Lag 1 conditions, for f
• *58
m »6s .
example, where a sequence had to differ only on
yu p ft 1
from the last trial, if a bird alternated between t
two sequences, all trials would be reinforced. 80 "
However, percentage different would be 2/50, TJ d>
or 4%, a low value.
Figure 1 shows the percentage of reinforced g »
trials for each of the four subjects averaged "c '55 60
over the last five sessions of each condition. cc
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
V
H
VC
for the 3 subjects that experienced Lag 1 V
training was a similarly high 94%.) Figure 1. Percentage of reinforced trials per session
during variability (V), variability-plus-constraint (VC),
Performances under the original and rep- and replication of variability (V) conditions. (Filled
licated V conditions did not differ statistically, points = arithmetic averages over the final five sessions;
and for purposes of further statistical analyses, bars =medians of the pigeons' performances; open dia-
the initial and replicated V data were averaged monds = simulated performance from a computer-based
and then compared with VC data. The per- random number generator.)
centage of reinforced trials under V was
significantly higher than the percentage of Because the output of the random generator
reinforced trials under VC, r(3) = 14.44, could not have been affected by the reinforc-
ers, the only explanation is that there were
Figure 1 also shows that the performance fewer possible sequences under VC (where
of a simulated random generator (open dia- trials were terminated by a fifth response to
monds) yielded percentage reinforcement either key) than under V (where all trials
values of 96% under the V condition and required eight responses).
29% under VC. In terms of successfully meet-
ing the schedule requirements, the pigeons
and random generator were similarly affected Discussion
by the conditions of this experiment. Under variability-plus-constraint condi-
Figure 2 shows that during the last five tions, only 42% of trials were reinforced, a
sessions of V and VC conditions, respectively, finding that essentially replicates Schwartz
more than 70% of response patterns differed (1980), in which 36% of trials ended in
from all previous sequences in the session. reinforcement. In sharp contrast, under the
The two conditions did not differ statistically. variability condition, 90% of trials were rein-
Two pigeons showed a decrease in percentage forced. The main question for discussion is
of different sequences from initial V to VC why V and VC conditions yielded such dif-
conditions and a subsequent increase when ferent success rates. The procedural differ-
V was reinstated; all 4 pigeons showed an ences between the two conditions complicate
increase of percentage difference scores from interpretation: Matrix lights were present un-
VC to the replicated V condition. This slight der VC and not V; sequences had to differ
tendency for there to be a greater number of from the last trial under VC and from the
different sequences under the V condition last five trials under V; a total of 70 sequences
was also observed in the simulated bird's were potentially reinforceable under VC (all
performance, shown by the open diamonds. sequences of eight responses having exactly
434 SUZANNE PAGE AND ALLEN NEURINGER
lOOr
*28
cies, only approximately one third of the
*58 trials would end with a reinforcer. On the
tea
90 other hand, the same random responses would
be reinforced on more than 99% of trials
80 under an analogous Lag 1 contingency where
the constraint of no more than four responses
*-
2 70
per key was absent. An additional analysis of
0) the computer-simulated data showed that
M—
during 250 simulated trials under the VC
Q 60
condition, whenever a trial was not reinforced,
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Results
VC V VC V
The left portion of Figure 3 shows average
Figure 3. Percentage of reinforced trials (left two bars)
percentage of reinforced trials per session and percentage of different sequences (right bars) in the
over the last five sessions in each condition. variability (V) and variability-plus-constraint (VC) con-
These results essentially replicate Experiment ditions. (Filled points = averages over the final five sessions;
1: When pigeons were permitted to distribute bars = medians of the pigeons' performances.)
their behaviors on the two keys in any fashion,
they achieved reinforcement on more than of all trials; repetitions accounted for less
80% of the trials. However, when a four- than 2%.
response-per-key constraint was added, per-
centage reinforcements fell to slightly higher
Discussion
than 40%, a significant decrease, f(3) = 6.667,
p <0.01. The contingencies and parameters in the
The right side of Figure 3 shows the per- present VC condition were identical to those
centage of sequences that differed from every used by Schwartz (1982a). The contingencies
previous sequence in each session averaged under V were, as far as possible, identical to
over the last five sessions. The percentage the VC condition with one important excep-
different scores under the VC condition were tion: Under V there were no constraints on
higher than under V in three of the four the distribution of responses. We conclude
pigeons. That is, the birds were more variable that constraining the response distribution to
under VC than they were under V. This four responses per key in VC and in Schwartz
difference was not statistically significant. (1980, I982a) caused the low frequencies of
Figure 4 shows why the relatively high obtained rewards.
variability under VC resulted in relatively The pigeons generated fewer different se-
infrequent rewards. The percent of nonrein- quences under V than under VC. As noted
forced trials under VC (left bar) is divided above, little variability was required by a Lag
into two categories, shown by the two right 1 contingency (one successful strategy being
bars. One cause for nonreinforcement was to alternate between only two sequences).
termination of a trial by a fifth response on Thus, the variability engendered by the V
either key. The other cause was repetition of contingencies appeared to be sensitive to the
the immediately preceding eight-response se- degree of variability required (see below for
quence. Terminated trials accounted for 53% confirmation). Furthermore, despite the rel-
436 SUZANNE PAGE AND ALLEN NEURINOER
(D
55
Q. 30 - T3
Q. decreased with training (see Experiment 5,
below, for the opposite conclusion), it is pos-
CC
sible that with continued training under VC,
£? •o
#
response stereotypies would have developed.
20 " •E ffl •
C Experiment 3: How Variable?
1 E Lag as Parameter
10 "
i 1- Variability is a continuum. The two pre-
vious experiments showed that pigeons earned
n —*—1 frequent rewards under Lag 1 and Lag 5
look-back contingencies when there were no
Figure 4. Percentage of trials that were not reinforced
and division of nonreinforced trials into terminated additional constraints on response distribu-
(more than four responses emitted on a key) and repeated tion. This experiment asked whether pigeons
(a given eight-response sequence repeated on two consec- could maintain high success rates when the
utive trials). (Filled points = averages from final five
variability requirement became increasingly
sessions under the variability-plus-constfaint condition;
bars = medians of the fours pigeons' performances.) stringent, that is, when the look back was
increased. By the end of this experiment, to
be rewarded, a pigeon had to generate a
sequence that differed from every one of its
atively high variability demonstrated by the
last 50 trials.
birds in the VC condition, reinforcement
frequency remained low because of the re-
Method
sponse-distribution requirement.
Although the percentage of reinforced trials Subjects and Apparatus
under the present VC contingencies approx-
Two experimentally naive pigeons (Nos. 70 and 73)
imated those obtained by Schwartz (1980; and two previously experienced pigeons (Nos. 59 and 61)
1982a), one result differed. Schwartz found were housed and maintained as in Experiment I. The
that birds emitted few different sequences, apparatus used in this experiment was the same as in
that one modal sequence came to dominate, Experiment 1, but the matrix lights were not used.
lag requirement was increased to 5: For reinforcement, the broken line shows a random generator
sequences had to differ from those in each of the last 5 simulation under identical conditions. From
trials. The two experimentally sophisticated pigeons began
the experiment at this Lag 5 value. There followed 10 to
Lags 5 through 25, more than 85% of the
21 sessions under Lag 10, 8 to 25 sessions under Lag 15, pigeons' sequences met the variability re-
10 to 38 sessions under Lag 25 (No. 59 remained at Lag quirements and were reinforced. At Lag 50,
25 until the end of the experiment because its performance there was a decrease to 67%. This same
never reached stability), and (for the 3 remaining subjects) decrease in percentage reinforced was seen in
23 to 45 sessions under Lag 50. Throughout the experi-
ment, the lag value was changed only after a subject's the random generator's simulated data. Thus,
percentage of reinforced trials had become stable over at the pigeon's data again paralleled the data of
least 5 sessions. Midway in the Lag 10 condition, the a random generator.
procedure was changed (as described in Experiment 1)
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
the previous session. Thus, for example, under the Lag forcement shown in Figure 5, the pigeons
50 condition, if the subject were responding on its llth must have generated highly variable response
trial in a session, for a reward to be presented, the sequences. One index of this high variability
current sequence had to differ from the 10 trials already is shown in Figure 6. As the lag requirement
completed in the present session and the last 40 trials of
the previous session. Because of the same programming increased from 5 to 25, the percentage of
error described in Experiment 1, from the midpoint of sequences that differed from all previously
Lag 10 through the end of the experiment, all timed emitted sequences in a session increased from
events were increased by a factor of 10/6: Reinforcement 66% to 87%. As discussed above, to maintain
and time-out were 5.83 s rather than the original 3.5 s,
and interpeck interval was 0.83 s rather than the orig-
a high frequency of reinforcement under a
inal 0.5 s. low lag requirement, the bird did not have
to emit very many different sequences. As
Results and Discussion the length of the look back increased, however,
increasingly greater numbers of different se-
Figure 5 shows average percentage of rein- quences were demanded. Sensitivity to these
forced trials over the last five sessions at each changing requirements is indicated by the
lag, or look-back value. The solid line con- increasing function from Lag 5 to Lag 25.
nects the medians of the four pigeons, and The small decrease to 81% different at the
lOOr
90
8
o
»t—
80
'53
DC
*- 70
C
<D
O
60
(I)
Q.
-ii-
10 15 25 50
LAG
Figure 5. Percentage of reinforced trials per session as a function of lag (look back) value. (Filled points =
averages over final five sessions at each lag value for each subject; solid line = medians; broken line =
performance of a simulating random generator under identical conditions.)
438 SUZANNE PAGE AND ALLEN NEURINGER
lOOr
C 90
CD
5: so
Q
E 70
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
0)
This document is copyrighted by the American Psychological Association or one of its allied publishers.
sol
10 15 25 50
LAG
Figure 6. Percentage of different sequences per session as a function of lag value. (Filled points = averages
over final five sessions at each lag value for each subject; solid line = medians; broken line = performance
of a simulating random generator under identical conditions.
Lag 50 value may be a respondent effect, Of LLL, LLR, LRL, LRR, RLL, RLR, RRL, and
correlated with the lowered frequencies of RRR triplets. The U measure, derived from
obtained reinforcements. As shown by the information theory (Miller & Frick, 1949),
performance of the random generator in Fig- varies between 0.0 and 1.0, with 0.0 indicating
ure 5, no matter how variable the behavior, that all responses are perfectly predictable or
as lag requirement continued to increase, a ordered and 1.0 indicating maximum uncer-
decrease in the number of rewards gained tainty. The U values were calculated by con-
resulted. catenating all responses without regard to
More detailed indices of overall response reinforcement or time-out and computing the
variability are given in Figure 7, which shows relative frequencies of left and right responses
three measures of variability (U values, or (Ut), pairs of responses (C/2), and triplets of
average uncertainty) as functions of lag. The responses (L3). When left and right were
V values were computed according to the approximately equal, U\ approached 1.0;
following equations: when all possible pairs of responses ap-
proached equality, t/2 approached 1.0; and
2? Pi 1°82 Pi when all possible sequences of responses taken
three at a time approached equality, t/3 ap-
proached 1.0. Figure 7 shows that as lag
values increased (i.e., as requirement for vari-
ability increased) the averages of the 4 pigeons'
lofc(4) l/i, C/2, and [/3 values increased. (These
and averages well represent the individual func-
tions and are presented to save space.) At the
U,= lag value of 25, the average of the 4 pigeons'
lofe(8) U values approximated the U value of the
where, for U\, p, equals the probabilities of random number generator (shown by the
L and R responses; for t/2, p, equals the open diamonds; only a single line is drawn
probabilities of LL, LR, RL, and RR response because U\, l/2 and t/3 values were approxi-
pairs; and for I/3, p, equals the probabilities mately the same for the random generator).
OPERANT VARIABILITY 439
l.OOr
Q)
•= .95
"55
I
.90
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
.85
.80
10 15 25 50
LAG
Figure 7. Average sequence variability as functions of lag. (Each of the lines connects the medians of the
pigeons' average performances over final five sessions of each lag value. U, = uncertainty for responses
taken one at a time; £/2 = uncertainty for responses considered in pairs; I/5 = uncertainty for response
triplets. Open diamonds = analogous data for simulating random number generator, where, because the
three U values were almost identical, a single point indicates the values.)
Once again, there was a slight tendency for paralleled that of the simulated random func-
variability to decrease at Lag 50. The closeness tion; this finding was again consistent with
of the three U functions to one another the hypothesis that the pigeons were gener-
indicates high variability and absence of ating quasi-random sequences. Although the
higher order biases or stereotypies. (Note that variability of the computer-based random
we also examined U4 through Us, and these generator was, of course, unaffected by the
contained similar information to that shown reinforcement schedule, the pigeons' vari-
in f/i through Ui. ability appeared to be controlled by the re-
Two possibly confounding influences were in forcers. When the schedule demanded rel-
present in Experiment 3. First, the timing atively little variability (Lag 5), variability
parameter changed during the Lag 10 con- was relatively low. As the variability require-
dition, and therefore the Lag 15 through Lag ment increased to Lag 25, so too did vari-
50 conditions contained longer reinforcement ability of performance. However, at Lag 50,
and time-out durations than the Lag 5 and when the obtained frequency of reinforcement
Lag 10 conditions. However, Experiment 1 decreased despite random performance (as
showed that there was no statistically signifi- indicated by the simulating random genera-
cant effect of these timing differences, a result tor), again the birds' variability decreased.
supported in Experiment 5, below. Second, Thus, high variability was engendered only
the lag requirements increased from low val- when it was differentially reinforced.
ues to high, and therefore the form of the
obtained function may partly be due to the
Experiment 4: Quasi-Random Versus
order of experience. We thought that pigeons
Memory Strategies: Number of
could not tolerate high lag requirements be-
Responses as Parameter
fore they experienced training under lower
requirements, a hypothesis shown to be in- The present experiment asked how the
correct in Experiment 5. The general form pigeons generated their variable response se-
of the subjects' percentage reinforced function quences: What mechanism or strategy ac-
440 SUZANNE PAGE AND ALLEN NEURINGER
counts for such highly variable performance? under eight responses per trial (identical to the Lag 5
The previous experiments alluded to one variability conditions in Experiments 1 and 3 where
there were 256 possible sequences); 9 to 23 sessions
possible strategy, that of a quasi-random gen- under four responses per trial (16 possible sequences);
erator, but alternative strategies involving re- and another 6 to 9 sessions under eight responses per
membering previous sequences would also trial. The reinforcement and time-out intervals were 5.83
do. For example, a subject could learn a long s throughout, and interpeck interval was 0.83 s.
random response sequence (see Popper, 1968)
or utilize a rational system (e.g., first emit 8 Results and Discussion
left responses; then 7 left and 1 right; then 6 Figure 8 shows the average percentage of
left, 1 right, and 1 left; etc.). It is not here the 50 available rewards obtained over the
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
being suggested that pigeons can count in last five sessions at each responses-per-trial
This document is copyrighted by the American Psychological Association or one of its allied publishers.
binary but rather that their behavior could value. The eight-response value represents
be described as systematic. Any systematic the mean of the two eight-response phases,
strategy would involve a rather large memory which were statistically indistinguishable from
load, for the pigeon would have to remember one another. For all subjects, as number of
where it was in its system. The present ex- responses per sequence increased, percentage
periment attempted to contrast the quasi- of reinforced trials increased monotonically.
random and memory hypotheses in the fol- An analysis of variance with repeated mea-
lowing way. If the number of responses re- sures showed an overall significant difference
quired for each sequence were increased, among the three conditions, f\2, 6) = 38.21,
performance based on a memory strategy p < .001, and analytical pairwise comparisons
should be adversely affected, for it is easier showed that each condition differed from
to remember a four-response sequence than every other: four versus eight responses, F[\,
an eight-response sequence. On the other
hand, if the bird were acting as a quasi-
random generator, success rates should im-
prove with increasing responses per trial,
because by the laws of chance, a random
generator would be more likely to repeat
sequences comprising four responses (1
chance in 16 under Lag 1) than eight respon-
ses (1 chance in 256). Thus, the memory and
quasi-random generator hypotheses make op-
posite predictions. If the subject's rewards
increased with increasing responses per trial,
the quasi-random hypothesis would be sup-
ported; if the rewards decreased with increas-
ing responses per trial, a memory strategy
would be supported.
Method
Subjects and Apparatus
The subjects and apparatus were the same as in
Experiment I.
4 6 8
Procedure
The procedure was identical to that in Experiment 3,
Number of Responses
except the lag requirement was kept constant at Tag 5 Figure S. Percentage of reinforced trials as a function of
and the number of responses per trial, or sequence length, number of responses per trial. (Filled points = averages
varied in an ABCB format. The pigeons were first given over final five sessions at each condition for each of the
24 to 29 sessions under a six-responses-per-trial condition pigeons; solid line = medians; broken line = data from
(there were 64 possible sequences); then 12 to 38 sessions simulating random generator under identical conditions.)
OPERANT VARIABILITY 441
diameter Gerbrands response keys with their centers 7.5 I , the duration of events (reinforcement, time-out, and
cm from each other and from the side walls and 21.5 cm interpeck interval) was shorter (by a factor of 10/6) in
above the mesh floor. Keys could be transilluminated yoked-VR than in either the preceding or the following
with 7.5-W blue bulbs. A response registered when Lag 50 phases. The reinforcer and time-out were 3.5 s
applied force was withdrawn. The middle of the three in yoked-VR, as opposed to 5.83s in Lag 50. The
keys was covered with black tape and could not be interpeck interval was 0.83 s in Lag 50 phases but 0.5 s
operated. Directly below this middle key was a round in the yoked-VR. To compare directly the effects of the
hopper opening, 5 cm in diameter, with its midpoint 10 different time values on sequence variability, after each
cm from the floor, through which a Gerbrands magazine pigeon had reached stability on Lag 50 (second A phase
could provide mixed pigeon grain reinforcement illumi- containing the long times), the durations of the reinforcer,
nated with a 7.5-W white bulb. House light was provided time-out, and interpeck interval were changed to those
by two 7.5-W white bulbs above the wire mesh ceiling. under yoked-VR with everything else held constant, thus
As described in Experiment 1, VIC-20 computers con- permitting comparison of responding under Lag 50 long
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
trolled the experiment. Each pigeon was arbitrarily as- times with Lag 50 short times. After stable performances
This document is copyrighted by the American Psychological Association or one of its allied publishers.
signed to one of the two experimental chambers. were reached in 10 to 20 sessions, the yoked-VR contin-
gencies were reinstated for 17 to 32 sessions, thereby
permitting comparison of Lag 50 and yoked-VR when
Procedure
all time parameters were identical. Schedules of reinforcers
The four subjects were first given autoshaping training and time-outs in this second yoked-VR phase were
as described in Experiment 1. The experimental procedure derived from performances of each pigeon in the last 6
then followed an ABAAT3 design, with A and A' repre- sessions of the A' Lag 50 (short times) phase.
senting T^g 50 variability contingencies and B representing
a yoked variable ratio (yoked-VR) contingency in which Results and Discussion
variable sequences were not required. The variability
procedure was identical to that in Experiment 1 except The main results were that (a) variability
as follows. A Lag 50 requirement was present from the was significantly higher under Lag 50 than
outset. As in the latter phases of Experiment 1, the
yoked-VR, thereby demonstrating that the
comparison criterion was continuous over sessions. For
example, for Trial 10 to be reinforced, the sequence of
variability depended on contingent reinforce-
responses in that trial had to differ from each of the ment; (b) experimentally naive pigeons placed
sequences in the 9 previous trials of the current session directly on the Lag 50 schedule very quickly
and the last 41 completed trials of the preceding session. learned to vary, thereby indicating that vari-
The initial Lag 50 phase continued for 26 to 38 sessions
ability was easy to condition; and (c) the 10/
or until each pigeon maintained a stable percentage of
reinforced trials for 5 or more sessions. Throughout the 6 differences in reinforcer, time-out, and in-
present experiment, sessions terminated after the 50th terpeck interval times had no discernible
reinforcer, or after 100 trials, whichever occurred first. effect on performance, thereby indicating the
At the start of the B phase, the contingencies of robustness of the reinforcement-of-variability
reinforcement were changed so that the pigeons were
effect.
reinforced on a yoked-variable ratio schedule derived
from their individual performances under Lag 50. Under The left two bars of Figure 10 show mean
this yoked-VR, eight responses again constituted a trial, percentages of different sequences per session
and trials were again sometimes followed by grain rein- for the first five (open bar) and last five
forcers and sometimes by time-out, but reinforcement
(striped bar) sessions under the first Lag 50
and time-out presentation were now independent of
sequence variability. Each pigeon's last 6 sessions under
(long times) schedule. Over the first five ses-
the Lag 50 variability contingencies were used to create sions, more than 50% of sequences differed
its schedule of reinforcement under yoked-VR. Thus, from all previous sequences in the same
each subject was yoked to itself, and the schedule of session, and this value increased to more
reinforced and nonreinforced trials under yoked-VR
than 75% by the last five sessions. By the end
replicated the pattern of reinforcers and time-outs obtained
under Lag 50 variability. The yoked reinforcement sched- of this Lag 50 phase, the pigeons were being
ule lasted for 6 consecutive sessions and then was repeated. rewarded after approximately 70% of their
To illustrate, if Subject 44 had been rewarded after Trials trials, an increase from 50% during the first
2, 5, 6, and 8, and so on, in the last session under the
five sessions. The increases from first to last
Lag 50 condition, then in yoked-VR Sessions 6, 12, 18,
and so on, Subject 44 would be rewarded after Trials 2,
five sessions in both percentage different trials,
5, 6, and 8, and so on, regardless of which eight-response f(3) = 6.68, p <: .01, and percentage reinforced
sequence was emitted. Trials I, 3, 4, and 7, and so on, trials, J(3) = 5.11, p < .025, were statistically
would be terminated by time-out. The yoked-VR contin- significant.
gencies continued until performance was stable, from 24
The second set of bars, representing yoked-
to 31 sessions, whereupon Lag 50 variability contingencies
were reinstated and maintained for 17 or 18 sessions. VR, shows that when the variability contin-
Due to the programming error described in Experiment gencies were removed, percentage of different
OPERANT VARIABILITY 443
sequences fell immediately until fewer than last five sessions under this phase. Because of
20% of the sequences were different from the differences in timing values (long times
previous sequences in the session. The differ- in Lag 50 and short times in yoked-VR), the
ence between the last five sessions of the Lag observed effects might have been confounded.
50 variability contingencies and the last five The reinforcement, time-out, and interpeck
sessions of the yoked-VR was significant, interval times were therefore changed under
f(3) = 7.46, p < .005. Lag 50 so that these times were now identical
Upon reintroduction of the Lag 50 contin- to those under yoked-VR. The third and
gencies, percentages of different sequences fourth bars above the replicated Lag 50 show
rose immediately. The leftmost of the four percentage different sequences during the first
five and last five sessions under short time
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
shows percentage different during the first values. There was essentially no change in
five sessions of Lag 50 following yoked-VR, percentage of different sequences due to the
and the second of the four bars shows the different time values.
90r
• *4S
• *49
80 • #50
CD First 5
ESJ Last 5
70
0)
0>
3= 50
b
•4-"
(D 40
O
(5
a.
30
20
10
Upon return to the yoked-VR condition, Lag 50 and yoked-VR, ;(3) = 2.493, p =
variability of responding again decreased im- .0873, was due to the large spread of the
mediately, and once again the difference be- individual subjects' data under the yoked-VR
tween the last five sessions of Lag 50 (now condition. When the schedule demanded high
with short times) and last five sessions of variability (Lag 50), all subjects emitted very
yoked-VR (with identical short times) was few repetitions of any given sequence; when
significant, f(3) = 6.771, p < .01. the schedule permitted variability but did
The fact that response variability depended not require it (yoked-VR), there were large
on contingent reinforcement is shown in Fig- intersubject differences. These same patterns
ure 11 by the percentage of modal sequences, of modal frequencies were replicated with
denned as the single sequence emitted more return to Lag 50 (only the short time phase
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
frequently than any other in a session. The of Lag 50 is shown in the figure) and then to
ordinate shows the percentage of trials per yoked-VR, with the difference during the last
session in which the modal sequence oc- five sessions of these conditions being statis-
curred. By the last five sessions of the first tically significant, *(3) = 3.46, p < .05. In al-
phase of Lag 50, the modal sequences ac- most all cases, the sequence denned as modal
counted for about 4% of the sequences emit- under yoked-VR represented exclusive re-
ted per session. On the other hand, by the sponding on one or the other key (e.g., eight
end of the first yoked-VR phase, modal se- left responses or eight right responses). Be-
quences accounted for almost 50% of the cause reinforcement did not depend on any
sequences. Absence of significance between particular sequence, the final behavior was
lOOr
90-
^ *44
• *45
80 • *49
• *50
QJ CH First 5
O
c 70- ESS Last 5
0)
cr
0) 60
co
03
50
+- 40
C
0)
O 30
20
10
stereotyped responding. Following an informal exploration The key light colors signaling V and S were then
of the effects of parameter manipulations, the second reversed, with red key lights now associated with V and
phase attempted to equalize the number of responses per blue key lights with S. No other changes were made. Two
trial in the two conditions and to generate approximately pigeons (Nos. 29 and 38) began the first session of
equal and intermediate levels of percentage of reinforced reversed color in the S component, and the other two
trials. A reversal of stimulus conditions followed. began in V. There were 18 to 24 sessions of reversal.
Phase 1: Acquisition of stimulus control. Following
autoshaping, the four pigeons were put on a multiple Results
variability-stereotypy schedule in which the two com-
ponents alternated after every 10th reinforcement. In the Figure 13 shows percentages of reinforced
variability component, both key lights were blue, and
trials in variability and stereotypy components
subjects had to meet a Lag 5 variability criterion identical
to that described in Experiment 1, where reinforcement separately for 1 subject (No. 31) during each
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
was contingent on a sequence of eight pecks on the two session of Phases 1 and 3. The performance
This document is copyrighted by the American Psychological Association or one of its allied publishers.
keys that differed from each of the preceding five se- shown is representative of all birds. During
quences. The Lag 5 criterion was continuous with respect acquisition of performance under the first
to other V components in the same session and across
multiple schedule, shown in the left panel,
sessions.
In the stereotypy (S) component, both key lights were all birds initially received a higher percentage
red and the pigeons had to emit an arbitrarily denned of reinforcements in the variability compo-
three-response sequence, namely, LRR, in that order. The nent (eight responses, Lag 5) than in the
first two correct responses in the sequence produced the
stereotypy component (three responses). Over
same interpeck interval as in the V component, and the
third correct response produced the reinforcer. An error
sessions, performances in both components
during an S trial (e.g., if the second response were to the improved in accuracy, indicating that all sub-
left key) immediately produced the same time-out as in jects acquired the variability-stereotypy dis-
V and reset the sequence to the beginning. Whereas the crimination.
time-out in V occurred only after the eighth response,
the time-out in S immediately followed any incorrect
Figure 14 shows average (/-value mea-
response in the sequence. sures—t/i showing variability for responses
The V and S components continued to alternate after taken one at a time, U2 for responses taken
every 10th reinforcement until the bird earned a total of two at a time, and 1/3 for responses taken
60 rewards or until a maximum of 120 trials was reached three at a time—in each of the components.
in either component, whichever occurred first. After
initial longer times, the durations of reinforcement and
The left set of bars indicates the median
time-out were reduced to 5.0 s in both components performance over the last five sessions in the
(times for Subject 38 were decreased to 4.2 s after five initial condition. As would be expected if the
sessions owing to its weight gain). Pecks in both V and pigeons were performing variably in V and
S were separated by a 0.83-s interpeck interval.
emitting a fixed sequence in S, there were
Because most of the pigeons were at first unable to
attain 10 rewards in S, the first few sessions began with
large and significant differences between the
the V component. After the 8th session, when responding U values in the two components: For U\,
in S had improved, the V and S components alternately i(3)=15.515, p<.001: for U2, «(3) =
began each session. In an attempt to further improve 15.617, p < .001; and for U,, t(3) = 11.533,
performances during S, the salience of time-out was
p < .005.
increased by flashing the house light on for 0.33 s and
off for 0.08 s during time-out in both the V and S During the exploratory phase that is not
components. Phase 1 training continued for 12, 16, 24, shown, attempts were made to increase the
and 24 sessions for each of the 4 pigeons, respectively. stereotypy component to eight responses
Phase 2: Equalization of response*. An attempt was (equal to the variability component) and to
made to equalize the number of responses required in V
and S components and to approximately equalize the
change the contingencies in S so that time-
percentages of correct responses at an intermediate level outs for incorrect sequences occurred only at
of proficiency to avoid ceiling and floor effects. At the the end of the sequence (as was the case in
end of this phase, which lasted approximately 50 sessions, V). But both of these attempts failed. There-
the schedule in the V component was six responses, Lag
fore, to equalize performances under V and
10. The schedule requirement in S was the fixed pattern
LRRLL. Reinforcement and time-out in both components S, a five-response sequence was used in each
lasted for 3 s. component and the lag criterion was increased
Phase 3: Stimulus reversal. At the start of Phase 3, to 10 in the V component. Performances
the number of responses in V was reduced to five to under these conditions are shown in the
equal the number of responses in S. Otherwise, the
contingencies in effect at the end of Phase 2 were
middle panels of Figures 13 and 14. When
maintained. Birds 29, 31, 38, and 39 received 10, 14, the stimulus conditions were reversed (blue
12, and 22 sessions, respectively. key lights now signifying the S component
OPERANT VARIABILITY 447
Figure 13. Percentage of reinforced trials per session in variability (V) and stereotypy (S) components of
the multiple schedule for Pigeon 31. (Left panel = initial acquisition where the contingencies in V were
eight responses, Lag 5, and the S contingencies reinforced left-right-right patterns; middle panel =
performance under a V schedule of five responses, Lag 10. and an S schedule of left-right-right-left-left;
right panel = reversal of the middle schedule conditions [key light colors were reversed].)
and red key lights signifying the V component) ment. This conclusion was strengthened by
performances immediately deteriorated but comparison of the pigeon's performance with
then improved (right panels). At the end of a computer-based random number simulation.
both these phases, U values in S and V Experiment 3 increased the look back, or
differed significantly at the .01 level or better. the number of prior sequences of eight re-
Thus, we conclude that stimulus control was sponses from which the current sequence had
established over variable and stereotyped re- to differ. Eventually, to be reinforced, the
sponding. pigeon had to respond with a sequence that
differed from each of its last 50 sequences.
General Discussion This look back included sequences from the
previous session. The subjects generated
Experiments 1 and 2 showed that when highly variable sequences, with more than
hungry pigeons were given grain for generating 80% of the patterns differing from all previous
sequences of eight responses that differed patterns in a session. Probabilities of correct
from their last sequence, they successfully sequences again paralleled the probability of
varied their sequences. When, in addition to correct sequences of the simulating random
meeting this same variability requirement, generator.
the pigeons had to peck exactly four times Experiment 4 compared two possible ac-
on each key (as in Schwartz, 1980, Experi- counts of this variability. The memory hy-
ment 4; 1982a, Experiment 1), success rates pothesis was that the pigeons learned a long
fell significantly. We concluded that the in- sequence of responses or used a rational
ability of pigeons to gain high rates of reward strategy to meet the schedule requirements.
under the Schwartz procedure was due to an The variability hypothesis was that the pigeons
artifact of the four-response-per-key require- behaved as a quasi-random generator. The
448 SUZANNE PAGE AND ALLEN NEURINGER
first hypothesis predicts that increasing the rates improved significantly, thereby support-
number of responses per trial would be cor- ing the quasi-random generator hypothesis.
related with a lowered success rate, for it is Once again, the pigeons' performances par-
easier to remember fewer responses than alleled the performance of a simulating ran-
more. The quasi-random hypothesis made dom generator.
the opposite prediction: By chance, given few The first four experiments generated high
responses per trial, consecutive trials would behavioral variability. However, neither these
repeat one another; given many responses per nor any previously published experiments
trial, there would be few repetitions by chance. demonstrated that response variability de-
When the required number of responses per pended on the contingency between variability
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
trial was increased from four to eight, success and reinforcement. The observed variability
This document is copyrighted by the American Psychological Association or one of its allied publishers.
*39
*31
1.0 *38
.95
.90
.85
.80
p<.001 p<.005
U2
"0 i.o
"rt
> 9
I
(0
> .6
p<.001 p<.01 p<.01
U3
REVERSAL
V: 8 resp, lag 5 V: 5 resp, lag 10 V: 5 resp, lag 10
S: LRR S: LRRLL S: LRRLL
Figure 14, Average response variability under three phases of Experiment 6, from left to right. (U,
responses taken one at a time; t/2 = responses taken in pairs; V, = responses taken in triplets.)
OPERANT VARIABILITY 449
could have been elicited by the reinforcement was neither required nor differentially rein-
schedule (a respondent effect) rather than forced. The state of the environment, as well
directly reinforced (an operant effect). Exper- as the contingencies in the environment, in-
iment 5 tested these alternatives by comparing fluence behavioral variability, the former
the performance of pigeons under two iden- through respondent effects and the latter
tical schedules, except that one schedule re- through operant effects.
quired variability whereas the other permitted Both respondent and operant variability
it. In the first, a pigeon had to respond eight may be adaptive. When reinforcers are infre-
times with a sequence that differed from each quent or absent, variability increases the like-
of its last 50 sequences. The patterns and lihood that the animal will improve its lot—
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
frequencies of rewards under this condition learn a new strategy for obtaining reinforce-
This document is copyrighted by the American Psychological Association or one of its allied publishers.
were duplicated to form a self-yoked schedule ment or change its environment. Even when
where eight responses were required for re- reinforcement densities are relatively high,
ward but sequence variability was no longer variability may improve the schedule or pro-
necessary. The results showed that sequence vide knowledge of the environment in antic-
variability was generated only when it was ipation of possible future decrements in re-
required. Under the yoked schedule, vari- inforcement. Variability is an adaptive re-
ability decreased significantly and the pigeons sponse to a changing or potentially changing
responded with highly repetitive patterns. environment.
Variability is therefore an operant dimension Operant variability has unique adaptive
of behavior. functions not shared by respondent variability.
Experiment 6 demonstrated discriminative Whenever an animal is operantly conditioned
control over behavioral variability. Pigeons to generate a new response, whether the
learned to respond variably in the presence conditioning is through the process of shaping
of key lights of one color and with a fixed, (successive approximations to some desired
stereotyped sequence in the presence of a goal response) or trial and error (Thorndikian
second color. When the stimulus conditions conditioning), it is adaptive for the animal to
were reversed, performances reversed. Thus, vary its behaviors. Reinforcement works
the variability dimension of behavior is con- through selection of some previously emitted
trolled by environmental stimuli in much the behavior. If the to-be-selected behavior does
same manner as other operant dimensions are. not occur, reinforcement cannot select. On
The present series of experiments therefore the other hand, in an environment where an
conclusively demonstrate the existence and operant response has previously been learned
strength of operant variability, variability that and is now being maintained by an acceptable
is engendered and maintained because pre- schedule of reinforcement, high variability
sentation of a reinforcer depends on the may not be functional, for variation may
variability. This conclusion is consistent with require relative high energy expenditure and
Blough (1966), Pryor et al. (1969), and Bryant sometimes result in less frequent reinforce-
and Church (1974), among others. ment. It is advantageous for an animal to
Previous studies have examined respondent discriminate situations in which new respon-
variability. Different schedules of reinforce- ses must be learned from those in which
ment reliably engender differing degrees of previously learned behaviors must be re-
behavioral variability with no contingency peated. We hypothesize that this discrimina-
between the variability and the reinforcement tion is based on the reinforcement of diverse
schedules. For example, if identical and in- responses and response classes in the former
dependent fixed-ratio 5 (FR 5) schedules are case versus reinforcement of fixed, or stereo-
programmed on each of two response keys, typed, responses and response classes in the
most pigeons peck exclusively on one or the latter. (The discrimination is in some ways
other key. If the schedules are changed to FR analogous to that between contingent and
150, there is considerable switching between noncontingent reinforcement; Killeen, 1978).
keys. This observation from our laboratory When an animal is differentially rewarded
is a clear example of respondent variability for a variety of responses, it generates variable
caused by reinforcement schedules. Variability behaviors. We posit that this describes all
450 SUZANNE PAGE AND ALLEN NEURINGER
operant learning (as opposed to operant tempted to train a single, fixed eight-response
maintaining) situations. Explicit reinforce- sequence in the stereotypy component of
ment of variable behaviors prior to initiation Experiment 6, we were unsuccessful. Despite
of operant learning, or shaping, procedures hundreds of reinforcements over many ses-
might speed the learning process. This hy- sions, the pigeons failed to learn. It therefore
pothesis should be tested. seems unlikely that they could learn an eight-
There are other instances where operant response sequence after a single reinforce-
control of variability is adaptive. Some envi- ment. Furthermore, the present results did
ronments punish variability (e.g., many school not show persistence of previously reinforced
classrooms), and most children can discrim- sequences.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
inate these from other situations (e.g., the A second hypothesis is that the pigeons
This document is copyrighted by the American Psychological Association or one of its allied publishers.
operant conditioning studies have emphasized food, clothing, and company), behavior may
highly controlled acts (see Schwartz, Schul- still be highly constrained. Contingencies that
denfrei, & Lacey, 1978), but a complete explicitly reinforce behavioral variability are
analysis of behavior must include analyses of necessary to maximize freedom.
the variability of behavior, variability main-
tained through respondent influences as well References
as that directly engendered and maintained Antonitis, J. J. (1951). Response variability in the while
by reinforcing consequences. It may be im- rat during conditioning, extinction, and reconditioning.
possible to predict or control the next instance Journal of Experimental Psychology, 42, 273-281.
Blough, D. S. (1966). The reinforcement of least frequent
of a variable behavior, but lack of prediction
interresponse times. Journal of the Experimental Anal-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
and control should not rule out study. When ysis af Behavior, 9, 581-591.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
behaviors are variable, experimental analyses Bryant, D., & Church, R. M. (1974). The determinants
can determine whether the variability is noise of random choice. Animal Learning & Behavior, 22.
(i.e., experimental or observational error), 245-248.
Carpenter, F. (1974). The Skinner primer: Behind freedom
under respondent control, or under operant
and dignity. New York: Macmillan.
control. Experiments can define the class or Catania, A. C. (1980). Freedom of choice: A behavioral
classes from which the individual instances analysis. In G. H. Bower (Ed.), The psychology of
of behavior are selected, the conditions under learning and motivation (Vol. 14, pp. 97-145). New
which the variability will occur or not, the York: Academic Press.
Crossman, E. K., & Nichols, M. B. (1981). Response
variables controlling onset and offset of the location as a function of reinforcement location and
variability, and the functions served by the frequency. Behavior Analysis Letters, J, 207-215.
variability. Operant analysis must not limit Eckerman, D. A., & Lanson, R. N. (1969). Variability of
itself to predictable and controllable behav- response location for pigeons responding under contin-
uous reinforcement, intermittent reinforcement, and
iors. Doing so ignores an essential character-
extinction. Journal of the Experimental Analysis of
istic of operant behavior. Behavior, 12. 73-80.
Finally, we note one sociopolitical impli- Herrnstein, R. J. (1961). Stereotypy and intermittent
cation. Freedom (cf. Carpenter, 1974; Lamont, reinforcement. Science, 133. 2067-2069.
1967) often means the absence of constraining Holman, J., Goetz, E. M., & Baer, D. M. (1977). The
training of creativity as an operant and an examination
contingencies (no gun to the head) and the of its generalization characteristics. In B. Etzel, J. Le
presence of a noncontingent benign environ- Blanc, & D. Baer (Eds.), New developments in behavioral
ment, one where adequate food, shelter, and research: Theory, method and application. Hillsdale,
so forth, are available independent of any NJ: Erlbaum.
Humphries, D. A., & Driver, P. M. (1970). Protean
particular behavior. However, operant contin-
defense by prey animals. Oecologia, 5, 285-302.
gencies may also be crucial. To maximize Killeen, P. R. (1978). Superstition: A matter of bias, not
freedom, an animal or person must have a detectability. Science, 199, 88-90.
wide variety of experiences (Catania, 1980; Lachter, G. D., & Corey, J. R. (1982). Variability of the
Rachlin, 1980). If, for example, to obtain duration of an operant. Behavior Analysis Letters, 2,
97-102.
food, an animal has always entered the same Lamont, C. (1967). Freedom of choice affirmed. New
1 of 10 cubicles, each cubicle containing a York: Horizon Press.
different type of food, and therefore has never Miller, G. A., & Frick, F. C. (1949). Statistical behavior-
experienced any but the 1 food, it makes istics and sequences of responses. Psychological Review.
56, 311-324.
little sense to say that there is free choice
Miller, N. E. (1978). Biofeedback and visceral learning.
among the 10 foods. The present results Annual Review of Psychology, 29, 373-404.
suggest that diversity of choice is controlled Neuringer, A. (1970). Superstitious key pecking after
by reinforcers contingent on diversity. Absence three peck-produced reinforcements. Journal of the
of variability-maintaining contingencies, such Experimental Analysis of Behavior, 13, 127-134.
Neuringer, A. (1984). Melioration and self-experimenta-
as in the yoked condition of Experiment 5, tion. Journal of the Experimental Analysis of Behavior,
increases stereotyped behaviors and therefore 42, 397-406.
limits experiences. If these speculations are Neuringer, A. (1985). "Random" performance can he
correct, a laissez-faire environment will not learned. Manuscript submitted for publication.
Notterman, J. M. (1959). Force emission during bar
secure freedom of choice. Despite the absence
pressing. Journal of Experimental Psychology, 58, 341-
of aversive constraints and the presence of 347.
positive respondent influences (e.g., good Piscaretta, R. (1982). Some factors that influence the
452 SUZANNE PAGE AND ALLEN NEURINGER
acquisition of complex stereotyped response sequences ability with reinforcement. Journal of the Experimental
in pigeons. Journal of the Experimental Analysis of Analysis of Behavior, 37, 171-181.
Behavior, 37, 359-369. Schwartz, B. (1982b). Reinforcement-induced stereotypy:
Popper, K. R. (1968). The logic of scientific discovery. How not to teach people to discover rules. Journal of
New York: Harper & Row. Experimental Psychology: General, 111, 23-59.
Pryor, K. W., Haag, R., & O'Reilly, J. (1969). The Schwartz, B., Schuldenfrei, R., & Lacey, H. (1978).
creative porpoise: Training for novel behavior. Journal Operant psychology as factory psychology. Behaviorism,
of the Experimental Analysis of Behavior. 12, 653- 6. 229-254.
661. Serpell, J. A. (1982). Factors influencing fighting and
Rachlin, H. (1980). Behaviorism in everyday life. Engle- threat in the parrot genus Trichoglossus. Animal Be-
wood Cliffs, NJ: Prentice-Hall. havior, 30, 1244-1251.
Schoenfeld, W. N., Harris, A. H., & Farmer, J. (1966). Skinner, B. F. (1938). The behavior of organisms. New
York: Appleton-Century.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.