Gibbon, Berryman, Thompson (1974)

JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 1974, 21, 585-605 NUMBER 3 (MAY)
CONTINGENCY SPACES AND MEASURES IN

CLASSICAL AND INSTRUMENTAL CONDITIONING1
JOHN GIBBON, ROBERT BERRYMAN, AND ROBERT L. THOMPSON
NEW YORK STATE PSYCHIATRIC INSTITUTE, UNIVERSITY OF MISSISSIPPI,
AND HUNTER COLLEGE OF THE CITY UNIVERSITY OF NEW YORK
The contingency between conditional and unconditional stimuli in classical conditioning

paradigms, and between responses and consequences in instrumental conditioning par-
adigms, is analyzed. The results are represented in two- and three-dimensional spaces in
which points correspond to procedures, or procedures and outcomes. Traditional statistical
and psychological mieasures of association are applied to data in classical conditioning.
Root mean square contingency, 0, is proposed as a measure of contingency characterizing
classical conditioning effects at asymptote. In instrumental tlaining procedures, traditional
ineasures of association are inappropriate, since one degree of freedom-response prob-
ability-is yielded to the subject. Further analysis of instrumental contingencies yields a
surprising result. The well established "Matching Law" in free-operant concurrent sched-
ules subsumes the "Probability Matching" finding of Iiiathematical learning theory, and
both are equivalent to zero contingency between responses and consequences.
been distinguished from negative contin-

TABLE OF CONTENTS gencies, in which the CS reliably predicts the
1. INTRODUCTION absence of the US. The lack of a contingency
II. CLASSICAL CONTINGENCIES is then the intermediate condition in which
A. Trial and Probability Definitions the CS and US are completely uncorrelated
B. Contingency Measures ("truly random control"). Rescorla argued
C. Interpretation of 02 that excitatory conditioning results from posi-
III. INSTRUMENTAL CONTINGENCIES tive contingencies and inhibitory conditioning
A. Conditional Probabilities: from negative contingencies, so that only the
Contingency Square non-contingent control procedure is appropri-
B. Absolute Probabilities: Contingency ate for assessing the size and direction of the
Tetrahedron training effects. Much experimental work fol-
1. Surface of Independence lowing Rescorla's lead investigated parameters
2. Matching Surfaces of the non-contingent condition. Several re-
(a) Conditional Matching searchers found it produced no differential
(b) Probability Matching conditioning (Rescorla, 1968; Bull and Over-
(c) Relative Reinforcement mier, 1968; Davis and McIntyre, 1969; Ayres
Matching and Quincy, 1970). Others found strong gen-
3. Probability Matching Equals Rela- eralized effects under certain conditions (Selig-
tive Reinforcement Matching man, 1968; Seligman, Maier, and Solomon,
1971); some transitory conditioning to the CS
has also been reported (Kremer and Kamin,
1. INTRODUCTION 1971; Quincy, 1971; Benedict and Ayres, 1972).
A series of seminal papers by Rescorla (1967,
1968, 1969) argued for the importance of the 'The first author is particularly indebted to Professor
contingency between conditional and uncon- J. A. Nevin for formative discussions about instrumental
ditional stimuli in classical conditioning contingencies. Preparation of this work was supported
paradigms. The term "contingency" has not by grants MH18092-01 from the NIH, and GB 34095
been quantitatively defined in the literature. from the NSF J. Gibbon, Principal Investigator), and a
Rather, positive contingencies, in which the City University of New York Faculty Research Grant
(R. L. Thompson, Principal Investigator). Reprints may
conditional stimulus (CS) is a reliable predic- be obtained from J. Gibbon, N.Y.S. Psychiatric Insti-
tor of the unconditional stimulus (US), have tute, 722 West 168th Street, New York, New York 10032.
585
5)86 JOHN GIBBON, ROBERT BERRYMAN, and ROBERT L. THOMPSON
These studies were procedurally similar, in The present paper defines contingency in a
that CS and US presentations were scheduled relatively narrow statistical context and ex-
independently, but differed with respect to the plores its implications with an eye to parallels
frequency of CS and US events as well as or asymmetries between classical and instru-
several other parameters, including the de- mental training procedures. Classical condi-
pendent measures. One purpose of the present tioning paradigms are considered first, and an
paper is to examine the range of possible argument is advanced for a metric specification
procedural variations in contingency manipu- of the degree of contingency that results in the
lations in classical conditioning and to relate association statistic, 0, as a measure of the
some data to an appropriate metric. control exerted by the training contingencies.
In instrumental conditioning paradigms, the Instrumental conditioning paradigms are then
term contingency has had a more varied in- examined, and their contingencies are shown
terpretation. For example, early definitions of to differ fundamentally from those in classical
the relation between a response and a positive conditioning. This difference results from the
reinforcer often referred to the stimulus con- different degrees of freedom in the two para-
sequence as "contingent upon" the response. digms. In the classical conditioning case, con-
While it was clear that the intent of such a tingencies are completely controlled by the
usage was "follows immediately after", it was experimenter, while in the instrumental case,
seldom explicit what consequences, if any, the contingency itself is in some sense a depen-
were "contingent upon" not responding. More dent variable. This distinction is mirrored in
recent work sought to break the contingency the metric representation.
implicit in positive schedules of reinforcement In instrumental conditioning, as in classical
by scheduling "free" reinforcers independently conditioning, non-contingent procedures are
of responses, or for non-response periods (e.g., the appropriate conditions for assessing the
Neuringer, 1969, 1970; Rescorla and Skucy, importance of contingencies. Contingent pro-
1969). In aversive conditioning procedures, a cedures in the instrumental case, however, are
variety of meanings is also evident. For ex- not uniquely specified by the experimenter
ample, the role of the contingency between a and several cases of general interest, the so-
response and a shock occupied experimental called "Matching Laws", are analyzed for a
attention from the outset of work on punish- fundamental property. It is argued that both
ment (e.g., Estes, 1944; Azrin, 1956). Yet, while probability matching and free-operant match-
investigators agreed on what constituted a ing are basically the same, and are equivalent
contingent punishment procedure, they often to a statement of zero statistical association be-
differed on the appropriate non-contingent tween alternatives and consequences.
control. Frequently, non-contingent delivery
of a stimulus meant delivery without the ex-
perimenter knowing whether a response was II. CLASSICAL CONTINGENCIES
occurring or not. Indeed, the non-contingent Rescorla's (1968) adaptation of the classical
delivery has even included a stricture on some pairing procedure consisted of scheduling
minimum delay elapsing between a preceding shock with some probability in the presence
response and the "non-contingent" shock (e.g., of the CS and never in its absence (~CS). This
Estes, 1944; Hunt and Brady, 1955). More re- contingent group was then compared with
cent work compared strict contingent punish- another, the truly random control group, for
ment with non-contingent controls that equate which shock probability was the same in the
response and non-response shock probabilities presence and absence of the conditional stim-
(Church, 1969; Church, Wooten, and Math- uilus. Possible procedures of this sort are speci-
ews, 1970; Gibbon, 1967). fied by plotting US probabilities in the pres-
In avoidance conditioning, contingency ence and absence of the signal against each
variations have received less explicit experi- other. The resulting contingency square is
mental attention, but work on shock density shown in Figure 1. Such representations have
(Herrnstein and Hineline, 1966) and on "free" appeared several times in recent literature
shock (Jones, 1969; Sidman, Herrnstein, and (Catania, 1971; Church, 1969; Gibbon, Berry-
Conrad, 1957) is in the spirit of the non-contin- man, and Thompson, 1970; Seligman et al.,
gent control investigations of other paradigms. 1971). US probability in the presence of the
CONTINGENCIES IN CONDITIONING
Cs > us
I.Qr
*AU
*X RESCORLA 1968
0 RESCORLA 1969
U)
a-
CLII)
o: U) U)
0 US CS 1.0 ri
I
Po=P(USI^-vCS)
Fig. 1. Contingency square presenting possible combinations of coniditional probability of unconditioned stim-
tilus (US) delivery in the presence or absence of the conditioned stimulus (CS). Non-contingent procedures are
represenited along the diagonal.
CS, P1 = P(USICS), is plotted on the verti- ules that has not been investigated. Along this
cal axis and US probability in ~CS, P0, is on upper edge of the square, CS implies US but
the horizontal axis. The diagonal passing USs occur with some probability in ~CS also.
through the origin is the locus of all non- The filled points represent Rescorla's (1968)
contingent procedures, in which these two training procedures, which yielded increasing
probabilities are equal. The upper-left corner suppression in a Conditioned Emotional Re-
represents the traditional pairing procedure, in sponse (CER) paradigm with increasing dis-
whiclh the US is always delivered in the CS and tance from the diagonal. Along the diagonal,
never in ~CS, and the left edge represents the differential conditioning was not observed.
class of partial reinforcement procedures in The findling of no conditioning for non-
whiclh the US occurs only in the presence of contingent procedures is not universal. These
the CS, but with probability less than one. The procedures evidently produce considerable
distinction between partial and consistent conditioning to the entire stimulus complex
pairing procedures at the edlges and corners of ("learned helplessness": Seligman, 1970; Selig-
the square has a parallel in the logic of impli- man et al., 1971). When differential condition-
cation. The upper-left corner represents an ing to a CS is observed here, the effects are
"if and only if", or double implication. The generally short-lived, and appear to depend
CS is a necessary and sufficient condition for on early contingent samples of the training
the US. The left edge, the locus of traditional sequence (Quincy, 1971; Benedict and Ayres,
partial schedules, represents the "US implies 1972). Our concern here will be with asymp-
CS" case. The upper edge represents the "CS totic conditioning levels.
implies US" implication. This second impli- The formal symmetry of this space suggests
cation represents another case of partial sched- that conditioning to the CS produced by pro-
588 JOHN GIBBON, ROBERT BERRYMAN, and ROBERT L. THOMPSON
cedures represented above the diagonal might cedures that produce equivalent asymptotic
be matched by conditioning to ~CS below the levels of conditioning. Suclh procedures would
diagonal. Thus, excitatory conditioning to the define "iso-contingency contours" along whiclh
CS might equally well be described as in- some measure of conditioning effects remains
hibitory conditioning to ~CS, and vice versa constant. To approach that goal, we consider
for the space below the diagonal. This for- below several candidates for a contingency
mal symmetry, of course, does not guarantee metric. It will be seen that data exist in
symmetric contingency effects. Stimulus sufficient quantity to discriminate among some
dynamism (Kamin, 1965), "salience" (Wagner, alternatives. Before pursuing that develop-
1969; Rescorla and Wagner, 1972), "prepared- ment, a more concrete specification of trials
ness" (Seligman, 1970), as well as signal dis- and probabilities is required.
criminability and choice of dependent mea-
sure, may all be expected to modulate the A. TRIAL AND PROBABILITY DEFINITIONS
manner in which contingency exerts control. Consider the trial sequence diagrammed in
An ideal theoretical account would specify Figure 2A. Each CS terminates witlh a brief
the mechanism whereby P1 and PO combine to shock and two of the four "intertrial intervals"
produce varying degrees of association between (~CSs) do also. Such a sequence might occur,
conditional and unconditional stimuli. A for example, for the schedule point P1 = 1,
promising model has been proposed by Res- PO = 0.5. A procedure of this sort is close to
corla and Wagner (1972). A somewhat less the original CER paradigm and seems a
ambitious goal is to specify contingent pro- natural way to specify trials and probabilities.
CSs
us A
III 1~ ~
Cs
-CS
ITI T-
B
us I I I
m m
Cs
~ CS C
- TIMET-I
Fig. 2. Trial specification for three possible procedures. In A, CS and -CS alternate and shocks are scheduled
to occur at the end of some proportion of each kind of trial. In B, a "true" intertrial interval separates successive
trials, so that CS and ~CS trials are not forced to alternate. In C, the time axis has been divided into discrete in-
tervals to accommodate the delivery of multiple shocks in a given CS or -CS period. The discrete trial unit time
for this analysis must be less than or equal to the shortest CS duration, or intershock time.
CONTINGENCIES IN CONDITIONING 589
A somewhat more flexible alternative results An alternative is to regard the level of con-
from inserting a third stimulus condition at ditioning as a function of the statistical associ-
the end of each CS and ~CS trial. The third ation between the conditional and uncondi-
condition would then act as a "true" intertrial tional stimuli. For example, some measure of
interval. This procedure is indicated in Figure the correlation between CS and US would be
2B. Note that the "on-off" specification for CS appropriate on this view. In either case, any
and -CS is an arbitrary convention. Func- contingency measure is referable to the basic
tional CS and -CS trials may be cued by any contingency table for absolute frequencies of
(preferably very different) stimulus conditions. the four possible joint events:
Equal length CS and -CS trials have been
presented in Figure 2 simply for convenience. Table I
We will later consider unequal durations.
The simple paradigm of Figure 2A is not ~us us I
Rescorla's trial and probability specification CS a b a+b P1=b/(a+b)
for the interior-point procedures of Figure 1. -CS c d c+d Po=d/(c+d)
Instead, CS and ~CS durations were divided
into smaller time periods (2 min), each of n
which met the shock probabilities indicated
by the points in Figure 1. This situation is Trials are assumed to continue over a suf-
represented schematically in Figure 2C. Each ficiently large number of sessions to produce
CS and -CS period contains several "trials" asymptotic conditioning. Thus, if the above
in a row. A trial specification of this sort, cross classification refers to the frequencies per
which is not differentially cued, poses difficul- session, continued training amounts to multi-
ties. For example, the same sequence of shocks plying all entries by the (large) number of
might be produced by P1 and P0 values that sessions. If there exists a measure of contin-
were reduced by 1/2 if the trial "unit" times gency that corresponds to asymptotic con-
were also reduced by 1/2. Indeed, any number ditioning levels, then the measure must be in-
of units is possible so long as they are small variant over multiplication of the contingency
enough to restrict the number of shocks within table by a constant. The measures considered
a trial unit to one. We will consider later how here have this property.
this arbitrary specification affects the analysis Several traditional measures also have the
of contingency metrics. For the present pur- additional property (first suggested by Yule,
poses, the specification of Figure 1 is unique 1912, discussed by Kendall and Stuart, 1967,
only up to a multiplicative constant; i.e., the pp. 546-546) that they are invariant with mul-
probability values P1 and P0 have a unique tiplication of any row or column by a constant.
ratio only (as do the trial frequencies). This property is a very strong one, and mea-
sures of association possessing it are not fea-
B. CONTINGENCY MEASURES sible candidates for organizing contingency
One approach to the metric problem regards effects. The reason is that such measures regard
the level of conditioning as a function of the all values along the edges of the contingency
discriminability of the difference between CS square as equivalent. Consider points on the
and ~CS conditions. In the present context, left edge of the square. They have values of
this discrimination is between the two shock d in Table I equal to zero, since the US is
rates or probabilities. On this view, Weber delivered only in the presence of the CS. But
fractions based upon the two shock prob- then all such tables may be obtained one
abilities might be an appropriate measure. from another by multiplying the US column
Considerable latitude must be recognized re- by an appropriate constant. Thus, a measure
garding which of several alternative Weber of contingency that is invariant with column
fractions is appropriate. For example, if we set multiplication would not be able to describe
AP = P1 - PO = (1 - PO) - (1 - P) = Qo - Qi' graded effects obtainable along this edge of
then Weber fractions based on delivery prob- the square. Rescorla (1968) found partial
abilities are given by AP/Pi(i = 1,0), and reinforcement effects along this edge. In any
Weber fractions based on omission probabili- case, one would expect a priori the level of
ties are given byAP/Qi, where Q1 = 1 - Pi. conditioning to decrease as the extinction con-
.90 JOHN GIBBON, ROBERT BERRYMAAN, and ROBERT L. THOMPSON
dition (P1 = P0 = 0) is approached. Tlhus, a In the above form, it may be shown that for
measure that hopes to describe these effects constant T1/To ratios, 0 is monotone increas-
must not obey Yule's invariance property. ing witlh increasing P1 and witlh decreasing P0.
This invariance property is shared by many These properties hold for Rescorla's findings
traditional measures of association between for suippression (Figure 1), and his data have
two binary variables. For example, Yule's co- been replotted in Figure 3 as a function of 0.
efficient of association, Q, and coefficient of The data represent suppression ratios obtained
colligation, Y, are disqualified by this prop- on the first day of extinction in a CER par-
erty, as are some more recent measures, no- adigm after training at the schedule values
tably Goodman and Kruskal's measures of indicated in Figure 1. The relation seemed
predictive association, Xa, Xb, X (Goodman sufficiently linear to warrant the straight line
and Kruskal, 1954, 1959). A similar exclusion function13.
applies to functions of the cross ratio, bc/ad,
and functions of the ratio coefficient, Po/Pl Po
(Edwards, 1963). .5
0 -
Exclusion of the ratio coefficient means that 2ox .1 A
Weber fractions based on delivery probabili- .4 .
.2 U
ties are also excluded, since these Weber frac- a:
.4 X
tions may be represented as functions of .3
z
Po/Pl. Weber fractions based on delivery or 0
cnuC
omission probabilities define lines in the con- .2
tingency square radiating from the origin or
the upper-right corner. Tlhus, if Weber frac- a-
tions were an appropriate contingency mea- Cl)
sure, then at least two different fractions
would be required to describe data that showed .1 .2 .3 .4 .5 .6 .7
graded effects along all the edges of the square.
A classical measure that is not invariant
over row o- column multiplication is the Fig. 3. Suppression ratio as a function of 0 for Res-
familiar phi coefficient (root mean square con- corla's (1968) study. The suppression ratio is response
tingency), 0 = X/n`. 0 has two important rate in the CS, divided by response rate in the CS plus
response rate in the -CS. When these two rates are
properties: (1) it attains its maximum of unity equal (no suppression) the suppression ratio is 0.5.
only at the opposing maximally contingent Note that 0.5 is approximately the value for all non-
corners of the contingency square, and thus is contingent points (0 = 0). Increasing suppression is in-
a potential measure for intermittency effects dicated by smaller suppression ratio values as 0 in-
along any edge of the square: and, (2) if trials creases.
for CS and ~CS have different frequencies
in the sense of Figure 2C, then 0 is sensitive to Another reason for choosing 0 is that it has
changes in the "On/Off" duration ratio (Stein, a chance of describing "On/Off" (T1/To)
Sidman, and Brady, 1958). Note that for trial effects. Stein et al. (1958) found a strong in-
"units" of duration t (Figure 2C), if the verse relation between amount of suppression
duration of CS and ~CS are T1, To, respec- and T1/To. Their experimental paradigm was
tively, then T1/t = a + b and To/t = c + d. as diagrammed in Figure 2A, except that shock
With these conventions and restricting P1, was scheduled at the end of every CS and
PO #& 0, 1, 0 may be written2, never occurred in -CS. Under these circum-
stances, the trial unit time is unspecified. An-
0= P,-Po other complication is introduced by temporal
x/( l-Po + (II- P)Tj / To)(PI + PoTo/Tj)
(1) 3The linear fit is intended to describe these data over
the intermediate range of 0 values. It does not imply a
2Equation (1) may be obtained from the standard model. Clearly, the rate of approach to asymptote and
frequency formulation for x2: x2 = lad - bcl2/[(a + b) the level of suppression at asymptote may be a func-
(c + d) (b + d) (a + c)], by dividing numerator and de- tion of several paramieters, independent of 0 values.
nominator by [(a + b) (c + d)]2 and rearranging terms, Evidently, the present choice of parameters produces
noting that T,/To = (a + b)/(c + d). linearity of suppression with 0.
discriminations of CS onset and shock onset at limits are in the appropriate direction for the
the end of the CS that are occasionally evident On/Off ratio effect. The Stein, Sidman, and
in their data. Nevertheless, a rough approach Brady data have been recalculated in sup-
to specifying 0 for these data is possible, and pression-ratio form and are plotted against 0
seems worthwhile in the absence of more rein Figure 4. Despite the considerable variance,
cent data on the temporal parameters of CER. there is a clear trend toward smaller suppres-
The trial unit time and P1 may be specified by sion ratios with increasing 0. Unfortunately,
using the smallest CS or ~CS duration in the under our construction of the trial unit, in-
subject's preceding training history, and decreasing relative duration, T1/T0, generally
fining shock probability as P1 = (smallest pre- also decreases P1, so that the observed trend
ceding T1 or T0)/(current T1). Such a specifi- is not uniquely ascribable to the TI/To
cation regards large CS or -CS durations as variable.
successive trials-in-a-row, where each trial lasts An assessment of the 0 coefficient for a
only as long as the shortest CS or -CS trial the description of temporal parameters requires
subject has ever experienced. For example, the an experimental paradigm with a "true" inter-
first pair of T1, To values that a subject is trial interval bracketing each CS and ~CS
exposed to is associated with P1 = 1.0, when trial (as described in Figure 2B). The T1/To
T.< To. Since shocks never occur in ~CS, Po variable is then the ratio of trial frequencies
is always zero. For this case, equation (1) be- (a + b)/(c + d), and may be experimentally
comes, manipulated.
C. INTERPRETATION OF 02
0 = l+<1 + (I -P1)T1/To (2) Mean square contingency, 02, has suffered
Evidently 0 0 as (T1/To) -- 00, and 0 attains
-* from obscurity of interpretation in the classical
the maximum, 0 = y'_jas (T1/To) -e 0. These statistical literature. In the present context, 0
STEIN, SIDMAN, a BRADY 1958
.5 AM-I 0
0 AM-2 *
4 .4 A AM-3 A
0@ AM-4 A
0 0
z .3 AM-5 °
0
C')LU .2 0
cn 0
k 0
.1 0
Cl)
a I . a. 0. Alk
v .~A I a I n aA
v
.1 .2 .3 4 .5 .6 .7 .8 .9 1.0
Fig. 4. Suppression ratio as a function of 0 for data from Stein, Sidman, and Brady's (1958) study. Their data
were reported in the form of the ratio of rate in the signal to rate out of the signal, and have been recalculated
in suppression ratio form to make them comparable to the data in Figure 3.
is sensitive to two features of associative learn- Badia and Culbertson's (1972) data on prefer-
ing paradigms that are essential for a descrip- ence for warned shock support such a view. In
tion of the effects: both the relative and ab- fact, the mechanism urndlerlying conditioned
solute values of the shock schedules in the suppression is often assumed to be preparatory,
two signals appear to be important to the or an anticipatory fear reaction incompatible
associative learning that results. However, with ongoing behavior. This latter view also
while 0 is sensitive to these features, it is not suggests a cost interpretation for the second
at all clear why belhavior ought to be sensitive kind of prediction error: anticipation of shock
to 0. That is, it is not clear whether some when it is not presented. If anticipation of
more fundamental relation is implied by the shock effectively interrupts responding on the
linearity in Figure 3. appetitive baseline schedule, then incorrect
We do not have a detailed description of anticipations reduce positive reinforcement
how behavior makes contact with the shock density, so to say, unnecessarily. Both Stein's
schedules, but an approach to this problem et al. (1958) "reinforcements missed" analysis,
is implicit in the following probabilistic in- and Lyon's (1964) data, implicate appetitive
terpretation of 02. Goodman and Kruskal reinforcement density as an important param-
(1954) noted that a proportional prediction eter of conditioned suppression.
scheme for predicting the second classification These two sorts of errors evidently combine
in a two-way polytomy reduces to 02 in the to produce orderly relative suppression data
two-by-two case. Their prediction scheme may as a function of 0. The strong generalized
be reinterpreted in terms of the classical con- suppression found by Seligman (1968, 1969)
ditioning paradigm. Imagine that in the con- for the non-contingent procedures suggests that
tingency table a trial is chosen at random an attentional process may be involved. Non-
from the population of trials, and subjects contingent shock may produce considerable
guess whether it is shocked or unshocked overall suppression, but no differential sup-
when: pression, since attending to the trial cues does
(1) they are unaware of which sort of trial not reduce anticipation errors. Suppression is
(CS or ~CS) it is, or distributed differentially between the two
(2) they know which sort of trial it is. stimuli only when attending to these stimuli
Sequentially, case (1) is a guess about shock on lowers the frequency of "unnecessarily missed"
the next trial or "in general", and case (2) is a reinforcements and "unanticipated" shocks.
guess about shock on the present trial. The In sum, 0, or some monotone function of 0,
proportional prediction scheme predicts shock appears a reasonable measure of the power of
in case (1) with probability, (b + d)/n, and in the training contingencies at asymptote in the
case (2) with probabilities, P1 for CS and PO conditioned suppression paradigm. Other mea-
for -CS trials. Goodman and Kruskal show sures fail to describe partial reinforcement
that this proportional prediction scheme im- effects and temporal duration effects. Although
plies that the relative reduction in errors of the 0 interpretation in the latter case is some-
prediction going from case 1 to case 2 is 02. what strained, sharp experimental tests are
If this interpretation is taken seriously in the easily constructed by scheduling "true" inter-
CER situation, it means that the suppression trial intervals following CS and -CS trials.
ratio reflects the relative reduction in error Such an experimental arrangement is of in-
probability produced by the CS. However, it is terest in its own right becauise it allows control
not clear how errors in prediction might con- of trial frequency independently of P1 and Po.
tact behavior, even if one construes "predic-
tion" of shock as "anticipation" or "fear" of
shock. The 02 interpretation suggests that it III. INSTRUMENTAL CONTINGENCIES
is reduction in uncertainty that counts, rather Instrumental contingencies arrange for re-
than increases in P1 or decreases in PO per se. lations between responses and consequences
Possibly, the two kinds of incorrect anticipa- that share some features, and not others, of
tions produce two different kinds of "cost" to the relations between conditional and uncon-
the subject. Failing to anticipate shock when ditional stimuli in classical conditioning. In
it does occur may result in a functional in- the classical case, the contingency is between
crease in aversiveness. Lockard's (1963) and two stimuli, both under experimental control,
whereas in the instrumental case, the experi- cedures respectively. Under strict punishment,
menter loses one degree of freedom to the a response is the necessary and sufficient con-
subject. This fundamental difference reappears dition for shock, whereas under strict avoid-
in the following treatment in different forms. ance, the lack of a response is the necessary
It leads to two different ways of specifying and sufficient condition for shock.
instruimental contingencies: one is "ratio-like", The partial schedules parallel those for
and parallels the contingency specifications classical conditioning. Partial punishment
above; the otlher is "interval-like", and leads to schedules, in which shocks imply responses
a different metric specification. but not all responses are shocked, are repre-
sented on the left edge. Partial avoidance, in
A. CONDITIONAL PROBABILITIES: which shock trials mean that no response has
CONTINGENCY SQUARE occurred but not all such trials are shocked, is
The conditional probability specification represented on the lower edge. Partial punish-
parallels that described above (Figure 1) ex- ment procedures do not seem to have been in-
cept that the role of CS and ~CS is replaced vestigated in discrete trials, but a graded effect
by response alternatives. Figure 5 presents the is well documented in free-operant work (e.g.,
contingency square for an aversive stimulus Azrin and Holz, 1966). The schedule on the
(Catania, 1971; Church, 1969; Gibbon et al., upper edge might be thought of as partial
1970). The ordinate represents the probability punishment of not responding: responses al-
of shock given a response (R), the abscissa the ways produce shocks (R implies S), but not-
probability of shock given no response (~R). responding is also shocked occasionally. Gib-
The strict or double implication points at the bon (1967) studied the two extremes of this
upper-left and lower-right corners represent condition, and the extinction point, at which
traditional punishment and avoidance pro- shock is never delivered (filled circles in
C-3
1.0
O,0 HERRNSTEIN
&, HINELINE 1966
* GIBBON 1967
* NEFFINGER 1972
040 cc .cn
a- Cf) 4
-a I 'll
0 S 9^ R 1.0 Ct
Po=P(SI-R)
Fig. 5. Contingency square for presentation of an aversive stimulus.
Figure 5). Two different trial signals cued the havior may be controlled by shock density
two conditions on the upper edge, and re- alone (Byrd, 1969; Kelleher, Riddle, and Cook,
sponding was supported by an appetitive base- 1963; McKearney, 1968, 1969; Morse, Mead,
line in intertrial intervals. In both kinds of and Kelleher, 1967; Stretch, Orloff, and Dal-
trials, responses terminated the trial and rymple, 1968). One subject in Gibbon's (1967)
produced a shock (P1 = 1.0). For strict punish- punishment study maintained responding un-
ment, not-responding resulted in trial termina- der the non-contingent condition. Also, recent
tion without shock (P0 = 0), while on non- work by Neffinger (1972) in this laboratory
contingent trials (upper-right corner) shocks showed that a substantial proportion of sub-
occurred at the end of every trial whether the jects trained under the maximal avoidance
subject responded or not (P0 = 1.0). The result contingency maintained responding when
was that responding was more frequent in the studied subsequently under non-contingent
non-contingent condition. When shocks were procedures. Analyzing behavior not under
inevitable, subjects responded more than when contingency control is beyond the scope of
they were avoidable, and so it seems likely this paper, but it should be clear that the pro-
that graded effects may be obtained for inter- cedures outlined provide a means for studying
mediate points along this edge also. that control, because they allow parametric
Data from partial contingencies have not variation in non-contingent frequency of the
been collected in discrete trials, although un- unconditioned stimulus.
cued procedures in the "punislhment-like" Neffinger also studied procedures repre-
upper-left half of the square are beginning to sented by points on the right and bottom
receive experimental attention (Kadden, 1971; edges in Figure 5. After considerable training,
Kop and Snapper, 1970). subjects that showed control by the degree of
"Avoidance-like" procedures in the lower contingency alone (subjects not showing the
half of the square have received somewhat non-associative responding under the control
more attention. Graded contingency effects of shock density) eventually ceased to respond
have been found for free-operant procedures as non-contingent conditions were approached.
that correspond rouglhly to interior points. For the bottom edge, most of the drop in re-
Herrnstein and Hineline (1966) studied rats sponding took place close to the zero point
on schedules comparable to those indicated (traditional extinction). The right edge (pun-
by the open circles and squares. P0 values rep- ishment of avoidance), produced more varia-
resent the probability of shock delivery at 2-sec bility across subjects, but after sufficient ex-
intervals when no response was made. The posure most subjects showed a substantial drop
first post-shock response changed this prob- only close to the 100% punishment condition.
ability to P1. Subsequent responses had no Thus, for those subjects sensitive to a change
effect until the next shock, which changed in contingency, most of the control exerted
shock probability back to P0. Their experiment by the conditional probability of shock occurs
focussed on shock density effects, and discrete close to the non-contingent values. The drop
trials were not explicitly scheduled. However, as traditional extinction is approached is some-
the placement in Figure 5 is unique up to a what steeper than the drop produced by
multiplicative constant, as with the probability punislhment. This asymmetry is consonant
specifications in the classical case (Figure 2C). with a discrimination argument proposed be-
Under the non-contingent procedures along low for data of this sort.
the diagonal, responding eventually ceased and If one substitutes food for shock in this
response rate for contingent procedures in- scheme, the corresponding contingency square
creased with increasing distance from the has continuous reinforcement in the upper-
diagonal. Orderly graded effects are probably left corner, and "food avoidance" in the lower-
obtainable from these interior "avoidance- right corner. The latter condition recently
like" points, with response strength increasing received attention (Williams and Williams,
with proximity to the strict avoidance point. 1969) in assessing stimulus-stimulus contin-
The procedures represented by diagonal gencies in autoshaping (Brown and Jenkins,
points do not invariably make responding 1968). The partial schedules represented by the
cease. Studies of non-contingent shock in the left edge then correspond to random-ratio
free-operant literature suggest that some be- food delivery schedules (e.g., Sidley and
Schoenfeld, 1964). Partial contingencies other very large) number of trials over the course
than random-ratio schedules seem not to have of many sessions of training. For example, the
been sttudied in the appetitive case. In terms of strict avoidance contingency maintains Poo =
the logic of the contingency analysis, they plu = 0, and subjects distribute response or
would appear to repay investigation. non-response alternatives by changing the. two
In both appetitive and aversive instrumen- remaining probability values. An efficient
tal situations, measures of behavior corre- avoidance performance has plO close to 1.0. A
sponding to measures of partial contingencies discrete-trial analog of continuous reinforce-
are not readily available. The reason is funda- ment is represented by maintaining pol =
mental and derives from the loss of one plo = 0, and of course in this case, subjects
degree of freedom to the subject. In the classi- would be expected to maintain pi, close to 1.0.
cal contingency case, a measure of partial con- Both the strict implications may be represented
tingency such as 0, required a specification of by either the absolute probability specification
the On/Off ratio, or the frequency of CS to of Table II, or the contingency-square space.
-CS trials. In the instrumental case, the cor- For "interval-like" schedules of reinforcement,
responding parameter is the ratio of response however, the absolute probability values are
to non-response trials, and so a measure such as the appropriate representation, and the con-
0 is calculable only on a post hoc basis. One tingency square is inadequate. Consider a dis-
cannot then argue that partial contingencies crete-trial analog of a variable-interval (VI)
control behavior when behavior in turn defines schedule, arranging, say, one reinforced re-
the degree of contingency. This fact is not sponse for every 12 trials on the average. If
apparent in the conditional probability speci- trials were scheduled at very short intervals
fication (Figures 1 and 5), since only two de- and occupied about 5 sec each, then such a
grees of freedom, both experimentally con- schedule would correspond to a VI 1-min free-
trolled, are represented there. The next section operant schedule. The "unlimited hold"
develops a more general representation of feature of such schedules ensures that as long
contingencies that includes this additional as one response per minute is made, the total
degree of freedom. number of reinforcers in a given amount of
time or, in our case, a given number of trials,
B. ABSOLUTE PROBABILITIES: CONTINGENCY remains constant.4 That means that pi, is fixed
TETRAHEDRON by the schedule. In this example plu = 1/12.
The more general representation arises from Not responding is never reinforced, so that
considering absolute rather than conditional pol = 0, and again subjects distribute response
probabilities. Absolute probabilities are repre- alternatives between the remaining two prob-
sented in Table II below. For the moment, S ability values. Thus, interval schedules on this
construction are analogous to holding absolute
Table II probabilities constant. Note that the condi-
-S S tional probabilities may be obtained from
R Pio Pl plo + pl = P(R)
the absolute probabilities, as indicated below
the table. Note also that the dependent var-
-R POO poi pon+ po = P(-R) iable, response probability, is simply the sum
I P(QS) P(S) 1.0 = P(R) + P(-R) of the two absolute probabilities for the top
row.
The table entries must perforce sum to
P, =
Piu
P(SjR) = P(R) unity, since one of the four joint events must
P0 = P(S I-.R) = P(-.R)
P(R
4Strictly speaking, VI schedules in practice rarely hold
reinforcements per unit time constant. Generally, VI
is intended to represent presentation of a timing stops once reinforcement is available so that if
"biologically relevant" stimulus, either positive response rate is low, availability time can represent a
or negative (e.g., food or shock). Cell entries substantial contribution to the schedule and lower the
represent long-term asymptotic training levels overall reinforcement rate in a session. The assumption
that VI schedules hold reinforcements per unit time
and are estimated by the frequencies of the constant or, for this construction, absolute probabilities
conjoint events divided by the (presumably constant, is an approximation.
occur on any trial. This restriction constrains face representing Pij = 0. These tetrahedra are
contingency representations to three degrees easily constructed from tri-coordinate paper
of freedom. In the classical case, all three are (Style 359-32, Keuffel and Esser Co., New York,
experimenter controlled; in the present case, New York) for each face. On the right the
two degrees of freedom are available to the P0, P1, P(R) coordinate system is indicated.
experimenter and one is the dependent vari- P0 is represented by angular displacement be-
able, P(R). The conditional probabilities rep- tween the rear face (pol = 0 = Po) and the
resent one of several ways of specifying two right face (poo = 0, PO = 1). Similarly, P1 is
of the three dimensions available in any con- represented by angular displacement between
tingency table. A more general representation, the bottom face (pl = 0 = P1) and the left-
which subsumes the conditional probability front face (plo = 0, P1 = 1). Plane sections have
specification, is obtained by considering the been drawn in the tetrahedron corresponding
space of all possible values of the four absolute to P0 = /4, and P1 = %2. The variable, P (R),
probabilities under the constraint that they corresponds to orthogonal displacement along
sum to one. This space is represented by an the line connecting the midpoint of the lower-
equilateral tetrahedron (three-dimensional left edge and the midpoint of the upper-right
simplex). edge. Plane sections corresponding to P(R) =#
Feinberg and Gilbert (1970) studied this 0,1 form rectangles in the tetrahedron with
space in a statistical context; the reader is re- horizontal and vertical axes representing P0, P1
ferred to their work for a discussion of the respectively. The contingency square in the
relevant mathematics (see also Bush, 1955, for tetrahedron becomes a rectangle with sides in
a two-dimensional representation in a different the ratio P(~R)/P(R). The rectangle corre-
context). The four absolute probabilities are sponding to P(R) = 2% is drawn in the figure.
related to the conditional probabilities in ways The maximally contingent corners of the
that may be illustrated by the geometry of the contingency square become edges in the tetra-
tetrahedral space. In Figure 6, two tetrahedra hedron. For example, in the instrumental con-
are shown. In the left figure, the pij coordinate ditioning paradigms for negative reinforce-
system is shown, with each vertex representing ment, the strict punishment contingency is
some pj = 1 and the opposing triangular represented by the upper-left edge (pol = 0 =
Poil-
Fig. 6. Tetrahedra representing all possible combinations of absolute probabilities in a contingency table. The
left tetrahedron shows the absolute probability coordinates, p,. The right tetrahedron shows the P,, PO, P(R) co-
ordinate system. Planes in the tetrahedron show some specific values of these three coordinates. The plane sec-
tions within the tetrahedron are represented as opaque, thus occluding, for example, the lower-rear edge of the
tetrahedron, which is shown on the left figure.
plo) and the maximal avoidance contingency cur. The frequency of shock is so low that sub-
is represented by the bottom right front edge jects, so to say, never "discover" whether or not
(Pll = 0 = Poo). shock is scheduled for not responding. This is
represented in the tetrahedral volume by the
1. Surface of Independence fact that when plo is close to 1.0, the strict
A case of special interest is the locus of avoidance edge is close to the non-contingent
points corresponding to non-contingent proce- surface. Thus, the distance between contingent
dures. In the contingency square, the non- and non-contingent points in the tetrahedron
contingent cases are represented by points has an analog in the discriminability of the
along the P1 = PO diagonal. The generalization difference between reinforcement conditions
of this locus for the tetrahedron is the surface under contingent and non-contingent presenta-
generated by this diagonal as response prob- tions. At the response extremes of no respond-
ability moves from 0 to 1. In Figure 7, the ing or responding on every trial, there is no
non-contingent surface is slhown along with difference between these conditions. There is
the rectangle corresponding to a response a sense, then, in which a subject's response
probability of 1/4. Response probabilities less probability specifies the "distance" between
than this value are thus represented below the conditions experienced under contingent
this plane toward the front edge and response and non-contingent procedures. The next sec-
probabilities higher than 1/4 are represented tion develops three different surfaces in this
above and behind the plane. For example, space that correspond to three different modes
subjects that are sensitive to a break in contin- of behavior that might result from different
gency and stop responding under non-contin- contingencies.
gent procedures produce data falling on the
left lower-front edge. Subjects responding to 2. Matching Surfaces
the density of reinforcement under non-contin- (a) Conditional matching. A recent proposal
gent presentation produce points internal to for organizing effects attributable to shock
the edges on the surface of independence. This density variations in avoidance conditioning
surface provides a geometric representation of (Gibbon, 1972) regards subjects as discriminat-
a fact often noted in studies of avoidance. ing whether or not responding results in a
When subjects are responding maximally it is "worthwhile" improvement in shock density.
difficult for a discrimination between con- The complete theoretical proposal is more de-
tingent and non-contingent conditions to oc- tailed than suits our present purpose, but a
special case of unbiased discrimination is in-
structive as a contrasting approach to the so-
,pII called "Matching Laws" considered next. The
theory treats response strength as directly re-
lated to the ease with which response-produced
and non-response-produced conditions may be
discriminated. Suppose that subjects maintain
"in memory" a sample of intershock times
generated by sequences of response-terminated
trials. Under the maximal avoidance contin-
gency, when shocks are delivered only for not
responding, such sequences are approximately
geometrically distributed, with a mean pro-
portional to the inverse of the probability of
failing to respond. When this value is small,
intershock times are quite long. Intershock
times produced by a non-response series are
equal to the time between the end of one trial
Po0 and the end of the next, as long as PO = 1.0
Fig. 7. Surface of independence in the contingency
tetrahedron. The rulings shown are those generated by (maximal avoidance). If the shock delivery
the non-contingent diagonal in the rectangles corre- schedule is relaxed by reducing PO, intershock
sponding to the contingency square. intervals produced by not responding become
geometrically distributed, witlh a mean propor- If reinforcements were substituted for

tional to the inverse of PO. An idealized con- shocks in this scheme, and the more favorable
ceptualization regards a subject as sampling interreinforcement time governed responding,
from these two distributions and responding at then the role of P0 and P1 would be inter-
its maximum avoidance probability whenever clhanged in Equation (3). This results in a
intershock time for not responding is shorter different relation among the absolute prob-
than that for responding. As long as P1 is abilities, and it is readily shown that in this
maintained equal to zero, PO must be very low case, response probability is given by
before the non-response intershock interval
distribution approaclhes the intershock times i
produced by responding (for good avoidance P(R) = (6)
performances). Thus, the theory predicts that WPu+ N/-Pi
behavior should be insensitive to PO until PO is
close to zero. Similarly, when PO is maintained The square root relation, Equation (6), fol-
at 1.0 and punishment at the end of response lows directly from reversing PO and P1 in the
trials is introduced, the discrimination hy- preceding argument (see also Herrnstein,
pothesis regards subjects as sampling from the 19705).
intershock interval distribution on response In sum, a discrimination mechanism based
trials, which is now geometric and inversely on interreinforcement intervals produces
proportional (approximately) to P1. P1, sim- matching of relative conditional probabilities.
ilarly, may be expected to exert its effect most For the aversive case, this means equality of
strongly close to 1.0, since it is in this range absolute reinforcement rates for the response
that the distribution of intershock times for alternatives, and in the appetitive case, it
responding approaches that for not respond- means matching of the relative square roots of
ing. These two features of the discrimination absolute reinforcement rates. The two rela-
hypothesis are consonant with Neffinger's data tions define surfaces in the tetrahedron, which
described earlier. It may be readily shown that are shown on the left of Figure 8. The equal-
when 0 < (PO, P1) < 1, the discrimination of shock-rate relation (Equation 5) is the plane
improvement in shock density associated with bisecting the rising front edge of the tetra-
responding is given approximately by relative hedron, and the square root relation (Equation
conditional probability. That is, if subjects 6) is the (singly ruled) surface indicated by the
respond whenever the response sample for rays from the lower rear edge. The two sur-
intershock time is longer than the nonresponse faces intersect along the non-contingent di-
sample, then response probability is approxi- agonal for the contingency square correspond-
mately ing to a probability of response of %. This is
because the interreinforcement interval distri-
P(R) =pP0 p. (3) butions producing response indifference must
be completely indiscriminable, and this occurs
Buit then, only when PO = P1.
(b) Probability matching. Conditional
P1 _ P(~R) (4) matching does not appear to result from ap-
PO P(R) petitive discrete trial experiments when rein-
and, by the definition of Pi, we have forcement is scheduled in the "ratio-like"
fashion prescribed by the PO, P1 specification.
pii = Poi (5) Rather, subjects appear to maximize reinforce-
Thus, matching of relative conditional prob- ment density by choosing the more favorable
ability in the aversive case is equivalent to alternative nearly all the time. However, an
matching absolute shock rates for the two re-
sponse alternatives. If subjects adjust respond- 5Herrnstein's derivation of Equation 6 for the
ing to the discriminability of improvement in appetitive case is correct for conditional revard prob-
intershock intervals without asymmetries in abilities only. He mistakenly describes this relation as
to the probability matching phenomenon
response preferences, they match absolute appropriate
in the mathematical learning literature. As will be
punishment frequencies for the two response seen later, that paradigm controls absolute rather than
alternatives. conditional probability of reinforcement.
CONDITIONAL RELATIVE REINFORCEMENT

MATCHING MATCHING
P'1
NEGATIVE P (S).
s4
Pro Po0
PROBABILITY
Pot MATCHING PoI
PI
P'O
PoI
EQUAL ERROR
Poo P1o
Fig. 8. Surfaces corresponding to three different kinds of matching. OIn the left, surfaces for the negative rein-
forcement and positive reinforcement cases are those generated by a discrimination mechanism, which equates
response probability to the relative conditional reinforcement probability. The center tetrahedron shows the plane
correspondinig to probability matching on choice trials wvith a correction procedure. The right-hand tetrahedron
shows the surface of independence corresponding to relative reinforcement matching in concurrent free-operant
experimiients. The rulings on this surface are those generated by the diagonal in the rectangular section corre-
sponding to reinforcement probability. (Note: Negative reinforcement refers to avoidance-punishment, Figure 5.)
important phenomenon emerges from proce- dinsky, and Candland, 1958), and occurs un-
dures that include a "correction". der only some circumstances in the pigeon
The intent of the correction is to ensure (Graf, Bullock, and Bitterman, 1964). Shimp
that subjects experience all scheduled rein- (1966, 1973) has shown that probability match-
forcements for the less-frequently reinforced ing for the pigeon depends heavily on the
alternative. The early result with these pro- amount of training and on the kind of correc-
cedures was that response probability to a tion procedure used.
given alternative approximately equalled the We will consider first the traditional pro-
reinforcement probability scheduled for that cedure in which reinforcement is available on
alternative. The phenomenon originally arose every trial, and correction responses are forced
in the context of mathematical theories of to the reinforced alternative when the choice
learning (e.g., Bush and Mosteller, 1955, pp. response was "incorrect". A good example is
310-328; Estes, 1959). The early data (Bruns- provided by the experiment on probability
wik, 1939; Bush and Wilson, 1956; Lauer and matching in the fish by Behrend and Bitter-
Estes, 1954) were not very comprehensive, man (1961) or by the "guidance" procedure
and later workers questioned the probability with pigeons studied by Graf et al. (1964). In
matching phenomenon in some contexts. Bit- Table III below, choice trials and correction
terman and his colleagues maintain that prob- trials are presented separately. Table entries
ability matching with correction procedures refer to absolute frequencies pooled over some
does occur in the fish (Behrend and Bitterman, considerable period of training. The lower
1961), does not in the rat (Bitterman, Wo- row has been labelled Ro to indicate that in
Table III The surfaces corresponding to conditional
Choice Trials matching and error matching differ funda-
Correction Trials
-S S ,S S mentally. The subjects' control of the degree
of freedom corresponding to response prob-
R1 a .b R1 0 c ability results in different matching surfaces
Ro c id Ro 0 aa when the conditioning paradigms control the
n remaining two degrees of freedom in different
ways. In the conditional matching case, P0 and
these experiments, an explicit two-alternative P1 are under experimenter control, while in
response situation is generally used, rather the probability matching case, r and n are
than the non-response alternative used in the under experimenter control.
aversive conditioning studies. The entries for (c) Relative reinforcement matching. The
free-operant concurrent experiment has estab-
correction trials are strictly determined by the
choice behavior because every unreinforced lished still another sort of matching: matching
of relative response rate on two operanda to
choice is followed by a reinforced correction.
For example, unreinforced choices of R1 (a relative reinforcement rate on those operanda.
many) are all followed by a reinforced forcedThe finding requires that variable-interval re-
inforcements be scheduled on at least one of
choice of Ro and appear in the Ro, S cell of the
the operanda, but otherwise it is quite general
correction trials table. The conditional prob-
(Herrnstein, 1970). To analyze this phenome-
abilities P0, P1 for choice trials are not fixed
here. Instead, the total number of reinforce-non in contingency terms we must regard the
ments (n) and the proportion of this total session as divided into small periods of time
eventually delivered for each alternative are(as in Figure 2C) and cross-classify those inter-
fixed. Letting this proportion be denoted by vals containing reinforcement or no reinforce-
= (b + c)/n, the probability matching rela-ment with responding to one or the other of
tion is the finding that the asymptotic prob-the operanda or not responding. A cross-
ability of R1 on choice trials equals Equiv-
I.
classification of this sort results in the 3 x 2
alently, contingency Table IV below. The entries rep-
resent frequencies divided by total number of
P(R) a+b "trials" produced by the assumption of a dis-
n
crete time trial unit, so that they are estimates
or
of long-term probabilities of the joint events.
The VI schedules ensure that all scheduled re-
a c
7 inforcers are eventually received by the subject
= Poo .
Pio
n n
(7) as long as response rates are not very close to
zero. This means that the joint event probabil-
Thus, the "probability" matching relation is ities for S and Ro or R1 (p'o1 or p'll) are held
an "error" matching relation. Subjects dis- constant by the experimenter.
tribute RO, R, alternatives in such a way that
their unreinforced frequencies are equal. This Table IV
condition defines a plane in the contingency
_S SO
=
S -S1
tetrahedron shown in the center panel of
Figure 8. ~(RO or RI) |p 20 P p21
Later variants of this procedure (Graf et al., RI Pplo plli
1964; Shimp, 1966) allowed repetitive errors
on correction trials. The "correct" response Ro plP0oo pOi
was not forced and so the contingency table for
correction trials contained non-zero entries in The cross-classification assumes that the
the ~S column. We will consider a variant of session may be regarded as a long sequence of
this sort later. In either case, response prob- "trials-in-a-row" with no intertrial intervals.
ability is defined in these experiments for This assumption may be less critical in the
choice trials only, and so the equal-error rela- present case than in the classical paradigm
tion is the appropriate description for prob- (Figure 2C) because our purpose here is to
ability matching with both kinds of correction. obtain from Table IV a conditional table for
R1 and Ro entries alone, ignoring nonresponse Subjects distribute responses across alterna-
occurrences6. Thus, any trial definition suf- tives in such a way as to produce no statistical
ficiently sliort to prohibit multiple responses association between their behavior and its con-
will suffice. Slhorter trial definitions simply sequences. Thus, there is a sense in which the
increase the nonresponse row in Table IV and "Law of Effect" is the "Law of No Effect" or
do not alter the conditional probabilities zero contingency between responding and re-
defined by inforcement. Raclhlin (1971) argued a similar
point, but from a priori grounds unrelated to
Pii = P(RiSj IRo or RI) = p'i
P ,i, j, = 0, 1 the independence argument.
This seemingly paradoxical equivalence gen-
k, 1=0 erates more surprise at first bluslh than it de-
(8) .serves on consideration. The matching relation
where RiS denotes the joint occurrence of re- is an asymptotic one and requires averaging
sponse RI and consequence Sj. Then, we may over a large sample. Subsamples of contingency
collapse the 3 x 2 table into a 2 x 2 table as in tables obtained for small segments of a session
Table II, by identifying pij as the conditional are likely to show some degree of contingency
probability of response i and consequence j attribuitable to local rates of reinforcement.
given a -esponse on one of the two operanda. It is often pointed out that in concurrent
The relative rate of response is then schedules, time spent responding to one of the
two operanda increases the probability of rein-
RI forcement on the otlher. Tlhus, the matching
Ro + R, relation may resuilt from subject's responding
so as to "track" local changes in reinforcement
P'io + P'iu plo +
pll P(R1JRo or
R1). probability (Shimp, 1969). A mechanism of
X p'ii this sort has an analogue in the probabilistic
I,j=GO interpretation of 02 outlined for the classical
(9) conditioning situation. Recall that 02 was in-
Similarly, the relative rate of reinforcement terpreted as a relative reduction in error
for R1 is probability going from the a priori to the
a posteriori situation. 02 = O when the overall
R1S1 PP'u _ Pil "error" probability, I - P(S1), equals the con-
ROS + R1S1 P 'o + P 11 -Pol + P)11 (litional error probabilities, 1 - Pi, i = 0,1. Or,
(10) equivalently, 02 = 0 when P(S1) = P0 = P1 =
The well-established matching relation equates
P(R,S1)/P(R1), which is equivalent to the inde-
pendence relation, Equation 12. Thus, a mech-
these two relative rates. In terms of the prob- anism that avoids the alternative with the
abilities, pij, momentarily higher error probability results
in the asymptotic equivalence of conditional
Pio
+ P11 Pii
= Poi +Pii
and overall error rates, or statistical indepen-
dence.
or In the right-hand tetralhedron of Figure 8,
the surface of independence corresponding to
[P(R1)f [P(S,)] = P(R15S). (12) relative reinforcement matching is generated
Buit the probability form of the relation, Equa- in a way that exhibits the other set of rulings
tion 12, is the definition of independence be- not shown in Figure 7. Analogously to P(R),
tween response and reinforcement variables! P(SI) values correspond to rectangular plane
sections orthogonal to the line connecting the
6This formulation ignores "timiie matching" (e.g., midpoints of the front rising edge (plo = 0 =
Killeen, 1972; Rachlin, 1973), since non-response oc- poo, or P(S1) = 1) and the rear bottom edge
currences are ignored. Time matching may be derived (pol = 0 = P'1, or P(S1) = 0). The condition
from the present formulation with additional assump- that P(S1) equal P1 means that the plane cor-
tions about non-responding. For our present purposes, responding to P1 intersects
however, interest centers on the correspondence be- the P(S1) rectangle
tween free-operant and probability matching, and thus along the lower-left to upper-right diagonal.
the time matching development is not pursued here. The surface generated by these diagonals along
with the planes corresponding to P(S1) = P1 = choice trials only, while in the free-operant
¼ is drawn in the figure. situation, there is no distinction between
Comparing the left and right tetrahedra of choice and correction trials.
Figure 8 shows that the surfaces corresponding We wish to propose a discrete-trial analogue
to conditional matching for positive reinforce- of the concurrent schedule in whiclh it is rea-
ment and relative reinforcement matching are sonable to suppose that relative reinforcement
not very different at several points in the tetra-
matclhing wotuld be obtained. Suppose that
hedral volume. At probability of response trials are cued by the onset of choice stimuli
values near either 0, 1, or 0.5, a discriminationand end with responses. Reinforcement is
between these two rules for behavior would available on only some proportion of the
be difficuilt to make. It is only for P(R) valuestrials, and once available, is held until col-
near 0.25 and 0.75 that the conditional matclh- lected by the appropriate response. If inter-
ing surface for positive reinforcement deviates trial intervals are set very short, the central
substantially from the independence surface. difference between such a procedure and the
Thus, it is at these response levels that a discretizedl time analysis of the concurrent sit-
discrimination between the two kinds of uation (Table IV) is the assumption of a re-
matching may be made. sponse on every trial.
Nevin (1969, Experiment I) studied a par-
3. Probability Matching Equals Relative adigm of this sort, but with a 6-sec intertrial
Reinforcement Matching interval, and fouindI matching of response prob-
As indicated above, the discrete-trial cor- ability to relative reinforcement. Shimp (1966,
rection procedure results in equality of the Experiment III) studied a situation close to
absolute error probabilities (poo = plo) on this witlh a zero intertrial interval, but with an
choice trials. In the free-operant concurrent observing response required to initiate choice
situation, matching results in equality of the trials. Again, matching to relative reinforce-
conditional error probabilities (1-P0 = ment was confirmed.
1 - P1). We believe that the distinction is We may analyze this paradigm from the
superficial rather than fundamental, and re- probability matclhing approaclh by distinguish-
sults from different data-reporting conventions, ing three sorts of trials in Table V below.
rather than different kinds of behavior. The Trials in whiclh reinforcement is first available,
argument is simple but important, and rests correction trials in whiclh reinforcement is
upon a similarity between procedures in the available and was available but not collectedl
two situations. The correction procedure lhas on the preceding trial, and choice trials in
an analogue in the unlimited hold feature of which reinforcement is not available are indi-
VI schedules. Once reinforcement is assigned, cated separately.
it is held until collected by the subject. The The two left tables are the same as those for
two situations differ in data-reporting con- the probability matching experiment (Table
ventions. In the probability matching experi- III) with the addition of unreinforced correc-
ment, response probability is reported for tion responses, a' and c'. The probability
Tab le V
Choice Correction Choice
So S1 So so S1
R1
a b R1 a' c R1 a" 0
R, c d Ro c' a Ro C" 0
I n
b+c
n = a + b + c + d, =
reinforcement available reinforcemiient

not available
Table V'I
So S, Y.
R, A n7r = b + c A+
n_
R,, (~ A) n(Il-r) = a + d ((I-7T) (-+ n)
X A nA7 + n
matclhing finding has been riestricted in the probability matching are equivalent on this
prece(ling analysis to trials wlhen reinforcement constrtuction represents a marriage of sorts be-
was available, witlh the restult that errors in tween (liscrete-trials andl free-operant proce-
the choice table were approximately eqtual. dtures. The free-operant case generalizes the
The extension of the probability matclhing discrete-trial case by allowing three response
performance to all tlhree tables constittutes otur alternatives andl unreinforced trials. That rela-
proposal for the correspondence between prob- tive-reinforcement matching subsumes prob-
ability matclhing and rielative reinforcement ability matching is thus possibly not surprising.
matching. If subjects matclh their probability But that they are both equal to a zero con-
of response to IT = (b + c)/n, and if they do so tingency between belhavior and consequences
for all three tables, then pooling the three remains puizzling. The arguments for a con-
kinds of trials into a single contingency table tinual adjustment away from dependence
results in Table VI above, wlhere A = a + a' + a". towards independence, ouitlined above, leave
Response probability, P(R1), is of course equal us witlh some residual tuneasiness. Certainly
to 7r in the pooled table, but ir now represents the (letails of suclh a dynamic equilibrium
the proportion of the total reinforcements deserve study.
aissigned and collected for R1. That is, dividing
all entries by the total, A/r + n, one obtains
the asymptotic relative frequencies pij for the REFERENCES
joint events, and Ayres, J. J. B. and Quinsey, V. L. Between groups in-
centive effects on conditioned suppression. Psycho-
Pi nomic Scietice, 1970, 21, 294-296.
P(R1) = IT Azrin, N. H. Some effects of twvo intermiiittent sched-
Poi + Pii uiles of iimmediate and non-immediate punishment.
whicli is the relative reinforcement matching Journal of Comparative and Physiological Psy-
or independence relation, Equation 10. chology, 1956, 42, 3-21.
Azrin, N. H. and Holz, W. C. Punishlmient. In W. K.
On this construction then, the unlimited Honig (Ed.), Operant behavior: areas of research
hold property of variable-interval schedtules and application. Newv York: Appleton-Century-
represents a sort of generalized correction pro- Ciofts, 1966. Pp. 380-447.
cedure in which corrections are not cued and Badia, P. and Culbertson, S. The relative aversiveness
of signalled vs. unsignalled escapable and inescap-
not forced. It seems likely that botlh these able shock. Journal of the Experimental Analysis of
features, as well as tinreinforced clhoice trials, Behavior, 1972, 17, 463-471.
are important to matching. For example, Behrend, E. R. and Bitterman, M. E. Probability-
Shimp (1966, Experiment II) found that matching in the fish. A merican Journal of Psy-
pigeons maximized wlhen correction trials were chology, 1961, 74, 542-551.
Benedict, J. 0. and Ayres, J. J. B. Negative con-
differentially cued. When reinforcement was tingency discrimination: differentiation by rats be-
scheduled for all choice trials, but corrections tween safe and random stimuli. Journal of Com-
were not differentially cued, behavior was in- parative and Physiological Psychology, 1972, 82,
termediate between matching and maximizing 323-330.
(Experiment I). Under these conditions, while Bittermiian, M. E., Wodinsky, J., and Candland, D. K.
Sonme comparative psychology. A merican Journal of
corrections are not explicitly cued, non-rein- Psychology, 1958, 71, 94-110.
forcement may become a cue for switching. Brown, P. L. and Jenkins, H. M. Auto-shaping of the
Only when correction trials were completely pigeon's key peck. Journal of the Experimental
indiscriminable from choice trials (Experiment Analysis of Behavior, 1968, 11, 1-8.
III) was matclhing obtained. Brunswvik, E. Probability as a determiner of rat be-
havior. Journal of Experimental Psychology, 1939,
That relative-reinforcement matching and 25, 175-197.
Bull, J. A. and Overmier, J. B. Additive and sub- Journal of the Experimental Analysis of Behavior,
tractive properties of excitation and inhibition. 1964, 7, 151-157.
Journal of Comparative and Pysiological Psychology, Herrnstein, R. J. On the law of effect. Jouirnal of the
1968, 66, 511-514. Experimental Analysis of Behavior, 1970, 13, 243-266.
Bush, R. R. Some problems in stochastic learning Herrnstein, R. J. and Hineline, P. N. Negative rein-
models with three or more responses. Mathematical forceinent as shock-frequency reduction. Journal of
models of human behavior: proceedings of a the Experimental Analysis of Behavior, 1966, 9,
symposium. Darien, Connecticut: Dunlap & Associ- 421-430.
ates, Inc., 1955. Pp. 22-24. Hunt, H. F. and Brady, J. V. Some effects of punish-
Bush, R. R. and Mosteller, F. Stochastic models for ment and intercurrent "anxiety" on a simple oper-
learning. New York: Wiley, 1955. ant. Journal of Comparative and Physiological
Bush, R. R. and Wilson, T. R. Two-choice behavior Psychology, 1955, 48, 305-310.
of paradise fish. Journal of Experimental Psychology, Jones, R. R. The effects of rate of delivery of re-
1956, 51, 315-322. sponse-independent shocks upon avoidance respond-
Byrd, L. D. Responding in the cat maintained under ing. Joutrnal of the Experimental Analysis of Be-
response-independent electric shock and response- havior, 1969, 12, 1023-1028.
produced electric shock. Journal of the Experimental Kadden, R. M. Response-dependence and predict-
Analysis of Behavior, 1969, 12, 1-10. ability of shock as continuous variables. Proceedings
Catania, A. C. Elicitation, reinforcement, and stimulus of the American Psychological Association, 1971, 6,
control. In R. Glazer (Ed.), The nature of reinforce- 707-708.
ment. New York: Academic Press, 1971. Pp. 196-220. Kamin, L. J. Temporal and intensity characteristics of
Church, R. M. Response suppression. In B. A. Camp- the conditioned stimulus. In W. F. Prokasy (Ed.)
bell and R. M. Church (Eds.), Punishment and Classical conditioning, New York: Appleton-Century-
aversive behavior. New York: Appleton-Century- Crofts, 1965. Pp. 118-147.
Crofts, 1969. Pp. 111-156. Kelleher, R. R., Riddle, W. C., and Cook, L. Per-
Church, R. M., Wooten, C. L., and Matthews, T. J. sistent behavior maintained by unavoidable shocks.
Contingency between a response and an aversive Journal of the Experimental Analysis of Behavior,
event in the rat. Journal of Comparative and Physio- 1963, 6, 507-517.
logical Psychology, 1970, 72, 476-485. Kendall, M. G. and Stuart, A. The advanced theory
Davis, H. and McIntire, R. W. Conditioned sup- of statistics, Vol. 2, New York: Hafner, 1967.
pression under positive, negative, and no contingency Killeen, P. A yoked-chamber comparison of concurrent
between conditioned and unconditioned stimuli. and multiple schedules. Journal of the Experimental
Journal of the Experimental Analysis of Behavior, Analysis of Behavior, 1972, 18, 13-22.
1969, 12, 633-640. Kop, P. F. M. and Snapper, A. G. Responding and
Edwards, A. W. F. The measure of association in a heart rate under temporally defined schedules of
two-by-two table. Journal of the Royal Statistical signalled electric shock. Proceedings of the Ameri-
Society, 1963, A, 126, 109. can Psychological Association, 1971, 6, 703-704.
Estes, W. K. An experimental study of punishment.
Psychological Monographs, 1944, 57 (3, Whole No. Kremer, E. F. and Kamin, L. J. The truly randoimi
control procedure: associative or non-associative
263). effects in the rat. Journal of Comparative and
Estes, W. K. The statistical approach to learning Physiological Psychology, 1971, 74, 203-210.
theory. In S. Koch (Ed.), Psychology: a study of a
science. Vol. 2, New York: McGraw-Hill, 1959. Lauer, D. W. and Estes, W. K. Observed and predicted
Pp. 380-491. terminal distributions of response probability under
Feinberg, S. and Gilbert, J. P. The geometry of a two- two conditions of random reinforcement. American
by-two contingency table. Journal of the American Psychologist, 1954, 9, 413. (Abstract)
Statistical Association, 1970, 65, 694-701. Lockard, J. S. Choice of a warning signal or no signal
Gibbon, J. Discriminated punishment: avoidable and in an unavo-dable shock situation. Journal of Com-
unavoidable shock. Journal of the Experimental parative and Physiological Psychology, 1963, 56, 526-
Analysis of Behavior, 1967, 10, 451-460. 530.
Gibbon, J. Timing and discrimination of shock density Lyon, D. 0. Some notes on conditional suppression
in avoidance. Psychological Review, 1972, 79, 68-92. and reinforcenment schedules. Journal of the Experi-
Gibbon, J., Berryman, R., and Thompson, R. L. Con- mental Analysis of Behavior, 1964, 7, 289-291.
tingency spaces and random controls in classical McKearney, J. W. Maintenance of responding under
and instrumental conditioning. Paper presented at a fixed-interval schedule of electric shock presenta-
the 41st annual convention of the Eastern Psycho- tion. Science, 1968, 160, 1249-1251.
logical Association. Atlantic City, N.J., April, 1970. McKearney, J. W. Fixed-interval schedules of electric
Goodman, L. A. and Kruskal, W. H. Measures of shock presentation: extinction and recovery of per-
association for cross classifications. Journal of the formance under different shock intensities and fixed-
American Statistical Association, 1954, 49, 732-764. interval duration. Journal of the Experimnental
Goodman, L. A. and Kruskal, W. H. Measures of Analysis of Behavior, 1969, 12, 301-313.
association for cross classifications, II: Further dis- Morse, W. H., Meade, R. N., and Kelleher, R. T.
cussion and references. Journal of the American Modulation of elicited behavior by a fixed interval
Statistical Association, 1959, 54, 123-163. schedule of electric shock presentation. Science,
Graf, V., Bullock, D. H., and Bitterman, M. E. Further 1967, 157, 215-217.
experiments on probability-matching in the pigeon. Neffinger, G. G. Effects of contingency variations upon
avoidance responding in rats. Unpublished doctoral Seligman, M. E. P., Maier, S. F., and Solomon, R. L.
dissertation, Columbia University, 1972. Unpredictable and uncontrollable aversive events. In
Neuringer, A. J. Animals respond for food in the F. R. Brush (Ed.), Aversive conditioning and learn-
presence of free food. Science, 1969, 166, 399-401. ing, New York: Academic Press, 1971. Pp. 347-400.
Neuringer, A. J. Many responses per foo(d reward Shimp, C. P. Probabilistically reinforced choice be-
with free food present. Science, 1970, 169, 503-504. havior in pigeons. Journal of the Experimental
Nevin, J. A. Interval reinforcement of choice behavior Analysis of Behavior, 1966, 9, 443-455.
in discrete trials. Journal of the Experimental Shimp, C. P. Optimal behavior in free-operant experi-
Analysis of Behavior, 1969, 12, 875-887. ments. Psychological Review, 1969, 76, 97-113.
Quinsey, V. L. Conditioned suppression with no CS-US Shimp, C. P. Probabilistic discrimination learning in
contingency in the rat. Canadian Journal of Psy- the pigeon. Journal of Experimental Psychology,
chology, 1971, 25, 69-82. 1973, 97, 292-304.
Rachlin, H. On the tautology of the matching lawv. Sidley, N. A. and Schoenfeld, W. N. Behavior stability
Journal of the Experimtlental Analysis of Behavior, and response rate as functions of reinforcement
1971, 15, 249-251. probability on "randoin ratio" schedules. Journal of
Rachlin, H. Contrast and matching. Psychological the Experimnental Analysis of Behavior, 1964, 7,
Review, 1973, 80, 217-234. 281-283.
Rescorla, R. A. Pavlovian conditioning and its proper Sidman, M., Herrnstein, R. J., and Conrad, D. G.
control procedures. Psychological Review, 1967, 74, Maintenance of avoidance behavior by unavoidable
71-80. shocks. Journal of Comparative and Physiological
Rescorla, R. A. Probability of shock in the presence Psychology, 1957, 50, 553-557.
and absence of CS in fear conditioning. Journal of Stein, L., Sidman, M., and Brady, J. V. Some effects
Comparative and Physiological Psychology, 1968, 66, of two temporal variables on conditioned suppres-
1-5. sion. Journal of the Experimental Analysis of Be-
Rescorla, R. A. Conditioned inhibition of fear result- havior, 1958, 1, 153-162.
ing from negative CS-US contingencies. Journal of Stretch, R., Orloff, E. R., and Dalrymple, S. D. Main-
Comparative and Physiological Psychology, 1969, 67, tenance of responding by fixed-interval schedule of
504-509. electric shock presentation in squirrel monkeys.
Rescorla, R. A. and Skucy, J. C. Effect of response- Science, 1968, 162, 583-586.
independent reinforcers during extinction. Journal Wagner, A. R. Stimulus validity and stimulus selec-
of Comparative and Physiological Psychology, 1969, tion in associative learning. In N. J. Mackintosh and
67, 381-389. W. K. Honig (Eds.), Fundamental issues in associa-
Rescorla, R. A. and Wagner, A. R. A theory of Pav- tive learning, Halifax, Nova Scotia, Canada: Dal-
lovian conditioning: variations in the effectiveness housie University Press, 1969. Pp. 90-122.
of reinforcement and non-reinforcement. In A. Black Williams, D. R. and Williams, H. Auto-maintenance
an(d W. Prokasy (Eds.), Classical conditioning II: in the pigeon: sustained pecking despite contingent
Current research and theory. New York: Appleton- non-reinforcement. Journal of the Experimental
Century-Crofts, 1972. Pp. 64-99. Analysis of Behavior, 1969, 12, 511-520.
Seligman, M. E. P. Chronic fear produced by un- Yule, G. U. On the methods of measuring association
predictable shock. Journal of Comparative and between two attributes. Journal of the Royal Sta-
Physiological Psychology, 1968, 66, 402-411. tistical Society, 1912, 75, 759.
Seligman, M. E. P. Control group and conditioning: a
comment on operationism. Psychological Review,
1969, 76, 484-491. Received 15 March 1973.
Seligman, M. E. P. On the generality of the laws of (Final Acceptance 19 November 1973.)
learning. Psychological Review, 1970, 77, 406-418.

Gibbon, Berryman, Thompson (1974)

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Gibbon, Berryman, Thompson (1974)

Uploaded by

Copyright:

Available Formats

JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 1974, 21, 585-605 NUMBER 3 (MAY)

CONTINGENCY SPACES AND MEASURES IN

The contingency between conditional and unconditional stimuli in classical conditioning

been distinguished from negative contin-

STEIN, SIDMAN, a BRADY 1958

geometrically distributed, witlh a mean propor- If reinforcements were substituted for

CONDITIONAL RELATIVE REINFORCEMENT

reinforcement available reinforcemiient

You might also like