Download as pdf
Download as pdf
You are on page 1of 122
‘The Indo European rit, sew, 29 follom, inks consequence to signal ard designate (fam the Latin ‘sumithing this one fol) and ta sci and association (fram the Latin socins, a companion or liver) Itshare it prefix, com-, with conditioning, contiegency and contiguity. Conditioning, try! ‘he Ind Enron root ik, to row or pronounce, bas many rats dictation, rom tbe Latin dere 0395 teach, fam the Old English taccan, fo shor instruct judgment, from the Latin juex, be bo pronounces ‘he hans and paradign,jrom ihe Greek pura, bed, plus dekenunai, 29 show Contingency, from the Latin ‘contingere, 29 tush om al sds orto happen, has varied meaing: a possibly, a condition of depending on hans somthing incidental t something ee. Like contac, itvmbines te roots com, agthe, and tangere, to touch Contiguity, te sonditon of tuching or big in contact, as tb sae arg. Cris, comtingercy ‘and contiguity ar wally conrad in karningshere:ccotingency, in is subi! ns,sreases bow the (kabood of one een irik orcad by ke events, nbuzas cnt pis the ucla of events fn spe o ne witout regard contin. Reinforcement Fint, a reminder. Now and for a while we'll con- “Mazes and Learning Curves ccertrate only on behavior without words, Well Experimental Chambers and Cumulative ‘consider homan examples but mainly in cases Records swhere our homan tle is ieeevint, and for that ‘The Vocabulary of Reinforcement ‘reison it is very easy to not to notice the human Extinction examples. They ate embedded among examples Extinction Versus Dohibition involving nonbuman organisms such as pigeons Response Reinfercer Contingencies and sod rats, and readers often miss the use of human Reinforcer Deliveries behavior ‘0 illustrate the relevance of findings Side Effects of Extinction from nonhuman research, For example, one exam Extinction Versus Free or Noncontingent question [used for several semesters of teaching Reinforcement these topics asked how mang instances of human ‘Addendum A: Extinction and Superstition behavior were mentioned in the text through the chaptese corresponding to the present Chapter 8, The four choices were: 4) none; #) fewer than KEY TERMS: Keinforce, Reinforcer, Reinforcement; Contingency; Extinction, Spostancous Recovery; Pree or Noncontingent Reinforcement, Superstition; Side fects 58 + PART Il LEARNING WITHOUT WORDS 40; about 20; 4) more than 40. By actual count, more than 50 human examples appeared in those chapters in the frst and second editions (there aze even moze now than there were then). The modal answer, however, was typically or & Ia other words, some students thought there weren't any at all and most of the rest dida’t notice more than tea of them, If you don’t think dis the correct answee after reading those chapters try actually ‘counting the human examples. ‘The advantage of starting with nonhuman ‘organisms is precisely that they don't ak: wolike tus they aze nonverbal creatures. We must resist fhe temptation to treat what they do as if dhey talk to themselves aboutic the way we talk to our selves, The absence of talk makes them so differ- cent from us that we would be well advised to take ‘nothing about them for granted and to deal with them as if we had just come upon them as aliens, which ina way is what they are relative to us. Yet all of our behavior that involves talk is built upon. ‘nonverbal foundations that are covered aver by the talk, and nonverbal organisms provide out best path to understanding those foundations. ‘When we get to our human Tearing with words swell be ina far better position to deal with what Jeepecia about i heeanse we Fst surveyed learn ing without words. ‘Behavior has consequences, and an important property of behavior is that it can be affected by fis consequences, We study this phenomenon by drzanging consequences for behavior, but doing to involves mote than just presenting simul. The stimuli must occur ia some relation to behav jos. We have to arrange the environment so that ‘esponses make something happen. "Natural environments already provide coose- quences for behavior. Even before we intervene DOnganisms change their environments by doing things or by going from one place to another But we can better smdy how consequences affect Dehavior by arranging, consequential enviton- ments in the laboratory. For example, I can build 4 maze in which a water depeived rat finds water ‘after making an appropsiate sequence of turns, or ‘a chamber in which a food-deprived pigeon pro- aces food by pecking a key on the wall. Then fan find out whether water affects the trns the rat takes asit rans through the maze or food affects hhow often the pigeon pecks the key. Ta this and the next chapter we'll explote the historical developement of research on the conse- quences of behavior and well sec how reinforce- ‘ment is televant not only to behavior maintained by physiologically significant consequences such 15 food and water but also to behavior with more fubde consequences, as when, in sensory-motor interactons, my eye movements affect what I see. REINFORCEMENT Chapter 2 introduced Thorndike’s experiments, in which animals learned to escape from prob lem bores by operating a device that released the door. Tpiclly,a food deprived animal was placed inside the box with food available outside. In its varied setviy, the animal sooner or later operated the device and was free to Jeave the box. At first this was a low probability response, but because it opened the door its probability went up over repeated tals “Thorndike described how the consequences of responding affected later responding with a principe he called the Law of Eifict. The law went through many cevisions, but its essence was that response probability could be mised by some consequences and lowered by others. In language closer to Thorndike’, responses with satisfying effects were stamped in whereas those with sanoy- ing effects were stamped out. The ealest version fof Thorndike’s law was called the song Law of [Effect Later, he repudiated the second half of the lag keeping the probability increase or stamp ‘ng in but discarding the probability decrease or stamping aut. What remained was then called the sieak Law of Effect. This historical point will be relevant later, when we deal with punishment. Figure 5-1 shows data from one of Thorne’. cats. To eseape from its box, this eat had to poll a string cunning from a wire loop atthe front of the box toa bolt holding the door. The first time inthe ‘box the cat took 160 seconds to escape (from here (CHAPTER 5, CONSEQUENCES OF RESPONDING: REINFORCEMENT + 59 Figure 5-1. Alearning curve: a cat's tine to escape fiom a problem box as. function of tials. (From Thome, 1898, Figue 1) con, seconds will often be abbreviated 4). Its time decrease gradually and irregularly over successive trials une, ding dhe last few trial, the cat reliably ‘escaped in les than 10s, This pracual decrease in the time taken to complete a task came to be called tral-anderrorlarnng, Kohler later contrasted this gradual change with the sudden of insighrfal sola- tons he observed with chimpanzees. In later years, tah and-error lextning was studied ‘with many different organisms in many different situations, Experimenters believed that the intel ligence of different epecies could be compared by seeing how rapidly learning went ia problem boxes, mazes, runways and other apparatuses (eg, Hilgard, 1951). Apparanas design began co be dic” tated by theoretical questions: Does leening take place in discrete steps, on an all-o-none basis, snstead gradually and continuously? Do oxpanisms Jeamn mowements(eesponse learning) or properties Of the environment (timulus learning)? Do the consequences of responding lead directly to leara- Jing or only make the organism perform s0 as (0 show what it had learned a other ways? ‘A common feanure of these experiments was that responding became more probable when ithad certain consequences. The change in probability was measured differendy depending on the appa- satus and the experimental aims. Graph showing hhow behavior changed during an experiment were called arning axes time to complete a response as 4 faretion of number of tials eg, Figure 5-1); percentage of correct responses; proporson of animals teaching some criterion of successful per formance. Sometimes these measures were tr formed to ease comparisons among them. When ‘ate tan theough a maze, for example, ie to on fromstathox to goalbor usually deczeased, whereas percentage of correct sums and proportion of ss snaking errorlest r008 increased Converting dine torn the maze to speed (he reciprocal oF manning, ‘ime) made all duree measures increase wit lean ing, But the shapes of learning curves depended s0 much on sppartises and on measures tae that the progress of leaming coulda’t be deseribed in any soitary was ‘The problem was that these experiments pro- ‘duced: complicated behavior Por example, mea- susing the ne course over which a rat stopped entcaing blind alleys asi learned it way through a roare did't show bow learning proceeded at single choicepoint. This consideration led to the sgradeal simplification of mazes a illusttated in Figure 5-2. ‘Diagram A shows the plan of the earliest maze used to study animal learning (Small, 1899-1900), 4 Gy Bfoot modifation of the hedge maze St Himpton Court in England. (Cariously, such mazes alto provided settings for the hydraulically ‘operated statues that inspired Descartes to for- ‘mula his concept of the reflex; ef Chapter 4) ‘Whea a cage door a the start was lifted, ats could enter the maze; food was in the goal area at the centers their experience grew they reached the onl area moce rapidly and with fewer wrong mins along the wag. Buriewas diffeult to examine leaen- ing a any given choicepoint. Learang might have been quicker at choicepoint tin diagram A than at “either because Iwas ealierin dhe maze than 7 or ‘because their Hoor plane differed earning might Ihave heen quicker a 4than at 5 ther beeabse it coule be approached in wo vay, ror 3 oF fom 5, or because a rat that usually wear directly orn 3 to 4 only sazely encountered 5. 60 = PART II: LEARNING WITHOUT WORDS Gradually, mazes evolved into more systematic forms, ax in diagtarn B, In this maze, sometimes falled a. Uae alter the form of the successive init cboicepoints were essentially the same af the mt approached each one; they differed only in “here they werein the sequence andin whetber lft Drright turns were correct This kind of systematic rrangerment made it easy to specify the correct sequence (in By rightleft-right-lefe-left right and to keep tmck of errors. Even here, however, position and sequence interactions complicated the analy For examples the rts choice of left at 4 affected by the preceding right tun at 3or the fellowingleft taen at 3? Would it matter ifthe rat approached 4 tfter coming back from the blind alley at 3, having, oadean erzor there instead of aftera correct right quan at 32 Does it ater that 4 isin the middle rather than near the beginning oF end? ————x<—— {It was peshaps inevitable that the maze would be redoced.0 a single choicepoint asin the T-maze Shown witha goalbox on th rin diagram Ci igure 5.2 Flere, when Teaving de strtbox the rat ina only to make a single choice of sight or let ‘Buteomplsations were sil posible. For example, suppose ore rat ts fist ral i this maze ems fignt whilea second rat tums left Should the = ee one be allowed to rezace its steps afer reach fg the empty box at the end of the left rm? If feb retutned to the statbox instead, should it be owed to the goalbox by blocking the left arm, to sive it tis inthe goslbox equal tothe fist at3? Fe next logical step was to eliminate choicepoints completely leaving nothing, but a simple conway, ge in diaggam D. Now no esrors ae possible, and reaoutes of behavior can be reduced to jst the speed ofthe nits run fom startbox © gotlbox. Goal rs Le ‘esl | = ce sy a oo i row 52. Stags the evkion of mazcen tutes of nial ering; A. omer Soo" TE Figure $2 S06 ao,» Umaze wih Sx chicoprts Ce sige chiepoit T=vane end the runway or straight alley CHAPTER 5, CONSEQUENCES OF RESPONDING: REINFORCEMENT + 61 table that the MAZE 4. gece other problems Average measures cicepoint, as n the formance of a group didait necessar- athe ght in digea eee peformances of the individuals in leaving the startbox ye choice of sight vill possible For © mip Suppose each rat running in a T-maze ‘anges abruptly from making frequent to making consistendy correct choices, but st tial in this maze, etic in hs me age occur on diferent al fo ferent turn lf. Should Wh age group of mt 6526 might mb om trace its steps afl, by wal 5, 72% by tial 6, 79% by tia ne end ofthe left a rtbox instead, should y blocking the left a box equal to the fist to eliminate choi hing but a simple ry no errors are possible can be eeduced to jo romn starthox to goall by tial 8, and £0 09, until performance es sable at 98% to 100% by tal 20. This ‘performance, giving the appearance of & ‘perease in correct choices, would com: obscure the abrupt change in erors for all al rts Sidenan, 1952). the simple ranway wase’t an uate because the speed of aatsrunaing down alley canbe affected by many tava c= tals began withthe opening of astaztbox it depended on which way the twas cing the door opened. Other factors included the menter handling ofthe rat when moving thal from goalbox to startbox, odor ls jy other rats and even whether the goalbox frough oom to allow a running ra (0 slow ‘before banging its head against the goalbox bral (Kile &e Arse, 1987) Tih mazes of runways, the expetimenter fb seta the sat fom goulbox to startbox 19 P new tia, Thus, the experimenter rather the organism determined when behavior fred. Furthermore, measuring how loog the [ok ait specty wart was actually doing that time, Two experimental ianovations to solve these problems. ‘The fist was an fotos designed 0 that the onganism could edly emit sn easily specified response wih {be expeninenter® intervention, the secood J recorling method based directly om the sate [equeney of responding rather than on indie tunes derived from response sequences or fr of organisms. These innovations inspized F iy an saterest in educog the handling of the Hamplon Court maatanisin and thereby simplifying the experi- holcepont T-maze; and Ps work, wete important Features of adie: f research initiated by B. F, Skinner (1930, 1938, 1950); see expecially Skinner (1956) fora his tory ofthese developments Experimental Chambers and Cumulative Records gue 5-3 illustrates two representative appara- fuses on the left a standard rat chamber with 2 ‘Sagle lever, and on the right a three-key pigeon Chanber ‘They share response devices, mecha hisms For delivering reinforcers such as food oF ‘ater, and stimulus sources. Ta a typical arrangement, a tat that has been food deprived is placed in the chamber. A lever proades from one wall Near the levers food tip into which food pellets ean be dispensed from tdeivery system on the other side of the wal, istiaeive sound accompanies each pellet detivery. “Thehouseight provides general illumination, and noase can be broadcast from the speaker to mask Sounds from outside the chamber “The first step is feeder taining, Pellets are Adeiered into the food cup. Sooner or later, the fat ands and eats them. Once this happens, pellet Salireres continue wot the rat comes quickly to the food exp from any location upon each deliverys 40 oF ao pellets are usually enough. Once feedet tunzing ia done, the apparatus is changed s0 that pellet deliveries depend om lever presses. Evento ly the rat presses the lever, the press produces ft pele, and the pellet occasions eating, The rit “ul then probably go back to the lever and press again. (Well consider altetnatives o waiting fo the lever press ina later chapter om shaping) The foarcome of interest is the eate at which the rat preses the lever IF it increases, we call the food pelet arin In the type of chamber shown in Figure 5-3, other kinds of reinforcers can be ub- stinted. For example the pellet dispenser could be Teplaced by a dipper that delivers small amounts ‘of water or milk. ’A pigeon chamber differs from one for sats in thc keys substitute for levers and the feeder works by bringing try of pigeon food (nixed grain or ceanmmercally available pellets) within the pigeon's Tach for a few seconds. The feeder opening is 62 + PARI Ii: LEARNING WITHOUT WORDS Figure 53. rat chamber (ef and a tree-koy pigeon chamber (igh). The rt chamber includes a lover (A), 2 food cup and palit dalvery tube (8), @ speaker (C), and a lamp or Fauselight(D); some rat chambers include 3 gf floor through which shock can be delivered (E), The pigoon chanber includes three keys (FG, H) and the ‘pening toa food hopper (lamps or projactore behind each key alow coors or patierns tobe displayed on them. ‘centered below the keys. Its common practice to light dhe feeder and toen off all other lights when- cever the feeder is operated. The chamber typically includes other features, such 25 2 houselight for ‘dim genetal illumination, masking noise or other auditoey simul, and so on. ‘A pigeon key is a piece of plastic mounted behind a hole inthe chamber wall It is attached to a switeh that records the pigeon’s peeks if they're forceful enough (keys are routinely sensitive to forces of less than 0.1 N of Newtons, about 10 {08 one-third of an ounce). The plastic is usually ttanslacent ot transparent, so that limps or minia- ture projectors or a computer monitor behind the key can present patterns or colors on it, The cham. ber in Figure 5-3 contains three Keys, arsanged horizontally about 23 cm (9 in) above the chamm- ber floor. A given expesiment might use only one key, some combination of two, or all thnee. Keys ae typically lit when cheyee in-use. As with the int if a food deprived pigeons key pecks produce fod, dhe rae at which the pigeon pecks the key ssill ordinarily increase “The rat and the pigeon are common labora tory organisms They each have idiosyncratic species-specific behavior patterns that must be taken into account, and we must't assume that ‘what we observe with tats or pigeons can be gen- ‘eralized to other organisms (though it often can be). Neventieless, the det, housing, susceptbil- ity to disease and other characteristics of these sins are reasonably well understood, and their size, relatively long lifespan and economy make them parteslaely convenient. Thus, they have often served in reseatch on the consequences of responding "Respording ip apparatuses like those in Fig- sure 5-3 fas sometimes been called firegperant responding: fae because the organism is free (0 emit the response at any Hime eather than waiting foe the experimenter (he ratin a poalbox can't ran through the maze again until it placed back in the stirtbcx and the experimenter opens the start- ‘box door); and eran because the response oper- ates on dhe environment Free-onerant responding lends itself to a recording method, the cummolative record, that gives a convenient, detailed picture of how responding changes over time. Most contempo- ‘ary cumulative records are produced by computer, butin the original cumulative recorder, illustrated in Figore 5-4, roll of paper was threaded around a oller. A motor drove the roller at a constant speed, feeding ous the paper. A peo or other writ. ing device rested on the paper as it pasted over the roller, and each response (eg, a pigeon’s key peck) moved the pen a small distance at right angles to the movement of the paper. Ths, at any time during a session this zecord shows the oul responses accumulated. Figure 5:5 showe examples of cumulative secords Because the record advances ata constant speed, its slope is steeper the higher the sate of responding, as illustrated in records A and B. In the scale for Figure 5-5, ates are coughly 30 and LBrresponses/min in records A and B respectively ‘Record C includes only a few responses the hori zontal portions indicate periods of time without otter Diestion of Paper Movement = (CHAPTER 5. CONSEQUENCES OF RESPONDING: REINFORCEMENT + 63 responses (note that a cumulative record can't have a negative slope, because the pen can record responses only by moving in one direction across the page) In record D, 2 magnified section of record that includes a few sesponses accompanies aa feveat record on the same time scale. For each response in the event record, a small step occurs in the cunmulative records this property of cums lative records isn't always obvious because typical response and time scales are often too small for such fine resolution. Even so different patterns of responding are casly distinguished. For example, response rates in E and F are roughly the same, but E is steplike whereas F is relatively smooth, ‘Thismeans that E.was produced by short high ate bursts of responding (step segments) separated by pruses (fat segments), whereas P was prochced Pon Direction of Figure 5.4 ain components ofa cumulative recorder, A roller drives the paper at @ constant speed and each Fesponse moves the pen a fixed distance across ihe paper. Paper speed and step-size per responso vay with the bbehavior understudy. A common scale is abou 1 cmimin (about 25 minin) and T100 responses across the paper wih (about 200 responsesin). At his scala, a sor of 45" represents arate of about 40 responsesimnin, When the pen moves near the top of the paper, it resets automaticaly tits starting postion near te baton. 64 + PARTI: LEARNING WITHOUT WORDS i i 1 10 Mites Figure $5 Sample comulate records. A.45* slope represen chout 20 reeponsasimin. Records A and B fer Inatnly in response rate higher in & than in B, Rate zero through most of C; a magnified segment of C with afew Fesponses is shown in i relation 1 an event record, Records Eand F are about equal in rate but show diferent patterns of responding: Es slepie, showing periods of responding aerating with pauses; the smoother.grained shows relatively steady responding. Records G and H show rales that change overtime, decreasing in G (neg tive acceleration) and increasing in H (positive acceleration). by mote uniform responding. This property i sometimes called gi of the two records Eas rougher gan than F Records G and H ilustate other properties ‘of behavior made visible in cumulative records In G, the rate begins at oughly 25 zesponses/ min bt gradually decreases as time passes: in H, it changes ia the opposite direction, increasing fom aelatively low tate to roughly 30 responses/ min (records in which slopes decrease over ime are called mgt) asa those in which they increase are called pase alae Figure 5.6 shows other features sometimes incorporated into cumulative records. Records A. and E show how pen displacements can indicate other events besides responses. Here only some respoises produced food, iregulaly in A (as at 4 band and regolasy in B (as at d and 4. The ‘epeated concave patter in B, as berween d and 4, is sometimes called scalping Ta C, responding that kegan at f produced food at gs indicated by the pen displacement. The pen then reset to hand fhe sequence repeated at # to f and co on. This ‘ype of record makes successive segments easy to (CHAPTER 5, CONSEQUENCES OF RESPONDING: REINFORCEMENT + 65 Figure 5-6 Adatonal features of cumulative reads. A and amin To Minutes 200 Responses 1m daplaccionts cuperonpose 2 racoud of pe ther evens, such 25 fod delverie, on cumulative responses (as at a through e), In, pen resets make easy to Compare successive seoments ofthe record (Ft 10 nD, sustained pen displacements distnguish respond fag during a steulus (tn) and nonresponting in ts absence (at k,n, 0). A slope of 45" represents about 40 Fesponsesimin (the scale difers from that in Figure 55) ‘compare (eg, the segment ending at ¢ contains ‘more responses than the one ending at ). Record D shows how sustained pen displacements can istinguish different conditions Here responding ‘occasionally produced food only in the presence ff a tone; the pen stayed in its normal position ating the tone, in sepments j, /and m, bot was clsplaced downoad in its absence, in segments A, rand 0. ‘With this treatment of free-operant behavior and cumulative records, we've explored pact of the technology of the science of behavior. Before we can effectively consider the findings made avail- able through such analyses, we must attend to sotte propertics of the language of behavion. ‘The Vocabulary of Reinforcement Lever pressing hecomes more probable when a ‘water depsived rats lever presses produce water than when they don't. Key peeking becomes more probable when a food.depsived pigeon’s key pecks [prodice food than when they don't, And perhaps a dild’s cries become more probable when they profuce a parents arention than when they don't. ‘These eases illustrate reinforcement: Responding 66 + PART II: LEARNING WITHOUT WORDS increases when it produces reinforcers. The prin ple is simple, but a it evelved from Thornlike’s Initial versions of the Law of Effect tots contem- porary status it cated problems of language and logic with it. Reinforcement is not a theory. Itis something that happens in behavior, and we must learn to spotit when it hppens and not to confuse {it with other things that happen that might seem superficially seated toi ‘Table 5-1 summarizes some properties of the contemporary vocabulary of reinforcement. This vocabuliry inclades the term renfaver as stimulus fand the tetms rinfrce and reiforement as either ‘operation of outcome. For example, when a rats ever presses produce food pellets and lever press. ing increases, we say that the pellets are reinfore- fers and that the lever presses are reinforced with pellets ‘The response that increases must be tthe one that produces the consequences. For ‘example, if a rats lever press produces shocks and TABLES ‘The Vocabulary of Reinforcement. only the nts jumping increas, it would be inap- propriate to speak of either pressing or jumping as teinfored. ‘Ateinforcerisa typeof stimulus, butreinforce- ‘ment is neither stimplus nos response. Asa proce- ure of cpetation, reinforcement is presenting a reinforcer when response occurs itis caried out fon responses, and x0 we speak of reinforcis responses tather than organisms. We say that food rein‘oeced a rats lever pressor that pigcon’s key peck was reinforced with water, but not that ood zein‘orced the rat or thatthe pigeon was rein forced for pecking or that 2 child was reinforced ‘The main reason for this restriction is llustrated in the last examples: When we speak of reinforcing ‘organisms, it i 100 easy to omit the response of the reinforcer or both. The resttiction forces vs to be explicit about what is reinforced by what [Nor must we omit dhe organisms we can always say whose response it was (ep, the childs crying). “This vocabulary” is approprile if and only if theee condition exist (1) a response produces cosequences; (2) the response occurs more often then when it doesn't produce those consequences; and (3) the imeeased re- sponding occurs becuse te response has those consequences. Tm ‘Restrictions reinforcer (noun) A stimulus reinforcing A property of a stimnhas (adjective) reinforcement As an operation, the delivery of eonse- (oun) ‘quences when @ response occurs [As a process, the increase in responding ‘hat results fom the reinforcement toreinforee (verb) Asan operation, to deliver consequences ‘when a response occurs; responses are reinforced and not organisms ‘As a proces, to inerease responding through the reinforcement operation Ba Froud pellets were used as reinforcers forthe rat’ lever presses. “The reinforcing stimulus was produced more often than the other, nonreinforcing, stil “The fixed-ration schedule of reinforcement -/, Kesponse rate ‘decreases over time in both (negative acceleration), but depending on the extinction criterion either could represent more resistance to extinction. IF the criusionis the time und 2 min go by without a response, then A shows more resistance to extine= tion than B; A doese’ include 2 min without a response but such a period begins halfway theough If instead the esiterion is total responses emit- ted, thea resistance to extinction is greater for B than for A. Because its definition permitted such ambigoities, resistance to extinction lost favor as a ‘measure of the effects of extinction, Bot resistance to change, of which resistance 1 extinction it special ease, remains an signif cant property of behavior (Nevin, 1992). ‘Two responses might each be maintained at similar sates, but one might be persistent in the face of changes in other evenis, such a8 the introduction CHAPTER 5, CONSEQUENCES OF RESPONDING: REINFORCEMENT * 69 Figure 5.7. Hypothetical cumulative records of ex- fiption of rat's lever presses after food reinforce- rent Either A or & might be said to demonstrate ‘real resistance fo extinction, depending on whet rit is measured by the time taken unt 2 min pass ‘without @ response of by total responses during the of reinforcement arranged for diferent responses, ‘whereas the other might be fragile. For example, arithmetic fcts, spelling and other academic skills fate said 10 be fen when they bave been practiced ‘and reinforced to the point where they ocoue with high accuracy and short stency (Johnson &Layag, 1992), Once such shill become fluent, they also become less likely to be disrupted by changes in settings or other distractions. Thus, fuent skills are far mote resistant to change than those that have ‘aot been leamed 0 & Buency exiterion. “As a rough approximation, we can say hat the total activation of responding over its rein- forcement history determines its resistance to change, whereas its rate ig determined by its eon~ text, including such current variables as the rein- forcement af other responce classes (cf. Nevin, MeLean, & Grace, 2001b)- Whatever its starting point, extinguished responding tends to decrease by a constant proportion over equivalent times, oughly accotding to the equation: blogs sshere Ris the response rate, isthe time in extn tion, and Ais dimensional constant (eg, Catania, 205%; Catania & Keller, 981; Katz 8 Catania, 2005; Nevin & Grace, 2005). We will lter see, however, that the course of extinction can be ind ‘enced by schedules of reinforcement (Chapter 15). Extinction Versus Inhibition If extinction didn't occur, the effects of rein. forcement would be permanent. Any sesponding engendered by reinforcement would lat through 1 lifetime. Cleatly that cannot be generally so. For example if you wear 2 watch you probably often futn jour wrist 10 Jook at it; the consequence of looking is seeing the tine. But if you stop wear- ing the watch for some reason, you'll eventually stop looking; seeing a bate waist isn't an effective reiforcer ‘Thehietoty of the concept of extinction, how- ‘eve, wasn't $0 simple. It was long assumed that ‘extinction actively suppressed responding, Extine- tion was said to ave init effects, in contrast to assumed caitatry effects of reinforcement. “This treatment went back to a language that had been applied (0 data ftom Pavlov’s conditioning experiments (ef, Chapter 17}. The effectiveness ff Skinner’ operant language depended in part ‘on eseaping from the implications of those ealier tusages Skinner, 1958, pp. 96-102). Once those usages had cazzied over to the language of con- sequences, they were Kept because they seemed Corsistent with other effects that often accom- panied extinction, Th, texts on lenring tended fo devote separate chapters to reinforcement and textnction rather than treating them as two aspects of a single phenomenon. Consider sportaneour reser. Ina typical extine- tion session, responding decreases as the session continues. But the rateat the beginning of the next ‘extinction session is utualy higher than it was at the end of the list one. Some hypothetical cm latire records illustrating spontaneous recovery are shown in Figure 5-8, Responding at the start of a session was sid to bave recovered spontaneously from inhibision built wp by the end of the lat ses sion; it was assumed that this inibition, actively supptessing responding, increased within sessions but dissipated between sessions Phenomena such as spontaneous recovery ‘vere taken to mean that responding that had been recaced by extinetion was somehow “there all the time but inhibited” (R. L. Reid, 1958). Various | 70 + PARTI: LEARNING WITHOUT WORDS Pee Figure 5.8 Spontancous recovery in hypothetical ‘cumulative records of ra’ lever presses in sessions ‘of extinction afer food reinforcoment. The responce rate af the start of session 2 is higher than fas at ‘he end of session 1; similarly, te rate atthe sar of ‘session 3s higher than twas a the end of session 2. accounts of extinction were formulated in terms Of inferred processes such as inhibition, frustea- ‘ion, interference or fatigue (Kimble, 1961), but shen a response was sid ( be inhibited in extinc- ton, there was no way to measure what was doing the inhibiting Later we'll consider a different van- fy of inhibition in which both what is inhibited and what does the inhibiting are clearly specified. Tt waso't necessary to assume that extinction required active suppression. For example, the effects of presession conditions such as handling ‘may make the start of a session different from later times If co, effecs of extinetioa late in one session might not transfer to the start of the nest fession. On these grounds, Kendall (1963) rea- ‘soned that the usual pattern of response rates in fexinction sessions could be reversed under the right conditions. The key pecks of theee pigeons ‘were frst zeinforced in. Ihr sessions. Repeated ‘Lmin sessions of extinction followed. The first long estinction session came only after respond- ing had reliably decreased to zero in the beet ses sions. Within a few minutes, each pigeon began to respond. Until this session, sesponding had never extinguished at times later than 1 min into 1 session; sponding occurred at these later times ‘when the opportunity was finally available. In a sense, Kendall had demonstrated spontaneous recovery within a session mther than at its ¢taet “Another example of recovery of extinguished responding has been called rerestion oF resurgence (Epstein. & Skinner, 1980; Keller & Schoenfeld, 1930, pp. 81-82). Suppose a rats chain pulls are cextinguithed and then lever peesses are rein- forced, I the lever presses are later extinguished the previously extinguished chain pulls are Hkely to teappeas. By analogy to clinical terminology, the pheromenon suggests regression from cur rent bebavior (lever presses) to older behavior that was once effective (chain pulls). Itis of con- siderable practical significance. For example, in training show dogs itis cometimes necessary dt- ing a show to withhold reinforcers that bad been awailble during training, but this must be done ‘arefllyt0 prevent regression to behavior that had beee common earlier bt would now spoil the dog’ performance. Response-Reinforcer Contingencies and Reinforcer Deliveries Contingencies expressed in terms of probabil ity relations between responses and their conse- ‘quences san be summarized in prapic form much like thoes berween ssn and the zesponses they slic Figure 4-2), The coordinate syster sis trated in Figure 5-9, The y-asis shows the prob- ability of stimu gfven a response, S/R); the _xaxis shows its probability given no response, {S/ ‘bo R). Relative to Figure 4-2, the Sand terme have been reversed, The eather figure showed cffcts of stimuli on responses; this one shows ‘fects of responses on stmul. ‘ALA, the probability of the stimulus is high given « msponse but is otherwise low a when a fats lever presses produce food At B, simulss probsbily is independent of e=ponses 28 when Food i delivered without regard to lever presses ACC, sulus probability is 2eo whether nota responsehas ocmrzed, 8 when food is dscontin ved in extinction. Well later consider other kinds Of contingencies in other contexts. For example, cases in which responses seduce stimulus prob bility, ar at D, iliostte avoidance (Chapter 8), tad those in which responses produce a sinus ‘witha probability of less than 10, asatEilstate teinforcement schedules (Chapter 15), ‘Discontnuing reinforcement has not one but ‘mo effects (1) the contiagency benween responses (CHAPTER 5. CONSEQUENCES OF RESPONDING: REINFORCEMENT + 71 SIRI - Probably of Simulos a Sw Sino A} - Prosi of Simulus ‘Given No Response Figure 59 Response-stimulus contingencies rep- resented in toms of stmuus probably given a = Sponse, pISIR), and stimulus. probabity given no response, p(Siho R), The graph includes reliable production of simul by tesponses (A), responso-in- dependent stimu (B), extinction (C), prevention of Sstimuliby weponses, as n avoidance (0: see Chapter 8), and intermittent production of imuliby osponses, 48 in reinforcement schedules (E: see Chapter 19) Figure 4-2 and reinforcers endl, 90 (2) seinforcees are no lon- get delivered. In this context, the term contingency ‘imply describes the consequences of responding here iti the effet of a rpnse stm probabi- 4, For exarople, in a contingency ip which a rat receives food if and only if itpresses aleves, lever press sacs the probubily of food from zero to 110, butt a contingency i which lever presses do nothing the probability of food is independent of lever presses: Strictly, a eesponse stimulus contin- _pency is vietally always part of 2 three-term con- Lingency, but we need address that issue here; cf Chapter 11) Side Effects of Extinction IF the standard estinetion procedure terminates both a contingency and reinforcer deliveries how ‘much of what happens depends on the end of the cootagency and how mich on the end of food Geireaes? In extinction, the rat esponses no lon aero anything, bor the xs alo no longer ening {Fog sway enforce wil affect more than just the forced response If food is suddenly taken vay ftom a food deprive ra that hasbeen etn, for example the rt becomes more seve tnd per hapsusiats or defecate If ood was produced by lever presses, he rat might bie the lever (Mower {& Jones, 194). 1 other ogame are inthe charm be the tat might attack them (Azen, Hutchinson, & inks, 1960). And the opportunity t0 engige in sich aggressive cesponses may reinforce other responses. For ample, an organism might pall hain if cain pulls gett somethings can sik its tect ito (Azria, Hutchison, & MeLanghln, 1062). These effects, though observed in extn ton arent eeu of terminating the reinforcement Coningency They occur when eepontc indepen dent offre fod deliveries top a8 well a ding extinction. We el them side effecs because they tre icect products of the change in contngen ties Whether afr fee food e fod produced by response, rt tht hasbeen eating can no lenge dora In extncio, these side effets get super imposed on decreases in the previously reiforced responding bcs the termination of tenforcers {tcestany «part of exsncton “These se effects had been thought to show thar extinction was more than eridence for the temporary effect of senforcement. Yet may, tuc s the aggressive responding generated by terminating winorcer delve, could have been fbuerved in stuaons tht removed enforcers burdidnt involve respoose consequences These observations have crcl impietions for the fppication of ccnforcement and extinction pro Cures For example, those who work with chi Ger sorties se flee or respon independent reinforcers rather than extinction to ave the side tifes of terminating renforoer delves (eg, Marto a, 1968, on the socal reinforcement of « chil coopemtce play Txtnction was long searded as the most appropriate way to get oF problematic bchav- ion but as he side effects of tinction were

You might also like