Operant Conditioning Theory

Operant
Conditioning
Theory by B.F.
Skinner
 Prof. B.F. Skinner (b. 1904) started his research work on
behavior while he was a graduate in the Department of
Psychology of the Harvard University. In 1931 he wrote his
thesis entitled, “The concept of the reflex in the Description
of the behavior”. Skinner was a Practical Psychologist who
conducted several experiments on rats and pigeons.
 His important publications are: ‘The Behavior of
Organism’ (1930), ‘Science and Human Behavior’ (1953),
Verbal behavior (1957), Cumulative Record (1957),
Beyond Freedom and Dignity (1971) and ‘About
Behaviorism’ (1974).
Introduction of Operant Conditioning
 According to Skinner, there are two types of behaviors,
namely respondent behavior and operant behavior.
 You blink your eye in response to a flash of light. This
reflexive behavior is elicited directly by the environment. So
this is respondent behavior - spontaneous response to stimuli.
 But most of our behaviors are not so simply generated by the
environment. You are not forced by the environment to look at
a book, to talk, to sing, and to eat. These behaviors are emitted
by you, the individual. Through such behaviors, you operate
upon the environment. These are called operant behaviors
The Operant Experiment
 Skinner designed a box named as ‘Skinner box’ and placed a
hungry rat inside.
 There was a lever which, after being pressed, released a
mechanism to deliver a pellet of food to the rat.
 Initially, the rat is engaged in a number of random behaviors
like walking, sniffing and scratching. None of these helped to
get the food.
 At some point of time, the rat accidentally hit the lever and the
food was delivered. Of course, for the semi-starved rat, this
was a big reward.
 Skinner observed that after a few accidental manipulations
of the lever, the rat started spending more time near the
lever, and then deliberately pressed the lever whenever it
was hungry.
 So, now pressing the lever became a new operant for the rat.
Skinner further noted that if the pressing of the lever did not
deliver food any longer, the operant behavior by the rat
decreased and gradually stopped altogether.
 This is known as experimental extinction of operant
conditioning.
 For doing experiments with pigeons, Skinner made
use of another specific apparatus called “Pigeon
Box”.
 A Pigeon in this experiment had a peck at a lighted
plastic key mounted on the wall at head height and
was consequently rewarded by receiving grain.
Measuring Operant Behavior
 Quantification of operant behavior was crucial to Skinner’s
work. He needed to demonstrate that through appropriate use
of reward and punishment you can actually increase the
probability of occurrence of a conditioned operant behavior.
 Therefore Skinner introduced the rate of occurrence of the
target behavior as the measure of operant conditioning.
 He simply counted how many times the learnt behavior has
taken place within a given time. In fact, he used the cumulative
frequency of the operant behavior as the final indicator.
 If you put it in a graphical form you will readily see whether
the probability of the occurrence of that behavior has actually
increased over time.
1. Shaping
Operations in
4. Concept of Operant 2.Extniction
Reinforcement Conditioning
3. Spontaneous
Recovery
1. Shaping
 Shaping is an extremely important concept in operant
conditioning.
 Shaping means modification of the organism’s behavior to the
experimenter’s desired end.
 It takes place only through ‘successive approximations’.
 Suppose you are trying to modify a child’s behavior by
selectively rewarding the response desired by you. Before the
ultimate desired behavior is enacted, the child’s usually
engaged in numerous other behaviors which may be considered
as steps to the final behavior. They are close to the target, but
not the target per se. If these approximate target behaviors are
rewarded, shaping is facilitated.
 Skinner discovered this principle of successive
approximation rather accidentally.
 He was conditioning a pigeon to swipe a ball with its beak
movement which in turn would release a food magazine.
 The pigeon was not lucky enough. After waiting for the
accidental success to happen for a long time Skinner was
bored. So, just casually, he decided to reward any behavior
that might lead toward the target behavior.
 As these approximate behaviors were successively rewarded,
to Skinner’s surprise, the total process was quickened.
 Very soon ‘the ball was caroming off the walls of the box as
if the pigeon had been a champion squash player’ (Skinner,
1938, p. 38). Rewarding of the simpler step has automatically
led to the next higher step and so on. (This is successive
approximation)
Principles
involved in
Shaping
Habit
Generalization Chaining
Competition
1. Response
Generalization
2. Stimulus
Generalization
2. Habit Competition 3. Chaining
1. Generalization • At each point of the chain, • Cues produced by one
the correct habit must response must be linked
Using the experience attain dominance over with the succeeding
and knowledge of one competing habits. This is response.
accomplished by • Example- if we want to
situation in another reinforcing the correct train the pigeon to move in
situation. habit alone. circle then we must
reinforce every correct
turn .
1.
Generalization
a) Response 2. Stimulus
Generalization Generalization
is the ability to
behave in a new
It refers to the spreading of the
effects of a behavior situation in a
strengthening contingency to
other responses that are similar to way that has
the target response that resulted in
the behavior strengthening been learned in
consequence.
other similar
2. Extinction
 In the operant conditioning paradigm, extinction
refers to the process of no longer providing the
reinforcement that has been maintaining a
behavior.
 Operant extinction differs from forgetting in that
the latter refers to a decrease in the strength of a
behavior over time when it has not been emitted.
3. Spontaneous Recovery
 Spontaneous recovery is a phenomenon that
involves suddenly displaying a behavior that was
thought to be extinct.
 This can apply to responses that have been formed
through both classical and operant conditioning
4. The Concept of Reinforcement
 Reinforcement is defined as a consequence that follows an
operant response that increase (or attempts to increase) the
likelihood of that response occurring in the future.
 In operant conditioning, the concept of CS and UCS are not
applicable, as we are concerned with shaping of target
behavior.
 So here reinforcement comes separately as a consequence of
desirable behavior. It simply serves to strengthen the response.
The food pellet emerges only if the lever is pressed, and not
otherwise. So it is contingent upon operant behavior and
strengthens the same.
a. Fixed
1. ratio
Types of Continuou schedule
b) Variable
Reinforce s ratio
ment
2. Partial schedule
Schedule c) Fixed
interval
schedule
d) Variable
interval
schedule
1. Continuous Reinforcement
 In continuous reinforcement., the desired behavior
is reinforced every single time it occurs.
 This schedule is best used during the initial stages
of learning in order to a strong association
between the behavior and the response.
 Once the response if firmly attached,
reinforcement is usually switched to a partial
reinforcement schedule.
2. Partial Reinforcement
 In partial reinforcement, the response is reinforced
only part of the time.
 Learned behaviors are acquired more slowly with
partial reinforcement, but the response is more
resistant to extinction.
 A) Fixed–ratio schedules are those where a response is
reinforced only after a specified number of responses. This
schedule produces a high, steady rate of responding with only a
brief pause after the delivery of the reinforce. An example of a
fixed-ratio schedule would be delivering a food pellet to a rat
after it presses a bar five times.
 B) Variable-ratio schedules occur when a response is
reinforced after an unpredictable number of responses. This
schedule creates a high steady rate of responding. Gambling
and lottery games are good examples of a reward based on a
variable ratio schedule. In a lab setting, this might involve
delivering food pellets to a rat after one bar press, again after
four bar presses, and the third pellet after two bar presses.
 C) Fixed-interval schedules: are those where the first
response is rewarded only after a specified amount of time
has elapsed. This schedule causes high amounts of responding
near the end of the interval but much slower responding
immediately after the delivery of the reinforce.
 D) Variable-interval schedules occur when a response is
rewarded with an unpredictable amount of time has passed.
This schedule produces a slow, steady rate of response. An
example of this would be delivering a food pellet to a rat after
the first bar press following a one-minute interval, another
pellet for the first response following a five-minute interval,
and a third food pellet for the first response following a three-
minute interval.
Educational Implications
 Identification of root cause of the behavior.
 Eliminates Negative Behavior: The operant conditioning
theory involves the use of negative reinforcement which
strengthens behavior by eliminating unpleasant behavior.
 By building operant conditioning techniques into lesson
plans, it is easily possible to teach children useful skills-as
well as good behaviors.
 The use of reinforcement in the form of rewards motivates
children to keep learning and perform better.

Operant Conditioning Theory

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Operant Conditioning Theory

Uploaded by

Copyright:

Available Formats

Operant

You might also like