PERCEPTION

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 19

PERCEPTION

 The process by which we recognise, interpret or give meaning to the information provided by
sense organs is called perception.
 In interpreting stimuli or events, individuals often construct them in their own ways.
 Thus, perception is not merely an interpretation of objects or events of the external or internal
world as they exist, instead it is also a construction of those objects and events from one’s own
point of view.

Processing Approaches in perception

 Psychologists studying perception distinguish between bottom-up and top-down processes.


 The term bottom-up (or data-driven) essentially means that the perceiver starts with small bits
of information from the environment and combines them in various ways to form a percept.
 A bottom-up model of perception and pattern recognition might describe your seeing edges,
rectangular and other shapes, and certain lighted regions and putting this information together
to “conclude” you are seeing the scene outside your window. That is, you would form a
perception from only the information in the distal stimulus.
 In top-down (also called theory-driven or conceptually driven) processing, the perceiver’s
expectations, theories, or concepts guide the selection and combination of the information in
the pattern-recognition process.
 For example, a “top-down” description of the scene-outside-your-window example might go
something like this: You knew you were in your dorm room and knew from past experience
approximately how close to the window the various trees, shrubs, and other objects were. When
you looked in that direction, you expected to see trees, shrubs, walkways with people on them,
a street with cars going by, and so on. These expectations guided where you looked, what you
looked at, and how you put the information together.

THE BOTTOM–UP THEORIES OF PERCEPTION EXPLANATION

 The characteristic feature of bottom–up theories of perception is the fact that the content and
quality of sensory input play a determinative role in influencing the final percept.
 Sensory input, in their view, represents the cornerstone of cognition and by its own nature it
determines further sensory data processing.
 Therefore, we call this data–driven processing perception, because it originates with the
stimulation of the sensory receptors.

Gibson’s theory of direct perception

 Psychologist James Gibson believed that our cognitive apparatus was created and formed by a long
evolutionary influence of external environment which is apparent in its structure and abilities.
 Gibson opposed the top-down model and argued that perception is direct.
 He stated that sensation is perception and there is no need for extra interpretation, as there is
enough information in our environment to make sense of the world in a direct way.
 His theory is sometimes known as the "ecological theory" because of the claim that perception can
be explained solely in terms of the environment.
 An example of bottom up-processing involves presenting a flower at the center of a person's
field. The sight of the flower and all the information about the stimulus are carried from the
retina to the visual cortex in the brain. The signal travels in one direction.
 Gibson claimed that perception is, in an important sense, direct. He worked during World War II
on problems of pilot selection and testing and came to realise:
 The leading proponent James Gibson (1966,1979) and his followers at Cornell University stated:

–“Direct perception assumes that the richness of the optic array just matches the richness of the
world”.

 In his early work on aviation he discovered what he called 'optic flow patterns'. When pilots
approach a landing strip the point towards which the pilot is moving appears motionless, with
the rest of the visual environment apparently moving away from that point.
 According to Gibson such optic flow patterns can provide pilots with unambiguous information
about their direction, speed and altitude.
 Three important components of Gibson's Theory are 1. Optic Flow Patterns; 2. Invariant
Features; and 3. Affordances.

1. Light and the Environment - Optic Flow Patterns

 Optic Array is the pattern of light reaching the eye, which is thought to contain all the visual
information available on the retina.
 Changes in the flow of the optic array contain important information about what type of
movement is taking place. For example:
 Any flow in the optic array means that the perceiver is moving, if there is no flow the perceiver is
static.
 The flow of the optic array will either be coming from a particular point or moving towards one.
The centre of that movement indicates the direction in which the perceiver is moving.
 If a flow seems to be coming out from a particular point, this means the perceiver is moving
towards that point; but if the flow seems to be moving towards that point, then the perceiver is
moving away.

2. The role of Invariants in perception

 We rarely see a static view of an object or scene. When we move our head and eyes or walk
around our environment, things move in and out of our viewing fields.
 Textures expand as you approach an object and contract as you move away. There is a pattern
or structure available in such texture gradients which provides a source of information about the
environment.
 This flow of texture is INVARIANT, ie it always occurs in the same way as we move around our
environment and, according to Gibson, is an important direct cue to depth. Two good examples
of invariants are texture and linear perspective.

3. Affordances

Are, in short, cues in the environment that aid perception. Important cues in the environment
include:

 OPTICAL ARRAY The patterns of light that reach the eye from the environment.
 RELATIVE BRIGHTNESS Objects with brighter, clearer images are perceived as closer.
 TEXTURE GRADIENT The grain of texture gets smaller as the object recedes. Gives the impression
of surfaces receding into the distance.
 RELATIVE SIZE When an object moves further away from the eye the image gets smaller. Objects
with smaller images are seen as more distant.
 SUPERIMPOSITION If the image of one object blocks the image of another, the first object is seen
as closer.
 HEIGHT IN THE VISUAL FIELD Objects further away are generally higher in the visual field.

EVALUATION OF GIBSON'S DIRECT APPROACH TO PERCEPTION

Visual Illusions

 Gibson's emphasis on DIRECT perception provides an explanation for the (generally) fast and
accurate perception of the environment.
 However, his theory cannot explain why perceptions are sometimes inaccurate, eg in illusions.
He claimed the illusions used in experimental work constituted extremely artificial perceptual
situations unlikely to be encountered in the real world, however this dismissal cannot
realistically be applied to all illusions.
 For example, Gibson's theory cannot account for perceptual errors like the general tendency for
people to overestimate vertical extents relative to horizontal ones.
 Neither can Gibson's theory explain naturally occurring illusions. For example if you stare for
some time at a waterfall and then transfer your gaze to a stationary object, the object appears
to move in the opposite direction.

TEMPLATE MATCHING

 Template matching theory describes the most basic approach to human pattern recognition. It is
a theory that assumes every perceived object is stored as a "template" into long-term memory.
 Incoming information is compared to these templates to find an exact match. In other words, all
sensory input is compared to multiple representations of an object to form one single
conceptual understanding.
 The theory defines perception as a fundamentally recognition-based process. It assumes that
everything we see, we understand only through past exposure, which then informs our future
perception of the external world.
 For example, A, A, and A are all recognized as the letter A, but not B. This viewpoint is limited,
however, in explaining how new experiences can be understood without being compared to an
internal memory template.
 As the simplest theoretical hypothesis in pattern recognition, the Theory of Template mainly
considers that people store various mini copies of exterior patterns formed in the past in the
long-term memory. These copies, named templates, correspond with the exterior stimulation
patterns one by one.
 When a simulation acts on people’s sense organs, the simulating information is first coded,
compared and matched with pattern stored in brain, then identified as one certain pattern in
brain which matches best.
 Thus the pattern recognition effect is produced, otherwise the stimulation can not be
distinguished and recognized. Because every template relates to a certain meanings and some
other information, the pattern recognized then will be explained and processed in other ways.
 In daily life we can also find out some examples of template matching. Comparing with
template, machine can recognize the seals on paychecks rapidly.
 Although it can explains some human pattern recognition, the Theory of Template, meanwhile,
has some obvious restrictions.
 According to the Theory of Template, people have to store an appropriate template before
recognize a pattern.
 Although pre-processing course is added, these templates are still numerous, not only bringing
heavy burden to memory but also leading pattern recognition less flexible and stiffer.
 The Theory of Template doesn’t entirely explain the process of human pattern recognition, but
the template and template matching cannot be entirely denied.
 As one aspect or link in the process of human pattern recognition, the template still works
anyway.
 In some other models of pattern recognition, some mechanisms which are similar to template
matching will also come out.
 It is apparent that the template-matching model won’t work, because a huge number of
different templates would be needed just to recognize one letter.
 When we multiply this by how many objects there are in the environment, the number becomes
astronomical.
 First, for such a model to provide a complete explanation, we would need to have stored an
impossibly large number of templates.
 Second, as technology develops and our experiences change, we become capable of recognizing
new objects such as DVDs, laptop computers, and smartphones. Template-matching models
thus have to explain how and when templates are created and how we keep track of an ever-
growing number of templates.
 A third problem is that people recognize many patterns as more or less the same thing, even
when the stimulus patterns differ greatly.
 Template matching works only with relatively clean stimuli when we know ahead of time what
templates may be relevant.
 It does not adequately explain how we perceive as effectively as we typically do the “noisy”
patterns and objects—blurred or faint letters, partially blocked objects, sounds against a
background of other sounds—that we encounter every day.

The Theory of Prototype

 Another kind of perceptual model, one that attempts to correct some of the shortcomings of
both template-matching and featural analysis models, is known as prototype matching.
 Such models explain perception in terms of matching an input to a stored representation of
information, as do template models.
 In this case, however, the stored representation, instead of being a whole pattern that must be
matched exactly or closely (as in template-matching models), is a prototype, an idealized
representation of some class of objects or events—the letter R, a cup, a VCR, a collie, and so
forth.
 The Theory of Prototype, also named the Theory of Prototype Matching, has the outstanding
characteristic that memory is not storing templates which matches one-by-one with outside
patterns but prototypes.
 The prototype, rather than an inside copy of a certain pattern, is considered as inside attribute of
one kind of objects, which means abstractive characteristics of all individuals in one certain type
or category.
 This theory reveals basic features of one type of objects. For instances, people know various
kinds of airplanes, but a long cylinder with two wings can be the prototype of airplane.
Therefore, according to the theory of Prototype, in the process of pattern recognition, outside
simulation only needs to be compared with the prototype, and the sense to objects comes from
the matching between input information and prototype.
 Once outside simulating information matches best with a certain prototype in brain, the
information can be ranged in the category of that prototype and recognized.
 In a certain extent the template matching is covered in the Theory of Prototype, which appears
more flexible and more elastic. However, this model also has some drawbacks, only having up-
down processing but no bottom-up processing, which is sometimes more important for the
prototype matching in human perceptional process.
 Prototype-matching models describe perceptual processes as follows. When a sensory device
registers a new stimulus, the device compares it with previously stored prototypes. An exact
match is not required; in fact, only an approximate match is expected. Prototype-matching
models thus allow for discrepancies between the input and the prototype, giving prototype
models a lot more flexibility than template models. An object is “perceived” when a match is
found.
 Prototype models differ from template and featural analysis models in that they do not require
that an object contain any one specific feature or set of features to be recognized. Instead, the
more features a particular object shares with a prototype, the higher the probability of a match.
Moreover, prototype models take into account not only an object’s features or parts but also the
relationships among them.

FEATURAL ANALYSIS

 Some psychologists believe that the analysis of a whole into its parts underlies the basic
processes used in perception.
 Instead of processing stimuli as whole units, we might instead break them down into their
components, using our recognition of those parts to infer what the whole represents.
 The parts searched for and recognized are called features. Recognition of a whole object, in this
model, thus depends on recognition of its features.
 The Theory of Feature is other theory explaining pattern perception and shape perception.
 According to this theory, people try to match the features of pattern with those stored in
memory, rather than the entire pattern with template or prototype.
 This model is the most attractive one currently, the Model of Feature Analysis has been applied
widely in computer pattern recognition. However, it is just a bottom-up processing model,
lacking up-down processing. Therefore, it still has some drawbacks.
 Feature detection theory proposes that the nervous system sorts and filters incoming stimuli to
allow the human (or animal) to make sense of the information.
 In the organism, this system is made up of feature detectors, which are individual neurons, or
groups of neurons, that encode specific perceptual features.
 The theory proposes an increasing complexity in the relationship between detectors and the
perceptual feature. The most basic feature detectors respond to simple properties of the stimuli.
 Further along the perceptual pathway, higher organized feature detectors are able to respond to
more complex and specific stimuli properties.
 When features repeat or occur in a meaningful sequence, we are able to identify these patterns
because of our feature detection system.
 One source of evidence for feature matching comes from Hubel and Wiesel's research, which
found that the visual cortex of cats contains neurons that only respond to specific features (e.g.
one type of neuron might fire when a vertical line is presented, another type of neuron might
fire if a horizontal line moving in a particular direction is shown).\
 PANDEMONIUM THEORY: The theory was developed by the artificial intelligence pioneer Oliver
Selfridge in 1959. It describes the process of object recognition as a hierarchical system of
detection and association by a metaphorical set of "demons" sending signals to each other. This
model is now recognized as the basis of visual perception in cognitive science.
 Pandemonium (Selfridge, 1959): Is a data-driven bottom up system
recognition model based on feature analysis – objects are recognised from an
analysis of their component
Pandemonium is composed of four types of recognition units (demons):

Sta Demon
Function
ge name
1 Image Records the image that is received in the retina.
demon
2 Feature There are many feature demons, each representing a
demons specific feature. For example, there is a feature demon for
short straight lines, another for curved lines, and so forth.
Each feature demon's job is to "yell" if they detect a
feature that they correspond to. Note that, feature demons
are not meant to represent any specific neurons, but to
represent a group of neurons that have similar functions.
For example, the vertical line feature demon is used to
represent the neurons that respond to the vertical lines in
the retina image.
3 Cognitive Watch the "yelling" from the feature demons. Each
demons cognitive demon is responsible for a specific pattern (e.g.,
a letter in the alphabet). The "yelling" of the cognitive
demons is based on how much of their pattern was
detected by the feature demons. The more features the
cognitive demons find that correspond to their pattern, the
louder they "yell". For example, if the curved, long straight
and short angled line feature demons are yelling really
loud, the R letter cognitive demon might get really excited,
and the P letter cognitive demon might be somewhat
excited as well; but the Z letter cognitive demon is very
likely to be quiet.
4 Decision Represents the final stage of processing. It listens to the
demon "yelling" produced by the cognitive demons. It selects the
loudest cognitive demon. The demon that gets selected
becomes our conscious perception. Continuing with our
previous example, the R cognitive demon would be the
loudest, seconded by P; therefore we will perceive R, but if
we were to make a mistake because of poor displaying
conditions (e.g., letters are quickly flashed or have parts
occluded), it is likely to be P.
Note that, the "pandemonium" simply represents the
cumulative "yelling" produced by the system.

Criticism
A major criticism of the pandemonium architecture is that it adopts a completely
bottom-up processing: recognition is entirely driven by the physical
characteristics of the targeted stimulus. This means that it is unable to account
for any top-down processing effects, such as context effects (e.g., pareidolia),
where contextual cues can facilitate (e.g., word superiority effect: it is relatively
easier to identify a letter when it is part of a word than in isolation) processing.
However, this is not a fatal criticism to the overall architecture, because is
relatively easy to add a group of contextual demons to work along with the
cognitive demons to account for these context effects.

Recognition-by-Components Theory

 First proposed by Irving Biederman (1987), this theory states that humans
recognize objects by breaking them down into their basic 3D geometric
shapes called geons, basic shapes or components that are combined in object
recognition; an abbreviation for “geometric ions” proposed by Biederman.
(i.e. cylinders, cubes, cones, etc.).
 An example is how we break down a common item like a coffee cup: we
recognize the hollow cylinder that holds the liquid and a curved handle off the
side that allows us to hold it. Even though not every coffee cup is exactly the
same, these basic components help us to recognize the consistency across
examples (or pattern).
 This also works for more complex objects, which in turn are made up of a
larger number of geons. Perceived geons are then compared with objects in
our stored memory to identify what it is we are looking at.
 RBC suggests that there are fewer than 36 unique geons that when combined
can form a virtually unlimited number of objects.

Edges and Concavities


 To parse and dissect an object, RBC proposes we attend to two specific
features: edges and concavities.
 Edges enable the observer to maintain a consistent representation of the
object regardless of the viewing angle and lighting conditions.
 Concavities are where two edges meet and enable the observer to perceive
where one geon ends and another begins.

Analogy between speech and objects

 In his proposal of RBC, Biederman makes an analogy to the composition of


speech and objects that helps support his theory.
 The idea is that about 44 individual phonemes or "units of sound" are needed
to make up every word in the English language, and only about 55 are
needed to make up every word in all languages.
 Though small differences may exist between these phonemes, there is still a
discrete number that make up all languages.
 A similar system may be used to describe how objects are perceived.
 Biederman suggests that in the same way speech is made up by phonemes,
objects are made up by geons, and as there are a great variance of
phonemes, there is also a great variance of geons.
 It is more easily understood how 36 geons can compose the sum of all
objects, when the sum of all language and human speech is made up of only
55 phonemes.

Viewpoint invariance

 One of the most defining factors of the recognition-by-components theory is


that it enables us to recognize objects regardless of viewing angle; this is
known as viewpoint invariance. It is proposed that the reason for this effect is
the invariant edge properties of geons.
 The invariant edge properties are as follows:
a. Curvature (various points of a curve)
b. Parallel lines (two or more points which follow the same direction)
c. Co-termination (the point at which two points meet and therefore cease to
continue)
d. Symmetry and asymmetry
e. Co-linearity (points branching from a common line)
 Our knowledge of these properties means that when viewing an object or
geon, we can perceive it from almost any angle.
 For example, when viewing a brick we will be able to see horizontal sets of
parallel lines and vertical ones, and when considering where these points
meet (co-termination) we are able to perceive the object.
 Two other properties of geons are discriminability and resistance to visual
noise. Discriminability means that each geon can be distinguished from the
others from almost all viewpoints. Resistance to visual noise means we
can still perceive geons under “noisy” conditions such as might occur under
conditions of low light or fog.
 The basic message of recognition-by-components theory is that if
enough information is available to enable us to identify an object’s
basic geons, we will be able to identify the object.

TOP-DOWN

 Richard Gregory introduced the concept of top-down processing in 1970.


 In this approach, perceptions begin with the most general and move toward
the more specific.
 There perceptions are heavily influenced by our expectations and prior
knowledge.
 Simply saying, our brain applies what it knows to fill in the blanks and
anticipate what it knows.
 Processing information from the top down allows us to make sense of
information that has already been brought in by the senses, working
downward from initial impressions down to practical details.
 Example; if half of a tree is covered, you usually have an idea what it looks
like, even though half is not being shown. This is because you know what
trees look like from prior knowledge.
 Top-down processing is a constructive approach, the perceiver builds a
cognitive understanding of stimulus.
 He or she uses sensory information to build the perceptions.
 This viewpoint also known as intelligent perception because it states that
higher order thinking plays an important role in perception.

Navon approach

 Forty years ago, David Navon tried to tackle a central problem concerning the
course of perceptual processing: “Do we perceive a visual scene feature-by-
feature? Or is the process instantaneous and simultaneous as some Gestalt
psychologists believed? Or is it somewhere in between?”.
 To examine this, Navon developed a now classical paradigm, which involved
the presentation of compound stimuli; a large letter (global level) composed
of smaller letters (local level) in which the global and the local letters could be
the same (consistent) or different (inconsistent).
 Images and other stimuli contain both local features (details, parts) and
global features (the whole).
 Precedence refers to the level of processing (global or local) to which
attention is first directed. Global precedence occurs when an individual
more readily identifies the global feature when presented with a stimulus
containing both global and local features.
 The global aspect of an object embodies the larger, overall image as a whole,
whereas the local aspect consists of the individual features that make up this
larger whole.
 Global processing is the act of processing a visual stimulus holistically.
 Although global precedence is generally more prevalent than local
precedence, local precedence also occurs under certain circumstances and
for certain individuals.
 Global precedence was first studied using the Navon figure, where many
small letters are arranged to form a larger letter that either does or does not
match.
 Variations of the original Navon figure include both shapes and objects.
 Individuals presented with a Navon figure will be given one of two tasks. In
one type of task, participants are told before the presentation of the stimulus
whether to focus on a global or local level, and their accuracy and reaction
times are recorded.
 In another type of task, participants are first presented with a target stimulus,
and later presented with two different visuals.
 One of the visuals matches the target stimulus on the global level, while the
other visual matches the target stimulus on the local level. In this condition,
experimenters note which of the two visuals, the global or local, is chosen to
match the target stimulus.
 He found two effects which he argued supported “..the notion that global
processing is a necessary stage of perception prior to more fine-grained
analysis” : (i) responses to the global level were faster than responses to the
local level, and (ii) when the levels were inconsistent, information at the
global level interfered with (slowed down) responses to the local level, but not
the other way around.
 Additionally, global interference effect, which occurs when the global aspect
is automatically processed even when attention is directed locally, causes
slow reaction time.
 Navon's study global precedence and his stimuli, or variations of it, are still
used in nearly all global precedence experiments.

Navon Effect
 A Navon figure is made of a larger recognisable shape, such as a letter,
composed of copies of a smaller different shape. Navon figures are used in
tests of visual neglect.
 Reading Navon figures has been found to affect a range of tasks.
 It has been shown that just 5 minutes reading out the small letters of Navon
figures has a detrimental effect on face recognition.
 The size of the Navon effect has been found to be influenced by the
properties of the image.
 The effect is short lived (lasting less than a couple of minutes).
 The Navon effects has also been found in other tasks such as golf putting
where reading the small Navon letters leads to poorer putting performance.

CONTEXT EFFECT
 A context effect is an aspect that describes the influence of environmental
factors on one's perception of a stimulus.
 The impact of context effects is considered to be part of top-down approach.
 The concept is supported by the theoretical approach to perception known
as constructive perception.
 Context effects can impact our daily lives in many ways such as word
recognition, learning abilities, memory, and object recognition.
 It can have an extensive effect on marketing and consumer decisions. For
example, research has shown that the comfort level of the floor that shoppers
are standing on while reviewing products can affect their assessments of
product's quality, leading to higher assessments if the floor is comfortable
and lower ratings if it is uncomfortable. Because of effects such as this,
context effects are currently studied predominantly in marketing.

Cognitive principles of context effects

 Context effects employ top-down design when analyzing information.


 Top down design fuels understanding of an image by using prior experiences
and knowledge to interpret a stimulus. This process helps us analyze familiar
scenes and objects when encountering them.
 During perception of any kind, people generally use either sensory data
(bottom-up design) or prior knowledge of the stimulus (top-down design)
when analyzing the stimulus. Individuals generally use both types of
processing to examine stimuli.
 The use of both sensory data and prior knowledge to reach a conclusion is a
feature of optimal probabilistic reasoning, known as Bayesian inference;
cognitive scientists have shown mathematically how context effects can
emerge from the Bayesian inference process.
 When context effects occur, individuals are using environmental cues
perceived while examining the stimuli in order to help analyze it. In other
words, individuals often make relative decisions that are influenced by the
environment or previous exposure to objects.
 These decisions may be greatly influenced by these external forces and alter
the way individuals view an object. For example, research has shown that
people rank television commercials as either good or bad in relation to their
enjoyment levels of the show during which the commercials are presented.
 The more they like or dislike the show the more likely they are to rate the
commercials shown during the show more positively or negatively
(respectively).
 Another example shows during sound recognition a context effect can use
other sounds in the environment to change the way we categorize a sound.
 Context effects can come in several forms, including configural superiority
effect which demonstrates varying degrees of spatial recognition depending
on if stimuli are present in an organized configuration or present in isolation.
 For example, one may recognize a fully composed object faster than its
individual parts (object-superiority effect).

Impact

 Context effects can have a wide range of impacts in daily life. In reading
difficult handwriting context effects are used to determine what letters make
up a word.
 This helps us analyze potentially ambiguous messages and decipher them
correctly. It can also affect our perception of unknown sounds based on the
noise in the environment.
 For example, we may fill in a word we cannot make out in a sentence based
on the other words we could understand. Context can prime our attitudes and
beliefs about certain topics based on current environmental factors and our
previous experiences with them.
 Context effects also affect memory. We are often better able to recall
information in the location in which we learned it or studied it.
 For example, while studying for a test it is better to study in the environment
that the test will be taken in (i.e. classroom) than in a location where the
information was not learned and will not need to be recalled. This
phenomenon is called transfer-appropriate processing.

Configural-Superiority Effect
A type of context effect by which objects presented in certain configurations are
easier to recognize than the objects presented in isolation, even if the objects in
the configuration are more complex than those in isolation.

MARR’S COMPUTATIONAL THEORY

 David Marr approached perception as problem solving. According to him, to


find a solution, it is important to analyze what the visual system should do in
order to make the perception successful.
 Marr called this level computational since it assumes that each function
(perception is a function) can be understood as a computational operation
(consisting of sequenced steps) leading to a desired outcome. A fundamental
feature of this sequence of steps is the fact that it contains hidden analytic —
computational processes — and the aim of computational analysis is to
describe a strategy, by which we ensure the achievement of a result.
 Marr’s second level specifies a representation system which identifies inputs
with algorithms, which transform inputs into representations. A second level
of solving a problem is a detailed analysis of specific actions which we must
take when transforming physical stimuli into mental representations. At this
algorithmic level we study formulas — algorithms as well as representations
(representational level) which enable us to achieve the result.
 A third problem is the analysis of the means enabling us to carry out a
specific operation. This level is called the hardware level (Rookes, Willson
2000, 34) or implementation level. In the case of living systems, it includes
neural network analysis, in the case of AI (artificial intelligence) it is the
description of functional connections described in the language of a specific
material base.
 Marr proposes that perception proceeds in terms of several different, special-
purpose computational mechanisms, such as a module to analyze color,
another to analyze motion, and so on. Each operates autonomously, without
regard to the input from or output to any other module, and without regard to
real-world knowledge. Thus, they are bottom-up processes.
 Marr believes that visual perception proceeds by constructing three different
mental representations, or sketches.
 The first, called a primal sketch, depicts areas of relative brightness and
darkness in a two-dimensional image as well as localized geometric structure.
This allows the viewer to detect boundaries between areas but not to “know”
what the visual information “means.”
 Once a primal sketch is created, the viewer uses it to create a more complex
representation, called a 2½-D (two-and-a-half-dimensional) sketch. Using
cues such as shading, texture, edges, and others, the viewer derives
information about what the surfaces are and how they are positioned in depth
relative to the viewer’s own vantage point at that moment.
 Marr proposes that both the primal sketch and the 2½-D sketch rely almost
exclusively on bottom-up processes.
 Information from real-world knowledge or specific expectations (that is, top-
down knowledge) is incorporated when the viewer constructs the final, 3-D
sketch of the visual scene. This sketch involves both recognition of what the
objects are and understanding of the “meaning” of the visual scene.

ATTENTION

 A state in which cognitive resources are focused on certain aspects of the


environment rather than on others and the central nervous system is in a
state of readiness to respond to stimuli.
 Because it has been presumed that human beings do not have an infinite
capacity to attend to everything—focusing on certain items at the expense of
others—much of the research in this field has been devoted to discerning
which factors influence attention and to understanding the neural
mechanisms that are involved in the selective processing of information.
 For example, past experience affects perceptual experience (we notice things
that have meaning for us), and some activities (e.g., reading) require
conscious participation (i.e., voluntary attention). However, attention can also
be captured (i.e., directed involuntarily) by qualities of stimuli in the
environment, such as intensity, movement, repetition, contrast, and novelty.

Selective attention

 Concentration on certain stimuli in the environment and not on others,


enabling important stimuli to be distinguished from peripheral or incidental
ones.
 Selective attention is typically measured by instructing participants to attend
to some sources of information but to ignore others at the same time and
then determining their effectiveness in doing this. Also called controlled
attention; directed attention; executive attention.
 Colin Cherry described a phenomenon called as the cocktail party effect, the
process of tracking one conversation in the face of the distraction of other
conversations.
 He observed that cocktail parties are often settings in which selective
attention is salient.
 Cherry did not actually hang out at numerous cocktail parties to study
conversations. He studied selective attention in a more carefully controlled
experimental setting. He devised a task known as shadowing.
 In shadowing, you listen to two different messages. Cherry presented a
separate message to each ear, known as dichotic presentation.
 You are required to repeat back only one of the messages as soon as possible
after you hear it. In other words, you are to follow one message (think of a
detective “shadowing” a suspect) but ignore the other.
 Cherry’s participants were quite successful in shadowing distinct messages in
dichotic-listening tasks, although such shadowing required a significant
amount of concentration.
 The participants were also able to notice physical, sensory changes in the
unattended message—for example, when the message was changed to a
tone or the voice changed from a male to a female speaker.
 However, they did not notice semantic changes in the unattended message.
They failed to notice even when the unattended message shifted from English
to German or was played backward.
 Conversely, about one third of people, when their name is presented during
these situations, will switch their attention to their name. Some researchers
have noted that those who hear their name in the unattended message tend
to have limited working-memory capacity.
 Three factors help you to selectively attend only to the message of the target
speaker to whom you wish to listen:
1. Distinctive sensory characteristics of the target’s speech. Examples of such
characteristics are high versus low pitch, pacing, and rhythmicity.
2. Sound intensity (loudness).
3. Location of the sound source
Sustained Attention
 While selective attention is mainly concerned with the selection of stimuli,
sustained attention is concerned with concentration. It refers to our ability to
maintain attention on an object or event for longer durations. It is also known
as “vigilance”.
 Sometimes people have to concentrate on a particular task for many hours.
Air traffic controllers and radar readers provide us with good examples of this
phenomenon. They have to constantly watch and monitor signals on screens.
 The occurrence of signals in such situations is usually unpredictable, and
errors in detecting signals may be fatal. Hence, a great deal of vigilance is
required in those situations.
 Factors Influencing Sustained Attention Several factors can facilitate or inhibit
an individual’s performance on tasks of sustained attention.
 Sensory modality is one of them. Performance is found to be superior when
the stimuli (called signals) are auditory than when they are visual.
 Clarity of stimuli is another factor. Intense and long lasting stimuli facilitate
sustained attention and result in better performance.
 Temporal uncertainty is a third factor. When stimuli appear at regular
intervals of time they are attended better than when they appear at irregular
intervals.
 Spatial uncertainty is a fourth factor. Stimuli that appear at a fixed place are
readily attended, whereas those that appear at random locations are difficult
to attend.

Divided Attention
Attention to two or more channels of information at the same time, so that two
or more tasks may be performed concurrently. It may involve the use of just one
sense (e.g., hearing) or two or more senses (e.g., hearing and vision).
Investigating Divided Attention in the Lab
 Early work in the area of divided attention had participants view a videotape
in which the display of a basketball game was superimposed on the display of
a handslapping game.
 Participants could successfully monitor one activity and ignore the other.
However, they had great difficulty in monitoring both activities at once, even
if the basketball game was viewed by one eye and the hand-slapping game
was watched separately by the other eye (Neisser & Becklen, 1975).
 Neisser and Becklen hypothesized that improvements in performance
eventually would have occurred as a result of practice. They also
hypothesized that the performance of multiple tasks was based on skill
resulting from practice. They believed it not to be based on special cognitive
mechanisms.
 The following year, investigators used a dual-task paradigm to study divided
attention during the simultaneous performance of two activities: reading
short stories and writing down dictated words (Spelke, Hirst, & Neisser, 1976).
 The researchers would compare and contrast the response time (latency) and
accuracy of performance in each of the three conditions. Of course, higher
latencies mean slower responses.
 As expected, initial performance was quite poor for the two tasks when the
tasks had to be performed at the same time. However, Spelke and her
colleagues had their participants practice to perform these two tasks 5 days a
week for many weeks (85 sessions in all). To the surprise of many, given
enough practice, the participants’ performance improved on both tasks.
 They showed improvements in their speed of reading and accuracy of reading
comprehension, as measured by comprehension tests. They also showed
increases in their recognition memory for words they had written during
dictation. Eventually, participants’ performance on both tasks reached the
same levels that the participants previously had shown for each task alone.
 When the dictated words were related in some way (e.g., they rhymed or
formed a sentence), participants first did not notice the relationship. After
repeated practice, however, the participants started to notice that the words
were related to each other in various ways. They soon could perform both
tasks at the same time without a loss in performance.
 Spelke and her colleagues suggested that these findings showed that
controlled tasks can be automatized so that they consume fewer attentional
resources. Furthermore, two discrete controlled tasks may be automatized to
function together as a unit. The tasks do not, however, become fully
automatic.
 For one thing, they continue to be intentional and conscious. For another,
they involve relatively high levels of cognitive processing.
Alternating attention
 Alternating attention is the ability to shift the focus of attention and move
between two or more activities with different cognitive requirements.
 Mental flexibility is thereby required to enable the switch and to perform the
different tasks efficiently, without the cognitive load of one task limiting the
performance of the others, or task switching itself altering concentration.
MODELS OF ATTENTION
BOTTLENECK MODELS OF ATTENTION
 A bottleneck restricts the rate of flow, as, say, in the narrow neck of a milk
bottle. The narrower the bottleneck, the lower the rate of flow.
 Broadbent's, Treisman's and Deutsch and Deutsch Models of Attention are all
bottleneck models because they predict we cannot consciously attend to all
of our sensory input at the same time.
 This limited capacity for paying attention is therefore a bottleneck and the
models each try to explain how the material that passes through the
bottleneck is selected.
BROADBENT’S FlLTER MODEL
 Donald Broadbent is recognised as one of the major contributors to the
information processing approach, which started with his work with air traffic
controllers during the war.
 In that situation a number of competing messages from departing and
incoming aircraft are arriving continuously, all requiring attention. The air
traffic controller finds s/he can deal effectively with only one message at a
time and so has to decide which is the most important.
 Broadbent designed an experiment (dichotic listening) to investigate the
processes involved in switching attention which are presumed to be going on
internalb in our heads.
 Broadbent argued that information from all of the stimuli presented at any
given time enters a sensory buffer. One of the inputs is then selected on the
basis of its physical characteristics for further processing by being allowed to
pass through a filter.
 Because we have only a limited capacity to process information, this filter is
designed to prevent the information-processing system from becoming
overloaded.
 The inputs not initially selected by the filter remain briefly in the sensory
buffer, and if they are not processed they decay rapidly. Broadbent assumed
that the filter rejected the non-shadowed or unattended message at an early
stage of processing.
 Broadbent (1958) looked at air-traffic control type problems in a laboratory.
 Broadbent wanted to see how people were able to focus their attention
(selectively attend), and to do this he deliberately overloaded them with
stimuli - they had too many signals, too much information to process at the
same time.
 One of the ways Broadbent achieved this was by simultaneously sending one
message (a 3-digit number) to a person's right ear and a different message (a
different 3-digit number) to their left ear.
 Participants were asked to listen to both messages at the same time and
repeat what they heard, this is known as a 'dichotic listening task.
 In the example above the participant hears 3 digits in their right ear (7,5,6)
and 3 digits in their left ear (4,8,3). Broadbent was interested in how these
would be repeated back. Would the participant repeat the digits back in the
order that they were heard (order of presentation), or repeat back what was
heard in one ear followed by the other ear (ear-by-ear).
 He actually found that people made fewer mistakes repeating back ear by ear
and would usually repeat back this way.
SINGLE CHANNEL MODEL
 Results from this research led Broadbent to produce his 'filter' model of how
selective attention operates. Broadbent concluded that we can pay attention
to only one channel at a time - so his is a single channel model.
 In the dichotic listening task each ear is a channel. We can listen either to the
right ear (that's one channel) or the left ear (that's another channel).
Broadbent also discovered that it is difficult to switch channels more than
twice a second.
 So you can only pay attention to the message in one ear at a time - the
message in the other ear is lost, though you may be able to repeat back a few
items from the unattended ear. This could be explained by the short-term
memory store which holds onto information in the unattended ear for a short
time.
 Broadbent thought that the filter, which selects one channel for attention,
does this only on the basis of PHYSICAL CHARACTERISTICS of the information
coming in: for example, which particular ear the information was coming to,
or the type of voice.
 According to Broadbent the meaning of any of the messages is not taken into
account at all by the filter. All SEMANTIC PROCESSING (processing the
information to decode the meaning, in other words understand what is said)
is carried out after the filter has selected the channel to pay attention to.
 So whatever message is sent to the unattended ear is not understood.
 Because we have only a limited capacity to process information, this filter is
designed to prevent the information-processing system from becoming
overloaded. The inputs not initially selected by the filter remain briefly in the
sensory buffer store, and if they are not processed they decay rapidly.
 Broadbent assumed that the filter rejected the non-shadowed or unattended
message at an early stage of processing.
EVALUATION OF BROADBENT'S MODEL
(1) Broadbent's dichotic listening experiments have been criticised because:
(a) The early studies all used people who were unfamiliar with shadowing and so
found it very difficult and demanding. Eysenck & Keane (1990) claim that the
inability of naive participants to shadow successfully is due to their unfamiliarity
with the shadowing task rather than an inability of the attentional system.
(b) Participants reported after the entire message had been played - it is possible
that the unattended message is analysed thoroughly but participants forget.
(c) Analysis of the unattended message might occur below the level of conscious
awareness. For example, research by von Wright et al (1975) indicated analysis
of the unattended message in a shadowing task. A word was first presented to
participants with a mild electric shock. When the same word was later presented
to the unattended channel, participants registered an increase in GSR (indicative
of emotional arousal and analysis of the word in the unattended channel).

ANNE TREISMAN’S (1964) ATTENUATION MODEL


 Selective attention requires that stimuli are filtered so that attention is
directed. Broadbent's model suggests that the selection of material to attend
to (that is, the filtering) is made early, before semantic analysis.
 Treisman's model retains this early filter which works on physical features of
the message only. The crucial difference is that Treisman's filter
ATTENUATES rather than eliminates the unattended material.
 Attenuation is like turning down the volume so that if you have 4 sources of
sound in one room (TV, radio, people talking, baby crying) you can turn down
or attenuate 3 in order to attend to the fourth.
 The result is almost the same as turning them off, the unattended material
appears lost. But, if a nonattended channel includes your name? for example,
there is a chance you will hear it because the material is still there.
 Treisman agreed with Broadbent that there was a bottleneck, but disagreed
with the location.
 Treisman carried out experiments using the speech shadowing method.
Typically, in this method participants are asked to simultaneously repeat
aloud speech played into one ear (called the attended ear) whilst another
message is spoken to the other ear.
 In one shadowing experiment, identical messages were presented to two ears
but with a slight delay between them. If this delay was too long, then
participants did not notice that the same material was played to both ears.
 When the unattended message was ahead of the shadowed message by upto
to 2 seconds, participants noticed the similarity. If it is assumed the
unattended material is held in a temporary buffer store, then these results
would indicate that the duration of material held in sensory buffer store is
about 2 seconds.
 In an experiment with bilingual participants, Treisman presented the attended
message in English and the unattended message in a French translation.
When the French version lagged only slightly behind the English version,
participants could report that both messages had the same meaning. C1early,
then, the unattended message was being processed for meaning and
 Broadbent's Filter Model, where the filter extracted on the basis of physical
characteristics only, could not explain these findings. The evidence suggests
that Broadbent's Filter Model is not adequate, it does not allow for meaning
being taken into account.
 Treisman's ATTENUATION THEORY, in which the unattended message is
processed less thoroughly than the attended one, suggests processing of the
unattended message is attenuated or reduced to a greater or lesser extent
depending on the demands on the limited capacity processing system.
 Treisman suggested messages are processed in a systematic way, beginning
with analysis of physical characteristics, sy11abic pattern, and individual
words. After that, grammatical structure and meaning are processed. It will
often happen that there is insufficient processing capacity to permit a full
analysis of unattended stimuli.
 In that case, later analyses will be omitted. This theory neatly predicts that it
will usually be the physical characteristics of unattended inputs which are
remembered rather than their meaning.
 To be analysed, items have to reach a certain threshold of intensity All the
attended/selected material will reach this threshold but only some of the
attenuated items.
 Some items will retain a permanently reduced threshold, for example your
own name or words/phrases like 'help' and 'fire'.
 Other items will have a reduced threshold at a particular moment if they have
some relevance to the main attended message.
EVALUATION OF TREISMAN'S ATTENUATION MODEL
1. Treisman's Model overcomes some of the problems associated with
Broadbent's Filter Model, e.g. the Attenuation Model can account for the 'Cocktail
Party Syndrome'.
2. Treisman's model does not explain how exactly semantic analysis works.
3. The nature of the attenuation process has never been precisely specified.
4. A problem with all dichotic listening experiments is that you can never be sure
that the participants have not actually switched attention to the so called
unattended channel.

KAHNEMAN'S CAPACITY THEORY OF ATTENTION


 Kahneman (1973) proposed that there is a certain amount of ATTENTIONAL
CAPAClTY available which has to be allocated among the various demands
made on it.
 On the capacity side, when someone is aroused and alert, they have more
attentional resources available than when they are lethargic. On the demand
side, the attention demanded by a particular activity is defined in terms of
MENTAL EFFORT; the more skilled an individual the less mental effort is
required, and so less attention needs to be allocated to that activity.
 If a person is both motivated (which increases attentional capacity) and
skilled (which decreases the amount of attention needed), he or she will have
some attentional capacity left over.
 People can attend to more than one thing at a time as long as the total
mental effort required does not exceed the total capacity available. In
Kahneman's model allocation of attentional resources depends on a
CENTRAL ALLOCATION POLICY for dividing available attention between
competing demands.
 Once a task has become automatic it requires little mental effort and
therefore we can attend to more than one automatic task at any one time,
e.g. driving and talking.
Kahneman's Capacity Model of Attention
(1) Attention is a central dynamic process rather than the result of automatic
filtering of perceptual input.
(2) Attention is largely top-down process as opposed to the Filter Models which
suggest a bottom-up process.
(3) The focus of interest is the way the central allocation policy is operated so as
to share appropriate amounts of attention between skilled automatic tasks and
more difficult tasks which require a lot of mental effort.
(4) Rather than a one-way flow of information from input through to responses,
attention involves constant perceptual evaluation of the demands required to
produce appropriate responses.
EVALUATION OF KAHNEMAN'S MODEL
1. Cheng (1985) points out that when tasks have been learnt we change the way
we process and organise them, but this is not necessarily 'automaticity'. For
example, if asked to add ten two's you could add 2 and 2 to make 4, add 4 and 2
to make 6, add 6 and 2 to make 8, add 8 and 2 to make 10 etc. Indeed young
children when first learning arithmetic would do just this.
When we have more arithmetical knowledge and realise that adding ten two's is
the same as multiplying 2 x 10, the solution can be produced in one step. The
answer is quicker because we have processed the information differently, using
different operations, not because we have added ten two's 'automatically'.
2. A MAJOR PROBLEM with Kahneman's theory is that is does not explain how the
allocation system decides on policies for allocating attentional resources tasks.

You might also like