Human Pattern Recognition

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Perception, 1994, volume 23, pages 411 -427

Human pattern recognition:


parallel processing and perceptual learning
Manfred Fahle
Section for Visual Science, Department of Neurophthalmology, University Eye Clinic, Rontgenweg 11,
D72076 Tubingen, Germany
Received 12 July 1993, in revised form 23 December 1993

Abstract. A new theory of visual object recognition by Poggio et al that is based on multidimensional interpolation between stored templates requires fast, stimulus-specific learning in the
visual cortex. Indeed, performance in a number of perceptual tasks improves as a result of
practice. We distinguish between two phases of learning a vernier-acuity task, a fast one that
takes place within less than 20 min and a slow phase that continues over 10 h of training and
probably beyond. The improvement is specific for relatively 'simple' features, such as the orientation of the stimulus presented during training, for the position in the visual field, and for the
eye through which learning occurred. Some of these results are simulated by means of a
computer model that relies on object recognition by multidimensional interpolation between
stored templates. Orientation specificity of learning is also found in a jump-displacement task.
In a manner parallel to the improvement in performance, cortical potentials evoked by the jump
displacement tend to decrease in latency and to increase in amplitude as a result of training.
The distribution of potentials over the brain changes significantly as a result of repeated
exposure to the same stimulus. The results both of psychophysical and of electrophysiological
experiments indicate that some form of perceptual learning might occur very early during
cortical information processing. The hypothesis that vernier breaks are detected 'early' during
pattern recognition is supported by the fact that reaction times for the detection of verniers
depend hardly at all on the number of stimuli presented simultaneously. Hence, vernier breaks
can be detected in parallel at different locations in the visual field, indicating that deviation
from straightness is an elementary feature for visual pattern recognition in humans that is
detected at an early stage of pattern recognition. Several results obtained during the last few
years are reviewed, some new results are presented, and all these results are discussed with
regard to their implications for models of pattern recognition.
1 Introduction
Visual object recognition relies on the comparison between an actual retinal image of
an object and stored examples of previously experienced (or described) visual objects.
Neuronal mechanisms in the brain then try to identify the actually presented object by
matching it with stored object descriptions (cf eg Boucart et al 1994). T h e description
most similar to the actually presented object will be regarded as the object (eg
a zebra) corresponding to the actual patch of black and white stripes on the retina.
T h e task of correlating the actual with the stored images is far from trivial, as was
experienced by researchers in artificial intelligence trying to teach computers how to
see. Objects will rarely reappear at exactly the same distance, hence the size of their
image varies. They might be at a different visual field position, though foveation, ie
looking towards the object, will often compensate for this problem. Objects might
appear under very varying illumination, as regards both luminance and spectral
composition, and, most important of all, they might be rotated relative to the previous
appearance, or might have changed their form, as in the case of a running versus a
lying zebra. Somehow, our brain is able to cope with these problems much better
than computer programs are u p to now, and to extract invariances from the visual
image (cf Van Gool et al 1994).
One possible explanation is that our brain makes use of a powerful strategy to
identify objects even from novel views: multidimensional interpolation between stored

412

M Fahle

examples (Poggio 1990). This hypothesis assumes that templates or views of an object
are stored on first appearance, and that the brain is able to interpolate between these
templates in order to recognise the object even after rotation, translation, or slight
changes of shape or form. Pattern recognition would then rely not on the formation
of a complex three-dimensional model of the object, requiring quite high-level
computational processes, but on a heavily memory-based strategy with far less computation required. Learning of relatively simple feature combinations plays a key role
here, and the model requires that fast and specific perceptual learning occurs even in
adults.
It has been known for some time that recognition of visually presented objects
indeed improves through practice. For instance, to all students of histology most
sections look quite similar, be they taken from liver, lung, or kidney. But after some
time, shorter for some than for others, the more advanced observer wonders why he
or she was ever able to miss the difference. It has also been known for a couple of
decades that performance improves as a function of training in much less complex
tasks, such as vernier discrimination, line-orientation discrimination, or stereoscopic
depth perception (Bennett and Westheimer 1991; Fendick and Westheimer 1983;
McKee and Westheimer 1978; Shiu and Pashler 1991; Vogels and Orban 1985).
Most researchers even take precautions against learning effects contaminating their
results, usually by starting experiments with a lengthy training phase for observers to
arrive at their 'baseline' performance, and/or by counterbalancing the order of
measurements between observers. Discrimination of spatial phase in complex
luminance-modulated gratings improves relatively fast but 'returns' to baseline after
rotation of the stimulus by 90 (Fiorentini and Berardi 1980, 1981). This was a first
indication that learning in visual perception might be quite specific for the exact
features of the stimulus presented. Learning does transfer from one eye to the
partner eye in monocular learning experiments but is specific for visual field position
(Fiorentini and Berardi 1981; Ramachandran and Braddick 1973), as well as for
orientation in a motion detection task (Ball and Sekuler 1987). Even in flies, learning
of visual patterns is specific for visual field position (Dill et al 1993). Learning also
plays an important role in the detection and discrimination of textures. Discrimination between figure and ground is specific for the eye and the orientation of the
stimulus elements, but not of the elements of the surround (Kami and Sagi 1991;
cf also Sagi and Polat 1992), and learning depends on which feature of the stimulus is
attended to (Ahissar and Hochstein 1993).
In the following, I will review contributions to the investigation of 'early' perceptual
learning. Learning in this context is defined as an observable modification of
behaviour, namely improvement in perceptual performance, as a result of training.
The first part of the paper (section 2) will deal with computational considerations and
models for fast perceptual learning. Section 3 will be devoted to the fast and the slow
phase of perceptual learning in vernier and stereoscopic depth discrimination and a
specific computer simulation; in sections 4 and 5 I will present further evidence for
the specificity of the learning, and then present correlations between psychophysical
and electrophysiological results in a jump-displacement task in humans. In section 6 I
will review the finding that vernier breaks are detected in parallel over the visual field,
indicating a relatively early stage of pattern recognition and of perceptual learning.
2 A model of perceptual learning
Visual object recognition can be considered as establishing more or less unique
relations between retinal images and cerebral representations or concepts of objects.
One might speculate that a number of views of an object are stored in visual memory
for all objects that can be recognised, and at the time of presentation of a visual

Parallel processing and perceptual learning

413

object the memory bank is searched through for the most similar view stored. If the
view, or a very similar one, has been stored, the corresponding 'basis function' will
provide a large output. If no sufficiently similar view is found, no 'basis function' or
'receptive field' will provide a significant output, and the new view would be added to
the views already stored. Such a procedure requires relatively little computational
power, but a large amount of (visual) storage capacity. Therefore, the new model of
object recognition requires relatively fast perceptual learning to store new views upon
request in the part of the brain that deals with visual perception.
It is important to note that the model does not require the actual view of an object
to be identical with a stored view, in the way a look-up table would. On the contrary,
the model assumes hyperradial basis functions (HBF) to be spread out through an
n-dimensional vector space. The basis functions serve as a kind of 'fuzzy' templates
that identify classes of features, rather than individual features. In this sense, the HBF
model postulates a form of a blurred look-up table as the basis for visual pattern
recognition. In our simplistic model, we feed the input of idealised photoreceptors
into an HBF network. When the orientation of the stimulus changes by 90, a different
subset of receptors is excited and the input to the HBF network changes. Hence, the
model would predict that learning does not generalise across an orientation change
of 90. Details of the model can be found in Poggio (1990), Poggio and Girosi (1990),
Poggio et al (1992a), and Weiss et al (1993).
3 Orientation specificity of the fast and slow phase of learning
We were especially interested in the time course of perceptual learning, and in the
question of whether a fast phase of perceptual learning for vernier discrimination
exists in addition to the slow phase described in the literature (McKee and Westheimer
1978). We indeed found such a fast phase of vernier learning that takes place within
some tens of minutes whereas a second, slow phase requires hours to weeks. Both
phases are quite specific for the stimulus used for learning: there is hardly any transfer
of learning when the stimulus is rotated by 90. This orientation specificity indicates
that improvement is not mainly caused by the fact that observers concentrate better
on the task or develop a better general strategy to solve hyperacuity experiments in
general.
3.1 Methods
Stimuli were presented on a CRT screen (Tektronix 608 or HP 1336) under computer
control. A vernier target was presented for 100 ms. It was 10 min arc long and 2 min
arc wide, with a luminance around 400 cd m~2 on a surround of 25 cd m~2. Viewing
distance was 2.0 m. Observers were paid students of Tubingen University. They had
normal or corrected-to-normal visual acuity, were naive as to the aim of the study, and
had not previously participated in psychophysical experiments. Stimuli were oriented
either horizontally or vertically. Observers had to decide, in a modified two-alternative
forced-choice task, whether the lower segment of the stimulus (or the right-hand one)
was offset to the left or to the right (or up or down) relative to its partner segment
and to indicate their decision by pressing the appropriate one of two push buttons.
The computer provided auditory feedback on the correctness of their response.
In the experiments on the slow phase of learning, an adaptive-staircase procedure
was used to measure thresholds (PEST; Taylor and Creelman 1967), but the higher
temporal resolution required to measure the fast phase did not allow the reliable
calculation of thresholds. Here, percentages of incorrect responses were measured for
a fixed vernier offset, usually of 15 s arc.

414

M Fahle

3.2 Results
Observers quickly improved performance as a result of practice. Mean performance
of twelve inexperienced observers improved from 26% to around 15% incorrect
responses within less than 30 min (figure 1). Half of the observers were trained with
vertical stimulus orientation, the other half with horizontally oriented verniers. After
1 h, stimulus orientation was rotated by 90. Performance deteriorated dramatically
as a result of stimulus rotation, to pretraining levels or below.

EHGD

1.0
Time/h

Figure 1. Fast perceptual learning of vernier acuity. Performance [means and standard errors
(vertical bars) of percentage of incorrect responses] of twelve observers. Six observers started
with vertical orientation of the verniers, six started with horizontal orientation. After 1 h of
training, stimulus orientation was changed by 90 (broken vertical line), and the experiment
continued for another hour. In spite of large interindividual variation, there was a highly significant improvement of performance as a result of training that was highly specific for stimulus
orientation, ie that did not transfer between orientations (after Poggio et al 1992b).
Similar results were obtained with another twelve inexperienced observers in a
long-term learning experiment. Here, thresholds rather than percentages of correct
responses were measured, and the total training time per observer was 10 h, rather than
2 h as in the previous experiment. All average thresholds were clearly below 30 s arc,
ie in the hyperacuity range, below the diameter of foveal photoreceptors (figure 2).
The fast phase of learning is partly masked in this latter experiment; since each
data point represents 240 presentations, rather than 60 presentations as in the
previous experiment, temporal resolution is much poorer. As is evident from figure 2,
learning of the vernier-discrimination task continues over at least 5 h, and if stimulus
orientation changes from vertical to horizontal or vice versa after 5 h thresholds
increase to and beyond pretraining levels. The improvement is significant both for
the first and for the second part of the curve (p < 0.01). In additional experiments
with constant stimulus orientation, improvement of thresholds continued throughout
the complete experiment, which lasted for 10 h, split into 10 daily 1 h sessions (Fahle
and Edelman 1993).
3.3 Discussion
The results on orientation specificity of vernier learning show that there are two
distinct phases of perceptual learning, a fast one that improves performance within a
few tens of minutes, and a slow one that continues to improve thresholds over hours.

415

Parallel processing and perceptual learning

CD
0Z."5 '

1
I
II
I

I
1

ZU "

I
I
T

if
1 J

|TT

'

1n .

ijJWlrl

ir T

UtJIk I

I,

itli

IT T

j f f lJ-^ ^T-LJI|u1 fi

fW^Hfl f****j x f ^ j i r

i
i
i

5 .

0 0.0

i
t
1

2.5

5.0
Time/h

i
1

7.5

"

10.0

Figure 2. Long-term perceptual learning of vernier acuity. In contrast to figure 1, thresholds


rather than percentages of incorrect responses were measured in another twelve inexperienced
observers. Each data point represents mean values for 240 presentations per observer, ie
almost 3000 presentations; vertical bars indicate standard errors. After 5 h of training, stimulus
orientation was rotated by 90 (broken vertical line). Again, there were large interindividual
variations, but a clear improvement of thresholds as a result of practice. Improvement did not
transfer after rotation of the stimulus. Quite to the contrary, mean thresholds for the new orientation were higher than for completely inexperienced observers (after Fahle and Edelman 1993).
The fast phase of perceptual learning immediately improves performance, as in the
case of grating discrimination (Fiorentini and Berardi 1980, 1981) but unlike in
texture-discrimination tasks (Kami and Sagi 1991). Of course, the different slopes of
the learning curve might be distinct phases of the same process, eg one that can be
described by an exponential function. The results on learning in grating discrimination (Fiorentini and Berardi 1980, 1981) and on texture segregation (Kami and Sagi
1991; Sagi and Kami 1993) indicate that learning may take place at quite different
levels of visual pattern recognition, with different time constants (not just 'fast' and
'slow'), with varying degree of eye specificity (transfer versus no transfer between the
eyes), and with or without the need of a phase of consolidation or rest. Informal
experiments indicate that the slow phase of learning might continue, at a very low
speed, for weeks or even months. These findings indicate that several levels of
perceptual learning might exist, not just one. Moreover, more cognitive factors, such
as insight into and adaptation to the test, might play a certain role as well. Both the
slow and the fast learning process are astonishingly specific: improvement does not
transfer to an identical stimulus rotated by 90. This result argues against the
assumption that observers learn mostly how to perform optimally in psychophysical
experiments in general, ie to concentrate, fixate constantly, and push the correct button.
Quite on the contrary, observers seem to learn specific features of the stimulusfeatures that are so specific that they are of no use (or might even be disadvantageous) when the stimulus is rotated by 90. The learning of stimulus-specific features
is a prerequisite for the model of pattern recognition based on radial basis functions,
as discussed above. Computer simulations, based on a model that uses radial basis
functions, were in excellent agreement with the experimental data on the orientation
specificity of perceptual learning (Poggio et al 1992b).

416

M Fahle

4 Specificity for visual field position and for the eye used for learning
The results of the first experiment indicate that perceptual learning of vernier acuity
operates on a level in the visual system where neurons are orientation specific, since,
otherwise, transfer of training effects would be expected between orientations. The
next question is whether the learning is also specific for the visual field position where
the stimulus was learned and for the eye used during monocular learning.
4.1 Methods
As in the previous experiment, stimuli appeared on a CRT screen under computer
control. Presentation time, stimulus size, luminance, contrast, and viewing distance
were as before. A new group of eight students participated in the first part of the
experiment, in which specificity for visual field position was tested. The main difference from the previous experiment was that fixation was not on the stimulus but on a
fixation point located at different positions around the monitor, such that stimuli were
presented for 150 ms (too short for voluntary saccades to the stimulus) at an
eccentricity of 10 deg. A video camera and display monitored observers' stability of
fixation. In eight observers, eight visual field positions of equal eccentricity, as indicated in figure 3, were tested in fixed order. During the first four changes of visual
field position, stimulus orientation changed simultaneously with position, ie when a
new visual field position was tested, the stimulus not only moved to this new position
but also changed orientation by 90. The later transitions, however, were pure
changes of position with constant orientation. Vernier offsets varied between 50 and
90 s arc for the peripheral tests, but were constant for each individual observer.
Percentages of incorrect responses, rather than thresholds, were measured for a fixed
vernier offset that was slightly below the initial threshold for the individual observer
and visual field location.
In the second part of the experiment, fixation was again to the stimulus, ie central.
To investigate whether learning of a standard line vernier in the centre of gaze was
monocular, four observers were trained on vernier discrimination with the right eye
for 5 h and were then tested with the left eye; the sequence was reversed for the
remaining four observers. Here, thresholds rather than percentages of correct responses
were measured.
4.2 Results
When the fixation point was above the monitor, the lower visual field was tested
(positions 1, 2, and 8 in figure 3a), whereas the upper visual field was tested during
fixation on positions 4 - 6 . At most visual field positions, mean performance of all
observers improved during the 1 h periods during which this eccentric visual field
position was used for training (figure 3b). Performance improved, on average, by
7.09% (standard error 1.4%; p = 0.0002, paired Mest), but decreased by 7.1%
(standard error 3.4%) at the transition to a new visual field position. This decrease of
performance was significant both for all transitions {p = 0.03, paired Mest) and for
the last three transitions in figure 3, where stimulus orientation was constant. Further
experiments with a pseudorandom order of testing at constant stimulus orientation
(Fahle et al 1994) fully confirmed the specificity of learning for visual field position.
The results shown in figure 3, moreover, clearly indicate that performance is better in
the lower than in the upper visual hemifield.
When testing was monocular, with another four observers being trained with the
right eye and four with the left eye, thresholds improved as a result of training (figure 4).
But when testing was with the opposite eye, performance returned to pretraining
levels or even above. The results show the same overshoot that was present in longterm learning after change of orientation. Learning in the second eye did not interfere

417

Parallel processing and perceptual learning

8^-2

/>

change of position
of fixation point

positions of
fixation point

U -10

(b)

Position constant

Position change

positions of
fixation point

Figure 3. (a) Performance of eight inexperienced observers. Means and standard errors (vertical
bars) of percentage of incorrect responses for eight different positions in the visual field. The
fixation point moved to a new position after every hour of training. The positions of the fixation points in relation to the screen bearing the stimulus are shown at the top of the figure, and
the corresponding numbers are indicated below the data points. For nearly all positions,
performance improved during the training, but deteriorated after the change of position. For
the first four changes of positions, as marked by the heavy dashed line, orientation changed
simultaneously with position. Positions in the upper visual field (fixation below stimulus, ie
positions 4, 5, and 6) tended to yield poorer results than positions in the lower visual field. The
first position was retested at the end of the experiment, (b) Average change of performance for
individual observers and mean change and standard error (vertical bars) during training at each
of the eight positions in the visual field ('position constant'), and at the transition between visual
field positions ('position change'). Mean improvement within positions was 7.1% 1.4%, and
mean increase of errors at change of position was 7.1% 3.4%.

418

M Fahle

with learning in the first eye: a retest through the first eye after learning through the
second eye showed neither significant decrease nor increase of performance.

(T5P|

right
eye

eye

25

right I
eye

eye i

1
1
1I TT

\
J
J
1

20

JJJJTTTIT

m ,

15
10

1 In * T .HjHkyi
-

LiimwW
^lAI-MJj

<

J
0

i TT T I

'

'

'

5.0
7.5
10.0
Time/h
Figure 4. Transfer of learning between the two eyes. Monocular thresholds were tested in
another eight observers for 10 h per observer. Four observers started training with the left eye
and four with the right eye. Testing was with the opposite eye; its start is indicated by the
broken vertical line. Data points are means for eight subjects, and vertical bars indicate standard
errors. The results do not show any transfer of learning between the eyes, but an overshoot of
thresholds similar to that after the transition between orientations.
0.0

2.5

4.3 Discussion
The results indicate that perceptual learning of hyperacuity is specific for the visual
field position and for the eye used during learning. These findings, together with the
orientation specificity of learning as present in the first experiment, effectively
constrain the possible localisation, in the visual system, of perceptual learning. The
orientation specificity we found requires that the neurons that learn are orientation
specific, since neurons that are not orientation specific would be trained by all orientations and learning would transfer between orientations. On the other hand, the fact
that learning is mostly eye specific suggests that the neurons that learn are mostly
monocular. (This result is by no means trivial since most other learning results
which probably concern more complex functionsshow transfer between the eyes, as
outlined in the introduction.) Last, the position specificity of learning indicates that
the underlying neuronal processes occur in a cortical area where position invariance
has not yet been achieved. These results suggest area VI as the most probable
candidate for learning of visual hyperacuity: neurons there are orientation specific,
retinotopically organised, and, at least in layer 4, mostly monocular.
5 Learning in motion perception and a physiological correlate
Training of vernier acuity considerably improves perceptual thresholds, and we have
just speculated about a possible location in the brain where this learning might occur.
One possible way to find out more about the neuronal mechanisms underlying perceptual learning would be to use electrophysiological methods, ie, in humans, visually
evoked cortical potentials. It is indeed possible to evoke cortical potentials by introducing vernier breaks in previously straight lines (Steinman et al 1985). We used similar
stimuli to investigate the neuronal mechanisms of perceptual learning in humans.

Parallel processing and perceptual learning

419

5.1 Methods
The stimulus was a straight line consisting of three elements replicated five times, as
indicated in figure 6c. The middle portions of these lines were displaced in one step
either to the left or to the right. Thresholds for the discrimination between jump displacements to the right and those to the left were measured by means of the usual
adaptive-staircase procedure, and percentages of incorrect responses were subsequently
measured as a function of training for a constant jump displacement roughly corresponding to the detection threshold. Five observers started with this vertical stimulus
orientation and five started with a similar stimulus, but in a horizontal orientation, ie
jump displacements either up or down. Block size was 100 presentations of 150 ms
each. After around 30 min stimuli were rotated by 90. Therefore each data point in
the graph relies upon identical numbers of horizontal and vertical stimulus presentations. Stimulus luminance and contrast corresponded to the ones in the preceding
experiments. No feedback about the observer's response was provided.
In order to evoke cortical potentials sufficiently large to be clearly identified and
compared before and after training, the same stimulus was used for the electrophysiological experiments. We recorded scalp potentials at sixteen positions over the occipital
skull in another ten naive observers. Observers fixated the monitor while their brain
potentials were recorded. The middle portions of the five parallel vernier targets
jumped to the right (or up) and back at a frequency of 0.8 Hz. The responses evoked
both by the jump displacement and by the jump back were averaged over blocks of
600 cycles each. Short breaks were made after each 100 cycles. After 1200 cycles,
stimuli were rotated by 90, either from vertical to horizontal or vice versa. Jump
size was constant at 45. We analysed the evoked potentials of all sixteen positions
with regard to amplitude and latency of the P100 component as well as to the spatial
distribution of the potentials over time. Means of the first 600 responses were
compared with the means of the second 600 responses for both stimulus orientations.
5.2 Results
The psychophysical experiment yields results that agree well with those on orientation
dependence of perceptual learning in vernier acuity. The number of incorrect
responses decreased on average by almost 10% within less than 30 min. After rotation of the stimulus by 90, the number of incorrect responses increased to more than
pretraining levels, and decreased continuously by more than 10% as a function of
training thereafter. Training to the second orientation did not interfere with performance in the stimulus orientation trained first, as is obvious from the rightmost data
point of figure 5. The improvement of performance during both the first and the
second half of the experiment is significant (p = 0.05, analysis of variance). Also, the
increase in incorrect responses after the change of orientation is highly significant
(p = 0.005, paired Mest; p = 0.02, Mann-Whitney U-test).
The electrophysiological experiment showed a significant decrease in the latencies
of the so-called P100 component of the visually evoked response from 117 to 104 ms
for the vertical stimuli and from 125 to 115 ms for the horizontal stimuli. These
differences are significant at the level p = 0.025 and p = 0.02, respectively. At the
same time, mean amplitudes of the PI00 increased as a function of learning
(figure 6a). However, the most significant change caused by the training concerns the
spatial distribution of potentials over the occipital pole. There were highly significant
differences in this distribution between the first and second 600 stimulus presentations (figure 6b) for several latencies, especially at around 80 ms after stimulus onset
above V I , as well as at 250 ms over more temporal and parietal areas (p = 0.01,
Mest; cf Fahle and Skrandies 1994).

420

M Fahle

CD40

30

1
5 20

[ i
1

-l
^ K ,i 1
^ ^ i
i-ki 1
' ' 1

10

0
0.00

0.25

0.50

i l1

0.75

1.00

1.25

Time/h

Figure 5. Lack of transfer of learning between different stimulus orientations in a jumpdisplacement task. Five observers were trained for 30 min with a vertical stimulus orientation,
another four observers with a horizontal orientation. Stimuli were rotated by 90 (as indicated
by the broken vertical lines) after 30 min, and testing and learning of the new orientation
followed. Performance decreased significantly after the transition between orientations. The
rightmost data point represents a return to the original stimulus orientation (after Fahle and
Skrandies 1994).
5.3 Discussion
The orientation specificity of perceptual learning found previously for vernier stimuli
occurs also in a jump-displacement task. The same orientation specificity is also found
for stereoscopic depth perception both with random-dot stereograms (Ramachandran
and Braddick 1973) and for two-dot stimuli (Fahle et al 1994). All these results
corroborate the hypothesis that perceptual learning can indeed occur at a relatively
'early' level of visual information processing, where rotation invariance has not yet
been achieved. It is reassuring that learning of the stimulus in a new orientation (or
with the partner eye, see above), does not interfere with performance in the previously learned orientation (or eye). Therefore, the improvement in performance
cannot be caused by short-term allocation of 'neuronal resources' (pools of cells) to
one task or the other. Quite to the contrary, the specific task seems to be learned.
Unsystematic pilot studies on the long-term behaviour of perceptual learning
completely agree with this view: performance of three observers who were retested
more than 1 year after the experiment proper achieved a performance close to the
level they had reached at the end of the experiment, much better than the pretraining
levels.
The electrophysiological experiment, to my knowledge, for the first time ever
demonstrates a direct, objective correlate of perceptual learning in humans. The
cortical potentials are evoked by a displacement close to the hyperacuity range, below
the size correponding to a Snellen acuity of 20/20. As to be expected, such a small
displacement evokes rather small cortical potentials. Nevertheless, there was a significant decrease in latencies and an increase in amplitudes of the evoked responses.
These changes are opposite to the changes one would expect as a result of habituation
and fatigue: those processes will increase latencies and decrease amplitudes.
The change of distribution of potential over the occipital pole is another, highly
significant correlate of perceptual learning. It is reassuring that the only significant

421

Parallel processing and perceptual learning

200

100

(b)

300
Latency/ms

150

400

200

250

300

After training

(c)

Figure 6. (a) Amplitudes of cortical potentials evoked by a jump-displacement stimulus are


higher after training (lower trace) than before training (upper trace) whereas latencies decrease.
(b) The distribution of potentials over the occipital pole changes significantly between before
and after training. The numbers at the top left of each distribution refer to the latency, in ms.
Black areas indicate negative potentials, white areas positive potentials. Isopotential lines indicate steps of 0.1 \LV each, (c) Schematic view of the stimulus configuration: left, lines straight;
right, lines offset.

422

M Fahle

difference between the distribution of potential resulting from the first versus the second
half of presentations is localised, for latencies below 100 ms, over the primary visual
cortex (VI), and that differences over other brain areas occur only after longer latencies.
Hence, the electrophysiological results are compatible with the hypothesis that
specific perceptual learning might occur in the primary visual cortex, but indicate that
there might be additional changes in other, 'higher' brain areas, though analysis of
evoked potentials through the skull is not a safe way to localise activity in the brain.
In the model of visual object recognition and perceptual learning based on radial
basis functions that was outlined above assumptions are not made regarding the exact
nature of the neuronal mechanisms that take place during learning. There are several
theories that are aimed at explaining these mechanisms, which are based, for example,
on a basic assumption put forward by Hebb (1949; cf von der Malsburg and Singer
1988; Palm 1982). Hebb postulated that the effectiveness of a synapse is increased
every time it is able to activate the postsynaptic neuron. Neurons in the visual cortex
receive inputs not only from the eye (via the lateral geniculate body) but mostly from
other cortical and subcortical neurons. Therefore, it is possible to increase the probability that an input activates a neuron by increasing the simultaneous input to this
neuron from other parts of the brain. The repeated presentation of the stimulus (while
it is attended to) might increase the effectiveness of the synapses for this specific
visual input. The increased effectiveness of synapses as a result of visual training
might lead to an increase in their amplitude and a decrease in their latency of spiking,
but, of course, other mechanisms are conceivable.
6 Parallel processing of vernier acuity
To test the hypothesis that vernier breaks are detected 'early' during pattern recognition I measured reaction times for the detection of a vernier target as a function of
the number of straight stimuli presented simultaneously. It is generally assumed that
those features that are detected at a constant reaction time irrespective of the number
of distractors presented simultaneously can be processed in parallel over the visual
field. Examples of such elementary features are colour, brightness, line orientation
and length, as well as line terminators (Julesz 1981, 1984; Treisman and Gormican
1988). Obviously, it requires a high number of cortical neurons to process a given
feature in parallel over the visual field. While the parallel versus serial discrimination
of relations between figures might be far less straightforward than originally thought
(Humphreys et al 1994), the concept of parallel versus serial processing seems to be
still valid for single features as opposed to relations between features. Hence, one
might speculate that only those features can be processed in parallel that are of
prominent importance for visual pattern recognition and that represent the elementary
building blocks of visual perception, extracted at the early stage of visual pattern
processing in the human brain. If vernier breaks could indeed be detected in parallel
at different locations in the visual field, this would indicate that deviation from
straightness is one of these elementary features for visual pattern recognition, probably
detected at one of the first stages of cortical pattern analysis.
6.1 Methods
Vernier stimuli were presented on the same experimental setup as before. Between 2
and 16 stimuli were presented simultaneously at an eccentricity of 4.5 deg. A central
cross served as a fixation aid. Each of the stimuli was 2.5 min wide and 85 min high,
except for observer UK (41 min), and vernier offset was 5 min, slightly above twopoint resolution at 4.5 deg (Levi et al 1985; Westheimer 1982). Stimulus luminance
was 450 cd m" 2 , background luminance was 20 cd m~2, and observation distance was
0.5 m. The presentation of each stimulus ended when the observer responded.

Parallel processing and perceptual learning

423

Acoustic feedback followed after incorrect responses. In part of the experiment, an


eye tracker monitored eye position to discriminate whether the subjects fixated or
scanned the stimuli.
In a two-alternative forced-choice task, three basic conditions were tested:
(a) identification of the offset target among straight distractors; (b) identification of
the straight target among offset distractors; (c) identification of the target offset in the
direction opposite to that of the distractors. Only half of the presentations contained
a target. The observers had to indicate whether or not a target was present in the
display and to push the appropriate one of two push-buttons. Reaction times represent the average both of positive (target present) and of negative (target absent)
presentations and rely upon at least 120 responses per condition and observer.
Before the experiments proper, the four observers underwent an ophthalmological
examination as well as a training period with more than 1000 stimulus presentations.
Three of the observers had previously participated in similar experiments. All
observers had normal or corrected-to-normal visual acuity, and, with the exception of
the author, were unaware of the purpose of the experiment.
6.2 Results
Detection of a single offset vernier was almost independent of the number of distractors: reaction times were nearly constant for up to at least eight stimulus elements
presented simultaneously (figure 7a; slope 9.5 3.5 ms per distractor). Even if the
size of the vertical gap of the verniers varied randomly by up to 5 min arc, precluding
the detection of the offset target on the basis of its larger gap (figure 7b), reaction
times increased hardly at all with the number of distractors, and the same was true for
the detection of a vernier offset to the left among distractors offset to the right, at
least for constant orientation of the stimuli (figure 7d). On the other hand, reaction
times increased sharply if observers had to find a straight line among offset distractors (figure 7c). Reaction times tended to be shorter for presentations with a target
than for those without target.
We repeated the experiments, varying the orientation of the stimuli independently
and at random by up to 20, to investigate a possible influence of the implicit orientation information that is present in an offset vernier (right-hand column of figure 7).
Detection of a single vernier among distractors was almost as fast as at fixed orientation. Detecting a target offset in a direction opposite to that of the distractors, on the
other hand, was almost independent of the number of distractors at fixed vertical
orientation (figure 7d), but required reaction times increasing dramatically with the
number of distractors if orientation varied.
6.3 Discussion
Reaction times for the detection of a single vernier target among straight distractors
increased by between 1.5 and 9.5 ms per distractor. This is far less than the 30 to
50 ms required for serial search (Jonides 1983; Krose and Julesz 1989; Treisman and
Gormican 1988), and lies well within the range accepted for parallel search. Moreover,
a vernier target was detected in a 150 ms presentation among straight distractors,
even if the vernier offset was smaller than the diameter of the foveal photoreceptors
(Fahle 1990; 1991). Detection of a vernier offset in a 150 ms flash at 0.2 deg
eccentricity often required a minimal displacement between 20 and 30 s arc even if a
mask followed immediately after the stimulus presentation (Fahle 1991). To obtain
such performance, observers had to test whether each of the stimulus elements was
straight or offset. The correct decision could be taken even after presentation times
that were too short for a serial search. Instead, all stimuli seemed to be probed in
parallel.

424

M Fahle

2.0

T D

T D
MF
HW
AH
UK

1.5 -I
1.0
0.5
0.0

T D

2.0 -I

T D

1.5 \

TT

TT

1.0 -I
0.5 J

T D

0.0 *-.

r-

2.5

T D

2.0
1.5
1.0
0.5
0.0

T D

5.0

T D

4.0
3.0
2.0 -I

1.0 -I
0.0
2
M)

3 4

6 8 .12 16

3 4

6 8 12 16

Number of targets

Figure 7. Parallel processing of vernier stimuli: reaction times for four subjects as a function of
the number of elements (distractors or target plus distractors) presented simultaneously, (a) One
offset target may be hidden among straight distractors. (b) One offset target may be present among
straight distractors, as in (a), and vertical gap size varies by up to 5 min. (c) One straight target
may be hidden among offset distractors. (d) The target is offset in a direction opposite to that
of the distractors. Left-hand column: reaction times for vertical orientation of stimuli. Righthand column: results for variable orientation of stimuli (after Fahle 1991). T target, D distractor.

Parallel processing and perceptual learning

425

It has been previously reported that an orientation cue is used in the detection of
vernier offsets (Andrews et al 1973; Watt et al 1983; Watt and Campbell 1985). But
detection was parallel in our experiment even if no absolute orientation cues were
available. With these stimuli of variable orientation, the underlying neuronal mechanism could not use the implicit orientation difference between a straight stimulus and
an offset one since absolute orientations varied. Moreover, the gap sizes of all stimuli
varied independently, hence the terminators present in an offset vernier could not be
used as the discriminating feature. I conclude that the feature which discriminates
between offset stimuli and straight ones must rely on deviation from straightness, not
on an absolute orientation cue (cfTreisman and Gormican 1988). The claim that
deviation from straightness is an elementary feature of visual perception was further
supported by the finding that a figure can immediately be discriminated from its
surround if the elements of the figure are bent whereas the elements of the surround
are straight, ie the figure 'pops out' (Wolfe et al 1992).
The detection of a vernier target offset in one direction among distractors offset in
the opposite direction can be made in parallel only as long as all stimuli share a
common orientation. Reaction times for this taskwhich might seem as easy as the
detection of an offsetincrease steeply with the number of distractors. The results
show that if the orientation cue is marked by variable orientations of all stimulus
elements, detection of the target with the opposite offset requires serial search.
7 Conclusions
I found that vernier offsets are detected in parallel at different positions of the visual
field and that deviation from straightness probably represents an elementary feature
of vision. Improvement in detecting a vernier offset is stimulus specific: performance
for vernier acuity, three-dot acuity, and a jump-displacement task improves as a function of training. Since this improvement depends on personal experience, it is called
learning. Two phases of learning can be discriminated, a fast one in the minute range
and a slow phase that continues at least over 10 h. Learning is specific for the orientation of the stimulus, for its position in the visual field, as well as for the eye used
during the training phase. This early perceptual learning is a prerequisite for a recent
model of human visual pattern recognition. In this model it is assumed that pattern
recognition might be achieved by some form of 'fuzzy' template matching. This is to
say that object recognition is considered as a process much more based on memory
and less on computation than previously thought. If the model were true, perceptual
learning would be a central factor for pattern recognition even in adults: only those
objects can be recognised that are similar to objects that have been seen previously
from various angles and whose views have been stored in memory. It is important to
realise in this context that the perceptual task tested here, namely to detect a break in
a straight line, is one that humans are faced with in everyday situations and that might
be a first step in the process of analysing visual patterns. Therefore, it is not surprising that even 'naive' observers will have undergone some training during their lives
and that the effects of learning during the experiment are both slow and relatively
moderate in extent. Informal pilot studies, as well as eg the results of Fiorentini and
Berardi (1980, 1981), suggest that the extent and speed of perceptual learning
increase for unfamiliar tasks such as the discrimination of phase relations in complex
grating patterns. Moreover, many studies show that visual performance of children
improves considerably with age (eg Zanker et al 1992) and that this improvement
requires visual experience, as is evident from deprivation studies. The results
reviewed here, on the other hand, show that learning is possible even for relatively
familiar features. It is surprising that learning occurs for a feature that is detected in
parallel, hence is an elementary feature of vision, and that is extracted during the first

426

M Fahle

steps of visual pattern recognition, while cortical areas such as VI have often been
considered to be relatively 'hard wired' in adults. The present results suggest that
area V I might be more modifiable even in adults than was previously assumed.
Cortical potentials evoked by the jump displacement of parallel vernier lines had
significantly shorter latencies and larger amplitudes after training than before training.
At the same time, the distribution of potentials over the posterior pole of the brain
changed in a highly significant way. The results both of psychophysical and of electrophysiological experiments can be taken as evidence that some form of perceptual
learning might occur very early during cortical processing. A next step will be to
repeat the behavioural and especially the electrophysiological experiments in
monkeys, where a direct recording from the surface of the cortex is possible, with a
much better signal-to-noise ratio than is possible in recordings from the skull of
humans. The final hope is to detect the neuronal basis of perceptual learning on a
cellular level in the visual cortex of mammals.
Acknowledgements. This research was supported by the Deutsche Forschungsgemeinschaft
(Fal 19/5-2; SFB 307, TP A6), the von Humboldt Society, and the Max-Planck Society. I wish
to thank all the observers for their participation, Mrs H Weller for technical and secretarial
help, and Dipl. Ing. M Repnow for writing the computer programs. Part of this work has been
published as part of the four publications indicated in the captions of figures 1, 2, 5, and 7.
References
AhissarM, Hochstein S, 1993 "Attentional control of early perceptual learning" Proceedings of
the National Academy of Sciences of the United States ofAmerica 905718-5722
Andrews D P, Butcher A K, Buckley B R, 1973 "Acuities for spatial arrangement in line figures:
human and ideal observer compared" Vision Research 1 3 5 9 9 - 6 20
Ball K, SekulerR, 1987 "Direction-specific improvement in motion discrimination" Vision
Research 21 953-965
Bennett R G, Westheimer G, 1991 "The effect of training on visual alignment discrimination and
grating resolution" Perception &Psychophysics 49 541 - 546
BoucartM, Delord S, GierschA, 1994 "The computation of contour information in complex
objects" Perception 23 399-409
DillM, WolfR, HeisenbergM, 1993 "Visual pattern recognition in Drosophila involved
retinotopic matching" Nature (London) 365 7 5 1 - 7 5 3
FahleM, 1990 "Parallel, semi-parallel, and serial processing of visual hyperacuity" in Human
Vision and Electronic Imaging: Models, Methods, and Applications, SPIE 1249147-159
Fahle M, 1991 "Parallel perception of vernier offsets, curvature, and chevrons in humans" Vision
Research 31 2149-2184
Fahle M, Edelman S, 1993 "Long term learning in vernier acuity: Effects of stimulus orientation,
range and of feedback" Vision Research 3 3 3 9 7 - 4 1 2
Fahle M, Skrandies W, 1994 "An electrophysiological correlate of learning in motion perception"
German Journal of Ophthalmology in the press
FahleM, Edelman S, Poggio T, 1994 "Short-term learning in vernier acuity" Vision Research
(submitted)
Fendick M, Westheimer G, 1983 "Effects of practice and the separation of test targets on foveal
and peripheral stereoacuity" Vision Research 23 145-150
Fiorentini A, Berardi N, 1980 "Perceptual learning specific for orientation and spatial frequency"
Nature (London) 287 4 3 - 4 4
Fiorentini A, Berardi N, 1981 "Learning in grating waveform discrimination: Specificity for
orientation and spatial frequency" Vision Research 21 1149-1158
Hebb D 0,1949 Organization of Behavior (New York: John Wiley)
Humphreys GW, Keulers N, Donnelly N, 1994 "Parallel visual coding in three dimensions"
Perception 2 3 4 5 3 - 4 7 0
Jonides J, 1983 "Further toward a model of the mind's eye's movement" Bulletin of the Psychonomic Society 21 247 - 250
Julesz B, 1981 "Textons, the elements of texture perception, and their interactions" Nature
(London) 290 9 1 - 9 7
Julesz B, 1984 "A brief outline of the texton theory of human vision" Trends in Neuroscience 1
41-45

Parallel processing and perceptual learning

427

Kami A, SagiD, 1991 "Where practice makes perfect in texture discrimination: Evidence for
primary visual cortex plasticity" Proceedings of the National Academy of Sciences of the United
States ofAmerica 88 4966 - 4970
KroseB J A, Julesz B, 1989 "The control and speed of shifts of attention" Vision Research 29
1607-1619
Levi D M, Klein S A, Aitsebaomo P, 1985 "Vernier acuity, crowding and cortical magnification"
Vision Research 25 963 - 977
McKee S P, Westheimer G, 1978 "Improvement in vernier acuity with practice" Perception &
Psychophysics 24 258 - 2 6 2
Malsburg C von der, Singer W, 1988 "Principles of cortical network organization" in Neurobiology
ofNeocortex; Dahlem Konferenzen Eds P Rakic, W Singer (New York: John Wiley) pp 69 - 99
Palm G, 1982 Neural Assemblies (Berlin: Springer)
Poggio T, 1990 "A theory of how the brain might work" in Cold Spring Habor Symposia on
Quantitative Biology LV 899-910
Poggio T, GirosiF, 1990 "Regularization algorithms for learning that are equivalent to multilayer networks" Science 247 978 - 982
Poggio T, Edelman S, Fahle M, 1992a "Learning of visual modules from examples: A framework
for understanding adaptive visual performance" Computer Vision, Graphics & Image Processing:
Image Understanding 56 22 - 30
Poggio T, Fahle M, Edelman S, 1992b "Fast perceptual learning in visual hyperacuity" Science
2561018-1021
Ramachandran V S, Braddick O, 1973 "Orientation-specific learning in stereopsis" Perception 2
371-376
SagiD, Kami A, 1993 "The time course of learning a visual skill" Nature (London) 365
250-252
SagiD, PolatU, 1992 "Perceptual learning increases the range of inhibitory connections
between spatial filters" Perception 21 Supplement 2, 69
Shiu L-P, PashlerH, 1991 "Improvement in line orientation discrimination is retinally local but
dependent on cognitive set" Investigative Ophthalmology and Visual Science 32 1041
Steinman S B, Levi D M, Klein S A, Manny R E, 1985 "Selectivity of the evoked potential for
vernier offsets" Vision Research 25 951 - 961
Taylor M M , Creelman C D, 1967 "PEST: Efficient estimates on probability functions" The
lournal of the Acoustical Society ofAmerica 4 1 7 8 2 - 7 8 7
TreismanA, Gormican S, 1988 "Feature analysis in early vision: Evidence from search
asymmetries" Psychological Review 95 1 5 - 4 8
VanGoolLJ, Moons T, Pauwels E, WagemansJ, 1994 "Invariance from the Euclidean
geometer's perspective" Perception 23 547-561
Vogels R, O r b a n G A , 1985 "The effect of practice on the oblique effect in line orientation
judgments" Vision Research 25 1679-1687
Watt R J, Campbell F W, 1985 "Vernier acuity: interactions between length effects and gaps
when orientation cues are eliminated" Spatial Vision 1 31 - 3 8
Watt R J, Morgan M J, WardRM, 1983 "The use of different cues in vernier acuity" Vision
Research 23 991 - 9 9 5
Weiss Y, Edelman S, Fahle M, 1993 "Models of perceptual learning in vernier hyperacuity"
Neural Computation 5 6 9 5 - 7 1 8
Westheimer G, 198 2 "The spatial grain of the perifoveal field" Vision Research 22 157-162
Wolfe J M, Yee A, Friedman-Hill S R, 1992 "Curvature is a basic feature for visual search tasks"
Perception 21 465-480
ZankerJ, Mohn G, Weber U, Zeitler-Driess K, Fahle M, 1992 "The development of vernier
acuity in human infants" Vision Research 3 2 1 5 5 7 - 1 5 6 4

You might also like