Professional Documents
Culture Documents
Perceptual Organization: Perfecto Herrera
Perceptual Organization: Perfecto Herrera
Perfecto Herrera
Introductory sound
examples
Perceptual Organization
Attention
Some terms
Source – the physical entity that gives rise to the sound
pressure waves (e.g. a violin being played)
Stream – the percept of a group of successive and/or
simultaneous sounds as a coherent whole appearing to come
from a single source (e.g., the brass section)
The sounds we hear at any one time usually come from a
number of different sources.
In most cases we can hear and identify each of the different
sound sources as having its own pitch, timbre, loudness and
location (stream=source). In other cases several sources are
processed as a single stream as their features do not qualify
for being considered as “distinct” (e.g., string section). In other
–exotic- cases, a single source may yield different streams.
Auditory Scene Analysis
A computational theory of hearing is required; plus a
functional explanation of the information processing
problems that the auditory system must solve in order
to make sense of the acoustic environment
A computational theory of hearing deals with the
question: what is the purpose of hearing? Which are
the constraints and regularities hearing can exploit?
Work in computer vision has benefited from a
computational theory since the late 1970’s, due to
David Marr
A similar foundation for hearing was developed by
Albert Bregman at McGill University in Montreal and is
known as auditory scene analysis
Auditory Scene Analysis
ASA can be conceptualized as a two-stage process:
1. The mixture of sounds is decomposed into a
collection of sensory elements (onsets, pitch
trajectories, modulations, spectral tracks, etc.)
2. Elements that are likely to have arisen from the
same event are grouped to form a perceptual
structure (stream) which can be interpreted by
higher centers in the brain
Good Continuation
Top Down:
Disjoint Allocation (Belongingness) Plastic,
Learned
Closure (schema-driven)
Proximity
8000
Frequency (Hz)
3255
1246
363
50
0 0.5 1 1.5 2 2.5
Time (s)
Some cues:
Fundamental Frequency and Spectral Regularity
Onset Timing
Sound Location