Critical Bandwidth and Consonance

Hearing Resegch, 54 (1991) 209-246 209
0 1991 Elsevier Science Publishers B.V. 0378-5955/91/$03,50
HEARES 01588
Critical bandwidth and consonance: Their operational definitions

in relation to cochlear nonlinearity and combination tones
Donald D. Greenwood
School of Audiology and Speech Sciences, Vnicersiiy of British Columbia, Varzcoulter, B.C., Canada
(Received 3 October 1990; accepted 30 January 1991)
A recent paper (Greenwood, 1990) reviewed cochlear coordinates in several species in relation to empirical frequency-position
functions (Greenwood, 1961b, 1974b), one of which well fits the B&k&y-Skarstein human cochlear map (BCkesy, 1960; Kringlebotn
et al, 1979). This increased the independence of the human function from the psychoacoustic data originally used to construct it
and encouraged a second assessment of the relations of similar psychoacoustically significant bandwidths to distance and position
on the cochlear map. The companion paper (Greenw~d, 1991, this issue), found that, among such bandwidths, ‘classical’ critical
bandwidth, and also ‘consonant interval’, estimates in man correspond to equal distances to a closer extent than generally
recognized, and over large parts of the frequency range they conform also to an exponential function of distance, as do most of the
ERB estimates. This correspondence to almost constant and similar distances facilitates, and forms a part of, an explanation of the
operational definitions of critical bandwidth in different experiments. The present account recapitulates the basic explanation of
critical bandwidth and consonance offered in Greenwood (1971, 1972b, I973b, 1974b) and Greenwood et al. (1976): by adding
schematic details to the earlier account of critical bandwidth measurements in pure tone masking (the masker-notch interval),
two-tone masking, narrow-band masking, and two-tone dissonance-consonance judgments and by outlining its applicability to AM
and Quasi-FM detection and to two-band (nominally notched-noise) masking experiments. The measured bandwidths derive from
approximately uniform dimensions of traveling wave envelopes in the peak region and from the effects of the resulting spatial
pattern of nonlinear interference among primary components. In this account, critical bandwidth in man corresponds to a distance
of about 1 or 1.25 mm, depending upon the direction the interval projects from the stimulus frequency to which it is referenced. It
is identified with the apical segment of the traveling wave displacement envelope, which in guinea pig and squirrel monkey appears
to be about 2/3rds and 3f4ths of a millimeter, respectively and would be about 1.25 mm in man if these distances were scaled
(Greenwood, 1962) among these three species (Greenwood, 1974b, 1977a). When reflected also in the basal direction, the upper
end of the frequency interval, at a 1.065 mm distance, makes a total two-critical-band distance, which corresponds with the region
of nonlinear input-output functions that extends in both directions from the envelope peak and hence also with the frequency-dis-
persive region of accelerated phase accumulation (Greenwood, 1974b, 1977a). Thus the critical bandwidths measured in these
experiments reflect a common pattern of nonlinear effects around the peak of the displacement envelope and are co-determined
(al by combination tones that elicit the relevant and apically distributed neural effects determining the criteria1 experimental
outcomes (Greenwood, 1971, 1972b,c) and (b) by an asymmetrical gain control jointly controlled by the primary stimulus
components (Greenwood, 1986a,b,c; Greenwood, 19881, which imposes a relative disadvantage on the higher primaries (of a pair or
set), a ‘suppressive’ effect (Greenwood and Goldberg, 1970; Greenwood, 197I). which (a) strongly and negatively affects their direct
detectability when they serve as the ‘signals’ on which the experimenter may be focussed but on which the system is not, and (b)
gives to combination tones a more important role generally.
Critical bandwidth; Masking; Consonance; Cochlear coordinates; Nonlinearity; Combination tones; Gain control; Dominance;
Suppression
Int~du~iou: The equal-dis~nce hypothesis

Correspondence to: Donald D. Greenwood, School of Audiol-
ogy and Speech Sciences, 5804 Faitview Crescent, University The previous paper in this issue (Greenwood,
of British Columbia, Vancouver, B.C., Canada V6T lW.5. 1991) has reported that within several sets of
210
psychophysically significant bandwidths the same fraction of cochlear partition length among
greater part of the estimates correspond to equal the four species and over appreciable proportions
distances on B&k&y’s cochlear map (BekCsy, 1960; of the cochleas from which they were obtained.
Greenwood, 1961b1, including the ‘classical’ criti- Although the data covered only the approximate
cal bandwidths. They correspond to equal dis- apical halves of the cochleas, this approximate
tances to a closer extent than previously indicated constancy over that portion was an interesting
and above about 300 or 400 Hz conform reason- observation, in part at least, since critical band-
ably well to an exponentia1 function of distance width had been hypothesized to correspond to
(Greenwood, 1961b1, as do most of the ‘conso- equal distances (Fletcher, 1953; Zwicker et al.,
nant interval’ estimates from 100 Hz to 3 kHz 1957; Greenwood, 1961b) and was expected to be
and ERB estimates from 100 Hz to 10 kHz. It will related to frequency resolution. The equal-dis-
be argued here that the correspondence, within a tance hypothesis, referred to earlier, and the fur-
given set of bandwidths and in several instances ther hypothesis that critical bandwidth might be
across sets, to a common distance facilitates, and an exponential function of distance along the
forms a part of, an explanation of the operational cochlea, had just received support from some
definitions of critical bandwidth in several experi- ‘direct’ measures of critical bandwidth and almost
ments. If critical bandwidth corresponds to equal all estimates of the ‘smallest consonant interval’
distances in the ‘spacing’ of cochlear events, that (Greenwood, 1961b), which a second test has now
might depend, as noted in the previous paper, confirmed and extended (Greenwood, 1991).
upon a certain degree of constancy in the dimen- In 1961, no hypothesis as to what critical band-
sions of displacement envelopes. A review of width in man might correspond to, either as an
those dimensions and of other evidence of a aspect of mechanical or neural response, had
relation between critical bandwidth and mechani- been offered (Greenwood, 1961bI, since no physi-
cal data follows, succeeded by an analysis of the cal correlate of critical bandwidth was then read-
critical band experiments that have provided the ily apparent as a candidate. Nevertheless, the
estimates corresponding to equal distances. apical segment was this author’s candidate for a
potential association with critical bandwidth even
Comments on Critical Bandwidth Relative to Di- though this dimension of the displacement enve-
mensions of the Displacement Envelope lope was much larger in dead cochleas than the I
mm distance associated with critical bandwidth or
Earlier papers have sought to associate criticai similar bandwidths. However, in the late ’60s and
bandwidth with the dimensions of the traveiing early ’70s data from living preparations began to
wave envelope and with the region of cochlear appear (Johnstone and Boyle, 1967; Kohlloffel,
nonlinearity revealed in basilar membrane mea- 1971, 1972a,b,c; and Rhode, 1971) which indi-
surements (Greenwood, 1973a,b; 1974b; 1977a; cated that the displacement envelope was much
1991). With the former aim as a background sharper, and hence the apical segment was much
purpose, a stilt earlier assessment was made shorter, than in dead cochleas. The apical seg-
(Greenwood, 1962) of the dimensions of traveling ments of displacement envelopes extracted from
wave displacement envelopes in three mam- these data were found in guinea pig and squirrel
malian species and the chicken, which was based monkey to correspond to distances of less than a
on B&k&y’s (lY60) observations of cadaver millimeter (Greenwood, 1973b. 1974b, 1977a,
cochleas. Frequency-position functions (Green- 1980).
wood, 1961, 1990) were used to obtain the ap- In this same period, critical bandwidth had
proximate dimensions of dispIacemcnt envelopes been associated with the Iow-frequency side of
from sets of amplitude-response curves. pure tone masked audiograms (Greenwood,
The dimensions of the envelopes in the region 1969/70, 1971, 1972a,b,c), defining a frequency
of the peaks, for example, of the apical segment interval between the masking tone and the foot of
extending from envelope peak to foot, were rather the masked audiogram that corresponded to
large. However, the dimensions were about the about 1.25 mm. Signals presented on the high
211
frequency side of the masking tone within an the peak of the audiogram to the low point of the
equal frequency interval, corresponding to about notch that appears in such audiograms as a con-
1 mm on that side, were detected because of the sequence of combination tone detection in lieu of
more audible effects of combination tones the signal, as discussed above and in the preced-
(created by signal and masker) whose frequencies ing companion paper. When the signal is at this
fell between the masking tone frequency and the point, where is it with respect to the masking
low-frequency foot of the audiogram. If signal tone’s displacement envelope? We now switch
frequency was more than about 1 mm above the our attention back to displacement envelopes and
masking tone, the combination tone 2f,-fz began the mechanical data to examine both sides of the
to fall beyond the audiogram foot and, at moder- peak region of the amplitude response curve
ate to low masker levels, fairly quickly became recorded at a point and its relation to the cumu-
too weak to play the role of detected event (also lative phase curve associated with it.
Smoorenburg, 1972). These experiments are rep- In Rhode’s (1971) plots of basilar membrane
resented in Fig. 4 and in the preceding paper amplitude at a single point versus stimulus tone
(Greenwood, 1991) and will be discussed later. frequency, the frequency interval extending from
It was suggested that the low-frequency side of the peak of the amplitude response curve to its
the pure-tone masked audiogram represented the high-frequency foot corresponds to the apical
apical segment of the displacement envelope, and segment of the displacement envelope. Of course,
an association was therefore also suggested be- the same frequency interval extends also to the
tween critical bandwidth, as a distance, and the start of the phase plateau. If we now determine
apical segment of the displacement envelope what frequency interval extends basally from the
(Greenwood, 1971, 1974b). The length of the peak of the response curve to the frequency at
apical segment in man was, of course, unmea- the point of ‘inflection’ of the cumulative phase
sured and will remain so. However, as above curve, we find it is again about the same fre-
noted, the apical segment had appeared to be quency interval, corresponding to a somewhat
approximately scale-related among B&k&y’s dead shorter cochlear distance than when projected in
cochleas (Greenwood, 1962) if we assume them the apical direction to the phase plateau. Fig. 1
‘comparably’ dead (see Kohlloffel, 1972a,b - on illustrates the cumulative phase curve and its
preparation ‘aging’). If it remained scaled among transformation to a curve plotting the logarithm
living species, in man it would appear to be in the of the time interval equivalent to the observed
neighborhood of 1.25 mm, when scaled from the phase lags (Greenwood, 1973a,b, 1977, 1980).
values of about 0.66 mm and at least 0.7 to 0.75 Figures like Fig. 1 (eleven in all) were plotted
mm in guinea pig and squirrel monkey, respec- in order to extract the average dimensions of the
tively (Greenwood, 1974b, 1977, 1980, 1991). This region in these data over which the traveling-wave
distance of 1.25 mm was closely similar to the cumulative phase curve began to accumulate more
distance subtended by the corresponding portion rapidly, following a different function (i.e. over
of the masked audiogram and hence by the criti- which its velocity decrease obeyed a stronger
cal bandwidth frequency interval when projected function) (Greenwood, 1974b, 1977a, 1980) and
apically (towards low frequencies) from the mask- also to determine the region over which nonlin-
ing frequency. ear input-output amplitude functions were ob-
tained. It has always been evident from Rhode’s
Critical band~dtb in relation to the region of data that these regions are the same. The posi-
non-linear input-output functions tion of maximum amplitude (not shown in Fig. 1)
provides the division into basal and apical nonlin-
As indicated above, the identification of criti- ear segments. Fig. 2 illustrates the spatial equiva-
cal bandwidth in pure-tone masking with the lent of this curve, with the division indicated,
low-frequency slope of the audiogram was not from the data of Kohllijffel (1972b), as fit by
exclusive. It was identified also with the distance empirical equations developed from Rhode’s
in the basal (or high frequency) direction from squirrel monkey data (Greenwood, 1977a, 1980).
212
Fig. 3 illustrates the same two segments in the phase-curves certainly identifies a frequency (or
phase travel-time curve associated with a tone of point if a spatial phase curve is examined) where
3.6 kHz in the cat (Greenwood, 1977a) and plot- input-output functions are closely approaching
ted as a function of cochlear distance. In short, linearity.
the distances subtended by the basal and apical The purpose of determining the dimensions of
segments each convert to the same frequency the important nonlinear region in animals was, in
interval; and the total distance extends apically part, to infer what the corresponding nonlinear
from the upper limit of the region of nonlinear region, with its basal and apical portions, might
input-output functions, which is also the same as be in humans. If we scale the distances extracted
the upper limit of the frequency-dispersive part from the squirrel monkey data to human, we find
of the phase curve, to the apical foot of the that they are very similar to the pair of numbers
displacement envelope and the start of the phase we have been dealing with in relation to the
plateau. The length of the basal segment seems psychoacoustic data on pure tone masking. As
likely to be less firmly indicated by the data of noted, the apical segment length scales to a dis-
Figs. 1 and 2. The extent of the basal shift of the tance of about 1.25 mm in humans. The basal
displacement envelope peak with increasing stim- segment length (from envelope peak to the more
ulus intensity (towards lower frequencies when basal point at which linear input-output functions
observations are made at a single point), would are obtained and at which phase velocity begins
seem to indicate that nonlinear input-output to decrease following a stronger function) scales
functions may at high levels be found somewhat to about 1 mm.
further towards the base, relative to the position If we compare these distances to psychoacous-
of the maximum at low-level (Robles et al., 1990). tic measures of critical bandwidth as in a previous
However, the ‘flex-point’ of the cumulative paper (Greenwood. 1974b) using the frequency-
Fig. I. Taken from Greenwood (lY74b. 1977a). A phase curve measured at a given point. The more rapidly accumulating portion of
the phase curve, above about 5 kHz and extending to the start of the phase plateau, coincides with the region of nonlinear
input-output functions obtained by Rhode (1970, 1971). It brackets the position of maximum amplitude, extending on average
about 3/4ths of a millimeter to the foot of the apical segment of the envelope (start of the phase plateau) and about 2/3rds of a
millimeter basally to the bend in the phase (or phase travel time) curve, at about 5 kHz in this individual case. Conversions of the
phase curves. like this figure, were made for all of Rhode’s phase curves in 1971 (Greenwood. 1973a,h. 1977). in part to extract the
average dimensions of the nonlinear region and its two segments. The identification of the corresponding apical and basal segments
of the displacement envelope, in man, with critical bandwidth. when reflected, respectively, towards the low or the high frequencies
from the frequency of a masking tone, would make the nonlinear region correspond to about two critical bandwidths (at moderate
to low levels - it may stretch towards the base at higher levels). Left panel: Phase differences. dd, between the motion of the
basilar membrane and the motion of the malleus. Negative numbers signify that the motion of the basilar membrane lags the
motion of the malleus. At low frequencies the curve approaches a phase lead of about r/2 rad. Data from animal 69-460. Curve
adapted from Rhode (1970). Right panel: Phase travel time from a point on the membrane at its basal end to the recording
location, calculated from the phase measurements as corrected by r/2 rad, except that times corresponding to the phase plateau
approximately represent phase travel times to the progressively more basal positions at which the displacement envelope of a
higher frequency tone reaches its apical foot (and causes more apical parts of the partition to vibrate in phase). The points on the
graph are not Rhode’s original data points, They represent the points of intersection of the original phase curve, as drawn by
Rhode, with a system of grid lines superimposed on his curves to extract numerical data from the graph.
213
displacement envelope, where S M’s ‘tail’ becomes

essentially linear. If S, is then raised in fre-
quency, its nonlinear apical segment begins pro-
gressively (peak first) to disengage from the basal
nonlinear segment of S,. Thus, in man, the total
nonlinear region, from apical foot to envelope
peak and then to the basal end-point of the
nonlinear region, would be about 2.32 mm, at
least up to moderate stimulus levels, and it corre-
sponds to about two critical bandwidths. At higher
levels, nonlinearity might well be observed at
points further from the peak on the basal side.
This review of estimated dimensions and pro-
posed relationships indicates that there are close
correspondences between, on the one hand, the
MM from stopes distances represented by critical bandwidth and,
Fig. 2. Taken from Greenwood (1973a, 1977). A spatial phase on the other hand, the probable dimensions in
curve of Kohil~ffel’s (1972b), showing the nonlinear region, man of the two portions of the region of non-lin-
directly on a millimeter scale, over which the phase curve
ear input-output functions bracketing the peak of
follows, in effect, a different function than at more basal
locations. The position of the maximum of the envelope is the displacement envelope.
indicated by the long tic marks on top and bottom abscissas. In addition, critical bandwidth has been linked
The basal and apical nonlinear segments are in general shorter to the physical distance and frequency interval
than in squirrel monkey (or, by inference, in man), but of the required (a> to shift combination tone masking, or
same relative proportions. The curve through the data is not
detection, away from most of the apical-side
Kohlliiffel’s, though very similiar; it is calculated from func-
tions based on squirrel monkey data and scaled to the guinea masking influence of the lower of two primary
pig. The data are relative phase measurements (bars) versus tones (Greenwood, 1971, 1972b), (b) to shift a
position on the basilar membrane in guinea pig (dead) for a 7 combination-tone far enough apically to begin to
kHz tone UCohllSffel, 1972a,b). There is no phase ambiguity drive primary-like neurons of the AVCN free
along the partition using Kohlliiffel’s stroboscopic observa-
from the excitatory effects of the lower of the two
tional technique (open and closed bars represent strobing at
phases separated by 90 degres), but the curve as a whole lacks primary tones (Greenwood et al., 19761, and (c)
an ordinate reference point. Hence calculations (curve) are to shift a higher frequency tone, basally away
relative phase on an arbitrary ordinate. from a CF-tone, to the tip of the rate-suppression
area as conventionally defined (Greenwood et al.,
1976).
position function discussed also in Greenwood
(1990, 1991), one critical bandwidth extending What Determines the Measured Values of Criti-
apically from a pure tone frequency corresponds cal Bandwidth Obtained in the Experiments Con-
to 1.25 mm; one critical bandwidth basally from sidered?
the same frequency extends 1.065 mm. Expressed
as though we were spacing three tones along the To revert to the earlier thread of argument in
cochlea, a tone, S,, whose traveling wave maxi- the Introduction: given the nonlinear properties
mum is 1.065 mm basal to the maximum of a of the system, as revealed in the particular inter-
lower frequency tone, S,, is at the same fre- actions or interferences of stimuli in any one of
quency separation (one critical bandwidth) from the experiments thought to provide ‘measure-
S, as is a tone, S,, whose maximum is 1.25 mm ments’ of critical bandwidth, it may well not be
apical to tone S,‘s maximum. Tone S,‘s maxi- the case, and cannot with any confidence gener-
mum is at the apical foot of the nonlinear portion ally be expected, that the critical bandwidth
of S,‘s envelope, whereas tone Sri’‘’ maximum is emerging from a given operational definition will
at the basal end of the nonlinear portion of S,‘s be an estimate of a conventional bandwidth of
214
the auditory filter or that it will be centered on national aggregate generated by a masking band
the signal. There are several experimental con- and of the important relationship of the fre-
texts included among those considered in this quency distributions of signal-masker generated
paper in which the Af resulting from the particu- combination components to those of the aggre-
lar operational definition does not directly reflect gate (Greenwood, 1971, 1972a,b,c) will then fol-
a conventional bandwidth of a filter. Critical low and introduce consideration of notched-noise
bandwidth measurement in experiments on pure (two-band) masking (Patterson, 1976; Weber,
tone masking (Wegel and Lane, 1924; Small, 1959; 1977; Peters and Moore, 1989; Moore et al, 1990;
Ehmer, 1959; Greenwood, 1961a,b, 1971); two- Shailer et al.. 1990).
tone masking (Zwicker, 1954; Green, 1965;
Greenwood, 196la,b, 1971, 1972a,b,c, 1974a; Nel- Single pure tone masker und single pure tone signal
son, 1974, 1979); two-tone consonance (Helm- - simplest ~arud~gm
holtz, 1877; Mayer, 1874, 1875, 1894; Cross and To review those experiments (Greenwood,
Goodwin, 1893; Kaestner, 1909; Guthrie and 1969/70, 1971; Smoorenburg, 1972): when a sin-
Morrill, 1928; Plomp and Levelt, 1965; Plomp gle masking tone Cf,,) at low to moderate levels is
and Steeneken, 1968, Plomp, 1976); AM and used to mask a nearby higher frequency signal
QFM detection (Zwicker, 1952; Goldstein, 1967a; (f,t, the lower frequency odd-order combination
Schorer, 1986); and narrow-band masking tones CZf,-f, and 3f,,,-2f,) produced by masker
(Greenwood, 1961a,b, 1972) will be discussed in and signal can permit detection of the presence
A discussion of masking by the combi- of the signal because of the combination tones’
JIO
Fig. 3. Taken from Greenwood (1973a. 1977). A spatial phase travel time curve for a 3600 tfz tone, labeled ‘Time - 3600 Hz’.
shows the spatial extent of the same nonlinear peak region as in Figs. 1 and 2, divided into basal and apical segments.
Superimposed on this summary graph are several other cmves that are not specifically germane to this paper. but that put this
cmve into a framework of calculations. The time curve indicates the phase travel time required for a tone reaching its maximum at
any point on the partition to reach that position. Note that it crosses the 3600 Hz curve at the position of maximum amplitude
which divides the straight line segment of increasing log-time into two segments. The time curve, when one considers what is the
frequency of the tone reaching maximum at a given point, converts to the phase curve. The frequency curve merely indicates the
position of maximum amplitude versus frequency; and the period curve is its reciprocal. The latter two curves have been modified
by about 9% on the ordinate scale by more recent data than those on which they were based (Liberman. 1982), but this has little
quantitative effect on the other curves and no effect on the relationships shown.
more apical neural effects (on primary fibers re-
sponding to the masker and/or to the combina-
tion tones). This occurs at signal levels at which N
I
the signal per se is masked - via dominance or c lOOO-
‘suppression’, which is maximal when compo- z”,

;;
nents are very close (and that diminishes with 500-
;o
-c
separation), i.e. when the masker drives neurons 5:
tuned to the signal, which the signal may reclaim 2: 200-
if raised in level (Greenwood and Goldberg, 1970, ;
iE
Fig. 11). As the signal increases in frequency the m- loo-
odd-order combination tones decrease in fre- ;
quency and, will eventually, as they are shifted 50-
more apically (at a given signal level), no longer

affect the fibers also responding to the masker
and will either excite primary fibers beyond the
zone of excitatory effect of the masker or become Fig. 4. Taken from Greenwood (1971). Widths of the notch in
incapable of doing so, depending on their intra- pure tone masked audiograms, at low to moderate levels, on
their high frequency side, which result from the determination
cochlear level (Greenwood et al, 1976; Smooren-
of signal threshold by the detection of combination tones
burg et al., 1976). In either case, the combination generated by signal and masker. The frequency interval plot-
tone will thus determine a higher frequency sig- ted on the ordinate is the separation between the masking
nal’s threshold until the signal becomes better frequency and the low point of the notch on the high fre-
able to excite primary fibers on the basal side of quency side, when the masking tone was at levels at and below
the masker (Greenwood, 1969/70, 1971; Smoor- about 60 dB SL. These frequency intervals equal critical
bandwidth and correspond to an approximately constant dis-
enburg, 1972, Sinex and Havey, 1984) than is the tance (Greenwood, 1961a,b, 1971). The solid circles with
combination tone on the apical side. It is, of horizontal bars are based on Small’s (1959) data and repre-
course, recognized that the amplitude distribu- sent the means of six subjects. The remaining symbols repre-
tion of a combination tone in the region of the sent individual subjects in 1971 (see original paper for expla-
nation). The solid line indicates all frequency intervals that
primaries’ overlap (where it is generated) will not
correspond to an equal distance of 1.065 mm, when the
duplicate the envelope of a tone of external ori- interval concerned is between masker and notch low point.
gin, but, as the primaries separate, in their region The same frequency interval when reflected about the masker
of non-overlap (apical to the higher primary com- in the low frequency direction represents also 1.25 mm, if the
bination tone), combination tone effects on pri- frequency interval of interest is between the masker and the
mary neurons increasingly resemble those of tones corresponding frequency of the cubic combination tone gener-
ated by the masker and a signal at the low point of the notch
of external origin added to the lower primary in in the masked audiogram.
the absence of the upper (Kim et al., 1980).
When the signal is at the low point of the
notch in the masked audiogram, the combination masking tone and the low point of the notch are
tone’s maximum is at the apical foot of the shown in Fig. 4. The foregoing description applies
masker’s zone of masking effect. If we call the when maskers are low to moderate in level and
frequency interval measured by this operation the notch on the upper side has not yet appeared
critical bandwidth, then indeed this critical band- (the slope is nonetheless determined by combina-
width is a frequency-resolution measure related tion tone detection) or is relatively narrow. Of
to the filtering properties of the system, but it is course, as the masker’s level exceeds this range,
one that in this experiment is related more to the the combination tones on the low side (of even-
extent of the zone of interaction of the combina- order, fs-,-f,, as well as odd-order type - Kram-
tion tone with the masker on the apical frequency mer and Greenwood, 1973) continue to deter-
side, an interaction made criteria1 by the masker’s mine the measured thresholds of signals on the
suppression of the signal on the basal side. Esti- high frequency side over a wider separation of
mates of the frequency interval between the masker and signal (ratios > 1.5), since the odd-
order combination tones’ displacement envelope masker are separated by about one critical band,
maxima occur at appreciable magnitude well be- here 1.065 mm. The combination tone is the same
yond the foot of the masker’s envelope (while the frequency interval below the masking tone, or at
even-order combination tones encroach upon it)
and since at higher levels the masker becomes
more effective in masking the signal per se, via
Fig. S. Schematic to illustrate primary and combination com-
suppression (Sachs and Kiang, 1968; Greenwood
ponent relationships in five experiments in which .critical
and Goldberg, 1970, Fig. 11; Kiang and Moxon, bandwidth is operationally defined. All panels: Slanting solid
1974). with the consequence that the notch he- line represents apical segment of the displacement envelope
comes wider and deeper. of primary tone to its right and apical zone of main masking
effect by that tone. extending from peak to foot of the masked
However, the point we are seeking to empha-
audiogram. Top panel: Pure tone masker and signal are
size here is the fact that, for signal levels that are separated by one critical hand. which (at moderate levels)
low with respect to the masker and at signal places signal at low point of a notch in the masked audiogram
frequencies close above the masker (within a ra- (see text): combination tone is at same frequency interval
tio of about 1.2), combination tones are able to below the masker at about the foot of the masked audiogram.
Second panel: Two pure tone maskers are separated by one
combine with, and contribute detectably to, the
critical band (separation at which a dip appears in masked
discharge patterns occuring in response to the audiogram) and a signal is at their arithmetic mean. Combina-
lower masker when the combination tones’ and tion tone produced by masking tones is shown as vertical
masker’s effects on primary fibers overlap. When dashed-dot line at foot of slanting line. where it forms a
the signal is at a greater frequency interval above resultant with 3f,-Z_f,, shown by slanting dashed line (and,
less importantly, with 4f,-.?f,). Combination tone produced
the masker, the combination tone begins to excite
by lower masking tone and signal. 2f;-f,. vertical dotted line,
primary neurons receiving their input from points Forms a resultant with 31;-?,f,,. also a slanting dashed line.
beyond the zone of excitatory effect of the mask- Third panel: Two primaries separated by one critical band,
ing tone (Greenwood et al., 1976). This is illus- interval corresponding to ‘smallest consonant interval’ (see
trated in the top panel of Fig. 5, where signal and text). are shown as sohd lines and their combination tone,
2 f;-l,? is shown as a dash-dot line. Fourth panel: Two signals
(side-hands, f,, and ,C,,) short solid lines, are masked by a
masker (carrier. .f; ), tall solid line. in either AM or QFM
phase relations with masker. At signal (side-band) separations
wider than shown, side-band level required to detect modula-
tion is equal in AM and QFM. At narrower separations,
signals in AM relation can be detected at lower levels than in
QFM. Combination tone If,,-.f; is shown as vertical dashed
line and forms a resultant with combination tone 3f, -2f,,,.
shown as a dotted line. Combination tone 2,f, - f;,, is also
shown as a dotted line and forms a resultant with lower signal.
Fifth panel: A critical band of noise masks a signal 2/5ths of
band width (II /(2n + 1)ths) from lower limit. The combina-
tion bands formed hy signal and lower masker components
((II + I ,/; .. ( -rlf’,) and the combination hands formed by sig-
nal and higher masker components (01 + l)j; -rzf, +,,). are
shown as pairs of dotted and dashed horizontal lines tn = I
and 2. respectively). Signals that are l/3 to l/2 of the width
of the masking hand place these combination bands entirely
within the region of cubic c~~mhinati(~n components generated
hy masking band and hence within main apical zone of
masking effect hy the masking hand and its combinational
aggregate. Combinational aggregate hands 2f,-fh and 3f) -
Zf;,. generated hy the masking hand, are effectively part of
the masker and are shown by dashed-dot and dash-dot-dot
lines terminated at apical limit by verticals. Shift of the signal
in either direction will extend one or the other set of signai-
masker combination hands outside the region of the major
(the cubic) components of the combinational aggregate.
217
a distance of 1.25 mm apical to the masker’s et aI., 19741, since both operate and are affected
position of maximum amplitude. This demon- by the gain control. (This is why the word ‘sup-
strated way in which combination tones de- pression’, which implies an increase of the inter-
tectably combine their effects with those of the nal over the external ratio, has been a poor term
masker or exert them apical to the masker (de- to use (Greenwood, 1986a,b) to describe the con-
pending on separation and level, and in either spicuous interchange of dominance achieved by
case as signal-proxies determine the signal level the gain controi when both tones are close and
at which a listener is able to detect a signal’s can still maintain a fairly close approximation in
occurrence) clearly warrants something more than the period histogram to the components’ external
their designation and seeming dismissal as “con- ratio. As they separate sufficiently, the situation
taminants’ which stand in the way of estimating a changes.) In contrast, even at low signal levels a
filter distributed about the signal. Among other higher frequency masker whose envelope peak is
implications, their role as a signal-induced ‘any- only about 1.25 mm (by inference from animal
thing’, which alters the sensation produced by the data, Greenwood et al., 1976) more basal than
masker alone by adding to the latter’s excitatory that of the signal has already lost most of its
effects on its low frequency side, tells us a great capacity to recruit a primary fiber whose CF is at
deal about the spatial ‘filtering’ on the apical side the signal frequency, though at this separation it
of the masker and is consistent also with what is still has the capacity to exert an influence on the
known about the masker’s stronger and different gain controt at the neuron’s locus and thereby to
effects on signals at frequencies above it, as op- reduce its rate of fire, which at wider separations
posed to its effects on either signals or combina- it rapidly loses (Greenwood et al., 1976). A lower
tion tones below. This asymmetry of effect of the frequency masker has an advantage over a higher
masker on signals at frequencies above and below one, and at much greater separations than 1.25
it is key to an understanding of the role taken by mm can recruit higher-CF neurons and more
the combination tones and of the particular val- rapidly dominate with increase in level.
ues of the frequency separations empirically de- The major role played by the combination
fined as critical in these experiments. tones, as producers of the detected neural events
At low signal levels a slightly lower-frequency when the signal lies above the masker, is made
masker recruits and becomes the dominant driver more intelligible by the large Fourier amplitudes
of primary units with CFs at the signal frequency. of the combination tones in their neural repre-
At higher signa levels, the signal will become sentations (reIative to those associated with the
effective in reclaiming and driving them, over- primaries) calculated from the period histograms
coming the suppressive effect of the lower fre- of the primary fiber population responding to
quency masker on its way to its own eventual two-tone stimuli Kim et al., 1980). If the auditory
dominance of that basilar locus and those fibers filter is the subject of the search in masking
(Greenwood and Goldberg, 1970, Fig. 11). When experiments, evidence bearing on the spatial ex-
it does, it has achieved nearly the intracochlear tent of the apical influence of the masker and on
level and neural representation at that locus that the effectiveness of a combination tone in driving
it would have if presented alone since its power neurons just beyond the primary envelope’s api-
now dominates the gain control that we envisage cal foot, with fittle influence by the primary
to be centered at that locus and that either or (Greenwood et al., 1976?, is highly relevant. The
both tones can control given the right frequency spatial extent of basal influence of the masker is
and intensity relations (Greenwood, 1986a,b,c; not determinable in psychoacoustical experiments
1988). When in sufficient proximity to each other, unless combination tone detection in lieu of high
not only may both tones be reflected in the driven frequency signals is precluded (Greenwood, 19711,
discharge or either tone dominate the period and the extent is characterized by an asymmetry
histogram and average rate, but their relative that is a marked function of level, in agreement
IeveIs as reflected in neural discharge may fairly with the physiological evidence. From both the
closely reflect their external amplitude ratio (Rose psychoacoustic and physiological experiments, the
critical involvement of combination tones in al- more detailed account of why the measured
most all masking experiments, including two-com- ‘bandwidth’ is what it is and why it is not ex-
ponent masking (and AM and QFM detection, plained by the exit of the two masking tones from
which involve two ‘signals’), and in the determi- a filter pass-band centered on the signal between
nation of consonant intervals can be directly in- them, but by the mechanical and neural events
ferred, as will be outlined below. caused by combination tones occuring on the
apical side. and ultimately at and beyond the
Two-tone maskers foot, of the mechanical and neural excitation
In the interpretation of two-component mask- pattern associated with the lower masking tone
ing data it should be remembered that the lower (Greenwood, 1971, 1972b, 1976). Why does the
masker, with both signal and upper masker, will dip in the two-tone masked audiogram begin to
generate combination tones on the low frequency occur when the two masking tones are about 1
side of the lower masker. These relations be- mm apart? At this separation, they are, of course,
tween primaries and combination tones are illus- also at the masker-notch interval and at the
trated in Fig. 5, second panel from the top. As smallest consonant interval as well; the combina-
the masking tones separate, a dip in the masked tion tone 2f,-fl, is generated by the two maskers,
audiogram occurs between the tones (Zwicker, with its frequency 1.25 mm below that of the
1954; Greenwood, 1961a,b), but as just described lower masker. As discussed above, detection of
above, the detection of a signal just above a the presence of a signal near mid-frequency, if
masking tone, hence also when centered between presented alone with only the lower masker, will
two tones, is not the result of neural events be dependent on the detection of combination
occuring at a basilar locus corresponding to the tones, 2f;-f, and 3f,-2f’,, whose maxima at this
signal’s position above the lower tone (and hence primary component separation will occur on the
between the tones) but rather occurs because of apical side with maxima within 1.25 mm of the
the detection of combination tones whose neural lower masker and whose effects presumably gen-
effects coincide with, or extend beyond, those of erate the first indication of the dissonance ex-
the lower tone on the low frequency or apical pected at this separation of signal and masker. In
side of the lower masking tone (Greenwood, (Fig. what follows, a signal at the arithmetic mid-
25) 1971, 1974a; Nelson, 1974, 1979). If the com- frequency rather than at geometric mid-frequency
bination (cubic and simple difference) tones are is assumed, and combination tones above the
masked, the dip in the masked audiogram be- primaries (such as 2fi,-fl) are left out of this and
tween the two masking tones - being normally later descriptions. When both masking tones are
the consequence of the greater audibility of the present, the two maskers’ combination tone, 2f;-
combination tones than the signal - does not fl,, coincides (to form a resultant) with the signal-
appear until a wider separation allows signal-gen- and-masker-generated combination tone of the
erated events to be detected above the lower same frequency, 3f,-2f,. The latter (of lesser
masked tone. This occurs when the signal is more level) should in principle be expected to affect
nearly free of the threshold elevating (suppres- the resultant’s level as the signal’s level (hence its
sive) effects of the masking tones (chiefly the own) is raised, but since it will be considerably
lower, Greenwood and Goldberg, 1970, Fig. 11; smaller than 2f,-fl,, the effect will be slight and
Sachs and Kiang, 1968; Kiang and Moxon, 1974). no doubt it is 2f,-f\ (generated by signal and
Recall that the earliest data (Zwicker, 1954) lower masker) that is principally responsible for
showed that upper masker’s initial effects on the subject’s detection of the signal’s presence
keeping threshold high quickly fades to negligi- (although it is also combined to form a resultant
ble, leaving the interaction of the signal with the with a smaller 3f,-2f,). (If a geometrically-
lower tone as the determinant of threshold. centered signal were used (Nelson, 1979), these
To introduce later discussion, this involvement resultants would not bc formed; their con-
of combination tones in the two-tone masking stituents would instead be nearby beating compo-
experiment can usefully be described here in a nents, which would be expected to somewhat
219
alter the empirical thresholds obtained, as well as Greenberg et al., 1986). As both the primary
this description.) tones, usually set equal in level, separate, so do
As the maskers exceed a separation of about 1 the combination tones. The latters’ intracochlear
mm and as the relatively large resultant, formed physical levels at their positions of maximum am-
by their combination tone 2f,-fh (which is a part plitude are not known, but their amplitudes are
of the masking stimulus) and a minor 3f,-2f,, not necessarily lower than the primaries in terms
moves apically away from the lower masker and of component amplitudes derived from period
its zone of effect, the lesser-amplitude combina- histograms nor less extensive than the upper pri-
tion tone 2f,-f, is then more easily detected, and mary in the overlap of their effects upon those of
signal threshold begins to decrease to cause a dip the lower primary (Kim, et al., 1980). But since
in the masked audiogram. (Recall that low-pass the separations of both primaries and combina-
noise (Greenwood, 1971; 1974a; Nelson, 1974, tion tones are co-varying, what can we say about
1979) or a noise floor (Green, 1965) will prevent a their relative roles? Recall that when the upper
signal’s presence from being detected until wider primary (called the signal in earlier masking con-
masker separations are reached.) Only at higher texts) is at a much lesser level, combination tones
signal levels, when the signal has sufficiently are the first detected components and provide
overcome suppressive effects by the maskers, will the evidence of ‘interference’ that we hear at that
the signal itself begin to contribute directly to the low signal level, evidence that is derived from
neural effects (at cochlear locations between the excitation coinciding with the lower primary’s ex-
maskers) that the listener may hear as a modifica- citation, or past it, on the low frequency side. Is it
tion of the maskers’ and combination tones’ on- likely that these effects of the combination tones
going perceptual effects or, at sufficient masker disappear, or become insignificant, as the signal
separations, as an identifiable component. itself (upper primary) is raised in level, simply
because the signal begins (within the gain control
Two-tone consonance centered about its frequency) to overcome the
The phenomena of consonance and dissonance domination by the masker (lower primary) to
observed in two-tone experiments are clearly fun- which it is subject and begins to contribute to
damental perceptual indicators of mechanical excitation of neurons in its own right? The an-
resolution and interference. Two tones beating is swer is ‘no’, since combination tones continue to
a mechanical interference, but these beats would increase in level as the primaries approach equal-
not be heard if the two beating tones were not ity in level.
reflected in neural response, most easily con- Consider the primary neurons tuned and firing
ceived as the response of some primary neurons to a single pure tone, our masker or lower pri-
to both tones. As the two tones separate, de- mary. What will the response of these neurons,
tectable beats increase in rate and reach maximal whose CF is the masker frequency, be to a signal
dissonance at separations of about 0.3 to 0.4 mm, alone at a frequency just above the masker fre-
but begin abruptly to diminish in detectability as quency? First, they will not respond until the
the envelope maxima exceed a separation of only signal is at a higher level than would be required
about 0.5 mm and complete their minimization at of a lower frequency tone at their CF. Second,
about 1 mm, at the shoulder of the consonance their rate of fire will increase much more slowly
curve (Figs. 3 to 7 in companion paper, Green- with signal level. These two well known facts,
wood 1991). This frequency relation is shown in derivable from any tuning curve and set of spike-
Fig. 5, middle panel. We may imagine that those count-vs-level functions, are consistent with the
portions of the receptor surfaces providing neural fact that it is combination tones on the low fre-
input to the system and responding to both tones quency side of a pure tone masker that produce
are no longer sufficiently beating to exceed some the first detected events when a signal is pre-
criteria1 extent; that the major stimulus compo- sented just above the masking tone, the listener’s
nents have each acquired a personal following task being simply to determine when he is listen-
(Greenwood, 1971,1972b; Greenwood et al., 1976; ing to something other than the masker.
220
What are those primary fibers, for which the rapidly - and just before a 1 mm separation is
signal is CF, doing when the lower masker alone reached - into lesser significance as a determiner
is present? They are either recruited by the of detected interference. The combination tones,
masker or will be with a relatively small increase eventually only the cubic, may be able to exert
in masker level. When they are, their firing rate their effects on the discharge of units firing both
will rise about as rapidly with the level of the to the masker and themselves over a possibly
masker as if the masker were their CF tone. The greater primary separation (they certainly can do
signal at CF is ‘suppressed’ as the lower fre- so at lower levels of the upper primary, i.e. signal,
quency masker comes to dominate the combina- when the latter is inaudible), until the portion of
tion of stimuli (Greenwood and Goldberg, Fig. 11 the cubic’s envelope that is effective in producing
and text, 1970; Greenwood, 1986a,b) and drive neural excitation no longer sufficiently overlaps
the neurons tuned to the signal at the average that of the lower primary for the overlapping
rate and with the temporal pattern appropriate to region to reflect interference or dissonance
the masker, whether it be noise or tone (Gold- (Greenwood, 1972b, Greenwood et al., 1976).
berg and Greenwood, 1966). These facts also fit Even if the combination tone and the upper
well with the finding that the presence of a signal primary were not in the same order of impor-
(itself inaudible) just above the masker is first tance in their contribution to dissonance when
detected psychophysically when and because a primary levels were equal (as they are at lower
combination tone interferes with and adds to the levels of the upper primary when combination
firing produced by the population of primary tones are the sole or more important contribu-
fibers otherwise driven by the masking tone, tors), if at equal primary levels the combination
chiefly on the apical side. This apical excitation is tones merely co-contribute to the detected inter-
something that the combination tone remains ca- ference, the essential point is made.
pable of producing as the masker and signal In short, we see that we have grounds to
separate, causing the combination tone’s dis- attribute an important part of the detected inter-
placement envelope to shift further apicalward. ference of two tones to the combination tones
Thus, it may continue to drive primary fibers that are generated and that probably still com-
after its peak and apical segment begin to sepa- bine perceptibly with the lower of the two tones
rate from the cochlear zone responding to the over the final stages of separation of the pri-
masker, eventually to affect primarily a popula- maries before the smallest consonant (non-inter-
tion of cells beyond the foot of the masker’s fering) interval is reached. The importance of this
displacement envelope (although, as earlier ex- role will be reinforced when we later consider
plained, the combination tone’s apical excitatory phase effects that are detected owing to combina-
effects may cease to determine the level at which tion tones. It is important to keep in mind, from
the signal’s presence is detected if and when the Figs. 3 to 7 (companion paper), how rapid is that
latter becomes relatively more audible at higher transition from maximum dissonance to the con-
signal frequencies owing to its own basal effects). sonance or non-interference seen at the curve
When at narrow primary (masker-signal) sepa- shoulders near 1 mm intervals. If, indeed, the
rations the signal is raised in level sufficiently to combination tones importantly contribute to the
contribute, now jointly with the masker, to the prolongation and detection of interference in the
activity of the primary fibers it would normally final stages of separation, then anything that
excite if alone, the combination tones are, of would obscure their contribution to the waveform
course, still significant contributors to the dis- reflected in the temporal pattern of discharge of
charge pattern of the primary fibers that they and active primary fibers should reduce the consonant
the lower primary excite. As the primaries, now interval that is measured, since it is believed that
imagined to be equal in level, separate further in among some fibers this pattern must reflect more
frequency, the upper primary, since it is less than one tone, that is, must beat, in order that
effective in exciting primary fibers with lower CF, interference be detected.
may well be the component that passes most Low level noise could be expected to exert
221
such an effect on the detectability of the combi- actually determined by the latter, as the two
nation tone contribution to the sensation of inter- tones pass rapidly from the maximally dissonant
ference or dissonance. A second factor tending to relation at a separation of about 0.4 mm to con-
reduce the effectiveness of the combination tone’s sonance in about 0.6 mm more. This emphasizes,
effects may lie in the reduction in phase locking in a direct way, how critical bandwidth, as opera-
with increasing frequency. In a frequency range tionally defined in all three types of experiment,
of lesser phase-locking, the contribution of comis related to a specific portion and dimension of
bination components to the complex waveform the mechanical displacement patterns and inter-
presented to the receptors will be reflected to a ference patterns. It also provides a similar line of
diminished extent in the temporal pattern of dis- reasoning to apply to critical band experiments of
charge of that group of single neurons that are at least some other kinds.
responding also to the pair of primary tones, just
as will the primaries’ contributions also diminish. Differential detectability of AM and Quasi-FM
Therefore, the combination tones may disappear The line of reasoning above has been recapitu-
sooner than they would otherwise, as a function lated from the 1970s. It is applied here to a
of primary frequency separation. In addition, and fourth type of critical band experiment. The abil-
perhaps of greater importance, since evidence ity of subjects to detect the presence of amplitude
exists that combination tones possess lesser am- modulation at a lesser side-band level than is
plitudes in the higher frequency region relative to required to detect a QFM stimulus with the same
the primaries (Goldstein, 1967b; Greenwood, spectral components appears to suggest a similar
1971; Hall, 1972a,b; Smoorenburg, 19721, their role for combination tones as in the other critical
expected contribution must diminish. There is, of band experiments above. The AM-QFM experi-
course, an additional possible factor (whose im- ments of Zwicker (1952) reported that the level
portance would not become manifest until higher of the side-bands required to distinguish either
frequencies) that an upper limit might be im- an AM or a QFM stimulus from a pure tone
posed by neural factors on the perception of became equal when the side-bands were sepa-
beats (Plomp and Steeneken, 1968). However, the rated by an interval that appears to correspond to
neurons with CFs of more than 3 kHz phase-lock a distance of about 1.15 mm (Greenwood, 1991,
well to pure tones of frequencies higher than the Fig. 13). Similar experiments by Goldstein (1967a)
beat-rates in question (if they behave similarly to replicated the essential results, as have experi-
cat primary fibers), so that such a limitation would ments from Zwicker’s Iaboratory by Schorer
seem dependent on the high frequencies of the (1986).
primaries in conjunction, rather than simply on The far greater peakiness of the AM waveform
the beat-rate itself. provides a plausible intuitive reason for its greater
To sum up, the foregoing paragraphs have detectability at low modulation rates (narrow
illustrated that the role of odd-order combination side-band separations), when first slow, then
tones in determining the masker-notch interval in rougher, beating will be heard. But AM’s greater
pure-tone masking (Greenwood, 19711, and the detectability disappears when the side-bands
similar role of combination bands or tones in reach separations from the carrier that are only
displacing the conventional interpretation of the somewhat wider (about 0.58 mm) than the two-
two-tone masking measurement of critical band- tone intervals yielding maximum dissonance
width (as the measurement of a filter bandwidth (about 0.4 mm). The expected advantage of AM
around a centered signal), can be extended to may be abetted in a nonlinear system by the
another type of experiment. The two-tone conso- generation of combination tones - and in a way
nance experiments, are interpretable (Green- that may suggest why the advantage disappears
wood, 1972b, Greenwood et al., 1976) in terms of while the side-bands stili remain near the carrier.
the interaction of both primaries and combina- At separations approaching those required for
tion tones - with the final stages of separation maximum dissonance it may be useful to view this
probably being importantly contributed to, if not experiment as a masking situation. in which the
222
side-bands are a joint signal and in which a joint tern of components above the carrier we assume
side-band threshold is measured - in effect, to to be less relevant.) The measured side-band
regard it as the reverse of a two-tone masking masked thresholds seem to suggest that the dif-
experiment. Maskers and signal trade roles and ferences (between AM and QFM) in these api-
relative levels, and all components are locked caliy situated resultants make it easier to detect
tones. When the two signals <fi, and fhs,the the side-bands in AM so long as the pair of
side-bands), presented with the masker <f,, the resultants are within about 1.25 mm below the
carrier), exceed a separation of about 1.15 mm masker (carrier). Thus, up to this separation three
from each other and half that from the masker, components contribute to waveforms and neural
component overlap seems enough reduced that discharge patterns, which depend on phase an
the phase relations among the three components amplitude relations, in the main apical zone of
no longer influence the detectability of change, the carrier’s excitatory effects.
whereas if they are closer their phase relations do Note that it is the resuftant of the two combi-
exert an influence on the required signai (side- nation tones below the Iower side-band Cf,,,),
band) Ievels. Fig. 5 shows the relation of these whose maximum is situated at I.25 mm from the
primaries and their combination tones to the carrier, that will move first beyond the carrier’s
apical segment of the masker’s displacement en- apical zone of excitatory effect when modulation
velope when the signals are separated by 1.15 frequency increases (and no longer affects de-
mm. tectability). As this resultant Ieaves this zone, the
We know from the single pure tone masking two remaining components, carrier and lower
studies that if the upper side-band Cf,,) were side-band resultant, will drive primary neurons of
presented at this separation without the lower the region with a more nearly two-tone wave-
and were raised in level from below threshold, it form. Whether it is in fact the resultant of the
would reach detection because of the effect of two combination tones below the lower side-band
combination tones, chiefly 2f,-f,,,, on the low that is mainly responsible for the disappearance
frequency side of the masker Cf,, carrier) and not of the difference in detectabiIity of AM and QFM,
‘itself be detected until presented at a higher as it begins its exit from the carrier’s excitatory
level. If we had (to some degree) cancelled (or zone, or whether the separation of the lower
reinforced) the combination tone 2f,-fh, with a side-band from the carrier and upper side-band is
weak third primary of the same frequency, we more important is not certain. But at wider sepa-
obviously could influence ‘signal’ (upper siderations this zone will become increasingly a two-
band, fhs) threshoId. In effect, this is, of course, tone area (and if side-bands are further above
what is being done in the AM-QFM experiments. threshold consonance is reached when the lower
When the lower side-band <fr,>is added, we is 1.25 mm from the carrier.) We also know
know that in both types of modulation the combi- generally that the intracochlear spectra of AM
nation tone generated by upper side-band and and QFM complexes, given nonlinearity, must
masker (2f,--fh,> will now coincide and form a differ in amplitude and phase (hence waveform)
mixed resultant (external tone and combination because of the combination tones that will form
tone) with the lower side-band Cfi,>,but with a resultants with side-bands and with each other,
different phase and amplitude in the two types of apical to the maximum amplitude of the domi-
modulation, since they differ only in the phase nant primary, f,.. The modification of the intra-
relations of carrier and side-bands. If in AM the cochlear spectrum of a complex stimulus by such
resultant’s components add partly in-phase, in resultants (coincident combination tones and/or
QFM they wiIl not. An unmixed combination coincident primaries and combination tones) has
tone resultant (composed of 3f,-2fh, and 2 f,,.-f,.) been reported in other experiments (Lewis and
also appears and will similarly differ in phase and Larsen, 1937; Goldstein, 1969/70, Greenwood,
amplitude in AM and QFM, at a frequency below 1972b,c; Hall (1972b); Buunen and Bilsen, 1974;
the lower side-band and at an interval from the Buunen, et al., 1974, 1976; Buunen, 1975; Green-
carrier corresponding to 1.25 mm. (A similar pat- wood, et al., 1976), as wil1 be described below.
223
The point here is not that the operational tone resultant 1.25 mm apically from the carrier,
definition of ‘critical bandwidth’ has been fully the AM compound is more easily discrimated
‘explained’ in this or preceding experiments, but from a pure tone than the QFM complex which
rather that in this additional ‘threshold-class’ ex- requires larger side-bands to make a similar dis-
periment an eventual explanation must again take crimination. If we focus instead on discrimination
into account combination tones and their interac- of AM from QFM (Goldstein, 1967a; carriers
tions with the primaries. They are the determi- from 0.25 to 8 kHz), they cease to be discrim-
nants of the criteria1 change in the stimulus to the inable from each other when the side-bands are
primary fibers, again on the low frequency side of on average about 1.25 mm from each other (see
the carrier, thus implicating the same critical Greenwood, 19911, which means that the carrier
distance or zone, and showing why the experi- and upper side-band are about 0.6 mm from each
ment yields the ‘measure’ of critical separation other, at the start of the rapid transition from
obtained. dissonance to consonance if they were alone. (If
This point is strengthened further by the ex- the side-bands alone were present, they would be
ample of two other three-tone experiments, which beyond the separation needed for consonance). If
are similar to the two-tone studies by Lewis and we focus now on consonance, when the side-bands
Larsen (1937) and Hall (1972b) in which the two are separated by about 2.32 mm (1.25 mm from
combination tones, 2fl-fi and &-fi where fi = lower side-band to carrier, I.065 mm from carrier
3/2f,, alternately cancelled and reinforced each to upper side-band), an AM complex is reported
other in their vector sum as the phase of one as just smooth or uniform (consonant) by Mayer
primary was rotated. By using three equidistant (18941, which is quite compatible with the approx-
and locked primaries Cf,, fit and f,> but at wider imately 1.0 mm distance required for two-tone
separations and different amplitude and phase consonance.
relations, we know additionally that chiefly by
altering primary phase relations (a) we can aug- Critical bands and combination tones in other
ment, or eliminate, the perceived presence of experiments involving noise stimuli
either one or the other, alternately, of the first
two odd-order combination-tone resultants (one The experiments discussed above can all be
composed of 2f,-fi and 3f2-2f3 the other of described and conceived in terms of pure tone
3f,-2f, and 4f,-3f,) below the primaries (Gold- components, although some used a narrow band
stein, 1969/70) and (b) we can, in the physiologi- of noise as a signal. They also were all directly
cal domain, correspondingly augment or diminish concerned with measuring critical bandwidth. In
the ability of these combination tone resultants to order to introduce other experiments directed to
drive single units located apically beyond the the task of inferring the filtering properties of the
excitatory zone of the primary tones (Greenwood cochlea from psychoacoustic e~eriments, a dis-
et al., 1976). ObviousIy, the same kind of forma- cussion of the generation and effects of combina-
tion of resultants, with their relative amplitudes tion noise follows. The use of a noise-band pri-
necessarily influenced by phase, still occurs at the mary ensures that there will be interactions among
narrower primary separations that will cause the noise components themselves, as well as with
combination tone resultants and mixed-origin re- any pure tone components. These interactions
sultants to overlap their zones of excitation, thus may have a profound influence on the results of
merging their mechanical effects to determine an experiment. Though the presence and some-
the discharge of a set of single neurons respond- times the perceptual importance of combination
ing in common to that complex of components tones have been recognized in several contexts,
(Greenwood, 1988). the described roles of combination tones
In sum, we have seen in the AM-QFM experi- (Greenwood, 1971, 1972b,c), further outlined
ments that when the side-bands are within a 1.15 above, have not much modified the interpretation
mm distance of each other (0.6 mm and 0.55 mm of some other masking data nor been taken into
from the carrier), which places a combination consideration in the conduct of certain experi-
224
ments. Rather, combination tone detection in a significant role in shaping the masked audio-
masking has sometimes seemed to be regarded as gram (Greenwood, 1971). When the skirts of the
merely an example of ‘off-frequency’ listening primary band of noise are sufficiently steep that
and lumped with the rather prosaic detection of the primary component distribution does not it-
spectral splatter when a tone is turned on self determine audiogram slope, there is a pro-
abruptly. Combination tones, at least in the con- gressive change, with a reduction of steepness, on
text of masking, sometimes have been regarded the low frequency side of the audiogram as a
only as contaminants, a methodological nuisance narrow band is widened and as the odd-order
providing undesired cues, perhaps even as a kind combination bands begin to ‘stretch out’ towards
of ‘static’ that does not preclude reception. In the apex. They first add to the low frequency side
other experiments where it will be shown that of the primary band, determining the joint, limit-
their role could be expected to be critical, they ing, intracochlear low-frequency slope of primary
are not taken into consideration explicitly at all. band and combination components, as the latter
However, a combination tone may not only serve extend downward with increasing bandwidth.
as the immediate signal as defined by the audi- They hence will have also determined the limiting
tory system rather than by the experimenter, they slope of the masked audiogram by the time that
may serve equally as maskers of external signals the primary band reaches about critical width
and of other combination components. In order since the primary band’s cubic combination band
to discuss other psychoacoustically significant extends the same frequency interval apically from
bandwidths, such as estimated rectangular band- the primary band’s lower limit, for a distance of
widths (ERB), considered to be related to critical about 1.25 mm. (This does not imply that the
bandwidth, it is necessary to consider combina- limiting masked audiogram slope is the same as
tion tones in some additional contexts. the intracochlear mechanical slope, only that, be-
The demonstrated role of combination tones ing functionally related, they both are reached at
in producing the masker-notch interval in pure the same bandwidth.) The combination compo-
tone and narrow band masking (Greenwood, nents then stretch further into the lower frequen-
1971) and their maskability with low-pass noise cies, extending the foot of the masked audiogram
carried with it the implication that combination and eventually producing ‘remote’ masking if the
tones or bands produced by any stimulus pair, or band is of sufficient level (Greenwood, 1971).
by any band-pass noise, would produce both exci- The even-order combination bands stretching up
tation and masking in their own right. Production from the low frequencies also contribute impor-
of such effects was easily demonstrated, as local tantly to remote masking. The massed and over-
maxima of masking by odd and even-order nar- lapping odd- and even-order combination bands
row bands of combination components at fre- produced by continuous spectra were called the
quencies lower than the primaries. Also, in the combinational aggregate and described, with sev-
case of band-pass and high-pass noise, combina- eral figures devoted to the odd-order portion
tion tones accounted for extensive uniform re- (Greenwood, 1971, Fig. 25). The relationships
mote masking once the primaries exceeded a among the combination bands generated by a
moderate level of about 40 to 50 dB SPL. Re- tone and noise and the combinational aggregate
mote masking had been earlier observed by Bil- generated by the noise alone are shown in Fig. 6.
ger and Hirsh (1954) and had correctly been which reprints the earlier Fig. 25, and are rele-
attributed to an intra-cochlear source of distor- vant to following discussion. Having considered
tion by Deatherage et al. (19571, and had also the presence and masking effects of the combina-
been specifically attributed to combination bands tional aggregate on the low frequency side of the
by Spieth (1957) in the same journal and issue. primary band, let us turn to the high side.
Thus, on the low frequency side of the masked The same demonstration that sufficiently nar-
audiogram produced by a band of noise, the row bands of noise would, like a pure tone masker,
odd-order combination components, chiefly the result in a notch on the high frequency side of the
cubic, generated by the band were shown to play audiogram showed also (Greenwood, 1969/70;
Frequency IO KHz
Fig. 6. Taken from Greenwood (1971, Fig. 25). Graphs represents the frequency distribution of combination components of the
form ((n + I)fl-nfz) potentially generated by the indicated primaries. Whether they will be present or effective will be dependent
on the nonlinearity in question and the relative and absolute levels of the primaries. Left panel (A): At the top of the figure a pure
tone at 1 kHz is represented by a vertical solid line and the critical bandwidth extending downward from it is represented by a short
horizontal bar. A dashed vertical line at 1.2 kHz represents a second tone, potentially a signal. On the low frequency side of 1 kHz
are the first three combination tones of the type above tn = 1, 2, and 3) at O&0.6, and 0.4 kHz. Below are diagrams in which bands
of noise, shown as solid brackets of increasing width, and an upper frequency limit of 1 kHz, are substituted for the 1 kHz tone.
The critical-band bar extends from the noise lower limit. The combination bands generated by band and a constant 1.2 kHz signal,
are indicated by dashed lines. The corresponding first three bands of the combinational aggregate that are generated by the
primary band itself are indicated by solid lines. Right panel (B): At the top of the panel, a band of noise of critical width is
represented by a solid bracket with its upper limit at 1 kHz. (The same noise and its combinational aggregate are represented by a
diagram that is inset on the left, and below it is a similar diagram for a slightly wider band of noise.) At the top and on the low side
of the band at the top right is a pure tone signal, represented by a solid vertical line, and the combination bands generated by the
tone and band of noise. Below are diagrams in which the tone is shifted upward with respect to the band of noise. which remains
fixed. When the signal enters the band of noise, two sets of combination bands are generated and are represented one above the
other. The upper set, ((n + I)&-& _h ), is generated by the signal and noise camponents above it in frequency. The lower set,
((n + I)&_ ,-nf,), is formed by the signal and noise components below it. Near the center of the vertical column of diagrams, and
on the right, the signal is within 1/3rd to l/2 of the width of the band from the lower limit. (In inset diagram to the right, the
signal is 3/7ths of the width through the band, the third pair of combination bands is aligned, and the critical band below the band
is indicated by a vertical dashed line.) For this short range of frequencies the first two, and largest amplitude, combination bands
(it = 1 and 2) of both sets are entirely within one critical band below the noise band and simultaneously within the limits of the first
(n = 1) combination band of the aggregate, the largest amplitude band within the aggregate. (By the time a rectangular noise band
has reached critical width, the combinational aggregate has ‘filled’ this frequency interval and determined the limiting low
frequency slope of the noise band’s masked audiogram.) So long as the signal is within the primary band, the signal-masker
generated combination bands are necessarily contained within the frequency limits of the combination bands of corresponding n in
the combinational aggregate of the noise. Since the band and hence its aggregate are constant in extent, the combination bands
formed when the signal exceeds the band’s upper iimit progressively extend beyond the aggregate, where they are more audible
than is the signal itself at frequencies above the band, up to a frequency of at least about 1.4 times the band’s upper limit. This
situation obtains up to bandwidths of at least about two critical bandwidths (see Fig. 7).
226
maskers - combination bands that are hence

potentially detectable further into the low fre-
quencies.
In the present paper, Fig. 7 reprints Fig. 21
from 1971. It was then noted that the progression
of changes from the notched to the unnotched
form produced by a wider band ‘can appear rather
unusual - as an inspection of the masked audio-
grams [in Fig. 7] will suggest when the upper
frequency limits are conceptually aligned - and
will depend on the exact frequency and ampli-
tude relations between, on the one hand, the
band plus its combinational aggregate [the total
masking stimulus] and, on the other hand, the
Frequency in KHz
combination bands produced by the band and the
Fig. 7. Taken from Greenwood (1971. Fig. 21). Two masked
signal [a part of the signal from the ‘point of
audiograms produced by bands of noise of 200 and 600 Hz in
view’ of the auditory system] (Greenwood, 1971,
width, respectively, whose lower frequency limit was 2000 Hz
and whose overall level was 73 dB SPL (see original publica- p. 540). Thus, the notch region itself disappears
tions for other details). The 200 Hz band is subcritical and very rapidly, with increases in bandwidth, as it
yields the notched form that results from detection of signal- extends upwards to comprise the entire upper
masker combination bands below the masking band when the
slope (see Fig. 71, and an effort to determine
signal is in a limited region immediately above the band. The
more precisely the width at which the notch dis-
600 Hz band is almost two adjacent critical widths (1.8) and
the entire high frequency slope of the masked audiogram is appears was not made. However, certainly when
determined by detection of signal-masker combination com- a band is of critical width (and in Fig. 7 here and
ponents. With the upper limits aligned. the high frequency Fig. 22 of 1971 they are about 1.8 adjacent critical
side of the masked audiogram produced hy the wider band
bandwidths), signal threshold is determined over
still lies well below the thresholds measured for the narrower
the whole falling portion of the high frequency
band. by about 11.7 dB, after allowance is made for the
approximately 1.X dB difference in spectrum level between slope of the masked audiogram and not merely
the 200 Hz band and a band of critical width. The lower close to the upper cut-off frequency in the former
thresholds can be attributed to the more extensive distribu- notch region. The finding that the high-frequency
tion of signal-masker generated combination bands below the
slope of the masked audiogram produced by a
masking band and their changed relationship to the combina-
critical band of noise was determined by combi-
tional aggregate, resulting in their detection over a more
extensive range of signal frequency. nation components has also been reported by
Zwicker (1968). [If a further examination of this
issue includes bands in the lower frequency re-
1971, Figs. 21-22, 25 and text pp. 530, 538-540) gion, what may be considered critical width in the
that the detection of combination bands pro- lower frequencies may well be more accurately
duced by signal and masker on the apical side of given by the exponential critical band function of
a masking band determined signal thresholds on the previous paper (Greenwood, 1991). For ex-
the entire high frequency slope (signals within ample, for a band centered at 400 Hz a 1 mm
about 1.4 times band’s upper limit) of the masked distance would correspond to as little as 75 Hz,
audiogram after the band was wide enough that which is consistent with the notchless (monotonic
the notch feature was lost. In Figs. 21 and 22 the decreasing) and only slightly irregular high fre-
high frequency slope was still determined when quency slope of the masked audiogram produced
bandwidths were at least 0.23 times their upper by a band of 90 Hz centered at 410 Hz (Egan and
frequency limits of 2600 and 1300 Hz. This could Hake, 195Oj.l
occur since signals of still higher frequency are In other words, at least one major reason
able to generate, with wider maskers, even more (there could well be others) for the different
extensive combination bands than with narrower extent of the upward spread of masking in pure
227
tone and noise band masking is the detection, binational aggregate generated by the primary
when a signal is presented above a band of noise, components in the lower frequencies may itself
of the combination bands produced by the signal be of lesser amplitude because the lower fre-
and masker which are detected more apically in quency primary components have been subject to
the cochlea. They are detected there to the ex- middle ear attenuations (however, recall evidence
tent that they are of sufficient magnitude and not that nonlineari~ increases near the apex in Hall,
masked there by both the masking noise and the 1972a, 1972b), the components of the aggregate
combinational aggregate of lower frequency com- and of the signaI-masker combination bands pro-
ponents of odd and even orders produced by the duced by higher frequency portions of the input
band of noise itself - the combinational aggre- spectrum will not be indirectly affected in this
gate that produces more extensive remote mask- way. The signal-masker produced bands, as noted
ing if the primary band is presented at sufficient above, also become more extensive as masker
level. Thus, combination bands at lower frequen- bandwidth increases, always keeping a ‘lead’ on
cies generated by the primary band and a higher the expanding aggregate in spatial extent (see
frequency signal of decreasing level will deter- Fig. 61 if the signal is outside the band, and will
mine signal threshold on the high frequency side fall off in amplitude in the low frequencies fol-
unless and until the combination bands generated lowing a similar slope as the aggregate does,
by masker and signal become inaudible (by virtue which will tend to preserve the amplitude rela-
of insufficient level or masking by the primary tionship between themselves and the combina-
band and/or the combinational aggregate the tional aggregate as they both expand with masker
band has created) before signal produced neural bandwidth.
events on the high frequency side become unde-
tectable. At some sufficiently high frequency Critical band~dtb measurement in narrow-band
above the masking band, quite likely depending masking
on masker level, a signal will remain audible
longer as its level is lowered than do any combi- The findings discussed above accounted for
nation components created by itself and the the results of another, and earlier, type of critical
masker. This is the same basic situation as de- band experiment (Greenwood, 1961a,b). In 1971,
scribed above in the masking by a pure tone of a from the above-cited results and diagrams, the
pure tone, or a noise band, signal. conclusion was drawn ‘that within given portions
Thus, the fact that the high frequency slope of of the frequency region below the primary band
a masked audiogram may not change after masker (for example, in the critical bandwidth interval
bandwidth increases beyond the width (at least immediately below it), the resultant amplitude
approximately critical) required to eliminate an distribution of combination components will not
actual notch does not establish that signal thresh- change materialIy once a numerically equal pri-
old is determined by events detected on the high mary bandwidth is exceeded. This conclusion has
side rather than by combination bands on the low clear implications for all classical paradigms of
side. In fact, it strongly suggests that the previous masking, . . . ’ (Greenwood, 1971, p. 535). Later in
situation has not changed. In addition, recall that that paper (p. 540 and Fig. 25) it was shown that
as a band of noise is widened into the low fre- all signal-masker combination bands would be
quencies enroute to becoming a low-pass noise, it concentrated within and close to the masking
becomes a true or uniform low-pass noise only band once the signal entered the band - ‘. . . as
outside the ear. Within the cochlea, plotted as the signal enters the band, it will form combina-
intracochlear level, it has a falling amplitude tion bands (1) with the noise components lying
spectrum on the low side that reflects rising above it and (2) also with those lying below it.’
threshold in the low frequencies. The combina- The center-right diagrams of Fig. 2.5 (Fig. 6 here)
tional aggregate and signal-noise generated com- showed that as the signal progressed through the
bination components suffer no such direct middle masking band, each of the two sets of odd-order
ear attenuations. Although that part of the com- (n = 1, 2, 31 signal-masker combination bands
22X
[t n + l>f,-nf+ ,* and (n + 1If, _ -nf,; where s = neural, events on the low side of the masking
signal, I= lower limit of band, h = upper limit, band; as signal level is lowered toward threshold,
f, of, _ <f, <f _ jzsf,, I changed in extent in op- the signal-masker generated combination tones
posite directions (expanded or contracted). will drop out of audibility first, although they
When the signal lay at a frequency 3/7ths of should be playing an increasing role (in the detec-
the way through the masking band, n<fh--f,)/(2n tion of the signal at threshold) by the time the
+ 1) the lower limits of the third pair (n = 3) of signal reaches the peak of the audiogram. For
combination bands coincided. When the signal signals coincident with the lower limit of a band,
lay 1/3rd of the way through the band, the lower the signal-masker generated combination compo-
limits of the first pair (n = 1) of bands coincided. nents will be coincident with the combinational
Most importantly, ‘Note that it is in this range of aggregate. For signals at about 2/5ths, n/(2n +
signal frequencies that all components of signifi- 11, of the width through the masking band, both
cant amplitude in the combination bands formed pairs of the major signal-masker combination
by the signal and the noise fall close to the bands (2f,-f* and 3f,-2f,) will be concentrated
low-frequency side of the band of noise and hence entirely within the limits of the combinational
are more greatly masked’; clearly they would not aggregate’s cubic, and largest amplitude, compo-
be so readily masked if the signal were not in this nent band on the low side of the masking band
range. Specifically the 2f,-fi and 3f,-2f, bands (and the lower limits of the 3f,-2f, pair will
fell entirely within one critical bandwidth of the coincide). This will be true for any bandwidth.
lower limit of the noise, as shown in the right However, for subcritical bands, the cubic band of
middle panel of Fig. 6 (also in the bottom panel the combinational aggregate will not yet have
of Fig. 5). They therefore fell entirely within the fully filled out the 1.25 mm distance below the
limits of the first (2f,-fz> and largest-amplitude lower band-limit (the low-frequency slope of the
component band of the combinational aggregate audiogram), where it contributes its masking ef-
which, when the masking band reaches the same fects on signals (and on signal-masker generated
critical interval in width, has determined the lim- combination bands formed by a signal advancing
iting slope of the masked audiogram on the low in frequency past the foot of the audiogram and
frequency side. There the aggregate can add no towards the masking band).
further to the masking of signals, nor signal- Let us assume the band has now been widened
masker combination components, falling between to critical width. The top panel of Fig. 8 illus-
the masked audiogram’s apical foot and its peak. trates the combinational aggregate and the dou-
As shown by the diagrams on the lower right of ble set of signal-masker combination bands gen-
Fig. 6, when a signal was advanced above the erated when a signal is located 2/5ths of the way
center of the band, the signal-masker combina- through a critical band of noise, indicated in Fig.
tion bands extended into lower frequencies and 8 as S,. As noted above, the aggregate itself will
away from effective masking by the combinational have now become as effective as a masker on the
aggregate, where they were able to determine apical slope as it can, since by the time the
signal threshold. masking band has reached critical bandwidth
In short, the text and diagrams of the 1971 (corresponding to about 1 mm), the cubic (largest
paper provided the basic reasons why the band amplitude) component-band of the aggregate will
widening experiments of Greenwood (1961a, b) fill the 1.25 mm interval below the band’s lower
would yield the critical band estimates obtained. cut-off. Thus, the signal and the signal-masker
An expanded explanation follows here, with the combination bands would, for near-central signal
addition of the concept of a mechanical gain frequencies, be jointly as hard to hear as they
control centered at every cochlear locus (Green- could be, i.e. maximally masked, because they are
wood, 1986a,b,c; 1988). Briefly, in narrow band concentrated in the central region of the masker
masking experiments, thresholds of signals on the band and its combinational aggregate, with no
low side of a masking band are probably deter- significant components extending beyond the lat-
mined by signal-based mechanical, and resulting ter. So long as the power of components is
229
summed at any given locus with increasing band- However, if the signal is moved to a higher
width, the signal threshold at that locus rises with frequency within the band, one set of signai-
bandwidth. Noise would otherwise dominate the masker combination bands, (n + l>f,-nf,, would
combined stimulus (of masker and signal) within at once begin to extend apically out of the mask-
the gain control centered at that locus and would ing reach of the combinational aggregate, as the
progressively constitute a larger proportion of the signal moved upward from the central frequency.
total power relative to the signal as bandwidth Although the signal of rising frequency, as it
increases. This the signal can only redress by exited the band and the successive gain controls
being increased in level. By critical bandwidth centered at frequencies within the band, would
further widening becomes irrelevant. become a relatively larger contributor to the gain
100 1000 M 20
t t
5,
Critrcal width masking band Combmotvxi bands-generated by
produces slgnol 8 lower masker components
peaked masked audlogram _--_
Comblnatton bands -generated by

__ _ SIgnal & higher masker components
-_._I-. Combinational aggregate

Increasing moskrng band by -.-..-._I generated by masking band
Af produces broader masked
5,
audiogram. Also has result that
lower limit of ONE signal-masker
____
combmotion band decreases
---_-______
5, x 5, +c&f
_._ _ _._ .-.-.-._
- .-..-..- .._.._. - .._. -.
~
Shiftmg stgnol higher by Af 5, Between 5, and 5, signal threshold
shifts lower llmlt of OTHER IS nearly flat At higher frequencies
sIgnal- masker “slgnal threShOld” IS determined by
combination band detectmn of comblnatlon bands vn
to same lower IlKlIt frequency !-ego” below masker.

_- ._._.-. --.
,“““I -..-. -..-. _,_.._,._.._ -.-. ,
100 1000 IOC 30
Cochlear frequency scale In Hz
Fig. 8. Schematic to illustrate primary and combination band distributions when a pure tone is masked by a critical or a
supercritical band of noise, based on Greenwood (1971 - Fig. 25, right panel and pp. 538-540 in text). Top graph: Signal, S,, is
placed Z/Sths of the width of a critical band from its lower limit. As in Fig. 5, the combination bands formed by signal and lower
masker components ((n + Ifficr-nfs) and the combination bands formed by signal and higher masker components ((n + I)&-
nf,_J are concentrated within one critical bandwidth (and the main zone of masking influence) below the lower limit of the
masking band, and therefore also within the limits of the cubic component of the combinational aggregate generated by the
masking band. At about S, the signal’s masked threshold is maximal. At lower signal frequencies than S,, signal threshold is
apparently determined by detection of neural effects of the signal itself (signal-masker generated combination bands would tend to
disappear from audibility first as signal level decreases). At higher signal frequencies than S, (to about 1.4 fh) signal threshold is
determined by neural events caused by signal-generated combination bands ((n + l)fi _) -nf, or, if f, > fr,, (n + Iif,.,,-n&j; that
is, the signal disappears from audibility first as its level is lowered. Middle graph: Masking band is increased by Af, and
combination bands ((n + I)f , I _ ,,J extends to lower frequencies but not as far as does the combinational aggregate, (tn + l)f,-nf,J.
Signal is not effectively any more masked by additional higher frequency primary components in the added Af-region, and
signal-masker generated components do not become relatively more audible since their relationship to combinational aggregate
does not change. Bottom graph: Signal is raised by Af to S, without significant change in threshold, and combination bands
((n + l)f,-nf, &,,I extend into the low frequencies without expectation of changes in audibility since combination band (3f,, ,Y-2f,)
merely reaches same lower limit as did (3f,-Zf, _hf when signal was at S,. But at signal frequencies higher than SH (to about 1.4
fhfr signal threshold is. as in top graph, determined by greater audibility and detection of signal-masker combination bands,
probably chiefly those among (2f ,_h-fr) that are at lower frequencies than C2f,-f,,), the lower limit of the cubic band of the
combinational aggregate.
230
control centered on itself (i.e. would become less masker. The present instance focuses attention
‘suppressed’ by the diminishing noise components on the consistent fact that the low-frequency slope
still dominating the control), the ‘signal’ thresh- of the noise audiogram will reach a limiting value
old nevertheless does not ‘track’ the upper arm of due to the combinational aggregate at the same
‘the filter’. Rather, over a potentially appreciable bandwidth at which the audiogram’s peak stops
range of higher frequencies the signal ceases to rising and starts to broaden. This suggests that
be audible, as it is lowered in level, before the the most favorable loci for the detection of the
signal-masker combination components on the signal-linked events, as in the other experiments,
low frequency side become inaudible. What is are still those on the apical side of the masking
happening is that signal-masker combination stimulus, namely precisely those loci that are
components, as they enter lower frequency re- contributed to by the combinational aggregate.
gions, are increasing in relative level. in respect We have already seen that combination tones or
to the combinational aggregate which remains bands excite primary neurons, and mask signals,
unaltered, faster than the signal is becoming more like components of external origin. The conclu-
effective at higher frequency loci. sion was also drawn in 1971 that, in acting like
If instead, the signal remains fixed at the ini- external tones, the combinational aggregate would
tial 2/5ths frequency and the masking band is exert a masking influence on signal-masker com-
now widened beyond critical width (Af added) bination tones at any loci where both exist.
towards higher frequencies, that set of signal- If signal frequency, as shown in the bottom
masker combination bands formed by the signal panel of Fig, 8, is now raised from its original
and the masker components of higher frequency, position, while the already widened band is held
(n + l)f,(initlal)-nf- /I,
would rapidly extend api- constant, the set of signal-masker combination
tally to a new lower frequency limit (given by bands extended by widening, (tz + 1If,-&_ ,z, will
(n + l)f,(,n,t,a,)-n(fh(i”,~i.,,~+ Af) to which we will contract and the other set of signai-masker com-
allude again below). The other set formed by the bination components generated by signal and
signal, S i_ in Fig. 8, and masker components masker components below the signal in fre-
below the signal, (B + l>f, t -tE~s(inirial)l wouId re- quency, (n + I).f,-$,) will extend downward in
main unchanged. However, the now extended frequency. When the signal has been raised to
signal-masker combination band would not be &new) (S,, in Fig. 8), a frequency equal to the
expected to become audible in the low frequen- earlier ‘2/.5ths’ frequency plus the added A,f [i.e
cies, since the stiI1 more extensive combinational LMV, =.&initial) + -Ifi let 3fl-2f~f~~~b = 3fs(inittali-
aggregate by expanding also keeps ahead of it (as S(.fiz(tnilial)

+ df) and solve for fr(ncwJ, the lower
Fig. 6 shows that it must when the signal is within limits of these newly extended bands (generated
the primary masker band). Consequently, if the by the signal and masking components below it)
ratio (of signal-masker generated combination will reach the same lower frequency limits from
bands over combinational aggregate) remains the which the initially extended band has retreated as
same, no reduction of signal threshold would be signal frequency was raised. Hence they will also
expected by reason of signal-masker combination- remain within the unchanged combinational ag-
band extension. Signal threshold would also not gregate and should remain masked. Therefore,
be expected to be raised, since the added masker between these two frequencies we would not ex-
components above the former upper band-limit pect signal threshold to change significantly be-
are already sufficientfy distant to have become cause of any changes in the audibility of the
negligible components in the output of the gain signal-masker combination components (which
control centered on the signal (and they have should not occur>, since, although their band-
become less able to control it from a distance). limits shift about with signal frequency, they will
Note that the same estimate of critical band- not become free of the joint masking influences
width is operationally defined as in the various of the masker and combinational aggregate com-
two-tone paradigms, where the detected neural ponents, which occur in the same regions (al-
events are occuring on the apical side of the though with a sufficient increase in bandwidth
231
threshold should increase because of increases in gered by signal and/or signal-masker combina-
power/mm due to the distribution of frequency tion bands in varying proportion depending on
along the cochlea). However, as signal frequency signal frequency) are being detected when signal
exceeds the upper frequency of this range, the frequencies lie between the low frequency foot of
signal-masker combination components, (n + the masked audiogram and at least the peak of
llfl + +s(rising), will quickly extend apically past the audiogram. But when signals sufficiently near
those portions of the combinational aggregate of the upper band-limit are presented (Fig. S), the
corresponding n, to regions where they are de- zone of detection shifts lower (or apically) in
tected in lieu of the signal as it is lowered in frequency as the signal moves out of the band on
level. its high frequency side. This continues to occur
To summarize: for a bandwidth less than or for signal frequencies up to about 1.35 times the
equal to critical width, one expects to obtain a masking band’s upper frequency limit (Green-
relatively simple and peaked masked audiogram, wood, 1971, Figs. 21 and 22). This description
with a visible combination-tone notch on the high applies at low to moderate levels and for masker
frequency side if the band is sufficiently narrow bandwidths of at least 0.23 times their upper
and above the lowest levels. With a wider band- frequency limit, based on past data. It would
width, one expects a broader peak region to be apparently apply up to bandwidths at least as
seen, possessing a rather flat portion comparable wide as 0.4 times upper limit, judging from the
in extent to the Af added beyond critical width, data to be discussed in the next section.
an old expectation. But the masked thresholds at
higher frequencies begin to turn down for the Implications for notched-noise (two-band) esti-
reason, as discussed above, that higher frequency mates of the slopes of the auditory filter and its
signals result in the apical extension of signal- equivalent rectangular bandwidth (ERB)
masker combination bands beyond the masking
effects of the combinational aggregate on the low The report in 1971 that the high frequency
frequency side of the masker where their detec- slopes of the one-band masked audiograms were
tion consequently determines the measured still determined by combination band detection
masked thresholds of signals on the high fre- even when a notch was not visible had clear
quency slope of the audiogram. Thus, in the cases implications relevant to later masking experi-
of both sub- and supercritical bands, the signal- ments seeking to measure the slopes and band-
masker combination bands will be audible longer width of ‘the auditory filter’, using noise band
on the low side than will signals of decreasing stimuli, although space was lacking to discuss
level on the high frequency slope of the masked fully the implications of the diagrams which
audiogram, where the masker is better able to showed the distribution of combination compo-
dominate mechanical events and reduce the ca- nents in masking experiments. Papers in 1972
pacity of the signal to contribute to neural dis- (Greenwood, 1972a,b,c) went further in showing
charge than the masker is able to do on its low the magnitude and distribution of the masking
frequency side. Hence both sides of the audio- exerted by combination bands below the pri-
gram are determined, in somewhat different ways, maries, but further papers to discuss these impli-
by the detection of events on the apical side; cations were deferred. Most succinctly now, com-
mapping the high frequency side of the masked bination components are ubiquitous. A system
audiogram does not map the high frequency side that has evolved with such components can be
of ‘the auditory filter’, as shown long ago. That expected to have ‘utilized’ them to the full, espe-
the critical width is reached (and that peak cially as primary neural firing in response to them
threshold ceases to increase1 when the combina- in regions apical to the primaries will not be
tional aggregate fills the 1.25 mm apical distance expected to saturate until well after response to
immediately below the masking-band lower fre- the primaries has saturated in the region of gen-
quency limit, is interpretable as evidence that it is eration. The point that the combination compo-
there that the neural signal-related events (trig- nents produced by a band of noise and a signal
232
above it may determine detection of the signal’s tone masking, single narrow-to-moderate-band
presence has been more frequently appreciated masking, and two-tone masking that combination
(in the simplistic form of contamination) than has tone, or band, detection in the low frequency
the importance of the two corollary points that region determines signal threshold (for a signal
the band itself will produce a combinational ag- just above the masker) but is not at all important
gregate, acting as a part of the masking stimulus, in two-band masking experiments? Given also
and that the relations between both sets of com- that when detection of combination tones is pre-
bination components will become important fac- vented, so they may play no role, in two-tone
tors in determining the distribution and magni- masking, wider tone separations are needed to
tude of the relevant ‘signal/masker’ ratios in the produce a threshold dip between the tones, should
low frequency region when a signal just above the we expect when masking with two bands of noise
band is lowered to ‘signal’ threshold. that smaller separations should yield a dip with-
In notched-noise masking of a centered signal out any involvement of combination tones, smaller
_ actually simply two-band masking experiments even than when combination tones are allowed to
as they have been conducted - we have a play a role in two-tone experiments? In short,
paradigm exactly like the two-tone masking ex- why should two noise bands yield somewhat bet-
periments described in earlier sections of this ter ‘frequency resolution’ than two tones, even
paper, which depended upon the detection of though tacitly assumed to be unaided by ‘cues’ of
combination tones on the low frequency side of the same kind known to be responsible and nec-
the masking pair of tones (Greenwood, 1971. essary for a similar ‘good’ performance in two
1974a; Nelson, 1974, 1979). And Fig. 7 (Fig. 21 in tone experiments? The point of these rhetorical
1971) indicates that for a single band of noise questions is that the corresponding propositions
almost twice critical width the whole high fre- are not plausible: (a) that combination tones play
quency slope of the masked audiogram is still no significant part in determining results in two-
determined by combination band detection. band masking experiments although they do in
Should we not therefore ‘suspect’ the involve- one-tone, one-band, and two-tone masking exper-
ment of combination bands in determining at iments and (b) that smaller separations should
least the initial fall in signal threshold as the characterize two-band masking results, even with-
notch is widened (consider in schematic Fig. 6 a out the ‘aid’ of combination component detec-
signal just above a band of noise, as it generates tion, than are required in two-tone masking ex-
(with the masker) combination bands at frequen- periments where combination tones ‘improve’ re-
cies below the band)? Of course, as the bands on sults.
either side of the signal become more distant and We consider further here illustrative details of
the lower band eventually advances into the low- the generation of the signal-masker combination
est frequencies, the signal would be expected components and the combinational aggregate in
eventually to be detected by virtue of neural connection with a current example. In the
events occuring within the basilar region corre- notched-noise paradigm developed by Patterson
sponding to the now-wide notch. And it is also (1976) (Iater used by others, see the ERB portion
true that two-band masking is more complex in of Greenwood, 1991), a tonal signal was placed
view of the addition by both bands of their com- between two bands of noise of moderate width,
bination bands (functional parts of the masker) 0.2, 0.3, or 0.4 times the signal frequencies, f;,
and of the addition by the higher band of noise of which were 0.5, 1, and 2 kHz. The lower of the
its combinational aggregate (a further part of the two bands was considered functionally equivalent
masking stimulus) and its signal-masker combina- to a low-pass noise because its lowest compo-
tion bands to those of the lower band, as the nents were too far from the signal to contribute
upper tone did in two-tone masking. to its masking. The signal was centered, or some-
But unless low-pass noise (of sufficient level in what off-center, between the bands to accomplish
the lower frequencies) has replaced the lower two purposes. One was to ensure detection of the
band-pass noise, why should it be the case in pure presence of the signal by virtue of neural events
233
in the part of the cochlea near the signal, by of noise were also shown to be determined by
precluding the detection of events at more basal detection of combination components, to the
locations, which was thought to have occurred in notch or over the whole steep portion on high
earlier one-band experiments. The upper band frequency side, for masker bandwidths ranging
was thus intended to preclude possible ‘off- from subcritical to at least about twice critical
frequency’ listening, restricting any residuum nar- width, respectively. Thus, the same explanation
rowly to the notch region around the signal. But applied previously would be expected to apply to
this was seen as important chiefly to accomplish Patterson’s 1974 results.
the related main purpose which was the determi- The addition of an upper band by Patterson
nation of the slopes and bandwidth of the audi- (19761, will not restrict the projection of combina-
tory filter considered to be centered around the tion components by signal and lower masker into
signal, from whose skirts the lower and higher the low frequencies nor will the upper band plau-
bands of noise were conceived to be progressively sibly play an essentially different nor necessarily
withdrawn by increasing their separation. more effective role on the high frequency side
However, since single band results like those than played by the upper tone in two-tone mask-
of Fig. 7 had indicated that the relevant events ing (except for its capacity to produce remote
had been detected below the band of noise rather masking when levels are high, which will thereby
than above, it is appropriate to precede discus- potentially contribute to the threshold of the sig-
sion of the notched-noise experiments by a brief nal per se when the notch is wide and the signal
consideration of those earlier results of Patterson threshold has bottomed out). As has been noted,
(1974). Patterson used a paradigm employing a the upper band will contribute to the distribution
single band of noise to map the high frequency of more signal-masker generated combination
slope when the masking noise was moved apically tones in the low frequencies (as in the two-tone
away from a signal tone (like the present paper’s paradigm in Fig. 5) and also add there the combi-
Fig. 7, high side) and to map the low frequency nation bands generated by both maskers them-
slope when the band of noise was above the selves (simply separated portions of that combi-
signal tone and moved basally (like Fig. 7, low national aggregate that would be created by a
side). The slopes obtained were nearly the same single band of noise), but it is implausible a priori
on a linear frequency scale (which were in- that the wider spread of the latter components
tepreted as an almost symmetrical filter) and could entirely preclude the role played by signal-
from these were derived similar filter widths. masker combination components in the other sit-
[These widths incidently correspond to nearly uations discussed. It would certainly be necessary
constant distances and are about a third the that these complications (added masker-combina-
width of intervals that correspond to 1 mm ac- tion bands) be shown to preclude completely in
cording to the frequency-position function the two-band case the demonstrated role played
(Greenwood, 199011. In the pure tone results of by the detection of signal-masker combination
Greenwood (1971) and Smoorenburg (1972), we components in two-tone and one-band experi-
similarly see that, at comparable levels to Patter- ments before firm conclusions could be drawn
son’s, the low and high frequency slopes of a that the signal is detected at a cochlear locus
masked audiogram (to the low point of the high- between the two bands. (The Appendix addresses
side notch) were comparably steep. We also saw other aspects of the notched-noise paradigm, on
that this would not have been possible without the assumption that low-pass noise were to be
the detection of combination tones when signals used).
were on the immediate high frequency side of the The basic objections to the concept of the
masking tone. When detection of combination two-band paradigm and purpose have been stated.
tones was prevented, the high frequency slopes Specific examples will illustrate why a low-pass
were not comparable to the lower slope, but noise should be used on the low frequency side of
instead shallower and bridged the notch. As al- the signal rather than band-pass noise, where
ready noted, the high frequency slopes of bands signal-masker generated combination bands may
escape the masking influence of the combina- tone is in principle the lower band (though with 0
tional aggregate. If bandwidth is 0.3 times f;,. Hz bandwidth) and the distribution of combina-
then for bands symmetric about 1 kHz and not tion components is consequently simplified; and
yet separated, the lower limit of the lower band it is an external signal on the low side of the
will be at 700 Hz, only 2.15 mm away from the masker whose threshold is affected, but the audi-
position of 1 kHz and about 12 mm from the bility of a signal-masker combination band in the
apex. Components at 700 Hz will not, of course, same region, formed when the signal is on the
add significantly to components near 1 kHz in high side of the band, will be similarly affected.
masking the 1 kHz signal itself, but that is not the Let us assume initially and instead that we use
issue. A wide expanse of apical cochlea remains only a single lower band. For example, a single
open for signal-masker combination components band of masking noise 300 Hz wide whose upper
to occupy. Of course, initially the main combina- limit at 900 Hz falls 100 Hz below a 1 kHz tone
tion components produced by masker and signal will have a lower limit at 600 Hz. Its own band of
will be concentrated within the limits of the com- cubic combination components - a part of the
binational aggregate formed by the two continu- aggregate which must be conceived as part of the
ous (not yet separated) masker bands, as in Fig. masker - will stretch from its upper limit down to
8. This will cease to be true when the bands are 300 Hz. a frequency about 3.6 mm below 600 Hz
separated, at which time the combinational ag- and about 7.5 mm from the apex. However, the
gregate created by the masker bands will develop band of cubic combination tones created by the
regions of lesser density and/or gaps below the lower noise and the 1000 Hz signal will stretch
lower masker. Consider that the total combina- from 800 Hz down to 200 Hz - to a point about
tional aggregate when two bands are used can be 5.7 mm from the apex, which is 100 Hz and 1.75
regarded as composed of the individual combina- mm further than the 2f!-f, aggregate band that
tional aggregates generated by each masking band extends, and constitutes part of, the lower noise
and of the combination bands generated by the masker. Every downward shift of the masker, or
two bands jointly. A plot of their distributions will increase of the signal frequency above the masker.
reveal that the sum of the two is less dense in will increase this disparity until the signal-masker
some regions than the combinational aggregate cubic combination bands reach zero frequency.
created by a single band of the same overall At higher levels, the masking band itself will
width (i.e. that lacks the notch) and can exhibit produce the f2-f, combination band (an even-
gaps, below the lower band. in the distribution of order member of the aggregate) at significant
components of a given n, gaps whose presence, levels from 0 to 300 Hz (which will function as a
locus. and width depend on masker bandwidths separated masker), but the signal and the band
and notch width. This occurs since the primary will produce an f?-f, band from 100 to 400 Hz,
components that are lacking in the notch region 100 Hz further into the higher frequencies. If we
do not supply their contribution of combination now add the upper masking band, a signal-
components to the aggregate. In these low-den- upper-masker combination band, 2f,-f’>, will co-
sity or gap regions of the aggregate on the low incide with the lower masking band and probably
frequency side of the lower masker the signal- be unnoticible at low signal levels, but the signal-
masker combination bands will be subject to less masker combination band. 3f,-2f2, produced by
masking by the aggregate. See Fig. 23 in Green- the signal and upper band will lie between 800
wood (1971) for a similar effect, on component and 200 Hz, exactly coinciding with the 2f,_f,
distribution. owing to a ‘notch’ introduced be- band produced by the signal and lower band.
tween a masking band of noise and a masking From 600 to 200 Hz, especially from 300 to 200
tone fixed at the band’s initial lower limit, which Hz (1.75 mm), these two coincident signal-masker
is shifted away from the tone while keeping the combination bands, with n’s of 1 and 2, respec-
noise upper limit fixed. As the notch is intro- tively. will have to contend with the lesser mask-
duced, signal threshold below the masking tone ing influence of an aggregate band of lesser den-
immediately drops. In that figure the masking sity than if the notch in the masker were absent.
235
We will not complicate what follows with inclu- sponsible for signal threshold in the first in-
sion of the even-order combination bands, but stance? or that they are still responsible? To have
they have been shown to have a strong and equal been sure of the first answer, comparisons of
role at higher masker levels in determining signal signal thresholds on the high frequency side would
threshold in pure tone masking (Krammer and have had to be made as narrow bands were
Greenwood, 1973; confirming and extending progressively widened from sub- to very wide
Wegel and Lane, 1924). * super-critical widths into the low frequencies, as
Clearly, the frequency ranges of the signal- suggested in Greenwood (19711. Unless one uses
masker generated combination bands are such a low-pass noise, and preferably one that is low-
that parts of the bands are in a position (some- pass (flat) within the cochlea, the second answer
where below 0.8 kHz) to become audible (above a is the more probable since it has experimental
diminished combinational aggregate), given a suf- support so far as that support has been carried.
ficiently high signal level. Will this signal level be Certainly one cannot a priori reject the second
reached before the signal itself becomes audible answer in the two-band paradigm given its prior
on the high frequency side of the band of noise? support in the one-band case, and we will con-
The answer was ‘yes’ in preceding paragraphs sider further the reasons and evidence favoring
and in 1971 when it was shown, in the case of the second answer given in 1971 and above.
both pure tone and narrow band maskers, that it It is true that the intracochlear amplitudes of
applied to signals located in the masker-notch combination tones produced either by a noise
interval, or on the entire falling high frequency band itself or generated by band and signal can
slope of the masked audiogram when noise band- be expected to decrease at wider primary separa-
width reached and exceeded critical width (up to tions, hence at lower combination tone frequen-
at least a factor of about two). If the masker band cies but not as steeply as cancellation tone ampli-
in the single band example above is broadened 50 tudes diminish in studies of combination tones
Hz further in the low frequency direction, its own (Greenwood, 1977b - comments based in part on
cubic combination band in the aggregate now neural data of Smoorenburg et al., 1976). In the
extends to 200 Hz, but the signal-masker cubic case of a single masking band, the basilar region
combination band extends to 100 Hz. Is signal where the detection of signal-masker combina-
threshold above the band expected to change? If tion tones may determine the thresholds of sig-
it does not, can one conclude that the signal- nals above the band, lies below the band’s lower
masker combination components were not re- cut-off and quite likely below the lower cut-off of
the cubic (higher level) portion of the band’s
combinational aggregate (where the signal-masker
cubic components still extend). As noted above,
* Such examples as above are more striking for an f, of 8 when the upper masking band is introduced, re-
kHz and a lower noise bandwidth of 0.4f,., or 3.2 kHz,
gions of lesser combinational aggregate density,
extending from 7.2 kHz to 4 kHz, or a distance of only 4.1
mm (see Shailer et al., 1990). The cubic component of the
by reason of the absence of masker components
band’s combinational aggregate will extend apically for 10.6 in the notch, will occur in the now extended
mm from 4 kHz to 0.8 kHz - a point about 12.8 mm from aggregate. The result is that signal-masker combi-
the apex. But the signal at 8 kHz, the same log interval nation bands in the frequency region below the
above 7.2 kHz (previous example’s ratio of 1.1 I), yields a
masker will continue to compete (after the addi-
signal-masker cubic combination band extending from 6.4
kHz to 0 Hz at the apex, thus to a point 12.8 mm more
tion of an upper band) with a less effective mask-
apical than the cubic portion of the combinational aggre- ing stimulus in a potentially appreciable portion
gate, ending at 0.8 kHz (see Fig 6). The upper band results of that lower region. Imagine, at a given moder-
in other signal-masker combination bands and combina- ate separation of a masker band and signal, that
tional aggregates in this region. The 8 kHz signal and band
the signal is introduced at a level sufficient to
components from 6 to 5.67 kHz yield 2f,-f, components
of 4 to 3.34 kHz, 0 to 1.25 mm below the band. Will all
produce a distribution of signal-masker combina-
signal-masker combination components be masked before tion components that is at a higher level than the
the signal is? masking band’s own combinational aggregate over
236
a considerable frequency region. The signal- In short, the suggestion is that this ratio may be
masker combination components of a given II and comparably high over a low frequency region of
order will presumably have an intracochlear spec- some width and that only a part of the region
trum in the low frequencies of similar slope to may be sufficient to permit this ratio to deter-
that of their counterpart components in the com- mine ‘signal’ threshold when other parts of the
binational aggregate (Greenwood, 1971, 1972b). region are encroached upon by the masker.
Consequently, the ratio of levels of signal- Moreover, a given ratio of signal-masker combi-
masker combination components over masker nation components to masker-generated combi-
combinational aggregate components of the same national aggregate may be more effective in de-
12 and order may be similar over considerable termining detection where primary neuron firing
portions of the low frequency region even as the is less saturated, a condition satisfied near an
levels themselves, in numerator and denominator, extremum of the active neural population. *
may vary together with frequency (although recall Thus the data and details of the last few para-
that any variation in density with frequency as graphs and associated footnotes provide further
result of using a notched masker will affect the support for the determination of the upper slope
local level of the denominator). Also ratios will of a noise band’s masked audiogram by virtue of
remain similar in adjacent low frequency regions combination tone detection below it. As noted at
although they simultaneously diminish with re- the outset, the upper band added by Patterson
ductions in signal level towards ‘signal’ threshold (1976) will not preclude the projection of combi-
(that reduce the level of the signal-masker gener-
ated bands in the numerator of the ratio). Where
the cubic band (n = 1) generated by the signal * Further support for the detectability of combination com-
and masker extends beyond, or into a gap in, the ponents in lieu of signals is found not only in experiments
cubic aggregate band generated by the masker which focus on masking by an external masker, but in those
alone, it will have an advantage (being a higher- which focus on masking by combination bands (Greenwood,
1972b,c). In general, a combination band that is low in level
level type band) over a remaining aggregate band
but sufficient to mask external signals may well be sufficient
of higher n (in that frequency region) such as the in level to be detected as well when the higher frequency
3f ,-2 f2 band (n = 2). Widening masker band- primary is not, a primary that we may alternatively regard
width downward in frequency without changing as a signal. Note again that in Greenwood (1971, Figs. 21
signal frequency will, it is true, cause the masker and 22) bands that were in width 0.23 times their upper
limit produced masked audiograms whose high frequency
itself to encroach on both signal-masker and ag-
slopes were determined by detection of combination bands
gregate distributions of combination components, for signals up to a ratio of 1.3. or more, times band upper
but if the ratios of the two distributions in the limit (i.e. over the whole steep section, which is complete by
frequency region that the masker’s lower limit about 1.4). The bandwidths used by Patterson (1976) at 0.5
has not yet overridden remain essentially un- and 1 kHz (0.2 and 0.3 times f,) were nearly the same
widths relative to the signal as were the bands cited above
changed and similar, as presumed above, then the
and also in Greenwood (1972b, Fig. 7) in relation to a
detection of some signal-masker generated com- signal at 1.35 kHz, (i.e. 0.26 f,-). which will be used in
bination components in lieu of the signal may examples of combination band masking below. In Fig. 7 of
continue to occur at the same signal level at low Greenwood (1972b) note also that two primaries (the lower
frequency locations. It is the ratio of signal-masker a tone of 1 kHz and the upper a narrow band of noise) at a
ratio of 1.35 can produce a combination band at a fre-
combination components to the masker’s combi-
quency ratio of 0.65 that is capable of producing about 4
national-aggregate components, wherever the and 10 dB of masking when the upper primary is 40 and 30
components in the numerator still exist and the dB less, respectively, than the lower primary (of moderate
local ratio is highest, that is in a position to level at about 60 dB SL). If instead we regarded the up-
determine the level required to detect signals per primary as a signal, the combination components that
were shown to mask other signals at 650 Hz may also be
above the masking band when the signal itself is
expected to be detectable there. potentially in lieu of the
not yet capable of producing directly signal-link- upper primary at 1.35 kHz. This conclusion is also indicated
ed neural events that are large enough to be directly by Smoorenburg’s (1972, Figs. 1 and 3) determina-
detected (Greenwood, 1971, Fig. 25 and p. 540). tions of the audibility of combination tones at various ratios
237
nation components by signal and lower masker tone, one band, and two-tone masking experi-
into the low frequencies nor plausibly play an ments, by events detected on the low side of the
essentially different or greater role on the high lower band.
frequency side than played by the upper tone in
two-tone masking. Since the higher masker band
Final comments on the auditory filter in relation
cannot ‘repeal’ the signal interactions with the
to the two-band (notched-noise) data and the
lower masker that create low frequency combina-
tion tones, then only if it could ensure that the ERB estimates
additional combinational aggregate (consisting of
the combination bands created by both maskers In respect to the expected asymmetry of influ-
and its own combinational aggregate) totally ence of low-pass and high-pass noise upon a
masked them (as a single masker’s combinational signal between them, it has long been known -
aggregate does not), could it ensure that the from the extensive neurophysiological data on
thresholds of a signal midway between the two suppression (Galambos and Davis, 1944; Green-
maskers were determined by neural events at that wood and Maruyama, 1965; Greenwood and
locus within the cochlea. In short, the weight of Goldberg, 1970; Sachs and Kiang, 1968; Sachs,
evidence since 1971 has indicated that thresholds 1969; Kiang and Moxon, 1974; Javel et al., 1978;
at signal frequencies just above the lower of two- and many other papers) and from the psychoa-
bands of noise are determined, just as in one caustic evidence that combination tones deter-
mine the slope of signal thresholds immediately
on the high frequency side of a tonal masker and,
and, absolute and relative, primary levels. If instead the
more extensively, above a noise band masker
upper primary were a tone, f,, at 1.35 kHz and the lower (Wegel and Lane, 1924; Greenwood, 1971;
primary were a noise band which was widened downward Zwicker, 19681 that the influence of a tone in the
from 1 kHz to a bandwidth 0.35 times its upper limit (or cochlea is significantly asymmetric, even at rela-
0.26 fC), its lower limit would lie at 650 Hz and the cubic tively low levels. The weight of evidence from
combination band within its aggregate would extend to 300
Hz. Likewise, the upper limit of the 2f,-fz combination
earliest days has thus been that the auditory
band generated by the two primaries would still fall at 6.50 filter, so far as masking of the signal per se is
Hz while the band extended downward to 0 Hz; the combi- concerned, must be both significantly asymmetric,
nation component of this band falling at 300 Hz would be and asymmetric with differing characteristics from
generated by the signal at 1.35 kHz and a component of the
the neurophysiological excitation that the masker
noise primary at 0.825 kHz. Although the lower primary
would now have a greater masking effect near and above
produces by itself. Any apparent evidence of sym-
650 Hz, and although signal-masker combination compo- metry in a more-than-one-component situation is
nents of lower frequency would diminish in amplitude, so thus itself evidence that in the interpretation of
also would the components of the masker’s combinational the data ‘hidden’ or forgotten factors must be
aggregate, the only part of the total masking stimulus
found to reconcile the data, and the appearance
available in their region to-mask them. If now the upper
primary were again regarded as a ‘signal’ and adjusted in
of symmetry, to the known asymmetries. These
level, some signal-masker combination components lower include the facts of suppression and the role of
than 650 Hz, might still be capable of determining signal combination component detection. These com-
threshold even when the upper primary was at 1.35 kHz. ments are in no way intended to suggest that the
This would be approximately consistent with Figs. 21 or 22 two-band data are not reliable and of consider-
in 1971 where, for signals at about 1.3 times the upper-
masker-limit, the ‘signal’ thresholds (determined by signal-
able value either in support of the interpretation
masker combination bands) were only about 30 dB less as offered here, or in eventual support, as part of
than for signals at band-center and were at or nearing the the data archive, of some other amended inter-
bottom of the falling curve. At higher levels there will be in pretation that will also have to relate the data to
general a considerable expansion of the lower frequency
well established evidence of asymmetry.
range over which combination tones will be audible
(Greenwood, 1972b,c), with an increasing role played by
Moreover, all the considerations above do not
even-order combination tones (Greenwood, 1971; Krammer necessarily affect adversely the comparisons of
and Greenwood, 1973). the ERBs (estimated in the two-band paradigm)
238
among themselves (in different parts of the fre- Summary

quency range) or the comparisons made in the
preceding paper (Greenwood, 1991) with equal Significance of the relatiorl of critical bandwidth
distances along the cochlear partition. On the estimates to non-linear phenomena
contrary. The bandwidths are all obtained from To sum up, in several types of critical-band
the application of the same procedures and math- experiments indicating that very comparable fre-
ematical treatment in different parts of the frequency intervals are critical to the observation of
quency range and may well be sufficiently compa- one or another ‘inflection’, ‘dip’, ‘notch’, ‘break
rable over the frequency range to inspire reason- point’, or ‘shoulder’ in the plot of a given depen-
able confidence in their relative values, whatever dent variable versus frequency separation, we see
meaning is attached to those values. So far as the that a crucial combination-tone contribution is
comparison made here to equal distances is con- involved in accounting for the phenomena and
cerned, the main comment above has simply been the occurrence of the various criteria1 changes at
that owing to omission of a low-pass noise the comparable frequency separations. All of these
detection of a signal’s presence by virtue of sig- experiments imply the occurrence, at the criteria1
nal-masker combination bands was not ruled out, separations, of certain changes in the detection of
and has in fact been indicated by substantial data the relevant neural events on the apical side of
(Greenwood, 1971, 1972a,b; 1974a; Nelson, 1974, the dominant primary component. However,
1979). The detection of such bands, instead of the though it seems clear that combination tones are
signal, at a more apical cochlear locus would have essential to the explanation of these experimental
two consequences. One would be that other prob- outcomes, the explanation, of course, is incom-
lems described in the Appendix that would affect plete without consideration of why they may be
true notched-noise experiments are deprived of detected when a primary component is not. Nev-
most of their importance in the present instance ertheless, the direction of an explanation is indi-
since a low-pass noise was not used. The second cated, and it is not adequately conceived in these
more important consequence is that, if the experiments as the specification of the bandwidth
signal-masker combination components do play of the auditory filter, nor properly fixated only on
the role described above, the consistency of the the signal and its frequency locus.
data, in their functional relation to distance or The experiments above hinge on the detection
center frequency, with much of the other data of certain neural events. Why these events and
considered here would thus be expected, as were why their locus? It is also clear that all of these
Patterson’s (1974) single band results. operational definitions of critical bandwidth de-
Thus the import of the comments above, argu- pend on the existence of cochlear nonlinearity
ing the probable determination of the initial slope and will change with its reduction or elimination.
of threshold reduction by the detection of combi- The nonlinearity not only involves the creation of
nation components, is to link the ERB results to combination tones, which then play their roles in
the same deus ex machina, that is, to the same the detection of the presence of signals or con-
fundamental determinants that operate to ac- tribute to the detection of dissonance and its
count for the results obtained in the other experi- gradation into consonance, but it produces the
ments. If the two-band experiments were in- phenomenon of ‘suppression’ of one primary
cluded, combination tones, partial suppression of component by the other, that, in the contexts
relevant signal components lying higher in fre- considered, appears to be the agent that shifts
quency than other (usually higher level) compo- the ‘responsibility’ for the production of the de-
nents, and the detection - on the low frequency tected neural events from signal to combination
side of the dominant components - of the salient tone. Suppression in these experiments, con-
neural events that determine the outcome of the ceived as the consequence of a gain control
operational definitions we employ would charac- (Greenwood, 1986a,b,c; 1988) operating at each
terize all the experiments considered. cochlear point, contributes to the asymmetry in
239
the relative capacities of the two primaries to tion of maximum amplitude towards the base,
stimulate primary nerve fibers and thus con- these two distances from the peak are expected
tributes to the conditions under which the combi- not to be constant. But they may approximately
nation tones play important roles in the determi- characterize the non-linear peak region from
nation of the critical bandwidths that are empiri- quasi-linear lower levels to moderate levels.
cally defined. The same critical separations that Thus, a frequency interval approximately twice
place the primaries and combination tones at critical bandwidth in man, and centered about a
certain distances that are found empirically to be given tone frequency, is envisaged to correspond
psychoacoustically and physiologically significant to the frequency interval that in the animal
in producing certain events will also be among preparations subtends the region of major nonlin-
the determinants (with absolute frequency and earity centered about the displacement envelope
relative and absolute levels) of whatever mutual peak. It extends from the apical foot of the enve-
suppressive influences are simultaneously being lope to the point basal to the peak at which
exerted by the primaries on each other (through input-output functions become essentially almost
their differential control of the gain control). linear and at which the spatial phase curve
In all these comparisons, the numbers of 1 and reaches the relatively straight portion of lesser
1.25 mm have been used, but not to maintain an slope basal to its inflection point. New informa-
illusory precision nor an exact identity in various tion indicates that at higher stimulus levels the
experiments that in fact may differ somewhat in basal portion of the non-linear input-output re-
the critical bandwidths that have emerged. gion of the displacement envelope may extend a
Rather, and although the bandwidths reported distance increasing to an octave above the CF of
did indeed correspond to approximately these a given cochlear point (Robles, Ruggero, and
same distances in different experiments, we note Rich, 1990).
again that these numbers express the distances These combined views - that the critical band-
subtended by the same frequency interval extend- width estimates obtained in several experiments
ing in both basal and apical directions from a are determined (a) by the existence and dimen-
given point on the cochlear map (Greenwood, sions of the zone of mechanical nonlinearity, (b)
1974b). Thus, considering once more the simplest by the generation of combination tones that then
masking experiment discussed above, that is, one play important roles as part of the intracochlear
masker and one signal, the same critical fre- spectrum, and (c) by a gain control producing the
quency interval, or Af, reflected in both direc- asymmetrical ‘suppression’ (Greenwood, 1986a,
tions from the masking tone - to the approximate b, c; 1988) that helps to cast combination tones in
foot of the masked audiogram on the low side the role we have seen they play - immediately
and to the low point of the notch on the high - suggest some consequences. When or if cochlear
corresponds to 1.25 mm on the apical side and nonlinearity is reduced or eliminated, as by
1.065 on the basal side. At the outset, the apical pathologies, or is ‘modulated’, as potentially it
distance was associated with the dimensions of may be by the efferent system or by experimental
the apical segment of the displacement envelope interference with the efferent system (Pickles and
and the basal distance with the distance from the Comis, 1973; Pickles, 1976; Mountain, 1980; Siegel
envelope peak to the more basal point at which and Kim, 1982) then critical bandwidth, at least
the cumulative phase curve inflects and takes a in the pathologies, may be best described not as
steeper course. Figs. 1 to 3 indicated, between ‘enlarged’ but, more accurately, as ‘abolished’,
these two points on either side of the envelope since in the ‘linearized’ cochlea combination tones
peak, that phase accumulates more rapidly than may be diminished to the point of effective non-
at more basal locations and that the input-output existence (hence unable to play their role in the
amplitude functions are markedly non-linear discriminations and experiments providing the
(Greenwood, 1974b, 1980, using the data of operational measures of critical bandwidth) and
Rhode, 1971, 1978). Precisely because of this the suppression mechanism that governs relations
nonlinearity and the associated shift of the posi- between the primaries may be expected to be
240
significantly reduced in degree and spatial dimen- more non-spatial factors may combine with fac-
sions. Then, although the experiments may be tors more directly linked to distance is critical
performed in the same way, they would no longer ratio measurements. Critical ratio measurements,
lead to the measures of bandwidth with the same among the ‘indirect’ measures long known not to
significance, the new estimates being attributable be exactly proportional (for whatever reasons> to
to a reduced set of factors that would remain to the classical critical band curve in the lower fre-
operate in a linear, rather than a nonlinear. quencies (Sharf, 1970), correspond much less
cochlea, requiring subjects to make discrimina- closely to a constant distance in that region. Their
tions on the basis both of different criteria and of departure from the ‘classical’ critical band data is
events originating from different cochlear loci. greater than from the curve. which itself deviates
A further point is that although it appears from the classical data in the same direction.
clear that combination tones are instrumental in Critical ratio is also divergent from equal dis-
the estimation of ‘critical bandwidths’ and indi- tances in the higher frequency regions, judging
cate in different experiments a comparabie inter- from existing published estimates of critical ratio
val required to alter interference to some criteria1 and its approximate paralleIism there to the
point indicated on plots of the dependent vari- ‘classical’ critical band curve, which is steeper
able, all estimates are by definition made in situa- (Greenwood, 1991, Figs. 10 to 13) than the actual
tions of two or more interacting simultaneous classical critical band data above 3 kHz. The
components. The ‘measured’ bandwidths there- factors responsible for the differences of critical
fore may indeed indicate near constancy in the ratio from a constant distance remain to be fully
dimensions of envelope nonlinear regions and of accounted for and may wet1 differ considerably in
the spatial relations along the cochlea between low and high frequency regions.
the separate components and the events associ- However, when varying noise bandwidths, to
ated with them, but they do not necessarily spec- the extent that the fluctuations of narrow bands,
ify, independent of the interaction, a ‘bandwidth as well as their power, may influence a given
dimension’ of the neural response to a single response measure, different response measures
component. The presumed zone of neural effect may be differentially affected as bandwidth varies.
of a pure tone masker seen in central masking They may thus indicate different critical widths as
(Zwislocki et al., 1968) is more restricted on both a function of center frequency, especially when
sides of the masker’s maximum effect than in the low frequency region is entered. In such
monaural masking. If non-simultaneous stimulus experiments where the critical ratio is measured,
paradigms also occur to the reader, it must be the power within a critical band has long been
recalled that some are non-simultaneous only in believed not to be the only factor contributing to
respect to one of three stimuli and remain mea- threshold (Hamilton. 1957; Zwicker, et al., 1957;
sures of interaction of the other two (Greenwood Greenwood, 1961a, b; de Boer, 1962; Bos and de
et al., 1976). However, the forward masking by a Boer, 1966).
pure tone, just as in central masking, also indi- ‘The ‘salience’ of a signal (Hamilton, 1957)
cates more restricted effects on both sides of the presented in noise diminishes as bandwidth de-
masker than in simultaneous masking. Plomp has creases in band narrowing experiments. This is
made a similar point in discussing the differences especially noticible when bandwidth is narrower
between Houtgast’s (1977) estimates of effective than critical and was one of the cited factors that
bandwidth in simuftaneous and pulsation thresh- determined the choice of a 3/second signal - to
old experiments using in each case a noise masker provide an identifying signal characteristic - in a
with the same sinusoidally shaped spectrum series of masking experiments (Greenwood,
(Plomp, 1976). 196la, b; 1971). To detect a signal in noise, not
only must the signal induce a change that is at
Final comments on psychophysically significant least comparable in magnitude to those among
bandwidths that may not correspond to equal dis- the ongoing fluctuations of intensity or quality of
tances the noise that are sufficient to be heard, moment
An example of a body of data in which one or by moment, as noise tluctuations by a listener
241
when no signal is present, but the signal-induced ‘geometry’ that serves as the sub-stratum for the
changes must be identified as such, that is, dis- nonlinear interactions that appear to be involved
criminated from the other ongoing changes that in the other particular experiments singled out
are also large enough to register as perceived for discussion earlier. In short, other experimen-
changes. One effect of presenting a 3/s signal tal bandwidths may reflect the same cochlear
may be to label it with a periodicity cue as well as spatial properties, but perhaps different dis-
to increase the probability of its detection owing tances, because related in different ways to the
to multiple presentations. This was the impres- dimensions of cochlear events and neural excita-
sion motivating this experimenter in the use of a tion patterns. Moreover, still other bandwidths
3/s signal in 1961 (Greenwood, 1961a). It was may not obviously or directly reflect a relation to
also the spontaneously volunteered introspection cochlear distance if non-spatial factors are rela-
of one of his more perspicacious (and still naive) tively more important among their dete~inants.
subjects that the periodicity of the signal was
what enabled him to distinguish the signal from Appendix
the narrower noises, especially certain signals he
designated (i.e. position in the session), which the Further difficulties with the notch-noise mask-
subject did not know were signals nearest the ing paradigm, as presently conceived, would re-
lower limit of the noise band (and that he said main even if Iow-pass noise replaced the lower
most closely matched its pitch). If this view as to band, which we will assume here has been done.
periodicity is correct, such labeling may in some
conditions permit the effects of changes in band The combinational aggregate of the higher band of
power per se, integrated within a critical distance, noise
to be seen in better isolation by reducing the The high-pass noise used in these experiments,
effects of salience loss as the band comes increas- even if steep enough that the slope of the primary
ingly to possess the temporal and perceptual band components does not itself determine the
characteristics of an irregular tonal signal. Since low frequency slope of the masking that it will
frequency intervals corresponding to a given produce, nevertheless will generate a combina-
cochlear distance become more narrow in the low tional aggregate that will actually determine the
frequencies (apical cochlea), temporal factors may falling slope of the intracochlear high-pass noise
increasingly combine with the effects of power to and will extend as we11beyond the low frequency
influence threshold in full-spectrum noise experi- foot of the noise and into the notch between the
ments as signal frequency decreases, as well as in two noises. It will contribute eventually, of course,
band-limiting experiments at a given signal fre- to the ‘remote’ masking of the signal at sufficient
quency. masker levels and notch widths. But it is more
Again with reference to bandwidth and dis- important to note that, at any levels, the masking
tance, it shouid also be noted that still other contributed on the low-frequency slope of the
psychophysically significant bandwidths that may high-pass section of the noise would be deter-
also seem to correspond to constant distances, mined by the odd order components of the com-
such as, for example, the Q-10 bandwidths of binational aggregate if the high-pass section were
forward or central (Q-3?) masking audiograms, presented alone and would determine the foot
will be initially dependent on the same basic section of the falling signal threshold curve as the
‘geometry’ of envelope shape and frequency-vs- high-pass section was moved basally. Therefore,
place but not accounted for by the same set or the slope measured would presumably not be
interrelation of factors. In forward or central acceptable as the auditory filter slope that is
masking, respectively, tones do not interact simul- sought, since, in principle, bands of the combina-
taneously or in the periphery; neither mechanical tional aggregate (part of the masker) that influ-
signal suppression nor combination tones are ence the obtained results have no different status
playing a role. The bandwidths may instead be in as masking components than do the primary
some other way related to the same basic cochlear masking components. The reason a very steep
242
band of noise is used in the first place is to width of the notch above the signal as the squares,
preclude the skirt of the noise from determining using the widths specified by Shailer et at. (1990)
the outcome; that is it is to justify the characteri- and calculating at a notch center frequency of 8
zation of the noise as a step function (Patterson, kHz.
1974). Seekers of the filter have in mind to esti-
mate a slope unconfounded by the characteristics Off--center signals in the notch
of the calibrating stimulus. The combinational The advantage of widening the notch arith-
aggregate of the high-pass noise would constitute meticaily would lie, of course, in the fact that it
a skirt and, in terms of the paradigm, the aggre- ‘optimally’ determines the frequency location of
gate would be expected to widen ERB estimates the combination components created by the sig-
relative those expected when using a step-func- nal and the masking noise. Let us consider the
tion noise, if such were possible. Of course, the high-pass section of the noise here. It is clear that
ERB estimates refered to here are those that the cubic combination component generated by
would be obtained if a low-pass section of noise the signal and the noise component at the cut-off
were to replace the lower band-pass noise. Of frequency of the high-pass noise will fall precisely
course, if that were done, the influence of the at the cut-off frequency of the low-pass noise that
low-pass noise would also be present at the same is equidistant from the signal frequency, and all
time, and could be expected, from all that is other odd-order combination tones of lower fre-
known about the asymmetrical growth of suppres- quencies will be lower than the cut-off and simply
sion, to be a marked function of level, so that the fall within with the Iow-pass noise itself (if a
masking of the signal would not, at all separa- low-pass rather than a band-pass noise is used).
tions, be necessarily nor probably attributable However, if the signal is not at the center of the
only to the high-pass section and its combina- notch in all cases but is placed to one side or the
tional aggregate. other in an effort to permit either the low-pass
noise or the high-pass noise to exert the greater
masking influence on signal threshold (and thus,
in the terms of the paradigm, to permit an opti-
This fact that the low-pass section’s influence mizing shift of the auditory filter), then when the
would be combined to an unknown degree with signal is closer to the high-pass section, a portion
that of the high-pass section presents some possi- of the combination components created by the
ble difficulties of a different sort. When the notch signal and high-pass section (the nearer and larger
is widened arithmetically, the low-pass cut-off amplitude components) wil1 fall in the notch itself
frequency will recede from the signal at a differ- just above the low-pass cut-off frequency.
ent rate in millimeters than the high-pass cut-off Whether or not this would have an effect on
because frequency is laid out approximately loga- signal threshold is another question. Perhaps not.
rithmically along the cochlea. Thus, if the mask- but to consider the possibility might be war-
ing effects of the noise above and below the ranted. It would not do simply to say that such an
signal change, as expected, more nearly as a eventuality would be an example of the ‘off-
function of physical distance than linear fre- frequency’ listening exemplifying the shift of the
quency, the balance of the influences will change filter which the arrangement is designed to pro-
as notch width increases asymmetrically in mm. mote, since the shift actually envisaged is one
This course of change will also be different at that optimizes the signal to noise ratio by varying
different masker Ievels, with an expectation that the filter locus (and thus the noise within the
the low-pass section will have a larger share of filter skirts) relative to a signal distribution that
influence and hold on to it longer with increasing does not change; it is not a shift that occurs
notch width at higher levels, because of antici- because the relevant signal power concentration
pated growing asymmetry with increasing noise moves to a new locus by creation of new ‘signal’
level. In Fig. 9 the growing width of the notch components, namely the signal-masker combina-
below the signal is shown as the circles and the tion components.
243
In addition, there is a further problem posed the notch center (middle graph) and above the
by off-center signals. If the signal is initially off- notch center (top graph). These are further con-
center and below the arithmetic center of the siderations that would need to be taken into
notch, then notch width will start out narrower account in any calculations, using data obtained
on the lower side of the signal but the relative with off-center signals, of the hypothetical high
widths in mm on the two sides of the signal will and low frequency slopes of the auditory filter. In
go through a reversal when it has reached a little the case of a signal centered below notch center
more than 3 mm on each side (calculating in the the purpose is to map out the influence of the
8 kHz region) and will in any case possess rather low-pass noise in order to estimate one side of
different slopes, that could be presumed to effect the hypothetical filter, but the influence of the
a change in the balance of influences of the
low-pass and high-pass sections on the signal’s
threshold. These changes in notch width are
shown in the middle and top graphs of Fig. 9, Fig. 9. In notched-noise masking (or two-band noise masking)
which show the effects of placing the signal below of a pure tone signal in the notch between the lower and
upper bands of noise, when the notch is widened arithmeti-
cally, the low-pass cut-off frequency recedes in millimeters
from the signal at a different rate than the high-pass cut-off
,
because frequency is laid out approximately logarithmically in
P
the cochlea. This differential widening of the notch on the
two sides of the signal is complicated if the signal is not
centered in the notch. In the panels below, Delta on the
d.’
:’ abscissa designates the frequency interval between the signal
frequency and the nearer noise cut-off frequency, but is
normalized by dividing it by the signal frequency. Delta on the
ordinate is the notch width converted to millimeters, of either
the frequency interval between signal and lower noise cut-off
or between the signal and upper noise cut-off, depending on
the symbol used. The growing width of the notch below the
signal is shown as the circles and the width of the notch above
the signal as the squares, using the Delta values specified by
Shailer, et al. (1990) and a signal frequency at 8 kHz. Bottom
0.0 02 04 0.6 0.6 10 panel: The signal is arithmetically centered at 8 kHz in the
notch. The widths on high and low sides of the signal are
initially nearly equal in millimeters but quickly diverge. Re-
sults will be similar for all cut-off frequencies above about 500
Hz. Middle panel: If the signal is initially off-center and below
the arithmetic center of the notch, then notch width will start
out narrower by almost 1.5 mm (Delta = O.OSf,) on the lower
side of the signal but the relative widths in millimeters on the
two sides of the signal will go through a reversal when the
width has reached a little more than 3 mm on each side (again
calculating for a signal at 8 kHz). Top panel: If the signal is
initially off-center and above the arithmetic center of the
notch, then notch width starts out wider on the lower side of
: :o the signal by more than 1.5 mm and its relative width in-
(_.’
creases monotonically. The lower portion of noise progres-
E 5- ,:’
;.6”
sively shifts away from the signal following a course very
E -
,,..d different from the way, in the middle panel, the higher
s - ;.’
_/o
,A- frequency noise is initially the furthest away in distance from
z ,,...o.” _a--
0 -
.‘,A-- the signal but ultimately the closest. Of course, the relative
,$” *
..> influences of the upper and lower bands cannot be expected
_ /Qd'
0 to be necessarily related in a simple way to the distances of
00 02 04 0.6
their cut-off frequencies from the signal nor to be indepen-
Delta (IS proportIon of Fc dent of stimulus level, (see text) owing to mechanical-en-
velope, suppressive, and neural-pattern asymmetries.
244
low-pass noise relative to the high-pass section regarding Helmholtz‘s theory ot consonance. Proc. Am.
will diminish progressively as notch widths on the Acad. Sci. New Ser. IS, l-12.
Deatherage. B.H., Davis, H. and Eldredge, D.H. (lY57) Physi-
two sides of the signal approach, and reach,
ological evidence for the masking of low frequencies by
equality. But their reversal at that point means high. J. Acoust. Sot. Am. 20. 132-137.
that later in the widening of the notch the influ- Deatherage, B.H., Bilger. R.C. and Eldredge. D.H. (1957)
ence of the high-pass section (or of its combina- Remote masking in selected frequency regions. J. Acouxt.
Sot. Am. 29, 514-514.
tional aggregate may increase in importance, per-
Egan, J.P. and Hake, H.W. (1950) On the masking pattern of
haps to predominance. The problem is com-
a simple auditory stimulus. J. Acoust. Sot. Am, 22, 622-
pounded, of course, in that although the relative 630.
influences will change in this direction, they are Ehmer, R.H. (1059) Masking by tones v\. noise bands. J.
not necessarily, nor in general likely to be, equal Acoust. Sot. Am. 31. 1253-1256.
when notch width is equal. As masker level in- Fidell, S.. Horonjeff. R.. Teffeteller. S. and Green, D. (lYX3)
Effective masking bandwidths at low frequencies. J.
creases neither will the influences remain in bal-
Acoust. Sot. Am. 73. 62X-638.
ance at a constant position within the notch. Fletcher. H. (1940) Auditory Patterns. Rev. Mod. Phys. 12.
47-65.
Acknowledgements Fletcher. H. (lYS.3) Speech and Hearing in Communication.
Van Nostrand. New York.
Galambos. R. and Davis. fl. (1W3) Inhibition of activity in
I would like to thank Dr. J.E. Hind and the
single auditory nerve fibers hy) acoustic stimulation. J.
faculty at the Department of Neurophysiology at Neurophysiol. 7. 2X7-303.
the University of Wisconsin, where this paper was Goldberg, J.M. and Greenwood. D.D (lOh6) Response ot the
begun in 1985, for their hospitality and provision neurons of the dorsal and posterovrntral cochlear nuclei
of the cat to acoustic stimuli of long duration. J. Neuro-
of a congenial place of work. I would like to
physiot. 2’). 72-Y3.
thank Dr. Dietrich Schwarz, Marianne McCor-
Goldstein. .J.I.. (1967aj Auditong spectral filtering and monau-
mack, and the reviewers, whose comments have ral phase perception. J. Acoust. Sot. Am. 41. 358-479.
improved the manuscript, and John Nicol for Goldstein. J.L. (1967bJ Auditory nonlinearity. J. Acoust. Sot.
essential computer assistance. Work supported by Am. 4 I. 676hXY.
Goldstein. J.L.. (lYnY/70) Aural combination tones. In: R.
NSERC. Canada.
Plomp and G. Smoorenhurg (Eds.). Symposium on Fre-
quency Analysis and Periodicity Detection in Ilearing.
References Sijhoff, Leiden, The Netherlands. pp. 230-242.
Green. D.M. (1965) Masking with two tones. J. Acoust. Sot.
Bekesy. G. von flYhO) Experiments in Hearing (McGraw-Hill. Am. 37. 802~Xl.i.
New York). Greenberg, S.. Geister. C.D. and Deng, L. (lYX6) Frequency
Bilger, R.C. and Hirsh, I.J. (1956) Masking of tones by bands selectivity of single cochlear-nerve fibers based on the
of noise. J. Acoust. Sot. Am. 28, 623-630. temporal response pattern to two-tone signals. J. Acoust.
de Boer, E. (1962) Note on the critical bandwidth. J. Acoust. Sot. Am. 79. 1010-1010.
Sot. Am. 34, 985986. Greenwood. D.D. ( IYhla) Auditory masking and the critical
Bos. C.E. and de Boer, E. (1966) Masking and discrimination. hand. J. Acouat. Sot. Am. 33. -1X4-502.
J. Acoust. Sot. Am. 3Y. 70X-715. Greenwood. D.D. ( lY6l h) C’ritical bandwidth and the fre-
Buunen, T.J.F. (1075) Two hypotheses on monaural phase quency coordinates of the basilar membrane. J. Acoust.
effects, Acustica, 9X-105. Sot. Am. 33. 1344-1356.
Buunen. T.J.F. and Bilsen, F.A. (1974) Subjective phase ef- Greenwood. D.D. (1062) Approximate calculation of the di-
fects and combination tones. In: E. Zwicker and E. Ter- mensions of tt-aveling wave envelopes in four species. J.
hardt (Eds.), Facts and Models in Hearing, Springer- &oust. Sot. Am. 34. 13h4- 136’).
Verlag. Berlin-Heidelberg, F.R.G. pp. 3444352. Greenwood. D.D. (lYnY/70) Discussion comments. In: R.
Buunen, T.J.F.. Festen, J.M., Bilsen. F.A. and v.d. Brink, G. Ptomp and G. Smoorenburg (Eda.). Symposium on Fre-
(1974) Phase effects in a three-component signal. J. Acoust. quency Analysis and Pet-iodtcity Detection in Hearing.
Sot. Am. 55, 2977303. Sijhoff. Leiden, The Netherlands. pp. 436444.
Buunen, T.J.F., ten Kate, J.H., Raatgever. J. and Bilsen, F.A.. Greenwood. D.D. (I971 ) Aural combination tones and audi-
(1976) Combined psychophysical and electrophysiological tory masking. J. Acoust. Sot. Am. 50. 502-543.
study on the role of combination tones in the perception Greenwood. D.D. ( lY72aJ Masking by narrow bands of noise
of phase changes. J. Acoust. Sot. Am. 61, 50X-519. in proximity to more intense pure tones of higher frequrn-
C‘ross. C.R. and Goodwin. H.M. (1893) Some considerations ties. J. Acoust. Sot. Am. 52, I l37- lIdi.
245
Greenwood, D.D. (1972b) Masking by combination bands: in the anteroventral cochlear nucleus of the cat. J. Acoust.
estimations of the levels of the combination bands (n + Sot. Am. 59, 607-633.
l)f,-nfh. J. Acoust. Sot. Am. 52, 1144-1154. Guthrie, E.R. and Merrill, H. (1928) The fusion of non-musi-
Greenwood, D.D. (1972~) Combination bands of even order: cal intervals. Am. J. Psychol. 40, 624-h25.
Masking effects and estimations of level of the difference Helmholtz, H.L.F. (1877) Die Lehre von den Tonempfindun-
bands (fh-f,) and 2(fh-f,l. J. Acoust. Sot. Am. 52, gen als Physiologische Grundlage fur die Theorie der
1155-1167. Musik, 4th Ed., 2nd Eng. Ed. Trans. (1885) A.J. Ellis
Greenwood, D.D. (1973a) Travel time functions on the basilar (Ed.), On the Sensations of Tone, Dover, New York, 1954.
membrane. J. Acoust. Sot. Am. 55, 432(A). Kaestner, G. (1909) Untersuchungen iiber den Gefuhlsein-
Greenwood, D.D. (1973b) Critical bandwidth in other species. druck unanalysierter ZweiklHnge. Psycho]. Studien 4, 473-
J. Acoust. Sot. Am. 55, 432(A). 504.
Greenwood, D.D. (1974a) Comment from the floor (see also Hall, J.L. (1972a) Auditory distortion products fz-f, and
Nelson, D.A.), Session R (Masking and Signal Processing), Zft-fz. J. Acoust. Sot. Am. 51, 1863-1871.
88th Meeting of the Acoustical Society of America. Hall, J.L. 11972b) Monaural phase effect: cancellation and
Greenwood, D.D. (1974b) Critical bandwidth in man and reinforcement of distortion products fz-f, and 2f,-f:. J.
some other species. In: A.R. Moskowitz, B. Scharf and Acoust. Sot. Am. 51, 1872-1881.
J.C. Stevens (Eds.1, Sensation and Measurement: Papers Hamilton, P.M. (t957) Noise-masked thresholds as a function
in Honor of S.S. Stevens, Reidel, Dordrecht, The Nether- of tonal duration and masking noise bandwidth. J. Acoust.
lands, pp. 231-239. Sot. Am. 29, 506-511.
Greenwood, D.D. (1977bI Empirical travel time functions on Houtgast, T. (1977) Auditory filter characteristics derived
the basilar membrane. In: E.F. Evans and J.P. Wilson from direct-masking data and pulsation-threshold data
(Eds.f, Psychophysics and Physiology of Hearing, Aca- with a rippled-noise masker. J. Acoust. Sot. Am. 62.
demic, New York, pp. 43-53. 409-415.
Greenwood, D.D. (1977b) Discussion following Rhode. In: Javel, E., Geisler, C.D. and Ravindran, A. (1978) Two-tone
E.F. Evans and J.P. Wilson (Eds.), Psychophysics and suppression in auditory nerve of the cat: Rate intensity
Physiology of Hearing. Academic Press, New York, p. 40. and temporal analyses. J. Acoust. Sot. Am. 63, lOY3.
Greenwood. D.D. (l98OI Empirical relations among cochlear Johnstone, B.M. and Boyle, A.J.F. (1967) Basilar membrane
phase data. Laboratory Report No. 2. vibration examined with the Mossbauer technique. Science
Greenwood, D.D. (1986aI What is synchrony suppression? J. 158, 3X9-390.
Acoust. Sot. Am. 79, 1857-1872. Krammer, F. and Greenwood. D.D. (1973) Role of the differ-
Greenwood, D.D. (1986b) Synchronization and suppression in ence tone fa-f, in masking. J. Acoust. Sot. Am. 55,
primary auditory fibers. In: B.C.J. Moore and R.D. Patter- 402(A).
son (Eds.), Auditory Frequency Selectivity, Plenum Pub- Kiang, N., Y.-S. and Moxon, EC. (1974) Tails of tuning curves
lishing, New York, pp. 217-228. of auditory nerve fibers. J. Acoust. Sot. Am. 55, 620-630.
Greenwood, D.D. (1986~) Two-tone phenomena: dominance, Kim, D.O., Molner, C.E. and Mathews, J.W. (1980) Cochlear
rate suppression, compression. In: Advances in Auditory mechanics: nonlinear behavior in two-tone responses as
Neuroscience, IUPS Satellite Symposium on Hearing, San reflected in cochlear nerve fibers and in ear-canal sound
Francisco, p. 39. pressure. J. Acoust. Sot. Am. 67, 1704-1721.
Greenwood, D.D. (1988) Cochlear nonlinearity and gain con- Kohlliiffel, L.IJ.E. (1971) Studies of the distribution of
trol as determinants of the response of primary auditory cochlear potentials along the basilar membrane. Acta Oto-
neurons to harmonic complexes. Hear. Res. 32, 207-253. laryngol. Suppl. 288, l-66.
Greenwood, D.D. (1990) A cochlear frequency-position func- Kohlloffel, L.U.E. (1972a) A study of basilar membrane vibra-
tion for several species - 29 years later. J. Acoust. Sot. tions I. Fuzziness - detection: A new method for the
Am. 87, 2592-2605. analysis of microvibrations with laser light. Acustica 27,
Greenwood, D.D. (1991) Critical bandwidth and consonance 49-65.
in relation to cochlear frequency-position coordinates. Kohlliiffel, L.U.E. (1972bf A study of basilar membrane vibra-
Hear. Res. 54, 164-208. tions II. The vibratory amplitude and phase pattern along
Greenwood, D.D. and Maruyama, N. (1965) Excitatory and the basilar membrane (post-mortem). Acustica 27, 66-81.
inhibitory response areas of auditory neurons in the Kohlloffel. L.U.E. (1972~) A study of basilar membrane vibra-
cochlear nucleus. J. N~urophysiol. 28, 563-892. tions III. The basilar membrane frequency response curve
Greenwood, D.D. and Goldberg, J.M. (1970) Response of in the living guinea pig. Acustica 27, 82-89.
neurons in the cochlear nuclei to variations in noise band- Kringlebotn, M., Gundersen, T., Krokstad, A. and Skarstein,
width and to tone-noise combinations. J. Acoust. Sot. 0. (1979) Noise induced hearing losses. Acta Otolaryngol.
Am., 47, 1022-1040. Suppl. 360: 98-101.
Greenwood, D.D., Merzenich, M.M. and Roth, G.L. (1976) Lewis, D. and Larsen, M.J. (1937) The cancellation, reinforce-
Some preliminary observations on the interrelations be- ment and measurement of subjective tones, Proc. Nat.
tween two-tone suppression and combination-tone driving Acad. Sci. 23, 415-421.
246
Liberman, C.E. (1982) The cochlear frequency map for the cochlear nucleus of the cat: Nonlinearity of cochlear out-
cat: Labeling auditory-nerve fibers of known characteristic put. J. Neurophysiol. 37. 218-253.
frequency. J. Acoust. Sot. Am. 72. 1441-1449. Sachs. M.B. (1969) Stimulus-response relation for auditory
Mayer, A.M. (1874) Art. XXI. Researches in acoustics. No. 6. nerve fihers: Two-tone stimuli. J. Acoust. Sot. Am. 45.
Am. Jour. Sci., 3rd Series 8, 242-255. 1025-1036.
Mayer, A.M. (1875) Researches in Acoustics, No. VI. Phil. Sachs, M.B. and Kiang, N.Y.-S. (1968) Two-tone inhibition in
Mag.. 4th Series 49. 352-365; or Art XXVIII. A redeter- auditory nerve fibers. J. Acoust. Sot. Am. 43. 1120-112X.
mination of the constants of the law connecting the pitch Scharf. B. (1970) Critical hands. In: J. Tobias (Ed.). Founda-
of a sound with the duration of its residual sensation. Am. tions of modern auditory theory. Vol I. Academic Press.
Jour. Sci. - 3rd Series 9. 267-269. New York.
Mayer, A.M. (1894) Researches in acoustics-IX. Phil. Mag., Shailer. M.J. and Moore, B.C.J. (lY83) Gap detection as a
5th Series 37. 259-288: or (1894) ‘Art. 1. - Researches in function of frequency, bandwidth and level. J. Acoust. Sot.
acoustics. Am. J. Sci. - 3rd Series 47. I-28. Am. 74. 467-473.
Moore, B.C.J., Peters, R.W. and Glasberg. B.R. (IYYO) Audi- Shailer. M.J., Moore, B.C.J.. Glasberg. B.R., Watson. N. and
tory filter shapes at low center frequencies. J. Acoust. Sot. Harris, S. (1990) Auditory filter shapes at 8 and 10 kHz. J.
Am. 88, 332-340. &oust. Sot. Am. 88, 141-148.
Mountain, D. (lY80) Changes in endolymphatic potential and Siegel. J.H. and Kim, D.O. (lY82) Efferent neural control of
crossed olivocochlear bundle stimululation alter cochlear cochlear mechanics? Olivocochlear bundle stimulation af-
mechanics. Science 210, 71-72. fects cochleai- biomechanical nonlinearity. Hear. Res. 6.
Nelson, D.A. (1974) Comment from the floor (see also Green- 171-182.
wood, D.D.), Session R (Masking and Signal Processing), Sinex, D. and Havey. D.C. ( 1984) Correlates of tone-on-tone
88th Meeting of the Acoustical Society of America. masked thresholds in the chinchilla auditory nerve. Hear.
Nelson, D.A. (1979) Two-tone masking and auditory critical Res. 13. 285-202.
bandwidths. Audiology 18, 279-306. Small. A. (19.59) Pure-tone masking. J. Acoust. Sot. Am. 31.
Patterson, R.D. (1974) Auditory filter shape. J. Acoust. Sot. 16lY-3625 Smoorenburg. G.F. (1972) Audibility region of
Am. 55, 802-809. combination tones. J. Acoust. Sot. Am. 52, 603-614.
Patterson, R.D. (1976) Auditory filter shapes derived in simul- Smoorenburg, G.F. (1972) Audibility region of combination
taneous and forward masking. J. Acoust. Sot. Am. 70. tones. J. Acoust. Sot. Am. 52, 603-614.
1003-1014. Smoorenburg. G.F.. Gibson. M.M., Kitzez, L.M.. Rose, J.E..
Peters, R.W. and Moore, B.C.J. (1989) Auditory filter shapes and Hind, J.E. (1976) Correlates of combination tones
at low frequencies. J. Acoust. Sot. Am. 85 Sup. 1. SIOX. observed in the response of neurons in the anteroventral
Pickles, J.O. (1975) Normal critical bands in the cat. Acta cochlear nucleus of the cat. J. Acoust. Sot. Am. 5Y.
Otolaryngol. 80. 245-254. 945-962.
Pickles. J.O. (1976) Role of centrifugal pathways to cochleae Spieth. W. tIY57) Downward spread of masking. J. Acoust.
nucleus in determination of critical bandwidth. J. Nruro- Sot. Am. 29. 5022505.
physiol. 39, 394-400. Weber, D.L. (1977) Growth of masking and the auditory filter.
Pickles. J.O. and Comis, S.D. (1973) Role of centrifugal J. Acoust. Sot. Am. 62, 424-42’).
pathways to cochlear nucleus in detection of signals in Wegel, R.L. and Lane. C.E. ( lY24) ‘The auditory masking of
noise. J. Neurophysiol. 36, 1131-I 137. one pure tone hy another and its probable relation to the
Plomp, R. (1965) Detectability threshold for combination dynamics of the inner ear. Phys. Rev. 23, 266-285.
tones. J. Acoust. Sot. Am. 37, 1110-l 123. Zwicker. E. (1952) Die Grenzen der Horbarkeit der Amplitu-
Plomp, R. (1976) Aspects of Tone Sensation, (Academic Press, denmodulation und der Frrquenzmodulation eines Tones.
London). Plomp, R. and Levelt. W.J.M. (1962) Musical Acustica 3. 325-133.
consonance and critical bandwidth. Proc. IV Int. Congr. Zwicker. E. (19.54) Die Vrrdrckung van Schmalbandge-
Acoust., Paper PSS. Harlang and Toksvig, Copenhagen. rauschen durch Sinustone. Acustica 4. 415-420.
Plomp, R. and Levelt, W.J.M. (1965) Tonal consonance and Zwicker. E.. Flottorp. G. and Stevens, S.S. (1957) Critical
critical bandwidth. J. Acoust. Sot. Am. 37. 548-560. handwidth in loudness summation. J. Acoust. Sot. Am. 29.
Plomp R. and Steeneken, H.J.M. (1968) Interference between 548-557.
two simple tones. J. Acoust. Sot. Am. 43. 883-884. Zwicker, E. tlY6X) Der kubische Differrnzton und die Errr-
Robles, L.. Ruggero, M.A. and Rich, N.C. tl9YO) Two-tone gung des Gehors. Acustica 20. 2066209.
distortion products in the basilar membrane of the chin- Zwislocki, J.J.. Buining. E. and Glantz, J. (IYhX) Frequency
chilla. In: Mechanics and Biophysics of Hearing. Confer- distribution of central masking. J. Acoust. Sot. Am. 43.
ence held at University of Wisconsin. Madison. 1267-1771.
Rose. J.E., Kitzes, L.M.. Gibson, M.M. and Hind, J.E. tlY74)
Observations on phase sensitive neurons of anteroventral

Critical Bandwidth and Consonance

Uploaded by

Copyright:

Available Formats

You might also like

Critical Bandwidth and Consonance

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Critical Bandwidth and Consonance

Uploaded by

Copyright:

Available Formats

Hearing Resegch, 54 (1991) 209-246 209

0 1991 Elsevier Science Publishers B.V. 0378-5955/91/$03,50

Critical bandwidth and consonance: Their operational definitions

Int~du~iou: The equal-dis~nce hypothesis

displacement envelope, where S M’s ‘tail’ becomes

‘suppression’, which is maximal when compo- z”,

odd-order combination tones decrease in fre- ;

quency and, will eventually, as they are shifted 50-

more apically (at a given signal level), no longer

maskers - combination bands that are hence

Comblnatton bands -generated by

-_._I-. Combinational aggregate

shifts lower llmlt of OTHER IS nearly flat At higher frequencies

sIgnal- masker “slgnal threShOld” IS determined by

combination band detectmn of comblnatlon bands vn

to same lower IlKlIt frequency !-ego” below masker.

aggregate by expanding also keeps ahead of it (as S(.fiz(tnilial)

among themselves (in different parts of the fre- Summary

You might also like