Resonance in the Perception of Musical Pulse

Leon van Noorden and Dirk Moelants

IPEM ö Department of Musicology, University of Ghent, Belgium

ABSTRACT ral behaviour. At much faster tempi dancing gets

impossible, it is stopped by the dynamical limita-
A number of phenomena related to the perception of iso- tions of the human body. But also at much slower
chronous tone sequences peak at a certain rate (or tem-
tempi, although motorically possible, dancing can
po) and taper off at both slower and faster rates. In the
present paper we start from the hypothesis that the peak- hardly be called a spontaneous activity: it feels
ing finds its origin in the presence of a damped resonat- more like moving slowly from one position to
ing oscillator in the perceptual-motor system. We assume another and one really has to concentrate and
that for pulse perception only the `effective' resonance count to keep synchronized. Apparently there is an
curve matters, i.e., the enhancement of the amplitude of
optimum tempo and a upper and lower threshold
the oscillator beyond the critical damping.
On the basis of the effective resonance curve, analyses for repeating bodily movements elicited by music.
have been made of data of Vos (1973) on subjective The same phenomenon can be seen when musi-
rhythmization and of data on tapping along isochronous cians tap their feet to the music, or when a director
tone sequences (Parncutt, 1994) and polyrhythmic sways his hands to synchronize an orchestra. Even
sequences (Handel & Oshinsky, 1981). The results show
that these data can be very well approximated with the
when we just listen to music and concentrate on
proposed model. The best results are obtained with a the tempo without performing any movement at
resonance period of 500-550 ms and a width at half all, we seem to select an internal `pulse' from a
height of about 400-800 ms. A comparison is made with relatively narrow range of frequencies. In this paper
a number of other tempo related phenomena. we want to determine more precisely which tempi
In the second part a preliminary effort is made to
determine the distribution of perceived tempi of musical are possible, which tempi are preferred, and to
pieces heard on the radio and in recordings of several develop a theory about the nature of this distribu-
styles, by having a number of listeners tapping along tion. Periodic movement like dancing, jumping up
these pieces. The resonance curve appears to be a good and down or just foot tapping can to a certain
tool to characterize these distributions.
extent be compared to a resonating oscillator. It is
as if we are resonating at the tempo of the music
(Van Noorden, 1991).
INTRODUCTION In the field of rhythm perception, oscillators have
been introduced in models dealing with real-time
When we watch a dancing crowd, people seem to tempo tracking. Some of these models use fre-
agree on the tempo of the music. There appears to quency-adaptive oscillators (e.g., Large & Kolen,
be a range of tempi at which dancing is a very natu- 1994; McAuley, 1995; Toiviainen, 1998) to track

periodicities and to be able to cope with a certain compare this to pushing a child on a swing: to do
amount of expressive deviation in the time domain. this efficiently you have to time your pushes to the
Other models use a large number of oscillators time the swing needs to move back and forth. By
with fixed frequency (e.g., Miller, Scarborough, & continuously adding energy to the swing in small
Jones, 1992; Langer & Kopiez, 1995; Scheirer, amounts, the amplitude of the swing's movement
1997). The resonance characteristics of these oscil- gradually builds up. Well-known examples of reso-
lators allow to detect periodicities within a narrow nance phenomena in the world of music are:
band of frequencies. More theoretical models of glasses `exploding' when exposed to a high tone of
pulse perception (e.g., Povel & Essens, 1985; Lee, an opera singer, strings that spontaneously start to
1991; Desain, 1992; Miller et al., 1992; Rosenthal, vibrate when exposed to a tone with a specific
1992), generally don't give the tempo of the pulse a pitch, the vibrations of the body of a violin, the
prominent position. They primarily look for the peculiar amplification of certain tones in some
processes that determine the relative strength of dif- concert-halls,...
ferent periodicities of interval sequences in a nor- One of the simplest models to approach the phe-
malized tempo, focussing on the basic metrical nomenon `resonance' is that of the damped har-
structure and phase. monic oscillator. Physical systems of which the be-
We believe that the determination of the pulse haviour can be approached by a harmonic
strongly depends on its tempo. We want to investi- oscillator are a mass on a spring or a pendulum
gate if we can establish a global view on the prefer- with small excursion. The damped harmonic oscil-
ence for certain pulse rates (or tempi) starting lator is a linear model, this means that the displace-
from a resonator model. This model can then be ment out of the equilibrium position is proportional
used as a filter, that follows upon an analysis of with the size of the force that acts upon it. The har-
the different periodicities present in a sound stimu- monic oscillator is characterized by two constants:
lus. Both Parncutt (1987, 1994) and Todd (1995, its natural (or characteristic) frequency (f0 ) and a
this volume) have used such a function in their damping constant ( ), that quantifies the loss of
model. However these filters are only a small part energy during the vibration. If the damping is rela-
of more complex models, and their main use is to tively weak and we move the harmonic oscillator
improve the performance of the models. The prime away from its point of rest and release it, it will typi-
aim of these authors was thus not to determine the cally perform sine shaped oscillations (with fre-
characteristics and shape of the filters. This paper, quency f0) of decreasing amplitude, and finally
in contrast, tries to understand the nature and return back to its rest position. The higher the
parameters of this filter by analysing different sets damping, the faster the decrease in amplitude and
of experimental data. The results of these analyses the sooner it will return to its rest position. At a cer-
will then be compared with elements of physiology, tain degree of damping, the oscillator will no longer
psychology and music theory. And finally with oscillate, but return to its rest position without oscil-
actual tempo perception of music. First however lation. The smallest amount of damping at which
we will establish the basic model with which we this occurs is called the `critical damping' ( cr ).
are going to approach the data. Resonance will occur in a harmonic oscillator
when a periodic external force acts on it, and
when the frequency of the external force (fext ) is
RESONANCE AND PULSE PERCEPTION close to f0. If the frequency of this external force is
low with respect to the natural frequency of the
The resonance phenomenon can be roughly oscillator, the maximum amplitude of the oscilla-
described as: the increase in amplitude of oscilla- tor's movement will be the same as the amplitude
tion in a physical system exposed to a periodic of the external force. If the frequency of the exter-
external force of which the driving frequency (or nal force is high with respect to the natural fre-
one of its component frequencies) is equal or very quency of the oscillator, the inertia of the oscillator
close to a natural frequency of the system. One can decreases the amplitude. In the neighborhood of

the characteristic period the amplitude of the oscil- towards a fixed maximum value, equal to the
lator is enhanced as long as the damping factor is amplitude of the forcing motion. When the fre-
smaller than the critical damping. When we plot quency goes above the resonance frequency (=
the amplitude of the damped harmonic oscillator short periods), there is a gradual decrease in ampli-
as a function of the period of the external force we tude until the mass can no longer follow the forcing
obtain the so-called `resonance curves'. In the motion and the amplitude goes towards zero. The
upper half of Figure 1, resonance curves are dis- same properties are present with higher damping,
played at three different degrees of damping: low the main difference being the strength of the ampli-
damping, relatively high damping and critical tude enhancement. With critical damping there is
damping. With low damping, the amplitude rises no longer amplitude enhancement, the amplitude
to high values around the natural frequency, at low simply evolves from 0 to the amplitude of the exter-
frequencies (= long periods) the amplitude levels nal force when the period of the external force aug-
The basic formula (1) describing the amplitude
(A) of the harmonic oscillator as a function of the

frequency of the external force is described by two
parameters: the resonance frequency of the oscilla-
tor and the damping constant (KneubÏhl, 1997).

A ˆ q …1†
…f02 ÿ fext 2 †2 ‡ f 2

We assume that for pulse perception the `effective'

resonance is important, i.e., the amplitude
enhancement of the oscillator that goes beyond the
amplitude of the critically damped oscillator with
the same f0. The effective resonance curve is then
given by the portion of the resonance curve above
the critically damped resonance curve with the
same f0. To obtain this, we subtract the amplitude
function of the critically damped harmonic oscilla-
tor from the amplitude function of the less than
critically damped oscillator. Therefore, the values
obtained with formula (1) are diminished with the
values of the critically damped resonance curve at
the same frequency. The critical damping constant
( cr ), is determined by the formula:
Fig. 1. Examples of resonance curves. In the top half
genuine resonance curves with f0 ˆ 2 (or a peri- cr ˆ 2 f02 …2†
od of 500 ms), the grey line represents the am-
plitude enhancement with a low damping
( ˆ 0:2), the black line that with a relatively This results in (3) as the formula for effective reso-
strong damping ( ˆ 2:0). The dotted line repre- nance amplitude (Ae ). In the bottom half of Figure
sents the resonance curve with a critical damp- 1, two examples of curves representing effective
ing ( ˆ 8:0 at f0 ˆ 2). The Y-axis represents
resonance amplitude as a function of the period of
the amplitude of the oscillator as a function of
the amplitude of the external force. In the bot- the external force are displayed. The resonance fre-
tom half `effective resonance curves', i.e., the quency and damping constants are the same as the
genuine resonance curves minus the amplitude first two curves in the top panel of Figure 1.
at critical damping, are displayed.

a characteristic period. One could call this the

1 1
Ae ˆ q ÿ q …3† resonance period for hearing subjective rhythmiza-
…f02 ÿ fext 2 †2 ‡ f 2 f04 ‡ fext 4
tion. The relation between subjective rhythmization
and the resonance model can be investigated by
In the following section we assume that human analysing the dependence of this type of grouping
tempo perception has characteristics that are simi- on the tempo of the tone sequence. For this investi-
lar to those of a driven harmonic oscillator. This gation we used data collected by Vos (1973). He
`resonance model' should be able to explain the presented isochronous tone sequences at various
existence of a preferred tempo for the pulse tempi, after each presentation the listeners reported
(Fraisse, 1982) and the limits of the existence the size of the groups in which they interpreted the
region of pulse perception (Parncutt, 1994) (cf. sequence. Vos' results are presented in Table 1.
infra, section 3). The number of responses in each category
The existing literature offers studies of a number depends on two elements: the tempo and the group
of perceptual phenomena that show a dependency size. From inspecting the table we see a clear ten-
on the tempo of tone sequences. Among them are dency to prefer group sizes 2, 4 and 8. Group sizes
the phenomenon of subjective rhythmization (Vos, 3, 5, 6 and 7 are much more rarily reported. This
1973), tapping on simple tone sequences (Parncutt, effect seems independent from the presentation
1994) and the preferences for tapping in polyrhyth- rate of the stimuli. The role of tempo can be derived
mic sequences (Handel & Oshinsky, 1981). We will from the period at which the number of responses
analyze the results of the three studies mentioned, in each category reaches a maximum. In category
in terms of our model. If there is a correspondence `4', for example, a maximum is reached at a tempo
between these experimental data and the basic with a period of 300 ms (or at a group length of
properties of our model, the results of these studies 1200 ms). The model applied to analyze the data
can provide us with the necessary information to proposes that the relative strength of a response
determine the parameter values (f0 and ). (Sr ) is a product of two factors: the relative strength
of its group period (St …T †, with T representing

MODELLING EXISTING EXPERIMENTAL Table 1. Overview of the results of Vos'experiment on sub-

DATA SETS jective rhythmization. In the first row the interstimulus in-
terval of the sequences is given (in ms). The first column
gives the number of elements perceived in one group. The
Subjective rhythmization
numbers in the central panel represent the number of sub-
Subjective rhythmization is a phenomenon that jects that heard a specific grouping size at that instimulus
occurs when one listens to a series of isochronous, interval. The last two columns show the number of re-
identical sounds. When presented at an appropriate sponses in each row and the corresponding grouping
rate, one will hear them with a pattern of accentua- strength resulting from the optimalization of the model
parameters to fit the data. Both numbers indicate 2, 4
tion that causes the formation of groups of sounds,
and 8 as the most commonly perceived grouping sizes.
especially after prolonged listening. A well-known
example is the interpretation of identical clock ticks Perceived ISI in ms tot. Sg
group size
as `ticks' and `tocks'. The phenomenon of subjective 150 200 300 400 800
rhythmization has interested psychologists from
2 0 2 1 6 10 19 .443
the end of the 19th century onward (Bolton, 1894;
3 0 1 0 2 1 4 .053
Meumann, 1894) as it is a unique example of how 4 5 7 12 7 3 34 .373
the perception process can subjectively change the 5 1 0 1 0 1 3 .017
appearance of identical elements in perception. If 6 1 1 1 0 0 3 .022
one hears an isochronous tone sequence in subjec- 7 2 0 0 0 0 2 .019
8 6 4 0 0 0 10 .072
tive groups, the grouping period must be imposed
by a periodic fluctuation in the perceptual system. tot. 15 15 15 15 15 75
Our hypothesis is that the grouping fluctuation has

group period, i.e., ISI * group size), and the relative found by Vos (1973), lines the predictions of the
strength of the group size (Sg …n†, with n represent- model. On the Y-axis the probability of perceiving
ing group size): the group size is given, in the X-axis the interstimu-
lus interval. It can clearly be seen that the different
Sr ˆ St …T †  Sg …n† …4†
curves reach a maximum at consecutive interval
lengths, each of them corresponding to the reso-
To compute the group period strength, the effective nance frequency for grouping divided by the group
resonance curve (cf. supra) is used, and applied to size. The correlation between model and data is
the group period. The number of parameters that 0.976. The resonance curve for grouping is repre-
should be estimated from the data is eight: 2 for sented in Figure 11 (thick dotted line).
the effective resonance curve (T0 and Damping)
and n ÿ 1 …7 ÿ 1 ˆ 6† for the group size strengths. Tapping on isochronous sequences
The value of parameters is computed with the Also Parncutt (1994) has presented isochronous
`Solver' tool in MS-Excel by minimizing the total tone sequences with various tempi to participants
sum of squares between the data and the model. in a listening experiment. The difference with Vos
The parameters found are: T0 ˆ 1100 ms and is that he did not ask to report the perceived group-
ˆ 0:22 for the resonance curve, the parameter ing but to tap along the tone sequence in a regular
values for group size strength are given in the last way. The results were again certain groupings as
column of table 1. Note the relatively high strength function of the tempo. The results are represented
of the `binary' grouping sizes 2, 4 and (in lesser in Table 2.
extend) 8, corresponding to the observed prefer- In these data it can be seen that, again, the vari-
ence for powers of 2. The sequence of decreasing ous groupings have maxima at different periods,
group strength: 2-4-8-3-6-7-5, is close to what one following basically the same schema as in the pre-
would expect from a music theoretical viewpoint. vious paragraph, with a dependency on both group
In Figure 2 the correspondence between the model size and tempo. The data are modelled in first
and the data is given, points represent the results instance in exactly the same way as the previous
data set. The parameters for minimizing the sum
of squares between the model and the data are:
T0 ˆ 558 ms and ˆ 6:42 for the resonance curve,
the grouping size strenghts are given in Table 2
(under 1 comp.). The correlation between data and
model is 0.991.
In general the fit is good as can be seen in the top
panel of Figure 3. However, the almost critical
damping ( cr ˆ 2f02 ˆ 2…1:79†2  6:42) found, is
an indication that the current model might not be
the ideal way to approximate the data. Probably
the curve in this model tends to be very broad, in
order to cope with a seemingly broader range of
periodicities, and thus uses a that is as large as
possible. A systematic deviation between model
and experimental data only occurs at the longest
Fig. 2. Analysis of Vos' (1973) data, using resonance intervals. This might indicate that the high damping
curves. Lines represent the predictions of the constant is necessary to deal with longer periods
model, points the results found by Vos. Group
size 2: thick solid line, black diamonds; 3: thin but that it is not possible to give a good approxima-
solid line, open triangles; 4: dashed line, tion for the longest ones, while keeping its reso-
crosses; 8: dotted line, circles. The weak group- nance frequency near the optimum. A possible
ing sizes 5, 6 and 7 are omitted. explanation for the results at the longest ISIs, is

Table 2. Overview of the results of Parncutt's experiment on tapping to isochronous sequences, giving the responses for
all 22 participants. In the first row the interstimulus interval is given. The first column gives the number of elements be-
tween two successive taps. At the right the number of responses in each row and the corresponding group strengths as
used in the modelling (both the component and the two component model) are given: the most common grouping sizes
are 1,2 and 4.

`tapped' ISI in ms tot. 1 comp. 2 comp.

group size
150 227 345 522 792 1200 Sg Sg Sg(2f)

1 2 3 9 15 18 22 69 .620 .234 .595

2 7 14 12 7 4 0 44 .279 .038 .115
3 2 1 0 0 0 0 3 .017 .001 .000
4 10 4 1 0 0 0 15 .070 .003 .009
8 1 0 0 0 0 0 1 .015 .001 .000

tot. 22 22 22 22 22 22 132
that the listeners internally subdivided the period to is not unexpected, since grouping occurs between
tap accurately on the occurrence of the tones sepa- non-adjacent elements and a binary division is
rated by such long intervals. One can think of it as dominant. Thus we can easily model both sets of
if the subject counts one two one two... and only data using the same resonance frequency.
taps on the ones. We incorporate this in the model
by adding to the strength of the actually chosen pe- Tapping on polyrhythmic sequences
riod, that of the period with the double frequency. Handel and Oshinsky (1981) have collected experi-
This means that another category of parameters is mental data on how people tap along with two iso-
introduced, representing the Sg of the double fre- chronous tone sequences presented simultaneously.
quency of every grouping size n mentioned The tempo of these sequences were related accord-
(Sg…2f † …n†). Thus the total grouping strength is dis- ing to a low number ratio: 2:3, 3:4, 2:5, 3:5 or 4:5,
tributed between 10 (5 + 5) parameters. The opti- thus forming polyrhythmic patterns. The length of
mal values found are given in Table 2. The result is the compound `measure', i.e., the length between
shown in the bottom half of Figure 3. Note that the coinciding tones of the two component
the probability of tapping at a given interval is sequences, was varied between 400 and 3000 ms
found by summing the probabilities of hearing the (11 distinct values). The participants were
main interval and its double tempo. The correlation instructed to tap regularly. This approach provides
between model and data is now almost one: 0.999. us with excellent data to study the preferred rates
The resonance period is slightly shortened to 550 for tapping. Every response in the experiment
ms, the damping constant is reduced to a value of involves in fact a choice between 3 different tempi:
0.45. the two components of the polyrhythm (a faster
When we compare the fitted models of Vos and and a slower tempo) and the compound measure
Parncutt we find as the most important difference (hence called `1'). The data of Handel and Oshins-
that the resonance period of Vos is the double of ky have been collected from the diagrams in their
the one found for Parncutt. However, both the rela- paper. The different conditions, presented sepa-
tive strength of the group sizes and the damping rately in their paper, were combined into one set
constants are comparable. It is very tempting to of data for each frequency ratio. In each set of data
suppose that they represent the same resonance 27 observers participated. An overview of the
phenomenon. There seems to be a basic resonance results is given in Table 3.
for tapping to simple sequences around 550 ms. As in the previous sections, we suppose that the
The resonance for grouping is found at the point preference rate at which people tap can be
where 2 intervals of that length are grouped. This described by a resonance curve with fixed reso-

the compound measure (1). In unbiased conditions

this would mean that more people would have
tapped along sequence 2 than on sequences 3 and
1. However, the situation is not as simple as that:
some independent factors such as the preference to
tap along the faster of two sequences, forces us,
again, to assign different strengths to each of the
sequences: fast, slow and compound. Moreover,
(binary) hierarchical structures can still have an
influence on the responses. The observer can, e.g.,
count `1 2' and only tap on every `1', (i.e., partici-
pant's perceived pulse is at the double tempo of
the produced sequence; cf. the previous section).
Also the opposite might happen: he can tap `1 2'
Downloaded by [Staffordshire University] at 05:43 06 October 2014

but only think of every `1' as a beat (i.e., perceived

pulse at half the tempo of the produced sequence).
These strategies will occur most often in extremely
slow and in extremely fast sequences respectively.
Although people could do even more complicated
things we assume that halving and doubling are
the most dominant of all division/multiplication
In analyzing a set of data attention should be
given to the number of parameters that will be esti-
mated. If one takes too few parameters, interesting
aspects of the data may be lost. If one takes too
many, one may find effects that may be just due to
random fluctuations. In order not to fall into this
Fig. 3. Analysis of Parncutt's data, using resonance trap we made several approximations with different
curves. At the top only the basic group sizes are numbers of parameters. The main tendencies could
used, at the bottom the modelling including the already be modelled with only six parameters. How-
double tempo is shown. Lines represent the pre- ever, it was clear from the fitting of the data that
dictions of the model, points the results found
minimizing the amount of parameters led to some
by Parncutt. Tapping on every beat: thick solid
line, black diamonds; every 2 beats: dashed line, systematic errors, as it was the case for the slowest
crosses; every 3 beats: thin solid line, open tri- tempi in the modelling of Parncutt's data. Here,
angles; every 4 beats: dotted line, circles; every the model could, e.g., not be fitted very well at
8 beats: thin solid line, squares. large durations, especially in the 2:3 and 2:5 poly-
rhythms. This led us to the conclusion that the
data could be better approached by taking into
nance frequency and damping. Figure 4 illustrates account more tempo components, or more pre-
how this principle can be applied to a polyrhythmic cisely by taking into account the halve and double
sequence. In this example, the three possible pulses tempi as described before. Therefore we gradually
from a 2 against 3 polyrhythm with a compound expanded the number of parameters until no more
measure period of 1200 ms are shown in relation systematic errors could be found.
to the resonance curve. The height of the vertical In total, 42 parameters have been adapted: 2 for
lines indicates that the strength of the slow compo- the resonance curve + 5 * 9 for data sets and com-
nent (2) is higher than that of the fast component ponents ö 5, since the sum of weights is 1 for
(3) which in his turn is higher than the strength of each data set. The total residual sum of squares is

Table 3. Summary of the results of Handel and Oshinsky's (1981) experiments on tapping along polyrhythmic se-
quences. In the left column the polyrhythm is indicated, in the top row the length of the compound measure. In each
block three numbers are given, representing the percentage of responses following the compound measure (`1'), the
slower component and the faster component of the polyrhythm are listed. In the right column the total percentage of re-
sponses in each row is given for every polyrhythm.

400 600 800 1000 1200 1400 1600 1800 2000 2400 3000 tot.

1 47.7 33.9 22.4 18.4 9.0 0.0 0.0 0.0 0.0 0.0 0.0 12.0
2:3 2 46.9 32.2 37.3 20.1 12.7 9.9 3.2 8.2 2.4 2.7 8.1 16.7
3 5.4 33.9 40.2 61.4 78.2 90.1 96.8 91.8 97.6 97.3 91.9 71.3

1 61.1 33.1 17.6 6.4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 10.7
3:4 3 34.1 55.0 70.1 46.5 51.7 49.2 41.6 31.4 23.6 35.1 58.0 45.1
4 4.8 11.9 12.4 47.1 48.3 50.8 58.4 68.6 76.4 64.9 42.0 44.2

1 16.5 3.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.8
2:5 2 79.9 62.7 79.0 65.8 55.9 25.7 33.4 29.0 20.1 14.7 0.0 42.4
5 3.8 33.9 21.0 34.2 44.1 74.3 66.6 71.0 79.9 85.3 100 55.8

1 35.5 22.0 8.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6.0
3:5 3 54.9 74.4 69.7 66.7 48.3 48.5 46.2 45.8 40.2 31.7 18.8 49.6
5 9.6 3.7 21.6 33.3 51.7 51.5 53.8 54.2 59.8 68.3 81.3 44.4

1 83.7 31.3 11.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 11.5
4:5 4 3.3 50.1 78.9 73.3 59.7 59.0 61.7 48.5 54.9 36.5 41.1 52.0
5 8.0 18.6 9.1 26.7 40.3 41.0 38.3 51.5 45.1 63.5 58.9 36.5

now 0.239, which equals correlations between model with 42 parameters are given together with
0.987 (for the 2:5 polyrhythm) and 0.995 (for the the original data. The structure is basically the
2:3 polyrhythm). The adapted parameters for a same as in Figures 2 and 3. The relative strengths
minimum sum of squares between model and of the three possible responses are calculated and
data, as found with the solver of Excel, are given plotted against the complete cycle length. It must
in Table 4. In Figure 5, the predictions of proposed be noted that the shape of the curves is somewhat
more complicated than in the earlier examples, not
all of them are characterized by one single maxi-
mum. This can be explained by the addition of the
halve and double tempo in the calculation of each
curve, leaving the possibility for one, two or three
(local) maxima.
These results show that the data of Handel and
Oshinsky can be explained on the basis of a reso-
nance model. The resonance parameters corre-
spond with the resonance curves found in the data
of Vos and Parncutt, the resonance period is about
10% shorter (500 instead of 550). A weakness of
the model is that it doesn't provide an explanation
for the distribution of the weights in the different
Fig. 4. Relative resonance strengths of the periods
1200, 600 and 400 ms. as present in the poly- polyrhythms. The weights are calculated in order
rhythm 2:3 at pattern period 1200 ms. At a to obtain the best fit, while maintaining the same
somewhat longer pattern period, the strength resonance curve. They should be able to tell us
of the division in 3 will become dominant, as in- something about the way people approach the
dicated by the arrows.
task. A model that explains the optimal weights is

Table 4. Settings of the parameters for each of the polyrhythms used by Handel and Oshinsky (1981), each time the
weight for tapping on the whole pattern, on the slow component and on the fast component are shown together with
their halfs (slower) and doubles (faster). The remaining parameters T0 and are put at 500 and 1.12 respectively.

Polyrhythm double 1 half double slow half double fast half

2:3 0.047 0.021 0.002 0.004 0.076 0.019 0 0.830 0

3:4 0.003 0.040 0 0.258 0.113 0.118 0 0.404 0.064
2:5 0 0 0 0.007 0.003 0 0.974 0.016 0
3:5 0 0.001 0 0.033 0.01 0.009 0.938 0 0.009
4:5 0 0.003 0.023 0.342 0.175 0.058 0 0.369 0.029
Fig. 5. Analysis of Handel and Oshinsky's data, using resonance curves. Results of the experiment and predictions of
the model are shown for each of the 5 polyrhythms. In each graph tapping to the smaller integer value in the
polyrhythm is represented by the dotted line (model) and crosses (data), to the larger value by the dashed line
(model) and full diamonds (data) and to the complete cycle by the solid line (model) and open circles (data).

not yet available, but a few trends are clear: The theory as it has spontaneously developed through
weight of the `1' (the tap on the common tone) is history on the one hand, and a scientifically investi-
always relatively low, while the fastest beat period gated perception phenomenon on the other hand.
in the polyrhythm is the strongest. The bigger the
relative distance between the two beat periods, the `Spontaneous tempo' and related phenomena
stronger the fast component. An explanation might The peak and extent of the resonance curve can be
be that, apart from the enhancement due to reso- compared with a number of other tempo phenom-
nance, each pair of events contributes to the sali- ena that have been studied in the past: a) sponta-
ence of a beat pattern, as suggested by Parncutt neous tempo, b) preferred tempo and c) the indiffer-
(1987). According to this theory the more events ence interval .
of a specific period are present, the more salient it Spontaneous tempo is defined as: the tempo peo-
is, and the larger the differences in the amount of ple chose when asked to tap with minimal instruc-
events, the larger the salience of the fastest beat. tions. It is important to note that one can not elimi-
nate the influence of environment and instructions.
Downloaded by [Staffordshire University] at 05:43 06 October 2014

For example, the instruction ``relax and tap at a

RESONANCE CURVES AND TEMPO comfortable rate'' could yield significantly different
PHENOMENA results compared to the instruction ``tap as fast as
you want''. According to Fraisse (1982), sponta-
We will now compare the characteristics of the neous tempo has a preferred range between 380
resonance curve for pulse perception, as obtained and 880 ms, while it is best represented by an inter-
in the previous chapter, with some known facts val of 600 ms. However he cites enormous inter-
and concepts from music theory and music percep- individual variability, with results ranging from 200
tion research. to 1400 ms, probably a result of differences in inter-
pretation among the participants. This shows that
Categories and range of musical tempo a simple measurement of spontaneous tempo is
In handbooks on music theory (see, e.g., Willemze, unreliable. In more carefully controlled experi-
1971), we find indications for the following tempo ments, Kay and his colleagues (Kay, Kelso, Saltz-
ranges: Fast (MM 160^216) ö Moderate (MM man, & SchÎner, 1987) found as spontaneous tem-
90^120) ö Slow (MM 40^60), with transition po for single hand wrist movements a rate with a
regions in between (see Fig. 11). The limits of the period of 490 ms.
total range of tempi are also found in the traditional A slightly different concept is that of `preferred
range of metronomes: MM 40ö208 (which equals tempo': this is found by asking subjects what metro-
a period range of 288ö1500 ms). nome tempo they consider most natural. Answers
The peak of our resonance curve around 500 ms, cluster between 500 and 700 ms (Fraisse, 1982). It
corresponds with the moderate tempo category, should be noticed however that here as well the prob-
the slow and the fast categories are located at the lem of instruction-dependency has an influence.
outskirts of the resonance curve. The resonance A third element is the so-called `indifference
curve nicely fits the range of musical tempi and interval': the point at which intervals are repro-
could provide an explanation for it. Although all duced without systematic errors, assuming that
the tempi used for counting the time of music lie short intervals are lengthened and long intervals
within the range of effective resonance, fast and are shortened. In older studies (see Woodrow,
slow tempi are located at ranges where the reso- 1951), the indifference interval was fixed around
nance curve has a relatively low amplitude. The sta- 600 ms. However this number seems to be an arte-
bility of these tempi may be linked to the resonance fact of the method used. Experiments with different
frequency by halving or doubling the period, a ranges of durations show that the indifference inter-
strategy that we also incorporated in the modelling val tends to be found around the geometric mean
of existing experimental data. These observations of all durations in the experiment, e.g., 1150 ms
establish a link between music practice and music has been found for durations between 200 and

1500 ms and 3650 ms for durations between 360 Temporal coherence and fission, considered as
and 12000 ms (Fraisse, 1963). In general data of pulse perception
tapping experiments show considerable differences, The perceptual difference preference between tempi
and a strong proof for the existence of a real indif- of different frequency can also be related to
ference interval has not been given. Vos en Eller- research on the formation of auditory streams, or
man (1989), e.g., found that longer intervals the linking of sequential tones in perception (Van
(> 600 ms) tend to be shortened in a continuation Noorden, 1975). Generally, the perception process
task, but found no lengthening of shorter intervals links successive events together unless they follow
(> 200 ms), while PÎppel (1978) found an indiffer- each other too quickly or too slowly. When they fol-
ence interval around 3000 ms. low each other too quickly, it will be difficult to
A related issue is that of magnitude scaling of determine their temporal order or even to distin-
estimations of short durations, studied by Michon guish them as individual elements (loss of temporal
(1967). The function relating the estimated duration acuity). When they are too far apart in time they
to the stimulus duration is less steep below 500 ms will constitute single independent events, that can
(test ˆ :55t0:6 )
than above this duration no longer be vividly connected. The results of an

(test ˆ :77t1:1 ). From his data can be concluded experiment measuring the temporal coherence and
that there is an overestimation of durations below fission boundaries in a tone sequence ABAB... are
200 ms and an underestimation above this value, represented in Figure 6. The temporal coherence
above 500 ms the underestimation gradually boundary is determined by the largest pitch inter-
decreases. val between the tones A and B where one can still
It is clear that the ranges of these phenomena hear the alternation ABAB in one single stream.
roughly correspond to the passband of the reso- The fission boundary corresponds to the smallest
nance filters we used (half height values around pitch interval where one is able to hear two inde-
400 and 800 ms). We think the interval of 600 ms, pendent streams: A-A-A- and -B-B-B. As Figure 6
generally referred to as `preferred tempo', should illustrates, the linking process varies between differ-
not be considered as an exact value. It is one point ent tempo regions. In the range T = 200ö800 ms,
in a range of periods favored in perception. The it depends largely on the listening strategy of the
findings reported in the previous two sections, listener, he can choose between one single or two
based on the analysis of data obtained by musical seperate streams. This corresponds to a choice
perception tasks, suggest a somewhat faster period between two different pulses within the existence
as the point of gravity of that range: 500^550 ms. region of musical pulse: the pulse determined by
The 600 ms value was probably found by averaging the rate of all the tones (ABAB...; ISI: T) or the
the data, not taking into account the positive skew- pulse related to the alternating tones (A-A-A- or -
ness of their distribution. B-B-B; ISI: 2T). Below T = 200 ms it becomes

Fig. 6. Temporal coherence boundary (TCB) and fission boundary (FB), as determined by two observers (Van Noor-
den, 1975).

gradually more difficult to follow the pulse of the reaches the threshold in quiet from a distance of
single stream: The alternation ABAB... can be fol- 200 ms on (Durrant & Lovrinic, 1977; Zwicker &
lowed over smaller and smaller pitch distances, Fastl, 1990). So essentially no masking occurs
until it becomes inevitable to hear the fission. At beyond this interval regardless of the intensity of
slow rates (beyond T = 1600 ms), on the contrary, the masker. This shows that an interval of approxi-
it becomes difficult to keep the pulse of the alter- mately 200 ms is needed for two tones being per-
nating tones. Therefore a larger tone interval is ceived as fully independent, without influencing
necessary to keep the tones A-A- apart from the each others loudness.
tones -B-B. Both elements show the importance of the 200
ms threshold in auditory perception. It seems likely
The limit at fast tempi that that for tapping the tempo the events need to
The resonance theory of pulse perception incorpo- be fully developped in perception.
rates the fact that at fast tempi the ability of follow- The transition between two distinct perception
ing the tempo of a tone sequence is limited by iner- strategies around 200 ms is corroborated by
tia (cf. supra). This can be compared with the research on tempo discrimination (Friberg &
inertia of the motor system. Whereas the limits of Sundberg, 1995), where a transition from holistic
tapping speed have been fixed at an average of (based on the complete pattern) to analytic (inter-
about 126 ms (Fraisse, 1956), some studies report val to interval) perception is found around 250
problems to perform a controlled tapping at inter- ms. This is reflected by a change from a constant
vals slightly smaller than 200 ms (Peters, 1989; absolute JND (ca. 6 ms) to a constant relative
Wing & Kristofferson, 1973). In these studies par- JND (ca. 2.5%). The somewhat higher threshold
ticipants were asked to tap in synchrony with a se- found here and in other studies shows that not
ries of equidistant tones. They were able to repro- everything changes suddenly at 200 ms, the zone
duce the fast rates, but the synchronization with between 200 and 300 ms can be regarded as a tran-
the pacing tone was very bad, and the participants sition zone in which controlled tapping is possible,
didn't seem to notice that their movements were but still relying on more or less automated pro-
badly synchronized. This shows that periods short- cesses, without the possibility of anticipation and
er than 200 are unlikely to be used as a basis for immediate adjustment. The range of tempi where a
tempo perception, but rather represent a `forced' constant relative JND is found extends to about
type of movement. A possible basis for the weak 1000 ms.
synchronisation ability below 200 ms may be found
in some principles of auditory perception: tempo- The limit at slow tempi
ral summation and masking seem to be effective To give an explanation for the 1500 ms boundary,
up to the 200 ms threshold. the resonance theory of pulse perception can be
The perceived loudness of a tone is not only compared with the integration period of memory
determined by its amplitude, but also by its length. processes. In order to link two successive tones we
This is caused by the temporal summation (or inte- need to be able to keep them simultaneously in our
gration) of energy. Experiments in the determina- perceptual working memory, somethimes called
tion of the audibility threshold (e.g., Plomp & Bou- the perceptual (or subjective) present. However the
man, 1959; Campbell & Counter, 1969; Watson & exact range of this perceptual present is difficult to
Gengel, 1969) show that the ear integrates energy establish. Fraisse (1957) found a maximum of
over time within a time frame of roughly 200 ms. about 5 seconds, but also stated that our perceptual
Loudness increases if a tone gets longer until the present only seldom exceeds 2 or 3 sec. More
threshold of 200 ms is reached (Gelfand, 1981). recently, converging evidence from different fields
This implies that we can only speak about a fully points to a window of temporal integration working
developed tone if it lasts at least 200 ms. up to 3 seconds (PÎppel, 1996). Only if different
Similarly the audibility threshold in forward acoustic stimuli occur within this integration inter-
masking (when the probe follows the masker), val, the listener will be able to establish a tempo

(PÎppel, 1989). This means that the slowest possible

tempo is determined by a period with half the
length of the perceptual present (or 1500 ms if the
perceptual present equals 3000 ms). The evidence
cited to fix the extend of the perceptual present at
3 seconds includes the location of a temporal indif-
ference point (cf. supra); the decrease of accuracy
in the judgement of intensity difference and interval
duration when the interval goes beyond 3 sec.;
shifts of attention in the perception of ambiguous
figures or sound sequences; the limits of subjective Fig. 7. Comparison between the resonance curve devel-
rhythmization (cf. supra) and synchronized motor oped by modelling Parncutt's (1994) experiment
response; limitations of working memory; the on tapping to isochronous sequences and the
duration of spontaneous intentional acts; and find- band pass filter used in Parncutt's own model.
Downloaded by [Staffordshire University] at 05:43 06 October 2014

Comparison with Parncutt's pulse salience filter very similar, but Parncutt's curve is broader, and
The approach of estimating a function for periodic- peaks at a somewhat slower tempo. Thus it gives
ity filtering closest to our method is the one used considerably more weigth to periods between 700
by Parncutt (1994). His model includes a module and 1500 ms. A possible explanation is already
which functions in a similar way as our resonance mentioned in section 2.2, where the initial model-
curve: a band-pass filter to allow only a specific ling tended to produce a very broad curve. We
range of periods to be considered as the pulse. A solved this problem by introducing the possibility
problem in making a comparison between his and to use the double tempo as internal pulse. Since
our model is that both are part of a larger system: Parncutt does not add the possibility of an alterna-
Parncutt's filter function is part of a global model tive tapping strategy, he has to use a somewhat
of pulse salience and metric accents, using other broader curve to deal with these longer periodic-
modules as well. Our modelling can not be per- ities. Another difference between the two curves is
formed succesfully without adding the grouping that Parncutt's band-pass filter leaves no place for
strengths. Nevertheless it is interesting to compare very slow tempi (T > 3000). In our model they are
both curves and look for similarities and differ- considered to be possible, which corresponds to
ences. the physiological reality, but unlikely to be per-
Parncutt uses a filter function that has the shape ceived as the pulse. The role and shape of our reso-
of a normal distribution on a logarithmic periodic- nance filter and Parncutt's pulse salience filter
ity scale. It can be characterized be two parameters: seem rather similar. The difference is that the reso-
the logarithm (base 10) of the `moderate pulse peri- nance curve is based upon a plausible model, while
od' (a value comparable to our `resonance period') the salience filter is just a symmetrical curve
and its standard deviation defining the width of around a maximum that improves the model and
the curve, and thus comparable to our damping roughly corresponds to perception.
constant. The magnitude of both parameters was
also found with an optimalization method. Interest-
ingly also his results show differences between the TEMPO DISTRIBUTION OF MUSICAL PIECES
data sets to which the model was applied. The cen-
ter frequency varies between 660 and 760, the Until now we have dealt with regular patterns, con-
standard deviation between 0.14 and 0.23. As `typi- sisting of only one or of two independent layers of
cal' values Parncutt indicates 700 and 0.2. In Fig- identical, equally spaced sounds. When we want to
ure 7 the shape of this curve is compared with our investigate tempo in `real' music, we are confronted
resonance curve. The `fast' side of the curves is with more complicated structures and interactions.

In music, usually a number of different isochronous recordings of music of a specific style, and an
beats that fall within the existence region of musi- analysis of existing data describing the tempo of
cal pulse occur simultaneously. Variations in pitch, dance music was done.
timing, dynamics and timbre, prescribed in the
score or added by the performer (`expressive'), Dance music tempi
may influence the listener's choice of the pulse. Contemporary dance music is considered as par-
If the resonance model presented above is valid ticularly interesting in the study of preferences for
in these more complex situations, the curve musical tempo. Since its primary purpose is to elic-
describing the distribution of tempi perceived in it bodily movements it tends, more than other
music should be characterized by the following music, to invoke the perception of a strong pulse.
two properties. First, the periodicities close to the So-called `bpm-lists' giving listings of metronome
resonance period should considerably more often numbers (or `beats per minute') are compiled by
be chosen as the actual tempo than their multi- DJ's to help their colleagues in meeting the high
pliers and divisors. Or, from a different point of demands the contemporary dance scene makes
Downloaded by [Staffordshire University] at 05:43 06 October 2014

away from the tempo associated with the resonance Professional DJ's are required to produce faultless,
frequency (i.e., very slow or very fast), the music unnoticeable mixes and have to speed up and slow
(or the performer(s)) should provide strong cues to down the tempo in a fluent and controlled way.
impose this particular, `less natural', tempo. Tech- Lists giving an exact measurement of the tempo
niques to do this include the avoidance of regular can be of great help. An interesting aspect concern-
division of a slow pulse and strong accentuation of ing these lists is that they are necessarily percep-
every beat of a fast pulse. Second, tempi beyond or tual, not based on indications in scores. Of course
below the limits of the effective resonance region one reason for that is that scores of this type of
of musical pulse should not occur. This should music are scarce. But, more important is that only
result in a distribution of tempi that has a similar the effect on the audience is relevant for the pur-
shape and position as the resonance curve. If the poses of the DJ's, since it is the tempo that people
resonance frequency is universal, as the agreement will most probably choose as basis for their dance
between participants in the experiments analysed that has to be listed. In general the bpm-lists used
in chapter 2 suggests, inter-individual differences are made in similar way as our tapping experiment
should be minimal. However, differences between (cf. infra), although using other programs to calcu-
different samples of music might occur, depending late the beats per minute (for an overview see, e.g.,
on the character of the music they contain. The per- tune/bpm counting.html).
ceived tempi in a collection of lullabies, e.g., can Six lists were collected from the internet1, each
be expected to have a distribution in which some- containing between 1185 and 3035 entries, with a
what more slow tempi occur more often than in a total number of 12148, they were compiled in 6 dif-
random sample. ferent countries (Finland, Canada, U.K, Germany,
The distribution of tempi commonly perceived in France and Sweden). Each of them had a similar
music, was measured in two ways: Data were col- shape with means between 446.9 ms and 489.8 ms,
lected from an experiment in which participants paired correlations of the histograms (as presented
had to tap along with the perceived beat of random in the figures) were all highly significant (> :80,
musical fragments heard on the radio or from with p < 0:001). Overlaps between the different lists

resulted in some entries occuring several times in (Collier & Collier, 1994), tape measurement (see
the final list, given the large amount of data, these Epstein, 1995) and onset detection (Moelants &
are not likely to affect the results. Moreover it can Rampazzo, 1997), are restricted to the measure-
be argued that the pieces that occur several times ment of time-spans. They don't necessarily include
are also the most popular pieces and/or the most the possibility to decide which of the regular pat-
widely distributed, and are as such allowed to have terns detected is actually perceived as the pulse.
more weight in the final representation. Therefore measurements of tempo done with these
The results of the observed tempi are shown in methods mostly rely on a theoretical account of
Figure 8, plotted in blocks of 20 ms from 200 ms what is supposed to be the pulse. In general the
to 1500 ms. The mean tempo is found at 463 ms, denominator of the time signature is chosen as the
the median at 451 ms and the mode is situated at pulse. Using this value might work in most cases,
the interval 440-460 ms, with 20.8 % of the entries, but, e.g., the numerous pieces written in a fast 6/8
the standard deviation is 75 ms. The range between meter, and counted in two, are evident examples of
400 ms and 500 ms clearly stands out with each of the impossibility to generalize this procedure. To
the five segments representing more than 9 % and solve this problem, a large amount of research has
the whole representing 72.5 % of all entries. All been done in the field of computational modeling
segments containing more than 1 % of all entries of beat induction (see Desain & Honing, 1995). A
are located between 340 ms and 640 ms, with a ready to use model, dealing with random fragments
total of 96 % of all tempi located between these of music does not yet seem to exist.
two pulse periods. Data were collected with a computer program
specifically designed for this purpose. Intervals
The tapping experiment were entered by pressing a key at the perceived rate.
A measurement of tempo by letting participants tap The program automatically stops recording after 40
along with the music, seems to be the best way to entries, and averages them to determine global tem-
collect the subjective data of pulse perception that po. After each run the user has the possibility to
we need. Other methods, like stopwatch timing enter some informative data, then everything is

Fig. 8. Distribution of tempi in contemporary dance music.


stored and a new measurement can be started. The 9. For these data, the mean is at 566 ms, the me-
program has the advantage that every intertap inter- dian at 509 ms and the mode at 460-480 (6.8% of
val is registered and stored in a database file, while the entries), the standard deviation is 195 ms. The
other existing bpm-counters only record the result- range 420-520 ms contains 32.6% of all entries
ing averages. The latter might be enough to deal and is the most important 100 ms interval. The
with the constant tempi found in dance music, but smallest interval containing 75% of the entries is
when dealing with tempo perception in music with found between 320 and 720 ms.
a fluctuating pulse we should be able to record
abrupt as well as gradual tempo changes. Styles
In addition to this, using the same design, selective
Radio listening tests were done, tapping along with spe-
In a first experiment 4 musically skilled partici- cific selections of music, representing divergent
pants were asked to tap along with the perceived styles:
pulse of music broadcasted on the radio, randomly 1. 16th century Flemish polyphony: religious and
Downloaded by [Staffordshire University] at 05:43 06 October 2014

this way a general image of the perceived tempi of Adriaan Willaert, Orlandus Lassus, Philippus
the music present in our culture can be established. de Monte and contemporaries.
The number of pieces to which each participant 2. French baroque music from the first half of the
tapped varied between 125 and 250. In the analysis 18th century: predominantly chamber music of
the duration of each inter-tap interval was used Franc°ois Couperin, Marin Marais, Jean-
separately. Philippe Rameau, Jean-Marie Leclair and con-
Correlations between the results obtained by the temporaries.
four participants are all highly significant (correla- 3. Romantic piano music composed by Franz
tions between .83 and .94, p < 0:001). This allows Schubert, Robert Schumann, Frederic Chopin
us to sum them, this summation is shown in Figure and Franz Liszt.

Fig. 9. Distribution of tempi in radio music.


Table 5. Overview of the results of the tapping experiments with music of different styles. In each row we find a specific
style with: the number of pieces included in the experiment; the median and the mean; the mode (in 20 ms intervals);
the standard deviation (giving an idea of the spread); the 100 ms interval with the highest number of entries and the per-
centage of entries it represents; and the smallest interval that includes 75% of all entries.

Style N. Median Mean Mode St. Dev. Highest 100 ms. 75% int.

1. Polyphony 193 574 616 540ö560 200 500ö600 (34.4%) 380ö800

2. Baroque 285 530 589 500ö520 231 420ö520 (24.2%) 280ö740
3. Romantic 251 552 600 460ö480 228 480ö580 (20.7%) 300ö780
4. Jazz 306 451 490 320ö340 183 420ö520 (26.9%) 260ö580
5. Charts 145 488 537 440ö460 144 400ö500 (49.5%) 380ö620

4. `Classic' jazz music from the late thirties to the the large agreement between the shapes of the dif-
early sixties, or from Glenn Miller to John Col- ferent tempo distributions. The statistic averages
trane. show that the distribution of the answers is concen-

5. Contemporary hitparade music, based on the trated around the same region: means vary between
top 30 of the Flemish national radio 2 and the 463 ms (dance) and 616 ms (polyphony) with an
top 20 of the `alternative' charts of the youth average of 545 ms, medians vary between 451 ms
channel of the Flemish national radio `Studio (dance and jazz) and 574 ms with an average of
Brussels', collected in the middle of 1998. 508 ms and the mode varies between 320^340 ms
The data were collected using the same procedure (jazz) and 540^560 ms (polyphony) with an average
as for the tapping experiment on radio music. How- of 461 ms (middle point of the means of the cat-
ever, the data analysed in this chapter were col- egory boundaries). The differences between these
lected by one single subject (author DM). This is numbers already give some indications about the
justified by the lack of significant intersubject dif- shape of the distributions: they all have a positive
ferences in the `radio' experiment. An overview of skewness (between 3.5 for the dance music and
the results is given in Table 5. 0.7 for the radio tapping experiment), which means
that their mode is at a lower value (i.e., a faster
Analysis and Discussion tempo) than the mean. Both the location of the
The different measurements of tempo distribution modes and averages and the general outline of the
in music show a considerable similarity as well as shape of the different tempo distribution plots,
some interesting differences. As a measure of simi- with their steep slope at the side of the fast tempi
larity the correlations between the (categorized) and a gradual descent at the side of the slow tempi,
results can be used, an overview is shown in Table 6. resemble the theoretical resonance curves
All these correlations are highly significant described in the previous chapter.
(p < 0:001), except the correlation between po- However each of the categories has individual
lyphony and dance music (p ˆ 0:012). This shows properties that discriminate it from the rest:

Table 6. Correlations between the tempo distributions in the different measurements.

Dance Radio Polyphony Baroque Romantic Jazz Charts

Dance 1 .786 .307 .683 .581 .655 .930

Radio 1 .802 .940 .894 .792 .909
Polyphony 1 .811 .818 .567 .572
Baroque 1 .959 .877 .801
Romantic 1 .850 .719
Jazz 1 .678
Charts 1

ö The dance music tempi (Fig. 8) are in general ated around 500 ms. They also show a large
somewhat faster than the average, but more spread of tempi (standard deviations 231 and
striking is the sharp peak around 450 ms. The 228 ms respectively), with a considerable
vast majority of the entries lies within a very amount of both fast (3.8 and 3.3% < 300 ms;
small range, which is reflected in a high kurtosis 19.8 and 19.9% < 400 ms) and slow tempi (2.6
(12.5) and a low standard deviation (75 ms). and 2.1% > 1200 ms; 10.3 and 11% > 900 ms).
This indicates a clear preference for a narrow This wide distribution of tempi seems to be
range of rather fast tempi, tempi that allow a related to the strongly `expressive' character of
free, exciting movement, without too high both styles, in which movements expressing sad-
demands on the dancers. ness and melancholy contrast with virtuosity
ö The same characteristics are, in a lesser extend, and energetic dance movements. This differen-
found in the tempi of the music in the charts tiation between different characters is also indi-
(cf. the high correlation between them). Of cated by a tendency to bi-modality in the data
course there is a considerable amount of typical of the romantic piano music, with a secondary
dance music included in these data, but also peak just below 400 ms.
pop songs, ballads, rock, local schlager singers ö This bimodality is much stronger in the jazz
and different `alternative' styles (from heavy- music data, where we clearly see two peaks:
metal to trip-hop). This broader field is reflected one at 460^480 ms and one at a fast tempo
in a larger amount of relatively slow tempi. (320^340). This clear preference for faster tempi
Whereas the mode is the same as in the pre- confirms the characterization of jazz as `fast'
vious category median and mean are consider- music. However, the extremely fast tempi
ably slower, the standard deviation is higher (< 200 ms), measured by Collier and Collier
and the kurtosis lower (7.76). However, in all of (1994), are not present here and seem to be a
the other style categories the spread (st. dev.) is product of the method employed: they always
much larger. Remarkable is also the lack of attribute the tempo to the quarter note, while it
really slow tempi, none above 1200 ms and seems reasonable that it shifts to the half note
only 2.3% of the entries above 900 ms. Clearly at very high speed. Apart from the dominance
slower tempi do not seem to be a good recipe of fast tempi, the amount of entries at tempi
to score a hit! slower than 600 ms is still considerable, point-
ö At the other end of the spectrum we find the ing to the presence of more slowly perceived
16th century polyphony. This category clearly jazz ballads.
stands out because of its larger amount of mod- ö Finally the data of the radio listening experi-
erately slow tempi, having consistently the high- ment (see Fig. 9), including many styles of
est value for mean, median and mode. The tem- music. Here we see a clear peak at the moderate
pi for this category are mainly collected from tempi. The typical shape of the resonance curve
religious music, a type of music is generally with its steep slope at the fast side and a long,
characterized as `contemplative' or `relaxing'. gradual decrease at the slow side is clearly pres-
The inclusion of secular chansons and madri- ent in these data. The tempo distribution of ran-
gals and especially that of some dance music dom music fragments, as heard on the radio,
augmented the entries at higher tempi. Never- seems to be a good average of the previous
theless this category clearly stands against the results. The individual characteristics of the dif-
previous two, emphasizing relatively slow, relax- ferent styles are not exactly reflected in the radio
ing tempi. Note that the highest values in this tapping data, but neither are they contradicted
category are close to the traditional measure- by them. `Radio' has the highest average corre-
ment of preferred or moderate tempo (cf. supra). lation (.854), while the extremes (dance music
ö The results of the tempo analysis of baroque and and polyphony) have the lowest ones (.657 and
romantic music show a big similarity. In both .646 respectively). It should also be noted that
categories the highest number of entries is situ- the sum of the results obtained for the six indi-

vidual musical styles, is highly similar to the good approximation of tempi perceived in music.
results of the radio listening experiment, with a Compared to the resonance curve obtained by the
correlation of .972. Scaled distributions of both analysis of Handel and Oshinsky's data (black solid
are shown in Figure 11. line in Fig. 11), we see that both distributions are
Both the distribution of tempi in the radio tapping somewhat faster and have a sharper peak. However
and in the `sum of styles' data should represent a the deviation of the resonance frequency is not dra-
Fig. 10. Distribution of tempi in different styles of music. Top row, left: 16th century Flemish polyphony; right: early
18th century French baroque music. Middle row, left: romantic piano music; right: classic jazz. Bottom row,
left: contemporary charts; right: sum of all 5 styles. The graphs of the individual styles were scaled by equaliz-
ing the mode with 100 and adapting the other data proportionally. The same distribution is used in the sum
of the five previous graphs and the distributions of the dance music tempi and the `radio' experiment, with
every number reduced to a seventh of its value.
Fig. 11. Distributions of tempi found in the radio listening experiment (dotted line) and the sum of tempi found in the
six separate style categories (thin solid line) and the resonance curve model for tempo in music (grey solid
line). For comparison the three curves used to model the experimental data in chapter 2 are given, they are re-
presented by thick black lines. The thick black solid line gives the resonance curve found by analyzing Handel
and Oshinsky's data (f0 = 2 Hz. (T0 = 500 ms) and ˆ 1:12), the dashed line that used for Parncutt (f0 =
1.818 Hz. (T0 = 550 ms) and ˆ 0:45) and the dotted the one found for Vos' data (f0 = 0.909 Hz. (T0 =
1100 ms) and ˆ 0:22). At the top the range of `tempo categories' as found in music theory (Willemze,
1971) is given.

matic: the peak falls at a point where both the dou- the same parameters as the curve of the model.
ble (228 ms) and half (912 ms) tempo still have a This method allows a quantification of the
considerably lower probability. We can interpret observed differences between the data sets and
this as the result of an overall preference for moder- between each set and the theoretical resonance
ately fast tempi in music. curve, using a minimal amount of variables. For
the whole set of the tapping data a precise approxi-
The resonance curve as a musicological tool for mation can be given using a curve with parameters
tempo analysis f0 = 2.193 Hz. (T0 = 456 ms) and ˆ 0:5 (grey
Another way to approach these data, is to model solid line in Fig. 11). This curve has correlations
each of the categories separately by changing the .988 and .971 with the radio tapping and sum of
parameters of the theoretical resonance curve. The styles data respectively. The results for the individ-
resonance curve is not used here to model pulse ual styles are shown in Table 7, their position in
perception, but as a musicological tool to charac- relation to the model (with T0 = 500 ms and
terize the tempo characteristics of different styles ˆ 1:12) is plotted in Figure 12. It can be seen
of music. A similar method of optimalization as that the values of the parameters are in agreement
used to find the characteristics of the resonance with the stylistic characteristics discussed. First
model is used to maximize the match between the look at the damping constant, the variable that
histograms describing the distribution of tempi in determines the width of the resonance curve. It is
different styles, and a flexible curve determined by clear that the categories with a large diversity of

Table 7. Characterization of the different styles using a variable (normalized) resonance curve. The curves are character-
ized by the period of their resonance frequency and the damping constant. In the bottom row the correlation between
data and model is given.

Dance Radio Polyphony Baroque Romantic Jazz Charts

T0 433 455 525 418 400 332 440

0.01 0.43 0.12 1.10 3.18 2.59 0.07
correlation .946 .988 .979 .946 .975 .948 .981

tempi (baroque, romantic, jazz) have a relatively GENERAL DISCUSSION

high , while the curves for the dance and charts
categories have a very low damping The latter indi- The main hypothesis of this paper is that a resonat-
cates that a high number of responses is concen- ing structure underlies pulse perception. The reso-
trated within a narrow band. The T0 values give a nance curve provides us with a useful model to
similar view as the mode of the data sets: generally explain existing experimental data on grouping
the periods are smaller than the resonance period. and tapping along isochronous and polyrhythmic
Only the characteristic period of the polyphony tone sequences. Three sets of data where modelled
falls between 500 and 550 ms. The others are using slightly different resonance curves. At first
around 100 ms shorter, only the T0 found for the sight this diminishes the strength of the model.
jazz data stands out as very low. Further experi- However, as one can see in Figure 11, the curves
ments have to determine if the resonance curve used to model Parncutt's and Handel and Oshins-
can indeed be used to compare the tempo charac- ky's data are quite similar (correlation 0:95). And,
teristics of different samples of music using only a as explained before, the third curve, used to model
minimum of parameters. the perception of grouping, can easily be related to

Fig. 12. Comparison of the tempo characteristics of different musical styles, in relation to the resonance curve devel-
oped through modelling. The x-axis represents the difference in resonance frequency between the model and
the best fitting resonance curve for the tapping experiments, the more to the right, the larger the characteristic
frequency. The y-axis corresponds to the difference in damping (based on the logarithm base 10), the higher,
the broader the distribution.

the other two, as its resonance frequency is about sal' curve using this method, are the specific char-
half the value of the other two. Moreover it must acteristics of each experiment and of the partici-
be noted that several factors could influence the pants. These inevitabily influence the results, so
exact location of the resonance frequency, such as that small differences in the parameters of the reso-
the subjects involved in the experiment, and the nance curves can be expected. For further applica-
nature of the experiment. The latter also explains tions, we propose to use, as a first approximation,
the use of different sets of parameters in the model- the resonance curve found by modelling the experi-
ling of the different experiments. These paramaters ment of tapping on polyrhythmic sequences (T0 =
do not have a direct relationship with tempo prefer- 500 ms and ˆ 1:12).
ence. They are necessary to infer the resonance Further experiments are necessary to explain the
curve from data in which also other elements than nature of the resonating system. Todd (this volume)
purely the preference for certain tempi have an considers human locomotion as the origin of tem-
influence. The value of these parameters can pro- po perception, especially the tempo of human
vide us with information about the strategies used walking (range of 434-613 ms for females and 444-
by the listeners to succeed in their task. 662 ms for males between 18 and 50) seems to be
The resonance curve is also a good model to give closely related to the resonance frequency. The
a characterization of the tempo distribution in analogy with human movement has been used in
music. In principle it should be the distribution of other types of modelling, e.g., the structure of the
repetition levels in music, filtered by the theoretical final ritard (Sundberg & Verrillo, 1980; Kronman
resonance curve. Further work is necessary to find & Sundberg, 1987) and modelling of rubato
out which elements in music determine this (Todd, 1995).
sequence of events that attract the tapping of the Apart from the determination of limits and shape
subjects. Anyway, just by varying the two variables of human pulse perception, the resonance model
T0 and the results of tapping experiments with can be used in other applications. It could be useful
music of different styles can be modelled. These as an element of a tempo tracking algorithm.
numbers then give an idea of the tempo character- When used as a band pass filter it can make a
istics of the style. They can be used for a compari- choice between the different periodicities found in
son between different sets of data and to determine the music and give them a `probability weight'.
the position of the data set in relation to preferred Other applications can be found in musicology:
tempo as derived from the modelling of the experi- apart from its use as a tool to classify and compare
mental data. In a similar way the model can be tempi in different styles, it can provide an explana-
the basis of further research on the dynamics of tion for the choice of performers for a particular
tempo perception, e.g., whether people differ in tempo and for (concious or unconcious) tempo
their resonance frequency, whether the resonance drift.
frequency depends upon the excitation state of peo-
ple (depending, e.g., on the type of music, on their
age etc.),... REFERENCES
If we compare the characteristic frequencies of
the different resonance curves we see that the more Bolton, T.L. (1894). Rhythm. American Journal of Psychol-
ogy, 6, 145ö238.
complex the stimuli are, the shorter the period of Campbell, R., & Counter, S. (1969). Temporal integration
the resonance frequency (music: 456 ms ^ poly- and periodicity pitch. The Journal of the Acoustical
rhythms: 500 ms ^ simple pulse trains: 550 (1100) Society of America, 45, 691ö693.
Collier, G.L., & Collier, J.L. (1994). All exploration of the
ms.). Therefore our hypothesis is that more complex
use of tempo in jazz. Music Perception, 11(3), 219ö242.
patterns `speed up' the resonance frequency. To con- Desain, P. (1992). A (de)composable theory of rhythm per-
struct an exactly determined theoretical resonance ception. Music Perception, 9(4), 439ö454.
Desain, P., & Honing, H. (1995). Computationeel modelle-
curve, reflecting `normal' pulse perception, we
ren van beat-inductie. In Van frictie tot wetenschap ö
would need to analyze more material using the Jaarboek 1994 ö1995, Vereniging van Academie-onder-
same method. The main problem to find a `univer- zoekers (p. 83ö95). Amsterdam.

Durrant, J.D., & Lovrinic, J.H. (1977). Bases of hearing Miller, B., Scarborough, D., & Jones, J. (1992). On the per-
science. Baltimore, MD: Williams & Wilkins. ception of meter. In M. Balaban, K. Ebcioglu, & O.
Epstein, D. (1995). Shaping time. New York: Schirmer Lake (Eds.), Understanding music with AI: Perspectives
Books. on music cognition (p. 428ö447). Menlo Park: The
Fraisse, P. (1956). Les structures rythmiques. Leuven: Publi- AAAI Press.
cations Universitaires de Louvain. Moelants, D., & Rampazzo, C. (1997). A computer system
Fraisse, P. (1957). Psychologie du temps. Paris: Presses Uni- for the automatic detection of perceptual onsets in a
versitaires de France. musical signal. In A. Camurri (Ed.), Kansei ö the tech-
Fraisse, P. (1963). Perception et estimation du temps. In P. nology of emotion (p. 141ö146). Genova.
Fraisse & J. Piaget (Eds.), Traitë de psychologie experi- Parncutt, R. (1987). The perception of pulse in musical
mentale: VI. La perception (p. 59ö95). Paris: Presses rhythm. In A. Gabrielsson (Ed.), Action and perception
Universitaires de France. in rhythm and music (p. 127ö138). Publications issued
Fraisse, P. (1982). Rhythm and tempo. In D. Deutsch (Ed.), by the Royal Swedish Academy of Music.
The psychology of music (p. 149ö180). New York: Aca- Parncutt, R. (1994). A perceptual model of pulse salience
demic Press. and metrical accent in musical rhythms. Music Percep-
Friberg, A., & Sundberg, J. (1995). Time discrimination in a tion, 11(4), 409ö464.
monotonic, isochronous sequence. The Journal of the Penner, M.J. (1977). Detection of temporal gaps in noise as a
Acoustical Society of America, 98, 2524ö2531. measure of the decay of auditory sensation. The Journal
Gelfand, S.A. (1981). Hearing ö an introduction to psycho- of the Acoustical Society of America, 61, 552ö557.
logical and physiological acoustics. New York: Marcel Peters, M. (1989). The relationship between variability of
Dekker, Inc. intertap intervals and interval duration. Psychological
Green, D.M. (1973). Temporal acuity as a function of fre- Research, 51, 38ö42.
quency. The Journal of the Acoustical Society of Ameri- Plomp, R., & Bouman, M.A. (1959). Relation between hear-
ca, 54, 373ö379. ing threshold and duration for pure tones. The Journal
Handel, S., & Oshinsky, J.S. (1981). The meter of syncopated of the Acoustical Society of America, 31, 749ö758.
auditory polyrhythms. Perception and Psychophysics, PÎppel, E. (1978). Time perception. In R. Held, H.W. Leibow-
30(1), 1ö9. itz, & H. Teuber (Eds.), Handbook of sensory physiology,
Hirsch, I.J. (1959). Auditory perception of temporal order. Vol. 8: Perception (p. 713ö729). Berlin: Springer Verlag.
The Journal of the Acoustical Society of America, 31(6), PÎppel, E. (1989). The measurement of music and the cere-
759ö767. bral clock: A new theory. Leonardo, 22(1), 83ö89.
Kay, B.A., Kelso, J.A.S., Saltzman, I.E.L., & SchÎner, G. PÎppel, E. (1996). Reconstruction of subjective time on the
(1987). Space-time behavior of single and bi-manual basis of hierarchically organized processing system. In
rhythmical movements: data and limit cycle model. M.A. Pastor & J. Aritieda (Eds.), Time, internal clocks
Journal of Experimental Psychology: Human Perception and movement (p. 165ö185). Amsterdam: Elsevier
and Performance, 13(2), 178ö192. Science.
KneubÏhl, F.K. (1997). Oscillations and waves. Berlin: Povel, D., & Essens, P. (1985). Perception of temporal pat-
Springer. terns. Music Perception, 2(4), 411ö440.
Kronman, U., & Sundberg, J. (1987). Is the musical ritard an Rosenthal, D. (1992). Emulation of human rhythm percep-
allusion to physical motion? In A. Gabrielsson (Ed.), tion. Computer Music Journal, 16(1), 64ö76.
Action and perception in rhythm and music (p. 57ö68). Scheirer, E.D. (1997). Tempo and beat analysis of acoustic
Publications issued by the Royal Swedish Academy of musical signals. Journal of the Acoustical Society of
Music. America, 103(1), 288ö301.
Langer, J., & Kopiez R. (1995). Entwurf einer neuen meth- Sundberg, J., & Verrillo, V. (1980). On the anatomy of the
ode der performanceanalyse auf grundlage einer theo- ritard: A study of timing in music. Journal of the Acous-
rie oszillierender systeme (tos). Musik Psychologie ö tical Society of America, 68, 772ö779.
Jahrbuch der Deutschen Gesellschaft fÏr Musikpsycholo- Todd, N.P.M. (1995). The kinematics of musical expression.
gie, 12, 9ö27. Journal of the Acoustical Society of America, 97(3),
Large, E.W., & Kolen, J.F. (1994). Resonance and the percep- 1940ö1949.
tion of musical meter. Connection Science, 6(2/3), Todd, N.P.M. (this volume). A sensory-motor theory of
177ö208. rhythm, time perception and beat induction. Journal
Lee, C.S. (1991). The perception of metrical structure: of New Music Research, 28(1).
Experimental evidence and a model. In P. Howell, R. Toiviainen. P. (1998). All interactive midi accompanist. Com-
West, & I. Cross (Eds.), Musical structure and cognition puter Music Journal, 22(4), 63ö75.
(p. 59ö128). London: Academic Press. Van Noorden, L. (1975). Temporal coherence in the percep-
McAuley, J.D. (1995). Perception of time as phase: Toward an tion of tone sequences. Unpublished doctoral disserta-
adaptive-oscillator model of rhythmic pattern processing. tion.
Unpublished doctoral dissertation. Van Noorden, L. (1991). Temporele relaties in de waarneming
Meumann, E. (1894). Untersuchungen zur Psychologie u. van toonreeksen. In G. Ten Hoopen, P.J.G. Keuss, &
Aesthetik d. Rhythmus. Philosophische Studien, 10, A.A.J. Mannaerts (Eds.), Muziekwaarneming: Psycho-
249ö322 and 393ö430. nomische publicaties 3. Amsterdam: Swets and Zeitlin-
Michon, J.A. (1967). Magnitude scaling of short durations ger.
with closely spaced stimuli. Psychonomic Science, 9(6), Vos, P. (1973). Waarneming van metrische toonreeksen. Nij-
359ö360. megen: Stichting Studentenpers.

Vos, P.G., & Ellerman, H.H. (1989). Precision and accuracy

in the reproduction of simple tone sequences. Journal
of Experimental Psychology: Human Perception and
Performance. 15(1), 179ö187.
Warren, R.M. (1993). Perception of acoustic sequences:
Global integration versus temporal resolution. In S.
McAdams & E. Bigand (Eds.), Thinking in sound: The
cognitive psychology of human audition (p. 37ö68).
New York: Oxford University Press.
Watson, C.S., & Gengel, R.W. (1969). Signal duration and
signal frequency in relation to auditory sensitivity. The
Journal of the Acoustical Society of America, 46,
Willemze, T. (1971). Algemene muziekleer. Utrecht/Ant-
werpen: Het Spectrum.
Wing, A.M., & Kristofferson, A.B. (1973). Response delays
and the timing of discrete motor responses. Perception
and Psychophysics, 14, 5ö12.
Woodrow, H. (1951). Time perception. In S.S. Stevens (Ed.),
You might also like