Professional Documents
Culture Documents
Grachten 2012
Grachten 2012
To cite this article: Maarten Grachten & Gerhard Widmer (2012) Linear Basis Models for Prediction and Analysis of Musical
Expression, Journal of New Music Research, 41:4, 311-322, DOI: 10.1080/09298215.2012.731071
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained
in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the
Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and
are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and
should be independently verified with primary sources of information. Taylor and Francis shall not be liable for
any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of
the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions
Journal of New Music Research
2012, Vol. 41, No. 4, pp. 311–322
Abstract
the variations in expressive parameters as a function of
The quest for understanding how pianists interpret the performer’s interpretation of the music, and most of
notated music to turn it into a lively musical experience, them can roughly be classified as either focusing on
has led to numerous models of musical expression. affective aspects of the interpretation, or structural
Several models exist that explain expressive variations aspects. An example of the former is the model of
over the course of a performance, for example in terms of Canazza, De Poli, Drioli, Rodá, and Vidolin (2004), and
phrase structure, or musical accent. Often however Canazza, De Poli, and Rodá (2002), which associates
expressive markings are written explicitly in the score perceptual dimensions of performance to physical sound
to guide performers. We present a modelling framework attributes. They identify sensorial and affective descrip-
for musical expression that is especially suited to model tions of performances along these dimensions. Further-
the influence of such markings, along with any other more, the rule based model of musical expression used by
information from the musical score. In two separate Bresin and Friberg (2000) allows for modelling both
experiments, we demonstrate the modelling framework structure and affect related expression. For the latter,
for both predictive and explanatory modelling. Together they use the notion of rule palettes to model different
with the results of these experiments, we discuss our emotional interpretations of music. These palettes
perspective on computational modelling of musical determine the strength of each of a set of predefined
expression in relation to musical creativity. rules on how to perform the music.
A clearly structure-oriented approach is the model by
Todd (1992), in which tempo and dynamics are (arch-
shaped) functions of the phrase structure of the piece.
1. Introduction and related work
Another example is Parncutt’s (2003) model of musical
When a musician performs a piece of notated music, the accent, which states that expression is a function of the
performed music typically shows large variations in musical salience of the constituents of a piece. Timmers,
expressive parameters like tempo, dynamics, articulation, Ashley, Desain, Honing, and Windsor (2002) propose a
and depending on the nature of the instrument, further model for the timing of grace notes. Lastly, Tobudic and
dimensions such as timbre and note attack. It is generally Widmer (2003) use a combination of case-based reason-
acknowledged that one of the primary goals of such ing and inductive logic programming to predict dynamics
variations is to convey an expressive interpretation of the and tempo of classical piano performances, based on a
music to the listener. This interpretation may contain phrase analysis of the piece, and local rules learnt from
affective elements, and also elements that convey musical data.
structure (Clarke, 1988; Palmer, 1997). A structural aspect of the music that has been
These insights have led to numerous models of remarkably absent in models of musical expression, are
musical expression. The aim of these models is to explain expressive markings written in the score. Many musical
Correspondence: Maarten Grachten, Johannes Kepler University, Computational Perception, Altenberger Str. 69, Linz, 4040 Austria.
E-mail: maarten.grachten@gmail.com
scores, especially those by composers from the early consists of a number of individual factors that jointly
romantic era, include instructions for the interpretation determine what the performance of a musical piece
of the notated music. Common instructions concerning sounds like (Palmer, 1996). With this framework,
the dynamics of the performance include forte (f) and expressive information from human performances can
piano (p), indicating loud and soft passages, respectively, be decomposed into (for now predefined) components,
and crescendo/decrescendo for a gradual increase and by fitting the parameters of a linear model to those
decrease in loudness, respectively. Some less common performances. We will refer to this as the linear basis
markings prescribe a dynamic evolution in the form of a modelling (LBM) framework.1
metaphor, such as calando (‘growing silent’). These Learnt models serve both predictive and explanatory
metaphoric markings may pertain to variations in one purposes. As a predictive tool, models find practical
or more expressive parameters simultaneously. application in tasks such as automatic score-following
At first sight, the lack of expressive markings as a and accompaniment. In an explanatory setting, a model
(partial) basis for modelling musical expression might be fitted to data reveals how much of the variance in an
explained by the fact that, since the markings appear to expressive parameter is explained by each of the basis
prescribe the expressive interpretation explicitly, model- functions (representing structural aspects of the musical
Downloaded by [University of Chicago Library] at 04:41 19 October 2014
name suggests, expressive parameters are modelled as such as pitch, onset time, and duration information, but
depending linearly on score information. also any further annotations that describe the note, such
Before we describe the notion of basis functions as whether the note has a staccato sign, an accent, a
representing score features, and the LBM framework, we fermata, and whether it is a grace note. Non-note
provide an illustration in Figure 1, to clarify the general elements represent score information that is not local
idea. The figure shows a fragment of notated music with to a specific note. Examples are expressive markings for
dynamics markings. The first four curves below the dynamics (p, f, crescendo, etcetera), tempo (lento,
notated music represent each of the four markings as ritardando, presto, etcetera), but possibly also time and
basis functions. The basis functions evaluate to zero key signatures, and slurs.
where the curves are at their lowest, and to one where The purpose of basis functions for modelling expres-
they are at their highest. Note that each of the basis sion is to capture some structural aspect of the score, and
functions is only non-zero over a limited range of time, express the relation of each score note to that aspect, as a
namely the time where the corresponding dynamic real number between 0 and 1. If we denote the set of all
marking takes effect in the music. The bottom-most note-elements by X , then a basis function has the form:
curve is a weighted sum of the basis functions j1 to j4 j : X ! ½0; 1. Although this suggests that the evaluation
Downloaded by [University of Chicago Library] at 04:41 19 October 2014
(using unspecified weights w), that represents an ex- of a basis function on a note element only depends on
pressive parameter, in this case the note dynamics. that element, many interesting types of basis function
take into account the context of the note. Therefore, it is
convenient to think of a note element as holding a
2.1 Representation of score information as basis functions
reference to its context in the piece it occurs in (for
In the past, the MIDI format has been often used as a example, to determine whether it occurs inside the scope
representation scheme for musical scores. Being intended of a crescendo sign).
as a real-time communication protocol however, this Note that defining basis functions as functions of
scheme is not very suitable for describing structural score notes, rather than functions of score time, increases the
information beyond the pitches and onsets of notes. By modelling power considerably. It allows for modelling
now, the more descriptive MusicXML format (Good, several forms of musical expression related to simulta-
2001) is widely used for distributing music. The types of neity of musical events. Examples are: the micro-timing
information we will refer to in this subsection are all of note onsets in a chord (chord spread), an expressive
contained in a typical MusicXML representation of a device that has hardly been studied empirically; and the
musical piece. accentuation of the melody voice with respect to
We define a musical score as a sequence of elements accompanying voices by playing it louder, and slightly
that hold information, and may also refer to other earlier (melody lead) (Goebl, 2001).
elements. For our purposes, it is relevant to distinguish In the following, we will propose several types of basis
between note elements, and non-note elements. Note functions.
elements hold local information about individual notes,
2.1.1 Indicator basis functions for note attributes
The simplest type of basis function is an indicator
function that evaluates to one wherever a specific
characteristic occurs, and to zero otherwise. For exam-
ple, we can define a function jstaccato for notes that have
staccato annotations, and a function jgrace for grace
notes. Both types of functions are illustrated in Figure 2.
By including jgrace as a basis function for dynamics, it
can account for any systematic deviations in dynamics of
performed grace notes. Similarly, jstaccato can reasonably
be expected to account for part of the variance in note
articulation.
Fig. 2. A score fragment illustrating various kinds of basis functions (see text for explanation).
Fig. 3. Dependency of dynamics and pitch in Magaloff’s performances of Chopin’s piano works (left); and Batik’s performances of
Mozart’s piano sonatas (right). Although the displayed lines were fitted on complete data sets, only a subset of the data points are
plotted, for convenience.
Table 1. Three categories of dynamics markings. feature, on the purpose of the model, and the amount of
data available for fitting the model. Nevertheless, the
Category Examples Basis function choice has some general implications. First of all, a local
basis modelling approach will lead to more basis
Constant f, ff, p, dolce, agitato step functions, and thus to models with more parameters.
Impulsive fz, fp impulse This provides more flexibility, and will result in better
Gradual crescendo, diminuendo, perdendosi ramp þ step approximations of the expressive target to be modelled.
But the larger number of parameters also makes models
more prone to overfitting. Apart from that, using local
‘closure’ occurring at each note.4 Closure can occur for basis functions in general leads to the situation that
example due to metrical position, completion of a different pieces are represented by a different number of
rhythmic or motivic pattern, or resolution of dissonance basis functions, depending on the (number of) annota-
into consonance. We use an automatic melody parser tions present in the score. This makes prediction slightly
that detects metric and rhythmic causes of closure more complicated (we deal with this in Subsection 2.3.1).
(Grachten, 2006), and represent the degree of closure at Nevertheless, it makes sense in some cases to choose local
each note in a basis function. basis functions, for example when the interpretation of
features is expected to include outliers, or vary strongly
from one instance to the other.
2.1.5 Global and local bases
An important decision in the design of basis functions is
2.2 Model description
whether a basis function corresponds to a feature in
general, or to a single instance of that feature. In the case As mentioned in the introduction, each of the expressive
of a grace note basis function for example, the first parameters (y) is modelled separately. LBM is indepen-
approach results in a single function that evaluates to one dent of the interpretation of y. For example, in Grachten
for all grace notes. We call such basis functions global. In and Widmer (2011) it is used to model dynamics, whereas
contrast, the second approach results in one basis in Krebs and Grachten (2012), y represents expressive
function for every grace note, evaluating to one only on tempo. In this subsection, we will use y without a specific
the corresponding grace note. We refer to these as local interpretation, and we will refer to it as the target.
basis functions. The central idea behind LBM is that it provides a way
Whether one approach is to be preferred over the to determine the optimal influence of each of a set of
other may depend, among other things, on the type of basis functions, in the approximation of the target. The
influence of a basis function is expressed by a weight w.
4
Narmour’s concept of closure is subtly different from the This leads to the following formalisation: given a musical
common notion of musical closure in the sense that the latter score, represented as a list of N notes x ¼ ðx1 ; . . . ; xN Þ,
refers to ‘ending’ whereas the former refers to the inhibition of and a set of K predefined basis functions j ¼
the listener’s expectation of how the melody will continue. ðj1 ; . . . ; jK Þ, the sequence of N target values y is
316 Maarten Grachten and Gerhard Widmer
modelled as a weighted sum of the basis functions plus local basis functions are ‘instantiations’ of a basis
noise e: function ‘class’. For example, a score may give rise to
three bases of the class crescendo, for each of three
y ¼ fðx; wÞ þ e ¼ jðxÞw þ e ; ð1Þ crescendo signs occurring in a score. In this way, we can
associate each estimated weight w ^j with the type of its
where we use the notation jðxÞ to denote the N K corresponding basis function cj. This leads to a set of
matrix with element ji;k ¼ jk ðxi Þ, and where w is a pairs ðcj ; w
^j Þ, to be used as training data for any
vector of K weights. regression algorithm, to predict a weight w for a given
basis function type c. This approach allows for arbi-
trarily rich descriptions of basis functions: rather than
characterizing a basis function just as a crescendo, it
2.3 Learning basis function weights from data
might for example be characterized as a crescendo
Given performances in form (x, y) we use the model in following a piano, in a minor key context.
Equation 1 to estimate the weights w, which is a
straightforward linear regression problem. Under the
Downloaded by [University of Chicago Library] at 04:41 19 October 2014
2.3.1 Prediction with local and global bases 3.1 Experiment 1: Representation and prediction of
expressive dynamics
In the case of global bases, once a weight vector w^ D has
been learnt from a data set D, predictions for a new score The objective of this experiment is to assess how
x can be made easily, using Equation 1, leaving out the accurately expressive dynamics can be represented and
noise term e. First, we construct the matrix of basis predicted using LBM. In particular, we are interested
functions jðxÞ from x, and subsequently we apply the which (combination) of the basis functions described
dot product of the matrix with the learnt weights w ^ D: above is most useful for representation and prediction.
yD ¼ fðx; w
^ ^ D Þ ¼ jðxÞ^
wD : ð3Þ
3.1.1 Method
In the case of local bases, weight estimation is realized We use the following abbreviations to refer to the
for each piece in D individually, leading to a set of weight different kinds of features: DYN: dynamics annotations.
vectors ð^ ^ L Þ. As described in Subsection 2.1.5,
w1 ; . . . ; w These annotations are represented by one basis function
Linear basis models for prediction and analysis of musical expression 317
for each marking in Table 1, plus one basis function for It is likely that Magaloff used manuscripts as scores,
accented notes; PIT: a third-order polynomial pitch but we are uncertain as to the exact version. To obtain
model (three basis functions); GR: the grace note dynamics markings from the scores, we have used the
indicator basis; IR: two basis functions, one indicating Henle Urtext Edition wherever possible, which explicitly
the degree of closure, and another representing the states its intention to stay faithful to Chopin’s original
squared distance from the nearest position where closure manuscripts. The dynamics markings are obtained by
occurs. The latter feature forms arch-like parabolic optical music recognition from the scanned musical
structures reminiscent of Todd’s (1992) model of scores (Flossmann et al., 2010).
dynamics.
We employ two different modelling approaches. In
3.1.3 Goodness-of-fit of the dynamics representation
the first, all features are represented by global basis
functions (see Subsection 2.1.5), and the weights for the Table 2 shows a comparison of the observed expressive
bases are estimated all at once by concatenating all dynamics with the optimal fit of the model. The
performances in the training set, as described in goodness-of-fit is expressed in two quantities: r is the
Subsection 2.3. In the second scenario, we use global Pearson product-moment correlation coefficient, denot-
Downloaded by [University of Chicago Library] at 04:41 19 October 2014
bases to represent the features PIT, GR, and IR, and ing how strongly the observed dynamics and the
local bases for DYN. In this case, we use the second dynamics values of the fitted model correlate. The
weight estimation procedure described in Subsection 2.3, quantity R2 is the coefficient of determination, which
in which weights are learnt per piece/performance pair. expresses the proportion of variance accounted for by the
For the prediction of expressive dynamics for unseen model. Average and standard deviations (in parentheses)
data, support vector regression (Schölkopf & Smola, of the r and R2 values over the 154 musical pieces are
2002) was used to estimate weights for unseen data, listed in Table 2.
based on the estimated weights from the training data The results show that both the strongest correlation,
(cf. Subsection 2.3.1). and the highest coefficient of determination is achieved
The total number of model parameters in the global when using local basis for dynamics markings, and
scenario is thus 30 (DYN) þ 3 (PIT) þ 1 (GR) þ 2 including all features. This is unsurprising, since in the
(IR) þ 1 (constant basis) ¼ 37, or less, depending on the global setting a single weight vector is used to fit all
subset of features that we choose. In the local scenario pieces, whereas in the local setting each piece has its own
the number of parameters can either be larger or smaller weight vector. Furthermore, since adding features
than in the global scenario, depending on the number of increases the number of parameters in the model, it will
dynamics markings that appear in the piece. In the also increase the goodness-of-fit. We note however, that
evaluation, we omit the feature combinations that consist the r and R2 values are averaged over pieces, with a
of only GR and IR, since we expect their influence on considerable standard deviation, and that differences
dynamics to be marginal with respect to the features between outcomes may not all be significant.
DYN and PIT.
setting, increasing the complexity of the model does not We use a local basis fitting approach, in which each
affect its predictive accuracy, whereas in the local setting, piece/performance pair is fitted individually.
maximal predictive accuracy is achieved for models of
moderate complexity (including dynamics, pitch, and
3.2.2 Data
grace note information). The decrease of accuracy for
more complex models is likely to be caused by over- Data as precisely measured as that of the Magaloff
fitting. corpus is not available for other performers – at least not
Interestingly, the highest proportion of explained for several performers playing the same pieces of music.
variance ðR2 ¼ 0:19Þ is achieved by the predictions of We use loudness measurements from commercial CD
the local model with all available features (DYN þ recordings as an alternative. The data for this experiment
PIT þ IR þ GR). Note, however, that the standard was used earlier by Langner and Goebl (2003). Loudness
deviation of R2 is large in most cases. from the PCM data was calculated using an implementa-
tion of Zwicker and Fastl’s (2001) psycho-acoustic
model, by Pampalk, Rauber, and Merkl (2002). To
3.2 Experiment 2: Analysis of expressive dynamics in
reduce the effect of different recording levels, the data
commercial recordings
was transformed to have zero mean and unit standard-
The above experiments are all done using the perfor- deviation per piece, as in Repp (1999). In addition to the
mances of a single performer. In the next experiment, we loudness computation, beat-tracking was performed
semi-automatically, using BeatRoot (Dixon, 2001).
The data set includes multiple performances of four
Chopin piano pieces: a Ballade, a Prelude, and two
Table 3. Predictive accuracy the model in a leave-one-out Nocturnes. The performances for each piece are listed in
scenario; See Section 3.1 for abbreviations.
Table 4.
r R2
3.2.3 Results
avg. std. avg. std.
In columns 2 and 3 of Table 5 (under meas. vs. fit), R2
Basis (global) and r measures are shown per piece, averaged over all
DYN 0.192 (0.173) 0.20 (0.100) performers. Analysis of variance shows that both R2 and
PIT 0.422 (0.129) 0.147 (0.111) r differ significantly across pieces: Fð3; 8Þ ¼ 30:15;
DYNþPIT 0.462 (0.125) 0.161 (0.156) p < 0:001, and Fð3; 8Þ ¼ 26:51; p < 0:001, respectively.5
DYNþPITþGR 0.462 (0.125) 0.161 (0.156) No effect of performer on goodness-of-fit measures was
DYNþPITþIR 0.462 (0.124) 0.162 (0.155) found.
DYNþPITþGRþIR 0.462 (0.124) 0.162 (0.154)
Columns 4 (measurement) and 5 (residual) of Table 5
Basis (local) summarize the correlations between the loudness curves
DYN 0.192 (0.179) 0.024 (0.109) of performers. The r values in column 4 are computed on
PIT 0.415 (0.137) 0.149 (0.149)
the measured loudness curves. Column 5 contains the r
DYNþPIT 0.459 (0.126) 0.151 (0.220)
DYNþPITþGR 0.459 (0.123) 0.153 (0.195)
5
DYNþPITþIR 0.455 (0.130) 0.141 (0.231) To meet the assumptions of ANOVA, the data set was
DYNþPITþIRþGR 0.457 (0.123) 0.188 (0.126) restricted to the pianists for which performances of all four
pieces are available, namely Pollini, Rubinstein, and Ashkenazy.
Linear basis models for prediction and analysis of musical expression 319
Op. 28(17) Sokolov 1990, Arrau 1973, Harasiewicz 1963, effect on dynamics is not a matter of SPL compensation.
Pogorelich 1989, Argerich 1975, Although the data set contains many performances, it
Ashkenazy 1985, Rubinstein 1946, Pires 1992,
is important to realize that the results are derived from
Kissin 1999, Pollini 1975
performances of a single performer, performing the
music of a single composer. The importance of pitch as
a predictor for dynamics may be different for other
performers, composers, and musical genres. Specifically,
Table 5. Mean and standard deviation of R2 and r per piece. we hypothesize that the fact that pitch effect on dynamics
is a consequence of melody lead. This phenomenon,
meas. vs. fit measurement residual which has been the subject of extensive study (see Repp,
1996; Goebl, 2001), consists of the consistent tendency of
Piece R2 r r r
Op. (No.) mean (sd) mean (sd) mean (sd) mean (sd)
pianists to play melody notes both louder and slightly
earlier than the accompaniment. This makes the melody
15 (1) 0.90 (0.04) 0.95 (0.02) 0.88 (0.03) 0.50 (0.11) more clearly recognizable to the listener, and may
27 (2) 0.76 (0.07) 0.87 (0.04) 0.80 (0.03) 0.56 (0.07) improve the sensation of a coherent musical structure.
28 (17) 0.66 (0.08) 0.81 (0.05) 0.76 (0.06) 0.61 (0.05) In many musical genres, the main melody of the music is
52 0.86 (0.05) 0.93 (0.03) 0.82 (0.06) 0.48 (0.08) expressed in the highest voice, which explains the
relationship between pitch and dynamics.
This effect is clearly visible in Figure 4, which displays
observed, fitted, and predicted dynamics for the final
measures of Chopin’s Prelude in B major (Opus 28, No.
values computed from the residual loudness curves, after 11). In this plot, the velocity of simultaneous notes is
the model fit has been subtracted. plotted at different (adjacent) positions on the horizontal
The variance of coefficients across pieces appears to be axis, for the ease of interpretation. Melody notes are
too large to reveal any direct relationships between indicated with dotted vertical lines. It is easily verified by
performers and coefficients, independent of the piece. eye that the velocity of melody notes is substantially
Within pieces however, significant effects of performer on higher than the velocity of non-melody notes. This effect
coefficients are present for the coefficients of some is very prominent in the predictions of the model as well.6
loudness directives. For example, in Op. 52. there is an Although observed and predicted dynamics are
effect of performer on ff coefficients (Fð7; 28Þ ¼ 3:90; visibly correlated, Figure 4 shows that the variance of
p < 0:005), and in Op. 28 (No. 17), an effect of performer the prediction is substantially lower than that of the
on fz coefficients (Fð9; 90Þ ¼ 25:75; p < 0:0001). observation, meaning that expressive effects in the
predicted performance are less pronounced. The lower
variance is most likely caused by the fact that the model
parameters have been optimized to performances of a
4. Discussion
wide range of different pieces, preventing the model from
In this section we discuss the results of both experiments accurately capturing dynamics in individual perfor-
described in the previous section. We conclude the
section with a brief discussion on how we see the LBM
framework in relation to creative aspects of expressive 6
See http://www.cp.jku.at/research/TRP109-N23/BasisMixer/
music performance. midis.html for sound examples.
320 Maarten Grachten and Gerhard Widmer
Downloaded by [University of Chicago Library] at 04:41 19 October 2014
Fig. 4. Observed, fitted, and predicted note-by-note dynamics of Chopin’s Prelude in B major (Opus 28, No. 11), from measure 16
onwards; fitting and prediction was done using the global bases DYNþPITþGRþIR (see Section 3.1). Vertical dotted lines indicate
melody notes.
straints must be imposed to avoid the model being performances for our research. For this research, we
under-determined. have made extensive use of free software.
It should be stated clearly, however, that the LBM
framework is intended as a tool for analysing artifacts,
rather than the process that led to these artifacts. References
Analogously, the process of creating a new performance Bresin, R., & Friberg, A. (2000). Emotional coloring of
by using LBM, should not be seen as modelling a computer-controlled music performances. Computer Mu-
cognitive process, let alone a creative process. Whether a sic Journal, 24 (4), 44–63.
performance created by LBM could be regarded as Canazza, S., De Poli, G., Drioli, C., Rodá, A., & Vidolin, A.
creative by a human listener is a philosophical question. (2004). Modeling and control of expressiveness in music
We adhere to the view stated by Widmer, Flossmann, performance. Proceedings of the IEEE, 92 (4), 686–701.
and Grachten (2009), that creativity is in the eye of the Canazza, S., De Poli, G., & Rodá, A. (2002). Analysis of
beholder. It is even conceivable that the creativity is not expressive intentions in piano performance. Journal of
just a (subjective) characteristic of the performance, but ITC Sangeet Research Academy, 16, 23–62.
also of the listener’s interpretation, by which she con- Clarke, E.F. (1988). Generative principles in music. In J.
Downloaded by [University of Chicago Library] at 04:41 19 October 2014
strues a novel and unconventional performance as an Sloboda (Ed.), Generative Processes in Music: The
enjoyable one. Psychology of Performance, Improvisation, and Composi-
tion. Oxford: Oxford University Press.
Dixon, S. (2001). Automatic extraction of tempo and beat
5. Conclusions and future work from expressive performances. Journal of New Music
Research, 30 (1), 39–58.
The work presented in this paper corroborates the
Flossmann, S., Goebl, W., Grachten, M., Niedermayer, B.,
growing insight in music performance research, that
& Widmer, G. (2010). The Magaloff Project: An interim
even if musical expression is a highly complex phenom- report. Journal of New Music Research, 39 (4), 369–377.
enon, it is by no means fully unsystematic. We have Goebl, W. (2001). Melody lead in piano performance:
described a linear basis modelling framework to account expressive device or artifact? Journal of the Acoustical
for expressive variations in music performance. Several Society of America, 110 (1), 563–572.
types of basis functions were discussed. Using a relatively Goebl, W., & Bresin, R. (2003). Measurement and
small set of basis functions, it is possible to account for reproduction accuracy of computer controlled grand
over 45% variance in the dynamics of Magaloff’s pianos. Journal of the Acoustical Society of America, 114
performances of Chopin piano works. Prediction of (4), 2273–2283.
performances using a model trained on performance data Good, M. (2001). MusicXML for notation and analysis. In
unsurprisingly yields lower values, but substantial W.B. Hewlett and E. Selfridge-Field (Eds.), The Virtual
positive correlations are still observed. Score: Representation, Retrieval, Restoration (Vol. 12, pp.
As an analytical tool, we have used the framework to 113–124). Cambridge, MA: MIT Press.
quantize performance differences between performers, Grachten, M. (2006). Expressivity-aware tempo transforma-
and pieces. Results indicate that the variance across tions of music performances using case based reasoning (PhD
pieces is too large to identify performer-specific expres- thesis). Pompeu Fabra University, Barcelona, Spain.
sive style, but within pieces, some performer-specific Grachten, M., & Widmer, G. (2009). Who is who in the end?
expressive effects were identified. Recognizing pianists by their final ritardandi. In Proceed-
The LBM framework can be extended in two ings of the Tenth International Society for Music Informa-
important ways. Firstly, we believe the framework is tion Retrieval Conference. Kobe, Japan, pp. 51–56.
Grachten, M., & Widmer, G. (2011). Explaining expressive
well suited to a probabilistic approach, in which prior
dynamics as a mixture of basis functions. In Proceedings
information on the distribution of weights is combined
of the Eighth Sound and Music Computing Conference
with estimates obtained from performance data. Sec-
(SMC). Padua, Italy.
ondly, a strong limitation of the current model is that
Krebs, F., & Grachten, M. (2012). Combining score and
basis functions must be defined manually. Dictionary filter based models to predict tempo fluctuations in
learning techniques developed in the field of sparse expressive music performances. In Proceedings of the
coding may be used to learn basis functions from Ninth Sound and Music Computing Conference (SMC).
performances. Copenhagen, Denmark, pp. 358–363.
Langner, J., & Goebl, W. (2003). Visualizing expressive
Acknowledgements performance in tempo-loudness space. Computer Music
Journal, 27 (4), 69–83.
This research is supported by the Austrian Research Moog, R.A., & Rhea, T.L. (1990). Evolution of the
Fund (FWF, Z159 ‘Wittgenstein Award’). We are keyboard interface: The Bösendorfer 290 SE Recording
indebted to Mme. Irene Magaloff for her generous Piano and the Moog Multiply-Touch-Sensitive key-
permission to use the data of her late husband’s boards. Computer Music Journal, 14 (2), 52–60.
322 Maarten Grachten and Gerhard Widmer
Narmour, E. (1990). The Analysis and Cognition of Basic Rosenblum, S.P. (1988). Performance Practices in Classic
Melodic Structures: The Implication-Realization Model. Piano Music: Their Principles and Applications. Bloo-
Chicago: University of Chicago Press. mington, IN: Indiana University Press.
Palmer, C. (1996). Anatomy of a performance: Sources of Schölkopf, B., & Smola, A.J. (2002). Learning with kernels.
musical expression. Music Perception, 13 (3), 433–453. Cambridge, MA: MIT Press.
Palmer, C. (1997). Music performance. Annual Review of Timmers, R., Ashley, R., Desain, P., Honing, H., &
Psychology, 48, 115–138. Windsor, L. (2002). Timing of ornaments in the theme
Pampalk, E., Rauber, A., & Merkl, D. (2002). Content- of Beethoven’s Paisiello Variations: Empirical data and a
based organization and visualization of music archives. model. Music Perception, 20 (1), 3–33.
In Proceedings of the 10th ACM International Conference Tobudic, A., & Widmer, G. (2003). Playing Mozart phrase
on Multimedia, Juan les Pins, France (pp. 570–579). by phrase. In Proceedings of the Fifth International
Parncutt, R. (2003). Perspektiven und Methoden einer Conference on Case-Based Reasoning (ICCBR-03) (pp.
Systemischen Musikwissenschaft (pp. 163–185). Ger- 552–566). Berlin: Springer-Verlag.
many: Peter Lang. Todd, N. (1992). The dynamics of dynamics: A model of
Repp, B.H. (1996). Patterns of note onset asynchronies in musical expression. Journal of the Acoustical Society of
Downloaded by [University of Chicago Library] at 04:41 19 October 2014