Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)

An investigation on the automatic generation of


music and its application into video games
Germán Ruiz Marcos
School of Computing and Communications
The Open University
Milton Keynes, UK
german.ruiz-marcos@open.ac.uk

Abstract—This paper presents a description of the author’s The process of automatically generating music has also been
PhD research plan and its progress to date. By way of intro- aided by improvements in memory consumption. It is attractive
duction, some gaps and challenges are pointed out concerning as a source of real-time, original, endless material [7], either
algorithmic composition and its literature. Motivated by these, a
set of research questions are given, which explore the possibility as a final product or to provide inspiration [8], and to reduce
of generating music matching tension and its applications in video effort and costs [9].
games. To give a brief overview of the background, the most
relevant models of tension in music are introduced, as well as the B. Research questions
most recent pieces of related work. The research approach is then Motivated by this, my research explores the following
presented as a summary of the scope of the problem, according to
the gaps motivating the project and the challenges that come from research question: how to automatically generate music match-
the related work, and the appropriate methodology to explore the ing the flow of tension within narratives and how can it be
research questions. A brief review of the work to date is included, applied into video games?
emphasising the design of an automatic music generator and the This question can be split into four sub-questions, which
empirical study carried out to test its capabilities. To conclude, determine my research methodology. These questions are:
the steps presented in the methodology are transformed into a
future plan and some research contributions. a) How to model (tonal) tension in order to implement an
Index Terms—algorithmic composition, tension, video games, automatic algorithmic composition system?
tonal music b) How to generate harmonic, melodic and rhythmic con-
tent matching (tonal) tension?
I. I NTRODUCTION c) Which features influence the generation, evolution and
Algorithmic composition refers to the process of combining perception of tension within video games and their
musical elements into a whole composition according to a narratives?
sequence of rules [1]. During the past decades, algorithmic d) How to map the musical model of tension to that
composition has been widely studied within the field of generated by the game?
artificial intelligence [2]–[4]. However, the music generated by
II. BACKGROUND AND RELATED WORK
most of these studies lacks specific intention and so can sound
meaningless [2]. To address this issue, Papadopoulos and In the literature, musical tension has been studied as a
Wiggins [2] proposed that, in future, models should evaluate matter of expectation within hierarchies, as it is the case of
and refer to a specific feature, such as musical tension 1 tonality, but also as depending on the psychological impact
or expectancy. This would allow human composition to be of sensory features, such as loudness, rhythmic patterns or
simulated more closely. dissonances, among others [10]–[12]. These latter studies
investigate tension by focusing on a single feature each, which
A. Motivations precludes the generalization of a theory of tonal tension. On
The generation of music in terms of tension can be used in the other hand, hierarchical models attempt to describe the
many applications. For instance, in video games music plays general role of tension in a given context, what fits best with
an essential role, both as a medium of affective immersion and the purpose of this investigation.
as support of the interaction between the player and the nar-
A. Musical context
rative of the game. Thereby, procedural music, “composition
that evolves in real time according to a specific set of rules or Tonality is one of the main idioms in the Western music
control logics” [6], which is in constant demand in markets tradition 2 . It can be defined as a hierarchical system that
such as those of films or video games, would definitely benefit organizes musical events according to their level of importance
from a tension-based approach. with regard to a reference, known as the centre of the tonality.
Those relationships are based on the nature of sound itself,
1 Musical tension is commonly used for evoking emotions in the literature
concerning affective computing [5]. 2 The term “tonal” is used in this paper as being in the context of tonality.

978-1-7281-3891-6/19/$31.00 ©2019 Crown


but also on the implicit knowledge of tonality, as the result C. Tension and game scoring
of the long-term exposure to this context during most of the Dynamic music is that able to react and adapt to game play
history of the Western music tradition. [22]. It can support narratives by pointing out and describing
Notes grouped together and played simultaneously are emotions, characterizing settings and physical activities, com-
known as chords. Within tonality, chords have a specific role municating values, directing attention and focusing on detail,
or function given by the individual importance of the notes or on the contrary masking [23].
shaping each chord. Said chords’ ratings could be considered There have been some contributions to this matter in the past
as implicit in listeners [13]. So that, a note or a chord generates recent years. AudioInSpace [24] presented a creative fusion of
an expectation on the listener on what will be played next, generative audio and narratives, implemented through multiple
based on its level of importance within tonality [14]. Thus, Artifitial Neural Networks. The Sonancia [25] project carried
the expectation of a given note or chord depends on its tonal out sonification of game levels according to probabilistic
function, but also on the distance to the previous note or chord: transitions, which were weighted according to pre-annotated
the more important the note or chord is within a tonality and levels of suspense. ARTMG [26] developed a music generator
the closer it is to its preceding note or chord, the more it is capable of composing music in real-time in style of a training
expected to be played, and vice-versa. piece, using Hierarchical and Hidden Markov models. In FATE
In this sense, musical tension can be described as a matter [27], adaptive music is generated in order to set interactive
of expectation, following a hierarchical organization [11]. If experiences, according to probabilistic transitions based on
a note or chord is played but it was not expected to follow context-sensitive grammars. Metacompose [28] proved capable
its previous note or chord, the change from one to the other of generating affective music in real time by using genetic
will be perceived as tense. If, on the contrary, the output note algorithms. Finally, Escape point [29] presented a template-
or chord was expected, we will probably feel the transition as based chord generator that supports a dynamically changing
non-tense. emotional-narrative.

III. R ESEARCH APPROACH


B. Modelling tonal tension
A. Scope of the problem
There have been different approaches to modelling tension
between chords in the literature. However, some of them have According to the literature, the lack of intention of music
not been empirically tested [15], do not give a complete generators within algorithmic composition stands out as an
perspective of the problem [16], or fail to consider features essential gap to be filled. This gap, which motivates my
such as melodic contour and rhythm [12]. In 2007, Lerdahl research, sets the starting point at finding a suitable definition
and Krumhansl [17] presented a quantitative theory to model of tension in the context of music applicable into video games.
tonal tension 3 , based on the structural analysis of music and To approach this task, an interpretation and transformation of
the role of notes and chords within tonality. This theory has the theoretical descriptions of tension in music and games is
shown empirical evidence both statistically [17] and at the needed, to support an implementation of a generative strategy.
neuroscientific level [20]. New challenges also arise from the limitations of the
models introduced as part of the related work. Taking into
The relationship between melody and tension is studied in
account that game narratives are non-linear, game music has
the literature as a matter of melodic expectancy. Margulis [14]
to dynamically adapt in real-time. But it has to be an enjoyable
succeeded in providing quantitative predictions of melodic
adaptation to game play, avoiding boredom. Likewise, most of
expectancy. Given a note in a melodic line, Margulis’ model
the music generators developed to date generate game music
would calculate the expectancy values of the note to follow
as variations over pre-composed material, which reduces the
based on the role of the chord of the original note [20],
chances of exploring new and more creative ideas, as oppose
the distance of the original note to its preceding note, and
to generating total compositions of new material.
the continuation or reverse of the direction of the preceding
melody. B. Methodology
Concerning the rhythmic perspective of tension, some con-
siderations had already been taken into account in Lerdahl and Taking into account the scope of the problem, the steps
Krumhansl’s model [17] (as it is based on the GTTM). Some to be followed to answer the research questions could be
others refer to the musicological behaviour of chords within summarised as follows:
tonality, as going from a more stable chord to a less stable a) How to model (tonal) tension in order to implement an
one is perceived as tense, and to the speed of change of the automatic algorithmic composition system?
notes between chords; the faster the change, the tenser it is • to carry out a critical review of the literature con-
perceived [21]. cerning theories of tension in music.
• to combine the most suitable theories of tension
3 Based on the well-known Generative Theory of Tonal Music [18] (GTTM) in music into a single model, keeping in mind the
and the Tonal Pitch Space [19] (TPS). empirical essence of the desired output.
b) How to generate harmonic, melodic and rhythmic con-
tent matching (tonal) tension?
• to transform the theoretical model from analytical
into generative.
• to define and set the value of the parameters of the
generative theory based on theoretical aspects.
• to test those decisions taking into account the re-
lationship between the generative model and what Fig. 1. Interface used in the empirical study.
listeners actually perceive.
• to hypothesize about ideas of improvement on the
probability. Thereby, following the transition probabilities, a
generative model based on empirical data. chord sequence is generated matching a given level of tension.
The durations of those chords also affect musical tension.
c) Which features influence the generation, evolution and
According to Swain’s theory of rhythmic tension [21], tran-
perception of tension within video games and their
sitions from longer to shorter durations are perceived as an
narratives?
increase in tension, and vice-versa. Likewise, the harmonic
• to explore the role of the different features that relate
and rhythmic influence on tension must be congruent. So that,
to tension within game narratives. chords’ durations are implemented again as a probabilistic
• to design a theoretical framework concerning the
model, taking into account the level of tension of the transition
implication of these features in the generation of between chords: if the transition goes to a more important
tension through game narratives. chord in the tonality, tension decreases, so it is more probable
• to test the theoretical framework, qualitatively, with
to find a longer chord; if the transition goes to a less important
experts in the field of Game Design and Game chord, tension increases, so it is more probable to find a shorter
Music. chord. When the level of tension remains the same, it is more
d) How to map the musical model of tension to that probable to repeat the previous duration.
generated by the game? A melody could be played on top of the generated harmonic
• to propose a mapping system based on the musical sequence, matching again tonal tension. Margulis’ model of
and gaming tension frameworks. melodic expectancy [14] was used to determine transition
• to implement the final model into a video game. probabilities between notes in order to stochastically generate
• to test the influence of the generated game music in a melody.
terms of immersion.
• to test the correlation between the tension extracted B. Empirical study
from the game narratives, the tension theorized by
I carried out an experiment to test the power and capability
the music generator, and the tension perceived by
of the described system. Ten musicians and ten non-musicians
the players.
took part in the study, which was done face-to-face and
• to come up with relevant conclusions and prospects
lasted between thirty and fifty minutes per participant. An
for future work.
audiovisual interface (see Fig. 1) was specifically designed for
IV. W ORK TO DATE the experiment where three areas were shown. Each area was
A. Design of a music generator associated with a different tension level: minimum, medium
or maximum. Participants were asked to listen to the music
I have designed and implemented a music generative system generated in each area and select, from the three possible
whose output matches tonal tension. It is divided in three areas, which sequence they perceived as least tense (minimum)
layers, each corresponding to three basic elements of music: and as most tense (maximum). Each participant listened and
harmony (sequence of chords), rhythm (arrangement of dura- labelled five chord sequences, five melodies, and five melodic
tions of chords and notes) and melody (sequence of notes). and harmonic sequences, all of them automatically generated
Based on Lerdahl and Krumhansl’s [17] quantitative model by the described model.
of tonal tension, it is possible to, given a chord, decide which
The system’s and participants’ tension labels were compared
chords could be played next to match a specific level of
for each tension level (minimum, medium, maximum), in
tension. To deal with all the possible transitions between
each experimental phase (harmonic, melodic, accompanied
chords, the harmonic generator is implemented as a Markov
melodies). The agreement between a participant’s and the
chain. A Markov chain is a stochastic model where the
system’s labelling was defined as equal to 1 if the system’s
possible events of a transition are given a probability to
tension label was the same as the participant’s, 0 otherwise 4 .
follow the sequence depending only on the previous event.
A summary of the average degree of agreement is shown in
Those transition probabilities are set according to the tension
Tables I and II.
values. That is to say, in a high-tension scenario, the tenser
the transition, the higher the probability to occur; and, in a 4 Cohen’s kappa coefficient was not use because its overall random agree-
low-tension scenario, the tenser the transition, the lower the ment, pe , would be affected by the conditional probabilities.
TABLE I TABLE III
AVERAGE DEGREE OF AGREEMENT, %, BETWEEN THE PARTICIPANTS ’ F UTURE PLAN
AND THE SYSTEM ’ S TENSION LABELLINGS IN THE CASE OF MUSICIANS .
mid 2019 late 2019 early 2020 mid 2020 late 2020
Improvements on the generative system
minimum medium maximum AVERAGE Melodic expectancy VS melodic attraction
harmonic 84% 70% 92% 82% Game narrative and tension review
melodic 42% 40% 92% 58% Game features extraction
acc. mel. 68% 64% 94% 75% Design game tension theoretical framework
Interview study with game experts
AVERAGE 65% 68% 93% Design game-music tension mapping system
Implement mapping strategy into video game
Final experiment
Write up thesis

TABLE II
AVERAGE DEGREE OF AGREEMENT, %, BETWEEN THE PARTICIPANTS ’ AND between participants and the system.
THE SYSTEM ’ S TENSION LABELLINGS IN THE CASE OF NON - MUSICIANS .
Most of the harmonic mislabellings could be explained as
minimum medium maximum AVERAGE a matter of separation from the current tonality, what suggests
harmonic 52% 40% 54% 49%
melodic 46% 50% 76% 57% the need of implementing a tonality-confirmation strategy.
acc. mel. 60% 58% 78% 65%
AVERAGE 53% 49% 69%
D. Future work
C. Discussion Based on the steps shown in the methodology and the
To test the randomness of the data, the probabilities of findings of the empirical study, the next step on my PhD
matching the system’s tension evaluations by chance were plan concerns the improvement of the music generator. An
calculated using contingency tables. The degrees of agreement exploration of the defined parameters would be carried out,
proved to be above the by-chance probability threshold ex- as well as a refinement of the stochastic algorithms involved
cept in one case. The only exception is the non-musicians’ in the selection of transitions. The improved system would
harmonic phase, at medium tension. The probability of the be tested again against listeners’ perceptions. Thinking of the
data being random in this case is 44%, whereas the empirical final application of the generator within video games, a new
agreement is 40%. experimental methodology should be used, such as asking
participants to annotate the tension they perceive while listen-
The average agreement in the case of accompanied melodies
ing to music [17], [30], or getting participants’ physiological
is around 75% from musicians’ responses and 65% from non-
response to music [31].
musicians’, which fall within the state of the art results in
similar generative tasks in the field of Music Computing. Once the final music generator is completed, a deeper
literature review would be carried out concerning Game Music,
In the case of musicians, the agreement for the harmonic
focusing on the role of tension within game narratives. From
phase is 80%. However, it falls to nearly 60% in the melodic
this literature review a theoretical framework would be de-
phase. This might happen because of the possible harmonies
signed, exploring the relationship between game features and
inferred by the participants when listening to the melodies.
tension in narratives. The framework would be qualitatively
This suggests there might be two different listening ap-
tested based on the contributions of experts in the field of
proaches, being non-musicians less focused on the flow of
game design and game scoring.
the harmonic content. For instance, non-musicians’ melodic
degree of agreement is actually greater than their harmonic The final step of this research project would be the im-
degree of agreement, unlike in the case of musicians. Likewise, plementation of both tension frameworks, that is the music
the agreement between the model and participants is higher in generator and the narratives analyser, into a video game. A
the case of musicians. This raises the question of how different final experiment would be carried out to test the mapping
the perception of tension is between musicians and non- strategy, but also to test the influence of the generated game
musicians. It would be of interest in future projects to explore music in terms of immersion. A summary of the future plan
different types of musical tension, such as those described by in shown in Table III
Margulis [14].
V. C ONTRIBUTIONS
The least matching phases correspond to the melodic mini-
mum and medium. All the generated melodic transitions were The research contributions to date include a critical review
annotated and analysed. The maximum case differed from the of the literature concerning models of tonal tension, and the
rest by more than half of the transitions, but the minimum music generative model. The data gathered in the empirical
and medium generated melodies used nearly the same melodic study also constitute useful findings, by suggesting strategies
transitions. To improve the melodic generation in the future, for future improvement.
looking into harmonic and melodic attraction [17] and the According to the future plan, some expected contributions
differences between these and expectancy seems to be a good of the PhD work would include the theoretical framework of
starting point. tension in games, the mapping strategy between the musical
A harmonic rhythm transition analysis was also carried out, and the game tension approaches, and a video game where
similarly to the one concerning melodic transitions, but no procedural music adapts in real-time to dynamically changing
significant correlations were found among the disagreements narratives.
R EFERENCES [27] C. Aspromallis, and N. E. Gold, “Form-Aware, Real-Time Adaptive
Music Generation for Interactive Experiences”, in Sound and Music
[1] D. Cope, “Position Paper for the Second Panel on Algorithmic Music”, Computing, 2016.
in Proceedings of the International Computer Music Conference, pp. [28] M. Scirea, J. Togelius, P. Eklund, and S. Risi, Metacompose: “A com-
23–28, 1993. positional evolutionary music composer”, in International Conference
[2] G. Papadopoulos, and G. Wiggins, “AI methods for algorithmic com- on Computational Intelligence in Music, Sound, Art and Design, pp.
position: A survey, a critical view and future prospects”, in AISB 202–217, March 2016.
Symposium on Musical Creativity, Vol. 124, pp. 110–117, April 1999. [29] A. Prechtl, “Adaptive music generation for computer games”, 2016.
[3] G. Nierhaus, G, “Algorithmic composition: paradigms of automated [30] M. M. Farbood, “A parametric, temporal model of musical tension”, in
music generation”, Springer Science & Business Media, 2009. Music Perception: An Interdisciplinary Journal, University of California
[4] J. D. Fernández, and F. Vico, F., “AI methods in algorithmic compo- Press Journals, 29(4), pp. 387–428, 2012.
sition: A comprehensive survey”, in Journal of Artificial Intelligence [31] C. L. Krumhansl, “An exploratory study of musical emotions and psy-
Research, 48, pp. 513–582, 2013. chophysiology”, in Canadian Journal of Experimental Psychology/Revue
[5] D. Herremans, C. H. Chuan, C. H., and Chew, E., “A functional canadienne de psychologie exprimentale, 51(4), 336, 1997.
taxonomy of music generation systems”, ACM Computing Surveys
(CSUR), 50(5), 69, 2017.
[6] K. Collins, K., “An introduction to procedural music in video games”,
in Contemporary Music Review, 28(1), pp. 5–15, 2009.
[7] J. Togelius, G. N. Yannakakis, K. O. Stanley, and C. Browne, “Search-
based procedural content generation: A taxonomy and survey”, in IEEE
Transactions on Computational Intelligence and AI in Games, 3(3), pp.
172–186, 2011.
[8] A. M. Smith, and M. and Mateas, “Variations forever: Flexibly gener-
ating rulesets from a sculptable design space of mini-games”, in Pro-
ceedings of the 2010 IEEE Conference on Computational Intelligence
and Games, pp. 273–280, August 2010.
[9] C. Remo, “MIGS: Far Cry 2s Guay on the importance of procedural
content”, Gamasutra, November 2008.
[10] E. Bigand, R. Parncutt, and F. Lerdahl, “Perception of musical tension
in short chord sequences: The influence of harmonic function, sensory
dissonance, horizontal motion, and musical training”, in Perception &
Psychophysics, 58(1), pp. 125–141, 1996.
[11] R. Y. Granot, and Z. Eitan, “Musical tension and the interaction of
dynamic auditory parameters”, in Music Perception: An Interdisciplinary
Journal, 28(3), pp. 219–246, 2011.
[12] D. Herremans, and E. Chew, “Tension ribbons: Quantifying and visual-
ising tonal tension”, 2016.
[13] F. Lerdahl, “Tonal pitch space”, in Music Perception: An Interdisci-
plinary Journal, 5(3), pp. 315–349, 1988.
[14] E. H. Margulis, “A model of melodic expectation”, in Music Perception:
An Interdisciplinary Journal, 22(4), pp. 663–714, 2005.
[15] H. Schenker, “1979. Free Composition”, in Trans. and ed. E. Oster. New
York, 1935.
[16] R. N. Shepard, “Geometrical approximations to the structure of musical
pitch”, in Psychological review, American Psychological Association,
89(4), 1982.
[17] F. Lerdahl, and C. L. Krumhansl, “Modeling tonal tension”, in Music
Perception: An Interdisciplinary Journal, University of California Press
Journals, 24(4), pp. 329–366, 2007.
[18] F. Lerdahl, and R. S. Jackendoff, “A generative theory of tonal music”,
MIT press, 1985.
[19] F. Lerdahl, “Tonal pitch space”, Oxford University Press, 2004.
[20] S. Koelsch, M. Rohrmeier, R. Torrecuso, and S. Jentschke, “Processing
of hierarchical syntactic structure in music”, in Proceedings of the
National Academy of Sciences, 110(38), pp. 15443-15448, 2013.
[21] J. P. Swain, “Dimensions of Harmonic Rhythm”, in Music Theory
Spectrum, University of California Press, 20(1), pp. 48–71, 1998.
[22] K. Collins, “An introduction to the participatory and non-linear aspects
of video games audio”, in Essays on sound and vision, pp. 263–298,
2007.
[23] J. Wingstedt, “Narrative functions of film music in a relational per-
spective”, in 26th International Society for Music Education World
Conference, 2004.
[24] A. K. Hoover, W. Cachia, A. Liapis, and G. N. Yannakakis, “Au-
dioinspace: Exploring the creative fusion of generative audio, visuals
and gameplay”, in International Conference on Evolutionary and Bio-
logically Inspired Music and Art, pp. 101–112, April 2015.
[25] P. Lopes, A. Liapis, and G. N. Yannakakis, “Sonancia: Sonification
of procedurally generated game levels”, in Proceedings of the 1st
computational creativity and games workshop, 2015.
[26] S. Engels, T. Tong, and F. Chan, “Automatic real-time music generation
for games”, in Eleventh Artificial Intelligence and Interactive Digital
Entertainment Conference, September 2015.

You might also like