Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 9

1

MUSIC VIDEO AND THE


SEMIOTICS OF POPULAR MUSIC
ALF BJÖRNBERG (1992)
Originally published in
Studi e testi dal Secondo Convegno Europeo di Analisi Musicale
(atti a cura di Rossana Dalmonte e Mario Baroni),
Università di Trento, 1992: 379-388

Introduction

During the last decade, music video has developed into one of the dominating
means for the dissemination of popular music, especially in the Anglo-American
pop and rock music market. In contrast with traditional film music, music video
is characterised by the fact that the visual dimension is governed by the music’s
syntax and verbal lyrics. The structure of visual narration in music video is thus
related to narrative processes in the lyrics, and to the analogue of narrative
processes in the music. The object of this paper is to give a theoretical outline
of the relationships between musical and visual structure specific to
contemporary music video, and to illustrate how these relationships work in
practice.

Music and meaning

Music and meaning in discussions of signification and meaning in music, the


non-referentiality of music is often stressed: musical structures, as distinguished
from verbal language, are void of denotations, and musical meaning arises from
the connotations or associations which the music effects on part of the listener.

Middleton [1990, 220ff] criticises analyses which one-sidedly emphasise this


connotative aspect of musical meaning, and proposes, instead of «denotation»
and «connotation», the terms «primary signification» and «secondary
signification». One common characteristic of the models for primary signification
in music discussed by Middleton is their emphasis on the structure of music, as
opposed to its content, which is left undefined. Musical meaning originates from
{379-380} the relationships between elements ordered according to syntactic
rules; music «offers a means of thinking relationships» [ibid. 223] which is
related to more general cognitive structures, and also to physiological and
motorial processes. Music may thus be said to denote structural relationships
between objects, rather than the objects themselves.

Secondary signification in music may arise in a number of ways; Middleton


[1990,232] quotes from Stefani an extensive list of various types of such
musical meaning, related to different structural levels ranging from single
elements in a particular piece of music to entire musical styles. As is the case
with intramusical structural principles, such connotation fields vary from culture
to culture [Tagg 1990]; furthermore, they vary from individual to individual within
a culture, due to, among other factors, the units of signification in music not
being well-defined. Although music thus does not possess referential precision
to the extent that verbal language does, there nevertheless exists a certain
degree of intersubjective consistency in these connotation fields; music may be
said to have a «conditional referentiality». Both these types of musical
2
signification are active in music video. At the level of visual content, various
kinds of musical secondary signification are visualised, often in combination
with the visualisation of the semantic content of song lyrics on a micro- and/or
(narrative) macro-level. One characteristic specific to music video, however, is
that visual processes are closely interrelated with musical structures.

This means that the visual dimension also refers to the music’s primary
signification; if music «offers a means of thinking relationships», then the
images of music video offers a visual representation of these relationships,
visual homologues to musical structures. How this process of representation is
shaped depends to a large extent on the syntactical characteristics of the music
typically visualised in music video, i.e. contemporary rock and pop music.

Popular music syntax and narrativity

Compared to other modes of communication, music is, in general, remarkably


repetitiously structured. The segmentation of music, which forms the basis both
of music analysis and of «structural listening» in general, is in fact based on
repeated structures involving one or several musical parameters. One essential
difference, however, between the segmentation undertaken by non-analysts in
everyday listening and the one used in theory based analysis, is that the former
operates on larger units (motifs, phrases, symmetrically constructed periods)
{380-1} than the latter [Stefani 1987]. This is also reflected in the syntactic
principles typical of popular music: phrases of equal length are combined into
symmetrical periods, and on higher structural levels relatively fixed and distinct
principles of form dominate (e.g. the alternation of «verse» and «chorus«
sections).

In post-1945 popular music, however, a clearly discernible tendency can be


observed towards a less distinct segmentation on higher levels (section,
period), which means that segmentation on lower levels (phrase, motif, metric
unit) is increasingly emphasised. This tendency is connected with the general
stylistic development within popular music since the 1950s. Musical processes
maybe divided into those based on («digital selection», i.e. choices among a
finite number of alternatives, and those based on «analogue selection», i.e.
continuously variable parameters [Middleton 1983]. As Middleton points out,
«extensionally» structured music, such as Western European art music
[Chester 1970], is characterised by digital selection on the micro-level and
analogue selection on the macro-level: small constant units (notes, motifs) are
combined into complex constructions which cannot be immediately reduced to
simple formulaic structures. In the «intensionally» structured music typical of
many folk and popular music traditions, the condition is rather the reverse: on
the macro-level, selection is digital (e.g. the alternation of a small number of
formal sections, such as verse and chorus), while analogue selection is applied
on the micro-level (inflection of small units concerning pitch, rhythm or
timbre).#1 Generally speaking, contemporary popular music, mainly under the
influences of Afro-American and Anglo-American musics, is increasingly
characterised by intensional modes of construction.

The extensional/intensional dichotomy is determinant for the degree of


«narrativity» that may be attributed to different types of musical structure.
3

Musical structures possessing a distinct narrative quality are primarily found in


extensionally constructed functional tonal music, the object-lesson being the
thoroughly structured «musical drama» of sonata-allegro form. Such narrative
musical processes, which primarily involve melody and harmony and the {381-
2} relationships between these, are characterised by a tonal logic based on an
irreversible directionality: melodic/harmonic processes of tension and release
produce an onward-directed «movement» in the music.#2 These processes are
based both on digital selection on higher levels (e.g. the alternation of major
and minor tonality or of different «tonal levels»), and on analogue selection in
the movement between these levels (i.e. a varying degree of melodic/harmonic
tension).

Furthermore, such processes also imply various punctuational functions, for


example by tonally «opening» or «closing» phrase construction. Contemporary
popular music, however, is to a large extent based on modal structural
principles, in which such tonal processes play a considerably less prominent
part, and which give the music a tonally more static quality [Björnberg 1989].
The more modally conceived the music is, the less it possesses such functional
tonal narrativity; instead, inflection concerning other parameters is foregrounded
in a micro-perspective, in addition to possible tonal/modal level shifts in a
macro-perspective.

Probably these are the characteristics of pop and rock music referred to by Frith
when writing, in a discussion of the aesthetics of music video, that «Rock music
[...] doesn’t seem to have the necessary density to take on interesting or
complex imagery» [Frith 1988, 219]. «Density» may here arguably be taken to
denote «information density» in the strict sense of information theory; however,
musical communication contains more than one type of information. Moles
[1968] divides the information content of music into «semantic» information
(roughly, the information contained within the notable structures of music) and
«aesthetic» information (the information which is added in the music as actual
sounding performance). Because of the important part played by repetitivity in
musical syntax, all music possesses a higher degree of structural semantic
redundancy (i.e. a lower information density) than verbal language, but in return
it contains a larger amount of aesthetic information. Whether this aesthetic
information bias generally applies more to popular music than to art music may
be disputed; however, the semantic information density is unquestionably lower
in contemporary pop/rock music than in, for example, the Western European art
{383-2} music tradition and Hollywood film music based on this tradition. This is
related to the differences discussed above concerning «narrative» tonal
structures.#3

This semantic redundancy is essential for certain specifically musical functions


of the music, as may be explained from a physiological as well as a
psychological perspective [Booth 1981; Middleton 1983; Stefani 19871;
however, it forms an oppositional relationship with logical and directional
narrativity. This is an important explanation of the «post-modem» character of
music video often commented upon: visual processes homologous to the types
of musical syntax typical of contemporary popular music have a high degree of
structural narrative redundancy, while, on the other hand, visualisation of the
semantically more ambiguous dimension of aesthetic information allows for a
4

great freedom of choice as regards actual visual content [Straw 1988, 258].
Slightly overstating the matter, music in general, and pop/rock music in
particular, may simply be said to constitute a more «post-modern» mode of
communication than verbal language or classical dramaturgy.

The fact that in music video specifically popular musical forms are visualised
implies an important distinction compared to traditional Hollywood film music. In
film music, the demands made by visual narration mean that specifically
musical structural principles are undermined and have to be modified [Gorbman
1987, 13], while in music video the relationship is reversed, i.e. the determinant
role played by musical syntax renders coherent narration difficult. Parallels to
this situation exist in opera and musical film, where the introduction of
«autonomous» musical forms (aria, TPA-type popular song) interrupts the
narrative flow. The visualisation of typical «narrative» musical structures, on the
other hand, is more easily adaptable to a traditional film/film music relationship;
a typical example would be the visualisation of Beethoven’s Pastoral Symphony
in Disney’s Fantasia.

The aesthetic information dimension prominent in pop/rock music includes as


important experiential qualities those denoted, in everyday discussion of music,
by the terms beat and sound [Abrahamsen 1988; Fornäs 1980; Lilliestam 1984].
From a reception-psychological perspective, the accentuation of these qualities
implies an emphasis on musical «primary processes», related to the preverbal,
{383-4} «irrational» structural levels of music, at the expense of «secondary
processes» related to rationally logical surface structures such as melody and
harmony [Kohut 1957]. The capacity of music for activating, by means of its
beat and sound, such pre-verbal psychological levels is in a «normal» usage-
context largely dependent on acoustical qualities such as high volume and
specific frequency characteristics, which cannot be properly reproduced through
the medium of television. In return, however, the rapid motions and high cutting
density of music video may function as a «translation» of such acoustical
factors into visual expression, as will be illustrated below.#4

Music video as visualisation of musical structures

In the following, I intend to illustrate and discuss some aspects of the


relationship between music and visuals in music video. The discussion does
not, however, enter into detailed analysis of individual videos; my object here is
primarily to outline and exemplify a few general principles.

The interaction in music video between sound and image is particularly manifest
with regard to the dimension of time: the structuring of temporal flow effected by
the music determines the shaping of visual content, both on the macro- and the
micro-level. The most fundamental temporal determinant of music video is the
total duration of the song in question. The conventionalised restriction of most
pop/rock songs to a time span of some four minutes imposes obvious limits as
to what may be represented visually; extended dramatic! narrative processes
are excluded or have to be represented in a very concentrated and elliptical
way. Exceptions such as the video for Michael Jackson’s Thriller only serve to
underline this fact: in this case, the ambition to reproduce a more complex
5

narrative results in the duration of the video substantially exceeding that of the
song.

Also within this total duration visuals are structured by the music. A great
majority of contemporary pop/rock songs are based on one variant or other of
the verse-chorus form [Björnberg 1987,55, 69f], which means that a song
(including introduction, possible solo sections and coda) normally consists of 8-
9 fairly {384-5} distinctly delimited sections. In most music videos, this formal
organisation (i.e. the alternation of verse and chorus sections etc.) determines,
to a greater or lesser degree, the organisation of visual content. Musical form is
often visualised by means of general changes of scene, e.g. from depiction of
the artist or group (or, when applicable, the soloist) in chorus sections to a more
or less fragmentary narrative in verse sections (see, for example, the videos for
Kraftwerk’s The Model and Down Under by Men at Work). Musically as well as
lyrically, verse-chorus form may be characterised as a «multiple centripetal
process» (cf. footnote 1): musically by means of the cadential effect of the
chorus section, lyrically by means of the motion from
concretisation/problematisation in verse sections to generalisation/confirmation
in the chorus [ibid. 189]. The domination of this mode of formal organisation in
popular music since the mid-l9th century implies that it may be regarded as a
deeply ingrained musical «archetype» to the contemporary Western listener.
However, because of the repeated return to a «position of rest» or «centre»
implied by the verse-chorus form, it forms an oppositional relationship with
linear narrativity.#5 The cutting-up of visual narrative into short sections (of the
order of 30 seconds) effected by adaptation to musical form also appears rather
arbitrary and irrelevant from a dramatic/narrative point of view.

In the relatively few cases where the disposition of the visuals is not adapted to
musical form, a narrative development without clearly marked segmentation
may extend over the entire video (examples of this may be found in many of ZZ
Top’s videos); another possibility is a total domination of diffusely structured
«dreamlike visuals» [Kinder 1984] void of narrative elements (one example
among many is the video for New Order’s Blue Monday). Segmentation is also,
naturally enough, less distinct in visualisations of songs belonging to musical
styles where the delimitation of formal sections is less clear, such as hip-hop or
house music; see, for example, the videos for Young MC’s I Come Off and
Marrs’ Pump Up The Volume.

The most obvious connection between visual organisation and the time
structure of the music, however, is situated at a temporal micro-level: in {385-6}
practically all music videos, both the motions depicted, camera movements and
cutting is synchronised with the basic beat and/or short rhythmic units
congruent with the music’s meter or basic rhythmic gestures. Frith’s thesis, that
«montage is the video-maker’s basic tool simply because it is the visual
equivalent of music built up out of studio sound layers» [Frith 1988, 219], seems
to be based on a technical analogy without obvious experiential correlates;
rather «montage» is one of several conceivable visual equivalents of music
characterised more by a strongly emphasised beat than by tonal processes of
tension and release. Thus, the close connection between music and visuals
which in film music contexts is somewhat contemptuously described as
«mickey-mousing» [Schmidt 1982, 48fl is essential, especially with regard to the
6

rhythmic dimension, to the functions of music video as a visualisation of


specifically musical experiential qualities. These rhythmic homologies constitute
one of the most significant characteristics of music video.

Many videos also provide instances of an incongruous or complementary


relationship between musical and visual rhythm, where visual homologues to
musical beat and meter are almost totally absent (see, for example, the videos
for Eurythmics’ There must be an angel playing with my heart and Rock The
Casbah by The Clash). From an experiential perspective, the relationship
between visual rhythm and musical beat in such cases constitutes a homology
to the contrast, generally typical of Afro-American influenced musical styles,
between the beat of the rhythm section and a rhythmically free and independent
soloist [Björnberg 1987, 99ff].

While the rhythmic character of the music thus plays a significant part in the
shaping of music video, manifest visualisation of tonal processes of tension and
release (affecting the melodic and harmonic parameters) is considerably less
frequent. The close onnection between tonal and visual processes that Ruud
discerns in the video for Paul Simon’s Rene and Georgette Magritte With Their
Dog After The War [Ruud 1988, 56ff] is rare in music video contexts, a fact
which is related to the general tonal characteristics of contemporary pop/rock
music described above. The kind of network of differentiated tonal relations
playing a prominent part in Paul Simon’s song is exceptional in these genres;
more often, songs are based on static «modal fields» of a relatively constant
affective character [Björnberg 1989], which in music video rather is illustrated on
a general «mood» level comparable to genre-specific types of secondary
signification.

As a rule, greater emphasis is laid in music video on more directly sound-


related musical parameters, such as timbre and dynamics, by way of various
{386-7} types of synaesthetic visualisation of timbral qualities and dynamic
changes. Such illustrations often take the form of «mickey-mousing» effects of
short duration; a closely related type of visualisation is the use of musical
sounds (often percussion sounds, but also others) as «filmic» sound effects, i.e.
simulations of diegetic sounds implied by the visuals. Several examples of
these types of visualisation can be found in the videos for Aerosmith’s Janie’s
Got A Gun (the association of synthesiser crescendos with spotlights, guitar
solo with breaking glass, and snare drum beats with pistol shots) and Vienna by
Ultravox (the association of filtered synthesiser sounds with foggy nocturnal
scenery, snare drum beats with flashlights, and crash cymbal beats with pistol
shots). Like the rhythmic homologies discussed above, such effects contribute
to music video’s emulation of specifically musical experiential dimensions.

Besides musical structure, visualisation in music video is often also based on


the song’s verbal lyrics. This visualisation may either relate to phonetic and
paralinguistic characteristics of the lyrics as sung, in which case it approaches
the visualisation of corresponding musical parameters such as pitch, timbre and
dynamics, or to the semantic content of the lyrics on various structural levels.
Music videos presenting a coherent visualisation of the most manifest level of
verbal signification (the situations or the narrative related in the lyrics) are
relatively rare.#6 However, the technique of concretely illustrating individual
7

words or concepts in the lyrics is frequently used; the «cut up» montage
aesthetics typical of music video enables visual interjections on the micro-level
of the lyrics, without any necessary connection to the surrounding context. Such
«vocable visualisation» may serve to underline details of the lyrics, as, for
example, in the video for Midnight Oil’s Blue Sky Mine. At times it dominates the
visuals over longer stretches of time; a striking example is the video for Peter
Gabriel’s Sledgehammer, which to a large extent is based on concrete
illustration of the metaphors in the lyrics, to the effect of a «mickey-mousing»-
relationship between lyrics and images. Another frequent lyric visualisation
practice is the display of written words from the lyrics (a graphic illustration of
this can be found in the video for Need You Tonight/Mediate by INXS).#7
Because of their fragmentary {387-8} character, these types of lyric visualisation
paradoxically often seem to direct attention to musical (i.e. phonetic and
paralinguistic) aspects of the lyrics, rather than to their semantic meaning and
narrative context.

Conclusion

In this paper I have argued that the visualisation of the structural meaning of
music, its primary signification, constitutes an important, perhaps primary,
function for music video. Visual processes are determined by musical structure
in an effort to complement specifically musical experiential qualities with visual
homologues, which partly precede and work independently of the referential
substance of the images. The dimension of visual «content» in music video may
correspondingly be regarded as a visualised representation of (part of) the
potential connotative meaning of the music, its secondary signification. This
dimension presents a «play» of signifiers, referring to film history, televisual
conventions, advertising, the visual arts, subcultural styles etc.,the interpretation
of which has been given much attention in the literature on music video, mainly
from the perspective of film and television theory. The meanings of music video
are produced, however, in a continuous interplay of musical and visual
signification on a number of levels. The study of music video is therefore
musicologically important because of its potential contributions to the semiotics
of popular music; however, the musical-analytical perspective also constitutes
an important element in the multidisciplinary approach which a thorough
analysis of this multidimensional signifying practice requires, but which so far
has largely been missing from contemporary research.

Bibliography

Abrahamsen, P (1988) Sound. En diskussion af termen sounds relevans for


populærmusikanalyse set i et socialpsykologisk perspektiv. Ålborg: Department
of Music and Music Therapy, Aalborg University Centre.
Björnberg, A (1987) En liten sång som alla andra. Melodifestivalen 1959-1983.
Göteborg: Musikvetenskapliga institutionen vid Göteborgs universitet.
— (1989) On aeolian harmony in contemporary popular music. Göteborg:
IASPM - Nordic Branch Working Papers, no. DK 1.
Booth, M W (1981) The Experience of Songs. New Haven & London.
Frith, S (1988) Music for Pleasure. Essays in the sociology of pop. Cambridge
University Press.
8
Chester, A (1970) `Second Thoughts on a Rock Aesthetic’. New Left Review,
62: 75-82.
Fornäs, J (1980) Socialisationsteori för musikvetare. Stencilled Papers from
Göteborgs Univ Musicology Dept, 8005.
Forsman, M (1986) `Det eviga nuet’. Filmhäftet, 54.
Gorbman, C (1987) Unheard Melodies: Narrative Film Music. Bloomington &
London: Indiana University Press / BFI Publishings.
Kinder, M (1984) `Music video and the spectator. Television, ideology and
dream’. Film Quarterly, 38/1.
Kohut, H (1957) `Observations on the psychological functions of music’. Journal
of the American Psychoanalytical Association, 5.
Larsen, P (1987) `Bortom berättelsen’. Filmhäftet, 56-57.
Lilliestam, L (1984) `Syntarnas intåg eller från melodi och harmonik till klang
och rytm: 10 teser’. Tvarspel - 31 artiklar om musik. Festskrift till Jan Ling.
Göteborg: Skrifter fran Musikvetenskapliga institutionen, 9: 352-370.
Middleton, R (1983) `"Play it again, Sam": on the productivity of repetition’.
Popular Music, 3: 235-271.
— (1990) Studying Popular Music. Buckingham: Open University Press.
Ruud, E (1988). Musikk for Øyet. Om musikvideo. Oslo.
Schmidt, H-C (1982). Filmmusik für die Sekundär- und Studienstufe. Kassel:
Barenreiter.
Stefani, G (1987) `Melody: a popular perspective’. Popular Music, 6/1: 21-36.
Stockfelt, O (1988) Musik som lyssnandets konst. En analys av W A Mozarts
synfoni N° 40, g moll K.550. Göteborg: Musikvetenskapliga institutionen vid
Göteborgs universitet.
Straw, W (1988) `Music video in its contexts: popular music and post-
modernism in the 1980s’. Popular Music, 7/3: 247-266.
Ström, G (1989) Musikvideo. Oslo.
Tagg, P (1979). Kojak: 50 Seconds of Television Music. Göteborg:
Musikvetenskapliga institutionen vid Göteborgs universitet.
— (1990) `"Universal" Music and the Case of Death’. La musica come
linguaggio universale, ed. R Pozzi, Raffaele. Firenze: Leo S. Olschki: 227-266.

Endnotes

1. Tagg [1979, 21 7ff] proposes a general division of musical processes into


«centripetal» (returning to the point of departure) and «centrifugal» processes
(ending «away from home»); however, this classification seems most applicable
in cases of binary digital selection concerning general parameters such as
tonality, periodicity, pitch range etc. In cases with more than two alternatives or
analogue selection, hierarchic ordering in terms of musical «distance» may be
problematic, as is also indicated by Tagg’ s application of this model of analysis.
2. Middleton [1983, 238] relates the distinction between extensional and
intensional construction to a distinction between «discursive» and «musematic»
repetition, where the former is characterised by repetition in one musical
parameter being combined with changes in other parameters (a typical example
being melodic/harmonic sequence), while the latter is characterised by the
repetition of smaller, more constant units.
3. In addition, the attitudes and «modes of listening» [Stockfelt 1988] typical of
the uses of contemporary popular music probably assign a greater importance
9

to the aesthetic information dimension in music than the «structural listening»


constituting the ideal of the conservatory tradition does.
4. Forsman [1986] presents an interesting discussion of these functions of
music video, viewed in the light of the German socialisation scholars’ theories of
a «new socialisation type», characterised by a weak ego and narcissistic need
structures.
5. An example of the segmentation of a visual process being determined by a
musical versechorus form is presented in Larsen’s [1987] analysis of the video
for Phil Collins’ Against All Odds. Larsen also demonstrates a homological
relationship between the narrative structure represented (fragmentarily) in the
visuals and an arch-shaped intensity process in the music (primarily affecting
the parameters of instrumentation and dynamics); he does not, however,
discuss to what extent the formal construction of the music reinforces or
counteracts this relationship.
6. This fact may be explained both by video makers’ ambition to avoid fixed and
unambiguous meanings in order to enhance the spectator’s opportunities for
individual interpretation [Ström 1989, 88], and by the ambition to underline the
artist’s distance towards, and control over, the protagonist of the lyrics [Frith
1988, 217].
7. This practice is carried to its ultimate consequence in the video for Prince’s
Sign O’ The Times, which in its entirety is based on the written lyrics of the
song.

You might also like