Technology and Discourse Intonation

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Technology and Discourse Intonation

DOROTHY M. CHUN

Introduction
One of the first applied linguists to use the term discourse intonation was David Brazil,
who proposed a theory of how intonation contributes to the communicative value of speech
in British English, based on the work of Halliday (1970). In the introduction to his theory,
Brazil stated that he was unashamedly concerned with function and described intonation
as a set of speaker-options formulated without explicit reference to grammar (Brazil, 1975,
pp. 12). What this means is that when a speaker chooses to use a particular intonation
pattern, the pattern chosen is not based on common intonational patterns for statements,
questions, and other grammatical structures, but is rather dependent upon the surrounding discourse in the conversation, in particular what came before (in terms of common
knowledge, topic, assumptions) and in expectations and intentions for the future. Although
most work to date involves English, this appears to be true for other languages as well;
that is, that there are no set intonational patterns, for example, for statements or questions,
because different intonational patterns are used in different contexts and within a specific
discourse (see Molholt & Hwu, 2008, for examples in Mandarin Chinese).
Brazil believed that there is a small, finite number of functionally contrastive pitch
configurations in English and that each of these configurations has its own meaning.
Meaning in this context does not refer to attitudinal notions such as expectant or
surprised, or to grammatically derived concepts such as interrogative or declarative.
Rather, of importance is the continuous assessment of discourse by the speaker and a
choice of one intonation pattern over another for the purpose of achieving coherence and
cohesion in the discoursein other words, the interactional significance of intonation.
Brazils theory thus differed from previous theories of intonation not in proposing a different set of components, but rather in ascribing different meanings and functions (ones
that derive from usage in discourse) to more or less traditional components. For work by
other linguists in this subfield, see Brown, Currie, and Kenworthy (1980); Coulthard and
Brazil (1981); Johns-Lewis (1986); and Ford and Thompson (1996), to name but a few.
A common belief in discourse intonation studies is that both theoretical and pedagogical
models should be based on naturally and authentically occurring speech, interactions, and
conversations. As Couper-Kuhlen and Selting (1996) suggested, intonation must be viewed
and interpreted within the context in which it occurs, that is, is spoken. It has a signaling
function within the discourse and is part of a real-time, ongoing process of interaction,
where speakers react to the way in which their interlocutor is using intonation (pitch,
rhythm, timing) and conform to it or break away from it. Speakers and hearers cooperate
to avoid conflict and to resolve conflict when necessary. The focus is on the reconstruction
of patterns as cognitively and interactionally relevant categories which real-life interactants
can be shown to orient to (Couper-Kuhlen & Selting, 1996, p. 46).
An example of an intonational choice in context was provided by Bradford (1996):
Although a falling intonational contour might be expected at the end of a statement, a
speaker might choose to use a high rising terminal intonational contour (or upspeak) as
a bonding technique which upspeakers use to promote a sense of solidarity between
The Encyclopedia of Applied Linguistics, Edited by Carol A. Chapelle.
2013 Blackwell Publishing Ltd. Published 2013 by Blackwell Publishing Ltd.
DOI: 10.1002/9781405198431.wbeal1180

technology and discourse intonation

themselves and their interlocutors (p. 23). This rising tone serves a participatory function
by encouraging the hearers continued involvement in the exchange.
The pedagogical implications of discourse intonation for language teaching have been
delineated by Brazil, Coulthard, and Johns (1980), who suggested the kind of teaching
syllabus that could be derived from Brazils seminal model. Other applied linguists who
have also taken discourse approaches to the teaching of intonation include Clennell (1997),
who focused on equipping international students to communicate effectively in nativeEnglish-speaking universities, and Pickering (2001, 2004), who investigated tone choice
by international teaching assistants (ITAs) and showed that their inability to exploit tone
choice and intonational paragraphs to highlight local vs. global information structure
contributed to communication failures between ITAs and their students.
Since the 1980s, the rapid development of technologies for quantifying intonational
features and the increasing use of computers to visually display intonation have led
researchers and teachers to investigate how technology can be used for the teaching and
learning of discourse intonation (Chun, 1998, 2002). For excellent reviews of work on
computer technology for teaching and researching pronunciation, see OBrien (2006) and
Levis (2007). This entry discusses specifically how technology can (a) illuminate the discourse intonation features of a language, (b) provide a visual comparison of first language
(L1) and second language (L2) intonational patterns, and (c) be used effectively in software
for teaching discourse intonation to L2 learners.

Using Technology to Quantify Discourse Intonation


The importance of technology for teaching discourse intonation was confirmed in a stateof-the-art chapter on research and teaching pronunciation and intonation by Jenkins (2004),
who stated Of the recent findings of pronunciation research, the most influential in terms
of pedagogic developments fall into two main groupings: those concerned with issues of
context and those that relate to technological advances (pp. 10910). In English-language
teaching, intonation has been considered to be a key to effectiveness in spoken language
for some time, but the use of visualization technology has been a crucial recent advance
in the teaching of intonation.
Levis and Pickering (2004) showed how teaching can be enhanced by connecting technology to an understanding of how intonation functions in discourse. Their study examined
two discourse-level uses of intonation, namely the use of intonation paragraph markers
and the distribution of tonal patterns, and they demonstrated convincingly that speech
visualization technology provides a more sophisticated understanding of how pitch functions systematically in discourse. Without technology, learners simply had to rely on their
auditory perceptions of pitch changes and movement. But, according to Levis and Pickering
(2004), the use of visualization technology has made intonation practice reliably available
to even those without the ability to confidently identify pitch changes (p. 517), thus
removing one of the most important obstacles to the teaching of intonation.
Sentence-level practice is insufficient for teaching how intonation is used in connected
speech; what is needed is software that presents longer stretches of speech, along with
explanations of the systematic meanings of discourse pitch movement. Specifically, by
expanding the context of computer-based practice material to the discourse level, learners
can be shown and can practice pitch patterns that mirror authentic patterns in contextualized speech. They can learn, for example, how to signal contrastive information structure
by making the relevant syllable prominent, or how to choose an appropriate pitch height
in order to indicate a boundary marker for a paragraph.
Molholt and Hwu (2008) described how technology can illuminate features of discourse
intonation in Chinese by contrasting two versions of the Mandarin Chinese utterance Ni

technology and discourse intonation

Figure 1 Different renditions of the question You know why, dont you? (Chun, 2002;
John Benjamins)

zai gan shen me? What are you doing? with the neutral pronunciation, which would
use the sequence of lexical tones for each of the words in the question. The visual acoustic
representations of the question asked in different contexts revealed that (a) when expressing strong curiosity in a friendly way to a child, the high pitches of the word tones are
even higher than in the neutral utterance, and the final syllable ends in a rise (as opposed
to the slight fall that would be used in the neutral utterance); and (b) when asking an adult
the same question in a demanding way, the high pitches are much higher than in (a) and
the falling intonation on the final syllable is much more abrupt a fall than in the neutral
utterance (Molholt & Hwu, 2008, p. 117, Figure 5.26).
Chun (2002, pp. 2201) illustrated how intonation in English can signal the discourse
expectations of the speaker. The left-hand portion of Figure 1 shows the utterance You
know why, dont you? with rising intonation at the end of the first clause, You know why,
and falling intonation at the end of the second clause, dont you, with the questioner assuming that the listener does know why and wanting to confirm this fact. The rising intonation
on the first clause may indicate a sort of reminder to the listener, who may know why but
may not be willing to accept the reason.
The right-hand portion of Figure 1 shows the same words, You know why, dont you?,
but this rendition shows falling intonation at the end of the first clause and rising intonation at the end of the second. This pattern is used when the speaker thinks or expects
that the listener knows why, but then has doubts and wishes to confirm this fact. These
examples illustrate the contextualized nature of intonation in discourse: The utterances
are marked intonationally not to signal that they are questions or statements, but rather
to signal underlying assumptions and expectations about the response. Marking shared
knowledge and presuppositions is a function of intonation at the discourse level. The
important principle illustrated here is that there is not a one-to-one correspondence between
syntactic type and intonation pattern, but that different combinations of these structures
occur, depending on pragmatic intentions and meanings.

Comparing Discourse Intonation of L1 and L2


In addition to comparing the same utterance said in different contexts within a language,
another approach to teaching L2 discourse intonation is to first compare the intonation

technology and discourse intonation

features of L1 and L2 (or of two varieties of a language, as in the case of different varieties
of English). For example, Pickering and Wiltshire (2000) studied the phonetic correlates
of accent and stress that distinguish Indian English from American English dialects, and
found that in the discourse of university teaching assistants lexically accented syllables
are often realized in Indian English with a relative drop in frequency (pitch) and without
a reliable increase in amplitude (loudness), whereas lexically accented syllables in American
English reliably increase in both frequency and amplitude. The quantification of the phonetic
correlates involved recording classroom discourse and acoustically analyzing the speech
using the Computerized Speech Lab (CSL) software from Kay Elemetrics (now KayPENTAX,
www.kayelemetrics.com).
In a similar vein, Wennerstrom (1998) compared four aspects of intonation that contribute
to achieving cohesion in the lectures of native speakers of English and of Mandarin Chinese
speakers lecturing in English. She found that native speakers of English (a) used higher
pitch to distinguish content words from function words, (b) often used utterance-medial
nonfalling boundary tones, (c) made a large distinction between contrasting and given
lexical items, and (d) differentiated rhetorical units through increased pitch range at major
organizational junctures. In contrast, the Chinese speakers with lower levels of English
proficiency who lectured in English (a) did not use as great a pitch difference to distinguish
newly introduced content words and function words, (b) often used low boundary tones
in mid-utterance position, (c) did not consistently use pitch to distinguish contrasting items
from given items, and (d) did not receive high speaking scores when they failed to use
increased pitch range to mark topic shifts. After quantifying specific features of discourse
intonation in different languages or by speakers of a language with different L1s, technology
can then be used to train L2 learners; this is discussed further in the next section.

Software for Teaching and Learning L2 Discourse Intonation


An excellent example of commercially available discourse-oriented intonation materials
is Streaming Speech (Cauldwell, 2002), which goes beyond sentence-level practice. Based
on Brazils theory of discourse intonation, it is targeted at advanced learners of British
or American English. All of the speech samples are unscripted narratives that have been
meticulously and extensively repurposed for pedagogical use. All of the recordings of
natural spontaneous speech have been analyzed in terms of both a traditional syllabus
that deals with vowels and consonants, and, more importantly, a discourse syllabus that
deals with the choices that speakers make in terms of pitch and stress and the strategies
they use to communicate effectively in real time. The discourse syllabus introduces the
notion of the speech unit, which is described as a stretch of speech with its own rhythm,
tones, and other features that make it stream-like. Learners are made aware not only of
different tones (e.g., falling, level, and rising) but of their relative frequency of occurrence
in English as well. Learners are taught about the use of high and low key, and how to
pause in ways that are acceptable and comprehensible to listeners. There are also opportunities for learners to practice dealing with common occurrences in spontaneous speech,
such as restarting after mistakes, self-correcting, and repeating themselves.
Streaming Speech provides speech unit transcriptions with notations (see Figure 2). The
notations indicate the speech unit number, speech unit boundaries, tones (arrows), where
the tone begins (underlined), prominent syllables (capitalized), and speed (words per
minute). Learners can listen to an entire recording and can also play back selected speech
units at the click of a button.
In Figure 3, the graphics (arrows and underlining), audio, and animation of the pitch
movement allow learners to observe the direction of the pitch change while simultaneously
hearing the utterance spoken. The use of authentic speech is in stark contrast to the vast

technology and discourse intonation

Figure 2 Screenshot of Streaming Speech unit transcription (www.speechinaction.com;


Streaming Speech)

Figure 3 Screenshot of Streaming Speech pitch animation (http://www.speechinaction.com/


Streaming Speech)

majority of programs for pronunciation, which use stilted, unnatural-sounding recordings,


usually not based on authentic speech. The program does not incorporate automatic speech
recognition (ASR) and pronunciation evaluation, which many other commercial packages
purportedly offer and which will be discussed further below.
Two limitations of the software are (a) that actual fundamental frequency representations
are not included, which would allow learners to see, for example, the degree of fall or rise
of the pitch (this, incidentally, could easily be done using an open source program such
as Praat, http://www.fon.hum.uva.nl/praat); and (b) that learners do not receive any kind
of feedback (which could be remedied perhaps by allowing learners to record their utterances to an online voice board, e.g., Audacity, audacity.sourceforge.net, so that an instructor could provide individual feedback).
To date, there is a paucity of studies on evaluating the effectiveness of using technology
to teach discourse intonation, but an excellent model for such work is provided by Hardison
(2005), who employed contextualized input in prosody training for L2 speakers of English

technology and discourse intonation

whose L1 was Chinese. The learners received training input using Anvil, a free Web-based
video annotation tool that displays both videos of a speech event and visual displays of
the pitch contour (imported from phonetic tools such as Praat). Each of the participants
gave a series of oral presentations, which provided the source for his or her training
materials. Two groups received training input that involved both selected video segments
of their presentations and displays of their pitch contours; they then practiced with the
CSL software. Two other groups received training input that consisted only of the visual
display of their pitch contours (without the video), which they then practiced with the
CSL software. Within each of the pairs of groups, one group trained with discourse-level
segments from their presentations, while the other group trained with only individual
sentences. Results showed that the prosody or intonation of all four groups improved
after the training, but the discourse-level input produced better transfer to novel natural
discourse. The findings of Hardisons study using technology-based training, namely that
meaningful contextualized input is valuable in prosody training when the measurement
is at the level of extended connected speech typical of natural discourse (2005, p. 175),
provides strong support for the effectiveness of technology to teach and train discourse
intonation.
As mentioned above, ASR technology has become prevalent in commercial products for
language learning. But one of the biggest obstacles to developing effective software is that
prosody is a very complex phenomenon involving a number of components, including
intonation (loudness and pitch), stress, accent, and rhythm (duration and pauses). Holland
and Fisher (2008) acknowledged that prosody fluctuates much more than basic vowel and
consonant sounds, depending on the surrounding semantic and pragmatic context of an
utterance. They also noted that the enormous variation in how native speakers realize
intonation and rhythm is a great challenge for developing reliable algorithms for prosodic
analysis, suggesting that One way to contend with individual variation is to express
prosodic measures not in absolute but in relative terms, such as the duration of one
syllable compared to an adjacent one. CALL prosody prototypes have been developed
on this principle to detect learners departures from acceptable native values of given
utterances (p. 13). For example, one difficulty that becomes obvious when using ASR
software is reconciling the speed of speech of the native speaker model and the L2 learner.
If the learner speaks at a slower or faster rate than the native speaker model, a comparison
of the waveforms and pitch contours will not be identical unless this difference can be
correctly identified and adjusted by the software. But the mere difference in speed is not
necessarily problematic in terms of how native-like or acceptable an utterance sounds.

Conclusion
Although the use of intonation in discourse has been studied linguistically in many languages, this knowledge has not been applied extensively to language teaching. There are
several reasons for this, the first being that prosody is a complex aspect of spoken language.
Second, even though software for acoustic analyses is freely available and can display
visualizations of pitch contours, most language teachers do not themselves understand
the concepts of discourse intonation, much less feel able to teach them to their students.
Third, although commercial software for learning pronunciation contains ASR components,
the feedback that is provided to learners does not explain precisely how the learners
utterance differs from the native speakers model utterance, nor does it quantify what must
be changed in order for the utterance to be comprehensible or appropriate for the discourse.
For the future, linguists, language teachers, and acoustic software engineers must collaborate in order to provide meaningful, easily interpretable feedback to learners.

technology and discourse intonation

SEE ALSO: Brazil, David; Computer-Assisted Pronunciation Teaching; Suprasegmentals:


Discourse Intonation; Suprasegmentals: Intonation

References
Bradford, B. (1996). Upspeak. Speak Out! Newsletter of the IATEFL Pronunciation Special Interest
Group, 18, 224.
Brazil, D. (1975). Discourse intonation I. Birmingham, England: English Language Research
Monographs.
Brazil, D., Coulthard, M., & Johns, C. (1980). Discourse intonation and language teaching. London,
England: Longman.
Brown, G., Currie, K. L., & Kenworthy, J. (1980). Questions of intonation. London, England: Croom
Helm.
Cauldwell, R. (2002). Streaming speech: Listening and pronunciation for advanced learners of English
[CD-ROM for Windows]. Birmingham, England: speechinaction.
Chun, D. M. (1998). Signal analysis software for teaching discourse intonation. Language Learning
and Technology, 2, 6177.
Chun, D. M. (2002). Discourse intonation in L2: From theory and research to practice. Amsterdam,
Netherlands: John Benjamins.
Clennell, C. (1997). Raising the pedagogic status of discourse intonation teaching. Language
Teaching Journal, 51(2), 11734.
Coulthard, M., & Brazil, D. (1981). The place of intonation in the description of interaction. In
D. Tannen (Ed.), Analyzing discourse: Text and talk (pp. 94112). Washington, DC: Georgetown
University Press.
Couper-Kuhlen, E., & Selting, M. (Eds.). (1996). Prosody in conversation. Cambridge, England:
Cambridge University Press.
Ford, C. E., & Thompson, S. A. (1996). Interactional units in conversation: Syntactic, intonational,
and pragmatic resources for the management of turns. In E. Ochs, E. A. Schegloff, &
S. A. Thompson (Eds.), Interaction and grammar (pp. 13484). New York, NY: Cambridge
University Press.
Halliday, M. (1970). A course in spoken English: Intonation. Oxford, England: Oxford University
Press.
Hardison, D. M. (2005). Contextualized computer-based L2 prosody training: Evaluating the
effects of discourse context and video input. CALICO Journal, 22, 17590.
Holland, V. M., & Fisher, F. P. (Eds.). (2008). The path of speech technologies in computer assisted
language learning. New York, NY: Routledge.
Jenkins, J. (2004). Research in teaching pronunciation and intonation. Annual Review of Applied
Linguistics, 24, 10925.
Johns-Lewis, C. (Ed.). (1986). Intonation in discourse. San Diego, CA: College Hill Press.
Levis, J. (2007). Computer technology in teaching and researching pronunciation. Annual Review
of Applied Linguistics, 27, 184202.
Levis, J., & Pickering, L. (2004). Teaching intonation in discourse using speech visualization
technology. System, 32(4), 50524.
Molholt, G., & Hwu, F. (2008). Visualization of speech patterns for language learning. In
V. M. Holland & F. P. Fisher (Eds.), The path of speech technologies in computer assisted language
learning (pp. 91122). New York, NY: Routledge.
OBrien, M. (2006). Teaching pronunciation and intonation with computer technology. In
L. Ducate & N. Arnold (Eds.), Calling on CALL: From theory and research to new directions in
foreign language teaching (pp. 12748). San Marcos, TX: CALICO.
Pickering, L. (2001). The role of tone choice in improving ITA communication in the classroom.
TESOL Quarterly, 35, 23355.
Pickering, L. (2004). The structure and function of intonational paragraphs in native and nonnative speaker instructional discourse. English for Specific Purposes, 23, 1943.

technology and discourse intonation

Pickering, L., & Wiltshire, C. (2000). Pitch accent in Indian-English teaching discourse. World
Englishes, 19(2), 17383.
Wennerstrom, A. (1998). Intonation as cohesion in academic discourse. Studies in Second Language
Acquisition, 20, 125.

Suggested Readings
Bolinger, D. (1989). Intonation and its uses: Melody in grammar and discourse. Stanford, CA: Stanford
University Press.
Bradford, B. (1988). Intonation in context. Cambridge, England: Cambridge University Press.
Brazil, D. (1997). The communicative value of intonation in English. Cambridge, England: Cambridge
University Press.
Wennerstrom, A. (2001). The music of everyday speech: Prosody and discourse analysis. Oxford,
England: Oxford University Press.

You might also like