Professional Documents
Culture Documents
The Routledge Handbook of Second Language Acquisition and Speaking
The Routledge Handbook of Second Language Acquisition and Speaking
This Handbook is a comprehensive volume outlining the foremost issues regarding research
and teaching of second language speaking, examining such diverse topics as cognitive
processing, articulation, knowledge of pragmatics, instruction in sub-components of
speaking (e.g., grammar, pronunciation, and vocabulary) and the attrition of the first
language. Outstanding academics have contributed chapters to provide an integrated and
inclusive perspective on oral language skills. Specialized contexts for speaking are also
explored (e.g., English as a Lingua Franca, workplace, and interpreting). The Routledge
Handbook of Second Language Acquisition and Speaking will be an indispensable resource for
students and scholars in applied linguistics, cognitive psychology, linguistics, and education.
DOI: 10.4324/9781003022497
List of Figures xi
List of Tables xiii
Contributors xiv
Preface xx
Acknowledgements xxi
Editors’ Introduction 1
PART I
Theoretical Foundations and Processes Underlying Speaking 7
vii
Contents
6 Language Anxiety 83
Małgorzata Baran-Łucarz
PART II
Research Issues 97
PART III
Core Topics 145
13 Fluency 188
Jimin Kahng
viii
Contents
PART IV
Teaching Speaking 259
PART V
Emerging Issues 357
ix
Contents
Index 455
x
FIGURES
xi
Figures
xii
TABLES
xiii
CONTRIBUTORS
Szilvia Bátyi is a Lecturer at the University of Pannonia, Veszprém, Hungary. Her research
interests include first language attrition, L1 and L2 speech production, and linguistic
landscape. She is the area editor of Hungarian in the Linguistic Minorities in Europe Online
peer-reviewed reference resource at Mouton de Gruyter.
Trang Le Diem Bui has a doctorate from the Victoria University of Wellington and is a
Senior Lecturer in the Faculty of Foreign Languages, An Giang University, Vietnam
National University, Ho Chi Minh City, Vietnam.
Laura Collins is a Professor Emeritus at Concordia University, Montréal. Her research has
examined the impact on learning of different distributions of instructional time as well as the
input factors and practice opportunities that may facilitate/constrain classroom learning. She
is the Past-President of the American Association for Applied Linguistics.
xiv
Contributors
Kees de Bot is a Professor at the University of Pannonia. His interests range from bilingual
processing to language attrition, language development over the lifespan and language and
aging. He published a book on the history of Applied Linguistics with Routledge in 2015.
Tracey M. Derwing has extensively researched L2 pronunciation and fluency, especially the
relationships among intelligibility, comprehensibility, and accent. She has also investigated
native speakers’ speech modifications for L2 speakers and has conducted workplace studies
involving pragmatics and pronunciation.
Patricia Duff is a Professor of Applied Linguistics and an Associate Dean, Research, in the
Faculty of Education at the University of British Columbia. Her research, teaching, and
scholarship focus on sociocultural approaches to the teaching, learning, and use of languages
in transnational, multilingual contexts.
Jim Hlavac is a Senior Lecturer in Translation and Interpreting Studies, Monash University,
Melbourne and a certified and practising interpreter/translator. He has published widely in
the field of Translation and Interpreting Studies and in multilingualism, contact linguistics,
sociolinguistics, intercultural communication, pragmatics, and heritage/minority language
maintenance.
Talia Isaacs is an Associate Professor of Applied Linguistics and TESOL and Programme
Leader for the MA TESOL In-Service at the UCL Centre for Applied Linguistics, UCL
Institute of Education, University College London, UK. Her research centers on
pronunciation assessment, including understanding constructs and scoring systems using a
mixed methods approach.
xv
Contributors
Sara Kennedy is a teacher and researcher at Concordia University in Montreal. Her research
interests include the teaching, learning, and assessment of second language speech, and
English and French as a lingua franca.
John M. Levis is a Professor in the Applied Linguistics and Technology program at Iowa
State University. His research interests are in L2 pronunciation teaching and speech
intelligibility. He authored Intelligibility, oral communication, and the teaching of
pronunciation (2018, Cambridge) and is an editor of the Journal of Second Language
Pronunciation.
Enric Llurda is a Professor of Applied Linguistics at the University of Lleida. His research
interests include non-native language teachers, English as a lingua franca, language attitudes,
multilingualism, translanguaging, internationalization and language education and policy in
higher education institutions. He is currently working on the development of disciplinary
literacies in English at university.
Wander Lowie holds a PhD in Applied Linguistics from the University of Groningen and is
the chair of Applied Linguistics at this university. His main research interest lies in the
application of Dynamic Systems Theory to second language. He has published more than 50
articles and book chapters and (co-)authored six books in field of Applied Linguistics. He is
an associate editor of The Modern Language Journal.
Peggy Mok is an Associate Professor at the Chinese University of Hong Kong. She is
interested in both speech production and perception, particularly with cross-linguistic and
psycholinguistic perspectives. She focuses more on speech prosody in the recent years.
Speech acquisition in different contexts is an important theme of her research.
xvi
Contributors
Charlie Nagle is an Associate Professor of Spanish and Applied Linguistics at Iowa State
University. His primary research area is second language pronunciation. He has published
on topics such as the perception–production link, individual differences in pronunciation
learning, and dynamic and interactive approaches to listener-based ratings.
Jonathan Newton is an Associate Professor and Programme Director for the MA in Applied
Linguistics/TESOL Programmes at the School of Linguistics and Applied Language Studies
(LALS), Victoria University of Wellington, New Zealand.
Bao Trang Thi Nguyen has a doctorate from Victoria University of Wellington and is a
lecturer at Faculty of English, Hue University of Foreign Languages, Vietnam.
Elke Peters is an Associate Professor at KU Leuven. Her research interests involve deliberate
and incidental FL vocabulary learning inside and outside of the classroom and how different
types of input can contribute to vocabulary learning. She has published her research in
Language Learning, Studies in Second Language Acquisition, and TESOL Quarterly.
June Ruivivar holds a PhD in Education, with a specialization in applied linguistics, from
Concordia University, Montréal. Her research explores the acquisition of sociolinguistic
competence, the learning and teaching of spoken grammar and vernacular varieties, and
socio-affective issues in second language acquisition.
Monika Schmid is the Head of the Department of Language and Linguistics at the University
of Essex. She obtained her PhD in English Linguistics in 2000 from the Heinrich-Heine
Universität Düsseldorf with a thesis on first language attrition among German Jews. She has
since held positions at the Vrije Universiteit Amsterdam and at the Rijksuniversiteit
Groningen.
Alif O. Silpachai is a graduate student in the Applied Linguistics and Technology Program in
the English Department at Iowa State University. His research interests include the
production and the perception of suprasegmentals, especially lexical tones.
xvii
Contributors
Thi Phuong Thao Tran has grown her interest and research in intercultural language teaching
and learning and intercultural understanding through her teaching at Can Tho University in
Vietnam, her PhD journey at Victoria University of Wellington in New Zealand and her
work supporting migrants in Australia.
Aki Tsunemoto is currently a PhD candidate at Concordia University, Montréal. She earned
her MA in TESOL at UCL Institute of Education, University College London, UK. Her
current research interests are second language speech assessment, psycholinguistic aspects of
speech interaction and second language pronunciation teaching.
xviii
Contributors
Duy Van Vu is a PhD researcher in Linguistics at KU Leuven. His research focuses on second
language vocabulary acquisition and use. He is interested in how vocabulary is addressed in
second language textbooks and classrooms as well as how vocabulary can be acquired from
different modes of input.
Yanjiao Zhu is a Lecturer in the School of Foreign Languages at the University of Electronic
Science and Technology of China. Her current research focuses on the acquisition of third
language speech, exploring the ways prior linguistic knowledge influence the development of
a new sound system.
xix
PREFACE
This Handbook is a comprehensive volume outlining the foremost issues regarding research
and teaching of second language speaking, examining such diverse topics as cognitive
processing, articulation, knowledge of pragmatics, instruction in sub-components of
speaking (e.g., grammar, pronunciation, and vocabulary) and the attrition of the first
language. Outstanding academics have contributed chapters to provide an integrated and
inclusive perspective on oral language skills. Specialized contexts for speaking are also
explored (e.g., English as a Lingua Franca, workplace, clinical settings, and interpretation).
It is our hope that The Routledge Handbook of Second Language Acquisition and Speaking
will become an indispensable resource for students and scholars in Applied Linguistics,
Cognitive Psychology, Linguistics, and Education.
On completion of an environmental scan, we determined that there are very few resources
that adequately address the breadth of research on second language (L2) speaking. Research
studies appear in disparate journals and edited volumes. This Handbook constitutes a
thorough treatment of speaking-related topics by leading experts in the field. Parts of the
Handbook will appeal to instructors of courses on cognitive processes underlying SLA; the
teaching of speaking; and L2 pronunciation and pragmatics.
Most chapters follow the same outline, in that they first introduce key definitions, followed
by descriptions of historical conceptualizations of a given topic, critical issues in the field,
current issues, research methods commonly used to probe these issues, recommendations for
practice and promising new directions. The authors have also provided some additional
references which they believe are some of the best, more extensive explorations into their
topics.
We three editors agree that we learned a lot in reading these contributions, so we
encourage you to dip into areas that you might not ordinarily seek out. The chapters are
fascinating and may trigger new ideas.
xx
ACKNOWLEDGEMENTS
First, we thank the authors of the chapters here. When we invited them to contribute to this
volume, we and they had no idea that all of our lives would be so disrupted a few months
later. The original due date for the chapters was the end of March, 2020. Suddenly, several
contributors were homeschooling multiple grades as well as converting their university
classes to an online platform, together with other adjustments associated with lockdown.
Many were burdened with additional administrative loads as a result of the pandemic.
Others had their daily routines interfered with in other ways. It is a tribute to each of our
contributors that they all sent excellent chapters our way, albeit with some delay (and thanks
to the Series Editors and Routledge for their understanding).
We are most grateful to the scholars who provided chapter reviews. These individuals
offered extremely helpful feedback to the contributors. They were also affected by the
adverse conditions brought on by the pandemic, but they nonetheless offered superb advice
in a timely manner. The reviewers are listed here alphabetically:
Susan Ballinger
Kathleen Bardovi-Harlig
Bill Crawford
Remi van Compernolle
Isabelle Darcy
Esther DeLeeuw
Jean-Marc DeWaele
Roger Gilabert
Talia Isaacs
Eva Karchava
Judit Kormos
Andrew Lee
John M. Levis
Elena Nicoladis
Mary Grantham O-Brien
Ron Smyth
Stuart Webb
David Wood
xxi
EDITORS’ INTRODUCTION
Background
In March of 2017, we were approached by Susan Gass with an invitation to edit a Handbook
on Speaking in the Routledge SLA series edited by Sue and Alison Mackey. We agreed, but
only if we could start the project in July of 2019, with the understanding that once we started,
we would need to adhere to strict timelines. (All that went out the window in March of 2020.)
We were somewhat surprised to be approached to edit a volume on speaking, since our own
primary expertise, second language pronunciation, lies within a subset of speaking. We were
thus slightly reluctant to take this on, but on closer investigation, we realized that many
researchers are in the same position as we are, in that their focus lies within one or two
components of speaking, rather than a broader thrust. We reached out to experts across the
spectrum, and the result is this comprehensive snapshot of L2 speaking in SLA research.
We have divided the Handbook into five parts, starting with Theoretical Foundations and
Processes Underlying Speaking, followed by Research Issues, then Core Topics, Teaching
Speaking, and Emerging Issues.
DOI: 10.4324/9781003022497-1 1
Introduction
2
Introduction
completion tasks. However, L2 speech requires close attention to a wide array of phenomena
including fluency and pausing, and prosodic features such as stress and intonation. Jimin Kahng
(Chapter 13) details L2 fluency research, while Peggy Mok and Yanjiao Zhu (Chapter 14) focus
on the role of prosody across languages. Fluency, appropriate use of pragmatics, and prosody all
contribute to comprehensibility in the sense of “ease of understanding.” In Chapter 12, Pavel
Trofimovich and colleagues have examined the combination of factors that contribute to a given
speaker’s comprehensibility. It is now clear from several studies that not every feature of an L2
accent has an impact on comprehensibility or intelligibility – the listener’s actual understanding
of the L2 speaker’s intended message. John Levis and Alif Silpachai demonstrate this in their
discussion of speech intelligibility (Chapter 11).
As we noted earlier, pronunciation is a subset of speaking, but it is one that has elicited
significant attention in the past two decades. Clearly, if a listener cannot understand an L2
speaker’s output, communication has failed. In some cases that failure may be attributed to
aspects of the speaker’s pronunciation, while in others, the attitudes and choices of the lis-
tener also play a role. Throughout the 1970s and 1980s, it was presumed that pronunciation
would develop on its own with sufficient input. That turned out not to be the case. Tracey
Derwing and Murray Munro (Chapter 10) outline the course of pronunciation in L2 teaching
and research over the past several years.
3
Introduction
accessing university students studying a second language. In Chapter 29, Johanne Paradis
explores issues related to child L2 speakers with language and communication disorders.
Another under-represented group in SLA speaking research is adult migrants in the
workplace, but clearly communication in the workplace is an important societal concern. In
Chapter 25, Lynda Yates highlights what we have learned about workplace communication
thus far and makes recommendations for future studies.
Ron Thomson considers the relationship between L2 speech perception and production in
Chapter 26, pointing out that although these two aspects of language are equally important,
the focus in language classrooms has been unduly weighted on production.
In Chapter 27, Marianne Gullberg notes that speaking is multimodal. Not only does it
require manipulations of the articulators, but also speakers employ gestures with their hands,
their eyebrows, movements of the head and so on. She points out that most SLA research on
speaking ignores gestural components and offers compelling arguments for examining gestures
when investigating the development of L2 speech.
Second language researchers and teachers are not the only professional group with a
vested interest in L2 speakers. As Marie Nader (Chapter 28) describes, many speech lan-
guage pathologists have made L2 accent modification a component of their practices, yet all
too often they are unfamiliar with aspects of SLA research that could inform them of how
best to help their clients.
Exceptionally talented language learners can eventually master another language to the
point of becoming professional interpreters, perhaps the most difficult of all linguistic un-
dertakings. In Chapter 30, Jim Hlavac outlines the nascent research in this area.
Most of this volume is dedicated to the examination of processes and products of the
learning and teaching of a second language, but in Chapter 31, Monika Schmid considers the
effects of the acquisition of another language on a person’s first language speech, noting that
it is more far-reaching than previously believed.
Conclusion
Overall, this volume demonstrates the many directions that SLA research can take in the
investigation of L2 speaking. Each chapter follows a similar trajectory, defining important
terms, outlining a historical overview, highlighting critical issues and topics, examining current
contributions and research, discussing main research methods and making recommendations
for practice. Students looking for ideas for a thesis topic would do well to consult the future
directions part of each chapter, which provide a wealth of ideas for new contributions to our
understanding of L2 speaking. Spoken communication lies at the heart of our very beings. As
Derwing and Munro (2015) observed, the withdrawal of the opportunity to talk with others,
from a child’s “time out” to solitary confinement in prison, is a punishment. Humans are
social; the more we know about communication, and particularly communicating in languages
other than our mother tongues, the better. We hope that some of the topics here inspire new
research and new ways of approaching the instruction of L2 speaking.
References
Derwing, T. M. & Munro, M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2
teaching and research. Amsterdam: John Benjamins.
Gass, S. M., & Varonis, E. M. (1994). Input, interaction, and second language production. Studies in
second language acquisition, 16(3), 283–302.
Krashen, S. D. (1982). Principles and practice in second language acquisition. Oxford: Pergamom Press.
Levelt, W. J. M. (1989). Speaking. From intention to articulation. Cambridge, MA: MIT Press.
4
Introduction
Long, M. H. (1983). Native speaker/non-native speaker conversation and the negotiation of compre-
hensible input. Applied Linguistics, 4(2), 126–141.
Mackey, A. & Gass, S. M. (2016). Second language research: Methodology and design (2nd edn).
New York: Routledge.
Swain, M. (1993). The output hypothesis: Just speaking and writing ’aren’t enough. Canadian Modern
Language Review, 50(1), 158–164.
5
PART I
1 Introduction
In this contribution, we discuss bilingual language production and some of the theoretical
concepts related to this. Our starting point will be the 1989 Speaking Model by Willem
Levelt. Despite the fact that it is now more than 30 years old, it still stands as the most
elaborate and empirically founded model for language production. The original model did
not aim to elucidate how bilingual production differs from monolingual production, how-
ever, in the past few decades a number of bilingual variants have been developed, although
these variants do not alter the fundamental characteristics of the model: it is lexically based,
modular in type, incremental, and skills oriented. It is not a model of change, but parts of the
model can change due to use and learning. One of the major issues in applying the model is
that it is unclear how individual differences such as motivation, attitudes, and anxiety can be
built into the blue print.
Bilingual models have been developed on the basis of the Levelt model. They have the
same characteristics, however, should help in understanding phenomena such as code-
switching and cross-linguistic influence (CLI).
2 Historical Perspectives
The question is: is the default model a monolingual or a bilingual one? On various occasions,
Levelt has indicated that in his view bilingualism is a fascinating topic, but not one he wants
to work on claiming that “monolingualism is already complex enough.” Several authors have
argued that taking a monolingual model as a starting point does not do justice to the fact
that an overwhelming majority of the world’s population is bilingual and that therefore the
default model should be a bilingual one (Grosjean, 2008, 2010). First of all, it has to be
assessed to what extent the current model can deal with various aspects of multilingual
processing. Bilingual and multilingual speech production models are usually derivations of
Levelt’s speaking model (Figure 1.1) or at least they borrow some elements from it.
Consequently, bilingual speaking models cannot be discussed without mentioning the
Speaking model. In the following parts, we describe the main components and processing
mechanism of the Levelt model followed by a discussion of bilingual versions of the model
and a brief outline of how code-switching is accounted for by the different models.
DOI: 10.4324/9781003022497-3 9
Kees de Bot and Szilvia Bátyi
10
Bilingual Models of Speaking
A full description of the Speaking Model would take more space than allowed for the
present contribution. Here, we offer a condensed version of the main characteristics.
Production starts in the conceptualizer where a communicative intention is turned into
lexical concepts. In the generation of the message, information about the conversational
setting and the discourse model are taken into account which includes the selection of a
linguistic register. Hesitation markers (e.g., silent or filled pauses) are often taken as in-
dicators of the amount of mental activity going on: the expression of new or complex ideas
are often preceded by greater hesitancy manifested in pauses, that is, more attention is di-
rected to the planning stage, therefore, resources to be used to execute the act of speaking are
limited. Conceptualization involves macroplanning and microplanning, including the ren-
dering of ideas in the right order (linearization) and the plan of achieving communication
goals (instrumentality). Within the conceptualizer, the message is generated and monitored
internally whether the (preverbal) plan coincides with the intended message. Finally, the
series of lexical concepts is turned into the preverbal message (see Figure 1.2 and Table 1.1 as
an example), which is fed forward to the formulator.
Here, the essential process of turning lexical concepts into a surface structure takes place
which is done by matching lexical concepts with lemmas in the lexicon. Lexical items consist
of two parts: the lemma in which the entry’s meaning and syntax are represented, and the
lexeme that contains the morphosyntactic and phonological information. The matching of a
lemma with a lexical concept also leads to the activation of the syntactic procedures that are
part of the lemma. For instance, if a transitive verb (e.g., caught) is activated, it will start the
syntactic procedures for the generation of a direct and an indirect object (e.g., ball). The
activation of the lexical item also leads to the lexeme becoming available. This process is not
always successful, as the well-known tip-of-the-tongue phenomenon shows; sometimes the
lexical item is activated through the lemma, but the lexeme part does not come up in time.
Interestingly, some properties of the intended word form do become available; speakers often
know how many syllables the word has and what the rhythmic pattern is. The selection of the
lemmas and lexemes also leads to the formation of a surface structure. While the surface
structure is being formed, the morpho-phonological information belonging to the lemma is
activated and encoded. The output of the formulator is the input of the articulator which
converts the speech plan into actual speech. There are two feedback loops, one internal that
checks the inner speech, and one external that checks the overt speech. Syllables are the
building blocks of speech. The outputs of the articulator are motor-plans to execute the
assembling of syllables into running speech. The two feedback loops monitor the speech and
articulation.
As this short description may show, speaking is not primarily syntactically based but
lexically. It is modular in nature though Levelt has always carefully avoided calling the
processing components modules, since that would imply that these components are modular
and therefore innate. In later publications, Levelt followed the view that the modular
character of these components is emergent, it is the result of use, not the origin. Several
commentators (e.g., de Bot et al., 2007) have argued that the modularity of the model is one
of its main characteristics and also one of its weaker points, because a strict modular view
does not allow for a view on language that is more dynamic in nature.
The system works “from left to right,” that is, the information enters the system and is
processed from intention to articulation without feedback or feedforward. It is only at the
level of the phonetic plan that internal speech is monitored and corrections can be made.
This means that errors in speaking can only be detected fairly late in the process. If, for
instance, the wrong lexical item has been activated, this can only be detected through the
internal feedback loop that monitors the internal speech. This also means that there are no in
11
Kees de Bot and Szilvia Bátyi
between mechanisms that can correct the error. Basically, error correction is redoing the
same procedures and hoping that this time the intended meaning is actually expressed
correctly.
12
Bilingual Models of Speaking
Conceptualizer Concepts are chosen. Catching (of something by someone); dog (the
entity carrying out this action); ball (the
entity on which the action is carried out.
Formulator Lemmas are accessed and {catch}
retrieved from the mental {dog}
lexicon. {ball}
Grammatical roles are given to VERB = {catch}; SUBJECT = {dog}: singular,
the lemmas. definite; OBJECT = {ball}: singular, definite;
TIME = past
The selected set of lemmas is (DETERMINER) {dog} [singular; definite]
organized into an ordered {catch} [past] (DETERMINER) {ball}
string. [singular; definite]
The lexemes or word-forms are e.g., {dog} is linked in the mental lexicon both
made available via links with to the written form <dog> and to the spoken
the lemmas. form /dɒɡ/
Articulator The utterance is pronounced. The dog caught the ball.
13
Kees de Bot and Szilvia Bátyi
One of the latest comprehensive models of bilingual speech production was proposed by
Kormos (2006), which is also based on Levelt’s (1999) monolingual framework. Based on
memory research, this model highlights the role of knowledge stores which are shared be-
tween L1 and L2 with an additional L2 store: the declarative knowledge of syntactic and
phonological rules. At the conceptual level, a language cue is added to each concept sepa-
rately allowing for code-switching at later stages. The model also accounts for the use of
formulaic expressions (apologizing, requesting, etc.) and these are activated as single chunks
at the conceptual level. Activation of the conceptual chunks are spread to the corresponding
linguistic chunks. In the formulator, lemmas from both languages are activated and they
compete for selection. Both syntactic and phonological encoding allows for the cascading of
activation; however, backward flow between the levels is not assumed.
In all bilingual versions of the speaking model, code-switching, the alternate use of two or
more languages in the same utterance or conversation (Auer, 2005; Poplack, 1980; Stavans &
Swisher, 2006) is addressed. A well-known model accounting for code-switching is Myers-
Scotton’s (2002) Main Language Frame model which is primarily aimed at analyzing different
types of code-switching (i.e., intrasentential and insertional code-switching). She argues that as
in the Speaking model, a number of syntactic frames are activated in bilingual production. A
crucial characteristic of her model is the selecting of a so-called matrix language, the language
that provides the language frame for code-switched utterances which the speaker typically goes
back to and elements from the other language are inserted into the dominant/matrix language
according to the speaker’s proficiency. The model proposes that the matrix language can be
identified by calculating the proportion of lexical elements from one of the languages used in
the code-switching setting (the most frequently used language is the matrix language). Myers-
Scotton’s ideas are only concerned with code-switching and provide no additional insights into
bilingual production beyond the Speaking model.
As we have seen, lexical items from both languages are activated in bilingual production
due to spreading activation. For successful communication, bilinguals have to inhibit ele-
ments from the non-target language to avoid interference. In bilingual production, research
into this process has been linked with language dominance and accounted for by the
Inhibitory Control Model (ICM) proposed by Green (1986, 1993, 1998), which explains
switching cost, that is, the reactivation of a language after inhibition. The model postulates
that language dominance governs the amount of inhibition directed at the non-target lan-
guage: the stronger language (L1) has to be suppressed by a greater magnitude of inhibition
than the weaker language (L2). As a consequence, the reactivation of the language depends
on the strength of the inhibition and the reactivation cost (i.e., switching cost) will be larger
for the L1. Later, the model has been expanded to account for symmetrical switching costs
(Schwieter & Sunderman, 2008; Fink & Goldrick, 2015) when the dominance of the two
languages is similar.
14
Bilingual Models of Speaking
several factors have been identified to play a role in the process, which are now suggested to
be used in monolingual production models too [e.g., language selection in bilinguals is
compared to selection between registers in monolinguals (La Heij, 2005)]. The main ques-
tions of bilingual speech production concern language selection, the locus of language se-
lection in the planning process, and factors that influence bilingual production. These issues
will be briefly addressed in the following.
One of the questions that has led to considerable research is whether language production
is selective or non-selective (Colomé, 2001; Costa et al., 1999; Grosjean, 2013; Hermans et al.,
1998, Jared & Kroll, 2001). Language selection seems to be the most pronounced question of
bilingual speech production models (these models explain lexical selection) and two general
alternatives have been proposed: language selective models claiming that bilinguals are able to
speak one language alone and prevent or ignore the activation of lexical items from the other
language, while competition for selection models assume that candidates from both languages
compete for selection. In the early days of research on the bilingual lexicon, it was assumed
that there was an input switch and an output switch which allowed for the use of either
languages at will productively and perceptually (Macnamara & Kushnir, 1971). The idea was
that there is a mechanism that acts as a sieve allowing only elements from one language to be
filtered out. Similarly, in production only elements from one language are selected and used.
Although the question is still a source of inspiration for many researchers, Kroll et al. note
that “bilinguals cannot switch off one of the two languages at will. When they listen to
speech, read, or prepare to speak in only one of their two languages, information about the
language not in use is also active and influences performance” (2012, pp. 231–232).
According to La Heij (2005), supporter of the language selective process, language selection
in bilingual production is either viewed as “complex access, simple selection,” or “simple ac-
cess, complex selection.” He (along with Poulisse & Bongaerts, 1994) claims that “Access is
complex in the sense that the preverbal message contains all the relevant information, including
the intended language” and “Lexical selection is a simple, local process that is only based on
the activation levels of words” (2005, p. 302). This suggests that language selection happens
early in the speech planning process, more specifically at the conceptual level. Furthermore, in
bilinguals the outcome of conceptualization is similar to the content of the preverbal message
of the monolingual speaker as modelled by Levelt, with the difference that besides information
about which register to use, the language (L1 or L2) is also selected (de Bot, 1992; La Heij,
2005). Due to the co-activation of semantic neighbours, it is “reasonable to assume that during
lexical access words that also appear in the nonresponse language are activated to some extent”
(La Heij, 2005, p. 301). Other language selective models assume a more complex selection
process, for example, Costa et al. (1999) assume the presence of language tags in lexical access
indicating whether a word belongs to L1 or L2.
Language tags are also displayed as components of competition for selection models but
accompanied with an inhibition mechanism. Green’s Inhibitory Control Model (ICM, 1986,
1993, 1998) assumes that each individual lexical representation has a language tag (L1 or L2)
and non-target lexical nodes can be suppressed in a particular communicative context.
According to the ICM, selection is mediated via inhibitory processes at the lemma level
(contrary to La Heij’s suggestion) and the amount of inhibition is proportional to the acti-
vation level of the non-target language items (Finkbeiner et al., 2006). L2, in general, receives
less inhibition because it is usually less highly activated than the L1. Empirical evidence
supporting the ICM comes from studies investigating switching cost. It has been found that
switching cost varies according to proficiency: less-proficient bilinguals experience greater
switching cost in L2–L1 direction (Meuter & Allport, 1999), while switching cost is symmetric
for highly proficient and close to balanced bilinguals (Costa & Santesteban, 2004). Hermans
15
Kees de Bot and Szilvia Bátyi
et al. (1998), in two picture–word interference tasks, demonstrated that bilinguals cannot
prevent interference from their L1 at the initial stages of lexical access in their L2, the L1
lemma also becomes activated.
As opposed to previous views on the locus of language selection [at the conceptual level
(La Heij, 2005), at the lexical level (Costa et al., 1999), and at the semantic level (Green,
1998)], Kroll et al. (2006) suggest that parallel activation of both languages can persist into
the execution of speech, which makes the system “generally nonselective and open to these
cross-language interactions” (p. 127). They acknowledge that “although there are circum-
stances that allow bilinguals to plan spoken utterances exclusively in one language without
the influence of the other language, those circumstances are the exception, not the rule,
particularly when speaking the L2” (p. 127).
Several factors have been proposed to influence the activation level of the languages and
the place of language selection in the bilingual speech production process. Grosjean (2001)
introduced the language mode concept which relates to the context of language use and
proposes two endpoints of the continuum: the monolingual mode where one language is
active and used and the bilingual mode in which both languages are highly active. He
proposed that the state of activation of the bilingual’s languages depend on where they find
themselves on the continuum (Grosjean, 2013). Numerous factors determine movement on
the continuum and thus the activation of languages (e.g., languages involved, interlocutors,
topic, stimuli, and experimental task). Some studies have attempted to experimentally vali-
date the language mode model but arrived at mixed results (Jared & Kroll, 2001; Van Hell &
Dijkstra, 2002; Navracsics, 2004).
Kroll et al. (2006) identified several factors that can influence the locus of language selection,
though the empirical evidence is still scarce and needs further research. These factors are the
following: language proficiency, language dominance, context of acquisition, processing demands
associated with the (experimental) task, the nature of concepts to be expressed, and activation of
the two languages. In addition to these, qualitative studies add such affective factors as emotions.
Navracsics (2014) conducted a longitudinal process study with English–Hungarian–Persian chil-
dren and found that CLI was often caused by the least frequently used language (Persian), which
was also the language of emotions in parent–child communication.
Bilingual production models rely on existing evidence in the domains of lexical processing
and sentence processing. Most studies on production focus on the former, while research on
the latter is scarcer and often yield to contradictory results. Hartsuiker and Pickering (2007)
review the evidence from bilingual sentence production studies to test the predictions of three
models: de Bot’s bilingual production model (1992) (strong and weak version), Ullman’s
declarative/procedural model (2001), and Hartsuiker et al.’s integrated model (2004). The
general question of the study is “To what extent are processes used in sentence production
integrated between the different languages of a bilingual and to what extent are they kept
separate?” (p. 479). All three models (we consider the weak version of de Bot’s model)
propose CLI but there is no consensus on the determining factors. de Bot assumes that the
degree of interaction could be a function of linguistic distance and between-language effects
should be stronger with closely related languages. Ullman is unclear about the effect of
linguistic distance but proposes a positive relationship between proficiency and the extent of
CLI which contradicts de Bot’s assumption.
The authors review recent behavioural evidence from language production experiments,
more specifically on conceptual number effects, syntactic transfer, syntactic priming (across
languages, strength of within- and between-language, the effect of linguistic distance and
proficiency), and they conclude that findings support the predictions of Hartsuiker et al.’s
16
Bilingual Models of Speaking
model that neither proficiency nor linguistic distance has an effect on CLI and there is robust
between-language priming.
Code-Switching
One of the most prominent features of bilingual production is code-switching (CS), the
change of language during speaking. The most frequent type is switching between utterances,
but it happens at all linguistic levels, phonological, morphosyntactic, lexical semantic, and
between sign languages and even sign and non-sign languages (Meier 2002). Code-switching
has been studied extensively and there is a large body of publications on the subject. Current
code-switching research is interested in the role of cognates in code-switching, in the cog-
nitive mechanisms involved, in particular whether switching between languages takes time
and effort and in switching between modalities.
Broersma and colleagues (Broersma & de Bot, 2006; Broersma et al., 2020) carried out a
number of studies on triggered code-switching – code-switching facilitated by the occurrence
of cognates – and found a strong effect of cognates in conversations from a large corpus of
Welsh-English conversational speech. The data showed that producing cognates facilitated
code-switching, and that speakers who use more cognates tend to switch more. Interestingly,
hearing rather than producing cognates did not facilitate code-switching. In terms of a
production model, this suggests that lexical activation can have an impact on language
choice and vice-versa.
Mosca and de Bot (2017) studied Dutch-English bilinguals with small differences in
dominance and found that while in recognition, the switching cost was associated with
language dominance, in production, no such pattern was found. On the contrary, they found
a paradoxical language effect (faster responses in the L2 than in the L1) in the production.
They concluded that “language control is a much more flexible mechanism than previously
believed and that because of its malleable nature it is difficult to circumscribe it within a
specific model” (p. 16).
Declerck et al. (2019) looked at control mechanisms in code-switching between registers
(formal/informal speech in French) versus code-switching between languages (French and
English). Similar switching costs were found for register/language switching. Making the cue-
to-stimulus interval longer led to a reduction of switching costs for the language switches but
not for the register switches. This suggests that the control mechanisms for the two types of
switching were not completely identical but partially shared.
There is a growing literature on switching between modalities (i.e., sign/non-sign; Tang &
Sze, 2018). Bimodal bilinguals are individuals proficient in both spoken and signed language(s).
Code-blending, the production of spoken and signed elements simultaneously, is more frequent
in this community than code-switching and it has attracted more research interest (see
Emmorey et al., 2008). Research in this domain is still scarce but it seems that the cognitive
17
Kees de Bot and Szilvia Bátyi
control exercised by unimodal bilinguals (Bialystok et al., 2009) do not occur for bimodal
bilinguals since the spoken and the signed language are fully or partially active when they are
producing speech, that is, no inhibition occurs (Pichler et al., 2019; Emmorey et al., 2008).
However, CLI have been observed during several studies (Emmorey et al., 2008; Morford et al.,
2011) which questions the assumption the CLI is the consequence of speaking two phonolo-
gically similar languages.
18
Bilingual Models of Speaking
19
Kees de Bot and Szilvia Bátyi
6 Future Directions
As it becomes clear from the earlier discussion of bilingual Speaking models and questions
related to its components and mechanism, there are countless directions for future research.
There is much more to be done in terms of research and understanding of what factors
influence the activation level of the languages and the place of language selection in the
bilingual speech production process. There is a dire need to conduct more empirical work to
find out the role of factors in language activation identified by Grosjean 2013 (e.g., languages
involved, interlocutors, topic, stimuli, and experimental task) and Kroll et al. (2006) (lan-
guage proficiency, dominance, context of acquisition, processing demands, etc.). In addition
to these factors, models of bilingual speech production could benefit immensely from the
integration of the role of individual differences. Research in this area is very limited and
apart from WM and anxiety, none of the IDs could be associated directly with any levels or
mechanisms.
This chapter heavily relied on lexical processing as most research on bilingual production
have focused on this level and there is considerably less work on bilingual syntactic pro-
cessing. Further research is needed to understand sentence production in a second language
and to see to what extent syntactic representations and processes are shared in bilingual
production (e.g., in the case of translators and simultaneous interpreters).
A promising direction could be the role of gestures, which has become a research field in
itself and there is a growing awareness that non-verbal parts of language use are at least as
important as verbal ones. In the original blue print, only verbal and linguistic information
was included. Through the work of De Ruyter McNeill 2000 non-verbal aspects were in-
cluded too. He argues for the addition of a gestuary, the collection of gestures a speaker uses.
There is no simple one-to-one match between certain gestures and meaning. Nor is there a
grammar of gesture use. Modelling gestures is very complex and the transcription of gestures
is tedious and labour intensive. At the same time, it is obvious that non-verbal behaviour is
an essential part in language production and that gestures have an impact on meaning
conveyance. Gullberg (2012) has suggested a link between gestures and intonation, because
both extend over longer parts of utterances and they carry meaning by themselves, often
related to the verbal content of the sentence. The role of gestures and other non-verbal
information in language production is still to be studied.
Further Reading
Grosjean, F. (2013). Speech production. In F. Grosjean & P. Li (Eds.), The psycholinguistics of
bilingualism (pp. 50–69). Malden, MA & Oxford: Wiley-Blackwell.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah: Lawrence Erlbaum.
La Heij, W. (2005). Selection processes in monolingual and bilingual lexical access. In J. F. Kroll &
A. de Groot (Eds.), Handbook of bilingualism (pp. 289–307). New York: Oxford University Press.
References
Auer, P. (2005). A postscript: Code-switching and social identity. Journal of Pragmatics, 37(3), 403–410.
Bialystok, E., Craik, F. I. M., Green, D. W. & Gollan. T. H. (2009). Bilingual minds. Psychological
Science in the Public Interest, 10(3), 89–129.
Bobb, S., & Wodniecka, Z. (2013). Language switching in picture naming: What asymmetric switch
costs (do not) tell us about inhibition in bilingual speech planning. Journal of Cognitive Psychology,
25, 568–585.
Broersma, M. & de Bot, K. (2006). Triggered codeswitching: A corpus-based evaluation of the original
triggering hypothesis and a new alternative. Bilingualism: Language and Cognition, 9, 1–13.
20
Bilingual Models of Speaking
Broersma, M., Carter, D., Donnelly, K., & Konopka, A. (2020). Triggered codeswitching: Lexical
processing and conversational dynamics. Bilingualism: Language and Cognition, 23(2), 295–308.
Colomé, A. (2001). Lexical activation in bilinguals’ speech production: Language-specific or language-
independent? Journal of Memory and Language, 45, 721–736.
Costa, A., & Santesteban, M. (2004). Lexical access in bilingual speech production: Evidence from
language switching in highly proficient bilinguals and L2 learners. Journal of Memory and Language,
50, 491–511.
Costa, A., Miozzo, M., & Caramazza, A. (1999). Lexical selection in bilinguals: Do words in the
bilingual’s two lexicons compete for selection? Journal of Memory and Language, 41, 365–397.
de Bot, K. (1992). A bilingual production model: Levelt’s speaking model adapted. Applied Linguistics,
13, 1–24.
de Bot, K. (2004). The multilingual lexicon: Modeling selection and control. The International Journal
of Multilingualism, 1(1), 17–32.
de Bot, K., Lowie, W. M., & Verspoor, M. H. (2007). A dynamic systems theory approach to second
language acquisition. Bilingualism: Language and Cognition, 10(1), 7–21. doi: 10.1017/S136672
8906002732
de Bot, K., & Schreuder, R. (1993). Word production and the bilingual lexicon. In R. Schreuder &
B. Weltens (Eds.), The bilingual lexicon (pp.191–214). Amsterdam: Benjamins.
De Ruyter, J. P. (2000). The production of gesture and speech. In D. McNeill (Ed.), Language and
gesture (pp. 284–311). Cambridge: Cambridge University Press.
Declerck, M., Ivanova, I., Grainger, J., & Duñabeitia, J. A. (2019). Are similar control processes im-
plemented during single and dual language production? Evidence from switching between speech
registers and languages. Bilingualism: Language and Cognition, 23(3), 694–701. doi: 10.1017/S136672
8919000695
Dörnyei, Z. (2009). The psychology of second language acquisition. Oxford: Oxford University Press.
Emmorey, K., Borinstein, H. B., Thompson, R. & Gollan, T. H. (2008). Bimodal bilingualism.
Bilingualism: Language and Cognition, 11(1), 43–61.
Fink, A., & Goldrick, M. (2015). Pervasive benefits of preparation in language switching. Psychonomic
Bulletin & Review, 22, 808–814.
Finkbeiner, M., Almeida, J., Jansen N., & Caramazza, A. (2006). Lexical selection in bilingual speech
production does not involve language suppression. Journal of Experimental Psychology: Learning,
Memory, and Cognition, 32, 1075–1089.
Green, D. (1986). Control, activation and resource. Brain and Language, 27(2), 210–223.
Green, D. (1993). Towards a model of L2 comprehension and production. In R. Schreuder &
B. Weltens (Eds.), The bilingual lexicon (pp. 249–277). Amsterdam: Benjamins.
Green, D. (1998). Mental control of the bilingual lexico-semantic system. Bilingualism: Language and
Cognition, 1, 67–81.
Grosjean, F. (2001). The bilingual’s language modes. In J. L. Nicol (Eds.), One mind, two languages:
Bilingual language processing (pp. 1–22). Oxford, England: Blackwell.
Grosjean, F. (2008). Studying bilinguals. Oxford: Oxford University Press.
Grosjean, F. (2010). Bilingual: Life and reality. Cambridge, MA: Harvard University Press.
Grosjean, F. (2013). Speech production. In F. Grosjean & P. Li (Eds.), The psycholinguistics of bi-
lingualism (pp. 50–69). Malden, MA & Oxford: Wiley-Blackwell.
Gullberg, M. (2012). Bilingualism and gesture. In T. K. Bhatia & W. C. Ritchie (Eds.), The handbook of
bilingualism and multilingualism (2nd edn, pp. 417–437). Malden, MA: Wiley-Blackwell.
Hartsuiker, R., & Pickering, M. (2007). Language integration in bilingual sentence production. Acta
Psychologica, 128(3), 479–489.
Hartsuiker, R., Pickering, M., & Veltkamp, E. (2004). Is syntax separate or shared between languages:
Cross-linguistic syntactic priming in Spanish-English bilinguals. Psychologica Science, 1, 5, 409–414.
Hermans, D., Bongaerts, T., de Bot, K., & Schreuder, R. (1998). Producing words in a foreign lan-
guage: Can speakers prevent interference from their first language? Bilingualism: Language and
Cognition, 1, 213–229.
Horwitz, E. K., Horwitz, M. B., & Cope, J. A. (1986). Foreign language classroom anxiety. Modern
Language Journal, 70(2), 125–132.
Hoshino, N., & Kroll. J. F. (2008). Cognate effects in picture naming: Does cross-language activation
survive a change of script? Cognition, 106, 501–511.
Indefrey, P. (2007). Brain imaging studies of language production, In G. Gaskell (Ed.), Oxford hand-
book of psycholinguistics (pp. 547–564). Oxford: Oxford University Press.
21
Kees de Bot and Szilvia Bátyi
Indefrey, P., & Levelt, W. J. M. (2004). The spatial and temporal signatures of word production
components. Cognition, 92, 101–144.
Jared, D., & Kroll, J. F. (2001). Do bilinguals activate phonological representations in one or both of
their languages when naming words? Journal of Memory and Language, 44, 2–31.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Erlbaum.
Kormos, J. (2015). Individual differences in second language speech production. In J. W. Schwieter
(Ed.), The Cambridge handbook of bilingual processing (pp. 369–388). Cambridge: Cambridge
University Press.
Kroll, J. F., Dussias, P. E., Bogulski, C. A., & Valdes-Kroff, J. (2012). Juggling two languages in one
mind: What bilinguals tell us about language processing and its consequences for cognition. In B.
Ross (Ed.), The psychology of learning and motivation (Vol. 56, pp. 229–262). San Diego: Academic
Press.
Kroll, J. F., Gerfen, C. & Dussias, P. E. (2008). Laboratory designs and paradigms: Words, sounds,
and sentences. In L. Wei & M. G. Moyer (Eds.), The Blackwell guide to research methods in bi-
lingualism and multilingualism (pp. 108–131). Oxford: Blackwell Publishing Ltd.
Kroll, J., Bobb, S., & Wodniecka, Z. (2006). Language selectivity is the exception, not the rule:
Arguments against a fixed locus of language selection in bilingual speech. Bilingualism: Language
and Cognition, 9(2). 119–135.
La Heij, W. (2005). Selection processes in monolingual and bilingual lexical access. In F. Kroll & M. B.
De Groot (Eds.), Handbook of bilinguals: Psycholinguistic approaches (pp. 289–307). Oxford: Oxford
University Press.
Levelt, W. J. M. (1989). Speaking. From intention to articulation. Cambridge, MA: MIT Press.
Levelt, W. J. M. (1992). Accessing words in speech production: Stages, processes and representations.
Cognition, 42, 1–22.
Levelt, W. J. M. (1993). Language use in normal speakers and its disorders. In G. Blanken, J.
Dittmann, H. Grimm, J. C. Marshall & C-W. Wallesch (Eds.), Linguistic disorders and pathologies
(pp. 1–15). Berlin: deGruyter.
Levelt, W. J. M. (1995). The ability to speak: From intentions to spoken words. European Review,
3, 13–23.
Levelt, W. J. M. (1999). Language production: A blueprint of the speaker. In C. Brown & P. Hagoort
(Eds.), Neurocognition of language (pp. 83–122). Oxford, England: Oxford University Press.
Linck, J. A., Osthus, P., Koeth, J. T. & Bunting, M. F. (2014). Working memory and second language
comprehension and production: A meta-analysis. Psychonomic Buletin and Review, 21(4), 861–883.
Macnamara, J., & Kushnir, S. L. (1971). Linguistic independence of bilinguals: The input switch.
Journal of Verbal Learning and Verbal Behaviour, 10(5), 480–487.
Meier, R., Cormier, K., & Quinto-Pozos, D. (Eds.). (2002). Modality and structure in signed and spoken
languages. Cambridge: Cambridge University Press.
Meuter, R. F. I., & Allport, A. (1999). Bilingual language switching in naming: Asymmetrical costs of
language selection. Journal of Memory and Language, 40, 25–40.
Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The
unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A
latent variable analysis. Cognitive Psychology, 41, 49–100.
Morford, J. P., Wilkinson, E., Villwock, A., Piñar, P., & Kroll, J. F. (2011). When deaf signers read
English: Do written words activate their sign translations? Cognition, 118(2). 286–292.
Mosca, M., & de Bot, K. (2017). Bilingual language switching: Production vs. recognition. Frontiers in
psychology, 8, 934. doi: 10.3389/fpsyg.2017.00934
Myers-Scotton, C. (2002). Contact linguistics, bilingual encounters and grammatical outcomes.
Oxford: OUP.
Navracsics, J. (2004). The question of control in bilingual speech production in different language
modes. Grazer Linguistische Studien, 62, 95–111.
Navracsics, J. (2014). Input or intimacy. Studies in Second Language Learning and Teaching, 4(3),
485–506.
Pichler, C. D., Reynolds, W., & Palmer, L. J. (2019). Multilingualism in signing communities. In S.
Montanari & S. Quay (Eds.), Multidisciplinary perspectives on multilingualism (pp. 175–202). Berlin,
Boston: De Gruyter Mouton.
Poplack, S. (1980). ‘Sometimes I’ll start a sentence in English y termino en español’: Toward a typology
22
Bilingual Models of Speaking
23
2
PSYCHOLINGUISTIC PROCESSES
IN L2 ORAL PRODUCTION
Daphnée Simard
1 Introduction/Definitions
Oral production (OP), defined here as the oral expression of language encompassing any form
of language production, from the most spontaneous, which occur during informal discussions,
to entirely planned ones, such as lectures or reading papers out loud, has often been examined
from the point of view of communicative skills in second language (L2; first language, L1)
research (e.g., Bygate, 2008). In such cases, it is commonly referred to as speaking. It has also
been researched as a psycholinguistic process, which is the object of this chapter offering a
description of the linguistic and cognitive processes involved in L2 OP. Generally speaking, OP
entails that L2 speakers mobilize simultaneously and in real time their cognitive resources
along with their linguistic knowledge (pragmatic, semantic, morphosyntactic, and phonolo-
gical), mastered to varying degrees. A distinction will be made between psycholinguistic pro-
cesses, that is, the language processing mechanisms such as grammatical and phonological
encodings, from cognitive resources, specifically, memory and attention.
Since the psycholinguistic processes underlying OP, whether in L1 or L2, are considered to
be largely the same, despite some differences (e.g., Kormos, 2006, 2011), the discussion will
focus OP in general psycholinguistic terms, beginning with Levelt’s speech production model
(e.g., 1989, 1999a, 2000, 2001), believed to represent a “consensus view of the linguistic, psy-
cholinguistic, and cognitive issues underlying the act of speaking” (Segalowitz, 2010, p. 8).
Levelt’s model, which was originally formulated for L1 production, has been widely used in L2
OP research (e.g., Dörnyei & Kormos, 1998; Izumi, 2003; Kormos, 1999a, 2006, de Bot &
Batyi, this volume). Additionally, the roles played by working memory and attention in the
model are addressed, as they constitute an important source of variation observed between L1
and L2 performance. Studies examining the mediating role of working memory and attention in
the psycholinguistic processes involved in L2 OP are then addressed. Finally, future directions
for research and classroom implications are presented.
2 Historical Perspectives
Over the past 40 years, several models have been proposed to explain how language is pro-
duced. Two broad types of models exist: modular and non-modular, with modularity referring
to encapsulated, specialized, and independent (i.e., that do not interact amongst themselves)
24 DOI: 10.4324/9781003022497-4
Psycholinguistic Processes
modules (Fodor, 1983). On the one hand, modular models put forth the idea of existing
modules through which OP proceeds (e.g., Garrett, 1984; Levelt, 1989, 1993; Levelt et al.,
1999). In this context, modules are considered to be functionally driven domain-specific pro-
cessors which operate on linguistic information (e.g., semantic, lexical, and phonetic). On the
other hand, non-modular models describe OP as a process during which linguistic information
is activated at different levels simultaneously (Dell, 1986; Trueswell et al., 1994).
According to Kormos (2006), to date, the most detailed account of OP and arguably the
most influential model is Levelt’s Blueprint of the Speaker (e.g., 1989, 1999). Levelt’s
modular model describes the psychological processing components in operation during
production and comprehension; in his view, they are closely intertwined, as speakers do both
when interacting orally (Levelt, 2000, p. 154). Therefore, the production and comprehension
systems will be addressed in the presentation of the model.
Production System
In all versions of the model, the first stage of OP is the conceptualization of what will be said.
However, in a more recent version, this first stage is fed by the speaker’s intention of
communication (Levelt, 2000), rather than being part of the conceptualizer itself (e.g., Levelt,
1989). Therefore, the first step of message conceptualization is to consider the speaker’ in-
tentions for communicating. To do so, speakers rely on their knowledge of discourse models
and Theory of Mind, which allows for the creation of “complex knowledge of structures of
social environment” (Levelt, 1999a, p. 89). Next, the conceptual generation of the message
takes place through two processes, that is, macroplanning and microplanning (Levelt,
1999a). Macroplanning is the process by which the speaker decides what to say next by
managing the discourse focus, that is, directing attention to the object of the production, and
shifting from one object to another, as the production evolves. Next, through microplanning,
the speaker determines which concepts to include in the emerging utterance, and how to
spatially and temporally represent them. The selection of lexical concepts in the mental
lexicon is referred to as perspective taking (Levelt, 1996). In the case of more complex
communicative intents, such as a narration, decisions about the order of events must be
made. This process is called linearization (Levelt, 1996). Citing Slobin’s (1987) work on
“Thinking-for-Speaking,” Levelt (1999a) states that microplanning processes are language
dependent, contrary to macroplanning processes, which are language independent (p. 94).
As they become available, segments of the propositional form of the message, also called
preverbal plans, are passed onto the formulator, for lexically driven encoding. The lexical
information has to be retrieved from the speaker’s mental lexicon – a structured network in
which lexical information is stored – originally in the form of lemmas and lexemes. Lemmas
were first defined by Kempen and Hoenkamp (1987) as the smallest units of grammatical
encoding containing semantic and syntactic properties of lexical items. In more recent ver-
sions, Levelt (1999a, 1999b; Levelt et al., 1999) attributes to lemmas only the syntactic
properties of lexical units, since an additional conceptual level has been added in the mental
lexicon (see Figure 2.1). Therefore, after having selected concepts in the mental lexicon for
25
Daphnée Simard
CONCEPTUALIZER
Conceptual generation of speech Pragmatic &
- macroplanning Monitoring discourse processing
- microplanning
FORMULATOR PARSER
Grammatical encoding
Mental Lexicon
- Concepts Grammatical encoding
Surface structure - Lemmas
- Lexemes
Morphophonological Morphophonological
encoding decoding & word
recognition
Phonological score
Syllabary
(internal speech)
Phonetic decoding
Phonetic encoding
Articulatory score
Prelexical representation
(Phonetic Plan)
Speech
Overt speech
Figure 2.1 Adapted version of Levelt’s model (1983, 1989, 1999a, 2000).
the preverbal plan, according to the intended message, the syntactic properties of the lemmas
associated with the concepts are activated in the form of syntactic procedures. For instance,
if a noun is selected, a noun phrase will be initiated. This process corresponds to the
grammatical encoding, and generates the surface structure, that is a sequence of lemmas
26
Psycholinguistic Processes
organized into phrases (Levelt, 1989, p. 11). Then, this surface structure is further processed
through phonological encoding by accessing the information stored in the lexemes (i.e.,
morphological and phonological properties of lexical items) and transformed into a phono-
logical score, that is the syllabified and prosodified words (and groups of words) (Levelt,
2001). The phonological score can remain in the form of internal speech or be transmitted
further in the production process to be articulated (Levelt, 1999a; Levelt et al., 1999).
A word about internal speech is necessary here. Although its nature is not entirely un-
derstood, it is thought to be phonological (Jackendoff, 1987; Levelt, 2000). This is the po-
sition taken by Levelt in his latest work (Levelt, 2000). In earlier versions of the model, it was
believed to be the result of phonetic encoding (e.g., Levelt, 1989). In any case, according to
him “whatever it is, we can attend to it and parse it just as we parse what is said to us by
others” (Levelt, 2000, p. 156).
The last phase of phonological encoding is phonetic encoding, which corresponds to the
retrieval, from the mental syllabary, of “articulatory gestural scores” or “motor instructions,”
allowing for their articulation (Levelt & Wheeldon, 1994). This mental syllabary, which is
believed to contain articulatory information for all syllables in a given language, accounts for
the various pronunciations of a given lexical item, depending on its use (e.g., Levelt, 1992,
1995, 1999a; Levelt & Wheeldon, 1994). The output of the phonetic encoding is then called the
articulatory score. The articulatory score, also known as the phonetic plan, is turned into actual
speech in the articulator. The articulation corresponds to the last phase of speech production
according to Levelt’s model (1989), that is, articulation during which phonological and pho-
netic processes translate the emerging message into overt speech.
Comprehension System
The comprehension system allows for the analysis of internal speech, following phonological
encoding, and overtly produced speech, after it has gone through the acoustic processor. The
information initially analyzed by the acoustic processor creates a prelexical representation
(probably built of contrastive information on vowels and consonants or even syllables; how-
ever, still of unclear nature) to be processed by the parser (Levelt, 2000). In both these sce-
narios, speech, internal or overt, ends up being analyzed by the comprehension system, more
recently called the parser (e.g., Levelt, 2000). The parser contains all the “procedures available
to a language user for understanding spoken language” (Levelt, 1983, p. 49), and has access to
the information contained in the mental lexicon. Finally, to interpret the intended message, the
listener relies on pragmatic and discourse processing located in the conceptualizer, which in-
teracts with grammatical decoding in the parser (Levelt, 2000). After being parsed, the message
is sent back to the conceptualizer to be self-monitored for possible feedback.
To explain self-monitoring, Levelt adopts what he calls a perceptual theory (as opposed to
a production theory) (1983, p. 46) according to which a monitor is located in the con-
ceptualizer, and is fed from the parser to allow self-monitoring (Özdemir et al., 2007). This
self-monitoring occurs through three monitors, or feedback loops (Levelt, 1983). The first
monitor loop verifies the conformity of the preverbal plan with the intended message, the
second, also called covert monitoring, checks the conformity of the internal speech, and the
last loop verifies the overt speech produced in the articulation phase. Attention must be
deployed at each stage of production to detect mismatches between speakers’ intentions and
the production outcomes.
The manifestation of speakers’ self-monitoring of their own OPs is self-repairs (Levelt,
1983, 1989). Self-repairs correspond to a modification (or an intention to modify) of what is
perceived as being a problem in one’s speech, as observed by an interruption of speech. More
27
Daphnée Simard
Moment of interruption
Figure 2.2 Conceptualization of self-repair adapted from Levelt (1983). Example from Simard
et al. (2017).
specifically, a self-repair sequence consists of a reparandum (i.e., the element being the object
of a modification), an editing phase and the repair proper (i.e., the new formulation), also
called reparatum, as depicted in Figure 2.2.
In the example provided in Figure 2.2, the reparandum corresponds to “the sui” and the
reparatum, to “his suitcase”, separated by an editing term represented by /euh/, which is not
necessarily present in self-repair sequences.
The perceived problems that trigger self-repairs can vary. These categories include speech errors,
syntactic flaws, or conceptual mismatches between the original intention and the emerging message
(see Levelt, 1983, for self-repair categories; see also Postma, 2000). In this sense, self-repairs do not
always target errors, and when they do, the repair proper is not necessarily correct (Levelt, 1983).
Furthermore, Levelt (1983, 1989) identifies two broad categories of self-repairs: Covert self-repairs,
which consist of false starts, hesitations, and pauses and occur when the message is checked before
being articulated; and overt self-repairs, that is, verbalized reformulations, which occur when
learners perceive an element that they wish to change in their productions.
Levelt’s model accounts for various aspects of OP, from the formulation of commu-
nicative intentions to self-regulation, all of which rely on working memory and attention.
28
Psycholinguistic Processes
to be responsible for phonological memory, that is, “the ability to recognize and remember
phonological sequences in the order in which they occur” (Baddeley et al., 1998). The visual
sketch pad is responsible for the short-term storage of visuospatial information. Finally, the
episodic buffer (Baddeley, 2000) regulates short-term storage of information from the other
two subsystems and the creation of multimodal representations by binding both visuospatial
and verbal information (Baddeley, 2003). It also allows interaction between the subsystems
and long-term memory (Baddeley, 2000, 2010).
The conceptualization of the message, including the generation of its communicative in-
tention, exerts the greatest demand on the speaker’s working memory as:
Speakers do not have a small, fixed set of intentions that they have learned to realize
in speech. Communicative intentions can vary in infinite ways, and for each of these
ways the speaker will have to find new means of expression. This requires much
attention. (Levelt, 1983, p. 21)
In other words, conceptualization relies on controlled processes which are slower, involve
attention and are constrained by working memory limits (Shiffrin & Schneider, 1977).
Conversely, the processes involved in the formulator and articulator are highly automatized
and consequently put lower demands on working memory, explaining the speed with which
language can be produced (Levelt, 1989). Highly automatized processes operate quickly and
without conscious control (Shiffrin & Schneider, 1977).
As the aforementioned quote indicates, attention is fundamental in the Blueprint of the
Speaker and is assumed to be closely related to working memory. Indeed, in Baddeley’s
working memory model the central executive is similar to Norman and Shallice’s (1986)
supervisory attentional system (Baddeley, 1986, 1996). Levelt (1989, 1999a) uses two types of
representation when talking specifically about attention, one that emphasizes characteristics
of attention and another that focuses on the functions of attention. More specifically, at-
tention is considered to be selective, to shift from one object to another and to fluctuate
during OP (Levelt, 1989, p. 498). The selective aspect of attention, a characteristic of at-
tention (e.g., Filter Theory, Broadbent, 1958; Treisman, 1960), is seen as limited, selective,
and effortful. The shifting and fluctuating aspects of attention are functions: attention
shifting refers to a change in the focus of attention, fluctuating is related to a change in the
intensity of attention focalization (Levelt, 1989). A few years later, Levelt (1999a) equated
attention shift with attention management, specifying that it is mainly solicited during mes-
sage planning, when speakers must shift their attentional resources from one process to
another (e.g., between macroplanning and microplanning), and that attention has to be
selective during self-monitoring. Therefore, two aspects of attention are utilized during OP.
Since Levelt’s model of L1 production, researchers have compared non-native speakers’ pro-
duction with that of native speakers. This led to the adaptation of the model to look more spe-
cifically for similarities and differences between L1 and L2 productions and to characterize in
psycholinguistic terms the various aspects of L2 OP. Similarities and differences are accounted for
to different extents in adaptations of the model for speakers using more than one language in their
OPs (e.g., de Bot, 1992; de Bot & Batyi, this volume; Kormos, 2006, 2011; Segalowitz, 2010).
29
Daphnée Simard
Segalowitz, 2010). An often-cited adaptation is the Bilingual Production Model (de Bot,
1992), which shows how one language will be selected over others, among speakers with
balanced or non-balanced levels of proficiency. de Bot argues that since the knowledge of a
situation necessary to convey a message is only available in the conceptualizer, the language
used for expressing that message has to be selected during that OP phase. More specifically,
the language of a given utterance is selected during macroplanning on the basis of in-
formation derived from the discourse model. However, it is only during microplanning that
the encoding specific to the language selected occurs (p. 8). This ensures that the preverbal
message will contain the necessary information for appropriate lexicalization in the
formulator.
The formulator is entirely language specific in the Bilingual Production Model. The
grammatical and phonological encodings specific to the selected language are triggered by
selecting lexical items in the mental lexicon subset (de Bot & Schreuder, 1993). Drawing on
the work of researchers such as Paradis (1981) and Green (1986), de Bot (1992) and de Bot
and Schreuder (1993) argued that lexical items are organized in subsets according to the
languages activated. For Poulisse and Bongaerts (1994), lexical items are tagged with in-
formation about the language to which they belong (e.g., Green, 1986), instead of being part
of subsets. Lexical items are organized in a large network which can be partly activated.
Therefore, as soon as a language is selected in the conceptualizer, all lexical items tagged as
belonging to that language subset can be activated. This activation is mediated by word
frequency. Even though an L2 has been selected by the speaker, if the lemma chosen is more
frequently used in the L1, it might be produced in the L1. Finally, with regard to the ar-
ticulator, de Bot (1992) suggests that there is only one, independent of the numbers of
languages known to the speaker, as exemplified by the persistent foreign accents observed
among L2 speakers.
In her adaptation of Levelt’s model, Kormos (2006, 2011) proposed that the knowledge
necessary for L1 or L2 OP in long-term memory is divided into episodic memory, semantic
memory, and declarative memory. Episodic memory is responsible for storing temporally
organized events or episodes experienced by the speaker (Kormos, 2011, p. 41, see also
Tulving, 1972). Semantic memory contains the mental lexicon, with conceptual information,
lemmas and lexemes. Finally, the declarative memory added by Kormos in her adaptation of
Levelt’s model, accounts for L2 syntactic and phonological rules known to the speaker that
are not proceduralized to the extent they are in L1. L2 vocabulary knowledge could also be
added to the declarative memory as defined by Kormos. In this regard, Segalowitz, in his
adaptation of de Bot (1992) and Levelt (1999a), for the explanation of L2 fluency, made a
distinction between the lexicon and vocabulary. Citing Paradis’ work, he distinguished the
lexicon which corresponds to the implicit knowledge of the meaning of words, their use, and
their syntactic properties, from vocabulary knowledge that consists of the explicit knowledge
of words. Therefore, in a model for L2 OP, there should be a possible storage for vocabulary
knowledge because it is likely to be an important part of the L2 speaker’s source of
knowledge for production.
Although there seems to be agreement that Levelt’s model can be used to describe L2
production (e.g., de Bot, 1992; Poulisse, 1997; Segalowitz, 2010) several differences between
L1 and L2 productions have been suggested. Crucially, L1 OP is considered mainly sub-
conscious and automatized (e.g., Levelt, 1989), unlike L2 OP which relies heavily on working
memory and attention (e.g., Dörnyei & Kormos, 1998; Kormos, 2006; Segalowitz, 2010).
The next part presents a discussion of their mediating role in the psycholinguistic processes
involved in L2 OP.
30
Psycholinguistic Processes
Visuo-spatial
Episodic sketchpad
Central executive buffer
Phonological loop
Figure 2.3 Working Memory Measurement. Reprinted from Simard et al. (2020) with permission.
31
Daphnée Simard
recently, attentional shift and attentional control (e.g., Segalowitz, 2007; Simard & Wong,
2001; Tomlin & Villa, 1994). A plethora of instruments such as the Trail Making Test
(Reitan, 1958) and the D2 test of attention (Brickenkamp & Zillmer, 1998), or the eye-
tracker paradigm have been used to measure attention in L2 research. This highlights a lack
of consensus, reflected in the variety of tests, each measuring one or more aspects of at-
tention, which still prevails around the nature of attention (Kormos, 2011; See Robinson,
2003 for an in-depth discussion).
That being said, the role played by working memory and attention in L2 OP has been
extensively studied. Research reveals that both the executive aspect of working memory, and
by extension, attentional control and the storage subsystems enhance L2 production and
comprehension success (see Linck et al., 2014 for a meta-analysis; Wen et al., 2013).
However, their role in the different psycholinguistic processes, as described in Levelt’s
model – conceptualizer, formulator, articulator, and self-monitoring – during L2 OP has
been somewhat neglected (e.g., Fortkamp, 1999; Kormos, 2006). Nevertheless, the psycho-
linguistic processes involved in OP should be closely considered in regard to variation in
working memory and attention, as they are automatized in L1 but not in L2, especially
among speakers with lower levels of proficiency (de Bot, 1992; Dörnyei & Kormos, 1998,
Kormos, 2011).
Despite the difficulties in investigating each psycholinguistic process individually during
OP, some proposals have been suggested. In her study, Fortkamp (1999) examined the re-
lationship between the executive aspect of working memory as measured by a speaking span
task (administered in Portuguese L1 and English L2) and fluency in conceptualizing a
message through a speech generation task (picture description) and fluency in articulation
through an oral reading task and an oral slip task (aimed at eliciting spoonerism errors, i.e.,
speech errors involving phoneme exchanges). She found that individuals with greater
working memory spans (measured in English, the participants’ L2) were better at the speech
generation task in English L2, confirming according to the author, the reliance of the con-
ceptualizer on working memory. However, no association was found with either of the ar-
ticulation tasks and the speaking span test. This study’s results provide insights into the
differential impact of working memory on psycholinguistic processes, and as Levelt stated,
conceptualization seems to rely more heavily on working memory than articulation does.
Instead of focusing on tasks targeting each psycholinguistic process, one might manipulate
task conditions, to ease cognitive demand at different phases of OP (see Skehan, 2015, for a
detailed discussion). These manipulations include presence or absence of preplanning time,
presence or absence of on-line planning time and repetition of the same task (Skehan, 2015).
For instance, preplanning time, along with as much time as needed to produce language (i.e.,
on-line planning) should ease conceptualization, and consequently, allow the allocation of
more resources to formulation, which is believed to be taxing because of the grammatical and
phonological encoding that are based on partial L2 knowledge (e.g., de Bot, 1992). Conversely,
if L2 speakers are given as much time as needed to produce, but are not given preplanning
time, it is assumed that the formulator will be eased, but not the conceptualizer. Finally, one
might assume that the repetition of the same oral task would prime knowledge in semantic and
episodic memory (see Kormos, 2006, 2011), and therefore lessen demands on both working
memory and attention, facilitating grammatical and phonological encoding in the formulator.
Many L2 studies have focused exclusively on self-monitoring during L2 OP. Typically, self-
monitoring has been investigated by looking at its observable manifestation, that is, self-repairs.1
Some L2 studies categorized self-repairs according to the classification originally proposed by
Levelt (1983), and others created their own, based on Levelt (e.g., Bange & Kern, 1996; Kormos,
1999a), making comparisons across studies difficult. However, in general, two broad categories
32
Psycholinguistic Processes
representing two general levels of psycholinguistic processing are present in these classifications.
Self-repairs either target (1) discourse-level elements (at the conceptualizer level) by changing
words or groups of words, to modify the informational structure of the message, or (2) form-level
elements (at the formulator level), by modifying a form perceived as inaccurate (Simard et al.,
2011; Simard et al., 2017; Zuniga 2015; Zuniga & Simard, 2019). Interestingly, results from
factorial analyses show that self-repairs in the two categories load onto a different factor, con-
firming their independence in the production process (see Simard et al., 2016).
Some of the self-repair studies specifically examined their relationship with the central ex-
ecutive aspect of working memory, the phonological loop, and various aspects of attention.
Although in most cases a relationship between the central executive, as measured by complex
memory tasks and the production of self-repairs, and a larger working memory span was
associated with fewer self-repairs targeting choice of words or groups of words2 (e.g.,
Ahmadian, 2015; Mojavezi & Ahmadian, 2013; Simard et al., 2020), one study did not find any
(Georgiadou & Roehr-Brackin, 2017). It could be argued that since the complex memory task
used in Georgiadou and Roehr-Brackin (2017) was administered in the participants’ L2, their
results differed from those of other studies, in which the working memory tasks were ad-
ministered in the participants’ L1. This result is contrary to Fortkamp (1999) who observed a
significant correlation only between the working memory task administered in the participant’s
L2 and the L2 speech generation task. However, it must be noted that her participants’ OPs
were coded subjectively by two judges for fluency on a scale from 1 to 5. More studies are
needed to clarify the role of the executive aspect of working memory in self-monitoring.
Regarding working memory subcomponents, only the phonological loop has been in-
vestigated in relation to self-repairs. Simard and her colleagues (2016) found that a better
score on the non-word repetition task was associated with fewer self-repairs targeting lan-
guage forms among their intermediate proficiency participants. The phonological loop was
not associated with self-repairs targeting discourse-level elements.
Different aspects of the relationship between self-repairs and attention have been in-
vestigated (e.g., Simard et al., 2011; Zuniga, 2015; Zuniga & Simard, 2019). For instance,
Simard and her colleagues (2011) examined attentional capacity, that is, the capacity to
maintain concentration of attention across time. No relationship between self-repairs and the
limited-capacity characteristic of attention, as measured by the attentional capacity D2 test
was found. The authors argued attention shift is necessary for monitoring. This claim was
later verified and supported (Simard et al., 2016; Zuniga, 2015; Zuniga & Simard, 2019).
A closer look at the results obtained from the studies presented earlier reveals that the
complex relationship between monitoring, attention shift, phonological memory, and the ex-
ecutive aspect of working memory is a function of the language elements targeted for mon-
itoring. Indeed, phonological memory was exclusively and positively associated with self-
repairs targeting forms (Simard et al., 2016), while the executive aspect of working memory
and attention shift were negatively associated with self-repairs targeting discourse-level con-
ceptual elements (Ahmadian, 2015; Mojavezi & Ahmadian, 2013; Simard et al., 2020). Given
that attention must be controlled to shift it from one stimulus to another, and the executive
aspect of working memory is an attention–control system, it is no surprise that both measures
led to similar results (Simard et al., 2016).
6 Future Directions
Cognitive resources (i.e., working memory and attention) interact in a distinct manner with
the psycholinguistic processes (i.e., the language processing mechanisms such as grammatical
and phonological encodings) involved during L2 OP. Although we have some information
33
Daphnée Simard
regarding how working memory and attentional resources interact with self-monitoring
during L2 OP, much less information is available regarding their interaction with con-
ceptualization, formulation, and articulation. More research is needed. For instance, a re-
search programme should examine the interaction between working memory (executive and
storage aspects), attention (characteristics and functions), and a systematic manipulation of
task conditions among various L2 speaker populations (e.g., different age groups, levels of
proficiency, combinations of languages spoken). Additionally, the idea that OP tasks
themselves can target specific psycholinguistic processes should be further investigated (see
Fortkamp, 1999), and more specifically whether a given task can really isolate formulation.
Van Moere (2012) suggested that elicited repetition provides information regarding pho-
nological and grammatical encoding accuracy, which are both psycholinguistic processes
occurring during the formulation phase. This is an interesting path to examine in relation to
other measures of OP and cognitive resources.
Finally, proposals formulated in the adaptations of Levelt’s model for L2 OP should be
tested. Among others, L2 research on memory has focused on Baddeley’s model oper-
ationalization. However, it would be interesting to investigate the capacity to access the
additional storage of information in long-term memory (see Kormos, 2006) using episodic
and semantic memory tasks (e.g., Vallet et al., 2017) to verify how these measures interact
with the psycholinguistic processes involved during L2 production. The same could be done
for the relationship with a declarative memory measure.
Notes
1 In the field of L2 acquisition research, self-repairs have been analyzed from different angles. Among
other things, their nature, frequency, and distribution in L2 OP (i.e., characteristics of self-repairs)
have been described in relation to the language features being repaired (e.g., Kormos, 1998), the
level of proficiency of the L2 speakers (e.g., Gilabert, 2007; Kormos, 1999b, 2000a, 2000b), or the
development of their language proficiency (e.g., Griggs, 1997; Kormos, 1999a, 2000a, 2000b), and
with the degree of complexity of different narrative tasks (i.e., contextual characteristics of OP) (e.g.,
Gilabert, 2007; Kormos, 1999b).
34
Psycholinguistic Processes
2 In their work, Mojavezi and Ahmadian (2013) and Ahmadian (2015) refer to discourse change,
which is similar in definition to what Simard and colleagues call choice of words or groups of words.
Further Reading
Kormos, J. (2006). Speech production and second language acquisition. Mahwah: Lawrence Erlbaum
Associates.
Segalowitz, N. (2010). Cognitive bases of second language fluency. New York: Routledge.
References
Ahmadian, M. J. (2015). Working memory, online planning and L2 self-repair behaviour. In Z. Wen,
M. B. Mota & A. McNeill (Eds.), Working memory in second language acquisition and processing
(pp. 160–174). Bristol, UK: Multilingual Matters.
Allport, D. A. (1987). Selection for action: Some behavioral and neurophysiological considerations of
attention and action. In H. Heuer & A. F. Sanders (Eds.), Perspectives on perception and action
(pp. 395–419). Hillsdale, NJ: Lawrence Erlbaum Associates.
Baddeley, A. D. (2015). Working memory in second language learning. In Z. Wen, M. B. Mota & A.
McNeill (Eds.), Working memory in second language acquisition and processing (pp. 17–28). Bristol:
Multilingual Matters.
Baddeley, A. D. (2010). Long-term and working memory: How do they interact? In L. Bäckman & L.
Nyberg (Eds.), Memory, aging and the brain: A festschrift in honour of Lars-Göran Nilsson
(pp. 18–30). Hove, UK: Psychology Press.
Baddeley A. D. (2003). Working memory and language: An overview. Journal of Communication
Disorders, 36, 189–208.
Baddeley, A. D. (2000). The episodic buffer: A new component of working memory? Trends in
Cognitive Science, 4, 417–423.
Baddeley, A. D. (1996). Exploring the central executive. The Quarterly Journal of Experimental
Psychology: Section A, 49, 5–28.
Baddeley, A. D. (1986). Working memory. Oxford: Oxford University Press.
Baddeley, A. D., Gathercole, S., & Papagano, C. (1998). The phonological loop as a language learning
device. Psychological Review, 105, 158–173.
Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. H. Bower (Ed.), The psychology of
learning and motivation: Advances in research and theory (Vol. 8, p. 47–89). New York: Academic
Press.
Bange, P., & Kern, S. (1996). La régulation du discours en L1 et en L2. Études Romanes, 35, 69–103.
Brickenkamp, R., & Zillmer, E. (1998). The d2 Test of Attention. Seattle, WA: Hogrefe & Huber.
Broadbent, D. E. (1958). Perception and communication. New York: Pergamon.
Bygate, M. (2008). Oral second language abilities as expertise. In K. Johnson (Ed.), Expertise in second
language learning and teaching (pp. 104–127). New York: Palgrave Macmillan.
de Bot, K. (1992). A bilingual production model: Levelt’s “speaking” model adapted. Applied
Linguistics, 13, 1–24.
de Bot, K., & Schreuder, R. (1993). Word production and the bilingual lexicon. In R. Schreuder & B.
Weltens (Eds.), The bilingual lexicon (pp. 191–214). Amsterdam: John Benjamins.
Dell, G. (1986). A spreading-activation theory of retrieval in sentence production. Psychological
Review, 93, 283–321.
Dörnyei, Z., & Kormos, J. (1998). Problem-solving mechanisms in L2 communication: A psycho-
linguistic perspective. Studies in Second Language Acquisition, 20, 349–385.
Fodor, J. A. (1983). The modularity of the mind. Bradford: MIT Press.
Fortkamp, M. B. M. (1999). Working memory capacity and aspects of L2 speech production.
Communication and Cognition, 32, 259–296.
Garrett, M. F. (1984). The organization of processing structure for language production. Applications
to aphasic speech. In D. Caplan, A. R. Lecours & A. Smith (Eds.), Biological perspectives on lan-
guage (pp. 172–193). Cambridge, MA: MIT Press.
35
Daphnée Simard
Georgiadou, E., & Roehr-Brackin, K. (2017). Investigating executive working memory and phonolo-
gical short-term memory in relation to fluency and self-repair behaviour in L2 speech. Journal of
Psycholinguistic Research, 46, 877–895.
Gilabert, R. (2007). Effects of manipulating task complexity on self-repairs during L2 OP. International
Journal of Applied Linguistics, 45, 215–240.
Golonka, E. (2006). Predictors revised: Linguistic knowledge and metalinguistic awareness in second
language gain in Russian. Modern Language Journal, 90, 496–505.
Green, D. W. (1986). Control, activation and resource: A framework and a model for the control of
speech in bilinguals. Brain and Language, 27, 210–223.
Griggs, P. (1997). Metalinguistic work and the development of language use in communicative pair-work
activities involving second language learners. In L. Diaz & C. Pérez (Eds.), Views on the acquisition and
the use of second languages (pp. 403–415). Barcelona, Spain: Universitat Pompei Fabrat.
Izumi, S. (2003). Comprehension and production processes in second language learning: In search of
the psycholinguistic rationale for the output hypothesis. Applied Linguistics, 24, 168–196.
Jackendoff, R. (1987). Consciousness and the computational mind. Cambridge, MA: MIT Press.
Kempen, G., & Hoenkamp, E. (1987). An incremental procedural grammar for sentence formulation.
Cognitive Science, 11, 201–258.
Kormos, J. (2011). Speech production and the cognition hypothesis. In P. Robinson (Ed.), Second
language task complexity: Researching the Cognition Hypothesis of language learning and perfor-
mance (pp. 39–60). Philadelphia: John Benjamin.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence Erlbaum
Associates.
Kormos, J. (2000a). The timing of self-repairs in second language speech production. Studies in Second
Language Learning, 22, 145–167.
Kormos, J. (2000b). The role of attention in monitoring second language speech production. Language
Learning, 50, 343–384.
Kormos, J. (1999a). Monitoring and self-repair in L2. Language Learning, 49, 303–342.
Kormos, J. (1999b). The effect of speaker variables on the self-correction behaviour of L2 learners.
System, 27, 207–221.
Kormos, J. (1998). A new psycholinguistic taxonomy of self-repairs in L2: A qualitative analysis with
retrospection. Even Yearbook, ELITE SEAS Working Papers in Linguistics, 3, 43–68.
Levelt, W. J. M. (2001). Relations between speech production and speech perception: Some behavioral
and neurological observations. In E. Dupoux (Ed.), Language, brain and cognitive development:
Essays in honour of Jacques Mehler (pp. 241–256). Cambridge, MA: MIT Press.
Levelt, W. J. M. (2000). Psychology of language. In K. Pawlik & M. R. Rosenzweig (Eds.),
International handbook of psychology (pp. 151–167). London: SAGE publications.
Levelt, W. J. M. (1999a). Language production: A blueprint of the speaker. In C. Brown & P. Hagoort
(Eds.), Neurocognition of language (pp. 83–122). Oxford, England: Oxford University Press.
Levelt, W. J. M. (1999b). Models of word production. Trends in Cognitive Sciences, 3, 223–232.
Levelt, W. J. M. (1996). Perspective taking and ellipsis in spatial descriptions. In P. Bloom, M. A.
Peterson, M. F. Garrett & L. Nadel (Eds.), Language and space (p. 77–107). Cambridge, MA: MIT
Press.
Levelt, W. J. M. (1995). The ability to speak: From intentions to spoken words. European Review,
3, 13–23.
Levelt, W. J. M. (1993). Psycholinguistics. In A. Colman (Ed.), Companium encyclopedia of psychology
(Vol. 1, pp. 319–337). London: Routledge.
Levelt, W. J. M. (1992). Accessing words in speech production: Stages, processes and representations.
Cognition, 42, 1–22.
Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.
Levelt, W. J. M. (1983). Monitoring and self-repair in speech. Cognition, 14, 41–104.
Levelt, W. J. M., & Wheeldon L. (1994). Do speakers have access to a mental syllabary? Cognition,
50(1–3), 239–269.
Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production.
Behavioral and Brain Sciences, 22, 1–38.
Linck, J. A., Osthus, P., Koeth, J. T., & Bunting, M. F. (2014). Working memory and second language
comprehension and production: A meta-analysis. Psychonomic Bulletin & Review, 21, 861–883.
Martin, K. I., & Ellis, N. C. (2012). The roles of phonological short-term memory and working memory
in L2 grammar and vocabulary learning. Studies in Second Language Acquisition, 34, 379–413.
36
Psycholinguistic Processes
Mojavezi, A., & Ahmadian, M. J. (2013). Working memory capacity and self-repair behaviour in first
and second language oral production. Journal of Psycholinguistic Research, 43, 289–297.
Miyake, A., & Shah, P. (Eds.). (1999). Models of working memory: Mechanisms of active maintenance
and executive control. New York: Cambridge University Press.
Norman, D. A., & Shallice, T. (1986). Attention to action: Willed and automatic control of behaviour.
In R. J. Davidson, G. E. Schwartz & D. Shapiro (Eds.), Consciousness and self-regulation. Advances
in research and theory (pp. 1–18). New-York: Plenum Press.
Özdemir, R., Roelofs, A., & Levelt, W. J. M. (2007). Perceptual uniqueness point effects in monitoring
internal speech. Cognition, 105, 457–465.
Paradis, M. (1981). Neurolinguistic organization of a bilingual’s two languages. In J. E. Copeland & P.
W. Davis (Eds.), The seventh LACUS forum (pp. 486–494). Columbia, SC: Hornbeam Press.
Pawlak, M. (2011). Instructed acquisition of speaking: Reconciling theory and practice. In M. Pawlak,
E. Waniek-Klimczak & J. Majer (Eds.), Speaking and instructed foreign acquisition (pp. 3–23).
Toronto, Canada: Multilingual Matters.
Postma, A. (2000). Detection of errors during speech production: A review of speech monitoring
models. Cognition, 77, 97–131.
Poulisse, N. (1997). Language production in bilinguals. In A. de Grot & J. Kroll (Eds.), Tutorials in
bilingualism: Psycholinguistic perspectives (pp. 201–224). Hillsdale, NJ: Lawrence Erlbaum.
Poulisse, N., & Bongaerts, T. (1994). First language use in second language production. Applied
Linguistics, 15, 36–57.
Reitan, R. (1958). Validity of the Trail Making Test as an indicator of organic brain damage.
Perceptual and Motor Skills, 8, 271–276.
Robinson, P. (2003). Attention and memory during SLA. In C. J. Doughty & M. H. Long (Eds.), The
handbook of second language acquisition (pp. 631– 678). London: Blackwell Publishing.
Segalowitz, N. (2010). Cognitive bases of second language fluency. New York: Routledge.
Segalowitz, N. (2007). Access fluidity, attention control, and the acquisition of fluency in a second
language. TESOL Quarterly, 41, 181–186.
Shiffrin, R.M., & Schneider, W. (1977). Controlled and automatic human information processing: II.
Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127–190.
Skehan, P. (2015). Working memory and second language performance. In Z. Wen, M. Mota & A.
McNeill (Eds.), Working memory in second language acquisition and processing (pp. 189–201).
Bristol: Multilingual Matters.
Simard, D., Bergeron, A., Liu, Y.-G., Nader, M., & Redmond, L. (2016). Production d’autor-
eformulations autoamorcées en langue seconde: rôle de l’attention et de la mémoire phonologique.
Revue canadienne des langues vivantes/Canadian Modern Language Review, 72, 183–210.
Simard, D., Fortier, V., & Zuniga, M. (2011). Attention et production d’autoreformulations
autoamorcées en français langue seconde, quelle relation? Journal of French Language Studies, 21,
417–436.
Simard, D., French, L., & Zuniga, M. (2017). Evolution of L2 self-repair behavior over time among
adult learners of French. Revue canadienne de linguistique appliquée/ Canadian Journal of Applied
Linguistics, 20, 71–89.
Simard, D., Molokopeeva, T., & Zhang, Q. Y. (2020). The contribution of working memory to L2
French pronunciation among adult language learners. Canadian Modern Language Review/Revue
canadienne des langues vivantes, 76, 50–69.
Simard, D., & Wong, W. (2001). Alertness, orientation, and detection: The conceptualization of at-
tentional functions in SLA. Studies in Second Language Acquisition, 23, 103–124.
Slobin, D. I. (1987). Thinking for speaking. Proceedings of the thirteenth annual meeting of the Berkeley
linguistics society (pp. 435–444). Berkeley, CA: Berkeley Linguistics Society.
Tomlin, R., & Villa, H. (1994). Attention in cognitive science and second language acquisition. Studies
in Second Language Acquisition, 16, 183–203.
Towell, R., & J.-M. Dewaele. (2005). The role of psycholinguistic factors in the development of fluency
amongst advanced learners of French. In J.-M. Dewaele (Ed.), Focus on French as a foreign lan-
guage: Multidisciplinary approaches (pp. 210–239). Toronto, Canada: Multilingual Matters.
Treisman, A. (1960). Contextual cues in selective listening. Quarterly Journal of Experimental
Psychology, 12, 242–248.
Trueswell, J. C., Tanenhaus, M. K., & Garnsey, S. M. (1994). Semantic influences on parsing: Use of
thematic role information in syntactic ambiguity resolution. Journal of Memory and Language, 33,
285–318.
37
Daphnée Simard
Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.), Organization
of memory (pp. 381–403). New York: Academic Press.
Vallet G. T., Hudon, C., Bier, N., Macoir, J.,Versace, R., & Simard, M. (2017). ASEMantic and
EPisodic Memory Test (SEMEP) developed within the embodied cognition framework: Application
to normal aging, Alzheimer’s disease and semantic dementia. Frontiers in Psychology, 8, 1493.
van Hest, E. (1996). Self-repair in L1 and L2 production. Tilburg, Netherlands: Tilburg University Press.
Van Moere, A. (2012). A psycholinguistic approach to oral language assessment. Language Testing, 29,
325–344.
Wen, Z. (2016). Working memory and second language learning: An integrated approach. Bristol, UK:
Multilingual Matters.
Wen, Z., Mota, M. B., & McNeil, A. (2013). Working memory and SLA: Towards an integrated
theory. Asian Journal of English Language Teaching, 23, 1–18.
Zuniga, M. (2015). The role of attention in L2 speech production. Thèse de doctorat inédite, Québec,
Canada: Université du Québec à Montréal.
Zuniga, M., & Simard, D. (2019). Factors influencing L2 self-repair behavior: The role of L2 profi-
ciency, attentional control and L1. Journal of Psycholinguistic Research, 48, 43–59.
38
3
A COMPLEX DYNAMIC SYSTEMS
THEORY PERSPECTIVE ON
SPEAKING IN SECOND LANGUAGE
DEVELOPMENT
Wander Lowie and Marjolijn Verspoor
1 Introduction/Definitions
Complex Dynamic Systems Theory (CDST) addresses the process of language development
over time, rather than the outcomes of a process. The process is commonly described in terms
of patterns of change, which include stages of development in variability, stabilization, and
destabilization. The general goal of CDST-inspired studies is to come to an understanding of
the way in which the complex interaction of numerous forces leads to behavioural changes, to
understand how development comes about. If there is one conclusion that we can safely draw
from about 20 years of research into second language development from a CDST perspective,
it is that language development is a highly individual developmental process; language de-
velopment is not predetermined, but emerges from the interaction and coordination of sub-
systems, also referred to as “self-organization” (Smith & Thelen, 2003).
The term “systems” is central in systems theories like CDST. A system is the conglom-
eration of connected elements that form a coherent whole (Bertalanffy, 1995). The elements
in CDST are referred to as subsystems, which are also systems that may again consist of
subsystems. For instance, the language system is embedded in the larger system of cognition,
which in turn is embedded in the larger system of the human being, which is embedded in the
larger system of a speech community. The language system itself consists of several em-
bedded subsystems, like phonology and vocabulary, with language-specific subsystems for
multilinguals.
Subsystems are open and all these systems are connected regardless of the degree of
embeddedness. The changes in coordinated and interdependent subsystems form the foun-
dation of the dynamic and non-linear nature of development. As changes in any of the
subsystems may lead to changes in other subsystems, development is characterized as an
iterative process in which each stage in development is based on the system’s preceding state.
And since the combination of subsystems and the nature and timing of their interaction is
essentially unique for each person, this leads to an iterative developmental process that is
strongly individual and cannot be predetermined. The logical consequence of this is that
CDST-inspired studies tend to be longitudinal case studies that focus on the process of
development.
DOI: 10.4324/9781003022497-5 39
Wander Lowie and Marjolijn Verspoor
A growing number of CDST-inspired studies have shown that although the steps in
language development may be globally similar among learners, the timing and the magnitude
of the development strongly depends on individual differences and on changes in the in-
teracting factors that contribute to the learner’s development, including the learner’s context
or environment. CDST studies are characterized by longitudinal observations of individual
learners with dense measurements, allowing for reflections on the individual process of de-
velopment rather than focusing on generalized products of learning for groups of learners at
one moment in time.
So far, most CDST studies in second language development have focused on the devel-
opment of writing for different levels of learners in various contexts, but a small number of
studies have also focused on speaking, which is quite complex. The number of interacting
subsystems relevant for speaking is relatively large, as speaking is generally less controlled
than writing, and the contexts in which speaking is used are generally natural and ecological.
For oral production, the operation of the skills is coupled with context, including the dyad.
This is clearly shown by the occurrence of alignment and convergence during speech. People
have a tendency to adjust to the context, and likewise the context will then adjust to the
speaker, leading to an active form of coordination of the relevant subsystems. The short-term
developmental process of the speaker is coupled with that of other speakers and can be found
in all or several of the relevant subsystems, from timing and articulation in the speaker’s
pronunciation to the use of non-verbal gestures.
Another typical characteristic of dynamic systems is the self-similar nature of embedded
subsystems, also referred to as fractals. Each time we zoom in to a dynamic structure,
similar patterns can be perceived. The repeated patterns are clearly illustrated in
Mandelbrot sets, but can also be seen in many naturally occurring phenomena like cau-
liflowers and trees. The pattern of the skeleton of the tree is repeated in increasingly smaller
structures, from branches to leaves. Also, in the time domain fractals have been identified
in dynamic systems, when a certain pattern of variability is repeated at smaller timescales
(Rhea et al., 2014). Most evidence for fractal structures in the time domain have been
found at the short timescale of language processing. During speech comprehension, a
hierarchy of linguistic structures has been identified in neural tracking at different time-
scales (Zhang & Ding, 2017). Moreover, a fractal dimension has been found in the
variability of speech production during simple naming tasks in the L1 (Holden et al., 2009),
as well as in second language naming tasks (Plat et al., 2018). A fractal structure has also
been identified in the diachronic development of syntactic complexity (Evans & Larsen-
Freeman, 2020).
In the following parts, we will contextualize CDST research in a historical framework and
discuss critical issues. We will then discuss the strongly developing methods for process-
based research relevant for speaking and will mention some challenges and future directions
for the CDST framework.
2 Historical Perspectives
CDST is founded in well-accepted dynamic theories of physics, mathematics, and demo-
graphy. Over the past three decades, applications of CDST to cognition and psychology have
been very influential (Thelen & Smith, 1994) and applications to language development and
second language development have caused a major turn in applied linguistics. Two of the
most recent turns that have influenced our thinking about second language development
today occurred in the late 1990s, when psycholinguistic and neurolinguistic experimental
40
Second Language Development
methods became an accepted line of research. The two shifts have been referred to as the
Social Turn and the Dynamic Turn.
The Social Turn (see Block, 2003 for a detailed critical review), initiated by the seminal
and controversial paper by Firth and Wagner (1997), was a strong negative reaction to the
idea that language learning can be investigated through controlled experiments. A theoretical
perspective closely associated with the social turn is Sociocultural Theory (SCT), linking
society to individual development. For second language learning, the central premise of SCT
is that any form of human cognitive development is essentially mediated by cultural artefacts
(Lantolf & Thorne, 2006). Consequently, language development cannot be studied outside its
authentic communicative context. Levine (2020) argues that SCT and CDST are commen-
surable and complementary frameworks.
The second major paradigm shift, the Dynamic Turn (de Bot, 2015), shares several of its
assumptions with the social turn and its theoretical impacts. Similar to SCT, dynamic the-
ories do not consider language development as isolated activity in the cognitive domain. But
different from SCT, Dynamic theories do not emphasize the opposition between cognition
and sociocultural artefacts, but stress their integration. Both theories consider language
development as the ongoing, emerging process of an integrated holistic system, which in-
cludes a wide range of connected, embedded and embodied subsystems. The onset of this
development dates back to 1997, when Diane Larsen-Freeman published a groundbreaking
paper on Complex Adaptive Systems in Applied Linguistics (Diane Larsen-Freeman, 1997).
The paper emphasized the dynamic nature of language development, which is described as a
journey with no end state. Years later, the dynamic turn was reinforced by a number of
papers and books, such as Herdina and Jessner (2002), de Bot et al. (2005, 2007), Larsen-
Freeman and Cameron (2008), Dörnyei (2009), and Verspoor et al. (2011). Some authors
used the term Dynamic Systems Theory, while others used the term Complex System, but the
main theoretical implications were the same. Therefore, it was decided to use the combined
term Complex Dynamic Systems Theory (CDST) (de Bot, 2017).
Today, an increasing number of scholars are doing CDST-inspired research into second
language development. The focus of their studies is diverse, from very fundamental studies
on the self-organizing nature of language use in real time (Plat et al., 2018) to theoretical
considerations about CDST research (Hiver & Al-Hoori, 2016), studies that focus on the
identification stages of development as they emerge over time (Baba & Nitta, 2014);
process-based research that focuses on the development of accuracy and complexity by
studying variability in second language development (Spoelman & Verspoor, 2010);
and pedagogical implications of a CDST framework (Levine, 2020). For practical reasons,
most studies focus on writing, though recently some work on different aspects of the dy-
namic development of speaking has been published (Hepford, 2017; Lowie et al., 2018;
Polat & Kim, 2014; Roehr-Brackin, 2014; Yu & Lowie, 2019). We will elaborate on these
contributions in Part 4.
The dynamic nature of acquisition has also been addressed from a theoretical angle.
Following Browman and Goldstein (1992), Lima Júnior (2013) proposes that a child’s first
words may not be stored and accessed as separate phonemes, but as “holistic patterns of
articulatory routines” (Browman & Goldstein, 1992, p. 39). The frequent repetition of mi-
crolevel elements leads to the emergence of macrolevel patterns. That is, the pre-linguistic units
gradually develop into gestural units of contrast. During acquisition, the child distinguishes
and adjusts his/her emerging gestures and, simultaneously, learns how to coordinate them, as
the development of the ability to produce all the gestures of a word requires their coordination.
Lima Júnior (2013) concludes that such a CDST and experiential perspective on acquisition
41
Wander Lowie and Marjolijn Verspoor
may alter not only our view of first language acquisition, but also has implications for L2
acquisition. Because of the iterative processes and eventual entrenchment of patterns (or at-
tractor states in CDST terms), L2 learners associate L1 sound patterns to unknown patterns of
the L2, which reminds us of Flege’s Speech Learning Model (1995, 1999), in which he argues
that since adult L2 learners often fail to distinguish a certain L2 sound from a close L1 sound,
they may classify the L2 sound under a (prototypical) phonological category of their en-
trenched L1 categories.
42
Second Language Development
points. Unfortunately, a focus on the relevance of timescales and the fractal nature of lan-
guage comprehension and production, especially for the second language, has been largely
underexplored.
Finally, a critical issue is the choice of measures of (written or spoken) language pro-
duction in time series. When, depending on the timescale, every hour, day, week, month, or
year repeated measurements are taken of language production, we must be sure that the
measures are representative for the production at that moment. Measures that have typically
been used in CDST research are holistic assessment by trained raters or analytic assessment
of language using measures of Complexity, Accuracy, and Fluency (CAF). The introduction
of analytic measures of CAF has been a step in the objective evaluation of language de-
velopment (Ortega, 2003), and analyses can conveniently be run on (transcribed) samples
using Natural Language Processing (NLP) tools (https://www.linguisticanalysistools.org) for
linguistic complexity and lexical sophistication. Suitable PRAAT-scripts are available for the
automatic analysis of aspects of Fluency (De Jong & Wempe, 2009). However, the consistent
application of suitable measures, especially for clausal, phrasal and lexical complexity has
been a point of concern (Norris & Ortega, 2009). Different measures tend to be sensitive to
specific levels and types of development and the substantial literature on this topic is still
growing and the choice of suitable measures is getting more and more sophisticated (see for
instance Housen et al., 2019; Kyle et al., 2020). An important concern from a CDST per-
spective is that probably no single measure can adequately represent an L2 proficiency level
(for a discussion see Lowie et al., 2017).
43
Wander Lowie and Marjolijn Verspoor
Figure 3.1 Moving correlations between syntactic complexity and accuracy for participants (A) and
(B). Reprinted from Yu and Lowie (2020) with permission
findings reported in Dykstra-Pruim (2003). However, for one of the learners this shifted to a
reverse effect over time and the complexity in that twin’s writing was higher than in spoken
language. These data show that the iterative, process-based analysis of second language
development is the effect of the dynamic interaction of changing variables. Even minute
differences at one point in time may lead to large differences over time in a non-linear de-
velopmental trajectory.
A recent study by Yu and Lowie (2019) describes the dynamic paths of the development of
speaking skills of Chinese speakers of English in a longitudinal case study focusing on
complexity and accuracy over a period of 4 months. The study shows much variability and
tentatively points to a relationship between the amount of variability and the degree of
development: increased variability tends to coincide with developmental jumps in speaking
(as measured by lexical diversity as well as in accuracy). However, the most interesting
finding in this study was that accuracy and fluency show a strong and interesting dynamic
relationship. Although during the early stages of development a competitive relationship was
found between accuracy and fluency, this shifted toward a supportive relationship in the
course of the data collection period, as illustrated in Figure 3.1. The figure shows a moving
window of correlations between the variables, illustrating the change of the correlation be-
tween these variables over time. Initially, the learners may have been slowed down by a
limited availability of resources (particularly apparent in participant B), while at later stages
of development they manage to find a balance among different linguistic subsystems. This
finding is in contrast to findings from a single case study by Polat and Kim (2014), who found
that their participant made progress on complexity, but not on accuracy. A possible ex-
planation of the difference between the studies could well be that Polat and Kim investigated
an untutored learner, while the participants in the study by Yu and Lowie received formal
instruction.
In addition to complexity and accuracy, Hepford (2017) included fluency in a single case
study of oral L2 development in a naturalistic setting over a period of 15 months. Eliciting
language production using a rich variety of speaking tasks, she investigated the interaction of
the CAF in combination with global proficiency measures and motivation. She found clear
non-linear and self-organizing development as a result of interconnected subsystems like the
44
Second Language Development
Figure 3.2 Vowel diagram for English sounds, with the estimated Dutch rounded front vowel added in
the top left corner (When pronounced in context most productions by Dutch speakers will
be somewhat more central.)
learner’s motivation. While the relationship among most complexity measures remained
relatively constant over time, the fluency measures varied as an effect of the amount of
cognitive strain the learner experienced. This detailed case study, including a wide variety of
changing factors over a considerable time, shows that for oral production, fluency tends to be
the dimension of CAF that is most sensitive to contextual changes.
A practical longitudinal study of L2 phonological development (Verspoor et al., forth-
coming) illustrates the significance of variability in language production. This example fo-
cuses on the development of the phonological system of a 5-year-old American boy (B) who
learned Dutch in a naturalistic setting. His Dutch pronunciation was traced for about a year
(Lowie, 2013) by means of weekly measurements of speech production. The current example
concentrates on the development of the Dutch closed front rounded vowel /y/. In English, all
rounded vowels (/u/, /ʊ/, /o/, /ɔ/) are back vowels, and there are no rounded front vowels. In
Dutch, most rounded vowels are also back vowels, but /y/ is one of the exceptions. For
English learners of Dutch, the production of /y/ provides a major challenge, as it requires a
new combination of entrenched English articulators (see Flege, 1999). See Figure 3.2 for an
illustration of these options.
We observed that there was a seemingly random variation between variants of /i/ and
variants of /u/, with the occasionally combination of the two as a diphthong /ui/. The de-
velopment is far from linear and shows a highly variable developmental trajectory (see
Figure 3.3). In the first few weeks, B varies between the rounded /u/ and non-rounded /i/,
with the occasional diphthong /iu/, but later on, his productions tend towards the target /y/
much more often. We argue that this type of variability is not intentional and not caused by
any factors, but shows that the learner is aiming for a target sound and is constantly trying it
out until he approaches native like productions. The variability is functional in that without
trying and experimenting to aim for a target form, there would be no change.
45
Wander Lowie and Marjolijn Verspoor
4000
3500
3000
2500
2000
1500
1000
500
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Prod_F1 Prod_F2 Prod_F3
Figure 3.3 Longitudinal measurement of the production of the participant’s Dutch front rounded
vowel /y/, represented by the first three formants: F1 (darkest shade ), F2 (medium shade)
and F3 (lightest shade)
46
Second Language Development
after a fixed number of years of exposure by comparing a group of early starters to late
starters, or the effect of the learner’s L1 on the quality of L2 pronunciation by comparing
representative groups of learners (Derwing & Munro, 2013). Conversely, a process-based
study can investigate how the development evolves over time, and can, for instance, explore
how the vowel production of an L2 learner changes over time, as it initially switches
between L1 and L2 realizations, and gradually develops in the direction of the target rea-
lizations after overshooting the values of matched native speaker controls for some time
(Verspoor et al., 2021).
In CDST studies of second language development, several methods have been used to
explore the development in longitudinal case studies. Two methods have been dominant,
each related to different types of research questions: the analysis of variability over time and
the analysis of relationships of subsystems over time. Both analyses use moving windows
techniques to enable observing stepwise change over time while reducing the effect of local
peaks. Although many different techniques are available to answer a variety of research
questions, we will mention only one to illustrate the type of analysis.
For the analysis of variability over time especially, graphical tools are used to visualize
development. One such instrument is that of min–max graphs, in which a moving window of
minimum, mean, and maximum is used to identify changes in the amount of variability
(Van Dijk et al., 2011). This is illustrated in Figure 3.4. In these analyses it has frequently
been observed that increased variability tends to coincide with a jump in development (see
for instance Yu & Lowie, 2019). The significance of the jumps and the changes in the amount
of variability are commonly tested by using Monte Carlo simulations. These are permutation
tests in which the data are resampled a significant number of times (for instance 10,000) to
determine the coincidence of the data observed. Similar simulation analyses to determine
the significance of changes over time have been done using Change Point Analyses (Baba &
Nitta, 2014).
Several methods have been used for the analysis of relationships of subsystems over time.
After smoothing the data to eliminate extreme peaks and after detrending the data, the
relationship among the subsystems is analyzed using growth models like precursor models
30
25
Spacial prepositions
20
15
10
0
24-Feb-98 15-Apr-98 4-Jun-98 24-Jul-98 12-Sep-98 1-Nov-98 21-Dec-98 9-Feb-99
Observation date
MIN MAX average score
Figure 3.4 Moving min–max graph illustrating changes in the amount of variability in a child’s use of
spatial prepositions. Reprinted from Van Geert and Van Dijk (2002) with permission
47
Wander Lowie and Marjolijn Verspoor
(Lowie et al., 2011). Using these models, the complex dynamic interrelationship of several
subsystems has been analyzed. For instance, Caspi (2010) analyzed the relationship of four
dimensions of receptive and productive vocabulary in second language use, whereas growth
models tend to be rather advanced techniques, also more straightforward analyses have been
used, like moving correlations (Verspoor & Van Dijk, 2011). In these analyses, a moving
window of correlation between maximally two subsystems is used. Moving correlations may
show that even when the overall correlation between two subsystems is low, this may be due
to a change over time from a negative to a positive correlation, as is shown for the re-
lationship between the finite form ratio (words/FB) and sentence structure (simple/com-
pound) in Figure 3.5 (from Verspoor & Van Dijk, 2011).
The techniques mentioned here are only a small portion of the available analyses for
longitudinal data with dense measurements. After an initial discussion of CDST methods by
Larsen-Freeman and Cameron (2008), a practical guide for CDST methodologies by
Verspoor et al. (2011), and guidelines for data collection by Lowie (2017) and by Murakami
(2020), a very comprehensive overview of techniques was compiled recently by Hiver 2019.
Although most CDST-inspired papers have used quantitative analyses, several valuable
longitudinal case studies have used qualitative analyses (Lesonen et al., 2017; Roehr-Brackin,
2014). Both studies provide detailed qualitative analyses of the development over time of a
specific linguistic construction or way to express meaning in the L2. Since CDST is a rela-
tively young line of second language research, the methods of analysis can be expected to
advance in the years to come.
0.8
0.6
0.4
0.2
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
-0.2
-0.4
-0.6
-0.8
-1
Figure 3.5 Moving window of correlation. Reprinted from Verspoor and Van Dijk (2011) with
permission
48
Second Language Development
(differences over time within a learner), sometimes with strong ups and downs, sometimes
with subsystems becoming more stable. This has important implications for researchers,
teachers, and learners.
For researchers, it means that if we want to investigate the process of development, we
must collect longitudinal data of single learners. Of course, this can be a small group of
learners in a similar situation or a single learner. We must know what we are looking for
and why as there should first be some theoretical motivation to investigate so we know the
time needed and the measures we might trace. For example, if there is an instructional
intervention on giving feedback on particular L2 sounds as cross-sectional studies have
found positive effects, then we need to trace the effects of such interventions not only
during the intervention, but also have several measuring moments after the intervention. In
such cases, we could measure longitudinally in several longitudinal post-test sessions, for
example, for 1 week about a month after the intervention and another week 2 months after
the intervention. In making these decisions, the researcher should make an estimate on
how long it takes to acquire a certain skill and when it may be assumed to become rather
stable.
For teachers, it means that they need to recognize that learning is a process of trial and
error and that language use and language development cannot be distinguished. Also,
some subsystems may need to be in place before others can develop. The conclusion that
language learning is strongly individually determined may not be good news for teachers
or school administrators, but explains the need for personalized learning. The variable
nature of the individual learning trajectory also illustrates the need for different ap-
proaches to assessment, in which awareness of development over time using portfolios
may be more suitable that summative assessment at one moment. Moreover, it is im-
portant to realize that there is no monocausality in (language) development. When applied
consistently, a CDST implication for teaching requires a strongly ecological and holistic
framework of second language pedagogy. A fully worked out application of CDST
pedagogy in an ecological framework is found in Glenn Levine’s recent MLJ Monograph
(Levine, 2020).
7 Future Directions
Despite the growing number of CDST-inspired studies of second language development and
despite the advancement of methods and analyses, additional methodological innovations are
required for further development of the field. One of the challenges is the paradox of the research
dimensions. There is a continuous desire for generalizations about the process of language de-
velopment. However, on the one hand, it is impossible to generalize individual data, while on the
other hand, groups of learners cannot be followed over time due to ergodicity constraints. One
possible way around this problem is to use cluster analyses to identify ergodic ensembles of
learners showing similar behaviour over time. The first steps in this direction have been made
(Peng et al., 2020), but there is still a long way to go. The recent book by Hiver and Al-Hoorie
(2019) mentions several other promising methods for future development.
The missing link for speaking research is the CDST analysis of interaction over time
during conversation. Promising developments have shown the application of GridWare
(Hollenstein, 2013) to create dynamic state space grids that analyze the attractor states in
interaction. The work of Smit et al. (2017) on student–teacher interaction in the classroom
setting is a promising step in this direction.
49
Wander Lowie and Marjolijn Verspoor
Further Reading
Hiver, P., & Al-Hoori, A. H. (2016). A dynamic ensemble for second language research: Putting
complexity theory into practice. The Modern Language Journal, 100(4), 741–756.
In this contribution, Hiver and Al-Hoori review CDST research and sketch new directions for in
vestigating studying second language development within this framework. The authors provide a
template for methodological considerations for scholars who aspire to carry out CDST-inspired
research.
Larsen-Freeman, D, & Cameron, L. (2008). Complex systems and applied linguistics. Oxford: Oxford
University Press.
This is a comprehensive and very accessible overview of all aspects of CDST applications to research
into second language development. A must-read for people interested in this framework.
Levine, G. (2020). A human ecological language pedagogy. The Modern Language Journal, 104(S).
Levine has written a very comprehensive monograph on ecological language pedagogy using the CDST
framework as a starting point. He works out all implications of complexity in an up-to-date discussion
of language teaching in the ecological context of world readiness.
Lowie, W. M., Verspoor, M. H., & Van Dijk, M. (2018). The acquisition of L2 speaking: A dynamic
perspective. In R. Alonso Alonso (Ed.), Speaking in a second language (pp. 106–125). Amsterdam/
Philadelphia: John Benjamins.
This is a study that specifically focuses on studying oral skills from a CDST-perspective, partly in
contrast to writing skills.
References
Baba, K., & Nitta, R. (2014). Phase transitions in development of writing fluency from a complex
dynamic systems perspective. Language Learning, 64(1), 1–35. doi: 10.1111/lang.12033
Bertalanffy, L. von. (1995). General system theory: foundations, development, applications (Rev. ed.).
Braziller. https://rug.on.worldcat.org/oclc/36200371
Block, D. (2003). The social turn in second language acquisition(Ser. Edinb). Edinburgh: University
Press.
Browman, C. P., & Goldstein, L. (1992). Articulatory phonology: An overview. Phonetica, 49(3–4),
155–180. doi: 10.1159/000261913
Caspi, T. (2010). A dynamic perspective on second language development: Vol. PhD Disser. Groningen:
University of Groningen.
Chan, H., Verspoor, M. H., & Vahtrick, L. (2015). Dynamic development in speaking versus writing in
identical twins. Language Learning, 65(2), 298–325. doi: 10.1111/lang.12107
de Bot, K. (2017). Complexity theory and dynamic systems theory: Same or different? In Complexity
theory and language development: In celebration of Diane Larsen-Freeman (pp. 51–58). Amsterdam:
John Benjamins.
de Bot, K. (2015). A history of applied linguistics: From 1980 to the present (NV-1 onl). London:
Routledge. http://public.ebookcentral.proquest.com/choice/publicfullrecord.aspx?p=1983433
de Bot, K., Lowie, W. M., & Verspoor, M. H. (2005). Second language acquisition, an advanced resource
book. London: Routledge.
de Bot, K., Lowie, W. M., & Verspoor, M. H. (2007). A dynamic systems theory approach to second
language acquisition. Bilingualism: Language and Cognition, 10(1), 7–21. doi: 10.1017/S136672
8906002732
De Jong, N. H., & Wempe, T. (2009). Praat script to detect syllable nuclei and measure speech rate
automatically. Behavior Research Methods, 41(2), 385–390. doi: 10.3758/BRM.41.2.385
Derwing, T. M., & Munro, M. J. (2013). The development of L2 oral language skills in two L1 groups:
A 7-year study. Language Learning, 63(2), 163–185. doi: 10.1111/lang.12000
Dörnyei, Z. (2009). Individual differences: Interplay of learner characteristics and learning environ-
ment. Language Learning, 59(SUPPL. 1), 230–248. http://www.scopus.com/inward/record.url?eid=
2-s2.0-73149118934&partnerID=40&md5=e9f751871baa290aa2b63306b9f8468d
Dykstra-Pruim, P. (2003). Speaking, writing, and explicit-rule knowledge: Toward an understanding of
how they interrelate. Foreign Language Annals, 36(1), 66–76.
Evans, D. R., & Larsen-Freeman, D. (2020). Bifurcations and the emergence of L2 syntactic structures
in a complex dynamic system. Frontiers in Psychology, 11, 2823. doi: 10.3389/fpsyg.2020.574603
50
Second Language Development
Firth, A., & Wagner, J. (1997). SLA property: No trespassing! Modern Language Journal, 82(1), 91–94.
Flege, J. E. (1995). Second language speech learning. Theory, findings and problems. In W. Strange
(Ed.), Speech perception and linguistic experience (pp. 233–277). Baltimore: York Press.
Flege, J. E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.), Second language
acquisition and the critical period hypothesis (pp. 101–131). Mahwah, NJ: Laurence Erlbaum.
Hepford, E. A. (2017). Dynamic second language development: the interaction of complexity, accuracy,
and fluency in a naturalistic learning context. Philadelphia, PA: Temple University.
Herdina, P., & Jessner, U. (2002). A dynamic model of multilingualism. Perspective of change in psy-
cholinguistics. Clevedon: Multilingual Matters.
Hiver, P., & Al-Hoori, A. H. (2016). A dynamic ensemble for second language research: Putting
complexity theory into practice. The Modern Language Journal, 100(4), 741–756. doi: 10.1111/
modl.12347
Hiver, P., & Al-Hoorie, A. H. (2019). Research methods for complexity theory in applied linguistics (NV-
1 onl). Clevedon: Multilingual Matters. doi: 10.21832/9781788925754. https://rug.on.worldcat.org/
oclc/1138500858
Holden, J. G., Van Orden, G. C., & Turvey, M. T. (2009). Dispersion of response times reveals cog-
nitive dynamics. Psychological Review, 116(2), 318–342. doi: 10.1037/a0014849
Hollenstein, T. (2013). State space grids: depicting dynamics across development(NV-1 onl). New York:
Springer. doi: 10.1007/978-1-4614-5007-8. https://rug.on.worldcat.org/oclc/858887575
Housen, A., De Clercq, B., Kuiken, F., & Vedder, I. (2019). Multiple approaches to complexity in
second language research. Second Language Research, 35(1), 3–21. doi: 10.1177/0267658318809765
Hulstijn, J. (2020). Proximate and ultimate explanations of individual differences in language use and
language acquisition. Dutch Journal of Applied Linguistics. doi: 10.1075/dujal.19027.hul
Kyle, K., Crossley, S., & Verspoor, M. (2020). Measuring longitudinal writing development using in-
dices of syntactic complexity and sophistication. Studies in Second Language Acquisition, 1–32. doi: 1
0.1017/s0272263120000546
Lantolf, J., & Thorne, S. (2006). Sociocultural theory and the genesis of of second language development.
Oxford: Oxford University Press.
Larsen-Freeman, D., & Cameron, L. (2008). Complex systems and applied linguistics. Oxford University
Press.
Larsen-Freeman, D. (1997). Chaos/complexity science and second language acquisition. Applied
Linguistics, 18(2), 141–165. http://www.scopus.com/inward/record.url?eid=2-s2.0-0040151244&
partnerID=40&md5=68465fc5cd8f0bc3db4bf803ce3ce4ce
Larsen-Freeman, D. (2015). Ten ‘lessons’ from complex dynamic systems theory: What is on offer. In
Zoltán Dörnyei, P. D. MacIntyre, & A. Henry (Eds.), Motivational dynamics in language learning
(pp. 11–19). Clevedon: Multilingual Matters. doi: 10.21832/9781783092574-004
Larsen-Freeman, D., & Cameron, L. (2008). Research methodology on language development from a
complex systems perspective. The Modern Language Journal, 92(2), 200–213.
Lesonen, S., Suni, M., Steinkrauss, R., & Verspoor, M. (2017). From conceptualization to construc-
tions in Finnish as an L2. Pragmatics & Cognition, 24(2), 212–262. doi: 10.1075/pc.17016.les
Levine, G. (2020). A human ecological language pedagogy. The Modern Language Journal, 104((S)).
Lima Júnior, R. M. (2013). Complexity in second language phonology acquisition. Revista Brasileira de
Linguística Aplicada, 13(2), 549–576. doi: 10.1590/s1984-63982013005000006
Lowie, W. M. (2013). L2 phonological development: a plea for a dynamic, process-based methodology.
Presentation at the international symposium on the acquisition of second language speech (New Sounds
2013), Montreal, Canada, 17–19 May 2013.
Lowie, W. M. (2017). Lost in state space? Methodological considerations in Complex Dynamic Theory
approaches to second language development research. In L. Ortega & Z. Han (Eds.), Complexity
theory and language development in celebration of Diane Larsen-Freeman (pp. 123–141). Amsterdam:
John Benjamins Publishing Company. doi: 10.1075/lllt.48.07low
Lowie, W. M., Caspi, T., Van Geert, P., & Steenbeek, H. (2011). Modeling development and change. In
M. H. Verspoor, K. De Bot, & W. Lowie (Eds.), A dynamic approach to second language develop-
ment: methods and techniques (pp. 22–122). Amsterdam: John Benjamins.
Lowie, W. M., Van Dijk, M., Chan, H., & Verspoor, M. H. (2017). Finding the key to successful L2
learning in groups and individuals. Journal of Language Teaching and Learning, 7(1), 127–148. doi: 1
0.14746/ssllt.2017.7.1.7
Lowie, W. M., & Verspoor, M. H. (2015). Variability and variation in second language acquisition
orders: A dynamic reevaluation. Language Learning, 65(1), 63–88. doi: 10.1111/lang.12093
51
Wander Lowie and Marjolijn Verspoor
Lowie, W. M., Verspoor, M. H., & Van Dijk, M. (2018). The acquisition of L2 speaking: A dynamic
perspective. In R. Alonso Alonso (Ed.), Speaking in a second language (pp. 106–125). Amsterdam:
John Benjamins.
Lowie, W. M., & Verspoor, M. H. (2019). Individual differences and the ergodicity problem. Language
Learning, 69(S1), 184–206. doi: 10.1111/lang.12324
Molenaar, P. C. M. (2015). On the relation between person-oriented and subject-specific approaches.
Journal for Person-Oriented Research, 1(1–2), 34–41. doi: 10.17505/jpor.2015.04
Murakami, A. (2020). On the sample size required to identify the longitudinal L2 development of
complexity and accuracy indices. In W. M. Lowie, M. Michel, A. Rousse-Malpat, M. Keijzer, & R.
Steinkrauss (Eds.), Usage-based dynamics in second language development (pp. 20–49). Clevedon:
Multilingual Matters.
Norris, J. M., & Ortega, L. (2009). Towards an organic approach to investigating CAF in instructed
SLA: The case of complexity. Applied Linguistics, 30(4), 555–578. doi: 10.1093/applin/amp044
Ortega, L. (2003). Syntactic complexity measures and their relation to L2 proficiency: a research sy-
thethis of college-level L2 writing. Applied Linguistics, 24(4), 492–518.
Peng, H., Jager, S., Thorne, S. L., & Lowie, W. (2020). A holistic person-centred approach to mobile-
assisted language learning. In W. M. Lowie, M. Michel, A. Rousse-Malpat, M. Keijzer, & R.
Steinkrauss (Eds.), Usage-based dynamics in second language development (pp. 87–106). Clevedon:
Multilingual Matters. doi: 10.21832/9781788925259-007
Plat, R., Lowie, W., & de Bot, K. (2018). Word naming in the L1 and L2: A dynamic perspective on
automatization and the degree of semantic involvement in naming. Frontiers in Psychology, 8, 2256.
doi: 10.3389/fpsyg.2017.02256
Polat, B., & Kim, Y. (2014). Dynamics of complexity and accuracy: A longitudinal case study of
advanced untutored development. Applied Linguistics, 35(2), 184–207. doi: 10.1093/applin/amt013
Rhea, C. K., Kiefer, A. W., Wittstein, M. W., Leonard, K. B., MacPherson, R. P., Wright, W. G., &
Haran, F. J. (2014). Fractal gait patterns are retained after entrainment to a fractal stimulus. PLoS
One, 9(9), e106755. doi: 10.1371/journal.pone.0106755
Roehr-Brackin, K. (2014). Explicit knowledge and processes from a usage-based perspective: The de-
velopmental trajectory of an instructed L2 learner. Language Learning, 64(4), 771–808. doi: 10.1111/
lang.12081. https://rug.on.worldcat.org/oclc/5694457243
Smit, N., van de Grift, W., de Bot, K., & Jansen, E. (2017). A classroom observation tool for scaf-
folding reading comprehension. System, 65, 117–129.
Smit, N., Van Dijk, M., De Bot, K., & Lowie, W. M. (in press). The complex dynamics of adaptive
teaching. International Review of Applied Linguistics.
Smith, L. B., & Thelen, E. (2003). Development as a dynamic system. Trends in Cognitive Sciences, 7(8),
343–348.
Spoelman, M., & Verspoor, M. H. (2010). Dynamic patterns in the development of accuracy and
complexity: a longitudinal case study on the acquisition of Finnish. Applied Linguistics, 31(4),
532–553.
Thelen, E., & Smith, L. B. (1994). A dynamic systems approach to the development of cognition and
action. Cambridge, MA: MIT Press.
Van Dijk, M., Verspoor, M. H., & Lowie, W. M. (2011). Variability and DST. In M. Verspoor, K. De
Bot, & W. Lowie (Eds.), A dynamic approach to second language development: methods and techni-
ques (pp. 55–84). Amsterdam: John Benjamins.
Van Geert, P., & Van Dijk, M. (2002). Focus on variability: New tools to study intra-individual
variability in developmental data. Infant Behavior and Development, 25(4), 340–375.
Verspoor, M. H., De Bot, K., & Lowie, W. M. (2011). A dynamic systems approach to second language
development: methods and techniques. In N. Spada & N. Van Deusen-Scholl (Eds.), Language
learning & language teaching 29. Amsterdam: John Benjamins.
Verspoor, M. H., Lowie, W. M., & de Bot, K. (2021). Variability as normal as apple pie. Linguistics
Vanguard, 7(s2).
Verspoor, M. H., & Van Dijk, M. (2011). Visualizing interactions between variables. In M. H.
Verspoor, K. De Bot, & W. Lowie (Eds.), A dynamic approach to second language development:
methods and techniques (pp. 85–98). Amsterdam: John Benjamins.
Wanninge, F., Dörnyei, Z., & de Bot, K. (2014). Motivational dynamics in language learning: Change,
stability, and context. The Modern Language Journal, 98(3), 704–723.
52
Second Language Development
Yu, H., & Lowie, W. (2019). Dynamic paths of complexity and accuracy in second language speech: A
longitudinal case study of Chinese learners. Applied Linguistics, 41(6), 855–877. doi: 10.1093/applin/
amz040
Yuan, F., & Ellis, R. (2003). The effects of pre-task planning and on-line planning on fluency, com-
plexity and accuracy in L2 monologic oral production. Applied Linguistics, 24(1). doi: 10.1093/
applin/24.1.1
Zhang, W., & Ding, N. (2017). Time-domain analysis of neural tracking of hierarchical linguistic
structures. NeuroImage, 146, 333–340. doi: 10.1016/j.neuroimage.2016.11.016
53
4
SOCIOCULTURAL APPROACHES
TO SPEAKING IN SLA
Victoria Surtees and Patricia Duff
1 Introduction/Definitions
Speaking is a powerful mode of human communication, learning, and sociality. It has
therefore been central to much sociocultural and sociolinguistic research on first and second
language (L2) development, as well as in studies of bilingualism, multilingualism, and other
kinds of learning. Many second language acquisition (SLA) scholars interested in oral lan-
guage development (e.g., see other chapters in this volume) view speaking through a psy-
chological, cognitive, or linguistic lens, breaking L2 oral production down into measurable
components such as pronunciation, fluency, accuracy, and comprehensibility. From a so-
ciocultural stance, speaking is seen as one observable (audible, performed) aspect of inter-
action through which meanings are constructed. In addition, speaking is viewed as a means
of potential socialization into linguistic, cultural, and other perspectives and practices as
speakers convey aspects of their identities and interests or positionalities. From this per-
spective, features of speech, such as prosody, lexical choice, or interaction (e.g., questioning
or turn-taking patterns), must be understood in terms of participants’ ability to achieve
mutual understanding, index their social meanings and identities, accomplish goals, and
participate effectively in recognizable and legitimate activities within a community.
Sociocultural theorizing in SLA is broad, multifaceted, and interdisciplinary. For that
reason, we use the plural form when describing sociocultural “theories.” Some sociocultural
work has been deeply informed by Vygotskian theory, which examines the development of
cognition or mental processes through experiences of mediated social interaction (often
spoken interaction) and intersubjectivity (for an overview of Vygotskian-informed socio-
cultural approaches, see Lantolf et al., 2018). By intersubjectivity, we mean coming to shared
understandings and alignments with one another’s views and ways of speaking often within
the context of a particular activity.
Other sociocultural work, including our own research, is informed by language sociali-
zation theory, which tends to be less explicitly Vygotskian and instead draws on sociology,
sociolinguistics, cultural psychology, and linguistic anthropology to a greater extent (e.g.,
Duff, 2007). Increasingly foregrounded in language socialization are social constructs such as
identity, power relations, and intersections among race, gender, and other social categories
that may position language learners and their ways of speaking (or their silences) in ad-
vantageous or disadvantageous ways (Duff, 2019). In addition, this sociocultural work
54 DOI: 10.4324/9781003022497-6
Sociocultural Approaches to Speaking
(unlike most Vygotskian L2 studies) often examines language ideologies (e.g., perceived
status of languages, language varieties, or “accents,” and speakers of those languages),
communities or networks of practice, forms of capital (economic, social, and cultural) at
speakers’ disposal, and relationships between social structures and human agency (e.g.,
Bourdieu, 1991; Darvin & Norton, 2015). Sociocultural approaches may also examine fac-
tors affecting individuals’ and groups’ access to opportunities to learn and use languages, to
take turns in interactions, and to receive meaningful feedback on their contributions. Finally,
the research often examines learners’ trajectories and forms of participation in their various
communities over time.
Situated sociocultural processes work in tandem with, but clearly at different levels or
through different systems from, cognitive processes of perceiving, internalizing, mastering,
and producing particular linguistic constructions. These processes also occur across different
scales of time and space, and can be realized or carry traces of those settings and histories in
even a single utterance, linguistic form, or interaction, such as the exclamation He’s so woke!
(i.e., sociopolitically aware or sensitized). Similarly, the correction of certain forms of speech
by others (e.g., speakers, teachers) conveys their identities and language ideologies and not
simply phonological accuracy – for example, when a particular phonetic form common in a
non-standard dialect is disallowed in classrooms where pronunciation conforming to a more
standard variety is required (Duff, 2019; Friedman, 2010). In summary, in this chapter we
describe sociocultural approaches to the study of speaking in SLA that analyze not only the
production of L2 speech by learners, but also their emerging communicative repertoires,
networks, and communities in transnational, intercultural contexts.
2 Historical Perspectives
Early sociolinguistic research on the ethnography of communication and L2 pragmatics (see
Yates, this volume) emphasized that speaking – or participating in particular speech com-
munities and their oral cultural practices – requires learning how to communicate by using
particular linguistic forms and genres while taking into account social and situational vari-
ables. Such variables might include registers, social status, roles, power differentials, social
distance from speakers, and ways of displaying affective stances such as excitement or
gravitas (Hymes, 1964). These forms of knowledge or competence, such as ways of making
situation-appropriate requests, develop over time through observation, experience, and
mediation (i.e., socialization) by others. One example is Li’s (2000) case study of women
learning ESL/skills in workplace programmes in the United States where workers developed
strategies to successfully make requests in English through explicit and implicit socialization
via coursework and other social encounters. This socialization generally occurs through
meaningful, scaffolded social interaction and sometimes explicit instruction. However, not all
modelling provided by “expert” speakers is taken up by learners or novices, who may de-
velop their own strategies or preferences (Duff & Talmy, 2011).
Another important theoretical framework alluded to earlier that gives prominence to the
role of social interaction in learning, draws on Vygotsky and Activity Theory (Lantolf &
Thorne, 2006; Lantolf et al., 2018). Vygotskian theory and research in the area of L2
speaking first appeared in the 1990s (e.g., Lantolf, 1994) and has gained considerable ground
since. This work focuses on aspects of mediation, scaffolding, inner speech, and inter-
subjectivity in learning, and how social, embodied experience (including the use of gestures)
facilitates the internalization of knowledge. It also investigates the development of mental
processes and concepts, among which are linguistic ones. Through these mediated or scaf-
folded learning experiences, many of which involve speech, individuals become better able,
55
Victoria Surtees and Patricia Duff
over time, to regulate their own learning and use of language according to their own pur-
poses (Lantolf et al., 2018). Two intersecting areas of current Vygotskian sociocultural re-
search on L2 speaking focus on the utility of (1) concept-based instruction (e.g., around such
linguistic concepts as tense, aspect, mood, and voice, often taught explicitly through sche-
matic diagrams representing the concepts and relationships among them) and (2) meta-
linguistic talk about language (also known as languaging) that learners engage in while
gaining a deeper understanding of linguistic concepts and usage (see examples in Lantolf
et al., 2018).
In recent years, a greater emphasis across sociocultural approaches has been placed not
only on speech and one’s primary speech communities – important though those are – but also
on the kinds of multimodal communicative or semiotic repertoires that accompany or
sometimes replace speech (e.g., Early et al., 2015). Therefore, included in contemporary
theorization within sociocultural SLA are embodied expressions of meaning, such as gestures,
facial expressions, gaze, images, and written texts (e.g., Martin-Beltrán, 2010), and the role of
silence in addition to speech (e.g., Morita, 2004). Furthermore, multilingual resources that
facilitate the comprehension and production of meanings have become more central to SLA
theory, research, and pedagogies drawing from sociocultural approaches to SLA. The focus
on repertoires and multilingual resources is also salient in the Douglas Fir Group’s (Douglas
Fir Group, 2016) article on transdisciplinary, multiscalar approaches to SLA and are found
in a burgeoning area of research dealing with the development of interactional competence in
SLA (e.g., Hall et al., 2011). The Douglas Fir Group (DFG) (2016) could be broadly con-
strued as a “dynamic sociocultural/sociocognitive systems” framework that seeks to con-
textualize and interpret the development and use of “speaking” abilities in another language
across both local and wider social contexts and scales (see Duff, 2019). The DFG framework
and sociocultural principles contained within it encourage the examination of speaking
(input or exposure, opportunities, interactions, experience, performance, impact, feedback,
change) to the degree possible or relevant within and across macro (societal), meso (in-
stitutional), and micro (social–interactional, linguistic–indexical) levels and the larger spec-
trum of valued social–semiotic repertoires referred to earlier.
Thus, when conducting research on people engaged in a classroom L2 learning task or in
study-abroad students’ L2 interactions outside of class, sociocultural researchers consider
elements of the activity, the social context, the materials and spaces (virtual, face-to-face,
coffee shop, lab) involved, the division of labour among participants and activity objectives,
the emotions involved that may mitigate learning or performance, and the negotiation of
particular meanings in discourse and how these change over time. Not all of these areas of
potential enquiry will be relevant in all socioculturally oriented studies of L2 speaking de-
velopment. However, there has been a growing recognition over the past two decades that
various extralinguistic contextual factors combined with linguistic ones contribute to lear-
ners’ experiences, motivations, and performance in important ways.
56
Sociocultural Approaches to Speaking
57
Victoria Surtees and Patricia Duff
may have limited access to opportunities to use the L2 or to receive timely and valuable
assistance supporting their learning. Thus, sociocultural research on speaking requires at-
tention to the broader social contexts of learning and using language, and then actual
practice.
58
Sociocultural Approaches to Speaking
Diao (2016) examined L2 Chinese learners’ use of sentence-final particles in Mandarin (a/
ya, la, me, o, eh/ye), forms associated with “girl talk” among China’s urban youth. She
compared the developmental trajectories of three undergraduate American SA students, Tuzi
and Mac (both male) and Ellen (female). Drawing on recorded data of informal conversa-
tions between roommates over a semester, Diao analyzed patterns in participants’ use of the
particles as well as the thematic content of participants’ conversations. Findings show how
participants’ particle use shifted following conversations with roommates about the link
between femininity and frequently used stance particles. Ellen’s particle use increased to align
more closely with that of her female roommate, while Tuzi’s use decreased as he became
aware that overuse could be interpreted as an “effeminate” speech style. Mac, who did not
explicitly talk about gender, love, or relationships with his roommate, continued to omit the
particles from his speech entirely. Diao’s findings point to the importance of metalinguistic
talk about the social meanings of language forms. Other SA studies have observed a similar
role for explicit discussions of social meanings. For example, Surtees (2018) found that
Japanese SA students in Canada developed a range of discursive strategies related to asking
for language help following their peers’ repeated offers to help with English “when asked.”
59
Victoria Surtees and Patricia Duff
scaffolded or mediated through various forms of oral interaction. For example, Al Masaeed
(2016) examines how Arabic learners successfully scaffold their accomplishment of a paired
discussion task in the L2 by using English to resolve linguistic difficulties.
Other work, particularly in English-medium university settings, adopts an academic dis-
course socialization perspective to examine how university students take up and participate
in oral academic tasks (Kobayashi et al., 2017). For example, Kobayashi (2016) described
how one undergraduate international student, Otome, was socialized into delivering oral
academic presentations in English during her one year sojourn at a Canadian university.
Kobayashi’s findings highlight how the teachers’ criteria for “good presentations” centred
around content organization, such as the inclusion of critical and comparative talk, and
effective use of paralinguistic cues, such as eye contact. He highlighted how Otome became
sensitized to the expectations of the instructor through interactions with the instructor and
her peers. For instance, Otome observed how the instructor used textbooks and research to
support his claims during lectures and decided during her second presentation to use evi-
dence from the textbook. Thus, as with most sociocultural work, speaking development was
conceptualized with reference to local criteria for success – criteria that went beyond issues of
pronunciation or grammatical accuracy to incorporate key features of the genre, including
content selection and audience rapport.
60
Sociocultural Approaches to Speaking
2005; van Compernolle, 2014). Typically, the researcher is present while recording and taking
field notes as the interaction unfolds. However, adult or university level learners may in some
cases be asked to record their own interactions with mobile phones or handheld recorders and
to submit them to researchers later. This practice of remote recording gives participants more
discretion over the data collected and shared and, at the same time, the process is less intrusive
than having the researcher present. Data are typically interpreted from a participant-relevant
(or emic) perspective to examine how participants themselves orient to and co-construct local
meanings. An advantage of this emic approach is that it resists the use of external standards or
benchmarks to evaluate learners’ oral production and thus allows researchers to capture and
conceptualize the value of non-standard communicative practices, such as translanguaging
(freely intermixing languages or dialects) and language play (e.g., Martin–Beltrán, 2010; Al
Masaeed, 2016) or even transgressive, oppositional behaviours and speech (e.g., Talmy, 2008).
Findings from sociocultural studies are often reported as case studies to provide sufficient
contextualization and depth when describing individuals or small groups of learners.
Typically, cases in sociocultural research are interpretive – they focus on a phenomenon as it
occurs in a specific context. A case can include the experiences and language use of a single
focal learner, or the shared practices of a larger institution such as a school, or the nesting of
cases within cases. Often, researchers will select several focal cases within a research project
to illustrate contrasting findings. For example, in the Kinginger et al. (2014) study on SA
students’ mealtime conversations with Chinese host families, findings were presented for two
participants with significantly differing initial proficiencies, thus providing an example of
contrasting experiences. Alternatively, focal participants may be selected because they re-
present a “typical” developmental trajectory, or conversely, because they are exceptional in
some way. Explaining the rationale for recruitment and selection is important when con-
textualizing cases within the larger sample of learners and within broader societal issues and
theoretical questions. A key potential strength of case study research (and some other ap-
proaches) is that it leads to new understandings, awareness, and (possibly) empathy, and to
the realization that we are dealing with whole people with histories and aspirations, and not
simply data. In this sense, the research is intended to be transformative for the reader as well
as (potentially) the field (Duff, 2014).
Since case studies often involve multiple data sources, sociocultural work may make use
of multiple methods of data analysis including narrative analysis, descriptive and thematic
analysis (often for the purposes of contextualization or examining meso/macro-level phe-
nomena), and some form of linguistic or discourse analysis to investigate the linguistic re-
sources used (lexical, grammatical, pragmatic, etc.). In her study on racialized identities in
SA learning of L2 Portuguese by American students in Brazil, Anya (2016) presents all of the
following analyses: a thematic analysis of participants’ perceptions of race, a descriptive
discourse analysis of students’ overt references to identity categories in talk, and a critical
discourse analysis of participants’ interactions focusing on genres, speech acts, interactive
frames, and stance-taking practices. By employing multiple analytic methods, researchers
such as Anya can make clearer connections between phenomena at the macro and meso
levels (e.g., discourses about race and racial categories) and speaking practices at the mi-
crolevel (e.g., stance-taking practices that index racial identities). While only the micro-level
analyses are about speaking performance in the strict sense, macro-level analyses answer
important questions about factors that shape interlocutors’ linguistic choices, willingness to
engage in particular oral interactions, and the reactions of other interlocutors to learners’
language use.
The design of sociocultural research can vary widely in its epistemological commitments,
and may take up cognitive, interactionist, or critical orientations. Vygotskian-inspired L2
61
Victoria Surtees and Patricia Duff
studies that focus on concept-based instruction are quite cognitivist in the sense that they aim
to understand changing conceptions and control over particular linguistic forms (e.g., tense-
aspect marking). In contrast, work that takes a critical perspective, such as De Costa’s work
(2014) which adopts a language socialization framework, focuses on macro-systems of power
and their relationship to micro-level events. The De Costa’s research involved a year-long
ethnographic study of high school students studying abroad in Singapore in which he con-
ducted extensive field observations and became highly involved in the everyday life of the
school. His research questions reflect his commitment to critical enquiry and focused on how
macro-ideologies about cosmopolitanism were enacted at the level of interaction and how
those interactions impacted “students” development of a cosmopolitan outlook and set of
linguistic practices’ (p. 13). Informed by the work of post-structural scholars such as
Bourdieu (1991), De Costa collected and analyzed policy documents from the school to
demonstrate macro-level discourses. Then, using insights from interactional sociolinguistics
and conversation analysis, he examined how those ideologies were reproduced and trans-
formed in actual instances of classroom interaction with students and teachers. His findings
demonstrate how changes in students’ oral language use are linked to external societal
structures and values. As this example illustrates, the types of information that a socio-
cultural researcher chooses to gather and the ways they are analyzed largely depends on their
epistemological commitments and initial research questions. Critical readers of sociocultural
research should reflect carefully on the ways in which a researcher’s chosen methods align
with their theorizations of the learner, language, learning, power, and society.
62
Sociocultural Approaches to Speaking
competence – learners’ abilities to jointly negotiate meaning – are increasingly common (e.g.,
Galaczi, 2014). This form of assessment evaluates the range of discursive moves and turn-
taking strategies learners are able to employ in their L2. Other sociocultural researchers such
as Poehner and Lantolf (2005) advocate dynamic assessment, in which examiners interact
with examinees during oral exams to better understand the way learners process language,
and to lead them to new cognitive stages of development (see also Lantolf et al., 2018). This
approach to assessment prioritizes the sociocultural notion that language learning is ne-
cessarily a mediated process and draws on the Vygotskian notion of the zone of proximal
development. Dynamic assessment has been incorporated successfully into assessments of
oral pragmatic knowledge. For example, Van Compernolle (2014) engaged learners of
French in a task in which they judged the appropriateness of the pronominal address forms
tu and vous; however, rather than having learners complete these tasks independently, they
engaged in cooperative dialogue with a tutor and explained their choices and thinking as they
completed the task.
The connections that sociocultural research has shown between language use and identity
also have implications for how practitioners select content and design curricula. Most so-
ciocultural work highlights the importance of teaching material that allows students to
cultivate an L2 identity that they value. For example, Lin and Man (2011) drew on findings
from sociocultural research related to identity to create a rap-based extra-curricular English
programme for youth in Hong Kong. The authors explain that they designed the curriculum
around the identity of the young emcee “with the idea that many students would find in ELT
RAP a space to reconcile their mixed feelings about English” (p. 205). Rap and hip hop were
particularly appealing to the students in the programme as a medium through which youth
worldwide speak out against social injustice. The rhyming elements of rap also afforded
opportunities for phonetic development and writing “good” rap required students to engage
with current events to expand their active vocabulary.
Sociocultural research on speaking does not usually compare pedagogical methods or
evaluate performance based on external criteria; it is conducted in very specific learning
contexts. Whereas some studies are unlikely to provide teachers with explicit guidance about
how to teach specific features of oral language in their classrooms, others have developed
elaborate methods for raising learners’ awareness of linguistic concepts, meanings, and re-
lationships. Implications of this research focus on expanding the awareness of researchers,
teachers, learners, and programmes of the complex contextual and linguistic factors that
influence oral language use and learning. Furthermore, sociocultural approaches urge
practitioners and researchers to not view speaking as a context-independent skill but rather
to consider for what purpose and with whom students will be using oral language. It en-
courages practitioners to think critically about the learning processes in their own contexts
by asking questions such as:
• What social meanings are attributed to the language and linguistic forms I am teaching
by my learners, the school/institution, and society?
• How will awareness of those social meanings impact learners’ ability to participate in
their desired communities?
• What messages does my teacher-talk communicate to students about their identities and
abilities as learners or as multilingual speakers and how does it scaffold their learning?
• How do different modes of meaning-making (e.g., visual, gestural, textual; and in-
structional vs. experiential) work together with oral language to produce optimal con-
ditions for language learning in my specific learning space?
63
Victoria Surtees and Patricia Duff
This approach advocates for pedagogies that raise students’ awareness of these questions as
well as teaching that provides students with different options for expressing social meanings
in particular contexts. In sum, findings from sociocultural work challenge the input/output
conceptualization of language acquisition, which dominated much cognitive–interactionist
theory and pedagogy in SLA, in favour of dialogic/mediated understandings of speaking that
conceptualize all communication as interaction.
7 Future Directions
Writing this chapter during a unique and tumultuous period of COVID-19 self-isolation, on
the one hand, and anti-racist protests around the world, on the other, we offer some final
remarks on future directions for SLA research. These two acute circumstances are relevant to
sociocultural research on L2 speaking and learning in several ways. First, in relation to
COVID-19 constraints, more speaking and learning is taking place online than ever before,
typically mediated by synchronous video communication platforms (e.g., Zoom). However,
such platforms transform the nature of interaction and learning, due to limited screen size,
bandwidth, lag time or overlap between turns, the potential for concurrent chatting through
sidebar conversations, and other features (e.g., eliminating sound or video images through
muting buttons, or forgetting to unmute them when starting to speak causing unexpected
delays in taking turns). It remains to be seen how this critical epidemiological phenomenon
and rapidly emerging new tools and forms of learning and communicating will change
pedagogical and participatory structures for SLA learning and speaking in either the near
term or longer term. Nor is it clear how SLA research designs and pedagogies in relation to
speaking will adapt accordingly. This is an area with substantial intriguing possibilities for
research, theory, and practice.
Second, the current global surge of social unrest confronting racism and violence,
coupled with urgent calls for decolonization and Indigenous language revitalization, re-
quire a collective reckoning regarding our priorities and practices as applied linguists,
educators, and citizens. We are urged to re-examine the curriculum, modes of access to and
participation in education and in intercultural interactions as well as in civic society. We
must critically examine representations of speakers, contexts, and positionalities in learning
materials, in research, in the range of L2s (and multilingual repertoires) examined, and in
our research practices that reproduce the marginalization of certain people, cultures, and
languages. Although marginalization and oppression are historically-rooted sociological
and political processes, they continue to impact public and private perceptions of the value
of certain populations and experiences, heritage/Indigenous languages, locally and inter-
nationally valued ways of learning and participating, and people’s sense of legitimacy,
safety, and purpose in society. All of these factors affect people’s engagements with
learning and their future trajectories and well-being. Chapters in Burdelski and Howard
(2020) provide examples of how alienation occurs through speech activities (despite di-
rectives by teachers insisting on inclusive, respectful practices) as well as other behaviours,
attitudes, and groupings of participants: such as who will or will not dance or play with
minoritized children in an elementary school based on perceptions of the inferiority of the
Other. These larger discourses are highly relevant in sociocultural studies of SLA as they
may reproduce widespread historical and contemporary processes of exclusion and neglect
rather than foster greater inclusion, social participation, and opportunities for upward
mobility.
64
Sociocultural Approaches to Speaking
Further Reading
Douglas Fir Group. (2016). A transdisciplinary framework for SLA in a multilingual world. The
Modern Language Journal, 100, 19–47.
An interdisciplinary framework and ten principles for considering SLA across macro, meso, and micro
levels of analysis, integrating sociocultural and other compatible theories.
Duff, P., & May, S. (Eds.). (2017). Language socialization. Encyclopedia of language and education
(3rd edn). Cham: Switzerland.
This edited volume highlights one sociocultural approach to language learning: language socialization.
The authors examine the learning and use of a variety of oral and signed languages across a range of
learning contexts and across the lifespan.
Lantolf, J. P., Poehner, M., & Swain, M. (Eds.). (2018). The Routledge handbook of sociocultural theory
and second language development. New York: Routledge.
An authoritative overview of current Vygotskian-inspired sociocultural theory and concepts in SLA
with pedagogical implications.
References
Al Masaeed, K. (2016). Judicious use of L1 in L2 Arabic speaking practice sessions. Foreign Language
Annals, 49(4), 716–728. doi:10.1111/flan.12223
Anya, U. (2016). Racialized identities in second language learning: Speaking blackness in Brazil.
New York: Routledge.
Artemeva, N., & Fox, J. (2011). The writing’s on the board: The global and the local in teaching
undergraduate mathematics through chalk talk. Written Communication, 28(4), 345–379. doi: 10.11
77/0741088311419630
Avni, S. (2012). Translation as a site of language policy negotiation in Jewish day school education.
Current Issues in Language Planning, 13(2), 77–89. doi: 10.1080/14664208.2012.678976
Bourdieu, P. (1991). Language and symbolic power. [G. Raymond & M. Adamson, Trans.]. Cambridge,
MA: Harvard University Press.
Burdelski, M. J., & Howard, K. M. (Eds.). (2020). Language socialization in classrooms: Culture, in-
teraction, and language development. Cambridge: Cambridge University Press.
Darvin, R., & Norton, B. (2015). Identity and a model of investment in applied linguistics. Annual
Review of Applied Linguistics, 35, 36–56. doi:10.1017/S0267190514000191
De Costa, P. (2014). Reconceptualizing cosmopolitanism in language and literacy education: Insights
from a Singapore school. Research in the Teaching of English, 49(1), 9–30. https://www.jstor.org/
stable/24398662
Diao, W. (2016). Peer socialization into gendered L2 mandarin practices in a study abroad context:
Talk in the dorm. Applied Linguistics, 5(1), 599–620. doi:10.1093/applin/amu053
Dings, A. (2014). Interactional competence and the development of alignment activity. The Modern
Language Journal, 98(3), 742–756. doi:10.1111/j.1540-4781.2014.12120.x
Douglas Fir Group. (2016). A transdisciplinary framework for SLA in a multilingual world. The
Modern Language Journal, 100, 19–47. doi:10.1111/modl.12301
Duff, P. A. (2007). Second language socialization as sociocultural theory: Insights and issues. Language
teaching, 40(4), 309–319. doi:10.1017/S0261444807004508
Duff, P. A. (2014). Case study research on language learning and use. Annual Review of Applied
Linguistics, 34(2014), 233–255. doi:10.1017/S0267190514000051
Duff, P. A. (2019). Social dimensions and processes in second language acquisition: Multilingual so-
cialization in transnational contexts. The Modern Language Journal, 103, 6–22. doi:10.1111/
modl.12534
Duff, P. A., & Talmy, S. (2011). Language socialization approaches to second language acquisition. In
D. Atkinson (Ed.), Alternative approaches to second language acquisition (pp. 95–116). New York:
Routledge. doi:10.4324/9780203830932
Duff, P. A., Wong, P., & Early, M. (2000). Learning language for work and life: The linguistic so-
cialization of immigrant Canadians seeking careers in healthcare. Canadian Modern Language
Review, 57(1), 9–57. doi:10.3138/cmlr.57.1.9
65
Victoria Surtees and Patricia Duff
DuFon, M. A. (2006). Socialization of taste during study abroad in Indonesia. In M. A. DuFon & E.
Churchill (Eds.), Language learners in study abroad contexts (pp. 91–119). Clevedon: Multilingual
Matters.
Early, M., Kendrick, M., & Potts, D. (2015). Multimodality: Out from the margins of English language
teaching. TESOL Quarterly, 49, 447–460. doi:10.1002/tesq.246
Eriks‐Brophy, A., & Crago, M. (2003). Variation in instructional discourse features: Cultural or lin-
guistic? Evidence from Inuit and non‐Inuit teachers of Nunavik. Anthropology & Education
Quarterly, 34(4), 396–419. doi:10.1525/aeq.2003.34.4.396
Friedman, D. (2010). Speaking correctly: Error correction as a language socialization practice in a
Ukrainian classroom. Applied Linguistics, 31, 346–347. doi:10.1093/applin/amp037
Galaczi, E. D. (2014). Interactional competence across proficiency levels: How do learners manage
interaction in paired speaking tests? Applied Linguistics, 35(5), 553–574. doi:10.1093/applin/amt017
Hall, J. K. (2019). Essentials of SLA for L2 teachers: A transdisciplinary framework. New York:
Routledge.
Hall, J. K., Hellerman, J., & Pekarek Doehler, S. (Eds.). (2011). L2 interactional competence and de-
velopment. Bristol, UK: Multilingual Matters.
Hymes, D. (1964). Introduction: Toward ethnographies of communication. American Anthropologist,
66(6), 1–34.doi:10.1525/aa.1964.66.suppl_3.02a00010
Iino, M. (2006). Norms of interaction in a Japanese homestay setting: Toward a two-way flow of
linguistic and cultural resources. In M. A. DuFon & E. Churchill (Eds.), Language learners in study
abroad contexts (pp. 151–202). Clevedon: Multilingual Matters.
Kinginger, C. (2008). Language learning in study abroad: Case studies of Americans in France. The
Modern Language Journal, 92, 1–124. doi:10.1111/j.1540-4781.2008.00821.x
Kinginger, C., & Belz, J. A. (2005). Socio-cultural perspectives on pragmatic development in foreign
language learning: Microgenetic case studies from telecollaboration and residence abroad.
Intercultural Pragmatics, 2(4), 369–421. doi:10.1515/iprg.2005.2.4.369
Kinginger, C., Lee, S.-H., Wu, Q., & Tan, D. (2014). Contextualized language practices as sites for
learning: Mealtime talk in short-term Chinese homestays. Applied Linguistics, 1–26. doi:10.1093/
applin/amu061
Kinginger, C., & Wu, Q. (2018). Learning Chinese through contextualized language practices in study
abroad residence halls: Two case studies. Annual Review of Applied Linguistics, 38, 102–121.
doi:10.1017/S0267190518000077
Kobayashi, M. (2016). L2 academic discourse socialization through oral presentations: An under-
graduate student’s learning trajectory in study abroad. Canadian Modern Language Review, 72(1),
95–121. doi:10.3138/cmlr.2494
Kobayashi, M., Zappa-Hollman, S., & Duff, P. (2017). Academic discourse socialization. Language
socialization. Encyclopedia of Language and education (3rd edn, pp. 239–254). Cham, Switzerland:
Springer.
Lantolf, J. P. (1994). Sociocultural theory and second language learning. Modern Language Journal,
78(4), 418–420. doi:10.2307/328580
Lantolf, J. P., Poehner, M., & Swain, M. (Eds.). (2018). The Routledge handbook of sociocultural theory
and second language development. New York: Routledge.
Lantolf, J. P., & Thorne, S. L. (2006). Sociocultural theory and the genesis of second language devel-
opment. Oxford: Oxford University Press.
Li, D. (2000). The pragmatics of making requests in the L2 workplace: A case study of language
socialization. Canadian Modern Language Review, 57(1), 58–87. doi:10.3138/cmlr.57.1.58
Lin, A., & Man, E. (2011). Doing-hip-hop in the transformation of youth identities: Social class, ha-
bitus, and cultural capital. In C. Higgins (Ed.), Negotiating the self in a second language: Identity
formation and cross-cultural adaptation in a globalizing world (pp. 201–220). London: Equinox.
Martin–Beltrán, M. (2010). The two‐way language bridge: Co‐constructing bilingual language learning
opportunities. The Modern Language Journal, 94(2), 254–277. doi:10.1111/j.1540-4781.2010.01020.x
Moore, L. C. (2013). Qur’anic school sermons as a site for sacred and second language socialisation. Journal
of Multilingual and Multicultural Development, 34(5), 445–458. doi:10.1080/01434632.2013.783036
Morita, N. (2004). Negotiating participation and identity in second language academic communities.
TESOL Quarterly, 38(4), 573–603. doi:10.2307/3588281
Norton, B. (2013). Identity and language learning: Extending the conversation. Bristol, UK: Multilingual
matters.
66
Sociocultural Approaches to Speaking
Poehner, M. E., & Lantolf, J. P. (2005). Dynamic assessment in the language classroom. Language
Teaching Research, 9(3), 233–265. doi:10.1191/1362168805lr166oa
Ro, E., & Burch, A. R. (2020). Willingness to communicate/participate’ in action: A case study of
changes in a recipient’s practices in an L2 book club. Linguistics and Education, 58, 100821.
doi:10.1016/j.linged.2020.100821
Shively, R. L. (2013). Learning to be funny in Spanish during study abroad: L2 humor development.
The Modern Language Journal, 97(4), 930–946. doi:10.1111/j.1540-4781.2013.12043.x
Shively, R. L. (2018). Language socialisation during study abroad: Researching interactions outside the
classroom. In S. Coffey & U. Wingate (Eds.), New directions for research in foreign language edu-
cation (pp. 97–112). New York: Routledge.
Siegal, M. (1996). The role of learner subjectivity in second language sociolinguistic competency:
Western women learning Japanese. Applied Linguistics, 17(3), 356–382. doi:10.1093/applin/17.3.356
Surtees, V. (2018). Peer language socialization in an internationalized study abroad context: Norms for
talking about language (Doctoral dissertation). University of British Columbia.
Talmy, S. (2008). The cultural productions of the ESL student at Tradewinds High: Contingency,
multidirectionality, and identity in L2 socialization. Applied Linguistics, 29(4), 619–644. doi:10.1093/
applin/amn011
Theodórsdóttir, G. (2018). L2 teaching in the wild: A closer look at correction and explanation
practices in everyday L2 interaction. The Modern Language Journal, 102, 30–45. doi:10.1111/
modl.12457
Van Compernolle, R. A. (2014). Sociocultural theory and L2 instructional pragmatics. Bristol:
Multilingual Matters.
Van Compernolle, R. A. (2015). Interaction and second language development: A Vygotskian perspective.
Philadelphia: John Benjamins.
Zappa-Hollman, S. (2007). Academic presentations across post-secondary contexts: The discourse
socialization of non-native English speakers. Canadian Modern Language Review, 63(4), 455–485.
doi:10.3138/cmlr.63.4.455
Zuengler, J., & Miller, E. R. (2006). Cognitive and sociocultural perspectives: Two parallel SLA
worlds? TESOL Quarterly, 40(1), 35–58. doi:10.2307/40264510
67
5
APTITUDE AND INDIVIDUAL
DIFFERENCES
Joan C. Mora
1 Introduction/Definitions
Speaking in a second language is very challenging. Besides difficulties in finding the right
words and using the L2 grammar appropriately, learners often struggle to produce un-
familiar sounds. Their utterances often contain hesitations, inappropriate silences, and
mispronounced words, all of which make speech difficult to understand.
L2 pronunciation research is concerned with learners’ development and acquisition of the
sound system of a second language (L2 phonology), and with pronunciation instruction and
assessment (Derwing & Munro, 2015; this volume). Acquiring an L2 sound system involves
learning to perceive and produce individual sounds (vowels and consonants), knowing how
they combine to form syllables and words (phonotactics), and learning the appropriate
rhythm and intonation patterns (prosody), all of which shape the way utterances are pro-
duced and perceived. Perceptual dimensions of L2 speech, such as accentedness (speech
nativelikeness measured as degree of perceived foreign accent), perceived fluency, compre-
hensibility (ease of understanding), and intelligibility (actual amount of speech understood)
reflect L2 speech learning and development and are therefore also sensitive to sources of
individual differences.
Current L2 speech acquisition models such as the Speech Learning Model (SLM; Flege,
1995) and its revised version (SLM-r; Flege & Bohn, 2021) and the Perceptual Assimilation
Model of L2 speech learning (PAM-L2; Best & Tyler, 2007) aim to account for L2 learners’
speech development in naturalistic second language acquisition (SLA). These models focus
mainly on the perception and production of individual sounds and their mental re-
presentations (phonetic categories), in particular sound contrasts that are functionally im-
portant in language because they distinguish meaning (e.g., the vowels in cat and cut).
L2 fluency research investigates the properties and development of speaking fluency
(smooth delivery of speech), comprising three dimensions of speech production: utterance
fluency, cognitive fluency, and perceived fluency (Kahng, this volume; Segalowitz, 2010,
2016). Utterance fluency can be measured in terms of temporal properties (e.g., speech rate),
breakdowns (e.g., pauses and hesitations) and repairs (e.g., repetitions, false starts and re-
formulations that allow learners to overcome breakdowns). Cognitive fluency is the efficient
use of mechanisms and processes such as fast retrieval of words from memory, while per-
ceived fluency is the listeners’ perception of a speaker’s output. L2 fluency research has
68 DOI: 10.4324/9781003022497-7
Aptitude and Individual Differences
investigated the relationships among utterance fluency measures (e.g., speech rate), listeners’
judgements (Suzuki & Kormos, 2020), and speakers’ cognitive fluency (Kahng, 2020), as well
as the predictability of L2 utterance fluency from parallel L1 measurement (de Jong
et al., 2015).
At all levels of proficiency, learners vary in their success in resolving L2 speech production
difficulties and consequently in their comprehensibility. Even for L2 learners with similar L2
learning histories, variability in speech is huge. For example, whereas some advanced L1-
Spanish learners of English may effectively distinguish the English word cat /kæt/ from cut
/kᴧt/ in production, many others do not, pronouncing both words with their perceptually
closer L1 equivalent /a/. Similarly, some learners may speak English at near-native rates
(approximately 240 syllables per minute), whereas others are much slower, ranging from 110
to 230 syllables per minute (Mora & Valls-Ferrer, 2012). Thus, some L2 learners’ pro-
nunciation may be easy to understand while that of others may be detrimental to intellig-
ibility; some speak fluently without apparent difficulties, while others pause frequently,
compromising comprehension. Individual differences research aims to understand and de-
scribe the sources of this inter-learner variability and seeks to identify the factors that may
explain how well L2 learners speak, especially at ultimate attainment levels, that is, when the
L2 learner has reached a relatively advanced level of L2 competence and use that is no longer
progressing substantially. Such factors have been investigated from a variety of approaches
and perspectives, and conceptualizations of aptitude and individual differences abound (e.g.,
Doughty, 2019; Dörnyei, 2005; Kormos, 2013; Robinson, 2012).
In this chapter individual differences are understood to comprise a number of predictors
of ultimate attainment in L2 speech that can be categorized as age- and experience-related
factors (age of onset of L2 learning, amount and quality of L2 exposure and use, L1
background, learning contexts) sociopsychological factors (motivation, personality, anxiety,
learning strategies, willingness to communicate) and cognitive and aptitude-related factors
(working memory, acoustic and phonological memory, attention, inhibition, and auditory
processing skills).
69
Joan C. Mora
analysis also function automatically. However, in the L2 these mechanisms function less
efficiently because representations are less accurately defined and the processes to access
them less automatized.
As a consequence of the processing difficulties outlined earlier, L2 speech is usually less
accurate and less complex grammatically, lexically and phonologically, temporally less
fluent, and much more variable than L1 speech, which makes it less comprehensible and
harder to process for native and non-native listeners (Munro & Derwing, 1995b).
Understanding the sources of this variability is important for pedagogical, scientific, and
sociopsychological reasons. Pedagogically, it will help to improve instruction through tai-
loring of classroom tasks to individual learner characteristics, as well as assessment methods.
From a scientific point of view, it will help us to gain a better understanding of the me-
chanisms and processes involved in the acquisition of L2 speech, and it will help to develop
and extend current models of L2 speech learning that still do not take individual differences
into account (but see Flege & Bohn, 2021). Finally, it will help us to better understand the
psychological and social dimension of L2 speaking, such as the impact of learners’ speech on
their self-confidence and social integration (Segalowitz, 2016).
2 Historical Perspectives
The study of language learning aptitude and individual differences has a long tradition in the
field of SLA (Dörnyei & Ryan, 2015; Skehan, 2014) and is primarily motivated by the ob-
servation that language learners vary in their ability to master an L2 and the notion that
identifying learner characteristics leading to successful language learning will benefit L2
learning and teaching.
Early research on individual differences focused on identifying language aptitude
components. Carroll and Sapon’s (1959) modern language aptitude test (MLAT) identified
four main components of language learning aptitude: phonemic coding ability (sound
coding involving sound retention, retrieval, and recognition), grammatical sensitivity
(capacity to identify grammatical relationships), inductive language learning skill (ability
to extract syntactic and morphological patterns and use them for further language pro-
cessing), and associative memory (ability to establish memory links between L1 and L2
vocabulary items). Of these components, sound sequence recognition (for lexicalization
processes) and phonemic coding ability (for input processing) seem obvious sources of
individual differences. A more recent test battery, LLAMA (Meara, 2005), has been used
extensively to assess language learning aptitude. In particular, LLAMA-E (phonemic
coding ability) has been associated with segmental and suprasegmental pronunciation
accuracy (Saito, 2019a), and LLAMA-D (sound sequence recognition) has been found to
be positively related to L2 learners’ development of comprehensibility and speed and
breakdown fluency (Saito et al., 2019). Another test battery, specially designed to predict
very high levels of L2 proficiency, is the High-Level Language Aptitude Battery (Hi-LAB;
Doughty, 2019). It extends the MLAT by incorporating several working memory com-
ponents, but its speech-related components (auditory perceptual acuity: phonemic dis-
crimination and categorization) have not been investigated extensively yet as predictors of
L2 speech learning. Granena (2019) recently combined a measure of sound recognition
(LLAMA D) and one of facilitation of lexical processing (Hi-LAB’s ALTM synonym) into
a single implicit memory ability predictor, and found it to predict L2 speech rate
(see Part 4). In general, the amount of variance explained by aptitude sub-tests in outcome
speech measures appears to be very modest (8%–10%; Granena, 2019; Saito, 2019b; Saito
et al., 2019).
70
Aptitude and Individual Differences
Extensive research in the 1980–1990s from L2-immersion settings focused on age- and
experience-related factors and demonstrated their role in determining ultimate attainment in
L2 speech learning, often with a focus on L2 pronunciation accuracy and accentedness. Such
factors include the extent to which the L1 and the L2 differ phonetically (L1 background),
the age of onset of L2 learning (AOL) or age of first extensive exposure to the L2, L2
experience (often operationalized as length of residence in an L2 speaking environment
(LoR) or as amount and quality of L2 input received) and frequency and amount of L2 use
(for a review see Bohn & Munro, 2007). However, the large number and variety of factors
examined make it difficult to determine which contribute the most to L2 speech learning.
Whereas some studies concluded that the strongest predictors of L2 pronunciation accuracy
were L1 background and motivation to speak the L2 well, others found AOL, LoR, and
amount of L1 and L2 use to be the most important predictors (e.g., Flege et al., 1995). A
similar picture emerges with regard to L2 speaking fluency. L2 use factors such as amount of
interaction with native speakers and socializing in the L2 appear to be important predictors
of attainment in immigrant populations (Derwing & Munro, 2013), but other studies have
found L2 experience to be less predictive of speaking fluency than AOL (Trofimovich &
Baker, 2006). The outcome of this research, however, suggests that at least in immersion
settings an early start is better, and L2 use and exposure impact L2 speech development
positively. This contrasts sharply with findings from research conducted in foreign language
classroom contexts, where age- and experience-related factors appear to contribute little to
L2 speech acquisition due to the limited exposure and L2 use conditions in FL classrooms. In
this context, learner characteristics other than those related to L2 exposure and use, such as
aptitude (Saito et al., 2019) or motivation (Saito et al., 2017) may explain a considerable
amount of inter-learner variability in L2 speech learning.
Research on individual differences in L2 speech learning over the past 20 years has ex-
perienced a shift of focus towards cognitive aptitude skills (e.g., memory, attention, and
inhibition) and psychological factors (e.g., motivation and anxiety). Unlike psychological
predictors, cognitive language aptitude is unique in that it is componential (different com-
ponents underlie different aspects of performance), relatively stable in adulthood, and does
not change substantially with experience (Doughty, 2019). However, its contribution to L2
speech is difficult to assess, as the implication of different aptitude components in language
performance varies in strength, depending on the linguistic dimension investigated (Li, 2016).
71
Joan C. Mora
more confident in speaking the L2. Similarly, L2 learners’ ability to inhibit their L1 when
speaking the L2 may minimize L2-to-L1 interference in phonology (Darcy et al., 2016), but
the amount of interference is likely to be modulated by the amount of L1 and L2 exposure
and use. In sum, determining which factors influence (and to what extent) specific aspects of
L2 speaking performance and development is challenging because studies can include only a
limited set of factors and their effects on L2 speech are difficult to isolate. In addition,
interactions between experiential, psychological-, cognitive-, and aptitude-related sources of
individual differences in L2 speech (the black arrows in Figure 5.1) are under-researched.
Currently, questions addressed by most individual differences research mainly fall within one
of the following three categories:
1. To what extent does a given factor (e.g., L2 vocabulary size) affect L2 performance (e.g.,
pause frequency in an oral narrative task)?
2. To what extent does a given factor (e.g., auditory selective attention) affect a given L2
learning outcome (e.g., gains in the production of an L2 vowel contrast after phonetic
training)?
3. Which of a set of factors (e.g., phonemic coding ability, motivation, and hours of in-
struction) contributes the most to a given L2 learning outcome or to L2 performance
(e.g., pronunciation accuracy, comprehensibility, or speaking fluency)?
Most research addressing these questions has examined the speech performance of L2
learners differing in age, experience, or aptitude profiles at a single point in time and cross-
sectionally rather than longitudinally (but see Derwing & Munro, 2013). The outcome of this
research indicates that L2 input quantity and quality are fundamental in L2 speech learning,
whereas aptitude factors have a more modest influence. However, few researchers to date
have examined the relative contribution of experiential and aptitude factors together in a
72
Aptitude and Individual Differences
single study, and those who have, report a complex relationship between aptitude, experience
and L2 speech learning. For example, Saito et al. (2019) found classroom experience to be
associated with improvement in comprehensibility, while phonemic coding ability was as-
sociated with the development of fluency and prosody. Saito et al. (2020) found segmental
pronunciation accuracy to be influenced by phonemic coding ability but experience factors
(in-class and out-of-class L2 use) influenced word stress, intonation, and speaking fluency. A
similar picture emerges from recent research on cognitive skills and L2 speech learning (see
part 4 below). Current findings therefore suggest that L2 speech learning is very complex,
with multiple factors contributing to the development of different dimensions of L2 speech
and to varying degrees at different stages in acquisition.
73
Joan C. Mora
74
Aptitude and Individual Differences
75
Joan C. Mora
L2 Speech Measures
L2 speech measures are based on perception and production data (analyzable in terms of
segmental, suprasegmental, and temporal properties) obtained through tasks that use a variety
of presentation formats and elicitation techniques (controlled or spontaneous) (Nagle et al., this
volume). For example, L2 learners’ perceptual sensitivity to the quality differences between the
vowels in the lexical contrast cat-cut can be measured through discrimination (same-different)
or identification tasks, whereas their discriminability in production can be measured through
acoustic analysis of the formant frequencies, perceptual judgements of pronunciation accuracy
(e.g., listeners’ judgements of accentedness on a Likert scale), or intelligibility tests involving
listeners’ transcriptions of auditorily presented materials (e.g., Saito & Plonsky, 2019).
However, segmental production accuracy data obtained through acoustic analyses and per-
ceptual judgments need not match perfectly, as a small acoustic difference may have a sub-
stantial impact on listeners’ perception, and vice-versa (Munro, 1993; Pérez-Ramón et al.,
2020). Measures of L2 learners’ speaking fluency (as well as measures of comprehensibility)
require more than word- or sentence-long stretches of speech and are typically obtained from
20–30 seconds of speech elicited through a picture-based oral narrative. Recorded speech
samples are then analysed in terms of their temporal, lexical, and grammatical properties with
the help of speech analysis software. Measures of speech rate (syllables per second including
pausing time), articulation rate (mean syllable duration computed by dividing speaking time
excluding pause time by the total number of syllables produced), pause frequency and duration
within and at clause boundaries and number of dysfluencies per minute (repetitions and repairs)
are among the most commonly used measures reflecting the speed, breakdown, and repair
characteristics of L2 learners’ utterance fluency.
76
Aptitude and Individual Differences
estimates of participants’ learning histories and of L2 exposure and use, whereas psycho-
logical factors are often measured through adaptations of well-established questionnaire
instruments (e.g., the Foreign Language Classroom Anxiety questionnaire by Horwitz et al.,
1986). Similarly, cognitive psychology offers SLA researchers a wide array of cognitive tests,
even for measuring a single cognitive skill. For example, inhibitory control has been mea-
sured in SLA research through retrieval-induced forgetting (RIF) tasks (Darcy et al., 2016),
which measure participants’ capacity to supress the activation of a lexical item to the point of
forgetting it by increasing the activation of competing lexical items; a Stroop task (Lev-Ari &
Peperkamp, 2014), in which participants need to inhibit colour names (e.g., “red”) when
asked to name the congruent or incongruent ink colour in which they are visually presented
(i.e., the word red presented in blue ink); or a Simon task (Linck et al., 2012), in which
participants are presented with red or blue boxes appearing on the right or left of the screen
and are instructed to press a right or left key associated with the box colour regardless of its
spatial position. The RIF and Stroop tasks are domain-specific because they are based on
lexical activation and involve linguistic interference, whereas the Simon task is domain-
general rather than language-oriented. Still, they are all meant to measure the same cognitive
skill. This is an important methodological aspect to consider when assessing cognitive ap-
titude. Domain-general tasks allow researchers to obtain measures of cognitive individual
differences that are language independent, both in terms of the materials used and the
participants tested, whereas domain-specific linguistic-oriented tasks provide a cognitive
measure in a testing context that resembles the language use context where the cognitive skill
is required (e.g., speech production). It would therefore seem convenient to use speech-based
(rather than domain-general) tasks to obtain cognitive control measures used to predict
individual gains in L2 speaking performance.
Another methodological issue concerns distinguishing L1-based from L2-based sources of
individual differences. The speech production mechanisms and the cognitive processes that
support them (e.g., working memory) are essentially the same in L1 and L2 (Kormos, 2013)
and are not easy to disentangle. For example, pause frequency within clauses reflects pro-
blems in formulation (lexical, syntactic, and phonological encoding) both in L1 and L2 so
that measures of pause frequency in the L2 may also reflect L1 pausing behaviour (de Jong
et al., 2015; Kahng, 2020). This does not preclude certain features of utterance fluency and
their underlying cognitive processes to reflect L2-specific individual differences. For example,
clause-internal pauses are longer and more frequent in L2 than in L1 (de Jong, 2016) and
speed of lexical retrieval (Kahng, 2020) is slower in L2 than in L1. Thus, capturing the
individual variability in L2 utterance fluency uniquely attributable to cognitive processes
operating in L2 speech production requires L2-specific measures of utterance and cognitive
fluency. These can be obtained by using L1 data as a baseline, that is, by residualizing L2
utterance and cognitive measures against their corresponding L1 measures (e.g., Kahng,
2020; Segalowitz, 2016). This would help identify the cognitive predictors of individual
differences in L2 utterance fluency (e.g., speed of lexical access, articulatory skill, and
working memory) underlying L2-specific speech processing.
77
Joan C. Mora
7 Future Directions
The key methodological issues outlined in Parts 3, 5, and 6 suggest that future research on
aptitude and individual differences in L2 speech learning would benefit substantially from
investing efforts in extending current methodological approaches. In particular, future re-
search should consider the following: investigating the sources of individual differences
dynamically and longitudinally (Nagle, 2018), increasing L2 speech assessment consistency
and homogeneity through the use of common research-informed measures and testing
methods (Saito & Plonsky, 2019), including or controlling for inter-related experiential,
78
Aptitude and Individual Differences
psychological and cognitive variables, and using both domain-general and speech-based
tasks in the assessment of cognitive aptitude constructs. In addition, the benefits of pro-
nunciation instruction (Lee et al., 2015) and pedagogical interventions aiming at developing
L2 speaking fluency (Tavakoli et al., 2016) remain under-researched from an individual
differences perspective.
Further Reading
Andringa, S., & Dąbrowska, E. (Eds.) (2019). Individual differences in first and second language
ultimate attainment and their causes. Language Learning, 69: S1.
An edited collection of articles on individual differences in language acquisition.
Granena, G., Jackson, D. O., & Yilmaz, Y. (Eds.) (2016). Cognitive individual differences in second
language processing and acquisition. Amsterdam: John Benjamins.
Covers research on cognitive individual differences in second language acquisition and ultimate
attainment.
Hansen Edwards, J. G. (2017). Pronunciation and individual differences. In O. Kang, R. Thomson, &
J. Murphy (Eds.), The Routledge handbook of contemporary English pronunciation (pp. 385–398).
New York: Routledge.
A review of research on individual differences in L2 pronunciation including discussion of aptitude,
affective variables, gender and identity.
Kormos, J. (2015). Individual differences in second language speech production. In J. W. Schwieter
(Ed.), The Cambridge handbook of bilingual processing (pp. 369–388). Cambridge: Cambridge
University Press.
Reviews recent research on individual differences in second language speech production focusing on the
role of working memory, attention and affective variables like anxiety and willingness to communicate.
References
Astheimer, L. B., Berkes, M., & Bialystok, E. (2016). Differential allocation of attention during speech
perception in monolingual and bilingual listeners. Language, Cognition and Neuroscience, 31,
196–205.
Baddeley, A. (1986). Working memory. Oxford: Clarendon Press.
Baker Smemoe, W. & Haslam, N. (2013). The effect of language learning aptitude, strategy use and
learning context on L2 pronunciation learning. Applied Linguistics, 34, 435–456.
Best, C. & Tyler, M. (2007). Nonnative and second-language speech perception: Commonalities and
complementaries. In O-S. Bohn, & M. J. Munro (Eds.), Language experience in second language
speech learning. In honor of James Emil Flege. (pp. 13–34). Amsterdam: John Benjamins.
Bohn, O-S., and Munro, M. J. (2007). Language experience in second language speech learning.
Amsterdam: John Benjamins.
Carroll, J. B. & Sapon, S. (1959). Modern languages aptitude. Test-Form A. New York: Psychological
Corporation.
Darcy, I., & Mora, J. C. (2016). Executive control and phonological processing in language acquisition:
The role of early bilingual experience in learning an additional language. In G. Granena, D. O.
Jackson, & Y. Yilmaz (Eds.), Cognitive individual differences in L2 processing and acquisition
(pp. 247–275). John Benjamins.
Darcy, I., Mora, J. C., & Daidone, D. (2016). The role of inhibitory control in second language
phonological processing. Language Learning, 66(4), 741–773.
Darcy, I., Park, H., & Yang, C.-L. (2015). Individual differences in L2 acquisition of English pho-
nology: The relation between cognitive abilities and phonological processing. Learning and
Individual Differences, 40, 63–72.
de Jong N. H. (2016). Predicting pauses in L1 and L2 speech: The effects of utterance boundaries and
word frequency, International Review of Applied Linguistics in Language Teaching, 54, 113–132.
de Jong, N. H., Groenhout, R., Schoonen, R., & Hulstijn, J. H. (2015). Second language fluency:
Speaking style or proficiency? Correcting measures of second language fluency for first language
behavior. Applied Psycholinguistics, 36(2), 223–243.
79
Joan C. Mora
Derwing, T. M., & Munro, M. J. (2013). The development of L2 oral language skills in two L1 groups:
A 7‐year study. Language Learning, 63, 163–185.
Derwing, T. M., & Munro, M. J. (2015). Pronunciation fundamentals. Evidence-based perspectives for L2
teaching and research. Amsterdam: John Benjamins.
Dörnyei, Z. (2005). The psychology of the language learner: Individual differences in second language
acquisition. New Jersey: Mahwah.
Dörnyei, Z. (2006). Individual differences in second language acquisition. AILA Review, 19, 42–68.
Dörnyei, Z., & Ryan, S. (2015). The psychology of the language learner revisited. New York: Routledge.
Dörnyei, Z., & Ushioda, E. (2013). Teaching and researching motivation. Abingdon, Oxfordshire, UK:
Routledge.
Doughty, C. J. (2019). Cognitive language aptitude. Language Learning, 69, 101–126.
Flege, J. E. (1995). Second-language speech learning: Theory, findings, and problems. In Strange, W.
(Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 229–273).
Timonium, MD: York Press.
Flege, J. E. (2009). Give input a chance! In T. Piske & M. Young-Scholten (Eds.), Input matters in SLA
(pp. 175–190). Bristol, UK: Multilingual Matters.
Flege, J. E., & Bohn, O.-S. (2021). The revised Speech Learning Model (SLM-r). In R. Wayland (Ed.),
Second language speech learning: Theoretical and empirical progress (pp. 3–83). Cambridge:
Cambridge University Press.
Flege, J. E., Munro, M. J., & MacKay, I. R. (1995). Factors affecting strength of perceived foreign
accent in a second language. The Journal of the Acoustical Society of America, 97(5), 3125–3134.
Granena, G. (2019). Cognitive aptitudes and L2 speaking proficiency: Links between LLAMA and Hi-
LAB. Studies in Second Language Acquisition, 41(2), 313–336.
Hu, X., Ackermann, H., Martin, J. A., Erb, M. , Winkler, S., & Reiterer, S. M. (2013). Language
aptitude for pronunciation in advanced second language (L2) learners: behavioural predictors and
neural substrates. Brain and Language, 127, 366–376.
Jaeggi, S. M., Buschkuehl, M., Jonides, J. and Shah, P. (2011). Short- and long-term benefits of cog-
nitive training. Proceedings of the National Academy of Sciences, 108, 10081–10086.
Juffs, A., & Harrington, M. (2011). Aspects of working memory in L2 learning. Language Teaching, 44,
137–166.
Kahng, J. (2014). Exploring utterance and cognitive fluency of L1 and L2 English speakers: Temporal
measures and stimulated recall. Language Learning, 64(4), 809–854.
Kahng, J. (2020). Explaining second language utterance fluency: Contribution of cognitive fluency and
first language utterance fluency. Applied Psycholinguistics, 41, 457–480.
Kissling, E. M. (2014). What predicts the effectiveness of foreign-language pronunciation instruction?
Investigating the role of perception and other individual differences. Canadian Modern Language
Review, 70, 532–558.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah (NJ): Lawrence
Erlbaum.
Kormos, J. (2013). New conceptualizations of language aptitude in second language attainment. In G.
Granena & M. H. Long (Eds.), Sensitive periods, language aptitude, and ultimate L2 attainment.
(pp. 131–152). Amsterdam: John Benjamins.
Kormos, J. (2015). Individual differences in second language speech production. In J. W. Schwieter
(Ed.), The Cambridge handbook of bilingual processing (pp. 369–388). Cambridge: Cambridge
University Press.
Lee, J., Jang, J., & Plonsky, L. (2015). The effectiveness of second language pronunciation instruction:
A meta-analysis. Applied Linguistics, 36, 345–366.
Lengeris, A., & Hazan, V. (2010). The effect of native vowel processing ability and frequency dis-
crimination acuity on the phonetic training of English vowels for native speakers of Greek. Journal
of the Acoustical Society of America, 128, 3757–3768.
Lev-Ari, S., & Peperkamp, S. (2014). The influence of inhibitory skill on phonological representations
in production and perception. Journal of Phonetics, 47, 36–46.
Li, S. (2016). The construct validity of language aptitude: A meta-analysis. Studies in Second Language
Acquisition, 38, 801–842.
Linck, J. A., Kroll, J. F., & Sunderman, G. (2009). Losing access to the native language while immersed
in a second language: Evidence for the role of inhibition in second-language learning. Psychological
Science, 20, 1507–1515.
80
Aptitude and Individual Differences
Linck, J. A., Schwieter, J. W., & Sunderman, G. (2012). Inhibitory control predicts language switching
performance in trilingual speech production. Bilingualism, 15(3), 651.
Meara P (2005). LLAMA language aptitude tests. Swansea: Lognostics.
Miyake, A., & Friedman, N. P. (2012). The nature and organization of individual differences in ex-
ecutive functions: Four general conclusions. Current Directions in Psychological Science, 21, 8–14.
Mora, J. C., & Valls-Ferrer, M. (2012). Oral fluency, accuracy and complexity in formal instruction and
study abroad learning contexts. TESOL Quarterly, 46, 610–641.
Moyer, A. (2014). Exceptional outcomes in L2 phonology: The critical factors of learner engagement
and self-regulation. Applied Linguistics, 35, 418–440.
Muñoz, C. (2014). Contrasting effects of starting age and input on the oral performance of foreign
language learners. Applied Linguistics, 35(4), 463–482.
Munro, M. J. (1993). Productions of English vowels by native speakers of Arabic: Acoustic mea-
surements and accentedness ratings. Language and Speech, 36, 39–66.
Munro, M. J., & Derwing, T. M. (1995a). Foreign accent, comprehensibility, and intelligibility in the
speech of second language learners. Language Learning, 45(1), 73–97.
Munro, M. J., & Derwing, T. M. (1995b). Processing time, accent, and comprehensibility in the per-
ception of native and foreign-accented speech. Language and Speech, 38(3), 289–306.
Nagle, C. (2018). Motivation, comprehensibility, and accentedness in L2 Spanish: Investigating moti-
vation as a time‐varying predictor of pronunciation development. The Modern Language Journal,
102, 199–217.
O’Brien, I., Segalowitz, N., Freed, B., & Collentine, J. (2007). Phonological memory predicts second
language oral fluency gains in adults. Studies in Second Language Acquisition, 29, 557–581.
Pérez Castillejo, S. (2019). The role of foreign language anxiety on L2 utterance fluency during a final
exam. Language Testing, 36, 327–345.
Pérez-Ramón, R., Cooke, M., & Lecumberri, M. L. G. (2020). Is segmental foreign accent perceived
categorically? Speech Communication, 117, 28–37.
Robinson, P. (2012). Individual differences, aptitude complexes, SLA processes, and aptitude test
development. In P. Robinson & M. Pawlak (Eds.), New perspectives on individual differences in
language learning and teaching (pp. 57–75). Berlin, Heidelberg: Springer.
Saito, K. (2019a). Individual differences in second language speech learning in classroom settings: Roles
of awareness in the longitudinal development of Japanese learners’ English /ɹ/ pronunciation. Second
Language Research, 35, 149–172.
Saito, K. (2019b). The role of aptitude in second language segmental learning: The case of Japanese
learners’ English /ɹ/ pronunciation attainment in classroom settings. Applied Psycholinguistics, 40(1),
183–204.
Saito, K., & Hanzawa, K. (2016). Developing second language oral ability in foreign language class-
rooms: The role of the length and focus of instruction and individual differences. Applied
Psycholinguistics, 37(4), 813–840.
Saito, K., & Plonsky, L. (2019). Effects of second language pronunciation teaching revisited: A pro-
posed measurement framework and meta‐analysis. Language Learning, 69(3), 652–708.
Saito, K., Dewaele, J. M., & Hanzawa, K. (2017). A longitudinal investigation of the relationship
between motivation and late second language speech learning in classroom settings. Language and
Speech, 60, 614–632.
Saito, K., Kachlicka, M., Sun, H., & Tierney, A. (2020). Domain-general auditory processing as an
anchor of post-pubertal L2 pronunciation learning: Behavioural and neurophysiological in-
vestigations of perceptual acuity, age, experience, development, and attainment. Journal of Memory
and Language, 115, 104168.
Saito, K., Sun, H., & Tierney, A. (2020). Domain-general auditory processing determines success in
second language pronunciation learning in adulthood: A longitudinal study. Applied Psycholinguistics,
41(5), 1083–1112.
Saito, K., Suzukida, Y., & Sun, H. (2019). Aptitude, experience, and second language pronunciation
proficiency development in classroom settings: A longitudinal study. Studies in Second Language
Acquisition, 41, 201–225.
Sardegna, V. G., Lee, J., & Kusey, C. (2018). Self‐efficacy, attitudes, and choice of strategies for English
pronunciation learning. Language Learning, 68, 83–114.
Segalowitz, N. (2010). Cognitive bases of second language fluency. New York: Routledge.
Segalowitz, N. (2016). Second language fluency and its underlying cognitive and social determinants.
International Review of Applied Linguistics in Language Teaching, 54, 79–95.
81
Joan C. Mora
Skehan, P. (2014). Individual differences in second language learning. Abingdon, Oxfordshire, UK:
Routledge.
Suzuki, S., & Kormos, J. (2020). Linguistic dimensions of comprehensibility and perceived fluency: An
investigation of complexity, accuracy, and fluency in second language argumentative speech. Studies
in Second Language Acquisition, 42(1), 143–167.
Tavakoli, P., Campbell, C., & McCormack, J. (2016). Development of speech fluency over a short
period of time: Effects of pedagogic intervention. TESOL Quarterly, 50, 447–471.
Teimouri, Y., Goetze, J., & Plonsky, L. (2019). Second language anxiety and achievement: A meta-
analysis. Studies in Second Language Acquisition, 41(2), 363–387.
Trofimovich, P., & Baker, W. (2006). Learning second language suprasegmentals: Effect of L2
experience on prosody and fluency characteristics of L2 speech. Studies in Second Language
Acquisition, 28(1), 1–30.
Xie, Y., Chen, Y., & Ryder, L. H. (2021). Effects of using mobile-based virtual reality on Chinese L2
students’ oral proficiency. Computer Assisted Language Learning, 34(3), 225–245.
82
6
LANGUAGE ANXIETY
Małgorzata Baran-Łucarz
1 Introduction/Definitions
Language anxiety (LA), one of the affective individual differences (IDs), has attracted the
attention of second language acquisition (SLA) researchers for at least four decades. In
SLA, the construct has been referred to as Foreign Language Classroom Anxiety, Foreign/
Second Language Anxiety or simply Language Anxiety. Here, the latter term is used.
Horwitz et al. (1986) define it as “a distinct complex of self-perceptions, beliefs, feelings,
and behaviours related to classroom learning arising from the uniqueness of the language
learning process” (p. 128). Gardner and MacIntyre (1993) add that LA may not only
accompany second language (L2) learning but also appear “when a situation requires the
use of a second language with which the individual is not fully proficient” (p. 5).
Numerous studies have examined its nature, causes, correlates and, most importantly, the
effect it has on L2/foreign language (FL) learning and use. (L2 is an umbrella term which
encompasses FL but here the terms are used interchangeably.) MacIntyre and Gardner
(1991) considered LA “one of the best predictors of success” in FL learning (p. 96). Meta-
analyses conducted recently (Botes et al., 2020; Teimouri et al., 2019) provide firm evi-
dence for the detrimental effect of LA on FL achievement. Several early (e.g., Horwitz
et al., 1986; Phillips, 1992) and more contemporary studies (Gkonou, 2017; Piechurska-
Kuciel, 2008; Tóth, 2012) have suggested that among the skills most frequently generating
anxiety in L2 users is speaking.
The conceptualization of LA has changed over the years. Since the construct concerns a
specific kind of anxiety, the general phenomenon of anxiety is first discussed from a psy-
chological perspective. Then, the construct of LA is introduced in its various phases of de-
velopment, along with its causes and correlates. What follows is an overview of studies
examining the impact of LA specifically on oral performance, which simultaneously reveals
the approaches and instruments typically deployed to investigate this matter. Then, a brief
report is presented on results of meta-analyses, a construct of pronunciation anxiety, and
examples of studies following the current trend to explore LA from a Dynamic Systems
Theory (DST) point of view. Finally, practical hints on how to lower LA in the FL classroom
are delineated, along with areas for future LA research.
DOI: 10.4324/9781003022497-8 83
Language Anxiety
2 Historical Perspectives
Phases of LA Development
Explaining how the conceptualization of anxiety in SLA has changed over time, MacIntyre
(2017) suggests three phases: the Confounded Phase, Specialised Phase, and Dynamic Phase. In
the first phase, attempts were made to define and measure LA, by borrowing concepts and
instruments directly from psychology, without taking into account the specificity of the FL
learning process and context. According to Scovel (1978), some concepts were also misperceived,
which led to introducing overgeneralizations and chaos into SLA theory and research. Among
them were facilitative anxiety and debilitative anxiety, which were originally suggested to be
independent constructs, measured with the use of different scales, rather than two ends of one
continuum. The misinterpretation of concepts and misuse of instruments led to ill-formed
84
Małgorzata Baran-Łucarz
conclusions, such as the facilitative nature of anxiety in L2 learning. As Horwitz (2017) pointed
out, Scovel’s warning against applying anxiety measures directly from psychology to FL learning
situations indicated that a language-specific conceptualization of LA and an instrument to
measure it were needed. This conviction was further reinforced by the emergence of the socio-
educational model and development of an instrument – the Attitude Motivation Test Battery –
whose elements focused on anxiety experienced specifically when learning an FL (Gardner et al.,
1976). Data gathered with its use lent consistent support to the detrimental effect of anxiety on
L2 achievement and performance (MacIntyre & Gardner, 1991).
Scovel’s cautions and Gardner’s work contributed to the development of the construct of
anxiety referring specifically to L2 learning and use – Foreign/Second Language Anxiety – and
an instrument to measure it, introduced by Horwitz et al. (1986) in a pioneering article. This
marked the beginning of the Specialised Phase (MacIntyre, 2017), whose name derives from
the fact that L2 anxiety was treated as a language-specific construct. It is a phase dominated by
cross-sectional studies, focusing on exploring the correlates and causes of LA and its effects on
L2 achievements. Important developments of this stage are referenced in the next part.
Finally, a new approach – the Dynamic Approach – to research LA has emerged in the so-
called Dynamic Phase (MacIntyre, 2017). The arrival of the new trend results from the po-
pularity of the Complex Dynamic Systems Theory (Larsen-Freeman, 1997). Based on its
main principles, LA is considered to fluctuate over time, rising and falling not only within
years and months, but also within minutes. This is because it is connected to several con-
textual variables and IDs of L2 learners. The assumption has led to the application of more
classroom-oriented research methodologies, with longitudinal preferred over cross-sectional,
ethnographic research, design-based research, mixed-method research, and action research.
Examples of such studies are presented in further subsections.
85
Language Anxiety
When conceptualizing LA, Horwitz and her colleagues (1986) considered American
students’ comments regarding anxiety-generating experiences in their Spanish and French
classes. The participants explained that “speaking the language aloud, frequent testing
and fear of being negatively evaluated by their teachers and peers” were particularly
stressful (Horwitz, 2017, p. 15). Consequently, it was hypothesized that LA may be related
to such types of anxiety identified earlier by psychologists as communication apprehen-
sion, test anxiety, and fear of negative evaluation. Communication apprehension refers to
the general fear of communicating with others (mainly orally), that is, transferring a
message effectively to the interlocutor, making oneself understood (McCroskey, 1984).
Test anxiety denotes high levels of stress experienced before, during, or after test taking,
with the typical physiological and behavioural symptoms of anxiety and its cognitive
limitations (Sarason, 1981). Finally, fear of negative evaluation refers to threat-related
thoughts caused by assuming that one’s performance will be negatively assessed by others
(Watson & Friend, 1969). It is important to stress that Horwitz et al. (1986) in their
seminal paper did not suggest LA is directly composed of these three anxiety types in
different proportions; instead, it was claimed that LA is “only analogous” to them
(Horwitz, 2017, p. 33).
To measure levels of LA, a 33-item self-report questionnaire, the Foreign Language
Classroom Anxiety Scale (FLCAS), was designed (Horwitz et al., 1986), with items referring
to apprehension experienced specifically while learning and using an L2, “evidenced by ne-
gative performance expectancies and social comparisons, psycho-physiological symptoms,
and avoidance behaviours” (Horwitz et al., 1986, p. 559). It is this instrument, and adapted
shorter versions, that has been most frequently employed by SLA researchers exploring LA
(Botes et al., 2020; Teimouri et al., 2019). Studies have shown that LA is not simply a sum of
the three anxiety types mentioned above (e.g., Horwitz et al., 1986; Aida, 1994). More re-
cently, Park’s (2014) research led to the conclusion that communication apprehension either
in isolation or working in tandem with other factors (e.g., confidence) constitutes the “core
component of the FLCAS” (Horwitz, 2017, p. 36). As Horwitz claims, data at our disposal
seem neither to support the tripartite model nor its unitary form, leaving the nature of LA
still shrouded in mystery, evidence of its high complexity and ambiguity.
86
Małgorzata Baran-Łucarz
87
Language Anxiety
oral performance and LA. To examine the link, LA was most often identified with the
FLCAS or a modified version, while speaking capabilities were measured with tasks such as
role-plays, discussions or spontaneous speech. Quantitative outcomes were often supple-
mented with qualitative data gathered via surveys or semi-structured interviews. Sometimes
triangulation was used (e.g., Liu, 2006), that is, data were gathered with a written ques-
tionnaire filled out by students, verified by teacher’s reflections and comments, and further
supplemented with observations of anxious and non-anxious students in the FL classroom.
As anxiety influences all stages of information processing, it can have a particularly
detrimental effect on L2 pronunciation. Data corroborate this assumption. Horwitz et al.
(1986) reported that high anxiety students complained about having problems with
“discriminating the sounds (…) of a target language” (p. 126). Participants in Derwing and
Rossiter’s (2002) study observed that their pronunciation deteriorated when they felt
stressed. Young (1991) concluded that students were aware their pronunciation was impaired
when talking in front of the teacher. These self-observations support Scovel’s (1978) claim
that “high language anxiety experienced when speaking causes stiffness of muscles, which in
turn results in a learner’s poor pronunciation” (Szyszka, 2017, p. 83). Moreover, students’
awareness of poor pronunciation may make them try even harder, generating more anxiety
and leading to poorer articulation, and eventually to a vicious circle (see Gregersen &
Horwitz, 2002).
Further evidence of the connection between LA and pronunciation was demonstrated by
Szyszka (2011) and Baran-Łucarz (2011), who found moderate negative correlations between
learners’ self-perceived pronunciation competence and their levels of LA. Both studies reveal
medium effect sizes, explaining 21%–24% of variance in LA. Baran-Łucarz (2013) observed a
moderate negative relationship between scores on a pronunciation attainment test and results
of a skill-specific Phonetics Learning Anxiety Scale. The connection was verified further by a
t-test showing that the pronunciation of more anxious students was significantly worse than
that of less anxious individuals. Finally, in a more recent study, Szyszka (2017) observed that
high and low LA students differ significantly in the range of pronunciation learning strategies
and tactics they deploy.
The level of LA also affects other aspects of oral performance. Phillips (1992) recorded
young adult learners of French as an FL on two tasks – free speech and a role-play. Their
performance was transcribed, assessed and correlated with their level of LA measured with
the FLCAS. Measures were made of the average length of communication units (CUs)
(indicating syntactic complexity), and the percentage of words in CUs (indicating lexical
proficiency). The study showed a negative correlation of moderate strength between the
degree of LA and general level of oral performance, which can be considered a medium effect
size, accounting for 16% of variance in LA. The outcomes corroborate the hypothesis that
anxious individuals have limited working memory capacity and difficulties retrieving in-
formation from long-term memory. Further qualitative data – reflections of highly anxious
participants – provided support for this hypothesis; they “reported feeling frustrated, pa-
nicked, and apprehensive” having forgotten words they had learned but could not recall
while speaking (Szyszka, 2017, p. 101). The research was replicated in two later studies
(Stephenson , 2006; Hewitt & Stephenson, 2012) with similar results.
A negative relationship of moderate strength was also observed by Park and Lee (2005),
who designed their own measure, adopting the instrument of Aida (1994) and the FLCAS, in
which some items represented anxiety, while others tested L2 confidence and motivation. The
study revealed that the more anxious the students were, the less rich their lexis and grammar
and the less fluent their speech. Moreover, the participants with higher levels of LA used
fewer communication strategies and had more limited social skills.
88
Małgorzata Baran-Łucarz
Rich data appear in Tóth (2012). By applying a translated version of the FLCAS, she
discriminated among high and low LA Hungarian adults, who performed three oral tasks
with a native speaker: an introductory interview, a conversation about a controversial topic,
and an interpretation of a picture. Mann–Whitney U tests showed statistically significant
differences between the oral performance of low and high LA participants in task perfor-
mance, communication effectiveness, grammar accuracy and range, lexical correctness and
range, fluency, and appropriate pronunciation/intonation use.
Finally, it is worth considering a study conducted by Piechurska-Kuciel (2008), who
found a strong negative relationship between participants’ L2 self-perceptions and their le-
vels of LA, measured with a Polish translation of the FLCAS. The effect size was the highest
(large) in the case of self-assessed speaking skills. Similarly, Kitano (2001) and Subaşi (2010),
using a Japanese version of the FLCAS and three self-rating scales, found that the higher the
LA, the lower the students’ self-perceived level of English in comparison to that of their
classmates. The correlation was particularly high, revealing a large size effect, in the case of
self-rated pronunciation. The vast body of research reported in this subsection leaves no
doubt that LA affects L2 oral performance negatively.
LA and L2 Achievement
As noted earlier, among important recent contributions to literature on LA are two meta-
analyses – Botes et al. (2020) and Teimouri et al. (2019), which investigate the strength of
evidence for the connection between LA and FL achievement. The results of both confirm the
negative relationship between the two variables, with the correlations achieved in numerous
studies revealing a mean of r = −.36 (k = 105; N = 19,933) and r = −.39 (k = 59;
N = 12,585), respectively, and LA accounting for 13%–15% of variance in learners’ L2
achievement. Comparing outcomes of meta-analyses of other correlates of success in FL
learning, Teimouri et al. (2019) found aptitude and motivation to have more impact on per-
formance than LA, which was, in turn, followed by working memory. Together the four IDs
explained 58% of the variance in FL achievement. It is interesting that although participants of
numerous studies (Gkonou, 2017; Horwitz et al., 1986; Phillips, 1992; Piechurska-Kuciel, 2008;
Tóth, 2012) confessed that speaking is the most anxiety-generating L2 skill, in neither meta-
analysis did oral production reveal the highest effect size. More specifically, Teimouri et al.
(2019, p. 15) found that “listening and writing anxiety showed much larger effects than reading
or speaking anxiety.” Similarly, Botes et al. (2020) reported moderately large correlations be-
tween FLCAS scores and L2 writing and listening achievements and moderate correlations in
the case of reading. This time speaking achievement had the lowest mean effect size (r = −.26; k
= 16; N = 1,745). The researchers, however, draw attention to the high heterogeneity of cor-
relations found in the studies focusing on speaking, which may indicate that the relationship is
“exacerbated or impeded by other factors such as general public speaking anxiety”(p. 46). It
seems that the high dispersion of effect sizes in the case of oral production may be explained
also by various criteria chosen for assessing this FL skill.
89
Language Anxiety
is often acknowledged by students to be one of the most anxiety-provoking skills, and the
source of LA frequently comes from the belief that one has “a terrible accent” (Price, 1991,
p. 105), it was not long ago that a construct of Pronunciation Anxiety (PA) was proposed
(Baran-Łucarz, 2014a). In conceptualizing this anxiety type, I relied on my observations
from teaching pronunciation, comments of my students regarding the sources of their ap-
prehension related to L2 pronunciation and its learning (Baran-Łucarz, 2011; 2013), and
earlier studies showing a connection between L2 pronunciation and identity (e.g., Guiora,
1972; Walker, 2011). I defined PA as a “multidimensional construct referring to the feeling of
apprehension and worry experienced by non-native speakers in oral-communicative situa-
tions […] in the classroom and/or natural contexts, deriving from their negative/low self-
perceptions, beliefs and fears related specifically to pronunciation,” whose occurrence is
evidenced by typical cognitive, physiological, and behavioural symptoms of anxiety (Baran-
Łucarz, 2017, p. 109). I further advocated that the concept has four antecedents that interact
dynamically. These are (1) fear of being negatively viewed by interlocutors, classmates, or
teachers due to pronunciation; (2) pronunciation self-efficacy and self-assessment based on
comparisons made to classmates or interlocutors; (3) pronunciation self-image – the con-
ception of one’s aural and visual appearance while speaking the target language (TL) and
one’s readiness to accept the image; and (4) beliefs related to TL pronunciation concerning
the difficulty of the TL phonological system for speakers of a particular L1, the importance
of pronunciation for communication, and attitudes towards the general sound/particular
aspects of TL pronunciation.
Using an instrument to measure PA in English as an FL classroom setting in Poland
(Baran-Łucarz, 2014a), I found PA to be strongly correlated (r = −.60; p <.0005) with L2
WTC. The results of this analysis were further verified by a t-test, showing a statistically
significant difference between the general level of WTC of high and low PA students
(t = −7.828; p < .0005) (Baran-Łucarz, 2014b), with the former being less eager to speak than
the latter. The tests showed a large (in the case of correlation) and very large effect size (in the
case of the t-test, computed with Hedges’ g, which equalled 1.51) (Cohen, 1988).
Further studies conducted with a newer version of the Measure of Pronunciation Anxiety, in
both classroom and naturalistic settings (Baran-Łucarz, 2017), revealed a negative moderate
relationship between PA and motivation, represented by ideal pronunciation L2 self. This
study also lent support to the construct validity of the PA concept, demonstrating that low and
high PA students differ in their attitudes towards aspects of the TL sound system. For ex-
ample, in pictures indicating associations evoked by interdentals and post-alveolars, low PA
students drew positive images, such as dandelions flying in the wind. In contrast, a learner
displaying the highest PA level sketched images of a gravestone with an inscription “RiP -
Me,” a tongue with explanations “trapped between the teeth” and “dead tongue,” a crying face
with an inscription “tears,” and a sad face with a label “dead eyes” (Baran-Łucarz, 2017,
p. 121). The feelings displayed via pictures were further supported by expressions related to the
aforementioned sounds, with the low PA participants providing adjectives such as “delicate,”
“subtle,” “refined,” and “warm” (Baran-Łucarz, 2017, p. 123), and the high PA students
writing “heavy,” “strange,” “stupid,” “very unnatural,” “difficult,” “nobody speaks like that,”
“crippled,” and “childish,” (Baran-Łucarz, 2017, p. 121). Clearly, PA appears to be a highly
complex construct, whose nature and role in FL learning is worth further exploration.
90
Małgorzata Baran-Łucarz
(2014). Following an idea from an earlier qualitative study (Gregersen & Horwitz, 2002), a few
high and low LA teacher trainees were video-recorded while delivering a presentation. They
wore a heart rate monitor recording their peaks and falls in anxiety. Then, they viewed their
video-recorded performance and rated their levels of anxiety while speaking, with the use of
McIntyre’s (2012) free access idiodynamic programme. Finally, in interviews, the participants
explained the sources of their anxiety falls and peaks. Thus, it was possible not only to observe
that anxiety fluctuated during oral performance, but also to identify its causes. Furthermore,
the study showed that spikes of anxiety can be traced among low LA students. It turned out
that the anxiety of such learners might result from the convergence of several factors, which
stimulate physiological, cognitive, emotional and behavioural systems of anxiety.
The dynamic perspective in researching LA was also explored by Gregersen et al. (2017).
Here, the anxiety self-ratings of the same low and high LA participants from Gregersen et al.
(2014) were compared to anxiety detections of a peer and an expert on non-verbal behaviour
and emotional intelligence, who used the same idiodynamic procedure as the speakers.
Additionally, three conditions for detecting LA in the participants’ performance were of-
fered, that is, visual only, audio only, and combined audio and visual channels. The good
news for anxious learners, who fear their anxiety will be easily decoded by interlocutors,
classmates or teachers, is that not all the spikes of anxiety were identified by the observers.
The most salient symptoms were “their hands, faces, eyes, bodies and voices” (Gregersen
et al., 2017, p. 129), which they can learn to control. The research is valuable also for its
pedagogical implications. Since tracing LA is not as simple as suggested earlier, it appears
necessary to train FL teachers to recognize anxiety in their learners, as well as anxiety-
generating situations, which could, in turn, help in developing remedies.
Another project rooted in the dynamic perspective is that of Gkonou (2017). This study
examines the causes of LA by referring to Brofenbrenner’s (1993) nested ecosystems model.
First, the researcher identified highly anxious students with the use of the FLCAS, who wrote
weekly diary entries and were then interviewed. The data revealed that the LA experienced
by learners in actual FL classrooms (microsystem) were interrelated with prior learning
experiences (mesosystem), shaped by local attitudes towards learning English represented by
teachers (exosystem), which were in turn determined by the Greek FL education system
(macrosystem). The researcher concluded that it is important for teachers to be aware that
their students are not tabula rasas and that their negative emotions may be rooted in wider
contexts, shaped by several external factors exposed to earlier.
91
Language Anxiety
support or individual strategies relieving stress and helping control emotional reactions.
Şimşek and Dörnyei (2017) encourage students to verbalize their negative emotions in
constructive narratives and to introduce explicit discussions in FL classrooms on negative
emotions and ways of dealing with them.
Although more tangible data are needed on the effectiveness of introducing FL teaching
approaches, procedures, and techniques to keep anxiety low, it should be helpful to rely on the
information above regarding the nature, correlates of LA, and particularly the externally driven
classroom-based causes of LA. First and foremost, however, it is vital to follow the most basic
recommendations proposed by all LA specialists, that is, acknowledge the existence of LA (e.g.,
Gkonou, 2017), and make the learning environment as friendly, supportive, and encouraging as
possible (e.g., Horwitz, 2017). As Gregersen et al. (2017) put it, “it is better to assume the
presence of anxiety and build in classroom “safety nets” such as supportive interactive en-
vironments and effective error correction, than to miss negative affect when it is present and risk
its negative effects” (p. 131). Moreover, besides making lessons as pleasant as possible, it is
important to foster positive self-perceptions of FL learners and change their frequent unrealistic
FL learning expectations, such as achieving nativelike pronunciation.
Remembering the interplay among internal and external factors which determine anxiety,
teachers should be aware that not all remedies work equally well for all anxious students in
all contexts. However, showing students understanding, acceptance, concern, and readiness
to help overcome their fears is always a good starting point in eliminating negative feelings,
which debilitate performance and discourage learning and risk-taking in L2 usage.
6 Future Directions
Despite the fact that LA has attracted the attention of SLA researchers for decades, there are
still several theoretical and practical questions to be answered. What requires deeper ex-
ploration is the very nature of LA, with its antecedents and correlates (Horwitz, 2017; Şimşek
& Dörnyei, 2017). Although cross-sectional studies are needed, they should be supplemented
with qualitative research designs. Qualitative data can be gathered through narratives (e.g.,
Şimşek & Dörnyei, 2017), interviews, and classroom observations, supported with the use of
idiodynamic software (MacIntyre, 2012). The latter can be particularly helpful in shedding
light on LA as a dynamic phenomenon experienced by learners of different cognitive, affective,
personality profiles and in determining which strategies and external remedies are more ef-
fective for which learners. More experimental designs are needed to help verify the “causal
connections between language anxiety and performance” (MacIntyre, 2017, p. 23). Moreover,
the nature of language-specific anxieties is still rather tentative. What seems interesting to
examine is which learner profiles are more prone to experience which language-specific anxi-
eties. Also underexplored is the role of LA at different stages of learning, its effect across age
groups, including young learners and older adults, and differences in intensity, causes and
effects on FL learning and use among different cultural groups. To better know how to control
LA, further investigations are needed of the relationships and dynamics of LA with regard to
identity, motivation, WTC, mindfulness, enjoyment, engagement, autonomy, boredom, and
silence. Crucially, classroom-based research should be extended to naturalistic settings. Studies
on fluctuations of LA among learners of different profiles and cultures, using L2 in authentic
communicative situations are clearly missing. Finally, future research could focus on teachers’
anxiety, its causes and relationship with burnout and teachers’ well-being, and the potential
connection between teacher and learner anxiety in the FL classroom (Gkonou et al., 2017).
92
Małgorzata Baran-Łucarz
Further Reading
Gkonou, C., Daubney, M., & Dewaele, J.-M. (Eds.). (2017). New insights into language anxiety: Theory,
research and educational implications. Bristol: Multilingual Matters.
This book contains chapters authored by well-known experts on LA, who revisit the concept, report on
their studies, and offer pedagogical interventions.
Gregersen, T., & MacIntyre, P. (2014). Capitalizing on language learners’ individuality. From premise to
practise. Bristol: Multilingual Matters.
The first chapter introduces readers to the theoretical underpinnings of LA and several practical
classroom activities to lessen its negative effects on L2 learning.
Szyszka, M. (2017). Pronunciation learning strategies and language anxiety. Cham, Switzerland:
Springer International Publishing.
This book explores the relationship between LA and pronunciation learning strategies. It offers a
comprehensive overview of empirical studies of LA, with a special focus on its effect on speaking and
pronunciation.
References
Aida, Y. (1994). Examination of Horwitz, Horwitz, and Cope’s construct of foreign language anxiety:
The case of students of Japanese. Modern Language Journal, 78(2), 155–168.
Baran-Łucarz, M. (2011). The relationship between language anxiety and the actual and perceived
levels of FL pronunciation. Studies in Second Language Learning and Teaching, 1(4), 491–514.
Baran-Łucarz, M. (2013). Phonetics learning anxiety – Results of a preliminary study. Research in
Language, 11(1), 57–79. doi:102478/v10015-012-0005-9
Baran-Łucarz, M. (2014a). The link between pronunciation anxiety and willingness to communicate in
the foreign-language classroom: The Polish EFL context. Canadian Modern Language Review, 70(4),
445–473. doi:103138/cmlr.2666
Baran-Łucarz, M. (2014b, June). The link between pronunciation anxiety and willingness to commu-
nicate in and outside the FL classroom. Paper presented at Psychology and Language Learning
Conference, Graz.
Baran-Łucarz, M. (2016). Conceptualizing and measuring the construct of pronunciation anxiety.
Results of a pilot study. In M. Pawlak (Ed.), Classroom-oriented research (pp. 39–56). Berlin,
Heidelberg: Springer. doi:101007/978-3-319-30373-4_3
Baran-Łucarz, M. (2017). FL pronunciation anxiety and motivation: Results of a preliminary mixed-
method study. In E. Szymańska-Czaplak, M. Szyszka & E. Kuciel-Piechurska (Eds.), At the
crossroads: Challenges in FL learning (pp. 107–133). Heidelberg & New York: Springer International
Publishing. doi:101007/978-3-319-55155-5_7
Botes, E., Dewaele, J.-M., & Greiff, S. (2020). The foreign language classroom anxiety scale and
academic achievement: An overview of the prevailing literature and a meta-analysis. Journal for the
Psychology of Language Learning, 2, 25–56.
Brofenbrenner, U. (1993). Ecological models of human development. In M. Gauvain & M. Cole (Eds.),
Readings on the development of children (pp. 37–43). New York: Freeman.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. New York: Routledge.
Derwing, T. M., & Rossiter, M. J. (2002). ESL learners’ perceptions of their pronunciation needs and
strategies. System, 30, 155–166.
Dewaele, J.-M. (2002). Psychological and sociodemographic correlates of communicative anxiety in L2
and L3 production. The International Journal of Bilingualism, 6, 23–39.
Dewaele, J.-M. (2007). The effect of multilingualism, sociobiographical, and situational factors on
communication anxiety and foreign language anxiety of mature language learners. The International
Journal of Bilingualism, 11, 391–409.
Dewaele, J.-M. (2013). The link between foreign language classroom anxiety and psychoticism, ex-
traversion, and neuroticism among adult bi- and multilinguals. The Modern Language Journal,
97(3), 670–684. doi:101111/j.1540-4781.2013.12036.x
Dewaele, J.-M. (2017). Are perfectionists more anxious foreign language learners and users? In C.
Gkonou, M. Daubney & J.-M. Dewaele (Eds.), New insights into language anxiety: Theory, research
and educational implications (pp. 70–90). Bristol: Multilingual Matters.
93
Language Anxiety
Dewaele, J.-M., & Dewaele, L. (2017). The dynamic interactions in foreign language classroom anxiety
and foreign language enjoyment of pupils aged 12 to 18. A pseudo-longitudinal investigation.
Journal of the European Second Language Association, 1(1), 12–22.
Gardner, R. C., Smythe, P. C., Clement, R., & Gliksman, L. (1976). Second language learning: A social
psychological perspective. The Canadian Modern Language Review, 32(3), 198–213.
Gardner, R. C., & MacIntyre, P. D. (1993). A student’s contribution to second-language learning.
Part II: Affective variables. Language Teaching, 26, 1–11.
Gkonou, C. (2013). A diary study on the causes of English language classroom anxiety. International
Journal of English Studies, 13(1), 51–68.
Gkonou, C. (2017). Towards an ecological understanding of language anxiety. In C. Gkonou, M.
Daubney & J.-M. Dewaele (Eds.), New insights into language anxiety: Theory, research and educa-
tional implications (pp. 135–155). Bristol: Multilingual Matters.
Gkonou, C., DeWaele, J-M. & Daubney, M., (Eds.) (2017). New insights into language anxiety: Theory,
research and educational implications (pp. 217–223). Bristol: Multilingual Matters.
Gregersen, S., & Horwitz, E. (2002). Language learning and perfectionism: Anxious and non-anxious
language learners’ reactions to their own oral performance. Modern Language Journal, 86(4),
562–570. doi:101111/1540-4781.00161
Gregersen, T., MacIntyre, P., & Meza, M. (2014). The motion of emotion: Idiodynamic case studies of
learners’ foreign language anxiety. Modern Language Journal, 89(2), 574–588. doi:101111/j.1540-
4781.2014.12084.x
Gregersen, T., MacIntyre, P., & Olson, T. (2017). Do you see what I feel? An idiodynamic assessment of
expert and peer’s reading of nonverbal language anxiety cues. In C. Gkonou, M. Daubney & J.-M.
Dewaele (Eds.), New insights into language anxiety: Theory, research and educational implications
(pp. 110–134). Bristol: Multilingual Matters.
Guiora, A. (1972). Construct validity and transpositional research: Toward an empirical study of
psychoanalytic concepts. Comprehensive Psychiatry, 13(2), 139–150.
Hewitt, E., & Stephenson, J. (2012).Foreign language anxiety and oral exam performance: A re-
plication of Phillips’s MLJ study. The Modern Language Journal, 96(2), 170–189. 10.1111/
j.1540‐4781.2011.01174.x
Horwitz, E. K. (2000). It ain’t over’til it’s over: On foreign language anxiety, first language deficits, and
the confounding of variables. The Modern Language Journal, 84(2), 256–259. doi:101111/0026-
7902.00067
Horwitz, E. K. (2017). On the misreading of Horwitz, Horwitz and Cope (1986) and the need to balance
anxiety research and the experiences of anxious language learners. In C. Gkonou, M. Daubney &
J.-M. Dewaele (Eds.), New insights into language anxiety: Theory, research and educational im-
plications (pp. 31–47). Bristol: Multilingual Matters.
Horwitz, E. K., Horwitz, M., & Cope, J. A. (1986). Foreign language classroom anxiety. Modern
Language Journal, 7, 125–132.
King, J., & Smith, L. (2017). Social anxiety and silence in Japan’s tertiary foreign language classroom.
In C. Gkonou, M. Daubney & J.-M. Dewaele (Eds.), New insights into language anxiety: Theory,
research and educational implications (pp. 91–109). Bristol: Multilingual Matters.
Kitano, K. (2001). Anxiety in the college Japanese language classroom. The Modern Language Journal,
85(4), 549–566
Larsen-Freeman, D. (1997). Chaos/complexity science and second language acquisition. Applied
Linguistics, 18, 141–165.
Liebert, R. M., & Morris, L. W. (1967). Cognitive and emotional components of test anxiety: A dis-
tinction and some initial data. Psychological Reports, 20, 975–978. doi:102466/pr0.1967.20.3.975
Liu, M. L. (2006). Anxiety in Chinese EFL students at different proficiency levels. System, 34, 301–316.
MacIntyre, P. D. (1995). How does anxiety affect second language learning? A reply to Sparks and
Ganschow. The Modern Language Journal, 79(1), 90–99. doi:101111/j.1540-4781.1995.tb05418.x
MacIntyre, P. D. (2012). The idiodynamic method: A closer look at the dynamics of communication
traits. Communication Research Reports, 29(4), 361–367.
MacIntyre, P. D. (2017). An overview of language anxiety research and trends in its development. In C.
Gkonou, M. Daubney & J.-M. Dewaele (Eds.), New insights into language anxiety: Theory, research
and educational implications (pp.11–30). Bristol: Multilingual Matters.
MacIntyre, P. D., & Gardner, R. C. (1989). Anxiety and second language learning: Towards a theo-
retical clarification. Language Learning, 39(2), 251–275.
94
Małgorzata Baran-Łucarz
MacIntyre, P. D., & Gardner, R. C. (1991). Methods and results in the study of anxiety in language
learning: A review of the literature. Language Learning, 41, 85–117.
MacIntyre, P. D., Clément, R., Dornyei, Z., & Noels, K. A. (1998). Conceptualizing willingness to
communicate in a L2: A situational model of L2 confidence and affiliation. Modern Language
Journal, 82(4), 545–562.
McCroskey, J. C. (1984). Communication competence. The elusive construct. In R. N. Bostrom (Ed.),
Competence in communication: A multidisciplinary approach (pp. 259–268). Beverly Hills, CA: SAGE
Publications.
Onwuegbuzie, A. J., Bailey, P., & Daley, C. E. (1999). Relationship between anxiety and achievement at
three stages of learning a foreign language. Perceptual and Motor Skills, 88, 1085–1093.
Oxford, R. (2017). Anxious language learners can change their minds: Ideas and strategies from tra-
ditional psychology and positive psychology. In C. Gkonou, M. Daubney & J.-M. Dewaele (Eds.),
New insights into language anxiety: Theory, research and educational implications (pp.177–197).
Bristol: Multilingual Matters.
Park, G. P. (2014). Factor analysis of the foreign language classroom anxiety scale in Korean learners
of English as a foreign language. Psychological Reports, 115, 261–275.
Park, H., & Lee, A. R. (2005). L2 learners’ anxiety; self-confidence and oral performance. In
Proceedings of the 10thconference of the Pan-Pacific association of applied linguistics (pp. 107–208).
Edinburgh University, August 2005. Retrieved from http://www.paaljapan.org/resources/
proceedings/PAAL10/pdfs/hyesook.pdf
Pekrun, R. (1992). Expectancy-value theory of anxiety: Overview and implications. In D. Forgays &
T. Sosnowski (Eds.), Anxiety: Recent developments in cognitive, psychological and health research
(pp. 23–39). Washington, DC: Hemisphere.
Phillips, E. (1992). The effects of language anxiety on students’ oral test performance and attitudes. The
Modern Language Journal, 76(1), 14–26.
Piechurska-Kuciel, E. (2008). Language anxiety in secondary grammar school students. Opole:
Wydawnictwo Uniwersytetu Opolskiego.
Price, M. L. (1991). The subjective experience of foreign language anxiety: Interviews with highly
anxious students. In E. K. Horwitz & D. J. Young (Eds.), Language anxiety: From theory and
research to classroom implications (pp. 101–108). Upper Saddle River, NJ: Prentice Hall.
Rubio-Alcala, F. D. (2017). The Links Between Self-Esteem and Language Anxiety and Implications
for the Classroom. In C. Gkonou, M. Daubney & J.-M. Dewaele (Eds.), New insights into language
anxiety: Theory, research and educational implications (pp.198–216). Bristol: Multilingual Matters.
Sarason, I. C. (1981). Test anxiety, stress, and social support. Journal of Personality, 49, 101–114.
Scovel, T. (1978). The effect of affect on foreign language learning: A review of the anxiety research.
Language Learning, 28(1), 129–142.
Şimşek, E., & Dörnyei, Z. (2017). Anxiety and L2 self-images: The ‘anxious self’. In C. Gkonou, M.
Daubney & J.-M. Dewaele (Eds.), New insights into language anxiety: Theory, research and educa-
tional implications (pp. 51–69). Bristol: Multilingual Matters.
Sparks, R. L., & Ganschow, L. (1991). Foreign language learning differences: Affective or native
language aptitude differences? Modern Language Journal, 75(1), 3–16.
Stephenson Wilson, J. T. (2006). Anxiety in learning English as a foreign language: Its associations with
student variables, with overall proficiency, and with performance on an oral test. Doctoral dis-
sertation, University of Granada. Retrieved from http://hera.ugr.es/tesisugr/16235290.pdf
Stroud, C., & Wee, L. (2006). Anxiety and identity in the language classroom. Regional Language
Centre Journal, 37(3), 299–307.
Subaşi, S. (2010). What are the main sources of Turkish EFL students in oral practice? Turkish Online Journal
of Qualitative Inquiry, 1(2), 29–49. Retrieved from https://core.ac.uk/download/pdf/27171558.pdf
Szyszka, M. (2011). Foreign language anxiety and self‐perceived English pronunciation
competence.Studies in Second Language Learning and Teaching, 1(2), 283–300. https://doi.org/
10.14746/ssllt.2011.1.2.7
Teimouri, Y., Goetze, J., & Plonsky, L. (2019). Second language anxiety and achievement: A meta-
analysis. Studies in Second Language Acquisition, 41(2), 489.
Tóth, Z. (2012). Foreign language anxiety and oral performance: Differences between high- vs. low-
anxious EFL students. Language, 10(5), 1166–1178.
Vasa, R. A., & Pine, D. S. (2004). Neurobiology in anxiety disorders in children and adolescents. In
T. R. Morris & J. S. March (Eds.), Anxiety Disorders in Children and Adolescents (pp. 3–26). New
York: Guilford Press.
95
Language Anxiety
Walker, R. (2011). Teaching the pronunciation of English as a lingua franca. Oxford: Oxford University
Press.
Watson, D., & Friend, R. (1969). Measurement of social-evaluative anxiety. Journal of Consulting and
Clinical Psychology, 33, 448–457.
Young, D. J. (1990). An investigation of students’ perspectives on anxiety and speaking. Foreign
Language Annals, 23, 539– 553.
Young, D. J. (1991). Creating a low-anxiety classroom environment: What does language anxiety
research suggest? The Modern Language Journal, 75, 426–439.
96
PART II
Research Issues
7
SPEAKING RESEARCH
METHODOLOGIES
Charles Nagle, Tracey M. Derwing, and Murray J. Munro
1 Introduction/Definitions
Speaking is an intricate activity which involves cognitive skills (e.g., memory, lexical re-
trieval) (see De Bot, this volume), articulation skills, interaction skills, and culturally de-
termined pragmatic knowledge, all in real time. It is even more complicated in a second
language (L2) because speakers may experience interference from their first language (L1),
and learners, at least, have gaps in knowledge that present them with additional commu-
nication challenges. Given this complexity, researchers from several disciplines, including
linguistics, psychology, and applied linguistics, have examined different aspects of speaking,
each employing methodology from their own fields of study. In all instances, however, the
key issues to consider when posing a research question investigating the nature of speaking
are the type of data collection to employ and the analysis of the resulting data.
The overall design of an experiment can be described as belonging to one of several di-
chotomies. Second language acquisition research was predominantly quantitative in the early
stages; researchers looked for patterns in acquisition which might later inform teaching (see
Dulay & Burt, 1974). Quantitative approaches have developed considerably and now often
involve highly sophisticated statistical techniques. More recently, researchers adopted qua-
litative methods, which document linguistic experiences in a non-numerical way (e.g., so-
ciocultural approaches, see Surtees & Duff, this volume). Another design dichotomy is
observation versus intervention (see discussion of observation tasks in the Historical
Perspectives part). Observation allows researchers to gauge the degree to which an L2
speaker has mastered a given speech variable. In longitudinal studies, systematic observation
can record naturalistic development. Intervention tasks typically involve measuring learners’
facility with an aspect of speech, providing instruction over a set period of time, and then re-
measuring their performance. To determine whether changes are a result of instruction, an
uninstructed control group (learners who share similar performance on the speech variable in
question at the outset of the study) is included, and the performance of the two groups is
compared after the intervention. If the experimental group’s speech is significantly improved
over that of the control group, it is safe to assume that the instruction was effective.
A further methodological dichotomy is cross-sectional versus longitudinal studies. Cross-
sectional research offers a snapshot of speaking performance at a specific point in time. For
example, Baker & Trofimovich (2005) examined age effects on the acquisition of English
DOI: 10.4324/9781003022497-10 99
Charles Nagle et al.
vowels by Korean speakers by sampling vowels from early and late bilingual speakers and
comparing the two. In contrast, a longitudinal study follows the linguistic development of
speakers over an extended period of time. It therefore allows close inspection of differences in
individual learning trajectories, rather than just comparisons of group means. Of course,
even longitudinal studies that focus solely on means do not capture individual differences.
Some longitudinal studies involve groups of learners who perform tasks over time (e.g.,
Derwing & Munro, 2013), and others take the form of introspective, single subject studies,
such as Leopold’s (1949) diary of the development of English and German in his daughter,
Hildegard.
Spoken corpora provide another source of data available to researchers (see Huensch &
Staples, this volume). Despite their limitations, existing corpora can be utilized to examine
grammatical, lexical and phonological features of speech, sometimes along with pragmatics.
Once research questions have been determined, the researcher must decide on the most
appropriate data to address those questions. Sometimes an existing corpus will suffice, but
often researchers design or adapt tasks to elicit spoken language samples. At this point, it is
important to look for congruence among the questions, the tasks, and the resulting data. For
instance, if fluency (in the sense of the flow of language, extent and placement of pausing,
and hesitation forms) is the dimension of interest, a read-aloud task is not an appropriate
choice, because it does not reflect the work that a speaker must do in natural speech, such as
retrieving suitable vocabulary items, making grammatical decisions, and determining fitting
prosody. This work is already done for the speaker who is presented with a passage to read;
thus, these speaking elements, which all affect fluency, are eliminated, making such data of
limited interest.
Once the data are collected, they must be coded in some way. In early studies, for instance,
the occurrences of particular features such as grammatical morphemes, interactional stra-
tegies, and politeness cues were categorized and counted. In pronunciation studies, listeners
are often employed to give scalar ratings of speech samples for speech dimensions such as
comprehensibility, fluency, and accentedness. More recently, listeners have been asked to
rate speakers dynamically, reacting to elements of a speech sample as they occur, rather than
providing a single scalar rating (Nagle et al., 2019). In a novel approach to measuring L2
anxiety, Gregersen et al. (2014) conducted a study in which learners wore heart monitors and
rated their anxiety in real time, with follow-up retrospective interviews.
2 Historical Perspectives
The earliest research on L2 speaking was often based on findings from first language (L1)
acquisition studies. In a classic, longitudinal, observational enquiry of three children, Brown
(1973) proposed that children shared similar patterns in the L1 development of English
grammatical morphemes. Dulay and Burt (1974) created the Bilingual Syntax Measure
(BSM), a conversational assessment tool, to elicit grammatical morphemes from L1 Spanish
speakers learning English to compare L2 learners with Brown’s L1 data. Using a cross-
sectional design with three groups of learners at different proficiency levels, they elicited
language from children by showing them cartoons and asking questions, the correct answers
to which required the use of particular morphemes. As researchers adopted the BSM for use
in other populations (e.g., adult learners) the methodology of contrasting L1 longitudinal,
naturalistic development with cross-sectional designs was questioned. Rosansky (1976), for
instance, raised such concerns and argued for longitudinal L2 studies of spontaneous speech.
Rather than compare two languages to predict errors that students may make, as in
contrastive analysis (Lado, 1957), the advocates of error analysis (e.g., Corder, 1967) posited
100
Speaking Research Methodologies
that an examination of learners’ actual errors would be more informative. Throughout the
1970s and 1980s, error analyses were conducted, often with listeners judging the severity of
the errors (both spoken and written). However, in 1974, Schachter’s article An Error in Error
Analysis illustrated that non-occurrence of an error does not necessarily mean that a learner
has mastered a particular form; rather it can indicate avoidance of a form that a learner finds
difficult. Although Schachter’s study was based on written passages, it had implications for
speaking as well. Error analysis provides no insight into aspects of language that do not
appear in an L2 speaker’s speech.
Another methodology to investigate L2 speaking development, a case study, was carried
out by Schmidt and Frota (1986). Schmidt kept a diary, outlining observations of his own
learning of Portuguese in addition to recording his own conversations. His linguist co-
author, a native speaker of Portuguese, later analyzed noun and verb phrases in the re-
cordings, noting changes over time. Schmidt took 50 hours of language classes, studied on
his own (especially grammar), and participated in an active social life while in Brazil, giving
him ample opportunities to interact with others. As for the value of formal instruction, the
authors concluded that Schmidt “learned and used what he was taught if he subsequently
heard it and if he noticed it” (p. 279), italics in the original. This is a prime example of the
power of a diary study, in that it resulted in an important theoretical concept, the noticing
hypothesis, that remains influential today.
Conversational analysis (CA), which is also rooted in L1 research, has been used to ex-
amine L2 speaking (e.g., Sacks et al., 1974). CA not only examines the linguistic forms in a
conversation, but also, more importantly, it takes into account interactional behaviours.
Although much of L2 research focuses on speakers, CA considers both interlocutors, and
documents such phenomena as how turn-taking is managed and how repairs are made when
a communication breakdown occurs. Typically, a conversation is recorded and then pains-
takingly transcribed and examined for social patterns. For a comprehensive review, see
Kasper and Wagner (2014).
Another approach to studying L2 speaking acquisition emerged in the 1970s, this time
borrowing from studies of Caregiver Speech (CS) in L1. It was posited that the adjustments
parents make for their young children have a facilitating effect on their offspring’s lin-
guistic development. Ferguson (1975) proposed that foreigner talk (FT), the adjustments
native speakers make for lower proficiency L2 speakers, played a similar role to CS. Like
CA, FT studies examined both sides of the equation in a conversation, but the focus of
analysis was primarily linguistic. Long (1983) argued that native speakers make a range of
adjustments in negotiating meaning with an L2 learner and that these interactional ad-
justments (such as clarification requests, confirmation checks, and paraphrase) are em-
ployed to “avoid conversational trouble and to repair the discourse when trouble occurs”
(p. 131). For instance, in the Find the Difference task that follows, two interlocutors each
had similar pictures, but were required to find six differences through talking only. The
native speaker (NS) took the lead, describing his picture, including the word “chimney,”
which the non-native speaker (NNS) did not know. Using paraphrase, by describing the
function and physical properties of a chimney, the NS was able to assist the learner to
assign an English noun to an object.
NS: I have a house and a tree. NNS: Mmhmm. NS: And uh, the house has a door.
NNS: Yeah. NS: And a window, NNS: Mmhmm. NS: And a chimney. NNS:
Chimney? NS: For the fireplace? On the roof? You know the roof of the house?
NNS: Yeah. NS: And then there’s a chimney? NNS: Yeah. NS: Made of brick?
NNS: Oh, that chimney! NS: Chimney. Do you have a chimney? NNS: Yeah, I
101
Charles Nagle et al.
don’t have. NS: You don’t have a chimney. NNS: Yeah, I don’t have. (unpublished
data, T. M. Derwing)
Fillmore first brought attention to four distinct types of fluency in L1 in 1979 (reprinted in
2000), the first of which is “the ability to talk at length with few pauses” (p. 51). This is the
type of fluency most often studied in L2 speaking research; it is typically measured by ex-
amining elements of dysfluency, such as pause placement and length, mean length of run
(number of words between pauses), hesitations, repetitions, and false starts. These physical
measures correlate with human listeners’ scalar judgements of fluency, although there is not a
one-to-one relationship (Derwing et al., 2004).
L2 speaking has also been studied in terms of learner affective variables. Test instruments
have been used to measure language anxiety (Horwitz et al., 1986), and speaking strategies
have been explored using interviews, observation, and three types of verbal report: self-
revelation (think-alouds), self-observation, which entails either immediate introspection or
retrospection (e.g., watching a video of oneself speaking), and self-report (Cohen, 2014).
Cohen also offers a comprehensive discussion of the benefits and disadvantages of structured
versus semi-structured interviews. MacIntyre et al. (1998) explored the concept of
Willingness to Communicate, which has been investigated using several different approaches.
Sociocultural studies (Surtees & Duff, this volume) tend to use a wide variety of quali-
tative methods to examine L2 and typically take into account the context in which inter-
actions occur to examine power relations and other connections to the speakers’ identities.
Video and audio recordings, interviews, field notes, mappings, and other data are collated to
bring together a nuanced understanding of a participant’s communication development.
102
Speaking Research Methodologies
“spontaneous” speech (speech that occurs in a natural discourse without prompting by a re-
searcher) is essentially non-existent in L2 research because of the ethical requirement of in-
formed consent from participants. Perhaps the nearest approximation is illustrated by de
Leeuw (2019), who avoided ethics concerns by analyzing public domain news recordings in a
longitudinal study of tennis player Stefanie Graf’s L2 English speech. Of course, Graf knew
she was being recorded, though not for linguistic reasons.
Applied linguists often use monologic tasks which, though not spontaneous, are designed
to minimize the impact of the observer. One example is the “danger-of-death” narrative,
developed by sociolinguists in the 1970s (see Labov, 1972), in which the speaker orally re-
counts a life-threatening experience (Oyama, 1976). Deep engagement with an emotional
topic is thought to distract the speaker’s attention from linguistic form. Closely related,
though not as emotionally charged, are other monologic activities including picture de-
scriptions, personal narratives, and oral summaries of videos. Given limited control by the
researcher, these are thought to elicit relatively natural speech, but a significant drawback is
that the speaker may not produce the particular vocabulary, grammatical structures or
phonological patterns relevant to the goals of the investigation. Consequently, they are most
beneficial in studies in which specific language structures are not at issue, as in fluency re-
search or global speech analysis (e.g., intelligibility or comprehensibility).
To examine speakers’ use of specific language structures, elicitation procedures must be
more tightly constrained. At the most controlled end of the continuum are read-aloud and
simple oral repetition tasks, in which the linguistic content is fully predetermined. Although
such tasks are still used, they have serious drawbacks. Because both entail relatively un-
common types of speech acts in normal human communication, their outcomes may not be
generalizable to typical speaking performance. Aside from this aspect of ecological validity,
reading aloud may require the speaker to use unfamiliar vocabulary and may yield spelling
pronunciations even of known words. Repetition may yield speech that is heavily shaped by
the characteristics of the model and may therefore fail to represent the speakers’ capabilities.
To some degree, these issues are addressed with modified tasks. In vocabulary and pro-
nunciation research, for example, target lexis items might be elicited from pictures without
orthographic representations. As well, some types of morpho-syntactic knowledge can be
tapped through descriptions of pictures or videos in which responses involve grammatical
forms in obligatory contexts. An alternative to simple repetition is delayed repetition, in
which the speaker’s auditory memory is disrupted by a pause of several seconds (Trofimovich
& Baker, 2006) or an intervening sound or a task (such as counting aloud to 10) before the
repetition is produced. The delay is believed to give more natural performance because it
requires greater processing and coding of the speech material than does immediate repetition.
Because so much human communication is interactional rather than monologically based,
the evaluation of interactional speech material is fundamental in the study of speaking.
Though their methods differ, both quantitative and qualitative researchers have used in-
teractional material effectively. An important source of quantitative data has been classroom
observations focusing on how learning takes place, rather than on specific language content.
Some studies, for instance, have explored teacher–student interactions to establish how
teachers provide corrective feedback and how it is received (Lyster & Ranta, 1997; see Goo,
this volume). Another line of work examines language behaviour during student–student
interactions in task-based learning (e.g., Foster & Skehan, 1996). Still other research ex-
amines the types of adjustments native speakers make in response to L2 speech (Long, 1983).
In qualitative studies, interactions serve as a rich source of information. Surtees and Duff
(this volume), for instance, point to the benefits of micro-level analysis from naturalistic
video and audio recordings obtained in classroom and work environments, and in informal
103
Charles Nagle et al.
social settings, such as the dinner table. Some researchers have gained insights into learners’
interactive experiences using the introspective technique known as stimulated recall (Gass &
Mackey, 2016), in which research participants consider a video or audio recording of their
own interactions and provide commentary on their experiences at the time of the original
activity.
An important development in recent years is the growing availability of online spoken
corpora (see also Huensch & Staples, this volume.) The Talkbank (2020) project, for in-
stance, consists of multiple repositories covering a wide range of categories including con-
versational materials, clinical recordings, child language data and speech from bilinguals and
second language learners. These can be accessed and assessed using a variety of software
tools included in the project. The Dutch corpus, JASMIN-CGN (Cucchiarini et al., 2008), is
a useful model for corpora; it contains orthographically and phonemically transcribed re-
cordings and part-of-speech tagging of speech produced by children, seniors, L2 Dutch
learners, and has been used for multiple research purposes.
Approaches to Analysis
The range of approaches for analyzing speech data is wide, encompassing transcription,
expert data coding, acoustic measurements, and naive listener ratings. Furthermore, using a
combination of approaches yields deeper insights into speech material than does a single
approach.
The first step in studies of less-controlled elicitation types such as narratives and inter-
actions is transcription, typically in standard orthography. The details in transcription are
determined by the dependent variables at issue and the type of coding planned. Although
researchers may choose not to employ formal transcription conventions for their own pur-
poses, these are an essential part of speech corpora and other data-sharing situations.
Coding of transcribed material requires expertise, and must be checked for reliability,
often through independent coding of the materials by one or more coders in addition to the
main one. Coding may entail identifying, classifying, and tagging parts of speech, gram-
matical errors, particular speech acts, and grammatical structures such as phrases, clauses,
and t-units.
Acoustic analyses are usually performed using software that displays speech waveforms
and other representations. In fluency research, for example, durational data, measured in
milliseconds, may be obtained on filled and unfilled pauses, runs, repetitions, and false starts.
While software allows automation of some such measurements (De Jong & Wempe, 2009),
results must generally be checked for accuracy and reliability. More detailed acoustic ana-
lyses of the type used in pronunciation research are performed with analysis software such as
Praat (Boersma & Weenink, 2020). Among other measurements, these include vocal pitch,
vowel formant frequencies, and temporal properties of consonants such as voice onset time
(VOT). On the one hand, acoustic data often shed light on listeners’ perceptions. For in-
stance, speaking rate measurements can predict fluency ratings. On the other, caution must
be exercised in interpreting such data because there is rarely a straightforward relationship
between the acoustic events in speech and listeners’ perceptions of those events. It is possible,
for instance, to find measurable differences between utterances that are not perceivable by
listeners, and therefore have no relevance to communication. Small differences in VOT in
stop consonants, for instance, might be reliably measured yet imperceptible.
Naive listener ratings have played a role in the analysis of L2 speech for several decades.
Listeners may listen to audio files and then assign scalar ratings on such dimensions as social
appropriateness (pragmatics research), fluency, comprehensibility, and fluency. In sociolinguistic
104
Speaking Research Methodologies
studies, such scales may cover perceived personal characteristics of the speakers, such as their
friendliness, teaching ability, competence, or assertiveness. Protocols for rating studies require
careful administration, including appropriate scale sizes and labelling, speech samples of suitable
duration and content, controlled listening conditions, and checks on inter- and intra-rater
reliability.
105
Charles Nagle et al.
rather than exclusively monologic speech. Some of the challenges raised by the application of
new technologies in automated assessment of speaking are discussed by Iwashita (this vo-
lume). While noting their administrative advantages she comments on the limitations of the
unidirectional evaluation of speaking on which it typically focuses.
106
Speaking Research Methodologies
data analysis tools. L2 speaking research, regardless of its scope, often generates complex,
hierarchical data sets that include speakers and observations (e.g., individual sounds, words,
sentences, monologues, and sometimes interlocutors in conversation). It has been common
practice to average over different facets of the data to compute single-measure averages (e.g.,
speaker averages) that are necessary for ANOVA and other statistical tests. Likewise, when
two facets of the data were of interest (e.g., speakers and items, speakers and listeners),
separate analyses have been conducted. These analyses make it difficult to consider how
various dimensions of the data interact, that is, how variables nested within different facets
of the data influence L2 speaking outcome measures. Researchers are increasingly embracing
statistical tests such as mixed-effects modelling that allow them to estimate within a single
analysis a wide range of effects. These techniques also enable researchers to evaluate the
extent to which any given effect varies for the speakers, items, and listeners in the sample. By
using these techniques, researchers can gain a far more accurate and nuanced portrait of
speaking-related phenomena.
107
Charles Nagle et al.
aspects of L2 speaking development in the same learner sample from different perspectives.
Once a large body of work is accumulated on these samples, interesting comparisons of
grammatical, lexical, pronunciation, and pragmatic development will be possible. Of course,
as we avail ourselves of open tools, tasks, and analyses, so too must we ensure that our
methods are as transparent and replicable as possible by publishing our own tasks, data sets,
and analyses whenever feasible.
We also encourage collaboration between researchers and teachers. Teachers often have
an intuitive sense of areas of special difficulty for their students and the factors that play a
role in predicting achievement. Researcher–teacher collaborations can help to bridge the gap
between research and practice by designing classroom studies that are both empirically sound
and ecologically valid. To do so, researchers may have to let go of some of the traditional
control mechanisms that they have in the laboratory. For instance, a focus on whole
classrooms often prevents random assignment of students into control and experimental
groups. Despite the problems of interpretation that may result, these circumstances reflect
the reality of the learning context in which many teachers work.
7 Future Directions
The interdisciplinary scope of L2 speaking research demands multidimensional theoretical
and methodological approaches. As the state of the art evolves, bringing with it new research
questions, so too must methods evolve to address them. We offer three methodological re-
commendations to enhance the breadth and depth of L2 speaking research. First, more work
is needed on languages other than English. To date, the majority of studies have focused on
L2 English, and the few studies that have investigated other L2s have mostly recruited L1
English speakers. Thus, as a research community, we have ignored a substantial proportion
of language learners worldwide who are not L1 English speakers and who choose to learn an
L2 other than English. As we expand the languages we investigate, we should also strive for
greater diversity in the social backgrounds of research participants and aim especially for
non-academic samples (e.g., Andringa & Godfroid, 2020).
We also need more longitudinal studies, particularly research examining speakers over
more than three data points and over longer periods. We can gain a more nuanced under-
standing of the long-range dynamics of L2 speaking through multiwave longitudinal re-
search. For example, Derwing and Munro (2013) were able to provide unique insights into
issues such as non-linear change over time and individual developmental trajectories because
they examined their speakers over a 7-year period. Another example is Huensch and Tracy-
Ventura’s work on foreign language learners’ fluency development before, during, and after
study abroad (2017), including follow-up 4 years after learners’ study abroad experience
(Huensch et al., 2019). This research addresses complex questions related to the role of in-
dividual differences in rate of change and L2 maintenance over many years. There is also a
need for dense, multiwave research conducted over shorter timeframes, as well as dynamic
methods focussing on system-level change and emphasizing the interconnectedness of ele-
ments (e.g., Hiver & Al-Hoorie, 2019).
Finally, we should strive for a multifaceted view of L2 speaking, which includes linking
different types of L2 speaking measures over time and various tasks using robust quantitative
and qualitative methods. This also calls for working at the interface of different subdomains
of SLA using diverse methods, for instance, Ruivivar and Collins’ (2018) work on the in-
terplay between non-native accents and spoken grammar. To a certain extent, we are already
making progress towards a multidisciplinary, multimethod approach to L2 speaking. As L2
108
Speaking Research Methodologies
speaking researchers, even though we may approach questions from different perspectives,
we are interested in the same fundamental issues: What are the components of L2 speaking?
How do those components relate to listeners’ impressions of L2 speech, and how does L2
speaking relate to global communicative competence? How do L2 learners become compe-
tent L2 speakers? Ultimately, all our theoretical perspectives and methodological skillsets
will be needed to understand L2 speaking in its complexity.
Further Reading
Burns, A. (2017). Research and the teaching of speaking in the second language classroom. In E. Hinkel
(Ed.), Handbook of research in second language teaching and learning (pp. 242–256). Routledge.
Burns addresses cognitive and affective aspects of L2 speaking. She summarizes key differences between
spoken and written discourse and discusses the contributions discourse analysis can make to L2
speaking pedagogy.
Day, R. R. (Ed.) (1986). Talking to learn: Conversation in second language acquisition. Rowley MA:
Newbury House.
A collection of classic papers foundational to our current understanding of L2 speaking.
Mackey, A., & Gass, S. (2015). Second language research: Methodology and design (2nd edn). London:
Routledge.
An in-depth analysis of the research process, addressing topics such as study design, data collection and
analysis, and reporting. The authors review a wide range of data collection instruments with illustrative
examples of the data the instruments generate.
Plonsky, L. (Ed.) (2015). Advancing Quantitative Methods in Second Language Research. London:
Routledge.
An edited volume containing both a critical appraisal of basic quantitative principles such as descriptive
statistics, p values, and effect sizes, as well as targeted summaries of advanced statistical techniques.
Chapters include step-by-step instructions on how to carry out analyses, drawing upon examples from
published research.
References
Andringa, S., & Godfroid, A. (2020). Sampling bias and the problem of generalizability in applied
linguistics. Annual Review of Applied Linguistics, 40, 134–142.
Baker, W., & Trofimovich, P. (2005). Interaction of native-and second-language vowel system (s) in
early and late bilinguals. Language and Speech, 48(1), 1–27.
Bardovi-Harlig, K. (2018). Matching modality in L2 pragmatics research design. System, 75, 13–22.
Boersma, P. & Weenink, D. (2020). Praat: doing phonetics by computer [Computer program]. Version
6.1.21, retrieved 20 August, 2020 from http://www.praat.org/
Brown, R. (1973). A first language: The early stages. Cambridge, MA: Harvard University Press.
Cobb, T. (2020). Compleat lexical tutor, v. 8.3. Accessed 28 September, 2020 at https://www.lextutor.ca
Cohen, A. D. (2014). Strategies in learning and using a second language (2nd edn). London & New York:
Routledge.
Corder, S. P. (1967). The significance of learners’ errors. International Review of Applied Linguistics, 5,
160–170.
Cucchiarini, C., Dreisen, J., Van Hamme, H., & Sanders, E. (2008). Recording speech of children, non-
natives and elderly people for HLT applications: The JASMIN-CGN corpus. Proceedings of the 6th
international conference on language resources and evaluation, LREC 2008 (pp. 1445–1450).
De Jong, N. H., & Wempe, T. (2009). Praat script to detect syllable nuclei and measure speech rate
automatically. Behavior Research Methods, 41(2), 385–390.
de Leeuw, E. (2019). Native speech plasticity in the German-English late bilingual Stefanie Graf: A
longitudinal study over four decades. Journal of Phonetics, 73, 24–39.
Derwing, T. M., & Munro, M. J. (2013). The development of L2 oral language skills in two L1 groups:
A 7-year study. Language Learning, 63(2), 163–185.
Derwing, T. M., Rossiter, M. J., Munro, M. J., & Thomson, R. I. (2004). L2 fluency: Judgments on
different tasks. Language Learning, 54, 655–679.
109
Charles Nagle et al.
Doughty, C. J., & Long, M. H. (2000). Eliciting second language speech data. In L. Menn, &
N. Bernstein Ratner (Eds.), Methods for studying language production (pp. 149–177). Mahwah,
NJ: Lawrence Erlbaum.
Dulay, H. C., & Burt, M. K. (1973). Should we teach children syntax? Language Learning, 23(2),
245–258.
Dulay, H. C., & Burt, M. K. (1974). Natural sequences in child second language acquisition. Language
Learning, 24(1), 37–53.
Ferguson, C. A. (1975). Towards a characterization of English foreigner talk. Anthropological
Linguistics, 17, 1–14.
Fillmore, C. J. (2000). On fluency. In H. Riggenbach (Ed.), Perspectives on fluency (pp. 43–60). Ann
Arbor: University of Michigan Press.
Foster, P., & Skehan, P. (1996). The influence of planning and task type on second language perfor-
mance. Studies in Second Language Acquisition, 18(3), 299–323.
Gass, S. M., & Mackey, A. (2016). Stimulated recall methodology in applied linguistics and L2 research.
Milton Park: Taylor & Francis.
Gass, S., & Plonsky, L. (2020). Introducing the SSLA methods forum. Studies in Second Language
Acquisition, 42, 667–669.
Goldrick, M., Shrem, Y., Kilbourn-Ceron, O., Baus, C., & Keshet, J. (2021). Using automated acoustic
analysis to explore the link between planning and articulation in second language speech produc-
tion. Language, Cognition, and Neuroscience, 36(7), 824-839.
Gregersen, T., MacIntyre, P. D., & Meza, M. D. (2014). The motion of emotion: Idiodynamic case
studies of learners’ foreign language anxiety. The Modern Language Journal, 98, 574–588.
Hiver, P., & Al-Hoorie, A. (2019). Research methods for complexity theory in applied linguistics. Blue
Ridge Summit, PA: Multilingual Matters.
Horwitz, E. K., Horwitz, M., & Cope, J. A. (1986). Foreign language classroom anxiety. Modern
Language Journal, 7, 125–132.
Huensch, A., & Tracy-Ventura, N. (2017). L2 utterance fluency development before, during, and after
residence abroad: A multidimensional investigation. The Modern Language Journal, 101(2),
275–293.
Huensch, A., Tracy-Ventura, N., Bridges, J., & Cuesta-Medina, J. (2019). Variables affecting the
maintenance of L2 proficiency and fluency four years post-study abroad. Study Abroad Research in
Second Language Acquisition and International Education, 4, 96–125.
Kasper, G. & Wagner, J. (2014). Conversation analysis in applied linguistics. Annual Review of Applied
Linguistics, 34, 1–42.
Labov, W. (1972). Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press.
Lado, R. (1957). Linguistics across cultures: Applied linguistics for language teachers. Ann Arbor, MI:
University of Michigan.
Leopold, W. (1949). Speech development of a bilingual child (Volume 4). Evanston, IL: Northwestern
University Press.
Linck, J. A., & Cunnings, I. (2015). The utility and application of mixed-effects models in second
language research. Language Learning, 65(S1), 185–207.
Long, M. H. (1983). Native speaker/non-native speaker conversation and the negotiation of compre-
hensible input. Applied Linguistics, 4, 126–142.
Lyster, R. & Ranta, L. (1997). Corrective feedback and learner uptake: Negotiation of form in com-
municative classrooms. Studies in Second Language Acquisition, 20, 37–66.
Mackey, A. (2012). Input, interaction, and corrective feedback in L2 learning. Oxford, England: Oxford
University Press.
Mackey, A., & Gass, S. (2016). Second language research: Methodology and design (2nd edn).
New York: Routledge.
MacIntyre, P. D., Dörnyei, Z., Clément, R., & Noels, K. A. (1998). Conceptualizing willingness to
communicate in a L2: A situational model of L2 confidence and affiliation. The Modern Language
Journal, 82(4), 545–562.
Marsden, E., Mackey A., & Plonsky, L. (2016). The IRIS Repository: Advancing research practice and
methodology. In A. Mackey & E. Marsden (Eds.), Advancing methodology and practice: The IRIS
repository of instruments for research into second languages (pp. 1–21). New York: Routledge.
Marsden, E., Morgan-Short, K., Trofimovich, P., & Ellis, N. C. (2018). Introducing Registered Reports
at Language Learning: Promoting transparency, replication, and a synthetic ethic in the language
sciences. Language Learning, 68(2), 309–320. 10.1111/lang.12284
110
Speaking Research Methodologies
Nagle, C. L. & Rehman, I. (2021). Doing L2 speech research online: Why and how to collect online
ratings data. Studies in Second Language Acquisition, 43(4), 916–939.
Nagle, C., Trofimovich, P., & Bergeron, A. (2019). Toward a dynamic view of second language
comprehensibility. Studies in Second Language Acquisition, 41(4), 647–672.
O’Brien, M. G. (2016). Methodological choices in rating speech samples. Studies in Second Language
Acquisition, 38(3), 587–605.
Oyama, S. (1976). A sensitive period for the acquisition of a nonnative phonological system. Journal of
Psycholinguistic Research, 5(3), 261–283.
Rosansky, E. J. (1976). Methods and morphemes in second language acquisition research 1. Language
Learning, 26(2), 409–425.
Ruivivar, J., & Collins, L. (2018). The effects of foreign accent on perceptions of nonstandard
grammar: A pilot study. TESOL Quarterly, 52(1), 187–198.
Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organization of turn-
taking for conversation. Language, 50(4), 696–735.
Schachter, J. (1974). An error in error analysis. Language Learning, 24, 205–214.
Schmidt, R. W., & Frota, S. N. (1986). Developing basic conversational ability in a second language: A
case study of an adult learner of Portuguese. In R. R. Day (Ed.), Talking to learn: Conversation in
second language acquisition (pp. 237–326). Rowley, MA: Newbury House.
Surtees, V. (2013). Mobile tracking of L2 interactions: Identifying speech act contexts for inclusion in
pragmatic assessment tools. Paper presented at the Canadian Association of Applied Linguistics
Conference, Victoria, Canada.
Talkbank (2000). Electronic resource retrieved from https://talkbank.org, September 10, 2020.
Trofimovich, P., & Baker, W. (2006). Learning second language suprasegmentals: Effect of L2 ex-
perience on prosody and fluency characteristics of L2 speech. Studies in Second Language
Acquisition, 28(1), 1–30.
Thomson, R. I. & Derwing, T. M. (2015). The effectiveness of L2 pronunciation instruction: A nar-
rative review. Applied Linguistics, 36, 326–344.
111
8
SPOKEN CORPORA
Amanda Huensch and Shelley Staples
1 Introduction/Definitions
In second language acquisition research on L2 speaking,1 there is growing interest in the use
of spoken corpora to understand language development. Corpora (the plural of corpus) are
generally defined as large collections of speech (or writing) that are balanced and re-
presentative of a particular discourse domain (Biber et al., 1998; McEnery & Hardie, 2012).
Most corpus researchers consider corpora to be collections of naturally occurring oral and/or
written texts. However, learner corpus researchers often also include in their definitions
collections of texts containing elicited material, such as classroom or assessment tasks. We
propose a definition of corpora incorporating a cline of spoken data that is more controlled
and more naturally occurring, with items like reading of words or sentences at one end of the
spectrum; followed by picture description tasks, or narrative recount tasks; followed by
speaking performance assessments (e.g., oral interviews or monologues); followed by open-
ended classroom tasks (e.g., introducing oneself or describing a trip); followed by con-
versation and other spoken domains outside of the classroom. As with the surge of written
corpora starting in the 1980s, when computing capabilities improved, one of the reasons why
spoken corpora are growing in popularity is the increase of digital tools that can make
building, analyzing, and sharing spoken corpora easier.
2 Historical Perspectives
Early “corpora” (which were not referred to as such) consisted of individual words, focusing
primarily on the study of spoken utterances. These datasets were rightly criticized as un-
representative of speech as a whole and were particularly attacked for their focus on em-
piricism versus rationalism by Chomsky (1962). However, interestingly, even after the
Chomskian revolution of the 1950s, phoneticians continued to work with naturally observed
data, as did second language acquisition researchers (see McEnery & Wilson, 2001). With the
advent of machine-readable capabilities, modern corpora were built in the 1950s and 1960s,
particularly in English. Most of these comprised writing by L1 speakers, and were not in-
tended for the study of second language acquisition. In fact, of the earliest modern (com-
puterized) corpora, only one made a significant contribution to understanding spoken
English, the London-Lund corpus. Started in 1975 and completed in the early 1980s, for
years it was the only corpus of spontaneous spoken English with prosodic annotation. In the
1990s, an explosion of learner and other corpora usable for SLA research took place. Two
major databases for accessing these corpora are the Université Catholique de Louvain’s
CECL2 list of learner corpora around the world and TalkBank’s SLABank.3 Major spoken
corpora used for study of SLA include the ESF4 family of corpora, LANGSNAP5 (as well as
FLLOC6 and SPLLOC7), LINDSEI,8 and TLC.9 Many other corpora exist (see Appendix
8A for spoken corpora and Appendix 8B for selected existing corpora) but few languages are
represented (mostly English, French, or Spanish). In addition, there are limitations on the
types of analyses that can be conducted. Many corpora (including LINDSEI) do not provide
researchers with sound files but are limited to transcripts. For those that do include sound
files (e.g., FLLOC and SPLLOC), no phonological annotation is provided, and thus analysis
of features such as prosody would be very time consuming.
113
Amanda Huensch and Shelley Staples
acquisition of grammatical and lexical features (e.g., Crossley et al., 2015), pragmatic fea-
tures (e.g., Fernández, 2013), utterance fluency (e.g., Huensch & Tracy-Ventura, 2017),
phonological features (e.g., Götz, 2013), complexity/accuracy/fluency (CAF) analyses (e.g.,
Vercellotti, 2017), and more. A 2019 special issue in the International Journal of Learner
Corpus Research (IJLCR) highlights some of the possibilities of using oral corpora to explore
spoken SLA using the TLC9, a large (4.2 million words) corpus collected from 2012–2018
which includes monologic and interactive speech from the Graded Examinations in Spoken
English assessment developed by Trinity College London.
One area of spoken SLA research fairly well-represented by the use of corpora is oral
fluency development (Huensch, 2020). In the IJLCR special issue, Götz (2019) used a subset
of the TLC to investigate utterance fluency, specifically the relationship between filled pause
frequency and variables such as proficiency level, country of origin, and age of acquisition.
Using regression modelling to predict filled pause frequency, Götz demonstrated that the
factor with the strongest explanatory power was country of origin, which is a loose proxy for
L1 background. With evidence that filled pause usage is particularly linked to L1 influence,
Götz calls into question the practice of high-stakes assessment such as the Common
European Frame of Reference explicitly mentioning this feature in rubrics designed to test all
learners on the same scale. Many other studies have used spoken corpora to explore oral
fluency, such as the PAROLE10 corpus which includes speech from learners of English and
French as well as NS control groups and has been used to compare utterance fluency
characteristics among NSs and learners at different proficiency levels (Hilton, 2014). The
WiSP11 corpus, including English and Turkish L1 learners of L2 Dutch, has also been used
to explore multiple research questions regarding L2 fluency, including investigations of
L1–L2 fluency relations (e.g., De Jong et al., 2015).
Another area of research using corpora to investigate SLA pertains to the development
of constructions, or form-meaning pairings ranging from morphemes to words to idiomatic
expressions to syntactic frames (Ellis et al., 2016). Verb constructions are the focus of
Gilquin (2019) and Römer and Garner (2019) in the IJLCR special issue. Römer and
Garner examined the development of verb argument constructions (e.g., V about n, V for n)
across proficiency levels (low intermediate to high advanced). One benefit of using corpora
for such an analysis is the ability to compare results to a large reference corpus, in this case
the British National Corpus. Römer and Garner discovered that learners at advanced
proficiency levels evidenced similar distributions to the British National Corpus in both the
number and distribution of verbs in the constructions and were also able to demonstrate
how lower-level learners differed in terms of the types of verbs used in the constructions.
A host of other studies have focused on lexico-grammatical patterns of learner speech
across proficiency levels (e.g., Biber et al., 2016; Staples et al., 2017). A fairly consistent
finding across research contexts is that task type strongly influences the use of features as-
sociated with informational elaboration (e.g., use of nouns and noun modifiers, longer
words, passive voice, and relative clauses), more often associated with writing than speech.
While mode clearly plays a major role in determining learners’ use of these features, tasks
requiring more informational content (e.g., an oral interview focused on students’ profes-
sional experience or an integrated speaking task) lead to greater production of these features.
In addition, speakers at higher proficiency levels use more of these features within in-
formationally driven tasks.
The final two studies in the IJLCR special issue used the TLC to investigate pragmatic
development in the use of backchannels (Castello & Gesuato, 2019) and stance adverbs
(Pérez-Paredes & Díez-Bedmar, 2019). Pérez-Paredes & Díez-Bedmar explored the impact of
114
Spoken Corpora
task (monologic vs. dialogic) and proficiency level on the use of adverbs such as really,
actually, and obviously to display stance. Using both quantitative and qualitative analyses,
the researchers provided evidence that task type differentially impacted adverb usage: ac-
tually was more task-independent compared to really.
Pragmatics of spoken language development has been the subject of several studies of L2
spoken discourse outside of assessment contexts (Fernández & Yuldashev, 2011; Friginal
et al., 2017; Gilquin, 2008; Polat, 2011). Friginal et al. (2017) explore how hedges (e.g., think,
sort of ) and boosters (e.g., so) along with first person pronouns and modal verbs are used by
learners in EAP classroom discourse. Their results show that learners used think over-
whelmingly as a hedging device, and did not use modals for this purpose as much as their
teacher interlocutors. Modal verbs were also used more frequently by L2 learners in colla-
borative tasks when compared to non-collaborative tasks. Possibility, ability, and permission
modals (e.g., can, could ) were particularly frequent, reflecting learners’ negotiation of
meaning during collaborative tasks (e.g., can you explain…).
115
Amanda Huensch and Shelley Staples
LANGSNAP
As described earlier, corpora have been used to investigate many research questions in spoken
second language acquisition. The Languages and Social Networks Abroad Project (LANG-
SNAP, Mitchell et al., 2017) is a good example of the benefits of publicly shared longitudinal
corpora and how a corpus can be designed and used to answer a wide range of research
questions. The LANGSNAP corpus contains data from UK university students who were L2
learners of French or Spanish and required to spend their third year of a four-year degree
programme living in a French- or Spanish-speaking country. From 2011 to 2013, 56 partici-
pants completed a picture-based narration and a semi-structured interview at each of six data
collection points before, three times during, and two times after returning home from their
9-month sojourn abroad. Participants also completed an argumentative writing task. The
audio files and transcriptions (in CHAT format, discussed later) are available for download on
TalkBank. The oral data have been used to explore spoken language development of modality
(McManus & Mitchell, 2015), CAF (McManus et al., 2020), L1–L2 fluency relationships
(Huensch & Tracy-Ventura, 2017), and identity (Mitchell et al., 2020). Because it is a publicly
available corpus, it has also been used by other research groups. For example, Gudmestad
et al. (2019) explored the development of grammatical gender marking in L2 Spanish from a
variationist SLA perspective and, using a multifactorial analysis, demonstrated how multiple
linguistic (e.g., noun gender, noun frequency) and extralinguistic (e.g., task) factors contribute
to different components of stability and variability in the gender marking of advanced L2
speakers. Data are still being added to this “productive” corpus. In 2016 and 2019, 33 and 31,
respectively, of the original 56 speakers participated in two additional rounds of data collec-
tion, bringing the total project to 8 years and allowing new research questions examining
factors that impact foreign language attrition/development/maintenance (Huensch et al., 2019).
LINDSEI
The LINDSEI corpus (Gilquin et al., 2010) has been used for an impressive number of research
studies (see https://uclouvain.be/en/research-institutes/ilc/cecl/lindsei-bibliography.html). LINDSEI
was designed as a spoken counterpart to written argumentative essays provided in the International
Corpus of Learner English (ICLE) corpus (Granger, 1998). The LINDSEI corpus consists of in-
terviews with university-level English as a Foreign Language learners following a set structure in
three parts. Each interview begins with a warm-up comprising a monologic speaking task on a
given topic followed by an informal dialogic interview about speakers’ lives at university. To finish,
speakers completed a picture description task. The corpus (transcripts only) is available for pur-
chase and currently includes interviews with 554 participants. Two main strengths of the LINDSEI
corpus are the variety of L1s represented (11 different backgrounds) and its parallel L1 English
corpus, the Louvain Corpus of Native English Conversation (LOCNEC, De Cock, 2004). This
design allows for both cross-linguistic and L1–L2 comparisons. Several studies have focused on
discourse markers and other “small words” in LINDSEI (Buysse, 2012; Guilquin, 2008). For
instance, Buysse (2012) explored spoken usage of the discourse marker so in the LINDSEI Dutch
L1 subcorpus (n = 40 interviews) between learners majoring in English Linguistics versus those
116
Spoken Corpora
majoring in Commercial Sciences and also compared the learners to the L1 English LOCNEC
corpus. Results indicated that both groups of learners and L1s evidenced the use of so in a variety
of different functions, but that learners (from both majors) tended to overuse so in comparison to
the L1 reference corpus. Similarly, Götz (2013) is a book-length treatment exploring native and
non-native speaker utterance fluency using the German L1 subcorpus of LINDSEI. One finding
from her analysis of the patterns of use of discourse markers was that learners often underused
them and used a limited variety in comparison to native speakers. Rosen (2016) used the French
L1 subcorpus of LINDSEI to explore the constructs of error and innovation by comparing the
LINDSEI corpus data to a variety of English influenced by Norman French, Jersey English.
Rosen’s analysis brings together SLA research and research on indigenized varieties of English and
asserts that “the difference between the notions of (not yet conventionalized) innovations on the
one hand and errors on the other seems to be terminological and attitudinal – a matter of per-
spective and norm-orientation rather than a linguistic difference” (p. 304). Data collection for the
LINDSEI corpus involves multiple researchers across several international institutions following a
protocol to ensure that data collected are suitable for comparison. Additional subcorpora are
continually being added to the LINDSEI corpus.
LeaP
The LeaP corpus (Gut, 2012) is one of the few L2 phonological corpora. The corpus consists of
spoken data from L2 learners of German and English collected between 2001 and 2003. The
project examined the acquisition of prosodic features (e.g., intonation, stress) and the potential
impact of factors such as proficiency, formal instruction, and individual differences variables such
as motivation and musicality. Over 12 hours of speech was collected from learners and native
speakers completing tasks comprising both read and spontaneous speech. The reading tasks in-
cluded a list of nonsense words and a narrative passage. The spontaneous speech tasks included a
re-telling of the narrative passage and an informal interview. Time-aligned annotations were
completed in Praat (Boersma & Weenink, 2020) and included segmentation of words, syllables,
phonemes, tone, and pitch. The corpus (including sound files, texgrids, xml files, and manual) is
freely available for download. One potential limitation is that the tools developed for its analysis
are not publicly available and likely require basic knowledge of programming in Perl language
(Edalatishams, 2017). Investigations of both the development of phonological features and oral
language fluency have been published. For example, Gut (2017) used a subset of learners from the
LeaP corpus to conduct a mixed-methods analysis of the effects of learning context on phono-
logical development in different tasks over time. Contexts included study abroad, study abroad
with participation in a phonology course, and at-home learners who participated in a phonology
course. Phonological variables included vowel reduction, intonation, and fluency (articulation rate
and mean length of run). The quantitative results showed no clear advantage for one of the
contexts over another (although there were trends indicating benefits for the groups who received
explicit teaching). Additionally, the qualitative analysis revealed a large amount of individual
variability across learners in all contexts, and indicated that making gains in a phonological feature
typically resulted in doing so across multiple tasks in the corpus.
CCOT
17
The Corpus of Collaborative Oral Tasks (CCOT; Crawford, 2021) was created at Northern
Arizona University between 2009 and 2012. The tasks in the corpus were given to students as
part of their achievement tests during their study in an Intensive English Programme, from one
to three times. There are 24 tasks, with at least ten learner performances of each task for a total
117
Amanda Huensch and Shelley Staples
of 775 files. There are 600 speakers from three proficiency levels. The most common tasks are
problem solving (e.g., where learners decide which patient to treat or create an advertisement
together). Both the audio files and the transcriptions are available by contacting the creator,
William Crawford. An edited volume (Crawford, 2021) includes research on lexico-grammar,
pronunciation, and other types of speech analysis. For example, Staples (2021) investigates
lexico-grammatical features (e.g., nouns, conditional clauses, that complement clauses), in-
teractional features (turn length, backchannels, questions), fluency (speech rate, length of
pauses), and pitch range across task types (informational and argumentative). Not surpris-
ingly, nouns and other informational features were used more in the informational task while
conditional clauses were more common in the argumentative task. These findings align with
numerous studies supporting the use of these lexico-grammatical features for these particular
purposes. However, perhaps more interesting are the findings for interactional features, flu-
ency, and pitch range. Backchannels were used more frequently in informational tasks, per-
haps reflecting the listener’s uptake of information provided by the speaker. Speech rate was
faster and number of pauses was lower for the argumentative tasks, likely reflecting the less
dense use of informational content in the argumentative tasks. Pitch range was also higher in
the argumentative task, perhaps due to the need to stress syllables at higher pitch to make
points more salient and arguments appear stronger. These findings have important implica-
tions for the understanding of interactional variables, fluency, and pronunciation across tasks.
118
Spoken Corpora
corpora. However, to investigate intonation and rhythm, more naturally occurring speech is
needed. To investigate vocabulary, larger corpora are needed.
Corpus methods typically take three different approaches to research design (coined Type
A, Type B, and Type C by Biber & Jones, 2009). In type A studies, researchers investigate a
linguistic feature to determine how that feature varies based on the linguistic environment.
For example, one might examine copular verbs in Spanish and Portuguese learner corpora to
see how they vary depending on type of complement. Logistic regression can then identify
whether the patterns vary across L1 background or learner level (Picoral, 2020).
Type B studies take as their unit of observation an individual text. Linguistic features are
examined within each text, but the output is the frequency of occurrence of that feature in each
text. Thus, the focus is not on the behaviour of a linguistic feature in a linguistic environment, but
rather how frequent that feature is used across L1 backgrounds, learner levels, and/or text types.
Types of statistical methods used with these corpora include the ANOVA family, to investigate
differences across subgroups (e.g., by proficiency level or L1 background, for example) or from the
correlation/regression family, to determine relationships between a continuous operationalization
of proficiency (e.g., scores on a proficiency test) and linguistic features.
Type C studies are similar to type B, but they investigate the frequencies of an entire
subcorpus rather than getting the frequency for each individual text within that subcorpus.
In this case, the use of inferential statistics is more limited, and it is commonplace for re-
searchers to report frequency data. Reporting range along with normed frequencies is ad-
visable, to help researchers determine whether the phenomena are spread throughout
speakers in a subcorpus or are used by only one or two speakers.
Most corpora are sampled from one period of time and thus are typically cross-sectional.
Corpus compilers ideally balance the corpus across score or proficiency levels, and also typi-
cally try to balance across L1 backgrounds. Such corpora provide valuable information about
linguistic and other features that characterize performance at different levels. However, more
recently, there has been a call for more quantitative longitudinal studies. Longitudinal corpora
provide an ideal dataset for examining spoken development. One of the choices researchers
must make is whether they prioritize the similarity of task across time periods (e.g., the same
task is administered to learners at two or more points in time) or whether they want to
prioritize the type of tasks suitable for learners at different developmental stages. The former
has the obvious advantage of being more controlled, while the latter has the advantage of more
ecological validity. Researchers are exploring these two options in corpus data, and it is clear
that new methods are needed to address different types of longitudinal datasets.
For corpora consisting of read words or sentences, balance and representativeness are not
important considerations. As discussed earlier, such corpora have the advantage that they
can be designed to have the control of a psycholinguistic experimental setting with shared
prompts and lab-quality recordings but have the disadvantage of not representing a spoken
discourse domain. Methods for these types of corpora are similar to those for psycho-
linguistic data, discussed in Nagle et al., this volume.
Digital Tools
A variety of digital tools exist to transcribe, annotate (including tagging and segmentation),
and analyze spoken corpora. This part describes these processes and some of the most
commonly used tools to complete them.
Transcription is the process of representing oral language in some form of written script,
such as orthographic transcription (e.g., following typical spelling conventions) or phonemic
or phonetic transcription (e.g., using IPA symbols and diacritics). While digital tools can
119
Amanda Huensch and Shelley Staples
assist in (semi-) automating other processes with spoken corpora, manual transcription is
often a necessary and time-consuming first step. For instance, Brezina et al. (2019) reported
that it took 5 years and nearly 3,500 hours to transcribe the TLC (see footnote 8). Spoken
corpora can be transcribed in text editors (e.g., Microsoft Notepad++, Mac TextEdit) or
software programs specialized for linguistic analysis such as the freely available CLAN or
ELAN. CLAN (Computerized Language Analysis, MacWhinney, 2000) is a software pro-
gram developed for the TalkBank system. CLAN is designed to work in conjunction with the
CHAT (Codes for Human Analysis of Transcripts) transcription and coding format, a set of
standardized conventions for creating computerized transcripts of speech. ELAN (EUDICO
Linguistic Annotator, Wittenburg et al., 2006) is another software program that allows for
transcription and analysis of audio and video. A useful feature of ELAN is its organization
around tiers, which can be hierarchically structured.
Annotation is the process of providing additional linguistic information to the tran-
scription. One of the most common forms of annotation across both written and spoken
corpora is known as tagging. This is the process of marking up words in the corpus with part-
of-speech information based on the word and its context. For example, in Figure 8.1, lines 17
and 18 represent the part-of-speech (POS) tagged words from the orthographically tran-
scribed Spanish utterance in line 16. As shown in Figure 8.1, POS tagging in this case
provides information about word class, tense, gender, number, etc. Typically, the tagging
process is automatic, although some follow-up disambiguation might be necessary depending
on the accuracy of the tagger. At a minimum, it is important to include accuracy checking as
one of the steps when using automatic annotators.
Once transcribed and potentially POS annotated, concordancing tools can be used for analyses
related to the frequency and distribution of words in a corpus. These often involve extracting not
only key words or phrases, but also the words occurring before and after them [known as Key
Word In Context (KWIC) analyses]. These tools are available as stand-alone (e.g., AntConc) or
web-based (e.g., SketchEngine) applications and have been used not only for linguistic research,
but also as pedagogical tools. For example, SketchEngine provides access to 500+ corpora in over
90 languages, but researchers can also upload their own corpora for analysis.
Many other forms of annotation are possible (for an overview, see Leech, 2005).
Regarding annotation specific to spoken corpora, for example, prosodic annotation could be
used to indicate information about intonation, stress, and pausing. Additionally, symbols
may be added to transcripts for features such as filled pauses (e.g., uh, um), repeated or
reformulated words or phrases. Segmentation is a common form of annotation in spoken
corpora and can be used at multiple levels. For instance, segmentation might be used to
separate speech from silence, to indicate discourse units such as turns, to separate phonemes
or syllables within a word, etc. In addition to ELAN, Praat is a commonly used digital tool
for segmentation and annotation of speech. Annotations in Praat are created in TextGrid
files, which can have multiple tiers as shown in Figure 8.2. Praat has a built-in feature to
automatically segment silence from speech called Annotate To TextGrid (silences…). Its
accuracy varies depending on the sound quality of the file, so manual post-checking is re-
commended. Given Praat’s wide usage, many other digital corpus tools (e.g., CLAN, ELAN,
120
Spoken Corpora
Figure 8.2 Praat TextGrid with annotation including a word tier (1), syllable tier (2), and a consonant/
vowel tier (3). Reprinted from Ghanem et al. (2020) with permission
Phon) can read and/or write Praat TextGrid files. Praat has also been used for annotation of
phonological corpora, which are particularly time- and effort-intensive to annotate. For
example, the LeaP corpus involved approximately 1,000 events annotated per minute (Gut,
2012) whose reliability varied as a factor of what was being annotated. One of the most
important benefits of annotation, however, is that once completed, it allows for automatic
analysis of the corpus.
Ghanem et al. (2020) provide an overview and evaluation of five commonly used digital
tools for spoken discourse, and recommendations for combining them efficiently across
different stages of data preparation and analysis for pronunciation corpora. They provide
documentation and evaluation of the five digitals tools on their website.18
121
Amanda Huensch and Shelley Staples
Two major database providers, CECL and TalkBank, provide suggestions and/or example
guidelines for creating new corpora and embarking on multi-institutional collaborations.
Another recommendation is to increase training opportunities and materials for re-
searchers who currently work with or would like to work with spoken corpora. Many
programs are available for transcription, annotation, and analysis of spoken corpora – so
many that those new to the field might have difficulty deciding where to start. More es-
tablished programs (e.g., Praat, AntConc) usually have detailed documentation user guides
on the web. Providing additional training opportunities during pre-conference workshops or
conference presentations at venues such as the American Association of Corpus Linguistics
(AACL), the Pronunciation in Second Language Learning and Teaching (PSLLT) con-
ference, or the Second Language Research Forum (SLRF) or as part of a summer workshop
session is a great way to increase skills and encourage wider use of spoken corpora for SLA.
7 Future Directions
Ultimately, while many potential benefits of using spoken corpora for SLA research exist,
the field is in its early stages. We have indicated areas of research that represent important
next steps. Thus far, we have highlighted the need for more longitudinal and phonological
corpora and research as well as examinations of spoken SLA combining experimental and
corpus-based methods. For projects such as these, collaborative efforts that bring together
researchers from multiple institutions and methodological expertise are likely to be most
successful. Such efforts require careful planning as well as consistency in data collection and
preparation.
One future direction that deserves mention is the need for spoken corpora representing
less commonly taught languages (LCTLs) as well as greater L1–L2 pairings in general. Not
surprisingly, most corpora represent languages such as English, French, or Spanish. One
project working to collect data from two LCTLs, Russian and Portuguese, is the
Multilingual Corpus of Assignments – Writing and Speech (MACAWS).19
Notes
1 To clarify, by “second language acquisition of speaking,” we mean L2 acquisition (in or outside
instructional contexts) in the spoken mode, including investigation of phonology, syntax, and
pragmatics.
2 Université Catholique de Louvain’s Centre for English Corpus Linguistics (CECL); https://
uclouvain.be/en/research-institutes/ilc/cecl/corpora.html
3 SLABank; MacWhinney (2020), https://slabank.talkbank.org/
4 European Science Foundation Second Language (ESF); (Perdue, 1993), https://slabank.talkbank.org/
access/Multiple/ESF/
5 Languages and Social Networks Abroad Project (LANGSNAP); (Mitchell et al., 2017), http://
langsnap.soton.ac.uk/, https://scholarcommons.usf.edu/langsnap/
6 French Learner Language Oral Corpora (FLLOC); http://www.flloc.soton.ac.uk/
7 Spanish Learner Language Oral Corpora (SPLLOC); http://www.splloc.soton.ac.uk
8 Louvain International Database of Spoken English Interlanguage (LINDSEI); (Gilquin et al.,
2010), https://uclouvain.be/en/research-institutes/ilc/cecl/lindsei.html
9 Trinity-Lancaster Corpus (TLC); (Brezina et al., 2019), http://cass.lancs.ac.uk/trinity-lancaster-
corpus/
10 Parallèle Oral en Langue Etrangère ‘Parallel Oral Foreign Language’ (Parole); (Hilton, 2009),
https://slabank.talkbank.org/access/English/PAROLE.html
11 What is Speaking Proficiency (WiSP); (De Jong et al., 2015).
12 Intonational Variation in English (IViE); (Grabe et al., 2001), http://www.phon.ox.ac.uk/files/
apps/IViE/
122
Spoken Corpora
Further Reading
Biber, D., & Reppen, R. (Eds). (2015). The Cambridge handbook of English corpus linguistics.
Cambridge: Cambridge University Press.
This handbook covers major research areas within corpus linguistics, with an emphasis on English but
with relevance to research of other languages as well. The book contains helpful chapters on common
research areas, such as keyword and collocational analysis, as well as introductions to both spoken
corpus and learner corpus research.
Granger, S., Gilquin, G., & Meunier, F. (Eds.) (2015). The Cambridge handbook of learner corpus
research. Cambridge: Cambridge University Press.
This handbook provides a comprehensive guide to the rapidly‐developing field of learner corpus re-
search. The volume contains 27 chapters divided among parts devoted to corpus design and metho-
dology, learner language analysis, and intersections between learner corpus research and SLA, language
teaching, and natural language processing.
Tracy-Ventura, N., & Paquot, M. (2020). The Routledge handbook of SLA and corpora. New York:
Routledge.
This handbook begins with introductory chapters on corpus linguistics, LCR, SLA, and the intersec-
tions of SLA and LCR. The remainder of the handbook is comprised of three parts (a) aspects of
corpus design, annotation, and analysis, (b) the role of corpora in SLA theory and practice, and (c)
SLA constructs (e.g., input, interaction, accuracy) and corpora. The handbook ends with a chapter on
future directions of the use of corpora in SLA.
References
Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use.
Cambridge: Cambridge University Press.
Biber, D., Gray, B., & Staples, S. (2016). Predicting patterns of grammatical complexity across textual
task types and proficiency levels. Applied Linguistics, 37, 639–668.
Biber, D., & Jones, J. K. (2009). Quantitative methods in corpus linguistics. In Corpus linguistics: An
international handbook (pp. 1286–1304). Berlin: De Gruyter Mouton.
Boersma, P., & Weenink, D. (2020). Praat: Doing phonetics by computer (Version 6.1.09) [Computer
program]. Retrieved from http://www.praat.org/
Brezina, V., Gablasova, D., & McEnery, T. (2019). Corpus-based approaches to spoken L2 production:
Evidence from the Trinity Lancaster Corpus. International Journal of Learner Corpus Research, 5, 119–125.
Buysse, L. (2012). So as a multifunctional discourse marker in native and learner speech. Journal of
Pragmatics, 44, 1764–1782.
Castello, E., & Gesuato, S. (2019). Holding up one’s end of the conversation in spoken English: Lexical
backchannels in L2 examination discourse. International Journal of Learner Corpus Research, 5,
231–252.
Chomsky, N. (1962). Paper given at the University of Texas 1958. In 3rd Texas conference on problems
of linguistic analysis in English. Austin, TX: University of Texas.
Crawford, W. (2021). Multiple perspectives on learner interaction: The corpus of collaborative oral tasks.
New York: DeGruyter.
Crossley, S. A., Salsbury, T., & Mcnamara, D. S. (2015). Assessing lexical proficiency using analytic
ratings: A case for collocation accuracy. Applied Linguistics, 36, 570–590.
De Cock, S. (2004). Preferred sequences of words in NS and NNS speech. Belgian Journal of English
Language and Literatures, 2, 225–246.
123
Amanda Huensch and Shelley Staples
De Jong, N. H., Groenhout, R., Schoonen, R., & Hulstijn, J. H. (2015). Second language fluency:
speaking style or proficiency? Correcting measures of second language fluency for first language
behavior. Applied Psycholinguistics, 36, 223–243.
Durand, J., Laks, B., & Lyche, C. (2002). La phonologie du français contemporain: usages, variétés et
structure. In C. Pusch & W. Raible (Eds.), Romanistische Korpuslinguistik- Korpora und gesprochene
Sprache/Romance corpus linguistics – Corpora and spoken language (pp. 93–106). Tübingen: Gunter
Narr Verlag.
Edalatishams, I. (2017). LeaP corpus (review). In M. O’Brien & J. Levis (Eds.), Proceedings of the 8th
pronunciation in second language learning and teaching conference, ISSN 2380-9566, Calgary, AB,
August 2016 (pp. 236–240). Ames, IA: Iowa State University.
Ellis, N. C., Römer, U. & O’Donnell, M. B. (2016). Usage-based approaches to language acquisition and
processing: cognitive and corpus investigations of construction grammar. Language Learning
Monograph Series. Hoboken, NJ: Wiley-Blackwell.
Fernández, J. (2013). A corpus-based study of vague language use by learners of Spanish in a study
abroad context. In C. Kinginger (Ed.), Social and cultural aspects of language learning in study
abroad (pp. 299–332). Philadelphia: John Benjamins.
Fernández, J., & Yuldashev, A. (2011). Variation in the use of general extenders and stuff in instant
messaging interactions. Journal of Pragmatics, 43, 2610–2626.
Friginal, E., Lee, J. J., Polat, B., & Roberson, A. (2017). Exploring spoken English learner language
using corpora: Learner talk. New York: Springer.
Gablasova, D., Brezina, V., & McEnery, T. (2019). The Trinity Lancaster corpus: Development, de-
scription and application. International Journal of Learner Corpus Research, 5, 126–158.
Gilquin, G. (2008). Hesitation markers among EFL learners: Pragmatic deficiency or difference? In J.
Romero-Trillo (Ed.), Pragmatics and corpus linguistics: A mutualistic entente (pp. 119–149). Berlin:
Mouton de Gruyter.
Gilquin, G. (2019). Light verb constructions in spoken L2 English: An exploratory cross-sectional
study. International Journal of Learner Corpus Research, 5, 181–206.
Ghanem, R., Edalatishams, I., Huensch, A., Puga, K., & Staples, S. (2020). The effectiveness of
computer programs in the transcription and analysis of spoken discourse: towards a protocol for
pronunciation corpora. In O. Kang, S. Staples, K. Yaw, & K. Hirschi (Eds.), Proceedings of the 11th
pronunciation in second language learning and teaching conference (pp. 97–114). Ames, IA: Iowa State
University.
Gilquin, G., De Cock, S., & Granger, S. (2010). The Louvain international database of spoken English
interlanguage. Handbook and CD-ROM. Louvain. Belgium: Presses universitaires de Louvain.
Gilquin, G., & Gries, S. (2009). Corpora and experimental methods: A state-of-the-art review. Corpus
Linguistics and Linguistic Theory, 5, 1–26.
Götz, S. (2019). Filled pauses across proficiency levels, L1s and learning context variables: A multi-
variate exploration of the Trinity Lancaster Corpus Sample. International Journal of Learner Corpus
Research, 5, 159–180.
Götz, S. (2013). Fluency in native and nonnative English speech. Philadelphia, PA: John Benjamins.
Grabe, E., Post, B. & Nolan, F. (2001). The IViE corpus. Department of Linguistics, University of
Cambridge. http://www.phon.ox.ac.uk/old_IViE.
Granger, S. (2009). The contribution of learner corpora to second language acquisition and foreign
language teaching: A critical evaluation. Corpora and Language Teaching, 33, 13–32.
Granger, S. (1998). The computerized learner corpus: A versatile new source of data for SLA research.
In S. Granger (Ed.), Learner English on computer (pp. 3–18). London: Longman.
Gudmestad, A., Edmonds, A., & Metzger, T. (2019). Using variationism and learner corpus research to
investigate grammatical gender marking in additional language Spanish. Language Learning, 69, 911–942.
Gut, U. (2017). Phonological development in different learning contexts. International Journal of
Learner Corpus Research, 3, 196–222.
Gut, U. (2012). ‘The LeaP corpus. A multilingual corpus of spoken learner German and learner
English. In Th. Schmidt, & K. Wörner, K. (Eds.), Multilingual corpora and multilingual corpus
analysis (pp. 3–23). Amsterdam: John Benjamins.
Gut, U., & Voormann, H. (2014). Corpus design. In J. Durand, U. Gut, & G. Kristoffersen (Eds.), The
Oxford handbook of corpus phonology (pp. 13–26). Oxford: Oxford University Press.
Hilton, H. E. (2014). Oral fluency and spoken proficiency: Ideas for testing and research. In P. Leclercq,
A. Edmonds, & H. Hilton (Eds.), Measuring L2 proficiency: Perspectives from SLA (pp. 27–53).
Bristol, UK: Multilingual Matters.
124
Spoken Corpora
Hilton, H. (2009). Annotation and analyses of temporal aspects of spoken fluency. CALICO Journal,
26, 644–661.
Huensch, A. (2020). Fluency. In N. Tracy-Ventura, & M. Paquot (Eds.), The Routledge handbook of
SLA and corpora. New York: Routledge.
Huensch, A., & Staples, S. (2018). Towards a protocol for a multilingual corpus for pronunciation
researchers. Pronunciation in Second Language Learning and TeachingAmes, Iowa. https://
apling.engl.iastate.edu/conferences/pronunciation-in-second-language-learning-and-teaching-
conference/psllt-archive/
Huensch, A., & Tracy-Ventura, N. (2017). Understanding second language fluency behavior: The ef-
fects of individual differences in first language fluency, cross-linguistic differences, and proficiency
over time. Applied Psycholinguistics, 38, 755–785.
Huensch, A., Tracy-Ventura, N., Bridges, J., & Cuesta-Media, J. (2019). Variables affecting the
maintenance of L2 proficiency and fluency four years post-study abroad. Study Abroad Research in
Second Language Acquisition and International Education, 4, 96–125.
Leech, G. (2005). Adding linguistic annotation. In M. Wynne (Ed.), Developing linguistic corpora: A
guide to good practice (pp. 17–29). Oxford: Oxbow Books, http://users.ox.ac.uk/~martinw/dlc/
index.htm
MacWhinney, B. (2020). TalkBank and SLA. In N. Tracy-Ventura, & M. Paquot (Eds.), The Routledge
handbook of SLA and corpora. New York: Routledge.
MacWhinney, B. (2017). A shared platform for studying second language acquisition. Language
Learning, 67, 254–275.
MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk (3rd edn). Mahwah, NJ:
Lawrence Erlbaum.
Marsden, E., Mackey A., & Plonsky, L. (2016). The IRIS Repository: Advancing research practice and
methodology. In A. Mackey & E. Marsden (Eds.), Advancing methodology and practice: The IRIS
repository of instruments for research into second languages (pp. 1–21). New York: Routledge.
McEnery, T., Brezina, V., Gablasova, D., & Banerjee, J. (2019). Corpus linguistics, learner corpora,
and SLA: Employing technology to analyze language use. Annual Review of Applied Linguistics,
39, 74–92.
McEnery, T., & Hardie, A. (2012). Corpus linguistics: Method, theory and practice. Cambridge:
Cambridge University Press.
McEnery, T., & Wilson, A. (2001). Corpus linguistics (2nd edn). Edinburgh: Edinburgh University
Press.
McManus, K., Mitchell, R., & Tracy-Ventura, N. (2020). Longitudinal study of advanced learners’
linguistic development before, during, and after study abroad. Applied Linguistics. doi: 10.1093/
applin/amaa003.
McManus, K. & Mitchell, R. F. (2015). Subjunctive use and development in L2 French: A longitudinal
study. Language, Interaction and Acquisition, 6(1), 42–73.
Meunier, F., & Littre, D. (2013). Tracking learners’ progress: Adopting a dual ‘corpus cum experi-
mental data’ approach. The Modern Language Journal, 97, 61–76.
Mitchell, R., Tracy-Ventura, N., & Huensch, A. (2020). After study abroad: The long-term evolution of
multilingual identity among anglophone languages graduates. Modern Language Journal.,104(2),
327-344
Mitchell, R., Tracy-Ventura, N., & McManus, K. (2017). The Anglophone student abroad: Identity,
social relationships and language learning. New York: Routledge.
Myles, F. (2015). Second language acquisition theory and learner corpus research. In A. Granger, G.
Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 309–331).
Cambridge: Cambridge University Press.
Myles, F. (2008). Investigating learner language development with electronic longitudinal corpora:
Theoretical and methodological issues. In L. Ortega, & H. Byrnes (Eds.), The longitudinal study of
advanced L2 capacities (pp. 58–72). New York: Routledge.
Ortega, L., & Iberri-Shea, G. (2005). Longitudinal research in second language acquisition: Recent
trends and future directions. Annual Review of Applied Linguistics, 25, 26–45.
Perdue, C. (Ed.) (1993). Adult language acquisition. Vol 1: field methods. Cambridge: Cambridge
University Press.
Pérez-Paredes, P., & Díez-Bedmar, M. B. (2019). Certainty adverbs in spoken learner language: The
role of tasks and proficiency. International Journal of Learner Corpus Research, 5, 253–279.
125
Amanda Huensch and Shelley Staples
Picoral, A. (2020). L3 Portuguese by Spanish-English bilinguals: Copula construction use and acqui-
sition in corpus data (Publication No. 27957666). [Doctoral dissertation, University of Arizona].
ProQuest Dissertations Publishing.
Polat, B. (2011). Investigating acquisition of discourse markers through a developmental learner
corpus. Journal of Pragmatics, 43, 3745–3756.
Rose, Y., & MacWhinney, B. (2014). The PhonBank project: Data and software-assisted methods for
the study of phonology and phonological development. In J. Durand, U. Gut, & G. Kristoffersen
(Eds.), The Oxford handbook of corpus phonology (pp. 380–401). Oxford: Oxford University Press.
Rosen, A. (2016). The fate of linguistic innovations: Jersey English and French learner English com-
pared. International Journal of Learner Corpus Research, 2, 302–322.
Römer, U., & Garner, J. R. (2019). The development of verb constructions in spoken learner English:
Tracing effects of usage and proficiency. International Journal of Learner Corpus Research, 5, 207–230.
Staples, S. (2021). Exploring the impact of situational characteristics on the linguistic features of spoken
oral assessment tasks. In W. Crawford (Ed.), Multiple perspectives on learner interaction: The corpus
of collaborative oral tasks (pp. 123–144) Berlin: DeGruyter.
Staples, S., LaFlair, G., & Egbert, J. (2017). A multi-dimensional comparison of oral proficiency in-
terviews to conversation, academic and professional spoken registers. Modern Language Journal,
101, 194–213.
Tracy-Ventura, N., & Huensch, A. (2018). The potential of publicly shared longitudinal learner corpora
in SLA research. In A. Gudmestad & A. Edmonds (Eds.), Critical reflections on data in second
language acquisition (pp. 149–170). Philadelphia/Amsterdam: John Benjamins.
Vercellotti, M. L. (2017). The development of complexity, accuracy, and fluency in second language
performance: A longitudinal study. Applied Linguistics, 38, 90–111.
Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., Sloetjes, H. (2006). ELAN: A professional
framework for multimodality research. In Proceedings of LREC 2006, Fifth international conference
on language resources and evaluation. https://tla.mpi.nl/tools/tla-tools/elan/
Zhao, G., Sonsaat, S., Silpachai, A., Lucic, I., Chukharev-Hudilainen, E., Levis, J. M., & Gutierrez-Osuna, R.
(2018). L2 ARCTIC: A non-native English speech corpus. Proceedings of interspeech (Hyderabad, India).
126
Spoken Corpora
127
Corpus Language/Proficiency Size Data/Annotations Strengths Weaknesses Task Access
BeMaTaC German L2 (n = 10); 18,123 Transcripts Rich metadata Limited PRON Info-gap Free; download
[advanced prof = C1/ (EXMARaLDA) Dialogic annotations ANNIS
C2]German NS (n Sound files (wav, mp3) Single task
= 24) Video files
(mov, webm)
CCOT English L2 (n = 268,324; 775 files TranscriptsSound Dialogic; No PRON 24 different tasks Free; contact creator
600);three proficiency files (wav) multiple L1s annotations (William.Crawfor-
levels [TOEFL 32–69] represented; Variable sound d@nau.edu)
multiple quality
tasks
represented
FLLOC Collection of 8 corpora(n 40,00 files>3 Transcripts (CHAT/ Large; Limited PRON Elicitation tasks; Free; download
= 491 participants, million words CLAN) Sound files Comparisons annotations Narratives;
aged 5–23) [varying (wav, mp3) POS- w/ SPLLOC Variable sound Interview
proficiency] tagged/MOR quality
HKCSE L1 Hong Kong Chinese, 900,214 words; Transcripts (txt) Brazil Large; Wide Highly proficient Business (e.g., job $125CD-ROM
(Prosodic) L2 English 311 recordings annotations range of L2 speakers No interview;
(n=643,286)[advanced (searchable through naturally sound files presentations,
proficiency] L1 English iConc interface) occurring service
128
(227,894) L1 Other tasks; encounters);
(29,064) Annotation Academic
using Brazil’s (student
system presentations,
lectures); Public
(speeches,
interviews)
Amanda Huensch and Shelley Staples
LANGSNAP French L2 (n=29) 742,203 words; Transcripts (CHAT/ Longitudinal; Limited PRON Interview; Story re-
Free; download
Spanish L2 (n=27) 1,238 files CLAN)Sound files Controlled annotations tell; Essay
[advanced proficiency] (wav, mp3)POS- and free Variable sound
French NS (n=10) tagged/MOR tasks; quality
Spanish NS (n=10) Multiple L2s
LeaP L1 German, L2 English 73,941 words; 12 Transcripts (XML-based Detailed Low inter- Free speech in an Free; download
(n=176) L1 English, hours TASX format) segmental/ annotator interview;
L2 German (n=183) Syllables, segments, supraseg- agreement; Reading a story;
[advanced and pitch accents and mental Relatively small Retelling a story;
intermediate? boundary tones, annotation; Reading nonsense
proficiency] L1 English intonation contours, Controlled word list
(n=8) L1 part of speech, lemmas and free tasks
German (n=10) (Praat) Sound
files (wav)
LINDSEI ~50 files each of L2 > 1 million Transcripts (XML) Multiple L1s; Limited PRON Interview; Informal €211.75CD-ROM
English from the words; 792,000 Fluency annotations: Controlled annotations No chat; Picture
following L1s: of learner filled and unfilled and free tasks sound files description
Bulgarian, Chinese, language 554 pauses Proficiency
Dutch, French, interviews 130 information not
German, Greek, hours clear (only 5
Italian, Japanese, interviews from
Polish, Spanish, each group were
Swedish (Intermediate rated)
to Advanced [10 files
per L1 CEFR
evaluated])
Speech L1 and L2 English; ~350 2,642 samples 69 Phonetic transcription Extremely Read speech No One passage of read Free
Accent L1s [proficiency not words per broad range proficiency speech
Archive provided] sample of L1s; information
Comparabili-
ty across
samples
SPLLOC 2 corpora, each having: 575 files Transcripts (CHAT/ Comparisons w/ Limited PRON Elicitation tasks; Free; download
L2 Spanish (n=60) CLAN)Sound files FLLOC annotations Narratives;
[varying proficiency] (wav, mp3) POS- Variable sound Interview
129
L1 Spanish (n=15) tagged/MOR quality
Spoken Corpora
9
SPEAKING ASSESSMENT
Noriko Iwashita
1 Introduction/Definitions
In today’s globalized society, where the world is interconnected through vastly increased
trade and cultural exchanges, the demand for excellent communicative competence continues
to grow for employment, study, immigration, and travel opportunities. This gives rise to a
need for appropriate assessment tools for speaking. While speaking is one of the most valued
skills in language teaching (Lado, 1961), its assessment is relatively new compared with other
skills (Fulcher, 2003).
Since speaking assessment tools were first introduced in the early 1980s, their format has
undergone many changes. Following the prevalent assessment format over time, various
aspects of speaking assessment have been investigated to allow an appropriate inference of
test-takers’ language ability. The studies were initially conducted with large sample sizes in
quantitative studies. More recently, however, as test-takers’ interactional ability has been
increasingly recognized as an important element of speaking assessment, there has been a
shift from individualistic assessment to an approach that incorporates co-construction by
seeking for an alternative view of assessment that recognizes the fundamentally social nature
of language assessment (Roever & Kasper, 2018). This has resulted in a variety of qualitative
methods.
With advances in communication technology and its increased use in daily communica-
tion, computer-mediated tests have been implemented in large standardized tests (e.g.,
APTIS, Duolingo, PTE Academic, TOEFL iBT, Versant English Test) and automated
scoring system has been introduced to some of the large commercial tests. This new mode
of speaking assessment has brought challenges in terms of operationalizing the speaking
construct while acknowledging the benefits of overcoming the complexity in administering
speaking assessment (Galaczi & Taylor, 2018).
This chapter presents an overview of the current research on speaking assessment with a
historical overview. Several scholars have already contributed reviews on broader areas of
speaking assessment (e.g., Ginther, 2013; Isaacs, 2017; O’Sullivan, 2013). To complement
their reviews, this chapter focuses on studies that examined test-taker performance in
speaking assessment. Speaking assessment here includes monologue, conversation, and in-
terviews involving oral interaction with another person(s) who could be an examiner or
interviewer or fellow test-taker. Although other issues, such as washback and test use, are
important areas of speaking research, the discussion in this chapter does not cover these
issues because of space constraints.
2 Historical Perspectives
The direct format of speaking assessment, in which test-takers’ speaking ability is assessed
through face-to-face communication or in a form of monologue, was not introduced until the
early 1980s (Clark, 1979). It is worth noting that the format of speaking assessment corre-
sponded with the teaching methodologies employed at the time. For example, when a
structure-based curriculum was prevalent, speaking assessment consisted of an “indirect” test
where test-takers were required to identify sounds, words, or other aspects of the language in
a spoken text (e.g., dialogue). As a communicative-oriented approach to language teaching
became applied widely, direct assessment was gradually implemented. Over time, direct as-
sessment – referred to as performance assessment, where test-takers perform a task con-
sidering the use of language by test-takers outside the classroom, such as face-to-face
interviews, group orals – and semi-direct tests (i.e., test-takers are required to speak in re-
sponse to a prompt delivered by phone, audio-recording, or computer) have become
common (Harding, 2014). While performance assessments involving interaction with an
interviewer or other test-taker(s) are extensively implemented in both commercial tests [e.g.,
Cambridge Examination, the Trinity College Integrated Skills in English (ISE), IELTS] and
classroom assessments, it should be noted that traditional formats of speaking assessment
such as monologues and responding to prompts are also widely available in both computer-
mediated tests (e.g., APTIS, Duolingo, PTE Academic, TOEFL iBT, Versant English Test)
and in classroom assessment.
In the early days of language assessment and testing, scholars argued whether language
proficiency was unitary or multi-trait (Oller, 1979), in which language ability cannot be di-
vided up into components, but instead there is a general factor of language proficiency. It is,
however, now generally agreed that language proficiency is multi-componential (Bachman,
1990). Currently, the communicative language ability model proposed by Bachman (1990)
and Bachman and Palmer (1996) (based on Canale & Swain, 1980) is the best-known model
in the field of language assessment. The model, therefore, has been used widely as a basis for
many communicative language tests and frameworks (e.g., Common European Framework
of References – CEFR) although the model may not be explicitly mentioned. In this model,
the overall language competence is first comprised of organizational and pragmatic com-
petences, and then each competence is further categorized into several different types of
competence (i.e., organizational – grammatical and textual competences; or pragmatic –
illocutionary and sociolinguistic competences). The model provides a comprehensive list of
competences involved in language ability and is, therefore, considered as the most elaborate
model (Alderson & Banerjee, 2002).
Research in speaking assessment has investigated how various aspects of speaking per-
formance (e.g., test-taker, rater, task, and interlocutor) have an impact on the performance,
drawing on the models of speaking performance proposed by McNamara (1996), Fulcher
(2003), and Skehan (2009). In their models, the quality of the test performance that is re-
flected in the test score is dependent on not only the test-taker’s language ability, but also the
test tasks and rating. In other words, it is usually easier to do the task when the topic is
familiar than when it is unfamiliar, but also some test-takers may find it easier to respond to
interview questions than present a speech. Task conditions such as preparation (planning)
time or access to prompt materials may influence the difficulty of the task. For rating, test
scores are dependent on what is described on the rating scale (e.g., delivery, grammatical
131
Noriko Iwashita
accuracy), and they are also impacted by the raters’ leniency/harshness. In essence, these
scholars suggest that test performance is interpreted in terms of the interactions among
the abilities of the test-taker, the task, and the rating involved in speaking assessments. These
models were developed incrementally and are not mutually exclusive, although the emphasis
in each model differs slightly.
In early research on speaking assessment, studies were mainly conducted in the context of
the oral proficiency interview (OPI), developed by the Foreign Service Institute (FSI) in the
United States and its associated US Government agencies, which is known as the ACTFL
Oral Proficiency Interview (ACTFL OPI). The construct of speaking assessment, rating, and
test scores were the main focus of their investigations.
For example, some scholars investigated how different aspects of language ability con-
tribute to overall communicative language ability. This line of enquiry was undertaken
largely in the context of rating scale development and contributed to clarifying the construct
of language proficiency through the analyses of test scores and/or teachers’ and/or assessors’
perceptions of test-taker performances. Higgs and Clifford (1982), for instance, analyzed
raters’ perceptions of the relative role of the five component factors constituting global
proficiency (i.e., vocabulary, grammar, pronunciation, fluency, and sociolinguistics).
Subsequently, they proposed a relative contribution model (RCM), arguing that various
aspects of language contribute differently to overall language proficiency at the levels defined
in the FSI scale. According to the RCM, vocabulary and grammar contribute to overall
proficiency across all levels, but as levels increase, some aspects, such as pronunciation,
fluency, and sociolinguistic factors become more prominent than other factors. In the de-
velopment of the ACTFL Guidelines based on the FSI scale, proficiency was described in
terms of communicative growth comprising four main contributors: function, content,
context, and accuracy (Breiner-Sanders et al., 1999). Other studies focused their investiga-
tions on communicative language ability through tasks (Raffaldini, 1988), evaluation of the
construct validity of ACTFL guidelines (Alonso, 1997), and rater behaviour (Thompson,
1995). These studies were conducted across levels (secondary to university) and languages
(e.g., Mandarin Chinese, English, German, Japanese, and Spanish).
As the OPI became widely implemented, scholars questioned whether test-takers’ con-
versational ability could be satisfactorily assessed with an OPI (e.g., van Lier, 1989).
Accordingly, employing discourse and conversation analysis methods, various aspects of
OPI were investigated including the resemblance of an OPI and a conversation, and the
nature of communication observed in an OPI as a speech event (e.g., He, 1998; Johnson &
Tyler, 1998; Lazaraton, 1992). These studies made a significant contribution to uncovering
the characteristics of spoken interactions observed in an OPI. They concluded that the
speech elicited in this context cannot be considered to reflect a conversation in real life be-
cause of the lack of natural conversation features, including turn-taking management and
topic negotiation (e.g., Lazaraton, 1992).
The limitations of OPI reported in the studies resulted in the exploration of alternative
modes of speaking assessment such as group/paired interviews in both in-house classroom
and standardized tests. In the group/paired interview format of speaking assessment, test-
takers are asked to perform the task with another test-taker (paired interview) or in a group
with/without the interviewer’s presence. The introduction of pair/group interviews in large
commercial tests and classroom contexts was motivated by practicality in terms of cost, time,
and resources (e.g., Ockey, 2009; Van Moere, 2006). Also, the pair/group interview format is
similar to learners’ engagement experiences in their classroom activities (e.g., Taylor, 2000),
resulting in positive washback in the classroom for both students and teachers. In addition,
the pair/group interview is considered to be more authentic for test-takers’ real-life situations
132
Speaking Assessment
and, therefore, may provide a closer link between test results and the target language use
situation (Bachman & Palmer, 1996).
As the pair/group interview format was introduced, various aspects of this format of
speaking assessment have been investigated. The earlier research focused the effect of in-
terlocutor variables on test-taker performance in terms of score and features identified in
discourse analyses. The variables include interlocutor status (i.e., interviewer/examiner or
test-taker) (e.g., Brooks, 2009; Taylor, 2000), personality traits (e.g., Nakatsukahara, 2011;
Ockey, 2009), proficiency (e.g., Davis, 2009; David et al. 2018; Iwashita, 1996; Lazaraton,
1992; Nakatsukahara, 2006), and familiarity (e.g., O’Sullivan, 2002). On the whole, positive
findings in the use of pair interview formats have been reported. In some studies, however,
the quality of language produced in this format of assessment was observed according to
interlocutor variables (e.g., proficiency, familiarity, and personality), but it was reported that
variables such as personality traits were harder to control (Iwashita, 2019). Furthermore,
some studies found that the features identified in the analysis of test-discourse were not
always found in the rating scale (e.g., Brooks, 2009; Nakatsuhara, 2006; Taylor, 2000), which
raised a question about the validity of pair/group interviews.
In the broader context of communicative language tests, there is ongoing research interest
in the aspects of speaking in the speaking performance models introduced earlier. This in-
cludes how task content and format (i.e., independent vs. integrated, e.g., Frost et al., 2012;
Iwashita et al., 2008); implementation conditions (i.e., planning time, e.g., Elder & Iwashita,
2005); ratings (i.e., holistic vs. analytic scale, rater background, rater cognition, automated
scoring, e.g., Ducasse & Brown, 2009); and test-taker attributes (i.e., L1, working memory
capacity, personality trait, and test-taking strategy, e.g., Crossley & Kim, 2019) influence test
performance in the context of both commercial and classroom/in-house tests. These studies
were undertaken in both monologic and interactional tasks in the context of face-to-face or
computer-mediated settings.
As explained earlier, recent advances in communication technology have contributed to the
wide implementation of computer-mediated and automated scoring systems. Accordingly,
some studies investigated the equivalence of different modes of the oral proficiency interview
(e.g., SOPI and OPI) (e.g., O’Loughlin, 1995), while others explored test-takers’ reaction to
face-to-face interviews and computer-mediated tests (e.g., Qian, 2009). Similarly, for rating, the
comparability of automated scoring with human ratings has been explored (e.g., Neumeyer
et al., 2000). While the empirical findings from studies in computer-based assessment and
automated scoring systems indicate that they overcome some challenges in face-to-face as-
sessment (i.e., cost, administration, subjective rating, e.g., Isaacs, 2017), the findings clearly
show limitations in the use of these modes of assessment and scoring systems. In particular,
because of technological limitations, it is difficult to assess the interactional aspects of speaking
because of the unidirectional characteristics of computer-based assessment, which results in a
narrowing of the construct in this mode of assessment. Further discussion will be presented in
the Current Contributions and Research Part.
In summary, the format of speaking assessment has undergone many changes in tandem
with the development of teaching methodology and technological advances. While both
commercial and classroom assessments employ a traditional mode of speaking assessment
(i.e., monologue), speaking assessment involving face-to-face interaction is increasing.
Furthermore, technological advances have contributed to the wider implementation of
computer-mediated testing, and automated scoring which is now being used for high-stake
purposes. The development of a variety of performance assessment and implementation of
computer-mediated tests has resulted in studies that have contributed to identifying those
aspects of a test that potentially influence performance. In particular, the introduction of
133
Noriko Iwashita
134
Speaking Assessment
communication (Roever & Kasper, 2018). Commercial tests such as Cambridge English:
First, Trinity College ISE, and the Examination for the Certificate of Competency in English
incorporate IC explicitly or implicitly in the rating scale, such as assessing interactive aspects
of task fulfilment, turn-taking, including initiation and elaboration of turns, repair, and topic
negotiation.
While increased attention has been drawn to non-cognitive views of assessment, this type
of assessment has also attracted criticisms including difficulties with standardization, and the
impact of interlocutors’ interactional behaviour on test-takers (Galaczi & Taylor, 2018).
There is a long tradition of research on interactional behaviours in both face-to-face inter-
views and pair/group tasks. For example, McNamara and Lumley (1997) and Brown (2003)
reported a significant impact from interlocutors’ behaviours, such as questioning techniques
and rapport, on the test-takers. More recently, Roever and Kasper (2018) and Ross (2018)
showed that how the interviewers/examiners shape the interaction (e.g., question technique,
rapport, topic shift) determines test-takers’ opportunities to demonstrate their interactional
competence.
In contrast to the current trend of adopting a non-cognitive view in test development and
research, the recent development of computer-mediated tests largely draws on the psycho-
linguistics view of assessment (Galaczi & Taylor, 2018). In psycholinguistic-based assess-
ments, the language elicited in the test performance is not usually embedded in interaction, as
social interaction is not included in the assessment criteria. Within this perspective, the
speaking construct is described in terms of efficiency of processing and automaticity (referred
as to “near effortless processing of language”, p. 326) with strong emphasis on the cognitive
dimension of speaking in monologue (Van Moere, 2012). Van Moere (2012) is critical of the
language assessment literature for paying little attention to automaticity as a characteristic of
a competent L2 speaker. Further, he recommends that, to assess well-defined psycho-
linguistic constructs, tasks should require evidence of a test-taker’s capacity to handle the
language (i.e., morphology, syntax, and lexis) that would be processed in real-life domains.
This perspective of assessment considers interactional strategies (such as turn-taking and
organizing ideas) as social skills, and incorporation of these skills into speaking assessment
consequently may become less standardized and reliable. Furthermore, incorporation of
these strategies may lead to the inclusion of other factors such as personality (referred to as a
“construct irrelevant trait”) (Van Moere, 2012). It should be noted that many native speakers
with poor turn-taking or other interactional strategies still able to communicate well.
Similarly, according to Kormos (2006) and Dörnyei and Kormos, 1998), based on the speech
processing production model (Levelt, 1989), communicative success largely depends on an
individual speaker’s ability to employ linguistic resources.
135
Noriko Iwashita
analyses of group assessment tasks in Hong Kong employing conversation analysis method.
Roever and Kasper (2018) also found both quantitative and qualitative differences in the use
of preliminaries (i.e., the conversational move which occurs before initiative actions such as
invitations, announcements, and requests) between higher and lower L2 speakers. While
some differences of the features under study across the levels were observed in the studies
identified earlier, how the features observed in their studies represent IC remains unclear
(Galaczi & Taylor, 2018).
Features of IC have also been examined in rater studies through verbal protocols col-
lected during rater assessment of paired interview performances for validation of rating
scales. For example, May (2009) reports that 12 native speaker raters identified three key
features of the interaction (i.e., collaborating, cooperating, and assisting each other).
Similarly, Ducasse and Brown (2009) found non-verbal interpersonal communication (use
of body language and gaze), interactive listening (the test-takers’ manner of displaying
attention or engagement), and interactional management (the management of the topics
and turns) as the main features of interaction. These findings revealed that the interaction
aspect of performance reported in the raters’ verbal report is rarely mentioned in the scale
in general and suggested a scale that reflects the complexities of IC in a paired speaking test
to assess test-takers’ IC. Building on earlier studies, May et al. (2020) recently analyzed the
stimulated verbal reports on paired interactions, focusing on interactional features of test-
taker performance. Through thematic analysis of examiner comments, 9 main categories
and over 50 sub-categories were identified. These three studies support the claim of Fulcher
et al. (2011) for a performance drive approach (PDA) that could provide a richer de-
scription of test-taker performance.
Although the majority of IC studies in speaking assessment draw on sociolinguistics for
theoretical orientation, a range of interactional strategies (e.g., negotiation of meaning, re-
sponding to clarification requests) observed in test performance (e.g., Ramazani et al., 2018;
van Batenburg, Oostdam et al., 2018) has been investigated on a basis of psycholinguistic
and cognitive underpinnings. For example, van Batenburg et al. (2018) examined test-takers’
abilities to employ self-supporting and other-supporting strategies in the performance of six
dialogic tasks with a scripted speech by pre-vocational learners in the Netherlands using both
holistic (focusing on linguistic accuracy and interactional ability) and analytic categories (in
terms of compensation, meaning negotiation and correcting misinterpretation) based on the
CEFR descriptors (Council of Europe, 2001). The high levels of agreement among the three
raters led the researchers to justify the assessment of interactional ability by examining in-
dividual test-takers’ use of strategies; they concluded that IC can be regarded as an individual
trait and an integral part of speech production.
Although psycholinguistic-based research has contributed to the current assessment of IC
literature, there are concerns about employing this view (e.g., Roever & Kasper, 2018) be-
cause of its focus on individual performance and its treatment of interactional features of
performance as construct irrelevant variables (Van Moere, 2012). The same concern has been
raised about the construct drawn from the current format of computer-based tests because it
does not include interactions, unlike face-to-face interviews (e.g., Plough et al., 2018).
However, new developments have been proposed to overcome these limitations through the
introduction of video-conference technologies such as Skype and Zoom, and speech re-
cognition systems to provide opportunities for interaction.
The video-conference modes enable both speaking and visual input to occur, which may
tap into interactional resources (Iwashita et al., 2021). Further, according to Davis et al.
(2018), incorporating face-to-face interaction into computer-based platforms with these
devices potentially diversifies task types in these assessments. Accordingly, a small number of
136
Speaking Assessment
studies have examined the feasibility of this new development in existing testing by com-
paring test performance with face-to-face interviews.
Kim and Craig (2012) compared test-taker performance in face-to-face and video-
conferenced oral interviews and found no significant difference in either overall or analytic
scores between the two test modes. Qualitative analysis of test performance also revealed
comparability in the two modes in terms of comfort, computer familiarity, environment,
non-verbal linguistic cues, interests, speaking opportunities, and topic/situation factors, with
little interviewer effect.
More recently, building on earlier studies, Nakatsuhara et al., (2017) compared perfor-
mances of IELTS speaking in face-to-face and video-conference modes in terms of the scores
and features observed in discourse analysis. While the scores were identical, some differences in
the language test-takers used in the two modes were found. In the video-mediated mode, more
clarifications were requested in Speaking Part 1 (i.e., test-takers answer questions about family,
work, and interests) and Part 3 (i.e., test-takers answer questions relating to the topic that the
test-taker spoke about in Part 2) suggesting that in video-conference mode, skills such as in-
teractive listening and signalling communication breakdown may be required. The impact of
the two modes on the interaction was further investigated in examiners’ use of verbal and non-
verbal cues, such as back-channelling, nodding, eye contact, and other gestures. The analysis
revealed that examiners used different cues according to the interaction mode. Further, ex-
aminers said that turn-taking and eye contact are challenging in video-conference mode. The
differences between face-to-face and video modes that emerged in this study point to the in-
tegral role of IC in speaking construct as well as the context-dependent nature of IC.
Davis et al. (2018) investigated the feasibility of delivering interactive speaking assessment
online through the perceptions of participants in the United States and China and their per-
formances on speaking tasks involving a moderator and two or three participants. Turn-
taking, collaboration (i.e., sharing the floor), engagement (i.e., contributing to the elaboration
of topics), and appropriateness (i.e., communicating in a pragmatically appropriate way) were
included in the scoring rubric. As reported in Nakatsuhara et al. (2017), technological stability
and test-taker familiarity with access to technology were challenges. Analysis of test-taker
performance revealed limited ranges of visual input available to participants in a face-to-face
interaction through video. However, considering that conversations and job interviews often
take place via video-conference, Davis et al. (2018) recommend that this assessment mode be
introduced along with research on spoken interaction in this new format.
In addition to face-to-face interviews through video-conferencing, other recent develop-
ments include speaking assessment with use of the virtual environment, where test-takers are
required to participate in a discussion in a library setting and interact with avatars (Ockey
et al., 2017), and speech technologies that enable an electronic device to recognize and
analyze spoken words such as a speech recognition device (e.g., Litman et al., 2018).
Attempts to incorporate IC in computer-mediated testing are promising, but there are
challenges in terms of stable internet connection and test-takers’ access to and familiarity
with devices. Furthermore, research has revealed a new aspect of IC in online environment
contexts other than a face-to-face environment, as shown by examiners’ interactional be-
haviours, should be considered.
In summary, the growing recognition of IC as an integral part of speaking assessment has
resulted in studies identifying features of IC, incorporation of IC competence into tasks, and
rating scales incorporating IC. The research findings have contributed to further under-
standing of IC in speaking assessment, but at the same time have raised several challenges
(Galaczi & Taylor, 2018) regarding the definition of IC; the scalability of these features in
rating scales, and the potentially task-dependent nature of the features of IC. These
137
Noriko Iwashita
challenges have added further complexities to incorporating IC into tasks and rating scales in
the growing number of computer-mediated tests, considering the current limitations of
communication technology. Developments have been made by introducing video-conference
mode in face-to-face interviews, the use of a virtual environment, and speech recognition
systems. The findings of the small number of research projects have shown promise for the
future of incorporating technology in the assessment of IC in computer-mediated tests.
138
Speaking Assessment
conversational partners (Ross, 2018; Young, 2011). It is, therefore, important to help lear-
ners understand the context of communication (e.g., situation, interlocutor, the speaker’s
role), and to deploy the language knowledge specific to the context.
Finally, the increased implementation of large, commercial, computer-mediated speaking
tests has increased test-taker access, but communication via technology results in different
communication strategies being employed in video-mediated interviews (Davis et al, 2018;
Nakatsuhara et al., 2017). Considering that communication via technology is a part of our
daily lives, teachers are encouraged to include this new context in their teaching materials and
test preparation. With an increased demand for global communication and advancement of
communication technology, practitioners are expected to devise appropriate speaking assess-
ments, considering context and test-takers’ needs. Successful implementation of appropriate
assessment tools requires continuing collaboration between practitioners and researchers.
7 Future Directions
The field has come a long way since the direct format of speaking assessment was introduced in
early 1980. The historical overview of the development of types of speaking assessment and
research contributions presented here clearly shows that contextual variables (e.g., current
teaching methodology, test-users’ needs in terms of societal trends, and advancement of
communication technology) play a prominent role in deciding on the test format and its focus
in the assessment, its impact on curriculum, and vice versa. Incorporation of IC into speaking
assessments and technological advancements including computer-mediated tests and auto-
mated scoring systems have challenged scholars to revise existing constructs of speaking
proficiency. Considering that test validity (i.e., appropriateness for its purpose, use and impact)
is a core business of language assessment research, validation research is expected to continue
to draw on varied theoretical perspectives, using a range of research methods.
Further Reading
Fulcher, G. (2003). Testing second language speaking. Harlow: Longman/Pearson Education.
A comprehensive overview of the issues involved in second language speaking tests incorporating
practice and theory.
Galaczi, E. D. & Taylor, L. (2018). Interactional competence: Conceptualisations, operationalisations, and
outstanding questions. Language Assessment Quarterly, 15(3), 219–236. doi:10.1080/15434303.2018.1453816
An historical and current overview of IC in the context of spoken language use which discusses its
operationalization in tests and assessment scales, posing several challenges associated with this activity.
Harding, L. (2014). Communicative language testing: Current issues and future research, Language
Assessment Quarterly, 11(2), 186–197, doi:10.1080/15434303.2014.895829
Harding discusses a range of current issues and future research directions in Communicative Language
Testing (CLT) based on key questions which emerged at the CLT symposium at the 2010 Language
Testing Forum. He suggests a reinvigorated communicative approach that focuses on “adaptability” in
language testing, and several future research directions.
Nakatsuhara, F., Inoue, C., Berry, V., & Galaczi, E. (2017). Exploring the use of video-conferencing
technology in the assessment of spoken language: A Mixed-Methods study. Language Assessment
Quarterly 14(1), 1–18, doi:10.1080/15434303.2016.1263637
This study investigated the comparability of test performance in Internet-based video-conferencing
technology and standard face-to-face modes. It presents a comprehensive overview of video-mediated
testing and reports the findings of similarities and differences in performance under the two modes.
Van Moere, A. (2012). A psycholinguistic approach to oral language assessment. Language Testing,
29(3), 325–344. doi:10.1177/0265532211424478
A framework for incorporating the assessment of psycholinguistic constructs into spoken language
proficiency testing.
139
Noriko Iwashita
References
Alderson, C. & Banerjee, J. (2002). Language testing and assessment (Part 2). Language Teaching, 35,
79–113. doi:10.1017/S0261444802001751
Alonso, E. (1997). The evaluation of Spanish-speaking bilinguals’ oral proficiency according to ACTFL
guidelines (trans. from Spanish). Hispania, 80(2), 328–341.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford, UK: Oxford University
Press.
Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford, UK: Oxford University Press.
Bolus, R. E., Hinofotis, F. & Bailey, K. M. (1981) An introduction to generalizability theory in second
language research. Language Learning, 32(1), 245–258. doi:10.1111/j.1467-1770.1982.tb00970.x
Breiner-Sanders, K. E., Lowe, P. Jr., Miles, J., & Swender, E. (1999). ACTFL Proficiency guidelines –
speaking revised 1999. Foreign Language Annals, 33(1), 13–18. doi:10.1111/j.1944-9720.2000.tb00885.x
Brooks, L. (2009). Interacting in pairs in a test of oral proficiency: Co-constructing a better perfor-
mance. Language Testing, 26(3), 341–366. doi:10.1177/0265532209104666
Brown, A. (2003). Interviewer variation and the co-construction of speaking proficiency. Language
Testing, 20(1), 1–25. doi:10.1191/0265532203lt242o
Canale, M. & Swain, M. (1980) Theoretical bases of communicative approaches to second language
teaching and testing. Applied Linguistics, 1(1), 1–47. doi:10.1093/applin/I.1.1
Chalhoub-Deville, M. (2003). Second language interaction: Current perspectives and future trends.
Language Testing, 20, 369–383. doi:10.1191/0265532203lt264oa
Clark, J. L. D. (1979). Direct vs. semi-direct tests of speaking ability. In E. J. Briere & F. B. Hinofotis
(Eds.), Concepts in language testing: Some recent studies (pp. 35–49). Washington, DC: TESOL.
Crossley, S. A., & Kim, You Jin. (2019). Text integration and speaking proficiency: Linguistic, in-
dividual differences, and strategy use considerations. Language Assessment Quarterly, 16(2),
217–235. doi:10.1080/15434303.2019.1628239.
Council of Europe. (2001). Common European framework of reference for languages: Learning,
teaching and assessment. Cambridge, U.K.:Cambridge University Press.
Davis, L., Timpe-Laughlin, V., Gu, L. & Ockey, G. (2018). Face-to-face speaking assessment in the
digital age: Interactive speaking tasks online. In J. M. Davis, J. Norris, M. Malone, T. McKay, & Y.
A. Son (Eds.), Useful assessment and evaluation in language education (pp. 115–130). Washington,
DC: Georgetown University Press.
Davis, L. (2009). The influence of interlocutor proficiency in a paired oral assessment. Language
Testing, 26(3), 367–396.
Dornyei, Z., & Kormos, J. (1998). Problem-solving mechanisms in L2 communication. Studies in
Second Language Acquisition, 20(3), 349–385.
Ducasse, A. M., & Brown, A. (2009). Assessing paired orals: Raters’ orientation to interaction.
Language Testing, 26(3), 423–443. doi:10.1177/0265532209104669
Elder, C. A. (1996). The effect of language background on foreign language test performance: The case
of Chinese, Italian, and Modern Greek. Language Learning, 46(2), 233–282. doi:10.1111/j.1467-
1770.1996.tb01236.x
Elder, C. & Iwashita, N. (2005). Planning for test performance: Does it make a difference? In R. Ellis
(Ed.), Planning and task performance in a second language (pp. 219–238) Amsterdam, Netherlands:
John Benjamin.
Frost, K., Elder, C. A., & Wigglesworth, G., (2012). Investigating the validity of an integrated listening-
speaking task: A discourse-based analysis of test takers’ oral performances. Language Testing, 29(3),
345–369. doi:10.1177/0265532211424479
Fulcher, G. (2003). Testing second language speaking. New York, NY: Routledge.
Fulcher, G., Davidson, F., & Kemp, J. (2011) Effective rating scale development for speaking tests:
Performance decision trees. Language Testing, 28(1), 5–29.
Galaczi, E. D. (2008). Peer–peer interaction in a speaking test: The case of the First Certificate in
English examination. Language Assessment Quarterly, 5(2), 89–119. doi:10.1080/15434300801934702
Galaczi, E. D. (2014). Interactional competence across proficiency levels: How do learners manage
interaction in paired speaking tests? Applied Linguistics, 35(5), 553–574. doi:10.1093/applin/amt017
Galaczi, E. D. & Taylor, L. (2018). Interactional competence: Conceptualisations, operationalisations,
and outstanding questions. Language Assessment Quarterly, 15(3), 219–236. doi:10.1080/
15434303.2018.1453816
140
Speaking Assessment
Ginther, A. (2013). Assessment of speaking. In C. A. Chapelle (Ed.), The encyclopedia of applied lin-
guistics. Hoboken, NJ: Blackwell. doi:10.1002/9781405198431.wbeal0052.
Graham, C. R., Lonsdale, D., Kennington, C., Johnson, A., & McGhee, J. (2008). Elicited imitation as
an oral proficiency measure with ASR scoring. Proceedings of the sixth international conference on
language resources and evaluation (LREC 2008) (pp. 1604–1610). Marrakech, Morocco: LREC.
Hall, J. K., & Pekarak Doehler, S. (2011) L2 interactional competence and development, In J. K. Hall,
J. Hellermann, & S. Pekarak Doehler (Eds.), L2 Interactional competence and development
(pp. 1–15). Clevedon, OH: Multilingual Matters.
Harding, L. (2014). Communicative language testing: Current issues and future research, Language
Assessment Quarterly, 11(2), 186–197. doi:10.1080/15434303.2014.895829
He, A. W. (1998). Answering questions in LPIs: A case study, In Young, R. & He, A. W. (Eds.),
Talking and testing: Discourse approaches to the assessment of oral proficiency, studies in bilingualism
(Vol. 14, pp. 10–16). Amsterdam, Netherlands: John Benjamins.
Higgs, T. & R. Clifford. (1982). The push towards communication. In T. V. Higgs (Ed.), Curriculum,
competence, and the foreign language teacher (pp. 57–79). Lincolnwood, IL: National Textbook
Company.
Huang, Heng-Tsung Danny, Hung, Shao-Ting Alan & Plakins, L. (2018). Topical knowledge in L2
speaking assessment: Comparing independent and integrated speaking test tasks. Language Testing,
35(1), 27–49. doi:10.1177/0265532216677106
Isaacs, T. (2017). Fully automated speaking assessment: Changes to proficiency testing and the role of
pronunciation. In O. Kang, R. I. Thomson, & J. M. Murphy (Eds.), The Routledge handbook of
contemporary English pronunciation (pp. 570–584). New York, NY: Routledge.
Iwashita, N. (1996). The validity of the paired interview format in oral performance assessment.
Melbourne Papers in Language Testing, 5(2), 51–65.
Iwashita, N. (2019). Peer interaction assessment: Overview of research and directions. In Carsten
Roever and Gillian Wigglesworth (Ed.), Social perspectives on language testing: Papers in honour of
Tim McNamara (pp. 105–120) Berlin, Germany: Peter Lang.
Iwashita, N., Brown, A., McNamara, T., & O’Hagan, S. (2008). Assessed levels of second language
speaking proficiency: How distinct? Applied Linguistics, 29, 24–49. doi:10.1093/applin/amm017
Iwashita, N., May, L. & Moore, P. (2021). Operationalising interactional competence in computer-
mediated speaking tests. In M. R. Salaberry & A. R. Burch (Eds.), Assessing speaking in context -
expanding the construct and its applications (pp. 283-302). Bristol, UK: Multilingual Matters.
Johnson, M. & Tyler, A. (1998). Re-analyzing the OPI: How much does it look like natural
conversation? In R. Young & A. W. He (Eds.), Talking and testing: Discourse approaches to the
assessment of oral proficiency, Studies in Bilingualism (Vol. 14, pp. 27–51). Amsterdam, Netherlands:
John Benjamins.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence Erlbaum
Associates.
Kim, J. & Craig, D. A. (2012). Validation of a video conferenced speaking test.Computer Assisted
Language Learning, 25(3), 257–275.
Kramsch, C. (1986). From language proficiency to interactional competence. The Modern Language
Journal, 70(4), 366–372. doi:10.1111/modl.1986.70.issue-4
Lado, R. (1961). Language testing: The construction and use of foreign language tests. London, UK:
Longman.
Lam, D. M. K. (2018). What counts as ‘responding’? Contingency on previous speaker contribution as a
feature of interactional competence. Language Testing, 35(3), 377–401. doi:10.1177/0265532218758126
Lazaraton, A. (1992). The structural organization of a language interview: A conversation analytic
perspective. System, 20(3), 373–386. doi:10.1016/0346-251X(92)90047-7
Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.
Litman, D., Strik, H. & Lim, G. (2018). Speech technologies and the assessment of second language
speaking: Approaches, challenges and opportunities. Language Assessment Quarterly, 13(3),
294–309. doi:10.1080/15434303.2018.1472265
Marian, K. S., & Balaman, U. (2018). Second language interactional competence and its development:
An overview of conversation analytic research on interactional change over time. Language and
Linguistics Compass, 18(8). doi:10.1111/lnc3.12285
May, L. (2009). Co-constructed interaction in a paired speaking test: The rater’s perspective. Language
Testing, 26(3), 397–421. doi:10.1177/0265532209104668
141
Noriko Iwashita
May, L., Nakatsuhara, F., Lam, D., Galaczi, E. (2020). Developing tools for learning oriented as-
sessment of interactional competence: Bridging theory and practice. Language Testing, 37(2),
165–188.
McNamara, T. F. (1996). Measuring second language performance. London, UK: Longman.
McNamara, T. F. & Lumley, T. (1997). The effect of interlocutor and assessment mode variables in
overseas assessments of speaking skills in occupational settings. Language Testing, 14(1), 140–156.
doi:10.1177/026553229701400202
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd edn, pp. 13–103). New
York, NY: American Council on Education & Macmillan.
Nakatsuhara, F. (2006). The impact of proficiency level on conversational styles in paired speaking
tests. Cambridge ESOL Research Notes, 25, 15–20.
Nakatsuhara, F. (2011). Effects of test-taker characteristics and the number of participants in group
oral tests.Language Testing, 28(4), 483–508.
Nakatsuhara, F. (2013). The co-construction of conversation in group oral tests. Frankfurt/Main,
Germany: Peter Lang.
Nakatsuhara, F., Inoue, C., Berry, V., & Galaczi, E. (2017). Exploring the use of video-conferencing
technology in the assessment of spoken language: A Mixed-Methods study. Language Assessment
Quarterly 14(1), 1–18. doi:10.1080/15434303.2016.1263637
Neumeyer, L., Franco, H., Digalakis, V., & Weintraub, M. (2000). Automatic scoring of pronunciation
quality. Speech Communication, 30, 83–94. doi:10.1016/S0167-6393(99)00046-1
O’Loughlin, K. (1995). Lexical density in candidate output on direct and semi-direct versions of an oral
proficiency test. Language Testing, 12(2), 217–237. doi:10.1177/0265532208101010
O’Sullivan, B. (2002). Learner acquaintanceship and oral proficiency test pair-task performance.
Language Testing, 19(3), 277–295.
O’Sullivan, B. (2013). Assessing speaking. In J. Kuna (Ed.), The companion to language assessment.
New York, NY: John Wiley, & Sons. doi:10.1002/9781118411360.wbcla084.
O’Sullivan, B., & Weir C. J. (2011). Test development and validation. In B. O’Sullivan (Ed.), Language
testing: Theories and practices (pp. 13–32). Basingstoke, UK: Palgrave Macmillan.
Ockey, G. J. (2009). The effects of group members’ personalities on a test taker’s L2 group oral dis-
cussion test scores. Language Testing, 26(2), 161–186. doi:10.1177/0265532208101005
Ockey, G. J., Gu, L. & Keehner, M. (2017). Web-based virtual environments for facilitating assessment
of L2 oral communication ability. Language Assessment Quarterly, 14(4), 346–359. doi:10.1080/
15434303.2017.1400036
Oller, J. W., Jr. (1979). Language tests at school: A pragmatic approach. London, UK: Longman
Plough, I., Banerjee, J. & Iwashita, N. (2018). Interactional competence: Genie out of the bottle.
Language Testing, 35(3), 427–455. doi:10.1177/0265532218772325.
Purpura, J. (1998). Investigating the effects of strategy use and second language test performance with
high- and low-ability test takers: A structural equation modelling approach. Language Testing,
15(3), 333–379. doi:10.1177/026553229801500303
Qian, D. (2009). Comparing direct and semi-direct modes for speaking assessment: Affective effects on
test takers. Language Assessment Quarterly, 6(2), 113-125.
Raffaldini, T. (1988). The use of situation tests as measures of communicative ability. Studies in Second
Language Acquisition, 10(2), 197–216.
Ramazani, M., Behnam, B., Ahangari, S. (2018). Psychometric characteristics of a rating scale for
assessing interactional competence in paired-speaking tasks at micro-level. The Journal of English
Language Pedagogy and Practice 11(23), 180–206. doi:10.30495/jal.2019.664545
Roever, C. & Kasper, G. (2018). Speaking in turns and sequences: Interactional competence as a target
construct in testing speaking. Language Testing, 35(3), 331–355. doi:10.1177/0265532218758128
Ross, S. J. (2018). Listener response as a facet of interactional competence. Language Testing 35(1),
357–375. doi:10.1177/0265532218758125
Skehan, P. (2009). Modelling second language performance: Integrating complexity, accuracy, fluency,
and lexis. Applied Linguistics, 30(4), 510–532. doi:10.1093/applin/amp047.
Taylor, L. (2000). Investigating the paired speaking test format. Cambridge ESOL Research Notes,
2, 14–15.
Thompson, I. (1995). A study of interrater reliability of the ACTFL Oral Proficiency Interview in five
European languages: Data from ESL, French, German, Russian and Spanish. Foreign Language
Annals, 28(3), 407–422. doi:10.1111/j.1944-9720.1995.tb00808.x
142
Speaking Assessment
van Batenburg, E., Oostdam, R., van Gelderen, A., & de Jong, N. (2018). Measuring L2 speakers’
interactional ability using interactive speech tasks. Language Testing, 35(1) 75–100. doi:10.1177/
0265532216679452
van Lier, Leo (1989). Reeling, writhing, drawling, stretching and fainting in coils: Oral proficiency
interviews as conversations. TESOL Quarterly, 23, 489–508. doi:10.2307/358692
Van Moere, A. (2006). Validity evidence in a university group oral test. Language Testing, 23, 411–440.
doi:10.1191/0265532206lt336oa
Van Moere, A. (2012). A psycholinguistic approach to oral language assessment. Language Testing,
29(3), 325–344. doi:10.1177/0265532211424478
Young, R. (2008). Language learning and discursive practice. Language Learning 58, 135–181.
doi:10.1111/j.1467-9922.2009.00492.x
Young, R. (2011). Interactional competence in language learning, teaching and testing. In E. Hinkel
(Ed.), Handbook of research in second language teaching and learning (Vol. 2, pp. 426–443).
New York, NY: Routledge.
143
PART III
Core Topics
10
PRONUNCIATION LEARNING AND
TEACHING
Tracey M. Derwing and Murray J. Munro
1 Introduction/Definitions
Several features are eye-catching when two people meet for the first time. Age, gender, race,
height, weight – these are all very noticeable. Another characteristic that jumps out as soon
as two people talk is pronunciation, which can index a speaker’s social class, regional
background, and education. A second language accent is immediately discernable; in fact,
Flege (1984) determined that people can sometimes recognize a non-native utterance with
only 30 milliseconds of exposure. Moreover, even if recordings are played backwards,
listeners can reliably distinguish between native and non-native speech (Munro et al.,
2010). “Pronunciation,” or the way people speak (including their production of individual
sounds, prosody, speech rate, and voice quality) is highly salient to listeners. Surprisingly,
then, over a period of about 30 years in the past century, applied linguists and many
language teachers paid relatively little attention to second language (L2) pronunciation,
thinking that learners’ productions would improve with massive amounts of input.
However, in the 21st century, pronunciation has gone from a minor issue in L2 research
journals, to a highly prominent topic that now garners tremendous interest (Levis &
Sonsaat, 2020). Significant numbers of adult L2 learners, though, have always been con-
cerned with their pronunciation; one need only look online at the proliferation of accent
reduction programs to see that. “Accentedness” or the degree to which one’s speech differs
from that of a local community, is not the most important speech dimension for social
interaction; “intelligibility” or the degree to which a listener understands the speaker’s
intention, and “comprehensibility” the degree of effort required of a listener to understand
a speaker’s message (also known as processing fluency in social psychology) are far more
critical to successful communication. It is quite possible to have a very heavy accent and yet
be fully intelligible and easy to understand (Derwing & Munro, 1997; Munro & Derwing,
1995a). The identification of the features of an accent that interfere with comprehensibility
and intelligibility is the key to helping L2 speakers. Some differences in an L2 speaker’s
productions, though highly salient, have little or no impact on listener understanding,
while others cause communicative breakdowns. Another central aspect of pronunciation
research is “fluency” or the flow of speech, that is, the degree to which a speaker can talk
without noticeable pauses mid-clause or phrase and dysfluencies such as repetitions and
false starts (Derwing et al., 2004; Kahng, this volume).
2 Historical Perspectives
Although phonetics, the study of speech, dates back more than two millennia, many people’s
first introduction to it is George Bernard Shaw’s (1912) play, Pygmalion, or its movie adap-
tation, My Fair Lady, in which the character of Henry Higgins changes the pronunciation of a
lowly flower seller to that of an aristocrat. It has been suggested that Higgins was based on
Henry Sweet and Daniel Jones, both British phoneticians from Shaw’s day. Although the
Pygmalion story revolved around L1 markers of class, interest in helping L2 speakers change
their pronunciation to conform to a local standard had existed for hundreds of years. An early
volume addressing the notorious spelling system in English was also aimed at improving the
pronunciation of “outlanders” (Price, 1665). The rise of phonetics in Britain and elsewhere in
the 20th century led to a stronger focus on L2 pronunciation. David Abercrombie, a former
student of Daniel Jones, was initially interested in L1 phonetics, but unlike the fictional
Higgins, he was not at all concerned with promoting the use of Received Pronunciation (Kelly,
1993). This acceptance of difference was evident in his approach to L2, which he summarized
as follows: “I believe that pronunciation teaching should have, not a goal which must of
necessity be normally an unrealized ideal, but a limited purpose which will be completely
fulfilled; the attainment of intelligibility” (Abercrombie, 1949, p. 120).
Abercrombie’s commitment to intelligibility was not always sustained in the language
teaching community. The Audiolingual Method (Lado, 1964), a type of language teaching
which stressed native-like pronunciation and correct grammar, took hold first in the United
States, but then spread across the world in the 1950s and 1960s. Audiolingualism was based
on the imitation of exemplars; language students repeated dialogues until they were mem-
orized. Concomitant developments in technology meant that language labs gave teachers the
opportunity to assign hours of “listen and repeat” exercises. The format of the Audiolingual
method was a boon to teachers who did not have a good grasp of the L2 themselves, but on
the downside, students found it to be extremely boring (Flanders & Nuthall, 1972). A re-
action to its monotony appeared in the 1970s, with the rise of what have come to be known
as the “designer methods” including Suggestopedia, The Silent Way, Total Physical
Response, and Community Counselling Learning (see Richards & Rodgers, 2014 for in-
depth descriptions of these methods). The Silent Way (Gattegno, 1972), in particular, was
geared to nativelike pronunciation; in fact, learners’ attention was focused on individual
sounds to a greater extent than any other approach to language teaching. Each of the de-
signer methods required extensive training on the part of the instructor, and eventually they
were superseded by the Communicative Approach to Language Teaching in the early 1980s.
Early proponents of this approach argued that accurate L2 pronunciation would develop
naturally with sufficient input (Krashen, 1982). Furthermore, as it became clear that adult L2
learners were unlikely to ever achieve a native-like accent, the whole issue of pronunciation
teaching was seen by many as pointless (Murphy & Baker, 2015). A small cadre of expert
practitioners, including Judy Gilbert, Joan Morley, Clifford Prator and Betty Robinett,
continued to press for more attention to L2 pronunciation instruction (PI), but, for the most
part, it was abandoned in the L2 classroom. Applied research on L2 learners’ phonological
development became scarce.
In sum, Abercrombie’s plea for comfortably intelligible pronunciation was largely for-
gotten, replaced by concerns for perfectly native-like productions, and then by the notion
that massive exposure was the only way to improve a learner’s speech. In our view, the lack
of research to support the arguments of experienced practitioners was problematic. The
recent renewal of interest in L2 pronunciation is a direct result of empirical evidence de-
monstrating the importance of intelligibility and comprehensibility over accentedness.
148
Pronunciation Learning and Teaching
Moreover, it has become clear that individual differences in learning trajectories are far more
varied than was previously believed (Derwing & Munro, 2015). Inventories of pronunciation
difficulties, such as Swan and Smith (2001) and Nilsen and Nilsen (2010) identify problems
for people from different L1 backgrounds, but often these are over-simplifications that do
not hold for many learners (Munro, 2018).
The past two decades have witnessed an upsurge in L2 pronunciation research, including a
special issue of TESOL Quarterly in 2005, the establishment of an annual conference in 2009,
Pronunciation in Second Language Learning and Teaching, the appearance of a dedicated
academic journal, Journal of Second Language Pronunciation, in 2015, and a dedicated strand
at the American Association of Applied Linguistics, starting in 2018. The first three of these
four developments were initiated by John Levis, who has both led and pushed the applied
study of pronunciation forward. For an in-depth historical account of pronunciation
teaching, see Murphy and Baker (2015).
149
Tracey M. Derwing and Murray J. Munro
was exposed and to the learner’s perceptual and production capabilities at the time of
learning. It is thus important not to focus exclusively on the collective performance of groups
of learners, but to examine individual learning trajectories. Error hierarchies, in particular,
can be misleading. To be effective, instructors should carry out a needs analysis for each
student, placing the emphasis on difficulties that interfere with intelligibility or compre-
hensibility. The functional load principle, “a measure of the work which two phonemes (or a
distinctive feature) do in keeping utterances apart” (King, 1967, p. 831), can be applied at
this level. Conducting needs analyses, however, requires a knowledge foundation that entails
a basic background in phonetics and an understanding of the functional load principle.
Furthermore, effective teaching requires skill in listening, at both segmental and supraseg-
mental levels, to isolate problematic elements needing intervention. Unfortunately, formal
training in these areas is not available for many prospective language teachers (Foote et al.,
2011; Huensch, 2019).
Another critical issue in the area of pronunciation is the reluctance of L2 speakers to
receive instruction from non-native teachers. Often, they express a preference for native
speakers, yet as Levis et al. (2016) demonstrated in a controlled comparison of two pro-
nunciation classes, one taught by a native speaker and the other by a nonnative teacher,
listeners assigned equivalent comprehensibility ratings to students in both classes. As
Derwing and Munro (2005) pointed out, the key to effective language teaching is not tied to
the instructor’s native speaker status. Rather, appropriate pedagogical training, linguistic
knowledge, and proficiency in the language taught all contribute to success, such that L1
status is immaterial if the other factors are met. In fact, given reasonably easy access to
global resources featuring a wide range of accents, learners should be encouraged to un-
derstand the value of seeking out a variety of speech models, using multimedia technology,
for their own perception and production of English.
A critical area needing considerably more research is the determination of the relative
importance of various accent features to intelligibility and comprehensibility. Hahn (2004)
examined the effect of primary stress on intelligibility. By comparing appropriate stress with
misassigned and monotone utterances, she found that the latter two types of productions
impeded understanding. Her work provides a valuable model for exploring other elements of
L2 speech. Munro and Derwing (2006) used listener judgements of comprehensibility to
determine the relative importance of some English segments on the basis of functional load.
They found empirical evidence for what several practitioners had hypothesized: some errors
matter much more than others. Similar hierarchies should be explored in other L2s.
Increasingly, researchers have realized that pronunciation and other aspects of language
interact. Varonis and Gass (1982) were the first to examine the relationship between
grammar and listeners’ perceptions of L2 pronunciation, and others have now pursued this
in more depth (see Ruivivar and Collins, this volume). An exploratory study of the effects of
pragmatics instruction on comprehensibility demonstrated that using predictable pragma-
linguistic formulas facilitates listeners’ ease of understanding with no change to pro-
nunciation (Derwing et al., 2021). Yates (this volume) calls for more attention to the
interaction of pragmatics and pronunciation.
150
Pronunciation Learning and Teaching
2019), with some work extending the concept to interactive speaking situations (Crowther,
2020). Concern about the role of international speech varieties in communication, especially
World Englishes, also continues to command attention (Kang et al., 2019; Llurda, this vo-
lume). Much of that work focuses on international intelligibility, and the long-standing issue of
choosing pronunciation models for instruction and assessment (Levis, 2018).
Meanwhile, a new emphasis is being placed on individual differences in learner performance,
which, as noted earlier, appear to be much greater than previously suspected. Wade et al. (2021)
for example, show how overemphasis on mean performance of groups can obscure important
subtleties in individual L1 pronunciation features. In their study, idiosyncratic voice onset time
was found to be stable in individual talkers. In other words, between-speaker variability was not
simply noise, but was a reliable feature of the individuals’ speech. This phenomenon is likely to
extend to L2 production and needs to be taken into account in assessing L2 speech.
With respect to technological innovations, notable developments include work on
computer-assisted pronunciation teaching (CAPT). One especially intriguing CAPT concept
is the “golden speaker,” a synthetic rendition of the L2 learner’s own voice, with the pro-
nunciation characteristics of a native speaker of the L2. Findings from Ding et al. (2019)
point to its effectiveness in improving learners’ comprehensibility and fluency. In other work,
Garcia et al. (2020) reported improvements in an intervention study over 15 weeks (one
group received traditional instruction and the other group used ASR technology). The
traditional instruction group showed long-term benefits in comprehensibility, while the ASR
approach was found to be more effective at targeting individual phonemes. The authors
recommend using a hybrid approach to maximize benefits for learners.
It is now well-established that L2 speech perception is closely linked to performance in
production (see Thomson, this volume). Moreover, evidence from High Variability Phonetic
Training (HVPT) demonstrates the effectiveness of instruction in one domain (perception)
on performance in another (production) (Thomson, 2018). An important issue that has
arisen in this research is the failure of the training to generalize to some new phonetic
contexts (Thomson, 2011). This outcome is consistent with Munro (2021) and provides
further evidence that L2 phonemes do not emerge simultaneously across the entire lexicon.
This has very important ramifications for teaching in that we cannot assume that teaching a
particular sound in one context will generalize to another. Rather, language instructors need
to be aware of pronunciation difficulties experienced by their learners at the lexical level. All
vocabulary instruction, for instance, should incorporate a pronunciation component to en-
sure that L2 learners acquire the appropriate production of new words, along with their
meanings (see Horst, this volume).
It has been clear for some time that corrective feedback can benefit pronunciation learners,
but a recent innovative study (Martin & Sippel, 2021) examined feedback in a novel way. The
authors compared four groups of learners of German in a pronunciation training study: one
group received teacher feedback; one group provided feedback to their peers (according to a
checklist designed by the researchers, which focused on the segments targeted in the instruc-
tion); one group were the recipients of feedback from their peers; finally, a control group was
included, who received neither the pronunciation intervention nor corrective feedback.
Recordings of each group were assessed for comprehensibility before and after the 3 weeks of
training. The three intervention groups all performed better than the control group, but the
group with the most improvement was the peer provider group, followed by the teacher
feedback group. Apparently, the engagement required to provide pronunciation feedback to
classmates had a significant and salutory effect on the learners’ own pronunciation.
Research on pronunciation assessment has lagged far behind other aspects of speaking
assessment (Isaacs & Trofimovich, 2016). Until recently, assessment of speaking in high
151
Tracey M. Derwing and Murray J. Munro
152
Pronunciation Learning and Teaching
The nature of rating scales for these tasks has been the subject of considerable discussion.
Munro (2018) found that both comprehensibility and accentedness are amenable to equal-
interval scaling with numbered points. For comprehensibility, a numbered nine-point scale
anchored with the labels “very easy to understand” and “very difficult to understand” has
often been used with reliable results. However, the number of points to which listeners are
actually able to resolve their ratings is unknown. Some researchers have successfully used
anchored but otherwise unnumbered quasi-continuous scales with up to 1,000 underlying
points (Reid et al., 2019). The gradations, however, are not visible to the rater, who simply
clicks a location on a seemingly continuous line on the computer screen. Current evidence
does not indicate any advantage of one approach over the other in terms of listener relia-
bility, though large scales have the added benefit of suitability to mixed-effects statistical
modelling (Huensch & Nagle, 2021). Also, whether listeners evaluate multiple target con-
structs simultaneously or carry out the ratings for each construct separately does not appear
to affect results (O’Brien, 2016).
Indirect investigation of intelligibility and comprehensibility can sometimes be accom-
plished using acoustic measurements and techniques from artificial intelligence (Dalby &
Kewley-Port, 1999). While automatic speech recognition shows promise as a pedagogical
tool for the future, acoustic measurements are prone to serious misuse if researchers do not
have extensive training in phonetics. No straightforward characteristic, or combination of
characteristics, of the acoustic speech signal is known to correspond to any of the global
measures of L2 speech. As a result, acoustic assessment of global intelligibility would require
advance knowledge of all the specific acoustic dimensions that influence listeners, an im-
possible expectation. While acoustic measurements of segments can provide useful in-
formation if expertly carried out and interpreted, testing their validity requires listener
judgements. To illustrate the relevant dangers, imagine an intervention study in which vowel
quality is taught, and pre- and post-intervention measurements reveal a change in second
formant frequencies toward native-like values. In the first place, a change of this type in no
way guarantees improved intelligibility; only listener assessments can establish such a con-
nection. In the second, a measurable “improvement” in one acoustic dimension may be offset
by a worsening in some other dimension, perhaps one not actually evaluated by the re-
searcher, such that the changes together yield a net impact of zero on intelligibility. Given
these complexities, it is unsurprising that Chan and Hall (2019) found that the degree of
acoustic deviations from native vowel norms failed to account for listeners’ perceptions.
Recent innovations in L2 speech measurement illustrate intriguing refinements on com-
monly used procedures. One proposal is to use multiple measures to gain simultaneous,
complementary perspectives on the same speech material. Kang et al. (2018), for instance,
compared five different assessment types to determine the strengths of each. One of these,
commonly used in the study of pathological speech, was listener transcriptions of nonsense
utterances, which require a focus on segmental phonemic details, while filtered speech was
more suited to a suprasegmental analysis. A recently developed approach to speech ratings
entails dynamic assessment, in which listeners give multiple judgements over time as they
listen to speech samples (Nagle et al., 2019). While this early research suggests considerable
interrater variability in such tasks, more work remains to be done.
153
Tracey M. Derwing and Murray J. Munro
comprehensibility. To achieve that aim, however, teachers require some expertise in identi-
fying aspects of L2 speech that interfere with listeners’ understanding, as opposed to ele-
ments of an accent that may be salient but which have little or no effect on conveyance of
meaning. Language teacher preparation programs should include courses on how to teach
pronunciation, and those courses should minimally provide ample opportunity for trainees
to assess L2 speakers’ utterances for intelligibility and comprehensibility (Derwing, in press;
Murphy, 2017). Such courses should also touch on research that sheds a light on features
shown to affect these two dimensions, such as functional load (Munro & Derwing, 2006),
primary stress (Hahn, 2004), and speech rate (Munro & Derwing, 2001).
A great deal of PI is directed at learners who may be assumed to have plateaued in their
learning with little expectation of further improvement without intervention. Although PI
can be effective even then (Derwing et al., 2014), it makes more sense to intervene early if
intelligibility is at risk. Derwing and Munro (2015) have identified the first months of massive
exposure to the L2 as the “Window of Maximal Opportunity” –the period when naturalistic
improvements in pronunciation are most prominent, before speech patterns become en-
trenched. It stands to reason that PI should be introduced as early as possible to take ad-
vantage of this window. Zielinski and Yates (2014) make a compelling argument for starting
PI with beginner learners and provide many suggestions for how to do so. Moreover, PI
should not be restricted to a stand-alone course (although in some settings, a dedicated
pronunciation course is entirely warranted). Pronunciation should be integrated, not only
into speaking courses, but in any language course where L2 speakers have comprehensibility
issues, just as incidental vocabulary is introduced across the curriculum in all types of L2
courses.
A consequence of the recognition that comfortable intelligibility should be the goal is that
many L2 learners do not need PI. Some have a high aptitude for pronouncing well in an L2
(see Mora, this volume) and therefore require minimal help. It has become clear, however,
that individual differences are far more prevalent and diverse than was originally believed. It
is also evident that learning segments in one particular phonetic context does not necessarily
generalize to others. It is thus imperative that learners be given supports that are customized
to their needs at multiple levels (segmental, lexical, phrasal, and so on). Technology offers
considerable promise in this regard, especially for segmentals. Thomson (2018) reviewed over
30 studies which showed the efficacy of HVPT, yet most teachers are unfamiliar with this
technique. Developments such as the English Accent Coach (Thomson, 2021) can provide
learners with access to HVPT training on precisely those segments causing difficulty.
Although numerous L2 pronunciation resources are available online, many provide ad-
vice that is misleading or simply wrong (see Derwing & Munro, 2015 for examples). A key
service that instructors can offer their students is a frank caveat emptor discussion regarding
accent reduction scams and well-meaning but ill-informed coaches. As more authoritative
resources appear online (e.g., pronunciationforteachers.com) and in print, we can hope for a
reduction in opportunism that exploits learners.
6 Future Directions
Teaching
Despite the numerous articles that have appeared in the past two decades examining the
learnability of various phonemic contrasts (Thomson & Derwing, 2015), the increase in
teaching materials, and many studies on teacher cognition (Murphy & Baker, 2015), few
teacher preparation programs include courses on how to teach L2 pronunciation. Until
154
Pronunciation Learning and Teaching
language programs have staff who are well-versed in identifying their students’ commu-
nication problems and addressing them accordingly, students will continue to look elsewhere
to improve their productions. With the turn to online courses in many educational institu-
tions as a result of COVID, perhaps for-credit teacher preparation courses at reputable
institutions, offered by leading experts, will become available.
In view of the current lack of adequate pronunciation courses, one approach to help
learners in the short term is technological assistance for self-study. The English Accent
Coach (Thomson, 2021), for example, employs HVPT to assist learners with perception of
segmentals, but perhaps a more macro level of exposure to suprasegmentals could be in-
corporated in similar software. A key feature of HVPT underlying its efficacy is its utilization
of multiple talkers for training stimuli. Parallel variability may also improve suprasegmental
training. Similar platforms should be developed for other heavily studied languages as well
(e.g., Mandarin and Spanish). Other apps have also been shown to benefit both perception
and production, but as Fouz-González (2020) points out, students are unlikely to spend
much money on apps. It is imperative, then, that evidence-based apps should be attractive to
potential users in terms of both cost and functionality. Some of the technological tools de-
veloped thus far are expensive to maintain and require buy-in from funders and colla-
boration across disciplines (O’Brien et al., 2018).
As for classroom-based instruction, promising studies such as Galante and Thomson
(2017) offer direction for effective activities. More studies like theirs, using a range of ac-
tivities, participants from different L1s, and learners of different L2s would be useful. Direct
collaboration among teachers and researchers could encourage incorporation of effective PI
in the classroom, as was the case with Rojczyk (2015) who showed that students’ imitation of
an English accent while speaking their L1, Polish, benefited their pronunciation of English.
Martin and Sippel’s (2021) research demonstrates that full engagement can lead to better
noticing of one’s own pronunciation. Learners who are called on to provide peer feedback
appear to monitor and modify their own productions to a greater extent than if they relied
solely on the teacher’s corrections. In a workplace pronunciation course, Derwing et al.
(2014) noted that once an atmosphere of trust was established, learners felt comfortable
listening to and correcting their peers. Employing peer feedback with guidance from the
teacher may have positive implications for classroom teaching going forward, with the caveat
that the quality of peer feedback be supervised by the instructor.
Research
The future of pronunciation research looks bright; many more scholars are engaged in this
area than just a decade ago. Calls for replications of existing research in languages other than
English are now being answered (e.g., Huensch & Nagle, in press for Spanish; Zhang &
Yuan, 2020 for Mandarin). More longitudinal research has also started to appear (e.g.,
Huensch et al., 2019), as have explorations of relatively new data collection approaches, such
as crowdsourcing, using Mechanical Turk (Nagle, 2019). Each of these streams is promising
and we look forward to seeing far more work in these areas. O’Brien et al. (2018) offer
extensive lists of recommendations for pronunciation research, particularly with reference to
the use of various technologies. Other areas are worthy of exploration as well, including a
focus on the interaction of different approaches to PI. Extensive analyses of the relative
contributions of PI techniques, both individually and in combination would have practical
consequences for the classroom.
Assessment is another issue to which researchers are turning their attention (e.g., Isaacs &
Trofimovich, 2016; Kang et al., 2018) but this field needs considerably more exploration,
155
Tracey M. Derwing and Murray J. Munro
especially in view of the complexity of World Englishes (Kang et al., 2020). Hansen Edwards
et al. (2020) call for the development of a diagnostic tool that teachers could use to identify
learner needs. Further, an interest in regional dialect acquisition by L2 learners and the
factors that determine uptake is worthy of probing (Schoonmaker-Gates, 2020).
Researchers should think well into the future when designing studies with the possibility
of multi-purposing of data. Rather than conducting one-off studies that focus on a very
limited research question, some strategic planning to maximize the use of data for several
purposes (prior to the ethics submission stage) would be useful. Contributions to existing
corpora, for example, could be collected at the same time as the data for addressing a
particular question. A related consideration is the benefit of collaborative work which may
allow for more uses of the same set of data.
Finally, pronunciation learning needs to be contextualized by the recognition that ev-
eryone speaks with an accent. And nearly everybody communicates with people from outside
their own speech community. Rather than focus on discriminatory reactions to accent, re-
searchers are encouraged to investigate ways to improve listening capacity in target language
speakers and to enhance listeners’ willingness to communicate with interlocutors whose
pronunciation differs from their own (Derwing et al., 2002).
Further Reading
Derwing, T. M. & Munro, M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2
teaching and research. Amsterdam: John Benjamins.
Critical issues relevant to optimal teaching and learning of L2 pronunciation are covered in a clear,
comprehensive manner.
Grant, L. (Ed.) (2014). Pronunciation myths: Applying second language research to classroom teaching.
Ann Arbor: Michigan University Press.
This very readable collection of papers addresses popular myths related to pronunciation learning and
teaching.
Levis, J. M. (2018). Intelligibility, oral communication, and the teaching of pronunciation. Cambridge:
Cambridge University Press.
With a strong emphasis on classroom practice and how pronunciation teaching can be more effectively
approached in different teaching contexts, this book is an important resource for pronunciation researchers.
References
Abercrombie, D. (1949). Teaching pronunciation. ELT Journal, 3, 113–122.
Chan, K. Y. & Hall, M. D. (2019). The importance of vowel formant frequencies and proximity in
vowel space to the perception of foreign accent. Journal of Phonetics, 77, 1–22.
Council of Europe. (2018). Common European Framework of Reference for languages: Learning,
teaching, assessment:Companion volume with new descriptors. Strasbourg, France: Council of
Europe.
Crowther, D. (2021). Measuring phonology. In F. Winke & T. Brunfaut (Eds.), The Routledge
handbook of second language acquisition and language testing (pp. 243–253). New York, NY &
Abingdon OX: Routledge.
Crowther, D. (2020). Rating L2 speaker comprehensibility on monologic vs. interactive tasks: What is
the effect of speaking task type? Journal of Second Language Pronunciation, 6(1), 96–121.
Dalby, J., & Kewley-Port, D. (1999). Explicit pronunciation training using automatic speech
recognition technology. CALICO Journal, 16(3): 425–445.
Derwing, T. M. (in press). Lessons learned from teaching teachers to teach pronunciation. In V.
Sardegna & A. Jarosz (Eds.), Theoretical and practical developments in English speech assessment,
research, and training. Berlin: Springer.
Derwing, T. M. & Munro, M. J. (1997). Accent, comprehensibility and intelligibility: Evidence from
four L1s. Studies in Second Language Acquisition, 19, 1–16.
156
Pronunciation Learning and Teaching
157
Tracey M. Derwing and Murray J. Munro
Kang, O., Thomson, R. I. & Moran, M. (2020). Which features of accent affect understanding?
Exploring the intelligibility thresholds of diverse accent varieties. Applied Linguistics, 41, 453–480.
Kelly, J. (1993). Obituary: David Abercrombie . Phonetica, 50, 68–71.
Kennedy, S., & Trofimovich, P. (2019). Comprehensibility: A useful tool to explore listener under-
standing. The Canadian Modern Language Review, 75(4), 275–284.
King, R. D. (1967). Functional load and sound change. Language, 43, 831–852.
Krashen, S. D. (1982). Principles and practice in second language acquisition. Oxford: Pergamon Press.
Lado, R. (1964). Language teaching: A scientific approach. New York: McGraw-Hill.
Levis, J. (2020). Changes in L2 pronunciation: 25 years of intelligibility, comprehensibility, and ac-
centedness. Journal of Second Language Pronunciation, 6(3), 277–282.
Levis, J. M. (2018). Intelligibility, oral communication, and the teaching of pronunciation. Cambridge:
Cambridge University Press.
Levis, J. M. & Sonsaat, S. (2020). Publication venues for L2 pronunciation research. Journal of Second
Language Pronunciation, 6, 1–11.
Levis, J. M., Sonsaat, S., Link, S., & Barriuso, T. A. (2016). Native and nonnative teachers of L2
pronunciation: Effects on learner performance. TESOL Quarterly, 50, 894–931.
Martin, I. A., Sippel, L. (2021). Is giving better than receiving?: The effects of peer and teacher feedback
on L2 pronunciation skills. Journal of Second Language Pronunciation. https://doi-org.login.
ezproxy. library.ualberta.ca/10.1075/jslp.20001.mar
Munro, M. J. (2021). On the difficulty of defining “difficult” in second-language vowel acquisition.
Frontiers in Communication. 53
Munro, M. J. (2018). Dimensions of pronunciation. In O. Kang, R. Thomson, & J. Murphy. The
Routledge handbook of contemporary English pronunciation (pp. 413–431). London: Routledge.
Munro, M. J. (2018). How well can we predict L2 learners’ pronunciation difficulties? The CATESOL
Journal, 30(1), 267–281.
Munro, M. J. & Derwing, T. M. (1995a). Foreign accent, comprehensibility and intelligibility in the
speech of second language learners. Language Learning, 45, 73–97.
Munro, M. J., & Derwing, T. M. (1995b). Processing time, accent, and comprehensibility in the per-
ception of native and foreign-accented speech. Language & Speech, 38, 289–306.
Munro, M. J. & Derwing, T. M. (2001). Modelling perceptions of the comprehensibility and accent-
edness of L2 speech: The role of speaking rate. Studies in Second Language Acquisition, 23, 451–468.
Munro, M. J. & Derwing, T. M. (2006). The functional load principle in ESL pronunciation instruc-
tion: An exploratory study. System, 34, 520–531.
Munro, M. J. & Derwing, T. M. (in preparation). Huh? The amazing individual differences in L2
pronunciation learning trajectories.
Munro, M. J., Derwing, T. M. & Burgess, C. (2010). Detection of nonnative speaker status from
content-masked speech. Speech Communication, 52(7–8), 626–637.
Munro, M. J., Derwing, T. M., & Thomson, R. I. (2015). Setting segmental priorities for English
learners: Evidence from a longitudinal study. IRAL, 53(1), 39–60.
Munro, M. J. & Derwing, T. M. (2020). Collecting data in L2 pronunciation research. In O. Kang, S.
Staples, K. Yaw, & K. Hirschi (Eds.), Proceedings of the 11th pronunciation in second language
learning and teaching conference, Northern Arizona University, September 2019 (pp. 8–18). Ames,
IA: Iowa State University.
Murphy, J. (2017). Teaching the pronunciation of English: Focus on whole courses. Ann Arbor MI:
University of Michigan Press.
Murphy, J. & Baker, A. (2015). The history of ESL pronunciation teaching. In M. Reed & J. M. Levis
(Eds.), The handbook of English pronunciation (pp. 36–65). Hoboken, NJ: Wiley Blackwell.
Nagle, C. (2019). Developing and validating a methodology for crowdsourcing L2 speech ratings in
Amazon Mechanical Turk. Journal of Second Language Pronunciation, 5, 294–323.
Nagle, C., Trofimovich, P., & Bergeron, A. (2019). Toward a dynamic view of second language
comprehensibility. Studies in Second Language Acquisition, 41, 647–672.
Nilsen, D. L. F., & Nilsen, A. P. (2010). Pronunciation contrasts in English (2nd edn). Long Grove, IL:
Waveland Press.
O’Brien, Mary Grantham (2016). Methodological choices in rating speech samples. Studies in Second
Language Acquisition, 38(3), 587–605.
O’Brien, M. G., Derwing, T. M., Cucchiarini, C., Hardison, D. M., Mixdorff, H., Thomson, R. I.,
Strik, H., Levis, J. M., Munro, M. J., Foote, J. A., & Levis, G. M. (2018). Directions for the future
158
Pronunciation Learning and Teaching
159
11
SPEECH INTELLIGIBILITY
John M. Levis and Alif O. Silpachai
1 Introduction/Definitions
After the 2003 invasion of Iraq by American and allied forces, the Australian comedy show
skitHOUSE showed a British reporter apparently interviewing Iraqi insurgents outside
Tikrit, hometown of the Iraqi president (https://www.youtube.com/watch?v=j0m4rcx0of4).
The insurgent spokesman questions why subtitles are needed for his speech when his com-
rade speaks and is given no subtitles. Referring to the comrade, the reporter tells him that
“obviously he’s comprehensible,” and the comrade says that subtitles are like “teletext, for
the hearing impaired.” The skit thus plays off of three issues central to how we understand
L2 speech: accentedness, or the pronunciation of a speaker; comprehensibility, or the ease
with which listeners understand a speaker; and intelligibility, or “the extent to which a
speaker’s message is actually understood by a listener” (Munro & Derwing, 1995a, p. 76).
All three constructs of listener evaluation of speech (accentedness, comprehensibility, and
intelligibility) are partially related yet distinct. This chapter is about intelligibility, which
involves several types of understanding: understanding at the word level, the message level,
and in the interpretation of the message (Levis, 2018; Smith & Nelson, 1985). At the word
level, speech is intelligible when words in a message can be identified and decoded. If a
speaker’s words are unclearly pronounced, or mispronounced, or the competition from noise
is such that words cannot be effectively understood, words may be unintelligible by being
heard as another word, or by not being understood at all. At the message level, unintellig-
ibility occurs when the intended meaning of a discourse or utterance is not understood, or is
not fully understood. Intelligibility at the level of interpretation comes into play when lis-
teners understand the speaker’s intent behind a message, that is, its illocutionary force. This
kind of intelligibility is rarely studied for L2 speech because intent is often ambiguous, and
listeners may think they have correctly understood intent but have not.
A listener-based account of intelligibility is basic to how speakers are or are not under-
stood. In other words, we only know that spoken language is (un)intelligible because listeners
can or cannot understand it at the word, message or interpretation levels. From a speaker’s
viewpoint, intelligibility means speaking in such a way that listeners can understand what is
said. For L2 speakers, this may mean paying attention to segmental and prosodic features
associated with reduced intelligibility (Zielinski, 2008). Ladefoged and Disner (2012) describe
this give and take between speakers and listeners in terms of two principles behind speech:
articulatory ease and auditory distinctiveness. Speakers want to deliver their message with
the least amount of effort necessary, while listeners demand a level of clarity in speech that
makes listening easy. In effect, intelligibility means that the two sides need to meet in the
middle of any communicative exchange. When L2 speech is involved in the communicative
exchange, speakers will both speak out of their own L1 or L2 variety, and they will also listen
with their own perceptual systems.
It is clear that most exchanges between L1 speakers are mutually intelligible, even across
dialects. However, interactions including L2 speakers add unexpected variability that can
more seriously affect intelligibility. Hahn and Watts (2011), for example, report an otherwise
friendly interaction between a Hausa-speaking woman (and her children, all in their best
clothes) and an English-speaking man in Nigeria, in which the Hausa speaker said “You
want to snuff me?” The English speaker was very confused until the woman pointed to his
camera and he was able to interpret the word “snuff” as actually being “snap” (as in a
snapshot, a photo). Two mispronunciations at the word level, in the vowel and the final stop
consonant, caused unintelligibility in both what was being asked and why it was being asked.
It is helpful to think of intelligibility in terms of different types of speakers and listeners, as
in the modified speaker-listener matrix from Levis (2020), in which native speaker-listeners
(NS) and non-native speaker-listeners (NNS) interact in various combinations. Table 11.1
assumes that intelligibility can be compromised in any interaction, that is, that listeners can
misunderstand speakers at the word, message or interpretation level for a variety of reasons.
In Quadrant A, this would happen between two NS speaker-listeners, perhaps due to dif-
ferences in dialect. In Quadrant B, NNS listeners would find NS speakers unintelligible,
perhaps, for example, because of speed of speech, register of speech (with casual speech being
more difficult than formal speech), or because the NNS listener’s proficiency in their L2 is
limited. Quadrant C reflects most research on L2 speech intelligibility, in which NS listeners
misunderstand NNS speakers because of unexpected phonological, lexical, or grammatical
choices or errors. While some errors may be relatively easy to interpret (e.g., this pronounced
as dis), others are not as easy (e.g., flight pronounced as fright). Finally, Quadrant D reflects
other research on L2 intelligibility, in which speakers and listeners with different L1s use a
common L2 to communicate. Intelligibility in this Quadrant may be affected by the pho-
nological and perceptual systems of both speaker-listeners. That is, not only will their L1
affect how they produce the common L2, they will also interpret the other person’s speech
through their L1 phonological system. Jenkins (2000) reports an English interaction between
a Swiss-German and a Japanese speaker, in which the German interpreted the Japanese
Table 11.1 Possible intelligibility interactions in with native and non-native speakers
161
John M. Levis and Alif O. Silpachai
production of the words “grey house” as “clay house.” The Japanese speaker’s challenge in
producing a difference between /l/ and /ɹ/ also affected how the German speaker interpreted
the initial velar stop /ɡ/ as /k/. This chapter focuses on Quadrants C and D.
2 Historical Perspectives
Intelligibility has a long history in speech and signal processing research, especially in regard
to understanding how electronically delivered speech (e.g., telephone speech) is understood,
the effects of noise and other non-linguistic factors on understanding, and the extent to
which distortions of the speech signal affect understanding. As research came to incorporate
linguistic considerations, intelligibility was defined “as a property of speech communication
involving meaning” (Lehiste & Peterson, 1959, p. 280), resulting in studies of intelligibility in
deaf and hearing-impaired people, linguistic factors such as word predictability, and the
effects of age on listeners’ ability to understand. More recently, there have been increasing
numbers of studies on the intelligibility of L2 speech, the central topic here.
Within L2 pronunciation, one of the earliest advocates of intelligibility was Abercrombie
(1949) who said that L2 learners needed “comfortably intelligible” pronunciation, “which
can be understood with little or no conscious effort on the part of the listener” (p. 120).
Abercrombie’s view has much in common with modern notions of comprehensibility (i.e.,
requiring little conscious effort from a listener); he was ahead of his time in taking an ex-
plicitly listener-based approach to L2 pronunciation. Intelligibility as the goal of pro-
nunciation learning made a more prominent appearance in the 1970s and 1980s in regard to
how speakers of World Englishes understood each other (Smith & Rafiqzad, 1979), how new
English varieties were understood (Bansal, 1976), how international teaching assistants in the
United States were understood (Gallego, 1990; Hinofotis & Bailey, 1980), and in relation to
pronunciation teaching approaches (Pennington & Richards, 1986). Attempts were made to
identify what was actually meant by intelligibility (Smith & Nelson, 1985), what features
were important for intelligibility of L2 English (Jenner, 1989), how listeners with different L1
backgrounds evaluated accented speech (Fayer & Krasinski, 1987), and the effect of famil-
iarity with message content, accent, and speaker on improved intelligibility (Gass & Varonis,
1984). Intelligibility was not yet widely accepted as the primary goal for L2 pronunciation
teaching, but rather competed with nativelikeness as an alternative goal for L2 pronunciation
(Leather 1983). It was not until Munro and Derwing (1995a) that intelligibility was under-
stood in the way it is used in most L2 pronunciation research today, that is, whether listeners
understand a speaker’s message at the level of the word, message or intention.
Another construct related to intelligibility is comprehensibility (Levis, 2020), but the two
constructs do not refer to the same thing. Comprehensibility is measured not by success or
failure but by the amount of work listeners do to understand the speech. There are respects in
which intelligibility and comprehensibility are similar. Like intelligibility, comprehensibility
is affected by pronunciation and lexico-grammatical features, and intelligibility and com-
prehensibility can both change even when accentedness does not (Derwing et al., 1998;
Zhang & Yuan, 2020). However, comprehensibility also can be affected by fluency and the
ways in which spoken content is organized (Isaacs & Trofimovich, 2012).
162
Speech Intelligibility
intelligibility of L2 speech, the potential influence of clear speech (i.e., speech styles with
careful articulation as opposed to casual speech styles), the influence of noise, and the impact
of listener attitudes on L2 intelligibility.
163
John M. Levis and Alif O. Silpachai
linked to L1 sound structure. The evidence for these facilitative effects, however, remains weak.
In the experiment in Bent and Bradlow (2003) itself, the listeners who experienced ISIB were
only the Korean L1 speakers. Listeners who were English speakers with Chinese as their L1 did
not find the English of a Chinese L1 speaker more intelligible compared to a Korean L1
speaker. In Edwards et al. (2019), Cantonese listeners found Cantonese-accented English in-
telligible, but they also found the English produced by Mandarin speakers intelligible. The
authors suggested that this may have been because the listeners were familiar with Mandarin
speakers’ English. A similar claim was made by Tauroza and Luk (1997) who did not find
evidence for shared background effects. They found that Hong Kong based ESL learners
showed higher comprehension scores when listening to Received Pronunciation (RP) com-
pared to Hong Kong English. The authors suggested that this was because the learners were
familiar with RP as it had been the language of instruction.
Using a transcription task, Munro et al. (2006) did not find clear facilitative effects for
listeners with an L1 background that matched that of the talkers. Their study reported that
Japanese listeners found the English produced by Japanese speakers more intelligible than
the English produced by native speakers of Cantonese, Mandarin, and English. However, the
Cantonese listeners did not find speech produced by native speakers of Cantonese more
intelligible compared to the other language groups. The authors suggested that if shared
background effects were present, they were likely weak and outweighed by other factors. The
authors speculated that the discrepancy may have been due to other factors including ones
that are not known or well understood. One speculated factor was proficiency, given that the
Japanese listeners had reported using English more compared to the Cantonese listeners;
however, the authors concluded that this factor too was likely trumped by another factor –
the properties of the speech itself.
Another area of interest to intelligibility is research into clear and conversational speech.
According to Smiljanić and Bradlow (2007), clear speech is produced when speakers think
that they are speaking to someone with impaired hearing or to someone who is a non-native
speaker. Plain (or conversational) speech is produced when speaking to someone who is
familiar with the speech style of the speaker. In general, the researchers found “clear speech
is a beneficial articulatory modification regardless of the listener and talker L1 backgrounds”
(p. 664). Related research shows that Lombard speech, or speech produced in noise
(Lombard, 1911), may similarly facilitate intelligibility compared to speech produced in quiet
(Dreher & O’Neill, 1957; Summers et al., 1988; Pittman & Wiley, 2001). Lombard speech
unconsciously occurs when humans modify their vocalizations in various ways to facilitate
communication in noisy environments. Modifications include increased amplitude in pro-
portion to noise, a slower speech rate, a rise in fundamental frequency, and more energy at
higher frequencies (Bosker & Cooke, 2020; Cooke et al., 2014; Hotchkin & Parks, 2013; Luo
et al., 2015).
Previous studies have suggested that compared to native speech, noise may greatly hinder
the intelligibility of non-native speech especially the speech produced by low-intelligibility
speakers (Munro, 1998; Strori et al., 2020). However, the types of noise or other interference
of speech used in previous L2 intelligibility studies have been limited. Speech intelligibility in
natural settings is often hindered by interfering factors such as background noise, re-
verberation, that is, the prolongation of a sound in an enclosed environment (George et al.,
2010) as well as competing voices (Healy et al., 2017) and room characteristics (Astolfi et al.,
2012). Although most current L2 investigations have examined speech produced in quiet,
research on speech produced in noise and/or reverberation might be beneficial because
speech in everyday life is rarely produced in ideal settings. Perhaps future L2 research on
164
Speech Intelligibility
intelligibility of speech produced in naturalistic settings such as classrooms, large halls, and
outdoors would increase current understanding of L2 intelligibility.
A final non-linguistic factor that can affect intelligibility is the attitudes that listeners have
about L2 accentedness. Beliefs about foreign accents can make listeners believe that they do
not understand the speaker (Lippi-Green, 2012), even though strongly accented speech can
be fully intelligible (Munro & Derwing, 1995a). Attitudes about accents may also be seen in
research about whether listeners find particular speakers acceptable for particular jobs, on
the assumption that the wrong type of accented speech can harm business. Thus, questions of
acceptability (Pilott, 2016) may be less about intelligibility and more about attitudes toward
accented speech. A similar reaction to accented speech is reflected in the terms “annoyance”
or “irritation” (Fayer & Krasinski, 1987), with similar uncertainty about the relationship to
intelligibility. Even the expectation of accented speech may be enough to affect listeners’
views of speech being less intelligible (Rubin, 1992), although such expectations may be
mediated by the (mis)match of the speaker’s appearance and beliefs about how the speaker
should sound (McGowan, 2015).
Nonetheless, reactions to accented speech can result in discrimination based on the belief
that strong accents are a cause of unintelligibility. Munro (2003) describes cases investigating
discrimination on the basis of accent in Canada. These types of cases do not often reach a
courtroom, partly because of weak legal protections for speakers with foreign accents, or as
Wolfram (2013) quotes Lippi-Green (2012) who says: “Accent discrimination…is so com-
monly accepted, so widely perceived as appropriate, that it must be seen as the last back door
to discrimination”. This alone calls for a higher profile for intelligibility and a lower one for
accentedness in how societies think of spoken language.
4 Measuring Intelligibility
Intelligibility is traditionally defined by the absence of its opposite, that is, an utterance is
intelligible if it is not unintelligible. As a result, most measures of intelligibility try to identify
places in which listeners do not accurately understand. We will first describe approaches to
measuring intelligibility. Table 11.2 summarizes each approach to measuring intelligibility.
We will then consider how loss of intelligibility can make it difficult for listeners to navigate
speech.
A dominant approach to measuring intelligibility in signal processing is by using verbal
repetition (e.g., Fogerty & Humes, 2012), in which participants repeat aloud as accurately as
possible each sentence they hear. They do not receive any feedback on their repetition and
are encouraged to guess if unsure (Jørgensen et al., 2013). They may be allowed to listen to
each sentence a limited number of times (Chen et al., 2013). The scoring of repeated sen-
tences can be more or less strict, but typically it is the proportion of the number of correctly
recognized words as judged by trained raters. In a stricter approach, all words including their
exact morphemes are assessed (Fogerty & Humes, 2012), whereas in a less strict approach,
only specific keywords are considered (Zekveld et al., 2013).
Studies of L2 intelligibility primarily have measured word intelligibility, but some studies
have also examined the intelligibility of segments and of messages. The primary task used to
measure L2 word intelligibility has been transcription rather than verbal repetition. Scoring
can involve words in isolation (Field, 2005), all words (Munro & Derwing, 1995a), keywords
(Bent et al., 2007), or cloze transcription in which some words are provided and the targeted
words must be supplied (Smith, 1992). There is no research that we know of that compares
these different types of measures under similar conditions.
165
John M. Levis and Alif O. Silpachai
166
Speech Intelligibility
Since intelligibility in normal speech may be higher than expected because of the semantic
redundancy of speech, another type of intelligibility measure is sentence verification. Munro
and Derwing (1995b) used decontextualized sentences that were either true or false. They
asked listeners to verify whether each sentence was true or false and connected that judge-
ment to the subsequent transcriptions of the sentences.
A third type of intelligibility measure uses the analysis of interaction (Jenkins, 2000;
Kennedy, 2012), in which misunderstandings and repairs in the midst of interactions are
identified and used as a proxy for loss of intelligibility. These misunderstandings may then be
used to identify possible causes of unintelligibility.
A fourth measure is similar to the third, but directly identifies the phonetic and phonological
causes of unintelligibility. Zielinski (2008) used three extended (2 hour) interviews of Chinese,
Korean, and Vietnamese speakers to identify sentences that listeners had trouble transcribing.
Those sentences were then used for a formal intelligibility task in which listeners transcribed the
sentences. The same sentences were phonetically annotated for segments and stress patterns.
The phonetic annotations were used to describe linguistic patterns affecting intelligibility.
Intelligibility at the level of discourse meaning is typically measured by comprehension of a
message. Hahn (2004) used a matched-guise study with three versions of a short lecture (about
5 minutes). Each lecture was delivered by the same Korean–English bilingual, whose oral
English was high proficiency, ensuring consistency in delivery across versions. One had correct
prominence placements, one included incorrect prominence placements, and one was spoken
with no identifiable prominence placement (as with Korean prosody). Methodologically, Hahn
estimated intelligibility by the number of main ideas and details that listeners recalled.
Intelligibility can also be measured at the level of the segment. In this kind of study,
listeners are asked to identify the segment that they heard, often within a minimal pair (e.g.,
Lee & Lyster, 2017). Thomson and Derwing (2015) asked listeners to identify whether a
vowel was a correct or incorrect example of a category. If correct, the listener was asked to
identify whether the production was a good or poor example of the vowel. This approach
only indirectly says anything about whether mispronounced segments would impair in-
telligibility in a sentence or discourse context.
Scalar judgements have also been used to evaluate intelligibility (e.g., Gooch et al., 2016).
This is an unsatisfactory approach to measuring intelligibility, as Thomson (2017) says:
“using Likert-type scales to assess intelligibility…appeals to listeners’ subjective experiences
of listening, rather than requiring subjects to demonstrate that they can match what they
have heard with what was uttered” (p. 19).
All of these methods require comparisons to evaluate how a certain level of intelligibility is
to be understood. For example, what level of unintelligibility is too much for a listener?
Labov and Hanau (2011), in a description of training for doctors whose L2 oral dictations
were consistently difficult for medical transcribers, showed that only 2% of words were
impossible to transcribe correctly at pre-test, a number that decreased by 20%–1.6% at post-
test. The seemingly low percentage suggests that even 1 of 50 words is enough to cause
listeners to misunderstand, especially in contexts where accuracy is crucial.
167
John M. Levis and Alif O. Silpachai
must be intelligible to a listener, and successful listening can only occur when the listener can
understand the words, messages, and intentions of the speaker. Particular accents may be
important for speakers and group identity, but trying to master particular accents should not
be a priority in most L2 teaching, where accents are unlikely to become native-like. Munro’s
(2011) statement remains the dominant view of the field today: “Intelligibility is the single
most important aspect of all communication. If there is no intelligibility, communication has
failed. In language pedagogy this…is an empirically sound concept that will provide a basis
for a wide range of pedagogically-oriented research in the future” (p. 13).
A teaching and learning approach focused on intelligibility means that teaching must
prioritize features that strongly affect intelligibility and that teaching for intelligibility must
include pronunciation and other language features. Jenkins (2000), in her data from
NNS–NNS communication, found that pronunciation (mostly segmental) was implicated in
loss of intelligibility about two-thirds of the time, while the other one-third of the time,
grammatical and lexical issues influenced loss of intelligibility. Because Jenkins established
priorities based on a relatively small number of errors and limited L2–L2 speaker pairs, her
conclusions must be seen as preliminary.
Teaching for intelligibility can be done reactively or proactively. Reactive correction, or ad
hoc correction is a common type of corrective feedback and is important because it is part of
communicative language practice. Proactive approaches are based on teachers’ knowledge of
the types of features that are likely to cause loss of intelligibility, and proactively addressing
them through planned instruction. Not all pronunciation errors are likely to cause loss of in-
telligibility even if they are noticeable, and some pronunciation errors in certain types of su-
prasegmentals may not even be noticed as pronunciation errors. For example, word stress
errors such as saying noun–verb pairs (e.g., PERmit vs. perMIT) with the wrong stress patterns
do not seem to affect intelligibility or comprehensibility for L1 English listeners (Cutler, 1986).
For these reasons, it is valuable for teachers to understand that some problems are always more
important, while other errors may simply be bothersome to the teacher’s ear.
In the classroom, loss of intelligibility should be addressed explicitly, especially at the
word level. Word-level intelligibility problems occur most often when segmentals or word
stress choices are inaccurate. For example, Benrabah (1997) reported British English listener
transcriptions of English words spoken by Indian, Nigerian, and Algerian speakers. All the
words reported had unexpected stress patterns and their transcriptions were of completely
different words: UPset was transcribed as absent, norMALly as no money, wriTTEN as re-
tain, and seCONdary as country. Hahn and Watts (2011) show that vowel and consonant
mispronunciations can also result in word-level unintelligibility. They report on NS listeners
hearing math as meth, bed as bad, duck as dog, passion as patience, fork as pork, pen as Ben,
and ranking as linking. A number of these mishearings involve more than one segment (e.g.,
dog→duck involved both a vowel and final consonant error). Hahn and Watts say about two-
thirds of their examples involved more than one error. Other errors are high-functional load
errors (Brown, 1988), that is, they involve substitutions that have many minimal pairs in
English, such as /æ/-/ɛ/ (math-meth, bad-bed) and /ɹ/-/l/ (ranking-linking).
Because L2 learners are more likely to be aware of accentedness, it is important to raise
awareness of intelligibility as being more important. To do this, teachers could show that
pronunciation can sometimes be so unexpected that a listener cannot understand what is being
said. This can be done with stories or humour (almost all L2 learners have examples of this
happening) or by using others’ stories, such as those in Hahn and Watts (2011). Teachers can
then give opportunities to model and practice clarification or negotiation strategies for restoring
intelligibility (see Jenkins, 2000). There must also be opportunities for pronunciation instruction
for words that are particularly troublesome or for phonemic or word stress patterns.
168
Speech Intelligibility
In addressing segmental errors, the functional load principle offers helpful guidance for
teaching. Functional load measures the amount of work two phonemes do in distinguishing
otherwise identical words in a language. Brown (1988) lists phoneme contrasts for English
from 1 (low) to 10 (high). Low-functional load sound pairs have few minimal pairs (e.g.,
soot–suit, thought–fought), while high-functional load pairs have many (e.g., pat–fat,
plead–bleed, feet–fit). Because high-functional load errors cause listeners to work harder in
understanding and to judge accentedness more seriously (Munro et al., 2006), teachers
should prioritize high-functional load errors. Finally, all difficulties with understanding
should be openly discussed so that everyone can share strategies for resolving problems when
interacting in different contexts, for example, by paraphrasing or writing a word for the
listener. By doing so, teachers and learners may find it easier to talk about why commu-
nication was not successful.
In addressing suprasegmentals, it is important to remember that errors in suprasegmentals
may not be heard as pronunciation errors but as a general lack of understanding the larger
message or even as a social failing (Levis, 2018; Smith & Nelson, 1985). Unlike word-level
intelligibility, the role of pronunciation in these higher levels of intelligibility may be harder
to recognize because of the greater role of suprasegmentals in communicating messages and
intentions. With the exception of word stress, suprasegmentals are unlikely to cause unin-
telligibility at the word level (Levis, 2018). However, a high degree of word-level intelligibility
may be insufficient for promoting message level intelligibility. Hahn (2004) found that the
misplacement or absence of one pronunciation feature, primary phrase stress (i.e., promi-
nence or nuclear stress), resulted in undergraduate students understanding less of the content
of a short lecture.
Examples of suprasegmentals being heard as a social failing can be seen in Gumperz
(1982) and Low (2006). Gumperz reported a conflict that involved a cross-cultural pro-
nunciation difference in intonation. Servers of Indian and Pakistani origin were accused of
being rude to British origin employees at a British Airways cafeteria. In turn, the Indian and
Pakistani workers interpreted the British origin workers as also being rude. After analysis,
Gumperz pinpointed the problem as coming from differences in intonation. The Indian and
Pakistani workers served “Gravy” with a falling intonation, a pattern interpreted as some-
thing like, “Here, take the gravy”. Although Gumperz reported this was a polite way to offer
something in the servers’ L1s, it was impolite in English, where offers are conventionally
made with a rising intonation. In another example, Low (2006) reported that her Singapore
English pronunciation, which usually ends sentences with a prominent syllable even when
other varieties of English would not, was sometimes heard by L1 speakers of English as
signalling that she was angry or upset, even if that was the furthest thing from her intention.
6 Future Directions
The field of L2 pronunciation has progressed greatly since 1995. With an extensive pedagogical
pedigree, growing research findings on the differential effects of errors on intelligibility, studies
of the effectiveness of various techniques for face-to-face and autonomous learning, and the
extension of research on L2 pronunciation into a variety of languages, learning contexts, and
listening environments, the field is ready for further development. Murphy and Baker (2015)
write about four overlapping historical waves of L2 pronunciation: that pronunciation should
be taught, that knowledge about pronunciation makes a difference in teaching, that knowledge
about teachers can improve teaching, and that pedagogy can best improve when built on a solid
research base. Murphy and Baker also foresee a fifth wave in which pedagogy changes as a
result of attention to how teaching and learning are conceptualized as social practices, research
169
John M. Levis and Alif O. Silpachai
into teaching and learning materials, and innovations in teacher training. In such a fifth wave,
intelligibility will remain central to pedagogy and research.
Further Reading
These sources address intelligibility from different perspectives: ELF (Jenkins), research (Derwing &
Munro), and pedagogy (Levis). Munro and Derwing (2011) is an accessible starting point.
Derwing, T. M. & Munro, M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2
teaching and research. Amsterdam: John Benjamins Publishing Company.
Jenkins, J. (2000). The phonology of English as an international language. Oxford: Oxford University
Press.
Levis, J. (2018). Intelligibility, oral communication, and the teaching of pronunciation. Cambridge:
Cambridge University Press.
Munro, M. J. & Derwing, T. M. (2011). The foundations of accent and intelligibility in pronunciation
research. Language Teaching, 44(3), 316–327.
References
Abercrombie, D. (1949). Teaching pronunciation. ELT Journal, 3(5), 113–122.
Astolfi, A., Bottalico, P., & Barbato, G. (2012). Subjective and objective speech intelligibility in-
vestigations in primary school classrooms. The Journal of the Acoustical Society of America, 131(1),
247–257.
Bansal, R. (1976). The intelligibility of Indian English. Central Institute of English and Foreign
Languages.
Bent, T., & Bradlow, A. (2003). The interlanguage speech intelligibility benefit. The Journal of the
Acoustical Society of America, 114(3), 1600–1610.
Bent, T., Bradlow, A., & Smith, B. (2007). Segmental errors in different word positions and their effects
on intelligibility of non-native speech. In O-S Bohn & M. Munro (Eds.), Language experience in
second language speech learning: In honour of James Emil Flege (pp. 331–347). Amsterdam: John
Benjamins.
Benrabah, M. (1997). Word-stress–a source of unintelligibility in English. World Englishes, 35(3),
157–165.
Best, C. (1995). A direct realist view of cross-language speech perception. In W. Strange (Ed.), Speech
perception and linguistic experience: Issues in cross-language research (pp. 171–206). York Press.
Bosker, H. & Cooke, M. (2020). Enhanced amplitude modulations contribute to the Lombard in-
telligibility benefit: Evidence from the Nijmegen Corpus of Lombard Speech. The Journal of the
Acoustical Society of America. doi: 10.1121/10.0000646.
Brown, A. (1988). Functional load and the teaching of pronunciation. TESOL Quarterly, 22(4),
593–606.
Chen, F., Wong, L., & Wong, E. (2013). Assessing the perceptual contributions of vowels and con-
sonants to Mandarin sentence intelligibility. The Journal of the Acoustical Society of America,
134(2), EL178–EL184.
Cole, R., Yan, Y., Mak, B., Fanty, M., & Bailey, T. (1996, May). The contribution of consonants
versus vowels to word recognition in fluent speech. In 1996 IEEE international conference on
acoustics, speech, and signal processing conference proceedings (Vol. 2, pp. 853–856). IEEE.
Cooke, M., King, S., Garnier, M., & Aubanel, V. (2014). The listening talker: A review of human and
algorithmic context-induced modifications of speech. Computer Speech & Language, 28(2), 543–571.
Cutler, A. (1986). Forbear is a homophone: Lexical prosody does not constrain lexical access. Language
and Speech, 29(3), 201–220.
Derwing, T. & Munro, M. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2
teaching and research. Amsterdam: John Benjamins Publishing Company.
Derwing, T., Munro, M., & Wiebe, G. (1998). Evidence in favor of a broad framework for
pronunciation instruction. Language Learning, 48(3), 393–410.
Dreher, J., & O’Neill, J. (1957). Effects of ambient noise on speaker intelligibility for words and
phrases. The Journal of the Acoustical Society of America, 29(12), 1320–1323.
Edwards, J., Zampini, M., & Cunningham, C. (2019). Listener proficiency and shared background
170
Speech Intelligibility
effects on the accentedness, comprehensibility and intelligibility of four varieties of English. Journal
of Monolingual and Bilingual Speech, 1(2), 333–356.
Fayer, J. & Krasinski, E. (1987). Native and nonnative judgments of intelligibility and irritation.
Language Learning, 37(3), 313–326.
Field, J. (2005). Intelligibility and the listener: The role of lexical stress. TESOL Quarterly, 39(3),
399–423.
Flege, J. (1992). Speech learning in a second language. Phonological development: Models, Research,
Implications, 565, 604.
Fogerty, D., & Humes, L. (2012). The role of vowel and consonant fundamental frequency, envelope,
and temporal fine structure cues to the intelligibility of words and sentences. The Journal of the
Acoustical Society of America, 131(2), 1490–1501.
Fogerty, D., & Kewley-Port, D. (2009). Perceptual contributions of the consonant-vowel boundary to
sentence intelligibility. The Journal of the Acoustical Society of America, 126(2), 847–857.
Gallego, J. (1990). The intelligibility of three nonnative English-speaking teaching assistants: An
analysis of student-reported communication breakdowns. Issues in Applied Linguistics, 1(2),
219–237.
Gass, S., & Varonis, E. (1984). The effect of familiarity on the comprehensibility of nonnative speech.
Language Learning, 34(1), 65–87.
George, E., Goverts, S., Festen, J., & Houtgast, T. (2010). Measuring the effects of reverberation and
noise on sentence intelligibility for hearing-impaired listeners. Journal of Speech, Language, and
Hearing Research, 53, 1429–1439.
Gooch, R., Saito, K., & Lyster, R. (2016). Effects of recasts and prompts on L2 pronunciation de-
velopment: Teaching English /ɹ/ to Korean adult EFL learners. System, 60, 117–127.
Gumperz, J. (1982). Discourse strategies. Cambridge: Cambridge University Press.
Hahn, L. (2004). Primary stress and intelligibility: Research to motivate the teaching of supraseg-
mentals. TESOL Quarterly, 38(2), 201–223.
Hahn, L., & Watts, P. (2011). Intelligibility tales. In Proceedings of the 2nd pronunciation in second
language learning and teaching conference (pp. 17–29). Iowa State University, Ames, IA.
Healy, E., Delfarah, M., Vasko, J., Carter, B., & Wang, D. (2017). An algorithm to increase intellig-
ibility for hearing-impaired listeners in the presence of a competing talker. The Journal of the
Acoustical Society of America, 141(6), 4230–4239.
Hinofotis, F., & Bailey, K. (1980). American undergraduates’ reactions to the communication skills of
foreign teaching assistants. In J. Fisher, M. Clarke, & J. Schachter (Eds.), On TESOL (Vol. 80,
pp. 120–133). Washington, DC: Teachers of English to Speakers of Other Languages.
Hotchkin, C., & Parks, S. (2013). The Lombard effect and other noise‐induced vocal modifications:
Insight from mammalian communication systems. Biological Reviews, 88(4), 809–824.
Im, J., & Levis, J. (2015). Judgments of non-standard segmental sounds and international teaching
assistants’ spoken proficiency levels. In G. Gorsuch (Ed.), Talking matters: Research on talk and
communication of international teaching assistants (pp. 113–142). Stillwater, OK: New Forums Press.
Isaacs, T., & Trofimovich, P. (2012). Deconstructing comprehensibility: Identifying the linguistic in-
fluences on listeners’ L2 comprehensibility ratings. Studies in Second Language Acquisition, 34(3),
475–505.
Jenkins, J. (2000). The phonology of English as an international language. Oxford: Oxford University
Press.
Jenner, B. (1989). Teaching pronunciation: The common core. Speak Out, 4, 2–4.
Jørgensen, S., Ewert, S., & Dau, T. (2013). A multi-resolution envelope-power based model for speech
intelligibility. The Journal of the Acoustical Society of America, 134(1), 436–446.
Kennedy, S. (2012). When non-native speakers misunderstand each other: Identifying important as-
pects of pronunciation. Contact Magazine, 38(2), 49–62.
Kent, R., & Minifie, F. (1977). Coarticulation in recent speech production models. Journal of Phonetics,
5(2), 115–133.
Kewley-Port, D., Burkle, T., & Lee, J. (2007). Contribution of consonant versus vowel information to
sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners. The Journal
of the Acoustical Society of America, 122(4), 2365–2375.
Labov, J., & Hanau, C. (2011). Pronunciation as life and death: Improving the communication skills of
non-native English-speaking pathologists. In B. Hoekje & S. Tipton (Eds.), English language and the
171
John M. Levis and Alif O. Silpachai
medical profession: Instructing and assessing the communication skills of international physicians
(pp. 261–285). Bingley, UK: Emerald Group Publishing.
Ladefoged, P., & Disner, S. (2012). Vowels and consonants (3rd edn). Malden, MA: Wiley Blackwell.
Ladefoged, P., & Broadbent, D. (1957). Information conveyed by vowels. The Journal of the Acoustical
Society of America, 29(1), 98–104.
Leather, J. (1983). Second-language pronunciation learning and teaching. Language Teaching, 16(3),
198–219.
Lehiste, I., & Peterson, G. (1959). Linguistic considerations in the study of speech intelligibility. The
Journal of the Acoustical Society of America, 31(3), 280–286.
Levis, J. (2005). Changing contexts and shifting paradigms in pronunciation teaching. TESOL
Quarterly, 39(3), 369–377.
Levis, J. (2018). Intelligibility, oral communication, and the teaching of pronunciation. Cambridge:
Cambridge University Press.
Levis, J. (2020). Revisiting the nativeness and intelligibility principles. Journal of Second Language
Pronunciation, 6(3), 310–328.
Lindblom, B. (1963). Spectrographic study of vowel reduction. The Journal of the Acoustical Society of
America, 35(11), 1773–1781.
Lippi-Green, R. (2012). English with an accent. Language, ideology, and discrimination in the United
States (2nd edn). Milton Park, UK: Routledge.
Lombard, E. (1911). Le signe de l’elevation de la voix. Ann. Mal. de L’Oreille et du Larynx, 37, 101–119.
Low, E. L. (2006). A cross‐varietal comparison of deaccenting and given information: Implications for
international intelligibility and pronunciation teaching. TESOL Quarterly, 40(4), 739–761.
Luo, J., Goerlitz, H., Brumm, H., & Wiegrebe, L. (2015). Linking the sender to the receiver: Vocal
adjustments by bats to maintain signal detection in noise. Scientific Reports, 5, 1–11.
Lee, A. H., & Lyster, R. (2017). Can corrective feedback on second language speech perception errors
affect production accuracy? Applied Psycholinguistics, 38(2), 371.
McGowan, K. (2015). Social expectation improves speech perception in noise. Language and Speech,
58(4), 502–521.
Miller, G., Heise, G., & Lichten, W. (1951). The intelligibility of speech as a function of the context of
the test materials. Journal of Experimental Psychology, 41(5), 329–335.
Munro, M. (1998). The effects of noise on the intelligibility of foreign-accented speech. Studies in
Second Language Acquisition, 20(2), 139–154.
Munro, M. (2003). A primer on accent discrimination in the Canadian context. TESL Canada Journal,
20(2), 38–51.
Munro, M. (2011). Intelligibility: Buzzword or buzzworthy? In. J. Levis & K. LeVelle (Eds.),
Proceedings of the 2nd pronunciation in second language learning and teaching conference, Sept. 2010.
(pp. 7–16). Ames, IA: Iowa State University.
Munro, M. & Derwing, T. (1995a). Foreign accent, comprehensibility, and intelligibility in the speech
of second language learners. Language Learning, 45(1), 73–97.
Munro, M. & Derwing, T. (1995b). Processing time, accent, and comprehensibility in the perception of
native and foreign-accented speech. Language and Speech, 38(3), 289–306.
Munro, M. & Derwing, T. (2011). The foundations of accent and intelligibility in pronunciation re-
search. Language Teaching, 44(3), 316–327.
Munro, M., Derwing, T., & Morton, S. (2006). The mutual intelligibility of L2 speech. Studies in Second
Language Acquisition, 28(1), 111–131.
Murphy, J., & Baker, A. (2015). History of ESL pronunciation teaching. In M. Reed & J. Levis (Eds.),
The handbook of English pronunciation (pp. 36–65). New York: Wiley Blackwell.
Pennington, M. & Richards, J. (1986). Pronunciation revisited. TESOL Quarterly, 20(2), 207–225.
Pilott, M. (2016). Migrant pronunciation: What do employers find acceptable? Doctoral dissertation,
Victoria University, Wellington, New Zealand.
Pittman, A. & Wiley, T. (2001). Recognition of speech produced in noise. Journal of Speech, Language,
and Hearing Research, 44, 487–496
Rubin, D. L. (1992). Nonlanguage factors affecting undergraduates’ judgments of nonnative English-
speaking teaching assistants. Research in Higher Education, 33(4), 511–531.
Shattuck-Hufnagel, S., & Turk, A. (1996). A prosody tutorial for investigators of auditory sentence
processing. Journal of Psycholinguistic Research, 25(2), 193–247.
Smiljanić, R., & Bradlow, A. (2007). Clear speech intelligibility: Listener and talker effects. In
172
Speech Intelligibility
Proceedings of the XVIth international congress of phonetic sciences (pp. 661–664). Saarbrucken,
Germany.
Smith, L. (1992). Spread of English and issues of intelligibility. In B. Kachru (Ed.), The other tongue:
English across cultures (pp. 75–90). Champaign: University of Illinois Press.
Smith, L., & Nelson, C. (1985). International intelligibility of English: Directions and resources. World
Englishes, 4(3), 333–342.
Smith, L., & Rafiqzad, K. (1979). English for cross-cultural communication: The question of intellig-
ibility. TESOL Quarterly, 13(3), 371–380.
Strori, D., Bradlow, A., & Souza, P. (2020). Recognition of foreign-accented speech in noise: The
interplay between talker intelligibility and linguistic structure. The Journal of the Acoustical Society
of America, 147(6), 3765–3782.
Summers, W., Pisoni, D., Bernacki, R., Pedlow, R., & Stokes, M. (1988). Effects of noise on speech
production: Acoustic and perceptual analyses. The Journal of the Acoustical Society of America,
84(3), 917–928.
Tauroza, S., & Luk, J. (1997). Accent and second language listening comprehension. RELC Journal,
28(1), 54–71.
Thomson, R. (2017). Measurement of accentedness, intelligibility, and comprehensibility. In O. Kang &
A. Ginther (Eds.), Assessment in second language pronunciation (pp. 11–29). Milton Park, UK:
Routledge.
Thomson, R., & Derwing, T. (2015). The effectiveness of L2 pronunciation instruction: A narrative
review. Applied Linguistics, 36(3), 326–344.
Wolfram, W. (2013). Sound effects. Teaching Tolerance, 52(43), 29–31.
Zekveld, A., Rudner, M., Johnsrude, I., & Rönnberg, J. (2013). The effects of working memory
capacity and semantic cues on the intelligibility of speech in noise. The Journal of the Acoustical
Society of America, 134(3), 2225–2234.
Zielinski, B. (2008). The listener: No longer the silent partner in reduced intelligibility. System,
36(1), 69–84.
Zhang, R., & Yuan, Z. (2020). Examining the effects of explicit pronunciation instruction on the
development of L2 pronunciation. Studies in Second Language Acquisition, 42(4), 905–918.
173
12
SPEECH COMPREHENSIBILITY
Pavel Trofimovich, Talia Isaacs, Sara Kennedy, and Aki Tsunemoto
1 Introduction
In 21st century second language (L2) pronunciation research, pedagogy, and assessment, two
contrasting views continue to dominate the landscape (Levis, 2018). Propagated by the
unregulated accent reduction industry (Thomson, 2013), the first view upholds nativelike
attainment as the goal of L2 pronunciation learning and assessment. The second advances an
agenda to help L2 speakers be more easily understandable, not necessarily nativelike, to
listeners. This agenda includes researching the presumed factors that could foster or impede
listeners’ understanding of L2 speech and developing pedagogical interventions to enhance
the quality of L2 speakers’ performance and improve listeners’ ability to understand them.
Most L2 pronunciation experts deem the traditional emphasis on nativelikeness to be an
unsuitable goal in many contexts of language use (Derwing & Munro, 2015). However, the
alternative view – based on measuring how understandable L2 speakers are to listeners – has
been mired in definitional confusion and inconsistency (Isaacs, 2008).
predating Munro and Derwing’s influential work. Some researchers use the term intellig-
ibility when measuring understanding through Likert-type scales (e.g., Fayer & Krasinski,
1987) when, in fact, what is being measured is comprehensibility. Other scholars use the term
comprehensibility to refer to measures of what Munro and Derwing would call intelligibility,
such as examining the accuracy of listeners’ transcriptions of L2 utterances (e.g., Gass &
Varonis, 1984, but see Varonis & Gass, 1982, for a measure compatible with Munro and
Derwing’s notion of comprehensibility). Yet others have used rating scales to measure in-
telligibility, which additionally conflate nativelike pronunciation and intelligibility. For
example, Anderson-Hsieh et al.’s (1992) scalar ratings of pronunciation ranged from
“heavily accented speech that was unintelligible” to “near nativelike speech” (p. 538). This
leads to two problems: listeners’ perceptions are treated as their actual understanding, and
speakers’ comprehensibility is confounded with how nativelike they sound.
Definitional challenges can also be seen in the context of L2 oral proficiency scales in
human-mediated standardized language tests often used for high-stakes decision making (e.g.,
TOEFL iBT, IELTS, TOEIC, and Aptis), where the use of comprehensibility has become
pervasive. Many rating scale descriptors make reference to intelligibility or intelligible speech,
but the use of listener- or examiner-mediated scales implies that in fact, Munro and Derwing’s
notion of comprehensibility is being measured. To illustrate, Band 8 of the public version of
the IELTS speaking descriptors refers to L2 speech as “easy to understand throughout; L1
[first language] accent has minimal effect on intelligibility” (British Council, 2020). In another
example from language assessment, Isaacs et al. (2018) developed a dedicated L2 compre-
hensibility scale with extended descriptors intended for English for Academic Purposes tea-
chers to use as a pedagogical tool (i.e., for low-stakes formative assessment rather than high-
stakes consequential decision making). In their detailed analytic scale, comprehensibility is
discussed in terms of underlying pronunciation, fluency, lexis, and grammar features at dif-
ferent ability levels, with the degree of listener effort described across the subscales. This scale
illustrates a data-driven approach to modelling comprehensibility, where comprehensibility is a
multidimensional construct defined through multiple extended descriptors rather than a single
numerical scale commonly used in research settings.
3 Critical Issues
One overarching issue about the construct of comprehensibility is its role among other global
measures of speaking (e.g., intelligibility) and specific metrics of speakers’ performance (e.g.,
pronunciation accuracy and fluency). The key question is whether and to what degree scalar
ratings of comprehensibility can be useful for language teachers and learners, researchers,
and language speakers more generally.
175
Pavel Trofimovich et al.
more time for listeners to complete the tasks. Comprehensibility ratings are also reliable
across listeners, meaning that they generally agree with each other regardless of how com-
prehensibility is measured (Munro, 2018; Nagle, 2019). By comparison, intelligibility scores
often vary across task type, influenced by the nature of the speech sample and the type of
listening task used to measure intelligibility (Kang et al., 2018; Kennedy, 2009). Most im-
portantly, although intelligibility and comprehensibility are partially independent, compre-
hensibility ratings provide a reasonable estimate of listeners’ actual understanding of speech
(Sheppard et al., 2017). For instance, Munro and Derwing (1995a) reported substantial
overlaps between these dimensions, with correlation coefficients approaching .90, although
the magnitude of this link might vary for different speakers and listeners (Matsuura et al.,
1999). An intuitive, easy-to-use scalar measure, comprehensibility might thus be a useful
general metric of understanding in several contexts of language teaching, learning, and use.
176
Speech Comprehensibility
harder to pronounce are considered less trustworthy (Newman et al., 2014), regardless of the
content of the statements. Similarly, readers exposed to text printed in a difficult-to-read font
react more negatively than those reading the same text in an easy-to-read font, despite having
similar text comprehension for both conditions (Sanchez & Jaeger, 2015; Song & Schwarz,
2008). Munro and Derwing (1995a) observed that comprehensibility might be rated differ-
ently for speech that is perfectly intelligible, which aligns with findings from processing
fluency studies that listeners’ various reactions to speech and speakers might be linked not to
actual understanding (intelligibility) but to comprehensibility.
Growing evidence suggests that comprehensibility captures socially important decisions
for listeners. For instance, in social–psychological research, speakers who listeners perceived
as hard to understand were downgraded in listeners’ affective and attitudinal evaluations.
Such speakers were ascribed negative emotions of annoyance and irritation and deemed less
intelligent and successful (Dragojevic et al., 2017). Similarly, in an e-learning study, when
students evaluated an instructional video narrated by the instructor who was rated hard to
understand, students downgraded their evaluations of the instructor, expressed negative
attitudes towards coursework, and evaluated video content as more difficult, even though
students’ actual understanding of the video was not compromised (Sanchez & Khan, 2016).
In fact, a comprehensibility scale akin to that used in L2 speech research has now been
validated as part of a five-item processing fluency measure that appears to explain various
human judgements (truthfulness, preference, and perceived risk) all formerly attributed to
processing fluency (Graf et al., 2018). Thus, an intuitive appeal of comprehensibility as a
measure of processing fluency is that it might help to explain aspects of human behaviour,
including, for instance, whether interlocutors continue interacting with speakers they find
difficult to understand or whether university students drop out of courses led by instructors
whose speech they consider hard to process.
177
Pavel Trofimovich et al.
this would seem short sighted. It may be irresponsible to presume that L2 speakers will
exclusively speak with native speakers, especially for languages of major global or regional
significance (e.g., English, Mandarin, and Spanish).
178
Speech Comprehensibility
Comprehensibility – Dynamic
Although pedagogically oriented investigations of comprehensibility have tracked L2
speakers’ comprehensibility over weeks, months, and sometimes years, comprehensibility
has rarely been framed as a dynamic, variable process which can change on a finer-grained
timescale, as a matter of minutes or seconds. Nagle et al. (2019) explored whether com-
prehensibility can be construed as dynamic, examining how raters explain their assessments
as they evolve over time. Twenty-four Spanish-speaking listeners evaluated 3-minute
personal narratives by L2 Spanish learners using a computer interface which allowed lis-
teners to increase or decrease the comprehensibility rating as the speech unfolded. Listeners
showed varying rating profiles, such that some listeners increased or decreased compre-
hensibility ratings infrequently over a speech sample whereas others increased or decreased
ratings at a high frequency, with varying magnitude of change. In a follow-up study,
Trofimovich et al. (2020) reasoned that interactive speech, where interlocutors react to one
another in real time, might be even more amenable to dynamic comprehensibility judge-
ments, compared to the one-way listening task that Nagle et al. (2019) examined. For this
study, L2 English university students from different language backgrounds engaged in
collaborative tasks over 17 minutes, rating their partner’s comprehensibility at 2- to 3-
minute intervals. Speakers’ comprehensibility ratings for the most part followed a U-
shaped function, with comprehensibility (initially perceived to be high) dipping to lower
levels but then reaching high levels by the end of the interaction. Speakers’ ratings also
became more similar to each other soon after the interaction started and remained alike
throughout. Taken together, these findings not only suggest that speakers’ comprehensi-
bility can change over time as interaction unfolds but also imply that comprehensibility
issues might become less important for both interlocutors in a conversation after a certain
minimum threshold of comprehensibility has been reached. Whether such a threshold in-
volves a degree of interpersonal comfort or is simply a matter of investing sufficient time
into communication is an area for future research.
179
Pavel Trofimovich et al.
180
Speech Comprehensibility
• There is little difference in the ratings obtained through the use of 5- versus 9-point
scales, although shorter scales are sometimes perceived by listeners as constraining
whereas longer scales are considered difficult for differentiating across skill levels (Isaacs
& Thomson, 2013). Compared to direct magnitude estimation of comprehensibility,
9-point scales perform just as well, suggesting that the use of scalar ratings is a reliable
approach to measuring L2 comprehensibility for research purposes (Munro, 2018).
• Evaluations of individual short sentences by the same speaker often lack consistency,
suggesting that ratings of shorter speech samples might not be representative of ratings
of longer discourse produced by the same speaker (Munro, 2018).
• Listeners sometimes assign harsher ratings when evaluating the same samples again,
because listeners might become increasingly aware of how the speakers’ output differs
from the language expected by listeners (Flege & Fletcher, 1992; Munro &
Derwing 1994).
• Comprehensibility ratings do not appear to be influenced by whether this dimension is
evaluated separately or in combination with other global dimensions such as accent-
edness and fluency (O’Brien, 2016) or by the order in which comprehensibility judge-
ments occur in a rating sequence (Derwing & Munro, 1997; O’Brien, 2016).
• Speech ratings obtained in online environments with built-in controls (e.g., through
crowdsourcing platforms) yield highly reliable judgements, comparable to those ob-
tained in research laboratories (Nagle, 2019).
As Isaacs et al. (2015) note, regardless of the method used to capture comprehensibility, in
the absence of detailed guidance, raters may interpret the target construct in different ways,
for example, assuming that it refers to listeners’ perceptions of understanding the overall
message, to understanding every single word that is uttered, or solely to understanding
meaning-laden words. Put simply, some listeners’ interpretations of comprehensibility may
differ from other listeners’ interpretations and might not exactly conform to what the re-
searchers believe they are measuring. This is important to establish in light of construct
validity for comprehensibility measurement, which has only been examined infrequently to
date (Isaacs & Thomson, 2013; Munro, 2018; Nagle, 2019).
181
Pavel Trofimovich et al.
contexts for teaching or tutoring L2 speech or familiarizing listeners with L2 speech. For
instance, the type of speaking task, the speaker’s use of vocabulary and grammar, the
listener’s level of motivation, attitude towards or experience with L2 speech and learning
generally are all elements that could be linked to comprehensibility. Teachers, speakers,
and listeners could then work on elements over which they potentially have control, such as
pronunciation, vocabulary, attitude, motivation, or experience with L2 speech. Clearly,
neither teachers and speakers nor listeners can attend simultaneously to all elements po-
tentially linked to comprehensibility. The importance of particular elements varies ac-
cording to the person and context, and speakers and listeners should be encouraged to
enhance their awareness of these elements through awareness-building activities, including
guided analysis of self or others’ comprehensibility or spoken language (e.g., Derwing
et al., 2002; Krech Thomas, 2004).
In light of comprehensibility’s dynamic nature, it is important for researchers to consider
how the length of a speech sample or frequency of rating might impact comprehensibility.
Because comprehensibility trends upward over the course of interactions, L2 speakers aiming
for comprehensibility or increased confidence in their comprehensibility should be en-
couraged to seek opportunities for spoken interactions which are not brief. These might be
found in group discussions or brainstorming sessions, interviews, workshops, and commu-
nity group meetings. Another consideration is speakers’ overall affective and motivational
profiles which have been linked to comprehensibility ratings. Many researchers and teachers
try to ensure that their research and teaching contexts are not stressful for speakers or lis-
teners. Rehearsal of tasks and self-reports of anxiety, willingness to communicate, and
motivation could also help to modulate or document the possible influence of these and other
variables on interlocutors’ comprehensibility. Confidence could be promoted by teachers or
L2 speakers through calming or self-affirming exercises prior to and during spoken inter-
action. On a more global scale, enjoyable learning environments, where teachers promote
positivity, encourage learners’ desire to communicate, increase motivation, and reduce an-
xiety (e.g., Moskowitz & Dewaele, 2021), are likely most conducive to the development of
comprehensibility and successful L2 learning generally.
Because comprehensibility appears to be influenced by social variables, language tea-
chers might engage their learners in initiatives which involve structured opportunities for
positive contact between various types of interlocutors. Other initiatives might target
native-speaking listeners, to help them discover some differences between their language
and another language, do structured practice in transcribing L2 speech, or take the per-
spective of an L2 peer. The goal would be to guide people to consider different facets of
individuals with whom they might unknowingly share linguistic and social commonalities,
as a way of promoting harmonious communication (Hansen et al., 2014). With en-
couragement and support from administration and managers, formal or informal activities
such as happy hours, sharing circles, or language classes can also be done in workplaces
with colleagues from different backgrounds (Kim et al., 2019). Of course, individuals can
themselves initiate contact with L2 speakers or try to learn or use a less familiar language,
and so reduce anxiety or develop more positive attitudes about communication with L2
speakers. For research, the importance of attitudes towards L2 speech means that eliciting
measures of raters’ attitudes will add another dimension to the analysis and interpretation
of comprehensibility ratings. Moreover, social biases which a rater is exposed to before
rating can affect the rating itself. Researchers should carefully consider who is involved in
administering the rating session (e.g., a majority or minority language speaker in a given
context), what is said prior to the ratings, and how these factors could influence raters and
the scores they assign.
182
Speech Comprehensibility
7 Future Directions
Comprehensibility is an appealing construct because it connects language learners and tea-
chers, who might be interested in improving L2 oral production, with researchers, whose goal
is to describe what linguistic, social, experiential, and behavioural dimensions underlie
people’s experience with speech. The breadth of theoretical questions and the versatility of
applied contexts relevant to comprehensibility make for exciting future research. For ex-
ample, researchers could intensify longitudinal research examining how learners with dif-
ferent cognitive, motivational, experiential, and affective profiles develop comprehensible L2
speech across different contexts, both instructed and uninstructed. In keeping with a dynamic
view of comprehensibility, researchers might continue exploring interlocutors’ comprehen-
sibility in paired or group interaction. This work could clarify how interlocutors’ cumulative
shared experience impacts their comprehensibility ratings in tasks that increase versus de-
crease in cognitive difficulty over time and examine how non-linguistic cues (e.g., gestures,
facial expressions, and displays of emotion) and interactional variables (e.g., backchannelling
and clarification requests) contribute to interlocutors’ mutual comprehensibility judgements.
Researchers might explore links between interaction-based comprehensibility ratings and
interlocutor awareness of what makes speech comprehensible for them, using different
combinations of interlocutors who vary in language proficiency, experience, and other
variables (e.g., personality characteristics). Similarly, it might be useful to explore long-term
effects of interlocutors’ extended conversational experience on their perception of compre-
hensibility, focusing on speakers’ judgements of the same and new partners in another in-
stance of interaction, after a delay. In light of demonstrated alignment between both
partners’ comprehensibility scores in extended interaction, it could be fruitful to examine the
validity of a joint (rather than speaker-specific) measure of comprehensibility for both
partners in a conversational dyad. Given attitudinal influences on comprehensibility, re-
searchers might explore situations where interlocutors’ comprehensibility judgements are
influenced by one or both interlocutors’ sociopolitical views, stereotypical judgements, or
other attitudes towards the speaker or the topic of conversation. Finally, comprehensibility
ratings, as useful measures of listener understanding and listener processing fluency, could be
examined in relation to such conversational phenomena as speakers’ engagement in dialogue,
participation patterns, or affective responses to the task or their partner, to clarify the role of
processing effort in interlocutor experience in interaction.
8 Conclusion
Beyond a doubt, comprehensibility is a valuable construct, relevant to both speakers and
listeners and useful for both researchers and educational practitioners. People’s perceptions
of each other’s comprehensibility are subject to social influences, evolve dynamically over
time as communication unfolds, are tied to many linguistic (and some non-linguistic) features
of interaction, and affect other aspects of people’s judgements, such as how annoying or
intelligent a person is. These characteristics make comprehensibility a worthy conceptual and
practical target. By understanding how interlocutors perceive each other’s comprehensibility
(in terms of what comprehensibility means for them), it might be possible to empower
speakers to become more successful L2 communicators. Nevertheless, comprehensibility is
but one of several constructs relevant to L2 speech. To gain a clearer understanding of the
teaching and learning of L2 speech would unquestionably require the use of multiple com-
plementary metrics of listeners’ understanding, including measures of comprehensibility,
intelligibility, and listening comprehension.
183
Pavel Trofimovich et al.
Further Reading
Derwing, T. M., & Munro, M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2
teaching and research. John Benjamins.
A state-of-the-art review of literature on various constructs relevant to L2 pronunciation, including
comprehensibility.
Isaacs, T., & Trofimovich, P. (Eds.). (2016). Second language pronunciation assessment: Interdisciplinary
perspectives. Multilingual Matters.
This open access resource features multiple research contributions relevant to pronunciation assess-
ment, including the assessment of comprehensibility.
Isaacs, T., Trofimovich, P., & Foote, J. A. (2018). Developing a user-oriented second language com-
prehensibility scale for English-medium universities. Language Testing, 35, 193–216.
This study describes the development and validation of a comprehensibility-focused scale for English
for Academic Purposes teachers.
Nagle, C., Trofimovich, P., & Bergeron, A. (2019). Toward a dynamic view of second language
comprehensibility. Studies in Second Language Acquisition, 41, 647–672.
The first study investigating comprehensibility from a dynamic perspective.
References
Anderson-Hsieh, J., Johnson, R., & Koehler, K. (1992). The relationship between native speaker
judgments of nonnative pronunciation and deviance in segmentais, prosody, and syllable structure.
Language Learning, 42, 529–555.
Bergeron, A., & Trofimovich, P. (2017). Linguistic dimensions of accentenedness and comprehensi-
bility: Exploring task and listener effects in second language French. Foreign Language Annals, 50,
547–566.
British Council (2020). How IELTS is assessed. Retrieved 5 April 2020 from https://takeielts.british
council.org/teach-ielts/test-information/assessment
Caspers, J. (2010). The influence of erroneous stress position and segmental errors on intelligibility,
comprehensibility and foreign accent in Dutch as a second language. Linguistics in the Netherlands,
27, 17–29.
Crowther, D., Trofimovich, P., & Isaacs, T. (2016). Linguistic dimensions of second language accent
and comprehensibility: Nonnative listeners’ perspectives. Journal of Second Language Pronunciation,
2, 160–182.
Crowther, D., Trofimovich, P., Saito, K., & Isaacs, T. (2018). Linguistic dimensions of L2 accentedness
and comprehensibility vary across speaking tasks. Studies in Second Language Acquisition, 40,
443–457.
Derwing, T. M., & Munro, M. J. (1997). Accent, intelligibility, and comprehensibility: Evidence from
four L1s. Studies in Second Language Acquisition, 19, 1–16.
Derwing, T. M., & Munro, M. J. (2009). Comprehensibility as a factor in listener interaction pre-
ferences: Implications for the workplace. Canadian Modern Language Review, 66, 181–202.
Derwing, T. M., & Munro, M. J. (2013). The development of L2 oral language skills in two L1 groups:
A 7‐year study. Language Learning, 63, 163–185.
Derwing, T. M., & Munro, M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2
teaching and research. John Benjamins.
Derwing, T. M., Munro, M. J., & Thomson, R. I. (2008). A longitudinal study of ESL learners’ fluency
and comprehensibility development. Applied Linguistics, 29, 359–380.
Derwing, T. M., Munro, M. J., & Wiebe, G. (1998). Evidence in favor of a broad framework for
pronunciation instruction. Language Learning, 48, 393–410.
Derwing, T. M., Munro, M. J., Foote, J. A., Waugh, E., & Fleming, J. (2014). Opening the window on
comprehensible pronunciation after 19 years: A workplace training study. Language Learning, 64,
526–548.
Derwing, T. M., Rossiter, M. J., & Munro, M. J. (2002). Teaching native speakers to listen to foreign-
accented speech. Journal of Multilingual and Multicultural Development, 23, 245–259.
Dragojevic, M., Giles, H., Beck, A.-C., & Tatum, N. T. (2017). The fluency principle: Why foreign
accent strength negatively biases language attitudes. Communication Monographs, 84, 385–405.
Fayer, J. M., & Krasinski, E. (1987). Native and nonnative judgments of intelligibility and irritation.
Language Learning, 37, 313–326.
184
Speech Comprehensibility
Flege, J., & Fletcher, K. (1992). Talker and listener effects on the perception of degree of foreign accent.
Journal of the Acoustical Society of America, 91, 370–389.
Foote, J. A., & McDonough, K. (2017). Using shadowing with mobile technology to improve L2
pronunciation. Journal of Second Language Pronunciation, 3, 34–56.
Foote, J., & Trofimovich, P. (2018). Is it because of my language background? A study of language
background influence on comprehensibility judgments. Canadian Modern Language Review, 74,
253–278.
Gass, S., & Varonis, E. M. (1984). The effect of familiarity on the comprehensibility of nonnative
speech. Language Learning, 34, 65–89.
Graf, L. K. M., Mayer, S., & Landwehr, J. R. (2018). Measuring processing fluency: One versus five
items. Journal of Consumer Psychology, 28, 393–411.
Hahn, L. D. (2004). Primary stress and intelligibility: Research to motivate the teaching of supraseg-
mentals. TESOL Quarterly, 38, 201–223.
Hansen, K., Rakic, T., & Steffens, M. C. (2014). When actions speak louder than words: Preventing
discrimination of nonstandard speakers. Journal of Language and Social Psychology, 33, 68–77.
Isaacs, T. (2008). Towards defining a valid assessment criterion of pronunciation proficiency in non-
native English-speaking graduate students. Canadian Modern Language Review, 64, 555–580.
Isaacs, T., & Thomson, R. I. (2013). Rater experience, rating scale length, and judgments of L2 pro-
nunciation: Revisiting research conventions. Language Assessment Quarterly, 10, 135–159.
Isaacs, T., & Trofimovich, P. (2012). Deconstructing comprehensibility: Identifying the linguistic in-
fluences on listeners’ L2 comprehensibility ratings. Studies in Second Language Acquisition, 34,
475–505.
Isaacs, T., Trofimovich, P., & Foote, J. A. (2018). Developing a user-oriented second language com-
prehensibility scale for English-medium universities. Language Testing, 35, 193–216.
Isaacs, T., Trofimovich, P., Yu, G., & Chereau, B. M. (2015). Examining the linguistic aspects of speech
that most efficiently discriminate between upper levels of the revised IELTS Pronunciation scale.
IELTS Research Report Series, 4, 1–48.
Isbell, D. R., Park, O. S., & Lee, K. (2019). Learning Korean pronunciation: Effects of instruction,
proficiency, and L1. Journal of Second Language Pronunciation, 5, 13–48.
Kang, O., Rubin, D., & Pickering, L. (2010). Suprasegmental measures of accentedness and judgments
of language learner proficiency in oral English. The Modern Language Journal, 94, 554–566.
Kang, O., Thomson, R. I., & Moran, M. (2018). Empirical approaches to measuring the intelligibility
of different varieties of English in predicting listener comprehension. Language Learning, 68,
115–146.
Kennedy, S. (2009). L2 proficiency: Measuring the intelligibility of words and extended speech. In A.
Benati (Ed.), Issues in second language proficiency (pp. 132–144). Continuum.
Kennedy, S., & Trofimovich, P. (2008). Intelligibility, comprehensibility, and accentedness of L2
speech: The role of listener experience and semantic context. Canadian Modern Language Review, 64,
459–490.
Kennedy, S., & Trofimovich, P. (2010). Language awareness and second language pronunciation: A
classroom study. Language Awareness, 19, 171–185.
Kennedy, S., & Trofimovich, P. (2013). First-and final-semester non-native students in an English-
medium university: Judgments of their speech by university peers. Learning in Higher Education, 3,
283–303.
Kennedy, S., Foote, J. A., & Buss, L. K. (2015). Second language speakers at university: Longitudinal
development and rater behaviour. TESOL Quarterly, 49, 199–209.
Kim, R., Roberson, L., Russo, M., & Briganti, P. (2019). Language diversity, nonnative accents, and
their consequences at the workplace: Recommendations for individuals, teams, and organizations.
The Journal of Applied Behavioral Science, 55, 73–95.
Krech Thomas, H. (2004). Training strategies for improving listeners’ comprehension of foreign‐accented
speech (Doctoral dissertation). Retrieved from https://scholar.colorado.edu/concern/graduate_
thesis_or_dissertations/j098zb31g
Levis, J. (2018). Intelligibility, oral communication, and the teaching of pronunciation. Cambridge
University Press.
Ludwig, A., & Mora, J. C. (2017). Processing time and comprehensibility judgments in non-native
listeners’ perception of L2 speech. Journal of Second Language Pronunciation, 3, 167–198.
Ludwig, J. (1982). Native-speaker judgments of second-language learners’ efforts at communication: A
review. The Modern Language Journal, 66, 274–283.
185
Pavel Trofimovich et al.
MacIntyre, P. D. (2012). The idiodynamic method: A closer look at the dynamics of communication
traits. Communication Research Reports, 29, 361–367.
Matsuura, H., Chiba, R., & Fujieda, M. (1999). Intelligibility and comprehensibility of American and
Irish Englishes in Japan. World Englishes, 18, 49–62.
Moskowitz, S., & Dewaele, J.-M. (2021). Is teacher happiness contagious? A study of the link between
perceptions of language teacher happiness and student attitudes. Innovation in Language Learning
and Teaching, 15, 117–130.
Munro, M. J. (2018). Dimensions of pronunciation. In O. Kang, R. I. Thomson, & J. M. Murphy
(Eds.), The Routledge handbook of contemporary English pronunciation (pp. 413–431). Routledge.
Munro, M. J., & Derwing, T. M. (1994). Evaluations of foreign accent in extemporaneous and read
material. Language Testing, 11, 253–266.
Munro, M. J., & Derwing, T. M. (1995a). Foreign accent, comprehensibility, and intelligibility in the
speech of second language learners. Language Learning, 45, 73–97.
Munro, M. J., & Derwing, T. M. (1995b). Processing time, accent, and comprehensibility in the per-
ception of native and foreign-accented speech. Language and Speech, 38, 289–306.
Munro, M. J., Derwing, T. M., & Morton, S. L. (2006). The mutual intelligibility of L2 speech. Studies
in Second Language Acquisition, 28, 113–131.
Nagle, C. (2018). Motivation, comprehensibility, and accentedness in L2 Spanish: Investigating moti-
vation as a time‐varying predictor of pronunciation development. The Modern Language Journal,
102, 199–217.
Nagle, C. (2019). Developing and validating a methodology for crowdsourcing L2 speech ratings in
Amazon Mechanical Turk. Journal of Second Language Pronunciation, 5, 294–323.
Nagle, C., Trofimovich, P., & Bergeron, A. (2019). Toward a dynamic view of second language
comprehensibility. Studies in Second Language Acquisition, 41, 647–672.
Newman, E. J., Sanson, M., Miller, E. K., Quigley-McBride, A., Foster J. L., Bernstein, D. M., & Garry, M.
(2014). People with easier to pronounce names promote truthiness of claims. PLoS ONE, 9(2), e88671.
O’Brien, M. G. (2014). L2 learners’ assessments of accentedness, fluency, and comprehensibility of
native and nonnative German speech. Language Learning, 64, 715–748.
O’Brien, M. G. (2016). Methodological choices in rating speech samples. Studies in Second Language
Acquisition, 38, 587–605.
Piazza, L. G. (1980). French tolerance for grammatical errors made by Americans. The Modern
Language Journal, 64, 422–427.
Polyanskaya, L., & Ordin, M. (2019). The effect of speech rhythm and speaking rate on assessment of
pronunciation in a second language. Applied Psycholinguistics, 40, 795–819.
Reber, R., & Greifeneder, R. (2017). Processing fluency in education: How metacognitive feelings shape
learning, belief formation, and affect. Educational Psychologist, 52, 84–103.
Rubin, D. L., & Smith, K. A. (1990). Effects of accent, ethnicity, and lecture topic on undergraduates’
perceptions of nonnative English-speaking teaching assistants. International Journal of Intercultural
Relations, 14, 337–353.
Saito, K., & Akiyama, Y. (2017). Linguistic correlates of comprehensibility in second language
Japanese speech. Journal of Second Language Pronunciation, 3, 199–217.
Saito, K., & Plonsky, L. (2019). Effects of second language pronunciation teaching revisited: A
proposed measurement framework and meta‐analysis. Language Learning, 69, 652–708.
Saito, K., & Shintani, N. (2016). Do native speakers of North American and Singapore English dif-
ferentially perceive comprehensibility in second language speech? TESOL Quarterly, 50, 421–446.
Saito, K., Dewaele, J.-M., Abe, M., & In’nami, Y. (2018). Motivation, emotion, learning experience,
and second language comprehensibility development in classroom settings: A cross‐sectional and
longitudinal study. Language Learning, 68, 709–743.
Saito, K., Tran, M., Suzukida, Y., Sun, H., Magne, V., & Ilkan, M. (2019). How do L2 listeners
perceive the comprehensibility of foreign-accented speech? Roles of L1 profiles, L2 proficiency, age,
experience, familiarity and metacognition. Studies in Second Language Acquisition, 41, 1133–1149.
Saito, K., Trofimovich, P., & Isaacs, T. (2017a). Using listener judgements to investigate linguistic
influences on L2 comprehensibility and accentedness: A validation and generalization study. Applied
Linguistics, 38, 439–462.
Saito, K., Trofimovich, P., Isaacs, T., & Webb, S. (2017b). Re-examining phonological and lexical
correlates of second language comprehensibility: The role of rater experience. In T. Isaacs & P.
Trofimovich (Eds.), Second language pronunciation assessment: Interdisciplinary perspectives
(pp. 141–156). Multilingual Matters.
186
Speech Comprehensibility
Sanchez, C. A., & Jaeger, A. J. (2015). If it’s hard to read, it changes how long you do it: Reading time
as an explanation for perceptual fluency effects on judgment. Psychonomic Bulletin and Review, 22,
206–211.
Sanchez, C. A., & Khan, S. (2016). Instructor accents in online education and their effect on learning
and attitudes. Journal of Computer Assisted Learning, 32, 494–502.
Schwarz, N. (2018). Of fluency, beauty, and truth: Inferences from metacognitive experiences. In
J. Proust & M. Fortier (Eds.), Metacognitive diversity: An interdisciplinary approach (pp. 25–46).
Oxford University Press.
Sheppard, B. E., Elliott, N. C., & Baese-Berk, M. M. (2017). Comprehensibility and intelligibility of
international student speech: Comparing perceptions of university EAP instructors and content
faculty. Journal of English for Academic Purposes, 26, 42–51.
Song, H., & Schwarz, N. (2008). If it’s hard to read, it’s hard to do: Processing fluency affects effort
prediction and motivation. Psychological Science, 19, 986–988.
Strachan, L., Kennedy, S., & Trofimovich, P. (2019). Second language speakers’ awareness of their own
comprehensibility: Examining task repetition and self-assessment. Journal of Second Language
Pronunciation, 5, 347–373.
Suzuki, S., & Kormos, J. (2020). Linguistic dimensions of comprehensibility and perceived fluency: An
investigation of complexity, accuracy, and fluency in second language argumentative speech. Studies
in Second Language Acquisition, 42, 143–167.
Taylor Reid, K., O’Brien, M. G., Trofimovich, P., & Tsunemoto, A. (2020a). Exploring the stability of
second language speech ratings through task practice in bilinguals’ two languages. Journal of
Monolingual and Bilingual Speech, 2, 315–329.
Taylor Reid, K., O’Brien, M., Trofimovich, P., & Bajt, A. (2020b). Testing the malleability of teachers’
judgments of second language speech. Journal of Second Language Pronunciation, 6, 236–264.
Taylor Reid, K., Trofimovich, P., & O’Brien, M. G. (2019). Social attitudes and speech ratings: Effects
of positive and negative bias on multiage listeners’ judgments of second language speech. Studies in
Second Language Acquisition, 41, 419–442.
Taylor Reid, K., Trofimovich, P., O’Brien, M. G., & Tsunemoto, A. (2021). Using task practice to
reduce social influences on listener evaluations of second language accent and comprehensibility.
International Journal of Listening. Advanced Online Publication. https://doi.org/10.1080/10904018.
2021.1904933
Thomson, R. I. (2013). Accent reduction. In C. A. Chapelle (Ed.), The encyclopedia of applied linguistics
(pp. 8–11). Wiley-Blackwell.
Thomson, R. I. (2018). Measurement of accentedness, intelligibility, and comprehensibility. In O. Kang
& A. Ginther (Eds.), Assessment in second language pronunciation (pp. 11–29). Routledge.
Trofimovich, P., Nagle, C. L., O’Brien, M. G., Kennedy, S., Taylor Reid, K., & Strachan, L. (2020).
Second language comprehensibility as a dynamic construct. Journal of Second Language
Pronunciation, 6, 430–457.
Tyler, A., & Bro, J. (1992). Discourse structure in nonnative English discourse: The effect of ordering
and interpretive cues on perceptions of comprehensibility. Studies in Second Language Acquisition,
14, 71–86.
Varonis, E., & Gass, S. (1982). The comprehensibility of nonnative speech.Studies in Second Language
Acquisition, 4, 114–136.
Weyant, J. M. (2007). Perspective taking as a means of reducing negative stereotyping of individuals
who speak English as a second language. Journal of Applied Social Psychology, 37, 703–716.
187
13
FLUENCY
Jimin Kahng
1 Introduction/Definitions
While having a conversation in a less proficient second language (L2), have you missed a turn
because you could not construct your message fast enough to keep up with the pace of the
conversation? Speaking is a skill under time pressure. Compared to their first language (L1),
people typically have less knowledge of their second language, and are also considerably less
fluent using the L2 knowledge they do have (Segalowitz, 2010). Consequently, fluency
constitutes a crucial aspect of understanding L2 performance and proficiency (e.g., Housen
et al., 2012; Iwashita et al., 2008).
The importance of fluency in second language acquisition and education has been widely
acknowledged by researchers and practitioners; however, defining the term has been a long-
standing issue in the field, mainly due to its polysemous nature (Schmidt, 1992). For ex-
ample, in “Emily speaks three languages fluently,” “fluently” can be casually substituted with
“well” and fluency relates to someone’s overall proficiency. This meaning is what Lennon
(1990, 2000) labelled as the “broad” sense of fluency. One of the broadest conceptualizations
of fluency was provided by Fillmore (1979). In the discussion of how well people speak their
L1, Fillmore identified four dimensions of fluency based on speed and smoothness, semantic
density and coherence, appropriateness, and creativity.
On the other hand, fluency can refer to a specific aspect of proficiency, complemented by
others, such as the accuracy and complexity of language use (Housen et al., 2012). This
relates to what Lennon (1990) called the “narrow” sense of fluency, defined as the “rapid,
smooth, accurate, lucid, and efficient translation of thought or communicative intention
under the temporal constraints of online processing” (2000, p. 26). The majority of L2 flu-
ency research has focused on this narrow sense, although there have been some discussions of
taking a more holistic approach to examining fluency (e.g., Wright & Tavakoli, 2016).
Segalowitz (2010) maintains that even the narrow sense of fluency itself is a multidimensional
construct, encompassing three distinct aspects – cognitive, utterance, and perceived fluency.
Cognitive fluency refers to a speaker’s capacity to utilize the underlying cognitive processes
responsible for fluent speech production (e.g., efficiency of lexical retrieval, grammatical/
phonological encoding). Utterance fluency refers to the temporal and repair characteristics of
speech (e.g., articulation rate, number of pauses or repairs). In contrast, perceived fluency
involves listeners’ inferences about the speaker’s cognitive fluency based on their speech.
The distinction of the three aspects of fluency is a useful way to conceptualize and situate
various L2 fluency studies and will be used throughout the chapter.
2 Historical Perspectives
Disfluencies such as silent and filled pauses (e.g., uh and um), repetitions, and self-corrections
are common in spontaneous speech. However, historically they have not always been central
in language research. For instance, Chomsky (1965) considered disfluencies to be random or
characteristic errors, and argued that disfluencies should be excluded from linguistic theory.
On the other hand, a psychologist, Frieda Goldman-Eisler, was one of the first researchers
to systematically examine temporal features and disfluencies in L1 spontaneous speech (e.g.,
1951, 1968). Since her pioneering work, research on disfluencies have been expanded in re-
lated fields such as psycholinguistics, speech language pathology, discourse analysis, socio-
linguistics, second language acquisition (SLA), and language assessment with different
focuses. Before discussing early work in SLA, a brief overview of different perspectives on
fluency research in these neighbouring fields will help us to contextualize the topic (see De
Jong, 2018 for a comprehensive review).
Psycholinguistic studies on disfluencies have mainly investigated when and why dis-
fluencies occur in L1 speech and how listeners process them. They showed that slower speech
and disfluencies occur when speech planning is challenging (e.g., describing complex things,
Goldman-Eisler, 1968; before low-frequency words, Beattie & Butterworth, 1979).
Disfluencies were initially hypothesized to be edited out by listeners (e.g., Levelt, 1989);
however, later studies showed disfluencies have functions such as having listeners anticipate
something complex or new (e.g., Brennan & Schober, 2001).
Fluency-related research in discourse analysis has focused on social-interactional function
of disfluencies in conversation. Some of the main findings suggest that disfluencies are in-
teractional devices to regulate turn-taking; different types of disfluencies such as silent and
filled pauses are used to hold the floor or to give the floor to interlocutors (e.g., Maclay &
Oswood; 1959). Sociolinguists have investigated how individuals differ in speaking style
including fluency features based on various factors such as age, gender, socioeconomic
status, and personality. In particular, linguistic markers of powerlessness including hesita-
tions have been perceived to be less competent, less attractive, and less trustworthy (see
Hosman, 2015, for an overview).
Research on L2 fluency started in the 1970s and 1980s (e.g., Dechert & Raupach, 1987).
From the 1990s, L2 fluency research saw considerable growth in the fields of SLA and
language testing. This increased interest led to a special issue in the International Review of
Applied Linguistics in Language Teaching and a few books on L2 fluency, including the first
edited volume on L2 fluency from multiple disciplines by Riggenbach (2000), Segalowitz’s
(2010) monograph on fluency from a cognitive science perspective, and two recent additions
by Lintunen et al. (2020), and Tavakoli and Wright (2020).
Among early works, Lennon (1990) is one of the most widely cited studies on L2 fluency.
This paper introduced the broad and narrow senses of fluency and identified temporal speech
features reflecting L2-perceived fluency. Four advanced EFL learners told a story based on
pictures at the beginning and end of 6-months’ residence in the United Kingdom. Their
speech samples were rated by 10 EFL teachers for fluency and were also analyzed to obtain
12 objective measures of utterance fluency, for example, words per minute, repetitions, self-
corrections, and filled pauses per T-unit [a T-unit is “one main clause with all subordinate
clauses attached to it” Hunt, 1965, p. 20]. Most raters agreed that the participants improved
in overall fluency and their speech improved in terms of speech rate, filled pauses per T-unit,
189
Jimin Kahng
Measure Formula
190
Fluency
proficiency. Finally, a few studies explored the underlying mechanism of fluent speech by
examining the relationship between measures of utterance fluency and those of cognitive
fluency. In doing so, L1 utterance fluency or speaking style has also been viewed as a major
factor of L2 utterance fluency. In the subsequent part, details about each of the issues and
topics will be discussed in depth.
191
Jimin Kahng
four separate experiments, they examined the relative contributions of the three fluency aspects
to perceived fluency and whether perceptual sensitivity to each aspect can explain the relative
contributions. They found each of the three fluency aspects explained a significant amount of
variance of perceived fluency (breakdown 60%, speed 54%, repair 16%, altogether 84%) al-
though repair measures made the least contribution. When listeners rated on only one of the
three fluency aspects, they were sensitive to each aspect, but a bit more sensitive to pauses
compared to speed or repairs. They concluded that the contributions of the three aspects of
fluency cannot be explained by perceptual sensitivity alone and listeners seem to weight the
importance of the perceived aspects of fluency for an overall judgement.
More recently, Saito et al. (2018) investigated temporal correlates of four different levels
of perceived fluency. They found that the number of clause-final pauses distinguished be-
tween low- and mid-levels of perceived fluency; the number of mid-clause pauses further
distinguished between mid- and high-levels of perceived fluency; and finally, articulation rate
further distinguished between high- and native-levels of perceived fluency, suggesting the
importance of articulation rate and mid-clause pauses in perceived fluency.
Studies on perceived fluency utilizing correlation analyses, however, cannot demonstrate
whether those utterance features actually cause different levels of perceived fluency.
Therefore, some studies utilized phonetic manipulations to examine causal relationships.
Munro and Derwing (1998) had L1 and advanced L2 speech samples and three additional
sets of the samples whose speed had been synthetically manipulated (with a mean L1 rate, a
mean L2 rate, and a reduced rate). English native listeners rated how appropriate each
speaker’s rate was using a 9-point scale (1 = too slowly, 5 = just right, 9 = too quickly). [This
scale differs from the typical scales of perceived fluency, in which the anchors refer to “very
disfluent” and “very fluent.” It is noteworthy that typical fluency rating scales do not have a
reference point for “too quickly” or “too fluent.”] The results showed that speeding up slow
L2 speech samples had a positive effect on the ratings, whereas listeners in general preferred
L2 speech presented at a slightly slower rate than that of L1 speech. More recently, Bosker
et al. (2014) examined the effects of speed, number and length of silent pauses on perceived
fluency of L1 and L2 speech. They manipulated L1 and L2 speech samples in terms of
pauses, by creating no-, short- and long-pause conditions, and speed, by speeding up L2
speech and slowing down L1 speech. They found for both L1 and L2 speech samples, the
number and the length of silent pauses negatively affected fluency ratings. In terms of speed,
following their predictions, speeding up non-native speech increased fluency ratings and
slowing down native speech decreased fluency ratings. Kahng (2018) further examined
whether the location of silent pauses has an impact on perceived fluency by manipulating
pause location in L1 and L2 speech. She constructed no-pause, pauses-within-clauses, and
pauses-between-clauses conditions and compared their fluency ratings. The results showed
that the no-pause condition was rated more fluent than the two conditions with pauses.
Crucially, for both L1 and L2 speech, the pauses-between-clauses condition was rated more
fluent than the pauses-within-clauses condition (see Wennerstrom, 2001, for a similar finding
with non-manipulated L2 speech). The findings of Bosker et al. (2014) and Kahng (2018)
suggest that although L1 and L2 speech may be different in terms of speed and disfluencies
and L1 speech is usually perceived to be more fluent than L2 speech, the factors of perceived
fluency seem to operate in a similar fashion in L1 and L2 speech.
192
Fluency
methodologies have been used – comparing L2 with L1 speech, relating utterance fluency to
different speaking proficiency levels cross-sectionally, and tracking learners’ gains in utter-
ance fluency longitudinally.
When compared to L1 speech, L2 speech tends to be slower and have more pauses and
repairs (e.g., Kahng, 2014; Riazantseva, 2001). Importantly, there is also a difference in the
distribution of pauses. Compared to L1 speech, L2 speech has more pauses within clauses
(Kahng, 2014; Tavakoli, 2011) and within Analysis of Speech (AS) units (De Jong, 2016;
Skehan & Foster, 2007). [Foster et al. (2000, p. 365) define an AS unit as “a single speaker’s
utterance consisting of an independent clause, or sub-clausal unit, together with any sub-
ordinate clause(s) associated with either.”] De Jong (2016) further examined the effects of
word frequency on pause occurrences and found that both L1 and L2 speech are more likely
to have pauses before lower than higher frequency words. She concluded that the findings
suggest that in both L1 and L2 speech production, pauses at utterance boundaries are mainly
connected with conceptual planning, whereas pauses within utterances involve difficulties in
formulating the linguistic message including lexical retrieval.
Studies that related utterance fluency measures to speaking proficiency found moderate to
strong correlations (Ginther et al., 2010; Iwashita et al., 2008; Kahng, 2014, Kang et al.,
2010; Révész et al., 2016). For instance, Iwashita et al. (2008) analyzed spoken test perfor-
mances using a range of measures of grammatical accuracy and complexity, vocabulary,
pronunciation, and fluency and investigated which features best distinguish overall levels of
performance. They found that measures of fluency, especially speech rate, along with those
of vocabulary, had the strongest impact on distinguishing L2 speaking proficiency levels. De
Jong (2018) points out the rubrics of speaking proficiency of Iwashita et al. (2008) included
aspects of fluency, which might have led raters to pay more attention to fluency features.
However, a couple of studies (Kahng, 2014; Révész et al., 2016) on this topic whose raters
were not instructed to focus on fluency still reported significant correlations between utter-
ance fluency and speaking proficiency. In Kahng (2014), the rubric did not contain any
aspects of fluency yet she still found that overall speaking scores were significantly correlated
with mean syllable duration (inverse of articulation rate) and the number of silent pauses
within AS units.
Some studies have also tracked learners’ gains in utterance fluency over time. Since
Lennon’s (1990) and Towell et al.’s (1996) early work on the development of utterance fluency,
research on study abroad and related areas has pursued this line of research with various
L1–L2 speakers as participants. For example, O’Brien et al. (2007) found that after a semester
of study abroad, the English-speaking learners of Spanish improved speech rate and mean
length of run without fillers. In Mora and Valls-Ferrer (2012), after 3 months stay abroad,
Catalan-Spanish bilingual learners of English showed gains in speech rate, mean length of run,
pause frequency and duration. In their 2-year-long study with English-speaking learners of
Spanish, Huensch and Tracy-Ventura (2017b) reported gains in speed appeared quickly and
were maintained after return from study abroad whereas gains in pausing appeared later and
were sensitive to attrition after return home. Derwing and Munro (2013) followed two groups
of L2 immigrants to English-speaking Canada over 7 years and determined that fluency
continued to develop over time in the group that reported more interaction and exposure to
English, whereas the other group showed no significant fluency progress; most members of this
group reported limited exposure to English. The authors interpreted these findings through a
Willingness to Communicate framework (MacIntyre, 2007).
Taken together, studies on the relationship between utterance fluency and proficiency
suggest pure speed measures (e.g., articulation rate, mean syllable duration), and the fre-
quency of silent pauses, especially those within clauses or AS units seem to be reliable
193
Jimin Kahng
indicators of speaking proficiency. On the other hand, these studies do not tell us about what
cognitive processes are responsible for such fluent L2 speech. We will turn to the issue in the
next part.
194
Fluency
found that abandoning and regenerating a speech plan is cognitively demanding for both L1
and L2 speakers and leads to disfluencies; however, the additional time L2 speakers needed
to regenerate a speech plan was greater than for L1 speakers.
In summary, these few studies on the relationship between L2 utterance fluency and
cognitive fluency suggest that not all utterance fluency measures reflect L2 cognitive fluency
and some objective measures are more connected with certain subprocesses of speech pro-
duction. This line of research is exciting and inspiring yet more research is needed for a clear
understanding of the underlying mechanisms of fluent L2 speech.
195
Jimin Kahng
the framework of CALF (complexity, accuracy, lexis, and fluency) research (e.g., Housen
et al., 2012; Skehan & Foster, 2007; Tavakoli & Skehan, 2005). Some studies have explored
the cognitive and affective factors of L2 fluency development, including phonological short-
term memory (O’Brien et al., 2007), personality (Dewaele & Furnham, 2000), and affective
variables (Kormos & Préfontaine, 2017). In their 7-year study, Derwing and Munro (2013)
demonstrated how learners’ L1, age, L2 use, and their willingness to communicate influenced
the development of fluency, comprehensibility, and accentedness. There is a paucity of re-
search exploring the effects of learning contexts (e.g., Mora & Valls-Ferrer, 2012; Segalowitz
& Freed, 2004) and instruction types on fluency development (e.g., Boers et al., 2006; De
Jong & Perfetti, 2011; Galante & Thomson, 2017). There is also a growing interest in L2
fluency in dialogue (e.g., McCarthy, 2010; Sato, 2014; van Os et al., 2020).
196
Fluency
Further Reading
De Jong, N. H. (2018). Fluency in second language testing: Insights from different disciplines. Language
Assessment Quarterly, 15, 237–254.
An overview of L2 fluency research in applied linguistics, psycholinguistics, discourse analysis, and
sociolinguistics, focusing on the differences in its conceptualization with implications for language
research and testing.
Derwing, T. M. (2019). L2 fluency development. In S. Loewen & M. Sato (Eds.), The Routledge
handbook of instructed second language acquisition (pp. 246–259). New York: Routledge.
A thorough and comprehensive review on the historical background, current issues, empirical evidence,
pedagogical implications and teaching tips for L2 fluency development.
197
Jimin Kahng
Segalowitz, N. (2010). Cognitive bases of second language fluency. New York: Routledge.
A comprehensive and interdisciplinary synthesis of fluency research from a cognitive science perspective
covering a wide range of cognitive, social, motivational factors of second language fluency.
References
Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press.
Beattie, G. W. & Butterworth, B. L. (1979). Contextual probability and word frequency as determi-
nants of pauses and errors in spontaneous speech. Language and Speech, 22, 201–211.
Boers, F., Eyckmans, J., Kappel, J., Stengers, H., & Demecheleer, H. (2006). Formulaic sequences and
perceived oral proficiency: Putting a lexical approach to the test. Language Teaching Research, 10,
245–261.
Bosker, H. R., Pinget, A., Quené, H., Sanders, T., & De Jong, N. H. (2012). What makes speech sound
fluent? The contributions of pauses, speed and repairs. Language Testing, 30, 159–175.
Bosker, H. R., Quené, H., Sanders, T., & Jong, N. H. (2014). The perception of fluency in native and
nonnative speech. Language Learning, 64, 579–614.
Brennan, S. E., & Schober, M. F. (2001). How listeners compensate for disfluencies in spontaneous
speech. Journal of Memory and Language, 44, 274–296.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Cucchiarini, C., Strik, H., & Boves, L. (2002). Quantitative assessment of second language learners’
fluency: Comparisons between read and spontaneous speech. The Journal of the Acoustical Society
of America, 111, 2862–2873.
De Jong, N. H., & Perfetti, C. A. (2011). Fluency training in the ESL classroom: An experimental study
of fluency development and proceduralization. Language Learning, 62, 533–568.
De Jong, N. H. (2016). Predicting pauses in L1 and L2 speech: The effects of utterance boundaries and
word frequency. International Review of Applied Linguistics in Language Teaching, 54, 113–132.
De Jong, N. H. (2018). Fluency in second language testing: Insights from different disciplines. Language
Assessment Quarterly, 15, 237–254.
De Jong, N. H., Groenhout, R., Schoonen, R., & Hulstijn, J. H. (2015). Second language fluency:
Speaking style or proficiency? Correcting measures of second language fluency for first language
behavior. Applied Psycholinguistics, 36, 223–243.
De Jong, N. H., & Mora, J. C. (2019). Does having good articulatory skills lead to more fluent speech in
first and second languages? Studies in Second Language Acquisition, 41, 227–239.
De Jong, N. H., Steinel, M. P., Florijn, A., Schoonen, R., & Hulstijn, J. H. (2013). Linguistic skills and
speaking fluency in a second language. Applied Psycholinguistics, 34, 893–916.
Dechert, H. W., & Raupach, M. (Eds.). (1987). Psycholinguistic models of production. Norwood, NJ:
Ablex Publishing Corporation.
Derwing, T. M. (2019). L2 fluency development. In S. Loewen & M. Sato (Eds.), The Routledge
handbook of instructed second language acquisition (pp. 246–259). New York: Routledge.
Derwing, T. M., & Munro, M. J. (2013). The development of L2 oral language skills in two L1 groups:
A 7-year study. Language Learning, 63, 163–185.
Derwing, T. M., Munro, M. J., Thomson, R. I., & Rossiter, M. J. (2009). The relationship between L1
fluency and L2 fluency development. Studies in Second Language Acquisition, 31, 533–557.
Derwing, T. M., Rossiter, M., Munro, M., & Thomson, R. (2004). Second language fluency: Judgments
on different tasks. Language Learning, 54, 655–679.
Dewaele, J.-M., & Furnham, A. (2000). Personality and speech production: A pilot study of second
language learners. Personality and Individual Differences, 28, 355–365.
Felker, E. R., Klockmann, H. E., & de Jong, N. H. (2019). How conceptualizing influences fluency in
first and second language speech production. Applied Psycholinguistics, 40, 111–136.
Fillmore, C. (1979). On fluency. In C. Fillmore, D. Kempler, & W. S.-Y. Wang (Eds.), Individual
differences in language ability and language behavior (pp. 85–101). New York: Academic Press.
Foster, P. (2020). Oral fluency in a second language: A research agenda for the next ten years. Language
Teaching. doi: 10.1017/S026144482000018X
Foster, P., Tonkyn, A., & Wigglesworth, G. (2000). Measuring spoken language: A unit for all reasons.
Applied Linguistics, 21, 354–375.
Freed, B. F. (2000). Is fluency, like beauty, in the eyes (and ears) of the beholder? In H. Riggenbach
(Ed.), Perspectives on fluency (pp. 243–265). Ann Arbor, MI: University of Michigan Press.
198
Fluency
Gatbonton, E., & Segalowitz, N. (2005). Rethinking communicative language teaching: A focus on
access to fluency. Canadian Modern Language Review, 61, 325–353.
Galante, A., & Thomson, R. I. (2017). The effectiveness of drama as an instructional approach for the
development of second language oral fluency, comprehensibility, and accentedness. TESOL
Quarterly, 51, 115–142.
Ginther, A., Dimova, S., & Yang, R. (2010). Conceptual and empirical relationships between temporal
measures of fluency and oral English proficiency with implications for automated scoring. Language
Testing, 27, 379–399.
Goldman-Eisler, F. (1951). The measurement of time sequences in conversational behaviour. British
Journal of Psychology, 42, 355–362.
Goldman-Eisler, F. (1968). Psycholinguistics experiments in spontaneous speech. London: Academic
Press.
Hosman, L. (2015). Powerful and powerless speech styles and their relationship to perceived dominance
and control. In R. Schulze, & H. Pishwa (Eds.), The exercise of power in communication
(pp. 221–232). London: Palgrave Macmillan.
Housen, A., Kuiken, F., & Vedder, I. (Eds.). (2012). Dimensions of L2 performance and proficiency:
Complexity, accuracy and fluency in SLA. Amsterdam: John Benjamins.
Huensch, A., & Tracy-Ventura, N. (2017a). Understanding L2 fluency behavior: The effects of in-
dividual differences in L1 fluency, cross-linguistic differences, and proficiency over time. Applied
Psycholinguistics, 38, 755–785.
Huensch, A., & Tracy-Ventura, N. (2017b). L2 utterance fluency development before, during, and after
residence abroad: A multidimensional investigation. The Modern Language Journal, 101, 275–293.
Hunt, K. W. (1965). Grammatical structures written at three grade levels. National Council of Teachers
of English Research No. 3. Urbana Champaign, IL: National Council of Teachers of English.
Iwashita, N., Brown, A., McNamara, T., & O’Hagan, S. (2008). Assessed levels of second language
speaking proficiency: How distinct? Applied Linguistics, 29, 24–49.
Kahng, J. (2014). Exploring utterance and cognitive fluency of L1 and L2 English speakers: Temporal
measures and stimulated recall. Language Learning, 64, 809–854.
Kahng, J. (2018). The effect of pause location on perceived fluency. Applied Psycholinguistics, 39,
569–591.
Kahng, J. (2020). Explaining second language utterance fluency: Contribution of cognitive fluency and
first language utterance fluency. Applied Psycholinguistics, 41, 457–480.
Kang, O., Rubin, D., & Pickering, L. (2010). Sugrasegmental measures of accentedness and judgments
of language learner proficiency in oral English. Modern Language Journal, 94, 554–566.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Erlbaum.
Kormos, J., & Dénes, M. (2004). Exploring measures and perceptions of fluency in the speech of second
language learners. System, 32, 145–164.
Kormos, J., & Préfontaine, Y. (2017). Affective factors influencing fluent performance: French learners’
appraisals of second language speech tasks. Language Teaching Research, 21, 699–716.
Lennon, P. (1990). Investigating fluency in EFL: A quantitative approach. Language Learning, 40,
387–417.
Lennon, P. (2000). The lexical element in spoken second language fluency. In H. Riggenbach (Ed.),
Perspectives on fluency (pp. 25–42). Ann Arbor, Michigan: University of Michigan Press.
Levelt, W. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.
Lintunen, P., Mutta, M., & Peltonen, P. (Eds.). (2020). Fluency in L2 learning and use. Bristol:
Multilingual Matters.
MacIntyre, P. D. (2007). Willingness to communicate in a second language: Understanding the decision
to speak as a volitional process. The Modern Language Journal, 91, 564–576.
Maclay, H., & Oswood, C. E. (1959). Hesitation phenomena in spontaneous English speech. Word,
15, 19–44.
McCarthy, M. (2010). Spoken fluency revisited. English Profile Journal, 1, E4.
Mora, J. C., & Valls-Ferrer, M. (2012). Oral fluency, accuracy, and complexity in formal instruction
and study abroad learning contexts. TESOL Quarterly, 46, 610–641.
Munro, M. J., & Derwing, T. M. (1998). The effects of speaking rate on listener evaluations of native
and foreign-accented speech. Language Learning, 48, 159–182.
Nation, I. S. P., & Newton, J. (2009). Teaching ESL/ EFL listening and speaking. New York, NY:
Routledge.
199
Jimin Kahng
O’Brien, I., Segalowitz, N., Freed, B., & Collentine, J. (2007). Phonological memory predicts second
language oral fluency gains in adults. Studies in Second Language Acquisition, 29, 557–582.
Révész, A., Ekiert, M., & Torgersen, E. N. (2016). The effects of complexity, accuracy, and fluency on
communicative adequacy in oral task performance. Applied Linguistics, 37, 828–848.
Riazantseva, A. (2001). Second language proficiency and pausing. Studies in Second Language
Acquisition, 23, 497–526.
Riggenbach, H. (Ed.). (2000). Perspectives on fluency. Ann Arbor, MI: University of Michigan Press.
Rossiter, M. J., Derwing, T. M., Manimtim, L. G., & Thomson, R. I. (2010). Oral fluency: The ne-
glected component in the communicative language classroom. The Canadian Modern Language
Review, 66, 583–606.
Saito, K., Ilkan, M., Magne, V., Tran, M. N., & Suzuki, S. (2018). Acoustic characteristics and learner
profiles of low-, mid- and high-level second language fluency. Applied Psycholinguistics, 39, 593–617.
Sato, M. (2014). Exploring the construct of interactional oral fluency: Second language acquisition and
language testing approaches. System, 45, 79–91.
Schmidt. R. (1992). Psychological mechanisms underlying second language fluency. Studies in Second
Language Acquisition, 14, 357–385.
Segalowitz, N. (2010). Cognitive bases of second language fluency. New York, NY: Routledge.
Segalowitz, N. (2016). Second language fluency and its underlying cognitive and social determinants.
International Review of Applied Linguistics in Language Teaching, 54, 79–95.
Segalowitz, N., & Freed, B. F. (2004). Context, contact, and cognition in oral fluency acquisition:
Learning Spanish in at home and study abroad contexts. Studies in Second Language Acquisition, 26,
173–200.
Segalowitz, N., French, L. & Guay, J. (2017). What features best characterize adult second language
utterance fluency and what do they reveal about fluency gains in short-term immersion? Canadian
Journal of Applied Linguistics, 20, 90–116.
Skehan, P., & Foster, P. (2007). Complexity, accuracy, fluency and lexis in task-based performance:
A meta-analysis of the Ealing Research. In S. Van Daele, A. Housen, F. Kuiken, M. Pierrard, &
I. Vedder (Eds.), Complexity, accuracy, and fluency in second langauge use, learning, and teaching
(pp. 207–226). Brussels, Belgium: University of Brussels Press.
Tavakoli, P. (2011). Pausing patterns: Differences between L2 learners and native speakers. ELT
Journal, 65, 71–79.
Tavakoli, P., & Skehan, P. (2005). Strategic planning, task structure, and performance testing. In
R. Ellis (Ed.), Planning and task performance in a second language (pp. 239–276). Amsterdam: John
Benjamins.
Tavakoli, P., & Wright, C. (2020). Second language speech fluency: From research to practice.
Cambridge: Cambridge University Press.
Thai, C., & Boers, F. (2016). Repeating a monologue under increasing time pressure: Effects on fluency,
complexity, and accuracy. TESOL Quarterly, 50, 369–393.
Thomson, R. I. (2015). Fluency. In M. Reed, & J. M. Levis (Eds.), The handbook of English pro-
nunciation (pp. 209–226). Malden, MA: John Wiley & Sons.
Towell, R., Hawkins, R., & Bazergui, N. (1996). The development of fluency in advanced learners of
French. Applied Linguistics, 17, 84–119.
Van Os, M., De Jong, N. H., & Bosker, H. R. (2020). Fluency in dialogue: The effect of turn-taking
behavior on perceived fluency in native and non-native speech, Language Learning. Advanced online
publication. doi: 10.1111/lang.12416
Wennerstrom, A. (2001). The music of everyday speech: Prosody and discourse analysis. New York:
Oxford University Press.
Wray, A. (2002). Formulaic language and the lexicon. Cambridge: Cambridge University Press.
Wright, C., & Tavakoli, P. (2016). New directions and developments in defining, analyzing and mea-
suring L2 speech fluency. International Review of Applied Linguistics in Language Teaching,
54, 73–77.
200
14
THE ROLE OF PROSODY ACROSS
LANGUAGES
Yanjiao Zhu and Peggy Mok
1 Introduction/Definitions
Prosody is conceptualized as covering a spectrum of speech phenomena including pitch,
intonation, tone, loudness, rhythm and stress. Pitch is the height of a tone perceived by the
listener, which reflects the number of times the vocal folds vibrate per second (the funda-
mental frequency, F0). Pitch functions at both the lexical and post-lexical (i.e., phrasal,
sentential, and discourse) level, and is signified by tone and intonation. Tone refers to the
pitch of syllable for the use of lexical contrast. In tone languages, the contour and the level of
the pitch decide the meaning of a syllable. Intonation, also known as tunes, contours or
melodies, denotes the movement of pitch in an utterance and is used to perform a wide range
of post-lexical functions. For instance, it can indicate whether an utterance is a question or a
statement, divide chunks of speech, express emotions like enthusiasm, irony, incredulity, and
show relationships between utterances in conversational turns. Loudness refers to the
strength of the sound perceived by the listeners, which is associated with the acoustic in-
tensity (relative amplitude) of the sound. Rhythm is understood as the perceived regular
occurrence of units, with “stress-timed” referring to perceived regularity of stress beats and
“syllable-timed” to syllables. Hindi, Finnish, and many Romance languages are regarded as
syllable-timed languages, while English, Russian, Arabic, and all Germanic languages are
classified as stress-timed. Although this classification appears to capture a perceivable
rhythmic difference, the notion of truly equal inter-stress or inter-syllable intervals (isco-
chrony) has been shown to be false (Dauer, 1983). Rhythm is not a single prosodic property
but is intertwined with syllable structure, syllable duration, vowel quality, speech rate, flu-
ency, pausing, and connected speech processes (Dauer, 1983). Stress can be defined as the
perceived prominence of a syllable in relation to other syllables. A stressed syllable is usually
longer in duration, higher in pitch, and louder in amplitude, but languages vary in the re-
lative importance of these cues (Cutler, 2007). Stress placement also shows cross-linguistic
variation. In free-stress languages like English and Spanish, stress is lexically specified,
whereas in fixed-stress languages like Polish and Czech, it is realized on one fixed position in
words or phrases.
In L2 acquisition, prosody is worth paying attention to because it can affect the accent-
edness and intelligibility of L2 speech (De Mareüil & Vieru-Dimulescu, 2006). Accentedness
denotes the degree of foreign accent in an L2 utterance perceived by a listener, while
intelligibility refers to how much of the message is actually understood by the listener. A
related concept is comprehensibility, denoting listeners’ subjective estimation of the difficulty
they experience in understanding an utterance (Munro & Derwing, 1999). In the literature,
off-target production and perception of L2 prosody are evidenced in speech rhythm (Grenon
& White, 2008), lexical tone (Wu et al., 2014), pitch contour (Graham & Post, 2018), stress
(McGory, 1997), and a variety of prosodic features are known to influence the intelligibility
of L2 speech (Anderson-Hsieh et al., 1992; Munro & Derwing, 1998; Trofimovich & Baker,
2006). Therefore, to facilitate communication in real life, measures should be taken to im-
prove L2 prosody.
2 Historical Perspectives
In the past, research often focused on the differences between native speakers and L2 learners
in the production of intonation, stress, rhythm, and so forth. Through auditory and acoustic
analyses of tonal events, L2 learners were commonly found to produce non-nativelike
prosodic features. For example, the native judges in Backman (1979) identified Venezuelan
Spanish learners of English to have narrower pitch range and higher pitch level on unstressed
syllables than native American English speakers. Willems (1982) demonstrated that Dutch
learners of L2 English had a narrower pitch range and lower F0 at the beginning of an
utterance than native English speakers. Adams and Munro (1978) found that L2 English
speakers of various L1 backgrounds differed from native speakers in placement and fre-
quency of stress.
Early studies further examined the relationship between non-native prosody and foreign
accent, showing that non-nativelike temporal features in L2 speech resulted in the perception
of foreign accent by native listeners. Hutchinson (1973) found that L1 Spanish learners of L2
English who made a smaller durational contrast between stressed and unstressed syllables in
English were rated lower in pronunciation by native English listeners. Jonasson and
McAllister (1972) demonstrated that manipulation of vowel and consonant duration can
influence native listeners’ perception of foreign accent in American-accented Swedish, and
Flege (1993) found that larger durational contrasts between long and short vowel minimal
pairs led to weaker foreign accent in Mandarin-accented English.
However, with the progression of research in second language acquisition, most re-
searchers now believe that it is more important for L2 speakers to be understood and to be
able to communicate (Moyer, 2004), which can be achieved even when the L2 speakers retain
some foreign accent (Munro & Derwing, 1999). Therefore, contemporary studies also ex-
amine the effect of prosody on the intelligibility and comprehensibility of L2 speech. In
addition, recent attempts have been made to explore ways of improving L2 prosody with
these concepts in mind.
L2 Intonation
Intonation contributes substantially to successful communication as it affects the listeners’
understanding of speakers’ intentions. Gumperz (1982) illustrated this with an example in
which British Airline employees felt that they were treated badly because cafeteria servers of
202
The Role of Prosody Across Languages
Indian and Pakistani descent spoke with a falling intonation. Gumperz’ explanation was that
politeness is expressed with rising intonation in English, but by falling intonation in the
servers’ L1s. Direct evidence also illustrates the effect of intonation on speakers’ compre-
hensibility and intelligibility. Wennerstrom (1998) showed that teaching assistants who more
often used intonation to signal topic shifts scored higher on an oral intelligibility examina-
tion. Similarly, the evaluation of native and L2 teaching assistants’ presentations in Pickering
(2001) revealed that L2 speakers’ use of rising intonation affected the social connection they
made with students and hence the comprehensibility of their speech.
However, intonation poses strong obstacles for L2 speech acquisition. It is not easy for
learners to realize native-like pitch features in their speech production. Mennen et al. (2014)
showed that German learners of L2 English expanded pitch range in earlier portions and
compressed pitch range in later portions of intonational phrases, and Mennen (2004) found
that advanced Dutch speakers of L2 Greek had much earlier peak alignment (the timing of
the F0 peaks), than native Greek speakers due to L1 transfer. Moreover, L2 speakers have
difficulty learning to use intonation to achieve attitudinal, discoursal, grammatical, and
focusing functions. Pytlyk (2008) suggested that L2 Mandarin speakers with L1 English used
a falling pitch contour for questions with the bu particle and a rising pitch contour for
questions with ma particle preceded by Tone 1, which was opposite to native Mandarin
patterns.
L2 Lexical Tone
As introduced earlier, in tone languages, lexical tone can differentiate one word from an-
other. To be understood by others, learners of tone languages need to produce and process
tones correctly. However, evidence indicates that speakers of non-tone languages face dif-
ficulties when they perceive tones with contextual variation. Lee et al. (2010) studied the
identification of acoustically modified Mandarin tones by L2 learners of various L1 back-
grounds and proficiencies. They observed that L2 listeners were faster in recognizing tones
containing F0 continuity information of the preceding context than tones whose F0 con-
tinuity information was cut out, while native Mandarin listeners did not show such an F0
continuity effect. This observation suggested that L2 listeners relied on canonical F0 contour
and could not compensate for contextual variability.
Due to the complexities of acquisition, research in L2 tone perception is very vibrant with
a growing number of studies using more complex designs, frameworks and technologies, as
discussed later.
L2 Stress
Stress is critical for spoken word recognition (Cutler et al., 1997). In some languages, stress
placement can contrast lexical meaning, as in the English word contract (a noun meaning a
written or spoken agreement) and contract (a verb meaning a decrease in size or number). In
addition, the degree of stress plays a role in lexical retrieval. For example, the syllable oc-
receives primary stress in octopus, secondary stress in October, and no stress in occur. To
identify the three words, listeners need to be able to tell the difference between degrees of
stress in an activated cohort of words starting with oc- (Levis, 2018). As native listeners use
stress information to access words, mis-stressing in L2 can be problematic. Field (2005)
showed that L2 speech was rated as more intelligible when the primary stress was placed
correctly, and Zielinski (2008) suggested that syllable stress pattern was a more reliable cue
than segmental information for intelligibility.
203
Yanjiao Zhu and Peggy Mok
L2 learners often have problems with perceiving L2 stress and using stress information to
access the L2 mental lexicon. Dupoux et al. (2008) reported stress “deafness” of L2 Spanish
learners speaking L1 French, a language having no contrastive use of duration, pitch or
stress for lexical contrast. L2 learners had problems encoding stress in short-term memory
and using stress to access the lexicon. Context-sensitive “stress-deafness” was reported in
Ortega-Llebaria et al. (2013), who found that L1 English speakers had difficulty perceiving
L2 Spanish stress because English uses context-sensitive phonetic details differently from
Spanish in stress-marking. Therefore, “stress-deafness” hampers communication when L2
listeners fail to use stress to recognize words uttered by interlocutors.
In addition to perception, learners can have difficulty in stress production. Fokes and
Bond (1989) found that in L2 speech, stressed vowels were too short and unstressed vowels
were too long compared to native norms. McGory (1997) showed that L1 Korean learners of
English had difficulty acquiring reduction in unstressed syllables since Korean is a nonstress
language. In addition, Zhang et al. (2008) revealed that Mandarin learners of English were
unable to produce F0 and vowel reduction in a native-like manner and that L2 stress pro-
duction was judged by native listeners as less acceptable than native production.
L2 Speech Rhythm
Rhythm plays an important role in speech communication because it helps listeners to segment
the continuous speech flow into identifiable words. According to Cutler (2012), alternations
between strong and weak syllables are used by listeners to locate the beginnings of words. In
stress-timed languages like English, a strong syllable is likely to be the beginning of a word.
English listeners apply this feature to segment speech, whereas French listeners whose L1 is
syllable-timed do not. Therefore, L2 learners of a language with a different rhythmic pattern
need to learn the appropriate rhythm to help understand sentences spoken in the target language.
Numerous studies have demonstrated that rhythm can influence L2 speakers’ accented-
ness and intelligibly. Huang and Jun (2011) found that the perception of Mandarin-accented
English was strongly influenced by speech rate and articulation rate. Trofimovich and Baker
(2006) suggested that the degree of accent in L2 English speech produced by Korean learners
was most affected by pause duration and speech rate. Munro and Derwing (1998) found that
native listeners preferred higher speech rate than lower speech rate in foreign accent ratings.
Even when speech rate was controlled, Quené and Delft (2010) found that non-native
durational patterns alone can influence listeners’ perception of foreign accent. As for in-
telligibility, Tajima et al. (1997) found that the intelligibility of L2 English speech improved
by 15%–25% when phonemic durations were manipulated to match native durational pat-
terns, whereas the intelligibility of L1 English dropped by 15% when its phonemic durations
were warped to the non-native temporal patterns.
L2 speakers learning a language with a different rhythmic structure usually struggle with
acquiring speech rhythm. Mok and Dellwo (2008) found that English spoken by L1
Mandarin and Cantonese speakers sounded more syllable-timed than native English, as
reflected in rhythm metrics. Setter (2003) measured syllable duration of English spoken by
Cantonese speakers and found that L2 English had insufficient vowel reduction, making it
sound more syllable-timed. Gut (2003) found that various types of L2 German had less
vowel reduction or deletion than native German, again because of more syllable-timed
speech. Nevertheless, learners were able to learn speech rhythm through more L2 exposure.
Tortel & Hirst (2010), for example, demonstrated an increase of durational variability and
thus an approximation of stress-timing in French learners’ L2 English production during the
course of L2 acquisition.
204
The Role of Prosody Across Languages
205
Yanjiao Zhu and Peggy Mok
L2 Prosody Production
Different from earlier studies, which usually described global prosodic features in L2 pro-
duction, recent studies have enquired into the L2 production of specific prosodic features in
the realization of linguistic and pragmatic functions. Intonational marking of polarity
contrasts was studied in Turco et al. (2015), focusing on L2 Italian spoken by highly pro-
ficient German and Dutch speakers. A strong L1 effect was observed, as the speakers en-
coded the polarity contrast either by producing a nuclear accent on the finite verb as in
German or using lexical markers as in Dutch. Tremblay et al. (2018) examined the impact of
intonation on segmentation in the perception of L2 French by Dutch and English listeners.
An F0 rise signals word final boundaries in French, but signals word initial boundaries in
English and Dutch, with a stronger weight in Dutch than in English. Results suggested that
Dutch-speaking learners were at an advantage over English-speaking learners when learning
to use an F0 rise to cue word boundaries in French, showing the influence of L1. Post-focus
compression (PFC) realization for focus marking was systematically examined in Hong
Kong English (HKE), a typical variety of Cantonese-accented English spoken in Hong
Kong. PFC refers to the phenomenon of F0 narrowing and lowering after focus, which is
typically found in English but not in Cantonese. Fung and Mok (2014) studied the prosodic
realization of narrow focus in L2 English by Cantonese speakers. Production results sug-
gested that narrow focus was realized by the L2 speakers as on-focus F0 range expansion but
not PFC, which could be partially explained by L1 Cantonese influence. The lack of PFC in
L2 English was likewise observed in Gananathan et al. (2015), who investigated the pro-
duction and perception of narrow focus by HKE speakers. The study suggested that HKE
speakers were nativelike in the perception of narrow focus, but far from nativelike in pro-
duction. The less proficient speakers did not mark narrow focus at all, and the more pro-
ficient speakers did not realize PFC or rarely exhibited on-focus F0 changes.
L2 Prosody Perception
Given the deviations in L2 intonation production, researchers have also examined L2 lear-
ners’ knowledge of the relevant intonation patterns in the target language. For instance, He
et al. (2012) investigated Chinese learners’ knowledge of L2 Dutch intonation contours. In a
forced-choice task, L2 learners saw sentences on a screen, listened to the same sentences
spoken with correct or incorrect intonation contours, and selected the most appropriate
spoken version. Low proficiency learners tended to choose rising contours for sentences
ending with a question mark and falling contours in other situations. Using a similar design,
the mapping between intonation and its functions was investigated in Mok et al. (2016),
showing that the English speakers with L1 Cantonese were quite native-like for some sen-
tence types, but were less accurate in tag questions and wh questions.
In addition to intonation, stress has received attention in current perception studies del-
ving into the interaction between L1 and L2 in stress presentation. As introduced before,
languages vary in their use of duration, pitch, and intensity cues for stress identification
(Cutler, 2007), but learners’ cue-weighting strategies for stress perception in L1 can be
transferred to L2. Qin et al. (2017) investigated the processing of word stress by L1 Standard
Mandarin and L1 Taiwan Mandarin learners of English. F0 signals lexical meaning in both
varieties, but only Standard Mandarin uses duration to distinguish stressed from unstressed
words. The study showed that when English stress was only signalled by duration, the
206
The Role of Prosody Across Languages
Taiwan Mandarin speakers performed worse than Standard Mandarin speakers, indicating
that cue properties in L1 determine the weight of these cues in L2. Lin et al. (2020) examined
the processing of L2 English word stress by L1 Korean and L1 Mandarin speakers. The
study revisited the concept of “stress-deafness” (Dupoux 2008) and suggested different de-
grees of learning difficulty varying with L1. Mandarin and English have contrastive word
stress while Korean does not, and it was found that L1 Korean learners of L2 English
performed worse than their Mandarin-speaking peers in stress perception.
Training L2 Prosody
Although many studies demonstrate deviations in L2 prosody in comparison to native
speech, current work suggests that training can improve learners’ ability to produce and
perceive prosodic features so as to enhance comprehensibility.
One method known to be effective is high-variability phonetic training (HVPT), which
provides trainees with speech produced by multiple speakers and/or in multiple phonetic
contexts, along with corrective feedback. This training method helps listeners to form robust
phonetic categories while ignoring context- and talker-specific information. Studies on L2
prosody have shown that HVPT can facilitate the acquisition of tones. For instance,
Silpachai (2020) found that HVPT from multi-talkers could improve the perception of
Mandarin tones by English listeners. As for speech production, Wieneret al. (2020) trained
L1 English speakers to produce rising and falling tones in a Mandarin-like artificial language
and suggested that HVPT in combination with explicit instruction led to significant
improvement.
In addition, metaphoric bodily actions are also found to be facilitative in the acquisition
of L2 prosody. In Eng et al. (2013), English-speaking learners of L2 Mandarin showed
improvement in tonal identification after a training session in which they were provided with
hand gestures mimicking pitch contours. In Morett and Chang (2015), English-speaking
learners of Mandarin benefited from gestures representing Mandarin pitch contours in
identifying the meaning of minimal lexical tone pairs. In Burnham et al. (2006), rigid head
motion was demonstrated as a perceptual cue for L2 Cantonese tone perception. In addition
to tone perception, Zheng et al. (2018) showed that hand gestures and head nods can
modestly improve the production of L2 Mandarin tones.
Furthermore, quite a few studies have exemplified effective uses of modern computer-
assisted visualization technologies in L2 prosodic training. Hirata (2004) trained English
learners of L2 Japanese with Japanese words contrasting in pitch and duration using
computer-assisted methods that provided prosody graphs as visual feedback. Participants in
the 3.5-week training programme and a control group without training took pre- and post-
tests on production and perception of words including pitch and duration contrasts. Results
showed that the trained participants improved more than the control group in production
and perception of Japanese pitch and duration contrasts. Computer-assisted prosody
training with visual feedback was likewise facilitative in the learning of L2 French prosody
by English speakers (Hardison, 2004). Moreover, computer-aided instruction on various
aspects of prosody was shown in Lima (2015) to significantly improve learners’ intelligibility.
Other studies suggested some visualization methods outperformed others in prosodic
training. Niebuhr et al. (2017) compared the usability of different means of prosody visua-
lization techniques for Danish learners of L2 German. Both auditory phonological analysis
on prosodic accuracy and learners’ ratings on usability of visualization techniques suggested
that iconic techniques performed better than symbolic techniques. Godfroid et al. (2017)
designed a 3-week online course for L2 Mandarin tone perception using different multimodal
207
Yanjiao Zhu and Peggy Mok
methods. The study suggested that colour-mediated methods resulted in 10%–20% im-
provement of accuracy in tone perception, but colour needed to be presented with a concrete
object to optimize perceptual learning.
208
The Role of Prosody Across Languages
stress also significantly improved ESL learners’ comprehensibility and fluency (Levis &
Levis, 2018).
Therefore, teachers should be encouraged to draw learners’ attention to prosodic phe-
nomena. Hirano-Cook (2011) showed that raising learners’ awareness and improving their
self-monitoring skills could improve L2 learners’ production and perception of Japanese
pitch accents. In Saito and Wu (2014) a group who received form-focused instruction de-
monstrated significant gains in the perception of L2 Mandarin tones. This led the authors to
suggest that a communicative focus on form could facilitate L2 prosody acquisition.
In addition, teachers are advised to give explicit instruction on prosody. The importance
of explicit instruction was underscored in past studies. Gordon and Darcy (2016) examined
the effects of explicit and non-explicit instruction on the development of comprehensible
speech by ESL learners. The participants received explicit and non-explicit pronunciation
instruction on prosody and vowels. The training results underscored the importance of ex-
plicit training on prosody. Comprehensibility of L2 speech improved after explicit instruc-
tion, and this improvement was most salient for prosody training. In an investigation of L2
stress acquisition, Chen (2013) called for explicit training on English lexical categories which
could facilitate the acquisition of English lexical stress rules by Mandarin ESL learners.
Finally, teachers can use multiple innovative teaching methods involving actions, mo-
tions, and computerized visualization to help students perceive and produce speech prosody
in class. For example, haptic pronunciation teaching such as tapping and clapping have been
recommended in class to improve learning of L2 English rhythm grouping and stress (Burri
& Baker, 2016). Teaching speech rhythm with physical beat gestures was also shown to
improve Catalan learners’ production in their L2 English (Gluhareva & Prieto, 2017). Kaiser
(2013) offered practical teaching advice such as teaching only two levels of word stress, using
physicalization (e.g., rubber bands) to match strong actions with stresses, and adopting both
bottom–up prediction skills and top–down skills to facilitate the acquisition of L2 speech
rhythm. Frost and Picavet (2014) introduced a project incorporating a series of teacher
training workshops and blended learning platforms with a variety of tools to facilitate L2
prosody teaching. Modern speech visualization technology may also be used to enhance
learners’ practice of discourse level intonation (Levis & Pickering, 2004).
7 Future Directions
A number of new directions can be applied to future work on L2 prosody. Most previous
studies have been cross-sectional. Longitudinal investigations have been far fewer given the
logistical complexities involved, but they can provide much more useful insight into how L2
prosody develops over time, as it has been suggested that the improvement in L2 pro-
nunciation is concentrated in the first year following arrival in the L2 environment (Flege,
1988; Munro & Derwing, 2008). So far, only a handful of longitudinal studies of L2 English
rhythm have been published (Quené & Orr, 2014; White & Mok, 2018, 2019). Further work
can assess whether L2 prosody can be improved beyond initial improvement in the first year.
Today, there are even more multilingual speakers than bilingual speakers, and the vast
multilingual population provides more varied language combinations that allow researchers
to study more interesting intonation phenomena in L3. Only a handful of L3 studies have
scratched the surface of the new area of L3 prosody. Gabriel et al. (2015) found that mul-
tilingual Chinese–German speaking learners of L3 English and L3 French benefited from all
of their previous languages in producing L3 speech rhythm. Studies of Cantonese–English
bilingual learners of L3 German have also shown that various aspects of L3 German prosody
were influenced by both the L1 and L2 (Zhu et al., 2019; Zhu & Mok, 2016). Many
209
Yanjiao Zhu and Peggy Mok
interesting topics, such the influence of prosody on L3 speakers’ intelligibility, factors af-
fecting the acquisition of L3 prosody, and the prosodic training of L3 speech, still await
further investigation.
Further Reading
Trouvain. J., & Gut, U. (2007). Non-native prosody: Phonetic description and teaching practice. Berlin:
De Gruyter.
A description of how L2 learners differ from native speakers in various aspects of prosody.
Derwing, T. M., & Munro, M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2
teaching and research. Amsterdam/Philadelphia: John Benjamins.
A general overview of the current state of L2 pronunciation research.
Levis, J. (2018). Intelligibility, oral communication, and the teaching of pronunciation. Cambridge, UK:
Cambridge University Press.
A review of research on the impact of prosody on intelligibility.
References
Adams, C., & Munro, R. R. (1978). In search of the acoustic correlates of stress: Fundamental fre-
quency, amplitude, and duration in the connected utterance of some native and non-native speakers
of English. Phonetica, 35, 125–156.
Anderson‐Hsieh, J., Johnson, R., & Koehler, K. (1992). The relationship between native speaker
judgments of nonnative pronunciation and deviance in segmentals, prosody and syllable structure.
Language Learning, 42(4), 529–555.
Arslan, L. M., & Hansen, J. H. L. (1997). A study of temporal features and frequency characteristics in
American English foreign accent. The Journal of the Acoustical Society of America, 102, 28–40.
Backman, N. (1979). Intonation errors in second-language pronunciation of eight Spanish-speaking
adults learning English. Interlanguage Studies Bulletin, 4, 239–265.
Braun, B., & Galts, T. (2014). Lexical encoding of L2 tones: The role of L1 stress, pitch accent and
intonation, Second Language Research, 30, 323–350.
Burnham, D., Reynolds, J., Vatikiotis-Bateson, E., Yehia, H., Ciocca, V., Morris, R. H., Hill, H.,
Vignali, G., Bollwerk, S., Tam, H., & Jones, C. (2006). The perception and production of phones
and tones: The role of rigid and non-rigid face and head motion. In ISSP 2006 – proceedings of the
7th international seminar on speech production (pp. 185–192).
Burri, M., & Baker, A. A. (2016). Teaching rhythm and rhythm grouping: The butterfly technique.
English Australia Journal: The Australian Journal of English Language Teaching, 31, 72–77.
Capliez, M. (2016). Prosody- vs. segment-based teaching: Impact on the perceptual skills of French
learners of English. Language, Interaction and Acquisition, 7, 212–237.
Chen, H. C. (2013). Chinese learners’ acquisition of English word stress and factors affecting stress
assignment. Linguistics and Education, 24, 545–555.
Cutler, A. (2007). Lexical stress. In D. B. Pisoni & R. E. Remez (Eds.), The handbook of speech per-
ception. New York, NY: Blackwell.
Cutler, A. (2012). Native listening. Cambridge, MA: MIT Press.
Cutler, A., Dahan, D., & Van Donselaar, W. (1997). Prosody in the comprehension of spoken lan-
guage: A literature review. Language and Speech, 40, 141–201.
Dauer, R. M. (1983). Stress-timing and syllable-timing reanalyzed. Journal of Phonetics, 11, 51–62.
De Mareüil, P. B., & Vieru-Dimulescu, B. (2006). The contribution of prosody to the perception of
foreign accent. Phonetica, 63, 247–267.
Delais-Roussarie, E., Avanzi, M., & Herment, S. (Eds.). (2015). Prosody and language in contact: L2
acquisition, attrition and languages in multilingual situations. Berlin/Heidelberg: Springer-Verlag.
doi: 10.1007/978-3-662-45168-7
Dellwo, V. (2006). Rhythm and speech rate: A variation coefficient for deltaC rhythm and speech rate:
A variation coefficient for C. In Language and language processing (pp. 231–241). Frankfurt:
Peter Lang.
Dupoux, E., Sebastián-Gallés, N., Navarrete, E., & Peperkamp, S. (2008). Persistent stress “deafness”:
The case of French learners of Spanish. Cognition, 106, 682–706.
210
The Role of Prosody Across Languages
Eng, K., Hannah, B., Leung, L., & Wang, Y. (2013). Can co-speech hand gestures facilitate learning of
non-native tones? The Journal of the Acoustical Society of America, 133, 3572–3572.
Field, J. (2005). Intelligibility and the listener: The role of lexical stress. TESOL Quarterly, 39, 399–423.
Flege, J. (1988). Factors affecting degree of perceived foreign accent in English sentences. Journal of the
Acoustical Society of America, 84, 70–79.
Flege, J. E. (1993). Production and perception of a novel, second-language phonetic contrast. Journal of
the Acoustical Society of America, 93, 1589–1608.
Fokes, J., & Bond, Z. S. (1989). The vowels of stressed and unstressed syllables in nonnative English.
Language Learning, 39, 341–373.
Frost, D., & Picavet, F. (2014). Putting prosody first – Some practical solutions to a perennial problem:
The innovalangues project. Research in Language, 12, 233–244.
Fung, H. S. H., & Mok, P. P. K. (2014). Realization of narrow focus in Hong Kong English
declaratives-a pilot study. Proceedings of the International Conference on Speech Prosody, 7,
964–968.
Gabriel, C., Stahnke, J., & Thulke, J. (2015). Assessing foreign language speech rhythm in multilingual
learners: An interdisciplinary approach. In H. Peukert (Ed.), Transfer effects in multilingual language
development (pp. 191–220). Amsterdam: John Benjamins Publishing Company.
Gananathan, R. Y., Yin, Y., & Mok, P. K. P. (2015). Interlanguage influence in cues of narrow focus:
A study of Hong Kong English. In Proceedings of the 18th international congress of phonetic sciences
(ICPhS 2015).
Gluhareva, D., & Prieto, P. (2017). Training with rhythmic beat gestures benefits L2 pronunciation in
discourse-demanding situations. Language Teaching Research, 21, 609–631.
Godfroid, A., Lin, C., & Ryu, C. (2017). Hearing and seeing tone through color: An efficacy study of
web-based, multimodal Chinese tone perception training. Language Learning, 819–857. doi: 10.1111/
lang.12246
Gordon, J., & Darcy, I. (2016). The development of comprehensible speech in L2 learners: A classroom
study on the effects of short-term pronunciation instruction, Journal of Second Language
Pronunciation, 1, 56–92.
Grabe, E., & Low, L. (2002). Durational variability in speech and the rhythm class hypothesis. In C.
Gussenhoven & N. Warner (Eds.), Laboratory phonology (Vol. 7, pp. 515–546). New York: Mouton
de Gruyter.
Graham, C., & Post, B. (2018). Second language acquisition of intonation: Peak alignment in American
English. Journal of Phonetics, 66, 1–14.
Grenon, I., & White, L. (2008). Acquiring rhythm: A comparison of L1 and L2 speakers of Canadian
English and Japanese. In Proceedings of the 32nd Boston university conference on language devel-
opment (pp. 155–166).
Guion, S. G., Flege, J. E., Liu, S. H., & Yeni-Komshian, G. H. (2000). Age of learning effects on the
duration of sentences produced in a second language. Applied Psycholinguistics, 21, 205–228.
Gumperz, J. (1982). Discourse strategies. Cambridge: Cambridge University Press.
Gut, U. (2003). Non‐native speech rhythm in German. In Proceedings of the ICPhS conference,
(pp. 2437–2440).
Hahn, L. D. (2004). Primary stress and intelligibility: Research to motivate the teaching of supraseg-
mentals. TESOL Quarterly, 38, 201–223.
Hao, Y. C. (2012). Second language acquisition of Mandarin Chinese tones by tonal and non-tonal
language speakers. Journal of Phonetics, 40, 269–279.
Hardison, D. M. (2004). Generalization of computer-assisted prosody training: Quantitative and
qualitative findings, Language Learning and Technology, 8, 34–52.
He, X., Van Heuven, V. J., & Gussenhoven, C. (2012). The selection of intonation contours by Chinese
L2 speakers of Dutch: Orthographic closure vs. prosodic knowledge. Second Language Research
(Vol. 28). doi: 10.1177/0267658312439668
Hirano-Cook, E. (2011). Japanese pitch accent acquisition by learners of Japanese: Effects of training
on Japanese accent instruction, perception, and production. Japanese Language & Literature, 45,
363–364.
Hirata, Y. (2004). Computer assisted pronunciation training for native English speakers learning
Japanese pitch and durational contrasts. Computer Assisted Language Learning, 17, 357–376.
Holm, S. (2009). Intonational and durational contributions to the perception of foreign-accented
Norwegian. Retrieved from http://www.fonkonsult.com/PhD_thesis_Snefrid_Holm.pdf
211
Yanjiao Zhu and Peggy Mok
Huang, B. H., & Jun, S. A. (2011). The effect of age on the acquisition of second language prosody.
Language and Speech, 54, 387–414.
Huang, T., & Erickson, D. (2019). Articulation of English “prominence” by L1 (English) and L2
(French) speakers. In ICPhS 2019, 1, 1–5.
Hutchinson, S. P. (1973). An objective index of the English-Spanish pronunciation dimension.
Unpublished Master’s thesis, University of Texas, Austin, TX.
Jonasson, J., & McAllister, R. (1972). Foreign accent and timing: An instrumental study. PILUS,
14, 11–40.
Kainada, E., & Lengeris, A. (2015). Native language influences on the production of second-language
prosody. Journal of the International Phonetic Association, 45, 269–287.
Kaiser, D. J. (2013). Practical approaches and strategies for teaching stress-timed English rhythm. In
The conference proceedings of MIDTESOL: Cultivating best practices in ESL: 2013–2014, 71–90.
Lee, C. Y., Tao, L., & Bond, Z. S. (2010). Identification of acoustically modified mandarin tones by
non-native listeners. Language and Speech, 53, 217–243.
Levis, J. M. (2018). Intelligibility, oral communication, and the teaching of pronunciation. Cambridge,
UK: Cambridge University Press.
Levis, J. M., & Levis, G. M. (2018). Teaching high-value pronunciation features: Contrastive stress for
intermediate learners. CATESOL Journal, 30, 139–160.
Levis, J., & Pickering, L. (2004). Teaching intonation in discourse using speech visualization tech-
nology. System, 32, 505–524.
Lima, E. D. (2015). Development and evaluation of online pronunciation instruction for international
teaching assistants’ comprehensibility. Unpublished PhD Dissertation, Iowa State University,
Ames, Iowa.
Lin, C. Y., Wang, M., Idsardi, W. J., & Xu, Y. (2020). Stress processing in Mandarin and Korean
second language learners of English. Bilingualism: Language and Cognition, 17, 316–346.
McGory, J. T. (1997). Acquisition of intonational prominence in English by Seoul Korean and Mandarin
Chinese speakers. Unpublished PhD Dissertation, Ohio State University, Columbus, Ohio.
Mennen, I. (2004). Bi-directional interference in the intonation of Dutch speakers of Greek. Journal of
Phonetics, 32, 543–563.
Mennen, I., Chen, A., & Karlsson, F. (2010). Characterising the internal structure of learner intonation
and its development over time. In Proceedings of the 6th international symposium on the acquisition of
second language speech, new sounds, 319–324.
Mennen, I., Schaeffler, F., & Dickie, C. (2014). Second language acquisition of pitch range in German
learners of English. Studies in Second Language Acquisition, 36, 303–329.
Mok, P. K. P., & Dellwo, V. (2008). Comparing native and non-native speech rhythm using acoustic
rhythmic measures: Cantonese, Beijing Mandarin and English. Speech Prosody 200, 423–426,
Campinas/Brazil.
Mok, P. K. P., Yin, Y., Setter, J., & Nayan, N. M. (2016). Assessing knowledge of English intonation
patterns by L2 speakers. In Speech Prosody 2016, 543-547. Boston, MA.
Morett, L. M., & Chang, L. Y. (2015). Emphasising sound and meaning: Pitch gestures enhance
Mandarin lexical tone acquisition. Language, Cognition and Neuroscience, 30, 347–353.
Moyer, A. (2004). Age, accent and experience in second language acquisition. Clevedon: Multilingual
Matters.
Munro, M. J., & Derwing, T. M. (1998). The effects of speaking rate on listener evaluations of native
and foreign-accented speech. Language Learning, 42, 159–182.
Munro, M. J., & Derwing, T. M. (1999). Foreign accent, comprehensibility, and intelligibility in the
speech of second language learners. Language Learning, 49, 285–310.
Munro, M. J., & Derwing, T. M. (2008). Segmental acquisition in adult ESL learners: A longitudinal
study of vowel production. Language Learning, 58, 479–502.
Niebuhr, O., Alm, M., Schümchen, N., & Fischer, K. (2017). Comparing visualization techniques for
learning second language prosody. International Journal of Learner Corpus Research, 3, 250–277.
Ortega-Llebaria, M., Gu, H., & Fan, J. (2013). English speakers’ perception of Spanish lexical stress:
Context-driven L2 stress perception. Journal of Phonetics, 41, 186–197.
Pelzl, E., Lau, E. F., Guo, T., & DeKeyser, R. (2019). Advanced second language learners’ perception
of lexical tone contrasts. Studies in Second Language Acquisition, 41, 59–86.
Pickering, L. (2001). The role of tone choice in improving ITA communication in the classroom.
TESOL Quarterly, 35, 233–253.
212
The Role of Prosody Across Languages
Pytlyk, C. (2008). Interlanguage prosody: Native English speakers’ production of Mandarin yes-no
questions. In Proceedings of the 2008 annual conference of the Canadian linguistic association.
Qin, Z., Chien, Y. F., & Tremblay, A. (2017). Processing of word-level stress by Mandarin-speaking
second language learners of English. Applied Psycholinguistics, 38(3), 541–570.
Quené, H., & Delft, L. E. Van. (2010). Non-native durational patterns decrease speech intelligibility.
Speech Communication, 52, 911–918.
Quené, H., & Orr, R. (2014). Long-term convergence of speech rhythm in L1 and L2 English. In 7th
international conference on speech prosody 2014, 342–345.
Ramus, F., Nespor, M., & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal.
Cognition, 73, 265–292.
Santiago, F., & Delais‐Roussarie, E. (2015). The acquisition of question intonation by Mexican
Spanish learners of French. InProsody and language in contact (pp. 243–270). Springer, Berlin,
Heidelberg.
Saito, K., & Wu, X. (2014). Communicative focus on form and second language suprasegmental
learning: Teaching cantonese learners to perceive mandarin tones. Studies in Second Language
Acquisition, 36, 647–680.
Setter, J. (2003). A comparison of speech rhythm in British and Hong Kong English.In Proceedings of
the 15th International Congress of Phonetic Sciences, (pp. 467–470).
Setter, J., & Deterding, D. (2003). Extra final consonants in the English of Hong Kong and Singapore.
In Proceedings of the 15th international congress of phonetic sciences, Barcelona, 12–14.
Silpachai, A. (2020). The role of talker variability in the perceptual learning of Mandarin tones by
American English listeners. Journal of Second Language Pronunciation, 6, 209–235.
Slowiaczek, L. M. (1990). Effects of lexical stress in auditory word recognition. Language and Speech,
33, 47–68.
Smith, C. L., Erickson, D., & Savariaux, C (2019). Articulatory and acoustic correlates of prominence
in French: Comparing L1 and L2 speakers. Journal of Phonetics, 77, 1–29.
Tajima, K., Port, R., & Dalby, J. (1997). Effects of temporal correction on intelligibility of foreign-
accented English. Journal of Phonetics, 25, 1–24.
Tortel, A., & Hirst, D. (2010). Rhythm metrics and the production of English L1 / L2. In Proceedings of
speech prosody 2010-5th International Conference.
Tremblay, A., Broersma, M., & Coughlin, C. E. (2018). The functional weight of a prosodic cue in the native
language predicts the learning of speech segmentation in a second language. Bilingualism, 21, 640–652.
Trofimovich, P., & Baker, W. (2006). Learning second language suprasegmentals: Effect of L2 ex-
perience on prosody and fluency characteristics of L2 speech. Studies in Second Language
Acquisition, 28, 1–30.
Turco, G., Dimroth, C., & Braun, B. (2015). Prosodic and lexical marking of contrast in L2 Italian.
Second Language Research, 31, 465–491.
Wang, X. (2006). Perception of L2 tones: L1 lexical tone experience may not help. In Proceedings of
speech prosody, Dresden, Germany.
Wennerstrom, A. (1998). Intonation as cohesion in academic discourse: A study of Chinese speakers of
English. Studies in Second Language Acquisition, 20, 1–25.
White, D., & Mok, P. (2018). L2 speech rhythm development in new immigrants. In Proceedings of
speech prosody 2018 (pp. 838–842). Poznán.
White, D., & Mok, P. (2019). L2 speech rhythm and language experience in new immigrants. In
Proceedings of 19th international congress of phonetic sciences, Melbourne, Australia 2019
(pp. 334–338).
Wiener, S., Chan, M. K. M., & Ito, K. (2020). Do explicit instruction and high variability phonetic
training improve nonnative speakers’ Mandarin tone productions? The Modern Language Journal,
104, 152–168.
Willems, N. (1982). English intonation from a Dutch point of view. In Proceedings of the tenth inter-
national congress of phonetic sciences, 1–6 August 1983, Utrecht, The Netherlands, 706–709.
Wu, X., Munro, M. J., & Wang, Y. (2014). Tone assimilation by Mandarin and Thai listeners with and
without L2 experience. Journal of Phonetics, 46, 86–100.
Zhang, Y., Nissen, S. L., & Francis, A. L. (2008). Acoustic characteristics of English lexical stress
produced by native Mandarin speakers. The Journal of the Acoustical Society of America, 123,
4498–4513.
213
Yanjiao Zhu and Peggy Mok
Zheng, A., Hirata, Y., & Kelly, S. D. (2018). Exploring the effects of imitating hand gestures and head
nods on L1 and L2 Mandarin tone production, Journal of Speech, Language and Hearing Research,
61, 1–18.
Zhu, Y., Chen, A., Sudhoff, S., & Mok, P. (2019). Third language prosody: Evidence from Cantonese-
English-German trilinguals. In ICPhS 2019, 1–5.
Zhu, Y., & Mok, P. K. P. (2016). Intonational phrasing in a third language: The production of German
by Cantonese-English bilingual learners. In Proceedings of the 8th international conference on speech
prosody (SP2016) (pp. 751–755).
Zielinski, B. W. (2008). The listener: No longer the silent partner in reduced intelligibility. System,
36, 69–84.
214
15
GRAMMAR FOR SPEAKING
June Ruivivar and Laura Collins
1 Introduction/Definitions
Grammatical knowledge is defined as knowledge of the structures and patterns that govern a
language, and its acquisition as the ability to use this information in communication
(Nassaji, 2019). We consider both syntax and morphology as part of this definition, in-
asmuch as they constitute linguistic patterns and are treated as grammar in pedagogical
practice. We also include lexicogrammar, the grammatical behaviour of lexical items
(Halliday & Matthiesen, 2013).
In second language acquisition (SLA), grammar learning has commonly been associated
with the learning of rules, most of which are derived from written language. However, as
spoken corpora have become more numerous and accessible, it is now well established that
the rules we commonly see in language teaching and pedagogical materials do not fully
capture the unique grammatical properties of speech. There are competing views on whether
these differences are attributable to a distinct grammar of speech, as espoused by Brazil
(1995), or whether speech and writing are governed by the same grammar but with structures
occurring with different distributions (Biber et al., 1999). However, most studies, and the
pedagogical recommendations resulting from them, support the latter view. This view is also
consistent with the functionalist approach, which proposes that language is primarily a tool
for communication. From this perspective, grammar is best examined in terms of how it is
used to express meaning and how contextual factors influence speakers’ language choices.
One observation that supports a functionalist approach to the grammar of speech – in
particular, that grammar fulfils different functions in writing and speaking – is that the
grammatical “rules” derived from writing are consistently flouted in speech, particularly in
conversation. Examples (1) to (3) below illustrate some deviations.
1. Incomplete sentences
• Team meeting today!
• What time?
• Two o’clock
2. Subject ellipsis
• [I] Had to get my bike fixed
• How come?
• [It] Was making this funny noise
3. There is + plural
• There’s too many assignments in this class
• Yeah? Last term there was five papers to write
• Oh, there’s only three this time
The incomplete sentences in (1) indicate that conversations are co-constructed across turns;
rather than formulating well-formed sentences, the speakers build on each other’s utterances
to complete a thought. In (2), shared knowledge between speakers eliminates the need to
specify the subject: it is clear whose bike needs fixing and what was making the noise.
Example (3) shows how a non-standard form that facilitates speaking can be con-
ventionalized: generalizing there’s to singular and plural subjects not only conveys the
message more quickly, but also bypasses the task of ensuring subject–verb agreement.
Spoken grammar features appear to arise from the social and cognitive demands of speaking.
Learners wishing to communicate effectively in their second language (L2) must therefore
learn to use grammar to meet these demands.
2 Historical Perspectives
The notion that speech is grammatically distinct from writing is not recent. For example, during
the Grammarians’ War of 1519–1521, prescriptive grammarians took issue with spoken pas-
sages in the Latin grammar book Vulgaria, which reflected the colloquial language of the vulgus
or common people (Carlson, 1992). Colloquial varieties were seen as degrading the language,
characterized as “abusions of speech” (Jonson, 1640, cited in Carter & McCarthy, 2017) and
216
Grammar for Speaking
“adventitious sounds…rather than voices of art” (Harris, 1773, cited in Carter & McCarthy,
2017). The debate was also pedagogical, with some scholars recognizing the primacy of spoken
language over written (Emerson, 1896; Sweet, 1899). Sweet (1899), for example, noted that
because speech predates writing developmentally, native speakers can comfortably maintain
their speech patterns despite learning “formal” grammar later on; and so learners must also
develop an equally strong association with the spoken register (p. 211). Linn (2013) offers an
overview of similar divides in languages where written forms are historically associated with
scholarship, education, and careful style. In contemporary usage, examples include the French
passé simple (il parla, “he spoke”) and the imperfect subjunctive (qu’il parlât, “that he have
spoken”), which occur almost exclusively in literary works and are considered inappropriate, or
even comical, in conversation.
Today, attempts to standardize language continue to uphold the notion of speech as
inferior to writing, albeit to a lesser degree. Until 2018, for example, the American Heritage
Dictionary employed a “usage panel” of linguists, authors, journalists, and other language
experts who voted on the acceptability of certain constructions. Many were popular gram-
matical devices that were not considered acceptable until they had been in use for many
years. An example is hopefully as a sentence modifier (Hopefully we’ll be on time), which was
rated acceptable by only 34% of the panel members in 1999 but was accepted by 63% in 2012
(American Heritage Dictionary, 2020).
Pedagogical approaches have also shifted over the years. The Reform Movement of 19th
century Britain and Europe proposed greater attention to spoken language than the
grammar translation approaches that characterized foreign language teaching, but the in-
structional texts still tended to follow principles of written language in terms of vocabulary
and grammar (see Thornbury & Slade, 2006, p. 248). The audio-lingual method, popular in
North America through the 1960s and early 1970s, assumed that grammar was best learned
through oral practice. It favoured mechanical drilling of grammatical constructions in dia-
logues, without making the patterns explicit. The choice of patterns and their forms, how-
ever, was based on the conventions of written grammar (e.g., the use of complete sentences),
resulting in unnatural dialogues. With the introduction of the notion of communicative
competence (Canale & Swain, 1980) and communicative language teaching (CLT) came a
greater focus on authentic interaction in classrooms. Central to CLT is the idea that effective
communication requires not only the ability to construct grammatical sentences, but also the
ability to deploy them in real-life situations. Fluency (Chapter 13) became an important goal
alongside accuracy (which often meant grammatical accuracy), and it is still common to see
pedagogical activities described as targeting one or the other. Contemporary methodologies
favour teaching grammar in context, with the teacher drawing attention to the grammar used
in a given situation and providing opportunities for controlled or free practice. However,
there remains limited attention in textbooks to how learners can use grammar to speak more
efficiently.
217
June Ruivivar and Laura Collins
commonly considered formal, such as university lectures, are in fact closer to a conversa-
tional register (Biber et al., 2002). Therefore, a critical issue underlying any research on the
grammar of speech is the particular register, variety, or speech event under study within these
dimensions, as well as the larger social context in which it is used.
218
Grammar for Speaking
“a full range of structures,” with occasional “‘slips’ characteristic of native speaker speech”
considered acceptable (IELTS, 2020). The TOEFL iBT only specifies an “appropriate range
of grammatical structures” and tolerates minor errors that do not impede comprehension
(Educational Testing Service, 2019). These descriptors raise the question of what forms
might be considered “errors” or “slips” when they are in fact serving the purposes of speech.
There is evidence that L2 speakers are held to a higher standard of grammatical accuracy
than native speakers, especially when they use features of spoken grammar. For example,
Ruivivar and Collins (2018) found that topic fronting (This book, have you read it?), which
aids both speaker and listener by specifying the topic of a sentence, is judged as less gram-
matical when produced by even moderately accented speakers.
219
June Ruivivar and Laura Collins
Grammar is therefore used here for what Rühlemann (2007) calls relation management.
Examples 5a and 5b illustrate discourse management: the quantifier can communicate a
speaker’s stance or attitude, with some indicating that the speaker expects a positive re-
sponse, and any indicating an expected negative response (Larsen-Freeman, 2001). L2 users
appear to struggle with this type of variation (e.g., Mougeon et al., 2004), although near-
native speakers have been shown to closely approximate L1 speakers’ patterns
(Donaldson, 2016).
Grammatical choices can also mark contextual appropriacy. While 6a is, by textbook
standards, more grammatical than 6b, the latter may be more appropriate in a typical coffee
shop, where encounters are expected to be brief. L2 users might benefit from knowing that in
such situations, speaking in full sentences (as they may have been trained to do) is not only
unnecessary, but perhaps dispreferred. Larsen-Freeman (2001) argues that such grammatical
options represent choices available to speakers and which allow them to express politeness,
authority, and other social attributes.
220
Grammar for Speaking
What types of grammatical knowledge are most beneficial for L2 fluency? Researchers
have answered this question indirectly by identifying features that support fluent usage
among L1 users (e.g., Biber et al., 1999). This research suggests that speakers draw upon
mental representations of lexicogrammatical chunks, or constructions – ordered combina-
tions of words that function as one unit with a specific semantic or discoursal function (see
Peters, this volume). Examples of constructions include determiner + noun (a bicycle, your
brother) and subject pronoun + verb + object pronoun (I found it, she called them). The use of
constructions to produce stretches of language while limiting attention to grammatical form
has been noted by several scholars (e.g., Pawley & Syder, 1983) and is consistent with a
usage-based theory of language use (Tomasello, 2003). In L2 speech, it has been shown that
learners fulfil pragmatic functions using unanalyzed chunks of language, or “holophrases”
(Corder, 1973). More recently, Gilquin (2018) found that advanced learners primarily use the
same set of simple, two-word constructions as native speakers. While native speakers ex-
hibited longer and more complex constructions, some constructions were more frequent in
the learner corpus (e.g., subordinating conjunction + pronoun, because you), suggesting that
at least in advanced stages, learners can use language chunks to support speaking.
221
June Ruivivar and Laura Collins
allowed them to identify the distinguishing grammatical and lexical properties of different
spoken and written registers. One important outcome is a probabilistic view of grammar,
which proposes that rather than following separate grammars, speech and writing (and their
various registers) use grammatical structures at different frequencies, such that the nature of
texts and speech events can be identified by their linguistic properties and vice versa.
Corpora have also been used to study the relationship between grammar and discourse. In
addition to frequency analyses, discourse studies are also concerned with the social and
pragmatic meanings conveyed by grammatical choices. It considers contextual factors such
as the relationship between interlocutors and the intent of the interaction. For example,
Adolphs and Carter (2003) analyzed the frequency and uses of like in spoken English, finding
that it occurs largely in intimate and social contexts (as opposed to professional and
transactional) and serves a variety of discourse properties, such as reporting speech (We were
like, watch out!), analogizing (House work? Like what? Like dusting the shelves), and hedging
(I ran at like the speed of light). Drawing from the same corpus, Carter & McCarthy (1999)
reported that the get-passive was overwhelmingly favoured in the reporting of negative
events, with focus on the subject rather than the agent (e.g., We got stranded).
Research on the grammar of discourse can also focus on processing and information struc-
ture; that is, how grammatical forms define the (non)prominence of different elements of a turn
or utterance. Speakers often decide on word order depending on what they wish to emphasize
(e.g., Your teacher called vs. I got a call from your teacher), which in most languages means
putting prominent elements at the end (Halliday, 1985). This can also be achieved through non-
standard word order, such as cleft constructions in French and English, as illustrated.
Discourse-based studies typically focus on grammatical features selected for their functions
in narrow contextual categories or specific varieties, and so cannot easily be generalized
across contexts (Dontcheva-Navratilova, 2012). Rather, this research identifies the linguistic
markers of different interactional situations, which can tell us where certain forms may be
more or less appropriate, and what information will be most useful for L2 learners in a
variety of contexts.
222
Grammar for Speaking
(Derwing et al., 2002). Ruivivar and Collins (2019) have used a holistic measure of gram-
maticality, in which raters judge how correct a sentence sounds overall, following a training
session that discourages attention to specific grammatical features.
Perception and judgement studies are usually carried out in laboratories, with raters listening to
pre-written speech samples. The limitation of this approach is that the speech samples might lack
authenticity compared to natural language. However, it is challenging to obtain consistent num-
bers and types of grammatical errors, or sufficient samples of specific grammatical features, from
naturalistic data, and so laboratory studies remain prevalent. Authenticity can be maximized by
instructing speakers to speak as naturally as possible, giving them time to practice, and recording
multiple versions of each item from which the most natural is selected for the study.
223
June Ruivivar and Laura Collins
hedging or hesitation. At the same time, less frequent expressions such as wild guess or
educated guess, though not frequent overall, may be useful as they are conventionalized ways
of expressing certain notions (Bybee, 2008), and often only work in limited grammatical
contexts. Corpora searches can help learners evaluate the appropriacy or acceptability of
grammatical utterances they may create (e.g., wildly guessing or guessing educatedly).
224
Grammar for Speaking
7 Future Directions
Pedagogical Approaches
Given the continued pedagogical focus on communicative competence, one promising direc-
tion for research is to examine the role of grammar instruction on different aspects of L2
speech, and what aspects of L2 speech benefit the most from instruction. This can build on
research linking grammar to fluency (Hilton, 2009) and classroom studies showing that explicit
instruction can promote conceptual understanding and productive use of sociolinguistically
variable grammar (e.g., van Compernolle, 2013). Further studies might explore how fluency,
appropriacy, and other speech measures might be targeted through grammar instruction.
Teaching approaches could address specific grammatical features such as conversational dis-
course markers (e.g., right, okay) and variable question forms (see examples 4a to 4c), or they
225
June Ruivivar and Laura Collins
may take discourse or pragmatic concepts (such as distancing and politeness) as units of in-
struction, with the relevant grammar identified to achieve the desired communicative intent.
Finally, research should take a more learner-centric approach to exploring pedagogical
approaches to grammar for speaking. The literature has generally centred on two issues:
what features to teach and how, and whether native speech is an appropriate model for
pedagogy. Most pedagogical methods have focused on either features of informal discourse
(e.g., Jones & Carter, 2014) or non-standard structures such as left-dislocation (e.g., Timmis,
2005), using various forms of inductive teaching and speaking practice. The native-speaker
question has not been resolved, though the selection of features is generally informed by
native-speaker corpus data. Missing from this discussion is learners’ perspectives on their
own language use. There is some evidence that learner-centric issues of identity and align-
ment with the target language community can influence learning and production; in parti-
cular, some learners are reluctant to use features associated with native-speaker usage
(Ruivivar, 2020; Soruç & Griffiths, 2015). Future research might explore these perceptions
and how teaching can incorporate learners’ stance on learning conversational grammar and
what aspects of spoken language they find useful or want to learn.
Further Reading
Cook, V. (2016). Where is the native speaker now? TESOL Quarterly, 50(1), 186–189.
Hughes, R. (2010). What can a corpus tell us about grammar teaching materials? In M. O’Keeffe & McCarthy
(Eds.), The Routledge handbook of corpus linguistics (1st edn, pp. 401–412). London: Routledge.
Rühlemann, C. (2012). Conversational grammar. In C. Chapelle (Ed.), The encyclopedia of applied
linguistics. Oxford: Wiley Blackwell.
Thornbury, S., & Slade, D. (2006). Conversation: From description to pedagogy. Cambridge: Cambridge
University Press.
References
Adolphs, S., & Carter, R. (2003). And she’s like it’s terrible, like: Spoken discourse, grammar and
corpus analysis. International Journal of English Studies, 3(1), 45–56.
American Heritage Dictionary of the English Language. (2020). Hopefully [dictionary entry]. Retrieved
from https://ahdictionary.com/word/search.html?q=hopefully
Asano, Y., & Weber, A. (2016). Listener sensitivity to foreign-accented speech with grammatical errors.
In A. Papafragou, D. Groder, D. Mirman, & J. C. Trueswell (Eds.), Proceedings of the 38th annual
conference of the cognitive science society (pp. 1775–1780). Austin, TX: Cognitive Science Society.
Biber, D. (1986). Spoken and written textual dimensions in English: Resolving the contradictory
findings. Language, 62, 384–414.
Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.
Biber, D. (1995). Dimensions of register variation: A cross-linguistic comparison. Cambridge: Cambridge
University Press.
Biber, D., Conrad, S., Reppen, R., Byrd, P., & Helt., M. (2002). Speaking and writing in the university:
A multidimensional comparison. TESOL Quarterly, 36(1), 9–48.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and
written English. London: Longman.
Brazil, D. (1995). A grammar of speech. Oxford: Oxford University Press.
Bybee, J. (2008). Usage-based grammar and second language acquisition. In P. Robinson and N. Ellis (Eds.),
Handbook of cognitive linguistics and second language acquisition (pp. 216–236). New York: Routledge.
Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language
teaching and testing. Applied Linguistics, 1, 1–47.
Carter, R., & McCarthy, M. (1995). Grammar and the spoken language. Applied Linguistics, 16(2), 141–158.
Carlson, D.R. (1992). The “Grammarians’ War” 1519–21: Humanist careerism in early Tudor England
and printing. Medievalia et Humanistica, 18, 157–181.
226
Grammar for Speaking
Carter, R., & McCarthy, M. (1999). The English get-passive in spoken discourse: Description and
implications for an interpersonal grammar. English Language and Linguistics, 3(1), 41–58.
Carter, R., & McCarthy, M. (2017). Spoken grammar: Where are we and where are we going? Applied
Linguistics, 38(1), 1–20.
Carter, R., Hughes, R., & McCarthy, M. (2000). Exploring grammar in context: Upper-intermediate and
advanced. Cambridge: Cambridge University Press.
Cook, V. (2002). Background to the L2 user. In V. Cook (Ed.), Portraits of the L2 user (pp. 1–28).
Clevedon: Multilingual Matters.
Corder, S. P. (1973). Introducing applied linguistics. New York: Penguin
Cullen, R., & Kuo, I. (2007). Spoken grammar and ELT course materials: A missing link? TESOL
Quarterly, 41(2), 361–386.
Derwing, T. M., Rossiter, M. J., & Ehrensberger-Dow, M. (2002). “They speaked and wrote real
good”: Judgements of non-native and native grammar. Language Awareness, 11(2), 84–99.
Donaldson, B. (2016). Aspects of interrogative use in near-native French: Form, function, and register.
Linguistic Approaches to Bilingualism, 6(4), 467–503.
Dontcheva-Navratilova, O. (2012). The grammar of discourse. In C. Chapelle (Ed.), The encyclopedia
of applied linguistics. Oxford: Wiley Blackwell.
Educational Testing Service (2019). Performance descriptors for the TOEFL iBT® Test. Retrieved from
https://www.ets.org/s/toefl/pdf/pd-toefl-ibt.pdf
Emerson, O. F. (1896). The teaching of English grammar. The School Review, 5.
Etienne, C., & Sax, K. (2009). Stylistic variation in French: Bridging the gap between research and
textbooks. The Modern Language Journal, 93(4), 584–606.
Fafulas, S. (2015). Progressive constructions in native-speaker and adult-acquired Spanish. Studies in
Hispanic and Lusophone Linguistics, 8(1), 85–133.
Fernandez, C. (2011). Approaches to grammar instruction in teaching materials: A study in current L2
beginning-level Spanish textbooks. Hispania, 94(1), 155–170.
Folse, K. (2015). Creating corpus-based vocabulary lists for two verb tenses: A lexicogrammar ap-
proach. In M. A. Christison, D. Christian, P. A. Duff, & N. Spada (Eds.), Teaching and learning
English grammar: Research findings and future directions (pp. 119–135). New York: Routledge.
Gilquin, G. (2018). Exploring the spoken learner English construction. In R. Alonso Alonso (Ed.),
Speaking in a second language (pp. 128–152). Amsterdam: John Benjamins.
Goldschneider, J. M., & DeKeyser, R. M. (2001). Explaining the “natural order of morpheme acqui-
sition” in English: A meta-analysis of multiple determinants. Language Learning, 51(1), 1–50.
Gudmestad, A. (2012). Acquiring a variable structure: An interlanguage analysis of second language
mood use in Spanish. Language Learning, 62(2), 373–402.
Halliday, M. A. K. (1985). An introduction to functional grammar. London, England: Edward Arnold.
Halliday, M. A. K., & Matthiesen, C, (2013). Halliday’s introduction to functional grammar (4th edn).
London: Routledge.
Hanulikova, A., Van Alphen, P. M., Van Goch, M. M., & Weber, A. (2012). When one person’s
mistake is another’s standard usage: The effect of foreign accent on syntactic processing. Journal of
Cognitive Neuroscience, 24(4), 878–887.
Hilton, H. (2009). The link between vocabulary knowledge and spoken L2 fluency. The Language
Learning Journal, 36(2), 153–166.
IELTS (2020). Speaking: Band descriptors. Retrieved from https://www.ielts.org/-/media/pdfs/speaking-
band-descriptors.ashx?la=en
Jenkins, J. (2015). Repositioning English and multilingualism in English as a Lingua Franca. Englishes
in Practice, 2(3), 49–85.
Jones, C., & Carter, R. (2014). Teaching spoken discourse markers explicitly: A comparison of III and
PPP. International Journal of English Studies, 13(1), 37–54.
Kuo, I. (2006). Addressing the issue of teaching English as a lingua franca. ELT Journal, 60(3), 213–221.
Labov, W. (1969). Contraction, deletion, and inherent variability of the English copula. Language, 45,
715–762.
Labov, W. (1972). The logic of nonstandard English. Philadelphia: University of Pennsylvania Press.
Larsen-Freeman, D. (2001). The grammar of choice. In E. Hinkel & S. Fotos (Eds.), New perspectives in
grammar teaching in second language classrooms (pp. 104–118). New York: Routledge.
Larsen-Freeman, D. (2012). On the roles of repetition in language teaching and learning. Applied
Linguistics Review, 3(2), 195–210.
227
June Ruivivar and Laura Collins
228
16
CONVERSATIONAL INTERACTION
STUDIES
Jaemyung Goo
1 Introduction/Definitions
Interaction researchers claim that conversational interaction provides crucial opportunities
for learners to refine and restructure their inter-language by drawing their attention to lin-
guistic code features during negotiation for meaning (see Gass, 1997, 2003; Gass & Mackey,
2015; Goo, 2019; Long, 1985, 1996, 2007; Mackey, 2012; Mackey, Abbuhl, & Gass, 2012;
Pica, 1994, 1996 for reviews). Negotiation for meaning unfurls several interactional features
and activates cognitive processes optimally attuned to L2 development. Empirical research
has yielded sufficient evidence that interaction precipitates L2 learning. A number of inter-
actional features and processes, and learner-internal and -external factors believed to mediate
the extent to which L2 learners benefit from conversational interaction (e.g., modified output
opportunities, noticing, CF type, task type/complexity, working memory, and language
aptitude) have been investigated in various learning contexts and from diverse perspectives.
Overall, research results have furthered our understanding of the role of interaction in L2
learning. However, given that the field of SLA has expanded its theoretical and experimental
boundaries by adopting varied research methods and theoretical frameworks, the current
state of affairs, with many questions left unanswered or partially answered, bespeaks the
complex and multi-faceted nature of interaction-based learning involving a multitude of
potential mediating factors interacting with each other. I will first illustrate the Interaction
Hypothesis, along with relevant interactional features, then discuss variables that may
mediate, or are believed to mediate, the extent of interaction effects. Finally, I will provide a
brief summary and future directions.
Corrective feedback (CF): Oral or written responses to L2 learners’ erroneous output that
indicate its non-target-likeness in some way.
Modified output: Output modified by a learner following CF on their non-target-like original
utterance.
Uptake: A learner’s immediate response of any kind to CF which includes modified output.
Noticing: Registering L2 input/CF with some level of awareness
2 Historical Perspectives
Attention to the role of linguistic and conversational adjustments in L2 learning within the
interaction framework was triggered by Long’s (1981) call for experimental studies to test the
hypothesis that “participation in conversation with NS, made possible through the mod-
ification of interaction, is the necessary and sufficient condition for SLA” (p. 275). The early
version of Long’s Interaction Hypothesis drew on Krashen’s (1982) emphasis on compre-
hensible input. That is, if linguistic/interactional adjustments contribute to making input
more comprehensible, and if comprehensible input leads to L2 acquisition, as Krashen has
claimed, we can assume that linguistic/interactional adjustments result in L2 development.
Accordingly, early interaction research was conducted to investigate and describe various
linguistic and conversational adjustments, discourse patterns, and negotiation sequences in
the process of dealing with incomplete understandings.
The pivotal value of negotiated interaction in L2 learning is further illustrated in Long’s
(1996) revised Interaction Hypothesis, which includes cognitive aspects. He proposed that
That is, negotiation for meaning, which involves “provoking adjustments to linguistic form,
conversational structure, message content, or all three, until an acceptable level of under-
standing is achieved” (p. 418), creates invaluable breeding grounds for L2 acquisition by
providing nontrivial learning conditions and opportunities for attentional and psycho-
linguistic processes optimized for the development of various L2 aspects. As Long suggested,
negotiation for meaning involving interactional adjustments likely “connects input, internal
learner capacities, particularly selective attention, and output in productive ways” (p. 452),
functioning as a critical interaction-learning bridge.
Findings of early interaction research indicate that modified input through interactional
adjustments is more effective at precipitating L2 comprehension (and later production) than
premodified or unmodified input without interaction opportunities, and that the amount of
interaction may depend on task type and grouping (see Gass, 1997, 2003). However, early
research also suggested that comprehension via interaction does not necessarily lead to L2
acquisition, that is, “posing a linear relationship between comprehension of input and intake
of the structures contained therein may be untenable” (Loschky, 1994, p. 320). Accordingly,
interaction researchers began to investigate a direct relationship between interaction and L2
learning. For instance, Mackey (1999) found active participation in negotiated interaction
230
Conversational Interaction Studies
output may stimulate learners to move from the semantic, open-ended, non-
deterministic, strategic processing prevalent in comprehension to the complete gram-
matical processing needed for accurate production. Output, thus, would seem to have a
potentially significant role in the development of syntax and morphology (p. 128).
Accordingly, output is now viewed as an essential part of the L2 learning mechanism, not
just as an opportunity to practice, or an outcome of practicing what has been learned, as was
231
Jaemyung Goo
232
Conversational Interaction Studies
moves is that learner noticing of corrective intent of CF is represented in the form of uptake
or repair, which may not necessarily be the case (see Bao et al., 2011; Yoshida, 2010 for
disparities between uptake and noticing).
Noticing
Given that noticing through selective attention during negotiation for meaning (Long, 1996), and
noticing in general (Schmidt, 1990, 2001), plays a pivotal role in L2 learning, noticing of CF
moves is a critical aspect of how CF precipitates L2 development. Learner noticing of CF de-
termines the effectiveness of CF in refining and restructuring L2 learners’ interlanguage (IL), that
is, noticing functions as “a potential mediator in the feedback-learning relationship” (Mackey,
2006, p. 426). Research findings suggest that noticing of CF is influenced by factors including
target type, CF type, characteristics of recasts, type of uptake, teaching contexts, and learner
beliefs about CF. Mackey et al.’s (2000) results, for instance, revealed that whereas the learners
were relatively more accurate in their perceptions of lexical and phonological feedback, their
perceptions of morphosyntactic feedback provided mostly in recasts were generally inaccurate,
which indicates recasts on morphosyntactic errors may sometimes go unnoticed. In a similar
vein, the level of noticing depends on the type of target; some structures are more amenable to
noticing than are others (e.g., Kartchava & Ammar, 2014a; Mackey, 2006).
Other issues have been associated with noticing (e.g., learner beliefs, modified output, and CF
type). Rassaei (2013) found that learner noticing of explicit correction occurred more often than
noticing of recasts. In Kartchava and Ammar’s (2014a) study, prompts and mixed CF (prompts
+ recasts) led to more noticing of corrective intent, compared to recasts. In addition, Kartchava
and Ammar (2014b) showed that learner noticing of recasts was significantly correlated with
learner beliefs about the effectiveness of recasts and CF in general, but no such correlation was
found for the noticing of prompts. The corrective intent of recasts is perceived as such when
recasts are short and contain only one or two changes (e.g., Egi, 2007; Philp, 2003). Also, the
production of modified output facilitates the level of noticing (e.g., Egi, 2010). Although noticing
and modified output production are not entirely isomorphic (e.g., Gass, 2003; Goo & Mackey,
2013; Long, 2007; Mackey & Philp, 1998), learner noticing may be mirrored in modified output
production (e.g., Egi, 2010; Lyster, et al., 2013). Nevertheless, noticing does not guarantee L2
development (e.g., Kartchava & Ammar, 2014a; Mackey, 2006).
L2 Learning Via CF
Numerous experimental and quasi-experimental studies have examined the CF-learning re-
lationship in interactional contexts and provide empirical evidence for its beneficial role in L2
development (see Nassaji, 2016 for a review). Regarding the relative efficacy of different CF
moves, research findings suggest that prompts or explicit CF moves may be more effective at
promoting L2 development than implicit CF moves such as recasts (Lyster et al., 2013). Goo and
Mackey (2013) discussed methodological issues that led them to question the validity of previous
CF research designs/methods and the reliability of findings (see Lyster & Ranta, 2013 for their
response). As Ellis (2015) notes, different CF moves, whether implicit or explicit, and whether
recasts or prompts, when combined effectively, are likely to function as critical interactional
devices to cause L2 learning. Recently, researchers have explored many factors that may mediate
the extent to which L2 learners benefit from interactional features (e.g., CF moves). These factors
include effects of CF via active participation in interaction (Yilmaz, 2016), feedback timing (Li,
233
Jaemyung Goo
Ellis, & Zhu, 2016), different types of recasts (Wacha & Liu, 2017), recasts versus scaffolded
feedback (Rassaei, 2014), and extensive versus intensive recasts (Nassaji, 2017).
Task Complexity
Given that noticing plays a nontrivial role in L2 learning, and “task demands are a powerful
determinant of what is noticed” (Schmidt, 1990, p. 143), investigating tasks of different
complexity levels involving varying degrees of cognitive demands is of importance in inter-
action research (e.g., Gilabert et al., 2009; Kim, 2009, 2012; Kim et al., 2015; Kourtali &
Révész, 2020; Révész, 2009, 2011). Most interaction research on task complexity has been
conducted to test Robinson’s Cognition Hypothesis (2001, 2007, 2011). Robinson claims that
increasing task complexity along resource-directing dimensions (e.g., ±reasoning, ±few ele-
ments, ±here-and-now) “has the potential to connect cognitive resources, such as attention
and memory, with effort at conceptualization and the L2 means to express it” (2011, p. 14),
leading to greater accuracy and complexity of production. He also notes that increased
cognitive/conceptual demands of tasks likely trigger more interaction and negotiation for
meaning and furthermore, by directing learners’ attentional and memory resources to diverse
features of the L2 linguistic system, give rise to “more noticing of task relevant input, and
heightened memory for it, and so lead to more uptake of forms made salient in the input
through various focus on form interventions” (2007, p. 23). Overall findings are inconclusive,
with some studies reporting (partial) supportive evidence (e.g., Kim, 2012; Révész, 2011) but
others showing counterevidence or no evidence (e.g., Kim et al., 2015; Kourtali & Révész,
2020). Interaction researchers have also explored potential mediating variables (e.g. profi-
ciency, type of task dimensions, etc.). Révész (2011) found no mediating role of self-
confidence, anxiety, and self-perceived communicative competence in the impact of task
complexity on L2 learner performance. Cognitive abilities are another such variable and, in
fact, possible links between cognitive capacities and task complexity were evidenced in terms
234
Conversational Interaction Studies
of noticing and learning gains (e.g., Kim et al., 2015; Kourtali & Révész, 2020). Studies of
this kind are important to note because relevant findings delineate the scope of the impact of
task complexity on L2 learning. Further research is clearly warranted in this area.
Language Aptitude
Language aptitude refers to the ability to adapt to and benefit from instructed or naturalistic
exposure to the L2 (Robinson, 2013). Long (2015) emphasizes the essential role of language
aptitude in L2 learning, maintaining that “(given otherwise comparable abilities and learning
opportunities) one factor, sensitivity to input (not to negative input only), is the most likely
predictor of success and failure at the level of the individual” (p. 60). It is not a single/unitary
cognitive entity, but a composite of cognitive abilities that are critical in promoting L2
learning. Aptitude tests comprise several subtests measuring different components. Li’s
(2016) recent meta-analysis showed a correlation of .49 between language aptitude and L2
proficiency (k = 53), that is, approximately 25% of variance can be accounted for by lan-
guage aptitude. Accordingly, a growing number of cognitive-interactionist researchers have
investigated whether language aptitude mediates the extent of beneficial effects of interaction
on L2 learning. Relevant research has not offered a clear picture of how language aptitude
works in interactional contexts, showing less-than-consistent results depending on target type
(e.g., Granena & Yilmaz, 2019), feedback condition (e.g., Li, 2013; Yilmaz, 2013a; Yilmaz &
235
Jaemyung Goo
Granena, 2016), task type (e.g., Kourtali & Révész, 2020; Li et al., 2019), and dependent
variables (e.g., Kourtali & Révész, 2020; Li, 2013; Yilmaz & Granena, 2019). For example,
Li (2013) and Kim (2021) provided supporting evidence for a significant mediating role of
language analytic ability (LAA), meaning the ability to recognize grammatical patterns and
other linguistic entities in language samples and infer underlying rules, in the effectiveness of
recasts in L2 development. Li’s (2013) study, however, showed no significant correlation
between LAA and the effectiveness of explicit feedback (metalinguistic correction).
Somewhat differently, Yilmaz (2013a) revealed a significant mediating role of LAA measured
via the test, LLAMA F (Meara, 2005), in the efficacy of explicit feedback (i.e., explicit
correction) and no evidence of LAA playing a role in the case of recasts. Yilmaz’s results
further suggested that explicit correction worked better than recasts for learners with high
LAA, but not for those with low LAA. Yilmaz and Granena (2016, 2019; Granena &
Yilmaz, 2019) conducted a series of CF-aptitude studies to obtain a more detailed look at
how aptitude measured via LLAMA subtests functions. Findings suggest that LLAMA
subtests are differentially associated with implicit and explicit CF moves with target type and
dependent variables influencing the extent of such association (see Kourtali & Révész, 2020
on relationships among LLAMA subtests, task complexity, recasts, and dependent variable
measures). Regarding instructional conditions, Li et al. (2019) manipulated five learning
conditions. LAA was found to be associated with learning gains under two conditions (i.e.,
Task Only and Post-task Feedback) unrelated to WM, indicating that WM and LAA may
function differently under different instructional/learning conditions. Obviously, any con-
clusions are still premature. More research needs to be conducted to obtain a better un-
derstanding of the role of language aptitude and its relationship with other cognitive
variables (e.g., WM, attention control, etc.) as well.
Other Issues
Interaction effects, especially CF moves (and their effectiveness), have also been examined in
terms of developmental readiness with findings indicating that more proficient learners are
more likely to benefit from CF moves compared to less proficient learners (e.g., Ammar &
Spada, 2006; Mackey & Philp, 1998). Research findings suggest that advanced learners may
benefit from both recasts and prompts, whereas low-proficiency learners likely benefit more
from prompts than recasts (e.g., Ammar & Spada, 2006; H. Li, 2018; Li, 2014) although
details of this general observation may vary depending on other variables. One such variable
is the type of target feature, (e.g., Kim, 2021; Li, 2014; Mackey, 2006; van de Guchte et al.,
2015) with more salient target features being more susceptible to CF effects. In terms of
interaction between target type and CF type, research findings are inconclusive (Li, 2014).
In addition, some studies have investigated how form-focused instruction (FFI) optimizes
the effectiveness of CF moves (e.g., Lyster, 2004; Saito & Lyster, 2012). Findings suggest that
FFI may enhance the efficacy of CF moves with evidence of more benefits when combined
with prompts than recasts. Interlocutor types and characteristics also mediate interaction-
driven learning (Gurzynski-Weiss, 2017 for a review). For instance, peer interaction has been
found to be beneficial for L2 learning (e.g., Adams et al., 2011; Saito & Lyster, 2012).
Research findings have also revealed that instructors’ educational background and teaching
experiences are associated with their feedback use (e.g., Mackey et al., 2004; Gurzynski-
Weiss, 2016; Kartchava et al., 2020). Kartchava et al. (2020) reported a disparity between
pre-service teachers’ beliefs about CF and their in-class correction behaviours as reflected in
a smaller percentage of error correction in actual classroom practices. Nevertheless, their
preference for the same CF type (i.e., recasts) was observed with a much higher rate in actual
236
Conversational Interaction Studies
teaching practices. Learner beliefs, attitudes, and anxiety are also important to note because
they may influence the amount of modified output (e.g., repair), learner noticing, and L2
learning as a consequence (e.g., Akiyama, 2017; Kartchava & Ammar, 2014b; Lee, 2013;
Sheen, 2008). Lee’s (2013) advanced ESL learners, for example, chose explicit correction as
the most preferred CF move and perceived clarification requests as an anxiety-provoking CF
move that likely created emotional discomfort. It makes intuitive sense that, as indicated in
Akiyama (2017), CF moves that match leaners’ preferred types may possibly lead to more
favourable grounds for L2 learning, although not necessarily so. Attempts to investigate
variables such as cognitive style (e.g., Rassaei, 2015b), creativity (e.g., McDonough et al.,
2015) and gestures (e.g., Nakatsukasa, 2016, 2021), albeit few, have contributed to expanding
the spectrum of interaction research.
Nevertheless, empirical evidence is insufficient to offer unambiguous conclusions about
the specific functions of many of the variables described earlier. More research, including
replication studies, should be conducted to broaden our understanding of these already-
complicated cognitive phenomena involving a multitude of variables that may affect
interaction-driven L2 development.
5 Future Directions
The interaction approach, unparalleled in its impact on the L2 research community, has
attracted substantial scholarly attention to whether and how it contributes to L2 develop-
ment. Long’s (1996) revised Interaction Hypothesis
Empirical research has offered clear evidence for interaction effects, that is, L2 learners
benefit from negotiation for meaning because it affords L2 learners opportunities to receive
(modified) input and CF on erroneous output and produce modified output in response to
their interlocutors’ CF moves (see Gass & Mackey, 2015; Goo, 2019). Given that L2 learning
involves a constellation of complex phenomena encompassing many cognitive and affective
processes, it stands to reason that recent interaction research has focused on investigating
relevant variables in terms of their mediating role in the extent of interaction effects (e.g.,
WM, language aptitude, cognitive style, and anxiety). These variables have attracted much
attention, but still merit further research because we currently lack sufficient evidence for
generalizable statements. As Goo (2019) notes,
although more than thirty years of interaction research have no doubt revealed that
interaction precipitates L2 learning, we now have much more complicated issues to
deal with than when this idea of interaction first emerged, and far more questions
than answers when it comes to how it does so (p. 247).
Further Reading
Gass, S. M. (2018). Input, interaction, and the second language learner. New York, NY: Routledge.
A reprint of Gass’s (1997) classic text with the same title. It comprises the original 1997 text and a
newly-added preface that contains insights from Alison Mackey, Rod Ellis, and Mike Long. The main
237
Jaemyung Goo
text, albeit not updated, deals with such fundamental issues as the nature of input, attention and
awareness, output, and the role of interaction.
Ellis, R., Skehan, P., Li, S., Shintani, N., & Lambert, C. (2020). Task-based language teaching: Theory
and practice. Cambridge, UK: Cambridge University Press.
A crucial overview of the current state of affairs in task-based language teaching (TBLT), illustrating
important theoretical perspectives that underpin TBLT. It also discusses pedagogic and research per-
spectives on TBLT and various nontrivial issues that both researchers and practitioners should take
into consideration.
Nassaji, H., & Kartchava, E. (Eds.) (2021). The Cambridge handbook of corrective feedback in second
language learning and teaching. Cambridge, UK: Cambridge University Press.
An insightful analysis and discussion of cutting-edge research on the role of CF in diverse dimensions of
L2 development. Also discussed are a wide range of methodological and pedagogical issues considered
to be crucial for both research and teaching communities.
References
Adams, R., Nuevo, A.-M., & Egi, T. (2011). Explicit and implicit feedback, modified output, and SLA:
Does explicit and implicit feedback promote learning and learner-learner interactions? Modern
Language Journal, 95(Supplement), 42–63.
Ahmadian, M. J. (2012). The relationship between working memory capacity and L2 oral performance
under task-based careful online planning condition. TESOL Quarterly, 46, 165–175.
Akiyama, Y. (2017). Learner beliefs and corrective feedback in telecollaboration: A longitudinal in-
vestigation. System, 64, 58–73.
Akiyama, Y., & Saito, K. (2016). Development of comprehensibility and its linguistic correlates: A
longitudinal study of video-mediated telecollaboration. Modern Language Journal, 100, 585–609.
Ammar, A., & Spada, N. (2006). One size fits all? Recasts, prompts, and L2 learning. Studies in Second
Language Acquisition, 28, 543–574.
Baddeley, A. D. (2007). Working memory, thought, and action. Oxford: Oxford University Press.
Bao, M., Egi, T., & Han, Y. (2011). Classroom study on noticing and recast features Capturing learner
noticing with uptake and stimulated recall. System, 39, 215–228.
Brown, D. (2016). The type and linguistic foci of oral corrective feedback in the L2 classroom: A meta-
analysis. Language Teaching Research, 20, 436–458.
Doughty, C. (2001). Cognitive underpinnings of focus on form. In P. Robinson (Ed.), Cognition and
second language instruction (pp. 206–257). Cambridge, UK: Cambridge University Press.
Egi, T. (2007). Interpreting recasts as linguistic evidence: The roles of linguistic target, length, and
degree of change. Studies in Second Language Acquisition, 29, 511–537.
Egi, T. (2010). Uptake, modified output, and learner perceptions of recasts: Learner responses as
language awareness. Modern Language Journal, 94, 1–21.
Ellis, R. (2015). Understanding second language acquisition (2nd edn). Oxford: Oxford University Press.
Gass, S. M. (1997). Input, interaction, and the second langauge learner. Mahwah, NJ: Lawrence
Erlbaum.
Gass, S. M. (2003). Input and interaction. In C. J. Doughty & M. H. Long (Eds.), The handbook of
second language acquisition (pp. 224–255). Oxford: Blackwell.
Gass, S. M., & Mackey, A. (2015). Input, interaction, and output in second language acquisition. In B.
VanPatten & J. Williams (Eds.), Theories in second language acquisition (pp. 180–206). New York,
NY: Routledge.
Goo, J. (2012). Corrective feedback and working memory capacity in interaction-driven L2 learning.
Studies in Second Language Acquisition, 34, 445–474.
Goo, J. (2019). Interaction in L2 learning. In J. W. Schwieter & A. Benati (Eds.), The Cambridge
handbook of language learning (pp. 233–257). Cambridge: Cambridge University Press.
Goo, J. (2020). Research on the role of recasts in L2 learning. Language Teaching, 53, 289–315.
Goo, J., & Mackey, A. (2013). The case against the case against recasts. Studies in Second Language
Acquisition, 35, 127–165.
Gilabert, R., Barón, G., & Llanes, A. (2009). Manipulating cognitive complexity across task types
and its impact on learners’ interaction during oral performance. IRAL, 47, 367–395.
Granena, G. & Yilmaz, Y. (2019). Corrective feedback and the role of implicit sequence-learning ability
in L2 online performance. Language Learning, 69(S1), 127–156.
238
Conversational Interaction Studies
239
Jaemyung Goo
Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition (pp. 413–468). New York,
NY: Academic Press.
Long, M. H. (2007). Problems in SLA. Mahwah, NJ: Lawrence Erlbaum Associates.
Long, M. H. (2015). Second language acquisition and task-based language teaching. Malden, MA: Wiley-
Blackwell.
Loschky, L. (1994). Comprehensible input and second language acquisition: What is the relationship?
Studies in Second Language Acquisition, 16(3), 305–325.
Lyster, R. (2004). Differential effects of prompts and recasts in form-focused instruction. Studies in
Second Language Acquisition, 26, 399–432.
Lyster, R., & Mori, H. (2006). Interactional feedback and instructional counterbalance. Studies in
Second Language Acquisition, 28, 269–300.
Lyster, R., & Ranta, L. (1997). Corrective feedback and learner uptake: Negotiation of form in
communicative classrooms. Studies in Second Language Acquisition, 19, 37–66.
Lyster, R., & Ranta, L. (2013). Counterpoint piece: The case for variety in corrective feedback research.
Studies in Second Language Acquisition, 35, 167–184.
Lyster, R., & Saito, K. (2010). Oral feedback in classroom SLA: A meta-analysis. Studies in Second
Language Acquisition, 32, 265–302.
Lyster, R., Saito, K., & Sato, M. (2013). Oral corrective feedback in second language classrooms.
Language Teaching, 46, 1–40.
Mackey, A. (1999). Input, interaction, and second language development: An empirical study of
question formation in ESL. Studies in Second Language Acquisition, 21, 557–587.
Mackey, A. (2006). Feedback, noticing and instructed second language learning. Applied Linguistics,
27, 405–430.
Mackey, A. (2012). Input, interaction and corrective feedback in L2 classrooms. Oxford, UK: Oxford
University Press.
Mackey, A., Abbuhl, R., & Gass, S. (2012). Interactionist approach. In S. Gass & A. Mackey (Eds.),
The Routledge handbook of second language acquisition (pp. 7–24). New York, NY: Routledge.
Mackey, A., Adams, R., Stafford, C., & Winke, P. (2010). Exploring the relationship between modified
output and working memory capacity. Language Learning, 60, 501–533.
Mackey, A., Philp, J., Egi, T., Fujii, A., & Tatsumi, T. (2002). Individual differences in working
memory, noticing of interactional feedback and L2 development. In P. Robinson (Ed.), Individual
differences and instructed language learning (pp. 181–209). Amsterdam: John Benjamins.
Mackey, A., Gass, S. M., & McDonough, K. (2000). How do learners perceive interactional feedback?
Studies in Second Language Acquisition, 22, 471–497.
Mackey, A., & Goo, J. (2007). Interaction research in SLA: A meta-analysis and research synthesis. In
A. Mackey (Ed.), Conversational interaction in second language acquisition: A collection of empirical
studies (pp. 407–452). Oxford: Oxford University Press.
Mackey, A. & Philp, J. (1998). Conversational interaction and second language development: Recasts,
responses, and red herrings? Modern Language Journal, 82, 338–356.
Mackey, A., Polio, C., & McDonough, K. (2004). The relationship between experience, education and
teachers’ use of incidental focus-on-form techniques. Language Teaching Research, 8, 301–327.
Mackey, A., & Sachs, R. (2012). Older learners in SLA research: A first look at working memory,
feedback, and L2 development. Language Learning, 62, 704–740.
McDonough, K. (2005). Identifying the impact of negative feedback and learners’ responses on ESL
question development. Studies in Second Language Acquisition, 27(1), 79–103.
McDonough, K., Crawford, W. J., & Mackey, A. (2015). Creativity and EFL students’ language use
during a group problem-solving task. TESOL Quarterly, 49, 188–199.
Meara, P. (2005). LLAMA language aptitude tests. Swansea, UK: Lognostics.
Mitchell, R., Myles, F., & Marsden, E. (2019). Second language learning theories (4th edn). New York,
NY: Routledge.
Nakatsukasa, K. (2016). Efficacy of recasts and gestures on the acquisition of locative prepositions.
Studies in Second Language Acquisition, 38, 771–799.
Nakatsukasa, K. (2021). Gesture-enhanced recasts have limited effects: A case of the regular past tense.
Language Teaching Research, 25(4), pp. 587–612.
Nassaji, H. (2016). Interactional feedback in second language teaching and learning: A synthesis and
analysis of current research. Language Teaching Research, 20, 535–562.
Nassaji, H. (2017). The effectiveness of extensive versus intensive recasts for learning L2 grammar. The
Modern Language Journal, 101, 353–368.
240
Conversational Interaction Studies
Parlak, Ö., & Ziegler, N. (2017). The impact of recasts on the development of primary stress in a
synchronous computer-mediated environment. Studies in Second Language Acquisition, 39, 257–285.
Philp, J. (2003). Constraints on noticing the gap: Nonnative speakers’ noticing of recasts in NS-NNS
interaction. Studies in Second Language Acquisition, 25, 99–126.
Pica, T. (1994). Research on negotiation: What does it reveal about second language learning condi-
tions, processes, and outcomes? Language Learning, 44, 493–527.
Pica, T. (1996). Do second language learners need negotiation? International Review of Applied
Linguistics in Language Teaching, 34, 1–21.
Rassaei, E. (2013). Corrective feedback, learners’ perceptions, and second language development.System,
41(2), 472–483.
Rassaei, E. (2014). Scaffolded feedback, recasts, and L2 development: A sociocultural perspective.The
Modern Language Journal, 98(1), 417–431.
Rassaei, E. (2015a). Recasts, field dependence/independence cognitive style, and L2 development.
Language Teaching Research, 19, 499–518.
Rassaei, E. (2015b). Oral corrective feedback, foreign language anxiety and L2 development. System,
49, 98–109.
Révész, A. (2009). Task complexity, focus on form, and second language development. Studies in
Second Language Acquisition, 31, 437–470.
Révész, A. (2011). Task complexity, focus on L2 constructions, and individual differences: A
classroom-based study. Modern Language Journal, 95(Supplement), 162–181.
Révész, A. (2012). Working memory and the observed effectiveness of recasts on different L2 outcome
measures. Language Learning, 62, 93–132.
Robinson, P. (2001). Task complexity, cognitive resources, and syllabus design: A triadic framework
for examining task influences on SLA. In P. Robinson (Ed.), Cognition and second language in-
struction (pp. 287–318). Cambridge, UK: Cambridge University Press.
Robinson, P. (2007). Task complexity, theory of mind, and intentional reasoning: Effects on L2 speech
production, interaction, uptake and perceptions of task difficulty. Interactional Review of Applied
Linguistics, 45, 193–213.
Robinson, P. (2011). Second language task complexity, the Cognition Hypothesis, language learning,
and performance. In P. Robinson (Ed.), Second language task complexity: Researching the cognition
hypothesis of language learning and performance (pp. 203–235). Amsterdam: John Benjamins.
Robinson, P. (2013). Aptitude in second language acquisition. In C. A. Chapelle (Ed.), Encyclopedia of
applied linguistics: Language learning and teaching (pp. 129–133). Oxford: Wiley-Blackwell.
Saito, K., & Akiyama, Y. (2017). Video-based interaction, negotiation for comprehensibility, and
second language speech learning: A longitudinal study. Language Learning, 67(1), 43–74.
Saito, K., & Lyster, R. (2012). Effects of form-focused instruction and corrective feedback on L2 pro-
nunciation development of /ɹ/ by Japanese learners of English. Language Learning, 62(2), 595–633.
Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158.
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction
(pp. 3–32). Cambridge: Cambridge University Press.
Sheen, Y. (2004). Corrective feedback and learner uptake in communicative classrooms across in-
structional settings. Language Teaching Research, 8, 263–300.
Sheen, Y. (2006). Exploring the relationship between characteristics of recasts and learner uptake.
Language Teaching Research, 10, 361–392.
Sheen, Y. (2008). Recast, language anxiety, modified output and L2 learning. Language Learning, 58,
835–874.
Swain, M. (1985). Communicative competence: Some roles of comprehensible input and comprehensible
output in its development. In S. Gass & C. Madden (Eds.), Input in second language
acquisition.Rowley, MA: Newbury House.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook & B. Seidelhofer
(Eds.), Principle and practice in Applied Linguistics: Studies in honor of H.G. Widdowson
(pp. 125–144). Oxford: Oxford University Press.
Swain, M. (2005). The output hypothesis: Theory and research. In E. Hinkel (Ed.), Handbook on re-
search in second language teaching and learning (pp. 471–484). Mahwah, NJ: Lawrence Erlbaum.
van de Guchte, M., Braaksma, M., Rijlaarsdam, G., & Bimmel, P. (2015). Learning new grammatical
structures in task-based language learning: The effects of recasts and prompts. Modern Language
Journal, 99, 246–262.
241
Jaemyung Goo
Wacha, R. C., & Liu, Y.-T. (2017). Testing the efficacy of two new variants of recasts with standard
recasts in communicative conversational settings: An exploratory longitudinal study. Language
Teaching Research, 21(2), 189–216.
Wen, Z. (2016). Working memory and second language learning: Towards an integrated approach.
Bristol, UK: Multilingual Matters.
Wen, Z., Biedron, A., & Skehan, P. (2017). Foreign language aptitude theory: Yesterday, today and
tomorrow. Language Teaching, 50, 1–31.
Yilmaz, Y. (2013a). The relative effectiveness of mixed, explicit and implicit feedback in the acquisition
of English articles. System, 41, 691–705.
Yilmaz, Y. (2013b). Relative effects of explicit and implicit feedback: The role of working memory
capacity and language analytic ability. Applied Linguistics, 34, 344–368.
Yilmaz, Y. (2016). The role of exposure condition in the effectiveness of explicit correction. Studies in
Second Language Acquisition, 38, 65–96
Yilmaz, Y., & Granena, G. (2016). The role of cognitive aptitudes for explicit language learning in the
relative effects of explicit and implicit feedback. Bilingualism: Language and Cognition, 19, 147–161.
Yilmaz, Y., & Granena, G. (2019). Cognitive individual differences as predictors of improvement and
awareness under implicit and explicit feedback conditions. Modern Language Journal, 103(3),
686–702.
Yoshida, R. (2010). How do teachers and learners perceive corrective feedback in the Japanese lan-
guage classroom? Modern Language Journal, 94, 293–314.
Ziegler, N., & Mackey, A. (2017). Interactional feedback in synchronous computer-mediated com-
munication: A review of the state of the art. In H. Nassaji & E. Kartchava (Eds.), Corrective
feedback in second language teaching and learning: Research, theory, applications, implications
(pp. 80–94). New York, NY: Routledge.
242
17
PRAGMATICS: SPEAKING AS A
PRAGMALINGUISTIC RESOURCE
Kathleen Bardovi-Harlig
1 Introduction/Definitions
Readers may have heard or perhaps can imagine an exchange such as the following in which
a parent intervenes when one sibling has somehow impinged on another:
This chapter examines the intersection of research on the acquisition of pragmatics (the
apology part of this example) and characteristics of speaking and what they contribute to
pragmatics (the say-it-like-you-mean-it part of the example).
What Is Pragmatics?
Pragmatics is the study of how-to-say-what-to-whom-when (Taguchi, 2019, adds “what not
to say,” p. 1; see also Ishihara & Cohen, 2014, pp. 3–4). If that captures the sense of
pragmatics, then L2 pragmatics is the study of how learners come to know how-to-say-what-
to-whom-when (and what not to say!) in their second language. Although this is fairly ac-
curate in spirit, it lacks the detail required for research. Following Kasper & Rose (2002), I
adopt Crystal’s (1997) definition of pragmatics as “the study of language from the point of
view of users, especially of the choices they make, the constraints they encounter in using
language in social interaction and the effects their use of language has on other participants in
the act of communication” (p. 301; italics added by Kasper & Rose, 2002, p. 2).
The focus on interaction is echoed in two handbooks on pragmatics: “Pragmatics studies
the connection between a linguistic form and a context, where that form is used, and how this
connection is perceived and realized in a social interaction” (Taguchi, 2019, p. 1) and
In these definitions, both interaction and mode open the way for the investigation of how
speaking contributes to and/or influences the intended message (or illocutionary force) of an
utterance and the perceived intention of the message (or perlocutionary effect). However, in
mainstream L2 pragmatics, even though the focus has been on pragmatics for conversation
(Bardovi-Harlig, 2012, 2015, 2018; Ishihara & Cohen, 2014), characteristics of speaking have
not been included as part of standard analyses.
What Is Speaking?
Goh and Burns (2012; Burns, 2013) propose a three-component model of speaking com-
petence which includes knowledge of language and discourse, core speaking skills, and
communication strategies. The first two components are particularly relevant to prag-
matics. Knowledge of language and discourse includes the L2 sound system (including
intelligibility at segmental and suprasegmental levels), lexis, and morphosyntax (see
Bardovi-Harlig, 1999, for a discussion of the relation of grammar and pragmatics in L2
acquisition) and “understanding how stretches of connected speech (discourse, genre) are
organized, so that they are socially and pragmatically appropriate” (Burns, p. 167). The
second component of the model, core speaking skills, includes “the ability to process
speech quickly to increase fluency (e.g., speech rate, chunking, pausing, formulaic lan-
guage, discourse markers),” negotiating speech (including responding to the utterances of
others, checking comprehension, giving feedback, and repair), and “managing the flow of
speech as it unfolds (e.g., initiating topics, turn-taking, signaling intentions, opening/
closing conversations)” (Burns, p. 167).
As listeners we need to interpret what is said, as well as what is not said, and what
may be communicated non-verbally. These verbal and non-verbal cues transmit to
us just how polite, direct, or formal the communication is and what the intent is
(e.g., to be kind, loving, attentive, or devious, provocative, or hostile) (pp. 3–4).
House (1996) uses the term pragmatic fluency for the intersection of pragmatics and speaking
(pp. 228–229) and Kang, Kermad and Taguchi (2021) employ the term pragma-prosodic to
describe prosody used in the service of pragmatics (akin to the term pragmalinguistic which
refers to resources available to realize sociopragmatics). Yates (2017) identifies both
244
Pragmatics
In this example, the hearer links tone to the anticipated illocutionary force and to the
speaker’s affect: “she sounded a bit embarrassed and nervous.” She interprets increased
speech rate and pitch or loudness as indicators of the speaker’s discomfort at performing the
disinvitation.
Three comprehensive reviews of pragmatics and prosody have recently appeared
(Escandell-Vidal & Prieto, 2021, on Spanish pragmatics and prosody; Hirschberg, 2017, on
pragmatics and prosody; Kang & Kermad, 2019, on L2 pragmatics and prosody).
Although these reviews indicate interest, Kang and Kermad also report a general paucity
of L2 studies. I will not attempt to replicate the coverage of prosody undertaken by these
reviews. Instead, I will explore the intersection of pragmatics and speaking and consider
what pragmatics research might look like with more sustained interest in the characteristics
of speaking; that is, what would pragmatics research discover if we regularly considered
speaking as the addressee of the disinvitation did?
2 Historical Perspectives
Here, I consider the history of L2 pragmatics research related to speaking. (For in-
formation on how interlanguage pragmatics and L2 pragmatics research relate to SLA
245
Kathleen Bardovi-Harlig
generally, see Bardovi-Harlig, 2012; Taguchi, 2019.) This part situates spoken tasks among
the tasks used to collect data in L2 pragmatics research and demonstrates that although L2
pragmatics research views characteristics of speaking as contributing to illocutionary force
and perlocutionary effect, such characteristics have not been investigated systematically.
At the university
Ann missed a lecture yesterday and would like to borrow Judith’s notes.
Ann: ____________________________________________
Judith: Sure, but let me have them back before the lecture next week.
Three other major data types used to study pragmatics production include oral DCTs, role-
plays, and conversation. What elicitation tasks have in common is a scenario that describes
the setting in which talk is imagined to take place, the relevant characteristics of the speakers
involved, their relationship, the event that precipitates the speech, and the goal. In oral
DCTs, speakers provide a single spoken turn; the respondent may initiate a turn or reply to a
spoken turn as in (3) from Bardovi-Harlig (2009, p. 795).
(3). You give your classmate a ride home. He lives in the building next to yours. He gets out
of the car and says,
In role-plays, speakers interact either with a researcher, research assistant, or fellow learner
to negotiate over several turns the goal stated by the scenario. Conversation may be elicited
in interviews, information gap activities, problem solving, and peer feedback tasks, or
spontaneously in institutional talk, service encounters, and conversation. Many of these are
treated as separate categories of talk, but I will discuss them as one large category, identi-
fying the type of talk investigated by individual studies.
In a study using a written DCT to explore regional variation in pragmatics, Schneider
(2011) suggested that written representation of pragmatics and speaking in dramatic scripts –
in the absence of speech characteristics – nevertheless, displays “most essential features of
246
Pragmatics
dialogue” (p. 17). However, the use of written production tasks precludes the study of
speaking in L2 pragmatics. Similarly, the presentation of written language samples in jud-
gement and interpretation tasks excludes phonetic and prosodic information from the
utterances to be judged.
Two large-scale reviews (Bardovi-Harlig, 2010, 1979–2008; Nyugen, 2019, 1979–2017)
show that oral data are in the majority by task. Moreover, production tasks greatly outweigh
non-production tasks in L2 pragmatics. Bardovi-Harlig reported that of 152 studies, 129
(70%) included a production task exclusively, only 23 (or 15%) exclusively used a non-
production task (judgement or interpretation), and 22 (or 15%) used both. Taken together,
63% of the tasks are oral/aural (production tasks weigh in at 69%, a higher orality rate than
non-production tasks at 35% or mixed studies at 59%). Nyugen (2019) similarly reported that
217 (88%) of the 246 studies reviewed were production studies. Of those, at least 64% were
oral (there could be more because oral DCTs are in a mixed oral–written DCT category).
Thus, we cannot attribute the lack of integration of speaking into pragmatics to a lack of oral
data or the dominance of written data. However, we might consider that the analyses es-
tablished during the dominance of large-scale written studies has influenced current analyses.
Analyzing oral production data without explicit reference to characteristics of speech relies
on the same “essential” features (as Schneider 2011 described written dialogue without
speaking) and misses whatever information speaking provides.
The next part reviews L2 pragmatic studies that include speaking as part of their results
and lay the groundwork for more systematic investigation. The remainder of this discussion
will be limited to the study of speech acts, the dominant approach to L2 pragmatics research
(65% of the 152 studies reviewed by Bardovi-Harlig, 2010, explored speech acts).
247
Kathleen Bardovi-Harlig
also reported that “one of the raters commented, routines realized in the opening phase
appeared to be rattled off quickly and … unfeelingly, so as to get them over with in a rather
artificial fashion” (House, 1996, p. 239). Likewise, Bardovi-Harlig (2013) noted that some
utterances produced in response to an oral DCT were “flat” or “monotonic.” Learners ex-
hibited both overly rapid and slowed delivery of condolence formulas in contrast to native
speaker (NS) production (Bardovi-Harlig, 2013). They also exhibited pauses in a range of
conventional expressions, suggesting that they had not fully mastered the formulaic se-
quences. Word stress also plays a role in pragmatics. A study of academic advising sessions
reported that the hedge “I thínk” becomes an aggravator when produced with a stressed
pronoun, as in “Í think” (Bardovi-Harlig & Hartford, 1996). Rather than softening the
proposition expressed, the stress on the pronoun appears to contrast the student’s opinion
with her advisor’s.
House (1996) reported that the advanced German university EFL speakers in her study
were “so advanced that their productions are never characterized by markedly slow speech or
irritatingly long (unfilled) pauses that necessitate excessive repairing or other overt, para-
linguistic signs of disfluency (pp. 244–245).” Nevertheless, even those learners differed in
initiating and responding utterances. Preliminary results showed that when responding,
“learners slow down more markedly, pauses are longer, and repairs are frequent,” an ob-
servation that warrants more investigation.
Shively’s (2018) examination of the development of humour in second language during
study abroad reveals an important intersection of speaking and pragmatics. Learners of
Spanish frequently used humour with host families and native-speaker peers, and most of
these humour tokens (94%) were accompanied by a change in a characteristic of speaking.
These occurred “with laughter and/or prosodic markers such as smile voice, exaggerated
intonation, singsong intonation, increases or decreases in volume and pitch, sound length-
ening, and slowing down or speeding up the rate of speech” (p. 85). Deadpan humour,
tokens that occurred without such prosodic cues, was much more likely to go unrecognized
by interlocutors.
In contrast to pragmatically problematic deliveries, Bardovi-Harlig (2013) reported the
successful intonation of a single expression “you too!” in response to “have a nice day!” Both
are frequently delivered with singsong intonation in the Midwest community where the
learners resided, and this was captured on the aural prompt to which learners responded.
Several learners responded with a sing-song production of “you too!” This was noticeable
given the frequent monotonic production by learners (cf. Kang et al., 2021; Pickering, 2001).
Finally, Taguchi (2007) asked whether the type of social situation differentially affects rate
of L2 speech act production. However, she linked rate of speech to task difficulty, reasoning
that social demands presented in the scenario may make the task more difficult, and task
difficulty leads to slower speech rate. Although that is not the direct link between pragmatics
and speaking that other researchers have noted, it is nevertheless one of the first studies to
systematically investigate the relationship. Taguchi reported that lower level EFL learners
produced requests and refusals with low power (P), social distance (D), and degree of im-
position (R) appropriately and relatively quickly, but requests and refusals with higher PDR,
less appropriately and more slowly. Additionally, the lower-proficiency group was slower
than the higher-proficiency group, whereas NSs did not show a difference. Taguchi inter-
preted the scenarios describing low PDR as easier tasks than scenarios describing high PDR.
The crucial step is to link speech rate to speech act realization directly, rather than to task.
Recall the analysis of Tateyama’s raters of apology scenarios: at least some speech acts
should be performed slowly, demonstrating the cost to the speaker of making the speech act.
Taguchi observed that in addition to proficiency and task difficulty, speech rate may be
248
Pragmatics
influenced by L1 expectations for reduced speed for consequential acts. More recent work
returns to this question.
249
Kathleen Bardovi-Harlig
learners but environment was not a factor. In this case, as suggested in the previous part,
slower rate of speech was characteristic for all speakers for the higher-stakes realizations.
Based on the NSs’ judgments, to what extent can speakers (EFL, ESL, NS) convey
their intent in producing sincere and ostensible apologies?
The learners were 30 Thai learners of English, the NS speakers were from the American
Midwest, as were the 45 NS judges. Speakers were given role-plays clearly indicating whether
the apology to be produced was heartfelt (sincere) or ostensible (the speaker was not re-
morseful). The 320 production tokens were judged. Both learners and native speakers pro-
duced apologies recognized as sincere by the judges. However, only the NSs were considered
successful in conveying ostensible apologies. Judges did not report perceiving ostensible
250
Pragmatics
apologies in the learner productions to the same extent. Alexander notes that for all speakers,
and especially learners, the word-level IFIDs present in the apologies likely interacted with
acoustic characteristics, and intonation as an IFID. Apologies including intensifiers such as
“I’m really sorry” and “I’m so sorry” were frequently judged to be sincere, regardless of
prosody.
The analysis of intonation, pitch, and boundary tones was carried out on the highest
scoring apologies from each of three categories from the judgement task: 24 top-rated tokens
from sincere apologies judged to be sincere, 24 intended to be ostensible (insincere) but
judged to be sincere, and 24 that were ostensible apologies correctly perceived to be insincere.
Auditory analysis was conducted: Intonational transcriptions were carried out by the author
and a phonetician; pitch prominence was marked by the author. The 72 tokens meeting the
selection criteria were distributed over three apology formulas, three tokens of bare Sorry, 41
I’m sorry tokens, and 28 tokens of I’m + intensifier + sorry (e.g., I’m really sorry, I’m so
sorry). Considering speech characteristics, items more likely to be perceived as sincere were:
(1) intensified apologies (with or without a pitch accent on the intensifier), (2) apologies
carrying a high pitch accent (H*) on lexical items other than I’m, and (3) apologies ending
with a low boundary tone (L%). Utterances more likely to be perceived as ostensible included
(1) apologies with accented I’m, (2) apologies with a double pitch accent (i.e., L*+ H and L +
H*) which follows an accented intensifier, and (3) apologies ending with a high boundary
tone (H%).
Does English proficiency play a role in the success rate of interpreting the meaning/
force of the utterance?
251
Kathleen Bardovi-Harlig
Participants listened to each dialogue twice, then wrote a one-sentence paraphrase of the
last turn in the conversation on their answer sheet. Intermediate learners interpreted
unmarked intonation very easily (97% accuracy), but marked intonation much less ac-
curately (41%). Advanced learners showed 100% accuracy for unmarked and 83% for
marked intonation (in the NS range). To address concerns that the different contexts may
incline learners to the correct interpretation, Wan is developing interpretation tasks to
supplement the current task so that both renditions of a given string occur with the same
dialogue, thus the pair of utterances in the same dialogue is distinguished only by in-
tonation. In an item like Example (6) a participant would hear either B or B′ and provide
an interpretation.
252
Pragmatics
Overview
The five studies reviewed here illustrate the interaction of pragmatics and speaking. Both
words and delivery contribute to conveying illocutionary force (Alexander, 2011; Pickering
et al., 2012; Wan, 2020), the perception of it (perlocutionary effect; Alexander, 2011; Wan,
2020), and sincerity (Alexander, 2011). Pitch concord/discord seems to function as an IFID
(Pickering et al., 2012). Length of utterance influences delivery (Alexander, 2011; Kang et al.,
2021) and should be investigated as a variable. In addition, rate of speech (Taguchi, 2011)
and pitch, stress, and tone were affected by degree of imposition (Kang et al., 2021). If we
combine the two studies (Taguchi, 2011; Kang et al., 2021), we get a more complete view of
development. No doubt, prosody played some role in acceptability judgements. Taguchi
reported that proficiency, and not study abroad, significantly correlated with pragmatic
appropriateness; Kang et al. reported that study abroad was associated with the use of level
tones and wider pitch range.
None of these discoveries could be made without investigating both pragmatics and
speaking, thus suggesting that there is much to learn from exploring this interface.
253
Kathleen Bardovi-Harlig
intermediate proficiency. The study had two parts starting with eight role-plays with the
researchers (2 per 4 speech acts, requests, refusals, compliments, and apologies), instruction
informed by pragmatics research and recommendations, and a post-test consisting of the
same role-plays. The pragmatics group showed improvement in pragmatics, whereas a
control group, who received no pragmatics instruction, did not. In the second part of the
study, 56 listeners judged 2 refusals and 2 requests from each of the 11 students in the
pragmatics group for both pretests and posttests. The listeners rated the role-plays on three
9-point scales: pragmatics (socially appropriate to extremely inappropriate), comprehensi-
bility (extremely easy to understand to impossible to understand) and fluency (extremely
fluent to extremely disfluent). Raters found the learners to be more appropriate pragmati-
cally after instruction. Comprehensibility ratings improved on three out of four scenarios
(the remaining scenario involved asking an employer for a previously promised raise).
Fluency ratings showed an increase for only one scenario (refusing a customer’s request at a
bank because of lack of ID).
This study suggests that although learners may improve their pragmatics through in-
struction focused on oral production, they did not automatically improve on delivery, and it
further suggests that integrated instruction should be offered. Such instruction is unlikely to
be found ready-made in language textbooks as reviews by pragmatics researchers have found
them to rarely cover pragmatics and only inauthentically when included (Ishihara & Cohen,
2014, among others). Two recent reviews of speaking and pragmatics in English language
teaching textbooks reached similar conclusions.
Diepenbroek and Derwing (2013) reviewed pragmatics and fluency activities separately
in 48 ESL texts used in Canada (12 series of 4 textbooks each). They found fluency ac-
tivities to be lacking and unevenly distributed across textbooks. Like previous reviews,
they cited limited quality, depth of coverage, and lack of contextualization in the pre-
sentation of pragmatics. Petraki and Bayes (2013) reviewed five intermediate-level ESL
textbooks popular in Australia. One textbook addressed intonation and presented ex-
amples of polite, impolite, and sarcastic intonation, and students were encouraged to
express impoliteness and negative affect in role-plays. One textbook has what was de-
scribed as a mini-lesson on polite and impolite intonation, along with a diagram of polite
request intonation, and another presents only four examples. In short, not surprisingly,
the teaching of speaking and pragmatics fares no better than pragmatics alone; the review
did not report whether intonation patterns were authentic (an issue in the portrayal of
speech acts).
Fortunately, both Yates and Pickering have developed concrete proposals and activities
for integrated instruction of pragmatics and pronunciation (Yates, 2017) and discourse in-
tonation (Pickering, 2018). Yates persuasively argues that both pragmatics and pronuncia-
tion should have greater prominence in language teaching. She details the PREFER
approach (Practice-relevant models, Raising awareness of pragmatic and pronunciation is-
sues and their interaction, Experimentation with new pragmatic resources and pronuncia-
tion, Feedback, Exploring the world outside, and Reflection on what to do and how to do it)
(p. 240). She provides excerpts of L2 speakers interacting in their professional setting, and
then provides an instructional activity sequence using the PREFER approach that addresses
the relevant features.
Pickering includes teaching suggestions in many of her articles. In addition, her 2018
monograph provides a practical introduction to discourse intonation for ESL/EFL teachers.
Seven chapters include parts on pedagogical implications; the penultimate chapter includes
teaching suggestions and means for evaluating them. Two appendices developed by ESL
specialists provide additional elaborated activities.
254
Pragmatics
7 Future Directions
This chapter advocates integrating the study of speaking into the study L2 pragmatics so
regularly that speaking comes to be recognized as a pragmalinguistic resource. The question
is how to achieve that.
One starting point is to replicate a classic study (making it oral, if necessary), include spoken
characteristics in the analysis, and follow it up by including speech characteristics in the results.
A second route is for researchers with existing oral data to consider revisiting the re-
cordings, if they are of sufficient quality to do so. If they cannot be analyzed instrumentally
or auditorily, the recordings may suffice for a pilot for a new study.
A third type of investigation involves studying task effects, exploring how tasks influence the
spoken characteristics of speech act realization. While writing this chapter, I thought about the
extent to which the oral DCT might contribute to what we impressionistically described as
monotonic or monotonous production. Could recording oneself talking aloud to no one other
than the computer contribute to the observed lack of pitch variation? Kang et al. (2021) also
report monotonic production by some learners completing a similar task. On the other hand,
Pickering et al. (2012) also reports monotonous delivery of class lectures which do have an
audience. However, the lectures of international teaching assistants are often not interactive in
the way that the lectures of NS teaching assistants are. Thus, the relevance of interlocutors and
their influence on speech act delivery remains to be investigated.
Finally, this is an area that calls for collaboration. Few L2 pragmatics researchers are
trained in acoustic analysis and few L2 speaking researchers are trained in pragmatics, al-
though as this chapter shows, such individuals exist. However, combining expertise within
the SLA subfields is another route towards establishing research at the interface of L2
pragmatics and speaking.
Author Note
I wrote this chapter as a pragmatics researcher from the perspective of what speaking can
contribute to pragmatics. I would be pleased to read the mirror-image chapter, what prag-
matics can add to the research of speaking, but I do not have the expertise to write it.
Further Reading
Kang, O., & Kermad, A. (2019). Prosody in L2 pragmatics research. In N. Taguchi (Ed.), The
Routledge handbook of second language acquisition (pp. 78–92). Routledge.
Various approaches to understanding prosody are explored, along with the role of prosody in NS
pragmatics. L2 pragmatics and prosody are discussed.
Yates, L. (2017). Learning how to speak: Pronunciation, pragmatics and practicalities in the classroom
and beyond. Language Teaching, 50, 227–246.
Yates argues for integrating the teaching of pragmatics and pronunciation. This article presents a
pedagogical plan adaptable to any target language.
Kang, O., Kermad, A., & Taguchi, N. (2021). The interplay of proficiency and study abroad experience
on the prosody of L2 speech acts.
An example of a pragmatics study integrating a speech act approach with a description of six char-
acteristics of delivery.
255
Kathleen Bardovi-Harlig
References
Alexander, S. T. (2011). Sincerity, intonation, and apologies: A case study of Thai EFL and ESL learners
[Unpublished doctoral dissertation]. Indiana University.
Austin, J. L. (1962). How to do things with words. Cambridge: Harvard University Press.
Bardovi-Harlig, K. (1999). Exploring the interlanguage of interlanguage pragmatics: A research agenda
for acquisitional pragmatics. Language Learning, 49, 677–713.
Bardovi-Harlig, K. (2009). Conventional expressions as a pragmalinguistic resource: Recognition
and production of conventional expressions in L2 pragmatics. Language Learning, 59,
755–795.
Bardovi-Harlig, K. (2010). Exploring the pragmatics of interlanguage pragmatics: Definition by design.
In A. Trosborg (Ed.), Pragmatics across languages and cultures (Vol. 7 of Handbooks of pragmatics;
pp. 219–259). Berlin: Mouton de Gruyter.
Bardovi-Harlig, K. (2012). Pragmatics in SLA. In S. M. Gass & A. Mackey (Eds.), The Routledge
handbook of second language acquisition (pp. 147–162). London: Routledge/Taylor Francis.
Bardovi-Harlig, K. (2013). On saying the same thing: Assessing the production of conventional ex-
pressions in L2 pragmatics. Pragmatics and language learning, 13, 191–211.
Bardovi-Harlig, K. (2015). Disinvitations: You’re not invited to my birthday party! Journal of
Pragmatics, 75, 91–110.
Bardovi-Harlig, K. (2015). Operationalizing conversation in studies of instructional effects in L2
pragmatics. System, 48, 21–34.
Bardovi-Harlig, K. (2018). Matching modality in L2 pragmatics research design. System, 75, 13–22.
Bardovi-Harlig, K., & Hartford, B. S. (1996). Input in an institutional setting. Studies in Second
Language Acquisition, 18, 171–188.
Blum-Kulka, S., House, J., & Kasper, G. (Eds.) (1989). Cross-cultural pragmatics: requests and
apologies. Norwood, NJ: Ablex.
Burns, A. (2013). A holistic approach to teaching speaking in the language classroom. In M. Olofsson
(Ed.), Symposium 2012: Lärarrollen i svenska som andraspråk (pp. 165–178). Stockholm: Stockholms
universitets förlag.
Couper-Kuhlen, E. (1986). An introduction to English prosody. Tübingen: Niemeyer.
Derwing, T. M., Waugh, E., & Munro, M. J. (2021). Pragmatically speaking: Preparing adult ESL
students for the workplace. Applied Pragmatics, 3, 107–135.
Diepenbroek, L., & Derwing, T. M. (2013). To what extent do popular ESL textbooks incorporate oral
fluency and pragmatic development? TESL Canada Journal, 30, 1–20.
Escandell-Vidal, V., & Prieto, P. (2021). Pragmatics and prosody in research on Spanish. In D. A.
Koike, & J. C. Félix-Brasdefer (Eds.), The Routledge handbook of Spanish pragmatics (pp. 149–166).
New York: Routledge.
Gass, S. M. & Houck, N. (1999). Interlanguage refusals: A cross-cultural study of Japanese English.
Berlin: Mouton de Gruyter.
Goh, C. & Burns, A. (2012). Teaching speaking: A holistic approach. New York: Cambridge University Press.
Grice, H. P. (1975). Logic and Conversation. In P. Cole & J. Morgan (Eds.), Speech Acts (Syntax and
Semantics, Vol. 3, pp. 41–58). New York: Academic Press.
Gumperz, J. (1982). Discourse strategies. Cambridge: Cambridge University Press.
Hirschberg, J. (2017). Pragmatics and prosody. In Y. Huang (Ed.), The Oxford handbook of pragmatics
(pp. 532–549). Oxford: Oxford University Press.
House, J. (1996). Developing pragmatic fluency in English as a foreign language: Routines and me-
tapragmatic awareness. Studies in Second Language Acquisition, 18, 225–252.
Ishihara, N., & Cohen, A. D. (2014). Teaching and learning pragmatics: Where language and culture
meet. Abingdon, UK: Routledge.
Kang, O., & Kermad, A. (2019). Prosody in L2 pragmatics research. In N. Taguchi (Ed.), Routledge
handbook of SLA and pragmatics (pp. 78–92). New York: Routledge.
Kang, O., Kermad, A., & Taguchi, N. (2021). The interplay of proficiency and study abroad experience
on the prosody of L2 speech acts. Journal of Second Language Pronunciation.
Koike, D. & Félix-Brasdefer, J. C. (Eds). The Routledge handbook of Spanish pragmatics. New York:
Routledge.
Kasper, G., & Rose, K. R. (2002) Pragmatic development in a second language.Oxford: Blackwell.
Myles, F., Hooper, J., & Mitchell, R. (1998). Rote or rule? Exploring the role of formulaic language in
classroom foreign language learning. Language Learning, 48, 323–363.
256
Pragmatics
257
PART IV
Teaching Speaking
18
SECOND LANGUAGE SPEAKING
STRATEGIES
Sara Kennedy
1 Introduction/Definitions
In this chapter, second language (L2) speaking strategies are considered a subset of second
language communication strategies, with second language referring to any language(s) other
than a speaker’s dominant language(s) learned in childhood. Both communication strategies and
speaking strategies can be conceived of in quite focused or in broader terms, but speaking
strategies in this chapter encompass any spoken attempts to “enhance the effectiveness of
communication,” per Canale’s (1983) definition (as cited in Hung & Higgins, 2016, p. 903). As
elaborated in the next part, L2 speaking strategies can be conceived of narrowly, as something
used to fill a gap in communications, such as asking an interlocutor to clarify the meaning of an
utterance, or can be conceived of more widely as something done to enhance relations and
interaction between interlocutors, such as re-using an interlocutor’s words or phrases in a
subsequent turn. In Table 18.1, a set of L2 speaking strategies which have been commonly
grouped under a narrow (problem-oriented) approach are presented, followed by Table 18.2,
which presents a set of L2 speaking strategies more oriented towards supporting and enhancing
interaction between interlocutors. It is important to note that the problem-oriented strategies
could also be used to enhance interaction, but are here focused specifically on addressing
problems in L2 communication. A brief history of the framing and analysis of L2 speaking
strategies is described, followed by a discussion of some critical issues and topics in research on
L2 speaking strategies. Current themes in research on L2 speaking strategies are then outlined,
followed by the main research methods used in studies on L2 speaking strategies. Finally, some
recommendations for teaching practice and potential future directions in research are discussed.
2 Historical Perspectives
L2 speaking strategies (included under L2 communication strategies) were initially described as
errors resulting from speakers’ incomplete knowledge of the L2 (Richards, 1971), such as
inappropriate cross-linguistic transfer of first language (L1) vocabulary (e.g., enregistrates
[records] the light); soon after, L2 speaking strategies became more widely viewed as speakers’
reactions to problems in communicating; these reactions were categorised according to surface-
level moves such as switching topics or reformulating messages (Ervin, 1979). Later researchers
incorporated cognitive processes into the analysis of L2 speaking strategies, such as speakers’
Problem-Oriented
Interaction-Oriented
Let it pass – listener allows unclear words or utterances to “pass” unless Firth (1996)
understanding them becomes essential to communication
Build solidarity by using mutual L1 Lauriks et al. (2015)
Convey warmth and care through voice quality Jain and Krieger (2011)
Invitation to continue A: “…the people don’t accept me and aaaa I don’t Jamshidnejad (2011)
know mmmm what do you want me to say? (Laughing)’B: “what do
you mean they don’t accept you?”
Using different forms of address for rapport building A: “Morning Collier (2010)
Dennis, small cup, darlin’?”B: “Small.”
Anticipating and completing an interlocutor’s phrase or utterance – A: “I Björkman (2014)
am gonna ask him what what does it what does it” B: “consume” A:
“yeah consume…”
control of how meaning could be expressed even if knowledge of the L2 was insufficient, such
as using a word from the L1 (Bialystok, 1990).
Ultimately, approaches to the study and analysis of L2 speaking strategies coalesced into
two spheres: a psycholinguistic perspective and an interactional perspective. In the psycho-
linguistic perspective, L2 speaking strategies are explained through reference to cognitive and
linguistic models of language learning and use, such as Levelt’s model (1983, 1989, 1993,
1995; see de Bot & Bátyi, this volume). From this perspective, L2 speaking strategies are
typically viewed as steps taken to address speakers’ problems or challenges in commu-
nicating, such as providing fillers or hesitation devices while searching for words or con-
tinuing an utterance (e.g., uhh…the thing is…). From the interactional perspective, L2
speaking strategies are part of an overall enterprise of interlocutors jointly constructing and
achieving understanding (e.g., Firth, 1990). For example, one interlocutor who uses a lexical
item from her L1 while speaking an L2 is not solving a problem in production, but taking a
step with other interlocutors to jointly achieve orderly interaction by using all available
resources and reaching mutual understanding (Firth & Wagner, 1997).
Much research in the late 1990s and early 2000s focused on one of two areas: training
in L2 speaking strategies, which was generally framed in the psycholinguistic perspective
262
Second Language Speaking Strategies
(e.g., Scullen & Jourdain, 2000), and studies about the use of English as a lingua franca
(ELF), which explained the use of L2 speaking strategies as resources jointly used by in-
terlocutors to enable successful communication using ELF (e.g., House, 2003). Computer-
mediated communication also became a more frequent context for research on L2
communication strategies in the early 2000s; however, it was only in the 2010s, when tech-
nological advances allowed for widespread use of digitally-based video communication, that
the use of L2 speaking strategies in computer-mediated communication began to be more
widely studied, especially in post-secondary settings (e.g., Shih, 2014). This research generally
has taken a psycholinguistic perspective in identifying and classifying the use of L2 speaking
strategies, while research in ELF settings continues to frame L2 speaking strategy use from
an interactional perspective, with an increasing variety of research contexts such as profes-
sional workplaces, customer service centres, and small enterprises (e.g., Collier, 2010).
Many of the critical issues and topics discussed in the next part build on and extend the
more recent L2 speaking strategy research on pedagogical interventions, descriptions of
second language interactions, and Internet and mobile technology.
263
Sara Kennedy
speech event, depending on the intent and interpretation of the interlocutors and the purpose
of the speech event, as well as other individual and contextual factors such as proficiency
level or membership in a community of practice. The influence of different individual or
contextual factors, such as creativity level or role in the workplace, on speakers’ use of L2
speaking strategies has been established (e.g., Carter Pipes, 2019; Collier, 2010); however, the
outcome of that use is rarely explored, likely because of the difficulty of determining how the
use of a particular speaking strategy has affected communication, interlocutors’ commu-
nicative goals, or interlocutors’ relationships. It is easier to measure changes in the use of
particular L2 speaking strategies than to measure effective use of L2 speaking strategies, but
strategies are used to address a challenge or a purpose in a communicative speaking event. It
is important not only that L2 speakers learn to use L2 speaking strategies more frequently,
but also that they use those strategies in ways that benefit their purpose in speaking.
Therefore, teachers and researchers need to more consistently explore the reasons why L2
speakers use particular speaking strategies and also explore interlocutors’ perceptions of the
effects of employing those strategies on the communication itself and on relationships be-
tween interlocutors (see Hung & Higgins, 2016). Few teachers or researchers would suggest
that word coinage, for example, is a more effective L2 speaking strategy for all interlocutors
and in all contexts than the strategy of asking for assistance. However, more attention has
been paid to how focused instruction can increase the use of particular L2 speaking strategies
than to how L2 speakers effectively use a range of L2 speaking strategies, depending on their
capacities and contexts.
One domain where an increasing amount of L2 speaking strategy research is being done is
Internet and mobile technology. Researchers have found that L2 learners’ computer- or
mobile device-mediated use of L2 speaking strategies may differ in clear ways from L2
learners’ use of L2 speaking strategies in face-to-face interaction (e.g., Smith, 2003). The
modalities of interaction using computers or mobile devices can take many forms, from two-
way voice communication to audio-video communication to communication in virtual rea-
lity settings. These different technologies contribute various affordances, “constraining and
enabling aspects that are brought about by technological artifacts” (Rosenbaun, 2016, p. 7).
In some synchronous computer-mediated communication, the interlocutors may not be
known to each other, may not be physically visible to each other (even if avatars are visible),
and may not demonstrate a clearly defined purpose in communicating except to use the
technology (e.g., Skypecasts in Brandt & Jenks, 2013), so the use of L2 speaking strategies
can be quite different from face-to-face interaction or even telephonic interaction, where one
party purposely has contacted another party. Additionally, the possible combination of L2
speaking strategies together with communication cues of other kinds (e.g., using written text
and symbols, gestures by avatars) can support spoken communication in ways that are less
common in face-to-face communication (Shih, 2014).
Users of Internet and mobile technologies may be in communication even if they are not
affiliated through jobs or careers, schooling, or geographic regions. The use of the tech-
nologies themselves may contribute to what has been called Transient International Groups
(Pitzl, 2019) or Transient Multilingual Communities (Mortensen, 2017), where an L2 speaker
might not interact repeatedly with another speaker and so build a relationship, but might
interact transiently with other speakers, with no clearly shared purpose or shared rules for
engagement. The nature of these interactions may up-end typical expectations for the use of
L2 speaking strategies. For example, an account executive and an international client
speaking via a video call may each expect L2 speaking strategies to be used in somewhat
predictable ways, due to the established relationship and communication norms and
(perhaps partial) joint purpose in communicating; however, two interlocutors who encounter
264
Second Language Speaking Strategies
each other via a massively multiplayer online game, such as World of Warcraft, might use L2
speaking strategies in quite different ways from one interaction to the next, due to the lack of
established norms of communicating in an environment with the potential for many transient
encounters (see Jenks, 2009, for an early exploration of this concept). It is important,
therefore, that researchers who study L2 speaking strategies also focus on communicative
technologies which afford transient (virtual) encounters between L2 speakers who may not
want or need to establish shared norms for communication with their interlocutors. The use
and function of L2 speaking strategies in these environments may be noticeably more
idiosyncratic and distinctive to a given interaction.
265
Sara Kennedy
interlocutors, including speaking strategies which were goal-directed and strongly connected
to interlocutors’ actions and speech. Research on L2 speaking strategies will continue to
intensify in these multiplayer online gaming and virtual environments, which offer contexts
for interaction where interlocutors may differ in their levels of target language proficiency or
of familiarity with each other and with the online environment (see part on Transient
Multilingual Communities); this contrasts with many classroom settings, where patterns of
interaction are more structured and more familiar to interlocutors.
Another area where study of L2 speaking strategies continues to grow is research in ELF
settings. In these contexts, speakers from a range of L1s use English primarily for commu-
nicative purposes, rather than in pedagogical settings. The ways speaking strategies in English
are used may therefore be less informed by speaking according to L1 English norms and more
by using English to meet interlocutors’ communicative purposes. Ehrenreich (2018) explained
how the concept of community of practice was useful for ELF research. To constitute a
community of practice, groups of people must interact regularly, must have a joint goal or
purpose that guides the interrelated actions of group members, and must have a “shared re-
pertoire” (Wenger, 1998, as cited in Ehrenreich, 2018, 43), a set of resources for negotiating
meaning within the group, which includes linguistic resources. In post-secondary academic
settings, Björkman (2014) found that students using ELF while doing group work at a Swedish
university used speaking strategies mainly focused on confirming or checking accurate un-
derstanding (e.g., clarification requests), rather than speaking strategies focused on nativelike
use of English (e.g., word replacement). In business settings, Firth (1996) found that in sales
phone calls between ELF users, non-standard or potentially unclear utterances were often not
singled out by interlocutors (let it pass), but Tsuchiya and Handford (2014) noted that in a
multiparty meeting on designing a large bridge, two members of the meeting regularly re-
quested clarification of or reformulated unclear or ambiguous utterances from other members.
Tsuchiya and Handford suggested that the occurrence of these potentially face-threatening
strategies may have been due to the previously-attested combative nature of construction
communication (communication in construction-related fields), the gender of the participants
(all male), and the belief of one of the two members (the Chair of the meeting) that he had a
special responsibility to ensure that all meeting members understood what was being said.
Clearly, the use of L2 speaking strategies is influenced by the context of target language use;
where the target language is being used as a lingua franca (with little to no reference to L1
speaker norms), the use of L2 speaking strategies might be quite different from their use in an
L2 or language learning context. Additionally, if interaction is taking place in communities of
practice with specific purposes and repertoires for negotiating meaning and understanding, the
use of L2 speaking strategies might be noticeably different amongst communities because of
the range of purposes and repertoires appropriate to each community. The context for use of
L2 speaking strategies is as important as the nature of the strategies themselves.
266
Second Language Speaking Strategies
relatively more influence over the environment and the recording quality, as opposed to
workplaces, homes, or other naturalistic settings where ambient noise can affect the intellig-
ibility of recorded speech. However, the use of recordings from naturalistic settings is more
likely to reflect situated and contextualized L2 speech, where the nature of the surrounding
environment and interlocutors may considerably shape how, when, and which speaking
strategies are used. The recall of L2 speaking strategy use (via retrospection) is less common
than the recorded use of L2 speaking strategies in data collection, likely because of the risk of
speakers forgetting or not noting use of strategies during a specific speaking task; “if the task is
complicated or takes a lot of time, the participant can forget some of the mental processes that
occurred” (Perry, 2011, p. 119). Stimulated recall, where speakers are presented with excerpts
of their recorded speech and are asked to describe their thoughts at the time, can help speakers’
recollection of the strategies they used (e.g., Lam, 2010; Poulisse et al., 1987). Alternatively,
researchers can collect self-reports of recalled speaking strategies in contexts where recording
speech is not practical or feasible. Self-reports from questionnaires (e.g., Kongsom, 2009) can
also be used as teaching opportunities for raising learners’ consciousness of different types of
L2 speaking strategies that they already use or could start to use.
The identification and analysis of L2 speaking strategies has changed over time. Early fra-
meworks identified speaking strategies according to their surface-level characteristics, such as
word coinage or topic switch (Ervin, 1979). Later frameworks incorporated speakers’ cognitive
processes, such as the source of information which speakers drew upon to produce L2 speaking
strategies, such as the speaker’s L1 (Bialystok, 1983). Psycholinguistic models of language
learning and use were incorporated beginning in the late 1980s to categorise ways in which L2
speaking strategies reflected psycholinguistic processes (Bongaerts & Poulisse, 1989). All of these
frameworks focused on individual speakers and the means by which they addressed the chal-
lenges in speaking in an L2 (a cognitive/psycholinguistic approach). However, another approach
to analyzing the use of L2 speaking strategies is not as the actions of individual speakers in
response to possible problems in communicating, but as the interactive use of resources by
interlocutors as a natural outcome of communication, accomplished by interlocutors co-
constructing and jointly achieving understanding (Firth, 1990). With this approach to analyzing
L2 speaking strategies, the L2 communication analyzed must be interactive (i.e., with a person
listening to the speaker) and naturalistic (not elicited or guided by a researcher or prompt).
Generally speaking, speech samples analyzed for L2 speaking strategies range across a
continuum from monologic or interactive speech elicited from prompts in labs or classrooms
to unguided speech authentically used in naturalistic settings, whether the setting is a
classroom, a family residence, a restaurant, or a video call. L2 speaking strategies can be
identified and analyzed according to their surface-level characteristics (e.g., circumlocution,
language switch), to cognitive processes or psycholinguistic models which are presumed to
underlie the use of those speaking strategies, or according to a more qualitative description
of the resources used by interlocutors in the process of accomplishing orderly interaction in a
given context. The choice of speech samples, analytic frameworks, and data collection
methods (recorded versus recalled) can depend on practical considerations of collecting and
analyzing data, the researcher’s questions, and the researcher’s explicit or implicit perspective
on how languages are primarily learned, used, and analyzed, whether it be from a cognitive
basis, a constructivist (interactional) basis, or a combination of the two.
267
Sara Kennedy
many L2 speaking strategies have been shown to be learnable through explicit instruction
(e.g., circumlocution, use of all-purpose words), it is strategies related to L2 oral fluency,
such as filling pauses and minimizing hesitations (e.g., “let me see…”), which were learned in
multiple studies in different instructional contexts. These fluency-related strategies may
simply be relatively easy to adopt, and thus “low-hanging fruit” for teachers. Other L2
speaking strategies, such as re-structuring, may be more challenging for L2 speakers to use
proficiently. The acquisition and use of speaking strategies that require a degree of individual
linguistic flexibility or creativity, such as paraphrasing or word coinage, or which involve an
interlocutor, such as appealing for help or rephrasing an interlocutor’s utterance, may be
influenced by individual differences (e.g., vocabulary or syntactic knowledge, extroversion,
or confidence). They may also be influenced by contextual characteristics (e.g., the status or
mutual familiarity of interlocutors, the topic, or each speaker’s assumptions about pragmatic
norms for communication). Although studies on L2 speaking strategy instruction are
longitudinal by nature, with typical instructional periods measured in weeks and months, the
long-term impact of instruction is rarely examined (Guo, 2011). What is clear is that L2
speakers receiving explicit instruction in L2 speaking strategies do produce at least some of
the instructed L2 speaking strategies more frequently. However, the learnability of particular
L2 speaking strategies may be influenced by many factors, including the target language, L2
speakers’ L1, their level of L2 proficiency, or the sociopragmatic context for communication.
While individual L2 speaking strategies may be more or less learnable, it remains unclear
which L2 speaking strategies are important to learn. To my knowledge, the effectiveness of
using particular L2 speaking strategies in particular interactions has not been examined.
Several researchers have analyzed the effects of the use of specific strategies on interlocutors’
cognitions or on the interaction itself, using an etic (external to the interaction) or emic (from
an interlocutor’s perspective) approach (e.g., Chang & Liu, 2016; Kaur, 2011). Nevertheless,
findings from these studies are not meant to represent the utility or effectiveness of particular
speaking strategies; rather, the effect of using a specific L2 speaking strategy is analyzed within
the context of the particular interaction with those particular interlocutors. How, then, are
teachers to decide on which L2 speaking strategies to prioritise for explicit instruction?
Clearly, no L2 speaking strategies have the same effect on L2 communication in every
communicative context. Rather than focusing on a restricted set of L2 speaking strategies as
instructional targets, teachers might consider the purposes for which their L2 learners will
need or want to communicate: to have their L2 speech assessed, to communicate while
travelling, to find a job, etc. The goal for L2 learners would be to develop the flexibility to
select from a range of L2 speaking strategies so as to use ones that support learners’ com-
municative purposes. For example, an L2 learner seeking a job may feel that fluent speech is
more important than complex speech, so may use speaking strategies such as filling pauses
and using formulaic time-gaining phrases (e.g., let me see).
Teachers are familiar with target language contexts and, potentially, with challenges that
learners may face in learning to use relevant strategies. One approach to address the use of
different L2 speaking strategies in a given situation could be for teachers to present examples,
whether authentic or instructional, of specific communicative situations such as retail
transactions to raise awareness of the use or potential use of L2 speaking strategies. L2
learners working in a retail setting, for instance, could potentially build rapport through
repeating interlocutors’ language, or could address difficulties with lexical retrieval by
paraphrasing. Once learners’ awareness is raised about possibilities for using speaking
strategies in the L2, learners could engage in guided or more spontaneous speaking in a
similar communicative situation; learners could then analyze how they used or could have
used L2 speaking strategies for particular purposes, whether by recalling their speech or by
268
Second Language Speaking Strategies
listening to recordings of it. This cycle of awareness-raising, practice, and analysis can be
repeated for similar or different communicative situations in order to enhance learners’ re-
pertoire and flexibility in the selection of L2 speaking strategies.
Unfortunately, there is no published evidence for specific L2 speaking strategies which are
generally more communicatively effective than other strategies. L2 speakers, like other
speakers, can analyze the communicative effects of using particular speaking strategies by
reflecting on particular instances of communication. Teachers who know that their learners
struggle in particular areas of L2 speaking, such as lexical retrieval, might want to focus in-
struction on very specific speaking strategies to address those struggles; however, most teachers
and L2 learners cannot easily predict which speaking strategies will be needed in L2 com-
munication. L2 speakers who are resourceful, adaptable, and sensitive to particular commu-
nicative needs will be better placed to use L2 speaking strategies which meet their needs.
7 Future Directions
Much of the early research on L2 speaking strategies was set in classrooms, whether as
snapshots of strategy use by L2 learners, or as studies of effects of explicit instruction on L2
speaking strategies. To date, little published research has examined the long-term effects of
explicit instruction on the use of L2 speaking strategies, or L2 speakers’ longitudinal learning
of L2 speaking strategies in naturalistic (non-instructed) settings. These are valuable topics
for future investigation, especially given the rise in the use of Internet and virtual technology
for professional, educational, and recreational purposes.
These technologies allow for speech and interaction for some of the same purposes as off-
line contexts, such as sales calls or training sessions, and have affordances for the use of
many different modalities at the same time, such as images, video, and text. These multi-
modal communications, which can often be recorded, accentuate the combination of L2
speaking strategies with other, non-speech elements, which contribute in their entirety to the
communication. Gullberg (2006) explored this combination in face-to-face interaction, but
the surge in online multimodal communication means that the L2 speaking strategies could
and should be examined in conjunction with other communication cues, as in Wigham and
Chanier (2013), who found that some L2 users of Second Life L2 often used their avatars to
help convey the message of their next utterance, such as moving their avatars towards a
certain area or using their avatars to point to specific objects.
More research is being done on authentic use of L2 speaking strategies outside classroom
contexts, especially in international work and academic settings (e.g., Björkman, 2014; Du-
Babcock, 2013); however, the examination of how the use of L2 speaking strategies is in-
fluenced by social and political environments and interlocutors’ status (e.g., Lauriks et al.,
2015) is still an emergent area ripe for further research. Similarly, research on Transient
Multilingual Communities has the potential to extend our knowledge of how L2 speaking
strategies are used in contexts where interactional norms may not be established or shared.
Finally, it would be interesting to explore the effects of pedagogical interventions fo-
cusing on preparing L2 learners to use L2 speaking strategies to react to ongoing com-
municative or interactional needs. That is, explicit instruction would centre not only on
particular L2 speaking strategies, but on guiding learners to recognize their communicative
or interactional needs during particular L2 spoken interaction, to draw on their available
strategic resources, and to reflect on the effects of their use of L2 speaking strategies. When
L2 speakers can engage in communication using whichever strategic resources they feel are
suitable, they will be more autonomous and more adaptable to the changing demands of
authentic L2 communication.
269
Sara Kennedy
Further Reading
Guido, M. G. (2012). ELF authentication and accommodation strategies in crosscultural immigration
encounters. Journal of English as a Lingua Franca, 1(2), 219–240.
An observational study of English interactions between Italian immigration officials and asylum-
seekers from African countries.
Kennedy, S. & Trofimovich, P. (2016). Research timeline: Second language communication strategies.
Language Teaching, 49(4), 494–512.
A timeline describing important concepts and research studies from the 1970s to the mid-2010s.
Sato, T., Yujobo, Y. J., Okada, T., & Ogane, E. (2019). Communication strategies employed by low-
proficiency users: Possibilities for ELF-informed pedagogy. Journal of English as a Lingua Franca,
8(1), 9–35.
Lower-proficiency learners of English, a population not often targeted in L2 speaking strategy research,
are observed in paired, task-based interaction with L1 English speakers, to examine the speaking
strategies learners use without instruction and to measure the effectiveness of those, with suggestions
provided for L2 speaking strategies to encourage or discourage for these low-proficiency learners.
Soekarno, M., & Ting, S. H. (2020). Fluency and communication strategy use in group interactions for
occupational purposes. Journal of English Language Teaching Innovations and Materials (JELTIM),
2(2), 63–84.
An exploration of the effects of extended explicit instruction in L2 speaking strategies in an occupational
(culinary) program in a hospitality college in Malaysia. Although participants received training on 11
different strategies over 12 weeks, only the use of time-gaining fillers and the repetition of words for
various purposes were used repeatedly by all participants across three data collection times.
References
Birlik, S., & Kaur, J. (2020). BELF expert users: Making understanding visible in internal BELF
meetings through the use of nonverbal communication strategies. English for Specific Purposes,
58, 1–14.
Bialystok, E. (1990). Communication strategies: A psychological analysis of second-language use. Oxford:
Cambridge University Press.
Bialystok, E. (1983). Some factors in the selection and implementation of communication strategies. In
C. Faerch & G. Kasper (Eds.), Strategies in Interlanguage Communication (pp. 100–118). London:
Longman.
Björkman, B. (2014). An analysis of polyadic English as a lingua franca (ELF) speech: A commu-
nicative strategies framework. Journal of Pragmatics, 66, 122–138.
Bøhn, H., & Myklevold, G. A. (2018). Exploring communication strategy use and metacognitive
awareness in the EFL classroom. In Å. Haukås, C. Bjørke, & M. Dypedahl (Eds.), Metacognition in
language learning and teaching (pp. 179–203). New York: Routledge.
Bongaerts, T. & Poulisse, N. (1989). Communication strategies in L1 and L2: Same or different?
Applied Linguistics 10(3), 253–268.
Brandt, A., & Jenks, C. (2013). Computer-mediated spoken interaction: Aspects of trouble in multi-
party chat rooms. Language@ Internet, 10(5). https://www.languageatinternet.org/
Chang, S.-Y. & Liu, Y. (2016). From problem-orientedness to goal-orientedness: Re-conceptualizing
communication strategies as forms of intra-mental and inter-mental mediation. System, 61, 43–54.
Collier, S. (2010). Getting things done in the L1 and L2: Bilingual immigrant women’s use of com-
munication strategies in entrepreneurial contexts. Bilingual Research Journal, 33(1), 61–81.
Dörnyei, Z. (1995). On the teachability of communication strategies. TESOL Quarterly 29(1), 55–85.
Du-Babcock, B. (2013). English as Business Lingua Franca: A comparative analysis of communication
behavior and strategies in Asian and European contexts. Ibérica, Revista de la Asociación Europea de
Lenguas para Fines Específicos, 26, 99–130.
Ehrenreich, S. (2018). Communities of practice and English as a lingua franca. In J. Jenkins, W. Baker,
& M. Dewey (Eds.), The Routledge handbook of English as a Lingua Franca (pp. 37–50). London:
Routledge.
Ervin, G. L. (1979). Communication strategies employed by American students of Russian. The
Modern Language Journal, 63(7), 329–334.
Fernández Dobao, A. (2012). Collaborative dialogue in learner–learner and learner–native speaker
interaction. Applied Linguistics, 33(3): 229–256. doi: 10.1093/applin/ams002
270
Second Language Speaking Strategies
Firth, A. (1990). ‘Lingua franca’ negotiations: Towards an interactional approach. World Englishes,
9(3), 269–280.
Firth, A. (1996). The discursive accomplishment of normality: On ‘lingua franca’ English and con-
versation analysis. Journal of Pragmatics, 26(2), 237–259.
Firth, A. & Wagner, J. (1997). On discourse, communication, and some fundamental concepts in SLA
research. The Modern Language Journal, 81(3), 285–300.
Gullberg, M. (2006). Handling discourse: Gestures, reference tracking, and communication strategies in
early L2. Language Learning, 56(1), 155–196.
Guo, J. (2011). Empirical studies on L2 communication strategies over four decades: Looking back and
ahead. Chinese Journal of Applied Linguistics, 34(4), 89–106.
House, J. (2003). Misunderstanding in intercultural university encounters. In J. House, G. Kasper & S.
Ross (Eds.), Misunderstanding in social life: Discourse approaches to problematic talk (pp. 41–104).
London: Pearson.
Hsieh, A. F.-Y. (2014). The effect of cultural background and language proficiency on the use of oral
communication strategies by second language learners of Chinese. System, 45, 1–16. doi: 10.1016/
j.system.2014.04.002
Hung, Y.-W. & Higgins, S. (2016). Learners’ use of communication strategies in text-based and video-
based synchronous computer-mediated communication environments: Opportunities for language
learning, Computer Assisted Language Learning, 29(5),901–924. doi: 10.1080/09588221.2015.1074589
Jain, P., & Krieger, J. L. (2011). Moving beyond the language barrier: The communication strategies
used by international medical graduates in intercultural medical encounters. Patient Education and
Counseling, 84(1), 98–104.
Jamshidnejad, A. (2011). Developing accuracy by using oral communication strategies in EFL inter-
actions. Journal of Language Teaching and Research, 2(3), 530–536. doi: 10.4304/jltr.2.3.530
Jenks, C. J. (2009). Getting acquainted in Skypecasts: Aspects of social organization in online chat
rooms. International Journal of Applied Linguistics, 19, 26–46.
Kaur, J. (2011). Raising explicitness through self-repair in English as a lingua franca. Journal of
Pragmatics, 43(11), 2704–2715. doi: 10.1016/j.pragma.2011.04.012.
Kennedy, S. (2017). Using stimulated recall to explore the use of communication strategies in English
lingua franca interactions. Journal of English as a Lingua Franca, 6(1), 1–27.
Kongsom, T. (2009). The effects of teaching communication strategies to Thai learners of English. In
G. Raţă (Ed.), Language education today: Between theory and practice (pp. 154–168). Newcastle
upon Tyne, UK: Cambridge Scholars.
Kouwenhoven, H., Ernestus, M., & van Mulken, M. (2018). Communication strategy used by Spanish
speakers of English in formal and informal speech. International Journal of Bilingualism, 22(3),
285–304.
Lam, W. Y. K. (2010). Implementing communication strategy instruction in the ESL oral classroom:
What do low-proficiency learners tell us? TESL Canada Journal, 27(2), 11–30.
Lauriks, S., Siebörger, I., & De Vos, M. (2015). “Ha! Relationships? I only shout at them!” Strategic
management of discordant rapport in an African small business context. Journal of Politeness
Research, 11(1), 7–39.
Levelt, W. J. (1983). Monitoring and self-repair in speech. Cognition, 14(1), 41–104.
Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.
Levelt, W. J. M. (1993). Language use in normal speakers and its disorders. In G. Blanken,
J. Dittmann, H. Grimm, C. Marshall & C.-W. Wallesch (Eds.), Linguistic disorders and pathologies
(pp. 1–15). Berlin: de Gruyter.
Levelt, W. J. M. (1995). The ability to speak. From intentions to spoken words. European Review,
3(1),13–23.
Mortensen, J. (2017). Transient multilingual communities as a field of investigation: Challenges and
opportunities. Journal of Linguistic Anthropology, 27(3), 271–288.
Nakatani, Y. (2005). The effects of awareness-raising training on oral communication strategy use. The
Modern Language Journal, 89(1), 76–91.
Newgarden, K., & Zheng, D. (2016). Recurrent languaging activities in World of Warcraft: Skilled
linguistic action meets the Common European Framework of Reference. ReCALL, 28(3), 274–304.
Perry, F. (2011). Research in Applied Linguistics: Becoming a discerning consumer (2nd edn). New York:
Routledge.
Pipes, A. (2019). Examining creativity as an individual difference in second language production [Doctoral
dissertation, Georgetown University]. Georgetown University Institutional Repository.
271
Sara Kennedy
Pitzl, M.-L. (2019). Investigating communities of practice (CoPs) and transient international groups
(TIGs) in BELF contexts. Iperstoria, 13. https://iperstoria.it/index
Poulisse, N., Bongaerts, T., & Kellerman, E. (1987). The use of retrospective verbal reports in the
analysis of compensatory strategies. In C. Faerch & G. Kasper (Eds.), Introspection in second lan-
guage research (pp. 213–229). Clevedon, Avon: Multilingual Matters.
Rabab’ah, G. (2016). The effect of communication strategy training on the development of EFL
learners’ strategic competence and oral communicative ability. Journal of Psycholinguistic Research,
45, 625–651. doi: 10.1007/s10936-015-9365-3
Richards, J. (1971). Error analysis and second language strategies. Language Sciences, 17(1), 12–22.
Rosenbaun, L. (2016). Interaction management in recreational video-mediated communication:
Participation, multiactivities, and visual playfulness in multiparty Google+ Hangouts. [Doctoral dis-
sertation, University of Haifa].
Rossiter, M. J. (2003). The effects of affective strategy training in the ESL classroom. TESL-EJ,
7(2), 1–20.
Scullen, M. E. & S. Jourdain (2000). The effect of explicit training on successful circumlocution: A
classroom study. In J. F. Lee & A. Valdman (Eds.), Form and meaning: Multiple perspectives
(pp. 231–252). Boston: Heinle & Heinle.
Shih, Y.-C. (2014). Communication strategies in a multimodal virtual communication context. System,
42(1), 34–47.
Smith, B. (2003). The use of communication strategies in computer-mediated communication. System,
31(1), 29–53.
Tsuchiya, K., & Handford, M. (2014). A corpus-driven analysis of repair in a professional ELF
meeting: Not ‘letting it pass’. Journal of Pragmatics, 64, 117–131.
Wigham, C. R., & Chanier, T. (2013). A study of verbal and nonverbal communication in Second Life-
the ARCHI21 experience. Computer Assisted Language Learning, 28(3), 260–283.
Zhang, W. & Liu, M. (2013). Evaluating the impact of oral test anxiety and speaking strategy use on
oral English performance. Journal of Asia TEFL,10(2), 115–148.
272
19
TEACHING VOCABULARY
Marlise Horst
1 Introduction/Definitions
Knowing the words one wants to say is clearly at the very heart of what it means to speak a
new language. The centrality of second language (L2) vocabulary knowledge is vividly il-
lustrated in a 2008 study by Hilton in which learners were asked to describe a clip from a
silent movie. The closely transcribed speech data feature painfully long silences and deep
sighs as speakers search for the language they need. Interestingly, Hilton found that an
overwhelming proportion of the hesitations lasting three seconds or longer could be ascribed
to vocabulary problems. The learners were unable to retrieve a needed word or settled on a
wrong one. Hesitations ascribed to problems with phonology or grammar were relatively
few, the pauses were shorter, and the speakers usually were able resolve the problems and
move on with their stories. But when the problems were lexical, the speaking tended to break
down. This chapter discusses what learners and their teachers can do to overcome the lexical
deficits that stand in the way of successful spoken communication. Techniques that have
shown their effectiveness in teaching and learning spoken L2 vocabulary are reviewed.
We begin with key concepts used to describe vocabulary and learners’ lexical knowledge,
drawing on an excerpt from Hilton’s 2008 study. Transcription codes have been removed for
simplicity; the speaker’s L1 is French (p. 161):
SPEAKER: …and [2 seconds] the result is [1 second] that uh the fridge [8 seconds]… I uh don’t
know uh uh [laughter] tomber (= English fall).
INTERVIEWER: falls down.
SPEAKER: tomber?
INTERVIEWER: mhmm falls down.
SPEAKER: falls down
INTERVIEWER: mhmm
SPEAKER: um [6 seconds] falls down uh [2 seconds] sur? (= English onto]
INTERVIEWER: onto
SPEAKER: on the car.
The exchange highlights the speaker’s need of a larger productive L2 vocabulary size and
knowledge of the English equivalent of French tomber, in particular. Productive vocabulary
size is defined as the number of L2 words a learner can say or write; it stands in contrast to
receptive vocabulary size, defined as the number of words a learner can recognize and as-
sociate with a basic meaning in reading or listening contexts. Research shows that learners
know more L2 words receptively than productively (Webb, 2008). Thus, it is possible that the
speaker in the example would recognize the basic meaning of fall when hearing it used, but,
as we see in the excerpt, cannot readily produce the spoken form. The tomber/fall connection
has not yet become fully automatized for production. The example makes the point that the
L2 word knowledge needed for speaking involves having a large mental store of concepts
linked to L2 forms and crucially, the ability to make form-meaning connections quickly.
There is evidence that these two go hand in hand. Uchihara and Saito (2019) found that
learners with higher productive vocabulary scores tended to speak with fewer pauses and
repetitions and at a faster tempo. In another study, the comprehensibility of L2 speech was
linked to speakers’ ability to use a wide variety of words appropriately with few pauses and
hesitations (Saito et al., 2016).
There is more to know about the form of an L2 word than its sound and spelling. In the
case of fall in the excerpt, the learner may have intended to say the fridge fell. Well-developed
knowledge of fall includes awareness of fell and its other forms falls, falling and fallen. To
take the multiple morphological forms of a word into account, vocabulary researchers have
devised the pedagogical unit word family. A word family is defined as a headword along with
its grammatical inflections and frequently used derivations (Webb & Nation, 2017). The
definition raises the question of what instruction should include: For example, in explaining
expect, a teacher might also raise awareness of derived forms such as expectant and un-
expectedly – depending on the level and needs of the learners.
Nation (2013) addresses the question of what it means to know a word in an often-cited
scheme that considers form, meaning, and use. Knowing a word’s form includes receptive
and productive knowledge of its spelling, pronunciation, and morphology. Knowledge of
meaning involves connecting a word form to a basic concept, but also knowing extended and
metaphorical senses of a word. Falling has the literal meaning of losing balance, but also
refers to a declining amount or dying in battle. An important use aspect is the word’s
grammatical patterning; like other intransitive verbs, fall is not followed by an object in
sentences. Another use consideration is collocation or co-occurrence, broadly defined as the
company a word keeps. Frequent collocates of fall are down and over; note that the learner in
the excerpt above was unable to produce the needed fall down onto sequence. The point of
Nation’s scheme is to show what “deep” end-state knowledge of a word looks like; it is not a
lesson plan. Instructors could hardly teach all the knowledge aspects in a single vocabulary
teaching episode and learners would be overwhelmed. But the scheme has (at least) two
important messages for teaching. One is that vocabulary knowledge has depth; there is more
to teach and learn than the basic meaning of a new word and its spoken form. The second is
to not expect too much initially; deep knowledge is acquired gradually over time. A single
instance of saying the new word aloud for students and providing a quick definition is un-
likely to result in their full-fledged ability to use the new L2 word in sentences of their own.
2 Historical Perspectives
274
Teaching Vocabulary
275
Marlise Horst
language use for real-life purposes with activities that involve learners in problem-solving
tasks such as requesting information by phone or planning a class outing. Speaking is clearly
central in such activities, but questions have been raised about their usefulness for voca-
bulary learning. Bruton’s (2005) review of TBLT studies shows that the activities do not
result in the acquisition of much new vocabulary. But the focus on new words may be
missing the point. In addition to confirming the importance of teaching new vocabulary,
Nation’s discussion of key instructional “strands” (2013) highlights the value of commu-
nicative activities for learning vocabulary that is not new: Meaning-focused interactive tasks
usefully push learners to consolidate vocabulary that is partially known but not yet well
established. He also points to the value of activities that develop more fluent use of known
words by including a timed element that pushes speakers to perform faster. An example is the
4/3/2 activity in which learners tell the same story three times to different listeners with
decreasing time for each retelling. In their study of this activity, Arevart and Nation (1991)
found that with each retelling, learners produced more words per minute and there were
fewer hesitations.
O’Dell (1997) highlights a different concern in her discussion of communicative syl-
labus design: The vocabulary of task scenarios is often chosen haphazardly; a context for
practicing a particular language function is created with little regard to whether the vo-
cabulary that is introduced is useful to know. For example, a desert island activity in-
tended to practice planning structures (we will need/we won’t need) in a recent ESL
textbook I examined offers a set of pictured items including hat, matches, towel, flashlight,
GPS, thermos and sunscreen. In this set, thermos and sunscreen (and possibly others) are
infrequent words of questionable importance; it would have been easy enough to design
the activity to include more generally useful terms like water container and sun protection
instead. As the learners will use the words repeatedly in the activity, selecting “high-value”
vocabulary matters.
276
Teaching Vocabulary
277
Marlise Horst
soften what might otherwise be blunt statements of fact. You know – the most frequently
occurring pragmatic chunk in both the CANCODE and the spoken part of the BNC (Shin
& Nation, 2008) – signals informality and points to understanding between the speaker and
the listener. It is worth noting that the core meaning of know (have information in your
mind) differs considerably from its meaning in you know as used in conversations to signal
“I’m thinking what to say next”, or “I wonder if you’ve understood my point”. From the
learner’s perspective, know and you know are arguably two different lexical entries. Shin and
Nation (2008) provide a useful list of 1,000 frequently occurring phrases based on analysis of
the spoken part of the BNC.
So far, we have seen that productive knowledge of around 3,000 frequent word families is
an important goal in attaining spoken proficiency in English. The learner’s task is clearly
very large (though much smaller than the 8,000 families Nation’s 2006 study indicates are
needed for comprehending written English). But the task is larger still given that word fa-
milies include multiple members. In principle, knowledge of enjoy entails also knowing de-
rived forms such as enjoyable and enjoyment and other members of the enjoy family. Learners
are also eventually expected to know multiple meanings of words and their use in colloca-
tions (fall apart, fall asleep) and expressions (fall in love, fall off the wagon). In practice,
research shows that even advanced learners have difficulty producing derived forms of fa-
miliar base words (Schmitt & Zimmerman, 2002) and collocation errors persist in the writing
of advanced university learners of English (Laufer & Waldman, 2011). Finally, if we consider
that fluent speech also involves knowing hundreds or even thousands of multi-word con-
versational chunks and their pragmatic uses, the learning task is very large indeed.
The next part describes research-informed teaching and learning techniques. The focus
is on ways of promoting the acquisition of large amounts of rapidly accessed spoken
vocabulary.
278
Teaching Vocabulary
sounds used in English. There is evidence that learners avoid using phonologically difficult
words. Repeating the problem words aloud is useful; repetition makes them more familiar
(Ellis & Beaton, 1993). Hu (2008) suggests that raising learners’ phonological awareness in
activities such as reciting rhymes and segmenting words into sound units may also be
useful. Familiarizing learners with the stress of a new word is particularly important;
Field’s (2005) study showed that use of appropriate word stress plays a key role in making
L2 speech intelligible.
The connection between form and meaning is easily made when the new L2 word is a
helpful cognate. A cognate is a word that has a formal resemblance to a word in another
language; usually (but not always), there is a shared meaning. For example, for Dutch-
speaking learners, the English verb form fall is very similar to the Dutch label (vallen) for this
concept and this makes it easy to remember and produce. The Germanic roots of English and
its heavy borrowing of French, Latin and Greek vocabulary give speakers of many European
languages a distinct cognate advantage when learning the vocabulary of English. Other
languages have borrowed extensively from English; there are thousands of English loan-
words in Japanese (Daulton, 2008). The cognate advantage extends to morphology for many
European learners. The suffix-or is a noun-maker and -al an adjective-maker in both English
and Spanish; there are many other examples. Research shows that learners produce cognates
more easily than non-cognates (Rogers et al., 2015), but there is also evidence that learners
tend to underuse this helpful strategy. White and Horst (2012) found that showing learners
how to recognize cognates benefitted their word learning.
In the many cases where the L1 does not offer an obvious link to the sound of a new L2
word, teachers can work with learners to create mnemonics. The keyword method is an
often-cited example (Webb & Nation, 2017); it involves linking the meaning of a new L2
word to a sound-alike L1 word via an image. For example, a teacher of English-speaking
learners of French might say, “To remember flèche (the French word for arrow), picture an
arrow piercing your flesh”. In a study where L2 learners were directed to study and remember
word–picture pairs, the effectiveness of this technique was confirmed (Barcroft, 2009a).
However, the study also showed learning that using multiple strategies was effective; these
included translating the words into the L1, visualizing the word and the matching picture,
and repeating the words silently.
Cognitive psychology research shows that when new information is first met, the number
and quality of elaborations are important in facilitating learning and retention (Anderson,
1990). This is an argument for presenting a new L2 word in varied ways: The teacher can say
it, write it, translate it, define it, and in the case of a word like earthquake, ask the learners if
they can see familiar parts in it. The learners can repeat the word, write down the spelling,
draw a picture, share a related personal experience, consider whether there is an L1 cognate,
act it out, and more. Richly elaborated processing contributes to making a new word
memorable by providing multiple mental pathways for retrieving it. Joe et al. (1996) suggest
that speaking activities based on texts are useful in this regard. When students reconstruct
what they have read in a retelling activity or act out the events in a role-play, there are
opportunities to produce new words from the text repeatedly, clarify them to a partner, and
use them in ways that vary from the original sentence contexts.
However, Barcroft warns against elaborated approaches to teaching new words (2002,
2009b). His TOPRA (type of processing – resource allocation) model hypothesizes that using
cognitive resources to work on one kind of learning reduces the amount of attention available
for attending to another. His research has shown that when activities focus learners’ attention
on the meanings of new words, acquisition of their forms is disadvantaged. For example, in an
investigation of TOPRA predictions (2009b), Barcroft asked Spanish-speaking learners of
279
Marlise Horst
English to read a passage that contained unknown target words along with their L1 transla-
tions. Some of the learners were also asked to write another Spanish synonym for each target,
which meant that their attention was focused closely on the meanings of the new English
words. Performance on a surprise spelling test given after the reading activity showed that
learners who had read the passage without the synonym task were substantially better at
providing accurate or near-accurate spellings of the target words.
The research discussed in this part presents an apparent contradiction: there are convincing
arguments for richly elaborated ways of presenting new vocabulary but also evidence that early
attention to word meanings may disadvantage the acquisition of word forms. Further research is
needed to clarify this issue. In the meantime, a possible pedagogical way forward is illustrated in
a suggestion for introducing the lexical chunk I’m sorry to beginning learners of English from
Calderón and Soto (2017, p. 38). Note that in the following sequence, the teacher begins with
practicing the spoken form only; then she gradually introduces semantic aspects.
The final steps in the sequence are explaining I’m sorry briefly using a short definition and
then providing two simple examples of situations in which people feel sorry.
280
Teaching Vocabulary
Any activity that involves speaking provides opportunities for productive retrieval as
learners search their mental lexicons for the words they want to say, but activities can be
designed to enhance the retrieval aspect. Learners can be asked to put aside a recently studied
text and reconstruct it orally with a partner; this involves them in reusing new and partially
known words in the text multiple times. Reviewing previously taught vocabulary in class is
another important way of providing learners with added opportunities to retrieve words and
say them. How many productive retrievals are needed to develop fully automatized use of new
words in speech? So far, the answer is unclear, but studies of incidental vocabulary acquisition
through reading show that learners need to meet a new word repeatedly – as often as ten times
or more – in order for it to be learned receptively (Webb & Nation, 2017). We can assume that
the ability to use a new word in speaking also requires many repeated learning opportunities.
281
Marlise Horst
contexts where there was no special focus on vocabulary determined that one hour of time
spent in class typically results in knowledge of just four new word families.
6 Future Directions
Rote memorization of vocabulary seems at odds with current approaches to language teaching.
How it would fit into a rethought version of the communicative approach is unclear; course
designers might look to popular electronic vocabulary games for ways of making learning lists of
words meaningful and fun. Certainly, the well-established benefits of real spoken interaction in
communicative classrooms should not be compromised, but given the size of the spoken vo-
cabulary “syllabus” outlined in this chapter, the efficiency of old-fashioned memorization for
building a large amount of rapidly accessed word knowledge is too powerful to ignore.
As for future investigations of spoken L2 vocabulary development, there is a need for
studies that assess learning gains in terms of actual spoken production. Such studies are
surprisingly scarce; almost all the research discussed in this chapter used written measures.
While it may be reasonable to assume that ability to produce a recognizable spelling of a new
word means that the L2 learner can also say it, written production is hardly a satisfactory
basis for conclusions about the usefulness of learning activities for spoken production.
Finally, pedagogical studies conducted in actual language classrooms also proved to be
scarce. Given the centrality of speaking ability, research that sheds light on effective methods
for learning spoken vocabulary in real L2 classrooms is needed and important.
Further Reading
Hilton, H. E. (2008). The link between vocabulary knowledge and spoken L2 fluency, Language
Learning Journal, 36(2), 153–166.
Hilton draws on the findings of her speaking study to make a strong argument for the revival of direct
vocabulary teaching and memorization; she expresses strong doubts about the efficacy of commu-
nicative language teaching methods.
Horst, M. (2019). Focus on vocabulary learning. Oxford: Oxford University Press.
This volume expands on corpus-informed approaches to teaching vocabulary and other themes out-
lined in this chapter. The emphasis is on learners of English aged 5-18 in schools, but the discussion is
relevant to many L2 learning contexts.
Joe, A., Nation, P., & Newton, J. (1996). Vocabulary learning and speaking activities. English Teaching
Forum, 34(1), 2–7.
This short, teacher-friendly paper remains a classic. It offers practical ideas for designing speaking
activities and explains the theory and research underpinning the suggested activities in an acces-
sible way.
Thornbury, S. (2002). How to teach vocabulary. Harlow, UK: Pearson Education Limited.
This book is popular with instructors in university training programs for teachers of English as a second
language. Thornbury has an engaging style; research-informed principles are supported by many ex-
amples and the volume is richly illustrated.
References
Anderson, J. R. (1990). Cognitive psychology and its implications (3rd edn). New York: Freeman.
Arevart, S., & Nation, P. (1991). Fluency improvement in a second language. RELC Journal,
22(1), 84–94.
Baddeley, A. (1990). Human memory. London: Lawrence Earlbaum Associates.
Barcroft, J. (2002). Semantic and structural elaboration in L2 lexical acquisition. Language Learning,
52, 323–363.
Barcroft, J. (2009a). Strategies and performance in intentional L2 vocabulary learning, Language
Awareness, 18(1), 74–89.
282
Teaching Vocabulary
Barcroft, J. (2009b). Effects of synonym generation on incidental and intentional vocabulary learning
during reading. TESOL Quarterly, 43(1), 79–103.
Biber, D., Conrad, S., Reppen, R., Byrd, P., & Helt, M. (2002). Speaking and writing in the university:
A multidimensional comparison. TESOL Quarterly, 36(1), 9–48. 10.2307/3588359
Bruton, A. (2005). Task based language learning: For the state secondary FL classroom? Language
Learning Journal, 31, 55–68
Calderón, M., & Soto, I. (2017). Academic language mastery: Vocabulary in context. Thousand Oaks,
CA: Corwin.
Cobb, T., & Horst, M. (2011). Does Word Coach coach words? CALICO Journal, 28(3), 639–661.
Craik, F. I., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal
of Verbal Learning & Verbal Behavior, 11(6), 671–684.
Cummins, J. (2008). BICS and CALP: Empirical and theoretical status of the distinction. In B. Street &
N. H. Hornberger (Eds.), Encyclopedia of language and education, Volume 2: Literacy (2nd edn,
pp. 71–83). New York: Springer.
Daller, H., & Xue, H. (2007). Lexical richness and the oral proficiency of Chinese EFL students. In H.
Daller, J. Milton, & J. Treffers-Daller (Eds.), Modelling and assessing vocabulary knowledge
(pp. 150–164). Cambridge: Cambridge University Press.
Dang, T. N., Coxhead, A., & Webb, S. (2017). The Academic Spoken Word List. Language Learning,
67, 959–997.
Daulton, F. E. (2008). Japan’s built-in lexicon of English-based words. Clevedon: Multilingual
Matters.
De la Fuente, M. J. (2002). Negotiation and oral acquisition of L2 vocabulary: The roles of input and
output in the receptive and productive acquisition of words. Studies in Second Language Acquisition,
24(1), 81–112.
Elgort, I. (2011). Deliberate learning and vocabulary acquisition in a second language. Language
Learning, 61(2), 367–413.
Ellis, N. C., & Beaton, A. (1993). Psycholinguistic determinants of foreign language learning. Language
Learning, 43(4), 559–617.
Field, J. (2005). Intelligibility and the listener: The role of lexical stress. TESOL Quarterly,39, 399 – 423.
Griffin, G. F., & Harley, T. A. (1996). List learning of second language vocabulary. Applied
Psycholinguistics, 17, 443–460.
Hart, B., & Risley, T. R. (2003). The early catastrophe. The 30 million word gap by age 3. American
Educator, 22, 4–9.
Hilton, H. E. (2008). The link between vocabulary knowledge and spoken L2 fluency, Language
Learning Journal, 36(2), 153–166.
Hilton, H. E., Osborne, N. J. Derive, M.-J., Suco, N., O’Donnell, J., Rutigliano, S., & Billard, S.
(2008). Corpus PAROLE. Chambéry, France: Université de Savoie.
Horst, M. (2019). Focus on vocabulary learning. Oxford: Oxford University Press.
Hu, C.-F. (2008). Rate of acquiring and processing L2 colour words in relation to phonological
awareness. Modern Language Journal, 92(1), 39–52.
Hulstijn, J. H. (2001). Intentional and incidental vocabulary learning: A reappraisal of rehearsal, ela-
boration and automaticity. In P. Robinson (Ed.), Cognition and second language instruction(pp.
258–287). Cambridge: Cambridge University Press.
Joe, A., Nation, P., & Newton, J. (1996). Vocabulary learning and speaking activities. English Teaching
Forum, 34(1), 2–7
Krashen, S. (1982). Principles and practice in second language acquisition. Oxford: Pergamon.
Krashen, S. (1989). We acquire vocabulary and spelling by reading: Additional evidence for the input
hypothesis. Modern Language Journal, 73(4), 440–464.
Laufer, B. (1997). What’s in a word that makes it hard or easy: Some intralexical factors that affect the
learning of words. In N. Schmitt & M. McCarthy (Eds.), Vocabulary: Description, acquisition and
pedagogy(pp. 139–155). Cambridge: Cambridge University Press.
Laufer, B. (2006). Comparing focus on form and focus on formS in second-language vocabulary
teaching. Canadian Modern Language Review, 63(1), 149–166.
Laufer, B., & Waldman, T. (2011). Verb-noun collocations in second language writing: A corpus
analysis of learners’ English. Language Learning, 61(2), 647–672.
Milton, J. (2009). Measuring second language vocabulary acquisition. Bristol: Multilingual Matters.
Nakata, T. (2015). Effects of expanding and equal spacing on second language vocabulary learning.
Studies in Second Language Acquisition, 37(4), 677–711.
283
Marlise Horst
Nation, I. S. P. (1982). Beginning to learn foreign vocabulary: A review of the research. RELC Journal,
13(91), 14–36.
Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern
Language Review, 63(1), 59–82.
Nation, I. S. P. (2012). The BNC/COCA word family lists. Retrieved from http://www.victoria.ac.nz/
lals/about/staff/publications/paul-nation/Information-on-the-BNC_COCA-word-family-lists.pdf.
Nation, I. S. P. (2013). Learning vocabulary in another language (2nd edn). Cambridge: Cambridge
University Press.
Newton, J. (1993). Task-based interaction among adult learners of English and its role in second language
development. Unpublished PhD thesis. Victoria University of Wellington.
O’Dell, F. (1997). Incorporating vocabulary into the syllabus. In N. Schmitt & M. McCarthy (Eds.),
Vocabulary: Description, acquisition and pedagogy (pp. 258–278). Cambridge: Cambridge University
Press.
O’Keeffe, A., McCarthy, M., & Carter, R. (2007). From corpus to classroom: Language use and language
teaching. Cambridge: Cambridge University Press.
Perez, M. M., Peters, E., & Desmet, P. (2018). Vocabulary learning through viewing video: The effect of
two enhancement techniques. Computer Assisted Language Learning, 31(1-2), 1–26.
Rodgers, J., Webb, S., & Nakata, T. (2015). Do the cognacy characteristics of loanwords make them
more easily learned than non-cognates? Language Teaching Research, 19(1), 9–27.
Saito, K., Webb, S., Trofimovich, P., & Isaacs, T. (2016). Lexical profiles of comprehensible second
language speech: The role of appropriateness, fluency, variation, sophistication, abstractness, and
sense relations. Studies in Second Language Acquisition, 38(4), 677–701. doi: 10.1017/S0272263115
000297
Schmitt, N., & Schmitt, D. (2014). A reassessment of frequency and vocabulary size in L2 vocabulary
teaching. Language Teaching, 47, 484–503.
Schmitt, N., & Zimmerman, C. B. (2002). Derivative word forms: What do learners know? TESOL
Quarterly, 36(2), 145–171.
Shin, D., & Nation, I. S. P. (2008). Beyond single words: The most frequent collocations in spoken
English. ELT Journal, 62(4), 339–348.
Smith, B. (2004). Computer-mediated negotiated interaction and lexical interaction. Studies in Second
Language Acquisition, 26, 365–398.
Swain, M. (1985). Communicative competence: Some roles of comprehensible input and comprehen-
sible output in its development. In S. Gass & C. Madden (Eds.), Input in second language acquisition
(pp. 235–253). Rowley, MA: Newbury House.
Uchihara, T., & Saito, K. (2019). Exploring the relationship between productive vocabulary knowledge
and second language oral ability. Language Learning Journal, 47(1), 64–75.
Webb, S. (2008). Receptive and productive vocabulary size. Studies in Second Language Acquisition,
30(1), 79–95.
Webb, S., & Nation, P. (2017). How vocabulary is learned. Oxford: Oxford University Press.
Webb, S., & Rodgers, M. P. H. (2009). Vocabulary demands of television programs. Language
Learning, 59(2), 335–366.
White, J., & Horst, M. (2012). Cognate awareness-raising in late childhood: Teachable and useful.
Language Awareness, 21, 181–196.
Van Zeeland, H., & Schmitt, N. (2013). Lexical coverage in L1 and L2 listening comprehension: The
same or different from reading comprehension? Applied Linguistics, 34(4), 457–479.
Zimmerman, C. B. (1997). Historical trends in second language vocabulary instruction. In J. Coady &
T. Huckin (Eds.), Second language vocabulary acquisition: A rationale for pedagogy (pp. 5–19). New
York: Cambridge University Press.
284
20
THE ROLE OF FORMULAIC
SEQUENCES IN L2 SPEAKING
Duy Van Vu and Elke Peters
1 Introduction/Definitions
Recent decades have witnessed an increased interest in formulaic sequences (FSs) in both first
language (L1) and second language (L2) research and teaching (e.g., Boers & Lindstromberg,
2012; Wood, 2010; Wray, 2002). FSs, also called multiword units/expressions/sequences,
lexical chunks, formulas, prefabs, routines, or prefabricated patterns (see Wray, 2002), can
be broadly defined as:
FSs include collocations (take a shower), phrasal verbs (back up), idioms (raining cats and
dogs), lexical bundles (as far as I know), proverbs (actions speak louder than words). Although
different types of FSs may differ in relevance to spoken proficiency, they are very common in
speech and serve several communicative functions (Erman & Warren, 2000).
Language users draw on a large repertoire of ready-made FSs. Consider coffee: we say
strong coffee, not powerful coffee, even though strong and powerful are synonymous and both
phrases are grammatically correct. Thus, oral language production is not only governed by
grammar rules and combinations of single words, but also by conventionalized word com-
binations (Sinclair, 1991). Sinclair (1991) argues that reliance on ready-made FSs affords
processing advantages. In speech production, FSs are produced faster and more fluently than
non-FSs (e.g., Erman, 2007). Faster processing frees up attentional resources for other
cognitive tasks demanding more working memory during speech (N. Ellis, 2002). For ex-
ample, smooth talkers, such as sports commentators and auctioneers, make abundant use of
FSs in their speech (Kuiper, 1996); faster action requires greater use of FSs.
Research also demonstrates the importance of FSs for L2 learners. A positive relationship
holds between FSs and aspects of L2 proficiency, like reading (e.g., Kremmel, Brunfaut &
Alderson, 2017), writing (e.g., Granger & Bestgen, 2014), and speaking (e.g., Boers, Eyckmans,
Kappel, Stengers, & Demecheleer, 2006; Kyle & Crossley, 2015; Saito, 2020; Wood, 2009).
However, many L2 learners struggle with appropriate use of FSs (e.g., Hoang & Boers, 2016;
Laufer & Waldman, 2011; Nesselhauf, 2003; Pawley & Syder, 1983). This chapter provides an
overview of research into the role of FSs in speaking.
2 Historical Perspectives
It is important to zoom in on two widely recognized approaches to studying FSs: the
phraseological and the frequency-based approach (see Granger & Paquot, 2008 for a thor-
ough discussion). In the phraseological approach, FSs are classified on the basis of linguistic
criteria, such as the degree of compositionality, i.e., the predictability of word combinations
from the meaning of their components, or the degree of substitutability (or fixedness), i.e.,
the possibility of replacing words in word combinations (Barfield & Gyllstad, 2009). The
frequency-based approach is more recent and data-driven, such that FSs are regarded as
frequently co-occurring words. In the 1980s, this approach made great strides by showing the
extent of formulaicity in language thanks to computer software and corpus linguistics (Cobb,
2019). While early research on FSs in native speaker corpora demonstrated the ubiquity of
FSs in L1 speech (e.g., Altenberg, 1998; Biber et al., 1999; Erman & Warren, 2000), more
recent learner corpus research shows how language learners struggle with the appropriate use
of FSs, as they underuse, overuse, or do not use FSs (e.g., De Cock, 2004). Recently, research
has combined both approaches when studying FSs (e.g., Columbus, 2013; Wulff, 2008).
Many early studies on FSs entailed the L2 development of children and teenagers (e.g., R.
Ellis, 1984; Hakuta, 1974, 1976; Myles, Hooper & Mitchell, 1998; see Wray, 2002 for an
overview). Hakuta’s (1974) study was among the first attempts to examine FS use in chil-
dren’s L2 speech. He found that prefabricated patterns constituted over 50% of the utter-
ances of a Japanese learner of English in early learning stages. Hakuta (1976) argues that
prefabricated patterns meet beginner learners’ needs to express various functions beyond
their linguistic competence. Examining FSs in classroom learning, R. Ellis (1984) suggests
that FSs allow L2 children to “perform important communicative functions” (p.64) and
“may contribute, directly or indirectly, to the acquisition of rules for producing novel sen-
tences” (p.65). Similarly, Myles et al. (1998), observing English learners of French in a British
secondary school, found that learners use FSs for communicative needs in early stages and
employ parts of those FSs to produce new utterances in later stages.
In addition to studies on FSs in L2 development in children and adolescents, some re-
searchers have investigated FSs in L2 development of adults (e.g., Bolander, 1989; Hanania
& Gradman, 1977; Schmidt, 1983; see Wray, 2002 for an overview). An early exploration was
Hanania and Gradman’s (1977) longitudinal case study of an Arabic-speaking English
student in an English-speaking environment. The findings showed that her language devel-
opment patterns were similar to those of L1 learners. Importantly, in the early stages, the
student uttered word strings that she perceived as single units. Some years later, more evi-
dence of FS use was reported in Schmidt (1983) who found that an adult L2 English learner
used FSs extensively as a “major linguistic strategy” (p.150) to facilitate his fluency.
Investigating adult learning in the classroom, Bolander (1989) examined learners’ acquisition
of Swedish word order rules and found that memorized FSs support conversational speech in
early learner language. Eskildsen and Cadierno (2007) demonstrated how a learner produced
increasingly abstract patterns from the FS I don’t know in oral classroom interactions, and
how the use of this FS expanded to include other lexical items and past-tense expressions.
These studies illustrate how FSs contribute to language development. However, the role of
FSs in L2 learning may be less than in L1 learning (Wulff, 2019).
286
The Role of Formulaic Sequences
In pragmatics, several studies have examined L2 learners’ use of FSs (e.g., Bardovi-
Harlig, 2009; De Cock, 2004; Scarcella, 1979) and the value of FSs in L2 pragmatic
competence (e.g., Bardovi-Harlig, 2006; House, 1996; Kecskes, 2010) (see Bardovi-Harlig,
2019 for an overview). Scarcella (1979) found that even common routines are not easily
acquired by L2 learners, who produce a number of pragmatically deviant routines.
Further, De Cock (2004) showed that native speakers and advanced EFL learners differ
in their use of frequently recurrent sequences in informal speech. She found learners
significantly underused and misused markers of vagueness (e.g., or something, and things,
kind of), which made them come across as too formal, detached, over-emphatic, or rude,
depending on the FSs used. The underuse of FSs might result from a lack of familiarity
with some FSs, overuse of familiar FSs, level of pragmatic development, and socio-
pragmatic knowledge (Bardovi-Harlig, 2009). As for the contributions of FSs to L2
pragmatic competence, House (1996) showed that learners became more fluent pragma-
tically after taking a course rich in FSs. More recently, Bardovi-Harlig (2006) made a
distinction between developmental (acquisitional) formulas, i.e., formulas used when the
internal grammar of the formulas exceeds learners’ grammar in general, and social
(target) formulas, i.e., formulas expressing societal knowledge shared by a community.
Both these FSs types are crucial for L2 learners’ pragmatic competence. FSs can also
serve as pragmatic acts necessary for interlocutors in situational contexts (Kecskes, 2010).
These L2 pragmatics studies illustrate the importance of FSs for pragmatic purposes
in speech.
In L2 teaching, FSs have not always occupied a prominent place, especially in the Grammar
Translation method, where the focus was on grammar rules and reading. Speaking activities,
which barely played a role, were subordinate to syntax and morphology (Richards & Rodgers,
2001). In the 1960s, with the advent of Audiolingualism, the focus shifted to more oral practice
with drills in language labs. Target language patterns were selected, taught and practiced until
they became internalized by learners. Yet, the focus remained predominantly on grammar with
repetition, replacement, contraction, and inflection activities (Richards & Rodgers, 2001). FSs
took centre stage in Communicative Language Teaching (CLT) from the 1970s onwards
(Cowie, 1992; Dörnyei, 2009). This is reflected in the Common European Framework of
Reference, in which FSs are included as an important descriptor of learners’ speaking and
writing proficiency (Council of Europe, 2018). CLT places considerable emphasis on spoken
interaction and formulaic language to achieve communicative competence, which “is not a
matter of knowing rules…. It is much more a matter of knowing a stock of partially pre-
assembled patterns, formulaic frameworks…” (Widdowson, 1989, p. 135). In his proposal for
a principled communicative approach, Dörnyei (2009) advocates that teaching FSs be a core
component, maintaining that “[t]here should be sufficient awareness raising of the significance
and pervasiveness of formulaic language in real-life communication, and selected phrases
should be practiced and recycled intensively” (p. 41).
Dörnyei’s suggestion is reminiscent of Michael Lewis’s lexical approach, which regards
words and FSs as the building blocks of language learning. FSs are at the heart of the lexical
approach; Lewis (1993, 1997) maintains that L2 learners should be made aware of the
pervasiveness of FSs in language and should be encouraged to identify FSs in language
learning materials. Lewis’s publications (1993, 1997) have inspired several empirical studies
investigating the effectiveness of this approach (e.g., Boers et al., 2006; Peters & Pauwels,
2015). Early on, Boers et al. (2006) showed that awareness-raising activities have the po-
tential to increase L2 learners’ repertoire of FSs. Nevertheless, research into written activities
has indicated that awareness-raising should be complemented by practice and repetition
(Peters & Pauwels, 2015).
287
Duy Van Vu and Elke Peters
Empirical research into pedagogical interventions has mainly focused on learning FSs
from written input (see Boers & Lindstromberg, 2012; Pellicer-Sánchez & Boers, 2019 for
reviews). Few studies have explored the effect of interventions on learning and using FSs
in speech. Boers and colleagues (Boers, 2014; Hoang & Boers, 2016; Thai & Boers, 2016)
have conducted pioneering research in this respect. Boers (2014) and Thai and Boers
(2016) explored whether Nation’s (2013) 4/3/2-activity (and a modified version, the 3/2/1-
activity) affected learners’ fluency, complexity, and accuracy and whether learners’ spoken
output differed when a monologue was repeated under constant time conditions. In the 4/
3/2-activity, L2 learners repeat a task under increasing time pressure (4 minutes, 3 min-
utes, 2 minutes). Each monologue is then told to another listener. The findings in both
studies showed that repeating a monologue under increased time pressure fosters learners’
fluency, especially speech rate, but at the expense of accuracy. Learners’ speech in the
repetitions was characterized by several verbatim repetitions of sequences of two or more
words, many of which were repetitions of errors. In another study, Hoang and Boers
(2016) explored how many words and FSs Vietnamese EFL learners recycled after a
reading-and-listening activity. The analyses showed that input influences learners’ word
use in the retelling task. However, learners only recycled, on average, 2.41 FSs from the 35
FSs in the input and only 0.48, on average, were recycled correctly. These findings show
that accurate use of FSs is challenging for L2 learners, even after exposure to input
containing relevant FSs.
288
The Role of Formulaic Sequences
289
Duy Van Vu and Elke Peters
mean length of run) and a variable related to FS use (formula/run ratio) were analyzed. The
results showed that the learners’ fluency significantly improved and they used more FSs.
Other scholars have examined fluency in more depth. Tavakoli (2011) studied the dif-
ferences in pausing patterns between L2 learners and native speakers through four oral
narrative tasks (picture stories). Temporal measures of fluency, including the number of
pauses and the total amount of silence (in seconds) in the middle and at the end of clauses,
were statistically analyzed for each task. The results indicated that L2 learners and native
speakers differ primarily in the positions of pauses rather than the number of pauses or the
amount of silence. For instance, L2 learners produced significantly more pauses and silences
in the middle of clauses than native speakers, while they paused less frequently and did not
differ significantly in the amount of silence at the end of clauses. Further analyses revealed
that both L2 learners and native speakers hardly ever pause in the middle of FSs, which
suggests that FSs facilitate fluency. Going beyond pausing, Yan (2020) investigated the in-
fluence of FSs on both speech rate and pausing in L1 and L2 speakers of intermediate and
advanced proficiency levels in elicited imitation tasks. The findings revealed that FSs create a
processing advantage and also facilitate oral fluency for both L1 and L2 speakers. FSs had a
significant effect on pausing at the sentence level, but the effect on speech rate of sentence
repetition was non-significant.
Tavakoli and Uchihara (2019) investigated the relationship between oral fluency and use
of FSs across levels of proficiency. Fluency was measured in terms of speed, breakdown, and
repair while FSs were measured through n-grams (proportion, frequency, and strength of
association). The findings showed that: 1) there was a linear relationship between oral
proficiency level and many n-gram measures; 2) there were significant positive correlations
between fluency and high-frequency n-grams. Based on the findings, Tavakoli and Uchihara
(2019) explain that FSs facilitate oral fluency in both the formulation stage, where learners
with a large repertoire of FSs retrieve them holistically and process them as single words, and
the articulation stage, where FSs help phraseologically proficient learners speak faster given
their access to information about phonetic reduction of FSs in their lexicon. In fact, the
assumption of holistic retrieval that Tavakoli and Uchihara (2019) suggest for FSs is not
uncommon (e.g., Erman, 2007; Kecskes, 2010; Wray, 2002). However, this notion is con-
troversial (see Siyanova-Chanturia, 2015a); some psycholinguistic evidence counters it (e.g.,
Sprenger, Levelt & Kempen, 2006).
290
The Role of Formulaic Sequences
attention because they consist of high-frequency words (Boers, 2020). Many collocations, for
instance, do not cause problems at the level of comprehension, but at the level of production.
For example, the collocation make an effort is easy to understand for Dutch-speaking learners
of English, but difficult to produce because the Dutch collocation is do an effort. Consequently,
on encountering this collocation in speech, Dutch-speaking L2 learners may not notice that
English uses a different verb. Furthermore, many collocations contain delexicalized verbs
(make, do, have), which add little to the meaning of the collocation.
Another explanation for some FSs’ difficulty is a lack of semantic transparency. Unlike
collocations, idioms are semantically opaque; their meaning cannot be easily derived from
individual constituents. Take once in a blue moon for instance. It is not easy to figure out the
meaning of “very rarely” by means of the words in the idiom. Finally, L2 learners’ L1 exerts
a strong influence on their production of FSs, rendering their speech odd and non-nativelike
(e.g., Laufer & Waldman, 2011). Additionally, most adult L2 learners might take an analytic
approach to learning a new language. Even though more attention is now paid to FSs in L2
textbooks, there is still considerable reliance on single words.
291
Duy Van Vu and Elke Peters
292
The Role of Formulaic Sequences
FSs in the input (Pellicer-Sánchez & Boers, 2019). L2 learners could also use CALL appli-
cations, like concordances, to study the use of FSs in spoken registers. Concordances have
the advantage that they present FSs in context (Cobb, 2019; Meunier, 2020).
The memorability of FSs can be increased by pointing out sound patterns in FSs, like
alliteration (e.g., play a part), assonance (e.g., turn a blind eye), or rhyme (e.g., wear and tear)
(Boers & Lindstromberg, 2012). Given that learners’ L1 exerts a strong influence on their
production of FSs, a contrastive approach (e.g., translation activities) might help raise their
awareness of differences between the L1 and L2 (Laufer & Girsai, 2008; Webb &
Kagimoto, 2011).
To enlarge L2 learners’ formulaic repertoire and foster automatic retrieval, they should
practice FSs, as memorization and repetition enhance the learning of FSs (Wood, 2009;
Fitzpatrick & Wray, 2006). However, practice should involve retrieval (see also Pellicer-
Sánchez & Boers, 2019). Studies have shown that merely copying (Webb & Kagimoto,
2011) or orally repeating FSs (Alali & Schmitt, 2012) is not as effective as activities that
prompt learners to actually retrieve FSs from memory, like gap-filling or contrastive ac-
tivities. Importantly, learners should first be presented with FSs as a whole to prevent
them from making erroneous combinations, which are difficult to “unlearn” (Strong &
Boers, 2019).
293
Duy Van Vu and Elke Peters
7 Future Directions
Studies on the role of FSs in speech are growing. From the research discussed here, it is clear
that English is the most studied language. Limited research into other languages has focused
on FSs (e.g., Erman et al. 2015; Lundell et al., 2014). This is an area warranting further
investigation if the findings of existing studies are to be generalized.
Another issue hampering the generalizability of research findings is the sampling bias in SLA;
most studies are conducted with university students and advanced learners of English (see Myles
et al., 1998, for an exception). Our understanding of formulaic competence in speech would be
enhanced if more studies include younger and beginner L2 learners. An interesting example is
Siyanova-Chanturia’s (2015b) work mapping the development of formulaic competence in the
writing of beginner learners of Italian. Similar studies could be conducted for speaking. It would
be worthwhile to establish research collaborations with teachers in primary and secondary
schools to further the study of FSs in an ecologically valid way. The field would also benefit from
more longitudinal research investigating how FSs develop over time.
Empirical pedagogical studies of FSs and speech are still limited. More research into
pedagogical interventions to fuel the learning and use of FSs in speaking tasks is warranted.
Given that L2 learners’ vocabulary is somewhat task dependent (Eguchi & Kyle, 2020), more
research into how tasks and prompts affect learners’ use of FSs is needed.
As for corpus-based studies, researchers often combine data from different learners and
analyze them together. This focus on averages hides individual variation. Granger (2020)
argues that learner language has a high level of variability, so both group and individual
scores should be computed in corpus-driven research to resolve this issue. More research into
learners’ individual development of FS in speech would enhance our understanding of
learning trajectories (Granger, 2020).
Further Reading
Annual Review of Applied Linguistics (2012). Volume 32. Topics in formulaic language.
Language Teaching Research (2017). Volume 21, Issue 3. Special issue. Multi-word expressions.
Siyanova-Chanturia, A. & Pellicer-Sánchez, A. (Eds.). (2019). Understanding formulaic language: A
second language acquisition perspective. London/New York: Routledge.
References
Alali, F. A., & Schmitt, N. (2012). Teaching formulaic sequences: The same as or different from
teaching single words? TESOL Journal, 3(2), 153–180.
Altenberg, B. (1998). On the phraseology of spoken English: The evidence of recurrent word-
combination. In A. Cowie (Ed.), Phraseology: Theory, analysis and applications (pp. 101–122).
Oxford: Oxford University Press.
Bardovi-Harlig, K. (2006). On the role of formulas in the acquisition of L2 pragmatics. In K. Bardovi-
Harlig, C. Felix-Brasdefer, & A. S. Omar (Eds.), Pragmatics and language learning: Vol. 11
(pp. 1–28). Honolulu, HI: University of Hawaii, National Foreign Language Resource Center.
Bardovi-Harlig, K. (2009). Conventional expressions as a pragmalinguistic resource: Recognition and
production of conventional expressions in L2 pragmatics. Language Learning, 59(4), 755–795.
Bardovi-Harlig, K. (2019). Formulaic language in second language pragmatics research. In A.
Siyanova-Chanturia & A. Pellicer-Sánchez (Eds.), Understanding formulaic language: A second
language acquisition perspective (pp. 97–114). London/New York: Routledge.
Barfield, A., & Gyllstad, H. (2009). Introduction: Researching L2 collocation knowledge and devel-
opment. In A. Barfield & H. Gyllstad (Eds.), Researching collocations in another language: Multiple
interpretations (pp. 1–18). Basingstoke, UK: Palgrave Macmillan.
Biber, D. (2006). University language: A corpus-based study of spoken and written registers. Amsterdam:
John Benjamins.
294
The Role of Formulaic Sequences
Biber, D., & Barbieri, F. (2007). Lexical bundles in university spoken and written registers. English for
Specific Purposes, 26(3), 263–286.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and
written English. Harlow: Longman.
Boers, F. (2014). A reappraisal of the 4/3/2 activity. RELC Journal, 45(3), 221–235.
Boers, F. (2020). Factors affecting the learning of multiword items. In S. Webb (Ed.), The Routledge
handbook of vocabulary studies (pp. 143–157). London/New York: Routledge.
Boers, F., Eyckmans, J., Kappel, J., Stengers, H., & Demecheleer, M. (2006). Formulaic sequences and
perceived oral proficiency: Putting a lexical approach to the test. Language Teaching Research, 10(3),
245–261.
Boers, F., & Lindstromberg, S. (2012). Experimental and intervention studies on formulaic sequences in
a second language. Annual Review of Applied Linguistics, 32, 83–110.
Bolander, M. (1989). Prefabs, patterns and rules in interaction? Formulaic speech in adult learners’ L2
Swedish. In K. Hyltenstam & L. Obler (Eds.), Bilingualism across the lifespan: Aspects of acquisition,
maturity, and loss (pp. 73–86). Cambridge: Cambridge University Press.
Cobb, T. (2019). From corpus to CALL: The use of technology in teaching and learning formulaic
language. In A. Siyanova-Chanturia & A. Pellicer-Sánchez (Eds.), Understanding formulaic
language: A second language acquisition perspective (pp. 192–210). London/New York:
Routledge.
Columbus, G. (2013). In support of multiword unit classifications: Corpus and human rating data
validate phraseological classifications of three different multiword unit types. Yearbook of
Phraseology, 4(1), 23–44.
Council of Europe. (2018). Common European framework of reference for languages: Learning, teaching,
assessment - Companion volume with new descriptors. Retrieved 24 April, 2020 from http://rm.coe.
int/cefr-companion-volume-with-new-descriptors-2018/1680787989.
Cowie, A. P. (1992). Multiword lexical units and communicative language teaching. In P. Arnaud
& H. Bejoint (Eds.), Vocabulary and applied linguistics (pp. 1–12). Basingstoke, England:
Macmillan.
Crossley, S. A., Salsbury, T., & McNamara, D. S. (2015). Assessing lexical proficiency using analytic
ratings: A case for collocation accuracy. Applied Linguistics, 36(5), 570–590.
Crossley, S., & Salsbury, T. L. (2011). The development of lexical bundle accuracy and production in
English second language speakers. IRAL-International Review of Applied Linguistics in Language
Teaching, 49(1), 1–26.
De Cock, S. (2004). Preferred sequences of words in NS and NNS speech. Belgian Journal of English
Language and Literatures, 2, 225–246.
Dörnyei, Z. (2009). Communicative language teaching in the 21st century: The ‘principled commu-
nicative approach’. Perspectives, 36, 33–43.
Eguchi, M., & Kyle, K. (2020). Continuing to explore the multidimensional nature of lexical so-
phistication: The case of oral proficiency interviews. The Modern Language Journal, 104(2),
381–400.
Ellis, N. C. (2002). Frequency effects in language processing: A review with implications for theories of
implicit and explicit language acquisition. Studies in Second Language Acquisition, 24(2), 143–188.
Ellis, R. (1984). Formulaic speech in early classroom second language development. In J. Handscombe,
R. A. Orem, & B. P. Taylor (Eds.), On TESOL ’83 (pp. 53–65). Washington, DC: TESOL.
Erman, B. (2007). Cognitive processes as evidence of the idiom principle. International Journal of
Corpus Linguistics, 12(1), 25–53.
Erman, B., Denke, A., Fant, L., & Lundell, F. F. (2015). Nativelike expression in the speech of long-
residency L2 users: A study of multiword structures in L2 English, French and Spanish.
International Journal of Applied Linguistics, 25(2), 160–182.
Erman, B., & Warren, B. (2000). The idiom principle and the open choice principle. Text, 20(1), 29–62.
Eskildsen, S. W., & Cadierno, T. (2007). Are recurring multi-word expressions really syntactic freezes?
Second language acquisition from the perspective of usage-based linguistics. In M. Nenonen & S.
Niemi (Eds.), Collocations and idioms 1: Papers from the first Nordic conference on syntactic freezes
(pp. 86–99). Joensuu, Finland: University of Joensuu.
Fitzpatrick, T., & Wray, A. (2006). Breaking up is not so hard to do: Individual differences in L2
memorization. Canadian Modern Language Review, 63(1), 35–57.
Granger, S. (2020). Learner corpora. In C. A. Chapelle (Ed.), The concise encyclopedia of applied lin-
guistics (pp.681–688). Hoboken, NJ: John Wiley & Sons.
295
Duy Van Vu and Elke Peters
Granger, S., & Bestgen, Y. (2014). The use of collocations by intermediate vs. advanced non-native
writers: A bigram-based study. International Review of Applied Linguistics in Language Teaching, 52,
229–252.
Granger, S., & Paquot, M. (2008). Disentangling the phraseological web. In S. Granger & F. Meunier
(Eds.), Phraseology: An interdisciplinary perspective (pp. 27–49). Amsterdam, the Netherlands: John
Benjamins.
Hakuta, K. (1974). Prefabricated patterns and the emergence of structure in second language acqui-
sition. Language Learning, 24(2), 287–297.
Hakuta, K. (1976). A case study of a Japanese child learning English as a second language. Language
Learning, 26(2), 321–351.
Hanania, E. A., & Gradman, H. L. (1977). Acquisition of English structures: A case study of an adult
native speaker of Arabic in an English-speaking environment. Language Learning, 27, 75–91.
Hoang, H., & Boers, F. (2016). Re-telling a story in a second language: How well do adult learners mine
an input text for multiword expressions? Studies in Second Language Learning and Teaching, 6(3),
513–535.
House, J. (1996). Developing pragmatic fluency in English as a foreign language: Routines and me-
tapragmatic awareness. Studies in Second Language Acquisition, 18(2), 225–252.
Kremmel, B., Brunfaut, T., & Alderson, J. C. (2017). Exploring the role of phraseological knowledge in
foreign language reading. Applied Linguistics, 38(6), 848–870.
Kecskes, I. (2010). Situation-bound utterances as pragmatic acts. Journal of Pragmatics, 42(11),
2889–2897.
Kuiper, K. (1996). Smooth talkers: The linguistic performance of auctioneers and sportscasters. Mahwah,
NJ: Erlbaum.
Kyle, K., & Crossley, S. A. (2015). Automatically assessing lexical sophistication: Indices, tools,
findings, and application. TESOL Quarterly, 49, 757–786.
Laufer, B., & Girsai, N. (2008). Form-focused instruction in second language vocabulary learning: A
case for contrastive analysis and translation. Applied Linguistics, 29(4), 694–716.
Laufer, B., & Waldman, T. (2011). Verb-noun collocations in second language writing: A corpus
analysis of learners’ English. Language Learning, 61(2), 647–672.
Lewis, M. (1993). The lexical approach. Hove, UK: Language Teaching Publications.
Lewis, M. (1997). Implementing the Lexical Approach. Hove, UK: Language Teaching Publications.
Lin, P. (2019). Formulaic language and speech prosody. In A. Siyanova-Chanturia & A. Pellicer-
Sánchez (Eds.), Understanding formulaic language: A second language acquisition perspective
(pp. 78–94). London/New York: Routledge.
Lin, P. M. S., & Siyanova-Chanturia, A. (2015). Internet television for L2 vocabulary learning. In D.
Nunan & J. C. Richards (Eds.), Language learning beyond the classroom (pp. 149–158). London:
Routledge.
Lundell, F. F., Bartning, I., Engel, H., Gudmundson, A., Hancock, V., & Lindqvist, C. (2014). Beyond
advanced stages in high-level spoken L2 French. Journal of French Language Studies, 24(2),
255–280.
Meunier, F. (2020). Resources for learning multiword items. In S. Webb (Ed.), The Routledge handbook
of vocabulary studies (pp. 336–350). London/New York: Routledge.
Myles, F., Hooper, J., & Mitchell, R. (1998). Rote or rule? Exploring the role of formulaic language in
classroom foreign language learning. Language Learning, 48(3), 323–363.
Nation, I. S. P. (2013). Learning vocabulary in another language (2nd ed.). Cambridge: Cambridge
University Press.
Nesselhauf, N. (2003). The use of collocations by advanced learners of English and some implications
for teaching. Applied Linguistics, 24(2), 223–242.
Nation, I. S. P. & Newton, J. M. (2009). Teaching ESL/EFL listening and speaking. New York:
Routledge.
Pawley, A., & Syder, F. H. (1983). Two puzzles for linguistic theory: Native-like selection and native-
like fluency. In J. C. Richards & R. W. Schmidt (Eds.), Language and communication (pp. 191–226).
New York: Longman.
Pellicer-Sánchez, A., & Boers, F. (2019). Pedagogical approaches to the teaching and learning of for-
mulaic language. In A. Siyanova-Chanturia & A. Pellicer-Sánchez (Eds.), Understanding formulaic
language: A second language acquisition perspective (pp. 153–173). London/New York: Routledge.
Peters, E., & Pauwels, P. (2015). Learning academic formulaic sequences. Journal of English for
Academic Purposes, 20, 28–39.
296
The Role of Formulaic Sequences
Puimège, E., & Peters, E. (2019). Learning L2 vocabulary from audiovisual input: an exploratory study
into incidental learning of single words and formulaic sequences. The Language Learning Journal,
47(4), 424–438.
Puimège, E., & Peters, E. (2020). Learning formulaic sequences through viewing L2 television and
factors that affect learning. Studies in Second Language Acquisition, 42(3), 525–549.
Richards, J. C., & Rodgers, T. S. (2001). Approaches and methods in language teaching. Cambridge:
Cambridge University Press.
Rossiter, M. J., Derwing, T. M., Manimtim, L. G., & Thomson, R. I. (2010). Oral fluency: The ne-
glected component in the communicative language classroom. Canadian Modern Language Review,
66(4), 583–606.
Saito, K. (2020). Multi‐or single‐word units? The role of collocation use in comprehensible and con-
textually appropriate second language speech. Language Learning, 70(2), 548–588.
Saito, K., & Hanzawa, K. (2018). The role of input in second language oral ability development
in foreign language classrooms: A longitudinal study. Language Teaching Research, 22(4),
398–417.
Scarcella, R. C. (1979). Watch up!: A study of verbal routines in adult second language performance.
Working Papers on Bilingualism Toronto, (19), 79–90.
Schmidt, R. (1983). Interaction, acculturation, and the acquisition of communicative competence: A
case study of an adult. In N. Wolfson & E. Judd (Eds.), Sociolinguistics and language acquisition
(pp. 137–174). Rowley, MA: Newbury House.
Sinclair, J. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press.
Sprenger, S. A., Levelt, W. J., & Kempen, G. (2006). Lexical access during the production of idiomatic
phrases. Journal of Memory and Language, 54(2), 161–184.
Stengers, H., Boers, F., Housen, A., & Eyckmans, J. (2011). Formulaic sequences and L2 oral profi-
ciency: Does the type of target language influence the association?. IRAL-International Review of
Applied Linguistics in Language Teaching, 49(4), 321–343.
Siyanova-Chanturia, A. (2015a). On the ‘holistic’ nature of formulaic language. Corpus Linguistics and
Linguistic Theory, 11(2), 285–301.
Siyanova-Chanturia, A. (2015b). Collocation in beginner learner writing: A longitudinal study. System,
53, 148–160.
Siyanova-Chanturia, A., & Pellicer-Sánchez, A. (2019). Formulaic language: Setting the scene. In A.
Siyanova-Chanturia & A. Pellicer-Sánchez (Eds.), Understanding formulaic language: A second
language acquisition perspective (pp. 1–15). London/New York: Routledge.
Strong, B., & Boers, F. (2019). The error in trial and error: Exercises on phrasal verbs. TESOL
Quarterly, 53(2), 289–319.
Tavakoli, P. (2011). Pausing patterns: Differences between L2 learners and native speakers. ELT
Journal, 65, 71–79.
Tavakoli, P., & Uchihara, T. (2019). To what extent are multiword sequences associated with oral
fluency? Language Learning, 70(2), 506–547.
Thai, C., & Boers, F. (2016). Repeating a monologue under increasing time pressure: Effects on fluency,
complexity, and accuracy. TESOL Quarterly, 50(2), 369–393.
Van Lancker-Sidtis, D., & Rallon, G. (2004). Tracking the incidence of formulaic expressions in ev-
eryday speech: Methods for classification and verification. Language & Communication, 24(3),
207–240.
Webb, S., & Kagimoto, E. (2011). Learning collocations: Do the number of collocates, position of the
node word, and synonymy affect learning? Applied Linguistics, 32, 259–276.
Webb, S., & Nation, P. (2017). How vocabulary is learned. Oxford: Oxford University Press.
Widdowson, H. G. (1989). Knowledge of language and ability for use. Applied Linguistics, 10, 128–137.
Wisniewska, N., & Mora, J. C. (2020). Can captioned video benefit second language pronunciation?.
Studies in Second Language Acquisition, 42(3), 599–624.
Wood, D. (2001). In search of fluency: What is it and how can we teach it?. Canadian Modern Language
Review, 57(4), 573–589.
Wood, D. (2009). Effects of focused instruction of formulaic sequences on fluent expression in second
language narratives: A case study. Canadian Journal of Applied Linguistics, 12, 39- 57.
Wood, D. (2010). Formulaic language and second language speech fluency: Background, evidence and
classroom applications. London/New York: Continuum.
Wray, A. (2002). Formulaic language and the lexicon. Cambridge University Press.
Wulff, S. (2008). Rethinking idiomaticity: A usage-based approach. London/NewYork: Continuum.
297
Duy Van Vu and Elke Peters
298
21
TECHNOLOGY FOR SPEAKING
DEVELOPMENT
Walcir Cardoso
1 Introduction/Definitions
In Spike Jonze’s movie Her (2013), Theodore (played by Joaquin Phoenix) develops a re-
lationship with an operating system’s voice, Samantha. Samantha is an artificially intelligent
virtual assistant personified via a female voice (Scarlett Johansson), who uses and processes
language just like human beings to communicate effectively: it is (artificially) intelligent,
creative, interactive, and sensitive to pragmatics (see Figure 21.1):
With these human attributes, Samantha seems like the ideal language learning partner
and/or tutor. In addition to the human qualities described earlier, the “synthesized” voice
also excels in many of the features offered by computer-assisted language learning (CALL is
a cover term for any type of computer-based technology, including computers, online re-
sources, and mobile devices; see Levy & Hubbard, 2005 for the rationale): it encourages
practice and repetition, provides multiple and varied modalities for practice, promotes
learner autonomy in an anxiety-free environment, accommodates different learning styles,
has the ability to provide immediate feedback, fosters exploratory learning with wide access
to information, and is motivating and fun (see Egbert & Shahrokni, 2018 for a discussion of
these CALL affordances). In addition, Samantha fulfills Chapelle’s (2001) criteria for se-
lecting CALL tasks; namely, there is potential for a positive impact on language learning
through the (interactive) feedback provided, opportunities to engage with language and
consequently personalize the experience, attention to both form and meaning (attention, a
central concept in cognitive L2 acquisition, can be promoted via input enhancements such as
repetition and multimodal exposure; it is considered one of the main advantages of CALL
vis-à-vis classrooms – Chapelle, 2003), authentic interactions reflecting the out-of-class
world, and practicality (the system is accessible and easy to use).
However, we are still light years away from the type of full artificial intelligence depicted in the
conversation between Samantha and Theodore, despite attempts to design dialogue systems that
engage users in reasonably natural interactions with humans (e.g., the Virtual Language Patient,
an application to train healthcare professionals in interactions with patients; Walker et al., 2011).
First, current technology is not capable of affording the earlier-mentioned attributes that
characterize effective human communication. Second, it is ineffective in providing opportunities
for collaborative interaction, including negotiation of meaning (Dickinson et al., 2013), shown
when Samantha explains why she believes she is being challenged by Theodore. The main reason
why the conversation between Samantha and Theodore works is because it is a collaborative
event in which the speaker and his interlocutor interact in ways where each understands the
other. This might explain why speaking is considered the most difficult skill to teach via tech-
nology (Levy & Stockwell, 2006; Sevilla-Pavón et al., 2011).
Despite these limitations, there are many ways in which language teachers can use CALL
tools to enhance their students’ speaking abilities. To promote learner speaking as part of
social interaction in a CALL setting, Egbert and Shahrokni (2018) propose three “task
structures” for the design of technology-enhanced activities, in which students can learn
around the computer (to internalize information to support or complement learning; e.g.,
listening to a podcast), through the computer (e.g., to communicate with others via video-
conferencing tools), and with the computer (e.g., to interact with a synthesized voice, as
illustrated earlier); see also Chapelle (2003), for whom these task structures are broken into
interactions that can be intrapersonal (within a learner’s mind) or interpersonal (i.e., involving
learner–computer or learner–learner interactions). Here, this tripartite task structure model
provides a basic framework for describing and analyzing the types of speaking interactions
that learners can engage in via technology.
Normally, second/foreign language (L2) learning occurs within the classroom, facilitated
by a teacher. Sometimes, however, learning can also take place informally, outside of the
classroom, either extramurally (under the aegis of a school – Sundqvist & Sylvén, 2016), or in
the digital wilds (independent of formal instructional contexts, in which the impetus to learn
originates from the learner, not the school – Sauro & Zourou, 2019). Examples of the latter
include user-selected digital games, music streaming, and social networking. An interesting
implication of this observation is that it recognizes that L2 learning occurs within and
outside of the classroom, opening a world of pedagogical opportunities for extending the
reach of the classroom and consequently minimizing one of the challenges that afflicts in-
class teaching: time (Bione & Cardoso, 2020). This is particularly important for speaking, an
L2 skill that requires a substantial amount of output practice (Everly, 2018) so that learners
can test their hypotheses about what they are learning and consequently automatize their
abilities (Grimshaw & Cardoso, 2018). By extending the reach of the classroom, teachers can
allocate their in-class time resources to activities that may necessitate human support (e.g.,
personalized quality feedback). In this chapter, the degree of connection of these learning
settings to the classroom is assumed to vary within an overlapping continuum that views
CALL-assisted speaking taking place inside and/or outside of the classroom (i.e., informally
300
Technology for Speaking Development
in the wild, or to extend the reach of the classroom), depending on teachers’ and learners’
needs, interests, and comfort with technology.
The goals of this chapter are to introduce the field of technology for L2 speaking development,
review some of the relevant literature in the field, and propose recommendations for practice.
2 Historical Perspectives
Following the tripartite conceptual framework for describing computer–learner interactions,
the history of CALL in speaking development can be subsumed into two general categories:
those that promote interactions around and with the computer, as illustrated in the dialogue
between Samantha and Theodore, and those that target interactions through the computer
(Egbert & Shahrokni, 2018).
A computer system with all of Samantha’s speaking abilities is not yet possible and it is
unlikely to exist in the foreseeable future (it has abilities beyond those of human beings!). For
over 50 years, computer scientists have attempted to create interactive conversation systems
resembling Samantha. The first attempt was ELIZA, a natural language conversation pro-
gram developed by Weizenbaum (1966) by applying pattern-matching rules to match user
prompts to scripted replies to simulate a conversation between a human user and a virtual
therapist (ELIZA). A therapist was chosen as interlocutor to simplify programming:
therapists often ask open-ended questions and are not required to give advice or accurate
information. Unlike Samantha, ELIZA never initiated a conversation and was not capable
of learning new sentence patterns or words through interaction. Figure 21.2 shows a sample
dialogue between a user and ELIZA, using a JavaScript emulator of the original program
(https://www.masswerk.at/elizabot/eliza.html).
Figure 21.2 Sample dialogue between ELIZA and a user. Generated via Norbert Landsteiner’s im-
plementation at Masswerk ( www.masswerk.at/elizabot/eliza.html) and published with
permission
301
Walcir Cardoso
Since then, there have been many efforts to advance Weizenbaum’s ground-breaking in-
vention, ranging from specialized use of related technologies such as speech recognition software
and text-to-speech synthesizers, to intelligent personal assistants (IPAs). Representing the most
recent materialization of ELIZA, and a conceivable prototype for a Samantha-type of personal
assistance in the future, IPAs are voice-controlled services connected to smart speakers (e.g.,
Amazon Echo) that interact with users by answering (and sometimes asking) questions, and
performing tasks such as telling jokes, summarizing the news, and playing music (see Moussalli
& Cardoso, 2019). The first studies to examine IPAs’ potential for L2 oral skill development were
conducted recently in the late 2010s, emphasizing users’ perceptions of the technology and their
ability to understand and be understood by L2 learners (Dizon, 2017, 2020; Moussalli &
Cardoso, 2019).
To function properly, IPAs use a combination of two relatively more “established” CALL
technologies for speaking development: Automatic speech recognition (to convert human
speech into text for searching purposes) and text-to-speech synthesizers (to output the results
of the search). Automatic Speech Recognition (ASR) refers to the ability for an application to
identify spoken words and either convert them into text (e.g., Dragon Naturally Speaking,
Office Dictation) or respond to it (e.g., play a song following a request). The first ASR system,
the Audrey System, was designed in 1952 by Bell Laboratories to recognize numbers. By the
1960s, ASRs (e.g., IBM’s Shoebox) became capable of identifying and responding to 16 words
in English, a number that increased to 1,000 words in the 1970s, but with a high degree of
inaccurate recognition. In the early 2000s, recognition accuracy reached 80%, recently reaching
its highest rate at over 95%, with Google heralding the speech accuracy title (Globalme
Language & Technology, 2019). The advances in speech recognition observed in the late 1990s
triggered a number of studies examining ASRs’ potential for L2 pedagogy, particularly for the
teaching of oral skills (Coniam, 1999; Derwing et al., 2000).
Text-to-speech synthesizers (TTS, or text readers), on the other hand, are computer ap-
plications that convert text into speech. This feature is found on most modern computers and
mobile devices that “speak” to users after searching for the information requested in a da-
tabase. Applications that employ TTS synthesis include GPSs, voice assistants (e.g., Alexa,
Siri), and online or dedicated tools such as VoiceMaker and NaturalReader. Despite its
novelty appeal, TTS is considered the oldest speech technology, with its origin dating back to
the early 1000s AD when “machines” were built to imitate human speech to answer yes/no
questions (Hyman, 2011), receiving major improvements in the 1930s with the development
of the Vocoder (Bell Laboratories) and The Voder, both highly limited in terms of intellig-
ibility. Over the past 20 years, the quality of synthesized speech has been steadily improving,
with some speech features indistinguishable from actual human speech (Bione & Cardoso,
2020), thus calling researchers’ and pedagogues’ attention to TTS’ potential for the teaching
of L2 speaking.
The advent of the social, participatory Web 2.0 in the early 2000s brought us improve-
ments in internet technologies that allowed L2 researchers and pedagogues to embrace tools
with potential to promote interactions through the computer, via computer-mediated com-
munication. The two most frequently used internet technologies are videoconferencing (e.g.,
Skype and Zoom) and social media applications such as Facebook, Twitter, and social
messaging (e.g., Messenger and WhatsApp). Using these technologies, learners can interact
with one another through their devices using text, voice-only or videoconferencing. Real-time
web-based chatting appeared in the mid-1990s and instantly drew the attention of scholars,
who began to explore their potential for L2 speaking pedagogy. Some of this research in-
cludes works that emphasize the pedagogical use of Skype (Mullen et al., 2009), Twitter
(Mompean & Fouz-González, 2016), audioblogs (Hsu et al., 2008), digital games (Grimshaw
302
Technology for Speaking Development
& Cardoso, 2018), and virtual learning environments such as Moodle, which combine many
of the through-the-computer tools available (Barcomb & Cardoso, 2020).
Finally, the new millennium has experienced great advances in virtual reality (VR), a tech-
nology with roots in Morton Heilig’s approach to moviemaking (“Experience Theatre”, pa-
tented in 1962), which attempted to incorporate multiple senses (e.g., sound, smell, touch, and
sight) into the event. VR is a simulated, multimodal experience that can be similar to or com-
pletely different from the real world. Although the use of specialized headsets is not a require-
ment for VR implementation (e.g., Second Life – Cooke-Plagwitz, 2008), currently, standard VR
systems use either dedicated headsets or multi-projected environments to create an immersive,
multimodal experience, which can be used for entertainment or educational purposes (e.g., to
create an interactive learning environment that promotes learning through or with a computer).
Given the accessibility and the interactive essence of VR technology, it is not surprising that it
has sparked great interest by educators who seek to provide an immersive L2 learning experience
to their students (see Marcel, 2020 for a review of the literature).
Because of the recent advances in computer technology and the available research high-
lighting CALL’s pedagogical potential, effective speaking lessons (e.g., those that adhere to
Chapelle’s 2001 criteria for adopting effective CALL tasks, discussed earlier) can be designed
and delivered to students extensively and inexpensively, providing more interactionally au-
thentic practice opportunities than is possible in the traditional (i.e., not CALL-assisted) L2
classroom.
303
Walcir Cardoso
quality feedback, including Hassani et al.’s (2016) proposal for a computational model in
an intelligent virtual environment, which assesses learners’ speaking skills, estimates their
conversational abilities, and adjusts the level of communication complexity accordingly,
with the goal of improving students’ oral communicative skills (Hassani et al., 2016).
However, this type of customizable, intelligent feedback is still uncommon in most ap-
plications that target speaking, leading many students to abandon them, particularly
when used autonomously (Tuncay, 2020).
Finally, one of the main obstacles for using speech technologies for teaching oral skills used to
be that text synthesizers and speech recognizers were not always deemed appropriate or bene-
ficial for pedagogical purposes: synthesized voices were not are as accurate, natural, or in-
telligible as human speech (e.g., Stevens et al., 2005), and ASR applications performed less
favourably than humans in terms of accuracy and their ability to identify pronunciation errors
(e.g., Derwing et al., 2000). Fortunately, these two technologies have improved significantly over
the last decade, with research indicating that voice recognition systems are now nearly on par
with that of humans (Globalme Language & Technology, 2019); see also Moussalli and Cardoso
(2019) for an experiment contrasting IPAs and humans in their ability to understand L2 speech.
Differences between human and synthesized voices are negligible in measures of understanding
and phonological processing (i.e., comprehensibility, intelligibility, and the aural identification of
English regular past forms; Bione & Cardoso, 2020).
304
Technology for Speaking Development
305
Walcir Cardoso
the potential to reduce learners’ communication anxiety and, as a result, increase their
willingness to communicate and overall motivation to learn (Grimshaw & Cardoso, 2018).
As acknowledged by a reviewer, this generalization is particularly interesting in the context
of the COVID-19 crisis, when many students had no choice but to engage in CMC. The
short- and long-term impact that this experience will have on the use of CMC for language
learning remains unknown.
306
Technology for Speaking Development
Chien et al. (2020), for example, who adopted spherical video-based virtual reality with 360-
degree videos and photos for emulating virtual environments, viewed through a head-
mounted display. The author found that the VR system contributed not only to the parti-
cipants’ improvement in speaking performance, but also in motivation and critical thinking.
Positive outcomes were also observed in a study using mixed (real and virtual) reality, in
which Marcel (2020) confirmed that the proposed customized VR environment had a po-
sitive effect on the oral production of English vocabulary. Despite these observed pedago-
gical benefits, VR research remains in the realm of exploratory, not experimental research.
An important benefit of the four technologies discussed in this part is that they engage
learners in interactions around, through and with computers. This type of computer-mediated
interaction has a number of advantages from a pedagogical standpoint, mostly because they
do not require anxiety-inducing face-to-face exchanges: in addition to increasing learners’
opportunities to practice in an interactive and stress-free environment, these technologies
have the potential to enhance learning, as demonstrated earlier.
1. Development. This level focuses on the development of a new tool, usually based on
insights from second language acquisition, computer sciences, and CALL. For an ex-
ample, see Sundberg and Cardoso (2018), a development study that introduces a music
app for the teaching of L2 oral skills and vocabulary, and provide the theoretical and
pedagogical rationale behind its design.
2. Exploration. This second stage involves an examination of the pedagogical affordances
and potential of a novel technology. This stage may serve as a follow-up to a tool’s
development, illustrated earlier, or it can be used to investigate an existing technology
(see van Lieshout & Cardoso, 2022, for an example exploring Google Translate’s speech
capabilities for the autonomous learning of a set of spoken phrases in L2 Dutch).
307
Walcir Cardoso
3. Assessing suitability. This stage involves research on usability, acceptance, and learner’s
attitudes towards a technology. Once a new technology or tool is deemed pedagogically
appropriate, one possibility a CALL researcher might entertain is to investigate its
pedagogical suitability via quantitative (e.g., surveys), qualitative (e.g., interviews, focus
groups), or mixed methods. See Walker et al. (2011) for an example of a feasibility study
evaluating a computer-based L2 training module for healthcare professionals – the
Virtual Language Patient.
4. Assessing pedagogical effectiveness. Finally, the last stage in this framework consists of
assessing a target technology’s pedagogical effectiveness. These studies often employ
quantitative or mixed methods, with a pretest–posttest research design to investigate the
effects of the chosen technology on learning. Chiu et al. (2007) describes a study in which
participants improved their speaking abilities using CandleTalk, an ASR-equipped ap-
plication that promotes simulated conversations.
In practice, depending on the research questions being addressed, many studies combine two
or more of these levels.
• Before adopting a technology, consider using Chapelle and Jamieson’s (2007) criteria to
assess its pedagogical suitability by asking the following questions (simplified): Is there
potential for learning aspects of L2 speaking (e.g., phonological, interactional skills)? Is
it accessible and user-friendly? Are there opportunities for: feedback, engagement with
the L2, focus on form and meaning, and authentic interactions?
• If the above scrutiny is persuasive, reflect on Salaberry’s (2001) concerns about CALL:
Will the selected technology be pedagogically effective? Can it easily be integrated into
the curriculum without major disruptions? Will it be efficient in terms of human and
material resources?
• Decide on the role that the adopted technology will play in teaching: Is it to diagnose
problems or to improve or assess speaking skills? What aspects of speaking will it ad-
dress: pronunciation, interactional skills, speaking functions, or oral fluency? These
objectives can be easily integrated into computer-assisted curricula, but they may require
adaptations.
• Following insights from Sundqvist and Sylvén (2016), determine how technology will be
implemented: will it be used in in-class or out-of-class contexts? If the latter, will it be
used as an extension of the classroom (e.g., to complement in-class discussions), or in-
formally to promote autonomous learning (e.g., for strategy development – Chapelle
308
Technology for Speaking Development
& Jamieson, 2007)? Before deciding, teachers should ensure that all students have access
to technology outside of school: technology is not always a great equalizer.
• Based on learners’ needs and interests, as well as the insights from Egbert and Shahrokni
(2018) to promote speaking as part of social interaction, decide on how the
learner–computer interaction will occur: around, through, or with the computer. The main
advantage of through- and with-computer applications (e.g., synchronous videoconferencing
Skype, Zoom, IPAs) is that they require spontaneous speech, while around technologies
(e.g., videoblogging) compels planning – two authentic environments that characterize the
act of speaking. Research has shown that when learners plan and produce speech, their
utterances are generally more accurate and complex (Sotillo, 2000).
7 Future Directions
This chapter has introduced L2 speaking development from a CALL perspective, reviewed
some of the relevant literature on the subject, and highlighted the pedagogical implications of
what is known about the art of teaching L2 speaking with technology. One of the important
generalizations we can make about the field is that it has often taken advantage of advances
in computer technology, and that it has greatly contributed to the enhancement of L2
teaching and learning. Despite these optimistic remarks, there remain some issues that need
to be investigated and/or corroborated in future research. For the sake of brevity, only three
of these directions will be discussed.
First, L2 speaking researchers should expand their focus of analysis to include other
aspects of what it means to speak an L2. As indicated earlier, most of the research in CALL-
based speaking target pronunciation, with the majority covering segments (e.g., Cardoso,
2018; Thomson, 2011; but see Anderson-Hsieh, 2013 for a suprasegmental focus). This is
possibly because pronunciation is an area that benefits from technology in a way that cannot
be replicated in face-to-face interactions in the classroom (e.g., it requires decontextualized
articulation practice – Pennington, 1999). Others use unclear definitions for what constitutes
speaking (e.g., Golonka et al., 2014; Mompean & Fouz-González, 2016). One way of
characterizing speaking was proposed by Bohlke (2014), for whom speaking competence can
be broken down into four skill areas: phonological, interactional, extended discourse, and
speech functions. How would the pedagogical use of certain technologies contribute to the
development of these subcomponents of speaking?
Another topic in need of further investigation is methodological in nature. Although there
have been many methodological advances in CALL research, particularly in speaking, as
implied in earlier discussions, this skill has not received the attention it deserves from em-
pirical, experimental perspectives. For instance, it could be argued that very few available
studies could be classified as “assessing pedagogical effectiveness,” considering the four-level
hierarchy established earlier to describe the main research methods utilized in the field.
Including pretest–posttest designs with control groups would strengthen the internal validity
of findings and provide researchers with a high level of control over the experiment (e.g., to
isolate specific variables). Inspired by Chapelle (2003), another interesting direction would be
to investigate learners’ behaviour on the computer (e.g., how they negotiate meaning, how
they manage communication breakdowns) to establish a cause and effect relationship be-
tween that behaviour and the potential acquisition of the target L2 form (see Chapelle, 2003).
309
Walcir Cardoso
Further Reading
Blake, R. (2017). Technologies for teaching and learning L2 speaking. In C. Chapelle & S. Sauro (Eds.),
The handbook of technology and second language teaching and learning (pp. 107–117). London:
Wiley-Blackwell.
An SLA-informed introduction to CALL tasks and tools that can be used to promote L2 speaking. It
focuses on tutorial exercises (to engage students in self‐directed speech practice) and computer-
mediated communication (e.g., videoconferencing).
Chapelle, C., & Jamieson, J. (2007). Tips for teaching with CALL: Practical approaches for computer-
assisted language learning. London: Pearson.
The L2 instructor is introduced to the art of teaching with technology. Two chapters are relevant: One
dedicated to speaking (Chapter 6) and another to computer-mediated communication (Chapter 7).
These chapters constitute a good basis for teachers who would like to learn more about the art of
teaching speaking with CALL.
Sundqvist, P., & Sylvén, L. (2016). Extramural English in teaching and learning: From theory and re-
search to practice. London: Palgrave McMillan.
An examination of theory, research, and practice of learning L2 English in the digital wilds. The book
explores how this environment affects learning, and describes tools that teachers can use to develop
their students’ language skills, including speaking. Although the focus is on English, the ideas can be
implemented in the teaching of any L2.
References
Anderson-Hsieh, J. (2013). Interpreting visual feedback on English suprasegmentals in computer as-
sisted pronunciation instruction. CALICO Journal, 11(4), 5–22.
Baker, M. (2016). The negotiation of meaning in epistemic situations. International Journal of Artificial
Intelligence in Education, 26, 133–149.
Barcomb, M., & Cardoso, W. (2020). Rock or lock? Gamifying an online course management system
for pronunciation instruction: Focus on English /r/ and /l/. CALICO Journal, 37(2), 127–147.
Beatty, K. (2010). Teaching and researching computer‐assisted language learning. Harlow, U.K:
Longman.
Bione, T., & Cardoso, W. (2020). Synthetic voices in the foreign language context. Language Learning
& Technology, 24(1), 169–186.
Bohlke, D. (2014). Fluency-oriented second language teaching. In M. Celce-Murcia, D. Brinton, & M.
Snow (Eds.), Teaching English as a second or foreign language. National Geographic Learning
(pp. 121–135). Boston: Heinle Cengage.
Cardoso, W. (2018). Learning L2 pronunciation with a text‑to-speech synthesizer. In P. Taalas, J.
Jalkanen, L. Bradley, & S. Thouësny (Eds.), Papers from EUROCALL (pp. 1–6). Research-
publishing.net.
310
Technology for Speaking Development
Chapelle, C. (2003). English language learning and technology. Amsterdam, Netherlands: John
Benjamins.
Chapelle, C. (2010). Evaluating computer technology for language learning. TESOL Ontario Contact,
36(2), 36–55.
Chapelle, C., & Jamieson, J. (2007). Tips for teaching with CALL: Practical approaches to computer-
assisted language learning. Harlow: Pearson-Longman.
Chien, S., Hwang, G., & Jong, M. (2020). Effects of peer assessment within the context of spherical
video-based virtual reality on EFL students’ English-speaking performance and learning percep-
tions. Computers & Education, 146, 1–20.
Chiu, T.-L., Liou, H.-C., & Yeh, Y. (2007). A study of web-based oral activities enhanced by au-
tomatic speech recognition for EFL college learning. Computer Assisted Language Learning, 20,
209–233.
Coniam, D. (1999). Voice recognition software accuracy with second language speakers of English.
System, 27(1), 49–64.
Cooke-Plagwitz, J. (2008). New directions in CALL: An objective introduction to Second Life.
CALICO Journal, 25(3), 547–557.
Dehghanzadeh, H., Fardanesh, H., Hatami, J., Talaee, E., & Noroozi, O. (2019). Using gamification to
support learning English as a second language: A systematic review. Computer Assisted Language
Learning. doi: 10.1080/09588221.2019.1648298.
de Vries, B., Cucchiarini, C. Bodnar, S. Strik, H., & Hout, R. (2015). Spoken grammar practice and
feedback in an ASR-based CALL system. Computer Assisted Language Learning, 28(6), 550–576.
Derwing, T., Munro, M., & Carbonaro, M. (2000). Does popular speech recognition software work
with ESL speech? TESOL Quarterly, 34, 592–603.
Dickerson, W. (2015). Using orthography to teach pronunciation. In M. Reed & J. Levis (Eds.), The
handbook of English pronunciation (pp. 488–503). London: Wiley Blackwell.
Dickinson, M., Brew, C., & Meurers, D. (2013). Language and computers. London: Wiley-Blackwell.
Dizon, G. (2017). Using intelligent personal assistants for second language learning: A case study of
Alexa. TESOL Journal, 8(4), 811–830.
Dizon, G. (2020). Evaluating intelligent personal assistants for L2 listening and speaking development.
Language Learning & Technology, 24(1), 16–26.
Egbert, J., & Shahrokni, S. (2018). CALL principles and practices. Open educational resources (OER).
https://opentext.wsu.edu/call.
Enge, E. (2019). Rating the smarts of the digital personal assistants in 2019. Retrieved from https://
www.perficient.com/insights/research-hub/digital-personal-assistants-study#which_smartest
Everly, P. (2018). Expanding pronunciation instructional time beyond the classroom: Microsoft Office
2016 OneNote Class Notebook as an interactive delivery platform. TESOL Journal, 10(2). doi: 10.1
002/tesj.421
Globalme Language & Technology (2019, July). Speech recognition technology overview. Retrieved
from https://www.globalme.net/blog/the-present-future-of-speech-recognition
Golonka, E., Bowles, A., Frank, V., Richardson, D., & Freynik, S. (2014). Technologies for foreign
language learning: A review of technology types and their effectiveness. Computer Assisted Language
Learning, 27(1), 70–105.
Grimshaw, J., & Cardoso, W. (2018). Activate space rats! Fluency development in a mobile game-
assisted environment. Language Learning & Technology, 22(3), 159–175.
Hanson-Smith, E. (2003). A brief history of CALL theory. CATESOL Journal, 15(1), 21–30.
Hassani, K., Nahvi, A., & Ahmadi, A. (2016). Design and implementation of an intelligent virtual
environment for improving speaking and listening skills. Interactive Learning Environments, 24(1),
252–271.
Hsu, H.-Y., Wang, S.-K., & Comac, L. (2008). Using audioblogs to assist English language learning:
An investigation into student perception. Computer Assisted Language Learning, 21, 181–198.
Hwang, W.-Y., Shih, T., Ma, Z.-H., Shadiev, R., & Chen, S.-Y. (2016). Evaluating listening and
speaking skills in a mobile game-based learning environment with situational contexts. Computer
Assisted Language Learning, 29(4), 639–657.
Hyman, W. (2011). The automaton in English renaissance literature, literary and scientific studies of early
modernity. Milton Park: Routledge.
Kukulska-Hulme, A. (2016). Personalization of language learning through mobile technologies.
Cambridge: Cambridge University Press.
Lee, W. (1979). Language teaching games and contests. Oxford: Oxford University Press
311
Walcir Cardoso
Levy, M., & Hubbard, P. (2005). Why call CALL “CALL”? Computer Assisted Language Learning,
18(3), 143–149.
Levy, L., & Stockwell, G. (2006). CALL dimension: Options and issues in computer-assisted language
learning. Mahwah: Lawrence Erlbaum.
Levy, L. (2009). Technologies in use for second language learning. The Modern Language Journal, 93,
769–782.
Li, J. (2020). A systematic review of video games for second language acquisition. In P. Sullivan, J.
Lantz, & B. Sullivan (Eds.), Handbook of research on integrating digital technology with literacy
pedagogies (pp. 472–499). IGI Global.
Liakin, D., Cardoso, W., & Liakina, N. (2015). Learning L2 pronunciation with a mobile speech re-
cognizer: French /y/. CALICO Journal, 32(1), 1–25.
Liakin, D., Cardoso, W., & Liakina, N. (2017). The pedagogical use of mobile speech synthesis: Focus
on French liaison. Computer Assisted Language Learning, 30(3-4), 348–365.
Mackey, A., & Gass, S.M. (2011). Research methods in second language acquisition. Hoboken, NJ:Wiley‐
Blackwell.
Marcel, F. (2020). Mobile mixed reality technologies for language teaching and learning [Unpublished
doctoral dissertation]. University of Toronto.
McCrocklin, S. (2016). Pronunciation learner autonomy: The potential of automatic speech recogni-
tion. System, 57, 25–42.
Mompean, J., & Fouz-González, J. (2016). Twitter-based EFL pronunciation instruction. Language
Learning & Technology, 20(1), 166–190.
Moussalli, S., & Cardoso, W. (2019). Intelligent personal assistants: Can they understand and be un-
derstood by accented L2 learners? Computer Assisted Language Learning. doi: 10.1080/09588221.201
9.1595664
Mullen, T., Appel, C., & Shanklin, T. (2009). Skype-based tandem language learning and web 2.0. In
M. Thomas (Ed.), Handbook of research on Web 2.0 and second language learning (pp. 101–118).
Hershey, PA: IGI Global.
Parmaxi, A. (2020). Virtual reality in language learning: A systematic review and implications for
research and practice. Interactive Learning Environments. doi: 10.1016/j.procs.2021.08.141.
Pennington, M. C. (1999). Computer‐aided pronunciation pedagogy: Promise, limitations, directions.
Computer Assisted Language Learning, 12(5), 427–440.
Qian, M., Chukharev-Hudilainen, E., & Levis, J. (2018). A system for adaptive high variability seg-
mental perceptual training: Implementation, effectiveness, transfer. Language Learning &
Technology, 22(1), 69–96.
Rueb, A., Cardoso, W., & Grimshaw, J. (2018). The acquisition of French vocabulary in an interactive
digital gaming context. In P. Taalas, L. Bradley, & S. Thouësny (Eds.), Language learning as ex-
ploration and encounters (pp. 272–277). Research-publishing.net.
Saito, K., & Akiyama, Y. (2017). Video-based interaction, negotiation for comprehensibility, and
second language speech learning: A longitudinal study. Language Learning, 67(1), 43–74.
Salaberry, M. R. (2001). The use of technology for second language learning and teaching: A
retrospective.The Modern Language Journal, 85, 39-56.
Sauro, S., & Zourou, K. (2019). What are the digital wilds? Language Learning & Technology,
23(1), 1–7.
Sevilla-Pavón, A., Martínez-Sáez, A., & Macario de Siqueira, J. (2011). Self-assessment and tutor as-
sessment in online language learning materials: InGenio FCE Online Course and Tester. In S.
Thouësny & L. Bradley (Eds.), Second language teaching and learning with technology (pp. 45–69).
Research-publishing.net.
Shadiev, R., & Yang, M. (2020). Review of studies on technology-enhanced language learning and
teaching. Sustainability, 12(2). doi: 10.3390/su12020524
Sotillo, S. (2000). Discourse functions and syntactic complexity in synchronous and asynchronous
communication. Language Learning & Technology, 4, 82–119.
Stevens, C., Lees, N., Vonwiller, J., & Burnham, D. (2005). On-line experimental methods to evaluate
text-to-speech (TTS) synthesis: Effects of voice gender and signal quality on intelligibility, natur-
alness and preference. Computer Speech & Language, 19(2), 129–146.
Sundberg, R., & Cardoso, W. (2019). Learning French through music: The development of the Bande à
Part app. Computer Assisted Language Learning, 32(1-2), 49–70
Sundqvist, P., & Sylvén, L. (2016). Extramural English in teaching and learning: From theory and re-
search to practice. London: Palgrave McMillan.
312
Technology for Speaking Development
Thomson, R. I. (2011). Computer assisted pronunciation training: Targeting second language vowel
perception improves pronunciation. CALICO Journal, 28, 744–765.
Torsani, S. (2016). CALL teacher education. Rotterdam: Sense Publishers.
Tuncay, H. (2020). App attrition in computer-assisted language learning: Focus on Duolingo
[Unpublished master’s thesis]. McGill University.
van Lieshout, C., & Cardoso, W. (2022). Google Translate as a tool for self-directed language learning.
Language Learning & Technology, 26(1), XX–XX.
Walker, N., Trofimovich, P., Cedergren, H., & Gatbonton, E. (2011). Using ASR technology in lan-
guage training for specific purposes. CALICO Journal, 28(3), 721–743.
Weizenbaum, J. (1966). ELIZA – a computer program for the study of natural language commu-
nication between man and machine. Computational Linguistics, 9(1), 36–45.
313
22
CURRICULUM ISSUES IN
TEACHING L2 SPEAKING
Jonathan Newton, Trang Le Diem Bui, Bao Trang Thi Nguyen,
and Thi Phuong Thao Tran
1 Introduction/Definitions
This chapter brings a classroom-based, teacher-oriented perspective to the topic of teaching
second language (L2) speaking. It explores the practical classroom realities, contingencies
and challenges experienced by teachers, and draws on research related to these issues. This
approach reveals the maze of deep-seated educational, sociocultural, and learner-internal
issues that, in many educational contexts, make teaching L2 speaking uniquely challenging
to teach. Giving credence to the practical factors teachers face in teaching L2 speaking is a
crucial step in building bridges between the lived experience of teachers and the concerns of
second language acquisition (SLA) theory-building and research. A step in bridge-building,
as illustrated in this chapter, is teacher research, which takes as its starting point real world
issues experienced in specific educational ecologies, and which offers tangible, workable
innovations with transformational potential.
The chapter is something of a three-act play. In the first act (Into the maze), we present a
fictional account of a speaking lesson in an English foreign language (EFL) classroom
somewhere in Asia. While fictional, the account brings together our years of experience
teaching and observing teachers in similar classrooms. The scenario is related to other studies
revealing similar themes. The second act (Through the maze) consists of three case studies of
research conducted in teaching L2 speaking in Vietnamese EFL classrooms. Each study
focuses on a different education sector: primary school, high school, and university. The
three studies share a commitment to understanding the school and classroom ecology in
which the research is situated before introducing a pedagogic innovation to address the
specific challenges faced in this setting. In the third and final act (Above the maze), we tease
out themes and offer implications for moving the teaching of L2 speaking forward.
Throughout, we seek to place teachers’ concerns and realities front and centre to emphasize
the value of research informed by practice – a “researched pedagogy” for teaching L2
speaking (Samuda et al., 2018).
Starting with real L2 classrooms raises two questions. First, which L2? As our expertise
and experience is with (and as) EFL teachers, this will be our primary focus, although we
expect the issues raised will resonate with teachers of other L2s. Second, which classrooms?
L2 speaking is taught in a huge range of circumstances and learned for a myriad of purposes.
Consider the differences between, teaching L2 speaking in an ESP (English for Specific
Purposes) class for air traffic controllers in the UAE, and teaching English to a class of 40 or
so lower primary school students in a rural school in Cambodia. A single chapter will
struggle to do justice to this diversity, so we have chosen to showcase issues faced by teachers
in contexts we are familiar with, mostly English as a compulsory foreign language in Asian
schools and universities. We anticipate many of these issues will resonate with readers from
different contexts.
315
Jonathan Newton et al.
required by the ministry of education, and achieved a disappointing but not surprising
CEFR level B1 (Threshold independent user).
Today’s speaking class is her least favourite class. It confronts her with her own sense of
inadequacy as a proficient communicator in spoken English. To make matters worse, she has
trouble controlling the students in these speaking lessons, which always seem to be on the
edge of chaos. At least with the reading and writing class she can maintain order [“Switch to
writing if you find them turning restless,” was the advice provided by the UK Department of
Education and Science (1979), as reported by Alexander (2020, p. 91)]. The scheduled
textbook lesson begins with a vocabulary exercise followed by listening to a recording of a
conversation and then practice of a set dialogue. The unit concludes with a communicative
activity in which learners are expected to circulate around the class to survey other students
about their favourite foods. As has been her usual strategy, Ann spends as much of the lesson
as possible on the earlier exercises in the unit, where she uses her mother tongue to explain
points of grammar and aspects of word meaning and use. In the second half, she leads the
whole class in choral practice of the textbook dialogue before learners work in pairs to
practise the same dialogue.
The lesson concludes with the communication activity, but she keeps it brief. There is little
space to move in the packed room; students simply turn around and pair up with the person
behind them. They are not clear about the point of the activity and spend most of it talking in
their mother tongue. Off-task talk is frequent. They have done this kind of activity enough
times and with the same partner to be rather uninterested in it by now. Nevertheless, they
enjoy the chance to talk freely without close scrutiny, something frowned on in other classes.
The cramped classroom prevents Ann from moving around the groups to provide gui-
dance and monitoring; she usually focuses on the learners at the front. Consequently, most
students get little if any help or attention from the teacher, which contributes to their half-
hearted approach to the task. Noise levels rise dramatically and Ann has to bring the activity
to a premature close out of concern for complaints from the teachers on either side. Two
pairs are asked to stand and perform the task again, and do so with much laughter from the
class. The school bell rings for lunch just as Ann sums up key learning points. The lesson
is over.
316
Curriculum Issues in Teaching L2 Speaking
The teachers avoided the CLT approach of the new textbook. They were uncertain
how to implement this approach and so, drawing on their ‘apprenticeship of ob-
servation’, reverted to teacher-centred approaches using familiar and comfortable
approaches…Class time was mostly devoted to teacher talk
(Humphries & Burns, 2015, p. 246).
Carless (2003, 2007, 2015; Deng & Carless, 2009) and others (e.g., Chen & Wright, 2017)
have reported similar findings from research on the introduction of CLT and task-based
language teaching (TBLT) in primary and secondary schools in China and Hong Kong.
Chen and Wright, for example, found that even in a secondary school in mainland China
with a track record of CLT and TBLT, teachers adopted a “weak” version of CLT in which
the so-called “tasks” in the teachers’ lesson plans were often little more than “end-of-class
and add-on activities for practicing oral skills” (p. 532). Behind this practice was the teachers’
persistent belief that practising communicative English compromises the accuracy of the
learners’ speaking.
An even bleaker picture is painted by King (2013) in a large multi-site observational study
of the classroom behaviour of 924 English language learners across nine universities in
Japan. Findings showed that students were responsible for less than 1% of class talk and that
more than a fifth of class time involved no oral participation by either students or teachers.
King adopts a dynamic systems perspective (Cameron & Larsen-Freeman, 2007) to explain
the complex interaction of learner-internal and sociocultural factors leading to a strong
dispreference for talk in these classrooms. As King argues, issues such as lack of L2 ability,
unfamiliarity with topics/tasks, and problems with the delivery of the teacher’s talk all
converge to make “many learners…simply unwilling to engage in the potentially embar-
rassing behaviour of active oral participation for fear of being negatively judged by their
peers” (p. 339).
Even in the more cosmopolitan high school classrooms of Singapore, Aw (2017) found
considerable resistance to dialogic approaches to teaching. Teachers failed to engage students
in the kind of oracy work through which speaking mediates thinking, and notably higher
order thinking skills (Alexander, 2020). Teachers expressed insecurity about their expertise in
“teaching” speaking, and, along with senior school management, parents and students, cited
high stakes written examinations as a major impediment to being able or willing to see a
broader role for speaking in thinking and learning.
317
Jonathan Newton et al.
These illustrative studies are sufficient to establish that, for many teachers and even more
learners, teaching and learning L2 speaking can be uniquely challenging. Dörnyei and
Ushioda (2013, p. 339) refer to the “sociocultural maze” of issues that affect learner moti-
vation, and we think the phrase applies equally to the challenge of teaching L2 speaking.
Nevertheless, this maze has escape routes, which we now turn to.
I think the steps are so fixed. It’s like we arrange and assign things for students. We
show them this is what they should say. Then students just have to follow the
structural patterns we have taught them. This fails to enhance students’ ability to
use the language…It is like the learning process is very theoretical;…we have to
provide students something in advance and they have to follow. We provide the
theory for students before we get them to practice. I think this cannot enhance
students’ ability to use English language. It is like we force them to do what we want
them to do, speak what we want them to speak
(Bui & Newton, forthcoming).
These views from the teachers justified the initiation of a second phase of the research in
which task-based versions of two PPP lessons were designed and implemented by three
teachers. These redesigned lessons were intended to reduce mechanical pattern practice and
318
Curriculum Issues in Teaching L2 Speaking
emphasize meaningful language use from the beginning. Specifically, the teacher presentation
phase was replaced by an input-based task (e.g., listen to a dialogue about the school
timetable and fill in a timetable); the practice phase was replaced with an information gap
task; and the production phase was replaced with teacher-led discussion of public perfor-
mance of two or three pairs of students, and related language analysis activities which al-
lowed for a more deliberate focus on the target structures.
As reported in Bui (2019), in all performances of the main tasks in both lessons, learners
successfully completed the information gap tasks, and did so without the often lengthy
teacher presentation of language structures common in the PPP lessons. In performing the
information gap tasks, the learners consistently co-constructed utterances, self-corrected
errors, corrected each other’s errors, and negotiated for meaning to resolve comprehension
difficulties. They also frequently drew on their L1 to resolve problems and fill gaps in ex-
pression. While some teachers might baulk at the use of L1 by learners in an L2 speaking
class, the learners usually used it to manage the task performance and seek/provide assis-
tance, as in the following example.
Example 1
P1: What subject do you have on Friday?
P2: It…I have Art and Vietnamese and Science…Vietnamese and Science
P1: Vietnamese hả? (Is it Vietnamese?) Đọc lại cho tui nghe coi. (Say it again)
P2: I have Art and Vietnamese and Science
P1: Friday phải không? (Is it Friday?)
P2: Friday
In interviews, both teachers and pupils reported uniformly positive experiences in these
lessons. One teacher said the following:
The pupils could learn better when the two speaking lessons were taught this way. It
may be because they were cognitively engaged during the lesson. They had to think
to work out the language to speak. They had to manage their talk by themselves.
Previously the pupils did not have such experiences. Their learning was controlled.
They just followed the teacher
(Bui, 2019, p. 156).
I like to exchange information about the timetable with my friend. I tried to help my
friend understand using the language I knew. This helps me speak English more
naturally
(Newton & Bui, 2020, p. 43).
In summary, the study showed how teaching L2 speaking for young beginner-level EFL
learners could be transitioned from a heavy emphasis on drilling and memorizing
319
Jonathan Newton et al.
320
Curriculum Issues in Teaching L2 Speaking
Example 2
S1: Ê, they are poor hay they poor thôi hè? (Hey, they are poor or just they poor?)
S2: Er they are poor. Poor nớ tính từ mà, phải có động từ! (That poor is an adjective, it
needs a verb!)
S1: They are poor. They are poor.
In this example, the meaning-making required by the task pushes S1 to attend to the
grammatical issue of whether to use “they are poor” or “they poor.” S2 provides an answer
“they are poor,” with a meta-linguistic explanation that poor is an adjective, not a verb. In
simple terms, S1 notices a gap in her knowledge, collaborates with a partner to fill the gap,
and puts the new knowledge to immediate use.
Also evident in this example is extensive use of L1 in rehearsal to resolve language gaps.
Crucially, because the learners anticipated the possibility of being called on to perform the
task publicly in English, their use of L1 in rehearsal overwhelmingly functioned to re-
source the upcoming L2 performance, not to replace or avoid it (Seals et al., 2020; Skehan
& Foster, 2005, 2016). The following example of rehearsal and performance illustrates this
point.
Example 3
Rehearsal Public
Performance
(PP)
S1: I’m erm mình nói kinh doanh have S1: Hi Linh. How are you doing?
business à? (I want to say “do
business”. Should it be “have
business”?)
S2: I do business thôi! (I do business!) S2: I’m fine. And what’s your job?
S1: I do business and erm I gain kiếm S1: I do business and I earn a lot of
được…kiếm được là chi? (earn… money and I want to take uhm
how to say “earn money”?) part in volunteer work
S2: raise (.) uhm kiếm được là chi hè S2: Ok. That’s a good idea and erm
(how to say “earn”) (.) earn (.) what are you going to do with
this money?
S1: earn! and I earn a lot of money
In summary, the research shows how teachers in this school adopted an approach to
teaching L2 speaking that successfully addressed local contingencies and, in Phase 2, showed
321
Jonathan Newton et al.
Example 4
RESEARCHER: In your opinion, what culture content should be taught in class?
NHU: I think, teachers should teach ways of behaviour. If we understand better about
ways of behaviour, we can avoid cultural shock in future work and study in many
places.
DIEM: I see learning about culture influences my maturity a lot. The more countries’
culture you understand about the better, for future work. For example, I’m not
sure I’ll work in America. I may work in Thailand instead. In that case, I need to
master how Thai people communicate.
DUY: I agree with Diem…not necessary to focus on a certain country. If you know
culture of more countries, you can work with diversity more easily.
Drawing on these findings, the second phase sought to develop a more principled en-
gagement with culture by involving the case study teachers in two PAR workshop cycles. The
teachers were introduced to principles and practical examples of intercultural language
teaching and subsequently, in their classroom teaching, implemented the redesigned
intercultural-oriented lessons. Materials for the workshops included lessons from their
textbook redesigned to reflect the following core principles of iCLT:
322
Curriculum Issues in Teaching L2 Speaking
The redesigned iCLT lessons were structured as follows: (a) The students were given a
scenario from the relevant textbook unit (e.g., how to ask a teacher to write you a letter of
recommendation) and asked to make and discuss hypotheses about cultural dimensions of
the communication to be mindful of; (b) The students created role plays for this scenario in
Vietnamese and/or English; (c) The students performed their role plays and discussed dif-
ferences between them; (d) They listened to and analysed the textbook dialogue for the same
scenario, comparing it to their role plays and to their original hypotheses about cultural
dimensions of the scenario.
Space does not allow a detailed description, but two overall findings warrant comment.
First, classroom observations of the revised lessons revealed a marked increase in student
engagement and interaction with each other and the teacher. Second, in interviews, both
teachers and learners expressed unanimously positive views of the experience of intercultural
teaching and learning. Two comments from one of the teachers illustrate this point:
They [the students] analysed conversations, they get involved, they relate,…they
revise, they get involved…all steps they have to get involved.
[In the future] I will analyse the lesson plans more in detail…, and reflecting, relating
because that means you bring real life in your teaching. That is the part I am very
pleased with. I think the reasons why students cannot speak fluently in real situation
because they only learnt from book, they don’t bring real life in class.
(Tran, 2020, p. 191).
Two comments from students in the focus groups reveal complementary student perceptions
of the experience:
It was interesting as I understood that the ways Vietnamese and foreigners express
opinions are greatly different. For example, foreigners pay attention to phrases that
indicate politeness such as Can I, Could I, Will you. When they give advice, they
include reason to explain it to convince listeners. When disagreeing, they use phrases
that minimize conflicts such as I see what you mean but…I see I better understand
ways of speaking and using English to speak with foreigners.
I like sharing experience most because it is followed by a practical situation for us to
apply language structures to talk about it…It is more interesting to listen to friends’
sharing experience, more lessons and comments on one issue.
(Tran, 2020, p. 195).
In summary, this research showed how the EFL teachers at a Vietnamese university were
guided to successfully adopt an intercultural stance in their teaching of L2 speaking.
Furthermore, they achieved this in a setting with no previous experience or expertise in
323
Jonathan Newton et al.
intercultural teaching, and where the curriculum was devoid of intercultural content.
Importantly for other teachers who work from a similarly prescribed curriculum, these
outcomes were accomplished through adapting rather than replacing the set textbook.
Six key features of teacher research are modelled in the three studies described earlier: the
goal is to understand a classroom issue; teachers as researchers; a subjective orientation;
context specificity; a flexible, open-ended process; and the centrality of teacher knowledge
(Borg, 2006).
Studies 1 and 3 also offer models of how to build teacher professional development
(TPD) into teacher research, a crucial factor for strengthening sustainability. They do
this by modelling the five principles of effective teacher professional development (TPD)
proposed by Desimone (2009, p. 184): (1) a focus on subject-matter content and how
students learn that content; (2) opportunities for active teacher learning; (3) effort made
to build links to teachers’ knowledge and beliefs, and to local educational policies; (4)
teacher learning over a time span beyond a “one shot” workshop; and (5) collective
participation and dialogic learning.
These alignments with TPD can be contrasted with the conclusions reached by Humphries
and Burns (2015) discussed earlier, in which failure to adopt a more communicative peda-
gogy in a college in Japan was attributed to three features – teachers’ beliefs, understanding
of the new approach, and lack of ongoing support. These three problems mirror the absence
of TDP principles (3), (1), and (5). The Carless studies cited earlier similarly attribute the
failure of communicative reforms in Hong Kong to factors such as lack of teacher knowledge
and poor professional support.
Now, we return to our story of the lesson taught by Ann in Act 1. This story points to
other issues critical to the effective teaching of L2 speaking such as effective management
of first language (L1) use (Seals et al., 2020), teacher expertise (Tsui, 2012) and teacher
language proficiency (Le & Renandya, 2017). With respect to the latter, Alexander (2020,
p. 21) offers the following insight into the critical role of the teacher’s speaking skills
across the curriculum, a point which has even stronger resonance with respect to teaching
L2 speaking:
In reading and writing, the student’s skills are influenced more by the teacher’s skills
as a teacher of reading and writing, than by how well the teacher herself reads and
writes. Not so with talk. It’s essentially interactive nature means…that the teachers
324
Curriculum Issues in Teaching L2 Speaking
Another issue highlighted in Ann’s lesson is the classroom as a physical and acoustic space.
Many teachers teach in classrooms in which ambient noise and the acoustic properties of the
classroom make clear voice projection particularly challenging when it comes to teaching
speaking. Group work in such classroom spaces is an exercise in din reduction. And for
learners, as if talking in L2 is not challenging enough, in pair and group work they have the
additional stress of having to raise their voices to be heard. Even an enthusiastic teacher with
expertise in cooperative learning will find silent book work an attractive option in such
conditions. To the extent that classroom researchers interested in L2 speaking elide over such
front-of-mind issues for L2 speaking teachers, they risk reinforcing perceptions of their own
irrelevance. Interestingly, Alexander (2020, p. 135) identifies “space” as the first of five key
framing elements in a proposed generic framework for investigating talk in the classroom
(the other four being student organization, time, curriculum, and routine, rule, and ritual).
There is hope.
This classroom space-noise issue is symptomatic of a much broader challenge for the L2
speaking teacher, and that is a traditional and widespread discounting of the value of talk –
of oracy – across the curriculum. Speaking in formal education very often plays second fiddle
to literacy, if it is allowed to play at all. Consequently, students’ experience of talk for
learning and thinking across the curriculum may be confined to slotting answers into IRF
sequences framed on either side by teacher talk. Talk, genuine dialogic, exploratory, learning
talk may already be “foreign” before the learner experiences it in the foreign language
classroom.
We conclude on a more promising note. National education policies and curricula across
the globe have, over recent decades, sought to implement models of 21st-century skills and
learning (Fullan & Scott, 2014; Scott, 2015). As (or if) the principles of collaboration,
communication and creativity common to these models are worked out in classrooms, they
require a fundamental rethink of how talk is valued and what roles talk needs to play in
learning, a point long argued by the influential British educationalist Alexander (2020). For
contexts such as the one Ann struggles to teach in, the aspiration to teach for 21st-century
learning may be an uphill climb for years to come. And yet, it is an aspiration that offers just
the kind of educational environment in which truly communicative teaching of L2 speaking
is likely to thrive, as well as offering the L2 speaking teacher a leading role in bringing these
aspirations to life.
Further Reading
Alexander, R. (2020). A dialogic teaching companion. London: Routledge.
This overview of the neglect of talk in education in the UK makes a compelling case for the positive
impact of dialogic teaching on student engagement and learning. A framework for pedagogic teaching
and a professional development strategy for schools is provided.
King, J. (2013). Silence in the second language classrooms of Japanese universities. Applied Linguistics,
34(3), 325–343.
A nuanced discussion of the complex sociocultural and learner-internal factors accounting for will-
ingness and reluctance to speak in the L2 classroom.
Newton, J., & Nation, I. S. P. (2020). Teaching ESL/EFL listening and speaking (2nd edn). London:
Routledge.
This book outlines the four-strands framework, offering teachers a practical and principled basis for
planning a speaking curriculum. The strands are exemplified with a wide range of teaching techniques
and activities.
325
Jonathan Newton et al.
References
Alexander, R. (2020). A dialogic teaching companion. London: Routledge.
Aw, H. T. (2017). Speaking in the secondary English language classroom: Teachers’ beliefs, strategies and
use of talk. Singapore: National Institute of Education, Nanyang Technological University].
Borg, S. (2006). Teacher cognition and language education: Research and practice. London: Continuum.
Borg, S. (2010). Language teacher research engagement. Language Teaching, 43(4), 391–429. doi: 10.101
7/S0261444810000170
Bui, T. (2019). The implementation of task-based language teaching in EFL primary school classrooms: A
case study in Vietnam. Wellington: Victoria University of Wellington].
Bui, T., & Newton, J.(2021). A critical account of PPP: Insights from Vietnamese primary school EFL
classrooms. Language Teaching for Young Learners, 3(1), 93–116. https://doi.org/10.1075/
ltyl.19015.bui
Cameron, L., & Larsen-Freeman, D. (2007). Complex systems and applied linguistics. International
Journal of Applied Linguistics, 17(2), 226–239.
Carless, D. R. (2003). Factors in the implementation of task-based teaching in primary schools. System,
31(4), 485–500.
Carless, D. R. (2007). The suitability of task-based approaches for secondary schools: Perspectives from
Hong Kong. System, 35(4), 595–608. doi: 10.1016/j.system.2007.09.003
Carless, D. R. (2015). Teachers’ Adaptations of TBLT: The Hong Kong story. In M. T. Reinders & H.
Reinders (Eds.), Contemporary task-based language teaching in Asia (pp. 366–380). London:
Bloomsbury.
Chen, Q., & Wright, C. (2017). Contextualization and authenticity in TBLT: Voices from Chinese
classrooms. Language Teaching Research, 21(4), 517–538. doi: 10.1177/1362168816639985
Deng, C., & Carless, D. R. (2009). The communicativeness of activities in a task-based innovation in
Guangdong, China. Asian Journal of English Language Teaching, 19, 113–134.
Desimone, L. M. (2009). Improving impact studies of teachers’ professional development: Toward
better conceptualizations and measures. Educational Researcher, 38(3), 181–199. doi: 10.3102/00131
89X08331140
Dörnyei, Z., & Ushioda, E. (2013). Teaching and researching: Motivation. London: Routledge.
Fullan, M., & Scott, G. (2014). Education plus, new pedagogies for deep learning. Collaborative impact
SPC, Washington, DC. http://www.michaelfullan.ca/wp-content/uploads/2014/09/Education-Plus-
A-Whitepaper-July-2014-1.pdf
Gałajda, D. (2017). Communicative behaviour of a language Learner: Exploring willingness to commu-
nicate. New York: Springer.
Humphries, S., & Burns, A. (2015). ‘In reality it’s almost impossible’: CLT-oriented curriculum change.
ELT Journal, 69(3), 239–248.
King, J. (2013). Silence in the second language classrooms of Japanese universities. Applied Linguistics,
34(3), 325–343. doi: 10.1093/applin/ams043
Le, V. C., & Renandya, W. A. (2017). Teachers’ English proficiency and classroom language use: A
conversation analysis study. RELC Journal, 48(1), 67–81. doi: 10.1177/0033688217690935
Newton, J. (2016). Teaching English for intercultural spoken communication. In W. A. Renandya & H.
P. Widodo (Eds.), English language teaching today (pp. 161–177). New York: Springer.
Newton, J., & Bui, T. (2020). Low-proficiency learners and task-based language teaching. In C. P.
Lambert & R. Oliver (Eds.), Using tasks in diverse contexts (pp. 28–40). Bristol: Multilingual
Matters.
Newton, J., & Nguyen, B. T. T. (2019). Task repetition and the public performance of speaking tasks in
EFL classes at a Vietnamese high school. Language Teaching for Young Learners, 1(1), 34–56. doi: 1
0.1075/ltyl.00004.new
Nguyen, B. T. T., & Newton, J. (2019). Learner proficiency and EFL learning through task rehearsal
and performance. Language Teaching Research. doi: 10.1177/1362168818819021
Nguyen, B. T. T., Newton, J., & Crabbe, D. (2018). Teacher transformation of oral textbook tasks in
Vietnamese EFL high school classrooms. In V. Samuda, K. Van den Branden, & M. Bygate (Eds.),
TBLT as a researched pedagogy (pp. 52–70). Amsterdam: John Benjamins.
Pinter, A. (2007). Some benefits of peer–peer interaction: 10-year-old children practising with a commu-
nication task. Language teaching research, 11(2), 189–207. https://doi.org/10.1177/1362168807074604
Samuda, V., Van der Branden, K., & Bygate, M. (Eds.). (2018). TBLT as a researched pedagogy.
Amsterdam: John Benjamins.
326
Curriculum Issues in Teaching L2 Speaking
Scott, C. L. (2015). The futures of learning 2: What kind of learning for the 21st century. Education
Research and Foresight Working Papers, 3.
Seals, C. A., Newton, J., Ash, M., & Nguyen, T. B. T. (2020). Translanguaging and TBLT: Cross-overs
and challenges. In Z. Tian, L. Aghai, P. Sayer, & J. Schissel (Eds.), Envisioning TESOL through a
translanguaging Lens – Global perspectives. New York: Springer.
Sinclair, J. M., & Coulthard, M. (1975). Towards an analysis of discourse: The English used by teachers
and pupils. Oxford: Oxford University Press.
Skehan, P., & Foster, P. (2005). Strategic and on-line planning: The influence of surprise information
and task time on second language performance. In R. Ellis (Ed.), Planning and task performance in a
second language (pp. 193–216). Amsterdam: John Benjamins Publishing Company.
Skehan, P., & Foster, P. (2016). Task type and task processing conditions as influences on foreign
language performance. Language Teaching Research, 1(3), 185–211. doi: 10.1177/1362168897001
00302
Tran, T. P. T. (2020). Intercultural language teaching in Vietnamese tertiary EFL classes: A participatory
action research study. Wellington: Victoria University of Wellington].
Tsui, A. B. M. (2012). The dialectics of theory and practice in teacher knowledge development. In J. M.
L. B. R. S. Hϋtter & S. Schiftner (Eds.), Theory and practice in EFL teacher education: Bridging the
gap (pp. 16–37). Bristol: Multilingual matters.
Wingate, U. (2018). Lots of games and little challenge – a snapshot of modern foreign language
teaching in English secondary schools. The Language Learning Journal, 46(4), 442–455. doi: 10.1080/
09571736.2016.1161061
327
23
ORAL LANGUAGE DEVELOPMENT
IN IMMERSION AND DUAL
LANGUAGE CLASSROOMS
Roy Lyster and Diane J. Tedick
1 Introduction/Definitions
School-based additive bilingual programmes that teach an additional language through
subject-matter instruction permeate a wide range of international contexts and instructional
settings. Such programmes come in many shapes and sizes with names including immersion
and dual language (ImDL) education, content and language integrated learning (CLIL),
content-based instruction (CBI), and content-based language teaching (CBLT). They share
an instructional approach in which non-linguistic curricular content such as geography,
history, or science is taught to students through the medium of a language they are learning
as an additional language. One of the most attractive features of these programmes is the
increased exposure to and engagement with the target language via subject-matter instruc-
tion, which provides a motivational basis for purposeful communication and a cognitive
basis for language learning. Often overlooked, however, is the research evidence demon-
strating that, for these programmes to be effective, they need to integrate a systematic focus
on the target language that encourages shifts in learners’ attentional focus between language
and content.
Given the diverse range of such programmes and the wide variety of research associated
with each, this chapter focuses specifically on immersion and dual language (ImDL) pro-
grammes in Canada and the United States, first identifying and explaining some limitations
in terms of students’ oral production abilities and then proposing instructional solutions for
enhancing students’ oral language development through opportunities for focused and
contextualized practice.
More specifically, still in comparison to native speakers of French of the same age, Harley
et al. found that French immersion students performed similarly on measures of discourse
competence, operationalized as the ability to process language coherently and cohesively. An
example of coherence in discourse is the accurate use of pronouns to refer to characters,
objects, and locations when telling a story. Examples of discourse cohesiveness include the
329
Roy Lyster and Diane J. Tedick
accurate use of conjunctions and adverbs to make logical connections – such as temporal
sequencing and cause-effect relationships – between clauses and sentences.
In contrast to their native-like levels of discourse competence, French immersion students
were much less proficient on most grammar aspects, which included verb and preposition
usage, and also fell short of native speakers on measures of sociolinguistic competence, which
is the ability to vary one’s language according to social context. Harley (1994) summed up
French immersion students’ oral production as containing phonologically salient items that
are easy to acquire from the stream of speech, along with high-frequency lexical items and
syntactic patterns similar to those of English. Missing from their oral production are less
salient morphosyntactic features that differ from English or are not crucial for getting one’s
meaning across. For example, French immersion students underuse conditional verb forms
to express hypothetical meaning or uncertainty, instead using lexical items such peut-être
(maybe). These are useful communication strategies but do not lead students towards higher
levels of academic language production.
Like their Canadian counterparts, majority-language students in the United States one-
way immersion programmes produce spoken and written language lacking grammatical
accuracy. Moreover, their language is typically not sociolinguistically appropriate (see
Tedick & Wesely, 2015, for a review), and their vocabulary can be underdeveloped (e.g.,
Fortune & Tedick, 2015). Most US studies on the language development of one-way im-
mersion students have been done in Spanish or French programmes, and increasingly
Mandarin Chinese. Nevertheless, teachers representing programmes in a wide range of
languages, including Indigenous languages, share anecdotally that immersion students’
minority-language development is far from optimal.
Research on two-way immersion programmes has shown that English L1 students con-
tinue to perform better in English than in Spanish, while Spanish L1 students tend to develop
more balanced oral and written proficiencies in both languages (e.g., Lindholm-Leary, 2001).
However, some Spanish L1 students become dominant in English (Fortune, 2001) and de-
velop certain grammatical inaccuracies in Spanish, their home language (e.g., Potowski,
2007; Tedick & Young, 2016). Most of the research has been done in Spanish/English
programmes, but anecdotal evidence indicates that L1 speakers of other minority languages
in two-way immersion, such as Hmong, also develop some grammatical inaccuracies in their
home language.
Findings related to the shortcomings in ImDL students’ production abilities led to de-
scriptive research to better understand the shortcomings. Early classroom observations re-
vealed that subject-matter instruction did not necessarily invite much oral production by
students. For example, Swain (1988) reported findings from an observational study in which
only 14% of the turns produced by Grade 6 French immersion students contained more than
one clause in length. She argued that exposure to input, but with minimal opportunities for
production, engages comprehension strategies enabling students to comprehend content by
drawing on pragmatic and situational cues, real-world knowledge, and inference, without
processing structural elements in the language. Bypassing language structure in this way,
however, is harder to do when producing the language. Swain thus argued in favour of more
opportunities for student output and the provision of feedback that would push students to
express themselves more precisely and appropriately. This came to be known as “pushed
output.” This line of research also revealed that, contrary to expectations, subject-matter
instruction exposes students to a limited range of language forms and functions (e.g., Swain,
1988); in addition, a tendency for ImDL teachers to keep language instruction and content
instruction separate was also observed (Allen et al., 1990). Overall, this research suggested
that, for students to improve their production abilities, they need to engage with the target
330
Oral Language Development
language in its full functional range and teachers must engage with a range of instructional
practices considered effective for integrating content and language.
331
Roy Lyster and Diane J. Tedick
in all four modalities – listening, speaking, reading, and writing – in unrehearsed “real-
world” situations. Levels range from Novice to Distinguished, with High, Mid, and Low
sublevels for the Novice, Intermediate, and Advanced ranges (ACTFL, 2012). This scale is
similar to the Common European Framework of Reference for Languages (CEFR), devel-
oped by the Council of Europe (2001). It includes levels ranging from A1 (Novice) to C2
(Distinguished). These are broad-brush scales in that they describe proficiency generally,
with little attention to details.
Drawing on the ACTFL Guidelines, the Center for Applied Linguistics (CAL) developed
a rating scale for use with their oral proficiency assessment tools for young learners (Grades
K–8): the Student Oral Proficiency Assessment (SOPA), used with students learning addi-
tional languages, and the CAL Oral Proficiency Exam (COPE), designed for ImDL students
in Grades 5–8 (Thompson et al., 2006). These tools utilize the same rating scale, which has
nine levels (from Junior Novice-Low to Junior Advanced-High) across the domains of oral
fluency, grammar, vocabulary, and listening comprehension. Another tool is the Standards-
Based Measurement of Proficiency (STAMP) (Avant Assessment, 2015). The web-based,
computer adaptive STAMP assesses proficiency in all four skill areas and uses a scale aligned
with the ACTFL levels.
Two large-scale studies were conducted using STAMP, each assessing over 1,000 ImDL
students (most enroled in 50:50 programmes). Burkhauser et al. (2016) reported that Grade 8
ImDL students performed in the Intermediate-Mid range for speaking (Chinese and Spanish)
and between Intermediate-Low and -Mid in Japanese (between A2 and B1 according to the
CEFR [ACTFL, n.d.]). The Center for Applied Second Language Studies (CASLS, 2013)
found that 41% performed at the Intermediate-Low level in speaking by Grade 6 and 97% by
Grade 12, with 3% rated in the Intermediate-High range. The study included students in
Chinese, French, Japanese, and Spanish programmes, but results were not disaggregated by
language. Also relying on the STAMP, Fortune and Song (2016) reported that the speaking
proficiency of 70% of early, total one-way Mandarin immersion Grade 5 students (n = 80)
was in the Intermediate-Low range, pointing to the superiority of the early, total programme
over the 50:50 model.
Fortune and Tedick (2015) used CAL’s SOPA and COPE assessments in a cross-sectional
study of the oral language of early, total Spanish immersion students (n = 218) across Grades
K, 2, 5, and 8. Findings showed statistically significant differences between students in
Grades K, 2, and 5 across all domains, with Grade 5 students’ median proficiency score rated
at the Junior Advanced-Low level in oral fluency, grammar, and vocabulary. However, a
plateau effect emerged. Grade 8 students’ performance was not significantly better than ei-
ther Grade 5 or Grade 2 students’ in the speaking domains. Fortune and Tedick speculated
that this plateau may have occurred because Grade 8 students received only 25% of in-
struction in Spanish, whereas Grade 5 students received about 70%. They also questioned
whether the broad-brush rating scale was nuanced enough to detect differences in higher
levels of proficiency.
Following a similar cross-sectional design and using the same tools developed by CAL,
Fortune and Ju (2017) assessed the oral proficiency of early, total Mandarin immersion
students in Grades K, 2, and 5. They found significant differences in all domains between
Grades K and 2 but none between Grades 2 and 5. Both Grade 2 and 5 students were rated at
the Junior Intermediate-Mid level in the speaking domains. Also concerned about the in-
ability of the rating scale to detect differences, Fortune and Ju conducted a fine-grained,
follow-up linguistic complexity analysis of three representative speech samples (one from
each grade). They found steadily increasing grammatical complexity from one grade to the
next (K, 2, and 5). Lexical complexity, however, increased from Grades K and 2, but not
332
Oral Language Development
between Grades 2 and 5. This finding suggests the need for more robust vocabulary devel-
opment in ImDL classrooms, also recommended by Fortune and Tedick (2015) based on
their study results.
Xu et al. (2015) used the STAMP to examine proficiency in two-way Mandarin immersion
students. About 71% of Grade 5 students met or exceeded the Intermediate-Low level in
speaking. Interestingly, the heritage learners (i.e., L1 speakers of the minority language) and
English-speaking Mandarin L2 learners scored about the same. This perhaps was because the
heritage learners were defined based on the presence of Mandarin in the home, rather than
on the requirement that Mandarin be their L1. Potowski (2007), in contrast, defined heritage
and non-heritage learners based on students’ L1s and found stark differences in proficiency
between the two groups at Grade 8, with heritage (Spanish L1) speakers being assessed
significantly higher in oral language proficiency than English L1 speakers.
In summary, English L1 students in ImDL programmes tend to perform orally in the
Intermediate-Low to Advanced-Low ranges, and students in early, total (and 90:10) pro-
grammes achieve higher ratings in oral proficiency than students in 50:50 programmes.
Students in Spanish early, total programmes have higher levels of oral proficiency than
students in Mandarin early, total programmes. Moreover, heritage learners in two-way
programmes tend to outperform English L1 speakers and to develop more balanced bi-
lingualism, especially in Spanish/English programmes. Broad-brush measures may not be
nuanced enough to detect growth within the higher Intermediate to lower Advanced ranges
where students’ oral language seemingly plateaus. In addition, oral language assessments
should be developed specifically for immersion contexts to gain access to students’ ability to
use complex language indicative of academic literacy.
333
Roy Lyster and Diane J. Tedick
Other research on classroom interaction has investigated students’ varied use of both
instructional languages in ImDL classrooms. This work has emerged against the backdrop of
translanguaging theory, which argues that an individual’s entire linguistic repertoire func-
tions as one integrated system (e.g., García, 2009; Otheguy et al., 2015). In practice, trans-
languaging entails the use of two or more languages to make meaning, form experiences, and
cultivate knowledge and understandings (García, 2009). As translanguaging advocates in-
creasingly promote the use of translanguaging pedagogies, ImDL studies have followed,
particularly in two-way contexts. Several have reported that students move in and out of
different roles as they engage with each other and use both languages, often concurrently,
which, researchers have argued, affords opportunities for metalinguistic analysis, the med-
iation of understanding, and other learning (e.g., García, 2011; Hamman, 2018). At the same
time, some of this research has revealed linguistic imbalances. For example, Hamman (2018)
observed far more instances of students using English during instructional time in Spanish
than vice versa. She noted:
Despite the ubiquity of and overwhelming support for translanguaging in the literature, a
growing number of scholars have questioned whether translanguaging pedagogies and
practices in ImDL classrooms are warranted, in particular when it comes to development of
the minority language in contexts where English is the societal majority language (e.g.,
Ballinger et al., 2017; Fortune & Tedick, 2019; Lyster, 2019a; Tedick & Lyster, 2020).
Indeed, minority language development in such ImDL settings continues to be of great
concern, as evidenced in the next part.
Quasi-Experimental Research
In response to the observed shortcomings in ImDL students’ oral language development
along with the observations of the limitations in ImDL classroom discourse, Harley and
Swain (1984) proposed, nearly 40 years ago, a twofold instructional sequence to improve
students’ proficiency in the target language:
1. …more focused L2 input which provides the learners with ample opportunity to observe
the formal and semantic contrasts involved in the relevant target subsystem (this does
not necessarily involve explicit grammar teaching); and
2. …increased opportunity for students to be involved in activities requiring the productive
use of such forms in meaningful situations. (p. 310)
This proposal laid the groundwork for a series of quasi-experimental studies conducted in
French immersion classrooms to enhance students’ awareness of target features while pro-
viding opportunities for their productive use in meaningful contexts with a content or the-
matic focus (Day & Shapson, 1991; Harley, 1989; 1998; Lyster, 1994, 2004a; Wright, 1996).
Implemented in Grade 2–8 classrooms, the instructional treatments were designed in ac-
cordance with the instructed second language acquisition (SLA) construct of form-focused
instruction (FFI; Spada, 1997), which is designed to draw learners’ attention to target fea-
tures “as they are experiencing a communicative need” (Loewen, 2011, p. 582) and thus
334
Oral Language Development
differs considerably from decontextualized language instruction. Taken together, the results
of these studies showed that, in more than 75% of the 40 tests given either as immediate or
delayed posttests to assess both knowledge and productive use of the target features, students
participating in the FFI improved more than students left to their own devices to “pick up”
the target forms from the regular curriculum (Lyster, 2016).
Zooming in specifically on the results of the oral production measures, however, we find
that the oral outcomes varied across these studies. Specifically, the instructional treatment
targeting the functional distinctions between perfect and imperfect past tenses (i.e., passé
composé vs. imparfait) in Harley’s (1989) study yielded significant short-term improvement
on a cloze test and written production task, but no significant improvement in oral pro-
duction either in the short or long term. Instruction on the conditional mood in Day and
Shapson’s (1991) study yielded short- and long-term significant improvement in written
production, but none in oral production. In contrast, Wright’s (1996) study targeting verbs
of motion and the studies targeting second-person pronouns and grammatical gender by
Lyster (1994, 2004a) all showed long-term significant improvement in oral production. At the
same time, Harley’s (1998) study targeting grammatical gender showed significant im-
provement in oral production on a picture description task but none on an oral task eliciting
only unfamiliar nouns.
The different outcomes in oral production across these studies can be attributed to the
differential emphases in their instructional treatments (Lyster, 2004b). To do so, we refer
to skill acquisition theory and its distinction between declarative and procedural knowl-
edge (e.g., DeKeyser, 2007). Proponents of skill acquisition theory propose that L2 de-
velopment entails a gradual transition from effortful use that relies on declarative
knowledge (knowing about) to more automatic use of the target language that relies on
procedural knowledge (knowing how), brought about through practice and feedback in
meaningful contexts. In this view, effective instruction needs to target both types of
knowledge: (a) through noticing and awareness activities designed to increase the saliency
and frequency of the forms and functions of target features to facilitate their intake in
declarative form; and (b) through opportunities for production practice that allow stu-
dents to proceduralize more target-like representations in contextualized and mean-
ingful ways.
The two studies with no long-term effects on oral production arguably overemphasized
production activities at the expense of activities promoting noticing and metalinguistic
awareness. For example, the main thematic activities in Harley (1989) and Day and Shapson
(1991) – the creation of childhood albums and the design of futuristic space colonies, re-
spectively – succeeded in engaging students in meaningful interaction and motivating con-
tent, but may not have drawn their attention to linguistic accuracy any more than is typically
the case and, as noted in both studies, fell short of pushing students to actually use the target
forms in oral production.
In contrast, of considerable importance in the treatments targeting verbs of motion,
second-person pronouns, and gender were the awareness tasks that first helped students to
consciously notice the target features through typographical enhancement and increased
frequency, and then helped them to develop analyzable representations of the target features
through a range of consciousness-raising tasks. In addition, the production activities in these
four studies were limited to role-plays, games, riddles, rhymes, and songs, giving more em-
phasis to guided practice than to autonomous practice and thereby clearing the way for
teachers to provide corrective feedback more strategically.
These are important observations because they suggest that oral abilities, at least with
respect to difficult target features and recalcitrant interlanguage forms, do not develop
335
Roy Lyster and Diane J. Tedick
only from speaking but are further enhanced by opportunities to develop metalinguistic
awareness. This underscores the importance of preceding practice activities with rich
input-driven tasks that provide students with useful models while drawing their attention
to the target language patterns they will need for successful completion of the production
tasks.
Scaffolding
Scaffolding was initially invoked as a means to characterize parent–child interaction and was
qualified as that which “enables a child or novice to solve a problem, carry out a task or
achieve a goal which would be beyond his unassisted efforts” (Wood et al., 1976, p. 90). The
notion of scaffolding has since been aptly applied to teacher–student interaction and is
considered to encapsulate effective teaching. Scaffolding provides ImDL teachers with the
means to structure classroom discourse in ways that make oral interaction a key source of
learning. Scaffolding is what makes ImDL work, because it enables students to engage with
content in a language they know only partially as they draw on the contextual clues provided
in the scaffolding while also drawing on prior knowledge. Scaffolding students’ oral pro-
duction is necessary for assisting them in producing a language they are still learning or
helping them to produce language that is more academically sophisticated, complex, and
precise. At the same time, as students engage with subject-matter content, teacher support in
the form of scaffolding for comprehension is equally important. In this sense, scaffolding for
comprehension and scaffolding for production can be seen as interrelated in actual classroom
practice.
Drawing on earlier work by Echevarría et al. (2008), Tedick and Lyster (2020) describe
three types of scaffolding. Verbal scaffolding for comprehension involves linguistic re-
dundancy whereby teachers express the same message in a variety of ways. Verbal scaffolding
for production aims to facilitate student language use during classroom interaction through
questioning techniques and follow-up moves. Procedural scaffolding encompasses activity
frames and routines that teachers use to facilitate comprehension or to create multiple op-
portunities for students to use the language independently. Instructional scaffolding refers to
various tools or print and multimedia resources embedded in instructional activities to
promote comprehension and support production. Table 23.1 provides examples of these
three types.
As a metaphor borrowed from the construction industry, scaffolding has often been
considered a temporary support (e.g., Cazden, 1983). However, in ImDL classrooms, teacher
scaffolding is a key instructional strategy throughout the entire programme from beginning
to end. Although the nature of the scaffolding changes as students progress and become
more autonomous, the need to provide support for student learning is not any less apparent
in higher grades where academic language and content become increasingly more complex
(Tedick & Lyster, 2020).
336
Oral Language Development
337
Roy Lyster and Diane J. Tedick
Focus
on Contextualization phase
content
Awareness
phase
Predominant
focus on
language
Practice
phase
Focus
on Autonomy phase
content
Figure 23.1 Variable emphases on content and language in the CAPA model ( Tedick & Lyster,
2020, p. 112)
338
Oral Language Development
in the foreground as students discussed the main points surrounding Cartier’s voyages. Next,
during the awareness phase, the text of the video’s narration was projected, with instances of
the passé composé in bold. Students were led to identify the tense of the highlighted verbs, to
notice the two different auxiliaries, and to make a list for future reference classifying verbs
according to auxiliary.
Then, during the practice phase, each student received one of five images illustrating an
important event or place related to Cartier and wrote a description using verbs in the passé
composé. They then mingled with other students to find those with the same image, and
together in small groups they synthesized their descriptions to create an historical account,
which they then conveyed orally to the whole class, thus giving the teacher an opportunity to
provide corrective feedback as necessary. Finally, in the autonomy phase, students produced
an illustrated timeline in small groups depicting some of the landmark events in Cartier’s
career, including a legend for each event using the passé composé. As each group presented its
timeline to the class, the teacher had the opportunity to provide feedback on both language
and content.
The CAPA model thus draws on previous ImDL classroom intervention studies by in-
corporating noticing and awareness activities that increase the frequency and saliency of
target features, as well as production practice activities that promote the proceduralization of
the target features. In addition, the CAPA model intertwines language and content objectives
by making autonomous use of the target language the ultimate goal while ensuring that
focused consciousness-raising tasks serve to consolidate students’ metalinguistic awareness in
ways that pave the way for their autonomous use of the target language.
6 Future Directions
It has long been established that ImDL programmes hold much promise for student language
acquisition. Nonetheless, the research reviewed in this chapter reveals that they are falling short
of their potential with respect to oral language development. It is difficult to push students’
language proficiency beyond intermediate levels, and significant shortcomings in grammatical
accuracy, lexical specificity and variety, and sociolinguistic appropriateness persist. Despite early
observations and recommendations regarding the need for ImDL teachers to increase oppor-
tunities for students to produce output (Swain, 1988; Swain & Lapkin, 1986), many teachers
today find themselves still relying on the provision of comprehensible input and providing too
few opportunities for students to engage with the language. When they do provide such op-
portunities, they tend to focus on content learning without systematically offering corrective
feedback to improve students’ language. Thus, recommendations that were proposed nearly four
decades ago remain relevant today. Teachers need to orchestrate classroom interaction and
learning activities to maximize student output. They must have high expectations, scaffold
learning so students can meet those expectations, and provide age-appropriate corrective feed-
back both strategically and selectively to push students’ interlanguage to the next level. More
research on classroom interaction and on interventions to improve students’ oral language de-
velopment is clearly needed.
Given that ImDL teachers are typically prepared as content teachers and rarely have
immersion-specific preparation, they often lack knowledge of the target language themselves
in addition to knowledge of pedagogical approaches and strategies to shift learners’ attention
between content (meaning) and language (form). They need resources along the lines of
Lyster’s (2016) book for French immersion teachers as well as systematic and sustained
professional development opportunities to learn about language and pedagogical approaches
to embed a focus on language in the context of their content instruction, such as the CAPA
339
Roy Lyster and Diane J. Tedick
model. The field also needs more research on the types of professional development ex-
periences that lead teachers to transform their practices and ultimately improve student
language development.
ImDL programmes need also to make students’ continued development in the minority
language a primary goal rather than limiting assessments of programme effectiveness to students’
academic achievement and English language outcomes. The development of high-quality
immersion-specific oral language assessments and more refined rating scales to capture differ-
ences in oral language development, particularly at High-Intermediate to Pre-Advanced stages,
would greatly benefit the field (Fortune & Tedick, 2015; Fortune & Ju, 2017).
Although translanguaging theory, practice, and pedagogy have caught the attention of
many researchers and teachers who work in ImDL and other bilingual settings, some
translanguaging practices may not be appropriate in certain contexts, especially those in
which English is the majority language (e.g., Lyster, 2019a; Tedick & Lyster, 2020). For
example, in its aim to “use the entire linguistic repertoire of bilingual students” (García,
2013, p. 2), translanguaging practice may be detrimental to the development of the
minority language if English is used to process and engage with increasingly complex
subject matter. Some studies in two-way classrooms have indeed shown that when stu-
dents are free to engage in translanguaging practices in the classroom, they default to
English (e.g., Hamman, 2018). Sustained use of the minority language is arguably more
beneficial for pushing its development forward than recourse to English – given appro-
priate instruction and sufficient scaffolding to sustain use of the minority language.
Translanguaging practices, however, that incorporate cross-linguistic pedagogy as a
means “to teach for two-way cross-lingual transfer” (Cummins, 2007, p. 11) have much
potential to foster biliteracy development by improving students’ morphological aware-
ness (Lyster et al., 2013) and increasing their motivation to read in both languages (Lyster
et al., 2009). More evidence-based research on translanguaging pedagogies and their ef-
fects on minority language development in a range of contexts is needed before endorsing
across-the-board implementation.
Nearly 60 years have passed since the first immersion programmes were established (one-way
in St. Lambert, Montreal and two-way in Miami-Dade County, Florida), with more launched
each year in countries around the world. This form of bilingual education will continue to evolve
and improve as it incorporates relevant research findings about effective instructional practices
that integrate language and content and develops sustained professional development oppor-
tunities for teachers to respond to students’ oral language development needs.
Further Reading
Lyster, R. (2016). Vers une approche intégrée en immersion [Towards an integrated approach in im-
mersion]. Montreal: Les Éditions CEC.
Lyster, R. (2019). Making research on instructed SLA relevant for teachers through professional de-
velopment. Language Teaching Research, 23(4), 494–513.
Tedick, D. J., & Björklund, S. (Eds.) (2014). Language immersion education: A research agenda for
2015 and beyond [Special issue]. Journal of Immersion and Content-Based Language Education, 2(2).
Tedick, D. J., & Lyster, R. (2020). Scaffolding language development in immersion and dual language
classrooms. New York: Routledge.
References
American Council on the Teaching of Foreign Languages (ACTFL). (2012). ACTFL Proficiency
Guidelines 2012. Alexandria, VA: ACTFL. Retrieved from https://www.actfl.org/publications/
guidelines-and-manuals/actfl-proficiency-guidelines-2012.
340
Oral Language Development
ACTFL. (n.d.). Assigning CEFR ratings to ACTFL assessments. Retrieved from https://www.actfl.org/
publications/additional-resources/assigning-cefr-ratings-actfl-assessments
Allen, P., Swain, M., Harley, B., & Cummins J. (1990). Aspects of classroom treatment: Toward a more
comprehensive view of second language education. In B. Harley, P. Allen, J. Cummins, & M. Swain
(Eds.), The development of second language proficiency (pp. 57–81). Cambridge, UK: Cambridge
University Press.
Arshad & Lyster, R. (2021). Professional development in action: Teachers’ experiences in learning to
bridge language and content. In K. Talbot, S. Mercer, M.-T. Gruber, & R. Nishida (Eds.), The
psychological experience of integrating language and content (pp. 232–249). Bristol, UK: Multilingual
Matters.
Avant Assessment. (2015). Standards-Based Measurement of Proficiency (STAMP). Eugene, OR:
Author. Retrieved from https://avantassessment.com/.
Ballinger, S., & Lyster, R. (2011). Student and teacher language use in a two-way Spanish/English
immersion school. Language Teaching Research, 15, 289–306. doi: 10.1177/1362168811401151
Ballinger, S., Lyster, R., Sterzuk, A., & Genesee, F. (2017). Context-appropriate crosslinguistic
pedagogy: Considering the role of language status in immersion education. Journal of Immersion and
Content-Based Language Education, 5(1), 30–57.
Broner, M. (2001). Impact of interlocutor and task on first and second language use in a Spanish im-
mersion program (CARLA Working Paper #18). Minneapolis: University of Minnesota, Center for
Advanced Research on Language Acquisition. Retrieved from http://www.carla.umn.edu/resources/
working-papers/documents/ImpactOfInterlocutorTaskOn1st2ndLanguage.pdf.
Burkhauser, S., Steele, J. L., Li, J., Slater, R. O., Bacon, M., & Miller, T. (2016). Partner-language
learning trajectories in dual-language immersion: Evidence from an urban district. Foreign Language
Annals, 49(3), 415–433.
Cazden, C. B. (1983). Adult assistance to language development: Scaffolds, models, and direct in-
struction. In R. P. Parker & F. A. Davis (Eds.), Developing literacy: Young children’s use of language
(pp. 3–17). Newark, DE: International Reading Association.
Center for Applied Second Language Studies (CASLS). (2013). What levels of proficiency do
immersion students achieve? Eugene, OR: Center for Applied Second Language Studies. Retrieved
from https://casls.uoregon.edu/wp-content/uploads/pdfs/tenquestions/TBQImmersionStudentProficiency
Revised.pdf.
Council of Europe. (2001). Common European Framework of Reference for Languages: Learning,
teaching, assessment (CEFR). Strasbourg: Author. Retrieved from www.coe.int/lang-CEFR.
Cummins, J. (2007). Rethinking monolingual instructional strategies in multilingual classrooms. Canadian
Journal of Applied Linguistics, 10, 221–241.
Day, E., & Shapson, S. (1991). Integrating formal and functional approaches to language teaching in
French immersion: An experimental study. Language Learning, 41, 25–58.
DeKeyser, R. (1998). Beyond focus on form: Cognitive perspectives on learning and practicing second
language grammar. In C. Doughty & J. Williams (Eds.), Focus on form in classroom Second language
acquisition (pp. 42–63). Cambridge, UK: Cambridge University Press.
DeKeyser, R. (Ed.). (2007). Practice in a second language: Perspectives from applied linguistics and
cognitive psychology. Cambridge, UK: Cambridge University Press.
Echevarría, J., Vogt, M., & Short, D. J. (2008). Making content comprehensible for English learners: The
SIOP® Model (3rd edn). Boston, MA: Pearson Education.
Escamilla, K., Hopewell, S., Butvilofsky, S., Sparrow, W., Soltero-González, L., Ruiz-Figueroa, O., &
Escamilla, M. (2014). Biliteracy from the start: Literacy Squared in action. Philadelphia: Caslon
Publishing.
Fortune, T. W. (2001). Understanding immersion students’ oral language use as a mediator of social
interaction in the classroom. Unpublished doctoral dissertation, University of Minnesota,
Minneapolis, MN.
Fortune, T. W., & Ju, Z. (2017). Assessing and exploring the oral proficiency of young Mandarin
immersion learners. Annual Review of Applied Linguistics, 37, 264–287. doi: 10.1017/S026719051
7000150
Fortune, T. W., & Song, W. (2016). Academic achievement and language proficiency in early total
Mandarin immersion education. Journal of Immersion and Content-Based Language Education, 4(2),
168–197.
Fortune, T. W., & Tedick, D. J. (2015). Oral proficiency development of English Proficient K–8 Spanish
immersion students. Modern Language Journal, 99(4), 637–655. doi: 10.1111/modl.12275
341
Roy Lyster and Diane J. Tedick
Fortune, T. W., & Tedick, D. J. (2019). Context matters: Translanguaging and language immersion
education in the U.S. and Canada. In M. Haneda & H. Nassaji (Eds.), Perspectives on language as
action: Festschrift in honor of Merrill Swain (pp. 27–44). Bristol, UK: Multilingual Matters.
García, O. (2009). Bilingual education in the 21stcentury: A global perspective. Malden, MA: Wiley-
Blackwell.
García, O. (with Makar, C., Starcevic, M., & Terry, A.) (2011). Translanguaging of Latino kinder-
garteners. In K. Potowski & J. Rothman (Eds.), Bilingual youth: Spanish in English-speaking so-
cieties (pp. 33–55). Amsterdam: John Benjamins.
García, O. (2013). Theorizing translanguaging for educators. In C. Celic & K. Seltzer (Eds.),
Translanguaging: A CUNY-NYSIEB guide for educators (pp. 1–6). New York, NY: CUNY-
NYSIEB.
Hamman, L. (2018). Translanguaging and positioning in two-way dual language classrooms: A case for
criticality. Language and Education, 32(1), 21–42, doi: 10.1080/09500782.2017.1384006
Harley, B. (1989). Functional grammar in French immersion: A classroom experiment. Applied
Linguistics, 10, 331–359.
Harley, B. (1994). Appealing to consciousness in the L2 classroom. AILA Review, 11, 57–68.
Harley, B. (1998). The role of form-focused tasks in promoting child L2 acquisition. In C. Doughty & J.
Williams (Eds.), Focus on form in classroom second language acquisition (p. 156–174). Cambridge,
UK: Cambridge University Press.
Harley, B., Cummins, J., Swain, M., & Allen, P. (1990). The nature of language proficiency. In B.
Harley, P. Allen, J. Cummins & M. Swain (Eds.), The development of second language proficiency
(pp. 7–25). Cambridge, UK: Cambridge University Press.
Harley, B., & Swain, M. (1984). The interlanguage of immersion students and its implications for
second language teaching. In A. Davies, C. Criper & A. Howatt (Eds.), Interlanguage (pp. 291–311).
Edinburgh: Edinburgh University Press.
Hernández, A. M. (2015). Language status in two-way bilingual immersion: The dynamics between
English and Spanish in peer interaction. Journal of Immersion and Content-Based Language
Education, 3(1), 102–126.
Lindholm-Leary, K. J. (2001). Dual language education. Clevedon, UK: Multilingual Matters.
Loewen, S. (2011). Focus on form. In E. Hinkel (Ed.), Handbook of research in second language teaching
and learning (Vol. 2, pp. 576–592). New York: Routledge.
Lyster, R. (1994). The effect of functional-analytic teaching on aspects of French immersion students’
sociolinguistic competence. Applied Linguistics, 15, 263–287.
Lyster, R. (2004a). Differential effects of prompts and recasts in form-focused instruction. Studies in
Second Language Acquisition, 26, 399–432.
Lyster, R. (2004b). Research on form-focused instruction in immersion classrooms: Implications for
theory and practice. Journal of French Language Studies, 14, 321–341.
Lyster, R. (2007). Learning and teaching languages through content: A counterbalanced approach.
Amsterdam: John Benjamins.
Lyster, R. (2016). Vers une approche intégrée en immersion [Towards an integrated approach in im-
mersion]. Montreal: Les Éditions CEC.
Lyster, R. (2018). Content-based language teaching [The Routledge E-Modules on Contemporary
Language Teaching edited byB. VanPatten & G. Keating.] New York: Routledge.
Lyster, R. (2019a). Translanguaging in immersion: Cognitive support or social prestige? The Canadian
Modern Language Review, 75(4), 340–352.
Lyster, R. (2019b). Making research on instructed SLA relevant for teachers through professional
development. Language Teaching Research, 23(4), 494–513.
Lyster, R., Collins, L., & Ballinger, S. (2009). Linking languages through a bilingual read-aloud project.
Language Awareness, 18(3–4), 366–383.
Lyster, R., Quiroga, J., & Ballinger, S. (2013). The effects of biliteracy instruction on morphological
awareness. Journal of Immersion and Content-Based Language Education, 1(2), 169–197.
Lyster, R., & Sato, M. (2013). Skill Acquisition Theory and the role of practice in L2 development. In
P. García Mayo, M. Gutierrez-Mangado, & M. Martínez Adrián (Eds.), Contemporary approaches
to second language acquisition (pp. 71–92). Amsterdam: John Benjamins.
Otheguy, R., García, O., & Reid, W. (2015). Clarifying translanguaging and deconstructing named
languages: A perspective from linguistics. Applied Linguistics Review, 6(3), 281–307.
Potowski, K. (2007). Language and identity in a dual immersion school. Clevedon, UK: Multilingual
Matters.
342
Oral Language Development
Spada, N. (1997). Form-focused instruction and second language acquisition: A review of classroom
and laboratory research. Language Teaching, 29, 73–87.
Swain, M. (1988). Manipulating and complementing content teaching to maximize second language
learning. TESL Canada Journal, 6, 68–83.
Swain, M., & Lapkin, S. (1986). Immersion French in secondary schools: ‘The goods’ and ‘the bads’.
Contact, 5(3), 2–9.
Tarone, E., & Swain, M. (1995). A sociolinguistic perspective on second language use in immersion
classrooms. Modern Language Journal, 79, 166–178. doi: 10.1111/j.1540-4781.1995.tb05428.x
Tedick, D. J., & Lyster, R. (2020). Scaffolding language development in immersion and dual language
classrooms. New York: Routledge.
Tedick, D. J., & Wesely, P. (2015). A review of research on content-based foreign/second language
education in US K-12 contexts. Language, Culture and Curriculum, 28(1), 25–40.
Tedick, D. J., & Young, A. I. (2016). Fifth grade two-way immersion students’ responses to form-
focused instruction. Applied Linguistics, 37(6), 784–807.
Thompson, L. E., Boyson, B. A., & Rhodes, N. C. (2006). Administrator’s manual for CAL foreign
language assessments, grades K–8. Washington, DC: Center for Applied Linguistics.
Wood, D., Bruner, J., & Ross, G. (1976). The role of tutoring in problem solving. Journal of Child
Psychology and Psychiatry, 17, 89–100.
Wright, R. (1996). A study of the acquisition of verbs of motion by grade 4/5 early French immersion
students. The Canadian Modern Language Review, 53, 257–280.
Xu, X., Padilla, A. M., & Silva, D. M. (2015). Learner performance in Mandarin immersion and high
school world language programs: A comparison. Foreign Language Annals, 48, 26–38. doi: 10.1111/
flan.12123
343
24
SPEAKING AND ENGLISH AS A
LINGUA FRANCA
Enric Llurda
1 Introduction/Definitions
Second language acquisition (SLA) research has made enormous progress since its inception
more than 50 years ago. Yet, despite Sridhar and Sridhar’s (1986) early urgings to re-
searchers to look beyond native speaker models and avoid excessive reliance on native-
speaker environments, mainstream SLA studies have often focused on the native speaker as
the baseline model on which second language performance and achievement are measured.
Some authors have questioned the use of monolingual native models in SLA research by
claiming that the outcome of second language (L2) learning can never be identical to
monolingual first language (L1) competence (Cook, 1999; Grosjean, 1989), but it was the
appearance of the World Englishes paradigm (Kachru, 1985) and later the surge of interest in
English as a Lingua Franca (ELF) that contributed most to the displacement of the target
from the idealized native-like speaker to a real-life bi/multilingual L2 user (Cook, 1999).
Such displacement is necessary to avoid falling into a deficit perspective of SLA in which all
learners are doomed to fail.
The teaching of L2 speaking entails a range of techniques that make it a complex task,
which is often avoided by teachers who feel more comfortable dealing with grammar and
vocabulary, and the rules of writing. Yet, speaking is clearly different from writing and,
therefore, classroom activities should address its specific requirements. According to Burns
(2016), speaking involves a series of “core skills,” namely, pronunciation, speech function,
interaction management, and discourse organization skills. Therefore, when we look at re-
search on speaking and ELF, we should focus our attention on studies dealing with the
pronunciation and pragmatics of ELF.
In the specific area of pronunciation, Derwing and Munro have established the in-
dependence of intelligibility from accent (Derwing & Munro, 2015; Munro & Derwing, 1995),
and the limited role of accent in communicative effectiveness (Derwing & Munro, 2009), with
intelligibility being allocated the primary role in the achievement of success in oral interaction.
Accent is certainly an important aspect of L2 pronunciation, as a marker of a L2 speaker’s
“identification and affiliation with the target language” (Moyer, 2014, p. 11). The extent to
which speakers are willing to abandon their previous identities and embrace new ones affects
the extent to which they adopt the pronunciation of a particular community of speakers. But
what happens when the L2 is not identified with a specific community, and the speaker does
not have a socially motivated pull to adapt their way of speaking to a particular model that
characterizes a group of speakers? What if the L2 is a lingua franca used to bring together
speakers of many different speech communities? This is the case of English in its global or
lingua franca dimension. The community of speakers of English is not restricted to a local or
national context. Rather, it is spread across the world and is made up of people unevenly
distributed among all countries in the world. English is thus the language used to bring to-
gether people of different cultural backgrounds, which means that speaking English in lingua
franca situations involves developing intercultural communication skills and enhanced prag-
matic strategies to deal with cultural and pragmatic diversity.
Whereas the intent of the acquisition of most languages is communication with local
communities of speakers, the more international the language is, the more diverse settings its
speakers will encounter. English is thus a unique language, given that it is spoken in so many
contexts. So, when we deal with models of speaking, and particularly models of pro-
nunciation, we may need first to question who comprises the speaker’s potential target au-
dience. Widdowson (2012) emphasized that communication in English in international
contexts is necessarily ruled by the specific purposes of the participants in the interaction.
Thus, the relevant question is whether that purpose involves a narrowly defined speech
community. If not, we face a situation in which there is no point in imitating the specific
pronunciation of a group of speakers. In this context, broad global intelligibility is the de-
sired ultimate attainment. Nativeness becomes a singularity with no particular significance
other than the social value attached to a community of speakers.
Standard language ideology has had an enormous impact on mainstream conceptions of
language and has deeply affected language teaching by confining and reducing language to
its formalized standard versions, barring from classrooms any form that is not considered
“correct.” The importance of standard language ideology in determining goals of L2
speaking can be seen in the promotion in textbooks of artificial pseudo-spoken forms pre-
cooked in written form that rarely occur in natural conversations, or in the promotion of
pronunciation forms produced by specific groups of speakers (i.e., British Received
Pronunciation in many UK textbooks and General American English in materials produced
in the United States). ELF is certainly not the only research paradigm that challenges the
concept of standard language and the predominant role it has had in linguistic and applied
linguistic analysis (Canagarajah, 2007; García & Wei, 2014; Makoni & Pennycook, 2007),
but it clearly confronts such ideology and calls into question native speakerism (Holliday
2005), a pervasive notion that has sacralized native speakers while simultaneously ignoring
the many colours the language could take when spoken by individuals who have a different
first language.
Not only has standard language ideology created a potentially discriminatory situation
for non-native speakers of English, but it also contributes to the discrimination against many
native speakers whose accent is not deemed socially acceptable. Conversely, ELF research
emphasizes diversity within language use. Though such diversity has always been present in
NS forms and many sociolinguists have documented it, ELF has contributed to the trend in
L2 teaching and SLA to take a broader perspective, letting go of rigidities and constraints
posed by a narrowly defined concept of language use and a target language model.
2 Historical Perspectives
The historical evolution of ELF can be summarized as moving from describing a single
variety that could eventually become a new internationally accepted standard variety and
serving as a model to future learners of the language to the recent pluricentric
345
Enric Llurda
ELF1
In this phase, the goal was to characterize ELF interactions and predict the future evolution
of an upcoming international variety of English, and to help learners find the linguistic
aspects key to intelligibility in ELF. More or less implicitly, researchers were hopeful to
eventually codify such a variety and establish it as a norm for the teaching and learning of
English in international contexts (Seidlhofer, 2001).
In 2000, Jenkins put forward the Lingua Franca Core (LFC), which is one of the most
controversial aspects of ELF and L2 speaking. The LFC was (mis)interpreted by several
applied linguists as providing an alternative model to native varieties. However, Jenkins
(2007, 2015) claimed she did not intend to offer a prescriptivist set of rules to be complied
with or imposed as a model, and that the LFC is a hierarchy of pronunciation elements to
help teachers and learners establish priorities in their teaching and learning, rather than a
rigid simplified version of the syllabus. Such priorities would necessarily take into account
the particularities of different groups of learners and their L1s. The LFC gave a tre-
mendous push to general interest in ELF among English language teaching (ELT) re-
searchers and practitioners. Additionally, research on pronunciation teaching has shown
the importance of prioritizing elements of pronunciation that are most likely to con-
tribute in an increase in the intelligibility of the speaker (Derwing et al., 1998, Munro
et al., 2015).
Very little research has continued Jenkins’s (2000) on pronunciation and the LFC. One
relevant study is Deterding (2013), who analysed misunderstandings in conversations
among Asian speakers of English and concluded that pronunciation is the main factor
causing them. More recently, Gardiner and Deterding (2018) concluded that consonant
clusters need to be maintained and “teachers should focus on the full number of con-
sonants in initial clusters” (p. 231), but the exact quality of the second consonant is not as
relevant, as a variant realization of that consonant would not create as much trouble as
omitting it.
Recently, not as much attention has been paid to the LFC, as ELF researchers (Cogo &
Dewey, 2012; Jenkins, 2007; Seidlhofer, 2011) have emphatically argued that ELF is a
complex and diverse entity that originates in natural interactions among speakers of different
L1s; they have mainly aimed at achieving an understanding of the dynamics of such language
encounters and the potential common patterns that appear. Understanding the complexity of
ELF gave way to the two subsequent stages: ELF2 and ELF3.
ELF2
Seidlhofer (2009) outlined a new orientation of ELF research, in which the emphasis was not
as much on forms that identify ELF as an emerging variety of English, but as several “multi-
faceted multilingual repertoires” (p. 242) of members of the global English-using community
of practice. The focus was placed on ELF’s variability, “understood as a defining char-
acteristic of ELF communication” (Jenkins, 2015, p. 55). This perspective was not very
different from that of Canagarajah (2007) and Makoni and Pennycook (2007), who
346
Speaking and English as a Lingua Franca
questioned the existence of varieties per se and raised concerns as to what “English” and
“Englishes” actually refer to. Thus, the emphasis moved from the description of a new
variety to the acknowledgement of the inappropriateness of categorizing varieties and the
need to focus on communicative strategies of global speakers of English in different contexts
and situations. Seidlhofer (2009) explained that “the crucial challenge has been to move from
the surface description of particular features, however interesting they may be in themselves,
to an explanation of the underlying significance of the forms” (p. 241) and Cogo and Dewey
(2012) argued that they are not so much interested in surface-level features of ELF in-
novations as in “the underlying communicative motives that give rise to them” (p. 14). These
authors went on to describe ELF as “a naturally occurring, very widespread, especially
contemporary phenomenon,” which “entails contact between speakers from varying lin-
guacultural backgrounds,” “involves online modifications of English language resources to
suit the particular communicative needs of interlocutors” and “entails (…) processes of
identity signaling, codeswitching, accommodation and language variation” (Cogo & Dewey,
2012, p. 18). In line with these defining traits, the authors further claim that “successful
communication is any exchange that proves to be meaningful for the participants and that
has reached the required purpose or purposes,” as they wish to analyze “the strategies that
participants use to make the conversation work, the moves they make to negotiate meaning
or to align with their co-participants” (Cogo & Dewey, 2012, p. 36).
ELF3
Much recent ELF research makes it clear that it always takes place in multilingual en-
vironments where two or more languages are involved, and at least one of the speakers is
proficient in at least one language other than English; more often than not those languages
are also present in the interaction. This evolution of ELF research is parallel to the interest
in the constant and fluid interplay among languages in contact situations, and the em-
phasis on the hybridity and plurilithic nature of language use in spontaneous commu-
nication as in Garcia and Wei’s (2014) work on translanguaging, which is defined as
347
Enric Llurda
348
Speaking and English as a Lingua Franca
paradoxically binds them together in the shared awareness of “being involved in an especially
diverse linguacultural encounter” (Cogo & Dewey, 2012, p. 115). This includes native
speakers of English, who must be aware of such diversity to help achieve meaning. If they
rely on native speaker norms when in ELF interactions, it is quite likely they will encounter
trouble and ensuing miscommunication.
Diversity involves socio-pragmatic practices that may hinder communication, but at the
same time speakers’ linguacultural background provides them with a pool of resources to
face any communicative situation. Thus, diversity constitutes an asset in dealing with the
unexpected.
Current research on L2 speaking and ELF includes pragmatics and communication
strategies used by ELF speakers in interaction. Matsumoto (2011) is an example of a study
on negotiation strategies solving potential problems created by differences in pronunciation.
She argues that creating solidarity and trust among ELF speakers during the interaction is
fundamental, and observes that some L2 speakers find it difficult to establish such bonding
mechanisms with L1 speakers of English, an aspect that, according to the author, is a
consequence of unequal power relations.
The study of pragmatics in ELF started with the work of Firth (1996), who identified
several strategies found in international business negotiations, most notably the let-it-pass
and the make-it-normal principles, and was complemented by House’s (1999) and
Meierkord’s (2000) contributions on communicative efficiency. All three emphasized how
ELF conversations were characterized by cooperation and they highlighted the joint con-
struction of meaning that reduces misunderstandings to a minimum. Mauranen (2006)
concluded that misunderstandings do not occur more frequently in ELF than in non-ELF
conversations, possibly due to the use of proactive strategies in anticipation of commu-
nicative difficulty. Self-repairs, co-construction of expressions, unsolicited clarifications and
repetitions frequently occurred in Mauranen’s data. Thus, pragmatics and communication
strategies appeared to provide a compensatory mechanism to make up for actual formal
differences among different ELF speakers, as conversations are strongly oriented “toward
securing mutual intelligibility (…) quite possibly on the basis of the natural commonsense
assumption that it is not easy to achieve (mutual understanding) without special effort”
(Mauranen, 2006, p. 147).
Björkman (2011) described the use of pragmatic strategies in an ELF academic en-
vironment and found that speakers could use a wide array of strategies regardless their
degree of proficiency, and that students working in groups deployed such strategies more
frequently than lecturing instructors (in particular, “backchanneling” and “comment on
common ground”), which was attributed to the different speech events (interactive vs.
monologic) typical of group-work sessions and lectures.
According to Cogo and Dewey (2012), a key element in ELF communication is the use
of strategies during moments of “non-understanding,” which indicates a realization by
speakers of an unsuccessful interaction. Such strategies are useful to pre-emptively avoid
non-understanding, as they bring the interlocutor’s attention to a preceding part of the
conversation “in order to clarify or precise that particular segment” (Cogo & Dewey,
2012, p.128).
Finally, an aspect that has attracted attention in contemporary research in ELF, in line
with the current interest in translanguaging and the multilingual turn in ELF research is the
interplay among the different languages known by ELF speakers. Brunner and Diemer
(2018) and Vettorel (2019) describe how ELF speakers of different nationalities resort to
code-switching to either their own or their interlocutor’s L1 and “multilingual repertoires” to
build rapport and increase communicative efficiency.
349
Enric Llurda
350
Speaking and English as a Lingua Franca
complete disregard of how it might be achieved” (2011, p. 182–183). The key point is that
“learners’ non-conformities are to be categorized not as errors but as evidence of suc-
cessful learning” (Seidlhofer, 2011, p. 186). One example of how traditional native
speaker-oriented pedagogical models may not be the best ones is the denial of any sig-
nificant role of the L1 in L2 development. The use of the L1 as a scaffolding mechanism to
increase L2 proficiency has been appraised in fairly recent literature (Hall & Cook, 2012),
which shows that learners can benefit from transferring tools and strategies developed in
their L1.
ELF researchers’ recommendations for teaching practitioners involve a change of attitude
rather than a list of discrete items to be incorporated or erased from common ELT syllabi.
Llurda and Mocanu (2019) outlined a detailed five-stage programme to change teachers’
attitudes towards ELF:
351
Enric Llurda
352
Speaking and English as a Lingua Franca
7 Future Directions
ELF research has contributed to L2 speaking research by providing a profound under-
standing of interactions among speakers of English from different L1 backgrounds, and
particularly emphasizing how such speakers manage to develop communicatively efficient
interactions. Yet, as Derwing (2016) suggested, we need to expand research by conducting
longitudinal studies observing the effect of different communicative resources on long-
term intercultural relations. Furthermore, the work of Derwing and her colleagues on the
interplay of intelligibility, comprehensibility and accent demonstrates how accent is par-
tially independent of the more important variables which determine L2 users’ capability of
being understood by both L1 and L2 speakers. Thus ELF could benefit by incorporating
the framework developed by Derwing and Munro (1997; 2005) in which they distinguish
among those three speech dimensions, and outline several factors that are key in successful
communication, including comprehensibility and intelligibility, fluency, pragmatics,
nature of the interaction (short-term vs. ongoing) and Willingness to Communicate
(Derwing, 2016).
The implementation of an ELF-aware vision in speaking pedagogy and assessment needs
further development, especially when the general principle of going beyond NS norms is
translated into practical teaching and testing guidelines. However, research on pedagogical
experiences is increasing and the findings will likely engage more teachers into ELF-aware
practices. The current moment demands a higher integration of ELF and multilingualism
research traditions, as ELF always takes place in multilingual environments (Jenkins, 2015).
Llanes and Cots (2020), and Llurda and Cots (2020) demonstrate how a translanguaging and
ELF-inspired pedagogy in a Business English course can result in better outcomes for stu-
dents than for those following a conventionally monolingual NS-oriented approach.
Research on ELF speaking must continue making inroads into standard SLA research
and mainstream English pedagogy. However, as researchers and practitioners become in-
creasingly aware of the transformations required by the global spread of English, they are
gradually changing their perspective of a prototypical conversation in English, from one
involving NSs in a country where English is the majority language, to one in which speakers
are multilingual and who may live anywhere in the world.
Further Reading
Cogo, A., & Dewey, M. (2012) Analysing English as a lingua franca. A corpus-driven investigation.
London: Bloomsbury.
353
Enric Llurda
A corpus-based study on ELF that explores lexical and pragmatic aspects of ELF use, and additionally
provides a theoretical discussion on ELF research and its outcomes.
Jenkins, J., Baker, W., & Dewey, M. (Eds.). The Routledge handbook of ELF. Abingdon, UK:
Routledge.
A comprehensive volume dealing with different research topics and approaches to ELF research, and its
pedagogical implications. The most authoritative voices in the field have contributed to the book.
Sifakis, N., & Tsantila, N. (Eds.). English as a lingua franca for EFL contexts. Bristol: Multilingual Matters.
Reflective questions and tips for teachers on how to make the transition from traditional EFL to ELF-
aware teaching.
References
Bayyurt, Y., & Akcan, S. (Eds.). (2015). Current perspectives on pedagogy for English as a Lingua
Franca. Berlin: De Gruyter Mouton.
Bayyurt, Y., & Sifakis, N. (2017). Foundations of an EIL-aware teacher education. In A. Matsuda
(Ed.), Preparing teachers to teach English as an International Language (pp. 3–18) Clevedon:
Multilingual Matters.
Björkman, B. (2011). Pragmatic strategies in English as an academic lingua franca: Ways of achieving
communicative effectiveness? Journal of Pragmatics, 43, 950–964.
Brunner, M-L., & Diemer, S. (2018). “You are struggling forwards, and you don’t know, and then
you… you do code-switching…” – Code-switching in ELF Skype conversations. The Journal of
English as a Lingua Franca, 7, 59–88.
Burns, A. (2016). Research and the teaching of speaking in the second language classroom. In E. Hinkel
(Ed.), Handbook of research in second language teaching (pp. 242–256). Abingdon: Routledge.
Canagarajah, S. (2007). Lingua franca English, multilingual communities, and language acquisition.
The Modern Language Journal, 91, 923–939.
Chopin, K. (2015). Reconceptualizing norms for language testing: Assessing English language profi-
ciency from within an ELF framework. In Y. Bayyurt & S. Akcan (Eds.), Current perspectives on
pedagogy for English as a Lingua Franca (pp. 193–204). Berlin: DeGruyter Mouton.
Cogo, A., & Dewey, M. (2012). Analysing English as a Lingua Franca. A corpus-driven investigation.
London: Bloomsbury.
Cook, V. J. (1999). Going beyond the native speaker in language teaching, TESOL Quarterly, 33,
185–209.
Derwing, T. M. (2016). Challenges for intelligibility and comprehensibility in ELF. Plenary talk de-
livered at the 9th International Conference of English as a Lingua Franca, Lleida, June 26, 2016.
Derwing, T. M., & Munro, M. J. (1997). Accent, intelligibility, and comprehensibility. Studies in Second
Language Acquisition, 20, 1–16.
Derwing, T. M., & Munro, M. J. (2005). Second language accent and pronunciation teaching: A
research-based approach. TESOL Quarterly, 39, 379–397.
Derwing, T. M., & Munro, M. J. (2009). Putting accent in its place: Rethinking obstacles to com-
munication. Language Teaching, 42, 476–490.
Derwing, T. M., & Munro M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2
teaching and learning. Amsterdam: John Benjamins.
Derwing, T. M., Munro, M. J., & Wiebe, G. (1998). Evidence in favor of a broad framework for
pronunciation instruction. Language Learning, 48, 393–410.
Derwing, T. M., Rossiter, M., & Munro, M. J. (2002). Teaching native speakers to listen to foreign-
accented speech. Journal of Multilingual and Multicultural Development, 23(4), 245–259.
Deterding, D. (2013). Misunderstandings in English as a lingua franca. Berlin: De Gruyter Mouton.
Dewey, M. (2015). Time to wake up some dogs! Shifting the culture of language in ELT. In Y. Bayyurt
& S. Akcan (Eds.), Current perspectives on pedagogy for English as a Lingua Franca (pp. 121–134)
Berlin: DeGruyter Mouton.
Firth, A. (1996). The discursive accomplishment of normality: On ‘lingua franca’ English and con-
versation analysis. Journal of Pragmatics, 26, 237–259.
García, O., & Wei, L. (2014) Translanguaging: Language, bilingualism and education. Basingstoke:
Palgrave.
Gardiner, I. A. & Deterding, D. (2018). Pronunciation and miscommunication in ELF interactions.In
J. Jenkins, W. Baker & M. Dewey (Eds.). The Routledge handbook of ELF (pp. 224–232). Abingdon,
UK: Routledge.
354
Speaking and English as a Lingua Franca
Grosjean, F. (1989). Neurolinguists, beware! The bilingual is not two monolinguals in one person. Brain
and Language, 36, 3–15.
Hall, G., & Cook, G. (2012). Own-language use in language teaching and learning. Language Teaching,
45, 3, 271–308.
Holliday, A. (2005). The struggle to teach English as an international language. Oxford: Oxford
University Press.
House, J. (1999). Misunderstanding in intercultural communication: Interactions in English as a lingua
franca and the myth of mutual intelligibility. In C. Gnutzmann (Ed.), Teaching and learning English
as a global language (pp. 73–93). Tübingen: Stauffenburg.
Jenkins, J. (2000). The phonology of English as an international language. Oxford: Oxford University
Press.
Jenkins, J. (2007). English as a lingua franca: Attitude and identity. Oxford: Oxford University Press.
Jenkins, J. (2015). Repositioning English and multilingualism in English as a lingua franca. Englishes in
Practice, 2, 49–85.
Kachru, B. B. (1985). Standards, codification and sociolinguistic realism. In R. Quirk & H. G.
Widdowson (Eds.), English in the world: Teaching and learning the language and literatures
(pp. 11–30). Cambridge: Cambridge University Press.
Kohn, K. (2019). Towards the reconciliation of ELF and EFL: Theoretical issues and pedagogical changes.
In N. Sifakis & N. Tsantila (Eds.), ELF for EFL contexts (pp. 32–49). Bristol: Multilingual Matters.
Llanes, A., &. Cots, J. M. (2020). Measuring the impact of translanguaging in TESOL: A plurilingual
approach to ESP. International Journal of Multilingualism. doi: 10.1080/14790718.2020.1753749.
Llurda, E. (2015). Non-native teachers and advocacy. In M. Bigelow & J. Ennser-Kananen (Eds.), The
Routledge handbook of educational linguistics (pp. 105–116). New York: Routledge.
Llurda, E., Bayyurt, Y., & Sifakis, N. (2018). Raising teachers’ awareness about English and English as
a lingua franca. In P. Garrett & J. M. Cots (Eds.), The Routledge handbook of language awareness
(pp. 155–169). Abingdon: Routledge.
Llurda, E., & Cots, J. M. (2020). PLURELF: A project implementing plurilingualism and English as a
lingua franca in English language teaching at university. Status Quaestionis, 19, 259–276.
Llurda, E., & Mocanu, V. (2019). Changing teachers’ attitudes towards English as a lingua franca. In
N. Sifakis & N. Tsantila (Eds.), ELF for EFL contexts (pp. 175–191). Bristol: Multilingual Matters.
Lynch, T. (1996). Communication in the language classroom. Oxford: Oxford University Press.
Makoni, S., & Pennycook A. (Eds.). (2007). Disinventing and reconstituting languages. Bristol:
Multilingual Matters.
Matsumoto, Y. (2011). Successful ELF communications and implications for ELT: Sequential analysis
of ELF pronunciation negotiation strategies. The Modern Language Journal, 95, 1, 97–114.
Mauranen, A. (2006). Signaling and preventing misunderstanding in English as lingua franca com-
munication. International Journal of the Sociology of Language, 177, 123–150.
McKay, S. L. (2002). Teaching English as an international language. Oxford: Oxford University Press.
McNamara, T., & Shohamy, E. (2016). Language testing and ELF: Making the connection. In M.-L.
Pitzl & R. Osimk-Teasdale (Eds.), English as a lingua franca: Perspectives and prospects
(pp. 227–233). Berlin: DeGruyter.
Meierkord, C. (2000). Interpreting successful lingua franca interaction. An analysis of non-native-/non-
native small talk conversations in English. Linguistik Online, 5. https://bop.unibe.ch/linguistik-
online/article/view/1013/1673
Moyer, A. (2014). The social nature of L2 pronunciation. In J. M. Levis & A. Moyer (Eds.), Social
dynamics in second language accent (pp. 11–29). Boston: De Gruyter.
Munro, M. J., & Derwing, T. M. (1995). Foreign accent, intelligibility and comprehensibility in the
speech of second language learners. Language Learning, 45, 73–97.
Munro, M. J., Derwing, T. M., & Thomson, R. I. (2015). Setting segmental priorities for English
learners: Evidence from a longitudinal study. IRAL, 53, 39–60.
Newbold, D. (2019). ELF in language tests. In N. Sifakis & N. Tsantila (Eds.), ELF for EFL contexts
(pp. 211–226). Bristol: Multilingual Matters.
Seidlhofer, B. (1999). Double standards: Teacher education in the expanding circle. World Englishes,
18, 233–245.
Seidlhofer, B. (2001). Closing a conceptual gap: The case for a description of English as a lingua franca.
International Journal of Applied Linguistics, 11, 133–158.
Seidlhofer, B. (2009). Common ground and different realities: World Englishes and English as a lingua
franca. World Englishes, 28, 236–245.
355
Enric Llurda
Seidlhofer, B. (2011). Understanding English as a lingua franca. Oxford: Oxford University Press.
Seidlhofer, B., & Widdowson, H. G. (2019). ELF for EFL: A change of subject? In N. Sifakis & N.
Tsantila (Eds.), ELF for EFL contexts (pp. 17–31). Bristol: Multilingual Matters.
Sifakis, N., & Bayyurt, Y. (2018). ELF-aware teaching, learning and teacher development. In J.
Jenkins, W. Baker, & M. Dewey (Eds.), The Routledge handbook of ELF (pp. 456–467). Abingdon,
UK: Routledge.
Sridhar, K. K., & Sridhar, S. N. (1986). ‘Bridging the paradigm gap: Second language acquisition
theory and indigenized varieties of English’. World Englishes, 5, 3–14.
Thir, V. (2020). International intelligibility revisited. L2 realizations of NURSE and TRAP and
functional load. Journal of Second Language Pronunciation, 6, 458–482
Vettorel, P. (2019). Communication strategies and co-construction of meaning in ELF: Drawing on
“multilingual resource pools”. Journal of English as a Lingua Franca 8, 179–210.
Walker, R. (2010). Teaching the pronunciation of English as a lingua franca. Oxford: Oxford University
Press.
Widdowson, H. G. (2012). ELF and the inconvenience of established concepts. Journal of English as a
Lingua Franca, 1, 5–26.
356
PART V
Emerging Issues
25
WORKPLACE COMMUNICATION
Lynda Yates
1 Introduction/Definitions
Workplace communication is a broad research area encompassing a range of spoken,
written, and mediated communicative activity. Studies from multiple perspectives using
several methodologies have provided rich descriptions of how people communicate to get
things done, exercise leadership, and build relationships at work. My aim here is not to
survey this work in its entirety – this has been well-covered elsewhere (e.g., Vine, 2018, 2020).
Rather, it is to consider the implications of this work for adults who must learn to speak an
additional, later-learned language at work. This is an emerging issue, not because it is new –
people have been using other languages at work for time immemorial – but because the world
of work is changing fast and the explosion in world-wide interconnectivity, large-scale global
flows of workers at all levels and the “superdiversity” (Vertovec 2007) that now characterizes
our major cities make it more important than ever to ensure adult learners have the skills
needed to speak effectively at work.
Although I will draw chiefly on literature related to English, my focus is on spoken com-
munication rather than the features of any specific language. The term “language” is therefore
used as shorthand for “the use of language that is appropriate in context.” This entails the
ability to understand the nature of an interaction and the relative rights and obligations of
participants, that is, sociopragmatic competence, and the ability to select and use the language
behaviours that are pragmatically appropriate, that is, pragmalinguistic competence.
2 Historical Perspectives
As workplace communication is a relatively new area of focus in SLA studies, much of the
research is more accurately discussed in terms of critical issues currently contributing to the
emerging research strengths of the field.
3 Critical Issues and Topics; Current Contributions and Research Talk and
Workplaces
Workplaces have been described as “a moving target” (Kerekes, 2018, p. 414) for learners. Not
only do they embrace an almost infinite variety of settings, tasks and people, but also they
change at a phenomenal speed in ways that previous generations could not have imagined
under the impact of technical innovation and global events. These new and constantly evolving
conditions demand flexibility of thought, activity, and therefore of modes and styles of com-
munication. Moreover, since “jobs for life” are now the exception, an individual is likely to
experience a range of different roles and workplaces, so that an ability to style-shift and tailor
communication to context is of even greater importance, particularly for migrants who often
start “at the bottom” of the employment ladder, whatever their pre-migration skills.
Although talk at work shares many characteristics with talk elsewhere, it is constrained by,
and assumes particular meanings in relation to, the shared community of practices in the
workplace in which it is embedded (Drew & Heritage, 1992). While this is true for workplaces
in general, it is particularly visible in specialized work contexts where behaviours are strictly
regulated according to roles and power differences, such as turn-taking in courtrooms,
classrooms, boardrooms, and meetings. While the communicative demands of different
workplaces vary considerably, they are typically characterized by an orientation towards a
shared goal. This focus can require not only precise expression and technical language, but also
the ability to make and interpret economical contextual references to activities that may be
obscure to outsiders. Moreover, this shared attention to current or past activity can necessitate
complex explanations and a facility with specialized or localized discourses associated with a
particular industry or workplace (Handforth, 2018; Koester, 2006).
At work, we assume roles and identities which confer different rights and obligations to
speak from those we enact outside, and we often have to communicate effectively with a wide
range of people at different levels, whether we like them or not. This requires considerable
style-shifting and on-going relational work, both in overtly social conversations around the
water cooler or in the lunch room, and in more formal on-task interactions, and now in
virtual contexts for many. All of these interactions are important for building and main-
taining productive workplace relationships, especially for newcomers, since the consequences
of not “getting it right” can be severe, impacting not only efficiency and well-being, but also
ultimately job security and long-term career progression (Holmes, 2000).
360
Workplace Communication
work settings and have provided insight into key areas of workplace talk. These include the
use of humour (Holmes, 2006), the role of social talk (Holmes, 2000), how leadership and
gender are enacted (Holmes, 2006; Holmes et al., 2011), and how people wield power and get
things done (Holmes & Stubbe, 2015; Vine, 2004). This important work uses an interactional
sociolinguistic methodology to illustrate and foreground the important role of subtle,
pragmatic aspects of interaction in how speakers achieve their goals and come to be per-
ceived by others. It has also highlighted the centrality of informal interpersonal aspects of
communication in the workplace (Newton & Kusmierczyk, 2011). This research has had
considerable practical as well as theoretical impacts through collaborative initiatives with
government authorities and the dissemination of findings through materials and programmes
targeting migrants, as well as those dealing with a multicultural workforce (e.g., Joe &
Riddiford, 2017; Riddiford & Newton, 2010; Work Talk).
Descriptions of the language demands of workforce entry roles have received somewhat
less attention, and indeed many can be considered “language marginal” (McAll Bayley 2003)
in that they offer few opportunities for workers to use their additional language at all. Yet, it
is in these more menial roles that many migrant learners, even those from professional
backgrounds, first find employment, and in which they not only hope to earn a living, but
also to learn the dominant language and start engaging with their new communities. One
aspect of factory floor communication that has received attention, and sometimes surprises
migrants to English dominant workplaces, is the role of swearing and humour. Daly et al.’s
(2004) study of talk on the factory floor of a packaging plant highlights the importance of
this behaviour as a means of building solidarity and comradeship. The use of humour in the
form of short jokes and apparently confrontational complaints and competitive sequences
reflects a close working relationship within some teams and increases bonding to make life
more enjoyable.
In considering the diversity of workplaces, it is also important to note their increasingly
multilingual nature as they reflect the superdiversity of our communities and the global reach
of many enterprises. Communication in such workplaces demands the ability to adroitly
navigate the “ecosystem” of different languages in the office or on the factory floor (Angouri,
2014). Some companies mandate the use of a particular language, often English, through an
explicit language policy. Research on ELF suggests that lingua franca communication takes
on particular characteristics: users are more collaborative, more tolerant of deviation from
norms, less focussed on linguistic accuracy and more supportive of each other as they ne-
gotiate meaning together (Firth, 1996). However, alongside explicit language policies, im-
plicit language policies can also affect how interactions are managed and reacted to (Hazel,
2015). This can involve the selection of a particular language for a particular function and
can give rise to interactional norms internalized by experienced members of a community but
which newcomers need to learn. Code-switching, for example, might occur at work because it
is efficient (Kleifgen, 2013) or to signal either solidarity or discord (Lauriks et al., 2015).
Since language selection can index membership, the confident use of the language policy can
index a high status, as illustrated by the public sanctioning of the use of a dispreferred
language (Danish) in a business meeting where ELF is routinely used (Hazel, 2015, p. 147).
At the other end of the employment spectrum, Goldstein (1997) has illustrated the role that
different languages can play in wielding power and getting things done on the factory floor.
Moreover, workers find their own creative ways of communicating. Kalmar (2001), for example,
showed how Latino immigrant workers in the United States used their own language as they
worked through English, matching unfamiliar English sounds to familiar ones, producing hybrid
words. While the development and use of these “fragmented” and context-specific “multilingual
repertoires” (Blommaert 2010) facilitates communication and evidences considerable creative
361
Lynda Yates
linguistic and metalinguistic skill, it can also leave workers unsure as to exactly what language
they are speaking and thus lacking in confidence about their actual level of proficiency in the
dominant language (Pujiastuti, 2017). For individuals seeking to climb the employment ladder,
such short-term efficiency gains can have a deleterious impact on longer-term prospects.
362
Workplace Communication
While experimental role-plays used in early interlanguage pragmatic studies allowed the
comparison of learner and expert speaker behaviour in similar situations, they inevitably
tapped into the perceptions of “idealized” general settings and contexts about the use of a
particular variety (often English in the United States). In addition, as student populations
offer a ready source of participants for such studies, the findings likely reflect the intuitions of
young adults rather than older speakers more experienced in the workplace. To specifically
compare intuitions of experienced participants in workplace situations with those of migrant
learners preparing for the workforce in Australia, Yates (2010) elicited role-play data from
Dinka background refugees and mature Australians performing in common workplace si-
tuations. The findings highlighted some specific pragmalinguistic and sociopragmatic areas
where the learners might benefit from explicit attention, leading to the development of
teaching materials to specifically target these (Yates & Springall, 2010). Similarly, in the
interest of increasing workplace relevance, a combination of role-play data collected from
trained physicians and ethnographic data from naturally occurring doctor–patient con-
sultations were used in a series of studies investigating the pragmatic and cultural challenges
facing internationally trained doctors preparing for registration in Australia (Dahm & Yates,
2013; Yates & Dahm, 2016; Yates et al., 2016).
Cultural and linguistic backgrounds influence not only the forms that speakers select and
aspects of delivery such as turn-taking, hesitation phenomena, and gaze, but also expecta-
tions regarding the rights and obligations of speakers, the topics considered appropriate to
discuss, and so on. Since these vary cross-culturally, they are a potential source of mis-
communication at work. An example of this is small talk. While apparently trivial, research
suggests that small talk is actually very important for building good relationships and social
inclusion at work, yet the topics, functions, forms and notions of when and with whom small
talk might be appropriate can vary across cultures, so that newcomers can find themselves
excluded and unsure how to participate (Holmes, 2000). Similarly, other sociocultural ex-
pectations about behaviour at work, including how much and when to smile or laugh, what
counts as friendship, how much personal disclosure or informality is appropriate, and so on
can vary across cultures and pose a challenge for newcomers (Yates & Major, 2015).
To complicate matters, any workplace involves the intersection of different cultures re-
lating not only to the wider language and cultural community, but also to the particular
industry, company, and even department, and all will be relevant to the communicative
expectations of that site (Haugh & Watanabe, 2018). Effective speaking therefore demands
an understanding of the macro- and micro-cultural environment, as well as an understanding
of different genres and the communicative behaviours and roles expected in different
workplace activities. Thus, as discussed above, the swearing and apparently competitive
workplace talk reported in Daly et al. (2004) might in other settings be viewed as con-
frontational and offensive, while in the context of that factory floor it served to build soli-
darity. Learning to talk effectively at work involves much more than the skill to supply
behaviours appropriate to the context. It also demands the development of the knowledge
and skills to take stock of a new situation and respond appropriately. How these might best
be acquired through informal and formal learning will be considered later.
363
Lynda Yates
experienced others, moving from peripheral to more central membership as they gain in their
understanding of how things are done in a community. Drawing on work in the medical field,
Lu and Corbett (2012) argue that it is only through direct observation that someone can
understand how things actually work. As any workplace involves the intersection of various
communities of practice (Holmes & Stubbe, 2015), there is clearly a role for informal on-the-
job learning in the development of both proficiency and relevant workplace and industry-
specific expertise (Losa, 2018). Moreover, it can sometimes be easier to see exactly what
needs to be learned, and how this might be different from what you already know, once you
are at work. Ming, in Li (2000), came to understand that her workplace expected commu-
nication to be “directly, truthful and things a little bit sweet” (p. 75) in the “American way.”
Similarly, Charles from Colombia observed that things were done differently in his
Australian workplace. His role in furniture design meant that he had to critique the work of
others, but he came to realize that it did not go down well if he gave negative opinions
directly. He therefore learned that he had to “say without saying” in order to keep relations
sweet at work, even if he felt the designs were “horrible” (Yates & Major, 2015, p. 147).
However, the success or otherwise of language learning at work depends on many factors,
including how conducive the workplace environment is to learning, or even communication.
Drawing on reports from migrants participating in a longitudinal study of their early ex-
periences in Australia, Yates (2018) reviews some of the factors that fostered or impeded
their ability to learn English at work. She concluded that although the strong motivation to
work among participants increased the relevance of learning English, especially since there
were few other opportunities to engage with English outside class, it could also be a siren call,
luring them prematurely from their language classes. Professional roles were particularly
useful for language learning as they made more demands on their skills and forced learners to
take risks with their language. Many found that even low-level jobs gave them the confidence
to use the English that they had, particularly if colleagues were helpful, and that these jobs
could be useful stepping stones to other, more challenging roles as their English improved.
However, many migrants take jobs in businesses run by speakers of their first language
(L1), or in factories, laundries, bakeries, salons or other workplaces requiring little English
outside the routine phrases and technical vocabulary needed to get the job done. In some
workplaces it can be too noisy to communicate, or talking can be explicitly forbidden (Yates,
2018). Thus, many of the menial jobs taken by migrants can offer few language learning
opportunities, as illustrated in Strömmer’s (2015) case study of Kifibin, a graduate from
Uganda who worked as a cleaner in Finland. As cleaning is often undertaken when everyone
else is absent, he worked largely on his own with little or no contact with clients or anyone
else. His supervisors only spoke to him when there were changes to the normal routine, and
his engagement with Finnish was therefore largely limited to receiving instructions and
asking for clarification.
However, even where co-workers are present, they are not always keen or helpful inter-
locutors. Derwing (2016), for example, found that employees in the Canadian workplace she
studied tended to socialize within their own first language groups. Most of the 24 native
speaker respondents reported difficulties understanding and being understood by their mi-
grant colleagues, and two-thirds indicated they were sometimes reluctant to initiate con-
versations with them, even though, ironically, most felt that more practice speaking would
improve their co-workers’ English. This seems to be a common experience (e.g., Sandwell,
2010; Yates, 2011). While these difficulties may relate to their lack of experience with the
language and formats of workplace conversations, they may also reflect the reluctance of
native and expert speakers to engage. Reasons for not engaging may range from a desire not
to place an undue burden on colleagues who appear to struggle with language, through to
364
Workplace Communication
discomfort with ethnic or cultural differences or, more seriously, racism. Whatever the cause,
responsibility for good communication rests with everyone, not solely newcomers. This
suggests an important role for training programmes targeting the skills and attitudes of
managers and co-workers (Derwing, 2016; Derwing et al., 2021).
Even where English is used at work as a lingua franca, co-workers may speak varieties
that are difficult to understand, or the workplace itself might foster the development of its
own variety. Thus, although Jurik from Mexico used English all the time with his interna-
tional co-workers in the kitchen of the golf club where he worked, their focus was on the
efficient delivery of food to customers rather than the beauty or otherwise of their English, so
that he felt the “kitchen English” he used did not help to improve his English (Chappell,
Benson & Yates, unpublished data). Similarly, Pujiastuti (2017) reports how hotel workers
from different language backgrounds developed a very effective multilingual mode of
communication that resulted in a bricolage, which one of them referred to as “basura de
lenguas” (language trash). While such creativity may be effective for productivity in the
workplace, it is much less useful for learning the dominant language used outside. Thus,
while experiential learning in the workplace can be beneficial, formal language learning also
has an important role to play in developing the range of spoken language that migrants and
others need if they are to retain the ability to respond to new employment opportunities.
Below I consider issues related to the design and delivery of programmes to support language
learning for the workplace.
365
Lynda Yates
The aspect of linguistic proficiency singled out for particular attention as a priority by
Derwing (2016), pronunciation, has long been recognized as an important, but often ne-
glected concern for employers, co-workers and the learners themselves. Since this aspect of
spoken language is covered in more depth elsewhere (see Derwing & Munro, this volume),
here I will only make brief comment on what might be appropriate pronunciation goals, why
they should be a priority for training, and the role of interlocutors at work.
While it is unnecessary and unfair to expect mature second language (L2) users to sound
native-like, there is general agreement that a comfortable intelligibility is a reasonable goal
for the workplace (see Derwing et al., 2021). Since adults typically find it challenging to
“hear” and reproduce the often subtle phonological and prosodic features of a later learned
language, they may not be aware of the extent to which their pronunciation affects their
intelligibility, and even sympathetic interlocutors can feel shy about offering the assistance
they might with other aspects of language, such as vocabulary or grammar. This makes it a
priority candidate for specific instruction, particularly as a speaker’s problematic pro-
nunciation can too easily be used to make negative judgements about their general language
ability, and even their competence (Derwing et al., 2021).
Intelligibility is a two-way process, and interlocutors’ familiarity with or attitude towards
an accent or ethnic group can play a major role in their ability or willingness to understand
what is said. Thus, accents associated with “high status” groups may be more “easily” un-
derstood (Bresnahan et al., 2002). Moreover, workplaces can be pressured environments
where supervisors and co-workers are not always sympathetic listeners. This suggests that
practical programmes for co-workers and managers on how to improve their understanding
and ability to communicate with migrant colleagues is also a priority, alongside pro-
nunciation classes for the migrants themselves. Findings from the few studies reporting on
the outcomes of such training are encouraging, even for older learners and long-term mi-
grants who have been in the workforce for many years (see Derwing, 2016; Derwing et al.,
2014; Derwing et al., 2002; Kang & Rubin, 2012).
As the discussion of research into the communicative demands of the workplace in the
previous parts demonstrates, pragmatic proficiency, that is, the ability to convey and un-
derstand intended meaning appropriately in context, is vital to communicative success. This
makes it a clear priority for workforce preparation programmes. To be able to use their
general language proficiency appropriately and to good effect, speakers need the socio-
cultural knowledge and skills to accurately identify the nature of the situation and context in
which they are speaking, the purpose of the conversation, and the relative rights and ob-
ligations of the participants. Crucially, since all communication has a relational dimension
indexing attitude, speakers also require the ability to make pragmalinguistic choices that
accurately reflect their intentions in that context. Since, as discussed earlier, these pragmatic
conventions are often below the level of consciousness and usually learned during early
language socialization in childhood, they have been referred to as “the secret rules” of
language (Yates, 2004). The danger here is that, while syntactic or phonological errors index
a speaker as a learner, infelicities involving pragmatic choices are less clearly “visible” as
learner errors, and thus more likely to be misinterpreted as rudeness. Importantly, general
language proficiency does not guarantee pragmatic proficiency. Indeed, greater general L2
proficiency can actually lead speakers into greater difficulty, since the more a learner can say,
the greater the scope for pragmatic infelicity (Kasper, 1992). This suggests the need for an
explicit focus on pragmatics.
However, despite its centrality in effective communication, with some notable exceptions,
pragmatics is often overlooked in preparation programmes. Learning materials and activities
appropriate to the level of learners are not always available (Diepenbroek & Derwing, 2013).
366
Workplace Communication
As Kerekes (2018) notes, applied linguists have an important role to play in making their work
available and accessible to the world of practice. She rejects, however, the notion that research
findings can be simplified into scripts that can simply be learned, advocating instead that they
be used as the stimulus for a focus on strategies rather than rules. This approach has been used
successfully in short courses for professional migrants drawing on insights from collaborations
between researchers and practitioners in the Language in the Workplace project. Through a
combination of explicit pragmatic instruction involving reflection on models and the oppor-
tunity to engage and observe during workplace internships, course participants could develop
their awareness of sociocultural and pragmalinguistic phenomena. Riddiford and Joe (2010),
for example, demonstrated how learners not only developed their awareness of different ways
of making requests, but they also put them into practice during their work placements and
reflected on their own behaviour. In this way, they began to develop the understandings and
analytical skills necessary to explore workplace pragmatics for themselves. Such skills are a
crucial component of any workplace programme because no one course can cover everything
learners need to communicate effectively at work.
A combination of explicit instruction, reflection and the opportunity for observation and
engagement in a workplace offers the advantages of both formal and informal learning,
equipping learners with the tools needed to have agency in their own learning beyond the
classroom. Approaches to workforce language training such as paid hours of formal learning
within working hours, programmes specifically designed and delivered in and for particular
workplaces (see, for example, Norquest, 2010) and workplace internships embedded into
language programmes (AMES Australia, 2016; Riddiford & Joe, 2010) offer a welcome al-
ternative to after-hours classes, which learners can find too disconnected from their own
workplace and too exhausting after a full-day’s work.
An important issue raised in workplace language training relates to the models that should be
used for programmes, since many workplaces are multilingual, and adult learners are not, and
may not want to be, native speakers of the language they are learning. As Timpe-Laughlin (2019)
warns, the “target norm” should not be equated to “native speaker norm.” As noted earlier,
lingua franca communication may have its own norms. Another issue raised in the literature
relates to the scope of programmes preparing migrants for the workplace, and the tension be-
tween providing migrants with the skills they need to do their (entry-level) jobs and the provision
of a foundation capable of catering for their longer-term language needs and career progression
(Warriner, 2010). Further, prescriptive approaches to language and culture training constrain
what learners can say, and thus how they act and think at work. The socially constitutive and
indexical nature of talk means that learning to talk the way that others do at work is a double-
edged sword: on the one hand, it can bring a feeling of acceptance, but on the other, it can
encourage the reproduction, and therefore tacit endorsement, of the status quo. There are
dangers, too, in prescriptive approaches that objectify culture by essentializing groups or failing
to acknowledge the complexity and fluidity of interaction and community membership. Thus, a
social constructionist perspective should be incorporated alongside historical or geographically
bound perspectives (Lazzaro-Salazar, 2018). Such recommendations accord with contemporary
approaches to the teaching of pragmatics, which stress learner reflection, understanding and
choice, rather than compliance with a set of rules (Kerekes, 2018; Timpe-Laughlin, 2019).
5 Future Directions
To conclude, I briefly outline directions for future research in workplace communication and
how research findings can benefit different kinds of learners who must acquire the knowledge
and skills they need for work.
367
Lynda Yates
While research into workplace communication has already provided considerable insight
into the nature and formats of how people talk at work, the workplaces of tomorrow will be
vastly different from those of the past, and the rate of change is rapid and ferocious. New
industries, new technologies and developments on the home–work interface will change work
practices and spawn new ways of interacting. It is imperative that research into the nature of
workplace communication not only keep pace with these changes, but also expand to include
attention to hitherto neglected aspects of spoken communication. While pragmatic per-
spectives have rightfully been a major focus, the relationship between pragmatics and other
modes of delivery relevant to communicative success such as pronunciation (but see Derwing
et al., 2021), gesture, deixis, gaze, the use of artefacts, and so on are as yet under-researched
and suggest important possibilities for future endeavours. We must also build our under-
standing of how communication works in other work contexts and languages.
Moreover, the increasingly globalized and multilingual nature of workplaces mean that
research into the role, status, and use of different languages and varieties at work will be
increasingly relevant. Such research will make an important contribution theoretically to our
understanding of the role and status of different languages at work; in practical terms it will
provide an evidence base for training interventions supportive of speakers from all back-
grounds – native, expert, and learner – to manage this diversity. This suggests the importance
of increased research attention to the nature of ELF and other linguae francae. Since much
ELF research to date has been conducted in academic contexts and in Europe, there is a need
to expand the scope to the exploration of other languages and workplace contexts.
From an applied perspective, we need more research insights into the design, conduct and
evaluation of workplace language training programmes and into how learners gain the skills
they need for work. While studies of workplace communication can increase researcher
awareness of the source content for such programmes, that is, how people actually com-
municate at work, there remains the significant challenge of ensuring access to these findings
for policy-makers, curriculum designers, and practitioners to inform practice. Long-term
collaboration between researchers and practitioners offers one productive way of addressing
this need. Collaborative, integrated research-to-practice initiatives could include exploration
of the content, design, and delivery of programmes for both learners and their co-workers.
For researchers, this connection with practitioner–collaborators can offer good insights into
learner challenges, access to new research sites and a sense of being able to make a real
difference. For practitioners, it can mean a reliable evidence base for materials development,
opportunities for professional growth, and confidence in relevance of their work to the
current needs of their learners. Such collaborations can help to nurture a stronger connection
between research and practice in support of those who need to learn more about how to
communicate effectively in the workplace.
Further Reading
Riddiford, N., & Joe, A. (2010). Tracking the development of sociopragmatic skills. TESOL Quarterly,
44(1), 195–205.
A report on the conduct of a practical intervention for migrant language learners which combined
classroom instruction based on empirical evidence from the Language in the Workplace Project with
reflection on experiences during a practical internship in New Zealand workplaces.
Timpe-Laughlin, V. (2019). Pragmatics learning in the workplace. In N. Taguchi (Ed.), The Routledge
handbook of second language acquisition and pragmatics (pp. 413–428). New York, NY: Routledge.
This chapter draws on a number of empirical studies of pragmatics in workplace communication to
offer both theoretical and applied perspectives that will be useful for readers with an interest in the
processes of learning as well as in the language used in work contexts.
368
Workplace Communication
Vine, B. (Ed.) (2018). The Routledge handbook of language in the workplace. Abingdon, Oxon:
Routledge.
A fascinating introduction to studies in the field as it brings together a wide range of perspectives on
workplace communication using a staggering array of different research methodologies.
References
Ames Australia. (2016). In transition: Employment outcomes of migrants in English language programs at
AMES Australia. Research and Policy Unit AMES Australia December 2016. Accessed 02.09.20 from
https://www.ames.net.au/-/media/files/research/transitions_slpet-short-report_-final_dec-2016.pdf
Angouri, J. (2014). Multilingualism in the workplace: Language practices in multilingual contexts.
Multilingua, 33(1–2), 1–9.
Angouri, Jo. (2012). Managing disagreement in problem-solving meeting talk. Journal of Pragmatics,
44, 1566–1570.
Blommaert, J., & Dong, J. (2010). Ethnographic fieldwork: A beginner’s guide. Buffalo, NY:
Multilingual Matters.
Blum-Kulka, S., House, J., & Kasper, G. (Eds). (1989). Cross-cultural pragmatics: Requests and
apologies. Norwood, NJ: Ablex.
Bresnahan, M. J., Ohashi, R., Nebashi, R., Liu, Y., & Shearman, S.M. (2002). Attitudinal and affective
response toward accented English. Language & Communication, 22(2), 171–185.
Bygate, M. (2018). Creating and using the space for speaking within the foreign language classroom:
what, why and how?. In R.Alonso (Ed.) Speaking in a second language.Amsterdam,
Netherlands:John Benjamins.
Dahm, M., & Yates, L. (2013). English for the workplace: Doing patient-centred care in medical
communication. TESL Canada, 30(Special Issue 7), 21–23.
Daly, N., Holmes, J., Newton, J., & Stubbe, M. (2004). Expletives as solidarity signals in FTAs on the
factory floor. Journal of Pragmatics, 36(5), 945–964.
Derwing, T. M. (2016). The 3 P’s of ESL in the workplace: Proficiency, pronunciation and pragmatics.
In H. McGarrell & D. Wood (Eds.), Contact: Refereed proceedings of the TESL Ontario Research
Symposium, 42(2), 10–20.
Derwing, T. M., Munro, M. J., Foote, J. A., Waugh, E., & Fleming, J. (2014). Opening the window on
comprehensible pronunciation after 19 years: A workplace training study. Language Learning, 64, 526–548.
Derwing, T.M., Waugh, E., & Munro, M. J. (2021). Pragmatically speaking: Preparing adult ESL
students for the workplace, Applied Pragmatics, 3(2),107–135.
Derwing, T. M., M. J. Rossiter, & M. J. Munro. (2002). Teaching native speakers to listen to foreign-
accented speech. Journal of Multicultural and Multilingual Development, 23(4), 245–259.
Diepenbroek, L. G., & Derwing, T. M. (2013). To what extent do popular ESL textbooks incorporate
oral fluency and pragmatic development? TESL Canada, 30, 1–20.
Drew, P., & Heritage, J. (1992). Analyzing talk at work: An introduction. In P. Drew & J. Heritage
(Eds.), Talk at work: Interaction in institutional settings (pp. 3–65). Cambridge, NY: Cambridge
University Press.
Duff, P., Wong, P., & Early. M. (2002). Learning language for work and life: The linguistic socialization
of immigrant Canadians seeking careers in healthcare. The Modern Language Journal, 86, 397–422.
Félix-Brasdefer, J. C. (2017). Interlanguage pragmatics. In Y. Huang (Ed.), The Oxford handbook of
pragmatics (pp. 416–434). Oxford: Oxford University Press.
Firth, Alan (1996). The discursive accomplishment of normality: On ‘lingua franca’ English and con-
versation analysis. Journal of Pragmatics, 26, 237–259.10.1016/0378-2166(96)00014-8.
Fletcher, J. (2018). Rapport management. In B. Vine (Ed.), The Routledge handbook of language in the
workplace (pp.77–88). Abingdon, Oxon: Routledge.
Goldstein, T. (1997). Two languages at work: Bilingual life on the production floor. Mouton de Gruyter.
Gumperz, J.J. (1982). Discourse strategies. Cambridge, England: Cambridge University Press.
Gumperz, J. J., Jupp, T. C. , & Roberts, C. ( 1979). Crosstalk. Background materials and notes to
accompany the BBC film. London: National Centre for Industrial Language Training.
Handforth, M. (2010). The language of business meetings. New York, NY: Cambridge University Press.
Handforth, M. (2018). Corpus linguistics. In B. Vine (Ed.), The Routledge handbook of language in the
workplace (pp. 51–64). Abingdon, Oxon: Routledge.
Haugh, M., & Watanabe, Y. (2018). In B. Vine (Ed.), The Routledge handbook of language in the
workplace (pp. 65–76). Abingdon, Oxon: Routledge.
369
Lynda Yates
Hazel, S. (2015). Identities at odds: Embedded and implicit language policing in the internationalized
workplace. Language and Intercultural Communication, 15(1), 141–160.
Holmes, J. (2000). Talking English from 9 to 5: Challenges for ESL learners at work. International
Journal of Applied Linguistics, 10(1), 125–140.
Holmes, J. (2006). Sharing a laugh: Pragmatic aspects of humour and gender in the workplace. Journal
of Pragmatics, 38(1), 26–50.
Holmes, J., Marra, M., & Vine, B. (2011). Leadership, discourse and ethnicity. New York, NY: Oxford
University Press.
Holmes, J., & Stubbe, M. (2015). Power and politeness in the workplace: A sociolinguistic analysis of talk
at work (2nd edn). London, UK: Routledge.
Holmes, J., & Riddiford, N. (2011). From classroom to workplace: Tracking socio-pragmatic devel-
opment. ELT Journal, 65(4), 376–386.
Ishihara, N., & Cohen, A. D. (2010). Teaching and learning pragmatics: Where language and culture
meet. Abingdon, Oxon: Pearson Education.
Joe, A., & Riddiford, N. (2017). Applying research to real world challenges and Issues: developing
research-based resources to help migrants enter the workforce. In M. Marra, & P. Warren (Eds.),
Linguists at Work. Wellington: Victoria University Press.
Kalmar, T.(2001). Illegal alphabets and adult biliteracy: Latino migrants crossing the linguistic border.
Mahwah, NJ: Erlbaum.
Kang, O., & Rubin, D. (2012). Inter-group contact exercises as a tool for mitigating undergraduates’
attitudes toward ITAs. Journal of Excellence in College Teaching, 23(3), 159–166.
Kasper, G. (1992). Pragmatic transfer. Interlanguage Studies Bulletin (Utrecht), 8(3), 203–231.
Kerekes, J. (2018 ). Language preparation for internationally educated professionals. In Vine, B.
(Ed). Handbook of language in the workplace (pp. 389–400). Abingdon, Oxon: Routledge.
Kleifgen, J. A. (2013). Communicative practices at work: Multimodality and learning in a high-tech firm.
Bristol: Multilingual Matters.
Koester, A. (2006). Investigating workplace discourse. London, UK: Routledge.
Koester, A. (2010). Workplace discourse. London, UK: Continuum.
Li, D. (2000). The pragmatics of making requests in the L2 workplace: A case study of language
socialization. The Canadian Modern Language Review/La Revue canadienne des langues vivantes,
57(1), 58–87.
Lazzaro-Salazar, M. (2018). Social constructionism. In B. Vine (Ed). Handbook of language in the
workplace (pp. 425–435), Abingdon, Oxon: Routledge.
Lauriks, S., Siebörger I., & De Vos, M. (2015). “Ha! Relationships? I only shout at them!” Strategic
management of discordant rapport in an African small business context. Journal of Politeness
Research, 11(1), 7–39.
Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge, UK:
Cambridge University Press.
Losa, S. A. (2018). Vocational education. In B. Vine (Ed.), The Routledge handbook of language in the
workplace (pp. 389–400). Abingdon, Oxon: Routledge.
Lu, P-Y, & Corbett, J. (2012). English in medical education: An intercultural approach to teaching
language and values. Bristol: Multilingual Matters.
McAll, C. (2003). Language dynamics in the bi-and multilingual workplace. In R. Bayley & S. R.
Schecter (Eds.), Language socialization in bilingual and multilingual societies (pp. 235–250).
Clevedon: Multilingual Matters.
Martinez-Flor, A., & Usó-Juan, E. (Eds.) (2010). Speech act performance: Theoretical, empirical and
methodological issues. Amsterdam: John Benjamins.
Newton, J. (2004). Face-threatening talk on the factory floor: Using authentic workplace interactions in
language teaching. Prospect, 19(1), 47–64.
Newton, J., & Kusmierczyk, E. (2011). Teaching second languages for the workplace. Annual Review of
Applied Linguistics, 31, 1–19.
Norquest (2010). Common ground. English in the workplace. A how-to guide for employers. Edmonton,
Canada: Norquest College.
Paltridge, B., & Starfield, S. (2012). The handbook of English for specific purposes. Malden, MA: Wiley-
Blackwell Publishing.
Pujiastuti, A. (2017). Language socialization in the workplace: Immigrant workers’ language practice
within a multilingual workplace. Unpublished dissertation, Graduate School of The Ohio State
University.
370
Workplace Communication
Roberts, C. (2010). Language socialization in the workplace. Annual Review of Applied Linguistics, 30,
211–227.
Roberts, C. (2011). Gatekeeping encounters in employment interviews. In S. Sirangi & C. Candlin
(Eds.), Handbook of communication in organisations and professions (pp. 407–432). Berlin: De
Gruyter.
Riddiford, N., & Joe, A. (2010). Tracking the development of sociopragmatic skills. TESOL Quarterly,
44(1), 195–205.
Riddiford, N., & Newton. J. (2010). Workplace talk in context: An ESOL resource. Wellington, New
Zealand: School of Linguistics and Applied Language Studies, Victoria University Wellington.
Sandwall, K. (2010). ‘I Learn More at School’: A critical perspective on workplace-related second
language learning in and out of school. TESOL Quarterly, 44, 542–574.
Schnurr, S. (2013). Exploring professional communication: Language in action. London: Routledge.
Spencer-Oatey, H. (2008). Introduction. In H. Spencer-Oatey (Ed.), Culturally speaking: Culture,
communication and politeness theory (pp. 1–8). London, UK: Continuum.
Spencer-Oatey, H. (2000). Rapport management: A framework for analysis. In H. Spencer-Oatey (Ed.),
Culturally speaking. Managing rapport through talk across cultures (pp. 11–46). New York, NY:
Continuum.
Spencer-Oatey, H., & Jiang, W. (2003). Explaining cross-cultural pragmatic findings: Moving from
politeness maxims to sociopragmatic interactional principles (SIPs). Journal of Pragmatics,
35(10–11), 1633–1650.
Strömmer, M. (2016). Affordances and constraints: Second language learning in cleaning work.
Multilingua, 35(6), 697–721.
Tatsuki, D. H., & Houck, N. (2010). Pragmatics: Teaching speech acts. Alexandria, VA: TESOL.
Timmis, I. (2005). Towards a framework for teaching spoken grammar. ELT Journal, 59(2), 117–125.
Timpe-Laughlin, V. (2019). Pragmatics learning in the workplace. In N. Taguchi (Ed.), The Routledge
handbook of second language acquisition and pragmatics (pp. 413–428). New York, NY: Routledge.
Vertovec, S. (2007). Superdiversity and its implications. Ethnic and Racial Studies, 30, 1024–1054.
Vine, B. (Ed.), (2018). The Routledge handbook of language in the workplace. Abingdon, Oxon:
Routledge.
Vine, B. (2020). Introducing language in the workplace. Cambridge, UK: Cambridge University Press.
Vine, B. (2004). Getting things done at work. Amsterdam: John Benjamins.
Wenger, E. (1998). Communities of practice: Learning, meaning and identity. Ambridge, UK: Cambridge
University Press.
Warriner, D. S. (2010). Competent performances of situated identities: Adult learners of English.
Teaching and Teacher Education, 26, 22–30.
Work Talk. Accessed 1.08.20 at https://worktalk.immigration.govt.nz.
Yates, L. (2004). The ‘secret rules of language’: Tackling pragmatics in the classroom’. Prospect:
Journal of Australian TESOL, 19(1), 3–21.
Yates, L. (2018). Migrants at work: Language learning on-the-job. In B. Vine (Ed.), Handbook of
language in the workplace (pp. 425–435). Abingdon, Oxon: Routledge.
Yates, L. (2010). Dinkas downunder: Request performance in simulated workplace interaction. In G.
Kasper, H. Nguyen, D. Yoshimi & J. Yoshika (Eds.), Pragmatics and language learning 12
(pp. 113–140). University of Hawaii: National Foreign Language Resource Center.
Yates, L. (2011). Interaction, language learning and social inclusion in early settlement. International
Journal of Bilingual Education and Bilingualism, 14(4), 457–471.
Yates, L., Dahm, M., Roger, P., & Cartmill, J. (2016). Rapport and teamwork in Australia: Insights for
international medical graduates. English for Specific Purposes, 42, 104–116.
Yates, L., & Dahm, M. (2016). Doing patient-centred consultations: Some challenges for IMGs. In S.
White & J. Cartmill (Eds.), Communication in Surgical Practice (pp. 35–67). Sheffield, UK: Equinox.
Yates, L., & Major, G. (2015). “Quick-chatting”, “smart dogs”, and how to “say without saying”:
Small talk and pragmatic learning in the community. System Journal, 48, 141–152.
Yates, L., & Springall, J. (2010). Soften up!: Successful requests in the workplace. In D. Tatsuki & N.
Houck (Eds.), Pragmatics from research to practice: Teaching speech acts (pp.67–86). Alexandria,
Va: TESOL.
371
26
THE RELATIONSHIP BETWEEN
L2 SPEECH PERCEPTION AND
PRODUCTION
Ron I. Thomson
1 Introduction/Definitions
On hearing someone speaking with a familiar foreign accent, we can often identify the
speaker’s first language (L1) background. This perceptual ability is rarely accompanied by an
equal capacity to perfectly imitate the same accent. This disconnect could signal that speech
perception and production are two independent skills. Alternatively, the two skills may be
connected, but precision in production may lag behind accuracy in perception. In some
unusual cases, the ability to produce foreign language sounds may precede a speaker’s ability
to perceive them. This talent could indicate that learning to produce second language (L2)
sounds can be facilitated by a strategy of applying explicit knowledge in controlled contexts,
something that is not possible in infant L1 learning.
Scholars with an interest in speech perception and production have traditionally treated
these as separate processes. Moreover, researchers typically focus on one or the other, not
both. This division arose from early evidence that aphasia (neurological language impair-
ment) often impacts perception (Wernicke’s aphasia) and production (Broca’s aphasia) se-
parately, depending on the location of brain injury (Lichtheim, 1885). Psycholinguistic
research also largely treats speech perception and production as separate systems. Perception
is seen as comprising a number of complementary, non-linear processes (e.g., McClelland &
Elman, 1986), while production is characterized as largely linear in nature (e.g., Levelt,
1999). Given these different orientations, it is impossible to simply apply perceptual pro-
cesses in reverse to arrive at an explanation for speech production, or vice versa. See de Bot
and Bátyi (this volume) for a discussion of L1 and L2 speech models, and Simard (this
volume) for descriptions of the psycholinguistic processes involved in L2 speech production.
One consequence of placing speech perception and production in separate scholarly silos
is that limited attention has been given to potential relationships between subsystems in-
volved in each larger process. For example, both speech perception and speech production
feature subsystems which process meaning, words, and individual sounds. It is not un-
reasonable to ask whether subsystems within the larger perception and production processes
may interact across modalities. To the extent that they do, changes in perception could lead
to changes in production, and vice versa. The focus of this chapter is on this narrow sense of
speech perception and production. Henceforth, speech perception refers to the ability of
listeners to decode phonetic input and recognize individual segments (speech sounds) as
intended by speakers; speech production refers to how a speaker generates segments in spoken
utterances. While speech perception and production also include prosodic patterns, these are
the focus of Mok (this volume) and are not discussed here.
I begin with a brief summary of relevant findings from typical L1 speech development
research, a basic understanding of which is essential if one assumes that L2 speech learning
relies on the same mechanisms underlying L1 acquisition. I then discuss L2 speech literature
to provide evidence supporting a partial alignment of L2 speech perception and production.
The chapter concludes with implications for L2 pronunciation teaching practices and future
directions.
2 Historical Perspectives
373
Ron I. Thomson
adults read printed words before and after repeated exposure to auditory tokens of those
same words, independent judges indicated that the read words sounded more like imitations
of the auditory prompts after exposure. A related ability is apparent in naturalistic speaking
contexts. The phonetic realizations of words spoken by pairs of interlocutors begin to
converge over the course of single conversation (Pardo, 2006). This suggests that inter-
locutors notice phonetic information in their conversation partner’s speech, and modify their
own productions, in real-time, to more closely approximate their interlocutors’
pronunciation.
374
L2 Speech Perception and Production
used in L1 speech learning, only that explicit instruction can alter normal progression. These
issues will be discussed in greater detail later in the chapter.
Measuring Perception
Task Type
Behavioural tasks have long been the standard in L2 speech perception research. One
technique is the Forced Choice Identification (FCID) task (e.g., Carlet & de Souza, 2018;
Schmitz et al., 2018). In this task, L2 learners listen to target stimuli and indicate which
sound category they perceive, from a fixed set of possibilities. Responses are usually captured
via computer, using a mouse or button click. The number of possible responses depends on
how many target sounds are involved. For example, Bradlow et al. (1997) investigated ca-
tegorization of two sounds, English /l/-/r/, while Thomson (2011) investigated categorization
of ten English vowels. How response options are represented also varies. For example,
Thomson and Derwing (2016) used phonetic symbols, while Baker and Trofimovich (2006)
used keywords written in standard orthography containing the target sounds. To ameliorate
the potential activation of faulty learner representations associated with previous experience
with orthography, Thomson (2011) used images of ten distinct nautical flags, which learners
first learned to associate with target vowels.
Another technique often used to measure L2 speech perception is the discrimination task
(e.g., Baese-Berk, 2019). These tasks do not assess whether a listener can accurately perceive
a member of a particular category, but instead test whether they are able to tell two cate-
gories apart. In an AXB design, the A and B comparators represent two contrastive cate-
gories and the listener must indicate whether the target X prompt is most similar to the A or
B prompt. The related AX task asks whether the A and X are the same or different. Instead
of AXB and AX tasks, some researchers use oddity discrimination tasks. In these tasks,
listeners hear a sequence of three productions and then indicate which, if any, is different
from the others (e.g., Flege & MacKay, 2004). One advantage of using discrimination tasks
over FCID tasks is that they do not require listeners to learn symbols against which to
associate L2 sounds, and they avoid potentially negative effects of orthography.
While FCID and discrimination tasks are both measures of speech perception, Wayland’s
(2007) comparison of FCID and AXB data revealed that FCID scores do not predict dis-
crimination patterns well. On its face, a FCID may seem to be more in line with what listeners
do in the real world, which is to recognize rather than discriminate sounds they encounter.
While not widely utilized, another approach to measuring the perception of speech sounds
is phoneme monitoring (Hanulíková et al., 2012). This requires listeners to listen to words
and indicate whether a specific phoneme is present. Alternatively, listeners can be asked to
indicate whether words, presented either in isolation or in a sentential context, are pro-
nounced with the target phoneme.
375
Ron I. Thomson
Although behavioural tasks remain the standard for adult L2 speech perception research,
neurophysiological methods are also possible (Schmitz et al., 2018). For example, event-
related potentials (ERP) can assess changes in brain activity when a listener encounters a
deviant stimulus. In this technique, electrodes placed on the surface of a listener’s scalp
provide a direct measure of auditory discrimination ability. A listener is presented with re-
peated instances of a familiar vowel category interrupted at some point by a token re-
presenting a different vowel category. An ability to discriminate the deviant token, is
reflected by a sudden shift in neurological activity.
Stimulus Characteristics
In addition to task type, stimulus characteristics may influence results obtained from L2
speech experiments. For example, FCID and discrimination tasks have presented isolated
vowels (Schmitz et al., 2018), vowels in open CV syllables (Thomson, 2011), sounds in
nonsense words (Carlet & de Souza, 2018), and sounds in real words (Baker & Trofimovich,
2006). Each context has a different impact on how the target sounds are perceived.
Researchers who use real words have found evidence that lexical frequency and phonetic
context influence the perception of sound categories (Thomson & Derwing, 2016; Thomson
& Isaacs, 2009). Presentation of sounds in isolation, in syllables, or in nonsense words are a
purer test of phonetic knowledge, though such contexts are more remote from real-world
experiences.
Surrounding phonetic context can also distort perception. When Wayland (2007) ob-
tained different results from the same listeners using FCID versus discrimination tasks, she
suspected that the presence of the A and B comparators before and after the target X sti-
mulus affected the perception of the X stimulus, which cannot occur in a typical FCID task,
where target stimuli are presented in isolation from other surrounding stimuli. Wayland
(2007) later modified the FCID task such that the tokens to be identified were presented
within the same AXB frames used in the discrimination task. This led to far more com-
parable results across the two task types.
Another stimulus characteristic that may influence results of speech perception experi-
ments is the decision to use synthetic speech generated by a computer rather than natural
speech recordings (e.g., Borden et al., 1983; Schmitz et al., 2018). Synthetic speech is often
used in experiments where the researcher wants to precisely measure listener responses to
vowels and consonants along a continuum representing ambiguous to less ambiguous in-
stances of contrasting sounds. While this may afford greater control over variability, it is
unclear to what degree synthetic speech tokens reflect the natural variability that learners
experience in the real world.
The number of talkers (voices) used in speech perception tasks is also important. Results
based on responses to a single talker (e.g., Hanulíková et al., 2012) are unlikely to adequately
capture listeners’ perceptual ability in the real world, where perception differs depending on
who is talking. In contrast, the use of multiple talkers (e.g., Baker & Trofimovich, 2006;
Carlet & de Souza, 2018; Thomson, 2011) allows researchers to obtain mean perception
scores that average out listener responses to speech produced by individual talkers.
Finally, speech perception research varies in how many replays are allowed before a
perceptual response is expected from the listener. For example, Thomson (2011) allowed no
replays, while Baker and Trofimovich (2006) allowed multiple replays. In the latter case, the
researchers maintained that allowing multiple listens meant that the final response was not
based on a guess, but upon phonological processing. The use of replays does not reflect the
ephemeral nature of speech perception in the real world, so may not lead to valid results.
376
L2 Speech Perception and Production
Measuring Production
Unlike perception tasks, which capture listeners’ auditory responses to target stimuli in a
single step, production tasks require two steps. First, there is the speaking task itself, during
which productions are elicited and recorded. The second step requires instrumental or
human evaluation of those recordings to measure accuracy.
Speaking Tasks
A wide variety of speaking tasks are used to elicit L2 speech production (see Nagle et al., this
volume). As most research is focused on evaluating the production of specific sound categories,
controlled production tasks are the norm. While these may not reflect L2 learners’ spontaneous
speech production, they offer a measure of a learner’s knowledge in optimal conditions. Read
speech is frequently used in L2 pronunciation assessment (Thomson & Derwing, 2015), but it is
not as common in laboratory research. Reference to orthography can have both negative and
positive effects on production. For example, a word might be mispronounced if the ortho-
graphic representation is opaque. Alternatively, knowledge of spelling can allow L2 learners to
apply explicit knowledge to the production of difficult sounds (Thomson & Isaacs, 2009).
In laboratory research, elicited repetition of speech using auditory prompts is common,
and comes in two basic forms, either immediate or delayed. Immediate repetition (e.g.,
Hanulíková et al., 2012; Kabakoff et al., 2020) might be better characterized as a measure of
short-term phonological working memory (WM). Immediate imitations of prompts may
reflect properties of the stimulus, rather than a speaker’s typical ability to produce the same
sounds (Shockley et al., 2004). While WM is an important mechanism in speech learning, and
influences long-term phonological representations, it may not be indicative of the present
state of an L2 learner’s phonological system. To overcome the limitations of immediate
repetition, many L2 researchers use delayed and often interrupted repetition tasks. Auditory
prompts are embedded in a carrier phrase (e.g., “The next word is ____”) and learners re-
spond by producing the target word in a different carrier phrase (e.g., “Now I say____”) (see
Munro & Derwing, 2008; Thomson, 2011). Flege et al. (2003) added an additional layer of
complexity by presenting a target word in the middle of a three-word sequence, which was
then embedded in a carrier phrase. Embedding target words in carrier phrases interrupts the
listener’s ability to store a prompt’s phonetic properties in WM, and is believed to activate
their long-term phonological representation.
Whether repetitions of auditory prompts are immediate or delayed, they do not ne-
cessarily indicate production ability in isolation, since they are mediated by potentially in-
accurate perception of the prompts. In both cases, accurate production requires accurate
perception, but inaccurate production does not necessarily entail inaccurate perception, since
articulatory control is also required in production.
Hybrid approaches to speech elicitation, where speakers produce speech after seeing
written targets as well as hearing auditory models, have also been used (e.g., Bradlow et al.,
1997; Flege et al., 1999; Hanulíková et al., 2012). The combination of the two has been
shown to lead to more accurate productions than either on its own (Thomson & Isaacs,
2009), suggesting that learners are able to apply explicit knowledge during reading, while also
being alerted to any mismatch between the written form and the target pronunciation via the
auditory model. As such, this technique may not reflect the procedural knowledge that is
typically activated in communicative language use.
Though less frequently used, picture-naming is a more ecologically valid L2 speech elici-
tation task, because it can isolate production from perception, and resulting productions are
377
Ron I. Thomson
likely to be closer to spontaneous speech. Picture-naming is not without its own limitations,
however. For example, researchers can use pictures of minimal pairs such as “lock” versus
“rock” to elicit /l/ and /r/ productions, but are unable to as transparently do so across all
phonetic contexts, for example, “laughed” versus “raft.” The latter are more difficult to re-
present using images, and “raft” may be an unknown word for many L2 speakers. When the
target includes a larger number of contrasting sound categories (e.g., ten vowels) pictures
representing every vowel in the same phonetic context are not available (Schmitz et al., 2018;
Thomson & Derwing, 2016). Furthermore, being constrained to words with images, re-
searchers may need to include words that are not well-known to learners (e.g., Baker &
Trofimovich, 2006; Schmitz et al., 2018). Such research has attempted to overcome this ob-
stacle by including a familiarization phase to ensure learners actually know the target words
and how they are pronounced.
Stimulus Characteristics
Stimulus characteristics also determine which task type should be used to elicit L2 production.
In L2 English, for example, isolated vowels or nonsense words cannot be elicited using reading
tasks, unless the subjects are familiar with a phonetic alphabet. Picture-naming is not possible
unless participants are first taught to associate pictures with nonsense words. This suggests that
studies examining productions at the phonetic level almost always use elicited repetition tasks.
Evaluation of Production
Once recordings of L2 speech are obtained, they must be evaluated. This is done by human
judges or through acoustic analysis of the speech recordings.
When the target of analysis is individual phonemes, listeners are presented with randomized
recordings of L2 speech tokens using a FCID task (e.g., Thomson, 2011; Thomson & Derwing,
2016) to indicate which category they perceived. If the goal is to measure change in L2 speech
production over time, paired-comparison tasks are sometimes used. Judges listen to pairs of pre-/
post-test recordings, randomized within pairs, and report which is a better production of the
target (Bradlow et al., 1997). To account for variation across listeners’ identification of L2
speech, it is common to average scores across multiple judges. In other studies, expertly trained
phoneticians transcribe recordings, which are then validated by additional experts (McAndrews
& Thomson, 2017). To measure within-category differences, scalar ratings of “goodness-of-fit”
to a native speaker model are sometimes employed (Flege et al., 2003; Hanulíková et al., 2012).
Acoustic Analysis
An alternative to human judgements for assessing L2 speech production is measures of
acoustic features known to be correlated with the perception of specific sound categories. For
example, frequency and durational components of L2 vowels and consonants can be com-
pared to the same information extracted from recordings of native speakers producing the
same sounds, or to recordings of sounds produced in the learners’ L1, to determine whether
L2 productions reflect the target language or L1 transfer (Thomson et al., 2009).
378
L2 Speech Perception and Production
however, it can be rather imprecise, because listeners bring their own perceptual distortions
to the task. Just as L2 learners fail to perceive and accurately classify target sounds due to the
influence of their L1 categories, native listeners may be unable to detect differences in L2-
accented productions (Flege et al., 1997).
379
Ron I. Thomson
judge degree of accent in the L2 productions is unusual, because it is not analogous to the
FCID task used by learners. Had they used judgements of intelligibility, instead, the re-
lationship may have been stronger.
Some rare counter-examples provide evidence that L2 production can precede L2 per-
ception. Sheldon and Strange (1982) examined the pronunciation of the /l/-/r/ contrast by
advanced Japanese L2 English learners living in the United States and found that some were
able to accurately produce the contrast, despite being unable to discriminate the difference.
Notably, not only were they unable to recognize the difference in productions of native
speakers, but also they failed to discriminate it in recordings of their own accurate pro-
ductions. Bradlow et al. (1997) and Borden et al. (1983) reported similar patterns for /l/-/r/
perception and production by Japanese and Korean L2 English learners, respectively. While
such results are interesting, they are not surprising. The production data in these studies were
elicited using a reading task, suggesting that participants relied on explicit knowledge about
production of words spelled with /r/ and /l/. As such, these results cannot reasonably be taken
to disprove that perception precedes production in normal circumstances.
380
L2 Speech Perception and Production
Lee and Lyster (2017) demonstrated that the impact of perceptual training on production
is mediated by feedback type. Specifically, learners should not only be told when they mis-
classify a training prompt, but they must also re-hear the correct category, or hear an ex-
ample of the incorrect category chosen in error. Only with one of these two types of auditory
feedback did perceptual training in Lee and Lyster’s study transfer to production. Curiously,
providing replays of both correct and incorrect exemplars together interferes with transfer.
The perceptual training studies discussed thus far indicate that, given optimal learning
conditions, L2 production is sensitive to and benefits from changes in L2 speech perception.
While rare counter-examples have failed to find a transfer of perceptual training to pro-
duction, methodological issues are likely explanations. For example, Carlet and de Souza
(2018) trained Brazilians to perceive L2 English vowels presented in real words. As Thomson
and Derwing (2016) argue, when training stimuli are not minimal pairs, or nonsense words,
learners can apply explicit knowledge to decide what sounds are supposed to be in words,
despite not being able to accurately perceive them. In another study, Sakai (2016) found no
impact of perceptual training on the production of the English /i/-/ɪ/ contrast by Spanish L1
speakers. In her study, perceptual training utilized synthetically idealized exemplars of target
categories, which may have allowed learners to discriminate the contrast using acoustic cues
which would not prove useful in production.
A few studies have examined the impact of production training on perception. Liakin
et al. (2013) used automatic speech recognition (ASR) technology to provide feedback to
English learners on their productions of an L2 French vowel contrast. Training resulted in
improvement in production scores, as judged by NSs of French, but with no concomitant
improvement in perception. Herd et al. (2013) trained a group of Spanish L2 learners to
produce the Spanish intervocalic /d, ɾ, r/ contrasts, by having them compare waveforms and
spectrograms of their own productions to waveforms and spectrograms of NS productions of
the same sounds. Notably, learners did not hear recordings of the NS productions that they
were trying to match. Training resulted in improvements in both production and perception.
In both studies, production training was not entirely independent of perceptual input, since
the learners could hear their own productions.
Sakai (2016) isolated production from the influence of learners’ own perceptual input by
mapping spectral information from L1-Spanish productions of English /i/-/ɪ/ onto a two-
dimensional vowel space, analogous to the IPA vowel chart. Learners produced the target
vowels, while attempting to have them mapped to the appropriate part of the space. One
training group monitored their own productions, while another group wore noise cancelling
headphones, which made it impossible to hear their own productions. The group that
monitored their own productions significantly improved in their production of the L2 con-
trast, while the group who could not hear themselves did not. Both production groups
significantly improved in their ability to perceive the contrast. This study provides compel-
ling evidence for a bi-directional connection between production and perception
mechanisms.
381
Ron I. Thomson
not improve on their own. For example, improvement in the perception of L2 vowels seems
to trigger improvement in production in most, but not all cases (Thomson, 2008; 2011). In
contrast, many English learners have difficulty producing a Spanish trill, despite being able
to perceive it (Herd et al., 2013). In cases where perceptual learning is ineffective, working on
production may be a good strategy, and one which may benefit perception (Sakai, 2016).
For perceptual training to work effectively, training prompts should incorporate varia-
bility in terms of the number of talkers who provide training stimuli, as well as the number of
contexts in which target sounds are presented (Thomson, 2018). Training should also provide
auditory corrective feedback on errors, by either replaying prompts learners get wrong, or
playing an example of the incorrect choice (Lee & Lyster, 2017).
7 Future Directions
Given many unanswered questions regarding the nature of L2 speech perception and pro-
duction, the available avenues for future work are numerous. Highlighted here are some of
the most important concerns. First, researchers should identify the most valid research
methodologies. Without precise and comparable measures, it is difficult to draw accurate
conclusions about how the two skills are connected. Future studies using neurophysiological
measures, such as event-related potentials (ERP) or functional MRIs, can offer important
insights, in addition to confirming the validity of commonly used behavioural measures.
Second, the field would benefit from longitudinal research. In most L2 studies, measures of
perception and production are taken from a single time-point, or immediately before and
after a short intervention. Nagle (2018, 2020) provides a compelling example of what
longitudinal research can reveal. He found evidence of a time-lag between when L1 English
speakers learned to accurately perceive an L2 Spanish stop contrast and when that learning
emerged in production. He also found that the relationship varied by stop category. Third,
very little is known about how social factors mediate the development and relationship
between L2 speech perception and production. The fact that the speech of adult interlocutors
phonetically converges over the course of a conversation (Pardo, 2006; Lewandowski &
Jilka, 2019) suggests that basic imitative mechanisms remain intact over the lifespan. What is
unclear is how this ability might be used to improve long-term phonological representations.
The shadowing technique (Foote & McDonough, 2017), in which listeners imitate a recorded
stretch of speech as quickly as possible after hearing each word, may be tapping into this
ability. Fourth, researchers should examine which types of training are most effective. While
evidence suggests that improvement in perception induces improvement in production, and
vice versa, there is also evidence that targeting both skills simultaneously is deleterious to
learning (Baese-Berk, 2019; Herd et al., 2013). Finally, most research is limited to examining
the impact of training on individual segments in isolated words. The extent to which L2
speech training leads to improvements in production beyond the level of segments should be
researched.
Further Reading
Nagle, C. L. (2020). Revisiting perception–production relationships: Exploring a new approach to
investigate perception as a time‐varying predictor. Language Learning. doi: 10.1111/lang.12431.
An application of mixed-effects modeling to longitudinally examine the relationship between change in
L2 speech perception and its time-lagged impact on production.
Sakai, M., & Moorman, C. (2018). Can perception training improve the production of second language
phonemes? A meta-analytic review of 25 years of perception training research. Applied
Psycholinguistics, 39, 187–224.
382
L2 Speech Perception and Production
A meta-analysis of eighteen studies in which researchers used perceptual training to effect change in L2
speech production.
Trofimovich, P., & Foote, J. A. (2017). Second language pronunciation learning: An overview of
theoretical perspectives. In The Routledge handbook of contemporary English pronunciation
(pp. 93–108). Philadelphia: Routledge.
A detailed overview of major theories of second language speech learning, including linguistic, psy-
chological and sociocultural perspectives.
References
Baese-Berk, M. M. (2019). Interactions between speech perception and production during learning of
novel phonemic categories. Attention, Perception, & Psychophysics, 81(4), 981–1005.
Baker Smemoe, W., & Haslam, N. (2013). The effect of language learning aptitude, strategy use and
learning context on L2 pronunciation learning. Applied Linguistics, 34(4), 435–456.
Baker, W., & Trofimovich, P. (2006). Perceptual paths to accurate production of L2 vowels: The role of
individual differences. International Review of Applied Linguistics in Language Teaching, 44(3),
231–250.
Borden, G., Gerber, A., & Milsark, G. (1983). Production and perception of the /r/-/l/ contrast in
Korean adults learning English. Language Learning, 33, 499–526.
Bradlow, A. R., Pisoni, D. B., Akahane-Yamada, R., & Tohkura, Y. (1997). Training Japanese lis-
teners to identify English /r/ and /l/: IV. Some effects of perceptual learning on speech production.
Journal of the Acoustical Society of America, 101, 2299–2310.
Carlet, A., & de Souza, H. K. D. (2018). Improving L2 pronunciation inside and outside the classroom:
Perception, production and autonomous learning of L2 vowels. Ilha do Desterro, 71(3), 99–123.
Derwing, T. M., & Munro, M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2
teaching and research. Amsterdam: John Benjamins.
Fadiga, L., Craighero, L., Buccino, G., & Rizzolatti, G. (2002). Speech listening specifically modulates
the excitability of tongue muscles: A TMS study. European Journal of Neuroscience, 15, 399–402.
Flege, J. E. (1995). Second language speech learning: Theory, findings, problems. In W. Strange (Ed.),
Speech perception and linguistic experience: Issues in cross-language research (pp. 233–277).
Timonium, MD: York Press.
Flege, J. E., & MacKay, I. R. (2004). Perceiving vowels in a second language. Studies in Second
Language Acquisition, 26(1), 1–34.
Flege, J. E., Bohn, O.-S., & Jang, S. (1997). Effects of experience on non-native speakers’ production
and perception of English vowels. Journal of Phonetics, 25, 437–470.
Flege, J. E., MacKay, I. R. A., & Meador, D. (1999). Native Italian speakers’ perception and pro-
duction of English vowels. Journal of the Acoustical Society of America, 106, 2973–2987.
Flege, J. E., Munro, M. J., & MacKay, I. R. (1995). Factors affecting strength of perceived foreign
accent in a second language. The Journal of the Acoustical Society of America, 97(5), 3125–3134.
Flege, J. E., Schirru, C., & MacKay, I. R. (2003). Interaction between the native and second language
phonetic subsystems. Speech Communication, 40(4), 467–491.
Foote, J. A., & McDonough, K. (2017). Using shadowing with mobile technology to improve L2
pronunciation. Journal of Second Language Pronunciation, 3(1), 34–56.
Gass, S. (1984). Development of speech perception and speech production abilities in adult second
language learners. Applied Psycholinguistics, 5, 51–74.
Goldinger, S. D., & Azuma, T. (2004). Episodic memory reflected in printed word naming.
Psychonomic Bulletin & Review, 11, 716–722.
Grenon, I., Benner, A., & Esling, J. H. (2007). Language-specific phonetic production patterns in the
first year of life. Proceedings of the 16th International Congress of Phonetic Sciences, 3, 1561–1564.
Hanulíková, A., Dediu, D., Fang, Z., Bašnaková, J., & Huettig, F. (2012). Individual differences in the
acquisition of a complex L2 phonology: A training study. Language Learning, 62, 79–109.
Hattori, K., & Iverson, P. (2010). Examination of the relationship between L2 perception and pro-
duction: An investigation of English /r/-/l/ perception and production by adult Japanese speakers.
Paper presented at the Interspeech Workshop on Second Language Studies: Acquisition, Learning,
Education and Technology, Waseda University.
Herd, W., Jongman, A., & Sereno, J. (2013). Perceptual and production training of intervocalic/d, ɾ,
r/in American English learners of Spanish. The Journal of the Acoustical Society of America, 133(6),
4247–4255.
383
Ron I. Thomson
Kabakoff, H., Go, G., & Levi, S. V. (2020). Training a non-native vowel contrast with a distributional
learning paradigm results in improved perception and production. Journal of Phonetics, 78. doi: 10.1
016/j.wocn.2019.100940.
Kingston, J., & Diehl, R. L. (1994). Phonetic knowledge. Language, 70(3), 419–454.
Kosky, C., & Boothroyd, A. (2003). Perception and production of sibilants by children with hearing
loss: A training study. The Volta Review, 103(2), 71–98.
Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code. Nature Reviews Neuroscience,
5(11), 831–843.
Lee, A. H., & Lyster, R. (2017). Can corrective feedback on second language speech perception errors
affect production accuracy? Applied Psycholinguistics, 38(2), 371–393.
Lee, G. Y., & Kisilevsky, B. S. (2014). Fetuses respond to father’s voice but prefer mother’s voice after
birth. Developmental Psychobiology, 56(1), 1–11.
Levelt, W. J. (1999). Models of word production. Trends in Cognitive Sciences, 3(6), 223–232.
Lewandowski, N., & Jilka, M. (2019). Phonetic convergence, language talent, personality and attention.
Frontiers in Communication, 4. doi: 10.3389/fcomm.2019.00018.
Liakin, D., Cardoso, W., & Liakina, N. ( 2013). Mobile speech recognition software: A tool for
teaching second language pronunciation. OLBI Journal, 5.
Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition,
21, 1–36.
Lichtheim, L. (1885). On aphasia. Brain, 7, 433–484.
McAndrews, M. M., & Thomson, R. I. (2017). Establishing an empirical basis for priorities in pro-
nunciation teaching. Journal of Second Language Pronunciation, 3(2), 267–287.
McClelland J., & Elman J. (1986). The TRACE Model of Speech Perception. Cognitive Psychology,
18, 1–86
Munro, M. J., & Derwing, T. M. (2008). Segmental acquisition in adult ESL learners: A longitudinal
study of vowel production. Language Learning, 58, 479–502.
Nagle, C. L. (2018). Examining the temporal structure of the perception–production link in second
language acquisition: A longitudinal study. Language Learning, 68, 234–270.
Nagle, C. L. (2020). Revisiting perception–production relationships: Exploring a new approach to
investigate perception as a time‐varying predictor. Language Learning. doi: 10.1111/lang.12431.
Pardo, J. S. (2006). On phonetic convergence during conversational interaction. Journal of the
Acoustical Society of America, 119, 2382–2393. doi: 10.1121/1.2178720
Saffran, J. R., Johnson, E. K., Aslin, R. N., & Newport, E. L. (1999). Statistical learning of tone
sequences by human infants and adults. Cognition, 70(1), 27–52.
Saito, K., & Plonsky, L. (2019). Effects of second language pronunciation teaching revisited: A pro-
posed measurement framework and meta‐analysis. Language Learning, 69(3), 652–708.
Sakai, M. (2016). (Dis)Connecting perception and production: Training native speakers of Spanish on the
English /i/-/I/ distinction (Unpublished doctoral dissertation). Georgetown University, Washington, DC.
Sakai, M., & Moorman, C. (2018). Can perception training improve the production of second language
phonemes? A meta-analytic review of 25 years of perception training research. Applied
Psycholinguistics, 39, 187–224.
Schmitz, J., Díaz, B., Fernandez Rubio, K., & Sebastian-Galles, N. (2018). Exploring the relationship
between speech perception and production across phonological processes, language familiarity, and
sensory modalities. Language, Cognition and Neuroscience, 33(5), 527–546.
Scovel, T. (1988). A time to speak. A psycholinguistic inquiry into the critical period for human speech.
Rowley, MA: Newbury House.
Sheldon, A., & Strange, W. (1982). The acquisition of /r/ and /l/ by Japanese learners of English:
Evidence that speech production can precede speech perception. Applied Psycholinguistics, 3,
243–261.
Shockley, K., Sabadini, L., & Fowler, C. A. (2004). Imitation in shadowing words. Perception &
Psychophysics, 66(3), 422–429.
Thomson, R. I. (2008). L2 English vowel learning by Mandarin speakers: Does perception precede
production? Canadian Acoustics, 36(3), 134–135.
Thomson, R. I. (2011). Computer assisted pronunciation training: Targeting second language vowel
perception improves pronunciation. CALICO Journal, 28(3), 744–765.
Thomson, R. I. (2018). High Variability [Pronunciation] Training (HVPT): A proven technique about
which every language teacher and learner ought to know. Journal of Second Language
Pronunciation, 4(2), 207–230.
384
L2 Speech Perception and Production
Thomson, R. I., & Derwing, T. M. (2015). The effectiveness of L2 pronunciation instruction: A nar-
rative review. Applied Linguistics, 36(3), 326–344.
Thomson, R. I., & Derwing, T. M. (2016). Is phonemic training using nonsense or real words more
effective? In J. Levis, H. Le., I., Lucic, E. Simpson, & S. Vo. (Eds.), Proceedings of the 7th
Pronunciation in Second Language Learning and Teaching conference (pp. 88–97). Ames, IA: Iowa
State University.
Thomson, R. I., & Isaacs, T. (2009). Within-category variation in l2 English vowel learning. Canadian
Acoustics, 37(3), 138–139.
Thomson, R. I., Nearey, T. M., & Derwing, T. M. (2009). A modified statistical pattern recognition
approach to measuring the crosslinguistic similarity of Mandarin and English vowels. The Journal of
the Acoustical Society of America, 126(3), 1447–1460.
Trofimovich, P., & Foote, J. A. (2017). Second language pronunciation learning: An overview of
theoretical perspectives. In The Routledge handbook of contemporary English pronunciation
(pp. 93–108). Philadelphia: Routledge.
Wayland, R. P. (2007). The relationship between identification and discrimination in cross-language
perception: The case of Korean and Thai. In O.-S. Bohn & M. J. Munro (Eds.), Second-language
speech learning: The role of language experience in speech perception and production: A festschrift in
honour of James E. Flege (pp. 201–218). Amsterdam: John Benjamins.
Werker, J. F., & Tees, R. C. (1984). Cross-language speech perception: Evidence for perceptual re-
organization during the first year of life. Infant behavior and development, 7(1), 49–63.
385
27
THE RELATIONSHIP BETWEEN
GESTURES AND SPEAKING IN L2
LEARNING
Marianne Gullberg
1 Introduction/Definitions
Speaking is a multimodal act involving many articulators – not only the mouth, but also the
hands, arms, heads, eyebrows, etc. In other words, when we speak, we also gesture. Gestures,
defined as non-practical actions and visible bodily movements related to ongoing talk and
recognized as communicatively relevant by onlookers, are an integral part of the speech
production process. They are not an added communicative frill, but a fundamental aspect of
speaking. There is substantial evidence for the view that gestures are systematically and
closely linked to language in speech production (and in comprehension), the modalities
forming an integrated mode of expression that is subject to cross-linguistic, cognitive, social,
and cultural variation (Bavelas, 1994; Clark, 1996; Holler & Levinson, 2019; Kendon, 2004;
McNeill, 2017; Özyürek, 2017). The term multimodal is used throughout to refer to the use
of speech and language-related bodily visual behaviour, such as manual gestures and head
movements. “Multimodal” includes the term bimodal language use (speech + gesture) also
found in the literature.
The tight link between speaking and gesturing is seen in many ways. Gestures are pre-
dominantly a speaker phenomenon. More importantly, speech and gestures express se-
mantically related, discursively and prosodically highlighted meaning at the same time with
millisecond precision (Kendon, 1980; Levy & McNeill, 1992; Loehr, 2007; McNeill, 1992).
This fine-grained coordination also means that gestures reflect cross-linguistic variation in
which semantic elements are expressed in speech, and how they are morphosyntactically
organized (Kita, 2009 for an overview). The coordination is neurologically based: speech and
gestures engage similar brain regions and motor control systems in speech production and
comprehension (Gentilucci & Volta, 2008; Oi, Saito, Li, & Zhao, 2013; Özyürek, 2014).
Linguistically, gestures provide necessary referential content to deictic expressions (e.g., the
key is there; Fricke, 2014), and gestures can function as independent speech acts (e.g.,
pointing to a door as an imperative Get out). Finally, speech and gestures develop in parallel
in child language (Colletta et al., 2015; Gullberg et al., 2008; Iverson & Goldin-Meadow,
2005), and break down in parallel in stuttering (Mayberry & Jaques, 2000), disfluency
(Graziano & Gullberg, 2018; Seyfeddinipur, 2006), and aphasia (Rose, 2006). The reasons
for why the speech–gesture link exists are under debate (see Church et al., 2017), but the link
itself is not. These empirical facts strongly suggest that gestures are an integral part of
speaking. They therefore naturally become relevant to the study of speaking in a first (L1) or
second language (L2). The term second language acquisition (SLA) (L2 acquisition) will be
used throughout for both second and foreign language contexts, both instructed learning and
naturalistic acquisition. Moreover, the term “L2 learner” will refer to participants sometimes
called “L2 learners/users” by linguists, sometimes “bilinguals” by psychologists.
I will focus on gestures as defined earlier, leaving aside non-verbal behaviours such as
posture shifts and proxemics. The remaining class of movements, gestures, can be structu-
rally characterized in terms of articulators (hands, head, eyebrows, etc.), place of articula-
tion, and movement patterns with internal phase structure – a sort of “phonetics of gesture”
(Kendon, 1980). Gesture analyses often focus on the core phase of the movement, the stroke,
which is the most meaningful part of the movement. Other phases include the preparation
phase, the retraction phase when hands return to a resting position, and holds, when gestures
are momentarily kept immobile in space (Kendon, 1980). Gestures are also often classified
functionally or semiotically (see, Kendon, 2004 for an overview of classification systems).
For example, representational gestures convey meaning by iconically representing properties
of concrete or abstract objects or actions (iconic and metaphoric gestures) or by spatial
contiguity to an intended entity (deictic and indexical gestures). Rhythmic gestures (beats)
mark scansion; pragmatic gestures express non-referential content such as stance or com-
ments on what is said; and interactive gestures refer to some aspect of conversation itself.
Gestures also show different degrees of conventionalization, ranging from fully lexicalized,
conventional gestures (emblems, quotable gestures, e.g., thumbs-up), which are language-
and culture-specific form-meaning pairs that function like words or idiomatic expressions. In
contrast, non-conventional gestures (gesticulation, co-speech, or speech-associated gestures)
lack fixed form or meaning but accompany speech on the fly to convey speech-related
meaning. Since co-speech gestures are the most closely aligned with speech semantically,
prosodically, and temporally, they will be the focus of this review.
Gestures are deeply multi-functional. They serve both addressee-directed (communicative)
and speaker-directed (cognitive) functions in speaking. Speakers produce and tailor gestures
for their addressees to convey, highlight, and disambiguate meaning and speech content, to
establish common ground, and to regulate turn-taking (Bavelas et al., 2008; Holler & Beattie,
2003; Streeck, 2009). But speakers also produce gestures for themselves to organize thoughts
and facilitate their own speech production (Kita et al., 2017 for an overview). Both aspects
are vital.
I will briefly review how insights on the multimodal nature of speaking affects research on
speaking in SLA.
2 Historical Perspectives
The study of gesture has a long history (see Kendon, 2004, for an overview). However, the
advent of easily accessible film and video recordings in the 1970s enabled pioneering mul-
timodal work in interaction studies and anthropology (Efron, 1941/1972; Kendon, 1972,
2004), child language studies (Bates et al., 1977; Volterra et al., 2005), psychology (Goldin-
Meadow, 2003; McNeill, 1992, 2005), and psychiatry (Davis, 1985; Freedman, 1972). These
studies laid the foundation for the explosion of work seen in the past two decades. Gesture
studies is now a vibrant research field in its own right.
In SLA studies, interest in gestures has been slower to develop. Early on gestures were occa-
sionally discussed as culture-specific practices to acquire for cultural fluency in a target language
(e.g., Green, 1968; Pennycook, 1985; Von Raffler-Engel, 1980; Wylie, 1985), or as a pedagogical
tool for improving L2 comprehension in language classrooms (e.g., Kellerman, 1992). A few
387
Marianne Gullberg
studies also discussed whether bilinguals switch language in both speech and gesture (Efron, 1941/
1972; Lacroix & Rioux, 1978; Von Raffler-Engel, 1976). In the 1990s, interest grew in applying
gesture studies to theoretical issues in SLA. The theoretical, methodological, and empirical ad-
vances in L1 gesture studies provided new analytical tools and empirical facts about multimodal
language use to motivate such a shift (e.g., Duncan, 1996; Kendon, 1986, 1990; McNeill, 1992;
Müller, 1998). This early L2 work examined how gestures function in L2 interaction as commu-
nication strategies for lexical, grammatical, and pragmatic challenges (Gullberg, 1998); how native
speakers adjust speech and gestures to learners in multimodal foreigner talk (Adams, 1998); how
learners use private speech and gesture to internalize knowledge (McCafferty, 1998); and how
learners’ gestures may show traces of cross-linguistic influence (Stam, 1998). Since the early 2000s,
research on SLA and gestures has diversified considerably (Gullberg, 2006b, 2008; Gullberg & de
Bot, 2010; Gullberg & McCafferty, 2008; McCafferty & Stam, 2008; Stam, 2012; Stam &
Buescher, 2018 for overviews and collections of papers). It is not yet a separate subfield of study in
SLA, but the potential is obvious.
388
Gestures and Speaking in L2 Learning
A key issue is to understand which aspects of such general behaviour are driven by cognitive
or developmental mechanisms, and which are more firmly rooted in communicative con-
cerns. Different aspects of gesture production may shed light on both issues (e.g., gesture
frequency may speak to cognitive aspects, and gesture articulation in space relative to an
interlocutor may speak to communicative issues).
A fourth topic is concerned with what role learners’ speech–gesture ensembles play in
interaction (e.g., turn-taking), in collaborative practices for the establishment of meaning
and structure (e.g., jointly finding words with gestures), for understanding, problem re-
solution, and ultimately for L2 acquisition. Approaches here are largely interactionist,
conversational analytical (CA), or sociocultural, and apply qualitative micro-analyses.
Learners’ and their interlocutors’ gesture production is spontaneous, and the focus is on
sequences of unfolding behaviour.
A related issue is whether learners’ L2 acquisition of lexicon and grammar can be im-
proved by gesture production during explicit teaching. This line of work is experimental and
exclusively focused on instruction and explicit (non-spontaneous) gesture production. It
draws on L1 gesture research showing that gesture production affects memory more gen-
erally (Cook & Fenn, 2017 for an overview), possibly because gestures strengthen re-
presentations by evoking motor and visual imagery (Morett, 2018), or because gestures
engage sensorimotor brain networks that grow larger the more sensory modalities are linked
to a new element (Macedonia et al., 2019). The last two topics have important pedagogical
implications.
389
Marianne Gullberg
sometimes even when L2 speech looks target-like. Interestingly, new studies also find effects of
L1 co-speech gestures on the SLA of sign language (Ortega & Morgan, 2015). Gestural CLI can
be reflected in the timing of gestures relative to spoken elements (Stam, 2006), gesture meaning
(Choi & Lantolf, 2008), gesture form (Casey et al., 2012; Gullberg, 2009), gesture frequency (So,
2010), or in the way information is distributed across speech and gesture and in how co-
expressive the modalities are (Brown & Gullberg, 2008). Overall, studies find that even as speech
shifts towards L2 patterns, gestures often reveal lingering influences from the L1, sometimes
persistently (Özçalışkan, 2016). But there is also evidence of gestural shifts and learning
(Gullberg, 2009; Lewis, 2012; Stam, 2015). On the whole, speech seems to shift more easily
towards the L2 pattern than gestures. Why this is the case remains unclear.
Another line of CLI work examines the reverse influence, from the L2 on the L1, or even
bidirectional influences even at modest levels of L2 proficiency. In keeping with psycho-
linguistic evidence that all known languages affect each other in an individual mind (Cook,
2003; Van Hell & Dijkstra, 2002), studies have found traces of L2 speech–gesture patterns in
the L1. For example, speakers with knowledge of an L2 speak and gesture significantly
differently in their L1 from monolingual peers and are, crucially, sometimes indistinguishable
from themselves when speaking the L1 and the L2, suggesting multimodal convergence
(Brown, 2015; Brown & Gullberg, 2008). Similar bidirectional influences have been found
from a signed L2 onto L1 co-speech gestures (Casey et al., 2012).
Third, multimodal studies looking at general learner phenomena are rare (beyond issues
of gesture rates and proficiency). One line of study has explored how early L2 learners or-
ganize information about entities to create coherent discourse (reference tracking) (Perdue,
2000), especially when pronominal systems and word order patterns are not yet mastered.
These studies show that learners tend to avoid pronouns, and instead use full lexical noun
phrases (NPs) to refer both to new and old entities (Hendriks, 2003; Williams, 1988). They
create over-explicit discourse. L2 gesture analyses reveal that early L2 speakers with different
L1s and L2 are also multimodally over-explicit, accompanying every mention of a referent,
new or old, with a gesture (Gullberg, 2006a; So et al., 2013; Yoshioka, 2008). Interestingly,
this pattern appears whether addressees can see the gestures or not (Gullberg, 2006a), sug-
gesting that gesture production is not only a disambiguation strategy but may also serve a
self-directed purpose, perhaps to reduce memory strain (“cognitive load”; Cook & Fenn,
2017), by externalizing the referents that must be kept in mind onto gesture. An outstanding
issue is whether these patterns hold also when pro-drop languages are involved (cf. So et al.,
2013; Yoshioka, 2008). The precise temporal alignment between gestures and spoken ele-
ments must be clarified to test this since in the case of zero anaphora, gestures must align
with other elements than NPs or pronouns.
Fourth, a growing number of studies explore the role of gestures in L2 learners’ spoken
interactions inside and outside of classrooms, examining how learners deploy gestures as
communication strategies (Gullberg, 1998), in repair sequences (Olsher, 2008), in joint co-
production with native speakers (Mori & Hayashi, 2006), or to internalize new knowledge in
private speech (Lee, 2008; McCafferty & Rosborough, 2014). Other studies examine how
learners and teachers use gestures in L2 classroom speech to support the learning of voca-
bulary, grammar, pronunciation, and even writing (Eskildsen & Wagner, 2015; Kim & Cho,
2017; Lazaraton, 2004; Matsumoto & Dobs, 2017; Smotrova, 2017). Gestural teacher talk is
also studied, showing, for example, that during vocabulary training language teachers
modulate their gestures depending on students’ L2 proficiency, with more, longer and bigger
representational gestures the lower students’ proficiency (Tellier & Stam, 2012). Other studies
examine the effects of gestural corrective feedback and re-casts (Nakatsukasa & Loewen,
2017 for an overview), with mixed effects on L2 learning, perhaps depending on the linguistic
390
Gestures and Speaking in L2 Learning
domain. Overall, these studies reveal that gestures serve as a crucial communicative and
semiotic resource to learners and their interlocutors alike.
Finally, a flourishing subfield probes whether learners’ gesture production can measurably
improve their L2 acquisition (cf. Cook & Fenn, 2017). By now, many studies show that both
child and adult L2 learners who repeat modelled speech and gestures during vocabulary training
retain more words than learners who do not gesture, especially when gesture meanings match
speech content (Andrä et al., 2020; Kelly et al., 2009; Morett, 2014; Tellier, 2008). (Many studies
also examine the effects of simply observing gestures, but since that is perception and not
speaking, it is outside the scope of this review, see Macedonia, 2019, for a discussion.)
Similarly, gesture production during the training of new speech sounds improves L2
pronunciation (Baills et al., 2019; Li et al., 2020). In all cases, it is assumed that learning is
boosted by the double engagement of motor and auditory memory (Morett, 2018).
Interestingly, the effects of gesture training are mixed in the domain of L2 phonology where
gesture appears to benefit L2 production skills more (Li et al., 2020) than reception skills
(Hirata et al., 2014). Although all studies are pre-test/post-test designs, the tasks involved
vary greatly, meaning that the overall effects remain unclear. Further, other linguistic do-
mains need probing, the longevity of the effect needs clarification, and effects of non-
modelled, spontaneous gestures also need investigating.
391
Marianne Gullberg
(Nicoladis, 2007), gesture form and meaning (Choi & Lantolf, 2008), timing (Stam, 2006),
function (Gullberg, 1998), and degree of semantic overlap (co-expressivity) with speech
(Brown & Gullberg, 2008). Speech annotations of course target similar issues as in other SLA
research. It is vital to have interrater reliability measures of gesture annotations since gesture
coding is not standardized and often under-described.
Qualitative CA-based or sociocultural perspectives currently seem to dominate the study
of multimodal SLA, but quantitative, (semi-)experimental approaches are also frequent, with
intervention studies in particular on the rise.
7 Future Directions
The study of L2 speaking and gesturing has come far in the past 20 years. However, much remains
to be done. Crucially, the study of multimodal SLA needs to widen the empirical base. We need to
go beyond lexical concerns and probe other linguistic domains of speaking, in new languages, and
other settings. That is, we need to know what native speakers of languages X and Y do gesturally
when using evidentials, complex demonstrative systems, or when they distinguish entities with
regard to definiteness and specificity, for example. The extension of multimodal SLA to new
domains and to new languages calls for more L1 baseline data than is currently available. A way to
achieve this is to systematically study participants’ production in both their L1 and L2.
We should also move beyond the compensatory view that treats L2 gesture production
mainly as a communicative support system (cf. the discussion of gesture frequency). Embodied
language use is everywhere, in L1 and L2 speakers alike (Holler & Levinson, 2019), and fluent
speech is more tightly paired with gesture production than disfluent speech, also in L2 pro-
duction (Graziano & Gullberg, 2018). There is thus good reason to consider gestures differently.
392
Gestures and Speaking in L2 Learning
In general, we still know very little about whether, when, and how L2 learners come to
speak and gesture in language-specific ways. We do not know why gestures seem to be more
resistant to change than speech, whether gestures change through imitation (e.g., during
study abroad and immersion) or with shifts in the linguistic system, or some combination of
the two. As in all SLA, we need more longitudinal work both inside and outside of class-
rooms to address these issues. We know nothing at all about whether L2 learners come to
produce culture-specific emblems when speaking, for example using the circle gesture ac-
curately (joining the thumb and index finger to mean zero/bad, excellent, money, and bodily
orifice, respectively) depending on linguistic community and culture (Morris et al., 1979).
Since emblems function like idiomatic expressions and discourse markers, they are well
worth investigating relative to their spoken equivalents.
Most importantly, however, current work on gestures and L2 speaking is not very well
embedded in current SLA theory or empirical endeavours in SLA studies. Therefore, the
most pressing future direction for this line of work is to focus on what gesture analysis can
add to key SLA domains (cf. the table of contents of this volume). Obvious candidate areas
include work on the role of gestures for attention, noticing, implicit/explicit learning and
knowledge; task demands and cognitive load; individual differences; the entire field of in-
structed SLA; L2 processing in production. To this list one might add that we know nothing
about how uninstructed learners or learners with different degrees of literacy speak and
gesture. The entire domain of assessment of spoken L2 skills is also in need of a multimodal
approach. Although a few studies suggest that learners’ gestures affects assessments of their
spoken skills positively (Gullberg, 1998; Jenkins & Parra, 2003), this is a deeply under-
researched area with pedagogical implications.
In sum, gestures offer a rich and multidimensional view of L2 speaking. Multimodal
analyses of L2 speech and gesture can provide a fuller picture of both communicative and
cognitive aspects of L2 speaking. As such, gestures have considerable potential for ex-
panding the scope and depth of SLA research. The study of L2 gestures still needs to
integrate SLA concerns more fully, and work towards establishing gestures as a natural
element of any study of L2 speaking. It behoves us all to work to shift our theories and
models of SLA away from monomodal perspectives towards multimodal ones. It is time
for a gestural turn.
Further Reading
Two core texts on gesture research by the pioneers:
Kendon, A. (2004). Gesture. Visible action as utterance. Cambridge: Cambridge University Press.
McNeill, D. (2005). Gesture and thought. Chicago: University of Chicago Press.
A collection of papers on various domains of SLA where gesture analysis has fruitfully been applied:
McCafferty, S. G., & Stam, G. (Eds.). (2008). Gesture. Second language acquisition and classroom re-
search. New York: Routledge.
A brief overview of methodological issues in research on gestures in SLA:
Gullberg, M. (2012b). Gesture analysis in second language acquisition. In C. Chapelle (Ed.),
Encyclopedia of Applied Linguistics. Oxford: Wiley-Blackwell.
References
Adams, T. W. (1998). Gesture in foreigner talk. (PhD diss), University of Pennsylvania, Philadelphia.
Andrä, C., Mathias, B., Schwager, A., Macedonia, M., & von Kriegstein, K. (2020). Learning foreign lan-
guage vocabulary with gestures and pictures enhances vocabulary memory for several months post-
learning in eight-year-old school children. Educational Psychology Review, 32, 815–850. doi: 10.1007/s1064
8-020-09527-z
393
Marianne Gullberg
Aziz, J. R., & Nicoladis, E. (2019). “My French is rusty”: Proficiency and bilingual gesture use in a
majority English community. Bilingualism: Language and Cognition, 22(4), 826–835.
Baills, F., Suárez-González, N., González-Fuente, S., & Prieto, P. (2019). Observing and producing
pitch gestures facilitates the learning of Mandarin Chinese tones and words. Studies in Second
Language Acquisition, 41(1), 33–58.
Bates, E., Benigni, L., Bretherton, I., Camaioni, L., & Volterra, V. (1977). From gesture to the first
word: On cognitive and social prerequisites. In M. Lewis & L. A. Rosenblum (Eds.), Interaction,
conversation, and the development of language (pp. 247–307). New York: Wiley.
Bavelas, J. B. (1994). Gestures as part of speech: Methodological implications. Research on Language
and Social Interaction, 27(3), 201–221.
Bavelas, J. B., Chovil, N., Lawrie, D. A., & Wade, A. (1992). Interactive gestures. Discourse Processes,
15(4), 469–489.
Bavelas, J. B., Gerwing, J., Sutton, C., & Prevost, D. (2008). Gesturing on the telephone: Independent
effects of dialogue and visibility. Journal of Memory and Language, 58(2), 495–520.
Brown, A. (2015). Universal development and L1–L2 convergence in bilingual construal of manner in
speech and gesture in Mandarin, Japanese, and English. The Modern Language Journal,
99(S1), 66–82.
Brown, A., & Gullberg, M. (2008). Bidirectional crosslinguistic influence in L1-L2 encoding of Manner
in speech and gesture: A study of Japanese speakers of English. Studies in Second Language
Acquisition, 30(2), 225–251.
Casey, S., Emmorey, K., & Larrabee, H. (2012). The effects of learning American Sign Language on co-
speech gesture. Bilingualism: Language and Cognition, 15(4), 677–686.
Choi, S., & Lantolf, J. P. (2008). Representation and embodiment of meaning in L2 communication.
Motion events in the speech and gesture of advanced L2 Korean and L2 English speakers. Studies in
Second Language Acquisition, 30(2), 191–224.
Church, R. B., Alibali, M. W., & Kelly, S. D. (Eds.). (2017). Why gesture?: How the hands function in
speaking, thinking and communicating: Philadelphia, Amsterdam: John Benjamins.
Clark, H. H. (1996). Using language. Cambridge: Cambridge University Press.
Colletta, J.-M., Guidetti, M., Capirci, O., Cristilli, C., Demir, O. E., Kunene-Nicolas, R. N., & Levine,
S. (2015). Effects of age and language on co-speech gesture production: An investigation of French,
American, and Italian children’s narratives. Journal of Child Language, 42(1), 122–145.
Cook, S., & Fenn, K. M. (2017). The function of gesture in learning and memory. In R. Breckinridge
Church, M. W. Alibali, & S. D. Kelly (Eds.), Why gesture?: How the hands function in speaking,
thinking and communicating (pp. 129–153). Amsterdam: John Benjamins.
Cook, V. (2003). Introduction: The changing L1 in the L2 user’s mind. In V. Cook (Ed.), Effects of the
second language on the first (pp. 1–18). Clevedon: Multilingual Matters.
Davis, M. (1985). Nonverbal behavior research and psychotherapy. In G. Stricker & R. H. Keisner
(Eds.), From research to clinical practice (pp. 89–112). New York: Plenum Press.
De Jong, N. H., Groenhout, R., Schoonen, R. O. B., & Hulstijn, J. H. (2015). Second language fluency:
Speaking style or proficiency? Correcting measures of second language fluency for first language
behavior. Applied Psycholinguistics, 36(2), 223–243.
Denisova, V. A., Cienki, A., & Iriskhanova, O. K. (2018). Boundary expression in verbs and gesture:
Differences between L1 and L2 speakers. In Computational Linguistics and Intellectual Technologies
(pp. 163–171). Moscow, May 30–June 2, 2018.
Duncan, S. D. (1996). Grammatical form and ‘thinking-for-speaking’ in Mandarin Chinese and English:
An analysis based on speech-accompanying gesture (PhD diss.), University of Chicago, Chicago.
Efron, D. (1941/1972). Gestures, race and culture. The Hague: Mouton.
Eskildsen, S. W., & Wagner, J. (2015). Embodied L2 construction learning. Language Learning, 65(2),
268–297.
Freedman, N. (1972). The analysis of movement behavior during the clinical interview. In A. W.
Siegman & B. Pope (Eds.), Studies in dyadic communication (pp. 153–175). New York: Pergamon.
Fricke, E. (2014). Deixis, gesture, and embodiment from a linguistic point of view. In C. Müller, A.
Cienki, E. Fricke, S. H. Ladewig, D. McNeill, & S. Tessendorf (Eds.), Body – Language –
Communication (pp. 1803–1823). Berlin, New York: Mouton de Gruyter.
Gentilucci, M., & Volta, R. D. (2008). Spoken language and arm gestures are controlled by the same
motor control system. The Quarterly Journal of Experimental Psychology, 61(6), 944 - 957.
Goldin-Meadow, S. (2003). Hearing gesture: How our hands help us think. Cambridge, MA: The
Belknap Press.
394
Gestures and Speaking in L2 Learning
Graziano, M., & Gullberg, M. (2018). When speech stops, gesture stops: Evidence from developmental
and crosslinguistic comparisons. Frontiers in Psychology, 9(879). doi: 10.3389/fpsyg.2018.00879
Green, J. R. (1968). A gesture inventory for teaching Spanish. New York: Clinton Books.
Gregersen, T. S. (2005). Nonverbal cues: Clues to the detection of foreign language anxiety. Foreign
Language Annals, 38(3), 388–400.
Gregersen, T. S., Olivares-Cuhat, G., & Storm, J. (2009). An examination of L1 and L2 gesture use:
What role does proficiency play? The Modern Language Journal, 93(2), 195–208.
Gu, Y., Mol, L., Hoetjes, M., & Swerts, M. (2017). Conceptual and lexical effects on gestures: The case
of vertical spatial metaphors for time in Chinese. Language, Cognition and Neuroscience, 32(8),
1048–1063.
Gullberg, M. (1998). Gesture as a communication strategy in second language discourse. A study of
learners of French and Swedish. Lund: Lund University Press.
Gullberg, M. (2006a). Handling discourse: Gestures, reference tracking, and communication strategies
in early L2. Language Learning, 56(1), 155–196.
Gullberg, M. (2006b). Some reasons for studying gesture and second language acquisition (Hommage à
Adam Kendon). International Review of Applied Linguistics, 44(2), 103–124.
Gullberg, M. (2008). Gestures and second language acquisition. In P. Robinson & N. C. Ellis (Eds.),
Handbook of cognitive linguistics and second language acquisition (pp. 276–305). London:
Routledge.
Gullberg, M. (2009). Reconstructing verb meaning in a second language: How English speakers of L2
Dutch talk and gesture about placement. Annual Review of Cognitive Linguistics, 7, 222–245.
Gullberg, M. (2010). Methodological reflections on gesture analysis in SLA and bilingualism research.
Second Language Research, 26(1), 75–102.
Gullberg, M. (2012a). Bilingualism and gesture. In T. K. Bhatia & W. C. Ritchie (Eds.), The handbook
of bilingualism and multilingualism (2nd edn, pp. 417–437). Malden, MA: Wiley-Blackwell.
Gullberg, M. (2012b). Gesture analysis in second language acquisition. In C. Chapelle (Ed.),
Encyclopedia of Applied Linguistics. Oxford: Wiley-Blackwell.
Gullberg, M., & de Bot, K. (Eds.). (2010). Gestures in language development. Amsterdam: Benjamins.
Gullberg, M., de Bot, K., & Volterra, V. (2008). Gestures and some key issues in the study of language
development. Gesture, 8(2), 149–179.
Gullberg, M., & McCafferty, S. G. (2008). Introduction to Gesture and SLA: Toward an integrated
approach. Studies in Second Language Acquisition, 30(2), 133–146.
Hendriks, H. (2003). Using nouns for reference maintenance: A seeming contradiction in L2 discourse.
In A. G. Ramat (Ed.), Typology and second language acquisition (pp. 291–326). Berlin: Mouton.
Hirata, Y., Kelly, S. D., Huang, J., & Manansala, M. (2014). Effects of hand gestures on auditory
learning of second-language vowel length contrasts. Journal of Speech, Language, and Hearing
Research, 57, 2090–2101.
Holler, J., & Beattie, G. (2003). How iconic gestures and speech interact in the representation of
meaning: Are both aspects really integral to the process? Semiotica, 146(1), 81–116.
Holler, J., & Levinson, S. C. (2019). Multimodal language processing in human communication. Trends
in Cognitive Sciences, 23(8), 639–652.
Housen, A., Kuiken, F., & Vedder, I. (Eds.). (2012). Dimensions of L2 performance and proficiency:
Complexity, accuracy, and fluency in SLA. Amsterdam: Benajmins.
Iverson, J. M., & Goldin-Meadow, S. (2005). Gesture paves the way for language development.
Psychological Science, 16, 367–371.
Jenkins, S., & Parra, I. (2003). Multiple layers of meaning in an oral proficiency test: The com-
plementary roles of nonverbal, paralinguistic, and verbal behaviors in assessment decisions. Modern
Language Journal, 87(1), 90–107.
Kellerman, S. (1992). ‘I see what you mean’: The role of kinesic behaviour in listening and implications
for foreign and second language learning. Applied Linguistics, 13(3), 239–257.
Kelly, S. D., McDevitt, T., & Esch, M. (2009). Brief training with co-speech gesture lends a hand to
word learning in a foreign language. Language and Cognitive Processes, 24(2), 313–334.
Kendon, A. (1972). Some relationships between body motion and speech: An analysis of an example. In
A. W. Siegman & B. Pope (Eds.), Studies in dyadic communication (pp. 177–210). New York:
Pergamon.
Kendon, A. (1980). Gesticulation and speech: Two aspects of the process of utterance. In M. R. Key
(Ed.), The relationship of verbal and nonverbal communication (pp. 207–227). The Hague: Mouton.
Kendon, A. (1986). Some reasons for studying gesture. Semiotica, 62(1/2), 3–28.
395
Marianne Gullberg
Kendon, A. (1994). Do gestures communicate?: A review. Research on Language and Social Interaction,
27(3), 175–200.
Kendon, A. (1990). Conducting interaction. Cambridge: Cambridge University Press.
Kendon, A. (2004). Gesture. Visible action as utterance. Cambridge: Cambridge University Press.
Kim, S., & Cho, S. (2017). How a tutor uses gesture for scaffolding: A case study on L2 tutee’s writing.
Discourse Processes, 54(2), 105–123.
Kita, S. (2009). Cross-cultural variation of speech-accompanying gesture: A review. Language and
Cognitive Processes, 24(2), 145 - 167.
Kita, S., Alibali, M. W., & Chu, M. (2017). How do gestures influence thinking and speaking? The
gesture-for-conceptualization hypothesis. Psychological Review, 124(3), 245–266.
Lacroix, J. M., & Rioux, Y. (1978). La communication non-verbale chez les bilingues. Canadian Journal
of Behavioral Science, 10(2), 130–140.
Lazaraton, A. (2004). Gesture and speech in the vocabulary explanations of one ESL teacher: A mi-
croanalytic inquiry. Language Learning, 54(1), 79–117.
Lee, J. (2008). Gesture and private speech in second language acquisition. Studies in Second Language
Acquisition, 30(2), 169–190.
Levy, E. T., & McNeill, D. (1992). Speech, gesture, and discourse. Discourse Processes, 15(3), 277–301.
Lewis, T. N. (2012). The effect of context on the L2 Thinking for Speaking development of path
gestures. L2 Journal, 4(2), 247–268.
Li, P., Baills, F., & Prieto, P. (2020). Observing and producing durational hand gestures facilitates the
pronunciation of novel vowel-length contrasts. Studies in Second Language Acquisition, 42(5),
1015–1039. doi: 10.1017/S0272263120000054
Lin, Y.-L. (2020). A helping hand for thinking and speaking: Effects of gesturing and task planning on
second language narrative discourse. System, 91, 102243.
Loehr, D. P. (2007). Aspects of rhythm in gesture and speech. Gesture, 7(2), 179–214.
Macedonia, M. (2019). Embodied learning: Why at school the mind needs the body. Frontiers in
Psychology, 10(2098). doi: 10.3389/fpsyg.2019.02098
Macedonia, M., Repetto, C., Ischebeck, A., & Mueller, K. (2019). Depth of encoding through observed
gestures in foreign language word learning. Frontiers in Psychology, 10(33). doi: 10.3389/fpsyg.201
9.00033
MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk (3rd edn). Mahwah, NJ:
Lawrence Erlbaum Associates.
Matsumoto, Y., & Dobs, A. M. (2017). Pedagogical gestures as interactional resources for teaching and
learning tense and aspect in the ESL grammar classroom. Language Learning, 67(1), 7–42.
Mayberry, R. I., & Jaques, J. (2000). Gesture production during stuttered speech: Insights into the
nature of gesture-speech integration. In D. McNeill (Ed.), Language and gesture (pp. 199–214).
Cambridge: Cambridge University Press.
McCafferty, S. G. (1998). Nonverbal expression and L2 private speech. Applied Linguistics,
19(1), 73–96.
McCafferty, S. G., & Rosborough, A. (2014). Gesture as a private form of communication during
lessons in an ESL-designated elementary classroom: A sociocultural perspective. TESOL Journal,
5(2), 225–246.
McCafferty, S. G., & Stam, G. (Eds.). (2008). Gesture. Second language acquisition and classroom re-
search. New York: Routledge.
McNeill, D. (1992). Hand and mind. What gestures reveal about thought. Chicago: University of Chicago
Press.
McNeill, D. (2005). Gesture and thought. Chicago: University of Chicago Press.
McNeill, D. (2017). Gesture-speech unity: What it is, where it came from. In R. Breckinridge Church,
M. W. Alibali, & S. D. Kelly (Eds.), Why gesture?: How the hands function in speaking, thinking and
communicating (pp. 77–101). Amsterdam: John Benjamins.
Morett, L. M. (2014). When hands speak louder than words: The role of gesture in the communication,
encoding, and recall of words in a novel second language. The Modern Language Journal, 98(3),
834–853.
Morett, L. M. (2018). In hand and in mind: Effects of gesture production and viewing on second
language word learning. Applied Psycholinguistics, 39, 355–381.
Mori, J., & Hayashi, M. (2006). The achievement of intersubjectivity through embodied completions: A
study of interactions between first and second language speakers. Applied Linguistics, 27(2),
195–219.
396
Gestures and Speaking in L2 Learning
Morris, D., Collett, P., Marsh, P., & O’Shaughnessy, M. (1979). Gestures, their origins and distribution.
London: Cape.
Müller, C. (1998). Redebegleitende Gesten. Berlin: Berlin Verlag Arno Spitz GmbH.
Nagpal, J., Nicoladis, E., & Marentette, P. (2011). Predicting individual differences in L2 speakers’
gestures. International Journal of Bilingualism, 15(2), 205–214.
Nakatsukasa, K., & Loewen, S. (2017). Non-verbal feedback. In H. Nassaji & E. Kartchava (Eds.),
Corrective feedback in second language teaching and learning: Research, theory, applications, im-
plications (pp. 158–173). New York: Routledge.
Nicoladis, E. (2007). The effect of bilingualism on the use of manual gestures. Applied Psycholinguistics,
28(3), 441–454.
Oi, M., Saito, H., Li, Z., & Zhao, W. (2013). Co-speech gesture production in an animation–narration
task by bilinguals: A near-infrared spectroscopy study. Brain and Language, 125(1), 77–81.
Olsher, D. (2008). Gesturally-enhanced repeats in the repair turn: Communication strategy or cognitive
language-learning tool? In S. G. McCafferty & G. Stam (Eds.), Gesture. Second language acquisition
and classroom research (pp. 109–130). New York: Routledge.
Ortega, G., & Morgan, G. (2015). Phonological development in hearing learners of a sign language:
The influence of phonological parameters, sign complexity, and iconicity. Language Learning, 65(3),
660–688.
Özçalışkan, Ş. (2016). Do gestures follow speech in bilinguals’ description of motion? Bilingualism:
Language and Cognition, 19(3), 644–653.
Özyürek, A. (2014). Hearing and seeing meaning in speech and gesture: Insights from brain and
behaviour. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1651),
20130296.
Özyürek, A. (2017). Function and processing of gesture in the context of language. In R. B. Church, M.
W. Alibali, & S. D. Kelly (Eds.), Why gesture? How the hands function in speaking, thinking, and
communicating (pp. 39–58). Amsterdam: John Benjamins Publishing Company.
Pennycook, A. (1985). Actions speak louder than words: Paralanguage, communication and education.
TESOL Quarterly, 19(2), 259–282.
Perdue, C. (2000). Organising principles of learner varieties. Studies in Second Language Acquisition,
22(3), 299–305.
Rauscher, F. H., Krauss, R. M., & Chen, Y. (1996). Gesture, speech and lexical access: The role of
lexical movements in speech production. Psychological Science, 7(4), 226–231.
Robinson, P. (2007). Task complexity, theory of mind, and intentional reasoning: Effects on L2 speech
production, interaction, uptake and perceptions of task difficulty. International Review of Applied
Linguistics in Language Teaching, 45(3), 193–213.
Rose, M. L. (2006). The utility of arm and hand gesture in the treatment of aphasia. Advances in
Speech-Language Pathology, 8(2), 92–109.
Seyfeddinipur, M. (2006). Disfluency: Interrupting speech and gesture. (PhD diss), Radboud University,
Nijmegen.
Slobin, D. I. (1996). From “thought and language” to “thinking for speaking”. In J. J. Gumperz & S. C.
Levinson (Eds.), Rethinking linguistic relativity (pp. 70–96). Cambridge: Cambridge University
Press.
Smotrova, T. (2017). Making pronunciation visible: Gesture in teaching pronunciation. TESOL
Quarterly, 51(1), 59–89.
So, W. C. (2010). Cross-cultural transfer in gesture frequency in Chinese-English bilinguals. Language
and Cognitive Processes, 25(10), 1335–1353.
So, W. C., Kita, S., & Goldin-Meadow, S. (2013). When do speakers use gestures to specify who does
what to whom? The role of language proficiency and type of gestures in narratives. Journal of
Psycholinguistic Research, 42(6), 581–594.
Stam, G. (1998). Changes in patterns of thinking about motion with L2 acquisition. In S. Santi, I.
Guaïtella, C. Cavé, & G. Konopczynski (Eds.), Oralité et Gestualité (ORAGE ‘98) (pp. 615–619).
Paris: l’Harmattan.
Stam, G. (2006). Thinking for Speaking about motion: L1 and L2 speech and gesture. International
Review of Applied Linguistics, 44(2), 143–169.
Stam, G. (2012). Second language acquisition and gesture. In C. Chapelle (Ed.), Encyclopedia of
Applied Linguistics. Oxford: Wiley-Blackwell.
Stam, G. (2015). Changes in thinking for speaking: A longitudinal case study. The Modern Language
Journal, 99(S), 83–99.
397
Marianne Gullberg
Stam, G., & Buescher, K. (2018). Gesture research. In A. Phakiti, P. De Costa, L. Plonsky, & S.
Starfield (Eds.), Palgrave Handbook of applied linguistics research methodology (pp. 793–809).
London: Palgrave Macmillan.
Stam, G., & McCafferty, S. G. (2008). Gesture studies and second language acquisition: A review. In S.
G. McCafferty & G. Stam (Eds.), Gesture. Second language acquisition and classroom research
(pp. 3–24). New York: Routledge.
Streeck, J. (2009). Forward-gesturing. Discourse Processes, 46(2), 161–179.
Tellier, M. (2008). The effect of gestures on second language memorisation by young children. Gesture,
8(2), 219–235.
Tellier, M., & Stam, G. (2012). Stratégies verbales et gestuelles dans l’explication lexicale d’un verbe
d’action. In V. Rivière (Ed.), Spécificités et diversité des interactions didactiques (pp. 357–374). Paris:
Riveneuve éditions.
Van Hell, J. G., & Dijkstra, T. (2002). Foreign language knowledge can influence native language
performance in exclusively native contexts. Psychonomic Bulletin & Review, 9(4), 780–789.
Volterra, V., Caselli, M. C., Capirci, O., & Pizzuto, E. (2005). Gesture and the emergence and devel-
opment of language. In M. Tomasello & D. I. Slobin (Eds.), Beyond nature-nurture: Essays in honor
of Elizabeth Bates (pp. 3–40). Mahwah, NJ: Erlbaum.
Von Raffler-Engel, W. (1976). Linguistic and kinesic correlates in code switching. In W. C. McCormack
& S. A. Wurm (Eds.), Language and man: Anthropological issues (pp. 229–238). The Hague:
Mouton.
Von Raffler-Engel, W. (1980). Kinesics and paralinguistics: A neglected factor in second language
research and teaching. Canadian Modern Language Review, 36(2), 225–237.
Williams, J. (1988). Zero anaphora in second language acquisition. Studies in Second Language
Acquisition, 10, 339–370.
Wittenburg, P., Brugman, H., Russel, A., Klassman, A., & Sloetjes, H. (2006). ELAN: A professional
framework for multimodality research. In Proceedings of the fifth international conference on
Language Resources and Evaluation (LREC 2006) (pp. 1556–1559). Genoa.
Wylie, L. (1985). Language learning and communication. The French Review, 57(6), 777–785.
Yoshioka, K. (2008). Gesture and information structure in first and second language. Gesture, 8(2),
236–255.
398
28
SPEECH-LANGUAGE
PATHOLOGISTS AND L2 SPEAKERS
Marie Nader
1 Introduction/Definitions
Speech-Language Pathologists (SLPs) are healthcare professionals primarily concerned with
the prevention, assessment, diagnosis, and treatment of a wide range of communication and
swallowing disorders in a variety of settings (i.e., hospitals, schools, and private clinics). By
communication disorder, I refer to “unexpectedly long-lasting, persistent, or recurrent diffi-
culties that interfere with normal, successful, ordinary communication” (Oller et al., 2010,
p. 5) whether developmental or acquired. They include speech sound disorders (SSDs) (i.e.,
impairment of the articulation of speech sounds, fluency and/or voice), and language dis-
orders [i.e., “impaired comprehension and/or use of language which may involve (1) the form
of language (phonology, morphology, syntax), (2) the content of language (semantics), and/
or (3) the function of language in communication (pragmatics)”] (American Speech-
Language-Hearing Association [ASHA], 1993, n.p.). With increasing diversity in the popu-
lation, professional colleges stress ethical delivery of service to all individuals regardless of
their cultural and linguistic backgrounds (ASHA, 2017; Speech-Language and Audiology
Canada [SAC], 2016). However, SLPs find it difficult to ensure equal quality of service to all
culturally and linguistically diverse individuals. Nevertheless, clinicians are increasingly in-
volved with and challenged by communication disorders of second language (L2) speakers,
that is, “bilingual speakers who have already made significant progress toward acquisition of
[L1] when they begin the acquisition of a second language” (Paradis et al., 2011, p. 6).
In addition to working with native speakers (NSs) and non-native speakers (NNSs) with
disorders, SLPs provide services to typically developing (TD) individuals, people who do not
present any underlying disorders to explain their observed behaviour(s). For instance, SLPs
may provide pronunciation instruction (PI) to L2 speakers, often labelled foreign accent
modification/management/reduction (FAM). Foreign accent is defined as “non-pathological
speech that differs in some noticeable respects from native speaker pronunciation norms”
(Munro & Derwing, 1995, p. 289). This encompasses segmental features (consonants and
vowels) and suprasegmental features (e.g., word stress, sentences stress, intonation, and
rhythm). Pronunciation is operationalized in L2 research through three partially distinct
dimensions (see Munro & Derwing, 1995), namely accentedness, that is, difference from a
local variety, intelligibility, that is, how understandable L2 speech is, and comprehensibility,
that is, the effort a listener expends understanding an utterance. For NNSs, accents result
2 Historical Perspectives
400
Speech-Language Pathologists
1917, many L2 speakers were included in speech classes by correctionists who had up to 250
students in their caseload.
A functional approach to speech production was central to the pre-profession. Accent
correction programmes were usually taken from treatment techniques designed for mono-
lingual native English children with speech disorders. Through articulatory placement,
modelling, and imitating NS speech and articulation patterns, practitioners such as Van
Riper (1954) proposed accent “treatment” aiming to develop nativelikeness, a dominant
principle in pronunciation teaching for decades. However, developing a native-like accent is
an unrealistic goal for most individuals because it is conditioned by age, amount and quality
of L2 exposure, amount of L1 use, and motivation. With the Civil Rights movement (1960s),
the need to distinguish disorders from differences was highlighted. In 1975, ASHA published
its first position statement acknowledging foreign accents and social dialects as differences,
not disorders. Yet, to date, SLPs remain influenced by a medical view of foreign accent
whereas language teachers embrace a pedagogical view to L2-pronunciation instruction
(Derwing et al., 2014; Thomson & Foote, 2019).
Terminology Considerations
In speech-language pathology, L2 speakers are described using a framework of cultural and
linguistic diversity. This framework enables professional colleges and policy makers to em-
phasize legal and ethical responsibilities to provide appropriate services to all culturally and
linguistically diverse (CLD) individuals (ASHA, 2014; Individuals with Disabilities
401
Marie Nader
Education Act [IDEA], 2004). However, since populations within the framework vary as a
function of disability, ethnicity, gender identity, culture, language, and dialect, challenges are
expected when language is the focus of investigation. Maydosz and Maydosz (2020)
examined case law and law review journals regarding CLD individuals with disorders,
including L2 speakers. They noted that the term CLD is not uniformly defined across the
literature, resulting in faulty practices as evidenced by legal complaints of inequitable ser-
vices, biased evaluations, and misplacements in special education. Despite increasing
research examining CLD individuals, operational definitions remain scarce. When provided,
a definition encompasses heterogeneous groups such as bilinguals, non-standard dialect
users, and monolinguals in a minority language (D’Souza et al., 2012).
Obviously, differences across and within these groups are expected, such as sociocultural
background, amount of L2 exposure, and language sociopolitical status (minority vs. ma-
jority). Furthermore, terms within the framework are hardly interchangeable and may in-
clude a variety of subgroups. For instance, non-standard dialect users refers to speakers of a
regional/cultural dialect of a given language, who may or may not be bilinguals. Bilinguals,
on the other hand, is a neutral term referring to individuals who speak at least two languages
with varying degrees of proficiency (Meisel, 2019). Cook (2002) challenges the use of the term
bilingual as it has “contradictory definitions and associations in both popular and academic
usage” (p. 4). Among bilinguals are L2 speakers and heritage-language (HL) individuals, that
is, bilinguals of an immigrant minority language, raised in migrant families, exposed to their
HL and the majority language of the community since birth or in childhood, who may speak
or merely understand their HL (Benmamoun et al., 2013). Each subgroup is characterized by
linguistic particularities. For instance, TD L2 speakers can show signs of non-target-like
acquisition in different linguistic components of the L2 (e.g., phonetics, phonology, mor-
phology, semantics, syntax) (Benmamoun et al., 2013). TD HL speakers share linguistic
properties with both NSs and L2 speakers, and exhibit signs of non-target-like development
in their HL as well as in L2 which often becomes their more dominant language (Valdés,
2005). Such characteristics have implications for both assessment and intervention, hence, an
in-depth knowledge of L1 and L2 processes for each subgroup is paramount to providing
equitable and ethical speech-language pathology services.
402
Speech-Language Pathologists
SLPs working with adults with neurogenic communication disorders (such as aphasia and
traumatic brain injuries); 60% of clinicians “did not feel competent to provide clinical ser-
vices to diverse populations, particularly when a language or dialect difference was involved”
(p. 116). More recently, in a survey of 83 English and French-speaking SLPs in Québec, 96%
strongly agreed that mandatory L2-related education and training in graduate programmes
are needed (Nader & Chapdelaine, 2022).
403
Marie Nader
structures and characteristics of the L1. Finally, the use of L1 norm-referenced tools with L2
children can lead to their misidentification for disorders. Indeed, L2 children are exposed to
L1 input that is quantitatively and qualitatively different from the input to which NSs of the
L1 are exposed in their country of origin. Some L2 children often undergo incomplete ac-
quisition or even attrition of their L1.
Based on the Fundamental Difference Hypothesis (Bley-Vroman, 1990) within the per-
spective of the Universal Grammar, Sorace (1993) argued that L2 incomplete representations
in speakers’ interlanguage are due to a critical period for acquisition. On the other hand,
incomplete acquisition of L1 can also occur as observed in internationally adopted and HL
children, due to insufficient L1 input (Montrul, 2008; but see Perez-Corteset al., 2019). As for
L1 attrition, it refers to a non-pathological decrease in L1 use due to the loss/limited access to
previously acquired linguistic features or structures (see Chapter 31 this volume). L2 speakers
may experience L1 incomplete acquisition (children) and/or attrition (children and adults) as
a by-product of L2 contact. A gradual shift in language dominance from L1 to L2 can occur,
which, for children, is often observed before the L1 is fully developed, hindering L1 native-
like proficiency in many linguistic and phonological aspects (Montrul, 2008). Given that
SLPs are encouraged to assess speakers in their L1 and L2, factoring incomplete acquisition
and attrition into their clinical judgement is fundamental. Indeed, if L2 speakers have un-
dergone L1 incomplete acquisition and/or attrition, an underperformance on L1 test scores
would be observed compared to monolingual age-peers. In such cases, over-identification of
speech and language disorders can occur by clinicians unfamiliar with these processes.
404
Speech-Language Pathologists
clinical research can lead to unethical practices, even by the most well-intentioned
practitioner.
Inaccuracies can be found in SLP-led research stemming from a lack of L2-pronunciation
education and evidence-based practice. In the following statement for instance, “[a] perceived
strong or thick foreign or regional accent as compared to a mild accent is difficult or im-
possible for a native speaker to understand” (Freysteinson et al., 2017, p. 300); aside from
collapsing NNSs, NSs with non-standard (regional) accent, and NSs with a more “standard”
accent, the statement also reflects a lack of knowledge of L2 research that has clearly es-
tablished that having a strong foreign accent does not necessarily impede intelligibility
(Munro & Derwing, 1995).
Experts in the field of L2 pronunciation frequently argue that SLPs, as well as language
teachers wishing to work as FAM/PI providers, must acquire specialized education and
training in L2-pronunciation (Derwing et al., 2014; Müller et al., 2000; Thomson & Foote,
2019). Yet, to date in North America, specific L2-education and training requirements and
enforceable policies by professional bodies are lacking, thus contributing to ongoing pro-
blems in the L2 pronunciation field.
405
Marie Nader
Finally, most SLPs obtain a detailed repertoire of L2 speakers’ phonetic and phonological
errors to inform subsequent individual training programmes (stemming from a structuralist
approach to SSDs). However, not all phonemes are equally important for intelligibility.
Following the principle of functional load, that is, “a measure of the work which two pho-
nemes (or a distinctive feature) do in keeping utterances apart” (King, 1967, p. 831),
406
Speech-Language Pathologists
unattainable objective for most adults of eliminating foreign accents (Abrahamsson &
Hyltenstam, 2009).
407
Marie Nader
First, a shift in the focus of attention from accentedness to intelligibility is evident in the
effort to change the current label, FAM, to a more evidence-based related term “intellig-
ibility enhancement” (Blake et al., 2020). FAM misleads vulnerable L2 speakers into
thinking that their accent is the core problem in communication breakdowns and negative
outcomes. Shifting to “intelligibility enhancement” will focus L2 speakers’ attention (and
practitioners’ likewise) on improving intelligibility rather than modifying one’s accent. This is
an important change that should be more broadly adopted by SLPs, and more importantly,
strongly endorsed by professional colleges and associations.
Second, SLPs interested in FAM seem more critical towards the quality of available re-
search within their field. For instance, Gu and Shah (2019) reviewed 26 published studies
from 1990 to 2018 examining the effectiveness of FAM training programmes implemented
with L2 speaker healthcare professionals. The authors reported that “all included studies
were of low research quality and often had small sample sizes and few objective outcome
measures, indicating a lack of generalizability and reproducibility” (p. 391). However, even
these authors embraced a medical approach using terms such as “patient,” “clinicians,” and
“interventions.” Few references were made to leading experts in L2 pronunciation in dis-
cussing the reviewed studies.
Finally, neuroscientists have shown increased interest in understanding neural mechan-
isms underlying TD L2 speakers’ accent using neuroimaging techniques such as functional
magnetic resonance imaging (fMRI). In one such study in Québec, Ghazi-Saidi et al. (2015)
examined 12 Spanish-speaking adults who learnt 35 new Spanish–French cognates by means
of a computerized training programme for 4 weeks, followed by picture naming measures of
learned cognates during fMRI scanning in French L2 and Spanish L1. All participants self-
evaluated as being low proficient in French. The study indicated that attempting to produce
an L2 native-like pronunciation is cognitively effortful. They also outlined the role in L2
speech processing of a small brain region, the insula, that had previously been associated
with emotional processing and processing uncertainty. Although an interesting avenue to
increase our understanding of differences between TD L2 speakers’ performances in L2
pronunciation, the usefulness of neuroscience studies remains limited.
408
Speech-Language Pathologists
medical-related knowledge and acquired skills with L1 speakers with disorders can be irre-
levant to PI and lead to unethical practices. Given the specialized area of practice that is PI,
official policy makers and professional colleges must come together to regulate the practice,
first, by making mandatory a specialized certification in L2 pronunciation instruction, and
second, by suggesting a recommended guideline of courses/topics and training objectives
with the collaboration of L2 experts.
6 Future Directions
We have come a long way from “speech correctionists” to professional communication ex-
pert SLPs. Throughout our journey, collaborating with professionals from related fields has
been (and continues to be) enriching. Such collaborations enable us to provide evidence-
based services to L1 speakers and to offer support and counsel to family members and
caregivers. However, much is still needed to ensure L2 speakers’ legal rights to ethical and
evidence-based services. Currently, the interaction between our field and the fields of applied
linguistics and L2 research remains extremely limited. With more L2 individuals in need of
SLP services, it is time to bridge the gap. One potential venue for such interactions is the
annual Pronunciation in Second Language and Learning (PSLLT) conference which will
open discussions with L2 experts. Moving forward, a field dedicated to L2 clinical linguistics
may pave the way for fruitful collaborations between our fields. It is through interactions
that L2 theories can shape SLP practices with L2 speakers, but also where L2 clinical re-
search and practice can inform linguistic theories.
Further Reading
Grant, L. (2014). Pronunciation myths: Applying second language research to classroom teaching. Ann
Arbor: U. of Michigan Press.
Experts in L2 pronunciation research and teaching discuss seven myths about L2 pronunciation in-
struction while covering central concepts, terms and issues.
Montrul, S. (2008). Incomplete acquisition in bilingualism: Re-examining the age factor. Amsterdam:
Benjamins.
A review of literature on non-native L2 and L1 attainment in adult L2 speakers and heritage language
(HL) children. Various degrees of incomplete acquisition are described in L2 contexts as are processes
such as attrition and incomplete acquisition.
References
Abrahamsson, N., & Hyltenstam, K. (2009). Age of onset and nativelikeness in a second language:
Listener perception versus linguistic scrutiny. Language Learning, 59, 249–306.
American Speech-Language-Hearing Association (ASHA). (1993). Definitions of communication dis-
orders and variations. Retrieved from https://www.asha.org/policy/rp1993-00208/
American Speech-Language-Hearing Association. (2014). Cultural competence. Retrieved from
www.asha.org/Practice-Portal/Professional-Issues/Cultural-Competence/
American Speech-Language-Hearing Association (ASHA). (2016a). 2016 Schools survey: SLP caseload
characteristics. Rockville, MD. Retrieved from http://www.asha.org
American Speech-Language-Hearing Association (ASHA). (2016b). Code of Ethics [Ethics]. Retrieved
from www.asha.org/policy/
American Speech-Language Hearing Association (ASHA). (2017). Issues in Ethics statement: Cultural
and linguistic competence. Retrieved from https://www.asha.org/Practice/ethics
American Speech-Language Hearing Association (ASHA). (n.d.). In Frequently Asked Questions.
Retrieved from https://www.asha.org/slp/clinical/dysphagia/dysphagia_faqs/
Anderson, J., Saleemi, S., & Bialystok, E. (2017). Neuropsychological assessments of cognitive aging in
monolingual and bilingual older adults. Journal of Neurolinguistics, 43, 17–27.
409
Marie Nader
Arias, G., & Friberg, J. (2017). Bilingual language assessment: Contemporary versus recommended
practice in American schools. Language, Speech, and Hearing Services in Schools, 48, 1–15.
Artiles, A., Harry, B., Reschly, D., & Chinn, P. (2002). Over-identification of students of color in
special education: A critical overview. Multicultural Perspectives, 4, 3–10.
Bedore, L., & Peña, E. (2008). Assessment of bilingual children for identification of language impair-
ment: Current findings and implications for practice. International Journal of Bilingual Education
and Bilingualism, 11, 1–29.
Benmamoun, E., Montrul, S., & Polinsky, M. (2013). Heritage languages and their speakers:
Opportunities and challenges for linguistics. Theoretical Linguistics, 39, 129–181.
Blake, H. L., McLeod, S., & Verdon, S. (2020). Intelligibility enhancement assessment and intervention:
A single-case experimental design with two multilingual university students. Clinical Linguistics and
Phonetics, 34(1-2), 1–20.
Bley-Vroman, R. (1990). The logical problem of foreign language learning. Linguistic Analysis, 20, 3–49.
Brady, K., Duewer, N., & King, A. (2016). The effectiveness of a multimodal vowel-targeted intervention
in accent modification. Contemporary Issues in Communication Science and Disorders, 43, 23–34.
Caesar, L., & Kohler, P. (2007). The state of school-based bilingual assessment: Actual practice versus
recommended guidelines. Language, Speech, and Hearing Services in Schools, 38, 190–200.
Cavanaugh, M. (1996). History of teaching English as a second language. The English Journal,
85, 40–44.
Cook, V. (2002). Background of L2 users. In V. Cook (Ed.), Portraits of L2 users (pp. 1–28). Cleveland,
UK.: Multilingual Matters.
Damico, J. S. (1993). Synergy in applied linguistics: Theoretical and pedagogical implications. In F.
Eckman (Ed.), Confluence: Linguistics, L2 acquisition, and speech pathology (pp. 195–212).
Amsterdam: John Benjamins Publishing Company.
Derwing, T. M., Fraser, H., Kang, O., & Thomson, R. (2014). L2 accent and ethics: Issues that merit
attention. In A. Mahboob & L. Barratt (Eds.), Englishes in multilingual contexts. Berlin: Springer.
Derwing, T. M., & Munro, M. J. (2015). Pronunciation fundamentals: Evidence-based perspectives for L2
teaching and research. Amsterdam: John Benjamins Publishing Company.
D’Souza, C., Kay-Raining Bird, E., & Deacon, H. (2009). Survey of Canadian speech-language pa-
thology service delivery to linguistically diverse clients. Canadian Journal of Speech-Language
Pathology and Audiology, 36, 18–39.
Duchan, J. (2006). The diagnostic practices of Speech-Language Pathologists in America over the last
century. In J. Duchan & D. Kovarsky (Eds.), Diagnosis as cultural practice (pp. 200–222). Mouton
de Gruyter: Berlin.
Flege, J. (2016). The role of phonetic category formation in second language speech acquisition. In
Eight international conference on second language speech, Aarhus University, Denmark.
Freysteinson, M., Adams, D., Cesario, S., Belay, H., Clutter, P., Du, J., Duson, B., Goff, M.,
McWilliams, L., Nurse, R. P., & Allam, Z. (2017). An accent modification program. Journal of
Professional Nurses, 33, 299–304.
Freysteinson, M., Adams, D., Cesario, S., Belay, H., Clutter, P., Du, J., Duson, B., Goff, M.,
McWilliams, L., Nurse, R. P., & Allam, Z. (2017). An accent modification program. Journal of
Professional Nurses, 33, 299–304.
Ghazi-Saidi, L., Dash, T., & Ansaldo, A. (2015). How native-like can you possibly get: fMRI evidence
for processing accent. Frontiers in Human Neuroscience, 9, 1–12.
Gu, Y., & Shah, A. (2019). A Systematic review of interventions to address accent-related commu-
nication problems in healthcare. Ochsner Journal, 19, 378–396.
Guiberson, M., & Atkins, J. (2012). Speech-Language Pathologists’ preparation, practices, and per-
spectives on serving culturally and linguistically diverse children. Communication Disorders
Quarterly, 33, 169–180.
Gillam, R., & Peña, E. (2004). Dynamic assessment of children from culturally diverse backgrounds.
Perspectives on Communication Disorders and Sciences in Culturally and Linguistically Diverse po-
pulations, 11(2), 2–5.
Hedrick, J. (1922). A unique speech clinic. Position paper at Session –National Society for the Study and
Correction of Speech Disorders. Atlantic City: New Jersey.
Individuals with Disability Education Act Amendments of 2004 [IDEA]. (2004). Retrieved from https://
ideadata.org/
Kang, O., Thomson, R., & Moran, M. (2020). Which features of accent affect understanding?
Exploring the intelligibility threshold of diverse accent varieties. Applied Linguistics, 41, 453–480.
410
Speech-Language Pathologists
King, R. D. (1967). Functional load and sound change. Language, 43, 831–852.
Laing, S., & Kamhi, A. (2003). Alternative assessment of language and literacy in culturally
and linguistically diverse populations. Language, Speech, and Hearing Services in Schools, 34(1),
44–55.
Lippi-Green, R. (2012). English with an accent: Language, ideology, and discrimination in the United
States (2nd ed.). London: Routledge.
Levis, J. (2018). Intelligibility, oral communication, and the teaching of pronunciation. Cambridge:
Cambridge University Press.
Marinis, T., & Armon-Lotem, S. (2014). Sentence repetition. In S. Armon-Lotem, N. Meir & J. de Jong
(Eds.), Assessing multilingual children: Disentangling bilingualism from language impairment
(pp. 116–143). Clevedon, UK: Multilingual Matters.
Maydosz, A., & Maydosz, D. (2020). Culturally and linguistically diverse students with disabilities:
Case law review. Multicultural Learning & Teaching, 8, 65–80.
Meisel, J. (2019). Bilingual children: A guide for parents. UK: Cambridge University Press.
Montrul, S. (2008). Incomplete acquisition in bilingualism. Re-examining the age factor. Amsterdam/
Philadelphia: John Benjamins Publishing Company.
Müller, N., Ball, M., & Guendouzi, J. (2000) Accent reduction programmes: Not a role for speech-
language pathologists? Advances in Speech-Language Pathology, 2, 119–129.
Munro, M. J., & Derwing, T. M. (1995). Foreign accent, comprehensibility, and intelligibility in the
speech of second language learners. Language Learning, 45, 73–97.
Munro, M. J., & Derwing, T. M. (2001). Modelling perceptions of the comprehensibility and accent-
edness of L2 speech: The role of speaking rate. Studies in Second Language Acquisition, 23, 451–468.
Munro, M. J., & Derwing, T. M. (2006). The functional load principle in ESL pronunciation in-
struction: An exploratory study. System, 34(4), 520–531.
Nader, M., & Chapdelaine, C. (2022). Speech-Language Pathologists and linguistically diverse popu-
lations: Training, practice and future perspectives. [manuscript in preparation]
Nader, M., Simard, D., Fortier, V., & Molokopeeva, T. (2017). Étude de la contribution de la mémoire
de travail et de la mémoire phonologique dans la réalisation d’une tâche métasyntaxique chez des
enfants de langue d’origine. La revue canadienne de linguistiques appliquée / Canadian Journal of
Applied Linguistics, 20, 55–76.
O’Brien, I., Segalowitz, N., Freed, B., & Collentine, J. (2007). Phonological memory predicts second
language oral fluency gains. Studies in Second Language Acquisition, 29, 557–582.
Oelke, M., Sachet, L., Nagle, K., Bislick, L., Brookshire, E., & Kendall, D. (2015). Can intensive
phonomotor therapy modify accent? A phase I study. Speech, Language and Hearing, 18, 229–242.
Oller, J. W. Jr., Oller, S. D., & Badon, L. (2010). Cases: Introducing communication disorders across the
life span. San Diego, CA: Plural Publishing, Inc.
Orellana, C., Wada, R., & Gillam, R. (2019). The use of Dynamic Assessment for the diagnosis of
language disorders in bilingual children: A meta-analysis. American Journal of Speech-Language
Pathology, 28, 1298–1317.
Ortiz, S., & Ochoa, S. (2005). Cognitive assessment of culturally and linguistically diverse individuals:
An integrated approach. In R. Rhodes, S. H. Ochoa, & S. O. Ortiz (Eds.), Assessing culturally and
linguistically diverse students: A practical guide (pp. 168–201). New York: Guilford Press.
Oswald, D., Coutinho, M., Best, A., & Singh, N. (1999). Ethnic representation in Special Education:
The influence of school-related economic and demographic variables. The Journal of Special
Education, 32, 194–206.
Paradis, J., Genesee, F., & Crago, M. (2011). Dual language development and disorders: A handbook on
bilingualism and second language learning (2nd ed.). Baltimore, MD: Brookes.
Perez-Cortes, S., Putnam, M. T., & Sánchez, L. (2019). Differential access: Asymmetries in accessing
features and building representations in Heritage Language grammars. Languages, 4(4), 81.
Peña, E., Spaulding, & Plante, E. (2006). The composition of normative groups and diagnostic decision
making: Shooting ourselves in the foot. American Journal of Speech-Language Pathology, 15, 247.
Rapeer, L. (1916). Review: A speech symposium. The English Journal, 5, 519–520.
Rothman, J. (2009). Understanding the nature and outcomes of early bilingualism: Romance languages
as heritage languages. International Journal of Bilingualism, 13(2), 155–163.
Schmidt, A., & Sullivan, S. (2003). Clinical training in foreign accent modification: A national survey.
Contemporary Issues in Communication Science and Disorders, 30, 127–135.
Service, E., Maury, S., & Luotoniemi, E. (2007). Individual differences in phonological learning and
verbal STM span. Memory & Cognition, 35(5), 1122–1135.
411
Marie Nader
Sikorski, L. (2005). Foreign accents: Suggested competencies for improving communicative pro-
nunciation. Seminars in Speech and Language, 26(2), 126–130.
Sorace, A. (1993). Incomplete vs. divergent representations of unaccusativity in non-native grammars
of Italian. Second Language Research, 9, 22–47.
Speech-Language and Audiology Canada [SAC]. (2016). Scope of practice for speech-language pa-
thology. Available from www.sac-oac.ca
Sprague, J. (1925, July 4). Better Speech, Better Business. Saturday Evening Post, pp. 40–42.
Sullivan, A. (2011). Disproportionality in special education identification and placement of English
Language Learners. Exceptional Children, 77(3), 317–334.
Thomson, R. (2014). Myth 6: Accent reduction and pronunciation instruction are the same thing. In L.
Grant (Ed.), Pronunciation myths: Applying second language research to classroom teaching
(pp. 160–187). Ann Arbor, MI: University of Michigan Press.
Thomson, R., & Foote, J. (2019). Pronunciation teaching: Whose ethical domain is it anyways? In J.
Levis, C. Nagle & E. Todey (Eds.), Proceedings of the 10th Pronunciation in Second Language
Learning and Teaching Conference, ISSN 2380-9566, Ames, IA, September 2018 (pp. 226–236).
Ames, IA: Iowa State University.
Valdés, G. (2005). Bilingualism, Heritage Language Learners, and SLA research: Opportunities lost or
seized? The Modern Language Journal, 89, 410–426.
Van Riper, C. (1954). Speech correction: Principles and methods (3rd ed.). New York: Prentice-Hall.
Wagner, R., Francis, D., & Morris, R. (2005). Identifying English language learners with learning
disabilities: Key challenges and possible approaches. Learning Disabilities Research and Practice,
20(1), 6–15.
Wagner, E., & Toth, P. (2017). The role of pronunciation in the assessment of second language listening
ability. In Isaacs T. & Trofimovich P. (Eds.), Second language pronunciation assessment:
Interdisciplinary perspectives (pp. 72–92). Bristol; Blue Ridge Summit: Multilingual Matters/Channel
View Publications.
Wallace, G. (1997). Infusing multicultural content into the traditional neurogenics framework. In G.
Wallace (Ed.), Multicultural neurogenics: A resource for Speech-Language Pathologists providing
services to neurologically impaired adults for culturally and linguistically diverse backgrounds
(pp. 115–127). Austin, TX: Pro-Ed.
Washington, J.A., & Craig, H.K. (1994). Dialectal forms during discourse of poor, urban, African
American preschoolers. Journal of Speech and Hearing Research, 37(4), 816–823.
Yan, R., & Oller, J. W. (2007). Processing-dependent measures as a failed solution to the assessment of
individuals from language and dialect minorities. Communicative Disorders Review, 1(3), 1–14.
412
29
CHILD L2 SPEAKERS WITH
LANGUAGE AND
COMMUNICATION DISORDERS
Johanne Paradis
1 Introduction/Definitions
Children who are dual language learners with language and communication disorders
(LCDs) are the focus of this chapter. In school contexts, these children are often referred to
as students with special education needs. Research on this population follows two main
branches. The first is more theoretical and concerns whether children with LCDs have the
capacity to learn two languages, and whether their dual language learning has unique
characteristics. The second branch is more applied and focuses on issues of referral and
assessment with dual language children, and on which language(s) should be used in inter-
vention, educational programming, and at home for dual language children with LCDs.
The term dual language learners (DLLs) or bilinguals will often be used here because,
unlike adult L2 speakers, child L2 speakers are in the process of acquiring their two lan-
guages simultaneously, and consideration of both L1 and L2 acquisition is common in the
research with this population. Most research on DLLs with LCDs has been conducted with
children who speak a heritage-L1 at home and a majority, societal L2 at school and in the
wider community; that is, children who are primarily first- or second-generation children
from migrant families. Such children are not bilingual by choice but by necessity. To date,
there is limited research on child L2 speakers with LCDs who are learning their L2 by choice,
for example, through an immersion education programme, and so this will not be covered
here (see Kay-Raining Bird, Genesee et al., 2020 for information).
Children with LCDs have developmental disorders that cause impairment in language
development. Developmental disorders are different from acquired disorders in that children
are born with them. Developmental language disorder (DLD; formerly specific language
impairment or SLI) is one of the most common developmental disorders. Children with
DLD present with early language delay that does not resolve, and with difficulties in learning
language that persist until adulthood, but they do not have any other clinically significant
condition (Leonard, 2014). Thus, language disorder is their primary condition. Other chil-
dren present with LCDs as a consequence of another clinical condition. For example, chil-
dren with autism spectrum disorder (ASD) have core deficits in social interaction and
communication generally rather than in language learning specifically. Nevertheless, the
majority of children with ASD present with delay in onset of speaking, and for those who
become verbal, deficits in the pragmatic use of language are very common; some also exhibit
symptoms of language disorder in preschool and school (Schwartz, 2017). Children with
Down syndrome (DS) have moderate-to-severe intellectual disabilities affecting multiple
aspects of their development, including delayed onset and protracted speech–language de-
velopment which rarely exceeds their mental age (Schwartz, 2017). As the existing research
has focused on DLLs with these LCDs, these are the ones covered here.
2 Historical Perspectives
Studies on DLLs and on children with LCDs were pursued separately and mainly in isolation
of each other until the 1990s. Consequently, the discussion begins with early research on
DLLs with typical development and then turns to the more recent intersectional research on
dual language development and disorders. (For historical perspectives on child LCDs, see
Leonard, 2014; Schwartz, 2017.)
Up until the 1970s, research with DLLs was primarily focused on whether early bi-
lingualism suppressed intelligence and therefore was a risk factor for development (Arsenian,
1945; Darcy, 1946). Most of this research was deeply flawed methodologically in that the
“bilinguals” were often beginner learners of the L2 from lower socioeconomic status (SES)
backgrounds than the monolingual comparison groups, which were the likely reasons for
their lower performance on tests conducted in the L2 (Hakuta, 1986). The landmark study of
Peal and Lambert (1962), with French–English bilinguals at a private school in Montreal,
Canada, showed that when SES and proficiency in the L2 were controlled, bilinguals actually
displayed cognitive advantages. The cognitive consequences of dual language learning in
children has been an active line of research ever since (Bialystok, 2011). In the 1970s, sys-
tematic research on the nature and characteristics of child L2 acquisition (as opposed to
adult L2 acquisition) emerged. Developmental versus L1 transfer errors in the L2 speech of
young learners, interdependence and common underlying proficiencies between children’s
two languages, and timelines to native-like proficiency in the L2 were key topics (Cummins,
2000, Dulay et al., 1982). These are still active topics in the field of child L2 acquisition.
In the 1990s, research emerged that focused on issues in clinical practice with DLLs, for
example, challenges in the differential diagnosis of DLLs with typical and atypical devel-
opment given clinical protocols and testing materials based on monolingual mainstream
populations (Guterrez-Clellen Cole 1996; Westernoff, 1991). In the early 21st century, a
sharp increase in studies on DLD (formerly SLI) and dual language development appeared
and the past decade has seen an extension of this research focus to DLLs with ASD and DS.
These studies were not only focused on clinical issues but also on the potential developmental
costs for children with LCDs to learn two languages, which is somewhat reminiscent of the
early 20th-century view that bilingualism could be a risk factor for intellectual development.
414
Child L2 Language Disorders
systems implicated in language learning (Leonard, 2014), coping with dual language input
could limit uptake and/or overload the system. In the case of ASD, deficits in social inter-
action and pragmatics could limit linguistic input and uptake, and therefore impede language
learning and exacerbate early language delays. For children with DS, intellectual disability in
general, and deficits in auditory memory in particular (Schwartz, 2017), already place bar-
riers to language learning that could be increased through dual language input. Whatever the
rationale, the CEH supports the view that children with LCDs should not be exposed to two
languages, or that bilingualism should be discontinued post-diagnosis on the grounds that
this would be a risk factor in their already comprised development. Existing research does
not show evidence in favour of the CEH, similarly, it does not show evidence in favour of
bilingualism being a risk factor for intelligence.
415
Johanne Paradis
tend to be language specific in that what might be difficult for English speakers is not difficult
for French or Spanish speakers (Leonard, 2014). These studies found that the bilingual
children with DLD showed the same level of morphosyntactic abilities in each language as
their monolingual age peers with DLD, they showed the same profiles of strengths and
weaknesses with clinical and non-clinical markers as monolinguals with DLD, and their
morphosyntactic profile with clinical markers were language specific, thus no evidence of
crosslinguistic transfer.
Because French and English are both prestige and majority languages in Canada, it is
possible that these findings would not generalize to other simultaneous or early sequential
bilinguals who speak a heritage-L1 with a majority-L2, since the heritage-L1 would receive
less community support. However, research with Spanish–English DLLs in the United States
(Morgan et al., 2013), with English-L2 DLLs from diverse L1 backgrounds in Canada
(Rezzonico, Chen et al., 2015), and Dutch-L2 DLLs from diverse L1 backgrounds in the
Netherlands (Boerma 2016) show results consistent with those of French–English children
with DLD: dual language learning in the early years does not add additional difficulties for
children with DLD.
Regarding sequential bilinguals who started learning their L2 at school, comparisons to
their monolingual peers with DLD is complicated by their delay in the onset of L2 acqui-
sition; in other words, child L2 speakers would always lag behind monolinguals, whether
they have DLD or not. Therefore, studies of sequential bilinguals with DLD tend to compare
them to their typically developing (TD) bilingual peers to ascertain whether the children with
DLD show exceptional delays or unique profiles in their L2 development. A clinical marker
of monolingual English-speaking children with DLD is their protracted development of tense
morphology (e.g., past tense [-ed], or third person singular [-s]; Leonard, 2014). DLLs with
DLD also exhibit protracted development of tense morphology in their English L2, as
compared to their TD DLL peers (Blom & Paradis, 2013; Jacobson & Yu, 2018).
Furthermore, DLLs with DLD can show similar abilities to their TD DLL peers in their
acquisition of morphosyntactic structures that are not clinical markers, also parallel to
monolinguals (Paradis, 2010). Thus, in their English L2 acquisition, children with TD and
with DLD display the same profiles as their monolingual peers with TD and with DLD, it is
just that these profiles extend longer through the elementary school years. Moving beyond
morphosyntax, researchers have found that DLLs with DLD show no exceptional delays,
and have similar profiles of strengths and weaknesses in their L2 lexical and narrative skills
as those of monolingual children with DLD (Boerma et al., 2016; Govindarajan & Paradis,
2019; Sheng et al., 2013). Taken together, studies on morphosyntax, lexical, and narrative
skills indicate that the weight of evidence is against the CEH for bilingual children with DLD
(but see Verhoeven et al., 2011).
Turning to children with ASD, most research has focused on the impact of dual language
exposure on language and communication abilities in the preschool years. Comparisons
between DLLs and monolinguals with ASD have consistently found that dual language
exposure in the preschool years does not negatively impact children’s overall development
beyond the expected impacts of having ASD (Drysdale et al., 2015, Ohashi, Mirenda, et al.,
2012; Wang et al., 2018). Specifically, dual language exposure did not cause later onset of
first words, weaker expressive and receptive language skills, disadvantages in communicative
functioning or an increase in non-linguistic ASD characteristics.
In contrast to the research on DLLs with DLD, there is less focus on examining profiles in
L2 acquisition of school-age children with ASD and their TD peers. This is due, in part, to the
variable expressive language trajectories of children with ASD – some are minimally verbal
during elementary school while others have language abilities that have normalized for the
416
Child L2 Language Disorders
most part (Schwartz, 2017). Expressive narrative skills are often examined in school-age
children with ASD because they implicate all linguistic domains, crucially including discourse
pragmatics, which is a known weakness for children with ASD. Studies with DLLs from
diverse L1 backgrounds have found that children with ASD had less coherent story structures
and less frequent use of mental state vocabulary than their TD peers in their L2 narratives,
which is consistent with the profile of monolinguals with ASD compared to their neurotypical
peers (Govindarajan, 2020; Hoang et al., 2018). Furthermore, Gonzalez‐Barrero and Nadig
(2018) found that DLLs with ASD who were receiving the majority of their language input in
their French L2 had similar vocabulary and morphological skills to their neurotypical DLL
peers (see also Paradis et al., 2018).
There is even less research on DLLs with DS than on DLLs with ASD. Studies with
adolescents, either French–English or Spanish–English bilinguals, indicate that they show
abilities in their dominant language, English, in line with their mental ages, parallel to the
profile of monolingual adolescents with DS (Kay-Raining Bird, Cleave et al., 2005; Edgin
et al., 2011; Trudeau et al., 2011). DLLs with DS have similar profiles to their monolingual
peers with DS for morphosyntactic acquisition and word learning skills (Cleave et al., 2014;
Feltmate & Kay-Raining Bird, 2008). While both DS and ASD have more profound con-
sequences for children’s overall development than DLD, the limited research suggests that
the weight of evidence is not in favour of the CEH for these populations either.
417
Johanne Paradis
are usually referenced by age or grade. Therefore, use of monolingual norms in the L2 can
lead to TD children being over-identified as having a LCD because their scores are low for
their age, as referenced to their monolingual peers (Paradis et al., 2013). Similar over-
identification can occur using tests in the L1 normed with monolingual rather than heritage
speakers of that language (Barragan et al., 2018).
418
Child L2 Language Disorders
419
Johanne Paradis
instead of static testing procedures with L2 tests would aid in gauging learning capacity
separately from existing language knowledge. Because children with LCDs have reduced
language learning capacities, the dynamic component would assist in distinguishing them
from their TD peers regardless of existing knowledge of the language of testing or cultural
mismatches. A final strategy, alternative norm referencing, focuses on changing the norm-
referencing system for tests in the L2, rather than changing the tests themselves, to reduce
bias in assessment. The logic here is that the performance of DLLs should be benchmarked
to other DLLs, not to monolinguals. Re-norming tests for child L2 speakers could be doable
in many school districts that conduct districtwide testing at regular intervals based on ages or
grades. Tests scores from DLLs could be separated from the aggregate data and used to
create local bilingual norms for subsequent testing of DLLs in the district. DLLs with scores
below a certain threshold on the bilingual norms could be considered at risk for language or
learning disorders, prompting further assessment. For more details on these strategies, tools
for implementing them and the research base underlying them, see Paradis et al. (2021).
420
Child L2 Language Disorders
doing language and communication exercises with their child, or for fully expressing
themselves more generally, and (3) they valued their child’s ability to communicate with
extended family members and to develop their ethnic identity.
A final point regards the benefits of bilingualism for heritgage-L1 children.
Interdependence between the L1 and L2 that supports L2 development and the potential
cognitive advantages of bilingualism are widely recognized, and there is no reason to believe
they apply any less to heritage-L1 children with LCDs. Another key reason for supporting
continued development of the heritage-L1 is children’s socio-emotional well-being, as an
individual and within the family unit, which is dependent on parents and children being able
to communicate easily with each other, and thus is enhanced by children’s proficiency in the
heritage-L1 (Oh & Fuligni, 2020). Because parents of children with moderate-to-severe
LCDs are likely to be long-term caregivers, and the family context likely to be a primary
source of social interaction, elective monolingualism of one family member, the child with an
LCD, could be detrimental to that child’s overall development.
421
Johanne Paradis
6 Future Directions
Research at the intersection of child bilingualism and developmental disorders has grown
immensely since the start of the 21st century and has increased our understanding of the
capacity for children with LCDs to learn two languages, the strategies to improve ac-
curate speech–language assessment with DLLs, and the importance of and potential for
supporting both languages in intervention. Nevertheless, there are still gaps in knowledge
that future research should address. First, more information is needed on the char-
acteristics of L2 acquisition beyond the pre-school years for children with LCDs. Fully
understanding the capacity for bilingualism must include a long-term perspective, espe-
cially to see if L2 outcomes meet expectations in line with the developmental disorder,
what the connections between language and reading development are, and if the heritage-
L1 is at greater risk of attrition than for TD DLLs. Second, more research on dual
language development and disorders beyond DLD is needed. More systematic informa-
tion on DLLs with ASD, DS as well as on DLLs with hearing impairment, attention-
deficit-hyperactivity disorder and dyslexia would be useful for educators and clinicians
because these additional developmental disorders can have an impact on language and
reading development in the school years.
Further Reading
Goldstein, B., & Conboy, B. (Eds.). (2021). Bilingual language development and disorders in
Spanish–English speakers (3rd edn). Baltimore, MD: Paul H. Brookes Publishing.
An edited volume covering topics in dual language development of Spanish-English children with re-
levance for clinical practice. Chapters are organized around different language and literacy domains
providing a comprehensive research review and discussion of phonological, lexical, semantic, mor-
phosyntactic, narrative and reading development in Spanish-English TD children and children
with LCDs.
Paradis, J., Genesee, F., & Crago, M. (2021). Dual Language development and disorders: A handbook on
bilingualism and second language learning (3rd edn). Baltimore: Brookes Publishing.
422
Child L2 Language Disorders
This comprehensive research review and discussion of dual language development in children with typical and
atypical development covers simultaneous bilinguals, child L2 acquisition, heritage language acquisition and
second language education, including chapters on language and readings disorders in DLLs.
Peña, E., Bedore, L., & Baron, A. (2017). Bilingualism and child language disorders. In R. Schwartz
(Ed.), Handbook of child language disorders (2nd edn, pp. 297–327). New York, NY: Routledge.
A concise overview of the issues in clinical practice with bilingual children, focusing on dual language
assessment and intervention with Spanish-English speaking children.
Roseberry-McKibbin, C. (2018). Multicultural students with special language needs: Practical strategies
for assessment and intervention (5th edn). Oceanside, CA: Academic Communication Associates.
This book has a strong applied orientation and includes practical strategies and resources for clinicians
and educators working with culturally and linguistically diverse children in preschool and elementary
school.
References
Arsenian, S. (1945). Bilingualism in the post-war world. Psychological Bulletin, 42, 65–85.
Barragan, B., Castilla-Earls, A., Martinez-Nieto, L., Restrepo, M., & Gray, S. (2018). Performance of
low-income dual language learners attending English-only schools on the Clinical Evaluation of
Language Fundamentals–Fourth Edition, Spanish. Language, Speech, and Hearing Services in
Schools, 49(2), 292–305. doi: 10.1044/2017_LSHSS-17-0013
Bialystok, E. (2011). Reshaping the mind: The benefits of bilingualism. Canadian Journal of Experimental
Psychology/Revue canadienne de psychologie expérimentale, 65(4), 229–235. doi: 10.1037/a0025406
Blom, E., & Paradis, J. (2013). Past tense production by English second language learners with and
without impairment. Journal of Speech, Language and Hearing Research, 56, 1–14.
Boerma, T., Leseman, P., Timmermeister, M., Wijnen, F., & Blom, E. (2016). Narrative abilities of
monolingual and bilingual children with and without language impairment: Implications for clinical
practice. International Journal of Language and Communication Disorders, 51(6), 626–638.
Caesar, L., & Kohler, P. (2007). The state of school-based bilingual assessment: Actual practice versus
recommended guidelines. Language, Speech and Hearing Services in Schools, 38, 190–200.
Cheatham, G. A., Santos, R. M., & Kerkutluoglu, A. (2012). Review of comparison studies in-
vestigating bilingualism and bilingual instruction for students with disabilities. Focus on Exceptional
Children, 45(3), 1–12.
Chondrogianni, V., & Marinis, T. (2011). Asynchronous development of vocabulary, morphology and
complex syntax in successive bilingual children: Differential effects of internal and external factors.
Linguistic Approaches to Bilingualism, 1, 318–345.
Cleave, P., Kay-Raining Bird, E., Trudeau, N., & Sutton, A. (2014). Syntactic bootstrapping in children
with Down syndrome: The impact of bilingualism. Journal of Communication Disorders, 49, 42–54.
Cummins, J. (2000). Language, power and pedagogy: Bilingual children in the crossfire. Clevedon,
England: Multilingual Matters.
Darcy, N. (1946). The effect of bilingualism upon the measurement of the intelligence of children of
preschool age. Journal of Educational Psychology, 37, 21–44.
DeNavas-Walt, C., & Proctor, B. D. (2015). Income and poverty in the United States: 2014. Washington,
DC: U.S. Census Bureau.
Drysdale, H., van der Meer, L., & Kagohara, D. (2015). Children with autism spectrum disorder from
bilingual families: A systematic review. Review Journal of Autism and Developmental Disorders, 2,
26–38. doi: 10.1007/s40489-014-0032-7
Dulay, H., Burt, M., & Krashen, S. (1982). Language two. Oxford, UK: Oxford University Press.
Durán, L. K., Hartzheim, D., Lund, E. M., Simonsmeier, V., & Kohlmeier, T. L. (2016). Bilingual and
home language interventions with young dual language learners: A research synthesis. Language,
Speech, and Hearing Services in the Schools, 47, 347–371.
Ebert, K., Kohnert, K., Pham, G., Disher, J., & Payesteh, B. (2014). Three treatments for bilingual
children with primary language impairment: Examining cross-linguistic and cross-domain effects.
Journal of Speech, Language, and Hearing Research, 57, 172–186.
Edgin, J. O., Kumar, A., Spano, G., & Nadel, L. (2011). Neuropsychological effects of second language
exposure in Down syndrome. Journal of Intellectual Disability Research, 55, 351–356.
Feltmate, K., & Kay-Raining Bird, E. (2008). Language learning in four bilingual children with Down
syndrome: A detailed analysis of vocabulary and morphosyntax. Canadian Journal of Speech-
Language Pathology and Audiology, 32, 6–20.
423
Johanne Paradis
Gonzalez‐Barrero, A. M., & Nadig, A. (2018). Bilingual children with autism spectrum disorders: The
impact of amount of language exposure on vocabulary and morphological skills at school age.
Autism Research, 11, 1667–1678. doi: 10.1002/aur.2023
Govindarajan, K. (2020). Narrative abilities of bilingual children with autism spectrum disorder, devel-
opmental language disorder and typical development. Unpublished doctoral dissertation, University
of Alberta, Canada.
Govindarajan, K., & Paradis, J. (2019). Narrative abilities of bilingual children with and without de-
velopmental language disorder (SLI): Differentiation and the role of age and input factors. Journal
of Communication Disorders, 77, 1–16.
Graham, H. R., Minhas, R. S., & Paxton, G. (2016). Learning problems in children of
refugee background: A systematic review. Pediatrics, 137(6), e20153994. doi: doi: 10.1542/peds.
2015-3994
Guterrez-Clellen, V. (1996). Language diversity: Implications for assessment. In K. Cole, P. Dale, & D.
Thal (Eds.), Assessment of communication and language (pp. 29–56). Baltimore: Brookes.
Gutiérrez-Clellen, V., Restrepo, A., & Simon-Cereijido, G. (2006). Evaluating the discriminant accu-
racy of a grammatical measure with Spanish-speaking children. Journal of Speech, Language and
Hearing Research, 49, 1209–1223.
Hampton, S., Rabagliati, H., Sorace, A., & Fletcher-Watson, S. (2017). Autism and bilingualism: A
qualitative interview study of parents’ perspectives and experiences. Journal of Speech, Language and
Hearing Research, 60, 435–446.
Hakuta, K. (1986). The mirror of language. The debate on bilingualism. New York: Basic Books.
Hoang, H., Gonzalez-Barrero, A. M., & Nadig, A. (2018). Narrative skills of bilingual children with
autism spectrum disorder. Discours, 23. Retrieved from http://journals.openedition.org/discours/985
6. doi: 10.4000/discours.9856
Jacobson, P. F., & Yu, Y. H. (2018). Age-related changes in English past tense by bilingual children
with and without developmental language disorders. Journal of Speech-Language and Hearing
Research, 61(10), 2532–2546.
Jegatheesan, B. (2011). Multilingualism and autism: Perspectives of South Asian Muslim immigrant
parents on raising a child with a communicative disorder in multilingual contexts. Bilingual Research
Journal, 34, 185–200.
Johnson, Y. U., Martinez-Cantu, V., Jacobson, A. L., & Weir, C.-M. (2012). The home instruction for
parents of preschool youngsters ’program’s relationship with mother and school outcomes. Early
Education & Development, 23(5), 713–727.
Kay-Raining Bird, E., Cleave, P., Trudeau, N., Thordardottir, E., Sutton, A., & Thorpe, A. (2005). The
language abilities of bilingual children with Down Syndrome. American Journal of Speech-Language
Pathology, 14, 187–199.
Kay-Raining Bird, E., Genesee, F., Sutton, A., Chen, X., Oracheski, J., Pagan, S., Squires, B., Burchell,
D., & Sorenson, T. D. (2020). Access and outcomes of children with special education needs in early
French immersion. Journal of Immersion and Content-Based Language Instruction. doi: 10.1075/jicb.2
0012.kay
Kay‐Raining Bird, E., Lamond, E., & Holden, J. (2012). Survey of bilingualism in autism spectrum
disorders. International Journal of Language & Communication Disorders, 47, 52–64. doi: 10.1111/j.14
60-6984.2011.00071.x
Kohnert, K. (2010). Bilingual children with primary language impairment: Issues, evidence and im-
plications for clinical actions. Journal of Communication Disorders, 43, 456–473. doi: 10.1016/
j.jcomdis.2010.02.002
Kohnert, K., Yim, D. S., Nett, K., Kan, P. F., & Duran, L. (2005). Intervention with linguistically
diverse preschool children: A focus on developing home language(s). Language, Speech and Hearing
Services in Schools, 36, 251–263.
Lesaux, N. K., Rupp, A. A., & Siegel, L.S. (2007). Growth in reading skills of children from diverse
linguistic backgrounds: Findings from a 5-year longitudinal study. Journal of Educational
Psychology, 99(4), 821–834.
Leonard, L. (2014). Introduction (Chapter 1; pp 3-35). Children with specific language impairment (2nd
edn). Cambridge, MA: MIT Books.
Lim, N., O’Reilly, M. F., Sigafoos, J., Ledbetter-Cho, K., & Lancioni, G. E. (2019). Should heritage
languages be incorporated into interventions for bilingual individuals with neurodevelopmental
disorders? A systematic review. Journal of Autism and Developmental Disorders, 49, 887–912. doi: 1
0.1007/s10803-018-3790-8
424
Child L2 Language Disorders
Morgan, G., Restrepo, M., & Auza, A. (2013). Comparison of Spanish morphology in monolingual
and Spanish–English bilingual children with and without language impairment. Bilingualism:
Language and Cognition, 16(3), 578–596. doi: 10.1017/S1366728912000697
Oh, J. S., & Fuligni, A. J. (2010). The role of heritage language development in the ethnic identity and
family relationships of adolescents from immigrant backgrounds. Social Development, 19, 202–220.
doi: 10.1111/j.1467-9507.2008.00530.x
Ohashi, J. K., Mirenda, P., Marinova-Todd, S., Hambly, C., Fombonne, E., Szatmari, P., &
Thompson, A. (2012). Comparing early language development in monolingual- and bilingual-
exposed young children with autism spectrum disorders. Research in Autism Spectrum Disorders,
6(2), 890–897.
Oller, D. K., Pearson, B., & Cobo-Lewis, A. B. (2007). Profile effects in early bilingual language and
literacy. Applied Psycholinguistics, 28, 191–230.
Paradis, J. (2007). Bilingual children with SLI: Theoretical and applied issues. Applied Psycholinguistics,
28, 512–564.
Paradis, J. (2010). The interface between bilingual development and specific language impairment.
Keynote article for special issue with peer commentaries. Applied Psycholinguistics, 31, 3–28.
Paradis, J., Genesee, F., & Crago, M. (2021). Dual language development and disorders: A handbook on
bilingualism and second language learning (3rd edn). Baltimore: Brookes Publishing.
Paradis, J., Govindarajan, K., & Hernandez, K. (2018). Bilingual development in children with autism
spectrum disorder from newcomer families. Education and Research Archive. 10.7939/R31V5BT9X
Paradis, J., Schneider, P., & Sorenson Duncan, T. (2013). Discriminating children with language im-
pairment among English language learners from diverse first language backgrounds. Journal of
Speech, Language and Hearing Research, 56, 971–981.
Peal, E., & Lambert, W. E. (1962). The relation of bilingualism to intelligence. Psychological
Monographs, 76, 1–23.
Peña, E., Bedore, L., & Baron, A. (2017). Bilingualism and child language disorders. In R. Schwartz
(Ed.), Handbook of child language disorders (2nd edn, pp. 297–327). New York, NY: Routledge.
Pesco, D., & Crago, M. B. (2017). Language socialization in Canadian indigenous communities. In P.
Duff & S. May (Eds.), Language socialization. Encyclopedia of language and education (3rd edn,
pp. 291–307). Springer, Cham.
Prevoo, M. J. L., Malda, M., Mesman, J., Emmen, R. A. G., Yeniad, N., Van Ijzendoorn, M. H., &
Linting, M. (2014). Predicting ethnic minority children’s vocabulary from socioeconomic status,
maternal language and home reading input: Different pathways for host and ethnic language.
Journal of Child Language, 41(5), 963–984.
Rezzonico, S., Chen, X., Cleave, P. L., Greenberg, J., Hipfner-Boucher, K., Johnson, C. J., Milburn, T.,
Pelletier, J., Weitzman, E., & Girolametto L. (2015). Oral narratives in monolingual and bilingual
preschoolers with SLI. International Journal of Language and Communication Disorders, 50(6), 830–841.
Roseberry-McKibbin, C. (2018). Multicultural students with special language needs: Practical strategies
for assessment and intervention (5th edn). Oceanside, CA: Academic Communication Associates.
Schwartz, R. (2017). Handbook of child language disorders. New York, NY: Routledge.
Sheng, L., Bedore, L. M., Peña, E. D., & Taliancich-Klinger, C. (2013). Semantic convergence in
Spanish-English bilingual children with primary language impairment. Journal of Speech, Language,
and Hearing Research, 56(2), 766–777.
Simon-Cereijido, G., Gutiérrez-Clellen, V. F., & Sweet, M. (2013). Predictors of growth or attrition of
the first language in Latino children with specific language impairment. Applied Psycholinguistics,
34(6), 1219–1243. doi: 10.1017/S0142716412000215
Sorenson Duncan, T ., & Paradis, J. (2016). English language learners’ nonword repetition perfor-
mance: The influence of L2 vocabulary size, length of L2 exposure and L1 phonology. Journal of
Speech Language and Hearing Research, 59, 39–48.
Soto-Corominas, A., Paradis, J., Al Janaideh, R., Vitoroulis, I., Chen, X., Georgiades, K., Jenkins, J.,
& Gottardo, A. (2020). Socioemotional wellbeing influences bilingual and biliteracy development in
Syrian refugee children. In M. Brown & A. Kohut (Eds.), Proceedings of the 44thBoston university
conference on language development (pp. 620–633). Somerville, MA: Cascadilla Press.
Stewart, J., El Chaar, D., McCluskey, K., & Borgardt, K. (2019). Refugee student integration: A focus
on settlement, education, and psychosocial support. Journal of Contemporary Issues in Education,
14(1), 55–70.
Trudeau, N., Kay-Raining Bird, E., Sutton, A., & Cleave, P. (2011). Développement lexical chez les
enfants bilingues ayant le syndrome de Down. Enfance, 2011(3), 383–404.
425
Johanne Paradis
Verhoeven, L., Steenge, J., & van Balkom, H. (2011). Verb morphology as a clinical marker of specific
language impairment: Evidence form first and second language learners. Research in Developmental
Disabilities, 32, 1186–1193.
Wang, M., Jegathesan, T., Young, E., Huber, J., & Minhas, R. (2018). Raising Children with
Autism Spectrum Disorders in Monolingual vs Bilingual Homes: A Scoping Review. Journal of
Developmental and Behavioral Pediatrics, 39(5), 434–446.
Westernoff, F. (1991). The assessment of communication disorders in second language learners. Journal
of Speech-Language Pathology, 15(4), 73–79.
Yohani, S., Brosinsky, L., & Kirova, A. (2019). Syrian refugee families with young children: An ex-
amination of strengths and challenges during early resettlement. Journal of Contemporary Issues in
Education, 14(1), 13–32.
Yoshida, Y., & Amoyaw, J. (2020). Looking beyond labour market integration: Household conditions
surrounding refugee children in Canada. In A. Korntheuer, D. B. Maehler, P. Pritchard, & L.
Wilkinson (Eds.), Refugees in Canada and Germany: Responses in policy and practice. Köln: GESIS -
Leibniz-Institut für Sozialwissenschaften.
Yu, B. (2013). Issues in bilingualism and heritage language maintenance: Perspectives of minority-
language mothers of children with autism spectrum disorders. American Journal of Speech-Language
Pathology, 22(1), 10–24.
426
30
TRAINING INTERPRETERS
Jim Hlavac
1 Introduction/Definitions
Interpreter training is a younger area of research than second language acquisition (SLA).
The oldest schools were established in the 1940s and 1950s, and it was not until the 1980s
that interpreting pedagogy emerged as an area distinct from translation pedagogy. Similar to
SLA, interpreter training has been influenced by various approaches describing linguistic
production, but there has been relatively little cross-over. This is surprising as both dis-
ciplines have a strong focus on the linguistic and extra-linguistic abilities of learners.
Learning and information processing strategies utilized by successful interpreter trainees are
those that many language learners employ (Zannirato, 2008). In interpreter pedagogy,
strategies that successful learners should develop are commonly found in SLA research: self-
motivation (Dörnyei, 1994), segmentation of input (Pica, 1994), anticipation and inferencing
(Laviosa, 2014), restructuring and paraphrasing (Nabei & Swain, 2002), use of prosodic and
non-verbal features (Jenkins & Parra, 2003), memorizing input (Gu & Johnson, 1996), and
monitoring output of production and repairing errors (Kormos, 2006). Reflecting on SLA
from an Interpreter Studies perspective, Dejean (2000, p. 9) considers that “methods used by
interpreting students to perfect a language can obviously be of interest to those who wish to
achieve a true command of a foreign language.”
Essentially, interpreting is the transfer of verbal or signed messages from one language
into another. There are also situational factors pertaining to how most interpreting is per-
formed: immediacy and finality. Thus, a widely accepted contemporary definition of inter-
preting is: “Interpreting is a form of Translation in which a first and final rendition in another
language is produced on the basis of a one-time presentation of an utterance in a source
language” (Pöchhacker, 2016, p. 11. Original emphasis).
Terms such as “one-time presentation” and “first and final rendition” refer to the
mental and verbal dexterity that interpreters must have to readily understand, remember,
transfer, and re-produce speech from one language into another. This is the hallmark of
interpreting and why it is a “special” kind of speaking with at least six levels; components of
the first four are addressed here:
Here, I examine interpreting between spoken languages only but not sign language inter-
preting; almost all sign interpreters are L1 speakers of the spoken language. The interested
reader is referred to Nicodemus and Emmorey (2015).
Here, we focus on individuals interpreting from the L1 into their L2. Interpreters with
three or more languages typically still work into their L1 or L2 only. Within Interpreting
Studies, A, B, and C are used to refer to interpreters’ L1, L2, and L3, respectively (with C
encompassing L3, L4, and L5) (AIIC, 2012). To align this chapter with the rest of this book,
the terms L1 and L2 are used, with the latter term employed as a hypernym for all non-
dominant languages.
Terms specific to interpreting are source and target. Source refers to the original speech
form that the interpreter hears, or source speech/text. By analogy, the language from which
they interpret is the source language. The term target refers to the interpreter’s spoken
output, that is, their target speech/text, and the language into which they interpret, the target
language. Target is to be understood in this way, and should not be confused with the use of
target in SLA where it refers to an optimal form that a learner typically aspires to.
Four modes of interpreting exist: dialogue interpreting stretches of source speech of 1–50
words using short-term memory skills with minimal or no note-taking from and into both
languages or bidirectionally; consecutive interpreting of source speeches of >50 words
(usually between 150 and 1,000 words) using notes and memory skills and working either
monodirectionally or bidirectionally; simultaneous interpreting of source speech that is al-
most contemporaneous, with a delay of 0.5–5 seconds (called décalage) usually working
monodirectionally, less often bidirectionally; sight translation which is the reading of a
written source text and giving a spoken interpretation of it, usually monodirectionally. All
modes of interpreting can require interpreters to work bidirectionally to some extent, thus
giving spoken output in their L2 as well as their L1.
Interpreting is commonly defined according to the field in which the interpreter works,
principally two main fields with some overlap: conference interpreting encompassing inter-
national meetings, high-level government and diplomats’ meetings, business and commerce,
media and press conferences; and public-service or community interpreting encompassing
healthcare, social welfare and other government services, police/legal/asylum/courtroom/
prison settings, education, housing and employment, family violence, sport, faith-based
organizations, humanitarian and emergency situations, retail and customer service, sport,
and some media events. Conference interpreting modes include simultaneous and con-
secutive interpreting (very common), and dialogue interpreting and sight translation (less
common). The modes in public service interpreting are dialogue and consecutive interpreting
(very common), and sight translation and simultaneous interpreting (less common). While
428
Training Interpreters
many interpreters favour working in one or two modes only (most untrained interpreters can
undertake dialogue interpreting and sight translation only) and many specialize in one field
or thematic area or receive work mainly in this field, the spread of fields means that most
interpreters work into their L2 with some frequency.
2 Historical Perspectives
The training of interpreters dates back to the mid-20th century; models on interpreting
performance are even more recent. Attempts to locate speaking in either pedagogically based
or practice-based descriptions of interpreting yield few incidences in which it is overtly
mentioned. Speaking is conceived of as a feature of performance where attention is focused
on other things – usually the fidelity of transfer of the referential content from one language
into another. This is unsurprising for a discipline which by name implicitly refers to the
transfer of messages cross-linguistically.
Examining the professionalization of interpreting and comments received by the “first
generation” of professional interpreters about their performance, Baigorri-Jalón (2004, p. 82)
surmises that “the average listener appreciated more the rhetorical fluency than the presumed
accuracy of their interpretations.” This statement applies to almost all users of interpreting
services: it is the interpreter’s speaking skills that strongly determine others’ notions of their
performance rather than accuracy of translation; the latter is something that most users are
unable to ascertain (Kurz, 2001).
Interpreting into an L2 compared to an L1 has been a hotly debated issue in both training
and practice. Fluency, grammatical accuracy, and rhetorical skills are more easily displayed
in one’s L1, and interpreting into the L1 as the preferred direction became a principle ad-
vocated by one of the earliest interpreter educators, Danica Seleskovitch (1978). But an
equally influential school of interpreter training focuses on the source speech being the in-
terpreter’s L1. The reasoning behind this is that the interpreter must fully understand the
source speaker and this is most likely when the language of the source speaker is also the
interpreter’s L1. Thus, the interpreter’s interpretation into the L2 may contain phonological,
grammatical or other shortcomings, but the level of accuracy in the transfer of referential
content is likely to be higher because there is less chance of misunderstanding the source
speech (Denissenko, 1989).
Empirical studies on directionality were not undertaken until this century; the evidence
from these is mixed. Studies examining linguistic accuracy in target speech production
amongst the same cohorts of interpreters have found that interpreting into the L1 is superior
(Chang & Schallert, 2007). Other studies that tested for the variables of anticipation, that is,
predicting source-speech constituents not yet available for the interpreter’s output planning
(Kurz & Färber, 2003) or working memory taking in L1 input (Gorton, 2012) have found
interpreting into the L2 to be superior. Kalina (2005) and Gile (2009) argue that factors such
as the language–pair combinations, topic area, and familiarity with content can outweigh the
issue of directionality into one’s L1 or L2. Surveys of users such as conference delegates
reveal no strong preferences of listening to interpreters speaking into the interpreters’ L1
compared to their L2 (Donovan, 2004). These developments are mirrored by similar ones in
SLA and teacher training that show that the existence of an L2 accent should not prevent a
person from working as an educator (Derwing et al., 2014).
The debate is now largely over; most university programmes teaching conference or si-
multaneous interpreting include training for and assessment of students to work into their L2
(Lim, 2005), albeit with a lower weighting of assessment than into the L1 (EMCI, 2018). The
debate has also been put to rest due to the emergence of public-service interpreting in which
429
Jim Hlavac
Full Functional Proficiency. Able to use language fluently and accurately on all
levels pertinent to professional needs. Examples—Understands the details and ra-
mifications of concepts that are culturally or conceptually different from one’s own.
Can set the tone of interpersonal, official, semi-official, professional, and non-
professional verbal exchanges with a representative range of native speakers (for all
audiences, purposes, tasks, and settings). Can play an effective role among native
speakers in such contexts as negotiations, conferences, lectures, and debates on
matters of disagreement. Can advocate a position at length, both formally and in
chance encounters, using sophisticated verbal strategies.
(ASTM International, 2007, p. 2).
430
Training Interpreters
SLA activity has been most widely researched in relation to English as the L2: several such
studies have appeared in East Asian countries such as Japan, South Korea, and China (Lee,
2014). Interpreting exercises are employed to counter some students’ reticence to speak or
other students’ difficulties in “knowing what to say”; thus L1 input is used as a catalyst for
learners to speak in the L2 via inter-linguistic transfer. For instance, Lee (2014) reports the
use of sight translation and consecutive interpreting of sentence-long L1 segments that
learners recorded on their smartphones. Learners then compared their L2 target speech with
aspirational L2 target renditions provided by the educator. Students’ confidence levels are
reported to increase but no other outcomes in measuring spoken L2 are reported. In general
interpreter training exercises of transferring utterance-length source speech into target speech
are restricted to the early stages of learning dialogue interpreting only.
Another development specific to some universities in East Asian countries is interpreting
streams (usually 2–6 semesters) designated as an English Language Major or Translation and
Interpreting Specialisation. In Japan, a national policy, the “Action plan to foster Japanese
who can use English,” was launched in 2003 which led to a large expansion of interpreting
and translation programmes in universities (Komatsu, 2016).
In a study of an interpreting programme at a Japanese university with data from 8 in-
structors and 19 students in a 6-semester programme, Giustini (2020, pp. 8–11) identified a
sub-optimal level of speaking proficiency in English at the entry point as one factor in
students’ performance (B2 on the CEFR). Great difficulties were encountered and reported
by students (and instructors) in interpreting consecutively and simultaneously into their L2,
English. Only dialogue interpreting was well-performed, providing learners with a sense of
improving their spoken proficiency in English (Giustini, 2020). These findings make sense
only when one considers that the overall goal of the programme was not to equip learners to
become interpreters. Trainers were unconcerned if learners could not adequately perform
consecutive and simultaneous interpreting. According to one instructor, “the aim of the
course is to provide language instruction through practice-related interpreting subjects…We
use this innovative training system so that students pursue a systematic acquisition of
communicative skills” (Giustini, 2020, p. 7). In a wider context, this statement also makes
sense when one considers that most practising interpreters in Japan are not graduates of
university interpreting programmes but graduates of private-sector vocational colleges af-
filiated with corporations or interpreting agencies (Komatsu, 2016). Those students in uni-
versity programmes who do hope to become interpreters are reported to “experience
disappointment and disheartenment over their training, with possibly more negative than
positive effects on their ultimate acquisition of communicative competence in English”
(Giustini, 2020, p. 12).
431
Jim Hlavac
pleasant voice, fluency of delivery, logical cohesion of utterance, correct grammatical usage,
use of correct terminology, and use of appropriate style. In regard to speaking skills, Bühler
(1986) reports that fellow interpreters rate logical cohesion as most important (75%) followed
by correct terminology, fluency of delivery and correct grammar with ratings of 50% or
more. In another survey of interpreters, “sense consistency” (i.e., full transfer of referential
content in a coherent way) was rated most important (Chiaro & Nocella, 2004). The same
survey showed that interpreter colleagues perceive other colleagues’ “foreign accent” in their
L2 to be the least important impediment to accomplished interpreting, but this view may not
be shared by non-interpreters.
Kurz and Pöchhacker’s (1995) survey of 19 lay clients listed a native accent, together with
a pleasant voice, and fluency of delivery as important qualities. Cheung’s (2013) survey of lay
clients’ perceptions also records preferences for interpretations to be in the interpreter’s L1.
When presented with a choice, users’ preferences are likely to yield higher ratings for L1 over
L2. This is but one feature of performance that is largely disregarded if other presentational
features are performed well. For example, Hale et al. (2011) found that a non-native accent
had no effect on how source speakers in courtroom settings were perceived.
Fluency, referring to a speaker’s ability to draw on a wide variety of alternative turns and
to deliver these with an appropriate speech rate and prosody with few “disfluencies” is a
difficult quality to elicit specifically in users’ perceptions. Users sometimes identify pauses
and hesitations rather than fluency in an overall sense (Pradas Macías, 2006). Tissi (2000)
showed that source speakers’ disfluencies are not replicated in interpreters’ target speech; this
suggests that fluency is a characteristic of individual speech behaviour. Perceived lack of
fluency is reported to strain users’ comprehension of target speech (Ahrens, 2004).
Intonation and prosody are under-studied areas in Interpreting Studies research.
Intonation, pitch movement across an utterance, has been examined in terms of interpreters’
replication of source-speakers’ intonation and whether there are intonation patterns char-
acteristic of interpreting in general. One identified by Shlesinger (1994, p. 229) relating to
simultaneous interpreting is low-rise final pitch movement. Ahrens (2004) reports similar
findings regarding final pitch features and suggests that “interpreters do not know how the
source text will continue and therefore avoid intonational closure, in favour of a final pitch
movement that indicates continuation” (Ahrens, 2015, p. 213). By implication, interpreters
view a slightly rising pitch in utterance-final position as less infelicitous than falling in-
tonation at a juncture point that does not, in hindsight, mark the end of an utterance.
Prosody, the acoustic parameters of pitch, loudness, tempo and rhythm, can be a con-
spicuous feature of interpreters’ speech, especially when interpreting simultaneously.
Anomalies include hesitation pauses, changeable tempo, monotonous or “levelled out” in-
tonation, mismatches with the illocutionary force of the speech act, and prosodic features
inappropriate to the genre of the target speech (Lenglet & Michaux, 2020). Interpreters find it
easier to replicate prosodic features in consecutive interpreting but the processing requirements
of memory retrieval (or note-taking) can still result in target speech sounding “levelled out.”
Targeted training of L2 prosody patterns has an effect on interpreting students’ performance,
not only in recognizing and understanding the function of prosodic features when listening to
source speech, but also in target speech production when working into the L2. Yenkimaleki
and van Heuven (2018) show higher ratings for prosody-related features such as accentedness,
pace, and voice amongst students who received such training in their L2 compared to a control
group of students who did not. This finding is congruent with SLA research: the compre-
hensibility of L2 speakers is enhanced with prosodic instruction (Derwing & Rossiter, 2003).
Coherence refers to the underlying functional connectedness or identity of spoken text. It
is a feature noted usually by consumers of texts, who perceive a text’s coherence in terms of
432
Training Interpreters
their knowledge of the world, and the inferences and assumptions they make (Gernsbacher &
Givón, 1995). Textual coherence of source speech is crucial to the interpreter who listens and
makes sense of it, and in turn the coherence of their target speech is important to the au-
dience. In users’ judgements of the quality of interpreting, “logical coherence of utterances”
is consistently one of the top-ranking criteria (Grbić, 2008, p. 235). Using Rhetorical
Structure Theory (RST) as a framework, Peng (2009, p. 236) reports that a greater degree of
coherence in the target speech of trainee interpreters is observable when they work into their
L1 compared to their L2, but this contrast was less marked in the performance of profes-
sional interpreters.
Cohesion concerns linguistic features of speech signalling underlying concepts and rela-
tions which aid listeners in “making sense” of what they hear. The observation that trans-
lated texts tend to be more explicitly marked with cohesive devices than their source texts
(Pym, 2005) is also true for interpreted speeches compared to their source speeches although
the frequency of cohesive device markers decreases when the interpreter is working into their
L2 (Peng, 2009).
Pauses can be naturally occurring junctures in interpreted speech as they are in mono-
lingual speech. Depending on rhetorical or intonation features in their environment, they aid
and augment comprehension. But pauses, filled or unfilled, can be seen as disfluencies oc-
curring due to online production of (interpreted) talk. In qualitative studies of simultaneous
interpreters’ performance, Cecot (2001) found that rapidly delivered source speeches result in
equivalently rapidly delivered target speeches with a commensurately small number of pauses
in both, while Ahrens (2004) found that interpreters paused less often than the source
speakers, but their pauses were longer. These findings relate to interpreting into the L1. In
regard to differences in interpreting into one’s L2 compared to L1, Mead (2015) reports that
pauses are more frequent and longer.
Repairs are instantiations of speaker monitoring that function as corrections of speech
that is “erroneous.” Repairs also relate to disruptions in speech production that are probably
naturally occurring. Interpreting requires two levels of monitoring – that of checking the
fidelity of target speech produced in relation to its source speech and that of speech pro-
duction in general. An example of online monitoring in Polish-English simultaneous inter-
preting of interpreters working into their L2 is provided by Kopcyński (1980, p. 85), “our
common…eh…aims will come true…will be achieved.” Such an utterance is more an ex-
ample of “fine-tuning” than actual correction as suggested by Mead (2015, p. 349). It is also
hard to distinguish causality given that repairs in L2 speech are often indicators that en-
coding processes in the L2 have not become fully automatized.
Voice quality relates to the description of phonation types. Supralaryngeal features de-
termine whether a person’s voice sounds breathy, creaky or somehow conspicuous such that
it elicits an aesthetic-evaluative response in listeners. Voice training for interpreting students
remains an under-studied area (Flerov & Jacobs, 2016). A pleasant interpreter voice can have
a more persuasive effect on users than the actual referential content of the target speech and
conversely, an unpleasant voice can undermine good content (Shlesinger, 1994). Iglesias
Fernández (2013) reports that high pitch and nasality can be associated with an interpreter’s
perceived lack of maturity and competence, whereas lower pitch, wider pitch range and
higher resonance are associated with positive attributes such as credibility and reliability.
Specific comments on vocal quality commonly mention prosody and fluency, meaning that
distinctly phonational features such as timbre may be harder to tease out (Iglesias
Fernández, 2013). Although these studies relate mostly to interpreters working into their L1,
results from studies of phonational features of L2 show that L2 speakers have a narrower
pitch range than in their L1 (Zimmerer et al., 2014).
433
Jim Hlavac
434
Training Interpreters
training programmes set a high level of L2 proficiency as a condition for entry as Keiser
(1977) indicated: “interpretation courses are not language courses, in other words…the would-
be student must have mastered his (sic) language before entering into the course…he must
have the required mastery of his active and passive languages before starting the inter-
pretation course otherwise he will constantly stall and stumble under the tremendous pres-
sure of interpretation per se” (p. 13. Original emphasis). Few interpreter courses today set
such a high requirement for L2 spoken proficiency. Interpreter training has expanded
greatly; varying L2 levels and vocational expectations are now a feature of the student profile
of many training courses. I provide an overview of a cross-part of training courses, beginning
with short courses of approximately 50 contact hours and concluding with Master degree
courses. Levels of spoken proficiency may be ascertained via language tests for L2 learners,
such as IELTS for English, or levels aligned to the Common European Framework of
Reference for Languages (CEFR). Training institutions may conduct their own entrance
tests targeting L2 speaking skills, either with or without other formal tests.
Some courses in public-service interpreting have very general or even unspecified formal
linguistic prerequisites often because they target speakers of several L1s who undertake
language neutral training, or training with a common language of instruction, usually English
(the L2 for most learners). Where such courses have an entrance test, applicants’ L2 skills
may be assessed through spoken interviews and sight translation of a non-complex 100-word
text in a language other than English into spoken English (Hlavac et al., 2012).
Other vocationally focused courses with interpreting subjects specific to students’ specific
language-pairs, such as the one-year full-time course Diploma of Interpreting offered at
RMIT University (Melbourne) may have no formal academic prerequisites, but L2 language
requirements. For English, these are an IELTS (Academic) overall score of 6.0 (speaking not
lower than 5.5), and where the L2 is a language other than English, the requirement is
passing a secondary school final year exam in that language, or completion of an entrance
test (RMIT, 2020).
For 3-year Bachelor degrees providing a grounding in interpreting, the entry-level re-
quirement for L2 spoken proficiency is typically C1 (e.g., University of Vienna, 2017). For
postgraduate Master courses in interpreting, a proficiency level of C2 in the L2 is commonly
required (e.g., University of Vienna, 2018). An equivalent MA in Interpreting at a university
in a predominantly Anglophone country requires an IELTS score of 7.0 (overall) with scores
of no less than 6.5 for speaking for English L2 applicants (Newcastle University, 2020).
The toughest admission requirements of L2 speaking skills are for a specialist 2-year
Master in Simultaneous Interpreting, where applicants experience a 10-week entrance course
testing students’ speaking skills through activities such as paraphrasing, summarising and
performing impromptu role-plays in English (the L2 of most trainees) (Moser-Mercer, 1985).
Spoken proficiency in English, followed by pronunciation/enunciation are the first two cri-
teria for admission. Of the four remaining criteria, one of them is “assertiveness,” that is,
trainees’ confidence in their presentation skills and overall communicative competence
(Moser-Mercer, 1985, pp. 98–99).
In general, many Master courses have entrance-test requirements. In a survey of 18
postgraduate programmes, mainly in Europe, Timarová and Ungoed-Thomas (2008) list
spoken interviews and 2- to 5-minute oral presentations (with preparation time) in the L2 as
well as short consecutive interpreting and sight translation into the L2 as entrance-test tasks.
Interestingly, the authors do not identify shadowing as a task given to course applicants.
Shadowing is an exercise used in the training of simultaneous interpreters since the 1960s. It
is a monolingual exercise performed in the L2 (or L1) and consists of the spoken repetition of
another’s speech with little lagtime (phonemic shadowing) or at longer latencies (phrase
435
Jim Hlavac
shadowing) (Riccardi, 2015). As a dual-task exercise requiring the trainee to follow and
replicate the form of the source speaker’s speech, shadowing draws trainees’ attention to the
form of others’ speech.
In the past decade, diagnostic tools have been trialled to gauge potential trainees’ spoken
ability in L2 (and L1). The syncloze test was developed based on a recording of a 660-word
information text on a well-known topic (e.g., mobility and health) played to entrance-test
candidates. Gaps at regular intervals require candidates to utter forms in the L2 to make the
sentence thematically and/or grammatically complete. Their spoken performance is recorded
and their insertions in the gaps are assessed for lexical, collocational, and grammatical ac-
curacy (Pöchhacker, 2011).
Coming back to the issue of where interpreter training stands sequentially vis-à-vis SLA,
the variation in L2 proficiency for interpreting courses follows a general pattern: the shorter
and less demanding the course, the lower the level of required L2 skills. Still, in the content of
the courses surveyed above, there are no components focusing on language acquisition. The
rapid expansion of interpreting courses at pre-university level, and at undergraduate and
postgraduate levels suggests that many more students undertake training, but it is likely that
their L2 speaking skills vary substantially. This is a feature identified by Angelelli and
Degueldre (2002) who bemoan the paucity of bridging or superior-level courses addressing
the need to further develop the L2 speaking skills of prospective interpreting trainees.
Interpreting trainees typically report their training leads to development of L2 skills, for
instance, learning specialist vocabulary, knowledge of forms used in specific discourse genres,
command of multiple registers, and so on, but these are byproducts of the training. In the
post-training period and in the language services industry, interpreters typically report their
L2 skills continue to advance together with work-related skills such as discourse-
management and the development of business acumen.
6 Future Directions
The geographical spread of English, globalization, and the increasing numbers of speakers of
English as an L2 has led to the term “World Englishes” as a hypernym for all varieties of
English, whether L1 or L2. In many work interactions, English is no one’s L1 and functions
as a lingua franca (see Llurda, this volume). The growing use of English as a lingua franca
(ELF) has had three consequences for interpreting: less work for interpreters in bi- or in-
ternational settings where interlocutors increasingly communicate via English; interpreters
receiving source speech input in English now more often work with input coming from L2
users of English; interpreters providing target speech output into English are doing so for a
target audience of more L2 users of English. The last scenario is the focus here, where in-
terpreters working into English as their L2 know that many recipients of their interpretations
are English L2 users. In a descriptive model of conference interpreter competence containing
processes congruent with those identified by Gile (2009) in his Effort Model and others,
including pre- and post-interaction skills, Albl-Mikasa (2013a, p. 19) identifies two pro-
duction skills relating to the above situation: “balancing between high fidelity and audience
design” and “ELF accommodation” (Figure 30.1).
The term “high fidelity,” referring to production, is described by Albl-Mikasa (2013a,
p. 28) as an “ultra-completed rendition.” This metaphor refers to an English L2 rendition
fully reflective of the high register and lexically and/or phraseologically complex structure of
the source speech. Quoting conference interpreters working into English as their L2, Albl-
Mikasa (2013b, 10) lists anecdotes such as, “What is the use of throwing in expressions like,
‘that’s a sticky wicket’ when no one understands them?”. Experienced interpreters
436
Training Interpreters
Para-process skills
- business know-how, customer relations, professional standards
- lifelong learning predilection
- meta-reflection
Peri-process skills
- teamwork, cooperation
- unimposing extrovertedness
Pre-process skills - instinct and realism
- high-level command - pressure resistance Post-process skills
of languages - terminology
- low-level terminology In-process skills wrap-up
management
Comprehension skills - quality control
- informed semi- - below-expert scanning,
knowledge identifying, matching
- streamlined - contextualization
preparation
- ELF compensation
Transfer skills
- simultaneity
- capacity relief measures
Production skills
- synchronicity and
décalage modulation
- reduction
- balancing between high
fidelity and audience
design
- ELF accommodation
- performance,
presentation, prosody
Figure 30.1 Process- and experience-based model of interpreter competence. Reprinted from Albl-
Mikasa (2013b, p. 10) with permission
consciously avoid target speech constructions matching the source speech but which are
unlikely to be understood by the recipients of their target speech.
Further, Albl-Mikasa (2013b) invokes Kalina’s (1998) justification for “changing regis-
ters” in target speech for non-native audiences, and to sociolinguistic models describing shift
in style according to audience design (Bell, 1984) and communicative accommodation theory
(Giles & Coupland, 1991). In the case of audience design, interpreters modify their language
style to L2 audiences particularly in dialogic, public-service interpreting settings (i.e., to
“addressees”), and in monologic, conference interpreting settings (i.e., to “auditors”). In the
case of communicative accommodation theory, Giles and Coupland’s (1991, p. 88) de-
scription of means to “modify the complexity of speech (e.g., by decreasing diversity of
vocabulary) or simplifying syntax and increase clarity (by changing pitch, loudness [and]
tempo)…” are strategies that interpreters working into their L2 for a non-native audience
reportedly engage in (Albl-Mikasa, 2013b). Data on interpreters discussing their behaviour
437
Jim Hlavac
show this, with 72% (n = 23) of a sample of mainly conference interpreters reporting that
they “adjust [their] English (consciously or unconsciously to [their] listener/addressee” par-
ticularly in instances “when they had evidence that no native speakers were in the audience”
(Albl-Mikasa, 2010, p. 132).
A circumstance of the profile of interpreters in general is that relatively few have English as
their L1. Due to the linguistic profile of users of interpreting services, which increasingly en-
compasses English L2, Albl-Mikasa makes the call for interpreting into ELF to be a component
of training in terms of trainees’ spoken production, and as a research area with empirical data on
interpreters’ style and register used in L2 interpreting and on L2 users’ receptions of their in-
terpretations. Such research could benefit from cross-fertilization from SLA where some tradi-
tional target models are being re-evaluated, such that features of production like pronunciation
are re-conceived according to international intelligibility (Pickering, 2006).
Further Reading
Albl-Mikasa, M. (2013b). Teaching Globish? The need for an ELF pedagogy in interpreter training.
International Journal of Interpreter Education, 5(1), 3–16.
Conference interpreters report on working into English as their L2 and the reception of their interpreta-
tions amongst English L2 users.
Pöchhacker, F. (2016). Introducing interpreting studies (2nd edn). Abingdon, Oxon: Routledge.
Chapter 4 contains an overview of models, while chapters 6 and 7 deal with cognitive processes and
speech production respectively.
Setton, R., & Dawrant, A. (2016). Conference interpreting. A trainer’s guide. Amsterdam: John Benjamins.
Chapter 7 outlines language enhancement in the interpreting curriculum and discusses challenges and
strategies when interpreting into an L2 with a focus on quality of production and different speech and
event types.
References
Ahrens, B. (2004). Prosodie beim Simultandolmetschen. Frankfurt: Peter Lang.
Ahrens, B. (2015). Intonation. In F. Pöchhacker (Ed.), Routledge encyclopedia of interpreting studies
(pp. 212–214). Abingdon, Oxon: Routledge.
AIIC. International Association of Conference Interpreters. (2012). Working languages. https://aiic.net/
node/6/working-languages/lang/1 (accessed 1April2020).
Albl-Mikasa, M. (2010). Global English and English as a Lingua Franca (ELF): Implications for the
Interpreting Profession. Trans-Kom, 3(2), 126–148.
Albl-Mikasa, M. (2013a). Developing and cultivating expert interpreter competence. The Interpreters’
Newsletter, 18, 17–34.
Albl-Mikasa, M. (2013b). Teaching Globish? The need for an ELF pedagogy in interpreter training.
International Journal of Interpreter Education, 5(1), 3–16.
Angelelli, C., & Degueldre, C. (2002). Bridging the gap between language for general purposes and
language for work: An intensive Superior-level language/skill course for teachers, translators, and
interpreters. In B. Leaver & B. Shekhtman (Eds.), Developing professional-level language proficiency
(pp. 77–95). Cambridge, UK: Cambridge University Press.
ASTM International. (2007). Standard guide for Language interpretation services. (F2089). West
Conshohocken, PA: ASTM International.
Baigorri-Jalón, J. (2004). Interpreters at the United Nations: A history [Trans. by A. Barr] Salamanca:
Ediciones Universidad de Salamanca.
Bell, A. (1984). Language style as audience design. Language in Society, 13(2), 145–204.
Bühler, H. (1986). Linguistic (semantic) and extra-linguistic (pragmatic) criteria for the evaluation of
conference interpretation and interpreters, Multilingua, 5(4), 231–235.
Cecot, M. (2001). Pauses in simultaneous interpreting: A contrastive analysis of professional inter-
preters’ performances. The Interpreters’ Newsletter, 11, 63–85.
Chang, C.-C., & Schallert, D. (2007). The impact of directionality on Chinese/English simultaneous
interpreting. Interpreting, 9(2), 137–176.
438
Training Interpreters
Cheung, A. (2013). Non-native accents and simultaneous quality perceptions. Interpreting, 15(1), 25–47.
Chiaro, D., & Nocella, G. (2004). Interpreters’ perception of linguistic and non-linguistic factors af-
fecting quality: A survey through the World Wide Web. Meta, 49(2), 278–293.
Cho, S. (2007). Curriculum development in the undergraduate interpretation and translation program.
The Journal of Translation Studies, 8(2), 163–191.
Cokely, D. (1992). Interpretation: A sociolinguistic model. Burtonsville, MD: Linstok Press.
Dejean, L. (2000). Perfecting active and passive languages. Conference Interpretation and Translation,
2, 7–23.
Denissenko, J. (1989). Communicative and interpretative linguistics. In L. Gran & J. Dodds (Eds.), The
theoretical and practical aspects of teaching conference interpretation. (pp. 155–158). Udine:
Campanotto.
Derwing, T. M., & Rossiter, M. J. (2003). The effects of pronunciation instruction on the accuracy,
fluency and complexity of L2 accented speech. Applied Language Learning, 13, 1–18.
Derwing, T. M., Fraser, H., Kang, O., & Thomson, R. I. (2014). L2 accent and ethics: Issues that merit
attention. In A. Mahboob & L. Barratt (Eds.), English in a multilingual context, (pp. 63–80). New
York: Springer.
Donovan, C. (2004). European Masters Project Group: Teaching simultaneous interpretation into a B
language. Interpreting, 6(2), 205–216.
Dörnyei, Z. (1994). Motivation and motivating in the foreign language classroom. The Modern
Language Journal, 78(3), 273–284.
EMCI [European Masters in Conference Interpreting] (2018). Examinations: Admission and diploma
tests. https://www.emcinterpreting.org/examinations. (Accessed 2April2020).
Flerov, C., & Jacobs, M. (2016). Improving the interpreter’s voice. Morrisville, NC: Lulu Press.
Gernsbacher, M., & Givón, T. (1995). Coherence in spontaneous text. Amsterdam: John Benjamins.
Gile, D. (2009). Basic concepts and models for interpreter and translator training (2nd edn). Amsterdam:
John Benjamins.
Giles, H., & Coupland, N. (1991). Language: Contexts and consequences. Milton Keynes, UK: Open
University Press.
Giustini, D. (2020). Interpreter training in Japanese higher education: An innovative method for the
promotion of linguistic instrumentalism. Linguistics and Education, 56, 100792.
Gorton, A. (2012). ‘B’ language interpreting: The interpreter’s perspective. Forum, 10(2), 61–88.
Grbić, N. (2008). Constructing interpreting quality. Interpreting, 10(2), 232–257.
Gu, Y., & Johnson, R. (1996). Vocabulary learning strategies and language learning outcomes.
Language Learning, 46(4), 643–679.
Hale, S., Bond, N., & Sutton, J. (2011). Interpreting accent in the courtroom. Target, 23(1), 48–61.
Hlavac, J., Orlando, M., & Tobias, S. (2012). Intake tests for a short interpreter-training course: design,
implementation, feedback. International Journal of Interpreter Education, 4(1), 21–45.
Hymes, D. (1974). Foundations in sociolinguistics: An ethnographic approach. Philadelphia: University of
Pennsylvania Press.
Iglesias Fernández, E. (2013). Unpacking delivery criteria in interpreting quality assessment. In D.
Tsagari & R. Van Deemter (Eds.), Assessment issues in language translation and interpreting
(pp. 51–66). Frankfurt: Peter Lang.
Jenkins, S., & Parra, I. (2003). Multiple layers of meaning in an oral proficiency test. The com-
plementary roles of nonverbal, paralinguistic and verbal behaviors in assessment decisions. The
Modern Language Journal, 87(1), 90–107.
Kalina, S. (1998). Strategische Prozesse beim Dolmetschen. Theoretische Grundlagen, empirische
Fallstudien, didaktische Konsequenzen. Tübingen: Gunter Narr.
Kalina, S. (2005). Quality assurance for interpreting processes. Meta, 50(2), 768–784.
Keiser, W. (1977). Selection and training of conference interpreters. In D. Gerner & W. Sinaiko (Eds.),
Language interpretation and communication (pp. 11–24). New York: Plenum Press.
Komatsu, T. (2016). A brief history of interpreting and interpreter training in Japan since the 1960s.
In Y. Someya (Ed.), Consecutive notetaking and interpreter training (pp. 15–38). London:
Routledge.
Kormos, J. (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence
Erlbaum.
Kopcyński, A. (1980). Conference interpreting: Some linguistic and communicative problems. Poznań:
Adam Mickiewicz University Press.
Kurz, I. (2001). Conference interpreting: Quality in the ears of the user. Meta, 46(2), 394–409.
439
Jim Hlavac
Kurz, I., & Färber, B. (2003). Anticipation in German-English simultaneous interpreting. Forum, 1(2),
123–150.
Kurz, I., & Pöchhacker, F. (1995). Quality in TV interpreting. Translation. FIT Newsletter, 14(3/4),
350–358.
Laviosa, S. (2014). Translation and language education. Pedagogic approaches explored. Abingdon,
Oxon: Routledge.
Lee, T. (2014). Using computer-assisted interpreter training methods in Korean undergraduate English
classrooms. The Interpreter and Translator Trainer, 8(1), 102–122.
Lenglet, C., & Michaux, C. (2020). The impact of simultaneous-interpreting prosody on comprehen-
sion. Interpreting, 22(1), 1–34.
Levelt, W. (1999). Speaking: From intention to articulation. Cambridge, MA: MIT Press.
Lim, H.-O. (2005). Working into the B language: The condoned taboo? Meta, 50(4), CD – ROM.
Mead, P. (2015). Pauses. In F. Pöchhacker (Ed.), Routledge encyclopedia of interpreting studies,
(pp. 301–303). Abingdon, Oxon: Routledge.
Mead, P. (2015). Repairs. In F. Pöchhacker (Ed.), Routledge encyclopedia of interpreting studies,
(pp. 348–350). Abingdon, Oxon: Routledge
Moser-Mercer, B. (1985). Screening potential interpreters Meta, 30(1), 97–100.
NAATI. (2019). Certified Interpreter Test Assessment Rubrics. Available at: https://www.naati.com.au/
media/2245/ci_spoken_assessment_rubrics.pdf.
Nabei, T., & Swain, M. (2002). Learner awareness of recasts in classroom interaction: A case study of
an adult ESL student’s second language learning. Language Awareness, 11(1), 43–63.
Newcastle University. (2020). Interpreting MA. entry requirements. https://www.ncl.ac.uk/postgraduate/
courses/degrees/interpreting-ma/#entryrequirements (accessed 1April2020).
Nicodemus, B., & Emmorey, K. (2015). Directionality in ASL-English interpreting: Accuracy and
articulation quality in L1 and L2. Interpreting, 17(2), 145–166.
Park, H. (1999). A study on developing an interpretation track for undergraduate students. Conference
Interpretation and Translation, 1, 47–74.
Peng, G. (2009). Using rhetorical structure theory (RST) to describe the development of coherence in
interpreting trainees. Interpreting, 11(2), 216–243.
Pica, T. (1994). Research on negotiation: What does it reveal about second-language conditions,
processes and outcomes. Language Learning, 44(3), 493–527.
Pickering, L. (2006). Current research on intelligibility in English as a lingua franca. Annual Review of
Applied Linguistics, 26, 219–233.
Pöchhacker, F. (2011). Assessing aptitude for interpreting: The Syncloze test. Interpreting, 13(1),
106–120.
Pöchhacker, F. (2016). Introducing interpreting studies (2nd edn). Abingdon, Oxon: Routledge.
Pradas Macías, M. (2006). Probing quality criteria in simultaneous interpreting: The role of silent
pauses in fluency. Interpreting, 8(1), 25–43.
Pym, A. (2005). Explaining explicitation. In K. Károly & Á. Fóris, (Eds.), New trends in translation
studies: In honour of Kinga Klaudy (pp. 29–34). Budapest: Akadémiai Kiadó.
Riccardi, A. (2015). Shadowing. In F. Pöchhacker (Ed.), Routledge encyclopedia of interpreting studies
(pp. 371–373). Abingdon, Oxon: Routledge.
RMIT. (2020). Diploma of interpreting (LOTE-English). https://www.rmit.edu.au/study-with-us/levels-
of-study/vocational-study/diplomas/diploma-of-interpreting-loteenglish-c5364
Seleskovitch, D. (1978). Interpreting for international conferences [Trans. by S. Dailey & E. McMillan]
Leesburg,VA: Pen and Booth.
Setton, R., & Dawrant, A. (2016). Conference interpreting. A trainer’s guide. Amsterdam: John
Benjamins.
Shlesinger, M. (1994). Intonation in the production and perception of simultaneous interpretation. In S.
Lambert & B. Moser-Mercer (Eds.), Bridging the gap: Empirical research in simultaneous inter-
pretation (pp. 225–236). Amsterdam: John Benjamins.
Timarová, Š., & Ungoed-Thomas, H. (2008). Admission testing for interpreting courses. The Interpreter
and Translator Trainer, 2(1), 29–46.
Tissi, B. (2000). Silent pauses and disfluencies in simultaneous interpretation: A descriptive analysis.
The Interpreters’ Newsletter, 10, 103–127.
University of Vienna. (2017). Bachelorstudium Transkulturelle Kommunikation. https://
transvienna.univie. ac.at/fileadmin/user_upload/z_translationswiss/Studium/Curricula/Curriculum_
Bachelorstudium_Transk ulturelle_Kommunikation_2016_Stand2017.pdf (Accessed 3April2020).
440
Training Interpreters
441
31
FIRST LANGUAGE ATTRITION
Monika Schmid
1 Introduction/Definitions
This chapter not only concludes the “Emerging issues” part of this volume on second
language acquisition (SLA) and speaking, but is also the final chapter overall. In many ways
this is appropriate: First language (L1) attrition has struggled for a long time to emerge from
its status as a niche subject and is often regarded as an afterthought. Many researchers
assume that substantial traffic and interference from the second language (L2) to the L1 will
affect only a small number of bilinguals under very specific conditions such as long-term,
immersed bilingualism, dominant use of and near-native proficiency in the L2 and extremely
limited use of the L1. With such a view, effects of the second language on the first are not
necessarily relevant to all research on L2 development, and, in particular, to research on
instructed L2 learning. However, the cumulative evidence from research over the past dec-
ades indicates that measurable changes in the L1 become established very early in both
immersed and instructed late bilinguals. The bidirectional nature of cross-linguistic inter-
action should therefore be a consideration in most, if not all, contexts of bilingualism.
The view of attrition as a phenomenon on the periphery of bilingual development is
evident from the fact that in the 1990s and early 2000s, most reference works on L2 ac-
quisition and bilingual development did not include chapters dedicated to this phenomenon
(e.g., Bhatia & Ritchie, 2004; Doughty & Long, 2003; Kroll & de Groot, 2005). This changed
only gradually (see part on encyclopedia chapters in Schmid, 2020), starting with Cook’s
(2003) theory of linguistic multicompetence, which is based on the notion of an “integration
continuum” in which all languages of the multilingual language user exist somewhere be-
tween the extreme points of entire separation and entire integration. Where exactly the
languages are situated on this continuum may vary – across individuals, across linguistic
levels, and across time – but some form of connection, however tenuous, will always exist,
and the L2 will therefore also always affect the L1 (see Lowie & Verspoor, this volume).
The differences observed in the L1 between bilingual and monolingual speakers are often
much more subtle than differences between (monolingual) natives and L2 users. They relate
mainly to issues of processing and activation rather than representation, and therefore rely
on fine-grained, online measures for their detection. This has led to an increase in interest not
only in language attrition from the perspective of psycho- and neurolinguistics, but also in
the area that is most reliant on and most characteristic of online and naturalistic language
processing: speaking. In this chapter, I follow the terminology established in Schmid and
Köpke (2017) in that I assume the term “attriter” to broadly refer to any individual who
became bilingual or multilingual after the onset of puberty. The term most commonly, but
not uniquely, refers to a late bilingual who has been living in an L2 environment for some
time and, where not otherwise indicated, that is how I will be using it here.
2 Historical Perspectives
In the early phases of language attrition research in the 1980s and early 1990s, attrition
phenomena were almost exclusively understood and investigated in terms of representational
changes, that is, changes to underlying “competence” (e.g., Seliger & Vago, 1991; Sharwood
Smith & van Buren, 1991). In this context, any kind of “interference” or speech error pro-
duced by attriters was taken as an indication that attrition had taken place, that is, that the
underlying lexical or grammatical system had been changed or simplified so that obligatory
rules were no longer fully applied correctly or words were used inappropriately (e.g.,
Olshtain & Barzilay, 1991). The same interpretation was given to erroneous responses in
experimental tasks, such as grammaticality judgements (e.g., Altenberg, 1991). The as-
sumption that non-attrited native speech is, by default, homogenous and error-free was so
strong that, more often than not, these studies did not establish a control group to assess
whether error rates among attriters were higher than among monolinguals (see Köpke &
Schmid, 2004). This is spelled out, for example, by Vago (1991):
On the assumption that the subject’s first language dialect agreed in essential detail
with her parents’ dialect from the period of initial acquisition in Hungary through
the onset of attrition in Israel, the standard dialect of Hungarian may reasonably be
identified as the base-line grammar with which the subject’s grammar may be
compared. On this basis, any structural deviation from the standard may be identified
as an attrition phenomenon. (p. 241f., emphasis added)
This prevailing view that attrition is anything that deviates from an idealized monolingual
norm, which, in turn, is tacitly assumed to be 100% accurate, was challenged by Köpke
and Schmid (2004) and Schmid (2004), who highlighted a number of conceptual and
methodological difficulties, among them establishing what an “error” is in the first place
(as “right” and “wrong” in free speech are not necessarily discrete categories), what the
unattrited baseline looks like in terms of error-rates and other features, and what the
distributional properties are of the feature under investigation in both attrited and un-
attrited data. In particular, Schmid (2004) drew attention to the topic of avoidance
strategies and the fact that speakers who have lost confidence in their grammatical in-
tuitions may construct their utterances in a manner allowing them not to use those
structures about which they are uncertain, potentially leading to superficially error-free
but simplified discourse.
In the late 1990s and early 2000s, technological and methodological advances as well as an
increase in research networks led to more empirical rigour and larger sample sizes in attrition
research (e.g., Köpke & Schmid, 2004). The small-group case studies with relatively sim-
plistic experimental tasks that had largely prevailed until then were replaced by larger in-
vestigations typically comprising 20–50 participants, an unattrited control group, and a
variety of measures, usually a combination of elicited/free speech and controlled experi-
mental tasks, as suggested in the Language Attrition Test Battery (Schmid, 2011; see also
https://languageattrition.org).
443
Monika Schmid
The findings that arose from these studies paint a rather more complex picture of L1
attrition than was originally assumed, suggesting that what is at stake is not so much the
erosion of underlying knowledge leading to inaccuracies and a “deviant” grammar, but the
target-like application of fundamentally intact knowledge under the cognitive and time
pressures incurred in online speech production. All speakers experience these pressures (and
occasionally make mistakes or slips of the tongue), but for bilinguals they increase due to the
higher cognitive load incurred by managing two language systems (Simard, this volume). In
language attrition, a relatively low level of resting activation of the language due to lack of
exposure and use may further contribute to the cognitive load. That being the case, it makes
sense to assume that attriters will develop strategies to alleviate such pressures in real-time
speech production by avoiding more costly operations or using time-buying strategies, which
may in turn influence the complexity and fluency, alongside the accuracy, of their speech
output. Such phenomena were anticipated at the earliest stages of attrition research, for
example, by Andersen (1982), who predicted that attrited speech would be characterised not
only by a “lack of adherence to the linguistic norm” (p. 91) but also by reductions affecting
lexicon, phonology and grammar and by “linguistic insecurity” (p. 111), leading to slowed-
down linguistic interactions and the overuse of compensatory strategies and searches for
words or phrases.
444
First Language Attrition
Complexity
The concept of complexity has been widely studied in bilingualism research, most often in the
context of L2 writing (e.g., Bulté & Housen, 2012; Ortega, 2012), but also in L1 attrition and
spoken language. For example, a range of studies have attempted to measure the lexical
complexity of attrited naturalistic speech production (see Jarvis, 2019 for an overview), in
line with Andersen’s prediction that attriters would have a smaller, less accessible productive
lexicon consisting of common, highly frequent and unmarked lexical items (Andersen, 1982).
In the early stages of attrition research, studies tended to focus on the specificity of particular
lexical items. For example, Olshtain and Barzilay (1991) collected a corpus of spoken English
from a group of American attriters in Israel by means of two picture-story booklets popular
in child language acquisition research (the “Frog stories”). They focus on how various items
occurring in the story are named, for example, pond, deer and gopher, and conclude that
there is more variability in the Israel-based than in the US-based population (it should be
said here, however, that the attriting group was 2.5 times the size of the monolingual one),
with attriters often preferring more general terms (e.g., “body of water” instead of “pond”).
A similar approach was adopted in studies by Pavlenko (e.g., 2004; 2010; Pavlenko & Malt,
2011), who assessed how entire semantic fields, such as verbs of emotion or motion and
household objects, may shift their overall meaning through cross-linguistic influence and
semantic extension from the L2. Based on these and other findings, Schmid (2011, Chapter 3)
provides an overview and taxonomy of the different types of cross-linguistic influence that
may occur in L1 lexical attrition.
Andersen’s (1982) position paper also makes a number of predictions for syntactic at-
trition: he hypothesizes that attriters will preserve and overuse those syntactic constructions
that more transparently reflect underlying semantic and syntactic relations, and furthermore,
where applicable, they will tend to collapse different surface structures into one, except where
such a collapse would lead to informational loss (Andersen 1982, p. 99). Empirical studies on
such syntactic simplifications in free speech are much rarer than investigations of lexical
complexity and sophistication, but there are some tentative findings indicating that the most
complex options may indeed become dispreferred in speech production by attriters when less
complex alternatives exist. For example, Yılmaz (2011) investigated the use of building
(such as the grammatical genderive types of Turkish complex embedding constructions
among a population of attriters in the Netherlands and controls in Turkey (matched for age,
education, and region of origin) in a naturalistic interview. Yılmaz concluded that the most
complex of these constructions, postpositional clauses, had decreased to some extent among
the Turkish–Dutch bilinguals relative to their use by the reference population, but that the
four other types remain unaffected. Similarly, Jackson et al. (2011) found that film retellings
from German attriters, on average, contained fewer constituents in the inner field of the
German “verbal bracket” (formed by the finite and the nonfinite part of the verb), but that
this tendency to extrapose information outside the Verb Phrase was modulated by the
typological proximity between the languages in that it was stronger for those attriters whose
second language was Dutch than for German–English bilinguals. Finally, Karayayla (2020)
conducted an investigation of the extremely productive Turkish inflectional morphology
system. Turkish agglutinative morphology is characterised by the use of frequently
co-occurring “suffix templates”, that is, formulaic strings of up to four individual suffixes
445
Monika Schmid
which, due to their association, develop relationships that are similar to lexical collocations.
For example, the suffix chain A3pl + P2pl + Abl as in kitap-lar-ınız-dan (“from your books”)
appears to be stored in a separate mental representation and accessed with the same speed as
a single suffix (Bilgin, 2016). Karayayla found that, while verb phrase templates remained
unaffected by attrition, nominal suffix templates were used less productively in language
attrition. In particular, the less frequent templates were not applied to as large a range of
lexical lemmas of different frequencies by the attriters than by the monolingual controls.
Interestingly, the frequency of the lemma itself also played a role, with less frequent nouns
being less productive in terms of the suffix templates they were paired up with. This again
hints at a possible effect of a high cognitive load incurred by retrieving less accessible items
from memory that takes its toll on other parts of the computational system.
Taken together, these findings suggest that, in terms of syntactic strategies, there are no
sweeping or dramatic simplification effects to be observed even in long-term attrition (all of
the studies mentioned earlier investigated attriting populations with a minimum of 15 years
and an average of over 35 years of residence in an L2 environment), as was suggested by
Andersen (1982), but that some more subtle shifts may occur. The effect of these simplifi-
cations is to relieve pressure on the computational system through stronger reliance on more
frequent constructions or a distributional realignment of certain constructions that is more
closely in line with that supplied by the L2. In general, morphological features with a limited
range of values – for instance, case, tense, and evidentiality – do not become reduced overall
in the attritional process, as attriters tend to continue to make use of all available forms (i.e.,
there is no evidence of the nominal case overall replacing the oblique, or one tense sup-
planting another). However, as the example from Turkish shows, when a very large range of
formulaic suffix sequences is available, their use may become somewhat less productive in the
process of L1 attrition, leading to surface-level simplification.
Fluency
Attrition effects with respect to speech fluency (Kahng, this volume) were also anticipated by
Andersen (1982), with the prediction that attriters “will be less capable […] of being quick and
easy and of being expressive in the language”, leading to slowed-down linguistic interactions and
the overuse of compensatory strategies, searches for words or phrases, etc. (Andersen, 1982,
p. 111). This “linguistic insecurity” was first examined by Schmid and Beers Fägersten (2010),
who investigated the distribution of filled and empty pauses as well as repetitions and self-
corrections within a corpus of film retellings collected from German and Dutch attriters and
controls. With the exception of filled pauses, all disfluency markers had increased in the attriting
populations, and their placement had also changed in that the attriting populations made more
use of empty pauses immediately preceding nouns, articles, and pronouns. Schmid and Beers
Fägersten ascribed these patterns not only to a reduction of the accessibility of lexical items in
themselves, but to a potential weakening of other properties of the lemma which feed into lexical
retrieval and sentence building (such as the grammatical gender of nouns), leading to increased
insecurities on which article or pronoun to select. For filled pauses, on the other hand, Schmid
and Beers Fägersten found a change in distribution among the attriters that appeared to re-
semble the statistical and distributional properties of such items in the L2.
A number of other studies have since corroborated these findings, confirming that
monolingual-like fluency is one of the most vulnerable factors in L1 attrition (e.g.,
Badstübner, 2011; Bergmann et al., 2015; Dostert, 2009; Opitz, 2011; Varga, 2012; Yılmaz &
Schmid, 2012). Interesting results are reported by Schmid and Keijzer (2009), who concluded
that the increase in disfluency markers and the reduction in lexical accessibility found among
446
First Language Attrition
attriters is similar to what happens in healthy aging, to the extent that older monolinguals
appear to “catch up” with their attrited age-matched peers around age 70, while the bilin-
guals do not show any further age-related patterns of increased disfluency. There also ap-
pears to be a clear effect of re-exposure: Stolberg and Münch (2010) report on a longitudinal
single-case study in which an elderly native German speaker living in an L2 English en-
vironment (with a length of residence of over 50 years) was interviewed repeatedly over a
period of four years, and showed a marked recovery in terms of the number of hesitation
markers (as well as semantic accessibility and morphosyntactic accuracy) across this time.
447
Monika Schmid
cumulative findings from a range of empirical studies of attrited speech production show
consistent, albeit subtle, differences in complexity, accuracy and fluency between attrited and
non-attrited populations. To date, there is only one single-case study (Iverson, 2012) of a
user of two typologically closely related languages (Spanish and Portuguese) which has
presented convincing evidence of actual restructuring, though under highly unusual
circumstances.
The growing realization that language attrition does not usually lead to contact-
induced changes or underlying restructuring of the linguistic system has brought about a
change in both research paradigms and attempts to integrate findings into theories of
language development. With respect to the former, empirical research has shifted from
using mainly behavioural tasks (such as offline grammaticality judgements) to tap into
shifts in representational knowledge, to online and neurocognitive research technologies,
alongside a greater reliance on data collected from naturalistic speech production. In
terms of theoretical approaches, questions about the vulnerability versus the stability of
different aspects of the L1 linguistic system have been shown to have important im-
plications for the broader view of language development and the architecture of the
human capacity for language and bilingualism.
Research Paradigms
Language attrition research has often found that in targeted behavioural research designs
and elicitation paradigms which allow the language user to focus attention entirely on one
specific linguistic process (e.g., lexical retrieval, offline grammaticality judgements, and
picture naming), attriting populations do not differ, or differ only very slightly, from
monolingual controls. More consistent differences have been found through methods
capable of capturing not so much the deterioration of a grammatical system and its rules
but of how the additional cognitive load incurred by the competition between languages
on the one hand and lack of exposure on the other feeds into accessibility and processing.
Neurocognitive methods such as eyetracking (see Dussias et al., 2019) and EEG (see
Steinhauer & Kasparian, 2019) allow fine-grained insights into shifts in language pro-
cessing caused by these pressures even in the absence of more overt changes, such as a
decrease in the ability to detect grammatical violations. In particular, these studies have
shown how bilinguals and attriters may parse sentences differently from monolinguals, for
instance, in areas such as relative clause attachment, or how other features of processing
and evaluation may change (for a more detailed discussion of online and neurocognitive
methods in studies of L2-to-L1 influence see Schmid & Köpke, 2017 as well as the
part on psycholinguistic and neurolinguistic approaches to L1 attrition in Schmid &
Köpke, 2019).
A further window into the more subtle aspects of change usually witnessed in language
attrition is provided by online, real-time speech production. This process forces the speaker
to rapidly integrate information from all linguistic levels under pressure from time con-
straints and limited cognitive resources. In-depth comparisons between speech produced by
monolinguals and attriters almost invariably shows differences in terms of the phenomena
discussed in Part 3 of this chapter. While the analysis of naturalistic spoken data presents a
range of methodological challenges in terms of the quantification of features and items which
are more easily controlled in studies employing targeted elicitation methods (Iwashita, this
volume), it has been proposed that a combination of both elicited (online or offline) and
naturalistic data is better suited to detecting and describing the full range of attrition phe-
nomena (e.g., Schmid, 2011) than is one of these paradigms alone.
448
First Language Attrition
Theoretical Frameworks
The findings described earlier present a clear explanatory challenge for theoretical frame-
works of bilingualism: any theory of how the configuration of the human mind allows
speakers to become proficient in one or more languages, and how pre-established languages
influence and constrain the acquisition of languages acquired later, should also be able to
account for the deterioration of proficiency and facility through the cognitive pressure and
competition incurred by learning and using another.
Some researchers favour explanatory frameworks based largely on the accessibility and
the deterioration of unused knowledge, viewing attrition as the result of a long-term lack of
stimulation (e.g., Paradis, 2007), which affects less frequent items and forms more than
highly frequent ones. A related account, also linked to the domain-general operation of
memory and cognition, assumes that those grammatical features that were learned earliest
and/or learned best will be the ones that are most resilient in the attrition process (the
Regression Hypothesis, e.g., Keijzer, 2007). A more recent development of this approach is
presented in MacWhinney’s (2019) extension of his Competition Model to the attrition
context, which takes into account psycho- and neurolinguistic processes such as entrench-
ment, transfer and resonance to arrive at a model more capable of predicting specific out-
comes of the attritional process.
Other theoretical approaches attempting to account for the selectiveness of attritional
processes have their roots less in the distributional properties of morphosyntactic forms
and more in the role they play within a grammatical theory. The model that has been
empirically tested most often in language attrition research is probably the Interface
Hypothesis (IH, e.g., Sorace, 2005). This hypothesis has evolved somewhat since it was
first proposed, but in its essence, it predicts that grammatical features representing core
syntactic properties will be less vulnerable to language attrition than structures situated at
the interface with other cognitive domains (e.g., pragmatics). Evidence supporting this
hypothesis comes from, among others, a study of subject–verb inversion in Spanish
(Perpiñán, 2011). While Spanish main clauses follow a rigid subject–verb word order (La
maestra escribió un libro vs. *Escribió un libro la maestra, “The teacher wrote a book,”
Perpiñán, 2011, her example 3) subject–verb inversion is obligatory in questions (¿Qué dijo
Juan? vs. *¿Qué Juan dijo? “What did John say?” Perpiñán, 2011, her example 1). In
embedded relative clauses, however, inversion is optional and regulated by pragmatic
factors (Pedro no leyó el libro que la maestra escribió vs. * Pedro no leyó el libro que escribió
la maestra, “Pedro did not read the book that the teacher wrote”, Perpiñán, 2011, her
example 4). Perpiñán’s investigation shows that purely syntactic inversion (in wh-
questions) is unaffected by attrition, while pragmatically licensed inversion differs be-
tween attriters and monolinguals. Furthermore, a number of studies on the use of null vs.
overt pronouns (e.g.,Tsimpli et al., 2004) and Differential Object Marking (e.g.,
Chamorro et al., 2016) provide evidence for the IH. However, criticisms have been raised
about the lack of specificity of the IH, that is, the inability to predict a hierarchy of
vulnerability to attrition among interface phenomena (e.g., Gürel, 2011). Other syntactic
accounts, most notably the Feature Reassembly Hypothesis (e.g., Lardiere, 2009) have
recently been invoked in this discussion in search of more fine-grained models (e.g., Hicks
& Domínguez, 2020; Putnam et al., 2019). Within these frameworks, it is assumed that
morphological forms may represent language-specific “feature bundles,” which are only
gradually adjusted to fully native-like settings in L2 acquisition, and which may then also
shift and change in a language that is already established, leading to non-target-like use
and intuitions (e.g., Putnam & Sánchez, 2013).
449
Monika Schmid
40000
35000
30000
25000
20000
15000
10000
5000
0
1981-1985 1986-1990 1991-1995 1996-2000 2001-2005 2006-2010 2011-2015 2016-2020
Second Language Acquisition Language Attrition
Figure 31.1 Number of references on Google Scholar to “Second Language Acquisition” and
“Language Attrition,” 1981–2020
450
First Language Attrition
be hoped that future studies will avail themselves of the groundwork laid in these important
contributions to fully integrate our understanding of what goes on in language attrition with
the wider perspective on bilingual development.
The second gap in knowledge relates to the puzzling dichotomy noted above between the
extreme vulnerability to forgetting of childhood languages and their apparent stability post
puberty. A fuller understanding of how, when and why this change takes place has the potential
to contribute very substantially to our understanding of how age of learning and ultimate success
interact, and thus provide an answer to questions about maturational constraints and critical or
sensitive periods in bilingual development. To gain such an understanding, studies are needed
which assess the full range of ultimate proficiency in second language learners and native lan-
guage forgetters across all ranges of ages of learning and forgetting.
6 Conclusion
Since the 1980s, a well-established body of research on the spoken L1 use of immigrants who
use their L2 in daily life has demonstrated that such speakers come to show cross-linguistic
influence, a phenomenon known as language attrition. Such speakers differ from mono-
linguals with respect to the complexity, accuracy, and fluency of their native language, and
they may also develop a foreign accent. However, the differences tend to be minor and
subtle, indicative more of the online pressure of managing two highly active linguistic sub-
systems under time pressure than of any underlying representational changes to the native
system. As such, language attrition research presents a valuable added perspective to in-
vestigations of bilingualism and second language acquisition, as it allows researchers to se-
parate out the effects of online cross-linguistic influence from those of any potential failure to
establish native-like underlying representations.
Further Reading
Hicks, G., & Domínguez, L. (2020). A model for L1 grammatical attrition. Second Language Research,
36(2), 143–165.
Describes a model of language attrition based on the Feature Reassembly Hypothesis, which seeks to
explain under what conditions language attrition can be integrated into the developmental process
sustained by the language faculty.
Schmid, M. S., & Köpke, B. (2017). The relevance of first language attrition to theories of bilingual
development. Linguistic Approaches to Bilingualism, 7(6), 637–667.
Addresses the long-held distinction between attrition at the level of underlying structure (‘competence’)
and online processing (‘performance’) and examines whether it is methodologically possible and the-
oretically appropriate to distinguish the two.
Schmid, M. S. & Köpke, B. (Eds.) (2019). The Oxford handbook of language attrition. Oxford: Oxford
University Press.
Presents the state of the art of research in language attrition, showcasing different theoretical and
methodological approaches, and features of attrition at different phases of the lifespan, on different
linguistic levels and under different contexts and circumstances.
References
Altenberg, E. P. (1991). Assessing first language vulnerability to attrition. In H. W. Seliger & R. M.
Vago (Eds.), First language attrition (pp. 189–206). Cambridge: CUP.
Andersen, R. W. (1982). Determining the linguistic attributes of language attrition. In R. D. Lambert &
B. F. Freed (Eds.), The loss of language skills (pp. 83–118). Rowley, MA: Newbury House.
Badstübner, T. (2011). L1 attrition: German immigrants in the U.S. PhD dissertation, University of
Arizona at Tuscon.
451
Monika Schmid
Bergmann, C., Nota, A., Sprenger, S. A., & Schmid, M. S. (2016). L2 immersion causes non-native-like
L1 pronunciation in German attriters. Journal of Phonetics, 58, 71–86.
Bergmann, C., Sprenger, S., & Schmid, M. S. (2015). The impact of language co-activation on L1 and
L2 speech fluency. Acta Psychologica, 161, 25–35.
Bhatia, T. K., & Ritchie, W. C. (Eds.) (2004). The handbook of bilingualism. Oxford: Blackwell.
Bilgin, O. (2016). Frequency effects in the processing of morphologically complex Turkish words. Master
Thesis, Boğaziçi University, Istanbul. Retrieved from http://st2.zargan.com/public/resources/
turkish/frequency_effects_in_turkish.pdf
Bulté, B., & Housen, A. (2012). Defining and operationalising L2 complexity. In A. Housen, F. Kuiken,
& I. Vedder (Eds.), Dimensions of L2 performance and proficiency (pp. 21–46). Amsterdam: John
Benjamins.
Bylund, E. (2019). Age effects in language attrition. In M. S. Schmid & B. Köpke (Eds.), The Oxford
handbook of language attrition (pp. 277–286). Oxford: Oxford University Press.
Chamorro, G., Sturt, P., & Sorace, A. (2016). Selectivity in L1 attrition: Differential object
marking in Spanish near-native speakers of English. Journal of Psycholinguistic Research, 45(3),
697–715.
Chang, C. B. (2012). Rapid and multifaceted effects of second-language learning on first-language
speech production. Journal of Phonetics, 40(2), 249–268.
Cook, V. (2003). The changing L1 in the L2 user’s mind. In V. Cook (Ed.), Effects of the second
language on the first (pp. 1–18). Clevedon: Multilingual Matters.
de Leeuw, E. (2019). Phonetic attrition. In M. S. Schmid & B. Köpke (Eds.), The Oxford handbook of
language attrition (pp. 204–217). Oxford: Oxford University Press.
de Leeuw, E., Schmid, M. S., & Mennen, I. (2010). The effects of contact on native language pro-
nunciation in an L2 migrant setting. Bilingualism: Language and Cognition, 13(1), 33–40.
Dostert, S. (2009). Multilingualism, L1 attrition and the concept of ‘native speaker’. PhD thesis,
Heinrich-Heine Universität Düsseldorf.
Doughty, C. J., & Long, M. H. (Eds.) (2003). The handbook of second language acquisition. Oxford:
Wiley Blackwell.
Dussias, P. E., Valdés Kroff, J. R., Johns, M., & Villegas, Á. (2019). How bilingualism affects
syntactic processing in the native language: Evidence from eye-movements. In M. S. Schmid & B.
Köpke (Eds.), The Oxford handbook of language attrition (pp. 98–107). Oxford: Oxford
University Press.
Flege, J. E. (1987). The production of ‘new’ and ‘similar’ phones in a foreign language: Evidence for the
effect of equivalence classification. Journal of Phonetics, 15, 47–65.
Flege, J. E. (2002). Interactions between the native and second-language phonetic systems. In P.
Burmeister, T. Piske & A. Rohde (Eds.), An integrated view of language development: Papers in honor
of Henning Wode (pp. 217–244). Trier, Germany: Wissenschaftlicher Verlag.
Flege, J. E., & Eefting, W. (1987). Cross-language switching in stop consonant perception and pro-
duction by Dutch speakers of English. Speech Communication, 6(3), 185–202.
Granena, G., & Long, M. H. (2013). Age of onset, length of residence, language aptitude, and ultimate
L2 attainment in three linguistic domains. Second Language Research, 29(3), 311–343.
Gürel, A. (2011). In search for a unified model of L2 acquisition and L1 attrition: A commentary for the
Interface Hypothesis. Linguistic Approaches to Bilingualism, 1(1), 39–42.
Hicks, G., & Domínguez, L. (2020). A model for L1 grammatical attrition. Second Language Research,
36(2), 143–165.
Hopp, H., & Schmid, M. S. (2013). Perceived foreign accent in first language attrition and second
language acquisition: The impact of age of acquisition and bilingualism. Applied Psycholinguistics,
34(2), 361–394.
Iverson, M. (2012). Advanced language attrition of Spanish in contact with Brazilian Portuguese. PhD
thesis, University of Iowa.
Jackson, C. N., McDermott, L., & Schmid, M. S. (2011). Changing syntactic preferences in L1 attriters
of German. Paper presented at the 7th International Symposium on Bilingualism, Oslo, June 2011.
Jarvis, S. (2019). Lexical attrition. In M. S. Schmid & B. Köpke (Eds.), The Oxford handbook of lan-
guage attrition (pp. 241–250). Oxford: Oxford University Press.
Karayayla, T. (2020). Effects of first language attrition on heritage language input and ultimate at-
tainment: Two generations of Turkish immigrants in the UK. In B. Brehmer, J. Treffers-Daller, &
D. Berndt (Eds.), Lost in transmission: The role of attrition and input in heritage language develop-
ment (pp. 34–69). Amsterdam: John Benjamins.
452
First Language Attrition
Karayayla, T., & Schmid, M. S. (2019). First language attrition as a function of age at onset of bi-
lingualism: First language attainment of Turkish–English bilinguals in the United Kingdom.
Language Learning, 69(1), 106–142.
Keijzer, M. (2007). Last in first out? An investigation of the Regression Hypothesis in Dutch emigrants in
anglophone Canada. PhD thesis, Vrije Universiteit, Amsterdam.
Köpke, B., & Schmid, M. S. (2004). Language attrition: The next phase. In M. S. Schmid, B. Köpke,
M. Keijzer, & L. Weilemar (Eds.), First language attrition: Interdisciplinary perspectives on metho-
dological issues (pp. 1–43). Amsterdam: John Benjamins.
Kroll, J. F., & de Groot, A. M. B. (Eds.) (2005). Handbook of bilingualism. Oxford: Oxford University
Press.
Lardiere, D. (2009). Some thoughts on the contrastive analysis of features in second language acqui-
sition. Second Language Research, 25(2), 173–227.
MacWhinney, B. (2019). Language attrition and the competition model. In M. S. Schmid & B. Köpke
(Eds.), The Oxford handbook of language attrition (pp. 7–17). Oxford: Oxford University Press.
Major, R. C. (1992). Losing English as a first language. The Modern Language Journal, 76(2), 190–208.
Mayr, R., Price, S., & Mennen, I. (2012). First language attrition in the speech of Dutch-English
bilinguals: The case of monozygotic twin sisters. Bilingualism: Language and Cognition, 15(4),
687–700.
Mennen, I. (2004). Bi-directional interference in the intonation of Dutch speakers of Greek. Journal of
Phonetics, 32, 543–563.
Olshtain, E., & Barzilay, M. (1991). 10 Lexical retrieval difficulties in adult language attrition. In H. W.
Seliger & R. M. Vago (Eds.), First language attrition (pp. 139–150). Cambridge University Press.
Opitz, C. (2011). First language attrition and second language acquisition in a second language en-
vironment. PhD thesis, Trinity College Dublin.
Opitz, C. (2019). A complex dynamic systems perspective on personal background variables in L1
attrition. In M. S. Schmid & B. Köpke (Eds.), The Oxford handbook of language attrition
(pp. 49–60). Oxford: Oxford University Press.
Ortega, L. (2012). Interlanguage complexity: A construct in search of theoretical renewal. In B.
Kortmann & B. Szmrecsanyi (Eds.), Linguistic complexity: Second language acquisition, in-
digenization, contact (pp. 127–155). Berlin: Walter de Gruyter.
Paradis, M. (2007). L1 attrition features predicted by a neurolinguistic theory of bilingualism. In
Köpke, B., Schmid, M. S., Keijzer, M., & Dostert, S. (Eds.), Language attrition: A theoretical
perspective (pp. 121–134). Philadelphia/Amsterdam: John Benjamins.
Pavlenko, A. (2004). L2 Influence and L1 attrition in adult bilingualism. In M. S. Schmid, B. Köpke,
M. Keijzer, & L. Weilemar (Eds.), First language attrition: Interdisciplinary perspectives on metho-
dological issues (pp. 47–59). Amsterdam: John Benjamins.
Pavlenko, A. (2010). Verbs of motion in L1 Russian of Russian-English bilinguals. Bilingualism:
Language and Cognition, 13(1), 49–62.
Pavlenko, A., & Malt, B. C. (2011). Kitchen Russian: Cross-linguistic differences and first-language
object naming by Russian-English bilinguals. Bilingualism, 14(1), 19.
Perpiñán, S. (2011). Optionality in bilingual native grammars. Language, Interaction and Acquisition,
2(2), 312–341.
Putnam, M. T., & Sánchez, L. (2013). What’s so incomplete about incomplete acquisition?: A prolego-
menon to modeling heritage language grammars. Linguistic Approaches to Bilingualism, 3(4), 478–508.
Putnam, M. T., Perez-Cortes, S., & Sánchez, L. (2019). Language attrition and the feature reassembly
hypothesis. In M. S. Schmid & B. Köpke (Eds.), The Oxford handbook of language attrition
(pp. 18–24). Oxford: Oxford University Press.
Schmid, M. S. (2004). First language attrition: The methodology revised. International Journal of
Bilingualism, 8(3), 239–255.
Schmid, M. S. (2011). Language attrition. Cambridge: Cambridge University Press.
Schmid, M. S. (2014). The debate on maturational constraints in bilingual development: A perspective
from first-language attrition, Language Acquisition, 21(4), 386–410.
Schmid, M. S. (2019). Language attrition as a problem for LADO. In P. L. Patrick, K. Zwaan, & M. S.
Schmid (Eds.), Language analysis for the determination of origin (pp. 155–165). Chan: Springer.
Schmid, M. S. (2020). First language attrition. Oxford bibliographies in linguistics. Oxford: Oxford
University Press.
Schmid, M. S., & Beers Fägersten, K. (2010). Disfluency markers in L1 attrition. Language Learning,
60(4), 753–791.
453
Monika Schmid
Schmid, M. S., & Keijzer, M. (2009). First language attrition and reversion among older migrants.
International Journal of the Sociology of Language, 200, 83–101.
Schmid, M. S., & Köpke, B. (2017). The relevance of first language attrition to theories of bilingual
development. Linguistic Approaches to Bilingualism, 7(6), 637–667.
Schmid, M. S., & Köpke, B. (2019). Introduction. In M. S. Schmid & B. Köpke (Eds.), The Oxford
handbook of language attrition (pp. 1–4). Oxford: Oxford University Press.
Schmid, M. S., & Köpke, B. (Eds.) (2019). The Oxford handbook of language attrition. Oxford: Oxford
University Press.
Schmid, M. S., & Yılmaz, G. (2018). Predictors of language dominance: An integrated analysis of first
language attrition and second language acquisition in late bilinguals. Frontiers in Psychology,
9, 1306.
Seliger, H. W., & Vago, R. M. (1991). The study of first language attrition: An overview. In H. W.
Seliger & R. M. Vago (Eds.), First language attrition (pp. 3–15). Cambridge: Cambridge University
Press.
Sharwood Smith, M. A., & van Buren, P. (1991). First language attrition and the parameter setting
model. In H. W. Seliger & R. M. Vago (Eds.), First language attrition (pp. 17–30). Cambridge:
Cambridge University Press.
Sorace, A. (2005). Selective optionality in language development. In L. Cornips & K. Corrigan (Eds.),
Syntax and variation: Reconciling the biological and the social (pp. 46–111). Amsterdam: John
Benjamins.
Steinhauer, K., & Kasparian, K. (2019). Electrophysiological approaches to L1 attrition. In M. E.
Schmid, B. E. Köpke, M. C. Cherciov, T. C. Karayayla, M. C. Keijzer, E. C. De Leeuw, T. H.
Mehotcheva, S. C. Montrul & M. C. Polinsky (Eds.), The Oxford handbook of language attrition
(pp. 146–165). Oxford University Press.
Stolberg, D., & Münch, A. (2010). “Die Muttersprache vergisst man nicht”–or do you? A case study in
L1 attrition and its (partial) reversal. Bilingualism: Language and Cognition, 13(1), 19–31.
Tsimpli, I. M., Sorace, A., Heycock, C., & Filiaci, F. (2004). First language attrition and syntactic
subjects: A study of Greek and Italian near-native speakers of English. International Journal of
Bilingualism, 8(3), 257–277.
Vago, R. (1991). Paradigmatic regularity in first language attrition. In H. W. Seliger & R. M. Vago
(Eds.), First language attrition (pp. 241–252). Cambridge: Cambridge University Press.
Varga, Z. (2012). First language attrition and maintenance among Hungarian speakers in Denmark. PhD
thesis, Aarhus University, Denmark.
Yılmaz, G. (2011). Complex embeddings in free speech production among late Turkish-Dutch bilin-
guals. Language, Interaction and Acquisition, 2(2), 251–275.
Yılmaz, G., & Schmid, M. S. (2012). L1 accessibility among Turkish-Dutch bilinguals. Mental Lexicon,
7(3), 249–274.
454
INDEX
455
Index
456
Index
457
Index
458
Index
459
Index
460
Index
461
Index
462
Index
463
Index
464
Index
465
Index
466
Index
467
Index
468