Successful Spoken English

Successful Spoken English
Successful Spoken English: Findings from Learner Corpora demonstrates how

spoken learner corpora can be used to define and explore the constituents of
successful spoken English. Taking the approach that language learners can
speak effectively whilst still using some non-standard forms, this book:
• Examines databases of transcribed speech from learners at different

CEFR levels to analyse what makes a successful speaker of English;
• Discusses features of communicative competence, including the use of
linguistic strategies, organisation of extended stretches of speech, and
sensitivity to context;
• Demonstrates quantitative and qualitative data analysis using corpus
tools, looking at areas such as word frequency;
• Helps to reassess the goals of language learners and teachers, and pro-
vides recommendations for teaching practice and for further research.
Successful Spoken English: Findings from Learner Corpora is key reading

for postgraduate students of TESOL and Applied Linguistics, as well as for
pre- and in-service English language teachers.
Christian Jones is Senior Lecturer Applied Linguistics and TESOL at the Uni-
versity of Liverpool, UK.
Shelley Byrne is Lecturer in English for Academic Purposes at the University

of Central Lancashire, UK.
Nicola Halenko is Senior Lecturer in English Language Teaching at the Uni-

versity of Central Lancashire, UK.
The Routledge Applied Corpus Linguistics Series is a series of monograph studies
exhibiting cutting-edge research in the field of corpus linguistics and its applications
to real-world language problems. Corpus linguistics is one of the most dynamic and
rapidly developing areas in the field of language studies, and it is difficult to see a
future for empirical language research where results are not replicable by reference to
corpus data. This series showcases the latest research in the field of applied language
studies where corpus findings are at the forefront, introducing new and unique meth-
odologies and applications which open up new avenues for research.
SERIES EDITOR: RONALD CARTER

Ronald Carter is Research Professor of Modern English Language in the School
of English at the University of Nottingham, UK. He is the co-series editor of the
Routledge Applied Linguistics, Routledge Introductions to Applied Linguistics and
Routledge English Language Introductions series.
SERIES EDITOR: MICHAEL McCARTHY

Michael McCarthy is Emeritus Professor of Applied Linguistics at the University of
Nottingham, UK, Adjunct Professor of Applied Linguistics at the University of Lim-
erick, Ireland and Visiting Professor in Applied Linguistics at Newcastle University,
UK. He is co-editor of the Routledge Handbook of Corpus Linguistics and editor of
the Routledge Domains of Discourse series.
SERIES EDITOR: ANNE O’KEEFFE

Anne O’Keeffe is senior lecturer in Applied Linguistics, Department of English Lan-
guage and Literature, Mary Immaculate College, University of Limerick, Ireland and
Chair of IVACS.
Editorial Panel: IVACS (Inter-Varietal Applied Corpus Studies Group), based at Mary
Immaculate College, University of Limerick, is an international research network
linking corpus linguistic researchers interested in exploring and comparing language
in different contexts of use.
Other titles in this series

Language, Corpus and Empowerment
Applications to Deaf Education, Healthcare and Online Discourses
Luke Collins
Historical Spoken Language Research

Corpus Perspectives
Ivor Timmis

Findings from Learner Corpora
Christian Jones, Shelley Byrne and Nicola Halenko
More information about this series can be found at

www.routledge.com/series/RACL
Findings from Learner Corpora
Christian Jones, Shelley Byrne

and Nicola Halenko
First published 2018
by Routledge
2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN
and by Routledge
711 Third Avenue, New York, NY 10017
Routledge is an imprint of the Taylor & Francis Group, an informa business
© 2018 Christian Jones, Shelley Byrne and Nicola Halenko
The right of Christian Jones, Shelley Byrne and Nicola Halenko to
be identified as authors of this work has been asserted by them in
accordance with sections 77 and 78 of the Copyright, Designs and
Patents Act 1988.
All rights reserved. No part of this book may be reprinted
or reproduced or utilised in any form or by any electronic,
mechanical, or other means, now known or hereafter invented,
including photocopying and recording, or in any information
storage or retrieval system, without permission in writing from
the publishers.
Trademark notice : Product or corporate names may be trademarks
or registered trademarks, and are used only for identification and
explanation without intent to infringe.
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-in-Publication Data
A catalog record for this book has been requested
ISBN: 978-1-138-68399-0 (hbk)
ISBN: 978-1-315-10171-2 (ebk)
Typeset in Sabon
by Apex CoVantage, LLC
Contents
List of figures viii

List of tables xi
Acknowledgements xiii
1 What is a successful speaker of English? 1

1.1 Introduction 1
1.2 Rationale for exploring successful spoken language 2
1.3 Rationale for a focus on spoken language 4
1.4 Definitions of successful language: communicative
competence 4
1.4.1 Hymes’s theory of communicative competence 4
1.4.2 Canale and Swain’s theory of communicative
competence 6
1.4.3 Canale’s theory of communicative competence 8
1.4.4 Bachman and Palmer’s model of language use
and performance 10
1.5 Towards a definition of successful spoken language:
communicative competence 14
1.6 Linking communication, communicative competence
and learner success to the CEFR 16
1.7 Measuring successful spoken language in this book 19
1.8 Conclusion 28
2 Linguistic competence 33
2.1 Introduction 33
2.2 Definitions of linguistic competence 33
2.3 Previous studies 34
2.4 Methods of analysis 36
2.4.1 Frequency profiles 36
2.4.2 Frequency lists 37
vi Contents
2.4.3 Keyword lists 39

2.4.4 Lexical chunks 39
2.5 Linguistic competence at B1-C1 levels 40
2.6 Most frequent words 45
2.6.1 WE 47
2.6.2 Er/Erm 52
2.7 Keywords 55
2.7.1 Think 58
2.7.2 Can 62
2.8 Most frequent lexical chunks 65
2.8.1 A lot of 68
2.8.2 Agree with you 69
2.9 Discussion 72
2.10 Conclusion 73
3 Strategic competence 78
3.1 Introduction 78
3.2 Definitions of strategic competence and communication
strategies 78
3.5 Strategic competence at B1-C1 levels 85
3.5.1 Preliminary analysis of CEFR strategies in B1,
B2 and C1 speech 87
3.5.2 Comparison of CEFR strategy realisation in B1,
B2 and C1 speech 92
3.5.3 Production strategies: correction 94
3.5.4 Interaction strategies: inviting others into the
discussion and seeking clarification 99
3.6 Conclusion 105
4 Discourse competence 109

4.1 Introduction 109
4.2 Definitions of discourse competence 109
4.2.1 Definitions of spoken discourse markers 111
4.5 Discourse competence at B1-C1 levels 117
4.6 The frequency and functions of common discourse
markers used to achieve discourse competence 123
Contents vii
4.6.1 And er 125

4.6.2 Yeah/yeah/yeah I 127
4.6.3 I think er and I think it 128
4.6.4 Ok er/ ok I 129
4.6.5 Well I 129
4.7 Conclusion 131
5 Pragmatic competence 134

5.2 Definitions of pragmatic competence 135
5.2.1 Requests 137
5.2.2 Apologies 139
5.2.3 Formulaic language and developing pragmatic
competence 141
5.5 Pragmalinguistic features of successful request
language 144
5.6 Pragmalinguistic features of successful apology
language 149
5.7 Sociopragmatic features of successful request
and apology language 153
5.8 Conclusion 154
6 Conclusion 159
6.2 Summary of findings 159
6.2.1 Linguistic competence 159
6.2.2 Strategic competence 160
6.2.3 Discourse competence 162
6.2.4 Pragmatic competence 163
6.2.5 Summary 163
6.3 Implications for research 166
6.4 Implications for teaching 167
6.5 Final thoughts 169
Index 173
Figures
1.1 Canale and Swain’s (1980) theory of communicative

competence 7
1.2 Bachman and Palmer’s (1996: 63) language use and
performance with Bachman’s (1990: 87) components
of language ability 12
1.3 CEFR Common Reference Levels (Council of Europe 2001) 16
1.4 An example of a scenario from the CAPT to elicit spoken data 25
2.1 K-1 and K-2 words at B1 41
2.2 K-1 and K-2 words at B2 41
2.3 K-1 and K-2 words at C1 42
2.4 Percentage type vocabulary profiles 43
2.5 Percentage token vocabulary profiles 44
2.6 We used with a specific function at B1 47
2.7 We used to refer to a third party at B2 50
2.8 We used to make a general reference at C1 50
2.9 Repetition of we at B1 51
2.10 Linking ideas with I and synonyms in the native speaker data 51
2.11 I think + NP + ADJP at B1 59
2.12 I think + NP at B1 60
2.13 I think + NP at B2 60
2.14 I think + NP at C1 60
2.15 I can at B1 63
2.16 You can at B2 63
2.17 We can at C1 level 63
2.18 The use of yeah with we can to develop points across
speaker turns at C1 65
2.19 We can used to develop ideas within a speaker turn at B1 65
2.20 A lot of things I at B2 level 71
2.21 Agree with you but at B2 level 71
2.22 Agree with you because at C1 level 72
3.1 Procedure for identifying CS language in the USTC 84
3.2 B1 production strategies 87
Figures ix
3.3 B1 interaction strategies 88

3.4 B2 production strategies 89
3.5 B2 interaction strategies 89
3.6 C1 production strategies 90
3.7 C1 interaction strategies 90
3.8 B1 pronoun correction 96
3.9 B2 verb correction 96
3.10 C1 verb correction 96
3.11 NS pronoun correction 96
3.12 B1 tense correction 97
3.13 B2 tense correction 97
3.14 C1 tense correction 97
3.15 NS tense correction 97
3.16 B2 missing word correction 97
3.17 C1 missing word correction 97
3.18 C1 utterance reformulation 97
3.19 NS utterance reformulation 98
3.20 C1 post-utterance correction 99
3.21 NS post-utterance correction 99
3.22 C1 interaction questions to relinquish the turn 102
3.23 Task instruction clarification request at B1 103
3.24 Vague question clarification request at B2 103
3.25 Repetition clarification request at C1 104
3.26 Use of paraphrase and synonyms to clarify unknown
vocabulary at C1 104
3.27 Use of overt questions to clarify unknown vocabulary at B1 104
4.1 Samples of it used for cataphoric and anaphoric reference 116
4.2 Examples of it used as a cohesive device 119
4.3 Examples of this and that at B1, B2 and C1 levels 119
4.4 For example to develop a theme 121
4.5 Use of what do you think? to initiate conversation 121
4.6 Use of so yeah to close a turn 122
4.7 And er as a continuation 127
4.8 Yeah across the levels 128
4.9 I think er/I think it across the levels 129
4.10 Ok across the levels 130
4.11 Well used at C1 level 130
5.1 An example of a scenario from the CAPT to elicit spoken
data 135
5.2 Strategy choice for request head acts 139
5.3 Formulaic strategies for the apology speech act (based on
Trosborg 1987) 141
5.4 NNS and NS head act requests 145
5.5 Use of want in the NNS and NS request data 148
x Figures
5.6 Use of can as a repair strategy in the NNS and NS

apology data 149
5.7 Common chunks in NNS and NS apology data 150
5.8 Grammatically inaccurate chunks in the NNS apology data 150
5.9 Examples of NNS strategies and organisation in the
SPACE data 153
5.10 Examples of the because-therefore, therefore-because
patterns in the request data 154
Tables
1.1 Marking criteria 21

1.2 Details of the learner test corpus 24
1.3 Rating scale to evaluate participant request and apology
responses 26
1.4 Details of the Speech Act Corpus of English (SPACE) 27
2.1 Percentage of K-1 and K-2 words at B1, B2 and C1 41
2.2 Types, tokens, means and SDs for K-1 and K-2 42
2.3 Cumulative percentage types and tokens 44
2.4 The 20 most frequent words from B1-C1 and in NS data 46
2.5 Normalised frequencies for we 47
2.6 Collocates of we at B1, B2 and C1 48
2.7 Normalised frequencies for er and erm 52
2.8 Collocates of er 53
2.9 Collocates of erm 54
2.10 B1 Keywords 56
2.11 B2 Keywords 56
2.12 C1 Keywords 57
2.13 NS Keywords 57
2.14 Think at B1, B2 and C1 58
2.15 Colligational patterns of think 59
2.16 Collocations of I think 61
2.17 Can at B1-C1 levels 62
2.18 Collocates with we can B1-C1 levels 64
2.19 3-word chunks at B1-C1 levels and in the native
speaker data 66
2.20 4 word chunks at B1-C1 levels and in the native
speaker data 67
2.21 Comparison of many, much and a lot of across B1, B2
and C1 69
2.22 Collocates of a lot of across USTC 70
2.23 Comparison of USTC a lot of collocates with LINDSEI data 71
xii Tables
3.1 Dörnyei and Scott’s (1995) CS taxonomy 81

3.2 Final CS codes used for B1, B2 and C1 (Council of Europe
2001: 64–65, 86–87) 86
3.3 Revised B1 production strategies 88
3.4 Most frequent production strategy statements at B1,
B2 and C1 93
3.5 Most frequent interaction strategy statements at B1,
B2 and C1 93
3.6 Focus of error correction in the USTC corpus 95
3.7 CEFR qualitative descriptions of learner accuracy (Council
of Europe 2001: 28) 98
3.8 Summary of most frequent error types in USTC 100
3.9 Top three questions used at B1 to invite others into the
conversation 101
3.10 Top three questions used at B2 to invite others into the
conversation 101
3.11 Top three questions used at C1 to invite others into the
conversation 101
3.12 The three most frequent clarification requests at B1,
B2 and C1 103
3.13 Language used for clarification requests 105
4.1 CEFR ‘can do statements’ related to discourse competence 115
4.2 Language items used to realise discourse competence across
B1 – C1 levels 118
4.3 Log-likelihood scores for key linguistic items used to realise
discourse competence 120
4.4 Frequency of DMs in learners and NS data 123
4.5 Log-likelihood scores for each discourse marker 124
4.6 Two to six word chunks with er, yeah, I think, Ok, you
know, well and I mean 126
5.1 CEFR ‘can do’ statements for sociolinguistic appropriateness 138
5.2 The twenty most frequent words in the learner and native
speaker request data 145
5.3 The twenty most frequent four-, three- and two-word chunks
in the learner request data 146
in the native-speaker request data 147
5.5 The twenty most frequent words in the learner and native-
speaker apology data 149
in the learner apology data 151
in the native speaker apology data 152
6.1 Summary of successful speakers’ key competences at
B1-C1 levels 164
Acknowledgements
We would like to thank colleagues past and present for help, inspiration and
support:
Svenja Adolphs, Marco Antonini, Nick Carter, Ronald Carter, Jane

Cleary, John Cross, Isabel Donnelly, James Donnithorne, Andy Downer,
Graham Ethelston, Nick Gregson, Patrycja Golebiewska, Tania Horak,
Douglas Hamano-Bunce, Simon Hobbs, Stuart Hobbs, Geoffrey Leech,
Josie Leonard, Michael McCarthy, Fergus Mackinnon, Hitomi Masu-
hara, Marije Michel, Alan Milby, Carmel Milroy, Clive Newton, Anne
O’Keeffe, Sheena Palmer, Simon Pate, Raymond Pearce, Lesley Randles,
Karen Smith, Naoko Taguchi, Ivor Timmis, Daniel Waller, Neil Walker,
Nicola Walker, Andy Williams, Jane Willis.
Thanks to all our BA TESOL/MOLA and MA TESOL with Applied Linguis-

tics students past and present.
Thanks to Nadia Seemungal-Owen and Helen Tredget at Routledge for clear
guidance and help throughout.
Finally, thanks to our families for all their support.
Chapter 1
What is a successful speaker

of English?
1.1 Introduction
The aim of this chapter is to explore and attempt to define the concept
of a successful speaker of English and lay the groundwork for the chap-
ters which will follow. In order to work towards this definition, we seek to
explore the notion of a successful speaker within one main framework: the
notion of communicative competence, first developed by Hymes (1972). We
will argue that this is an appropriate manner in which to explore successful
speakers and their interlanguage (Selinker 1972). A successful speaker is,
we believe, one who can demonstrate all facets of communicative compe-
tence at their level and as appropriate for the purpose of the discourse they
are taking part in. We will suggest that the different aspects of a successful
speaker can therefore be analysed by exploring their linguistic, strategic,
discourse and pragmatic competence, as outlined by Hymes (1972), Canale
and Swain (1980), Canale (1983) and Bachman and Palmer (1996) and that
these competences will vary according to a learner’s level of English. This
model is not new but has, we believe, had a large impact upon the develop-
ment of communicative language teaching and testing and was one of the
theoretical bases of the Common European Framework of References for
Languages (Council of Europe 2001), so we feel it is still highly relevant. For
instance, although much learner corpus research focuses on errors or devia-
tions following comparisons of native speaker language use, communicative
competence models still promote the notion that assessments of language
should attend both to language knowledge and how a speaker uses this
knowledge in actual communication. After 30 years, this fundamental view
still resonates in the CEFR, since in viewing learners as social agents with
varying target language needs, development across levels is tracked accord-
ing to what students can do with their language when communicating across
contexts, according to different functions and with changing audiences. In
basing its own treatment of communicative competence on the interplay
between linguistic, pragmatic, sociolinguistic and existential competences,
the CEFR clearly adopts many of the original aspects of communicative
2 What is a successful speaker of English?
competence to demonstrate that language knowledge, skills and know–how

are only activated, and available for measurement across the level scales,
when language is actually used for communication. With the CEFR having
had a great impact on language assessment and coursebook design, it is clear
that communicative competence models such as those of Canale and Swain
(1980), Canale (1983) and Bachman and Palmer (1996) are still significant
and of relevance to notions of success. When analysing communicative com-
petence in this way, we seek to show that successful spoken interaction takes
place at the level of discourse, involves cooperation between speakers and
effective listenership, alongside spoken production (Carter and McCarthy
2015). We also hope to show that a successful speaker’s English is something
which changes as a learner’s interlanguage develops and need not be seen as
the final outcome of learning a second language.
Before we explore these definitions in more detail, this chapter will first
outline why we wish to explore success in general and, in particular, in terms
of spoken language. As we do this, we will make reference throughout to the
ways in which corpora have helped to define notions of success. We will also
outline details of the learner corpora used as the primary data sources for
our analysis and how we intend to use this in the rest of the book.
1.2 Rationale for exploring successful

spoken language
As English language tutors and assessors, our experience has confirmed the
view that many learners aspire towards a native or native-like proficiency
in English. A noticeable trend, however, is the tendency for some learners to
overlook the fact that they can still be successful in their use of English with-
out achieving such a level. The ‘comparative fallacy’ (Bley-Vroman 1983: 1),
involving the comparison and assessment of non–native speakers against
native–speaker norms, would seem, therefore, not only to be a feature found
in Second Language Acquisition (SLA) research, but one which appears in
learners’ self–assessments. In spite of native-speaker (NS) variation in lin-
guistic and sociocultural abilities causing considerable debate regarding the
identification of a sole NS norm (see Andreou and Galantomos 2009; Cook
1999; Kramsch 2003; Lee 2005; Lyons 1996; Rampton 1990), learners still
strive towards the NS ideal (Timmis 2002; 2003; 2005). Whilst doing so, it
is our view that learners often do not realise that they, as non–native learn-
ers, can be ‘successful users of English’ (SUEs) (Prodromou 2008: xiv) who
are capable of drawing on their linguistic resources to operate effectively in
the contexts they encounter. However, with studies of SUEs (see Piller 2002;
Prodromou 2008) often focussing on more advanced learners or those who
can ‘pass’ as NSs, there is sometimes a notion that ‘success’ is associated
only with the elite group of learners who are able to reach the highest levels
in language learning. Our reason for exploring learner success in English
What is a successful speaker of English? 3
thus stems from the sense that the NS target set by learners, and indeed
sometimes by practitioners (see Canagarajah 2007; Kramsch 2003; Timmis
2002), forms the basis of a learner’s assessment of their own achievement:
success equates to attaining native-like proficiency whereas other proficien-
cies equate to failed or incomplete attempts (Birdsong 2004). In contrast
to this, we wish to suggest that learners can be successful at different levels
and not only at a high level of competence. In other words, communicative
competence (as we will move on to discuss) is an attainable goal at different
language levels and it is the ability to be communicatively competent which
we can view as a measure of success. This argument is in opposition to the
notion that the ‘native speaker’ is the only model of success. We suggest that
this can create a ‘deficit relationship’ in which NSs have ‘the upper hand’
(Prodromou 1997: 439) and rather than underlining a learner’s success in
using an L2, they can heighten their sense of failure towards L2 development
and the NS model itself (Cook 2002; 2008). As Naiman et al., (1978: 2)
remark:
Failure is accompanied by dissatisfaction, awareness of one’s own inad-

equacy, and sometimes annoyance, disappointment, frustration, and
even anger at the colossal waste of time.
The continual pressure to replicate the NS not only creates ‘stereotypes that
die hard’ (Nayar 1994: 4), but it potentially discourages learners from per-
severing with the acquisition of English. It also overlooks the potential that
could be unlocked by viewing learners not as ‘failed native speakers’ but
instead as ‘successful multicompetent speakers’ of more than one language
(Cook 1999: 204).
Of vital importance to this study, however, is the barrier created by the
NS model. By focussing on the overall goal of L2 development, little assis-
tance is offered to learners whose goals for success may simply be to develop
their interlanguage to become more proficient in English than they currently
are. For instance, the NS model cannot be fully relevant to their needs since
in representing the ‘finished article’, it cannot demonstrate the nuances in
interlanguage between the beginner, pre-intermediate, intermediate, upper-
intermediate and advanced learner levels. While this realisation has prompted
some writers to propose alternative models of greater relevance to L2 learners’
multicompetences and language learning strategies (see Alptekin 2002; Cook
2008; Coperías Aguilar 2008; Edge 1988; Medgyes 1992; Modiano 1999
and Preston 1981 for alternative models; see also Cook 1992; Cook 1999;
Coppieters 1987; Galambos and Golin-Meadow 1990 for multicompetence
and O’Malley and Chamot 1990; Oxford 1990; Oxford and Nyikos 1989
for language learning strategies), relatively few studies have sought to inves-
tigate what makes learners within them successful in their own right. By
focussing on learners, this book therefore intends to address the imbalance
caused by the dominance of the NS model by establishing not what learners

of English are unable to do at different stages, but rather what they are able
to do in terms of their spoken production. In accordance with the levels and
descriptors offered in the CEFR, it will detail how success manifests itself in
the speech of B1, B2 and C1 learners and how their interlanguage progresses
in ways that might not always be recognised in some syllabi and textbooks.
1.3 Rationale for a focus on spoken language

Although it would be possible to consider other skills, in this book we have
chosen to focus on successful spoken language. There are several reasons
for this. Firstly, as the ‘primary form’ of language, it lays the foundation for
other modes (e.g. written language) and provides the source from which
language evolution and change often stems (Hughes 2011: 14). The second
reason is that despite this theoretical ‘reverence’, in language teaching, it
is often not valued as much as other skills (Bygate 1987: vii) and corpus-
informed research is often focused upon written corpora, due to their preva-
lence (Jones and Waller 2015). This is despite the fact that our experience
also tells us that spoken language is not undervalued by learners and the
desire to learn English is often framed in terms of a desire to speak it success-
fully. Finally, and possibly most importantly for the topic of success, speak-
ing represents the skill which is ‘most frequently judged’ (Bygate 1987: vii),
a perhaps unsurprising fact given that the majority of language use is spoken
rather than written (Lewis 1993). For learners wishing to be judged as suc-
cessful in an L2, the skill of speaking would seem to be, therefore, the most
obvious starting point in a study of learner success and it is for these reasons
that we have made it the focus of this book.
1.4 Definitions of successful language:

communicative competence
In order to advance our definition of successful spoken language, it will first
be helpful to outline the models of communicative competence on which
we will base our analysis in this book. We will do so first by definition and
explanation of each theory, before bringing each together to form our defini-
tion of successful spoken English.
1.4.1 Hymes’s theory of communicative competence

The first theory of communicative competence, created by Hymes (1972),
challenged the notion that sociolinguistic aspects of language use were
viewed as performance–related imperfections (Hymes 1972; Llurda 2000;
Taylor 1988). By stating that competence developed independently of socio-
cultural aspects, Chomsky (1965), who famously separated competence from
performance, had neglected to show the features of language which made

it not only well-formed, but also acceptable. Hymes (1972) hence made the
key distinction that sociocultural aspects of language use also symbolise a
type of competence that develops during first language acquisition (FLA).
Governed by their own set of rules and systems, social experiences not only
contribute to performance, but also to a language user’s internal knowledge
of their first language. Hymes highlights that an individual’s knowledge and
the ability to use it in performance are linked and are of equal worth to
descriptions of language:
We have then to account for the fact that a normal child acquires knowl-
edge of sentences not only as grammatical but also as appropriate. He
or she acquires competence as to when to speak, when not, and as to
what to talk about with whom, when, where, in what matter. In short,
a child becomes able to accomplish a repertoire of speech acts, to take
part in speech events, and to evaluate their accomplishment by others.
(Hymes 1972: 277)
The key argument in this theory, therefore, is that linguistic and sociocul-
tural competences are not separate entities with the latter ‘grafted’ onto the
former, nor is sociocultural competence irrelevant to a language user’s over-
all language competence. Similarly, competence no longer remains a static
state of knowledge due to its involvement in the process of creating and
comprehending meaning in a range of communicative contexts (Ellis 1994).
Competence instead embodies the ‘capabilities of a person’, dependent both
on declarative knowledge (knowledge about language) and the ‘ability for
use’ (Hymes 1972: 282). This definition incorporates four distinct qualities:
that language is formally possible, in terms of grammar, culture or com-
munication; feasible, regarding the implementation of language according
to psycholinguistic and cultural factors; appropriate, involving awareness
of contextual features and tacit knowledge of sentences and situations; and
done in the sense that the forms do actually occur in a language (see Hymes
1972: 281–286).
Hymes’s (1972) theory did much to extend the concept of competence in
communication and its applicability to second language learning and suc-
cess. For instance, as a theory fundamental to social interaction (Paulston
1992), it stipulates that grammatical accuracy alone is insufficient in the
learning and use of a language. It advocates that language users need ‘knowl-
edge of both the L2 grammar and of how this system is put to use in actual
communication’ (Ellis 2008: 6). For adult L2 learners, this means, for exam-
ple, that the learning of grammar rules alone would not result in success-
ful communicative competence, they have to also be capable of adapting
language to suit its mode, audience, genre and context. Ultimately, Hymes’s
theory asserts that learners possessing high grammatical accuracy and low
sociocultural competence are likely to be judged as unsuccessful. Hymes’s

communicative competence also acknowledges the variation evident across
individuals. By recognising that language users possess ‘differential knowl-
edge of a language’, less emphasis is placed upon the ideal-speaker hearer
proposed by Chomsky (Hymes 1972: 270). It is clear that this theory of
communicative competence has much relevance for L2 learners who differ
not only from each other, but also in their own evolving knowledge and use
of a second language.
1.4.2 Canale and Swain’s theory of communicative

competence
One criticism of Hymes’s explanations relates to the exact nature of compe-
tence and the interactions involved in making language knowledge available
for use. Despite communicative competence rapidly assuming ‘buzzword’
status in language learning (Canale 1983: 2), communicative competence
theory still required development following its intial proposal. Considered a
‘major advance’ (McNamara 1995: 167), Canale and Swain’s (1980) theory
of communicative competence thus aimed to clarify terms, create a new
model, and, in stark contrast to Chomsky and Hymes, place the language
learner at the heart of discussion.
Hymes (1972) posits that competence comprises a combination of gram-
matical and sociolinguistic knowledge. However, his model did not include
an explicit definition of grammatical competence and this led Canale and
Swain to stipulate that grammar should be reinstated in definitions and
models of communicative competence:
Just as Hymes (1972) was able to say that there are rules of grammar
that would be useless without rules of language use, so we feel that there
are rules of language use that would be useless without rules of grammar.
(Canale and Swain 1980: 5)
They stress that grammatical, sociolinguistic and strategic competence (to

be discussed shortly) are of equal importance within communicative com-
petence which is thus reliant upon the ‘relation and interaction’ of all its
elements (Canale and Swain 1980: 6). Communicative performance is there-
fore defined as:
the realization of these competences and their interaction in the actual

production and comprehension of utterances (under general psycho-
logical constraints that are unique to performance) . . . [it is the] actual
demonstration of this knowledge in real second language situations for
authentic communication purposes.
(Canale and Swain 1980: 6)
Canale and Swain’s clarification of each of the competences is displayed

in figure 1.1 As mentioned, their theory comprises three distinct elements:
grammatical competence, sociocultural competence and strategic compe-
tence, a new addition based on communication skill research focussing on
the oral skills learners need to ‘get along in . . . or cope with’ most of the
situations they are likely to encounter (Canale and Swain 1980: 9).
Grammatical competence concerns knowledge of lexis, morphology,
syntax, semantics and phonology which merge with L2 pedagogy to help
learners ‘determine and express accurately the literal meaning of utterances’
(Canale and Swain 1980: 30). Sociolinguistic competence, on the other
hand, relates more to the non-literal traits of language use involved in mak-
ing sense of the grammar of an utterance and the language user’s intentions.
It deals specifically with i) the characteristics involved in topic, participant
role and interaction norms, and ii) the employment of grammar, attitude
and register that are required for communication to be seen as appropriate,
a case in point being a university student using imperative forms to demand
that a tutor offer an assignment deadline extension and then using ‘ta’ rather
than ‘thank you’ if it is granted. Similarly, sociolinguistic competence relies
on rules of discourse, examinations based on the cohesion and coherence
of utterances on a collective rather than an individual basis; put simply,
language as a whole rather than isolated utterances. Whereas judgements of
accuracy can report on utterances separately, discourse incorporates the way
utterances combine to complete communicative transactions appropriately.
Finally, strategic competence plays a role in communication breakdowns
Lexis
Morphology
Grammatical
competence Syntax
Sentence-grammar semantics
Phonology
Communicative Sociolinguistic
Sociocultural rules of use
competence competence
Discourse
Verbal/Non-verbal communication strategies

Strategic
Grammatical competence strategies
competence
Sociolinguistic competence strategies
Figure 1.1 Canale and Swain’s (1980) theory of communicative competence

attributed to a lack of competence or the influence of performance variables

(Canale and Swain 1980: 29). Relating either to obstacles facing grammati-
cal or sociolinguistic competences, strategic competence helps learners to
counteract problems they contend with during communication. Although
these three features comprise Canale and Swain’s (1980: 31) theory of com-
municative competence, the writers are keen to emphasise one over–arching
condition affecting each component: the contingent existing on ‘probabil-
ity rules of occurrence’. Perhaps similar to Hymes’s condition necessitating
knowledge of whether a linguistic term is ‘done’, communicative competence
in the target language will only be achievable if the learner acquires knowl-
edge as to whether linguistic features actually occur in the second language.
Canale and Swain’s (1980) theory has significant implications for second
language learning success. This theory shifts focus from the more generalised
definition of Hymes, towards L2 learners and in particular L2 teaching
and testing. Related to this is the caveat that communicative teaching should
endeavour to ‘prepare second language learners to exploit’ language [empha-
sis added] (Canale and Swain 1980: 29). Whilst this concerns the expansion
of grammatical knowledge through sociocultural and strategic competence
and L2 experience, this remark links clearly to an assertion made in the
introductory chapter: that language teaching should pay attention not only
to the gaps in learners’ knowledge, but also to how their knowledge can be
enhanced, boosted and ‘exploited’ to achieve continued success. Another
distinguishing feature of this theory in comparison to Chomsky and Hymes
is that gaps in grammatical and sociocultural competences can, in fact, be
minimised by strategic competence. Though learners will encounter a range
of communicative obstacles, strategic competence may still allow learners to
communicate efficiently. The fact that Canale and Swain (1980: 30) remark
that communication strategies can be ‘called into action’ suggests that learn-
ers may make conscious decisions about how and when to use them.
1.4.3 Canale’s theory of communicative competence

The theories of Hymes (1972) and Canale and Swain (1992) are helpful
models but have also been criticised in the following ways: i) for an absence
of explicit detail regarding how the proposed components of communicative
competence interact, ii) for the suggestion that learners remain passive in
the communicative competence process, and iii) for confusion as to the role
of formulaic lexis in grammatical competence and the nature of strategic
competence as knowledge or ability. Of particular relevance is the absence of
detail regarding how competence is actually used in performance, something
which Canale’s (1983) theory advanced.
Responding to criticisms of Canale and Swain’s theory, Canale strives to
clarify and expand on some of the ambiguities that arose. In reaction to asser-
tions that the theory included a concept similar to Hymes’s ability for use
(see Shohamy 1996), despite their overt reluctance to do so (see Canale and
Swain 1980: 7) Canale (1983: 5) acknowledges that their theory referred
alternatively to a communicative competence incorporating ‘underlying
systems of knowledge and skill required for communication’. The knowl-
edge fundamental to communicative competence, or likewise the declarative
knowledge of ‘knowing about’ language is therefore accompanied by skill,
the procedural knowledge concerning the extent to which knowledge ‘can be
performed’ or put to use in ‘actual communication’: the new term created for
performance in order to avoid confusion with Chomsky’s 1965 definition
(Canale 1983; McNamara 1995). Transitioning from knowledge-oriented
approaches of language teaching, Canale asserts that adopting a more skill-
oriented approach is a much–needed shift if students are to learn how to
employ such knowledge adequately:
such [knowledge – oriented] approaches do not seem to be sufficient for

preparing learners to use the second language well in authentic situa-
tions: they fail to provide learners with the opportunities and experience
in handling authentic communication situations in the second language,
and thus fail to help learners to master the necessary skills in using
knowledge.
(Canale 1983: 15)
For learners of a second language aspiring to be seen as successful, competence

is therefore not only reliant upon having knowledge but also being exposed
to it in real situations and having the means to utilise it. As Lewis (2012: 33)
remarks, knowledge is necessary but ‘What matters is not what you know,
but what you can do. ‘Knowing’ a foreign language may be interesting; the
ability to use it is life-enhancing.’ With learners clearly varying in different
levels of knowledge and in the skills to execute them, this means that com-
municative competence could be described as a quality that changes as tar-
get language experience develops. It also represents a more explicit attempt,
beyond simply including strategic competence, to associate competence and
performance. Furthermore, a learner with a high degree of knowledge may,
in fact, be considered unsuccessful if they are not able to put that knowledge
into action whereas a learner with less knowledge may still be able to exploit it
to some extent. Canale’s (1983: 7–11) theory of communicative competence,
thus contains four key areas of knowledge and skill, briefly summarised here:
• Grammatical competence: ‘concerned with mastery of the language

code’ and once again with the comprehension and creation of literal
meaning relates to vocabulary, word formation, sentence formation,
pronunciation, spelling and linguistic semantics.
• Sociolinguistic competence: involves the ‘extent to which utterances
are produced and understood appropriately in different sociolinguistic
contexts depending on contextual factors.’ With appropriacy thus

relying on the form and meaning (of functions, attitudes and ideas),
sociolinguistic competence displays a contrast with Canale and Swain’s
theory in that it no longer encompasses discourse rules.
• Discourse competence: ‘concerns mastery of how to combine grammati-
cal forms and meanings to achieve a unified spoken or written text in
different genres.’ As a separate entity, discourse competence refers to
the structural connections enabling interpretation of a text, or cohe-
sion, and the interaction of meanings among texts conveying the literal,
functional and attitudinal, or coherence.
• Strategic competence: relating to the ‘mastery of verbal and non-verbal
communication strategies’ allowing for i) the compensation of commu-
nication breakdowns which occur due to a lack of competence or due
to ‘limiting conditions’, and ii) the boosting of communicative ‘effec-
tiveness’ relating to the context and function of language i.e. rhetorical
effect, a new addition to the previous definition of strategic competence
in Canale and Swain.
This model is helpful, as it clarifies some issues which arose from previous
models As the next section will also reveal, the model emphases renewed
attempts by theorists to describe more explicitly how knowledge is utilised
in communication.
1.4.4 Bachman and Palmer’s model of language

use and performance
Though the previously summarised theories illustrated a shift from purely
knowledge-oriented models of competence to models of communicative
competence acknowledging the roles of sociolinguistic, discourse and stra-
tegic competence, there is still a gap within this theory. This gap pertains to
how competence is realised in performance; to paraphrase Lewis (2012), to
show ultimately what can be done with language knowledge. Despite asser-
tions that performance cannot offer a true or complete reflection of learner
competence, the final theory to be presented here, developed by Bachman
and Palmer (1996), outlines learner competences and their involvement in
language use. Their model of language use and performance by no means
represents a large departure from the theories discussed. They commend
the move beyond sentence level grammar prompted by Hymes (1972) and
state that their theory is ‘essentially an extension’ of Canale and Swain’s
(1980) theory of communicative competence (Bachman 2007: 54). Similarly,
it clearly shares parallels with Canale’s (1983) definitions of knowledge
and skill due to its integration of Bachman’s (1990) notion of communi-
cative language ability. By stating that operating communicatively in lan-
guage involves both ‘knowledge of competence . . . and the capacity for
implementing, or using this competence’ (Bachman 1990: 81), it is evident

that skills, or one’s ‘capacity’ for utilising language, are fundamental to the
demonstration of language knowledge in performance. For second language
learner success, it once again implies that capacity for executing language
use can develop within and differ across individuals. A key principle under-
pinning language use and performance in this theory is the interactive nature
of language production. For the focus of this book, the manner in which
language use occurs within interactive situations is particularly important.
This is summarised as follows:
In general, language use can be defined as the creation or interpretation

of intended meanings in discourse by an individual, or as the dynamic
and interactive negotiation of intended meanings between two or more
individuals in a particular situation.
(Bachman and Palmer 1996: 61–62)
As a theory useful for considering and planning test design, Bachman and
Palmer’s model broadens the scope of language usage to incorporate char-
acteristics of the speaker and the task and setting. As figure 1.2 shows, this
change was attributed to features other than language knowledge – personal
characteristics, topical knowledge and affect – playing an important role
in someone’s ability to communicate. For example, in a test situation, a
learner’s display of their language knowledge may be impaired by factors
affecting the individual on the day of the exam, and this could lead to a
potential reluctance to produce language. The figure demonstrates, there-
fore, the characteristics held by an individual (in the bold circle), bearing
some resemblance to the CEFR’s existential competence (Council of Europe
2001), and how this relates to the external task or setting in which they
interact. Although individual differences will not be elaborated upon here, it
is evident that factors such as mood, tiredness, personality, topic knowledge
and willingness to attempt or adapt language use, to name but a few, will
influence a speaker’s overall performance.
In terms of its composition, Bachman and Palmer’s (1996) theory of lan-
guage use and performance rests ultimately on Bachman’s (1990) theory of
language ability. They posit that the combination of two elements, language
knowledge and strategic competence, bestow language users with ‘the ability,
or capacity, to create and interpret discourse, either in responding to tasks
on language tests or in non – test language use’ (Bachman and Palmer 1996:
67). In contrast to previous studies of communicative competence, these two
factors are thus no longer seen as uniform in their impact. For instance, whilst
Canale and Swain (1980) stress that grammatical competence is neither more
nor less important than sociolinguistic or strategic competence, the latter in
this theory is given a renewed role. In Bachman and Palmer’s model, stra-
tegic competence is made central to interactions between task and setting
Language
knowledge
Topical Personal
knowledge characteristics
Affect
Strategic
competence
Characteristics of the
language use or test
task and setting
Language
competence
Organisational Pragmatic
competence competence
Grammatical Textual Illocutionary Sociolinguistic

competence competence competence competence
Sensit. to Cultural
Rhet Ideat. Manip. Heur. Imag. Sensit. to Sensit. to Refs. &
Voc. Morph. Synt. PhonGraph. Cohes. Dial. or
Org. Functs. Functs. Functs. Functs. Reg. Nat. Figs. of
Variety
Speech
Figure 1.2 Bachman and Palmer’s (1996: 63) language use and performance with Bachman’s
(1990: 87) components of language ability
characteristics and an individual’s language knowledge. In a sense, strategic

competence has been extended beyond the realm of simply compensating
or accommodating language, to the heightened status of underpinning all
language use (Bachman 2007). With respect to performance and success,
therefore, it is essential that learners’ competences in goal setting, ‘deciding

what one is going to do’; assessment, ‘taking stock of what is needed, what
one has to work with, and how well one has done’; and planning, ‘deciding
how to use what one has’ are crucial (Bachman and Palmer 1996: 71–75);
any deficit or gap in these areas, will clearly diminish a language user’s poten-
tial to impart meaning, cope with task demands, adhere to social and dis-
course norms, or adapt topic knowledge to the task at hand.
In their comprehensive discussion of language knowledge, another dis-
tinction with previous theories becomes evident. Previous theories kept
grammar and sociolinguistic (discourse and pragmatics) relatively separate.
Here, such competences are merged under the term ‘language knowledge’,
which is split into two main components. Firstly, organisational competence
concerns knowledge of formal structure contained within utterances or sen-
tences which contribute to the production and comprehension of meaning at
utterance and text level (Bachman and Palmer 1996). Comprising grammati-
cal competence and textual competence, they rely on a learner’s knowledge
of vocabulary, morphology, syntax, phonology/graphology, cohesion and
rhetoric organisation (a text’s ‘conceptual structure’ and its effect on the lan-
guage user (Bachman 1990: 88)). Evident here is the move towards discourse
competence, as described by Canale and Swain (1980), away from sociolin-
guistic aspects of language to organisational aspects so as to illustrate how
utterance-level grammar blends with a language user’s knowledge of the
building of conversation and the marking of connections across utterances.
Secondly, pragmatic competence refers to the ‘interpretation of discourse’
when the formal structure of utterances is connected to the intended mean-
ings of language users in accordance with the setting for language use (Bach-
man and Palmer 1996: 69). Whereas illocutionary (functional) competence
associates utterances and texts to the meanings implied by a language user,
sociolinguistic competence refers to the interactions between language,
the language user and the setting and context. Illocutionary competence is
responsible for the additional meanings attached to utterances. For instance,
the statement ‘I’d like to know how much this vase is’, rather than indi-
cating desired knowledge, functions more as a polite, indirect request for
information; likewise, a partner exclaiming ‘I’m too tired to cook tonight’
could be seen as a suggestion that they would prefer to order a takeaway or
an instruction that the other person should cook. This type of competence
can also be dependent upon prior knowledge or experience. Phonological
aspects aside, an interlocutor listening to a tutor saying ‘She spent one day
on her assignment’ would need prior knowledge of the student’s character
and assignment mark to interpret this as a statement of fact, a statement of
surprise or a statement of criticism; extension of this example to involve
a student as interlocutor, it could additionally act as warning for them
not to do the same. Therefore, real word knowledge (ideational), learning
through language (heuristic), humour and figurative language (imaginary)
and getting other people to do something (manipulative) are all functions of
language which can extend meaning beyond the message conveyed at utter-
ance level. When combined with sociolinguistic aspects such as sensitivity to
dialect or variety, register differences, naturalness of language and cultural
references, it becomes easier to see how Bachman and Palmer’s (1996) model
of language use and performance is more thorough than previous theories in
terms of what, exactly, comprises communicative language ability and what
makes it accurate and appropriate.
This theory is of significance to learner success in a number of ways.
Through the use of arrows, a much more tangible attempt has been made to
explore the types of knowledge required by L2 learners and the interactions
occurring between them. It demonstrates how the form, or organisation,
of language is of equal importance to the meanings inferred by the speaker
or interpreted by the interlocutor as a result of pragmatic competence and
the interactive setting. The notion that appropriacy as well as accuracy is
integral to communication is thus maintained. Of particular interest to this
book is the elevated importance given to strategic competence. In a study
of success, it would be easy to confine this competence to compensation or
communication breakdowns; the implication from Bachmann and Palmer is
that learners require strategic competence in a broader sense.
Bachman and Palmer’s (1996) model is by no means perfect or definitive.
Based principally on language testing, its application to general language use
may be more problematic. For instance, it might be relatively easy to docu-
ment or evaluate aspects of pragmatic or sociolinguistic competence in lan-
guage tests of a familiar design, but in wider, freer communicative settings,
they may be harder to distinguish. Secondly, there is no explanation for how
learners make use of formulaic language in these competences. Finally, though
Bachman and Palmer accentuate the interaction with the task or setting, mod-
els of spoken English in teaching materials may actually interfere with a stu-
dent’s impressions of successful interactive speech (Jones and Horak 2014).
1.5 Towards a definition of successful spoken

language: communicative competence
In order to define what we mean by successful spoken English, it is first nec-
essary to summarise our definition of communicative competence. Taking
elements of the theories of communicative competence we have discussed so
far, we define this as follows:
1 Linguistic competence – the ability to use language, which includes lexis,

grammar, lexicogrammar and phonology effectively;
2 Strategic competence – the ability to repair errors when communi-
cating and also to make appropriate choices which oil the wheels of
conversations;
3 Discourse competence – the ability to organise and link language across

extended conversational turns;
4 Pragmatic competence – the ability to use language as appropriate for
the sociolinguistic context.
We would acknowledge that it is a difficult task to define, let alone measure,

all possible aspects of a successful language user at every level of compe-
tence, in every situation they find themselves in. This is because success is
likely to encompass many variables, including such aspects as confidence
and ability to use and read non-verbal communication. We would also
accept that any definition based on language which students produce will
not be able to measure what students simply understand about language.
Despite this, we believe it is possible to define and measure some aspects of
successful language use and that is what we will attempt to do in this book.
Our definition, coming as it does with an acknowledgement of its limitations,
is therefore based on the discussions of communicative competence earlier
in this chapter and is as follows:
A successful user is one who is able to use linguistic, strategic, discourse

and pragmatic competence as appropriate for a particular goal, at a
particular language level, as defined by the CEFR.
To give an example, one sample goal from the CEFR at B1 level is as follows:
‘I can . . . relate the plot of a book or film and describe my reactions.

‘(Council of Europe 2001: 26)
In order to undertake this successfully, we can suggest that a learner needs

language to describe the main aspects of a film or book (linguistic compe-
tence), the ability to clarify unknown language for the listener (strategic
competence), the ability to judge the formality of the language as appropriate
for the listener (pragmatic competence) and the ability to link ideas together
into a coherent narrative, allowing the listener ‘space’ to react and respond
(discourse competence). The interaction, if successful, will also entail coop-
eration between speaker and listener and will include effective listenership
and turn-taking from both parties (Carter and McCarthy 2015). We can also
suggest that it is possible to assess this example of a goal, at least holistically,
by raters assessing the extent to which the goal has been achieved or not and
that the manner in which such a goal was achieved will change according
to the general language level of the speaker. This is an aspect we will discuss
further in section 1.6. Before we move on to this discussion, it is first neces-
sary, in section 1.5, to outline the basis of the CEFR and how this links to
communicative competences and the notion of learners success.
1.6 Linking communication, communicative

competence and learner success to the CEFR
Currently translated into 39 languages (Council of Europe 2014) and in
preparation between 1993 and 2000 (Goullier 2006), the CEFR is a docu-
ment outlining how language proficiency and abilities progress across a vast
range of language learning contexts:
It describes in a comprehensive way what language learners have to learn

to do in order to use a language for communication and what knowl-
edge and skills they have to develop so as to be able to act effectively.
(Council of Europe 2001: 1)
Documenting the language activities (receptive, productive, interactive, medi-

ative), the language domains (public, general, educational, occupational) and
the communicative competences (linguistic, sociolinguistic, pragmatic) of
learners, the CEFR categorises language learning into six Common Refer-
ence Levels ranging from A1 Breakthrough to C2 Mastery (see figure 1.3).
These levels are then each illustrated via a set of general and specific descrip-
tors detailing the abilities of learners as they progress globally through the
levels, or as they develop within a particular language use context. With
regards its usage, the CEFR acts as an aid to those involved in the learning,
teaching, assessment and policy of language, and represents the culmination
of over six decades of work by the Council of Europe (2001), an organisation
responsible for endorsing plurilingualism, linguistic diversity, mutual under-
standing, democratic citizenship and social cohesion across its 47 member
states (Language Policy Division 2006; Council of Europe 2015). In doing
so, the CEFR functions as a tool for improving unity between these members
A – Basic user B – Independent user C – Proficient user
C1 – Effective
A1 – Breakthrough B1 – Threshold Operational
Proficiency
A2 – Waystage B2 – Vantage C2 – Mastery
Figure 1.3 CEFR Common Reference Levels (Council of Europe 2001)

and for establishing a common terminology to increase transparency, com-

munication and reflection (Council of Europe 2001: xi). Likewise, in detail-
ing the numerous competences learners possess and develop, it has also been
found to be ‘extremely influential’ in syllabus design, curricula planning
and language examinations (Hulstijn 2007: 663) as well as the planning
of language learning programmes, language certification and self–directed
learning (Council of Europe 2001). Intended to be non–language specific to
widen its application and scope, the CEFR’s six levels illustrated with ability
descriptors were attributed to its swift adoption (Alderson 2007) and to its
use as the ‘exclusive neutral reference’ in the national, educational setting
(Martyniuk and Noijons 2007: 7).
Discussions so far have explored definitions of communicative compe-
tence, language use and communication so as to present previous litera-
ture relating to learner success in speech. However, as we will be examining
success ‘in accordance’ with the CEFR, and language levels it contains, it
is important that the Council of Europe’s position is also examined. The
CEFR documents learner proficiency and abilities in a range of skills, across
a range of contexts. While its positive wording encourages positive apprais-
als of learner language in terms of what they can do, rather than what they
cannot, several gaps still remain as to the language to be evidenced and the
differences across proficiency levels. This section of the chapter therefore
serves a dual purpose. Intending to relate previous definitions of commu-
nicative competence and communication to the CEFR’s perspective, it will
first investigate the types of knowledge learners are said to possess and how
this is realised in their speech, in particular, during communicative encoun-
ters. Secondly, it will expand on previous criticisms of the gaps that remain
in describing learner proficiency so that their bearing on evaluating learner
success can be fully appreciated.
Firstly, it is necessary to pinpoint how language is regarded in the CEFR.
Taking an action-oriented view, language users and learners are seen as
social agents: individuals achieving ‘tasks’ requiring the use of their stra-
tegic and general competences within specific contexts from which acts of
speech acquire their ‘full meaning’ (Council of Europe 2001: 9). Language
use therefore:
. . . comprises the actions performed by persons who as individuals

and as social agents develop a range of competences, both general and
in particular communicative language competences. They draw on the
competences at their disposal in various contexts under various con-
ditions and under various constraints to engage in language activities
involving language processes to produce and/or receive texts in relation
to themes in specific domains, activating those strategies which seem
most appropriate for carrying out the tasks to be accomplished. The
monitoring of these actions by the participants leads to the reinforce-

ment or modification of their competences.
Within the CEFR, language use is not viewed as static, nor is it uniform. Lan-
guage users, such as those in Canale and Swain’s (1980) theory, benefit from
their unique set of competences and strategies which enable them not only
to achieve linguistic and non-linguistic tasks, but also to develop in those
competences. The CEFR, similarly to Canale and Swain (1980), Canale
(1983) and Bachman and Palmer (1996), views communicative competence
as part of a language user’s wider, more general competence. Comprising
knowledge, skills, existential competence and the ability to learn, an almost
cyclic relationship is created: communicative competence is thus a part of
general competence, but it is this competence which rather vaguely is said
to ‘contribute[s] in one way or another’ to learners’ communicative abili-
ties (Council of Europe 2001: 101). Declarative knowledge, concerning the
world, sociocultural and intercultural knowledge, is a product of experience
and formal learning. It relates to one’s language and culture as well as to lan-
guage users’ knowledge of day–to–day life. In a sense, declarative knowledge
is the CEFR’s equivalent to Bachman and Palmer’s (1996) treatment of topic
knowledge and knowledge of different settings. Skills, pertaining to proce-
dural knowledge or ‘know-how’, on the other hand concerns the ‘ability to
carry out procedures’ (Council of Europe 2001: 11). Using similar terminol-
ogy to Canale (1983), they relate to social, living, vocational and leisure skills
which differ in their degree of mastery, and the ease, speed and confidence
with which they can be performed. Existential competence, comparative to
Bachman and Palmer’s (1996) personal characteristics, is composed of per-
sonal attributes unique to language users. Containing attitudes, motivations,
values, beliefs, cognitive styles and personal factors, existential competence
results from acculturation and a person’s readiness to interact socially with
others. Finally, drawing on ‘various types of competence’, the ability to learn
amalgamates the three previous aspects (Council of Europe 2001: 12). Help-
ing language users to ‘deal with the unknown’(Council of Europe 2001: 12),
it involves more than the capability to learn; it suggests that users will dif-
fer in their predisposition, or willingness, to seize or seek opportunities for
exploiting learning potential, hopefully reducing the resistance or threat felt
in learning a new language.
The CEFR’s notion of general competence thus relies on more than
knowledge that something can be done: it highlights the need to know how
it can be done, the willingness to do it, and the ability to add to what is
already known. Regarding communicative competence, comparisons may
be made between this definition and Canale and Swain’s (1980) model as
it contains three similar components: linguistic competences, sociolinguis-
tic competences and pragmatic competences. Interesting to note here is the
plurality of the term ‘competence’ as, in the CEFR’s sense once again, com-
petence refers not only to knowledge but to a combination of knowledge,
skills and know–how in each of the three competences. Like the preced-
ing theories, linguistic competences relate closely to previous definitions.
Comprising knowledge of and the ability to use lexical, formulaic, gram-
matical, morphological, syntactical, semantic, phonological, orthographical
and orthoepic competence, lexical competences relates to the construction
and formulation of well–formed, meaningful messages (Council of Europe
2001). Interestingly, in stating that linguistic competence can be scaled, the
CEFR suggests that language is ‘never completely mastered by any of its
users’ (Council of Europe 2001: 109). Perhaps in opposition to previous
NS–learner comparisons, all language users are thus said to always have
something new to learn; their skills and accessibility to the range and qual-
ity of knowledge also differs across individuals. Sociolinguistic competences,
on the other hand, closely assimilate to previous models’ social and cultural
conventions, the particular linguistic customs which may need to be fol-
lowed in different settings. As no truly homogenous settings exist, this set
of competences is said to relate specifically to language use in response to
social contexts surrounding it. Learners thus require to develop their inter-
cultural competence to facilitate this. Despite having knowledge, skills and
know–how in such an area, language users may be uninformed of socio-
linguistic norms: just as Paulston (1992) remarks, language users can be
unaware of cultural rules until the point at which they are broken. Pragmatic
competence, in place of Canale’s (1983) discourse competence, concerns ‘the
functional use of linguistic resources . . . the master of discourse, cohesion
and coherence’ (Council of Europe 2001: 13). Combining the arrangement
of sentences into sequences (discourse competence) as well as the knowledge
of and ability to construct interaction, it conforms closely to Bygate’s (1987)
discussion of ‘routine’ and ‘interactive skills’ to produce spoken conversa-
tion. In short, the view of communicative competence in the CEFR remains
that it constitutes the internalised, yet changeable, state of knowledge, skills
and know–how which are then put to use. It is through performance or
‘observable behaviour’ (Council of Europe 2001: 14) that aspects can be
recognised and activated. However, no specification is offered as to how this
is actually achieved or whether performance is fully able to reflect what is
held in a language learner’s competence.
1.7 Measuring successful spoken language

in this book
Just as it is impossible to define successful language in all its aspects, it is
also not possible to measure all aspects of successful language use. We could
not, for example, hope to record and analyse all the successful speech of even
one language user in their daily life. Even if we chose to do so, there are huge
ethical issues when attempting to record conversations with other people,

who may not wish to be recorded. Therefore, in general, most corpora have
sought to capture data under circumstances which are easier to control and
that is a procedure we have used in this book. We have done so by explor-
ing data from language tests and from computer-animated production tests,
alongside to comparison to larger learner and native speaker corpora.
The UCLan Speaking Test Corpus (USTC) informing discussions of success
in this book comprised speakers of various nationalities at three CEFR levels:
B1, B2, C1 plus a small section comprising native speakers also undertaking
the tests. These tests are based on Canale’s (1984) Oral Proficiency Interview
design and each exam consisted of a general question and answer warm–up
phase, a paired discussion segment in which examiners do not participate,
and a probe stage encompassing a topic-specific conversation between speak-
ers and examiners. Test timings differed across the levels from 10 minutes at
B1 to 15 minutes at C1 (averaging 10 minutes across the three levels).
The native-speaker (NS) portion of the corpus was created as an aid for
comparing and contrasting learner and NS data in the test corpus. Whilst
comparisons of learner and native-speaker data are increasingly being dis-
couraged so as to reduce the effects of the comparative fallacy, such a tech-
nique does have benefits for the overall aim of this book. Similarities, of
course, will exhibit the ways in which learners produce target-like speech;
for those wishing to attain native-like levels of English, findings will dem-
onstrate where NS models of speech can be of more relevance and use.
Alternatively, where differences emerge, evidence will indicate not so much
the irrelevance of a NS model, but the manners in which its occasional domi-
nance as a ‘yardstick’ or measure for learner success loses significance.
To ensure learners were of the correct levels, chosen exam intakes followed
courses in which the students had completed a placement exam on entry.
Despite not being a failsafe guarantee of language level, proficiency tests
did provide some evidence that students were not taking exams above or
below their level. Secondly, and more importantly for documenting language
success, careful consideration had to be given to the exam grades allocated
to candidates. Following completion of the speaking exams, pre-approved
examiners were responsible for assessing students according to criteria on
marking scales at each level. Individual marks in the areas of grammar,
vocabulary, pronunciation, discourse management and interactive ability
were given to each candidate and then amalgamated into one global score
(see table 1.1. for a sample of these criteria). A global score of 2.5 equates
to a borderline pass, whereas a score of 3 equates to a firm pass. The mark-
ing scheme had strong implications for which exams could be included in
the study since candidates with a global mark of 5 could have possessed an
ability beyond that of B1, B2 or C1 level, whereas candidates obtaining a
borderline global mark of 2.5 may not have exhibited a solid performance.
Furthermore, decisions had to be made as to whether a score of 2.5 for each
Table 1.1 Marking criteria
Mark Grammar Vocabulary Pronunciation Discourse management Interactive ability

B1
5 Structures mostly Consistently Use of stress and intonation Consistently makes Sustained interaction in both
accurate for the level demonstrates puts very little strain on extensive, coherent and initiating and responding
with only occasional appropriate and listener and individual relevant contributions which facilitates fluent
minor slips. extensive range of sounds are articulated to the achievement of communication.Very sensitive
lexis for this level. clearly. Utterances are the task. to turn-taking.
consistently understandable.
4.5 More features of band 4 than band 5.
4 Generally Evidence of an Stress and intonation Contributions are Meaningful communication
structurally accurate extensive and patterns may cause generally relevant, is largely achieved through
for the level but appropriate range occasional strain on listener. coherent and of an initiating and responding
some non-impeding of lexis with Individual sounds are appropriate length. effectively. Hesitation is minimal
errors present. occasional lapses. generally articulated clearly. and the norms of turn-taking
are generally applied.
3 Reasonable level Lexis is mostly Use of stress and intonation Contributions are Sufficient and appropriate
of structural effective and is sufficiently adequate normally relevant, initiation and response
accuracy but some appropriate for most utterances to coherent and of an generally maintained
impeding errors are although range be comprehensible. Some appropriate length but throughout the discourse
acceptable. and accuracy are intrusive L1 sounds may there may be occasional although there may be some
restricted at times. cause difficulties for the irrelevancies and lack of undue hesitation. Turn-taking
listener. coherence. norms may not be observed.
(Continued)
Table 1.1 (Continued)
Mark Grammar Vocabulary Pronunciation Discourse management Interactive ability

B1

2 Frequent basic Lexis is limited in Inadequacies in all areas Discourse is not Contributions limited and
errors and a limited terms of range and of pronunciation put developed adequately the patience of the listener
command of accuracy and may considerable strain on the and may be incoherent may be strained by frequent
structure leading to be inappropriate listener. and irrelevant at times. hesitations. The norms of turn-
misunderstandings. for the task. taking are rarely observed.
1 Serious structural Insufficient or Limited competence in Monosyllabic responses. Fails to initiate and/or respond.
inaccuracy and lack inappropriate lexis all areas of pronunciation Performance lacks The interaction breaks down
of control which to deal with the severely impedes relevance and coherence as a result of persistent
obscures intended task adequately. comprehension. throughout. hesitation. The norms of turn-
meaning. taking are not observed.
0 Too little speech to Too little speech to Too little speech to assess Too little speech to Too little speech to assess
assess effectively. assess effectively. effectively. assess effectively. effectively.
language aspect needed to be obtained or whether a global mark would be

sufficient. It was decided that all candidates whose data was to be incorpo-
rated had to achieve a global mark of 3.5 or 4 to ensure a solid performance
and to minimise the possibility of students receiving lower marks in the dif-
ferent spoken sub-categories; nevertheless, regardless of individual criteria,
the overall performance had still been deemed successful. It also meant that
each candidate in the exam dyad or triad had to obtain a pass of this grade.
To further ensure exams were of the correct grade, procedures were fol-
lowed to check inter-rater reliability. Inter-rater reliability relates to inde-
pendent judges assigning grades to the same performance; the raters do
not confer or collaborate but the item under inspection remains the same
(Bachman 2004). In this case, this process was achieved in the following
ways. Firstly, USTC has strict procedures in place for the standardisation
and monitoring of examiners’ ratings. Following a training and standardisa-
tion process, individuals worked from the same agreed marking standards.
UCLan guidelines state that an assessor completes the marking grid for each
student but that the interlocutor must also provide a global score. Therefore
two people, working independently of each other had to agree on a score.
Secondly, 20% of the exams were checked for consistency by a senior exams
co-ordinator. Table 1.2 demonstrates the final composition of the USTC, giv-
ing details of the corpus’ size and the learners’ backgrounds.
Alongside this data, as a reference corpus, we have also made use of the
million-word Louvain International Database of Spoken Learner English
[LINDSEI] – spoken corpus (Gilquin, De Cock and Granger 2010). This cor-
pus contains over a million words of learner speech from individuals varying
in nationality, age and proficiency and thus it was considered ideal for com-
parison; based on learner interviews consisting of three tasks (set topic, free
discussion and picture description), it was thought to be particularly com-
parable to the data in the USTC (see Gilquin, De Cock and Granger 2010).
At times, we have also made some comparisons to the spoken section of the
BYU-BNC corpus (Davies 2004). The native speaker data in the BYU-BNC
is of course very different in nature to learner test data, coming as it does
from a variety of speech genres and contexts so we have used this only when
we felt it gave us an instructive comparison and not as something to which
we need to defer. Details of the analysis undertaken are given in each chapter.
The pragmatics corpus arose because, as we have detailed in this chap-
ter, we consider pragmatic competence one aspect of successful language use.
However, in general, we felt that test data does not tend to require learners to
display this competence. This is largely because the interlocutor has control
over the talk and as the topics are not selected, learners are not in general
required to consider appropriacy to any great extent. As a result, we con-
structed a corpus based on two very common speech acts, requests and apolo-
gies and have termed this the Speech Act Corpus of English (SPACE). The test
used to elicit data for the corpus was an innovative oral computer-animated
production test (CAPT) of the type described by Halenko (2013). These tests
Table 1.2 Details of the learner test corpus
CEFR level examined B1, B2, C1 and Native speaker
Total word count including 91, 173 tokens

examiner
Total word count 69, 561 tokens
excluding examiner
Total number of texts 57
(exams) used
Total word count at B1 24,074/17,171
with and without examiner
Total word count at B2 26,931/21,299
Total word count at C1 29,838/23,083
Total word count in the 10,330/8,008
native speaker section with
and without examiner
Total number of speakers 121 (60 males (49.6%), 61 females (50.4%))
B1 = 35 (16 males, 19 females)
B2 = 37 (22 males, 15 females)
C1 = 35 (18 males, 17 females)
Native speaker = 14 (4 males, 10 females)
Average age 23 years
Average time spent 7 years (B1)
learning English (exc. native 7 years (B2)
speakers) 10 years (C1)
Average time in UK (exc. 8 months (B1)
native speakers) 13 months (B2)
21 months (C1)
Nationalities represented Chinese = 50; British = 14; Saudi = 13;
Japanese = 11; Qatar = 6; Republic of Korea = 4;
Nigerian = 4; Unanswered = 3;
United Arab Emirates = 3; Iraqi = 3; Libyan = 2; Omani = 2;
Egyptian = 2; Columbian = 2;
Turkish = 1; Italian = 1
Speakers’ first language Chinese = 50; Arabic = 32; English = 12; Japanese = 11;
Korean = 4; Not given = 3; Kurdish = 2; Spanish = 2;
Hausa = 1; Italian = 1; Turkish = 1
Exam topics B1: Cinema (5), games and sports (2), learning (2), friends
(1), homes (1), memories (1), personality (1), travel and
tourism (1), work (1).
B2: Cultures and traditions (2), Outdoor hobbies (2),
Advertising (1), education technology (1), future (1), homes
(1), jobs (1), lifestyles (1), modern technology (1), music (1),
success and luck (1), weather (1), work and training (1).
C1: Travel and tourism (7), cities (1), community (1),
environment (1), history (1), language learning (1), staying
healthy (1), the world around us (1), transport (1)
NS: Cinema (1), environment (1), happiness (1), jobs (1),
learning (1), language learning (1)
use an animated figure within virtual role plays to provide learners with a con-
text and spoken prompt to which they respond and record their answers. For
example, one situation could be ‘You want an extension on your assignment.
You go to your tutor’s office to ask him for extra time’. The tutor then says
‘Hello, you wanted to see me?’ before the learner responds (see figure 1.4).
We acknowledge that this method of data collection cannot replace the
ideal of capturing data in naturally-ocurring settings, but it goes some way to
addressing some of the well-documented drawbacks of traditional written and
oral production tasks such as authenticity of interaction and learner response.
These virtual role plays were selected over face-to-face role plays for being
able to efficiently administer all the tests simultaneously with comparable
groups of students, under controlled conditions. A further advantage of the
CAPT is that the computer-animated characters are able to display a range of
non-verbal signals such as facial expressions and gestures, considered to be as
powerful as verbal cues. Secondly, authentic voice recordings can be uploaded
for the characters which also aim to simulate semi-authentic interaction in the
academic environment within which learners are currently studying. Halenko
(2013: 288) reports the CAPT successfully elicited language which is closer
to what students would actually say in a given situation, rather than what
they might say, when compared to written production tasks. For instance,
Halenko found written discourse completion task responses tended to be
longer, and often included extraneous detail, unlike the more efficient CAPT
responses, as these examples, from the same participant, illustrate;
(lost book borrowed from tutor scenario):
Sorry, sir. I am made a big mistake that I lost a book that you lent me.
I try my best to find where it is but I can’t and I try and find a new one
but I don’t know where I can buy it. I’m really sorry about that. Can
you tell me where I can buy this book? I will buy a new one for you. I
very apologise about that.
(Written production task response)
You have not completed your

essay.
You go to your tutor’s office,

who you know well, to ask for
extra time.
You:
Figure 1.4 An example of a scenario from the CAPT to elicit spoken data
I apologise about that. Last week you lent me a book but now I can’t
find it at home. I’m so sorry. Can I buy you a new one.
(CAPT response)
All of the scenarios and characters were designed to be familiar to learn-

ers studying in an academic study abroad context. The interlocutors within
the CAPT were characters who the learners were likely to encounter on
campus (e.g. a tutor, a librarian, a campus security guard), thereby increas-
ing the external face validity of the instrument (Nureddeen 2008). Higher
imposition requests were included in the scenarios, as led by staff members’
descriptions of situations typifying interactions with international students,
elicited during the design of the test, e.g. requesting an extension for an
assignment from a tutor; asking a campus security guard to retrieve a mobile
phone from a classroom out of hours; apologising for damaging a library
book or apologising for missing class. Participants were therefore placed
in familiar roles and situations, according to the academic context within
which they were currently studying, which are said to be key considerations
to improve both the quality of response and construct validity of the tests
(Bardovi-Harlig 1999; Schauer 2007). The data were collected from interna-
tional students on pre-sesssional programmes who had been institutionally
assessed as having equivalent to B2 English proficiency.
Following data collection, and in a similar procedure to the exam-based
corpus, the pragmatics corpus was then compiled using the learner request
and apology responses which were considered ‘appropriate’ for the sce-
narios presented, as rated by English-speaking tutors. The raters were first
instructed to judge each learner request and apology independently and eval-
uate them on a five-point Likert scale for pragmatic ‘appropriateness’, which
determined to what extent the responses were successful in terms of levels of
directness and politeness. For the purposes of this corpus, ‘appropriateness’ is
defined as, ‘the knowledge of the conventions of communication in a society,
as well as linguistic abilities that enable learners to communicate success-
fully in L2’ (Taguchi 2006: 513). The rating scale employed in table 1.3 was
Table 1.3 Rating scale to evaluate participant request and apology responses.
Rating score Description
5 I would feel completely satisfied with this response

4 I would feel very satisfied with this response
3 I would feel satisfied with this response
2 I would not feel particularly satisfied with this response
1 I would not feel satisfied at all with this response
0 No response provided
adapted from Shively and Cohen (2008). The rating scale did not require
attention to the grammatical accuracy of the responses, since the focus was
on their overall effectiveness.
The raters attended a standardisation meeting prior to the actual evalu-
ation stage to explain the project, the instrument, the rating criteria and
procedure. A number of practice items, followed by a comparison of rat-
ings, were completed to achieve a final consensus. A rating of ‘3’ was dis-
cussed as being ‘of minimal satisfaction’ and was included as the cut off
point for a response to be considered appropriate. Where queries were raised
by the raters during the evaluation stage, these were resolved in follow-up
meetings with the researcher. To understand which strategies and language
components were considered most effective for each scenario, all responses
awarded appropriate scores of 3 (appropriate) and 4 (very appropriate) by
the raters were isolated and used to comprise the pragmatics corpus detailed
in this chapter. The corpus also included NS data elicited via the same means.
The NS data were used as a point of comparison and to illustrate alternate
ways success could be achieved. This approach enabled us to measure prag-
matic competency more clearly. Table 1.4 details the learner data in the
pragmatics corpus.
Table 1.4 Details of the Speech Act Corpus of English (SPACE)
Detail Learner data
Total word count for learners at B2 level 33712

Total number of request and apology Apology: 236
responses Requests: 851
Total word count for native speakers 5725
Total number of speakers Total: 103
Learner data = 90 speakers
NS data = 13 native speakers
Average age 22.4 years
Average time spent learning English 9.1 years
Average time in the UK 4.3 months
Nationalities represented Chinese = 66; Qatar = 11, Japanese = 13,
British = 13
Speaker’s first language Chinese = 66; Arabic = 11; Japanese = 13;
English = 13
Sample request scenarios:
Request for assignment extension from a course tutor
Request to loan a library book beyond due date
Request to change accommodation at the accommodation office
Sample apology scenarios:
Apologise to a tutor for missing class
Apologise for a noisy party
Apologise to a librarian for damaging a library book
1.8 Conclusion
In this chapter, we have explained the reason for a focus on successful spo-
ken language. In short, there are two reasons for this. Firstly, successful
spoken language, as produced by learners, is worthy of study on its own
terms, not simply as something in deficit to a native- speaker model. We
acknowledge that many learners aspire to reaching a native-speaker stan-
dard but we also argue that this is often unattainable and successful spoken
English can be produced by learners at different levels of competence. Sec-
ondly, a focus on spoken language has been chosen because it is the spoken
skill which many learners aspire to obtaining and yet spoken language is still
relatively under-researched. In this chapter we have also reviewed the theo-
ries of communicative competence which have informed our definition of a
successful speaker of English and how these link to the CEFR levels B1–C2.
To reiterate, the broad definition we are employing is: A successful user of
spoken English is one who is able to use linguistic, strategic, pragmatic and
discourse competences as appropriate for a particular goal, at a particular
language level, as defined by the CEFR. We have also acknowledged that it
is not possible to measure in all contexts and therefore we have chosen to
examine it by looking at speaking data captured from texts and computer-
animated production tests. Such data gives us evidence of the successful lan-
guage students can produce in these contexts but does not, of course, inform
us about language which students only understand. Despite this, we feel it is
a realistic way to examine successful spoken language.
In the chapters which follow, we will use this data to explore successful
spoken language. Clearly, as the definitions in this chapter show, we can also
see that the different aspects of competence are often linked together. Each
chapter which follows this separates the analysis of different competences
which make up successful spoken English and analyses them individually.
We therefore look at linguistic competence, followed by strategic, discourse
and pragmatic competence in separate chapters. This is in order to clarify
the different aspects for the reader but where appropriate, we explicitly
make links between the competences and we acknowledge, as mentioned
above, that the different competences work together to produce communica-
tive competence.
References
Alderson, J.C. 2007. The CEFR and the need for more research. The Modern Lan-
guage Journal, 91(4), 659–663.
Alptekin, C. 2002. Towards intercultural communicative competence in ELT. ELT
Journal, 56(1), 57–64.
Andreou, G. and Galantomos, I. 2009. The native speaker ideal in foreign language
teaching. Electronic Journal of Foreign Language Teaching, 6(2), 200–208.
Bachman, L.F. 1990. Fundamental considerations in language testing. Oxford:

Oxford University Press.
Bachman, L.F. 2004. Statistical analysis for language assessment. Cambridge: Cam-
bridge University Press.
Bachman, L.F. 2007. What is the construct? The dialect of abilities and contexts
in defining constructs in language assessment. In: J. Fox, M. Wesche, D. Bayliss,
L. Cheng, C.E. Turner and C. Doe, eds. Language testing reconsidered. Ontario:
Ottawa Press, 41–72.
Bachman, L.F. and Palmer, A.S. 1996. Language testing in practice. Oxford: Oxford
University Press.
Bardovi-Harlig, K. 1999. Exploring the interlanguage of interlanguage pragmat-
ics: A research agenda for acquisitional pragmatics. Language Learning, 49(4),
677–713.
Birdsong, D. 2004. Second language acquisition and ultimate attainment. In: A.
Davies and C. Elder, eds. The handbook of applied linguistics. Oxford: Blackwell
Publishing, 82–105.
Bley-Vroman, R. 1983. The comparative fallacy in interlanguage studies: The case of
systematicity. Language Learning, 33(1), 1–17.
Bygate, M. 1987. Speaking. Oxford: Oxford University Press.
Canagarajah, S. 2007. Lingua franca English, multilingual communities, and lan-
guage acquisition. The Modern Language Journal, 91(Supplement 1), 923–939.
Canale, M. 1983. From communicative competence to communicative language
pedagogy. In: J.C. Richards and R.W. Schmidt, eds. Language and communica-
tion. New York: Longman, 2–27.
Canale, M. 1984.Testing in a communicative approach. In: G. Jarvis, ed. The chal-
lenge for excellence in foreign language education. Middlebiry,Vt: The North East
Conference Organisation, 79–92.
Canale, M. and Swain, M. 1980. Theoretical bases of communicative approaches to
second language teaching and testing. Applied Linguistics, 1(1), 1–47.
Carter, R. and McCarthy, M. 2015. Spoken grammar: Where are we and where are
we going? Applied Linguistics, 38(1), 1–20.
Chomsky, N. 1965. Aspects of the theory of syntax. Cambridge: MIT Press.
Cook, V.J. 1992. Evidence for multicompetence. Language Learning, 42(4), 557–591.
Cook, V.J. 1999. Going beyond the native speaker in language teaching. TESOL
Quarterly. 33(2), 185–209.
Cook, V.J. 2002. Portraits of the L2 user. Clevedon: Multilingual Matters.
Cook, V.J. 2008. Second language learning and language teaching. 4th ed. London:
Arnold.
Coperías Aguilar, M.J. 2008. Dealing with intercultural communicative competence
in the foreign language classroom. In: E. Alcón Soler and M.P. Safont Jordá, eds.
Intercultural language use and language learning. New York: Springer, 59–78.
Coppieters, R. 1987. Competence differences between native and near-native speak-
ers. Language, 63(3), 544–573.
Council of Europe 2001. Common European Framework of Reference for Lan-
guages: Language, teaching, assessment. Cambridge: Cambridge University Press.
Council of Europe 2014. Education and languages, language policy: The Com-
mon European Framework of Reference for Languages: Learning, teaching and
assessment (CEFR) [online] Available from: <www.coe.int/t/dg4/linguistic/cadre1_

en.asp> [Accessed 12 July 2015].
Council of Europe 2015. Council of Europe in brief. [online] Available from: <www.
coe.int/en/web/about-us/who-we-are> [Accessed 12 July 2015].
Davies, M. 2004. BYU-BNC. (Based on the British National Corpus from Oxford
University Press). Available from: <http://corpus.byu.edu/bnc/> [Accessed 27th March
2017].
Edge, J. 1988. Natives, speakers, and models. JALT Journal, 9(2), 153–157.
Ellis, R. 1994. The study of second language acquisition. Oxford: Oxford University
Press.
Ellis, R. 2008. The study of second language acquisition. 2nd ed. Oxford: Oxford
University Press.
Galambos, S.J. and Goldin-Meadow, S. 1990. The effects of learning two languages
on levels of metalinguistic awareness. Cognition, 34(1), 1–56.
Gilquin, G., De Cock, S. and Granger, S. 2010. LINDSEI: Louvain international
database of spoken English interlanguage: [CD-ROM]. Louvian: Presses Univer-
sitaires de Louvain.
Goullier, F. 2006. Council of Europe tools for language teaching: Common European
Framework and portfolios. [online] Available from: <www.coe.int/t/dg4/linguistic/
Source/Goullier_Outils_EN.pdf> [Accessed 12 August, 2015].
Halenko, N. 2013. Using computer animation to assess and improve spoken lan-
guage skills. In: Pixel, ed. ICT for language learning conference proceedings. 6th
ed., 286–290.
Hughes, R. 2011. Teaching and researching speaking. 2nd ed. London: Pearson
Education.
Hulstijn, J. 2007. The shaky ground beneath the CEFR: Quantitative and quali-
tative dimensions of language proficiency. Modern Language Journal, 91(4),
663–667.
Hymes, D. 1972. On communicative competence. In: J.B. Pride and J. Holmes, eds.
Sociolinguistics: Selected readings. Harmondsworth: Penguin, 269–293.
Jones, C. and Horak, T. 2014. Leave it out! Soap operas as models of spoken dis-
course in the ELT classroom. The Journal of Language Teaching and Learning,
4(1), 1–14.
Jones, C. and Waller, D. 2015. Corpus linguistics for grammar: A guide for research.
Abingdon: Routledge.
Kramsch, C. 2003. The privilege of the non-native speaker. In: C. Blythe, ed. The
sociolinguistics of foreign-language classrooms: Contributions of the native, the
near-native, and the non-native speaker. Boston: Heinle, 251–62.
Language Policy Division (LPD) 2006. Plurilingual education in Europe: 50 years of
international cooperation. [online] Available from: <www.coe.int/t/dg4/linguistic/
Source/PlurinlingalEducation_En.pdf> [Accessed 12 August 2015].
Lee, J.J. 2005. The native speaker: An achievable model? Asian EFL Journal, 7(2),
152–163.
Lewis, M. 1993. The lexical approach: The state of ELT and a way forward. Hove:
Language Teaching Publications.
Llurda, E. 2000. On competence, proficiency, and communicative language ability.
International Journal of Applied Linguistics, 10(1), 85–96.
Lyons, J. 1996. On competence and performance and related notions. In: G. Brown,
K. Malmkjӕr and J. Williams, eds. Performance & competence in second language
acquisition. Cambridge: Cambridge University Press, 9–32.
Martyniuk, W. and Noijons, J. 2007. Executive summary of results of a survey on:
The use of the CEFR at national level in the Council of Europe Member States.
Strasbourg: Council of Europe.
McNamara, T.F. 1995. Modelling performance: Opening Pandora’s box. Applied
Linguistics, 16(2), 159–179.
Medgyes, P. 1992. Native or non-native: Who’s worth more? ELT Journal, 46(4),
340–349.
Modiano, M. 1999. International English in the global village. English Today, 15(2),
22–28.
Naiman, N., Frolich, M., Stern, H.H. and Todesco, A. 1978. The good language
learner. Clevedon: Multilingual Matters.
Nayar, P.B. 2004. Whose English is it? TESL-EJ, 1(1), n.p. Available from: <www.
tesl-ej.org/wordpress/issues/volume1/ej01/ej01f1/> [Accessed 27th March 2017].
Nureddeen, F.A. 2008. Cross cultural pragmatics: Apology strategies in Sudanese
Arabic. Journal of Pragmatics, 40, 279–306.
O’Malley, J. and Chamot, A.U. 1990. Learning strategies in second language acquisi-
tion. Cambridge: Cambridge University Press.
Oxford, R. 1990. Language learning strategies: What every teacher should know.
Boston, MA: Wadsworth Publishing.
Oxford, R. and Nyikos, M. 1989. Variables affecting choice of language learning
strategies by university students. The Modern Language Journal, 73(3), 291–300.
Paulston, C.B. 1992. Linguistic and communicative competence. Avon: Multilingual
Matters Ltd.
Piller, I. 2002. Passing for a native speaker: Identity and success in second language
learning. Journal of Sociolinguistics, 6(2), 179–208.
Preston, D.R. 1981. The Ethnography of TESOL. TESOL Quarterly, 15(2), 105–116.
Prodromou, L. 1997. Correspondence. ELT Journal, 63(4), 439–440.
Prodromou, L. 2008. English as a lingua franca: A corpus-based analysis. London:
Continuum.
Rampton, M.B.H. 1990. Displacing the ‘native speaker’: Expertise, affiliation, and
inheritance. ELT Journal, 44(2), 97–101.
Schauer, G.A. 2007. Finding the right words in the study abroad context: The devel-
opment of German learners’ use of external modifiers in English. Intercultural
Pragmatics, 4(2), 193–220.
Selinker, L. 1972. Interlanguage. International Review of Applied Linguistics, 10,
209–231.
Shively, R. and Cohen, A. D. 2008. Development of Spanish requests and apologies
during study abroad. Ikala, revista de language y cultura, 13(20), 57–118.
Shohamy, E. 1996. Competence and performance in language testing. In: G. Brown,
K. Malkjmær and J. Williams, eds. 1996. Performance and competence in second
language acquisition. Cambridge: Cambridge University Press, 138–151.
Taguchi, N. 2006. Analysis of appropriateness in a speech act of request in L2
English. Pragmatics Quarterly Publication of the International Pragmatics Asso-
ciation (IPrA), 16(4), 513–533.
Taylor, D.S. 1988. The meaning and use of the term ‘competence’ in linguistics and
applied linguistics. Applied Linguistics, 9(2), 148–168.
Timmis, I. 2002. Native-speaker norms and International English: A classroom view.
ELT Journal, 56(3), 240–249.
Timmis, I., 2003. Corpora, classroom and context: The place of spoken grammar in
English language teaching. Unpublished PhD. University of Nottingham. Avail-
able from: <http://eprints.nottingham.ac.uk/12246/1/397578.pdf> [Accessed
10 December 2015].
Timmis, I. 2005. Towards a framework for teaching spoken grammar. ELT Journal,
59(2), 117–125.
Chapter 2
Linguistic competence
2.1 Introduction
In this chapter we examine the linguistic competence of successful spoken
English at different levels. As mentioned in chapter 1, we take our defini-
tion of this competence from a combination of Hymes (1972), Canale and
Swain (1980), Canale (1983) and Bachman and Palmer (1996) but in doing
so, we are defining linguistic competence as a competence mainly concerned
with lexis. This arises from a belief that the key to developing linguistic
competence is primarily lexical and that learning words, their collocations
and chunks is a process which grammar assists in but is not driven by. As
a result, we will explore the data in terms of word and keyword frequency
and also how keywords collocate and colligate within larger patterns of
language. By doing so, we will explore the patterns of lexis which learners
at different levels use and show how this language varies according to level.
We will also make comparisons with larger reference corpora and show how
these patterns differ or are similar to these corpora.
2.2 Definitions of linguistic competence

The broad CEFR definition of linguistic competence suggests that it com-
prises knowledge of and the ability to use lexical, formulaic, grammatical,
morphological, syntactical, semantic, phonological, orthographical and
orthoepic competence, in relation to the construction and formulation of
well-formed, meaningful messages (Council of Europe 2001). While this is a
comprehensive description, space restrictions and our focus on spoken lan-
guage will not enable us to describe all aspects of this competence. Instead,
as noted in the introduction, we will primarily focus on the lexical aspects
in the above description, as well as exploring how this language is used to
make meaning by learners at different levels. In doing so, we have been influ-
enced by definitions of linguistics competence developed by Hymes (1972),
Canale and Swain (1980), Canale (1993) and Bachman and Palmer (1996),
as discussed in chapter 1. We will briefly recap our definition here.
34 Linguistic competence
The key to our belief is summarised by Hymes, who suggested that an

individual’s knowledge and the ability to use knowledge of the language
in performance are inextricably linked. In other words, that linguistic and
sociocultural competences are not separate entities with the latter ‘grafted’
onto the former but co-exist and develop at the same time. As Hymes notes:
We have then to account for the fact that a normal child acquires knowl-
edge of sentences not only as grammatical but also as appropriate. He or
she acquires competence as to when to speak, when not, and as to what
to talk about with whom, when, where, in what matter. In short, a child
becomes able to accomplish a repertoire of speech acts, to take part in
speech events, and to evaluate their accomplishment by others . . .
(Hymes 1972: 277)
This definition suggests that when we examine the language which learners
at different levels can use, we must explore what they can use but also how
they use it, i.e. how it functions in discourse. Therefore, in this chapter, we
will explore linguistic competence in terms of the lexis used and how this
functions at utterance and text level to achieve meanings within the English
language tests which make up our corpus. As we will go on to describe in
section 2.4, this has been undertaken through an exploration of the most
frequent words, keywords and chunks used at each level but prior to this, in
section 2.3, we offer a brief review of previous research in this area.
2.3 Previous studies

The ‘basic dimension of lexical competence’ is vocabulary size (Meara 1996:
37). Though it can be defined in numerous ways (see Goulden, Nation and
Read 1990; Meara 1996; Nation 2001), it will first be discussed in the rather
loose sense of knowing individual words (Lewis 1993) and knowing word
families containing the inflections and derivations stemming from a one-
word root (Nation 2001; Schmitt 2008). Facilitative of student performance
in reading, writing, listening and speaking (Chujo 2004), previous research
has aimed to discover what or how much vocabulary learners require in
order to be successful. Since English is rich in its vocabulary (Götz 2013),
studies have also tried to determine how many words are ‘known’ by speak-
ers and how many of these are actually needed in order to use the language
for most purposes (Nation 2001).
Significant gains in establishing the vocabulary size of native speakers
have been made by examining corpora to examine frequency (see Laufer and
Nation 1999; Nation 2001) and coverage: the percentage of ‘known words
in a piece of discourse’ (Van Zeeland and Schmitt 2013: 457). With fre-
quency studies often grouping words into bands of 1000, researchers have
been able to calculate that an adult native speaker will know approximately
Linguistic competence 35
17,000–20,000 word families (Goulden, Nation and Read 1990; Nation

2001; Nation and Waring 1997). With regard to coverage, however, there
is a broad consensus that a vocabulary of 2000 word families will satisfy
most language demands (Götz 2013; Laufer and Nation 1999; McCarthy
1999; Nation 2001; Nation and Chung 2009; Nation and Waring 1997;
Schonell, Meddleton and Shaw 1956; Stær 2008; Thornbury and Slade
2006). Though many studies have focused on written language, estimates of
2000–3000 word families have been suggested for ‘everyday conversations’
(Götz, 2013: 64) as this can account for coverages of up to 95% in speech
(Adolphs and Schmitt 2003; Schonell, Meddleton and Shaw 1956). The first
2000 words in English are thus said to encompass the ‘heavy duty’ vocabu-
lary (McCarthy 1999: 4) which, due to their high frequency and coverage,
can provide a strong foundation for meeting individual learners’ everyday
needs (Thornbury and Slade 2006). Despite these findings, one noticeable
gap in research concerns how this figure relates to learner speech. Many of
the studies mentioned (for example, Adolphs and Schmitt 2003; Schonell,
Meddleton and Shaw 1956) have been based on native speaker speech so
claims as to the coverage provided by the 2000 most frequent words in
English have not been substantiated in regard to learner speech nor speech
at different proficiency levels. The CEFR similarly offers little explanation
as to the changes expected in vocabulary sizes across proficiency levels other
than the rather ambiguous descriptors it provides (see Council of Europe
2001: 28–29). Whilst Laufer and Nation (1999) indicate that the usage of
frequent words reduces as proficiency increases, little or no confirmation has
been offered from a language learner perspective which focuses on speech. It
is difficult, therefore, to determine whether the first 2000 words in English
do provide such high coverage in learner speech and whether this figure
changes across proficiency levels. Though corpus studies such as English
Vocabulary Profile (2016) and English Grammar Profile (2016) have been
able to illustrate some of the changes across CEFR proficiency levels, they
base themselves largely on analyses of written learner language. This is one
area which we seek to address in this chapter.
Viewing vocabulary size in terms of individual words is useful in that the
‘central units to be acquired’ by speakers (McCarthy 2006: 8) can be identi-
fied, but doing so fails to appreciate the links generated in meaning and form
when words frequently, and non-randomly, co-occur. Studies of lexicogram-
mar, acknowledging that much of language competence is driven by lexis,
have therefore also led to a re-evaluation of lexis in which the formulaic
pairing and grouping of words into collocations and multi-word units has
been emphasised in place of their treatment in isolation (see Hoey 2005;
Pawley and Syder 1983; Schmitt 2000; Sinclair 1991). Research in corpus
linguistics has revealed how central collocations and multi-word units are
to language use. For instance, given that 55% of language has been shown
to hinge on formulaicity, a feature more prevalent in spoken rather than
written language (Biber et al., 1999; Erman and Warren 2000), collocations
and lexical chunks cannot be seen as an ‘added extra’. Instead, we can sug-
gest that chunks ‘are extremely frequent, are necessary in discourse and are
fundamental to successful interaction’ (Adolphs and Carter 2013: 36).
Previous studies have managed to shed light on the definitions and func-
tions of formulaic language. In the former, collocation signifies two or
more words in adjacency or proximity of each other (McEnery and Hardie
2012: 123). They co-occur frequently enough that their occurrences cannot
be attributed to chance alone (Greaves and Warren 2010: 212), but they
surpass a threshold of possibility and statistical significance that confirms
their status as a meaningful collocation (Hunston 2002; McEnery and Wil-
son 2001). Collocations are also considered fundamental to the creation of
meaning (Adolphs and Carter 2013; Lewis 1993) despite often being uncon-
nected to pragmatic function (Nattinger and DeCarrico 1992). On the other
hand, lexical chunks, a form of multi-word unit, tend to perform a more
functional role in language (Schmitt 2000) and differ from collocations in
length and role (see De Cock 1998; Nattinger and DeCarrico 1992). In terms
of length, whilst collocations can primarily be associated with word pair-
ings (see Adolphs and Carter 2013: 23) e.g. ‘fussy eater’, ‘tall man’, ‘commit
fraud’, it is clear that lexical chunks extend ‘far beyond’ this level (Schmitt
2000: 400). They can be short, long or ‘anything in between’ (Schmitt and
Carter 2004: 3) but are typically expected to be between two and four words
in length (McCarthy 2010). With regard to function, although they are rel-
evant for conveying meaning or ‘referential’ topic-related information, they
are considered to be of higher importance in realising pragmatic and dis-
course functions (De Cock 1998: 69). However, they cannot be assumed
to be syntactically complete e.g. ‘In the’, ‘top of,’ ‘the end of’ (Adolphs and
Carter 2013; Biber et al., 1999; De Cock 1998), nor are they restricted to
one sole purpose (Erman and Warren 2000). Chunks are examined in rela-
tion to discourse and pragmatic competence in chapters 4 and 5 but the
nature and function of formulaic language in spoken learner language, and
the way it develops or changes across levels, hence remain less clear and
the implications for learner success remain unknown. It is these gaps in the
literature which we will attempt to address in the analysis of learner data
in this chapter.
2.4 Methods of analysis
2.4.1 Frequency profiles

To detect which words comprised the majority of student test language,
we first created a lexical frequency profile (Laufer and Nation 1995). Such
profiles identify the words which arise in learner language but more impor-
tantly, they identify the frequency band to which the words correspond. Put
simply, the Lexical Frequency Profile is a ‘measure of how vocabulary size

is reflected in use’ (Laufer and Nation 1995: 307). To produce the lexical
frequency profile, uncoded text files consisting solely of candidate language
use were uploaded into Cobb’s (2017) Compleat Lexical Tutor online soft-
ware. The files were then computed using the ‘vocabulary profile’ function
with the BNC-20 used as a measure. The resulting output demonstrated how
many tokens and types were present in the corpus data along with informa-
tion about their distribution across the 20 BNC word families. Not only did
analysis consist of identifying vocabulary coverage for the first 2000 most
frequent words, but token-type ratios were also examined to allow our texts
(of similar length) to be subjected to comparisons for lexical variation, as
outlined by Barnbrook (1996) and McEnery and Hardie (2012).
2.4.2 Frequency lists

Following this, frequency lists were first generated using the Wordlist function
in WordSmith Tools 6.0 (Scott 2015). With regard to the use of frequency
lists, it is generally acknowledged that the first stage of basic corpus analysis
involves observing the occurrence of individual words; the lists identify word
types within a corpus, the raw frequencies and their percentage coverages
(Baker 2006; Barnbrook 1996; Hunston 2002). These were produced for B1,
B2 and C1, as well as the individual exam sections at each level.
Once the frequency lists had been generated, the following procedure
was followed. Firstly, introductory analysis involved comparing the B1,
B2 and C1 frequency lists i) with reference corpora, and ii) against each
other. When compared in such a way, frequency lists can become ‘much
more meaningful’ (Barnbrook 1996: 46), especially in the case of specialised
corpora (Hunston 2002), since speakers have a large option of choices from
which to choose when interacting, the high frequencies of some items will
inevitably reveal speaker preferences and intentions ‘whether conscious or
not’(Baker 2006: 48). Comparison with reference corpora and across levels
can help to alleviate this effect.
The choice of reference corpus does, of course, affect the reliability of
comparisons. A reference corpus can be defined as a large corpus, ‘repre-
sentative of a particular language variety’ which can act as a ‘benchmark’
demonstrating typical language usage within that variety (Baker 2006: 30).
If chosen imprecisely, findings could suffer due to the lack of similarity or
‘appropriateness’ with the language under inspection (Scott and Tribble
2006). The decision was therefore taken to make comparisons of USTC
data with a reference corpus containing learner speech. The Louvain Inter-
national Database of Spoken English Interlanguage (LINDSEI) (Gilquin, De
Cock and Granger 2010) was selected. Containing over a million words
of learner speech from participants varying in nationality, age and profi-
ciency, it was considered ideal for comparison. As it is also based on learner
interviews consisting of three tasks (set topic, free discussion and picture),
it was thought to be particularly comparable to the data in the USTC as
outlined in chapter 1. In preparation for subsequent analyses, a wordlist
composed of learner turns only was constructed from LINDSEI using the
Wordlist function in WordSmith (Scott 2015). Comparisons were also made
to our own NS data and the spoken section of the BYU-BNC corpus (Davies
2004) where we thought this could be illustrative.
Initially, observation of the data was carried out to identify words which
differed greatly in their frequencies. Any words appearing with high fre-
quency across the levels and any words in the USTC data but not the NS
data were immediately earmarked for further investigation. Normalised fre-
quencies were also incorporated into the analysis as these reveal how often a
word can be expected within a set number of tokens (the base of normalisa-
tion), typically, every 10,000 words or 1,000,000 words. They are calculated
by dividing a word type’s total occurrence by the size of the corpus; this fig-
ure is then multiplied by the base of normalisation to allow comparisons to
be made across corpora of differing sizes (McEnery and Hardie 2012). Since
the base of normalisation adapts to corpus size, normalised frequencies in
this study utilised a base of 10,000 for LINDSEI and cross-level compari-
sons of the USTC data. Once a word was selected for further investigation,
qualitative examination of the word in contexts via the use of concordances
allowed us to explore patterns of use. This helped to paint a picture of a
word’s function in learners’ utterances at different levels. To investigate this
further, an additional measure (Juilland’s D) was used to ascertain how a
word was dispersed across the corpus. Whereas frequency identifies how
often a word occurs, dispersion can give more indication of its distribution
and whether it was typical of all texts in a corpus (see Oakes 1998). The
closer the figure is to 1, the higher the dispersion rate. To assist in calculat-
ing this measure, Brezina’s (2014) online toolbox for corpus linguistics was
utilised.
Information was obtained regarding what a particular word occurred
with, the structure it usually occurred in and the purpose it fulfilled. When
performing this qualitative analysis via the use of concordance lines, we
also observed salient collocations. Word co-occurrence is a concept closely
related to concordance lines ‘since the idea of two words occurring in a com-
mon context is similar to that of two words occurring in the same concor-
dance window’ (Oakes 1998: 159) Once a collocation seemed to be salient,
quantitative measures of collocational strength were adopted. These mea-
sures compared the number of appearances of a collocate within a four- or
five-word window (or span) to the left and right of the node (Baker 2006;
McEnery and Hardie 2012). For assessing collocation strength t-scores and
mutual information (MI) scores were used. T-scores are considered more
accurate in the analysis of words with relatively low frequencies due to the
manner of calculation (Baker 2006; Barnbrook 1996) and in this case we
assumed (following Barnbrook 1996: 98) that any t-score greater than 2
would signify a potentially ‘interesting’ focus and that an MI score of 3
would also be worth examination (Oakes 1998). For the purposes of pre-
senting collocates in this chapter, both t-scores and MI scores will be used
but when looking at the collocates of single words, both figures will have
to satisfy the significance thresholds specified and we have at times only
focused on the most frequent collocates, due to their number. We make this
clear in the text where this occurs. When looking at collocates in the form
of chunks (e.g. collocates of we can), only one figure will have to reach this
threshold due to the lower frequencies of the data.
2.4.3 Keyword lists

Keywords are words that appear significantly, or unusually, more or less
often than in a chosen reference corpus. These occurrences are said to indi-
cate the ‘aboutness’ of a text since function words typically occupying high
positions in frequency lists may be replaced with lexis of more interest for the
researcher. Typically, the words elicited occupy three categories. In addition
to a text’s proper nouns, words revealing the topics or content included in a
text are presented, and finally, words associated with the style, or nature, of
the language are captured (Scott 2015). Therefore, to discover which words
were key in the USTC, keyword lists were generated using WordSmith. A
keyness figure was generated using log-likelihood accompanied by a p score
confirming that the keyword indeed occurred significantly more frequently.
Log-likelihood was chosen as it is said to provide reliable measure of key-
ness (Scott 2015: para 4) and in conjunction with a p value set at p<.001,
it gives us some certainty that a word can be considered truly key (Scott
2015). Once calculated, every word in the B1, B2 and C1 lists had a p value
of p<.001 meaning that it was extremely unlikely that the words in the lists
were established as key by chance alone. Comparisons across keyness lists
were then conducted using manual observations of data, concordance and
exploration of the patterns of usage, as per the process detailed previously
in relation to frequent words. We then chose to analyse two keywords which
occurred at each level (think and can) as they were used to fulfil important
functions at each level and were clearly not key simply because they reflected
the test topics.
2.4.4 Lexical chunks

Lexical chunks can be defined as ‘recurring sequences of n words’ (McEn-
ery and Hardie 2012: 110) and hence are termed Ngrams in corpus soft-
ware. Their length can be determined by the analyses the researcher wishes
to make and mentioned in section 2.3, most chunks are considered to be
between two and four words in length. However, an initial examination of
two-word chunks led us feel that many of these seemed quite fragmentary
in nature and so a decision was made in this chapter to explore chunks of
three and four words in length. Whilst less common than three-word chunks
(Biber et al., 1999), the longer stretches of lexis were also thought interesting
for comparisons of B1, B2 and C1 as many would encapsulate some of the
three-word Ngrams (see Carter and McCarthy 2006). For example, ‘what
do you’ may appear as a three-word chunk and then as a part of the four-
word ‘what do you think? All chunks were included in the analysis, on the
basis of frequency. This was because we believed that their appearance in the
lists still suggests that they are typically retrieved and produced as if they are
single items, even if some might not be considered to be syntactically whole.
To produce the chunk lists, uncoded text files were uploaded into AntConc
(Anthony 2017). The previously discussed approaches to extracting collo-
cational information and analysing concordance lines were also followed,
which allowed us to examine both the language used and also how it func-
tioned in speech. This also meant we again tried to focus our attention on
chunks which occurred with frequency across the levels and were not obvi-
ously frequent in relation to the exam topics. For these reasons, we explore
both a lot of and agree with you.
2.5 Linguistic competence at B1-C1 levels

The following sections will look at linguistic competence in the following
ways. Firstly, we will look at how the words were distributed amongst the
K-1 – K-2 bands in order to understand the type of lexis successful learners
used (this section). We then move on to exploring the most frequent words
and their functions and the keywords and their functions (section 2.6).
Finally, we move to an exploration of three- and four-word chunks (sec-
tion 2.7). Throughout the chapter we mention ‘can do’ statements from the
CEFR where we consider this to be appropriate.
Table 2.1 displays the vocabulary profile results for K-1 and K-2 bands
in the USTC data.
This data shows that a percentage of approximately 97% of words at
B1, B2 and C1 originated from the 2000 most frequent words in English.
When tokens were examined, very little difference was in evidence across
the learner levels in this study. The 1000 most frequent words afforded
learners at each level a token coverage between 91% and 93%; the second
1000 most frequent words offered an additional token coverage of 3% to
nearly 5%. When combined, Table 2.1 reveals that these two bands pro-
vided a token coverage of approximately 97% meaning that only 1 in every
33 words at B1, B2 and C1 came from a different frequency band. An exami-
nation of coverage according to word families also supported the finding
that there was little difference across the levels. Combined K-1 and K-2 fam-
ily coverages remained rather stable at 81–83%: B1 had a family coverage
of 83.26%, B2’s family coverage stood at 83.70% and C1’s family coverage
decreased slightly to 80.62%. It can be inferred from this data that for learn-
ers to be successful and present a solid performance in their speech at the
Threshold, Vantage and Effective Operational stages, the two most frequent
vocabulary bands in English are essential. Figures 2.1–2.3 taken from the
B1, B2 and C1 data demonstrate that successful language at these levels is
indeed comprised of vocabulary mostly from the K-1 and K-2 bands (words
in italics = K-1, words in underlined italics = K-2, words in bold denote a
different frequency band or an off-list word):
Table 2.1 Percentage of K-1 and K-2 words at B1, B2 and C1
Exam Freq. Families Types Tokens Cumul. Total types Total tokens Tokens Standardised
Level Band (%) (%) (%) Token % across all across all per type TTR (Word-
bands bands Smith Tools)
}
B1 K-1 585 878 14762 92.81
(63.59) (65.28) (92.81)
K-2 181 219 553 96.29 1345 15905 11.83 25.00
(19.67) (16.28) (3.48)
}
B2 K-1 671 1011 18706 92.86
(62.89) (66.21) (92.86)
K-2 222 267 884 97.25 1527 20144 13.19 24.90
(20.81) (17.49) (4.39)
}
C1 K-1 675 1136 20303 91.64
(58.39) (62.80) (91.64)
K-2 257 340 1104 96.62 1809 22156 12.25 26.95
(22.23) (18.79) (4.98)
<$19M> Er actually the most popular sport in my country is football er I I like football and er
I’ve found national team. Er actually national team played yesterday last night and er er losed
the cup silver cup. I’m sad today but er the the you know the football it’s a game I think it’s help
the politics to keep the people of the country happy to keep the people in the country you know er
fans watch the TV <$=> it’s </$=> and also the people er happy when they when they watch er
the the foot= the football match.
Figure 2.1 K-1 and K-2 words at B1
<$29F> Well they look for higher qualifications and it depends actually on the job if you’re
gonna apply like if it’s for sport they will look for sport section <$=> if it’s for the </$=> it
depends which major you’re gonna you’re gonna apply for not major meaning you’re gonna
apply for the if you’re studying business like I’m studying business now I might t=er go to a
company and manage the company so it depends on the subject you studied and it depends on
the company you’re applying to.
Figure 2.2 K-1 and K-2 words at B2

<$29M> To learn to drive I think it must be sixteen or seventeen but to start driving it should be
eighteen because if people just learn to drive in one month or in one session they won’t learn
everything so <$=> they start </$=> two years ago they start having classes some safety some
safety classes about the classes about the cars <$=> can </$=> what they are capable of what
they can do what they can’t do they must tell them everything before they can use the car.
Figure 2.3 K-1 and K-2 words at C1
Table 2.2 Types, tokens, means and SDs for K-1 and K-2
Level Freq. Families Types Tokens

band
Freq. Mean SD Freq. Mean SD Freq. Mean SD
B1 K-1 585 163.76 23.76 878 197.59 28.75 14762 868.12 191.63
K-2 181 17.76 6.09 219 19.24 6.87 553 32.53 14.35
B2 K-1 671 186.29 21.98 1011 226.00 31.69 18706 1100.24 286.69
K-2 222 23.82 6.11 267 26.00 6.13 884 52.00 14.93
C1 K-1 675 200.23 25.39 1136 252.12 41.00 20303 1194.18 316.98
K-2 257 29.29 8.38 340 33.94 10.40 1104 64.94 29.30
Observations of raw type and token frequencies revealed an extended range

and usage of K-1 and K-2 words as proficiency levels rose. The fourth column
in table 2.1 shows an incremental use of K-1 and K-2 types from B1 to C1.
K-1 types increased by 133 (15%) from B1 to B2, and by 125 (12%) from
B2 to C1. K-2 types increased by 48 (22%) from B1 to B2, and by 73 (27%)
from B2 to C1. When independent t-tests were performed on data from the
three exam sets, it was confirmed that B2 learners produced significantly
more K-1 and K-2 types than B1 learners (p<.01 for K-1 and K-2) and that
C1 learners produced significantly more K-1 and K-2 types than B1 (p<.01
in both cases) and B2 learners (p<.01 for K-1 and p<.03 for K-2). This data
shows that to be successful at higher levels, a wider range of types within the
2000 most frequent words in English must be evidenced. Similarly, raw token
frequencies (column five, table 2.1) displayed a change in the number of
words used across levels. Whereas B1 learners produced 14,762 K-1 tokens,
B2 generated 18,706 (an increase of 27%) and C1 learners reached 20,303
(an increase of 38% from B1 and 8.5% from B2). K-2 token usage also
doubled from B1 to C1. When significance tests were performed, however,
a distinction arose. B2 and C1 students clearly used significantly more K-1
and K-2 tokens than B1 learners (p<.01 in all cases). C1 learners on the other
hand did not produce significantly more K-1 or K-2 tokens than B2 learners.
The data implies that learners need to use more individual words at B2 than
B1 to be considered successful, but from B2 to C1, the difference rests on
other factors. Types, token and means at each levels are shown in table 2.2.
As was suggested earlier and has been confirmed here, learners at any of
the three levels cannot, quite obviously, be successful without knowledge of
the 2000 most frequent words in English. Furthermore, it is apparent that

progression from B1 to B2 rests on the utilisation of a broader and more
frequent use of lexis from the K-1 and K-2 words categories. However, the
stability in token and family coverage, as well as increased type and token
numbers directly contradicts Laufer and Nation’s (1999) finding that higher-
level students use fewer high frequency words. In fact, successful USTC stu-
dents used more lexis from the 2000 most frequent words as proficiency
rose. This finding was also corroborated when the number of K-1 and K-2
word families used by B1, B2 and C1 learners was examined. B2 learners
produced significantly more K-1 and K-2 families than B1 learners (p <.01
for K-1 and for K-2). Though C1 learners did not use significantly more K-1
word families than B2 learners, they did produce more K-2 families than B2
learners (p<.05). Success at higher proficiency levels does not therefore rely
on the increased use of less frequent vocabulary; it is, in fact, reliant upon an
increased use of very frequent vocabulary across each of the levels.
Following these findings from the vocabulary profiles, one final analy-
sis was performed. The USTC data was inputted into EVP’s Text Inspec-
tor tool (WebLingua 2016) to supplement and enhance findings. Though
Text Inspector is normally used to analyse written language, it was also felt
this analysis could provide a deeper understanding of the words categorised
by the vocabulary profiles and it would inform assumptions by some that
lexis increases in complexity as proficiency rises. For additional comparison,
vocabulary profile data for the native-speaker exams were also incorpo-
rated. Figures 2.4 and 2.5 display the type and token percentages, respec-
tively, for speech from B1, B2, C1 and native speaker data sets. Table 2.3
shows cumulative percentage figures for types and tokens across the four
categories.
From the above data, it is clear to see that four-fifths of all speakers’
speech came from the A1 and A2 levels, including the speech used by native
% types from B1-NS

52.89
51.96
46.89
60.00
44.63
50.00
40.00
20.27
20.08
20.04
19.49
30.00
15.35
14.17
12.80
12.37
12.17
10.35
9.99
20.00
8.82
7.09
3.73
5.9
4.6
2.06
1.44
0.84
0.76
0.34
0.29
0.29
10.00
0.4
0.00
A1 A2 B1 B2 C1 C2 UNLISTED
B1 Types B2 Types C1 Types NS Types
Figure 2.4 Percentage type vocabulary profiles

% tokens from B1-NS

69.88
69.24
69.22
67.96
80
70
60
50
40
15.59
13.97
12.73
12.16
30
11.41
11.31
10.75
10.03
20
5.22
5.18
4.29
3.15
2.18
1.63
1.43
1.06
0.54
0.39
0.21
0.16
0.13
0.05
0.05
0.06
10
0
A1 A2 B1 B2 C1 C2 UNLISTED
B1 tokens B2 tokens C1 tokens NS tokens
Figure 2.5 Percentage token vocabulary profiles
Table 2.3 Cumulative percentage types and tokens
Vocab. Level Cumulative type percentage Cumulative token percentage
B1 B2 C1 NS B1 B2 C1 NS
A1 51.96 52.89 46. 89 44.63 69.24 69.88 69.22 67.96

A2 72.00 73.16 66.97 64.12 79.99 81.29 81.38 81.93
B1 84.80 85.53 82.32 78.29 83.14 85.58 86.60 87.11
B2 88.53 90.13 88.22 85.38 84.20 87.01 88.23 89.29
C1 89.37 90.89 89.66 87.44 84.36 87.22 88.62 89.83
C2 89.66 91.18 90.00 87.84 84.41 87.27 88.68 89.96
UNLISTED 100.01 100.00 99.99 100.01 100.00 100.00 99.99 99.99
*Percentages rounded
speakers. Though there are small increases across the proficiency levels in B1
and B2 token usage, all speakers in the USTC undeniably used lexis which
was high in frequency and without a great deal of complexity. The Text
Inspector findings thus illustrate that an increase in language level does not
necessarily equate to an increase in usage of what could be considered ‘more
difficult’ vocabulary.
This section has aimed to examine the vocabulary profile of speakers at
B1, B2 and C1. The CEFR states that there should be development in vocab-
ulary range but it does not suggest what this development is in quantitative
terms. Instead it uses vague expressions such as ‘enough language to get
by, with sufficient vocabulary [B1]’ ‘has a sufficient range of language [B2]’
and ‘a broad range of language [C1]’(Council of Europe 2001: 28) to imply
that the amount of vocabulary known by learners should grow in some
way. Whilst vocabulary research can focus on the number of word families,
derivations and inflections known to learners, such enquiry has pinpointed

a common target for all learners: that they learn the ‘heavy duty vocabulary’
(McCarthy 1999: 4) comprising the 2000 most frequent word families in
English. According to Laufer and Nation (1999), learners at more advanced
levels make less use of high frequency vocabulary than learners at lower lev-
els. However, the USTC learners at the B1, B2 and C1 levels displayed com-
parable token and family coverages at the K-1 and K-2 bands. This result
is not unique. In a study conducted by Galaczi and Ffrench (2011), it was
discovered that frequency profiles remained relatively fixed as learner profi-
ciency increased. Learners equivalent to B1, B2 and C1 displayed coverages
of 97.05%, 97.61% and 97.75%, respectively, which demonstrates that the
learners in this study were comparable in their fixed coverages across levels.
Despite the intuitive appeal of these results, only looking at frequency counts
provides a limited picture of language use and success. Support is thus leant
to Galaczi and Ffrench’s (2011: 160) claim that such quantitative lexical
variables alone are unable to ‘consistently show the lexical improvement in
candidate speech’. Put simply, whilst K-1 and K-2 words enabled learners
at the three levels to be successful, vocabulary profiles alone cannot show
us exactly how that success varied. For these reasons, we explore the most
frequent words and how they functioned in context in the next sections.
2.6 Most frequent words

Table 2.4 shows the 20 most frequent words at each level and in our native-
speaker data.
The 20 most frequent words occupied a large portion of the total corpus
at each of the three levels. When percentages for these words were com-
bined, it was discovered that at B1, the words comprised 44.46%, at B2, the
words comprised 40.63% and at C1, the words comprised 39.06% of their
respective corpora. Compared to the cumulative percentage of LINDSEI’s
top 20 words, which stood at 38.01%, it is noticeable that USTC students
relied a little more heavily on these highly frequent words than the learners
in LINDSEI. However, with more than two-thirds of the top 20 words at
B1, B2 and C1 (70%, 75%, 80%) corresponding to LINDSEI’s top 20 list,
there is evidence that being successful in spoken interaction still necessitates
knowledge of the most common words identified. Learners relied on these
frequent words to form around two-fifths of their speech, but the decreasing
percentages showed that dependency on these words lessened: transitions
from B1 to B2 and from B2 to C1 resulted in increasing usage of word types
beyond the 20 most frequent threshold.
Following this preliminary overview, we will now turn to the individual
words which provoked further investigation. The words explored in more
depth are: we, er and erm. As mentioned in section 2.4.3, these were chosen
for further analysis because they were common at each level but notably less
frequent in our NS data.
Table 2.4 The 20 most frequent words from B1-C1 and in NS data
N B1 B2 C1 Native speakers
Word Freq. % Texts % Word Freq. % Texts % Word Freq. % Texts % Word Freq. % Texts %
1 ER 1,391 8.09 17 100.00 ER 1,312 6.16 17 100.00 THE 1,147 4.97 17 100.00 I 374 4.67 6 100.00
2 I 1,126 6.55 17 100.00 I 926 4.34 17 100.00 I 807 3.50 17 100.00 AND 249 3.11 6 100.00
3 THE 688 4.00 17 100.00 THE 827 3.88 17 100.00 ER 770 3.33 17 100.00 THE 249 3.11 6 100.00
4 AND 523 3.04 17 100.00 AND 591 2.77 17 100.00 AND 665 2.88 17 100.00 LIKE 214 2.67 6 100.00
5 TO 461 2.68 17 100.00 TO 585 2.74 17 100.00 TO 640 2.77 17 100.00 TO 197 2.46 6 100.00
6 ERM 320 1.86 17 100.00 YOU 518 2.43 17 100.00 ERM 460 1.99 17 100.00 YOU 184 2.30 6 100.00
7 IN 296 1.72 17 100.00 YEAH 380 1.78 17 100.00 YEAH 442 1.91 17 100.00 A 172 2.15 6 100.00
8 IS 288 1.67 17 100.00 IN 376 1.76 17 100.00 IS 417 1.81 17 100.00 YEAH 171 2.13 6 100.00
9 SO 288 1.67 16 94.12 A 325 1.52 17 100.00 IN 414 1.79 17 100.00 IT 166 2.07 6 100.00
10 A 269 1.56 17 100.00 IS 316 1.48 16 94.12 YOU 376 1.63 17 100.00 ERM 165 2.06 6 100.00
11 YOU 252 1.47 17 100.00 ERM 308 1.44 17 100.00 A 350 1.52 17 100.00 OF 134 1.67 6 100.00
12 YEAH 250 1.45 16 94.12 LIKE 304 1.43 17 100.00 LIKE 348 1.51 17 100.00 THAT 128 1.60 6 100.00
13 MY 238 1.38 17 100.00 THINK 276 1.29 17 100.00 THINK 342 1.48 17 100.00 THINK 127 1.59 6 100.00
14 THINK 237 1.38 17 100.00 MY 252 1.18 17 100.00 OF 312 1.35 17 100.00 IT’S 100 1.25 6 100.00
15 LIKE 222 1.29 17 100.00 SO 252 1.18 17 100.00 SO 295 1.28 17 100.00 IN 97 1.21 6 100.00
16 BECAUSE 188 1.09 17 100.00 BUT 233 1.09 17 100.00 THEY 285 1.23 16 94.12 SO 79 0.99 6 100.00
17 OF 161 0.94 17 100.00 HAVE 229 1.07 17 100.00 IT’S 276 1.20 17 100.00 JUST 75 0.94 6 100.00
18 IT’S 152 0.88 16 94.12 THEY 226 1.06 15 88.24 IT’S 254 1.10 17 100.00 IF 74 0.92 6 100.00
19 CAN 150 0.87 15 88.24 OF 222 1.04 17 100.00 WE 221 0.96 17 100.00 ON 71 0.89 6 100.00
20 WE 149 0.87 16 94.12 WE 212 0.99 16 94.12 BECAUSE 196 0.85 17 100.00 BUT 69 0.86 6 100.00
Table 2.5 Normalised frequencies for we
USTC LINDSEI BNC Spoken
B1 B2 C1
Word List Freq. NF List Freq. NF List Freq. NF List Freq. NF List Freq. NF
pn pn pn pn pn
WE 20 149 86.77 20 212 99.54 19 221 95.74 26 5773 72.80 13 10448 10.45
2.6.1 WE
In terms of frequency, normalised frequencies were similar in the USTC and

LINDSEI data. Though there was a dramatic fall in the Spoken BNC data,
this suggests that it is far more frequent in the speech of successful learners
than native speakers.
Nineteen common and significant collates were found across B1, B2 and
C1. Some differences in functions were identified and sometimes differences
correlated with collocates not shared across levels. These frequencies and
collocates are displayed in tables 2.5 and 2.6.
What is notable from this data is the frequency with which we collocates
with modal forms such as should, can and need. We discuss this in relation
to the keyword can in section 2.7.2.
Exploring the functions of we as a subject pronoun across B1, B2 and C1
showed that its usage matched three distinct purposes identified by Carter
and McCarthy (2006: 379). Though they are learned at B1 level according
to the English Vocabulary Profile (2016), they were still representative of
speech at all three levels. The functions were as follows:
• Specific – Inclusive of speakers in immediate context

• Third party – Exclusive as it refers to speakers and persons absent from
discourse
• General – Reference to larger groups of people, e.g. society or people in
general
Figures 2.6–2.8 show each of these uses in context:
<$0> Who do you most like to spend time with?

<$17F> Okay I like to spend my time or all my free time with my friends.
<$0> Mhm.
<$17F> Er we go to shopping and er we sometimes go out to eat out restaurant. Yeah we have
er nice time <$O26> when </$O26> we together.
Figure 2.6 We used with a specific function at B1

Table 2.6 Collocates of we at B1, B2 and C1
B1 B2 C1
Collocate Freq. with T-score MI score Collocate Freq. with T-score MI score Collocate Freq. with T-score MI score
node node node
ER 109 9.286 3.177 ER 105 8.973 3.008 THE 93 8.505 3.083

CAN 49 6.814 5.236 THE 81 8.086 3.300 ER 65 7.148 3.141
AND 42 5.781 3.212 AND 60 6.987 3.352 TO 63 7.166 3.362
TO 34 5.146 3.089 TO 53 6.481 3.187 CAN 59 7.458 5.106
SO 31 5.120 3.635 HAVE 50 6.749 4.456 HAVE 59 7.443 5.012
A 30 5.052 3.686 IN 42 5.904 3.489 AND 56 6.633 3.137
ERM 29 4.870 3.387 CAN 34 5.483 4.067 ERM 48 6.293 3.447
HAVE 28 5.093 4.739 SO 33 5.308 3.719 SO 38 5.706 3.750
SHOULD 19 4.301 6.240 OUR 29 5.269 5.532 SHOULD 31 5.468 5.803
YEAH 19 3.862 3.133 LIKE 29 4.824 3.262 KNOW 31 5.392 4.989
IF 18 4.100 4.891 ARE 28 5.154 5.269 BECAUSE 27 4.835 3.847
ARE 16 3.890 5.178 A 28 4.681 3.115 IF 24 4.641 4.248
WITH 15 3.676 4.298 KNOW 24 4.698 4.607 IT 24 4.403 3.303
SOME 14 3.485 3.863 CULTURE 23 4.613 4.716 BUT 22 4.354 3.800
ALSO 13 3.519 5.381 IF 23 4.468 3.871 WITH 18 3.949 3.855
KNOW 13 3.377 3.981 DO 21 4.264 3.844 DON’T 16 3.761 4.063
GET 12 3.347 4.881 OR 21 4.190 3.544 LEARN 15 3.789 5.527
MM 12 3.161 3.517 BECAUSE 20 4.056 3.427 OUR 15 3.762 5.122
OUR 11 3.262 5.918 JUST 19 4.092 4.029 SOME 15 3.557 3.614
NEED 10 3.113 6.003 THAT 19 4.049 3.812 THAT 15 3.401 3.037
BUT 10 2.869 3.431 NEED 18 4.154 5.574 ABOUT 13 3.276 3.453
ABOUT 18 3.980 4.014 OR 13 3.213 3.198
DON’T 17 3.870 4.025 NEED 12 3.387 5.485
WILL 17 3.793 3.641 USE 12 3.351 4.934
WHAT 16 3.766 4.097 CAN’T 12 3.348 4.900
GO 15 3.601 3.831 MM 12 3.124 3.350
SHOULD 14 3.681 5.935 WHEN 11 3.158 4.385
LEARN 14 3.635 5.137 ARE 11 2.913 3.037
SOMETIMES 14 3.595 4.678 FROM 10 2.932 3.781
THIS 13 3.255 3.363 ALSO 10 2.878 3.474
USE 12 3.326 4.652 DO 10 2.848 3.329
NOW 12 3.252 4.027
ALSO 11 3.077 3.789
MAYBE 11 2.972 3.266
CAN’T 10 3.087 5.389
LOT 10 2.876 3.466
*Only significant collocates with a frequency equal to or greater than 10 are displayed
<$0> + how long have you been studying English?

<$25M> I have been er studying English er eight months. Er I started er in August er no sorry
in October last year 2012. Er er you can say in er when I was in school I sta= I start from the
business English because we don’t er use it English in our country. We use Arabic language. So
that’s it.
Figure 2.7 We used to refer to a third party at B2
<$0> The most environmentally friendly way to travel? Do you understand that?
<$16F> The most er invently +
<$0> Environmentally friendly.
<$16F> <$O17> Oh </$O17> Yeah I got the meaning.
<$15F> <$O17> Environment </$O17>
<$0> Mm.
<$16F> Mm I answer the question?
<$0> Yeah.
<$16F> In the travel I think the environment friendly it means we go to travel not by a car and it
not er mm we can protect er environment and that is er mm er green green travel yes erm green
travel when we go to travel I think if we use bicycle I think it’s not harmful the er fresh air and
we can’t throw the rubbish anywhere I think it’s er mm friendly environment
Figure 2.8 We used to make a general reference at C1
As the examples in figures 2.6–2.8 show, analysing the functions of the

personal pronoun we uncovered the subtle influences it had on success at B1,
B2 and C1 and showed how the functions of we related to the CEFR ‘can do’
statements. The CEFR states that the nature of the topics discussed evolves
across the levels from those of ‘familiar, or personal interest’ at B1, to ‘a wide
range of subjects related to [a learner’s] field of interest’ at B2, and finally to
those deemed to be ‘complex’ at C1 (Council of Europe 2001: 26–27). We
demonstrated how such changes occurred across levels because at B1 and
B2, third party usage was most prevalent; at C1, general usage was most
common. Being successful at B1 meant learners tended to use we to discuss
topics in relation to themselves. Progressing from B1 to B2 called for the
ability to use we to relate topics not only to personal or general situations,
but also to situations shared by speakers partaking in the discourse. At C1
level, more complex ideas were expressed by using we to describe a general,
sometimes hypothetical situation related to a country or culture rather than
just the individual or a third party.
We also enabled another function relevant to success to be performed.
Specific meanings expressed using we shed light on the importance the
pronoun held for managing interaction. We clearly afforded learners some
flexibility not only in expressing their opinions, but in managing discourse
and confirming comprehension. In terms of managing discourse, the greater
usage and repetition of this pronoun helped learners at all levels to formulate
their own turns and to link their own ideas together, within and sometimes
across turns. This is something we will discuss further in chapter 4 but we
can see example of this usage in figure 2.9, which shows an example by B1
level learners.
In terms of more specific functions, we was mainly used by learners to
manage interaction by including the other speaker in the answers to enable
a form of co-construction, confirming what the other speaker said and often
adding to it, as the example in figure 2.9 also shows. This co-construction
and ability to link across speaker turns increased from B1-C1 levels, as we
will discuss in chapter 4.
There was very little evidence of we being used in this manner in NS data,
and a far greater tendency to rely on I to fulfil the same function and to use
other means of linking ideas together, such as the use of synonyms or other
pronouns. Figure 2.10 shows an example of this.
<$1M> First er I think er working for a large international comp= company er company have er
some advantage and disadvantage. I think er +
<$2M> This one er er the advantage is er we can communicate wi= with different people and er
these people er from <$=> different b= <$=> from different place. And er the disadvantage is
the the language is a big problem.
<$1M> If the language is a problem maybe we can talk talk with each other.
<$2M> Yeah. I’m very agree with your opinion and er in addition I would like to say erm er
work in the international company is very erm good and erm er we should er get erm high
income in the company rather than erm interesting in some er work. I mean er in this city erm
the pre= pressure influence on lots of people so most of reason is money so we should get high
income er work to get lots of money for our life.
<$1M> Yeah I agree with you and I think income is er very big thing <$=> and erm erm </$=>
but er we need practise our language if we if we work in UK we need a very high level
Figure 2.9 Repetition of we at B1
<$6F> Is it possible to get a good job without qualifications? It is more important to have a
high salary than an interesting job. Should we start with the first one?
<$5F> Yeah sure. Erm I’d say it is impossible to get a good job without good qualifications. For
example you couldn’t erm be a doctor without good qualifications. Erm you couldn’t be a lawyer
without good qualifications for example.
<$6F> I think it depends like for me like what do you define a good job? Because for example if
I had my own way like a good job for me would to be an actress and you wouldn’t necessarily
need qualifications. You just need to know the right people and talent erm <$=> or </$=> I
guess it is possible I think it’s just if you’re in the right place at the right time.
<$5F> It’s definitely dependent on erm what you like say is a good job.
<$6F> Yeah like what the profession is +
Figure 2.10 Linking ideas with I and synonyms in the native speaker data
2.6.2 Er/Erm
Er and erm will be further discussed in chapter 4 in relation to discourse
competence but here they are analysed due to their high frequency and con-
tribution in terms of frequency. Er occupied first and third position in the
B1, B2 and C1 frequency lists and erm was 6th most frequent word at B1
and C1, and 11th most frequent at B2. The percentage coverage and fre-
quencies of er gradually fell as proficiency increased but erm showed no
such pattern. It fluctuated between 1.4–2.0% of the B1, B2 and C1 corpora
and its normalised frequencies were comparable, albeit higher, than in the
LINDSEI data (see table 2.7)
T-tests revealed that frequency changes for erm across the levels showed no
statistically significant differences. T-tests revealed that frequency changes
for er, on the other hand, showed key differences. Though there was no
statistically significant increase in usage from B1 to B2, overall test figures
uncovered that B1 (p<0.01) and B2 (p<0.03) learners used er significantly
more than C1 learners with mean figures decreasing from 82 at B1, to 77 at
B2 and to 45 at C1.
Examination of collocational information also revealed patterns in the use
of er and erm, in particular with coordinating and subordinating conjunc-
tions (see tables 2.8 and 2.9), a point which will be developed further in
chapter four. This was especially true of er and erm at B2 which collocated
significantly with and, but, because, and so.
The combinations of er and erm with different conjunctions at B1, B2 and
C1 may have implications for success in terms of the fluency portrayed by
USTC learners at boundaries where utterances are combined to form chains
of discourse. On one hand, filled pauses such as these can be discouraged in
teaching materials as, rather bluntly, they can make speakers ‘sound stupid’
(Viney and Viney 1996 cited by Hughes 2011: 37). This data, however, ques-
tions whether such a concerted effort is required. Though Biber et al. (1999)
group fillers under dysfluency features, for Götz (2013: 36), they can act as
Table 2.7 Normalised frequencies for er and erm

USTC LINDSEI BNC Spoken
B1 B2 C1
Word List Pn Freq NF List Pn Freq NF List pn Freq NF List pn Freq NF List pn Freq NF
ER 1 1391 810.09 1 1312 615.99 3 770 333.58 4 23925 301.71 17 88188 88.51
ERM 6 320 186.36 11 308 144.61 6 460 199.28 11 10354 130.57 27 62086 62.31
Table 2.8 Collocates of er
N B1 B2 C1
Word Freq. T-score MI score Word Freq. T-score MI score Word Freq. T-score MI score
1 I 796 24.985 3.127 I 503 19.886 3.142 I 231 13.820 3.463

2 THE 516 20.266 3.213 AND 453 19.575 3.638 ER 227 13.348 3.132
3 AND 462 19.526 3.449 THE 473 19.408 3.216 THE 118 9.684 3.204
4 TO 317 15.710 3.088 TO 296 15.112 3.039 AND 86 8.224 3.143
5 THINK 215 13.355 3.487 YOU 276 14.694 3.114 IS 61 7.124 3.509
6 ERM 227 13.348 3.132 IN 211 12.933 3.189 THINK 54 6.748 3.614
7 IS 220 13.262 3.239 IS 185 12.171 3.250 IN 48 6.133 3.123
8 IN 206 12.684 3.105 THINK 168 11.651 3.306 BECAUSE 42 5.941 3.586
9 YOU 186 12.143 3.190 LIKE 169 11.561 3.175 LIKE 37 5.404 3.163
10 A 175 11.584 3.008 MY 147 10.845 3.244 VERY 31 5.163 3.783
11 LIKE 158 11.141 3.137 CAN 137 10.632 3.448 CAN 30 4.968 3.426
12 MY 154 10.858 3.000 BUT 137 10.479 3.256 WE 29 4.870 3.387
13 BECAUSE 139 10.500 3.202 HAVE 133 10.310 3.238 OF 27 4.620 3.172
14 OF 123 9.916 3.239 BECAUSE 128 10.296 3.475 HAVE 21 4.091 3.221
15 CAN 120 9.847 3.306 THEY 129 10.133 3.213 IF 18 3.936 3.788
16 WE 109 9.286 3.177 FOR 117 9.696 3.270 FOR 19 3.847 3.089
17 FOR 98 8.919 3.336 WITH 112 9.524 3.322 GOOD 17 3.821 3.769
18 WHEN 95 8.892 3.511 IT’S 105 9.268 3.388 WITH 17 3.726 3.376
19 HAVE 96 8.799 3.294 WE 105 8.973 3.008 WHEN 17 3.658 3.149
20 IT 95 8.668 3.175 IF 95 8.749 3.288 ALWAYS 15 3.642 4.070
*Top 20 collocates with significant T- and MI scores
Table 2.9 Collocates of erm
N B1 B2 C1
1 I 231 13.820 3.463 I 152 11.244 3.506 THE 212 12.991 3.214
2 ER 227 13.348 3.132 YOU 73 7.668 3.286 I 156 11.203 3.278
3 THE 118 9.684 3.204 THINK 61 7.300 3.935 ER 143 10.675 3.221
4 AND 86 8.224 3.143 IS 57 6.945 3.642 TO 108 9.165 3.082
5 IS 61 7.124 3.509 A 52 6.560 3.469 THINK 95 9.048 3.801
6 THINK 54 6.748 3.614 LIKE 51 6.526 3.537 YOU 74 7.732 3.304
7 IN 48 6.133 3.123 BUT 43 6.044 3.675 OF 62 7.085 3.318
8 BECAUSE 42 5.941 3.586 CAN 37 5.598 3.650 BECAUSE 54 6.817 3.790
9 LIKE 37 5.404 3.163 MY 33 5.111 3.180 SOME 42 6.087 4.042
10 VERY 31 5.163 3.783 SO 30 4.812 3.042 PEOPLE 39 5.674 3.451
11 CAN 30 4.968 3.426 FOR 28 4.754 3.298 MAYBE 33 5.429 4.186
12 WE 29 4.870 3.387 SOME 26 4.685 3.623 MY 35 5.354 3.395
13 OF 27 4.620 3.172 THEY 28 4.674 3.100 ABOUT 31 5.124 3.649
14 HAVE 21 4.091 3.221 BECAUSE 26 4.569 3.266 THIS 29 4.975 3.713
15 IF 18 3.936 3.788 YES 22 4.271 3.482 KNOW 27 4.805 3.732
16 FOR 19 3.847 3.089 PEOPLE 22 4.265 3.464 CAN 29 4.723 3.024
17 GOOD 17 3.821 3.769 DO 21 4.119 3.305 LOT 23 4.497 4.003
18 WITH 17 3.726 3.376 THIS 19 3.938 3.372 IF 25 4.474 3.249
19 WHEN 17 3.658 3.149 HOW 3.91 17.000 4.246 ARE 25 4.442 3.164
20 ALWAYS 15 3.642 4.070 DON’T 18 3.885 3.568 NOT 24 4.326 3.095
*Top 20 collocates with significant t- and MI scores
a ‘fluency enhancement strategy’ to alleviate some of the pressures of spon-

taneous first and second language speech. They also form part of Bygate’s
(1987) facilitative communication skills since they permit learners to pause
before making important lexical choices whilst retaining turns when they
have not fully completed their responses. Actively avoiding er and erm can
therefore be disadvantageous. On the other hand, filled pauses could be
detrimental to perceptions of fluency which, according to the CEFR, should
increase as proficiency grows. This discussion by no means equates fluency
solely to the use of filled pauses like er and erm (see Gilquin and De Cock
2013; Götz 2013 for more in-depth treatments of fluency), but the CEFR
does explicitly stress that noticeable hesitations or pauses are more evident
at B1 and B2. At B1, ‘pausing for grammatical and lexical planning . . .
is very evident’; at B2 few long pauses should occur though learners ‘can
be hesitant as he/she searches for patterns and expressions’; at C1, speech
should flow ‘almost effortlessly’ for most subject areas (Council of Europe
2001: 28–29). In the USTC data, both er and erm were high in all the word
frequency lists. Normalised frequencies for er were more than double those
of LINDSEI for B1 and B2 whilst erm was still much higher at B1 and C1.
For usage of er at least, there was also a statistically significant fall from
B1 to C1, and again from B2 to C1. Successful speech, therefore, as we
would expect, displayed a reduction in filled pauses using er as proficiency
increased, but their appearance at all levels shows that they are still essential
in successful learner speech. Though no difference was established for the
use of erm across the proficiency levels, this study supports CEFR descrip-
tors of fluency to some extent as if filled pauses are linked to fluency percep-
tions, their usage should ultimately decrease.
2.7 Keywords
The keywords for B1-C1 levels and the NS corpus are shown in tables
2.10–2.13 in comparison to the LINDSEI corpus.
Analysis using the LINDSEI wordlist as a comparison showed that the
keywords produced tended to be more characteristic of the topics discussed
(see chapter 1 for an overview of the topics used in the test data). This
includes words such as Preston, tourism, football and cinema, which reflect
to a degree the topics discussed in the test based around learners’ lives and
their free time. We do not consider such keywords as central to analysis of
learners’ linguistics competence. Instead, discussion will follow based on two
keywords: think and can. These have been chosen because they appear in the
lists at all levels and also because close examination of the data revealed they
fulfil important functions at all levels and help us to define how successful
learners at different levels express themselves.
Table 2.10 B1 Keywords
N Keyword B1 – LINDSEI
Freq. % Texts RC. Freq. RC. % Keyness
1 ER 1,391 8.09 17 23,925 3.02 1,014.14

2 SPORT 41 0.24 5 30 220.58
3 WATCH 54 0.31 12 160 0.02 181.32
4 LAUGHS 23 0.13 4 6 147.94
5 MY 238 1.38 17 4,526 0.57 138.89
6 CAN 150 0.87 15 2,224 0.28 132.98
7 I 1,126 6.55 17 37,060 4.67 118.37
8 FOOTBALL 27 0.16 4 45 114.75
9 THINK 237 1.38 17 5,131 0.65 104.87
10 CINEMA 37 0.22 6 168 0.02 98.79
11 MEMORY 18 0.1 5 21 85.78
12 MOVIE 47 0.27 5 371 0.05 84.23
13 QUESTION 26 0.15 9 86 0.01 82.69
14 TELEVISION 22 0.13 8 53 81.05
15 IMPORTANT 44 0.26 12 338 0.04 80.7
16 LAUGHTER 12 0.07 2 4 74.65
17 HARRY 12 0.07 4 4 74.65
18 POTTER 11 0.06 4 4 67.54
19 FAVOURITE 22 0.13 8 85 0.01 64.46
20 GAME 19 0.11 3 59 62.34
Table 2.11 B2 Keywords
N Keyword B2 – LINDSEI
1 ER 1,312 6.16 17 23,925 3.02 536.42

2 CULTURE 88 0.41 6 174 0.02 316.25
3 AGREE 55 0.26 15 61 243.59
4 CAN 204 0.96 17 2,224 0.28 204.39
5 TECHNOLOGY 31 0.15 4 8 186.75
6 YOU 518 2.43 17 9,842 1.24 185.92
7 WILL 137 0.64 16 1,165 0.15 184.46
8 PRESTON 24 0.11 10 0 174.89
9 UK 19 0.09 10 0 138.45
10 WEATHER 42 0.2 5 101 0.01 138.28
11 SPEND 40 0.19 12 100 0.01 129.29
12 YEAH 380 1.78 17 7,498 0.95 122.43
13 USE 48 0.23 6 200 0.03 116.72
14 SUCCESSFUL 28 0.13 2 42 112.04
15 THINK 276 1.29 17 5,131 0.65 104.07
16 IF 158 0.74 17 2,261 0.29 103.88
17 MY 252 1.18 17 4,526 0.57 103.5
18 SPORTS 29 0.14 6 61 101.42
19 OUR 63 0.3 11 455 0.06 99.79
20 SMARTPHONE 13 0.06 2 0 94.72
Table 2.12 C1 Keywords
N Keyword C1 – LINDSEI
1 HOTEL 95 0.41 5 108 0.01 403.34

2 TOURISM 56 0.24 8 13 333.39
3 IMPORTANT 104 0.45 16 338 0.04 279.01
4 LAUGHS 40 0.17 11 6 250
5 UM 32 0.14 7 0 228.21
6 WILL 151 0.65 17 1,165 0.15 206.3
7 DUBAI 31 0.13 2 2 206.1
8 ENVIRONMENT 46 0.2 9 46 203.18
9 COUNTRY 115 0.5 17 708 0.09 195.22
10 THINK 342 1.48 17 5,131 0.65 175.91
11 YEAH 442 1.91 17 7,498 0.95 171.87
12 AGREE 41 0.18 15 61 158.45
13 LOCATION 22 0.1 5 3 138.71
14 CAN 179 0.78 17 2,224 0.28 130.45
15 FOOD 53 0.23 14 232 0.03 117.53
16 TRAVEL 43 0.19 10 136 0.02 117.09
17 GOVERNMENT 23 0.1 9 16 112.13
18 LIKE 348 1.51 17 6,394 0.81 108.87
19 RATING 14 0.06 4 0 99.83
20 RESTAURANT 28 0.12 11 52 99.07
Table 2.13 NS Keywords
N Keyword NS – LINDSEI
1 LAUGHS 35 0.44 6 6 288.5

2 LIKE 214 2.67 6 6,394 0.81 213.28
3 LAUGHTER 13 0.16 5 4 101.29
4 YEAH 171 2.13 6 7,498 0.95 87.94
5 THINK 127 1.59 6 5,131 0.65 76.92
6 IF 74 0.92 6 2,261 0.29 70.86
7 DEFINITELY 20 0.25 6 139 0.02 66.75
8 QUALIFICATIONS 8 0.1 1 2 63.72
9 YOU 184 2.3 6 9,842 1.24 57.57
10 THERE’S 28 0.35 6 470 0.06 51.82
11 WATCH 17 0.21 2 160 0.02 47.85
12 LAUGH 11 0.14 3 45 46.75
13 LEEDS 6 0.07 1 2 46.31
14 RECOMMENDED 6 0.07 1 2 46.31
15 NECESSARILY 8 0.1 3 13 46.04
16 DEPENDS 14 0.17 4 116 0.01 42.46
17 STUFF 20 0.25 5 286 0.04 42.21
18 JOB 22 0.27 1 357 0.05 41.9
19 JUST 75 0.94 6 3,163 0.4 41.55
20 SALARY 7 0.09 1 11 40.64
2.7.1 Think
Think was of fundamental importance to learners at all levels and was also
a keyword in our NS data. Juilland’s D of approximately 0.9 across all lev-
els also demonstrates its significance in the USTC data. Although we would
expect it to arise as a keyword due to the nature of the corpus data, com-
parison to LINDSEI lists showed that it still occurred more frequently in the
USTC data. In addition, as table 2.14 shows, think increased in frequency
as proficiency developed, and we can see it occurred on average more fre-
quently at C1 level and with a higher keyness factor.
Unsurprisingly, I topped the collocates lists for t-scores and MI scores
across all levels, a finding reflected in the LINDSEI data but colligational
analysis show some interesting patterns in the data. These are displayed in
table 2.15.
The most common colligation contained a complement with an adjec-
tive phrase (this was set apart from comparative and superlative phrases to
enhance analysis). One pattern in particular I [er/erm] think + object + is
very important was prominent in learner utterances:
• B1 = 11 occurrences
• B2 = 12 occurrences
• C1 = 31 occurrences
Learners used this pattern as a precursor to additional explanation and as

such it clearly functioned as a way to manage a turn, as we will discuss fur-
ther in chapter 4. Phrases such as I think it is very important also acted as
a stalling device for giving learners thinking time as they formulated ideas.
When it was used in such a way, it often followed or preceding other hesita-
tions such as er. Important was found to be a significant collocate following
think at B1 (t-score: 3.134, MI 4.181) B2 (t-score: 3.003, MI 4.308) and
C1 (t-score: 4.692, MI 4.021). Figure 2.11 shows an example of this usage
The second most common colligational pattern combined I think and a
noun phrase. Across levels, this was achieved using lexis such as way, place,
Table 2.14 Think at B1, B2 and C1
USTC LINDSEI
Level Raw Normalised Mean % of Juilland’s Position in Keyness Raw Normalised % of

freq. freq. freq. corpus D keyword list freq. freq. corpus
B1 237 138.02 13.94 1.38 0.86 9 104.87 5,131 64.70 0.65

B2 276 129.58 16.24 1.29 0.90 15 104.07
C1 342 148.16 20.12 1.48 0.89 10 175.91
NS 127 158.59 21.17 1.59 0.88 5 76.92
Table 2.15 Colligational patterns of think
Usage B1 B2 C1
Total % Total % Total %
I [er/erm] think + object 35 63.64 19 39.58 47 57.32

+ is/’s + adjective phrase.
Example: I think er sport is
very important [B1]
+ is/’s + noun phrase.
Example: I think it’s er a
matter of culture [C1]
+ is/’s + comparative or
superlative. Example: I
think outside is better [B2]
+ is/’s + comparative or
superlative. Example: I
think the most important
thing in our life is that
there’s no things should be
happened in the future [B2]
Total 55 100.00 48 100.00 82 100.00
<$0> Thank you. <$1M>. What do you do to keep healthy?

<$1M> Keep healthy I think er sport is er very important you know er sometimes <$=> I every
</$=> I like basketball and er sometimes I really with my roommate or my classmate go to
basketball.
Figure 2.11 I think + NP + ADJP at B1
city, thing, habit, subject, man, sport, or problem. Such utterances were sim-
ply used to provide an answer to a question briefly or to round off an utter-
ance, as we can see in figures 2.12–2.14.
Since the occurrence of comparative and superlative and relative clause
colligations were low, I think was instead analysed for the collocations
which surrounded it in learner discourse using a window of 5 words left
and right to see if differences arose (see table 2.16).
As the collocations suggest, I think was found to be multifunctional act-
ing as a discourse marker, stance marker and hedging device (see Carter
and McCarthy 2006). It displayed a sequencing function at all levels due to
combinations with and and so; the latter often used to conclude or round
off remarks (see chapter 4 for more in-depth discussion). Conjunctions such
<$0> Tell me about an interesting place you have visited recently.

<$9F> Er erm in recent? <$O15> Recently </$O15>
<$0> <$O15> Recently yes </$O15>
<$9F> I went to York. <$=> York is </$=> I think York is good pla= good place.
Figure 2.12 I think + NP at B1
<$0> Okay. Would you prefer TV without advertising? Would you like +
<$5M> No. No I I don’t agree it because I I have studies two years media and er in the future I
want er go to a TV station. The er the television is erm is a very important part of earn money er
in the of the to the TV er station. So I think it is not a good thing for the TV station.
Figure 2.13 I think + NP at B2
<$0> Er okay <$15F> what are the major transport problems in your country? Major transport
problems in your country.
<$15F> Mm major transport problems in my country? Okay I think it’s the traffic jams
Figure 2.14 I think + NP at C1
as those previously identified in discussion of er and erm were found to be

once again significant here and helped to focus, divert or shift attention; they
also allowed students to pause and sometimes introduce an idea different to
the one expected by the listener.
Similarly, stance, specifically epistemic stance, expresses a speaker’s com-
ments, evaluations and attitudes towards a particular topic as well as their
origin (Biber et al., 1999; Carter and McCarthy 2006). The use of I think as
a stance marker appeared to fulfil learners’ needs at all levels as there was
very little evidence of other stance markers such as ‘basically’ or ‘actually’.
However, especially at B2, think did act as a hedging device either on its own
or in combination with maybe. Such hedging performs an important socio-
linguistic function as it helps to maintain relationships between speakers by
making their utterances sound less ‘blunt and assertive’ (Carter and McCar-
thy 2006: 223). Whilst I think is capable of softening statements on its own,
its collocation with maybe at B2, emphasised the intentional uncertainty or
lack of strength in learner speech.
The frequency of I think is not surprising if we consider Bygate’s (1987)
treatment of production and interaction skills. He suggests that during com-
munication via spontaneous, unplanned speech, the use of routines is essen-
tial and clearly I think plays a role in many such routines. This then allows
Table 2.16 Collocations of I think
N B1 B2 C1
1 ER 101 14.242 11.293 ER 151 12.283 11.260 THE 140 11.828 11.461
2 THE 27 9.051 11.001 THE 95 9.743 11.258 ER 106 10.292 11.634
3 I 22 7.992 9.933 I 72 8.480 10.695 ERM 92 9.590 12.173
4 ERM 29 7.679 11.630 IS 62 7.872 12.030 IS 69 8.304 11.900
5 IS 11 7.414 11.681 ERM 51 7.139 11.785 IT'S 60 7.744 12.293
6 SO 32 6.706 11.392 SO 37 6.081 11.612 AND 52 7.207 10.818
7 IT’S 5 6.402 12.179 IN 37 6.080 11.034 I 45 6.703 10.330
8 A 13 5.828 11.086 IT'S 35 5.915 12.160 SO 43 6.555 11.717
9 IN 14 5.565 10.814 AND 34 5.826 10.260 YEAH 43 6.555 11.133
10 TO 16 5.472 10.128 YEAH 32 5.654 10.810 IN 41 6.400 11.159
11 AND 21 5.472 9.946 A 27 5.193 10.790 THAT 38 6.163 12.165
12 VERY 5 5.290 11.958 TO 27 5.191 9.942 MM 37 6.082 12.762
13 BECAUSE 15 4.794 11.039 BECAUSE 25 4.998 11.477 TO 37 6.078 10.383
14 YOU 6 4.579 10.485 YOU 25 4.995 10.007 IT 30 5.475 11.413
15 THIS 4 4.471 11.868 IT 24 4.897 11.257 OF 29 5.383 11.068
16 MM 11 4.357 11.399 BUT 24 4.897 11.100 A 28 5.289 10.851
17 MY 7 4.356 10.423 FOR 23 4.794 11.281 FOR 27 5.195 11.750
18 GOOD 4 4.242 12.174 THIS 22 4.689 11.850 IMPORTANT 26 5.098 12.495
19 YEAH 12 4.239 10.274 MM 22 4.689 11.630 VERY 25 4.999 11.949
20 FOR 6 3.871 11.070 MAYBE 20 4.471 11.856 BUT 23 4.794 11.652
speakers to reduce the demands on thinking time during the interaction and
supply learners with a ‘way into’ the discourse. In terms of learner success,
I think often satisfied what the CEFR suggests at B1 levels is the need for
learners to express ideas as a ‘linear sequence of thoughts’, B2 calls for more
‘detailed descriptions’ and for C1 calls for learners to ‘develop particular
points’ as per the productive sustained monologue descriptors (Council of
Europe 2001: 59).
2.7.2 Can
This word was focused upon because it features as a keyword at each level
but not in the NS data. Can peaks in terms of frequency at B2 level and its
keyness is also greatest at this level. This suggests that can has most impor-
tance at this level. These patterns are summarised in table 2.17.
When looking at patterns of collocations, quite distinct differences were
found. The most common pattern at B1 was I can (I can = 70 occurences,
you can = 26, we can =36), while at B2 this was you can (you can = 64
occurences, I can = 40, we can = 22) and C1 this was we can (we can = 51
occurences, I can = 35, You can = 42). Initial observation of the data sug-
gested that this showed that can occupied different functions at different
levels. At B1 level, the most common use was simply to describe that the
speaker has the possibility to do something. This was most commonly
related in some way to the speaker, rather than to descibe a specific ability
to do something e.g. when I stay in my room I can play computer games.
At B2 level, the function was very similar but the use of ‘you’ also allowed
speakers to discuss this general possibility to do something extended
beyond the individual speaker and to a more general theoretical possibil-
ity as a synonym of ‘one’ e.g. If you want to learn a new language er it’s
not difficult you can translate. At C1, can was combined with the general
use of ‘we’ discussed earlier (section 2.6.1) extending to wider groups of
people or society in general and the possibilities they have or do not have,
e.g. we can communication with the others er who are come from other
countries we can know their culture and can make friends. These functions
are shown in figures 2.15–2.17.
Table 2.17 Can at B1-C1 levels
USTC LINDSEI
Level Raw Normalised Mean % of Julliand’s Position in Keyness Raw Normalised % of

freq. freq. freq. corpus D keyword list freq. freq. corpus
B1 150 87.36 8.82 0.87 0.84 6 132.98 2,224 28.05 0.28

B2 204 95.78 12.00 0.96 0.87 4 204.39
C1 179 77.55 10.53 0.78 0.84 14 130.45
<$0> Thank you. <$2M>. What kind of books do you enjoy reading?
<$2M> Actually er I’m not er like the book <$=> I always er watch </$=> I always er enjoy
the <$G2> because erm I can er learn lots of things about them maybe like fishing.
Figure 2.15 I can at B1
<$0> <$O24> Rich </$O24> is it necessary to be rich in order to be successful?
<$10M> Okay. Necessary rich yes erm I don’t think so. I don’t think so yes but erm rich th= this
words I can’t understand er this this words. That means about your hand and about your money.
Maybe ri= rich er in from my perspective rich that means you have a lot of ideal yes you can
have a erm a strange inspiration maybe if if if er that’s rich that means . . .
Figure 2.16 You can at B2
<$16F> In the travel I think the environment friendly it means we go to travel not by a car and it
not er mm we can protect er environment and that is er mm er green green travel yes erm green
travel . . .
Figure 2.17 We can at C1 level
These functions reflect our earlier discussion in relation to the vocabulary

profile of each level. Successful speakers use similar words across the levels
but the amount of functions for those words increases so that at C1, a suc-
cessful speaker is able to use can to describe a general possibility related to
themselves, people in general and also wider cultures in general.
As section 2.6.1 demonstrated can was a significant collocate with we at
each level so we decided to look for collocates at each level with we can,
excluding the words we, I and can, which may have been repeated near
to the target words. As mentioned in the methodology section, as the fre-
quency of collocates was lower with this chunk, collocates with a t-score of
2 or more or an MI score of 3 or more were included. These are shown in
table 2.18 in order of their raw frequency
The most obvious point to note is that the number of significant collocates
increased at C1 level, which matches the suggestions we have made in regard
to an expansion of functions of can moving from a description of general
possibility including the speaker to a more general and hypothetical pos-
sibility as the levels increase. There was also some evidence that at B1 and
B2, the ways in which we can was used tended to be limited in conversation
to a focus on the speaker’s own turn rather than as a response to another
speaker. This was evidenced particularly by the use of yeah in the C1 data.
Table 2.18 Collocates with we can B1-C1 levels
Collocates with B1 Collocates with B2 Collocates with C1

we can we can we can
Freq T Score MI Score (3 +) Freq T score MI score Freq T score MI score
Get 8 2.47389 2.99601 Think 10 2.28714 1.85338 Er 16 2.11266 1.08365

Remember 5 2.09085 3.94472 Games 3 1.66837 4.76551 Yeah 12 2.21312 1.46942
Memorise 4 1.99085 7.71025 Computer 3 1.58732 3.58108 So 10 2.253866 1.79944
Start 3 1.53355 3.12529 Learn 8 2.71057 4.58490
With 8 2.37780 2.65000
Life 7 1.64148 4.25733
Bring 5 2.2009 5.99429
Speakers 3 1.69243 5.44997
Without 3 1.68111 5.08740
Native 3 1.66412 4.67236
Japan 4 1.64714 4.35044
Relax 3 1.89705 4.28005
Use 3 2.43892 4.12217
Any 3 1.63016 4.08740
Friends 3 1.59054 3.61347
Countries 3 1.57922 3.50244
The use of yeah with other keywords and chunks (as we will discuss further
in chapter 4) allowed successful learners at this level to respond to the turn
of their partner more easily and develop and co-construct ideas. At the B1
and B2 levels, learners showed less evidence of this and instead we can was
used to develop ideas within their own turn. Figures 2.18 and 2.19 show
these different uses.
2.8 Most frequent lexical chunks

The three- and four-word chunks for the B1, B2 and C1 data are presented
in tables 2.19 and 2.20. In total, there was a small rise in the number of
three-word chunks from B1 level with 513 at B1, to 530 at B2 and 526 at
C1, while the NS data showed a total of 275 chunks, from a smaller data
set. Four-word chunks were lower in number with 200 used at B1, 211 at
B2 and 190 at C1, with the NS data showing a total of 127. It was also evi-
dent, as Carter and McCarthy (2006) remark, that many of the four-word
chunks were extensions of three-word chunks. Statistical significance was
not reached in any of the comparisons, so the rise in proficiency for this
study did not relate to a rise in the number of chunks. Statistical significance
tests did highlight, however, that learners at all levels produced significantly
fewer four-word chunks than three-word chunks (p<.01 at all levels), a simi-
lar pattern which has been observed in spoken corpora of native speakers
data (for example, O’Keeffe, McCarthy and Carter 2007).
<$35F> <$O76> Mhm we can bring them </$O76> And normally they have hairdryers so +
<$34F> Yes.
<$35F> or laundry er machines so yeah it’s alright.
<$34F> Yeah if there is amenities i= it’s fine +
<$35F> Yeah.
<$34F> + and it’s good for me +
<$35F> Yeah.
<$34F> + but yeah we can bring it.
<$35F> We can bring it yeah
Figure 2.18 The use of yeah with we can to develop points across speaker turns at C1
<$0> Mm. Why is it important to work hard at school?

<$34F> Oh er <$G?> yeah you know I mm I try to work hard <$=> but yeah cos </$=> mm
<$E> laughs </$E> er yeah why important yeah maybe it’s important to after er af= after it’s
important to go social like er to work not in school not in social yeah so and we can get the
concentration
Figure 2.19 We can used to develop ideas within a speaker turn at B1

Table 2.19 3-word chunks at B1-C1 levels and in the native speaker data
Rank B1 B2 C1 NS
Chunk Freq. Chunk Freq. Chunk Freq. Chunk Freq.
1 I DON’T 77 A LOT OF 64 I THINK IT 53 I DON’T 45

2 I THINK ER 48 I DON’T 50 THINK IT’S 42 I THINK IT 25
3 A LOT OF 38 I THINK ER 35 A LOT OF 39 A LOT OF 24
4 ER I THINK 38 ER I THINK 29 I DON’T 34 YEAH I THINK 17
5 I THINK IT 32 I AGREE WITH 29 SO I THINK 31 DON’T KNOW 16
6 THINK IT’S 29 I THINK IT 29 IN MY COUNTRY 29 AND I THINK 14
7 I WANT TO 27 AGREE WITH YOU 27 I THINK THE 27 THINK IT’S 14
8 DON’T LIKE 24 DO YOU THINK 24 IT’S VERY 26 BUT I THINK 12
9 ER IT’S 21 IN THE FUTURE 24 YEAH IT’S 26 DON’T REALLY 12
10 I LIKE TO 21 SO I THINK 23 IT’S A 25 DON’T THINK 11
11 SO I THINK 21 DON’T HAVE 21 I AGREE WITH 24 ERM I THINK 11
12 I CAN’T 20 I WANT TO 21 IT’S NOT 24 IF YOU’RE 10
13 AND ER I 19 THINK IT’S 21 DON’T KNOW 23 OF THE TIME 9
14 DON’T KNOW 19 YOU HAVE TO 21 DO YOU THINK 22 YOU’VE GOT 9
15 ER AND ER 18 ER IT’S 20 OF THE HOTEL 21 AND IT’S 8
16 I THINK IT 17 DON’T KNOW 19 YEAH I THINK 21 AND THINGS LIKE 8
17 ER I LIKE 16 LOT OF TIME 19 DON’T HAVE 20 I THINK THAT 8
18 ER YOU KNOW 16 AND ER I 18 ERM I THINK 20 IT’S LIKE 8
19 YEAH YEAH YEAH 16 DON’T LIKE 18 ER I THINK 19 ALL THE TIME 7
20 AND ER ER 15 IT’S VERY 18 SO IT’S 19 I THINK I 7
Table 2.20 4 word chunks at B1-C1 levels and in the native speaker data
Rank B1 B2 C1 NS
Chunk Freq. Chunk Freq. Chunk Freq. Chunk Freq.
1 I THINK IT’S 26 I AGREE WITH YOU 21 I THINK IT’S 41 I DON’T KNOW 15

2 I DON’T LIKE 23 I THINK IT’S 20 I DON’T KNOW 12 I THINK IT’S 13
3 ER I THINK ER 15 A LOT OF TIME 19 WHAT DO YOU THINK 12 I DON’T THINK 11
4 I DON’T KNOW 13 WHAT DO YOU THINK 16 YEAH I AGREE WITH 12 AND THINGS LIKE THAT 7
5 AND ER AND ER 10 SPEND A LOT OF 15 I AGREE WITH YOU 11 I DON’T REALLY 7
6 I WOULD LIKE TO 10 I DON’T LIKE 12 IT’S IT’S 10 A LOT OF THE 6
7 GO TO THE CINEMA 9 I DON’T KNOW 10 LOCATION OF THE HOTEL 10 I THINK THAT’S 6
8 TO BE IN A 9 THEY DON’T HAVE 10 THE LOCATION OF THE 8 LOT OF THE TIME 6
9 ER A LOT OF 8 DO YOU THINK ABOUT 9 YEAH IT’S VERY 8 AND STUFF LIKE THAT 5
10 I DON’T THINK 8 I DON’T THINK 9 DO YOU THINK ABOUT 7 BUT I DON’T 5
11 I I DON’T 8 LOT OF TIME WITH 9 MY FAVOURITE RESTAURANT IS 7 DON’T KNOW I 5
12 BE IN A FILM 7 AGREE WITH YOU BUT 8 SO I THINK IT 7 ERM I DON’T 5
13 DO YOU LIKE TO 7 YES I AGREE WITH 8 THINK IT’S VERY 7 I MEAN IT’S 5
14 ER I DON’T 7 A LOT OF THINGS 7 BUT I THINK IT 6 I THINK IT IS 5
15 HAVE A LOT OF 7 ER I THINK ER 7 IN MY OPINION I 6 YEAH I THINK IT 5
16 I’D LIKE TO 7 WITH YOU BUT ER 7 IT’S VERY IMPORTANT 6 YOU’VE GOT TO 5
17 LIKE TO BE IN 7 AND ER I THINK 6 A FOUR STAR HOTEL 5 I THINK A LOT 4
18 WANT TO BE A 7 AND ER IT’S 6 A LOT OF PEOPLE 5 LIKE TO BE IN 4
19 BUT I DON’T 6 BECAUSE I DON’T 6 A LOT OF THINGS 5 NO I DON’T 4
20 DON’T KNOW THE 6 HAPPEN IN THE FUTURE 6 AGREE WITH YOU BECAUSE 5 S A LOT OF 4
It is first striking to see how many chunks at all levels again centre on the
verb think, a word which we have noted is important in the data. Whereas
at B1, emphasis was placed on giving opinions with chunks such er I think,
I think er, I think it’s, at B2 and C1, chunks were employed both to give and
seek opinions: I think it’s, so I think, what do you think, do you think about.
We will not analyse the usage of these chunks containing I think in detail in
this chapter, as we have already discussed the use of I think and such chunks
are looked at again in chapter 4, when we explore discourse competence.
Instead, we will focus on the usage of two other high frequency chunks,
namely a lot of and the various patterns of agree with before a comparison
of chunks used in general across levels.
2.8.1 A lot of
A lot of shared a similar dispersion rate across levels in that Juilland’s D for
B1, B2 and C1 was 0.74, 0.71 and 0.71, respectively. It was therefore high not
only in terms of chunk frequency, but it was utilised rather regularly through-
out the exams at each level. A lot of was explored in terms of its influence on
accuracy and collocation. It is not the intention of this book to analyse errors,
but since gains in accuracy affect judgements of success and since accuracy
can rise with the use of chunks, a brief error analysis was conducted.
Firstly, all uses of a lot of with countable and uncountable nouns were
classed as either correct or incorrect. When turned into a mean percentage,
a lot of had accuracy scores of 75.00% at B1, 71.67% at B2, and 91.30%
at C1. At B1, all uses of uncountable nouns (money, art, food, technology,
and time) were accurate but countable nouns proved more problematic with
the plural ‘s’ often missing, e.g. a lot of experiences, buildings, parks, places,
films, books, kinds and words. At B2, again, countable nouns posed more
difficulties than uncountable nouns with the latter being incorrect on only
three out of 27 occasions. In terms of countable nouns, again, plural ‘s’ was
sometimes omitted: a lot of elements, definitions, films, cultures, traditions,
things, programmes, cultures and sources. At C1, all uses of uncountable
nouns, such as food, frustration, benefit, money, slang and entertainment,
were accurate whilst only five instances of countable nouns, problems,
points, countryies, taxis and expressions were incorrect out of 46.
Statistical significance tests were carried out to establish whether i) there
were increases in usage of a lot of across the three levels or ii) whether there
were changes in accuracy scores. Though no statistically significant out-
comes were found between the levels in both of these respects, comparative
analysis of much and many provided further insight into the worth lexi-
cal chunks have for learner speech. Table 2.21 displays the frequencies and
accuracy scores of many, much and a lot of at B1, B2 and C1.
From this data, it can be argued that chunks such as a lot of have a posi-
tive effect on learner accuracy. Combined scores of much, many and a lot of
Table 2.21 Comparison of many, much and a lot of across B1, B2 and C1
Level Many Much Combined A lot of

Correct % Correct % Correct % Correct % % increase
uses/ accuracy uses/ accuracy uses/ accuracy uses/ accuracy or decrease
Total Total Total Total in accuracy
uses uses uses uses (combined
vs. a lot of )
B1 22/33 66.67% 1/3 33.33% 23/36 63.89% 27/36 75.00 +11.11%

B2 26/36 72.22% 6/9 66.67% 32/45 71.11% 43/60 71.67 –0.56%
C1 47/53 88.68% 8/9 88.89% 55/62 88.71% 42/46 91.30 –2.59%
displayed small drops in accuracy at B2 and C1, but at the lowest level in the
USTC, the chunk increased accuracy by 11%. With general usage of much
and many increasing from B1 to C1, perhaps simultaneously suggesting a
growth in learner confidence with these quantifiers, such data could lend
further support to studies which advocate the benefits of lexical chunks on
learner accuracy and their speech in general.
With regard to collocates with a lot of, a wide variety of items in right posi-
tion was not found due to the small frequencies involved. T-scores were often
lower but MI scores were still able to show that collocations comparable with
LINDSEI did exist. These collocates are shown in tables 2.22 and 2.23.
What is interesting to note here, despite the low frequencies, is the use of
vague language by USTC learners. The MI scores for a lot of things are
comparable to LINDSEI at B2 and C1; its t-score at B1 was not found to be
significant. To be successful, it is necessary that learners become comfortable
and more confident in using vague language. As an element of speech which
can reduce online processing and reduce hesitation, it could also be of bene-
fit to learners who may not always be able to retrieve the precise vocabulary
they require to complete their utterance. The sample in figure 2.20 shows
this chunk in use.
2.8.2 Agree with you

In tables 2.19 and 2.20 agree with appeared in multiple forms. Whilst it did
not feature in the B1 lists at all, it formed five chunks at B2: I agree with,
agree with you, I agree with you, agree with you but, and yes I agree with, and
three chunks at C1: I agree with, I agree with you, yeah I agree with. Although
in concordance data, agree with you did appear four times at B1, chunk data
suggested that interaction based on reactions to others’ utterances was not a
priority at this level. Further analysis revealed that it was multifunctional in
nature, especially at B2 level. If we take figure 2.20 as an example, despite
Table 2.22 Collocates of a lot of across USTC
B1 B2 C1
1 PEOPLE 6 1.998 9.712 TIME 23 4.582 12.202 PEOPLE 12 3.314 10.471

2 TIME 5 1.997 9.578 THINGS 9 2.828 12.292 THINGS 7 2.449 11.621
3 CHANNELS 4 1.732 13.655 PEOPLE 8 2.643 10.078 KINDS 3 1.732 14.495
4 MONEY 3 1.732 12.655 CULTURE 7 2.234 10.242 MONEY 3 1.731 10.654
5 BUILDING 5 1.732 12.485 FRIENDS 5 1.999 11.094 FOOD 3 1.731 10.352
6 FOOD 2 1.413 10.485 TRAINING 3 1.732 13.157 COUNTRY 6 1.729 9.234
7 THINGS 2 1.413 10.485 ACTIVITIES 3 1.732 12.643 TOURIST 2 1.414 11.688
8 FRIENDS 2 1.413 9.860 DEFINITION 2 1.414 13.380 CARS 2 1.413 10.688
9 MUSEUMS 2 1.000 14.070 TRADITION 2 1.414 12.795 CAUSES 3 1.000 12.910
10 PARK 3 1.000 12.070 HOME 2 1.413 10.522 HOMETOWN 2 1.000 11.688
11 CITY 2 0.999 9.982 COUSINS 2 1.000 13.380 CULTURES 2 1.000 11.325
12 SPORT 2 0.998 8.712 COMPUTER 2 0.999 9.736 PROBLEMS 3 0.999 10.247
13 --- --- --- --- FAMILY 8 0.998 8.953 TOURISM 2 0.998 8.688
Table 2.23 Comparison of USTC a lot of collocates with LINDSEI data
Word B1 B2 C1 LINDSEI
Freq. T-score MI Freq. T-score MI Freq. T-score MI Freq. T-score MI
score score score score
People 6 1.998 9.712 8 2.643 10.078 12 3.314 10.471 111 9.798 14.770
Things 2 1.413 10.485 9 2.828 12.292 7 2.449 11.621 75 8.366 15.386
Money 3 1.732 12.655 - - - - - - --- 3 1.731 10.654 35 5.657 16.016
Time 5 1.997 9.578 23 4.582 12.202 - - - - - - --- 58 7.000 14.413
Food 2 1.413 10.485 - - - - - - --- 3 1.731 10.352 5 1.732 13.324
Friends 2 1.413 9.860 5 1.999 11.094 - - - - - - --- 16 3.741 13.223
<$0> Er <$27F> how might your culture be different in 100 years’ time?
<$27F> In 100 years’ time in the future. Erm <$E laughs /$E> er I think there is huge huge
change will happen in the future. If er God give me a longer life to see that I will see my my my
my kids to grow in the in that time so I think a lot of things will change. Nobody have s= er the
own or local culture all the culture in the world er the same I think so.
Figure 2.20 A lot of things I at B2 level
1 Arabic a lot of time when we study. Yeah I agree with you but er I also may use technology
2 How about you ? Er I agree with you but er smartphone at this moment it’s very
3 drive to the city centre well . Yes I agree with you but I think I still enjoys life in er city
4 they have to control this. Yeah Yes I agree with you but er as you know everything it have
5 good teacher. I I agree with you I agree with you but er it depends with the teacher.
6 Mm yeah I + Sorry. + abso= absolutely agree with you but er mm if if cold weather er I
7 er something. Okay er I am totally agree with you but if er the family er make timetable for
8 at the computer. Yeah absolutely I agree with you but er they less communication with
Figure 2.21 Agree with you but at B2 level
the emphatic use of absolutely to imply explicit agreement with the previous
statement, it is followed by the conjunction but. This was found in a number
of exams at B2 as the example in figure 2.21 shows.
Whilst but was a significant collocate at B2 (t = 3.581, MI = 4.541), it
did not appear in the B1, C1 or LINDSEI collocates lists at all. At C1, the
conjunction but was instead replaced with because (t = 1.946, MI = 5.091).
Though only occurring on four occasions, and once being used to show
1 environment. No this time I don’t agree with you because it’s not only the cars + Yeah. +
2 erm rubbishes to the bins. Yeah I agree with you because erm people use education to teach
3 cancer tongue cancer well. Yes I agree with you because junk food is also food and also h
4 nment and give us a green life. I agree with you because also the cars er pollute the envi
Figure 2.22 Agree with you because at C1 level
disagreement C1 learners employed the chunk more for its interpersonal

nature and more for its ability to extend and build on utterances, something
we will explore further in chapter 4 in relation to discourse competence. The
finding for B2 level, however, indicated that, I agree with you performed a
different function: that of a stalling device for gaining time whilst students
formulated their own ideas. As can be seen in figure 2.21, agree with you
but was nearly always accompanied by er which accentuated the hesitation
between learners using the chunk and joining it with their own opinion.
Though giving the impression of listenership, KWIC analysis revealed oth-
erwise at the B2 level.
As a final note in this section, the occurrence of I agree with, agree with
you, and I agree with you in the B2 and C1 chunk lists prompted another
line of enquiry. Since chunks were used to infer or express agreement, data
was analysed to see how disagreement was expressed at all three levels.
What tended to happen in this case was that learners used the verb dis-
agree: five instances at B2 and four instances at C1 were found. The much
higher frequency of I agree with in comparison with I don’t agree or disagree
implies that the chunk also held a sociolinguistic function of maintaining
relationships between speakers during the interaction.
2.9 Discussion
Overall, the data demonstrated that USTC learners did use formulaic chunks
in their speech. Averaging between 30 and 35 three-word chunks and approx-
imately 12 four-word chunks per exam, formulaic language allowed learners
a degree of success in their interactions. However, correlation between rising
chunk numbers and rising proficiency could not be established. For instance,
though three-word chunk numbers grew by 60 from B1 to C1, statistical sig-
nificance was not reached for either the three- or four-word categories. The
frequency of chunks was therefore not a reliable indicator of growing profi-
ciency in the USTC data. One reason explaining this could be the differences
exhibited in individual exams. Three-word chunk data showed considerable
variation from exam to exam: B1 frequencies fluctuated from 11–46 chunks,

B2 figures wavered between 18–59 chunks and at C1, chunk frequency var-
ied from 12–68. Four-word chunk data was also similar in this respect 5–22
chunks at B1, 4–47 chunks at B2, and 4–31 chunks at C1. As the CEFR
explains, each learner’s communicative competence is formed by their previ-
ous language experiences. It could be possible that some learners within levels
had not had much experience of lexical chunk tuition whereas others had.
Attention thus turned to the nature of the chunks produced to see whether
this had an effect on learner success at B1, B2 and C1. Relating chunks to
Carter and McCarthy’s (2006) categories revealed that the vast majority
of chunks contained subject-verb forms consisting of lexical and auxiliary
verbs. Conjunction-verb structures were also used quite often across the
levels and occasional preposition and noun phrase expressions were also
identified. Again, regardless of three- or four-word constructions, not much
difference was found across the levels but for success, the data demon-
strates that some of the chunks relate to previously discussed functions, for
instance, I think it’s, so I think, I think er/erm, I don’t like, and er, and it’s
very (e.g. important). What this chunk data highlights is that their use also
impresses a degree of fluency or ease given the benefits for online processing
and retrieval chunks are said to have. Whilst er and erm did appear in some
chunks, especially at B1, the fact of the matter remains that salient lexico-
grammatical items highlighted earlier as having an impact on success added
another dimension to successful learner language use.
The final point to be raised here regarding chunk type pinpoints one
clear difference between levels. Intrapersonal chunks such as er you know,
I don’t know, but I think, I think er are useful as they ‘reflect . . . mean-
ings (meanings which build and consolidate personal and social relations)
created between speakers and listeners’ (Carter and McCarthy 2006: 835).
Some of these, especially those centred on the verb think, showed that B1
learners were more concerned with expressing opinions while learners at
B2 and C1 distributed usage between expressing and seeking opinions. The
degree of interaction is not only exemplified via chunk data, but it is also
realised via chunks in learner language; simultaneously this finding stresses
the more strategic nature that turn-taking chunks can have (see Carter and
McCarthy 2006: 836). To reinforce this final point, other chunks relating to
agreement I agree with you, reveal a greater attention paid to interacting in
the discourse at B2 and C1 rather than discourse being potentially composed
of two separate monologues with little joint construction.
2.10 Conclusion
Although we have only described certain aspects of this competence due to
the many aspects of linguistic competence, they still lead us to a number of
conclusions which, due to their number, are summarised in bullet points as

follows:
• B1, B2 and C1 learners were comparable in their combined K-1 and K-2
token and family coverages. K-1 and K-2 tokens stood at 97% whereas
family coverage fell between 81–84%.
• Learners did not use less frequent vocabulary, according to K-1 and K-2
bands, as proficiency developed so the 2000 most frequent words are
therefore fundamental to success at each level.
• The number of words used, i.e. tokens, increased significantly from B1
to B2. It was not, however, a distinguishing feature of the changes from
B2 to C1.
• The 20 most frequent words at B1, B2 and C1 comprised approximately
40% of all speech. Knowledge of these words is vital for success.
• We was more frequent in the learner data than the NS data. We was
frequent at each level but the functions changed and broadened as the
levels increased. At B1 and B2 levels, the use tended to be focused on
third parties, whereas at C1 level, the learners were also able to use this
to discuss concepts in a general sense.
• Er and erm were very frequent at all levels but hesitation via er did
reduce as proficiency levels increased, something which is reflected in
the CEFR descriptions of fluency
• Hesitation was often concentrated around conjunctions which linked
utterances and it appeared in many chunks and commonly functioned as
a means of buying time and signalling that the turn was to be continued.
Successful speech requires er and erm to realise discourse competence
(see chapter 4 for further discussion).
• The verb think was a keyword at each level and was used throughout in
a range of patterns, illustrating how communicative routines for giving
opinions could be realised. Analysis revealed a variety of functions. It
enabled learners to successfully sequence utterances, shift focus, express
stance and hedge language.
• Can was also a keyword at each level and was used most often to discuss
general possibility, with the focus shifting as the levels increased. At B1,
this was used to talk from the speaker’s viewpoint regarding what I can
do and at B2 and C1 levels, from a more hypothetical of general view-
point with patterns around you can and we can.
• Success at higher proficiency levels related to the flexibility in functions
an individual word could satisfy. Just because a word is used frequently
does not mean that it is always repetitive or used in the same way.
• Though increases in frequency of lexical chunks were evident across the
levels, there were no statistically significant gains in usage as proficiency
grew.
• Chunks which were favoured by learners tended to be those which

could be used in a variety of speech and with a variety of functions such
as a lot of and I agree with you.
• The chunk I agree with you also demonstrated a better ability to interact at
B2 and but it did not appear in chunk lists at B1. However, B2 learners did
use it more as a stalling device while they prepared their own utterances.
• Chunks such as a lot of demonstrated that chunks can reflect grammati-
cal accuracy. Correct usage is not guaranteed but for lower level learn-
ers, a lot of may prove an easier to use option than many and much.
These findings will be discussed further in subsequent chapters and the

implications discussed in our final concluding chapter.
References
Adolphs, S. and Carter, R. 2013. Spoken corpus linguistics: From monomodal to
multimodal. Abingdon: Taylor and Francis.
Adolphs, S. and Schmitt, N. 2003. Lexical coverage of spoken discourse. Applied
Linguistics, 24(4), 425–438.
Anthony, L. 2017. AntConc. (Version 3.4.4). [computer software] Tokyo, Japan: Waseda
University. Available from: <www.laurenceanthony.net/> [Accessed 28 March 2017].
Bachman, L.F. and Palmer, A.S. 1996. Language testing in practice. Oxford: Oxford
University Press.
Baker, P. 2006. Using corpora in discourse analysis. London: Continuum.
Barnbrook, G. 1996. Language and computers: A practical introduction to the com-
puter analysis of language. Edinburgh: Edinburgh University Press.
Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E. 1999. Longman gram-
mar of spoken and written English. Harlow: Longman.
Brezina, V. 2014. Statistics in Corpus linguistics: Web resource. [online] Available
from: <http://corpora.lancs.ac.uk/stats> [Accessed 2 June 2016].
Bygate, M. 1987. Speaking. Oxford: Oxford University Press.
pedagogy. In: J.C. Richards and R.W. Schmidt, eds. Language and communica-
tion. New York: Longman, 2–27.
Carter, R. and McCarthy, M. 2006. Cambridge grammar of English: A comprehen-
sive guide: Spoken and written English grammar and usage. Cambridge: Cam-
Chujo, K. 2004. Measuring vocabulary levels of English textbooks and tests using
a BNC lemmatised high frequency word list. Language and Computers, 51(1),
231–249.
Cobb, T. 2017. Compleat Lexical Tutor (Lextutor). [Online] Available from: <www.
lextutor.ca.> [Accessed 1 January 2017].
Council of Europe 2001. Common European Framework of Reference for Lan-
guages: Language, teaching, assessment. Cambridge: Cambridge University Press.
Davies, M. 2004. BYU-BNC (based on the British National Corpus form Oxford
University Press). [Online] Available from: <http://corpus.byu.edu/bnc.> [Accessed
15 February 2017].
De Cock, S. 1998. A recurrent word combination approach to the study of formulae
in the speech of native and non-native speakers of English. International Journal
of Corpus Linguistics, 3(1), 59–80.
English Grammar Profile 2016. [Online] Available from: <www.englishprofile.org/
english-grammar-profile> [Accessed 15 January 2016].
English Vocabulary Profile 2016. [Online] Available from: <www.englishprofile.org/
wordlists> [Accessed 15 January 2016].
Erman, B. and Warren, B. 2000. The idiom principle and the open choice principle.
Text, 20(1), 29–62.
Galaczi, E. and Ffrench, A. 2011. Context validity. In: L. Taylor, ed. Examining
speaking: Research and practice in assessing Second language speaking (studies in
language testing). Cambridge: Cambridge University Press, 112–170.
Gilquin, G. and De Cock, S. 2013. Errors in disfluencies in spoken corpora. Amster-
dam: John Benjamins Publishing.
Gilquin, G., De Cock, S. and Granger, S. 2010. LINDSEI: Louvain International
Database of Spoken English Interlanguage. [CD-ROM]. Louvian: Presses Univer-
sitaires de Louvain.
Götz, S. 2013. Fluency in native and non-native English speech. Amsterdam: John
Benjamins Publishing.
Goulden, R., Nation, P. and Read, J. 1990. How large can a receptive vocabulary be?
Applied Linguistics, 11(4), 341–363.
Greaves, C. and Warren, M. 2010. What can a corpus tell us about multi-word units?
In: A. O’Keeffe and M. McCarthy, eds. The Routledge Handbook of Corpus Lin-
guistics. Abingdon: Routledge, 212–226.
Hoey, M. 2005. Lexical priming: A new theory of words and language. London:
Routledge.
Hughes, R. 2011. Teaching and researching speaking. 2nd ed. London: Pearson Education.
Hunston, S. 2002. Corpora in applied linguistics. Cambridge: Cambridge University
Press.
Sociolinguistics: Selected readings. Harmondsworth: Penguin, 269–293.
Laufer, B. and Nation, P. 1995. Vocabulary size and use: Lexical richness in L2 writ-
ten production. Applied Linguistics, 16(3), 307–322.
Laufer, B. and Nation, P. 1999. A vocabulary-size test of controlled productive ability.
Language Testing, 19(33), 33–51.
Lewis, M. 1993. The lexical approach: The state of ELT and the way forward. Hove:
Language Teaching.
McCarthy, M. 1999. What constitutes a basic vocabulary for spoken communica-
tion? Studies in English Language and Literature, 1, 233–249.
McCarthy, M. 2006. Explorations in corpus linguistics. Cambridge: Cambridge
University Press. Available from: <www.cambridge.org/elt/teacher-support/pdf/
McCarthy-Corpus-Linguistics.pdf> [Accessed 14 June 2016].
McCarthy, M. 2010. Spoken fluency revisited. English Profile Journal, 1(1), 1–15.
McEnery, T. and Hardie, A. 2012. Corpus linguistics. Cambridge: Cambridge Uni-
versity Press.
McEnery, T. and Wilson, W. 2001. Corpus linguistics. 2nd ed. Edinburgh: Edinburgh
University Press.
Meara, P. 1996. The dimensions of lexical competence. In: G. Brown, K. Malmkjӕr
and J. Williams, eds. Performance and competence in second language acquisition.
Cambridge: Cambridge University Press, 70–88.
Nation, I.S.P. 2001. Learning vocabulary in another language. Cambridge: Cam-
Nation, P. and Chung, T. 2009. Teaching and testing vocabulary. In: M.H. Long and
C.J. Doughty, eds. 2009. The handbook of language teaching, Oxford: John Wiley
and Sons, 543–559.
Nation, P. and Waring, R. 1997. Vocabulary size, text coverage and word lists.
Vocabulary: Description, Acquisition and Pedagogy, 14, 6–19.
Nattinger, J.R. and DeCarrico, J.S. 1992. Lexical phrases and language teaching.
Oxford: Oxford University Press.
Oakes, M.P. 1998. Statistics for corpus linguistics. Edinburgh: Edinburgh University
Press.
O’Keeffe, A., McCarthy, M. and Carter, R. 2007. From corpus to classroom: Lan-
guage use and language teaching. Cambridge: Cambridge University Press.
Pawley, A. and Syder, F.H. 1983. Two puzzles for linguistic theory: Nativelike selec-
tion and nativelike fluency. In: J.C. Richards and R.W. Schmidt, eds. Language and
communication. New York: Longman, 191–227.
Schmitt, N. 2000. Key concepts in ELT. ELT Journal, 54(4), 400–401.
Schmitt, N. 2008. Instructed second language vocabulary learning. Language Teach-
ing Research, 12(3), 329–363.
Schmitt, N. and Carter, R. 2004. Formulaic sequences in action: An introduction. In:
N. Schmitt, ed. Formulaic sequences. Amsterdam: John Benjamins, 1–22.
Schonell, F.J., Meddleton, I.G. and Shaw, B.A. 1956. A study of the oral vocabulary
of adults. Brisbane: University of Queensland Press.
Scott, M. 2015. WordSmith Tools.v.6.0.0.252. [Online]. Lexical Analysis Software
Ltd. Available from <http://lexically.net/wordsmith/> [Accessed 29 December 2015].
Scott, M. and Tribble, C. 2006. Textual patterns: Keywords and corpus analysis in
language education. Amsterdam: John Benjamins Publishing.
Sinclair, J.M. 1991. Corpus concordance collocation. Oxford: Oxford University
Press.
Stæhr, L.S. 2008. Vocabulary size and the skills of listening, reading and writing. The
Language Learning Journal, 36(2), 139–152.
Thornbury, S. and Slade, D. 2006. Conversation: From description to pedagogy.
Cambridge: Cambridge University Press.
Van Zeeland, H. and Schmitt, N. 2013. Lexical coverage in L1 and L2 listening
comprehension: The same of different from reading comprehension? Applied Lin-
guistics, 34(4), 457–479.
Viney, P. and Viney, K. 1996. Handshake: A course in communication. Oxford:
Oxford University Press.
WebLingua. 2016. Text Inspector. Available from: <www.textinspector.com/> [Accessed
28 March 2017].
Chapter 3
Strategic competence
3.1 Introduction
Identified as a component of communicative competence by Canale and
Swain (1980), strategic competence is typically associated with the ability
to convey and negotiate meaning successfully despite obstacles in the com-
munication process. Previously neglected in competence theories centring
on the native speaker (NS) (Chomsky 1965; Hymes 1972) and receiving less
attention than linguistic and sociolinguistic competences (Tarone and Yule
1989), it was a shift in focus towards the language learner that saw strate-
gic competence transition from being of equal importance to grammatical
and sociolinguistic competences, to it subsequently being foregrounded as
a central element of all language users’ production (Dörnyei and Thurrell
1991; Bachman and Palmer 1996). This chapter aims to briefly review stra-
tegic competence definitions before exploring how it can be exhibited and
utilised in the speech of learners at different levels. The argument we wish
to reinforce is that strategic competence, and the communication strategies
employed by learners, can be facilitative and indicative of successful speech
and, as such, should not simply be dismissed as a sign of insufficient pro-
ficiency. With earlier research also criticised for emphasising discussion of
strategies rather than their actual usage in communication (Ellis 1985), this
chapter additionally aims to illustrate the common patterns that B1, B2, and
C1 language users display in their interactions and the effects they have on
their overall performance as successful speakers of English.
3.2 Definitions of strategic competence and

communication strategies
The paradox created regarding the relationship between strategic compe-
tence and successful second language speech can be attributed to earlier
definitions in the field. Comprising verbal and non-verbal communication
strategies to ‘compensate for breakdowns’ (Canale and Swain 1980: 30) and
utilised only ‘in the face of some apparent deficiencies in the interlanguage
Strategic competence 79
system’ (Tarone 1981: 286), strategic competence initially emphasised a learn-

er’s need to make up for what they did not yet know or could not yet do in a
target language. The use of the term interlanguage itself similarly accentuated
learners’ incomplete target language knowledge and reinforced suggestions
that ‘strategies of second language communication’ were essential if speak-
ers were to converse successfully with speakers of that language (Selinker
1972: 229). In a sense, the assumption that strategic competence was called
into action only when speakers faced an obstacle in conveying their desired
meaning placed its impact on success in a somewhat negative light: the use
of strategies ultimately exposed learners’ ‘inferior’ language use.
A turning point was prompted, however, following research into speakers’
first language strategy transfer and their lacking awareness of, or desire to
attend to, obstacles during speech (see Chang and Liu 2016; Dörnyei 1995).
Strategic competence, and the communication strategies it encompassed,
instead became seen as a vital resource for helping speakers assess, plan and
execute authentic communication to its full potential (Bachman and Palmer
1996). It ultimately enhanced, boosted and optimised their language use to
make it more effective or fluent (Canale 1983; Dörnyei and Thurrell 1991;
Prebianca 2009; Tarone and Yule 1989). With such competence influenc-
ing learners’ conversational skills and fluency – the latter itself a key factor
affecting perceptions of proficiency (Rossiter et al., 2010) – strategic com-
petence clearly has implications for learner success as well as benefits for
speakers’ overall communicative competence (Chen 1990).
Reviewing literature into strategic competence inevitably requires knowl-
edge of what communication strategies (CSs) are but with the numerous
taxonomies that exist, there still remains no mutually accepted definition
(Cohen 2014; Dörnyei and Scott 1997). It is nevertheless important to note
that despite these differing treatments, the actual ‘substance’ of individual
strategies has been thought to remain relatively similar (Bialystok 1990:
61). Communication strategies have previously been defined as attempts,
techniques or devices that compensate for deficiencies in the target lan-
guage system or enable speakers to communicate meaning to interlocutors
(Corder 1981; Tarone 1981; Tarone and Yule 1989). Seen as linguistic ‘first
aid devices’ (Dörnyei 1995: 64) and employed when there is a ‘mismatch’
between the speaker’s intended meaning and their means for realising it
(Varadi 1992: 437), it is clear to see that traditional CS definitions are
problem-oriented in nature (Chang and Liu 2016; Dörnyei and Scott 1997).
However, seeing them solely as sticking plasters for the obstacles encoun-
tered when producing meaning inevitably overlooks their potential for
manipulating and extending a speaker’s current linguistic resources to cope
with ever-changing, demanding communicative situations. Since speech is a
real-time information exchange (see Hughes 2011; Taylor 2011) and one of
the most demanding mental processes (Ellis, Simpson-Vlach and Maynard
2008); Levelt 1989), it is quite possible that CSs also respond to speakers’
80 Strategic competence
context-specific communicative needs rather than solely to the linguistic or

sociolinguistic problems they encounter (Williams, Inscoe of Europe and
Tasker 1997). The mutual cooperation between speakers and listeners to con-
verse with minimal effort hence implies that CSs can act as a means to main-
tain the clarity and economy of communication and simplify the demands
placed upon them (Poulisse 1997). In sum, such a view broadens the function
of CSs to acknowledge that they are a feature of ‘ordinary’ language use and
are not strictly confined to speech which is problematic (Bialystok 1990: 131).
It is with this perspective that we turn attention to Dörnyei and Scott’s
(1995) CS taxonomy (table 3.1).
Though CEFR statements for strategy usage in speech will ultimately pro-
vide the foundation for this chapter, we consider that CSs play important
roles in both meaning construction and the production of spontaneous speech
for all speakers. The inclusion of direct, interactional and indirect strate-
gies recognises traditional approaches to CSs concentrating on the actual
speaker producing meaning, but at the same time extends them to include
‘trouble-shooting exchanges’ (Dörnyei and Scott 1997: 199) between speak-
ers and listeners who reciprocally confirm, clarify and interpret each other’s
messages. The acknowledgement of indirect strategies for coping with time
pressures similarly displays some useful crossover with the CEFR statements
in that dealing with real-time speech is encompassed (see table 3.2). Dörnyei
and Scott’s CS taxonomy also helps us to make an additional key distinc-
tion. Since CSs are a feature of ordinary speech, it would be short-sighted
to assume that native speakers do not also employ them. In response to the
deficit view of CSs, we therefore end this section with the CEFR’s defini-
tion of a communication strategy as it resonates clearly with the taxonomy
presented, the notion of learner success and the overall aims of this book:
Strategies are a means the language user exploits to mobilise and balance
his or her resources, to activate skills and procedures, in order to fulfil
the demands of communication in context and successfully complete the
task in question in the most comprehensive or most economical way fea-
sible depending on his or her precise purpose. Communication strategies
should therefore not be viewed simply with a disability model – as a way
of making up for a language deficit or a miscommunication. Native speak-
ers regularly employ communication strategies of all kinds . . . when the
strategy is appropriate to the communicative demands placed upon them.

Canale (1983) was first to argue that a skill-oriented approach to lan-
guage learning was needed if learners were to deftly manage and cope with
the communication in which they participated. Only by raising learners’
Table 3.1 Dörnyei and Scott’s (1995) CS taxonomy
Direct Resource deficit-related • Message abandonment

strategies strategies • Message reduction
• Message replacement
• Circumlocution
• Approximation
• Use of all-purpose words
• Word coinage
• Restructuring
• Literal translation
• Foreignising
• Code switching
• Use of similar sounding words
• Mumbling
• Omission
• Retrieval
• Mime
Own-performance problem- • Self-rephrasing
related strategies • Self-repair
Other-performance problem- • Other-repair
related strategies
Interactional Resource deficit-related • Appeals for help
strategies strategies
Own-performance problem- • Comprehension check
related strategies • Own-accuracy check
Other-performance problem- • Asking for repetition
related strategies • Asking for clarification
• Asking for confirmation
• Guessing
• Expressing nonunderstanding
• Interpretive summary
• Responses
Indirect Processing time pressure- • Use of fillers
strategies related strategies • Repetitions
Own-performance problem- • Verbal strategy markers
related strategies
Other-performance problem- • Feigning understanding
related strategies
procedural knowledge of strategies would they be able to progress from sim-

ply having knowledge of language to actually being able to use it. In a similar
vein, Bachman and Palmer’s (1996: 84) treatment of communicative com-
petence called upon both knowledge and ‘the capacity for implementing, or
executing that competence’, thus attributing a learner’s success to how well

competence, including strategic competence, is employed. With CS usage
hence evolving as proficiency grows, it is vital to learn ‘Who uses which
strategies, when and [to] what effect’ (Bialystok 1983: 103).
In the CEFR, the language learning process reveals itself in i) the language
activities learners participate in, and ii) their employment of CSs. Assum-
ing that growing proficiency and progressing through the CEFR levels are
common targets for learners, the latter would suggest that CSs act as a ‘con-
venient basis for the scaling of language ability’ (Council of Europe 2001:
57). However, with the writers emphasising the value of CSs for learners at
beginner levels and with previous studies linking an increase in proficiency
to a reduced need for CSs (Bialystok and Fröhlich 1980; Chen 1990; Parib-
akht 1985; Terrell 1977), it would be easy to maintain that CSs display a
negative correlation with success: it is their decrease, rather than increase in
occurrence which denotes higher proficiency. This supposition would infer
that CS frequency is the sole indicator of changing proficiency but several
studies have instead indicated that the nature and efficiency of CSs are also
significant factors.
In their study with learners of French, Bialystok and Fröhlich (1980)
examined the effects of age, proficiency and task-type on learner attempts
to contend with vocabulary and infer meaning. Using a picture reconstruc-
tion task and a picture description task, they ascertained that as proficiency
grew, the occurrence of lexical CSs (for example foreignising first language
vocabulary, describing words in the target language and word coinage) fell
but their overall effectiveness, rated by a NS participating in the picture
reconstruction task, was found to improve. With regard to the nature of CSs,
Bialystok and Fröhlich’s typology indicated that advanced learners used sig-
nificantly fewer strategies originating from the L1, but more L2-based CSs
than other learners; their ability to select appropriate strategies was also
found to be better. Similarly, once again focussing on lexical CSs, Chen’s
(1990) study of 12 Chinese learners of English also identified that higher
proficiency resulted in fewer CSs. Students were instructed to complete a
concept-identification task consisting of 12 concrete and 12 abstract con-
cepts. It was discovered that higher proficiency learners were more likely
to employ linguistic-based CSs (e.g. synonyms, metalanguage, etc.) than
knowledge-based CSs (e.g. cultural knowledge and examples) or repetition
which were favoured by lower-proficiency learners. Greater efficiency in the
use of CSs was once again established. Moving beyond the sole analysis of
lexical CSs, Prebianca’s (2009) study investigated the use of lexical, gram-
matical, phonological and articulatory CSs by pre-intermediate, interme-
diate and advanced learners of English. Over an academic semester, three
oral narratives (about a fact making them happy, a story or movie they
liked and a story constructed from a series of pictures) were collected and
analysed for CS type and frequency across levels. Though previous results
had led Prebianca to expect a fall in CS frequency despite advances in CS
sophistication as proficiency rose, these predictions were not found to be

fully supported by the data: it was the intermediate group which superseded
the pre-intermediate and advanced learners in terms of frequency. This was
attributed to the group’s overall enthusiasm for language learning and to
comparisons involving pre-intermediate learners whose level was too low
to yield consistent CS results and advanced learners who did not need to
‘resort’ to CSs as often. CS type was finally found to be uniform across the
three levels with respect to six CS categories: transfer, grammatical reduction,
unfilled pauses, umming and erring, sound-lengthening and self-repetition.
What these studies demonstrate is that frequency, efficiency and CS type
all differ according to proficiency level. Whilst decreasing CS use may be
typical as proficiency grows, it should not be assumed to be true in all
cases. Such an assumption could reinforce the deficit view of CSs as only
a problem-oriented feature of language but it could also overlook the sig-
nificance of CS type, CS efficiency or sophistication and the profiles of the
learners themselves. To fully understand what strategic competence means
for learner success, a more balanced view of CSs is required. Another impli-
cation of these studies with respect to success is the nature of the communi-
cation in which learners participate. As was highlighted in chapter one, the
native speaker model has, at times, been overbearing. With many studies
requiring students to converse with native speakers, little can be deduced as
to what makes learner speech with other learners successful. It is important
for research to likewise discover what makes learners successful in their own
right when they interact with fellow learners at different levels.

As mentioned in section 3.1, we sought to ascertain which CSs B1, B2 and
C1 learners used in their speech. To do so, it was the communication strate-
gies documented in the CEFR (Council of Europe 2001) that were used in
this chapter’s methodology. However, as has been pointed out on several
occasions in this book, the vague language and lack of illustrative learner
language in the framework have resulted in a document that is open to inter-
pretation. Discovering what language could satisfy strategy descriptors was
therefore essential before any subsequent analysis could take place.
As a result, this chapter in contrast to the others in this book, took a very
different analytical approach. The compilation of the USTC resulted in a
usable collection of learner language, but corpus linguistic methods could
not be employed to identify how utterances in learner language did or did
not satisfy spoken CEFR strategy descriptors. Quantitative corpus analysis
can reveal frequent and significant lexicogrammatical patterns in a particu-
lar body of language, but such patterns are limited in their ability to show
how utterance-level structures combine to perform distinct functions during
spontaneous interaction. Another problem is that delving deeper into the
context surrounding particular lexis or structures requires prior knowledge
of the items being searched but without illustrative or exhaustive language

items, such searches would have been impossible or unreasonably restricted.
CS analysis thus required a systematic and robust qualitative procedure so
that this aim could be achieved and so that objectivity and reliability could
be upheld as much as possible. Since there is no definitive process suitable
for all qualitative research (Cohen, Manion and Morrison 2011), analysis
ultimately rests on the processes followed and the researcher’s own interpre-
tations of data (Dörnyei 2007). Ensuring that illustrations of can-do occur-
rence across B1, B2 and C1 were not solely a product of personal account
was essential. Figure 3.1 gives an overview of the process followed:
As figure 3.1 shows, this chapter’s findings essentially relied on the use of
codes for individual strategies. Coding refers to ‘the ascription of a category
label to a piece of data, that is either decided in advance or in response to the
data that have been collected’ (Cohen, Manion and Morrison 2011: 559).
These codes had already been established by the CEFR and would remain
largely unchanged during analysis so the first step in the process centred on
the identification of relevant can-do statements that could be used as codes.
B1, B2, C1 and C2 spoken can-do statements filtered according to relevance for the
USTC speaking exams. This was completed by the researcher and independent
judges.
Verification of can-do selection completed. Unanimous can-do selections accepted

automatically; ambigous statements selected at researcher's discretion.
Final list of selected statements for all three levels constructed. Statements checked
for overlap or repetition to make sure categories were distinct and operationalisable.
All exam transcripts containing either i) learner and examiner language or ii) only
leaner language mined for examples of language satisfying can-do statements.
Independent judges for each level completed the same process using one exam only.
Selected exams compared to ensure rater-reliability and to reduce subjectivity.
Final choices made regarding can-do occurence and inputted on NVivo
Quantiative analysis of codes: categories organised according to CEFR classification:

production, interaction or strategy and counted at B1, B2 and C1.
Language analysis of codes: can-do occurrence and language use compared across B1,
B2, C1 and C2.
Figure 3.1 Procedure for identifying CS language in the USTC

With the CEFR aiming to be comprehensive and applicable to a range of

language learning situations, not all abilities documented within it would be
evidenced in the USTC speaking exams. As qualitative research is typically
time consuming (Dörnyei 2007), efficiency had to be maximised by search-
ing for only the abilities that were expected to arise. To do this, a list of all
spoken CEFR can-do statements at B1, B2 and C1 was made; C2 statements
were also included as a basis for analysing NS speech. The researcher read
each statement and decided upon its relevance in the exams. Independent
raters were also asked to complete the same procedure marking statements
as relevant, irrelevant or potentially relevant. In total, five independent raters
assisted at this stage. Each had several years’ experience using the exams as
an interlocutor, assessor, senior examiner or examination standards official.
Upon looking at the statements selected at the end of this stage, it was
clear that some overlap and repetition remained. Overall CEFR descriptors
for spoken production were often broken down into two or three individual
descriptors; sometimes, statements from other categories, e.g. spoken inter-
action, appeared extremely similar to statements belonging to the strategy
group. For instance, B2’s goal-oriented cooperation statement ‘can help
along the progress of the work by inviting others to join in, say what they
think, etc.’ is of great similarity to B2’s interactive cooperating strategy ‘can
help the discussion along on familiar ground, confirming comprehension,
inviting others in, etc.’ (Council of Europe 2001: 79 and 86). In qualita-
tive research, attempts should be made to make codes discrete; this can be
completed via a systematic process of refining preliminary codes or by a
process of changing coding labels into fuller sentences describing actions
or patterns (see Cohen, Manion and Morrison 2011; Miles, Huberman and
Saldana 2014). However, since this study aimed to investigate ways in which
can-do statements were realised in learner speech, modification of CEFR
terminology was considered unfavourable. The following decisions were
made. Where several descriptors combined to formulate an overall descrip-
tor, the individual categories were chosen; where overall descriptors were
not itemised, the researcher broke them down into discrete labels; and when
statements were repeated, only one statement was chosen. Table 3.2 displays
the final descriptors that were used as codes for categorising CS use at B1,
B2, C1 and C2. Records were kept both for CS usage throughout the exam,
i.e. where examiner and learner language was included, and in learner-to-
learner paired discussion in which the examiner did not participate.
3.5 Strategic competence at B1-C1 levels

This section examines strategic competence in the following ways. It first
outlines how frequently CEFR strategy statements were satisfied by B1, B2
and C1 speakers throughout their USTC communications. In doing so, it will
incorporate some CS usage findings from learner-to-learner paired discussion
Table 3.2 Final CS codes used for B1, B2 and C1 (Council of Europe 2001: 64–65, 86–87)
B1 Strategies Can start again using a different tactic when communication breaks down.
(production) Can correct mix-ups with tenses or expressions that lead to misunderstandings provided the interlocutor indicates there is a problem.
Can define the features of something concrete for which he/she can’t remember the word. Can convey meaning by qualifying a word
meaning something similar (e.g. a truck for people = bus).
Strategies Can summarise the point reached in a discussion and so help focus the talk.
(interaction) Can invite others into the discussion.
Can ask someone to clarify or elaborate what they have just said.
Can repeat back part of what someone has said to confirm mutual understanding and help keep the development of ideas on course.
Can invite others into the discussion.
Can exploit a basic repertoire of language and strategies to help keep a conversation or discussion going.
Can initiate, maintain and close simple, face-to-face conversation on topics that are familiar or of personal interest.
Can intervene in a discussion on a familiar topic, using a suitable phrase to get the floor.
B2 Strategies Can use circumlocution and paraphrase to cover gaps in vocabulary and structure.
(production) Can correct slips and errors if he/she becomes conscious of them or if they have led to misunderstandings.
Can make a note of ‘favourite mistakes’ and consciously monitor speech for it/them.
Strategies Can intervene appropriately in discussion, exploiting appropriate language to do so.
(interaction) Can initiate discourse, take his/her turn when appropriate, and end conversations when he/she needs to, though he/she may not always
do this elegantly.
Can use stock phrases (e.g. ‘That’s a difficult question to answer’) to gain time and keep the turn whilst formulating what to say.
Can help the discussion along on familiar ground, inviting others in etc.
Can ask follow-up questions to check that he/has understood what a speaker intended to say, and get clarification of ambiguous points.
C1 Strategies Can backtrack when he/she encounters a difficulty and reformulate what he/she wants to say without fully interrupting the flow of
(production) speech.
(As B2+) Can use circumlocution and paraphrase to cover gaps in vocabulary and structure
Strategies (As B2) Can ask follow-up questions to check that he/she has understood what a speaker intended to say, and get clarification of
(interaction) ambiguous points.
Can relate own contribution skillfully to those of other speakers.
Can select a suitable phrase from a readily available range of discourse functions to preface his/her remarks appropriately in order to
get the floor, or to gain time and keep the floor whilst thinking.
C2 Strategies Can substitute an equivalent term for a word he/she can’t recall so smoothly that it is scarcely noticeable.
(production) Can backtrack and restructure around a difficulty so smoothly the interlocutor is hardly aware of it.
Strategies (As C1) Can select a suitable phrase from a readily available range of discourse functions to preface his/her remarks appropriately in
(interaction) order to get the floor, or to gain time and keep the floor whilst thinking.
(As C1) Can relate own contribution skillfully to those of other speakers.
(As B2) Can ask follow-up questions to check that he/she has understood what a speaker intended to say, and get clarification of
ambiguous points.
in response to previous research which has focused mainly on learner to native-

speaker interactions. Following this, qualitative analyses of utterances, along
with noteworthy corpus findings from frequent word, keyword and chunk
data, will be presented to demonstrate not only which statements were satis-
fied by learner language, but also how they were realised in learner speech.
This aims to add a qualitative dimension to findings of what made the speak-
ers successful by explaining how different strategies were verbalised by learn-
ers. It is by no means exhaustive, however, and can only offer evidence as to
what was demonstrated by the USTC learners and not as to what was possible
according to their full communicative competences. This secondary assess-
ment of trends in B1, B2 and C1 learners’ employment of CSs will incorporate
the strategy usage of native speakers carrying out the same speaking task. It is
crucial to note here that the inclusion of such a comparison does not promote
the deficit view of learners by ‘pigeon-holing’ them by what cannot be done or
what cannot be done as well as a NS. It instead aims to emphasise the parallels
CS usage shares across learners and NSs and combat the perception that CS
frequency alone is an indicator of higher or lower proficiency.
3.5.1 Preliminary analysis of CEFR strategies

in B1, B2 and C1 speech
Preliminary analysis of learner language at B1, B2 and C1 was able to pin-
point the occurrence of the selected can-do strategy statements identified in
table 3.3. The prevalence of production and interaction strategies are pre-
sented by level – figures 3.2 and 3.3 for B1, figures 3.4 and 3.5 for B2, and
figures 3.6 and 3.7 for C1 – alongside brief summaries synthesising main
findings at each of the three levels.
B1 production strategies
Can define the features of something concrete for which 23

he/she can’t remember the word. Can convey meaning by 5.79
qualifying a word meaning something similar. 53.49
Productive strategies
Can correct mix-ups with tenses or expressions that lead 0

to misunderstandings provided the interlocutor indicates 0
there is a problem. 0
Can start again using a different tactic when 20

5.04
communication breaks down. 46.51
0 10 20 30 40 50 60
TOTAL % of all strategy use % of production strategy total
Figure 3.2 B1 production strategies

Table 3.3 Revised B1 production strategies
B1 Can-do statement Total % of production % of all

strategy use strategy use
Revised Can start again using 20 20.62 4.43

production a different tactic when
strategies communication breaks down.
Can correct mix-ups with 54 55.67 11.97
tenses or expressions that
lead to misunderstanding.
Can define the features of 23 23.71 5.10
something concrete for which
he/she can’t remember the
word. Can convey meaning
by qualifying a word meaning
something similar.
B1 interaction strategies
Can intervene in a discussion on a familiar topic, using a 14

3.53
suitable phrase to get the floor. 3.95
Can initiate, maintain and close simple, face-to-face 80

conversation on topics that are familiar or of personal 20.15
interest. [VALUE]0
Can exploit a basic repertoire of language and strategies 61

15.37
Interactive strategies
to help keep a conversation or discussion going. 17.23
Can repeat back part of what someone has said to confirm 56

mutual understanding and help keep the development of 14.11
ideas on course. 15.82
Can ask someone to clarify or elaborate what they have 50

12.59
just said. 14.12
46
Can invite others into the discussion. 11.59
12.99
Can summarise the point reached in a discussion and so 47

11.84
help focus the talk. 13.28
0 10 20 30 40 50 60 70 80 90
TOTAL % of all strategy use % of interaction strategy total
Figure 3.3 B1 interaction strategies
B1 learner data exhibited 397 statements across the two categories. Of

these, 10.83% were production strategies and 89.17% were interaction strat-
egies. Given that one of two key characteristics of B1 learners is their ability
to ‘maintain interaction and get across what [they] want to’ (see Council
B2 production strategies
27
Can use circumlocution and paraphrase to cover gaps in
9.47
vocabulary and structure.
32.14
Production strategy
57
Can correct slips and errors if he/she becomes conscious
20.00
of them or if they have led to misunderstandings.
67.86
0
Can make a note of ‘favourite mistakes’ and consciously
0.00
monitor speech for it/them.
0.00
0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00
TOTAL % of all strategy use % of production strategy total
Figure 3.4 B2 production strategies
B2 interaction strategies
Can intervene appropriately in discussion, exploiting 27

9.47
appropriate language to do so. 13.43
Can initiate discourse, take his/her turn when 41

appropriate, and end conversations when he/she needs 14.39
to, though he/she may not always do this elegantly. 20.40
Interaction strategy
Can use stock phrases (e.g. ‘That’s a difficult question 34

to answer’) to gain time and keep the turn whilst 11.93
formulating what to say. 16.92
Can ask follow-up questions to check that he/has 38

understood what a speaker intended to say, and get 13.33
clarification of ambiguous points. 18.91
Can help the discussion along on familiar ground, 61

21.40
inviting other in, etc. 30.35
0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00
Figure 3.5 B2 interaction strategies
of Europe 2001: 34), it is perhaps unsurprising that interaction strategy

statements held such a large majority both throughout the exam and during
learner-to-learner paired discussion (production strategies = 14.10%, inter-
action strategies = 85.90%). These figures also present differing impressions
C1 production strategies
42
Productive Strategies
(As B2+) Can use circumlocution and paraphrase to

9.15
cover gaps in vocabulary and structure
34.15
Can backtrack when he/she encounters a difficulty and 81

reformulate what he/she wants to say without fully 17.65
interrupting the flow of speech. 65.85
0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00 90.00
Figure 3.6 C1 production strategies
C1 interaction strategies
Can select a suitable phrase from a readily available

146
range of discourse functions to preface his/her remarks
31.81
appropriately in order to get the floor, or to gain time
43.45
and keep the floor whilst thinking.
40
Interaction Strategies
Can relate own contribution skillfully to those of other

8.71
speakers.
11.90
(As B2) Can ask follow-up questions to check that he/ 75

she has understood what a speaker intended to say, and 16.34
get clarification of ambiguous points. 22.32
75
(as B2) Can help along the progress of the work by
16.34
inviting others to join in, say what they think, etc.
22.32
0.00 20.00 40.00 60.00 80.00 100.00120.00140.00160.00
Figure 3.7 C1 interaction strategies
of learner attempts to involve others in the discussion. Whilst figure 3.3

shows that initiating, maintaining and closing simple face-to-face discussion
accounted for nearly a quarter of all interaction strategies (80, 22.60%),
thus reinforcing their main characteristics as per the CEFR, inviting others
into the conversation was nearly half that at 13% of all interaction strate-
gies. Learner-to-learner speech, however, instead displayed a majority for the
latter, with 16 (23.88%) interaction strategy occurrences coinciding with
attempts to involve other speakers. Though maintaining interaction and
communicating meaning are at the forefront of learners’ interactive abilities

at B1 according to the CEFR, it is clear that in successful speech they also
present a desire to not only give opinions, but seek them as well.
With regards to production strategies, it at first appeared that no correc-
tion of mix-ups in tenses or expressions was evident in the learners’ speech
but that restarting an utterance or defining features of unknown vocabulary
were similar as they represented 46.51% and 53.49% of the production cat-
egory. This did not fully reflect the true nature of learner language at this
level though. Coming from a corpus of spoken test language, the wording of
the CEFR strategy for correction proved problematic in that it could only be
satisfied ‘provid[ing] the interlocutor indicates there is a problem’ (Council of
Europe 2001: 65), an action that would not be deliberately performed by an
interlocutor or assessor. The CEFR (Council of Europe 2001: 34), however,
simultaneously suggests that in freer production, repair at B1 is ‘very evident’
so the decision was taken to adjust this strategy to incorporate correction of
mix-ups and tenses without the interlocutor’s intervention. The resulting data
painted a very different picture from the initial analysis (see table 3.3).
In producing speech, B1 learners were clearly preoccupied with their accu-
racy in conveying meaning to those listening. Occupying 12% of all strategy
usage at this level, over half the production strategies related to some form
of correction. This was reflected in learner-to-learner discussion, which saw
seven instances of correction despite its drop in terms of its proportion of
the production strategy category (38.89%). This could again be attributed
to the nature of the speaking task but it could similarly reflect the practicali-
ties of producing real-time speech with little time to plan or reflect on the
construction of meaning.
B2 learners carried out the selected CEFR strategy statements on 285
occasions. This constituted 29.47% and 70.53% of the production and
interaction strategy categories respectively in all speech, and 30.67% and
69.33% in learner-to-learner speech. Once again, as was seen at B1, interac-
tion strategies dominated figures by occupying more than two-thirds of all
the statements evidenced in learner language. However, in a change to B1
language, there was a shift in the strategy majorities. In nearly a quarter of
all B1 learner speech, interaction strategy use related to initiating, maintain-
ing and closing communication on topics of familiar or personal interest.
Instead at B2, 61, nearly a third of all interaction strategies, related to help-
ing the discussion along on familiar ground and inviting others into the
discussion, a strategy twice as frequent as at B1. Though confirming mutual
understanding remained relatively similar, analysis of strategy use illustrated
that success at B2 is reliant upon communication which is reciprocal and
jointly constructed. This corresponds to CEFR descriptions of learners at
this level who are able to ‘more than hold [their] own in social discourse’
which demands a more natural conversation in which the listener plays an
equal part in the communication (Council of Europe 2001: 35).
Similarly, another distinction across the two levels was revealed in learners’
use of production strategies, in particular those associated with correction.
Correction at B2 constituted 68% of production strategy usage and 20%
of all strategy use throughout the discussions; in learner-to-learner commu-
nication, it equated to 56.52% of production strategy use and 17.33% of
all strategy use. Conversely, across all B1 speech, it represented 56% of all
production strategies and 12% of all strategy use but 38.89% of production
strategy usage and 8.24% of all usage in learner-to-learner discussion. This
increase across the two levels could, therefore, to some extent exemplify B2
learners’ ‘new degree of language awareness’ as specified within the CEFR
(Council of Europe 2001: 35). Though no ‘favourite mistakes’ were identi-
fied, learners did correct slips and errors on 57 occasions, a potential indica-
tor of the ‘new degree of language awareness’ which results in their labelling
as ‘vantage’ learners (Council of Europe 2001: 35).
In total, 459 strategies were identified in all C1 speech; 26.80% for pro-
duction, and 73.20% for interaction. Interestingly, despite the fact that this
level had the lowest numbers of selected statements for analysis (B1 = 10,
B2 = 8, C1 = 6 statements), the learners within it produced the highest
amount of statement evidence. Immediately, therefore, it could be argued
that the statements dominating each of the two strategy categories were
fundamental to learners’ overall spoken success. For instance, descriptions
of C1 ability in the CEFR highlight the use of suitable phrases for prefacing
remarks and the ‘controlled use of organisational patterns, connectors and
cohesive devices’ to aid fluency (Council of Europe 2001: 36). With 146
examples of interaction strategy occurrence, nearly a third of that category,
relating to prefacing remarks to get the floor, gain time and keep the floor,
the way that utterances were combined to convey meaning, seemed to take
precedence over the ability to clarify points and invite others into the dis-
cussion. This was replicated in paired learner discussion data which, despite
dramatic falls in prevalence, saw the prefacing statement occupy a major-
ity of 16 (34.78%) interaction strategies at this level. Similarly, with 66%
of production strategy evidence relating to backtracking and reformulat-
ing utterances (57.69% in learner-to-learner speech), once again, the ability
to monitor speech was prominent, as it was in the other two levels. What
is clear from this data, therefore, is that productive strategies do have an
important role to play in successful learner speech in spite of their reduced
occurrence in comparison with interaction strategies.
3.5.2 Comparison of CEFR strategy realisation

in B1, B2 and C1 speech
The next stage of analysis involved comparing the most frequent strategy
statements at each of the three levels (across all speech as well as in the
learner-to-learner paired discussion task) to reveal whether B1, B2 and C1
Table 3.4 Most frequent production strategy statements at B1, B2 and C1
Level Most frequent statement Freq. % of Most frequent statement Freq. % of

across all speech category in learner-to-learner category
speech
B1 [Amended] can correct 54 55.67% [Amended] can correct 7 38.89%

mix-ups with tenses or mix-ups with tenses or
expressions that lead to expressions that lead to
misunderstandings. misunderstandings.
B2 Can correct slips 57 67.86% Can correct slips 13 56.52
and errors if he/she and errors if he/she
becomes conscious of becomes conscious of
them or if they have led them or if they have led
to misunderstandings. to misunderstandings.
C1 Can backtrack when 81 65.85% Can backtrack when 15 57.69
he/she encounters he/she encounters
a difficulty and a difficulty and
reformulate what he/she reformulate what he/she
wants to say without wants to say without
fully interrupting the fully interrupting the
flow of speech flow of speech
Table 3.5 Most frequent interaction strategy statements at B1, B2 and C1
Level Most frequent statement Freq. % of Most frequent statement in Freq. % of

across all speech category learner-to-learner speech category
B1 Can initiate, maintain 80 22.60% Can invite others into the 16 20.51%
and close simple, face- discussion.
to-face conversation
on topics that are of
familiar or of personal
interest.
B2 Can help the discussion 61 30. 35% Can help the discussion 14 18.67%
along on familiar along on familiar ground,
ground, inviting others inviting others in, etc.
in, etc. Can ask follow-up 14 18.67
questions to check that
he/she has understood
what a speaker intended
to say, and get clarification
of ambiguous points.
C1 Can select a suitable 146 43.45% Can select a suitable 16 34.78%
phrase from a readily phrase from a readily
available range of available range of
discourse functions to discourse functions to
preface his/her remarks preface his/her remarks
appropriately in order appropriately in order to
to get the floor, or to get the floor, or to gain
gain time and keep the time and keep the floor
floor whilst thinking. whilst thinking.
learners evidenced similar traits in their production and interaction strat-

egy use. Whilst such analysis would reveal which descriptors were central
to success at each level, it would also act as a precursor to the in-depth
comparison of how learners used them and via what language. Tables 3.4
and 3.5 present the most frequent CEFR strategy statements for production
and interaction.
From tables 3.4 and 3.5, two clear themes for further analysis emerge. Pro-
duction strategies at each level were, for the most part, heavily dominated
by learner correction or utterance reformulation. Alternatively, interaction
strategy statements were shared between initiating, maintaining and closing
conversation, helping discussion along and asking follow-up questions to
seek clarification. Though additional insights will now be provided as to
the way these strategies were used, and the way in which they were realised
in learner speech, interaction strategy use will not concentrate on the way
conversations were started, closed or maintained since there was consider-
able crossover with chapter 4’s findings related to discourse management.
Discussion here therefore comments on correction for production strategies
and inviting others into the conversation and clarification for interaction
strategies.
3.5.3 Production strategies: correction

As was introduced in section 3.5.1, analysis of production strategies iden-
tified correction and rephrase – part of Dörnyei and Scott’s (1995) direct
strategies for one’s own-performance problems – as the most common
strategy across all three levels. Despite B1 learners initially employing
circumlocution and paraphrase more than correction, adjustment of the
CEFR statement to exclude interlocutor interference proved otherwise. In
fact, quantitative analysis of strategy occurrence indicated that correction
occupied an increasing proportion of the production strategy category
across the learner proficiency levels both in all speech (55.67% at B1,
67.86% at B2, 65.85% at C1), and in learner-to-learner speech (38.89%
at B1, 56.52% at B2, 57.69% at C1). Such a finding thereby challenges
claims that increasing CS evidence signifies a lower degree of language
proficiency as the increase in correction actually denoted continuing suc-
cess at the B2 and C1 levels. Similarly, the data contests previous assump-
tions which associate productive skills more with NSs due to their more
complete linguistic knowledge (Thornbury and Slade 2006). The fact that
the ability to reformulate speech was evidenced at all three levels instead
demonstrates that the ability to enhance and refine initial messages is one
which learners at B1, B2 and C1 exhibit and one which once again contra-
dicts associations of self-correction and rephrase with signs of diminished
competence.
What remains to be seen, however, are the factors to which learners lent
their attention. The CEFR beyond B1 level is a little vague in this respect.
B1 descriptors emphasise ‘mix-ups with tense or expressions’, whereas B2
and C1 statements highlight the need to i) repair ‘slips and errors’, or ii)
backtrack upon encountering a ‘difficulty’ (Council of Europe 2001: 65).
Little expansion is offered, therefore, on how such correction or reformula-
tion may represent itself in learners’ speech. To clarify the errors receiving
attention, utterances were assessed for their main corrective focus. Table 3.6
shows, with the help of illustrative learner language, which items were cor-
rected by learners. It also incorporates data from NS speech as a neutral
point for comparison.
The data highlight that B1, B2 and C1 learners were clearly alike in their
ability to correct utterances according to word choice selections. Equating to
approximately half of all learner corrective statements, word choice revisions
comprised 52%, 51% and 47%, respectively, across the levels and 38% in
NS data (again the majority error type). Analysis of successful learner speech
therefore identified that the CEFR’s ambiguous slips and errors related
mostly to the choice of verbs and pronouns (see figures 3.8-3.11), not only
at B2 and C1 but also in B1 and NS speech:
Table 3.6 Focus of error correction in the USTC corpus
B1 % B2 % C1 % NS %
Agreement 0 0.00 3 5.26 1 1.23 0 0.00

Contraction 3 5.56 1 1.75 3 3.70 0 0.00
Missing word 3 5.56 12 21.05 14 17.28 7 13.21
Negation 3 5.56 1 1.75 2 2.47 2 3.77
Passive or 1 1.85 0 0.00 0 0.00 0 0.00
active voice
Plurality 0 0.00 0 0.00 1 1.23 1 1.89
Pronunciation 0 0.00 3 5.26 3 3.70 0 0.00
Rephrase 0 0.00 1 1.75 11 13.58 16 30.19
Tense 10 18.52 4 7.02 6 7.41 6 11.32
Word choice Adjective 2 1 0 1
Adverb 2 4 4 1
Auxiliary verb 2 2 4 3
Conjunction 2 1 2 4
Determiner 2 2 3 0
Noun 3 3 5 1
Preposition 1 0 1 0
Pronoun 5 11 8 5
Verb 9 5 11 5
Word choice total 28 51.85 29 50.88 38 46.91 20 37.74
Word form 6 11.11 3 5.26 2 2.47 1 1.89
Total 54 100.00 57 100.00 81 100.00 53 100.00
<$5F> Er I think erm the photographs can be can be important er souvenirs to <$=> our
</$=> to us because
Figure 3.8 B1 pronoun correction
<$29F> First of all if you have a good salary but you don’t like your job you won’t get a you
won’t enjoy it.
Figure 3.9 B2 verb correction
<$21M> + because I want to er gradually incre= improve my life
Figure 3.10 C1 verb correction
<$3F> However <$=> it </$=> they’re two very different things though
Figure 3.11 NS pronoun correction
Further distinctions in learner correction were uncovered when the second-

most frequent focusses for correction were compared. First, at B1, the
level explicitly characterised in the CEFR by its tense correction, the data
prompted two opposing conclusions. Occupying nearly a fifth of all B1 cor-
rection (18.52%) USTC data initially supported the supposition that tenses
receive more attention at this level; tense correction more than halved from
18.52% at B1, to 7.02% and 7.41%, respectively, at B2 and C1. However,
though frequency for this correction category admittedly fell, proportions
did rise from 7% at B2 and C1, to 11% in the NS data. Whilst tense cor-
rection may not be an often acknowledged feature of speech at higher pro-
ficiency levels, it indeed does not disappear altogether, nor does it result in
a less successful use of speech (see figures 3.12 to 3.15). Though the major-
ity of the CEFR’s ‘slips’ were clearly exemplified via word choice changes,
speech alterations were also sometimes attributed to tense as well.
Furthermore, when looking at the next most common type of error in
learner data between B2 and C1, and C1 and NS level, learners either evi-
denced an ability to identify words which had been accidentally omitted in
their speech (see figures 3.16 and 3.17) or an ability to monitor the effective-
ness of the overall meaning conveyed (see figures 3.18 and 3.19):
<$33F> + because erm during the Easter holiday erm <$=> I would go mm </$=> <$=> I
ex= </$=> I would being in Spain but I can’t I couldn’t go because of er flight strike.
Figure 3.12 B1 tense correction
<$0> First I’m going to ask you some questions about yourselves. So <$10M> tell me about the
place where you live.
<$10M> Now?
<$0> Yes.
<$10M> Er now I live er I live in I’m living in Preston +
Figure 3.13 B2 tense correction
<$31F> + cos some teenagers don’t have like the sense of responsibility +
<$30F> Yeah.
<$31F> + to drive safety so erm <$=> they’ll </$=> they’re thinking about mm hanging out
with their friends so they’re not like with the right sense of safety
Figure 3.14 C1 tense correction
<$0> Okay. And final question er <$6F> which job would you least like to do?
<$6F> Erm. I guess one that’s sort of <$=> didn’t doesn’t </$=> doesn’t really go anywhere
or I don’t feel like I’m like utilising my skills
Figure 3.15 NS tense correction
<$37M> I enjoy yeah it’s nice erm I can make new fr= mm new friend I er I mean I erm this
studying er in England in Pr= in Preston I enjoy. I make new friend I spend all my time <$=> I
don’t have time to </$=> sometime I don’t have time to do it.
Figure 3.16 B2 missing word correction
<$16F> so think when I travel to the American <$=> I can </$=> I also can speak with the
native people and I can speak with British people.
Figure 3.17 C1 missing word correction
<$20F> Er I want to have my own <$G2> restaurant in the future because I like to eat food and
<$=> I hope I can eat our ah ah wrong word </$=> <$=> so I want my </$=> I want to open
a restaurant in my country.
Figure 3.18 C1 utterance reformulation

<$2M> Er I’d agree er like erm I always thought if you know something you can teach it you can
explain it to someone else better <$=> so if you’re just </$=> it’s good enough explaining it to
yourself I suppose.
Figure 3.19 NS utterance reformulation
Table 3.7 CEFR qualitative descriptions of learner accuracy (Council of Europe 2001: 28)
Level Accuracy
B1 Uses reasonably accurately a repertoire of frequently used ‘routines’ and patterns

associated with more predictable situations.
B2 Shows a relatively high degree of grammatical control. Does not make errors
which cause misunderstanding, and can correct most of his/her mistakes.
C1 Consistently maintains a high degree of grammatical accuracy; errors are rare,
difficult to spot and generally corrected when they do occur.
C2 Maintains consistent grammatical control of complex language, even while attention
is otherwise engaged (e.g. in forward planning, in monitoring others’ reactions).
Representing approximately a fifth of all errors at B2 (21.05%) and C1

(17.28%), 13.21% at NS, but only 5.56% at B1, reformulating speech to
insert a missing word seemed to receive more attention at the B2 and C1
levels. However, the data likewise revealed that assessing and rephrasing
a message for accuracy of meaning began to appear more consistently at
the C1 and NS levels (B1 = 0 occurrences (0%), B2, 1 occurrence (1.75%),
C1 = 11 occurrences 13.58%, NS = 16 occurrences (30.19%)). Such find-
ings are of clear significance, therefore, for what it means to be successful in
speech not only according to the CEFR corrective strategy descriptions, but
also to its qualitative descriptions of learner accuracy (see table 3.7).
At B2, for instance, speakers are said to show ‘a relatively high degree of
grammatical control; impeding errors are not made and speakers can correct
‘most’ of their mistakes (Council of Europe 2001: 28). The fact that speak-
ers addressed and amended their utterances when a word was missed out,
most typically at B2, suggests that this could be characteristic of learners at
this proficiency level. However, though accuracy descriptions for C1 and C2
discuss the conspicuous nature of errors, it is our opinion that as learners
develop in proficiency, correction is not always demonstrative of ‘forward
planning’; it instead takes place quite to the contrary of CEFR descriptions
(Council of Europe 2001: 28). Often, when a partial or complete rephrase
of an utterance was given, especially in the NS data, the impression was not
that speakers had carefully ‘planned ahead’ in terms of what they were going
to say or how they were going to say it. Instead, it seemed that monitoring of
speech only occurred once part or all of an utterance had been verbalised (see
figures 3.20 and 3.21). In a sense, the freeing up of online attention resources
which aid more fluent speech at higher levels (see proceduralisation in Tay-
lor 2011), might have been responsible not for quick, minor changes during
such verbalisation, but alternatively, it might have been responsible for the
aforementioned backtracking once the message had been conveyed.
To summarise this section, a closer inspection of production strategy
usage for correction has been used to shed light on the CEFR’s B1, B2 and
C1 statements and how learners can be successful in their speech. Not only
were occurrences of correction seen to rise across the proficiency levels, in
contrast to studies which said strategy use should decrease, but also slips
and backtracking were illustrated by attention to word choice, tense usage,
missing words and occasional rephrasing. Table 3.8 summarises the typi-
cal features reformulated at each level but the data unequivocally does not
associate a higher prevalence of correction to lower achievement or suc-
cess. For instance, the increases in raw frequencies and mean corrections per
exam assert that the ability to self-correct without an interlocutor’s inter-
vention should be seen as a significant indicator of success. Though some
may assume that increasing correction can disrupt the flow of speech, and
therefore perceptions of success, it is a key production strategy in all B1, B2,
C1 and NS speech.
3.5.4 Interaction strategies: inviting others into

the discussion and seeking clarification
Table 3.5 earlier showed that in addition to initiating, maintaining and clos-
ing discussion, interaction strategies were called upon to invite others into
the discussion and seek clarification. At some stage, across all three levels,
such strategies were integral to either speech throughout the exam or speech
during the learner-to-learner interaction. It is therefore worthwhile, after
<$3M> I I think that’s a bad thing because <$=> it doesn’t promotes like </$=> even if you
want like your country to erm develop faster you need a good like erm tourism needs to be good
so that erm other countries might want to do like they’d like when they come and they see what
they like they want to do business with your country
Figure 3.20 C1 post-utterance correction
<$4F> + well that’s a good point <$O18> television </$O18> or magazine I prefer well I
don’t <$O19> I don’t really </$O19> no <$O20> not magazine </$O20>
Figure 3.21 NS post-utterance correction

Table 3.8 Summary of most frequent error types in USTC
Level Most frequent correction type Second-most frequent correction type
B1 Word choice (verbs and pronouns) Tenses

B2 Word choice (pronouns and verbs) Missing words
C1 Word choice (verbs and pronouns) Missing words and rephrases
NS Word choice (verbs and pronouns) Rephrases
investigating one example of Dörnyei and Scott’s (1995) direct strategies,

to explore i) how interlocutors were drawn into the communication, and ii)
how clarification, an example of interactional strategy, was realised in learner
speech.
Initially, lexical chunk data in chapter 2 (see section 2.8) demonstrated
how USTC learners sought to engage listeners more often at the B2 and
C1 levels. Observation of the verb think suggested that B1 learners did not
fully participate in communication which was truly reciprocal; they appeared
more concerned with giving their own opinions and thoughts than asking for
those of others. It was only at B2 and C1 that interrogative forms such as
what do you, do you think, what do you think, began to emerge. Though such
a supposition would coincide with CEFR descriptors of interactive ability in
that engaging another person in conversation is only introduced at B2 (see
Council of Europe 2001: 28), using only corpus analysis tools here does B1
learners somewhat of a disservice. With 21% of learner-to-learner interaction
strategies at B1 aimed at conversing with others, there is evidence to argue
that it is an emerging skill at B1, albeit one that may not be as established in
comparison with B2 and C1. Though admittedly the number of questions
posed at each level did increase (B1 = 46; B2 = 60; C1 = 75), qualitative
analysis was able to reveal a vast range of questions used at all three lev-
els. Discussion here will focus on the three most frequent question types at
each level (see tables 3.9-3.11). In doing so, there is overlap with chapter 4’s
(section 4.5) discourse management discussion about opening, closing and
turn-taking; however, its aim is to demonstrate what learners expect of their
discussion partners and the less obvious strategies which can be utilised.
These three question categories comprised 44%, 72% and 49% of all ques-
tions asked, respectively, at the B1, B2 and C1 levels. Not only does this anal-
ysis demonstrate that such questions are essential to successful interaction in
learner-to-learner speech, but it similarly proves that interacting with others
in a jointly constructed spoken dialogue is a significant element of speech at
all three levels, not just at B2 or C1. The nature of the three questions also
reveals that learners at higher levels may ask more of their interlocutors. For
instance, at B1, the top two questions, though admittedly provoking a longer
response, did essentially form yes/no or one-word question-answer forms;
only the last category ‘How/What about you?’ required a lengthier answer.
Table 3.9 Top three questions used at B1 to invite others into the conversation
Question type Freq. % of all Example language

question types
Do/Did you . . .? 10 21.74 Do you think meeting er new people is er

good or erm er interesting when you are
when you meet er a good new people?
Do you agree with me?
Which ones/of 5 10.87 Erm erm which choice you want to start a
these reasons/ new sports er easy to learn?
choice/ of the Which of the following would make you
following . . .? watch a film?
How/What 5 10.87 Okay how about you?
about . . .? What about you?
Table 3.10 Top three questions used at B2 to invite others into the conversation

question types
Do/Did 16 26.67 Did you know?

you . . .? Do you agree with me?
Er mm er do you think er do you think listening to
music or playing musical instruments can help you
to reduce stress?
What do you 15 25.00 So what do you think of lifestyle? Healthy lifestyle?
think about/ Yeah so what do you think about the er young
of . . .? people they feel bored when they stay at home?
Okay what do you think about the first topic?
How/what 12 20.00 And er mm how about this one?
about . . .? So how about the food?
How about you?
Table 3.11 Top three questions used at C1 to invite others into the conversation

question types
What do you 19 25.33 And what do you think about a reasonable price?
think/ reckon Er erm how what do you reckon about this one?
about/of . . .?
How/What 11 14.67 How about the second point?
about . . .? What about finding information on the internet?
Do/Did 7 9.33 Like do you spend time going through it see
you . . .? what other erm other people stayed in the hotel
thinks about the hotel itself or?
Do you think if we visit the places where it
happened er it will still be the same?
The yes/no, one-word question structures constituted a third of all questions

asked at this level. At B2 and C1, however, yes/no question forms reduced
from 27% and 9% but were replaced by more open question structures on
45% and 40% of occasions. Learners at these two levels were not simply
concerned with eliciting a quick response to their questions; instead, they
seemed to want more detail as to the answers that were given, perhaps in an
attempt to build on them more as the interaction developed.
In terms of strategic competence, examination of inviting others into the
discussion also uncovered a hidden, indirect strategy: that of buying time
to think about a response or ‘passing the buck’ to momentarily transfer the
turn until learners were ready to speak. As can be seen in figure 3.22 with
learner $26F, some learners may have intentionally asked questions of an
interlocutor so that they were not pressured into answering before they were
ready to do so or so that they are not exposed had they not known how to
respond. The ability to ask questions therefore plays both an interactive and
strategic function which successful learners exploit in their real-time speech.
Posing questions formed one clear part of the information exchange in
the USTC but the ability to handle and negotiate meaning when there was
a potential for communication to breakdown was also another important
skill. Clarification, especially in B2 learner-to-learner speech was found to
be the most frequent interaction strategy used so exploration of to whom
the clarification was targeted and its purpose was called for. The use of
exams meant that speech was aimed at both NS examiners and fellow
candidates; perhaps unsurprisingly, the majority of clarification requests
at all levels (B1 = 86%, B2 = 89%, C1 = 85%) were posed to the NS. It
would be easy to assume from a deficit view of CSs, therefore, that this
was a sign that learners found it difficult to understand the interlocutor.
<$E> Candidates read sheet </$E>

<$26F> Which one do you think is suitable?
<$27M> Well first of all for me I think reading books it’s erm one of the most suitable ways of
knowing history because when you read books they have full details and information about
history <$=> and for wh= </$=> and what do you feel movies change? <$E> coughs </$E>
so= some movies change erm they change some details about +
<$26F> Yeah they <$O38> leave out </$O38> points.
<$27M> <$O38> the history </$O38> yeah and then the museum also is it’s also a good place
for knowing history. They have people who describe and explain the things you so it’s a good
way of knowing history.
<$26F> Do you think if we visit the places where it happened er it will still be the same?
<$27M> No I don’t think it’s a good way of knowing the history of the place because over time
the place changes there is development so it’s not a good way.
<$26F> What about finding information on the internet?
<$27M> Erm on the internet people like us puts puts the information on the internet so we’re not
really sure if it’s <$O39> are correct yeah </$O39>
Figure 3.22 C1 interaction questions to relinquish the turn

However, since ‘understanding a native speaker’ represents an ability to be

evidenced across all six CEFR levels (Council of Europe 2001: 75), it instead
could be a sign that USTC learners employed their strategic competence so
as to manage and exploit their linguistic resources fully. Further support for
this perspective was gathered when the purpose of the clarification request
was identified. Table 3.12 displays the three most frequent clarification
requests at all levels.
As can be seen, two-thirds of B1 clarification requests were attributed
to the meaning of vocabulary or task instructions. Task instructions were
undoubtedly complex at this level so this is why many students checked
what they needed to do (see figure 3.23). At B2, however, focus shifted to
clarifying examiner questions where multiple answers could have been given
on 37% of occasions. For instance, questions about learners’ circumstances
could have been answered with information about the learners’ L2 country
or their home countries (see figure 3.24).
Table 3.12 The three most frequent clarification requests at B1, B2 and C1
Level Rank Clarification purpose Freq. % % of the levels’

clarification requests
B1 1st Meaning of vocabulary/question 19 38.00 82.00

2nd Task/Instructions 14 28.00
=3rd Vague examiner question 4 8.00
=3rd Fellow student’s question 4 8.00
B2 1st Vague examiner question 14 36.84 78.95
2nd Meaning of vocabulary/question 10 26.32
3rd Task/Instructions 6 15.79
C1 1st Meaning of vocabulary/question 30 40.00 76.00
2nd Repetition 16 21.33
3rd Vague examiner question 11 14.67
<$3M> Er excuse me we’re same topic+

<$0> Yes.
<$3M> Okay but I think it’s different.
<$0> No your prompts are different <$O4> the topic </$O4> is the same
Figure 3.23 Task instruction clarification request at B1
<$0> <$26M> er what do you like about the place where you live?
<$26M> Mm here or in my country?
<$0> Any.
Figure 3.24 Vague question clarification request at B2

Interestingly, while checking the meaning of vocabulary and making inter-

locutor questions more explicit were again very frequent at C1 level (55% of
all clarification requests), a new tactic emerged for eliciting clarification: the
use of repetition (21%). The use of repetition by learners often prompted
the interlocutor to repeat the same word as if the learner had not heard it
(see figure 3.25). However, it is our belief that this strategy was, in fact, a
stalling technique, similar to table 3.1’s indirect strategy for processing time,
which afforded learners valuable, additional time in which to formulate their
responses. The fact that clarification attempts relating to vocabulary were so
high should also not be assumed to be a sign of lower success. Since C1 learn-
ers are characterised by their ability to discuss ‘complex subjects’ (Council of
Europe 2001: 27), it is understandable that they will from time to time have
to clarify unknown lexis. As figure 3.25 shows, however, this is sometimes
approached in a less explicit way via the combination of repetition to gain
time and the use of paraphrase or synonyms, rather than overt questions
relating to meaning that are seen at other levels (see figures 3.26 and 3.27).
Finally, to sum up this interaction strategy, it is necessary to outline how
clarification requests were realised in the speech of learners. Table 3.13 has
grouped responses according to the categories that emerged in the B1, B2
and C1 data. B1 learners, though attempting to formulate accurate ques-
tions, had the majority of their requests in the partially formed question
category. This was mostly due to the omission of auxiliary verbs or inaccu-
rate word order. B2 and C1 learners, however, showed small, but increasing
<$0> So <$34F> tell me about your favourite restaurant.

<$34F> Favourite restaurant?
<$0> Yes.
<$34F> It’s tricky one +
Figure 3.25 Repetition clarification request at C1
<$0> Mhm okay and how important do you think cultural awareness is when you’re travelling?
<$10F> Er so you mean cultural awareness?
<$0> Mhm.
<$10F> Mm it means er the people in the other countries tell their feelings?
Figure 3.26 Use of paraphrase and synonyms to clarify unknown vocabulary at C1
<$5F> Sorry. What’s the word’s meaning?
Figure 3.27 Use of overt questions to clarify unknown vocabulary at B1

Table 3.13 Language used for clarification requests
Language used B1 % B2 % C1 % Example
Statement 6 12.00 0 0.00 0 0.00 I don’t know what’s the meaning

here. (B1, Exam 12)
Partially formed 15 36.00 7 18.43 13 17.34 Which one we’ll talk about the
question one or we’ll talk about all three?
(B2, Exam 15)
Fully formed 6 12.00 7 18.43 17 22.67 Er I think the plane is the er
question sorry can you explain the topic?
(C1, Exam 8)
Repetition of 5 10.00 3 7.89 16 21.33 Successful art? (B2, Exam 5)
words in question
Sorry/Pardon/ 3 6.00 2 5.26 11 14.67 Er excuse me? (B1, Exam 9)
Excuse me
Why? How? 3 6.00 1 2.63 1 1.33 Er why? (B1, Exam 11)
Here? Etc.
So ____ 3 6.00 3 7.89 4
5.33 So you are not the type of person
who travels a lot? (C1, Exam 1)
Other 6 12.00 15 39.47 13 17.33 Er about er about the games she
says. (B1, exam 15)
Total 50 100.00 38 100.00 75 100.00
numbers of questions in the fully formed question category in which no

errors were present, perhaps in part due to the mastery of questions such as
could you repeat that? As was alluded to earlier when discussing the use of
repetition in clarification requests, C1 learners displayed the use of sorry,
pardon, or excuse me, almost three times as often as B2 and B1 learners
3.6 Conclusion
The data presented in this chapter have allowed us to reach several conclu-
sions regarding the role of strategic competence in successful learner speech.
Using CEFR spoken strategy statements as a basis for analysis, we have docu-
mented how the use of communication strategies in real-time learner speech
facilitates spoken production by allowing learners to manage discourse and
exploit their linguistic resources in response to the challenges they face.
Though traditionally communication strategies have been viewed in a nega-
tive light as indicators of lacking proficiency, this chapter has demonstrated
that they in fact can be a sign of improving proficiency as the use of CEFR
strategies was found to be highest at C1 level despite a reduced number of
descriptors. The occurrence of different types of strategies also unveiled typi-
cal characteristics that are expected to be seen at each level. In terms of inter-
action, B1 learners were able to maintain discourse and convey their desired
meanings, B2 learners demonstrated an ability to initiate, maintain and close
communication as well as a desire to invite others into the discussion, and

C1 learners prefaced remarks to take and maintain their turn whilst gaining
vital thinking time when needed. In terms of production, though correction is
associated more with B2 and C1 learners, USTC data revealed that B1 learn-
ers also monitor their speech independently of an interlocutor. Such capabili-
ties across the three levels were essential in speech both throughout the exam
and during parts in which only learner-to-learner discussion took place.
Closer inspection of the most common strategies at each level, and com-
parison against NS speech, provided further insight into what makes learn-
ers successful. In terms of correction, which could once again be viewed as a
sign of lower proficiency, analysis of learner speech showed a rising increase
in the amount of reformulation of individual words or utterances; the inclu-
sion of NS exams showed also that correction was a common, and in fact,
more prevalent feature in their speech where mean correction scores were
concerned. Assumptions that correction, and other CSs, are therefore only
attributed to learners were once again supported. Alternatively, the focus
of correction was found to deviate across the levels. Whereas all groups
displayed their highest amount of correction in changes to word forms, B1
learners were found to amend tense use, B2 learners monitored speech for
missing words and C1 learners began to reformulate entire utterances, much
like their NS counterparts did. However, it is important to note that these
trends cannot be taken to be absolute. Learners at each level produced a
variety of corrected errors, so it cannot be claimed that one ‘type’ of cor-
rection disappears at a particular proficiency level. With regards to suc-
cess, therefore, we argue that an alternative view of correction needs to be
adopted: a growing frequency and change in nature can help define, rather
than constrain continuing success at different proficiency levels both for
learners and native speakers alike.
This chapter has also shed light on some of the interactive differences B1,
B2 and C1 learners exhibit. Whilst chunk data in chapter 2 suggested that B1
learners did not often attempt to involve others in communication, analysis
of CEFR strategies instead identified that learner-to-learner discussion at
this level was dominated by attempts to invite others into the conversation.
Admittedly, such attempts did increase as proficiency grew, but it would
be a rather strong claim to overlook questions posed to interlocutors as a
sign of success at B1. However, the data did reveal that the detail involved
in the responses sought can be an indicator for developing proficiency. As
learner proficiency grew, so too did the use of questions which could not be
answered via a one-word response. It seemed that to be truly interactive and
successful at higher levels, detailed answers were called for, and sometimes
rather deceptively, could also be exploited as a time-gaining strategy.
Finally, analysis of clarification requests established that successful USTC
learners did ask more questions of NSs than their fellow learners. Once
again, there is potential here for the NS model to be overbearing in that the
need to enhance understanding could be seen as a potential ‘weakness’ of

learners. However, the fact they are able to request clarification successfully
demonstrates that learners can exploit their language, when needed, and
employ relevant strategies for adapting to the real-time spoken encounters
in which they participate. Though the nature of the requests was found to
evolve across the levels, from task instructions at B1, vague examiner ques-
tions at B2 and the meaning of vocabulary at C1, attempts to clarify remarks
should not be dismissed as an automatic sign of insufficient comprehension.
As was found at C1 level, clarification can also, rather cleverly, act as an
occasional stalling device to provide learners with valuable thinking time.
References
Bachman, L. and Palmer, A. 1996. Language testing in practice: Designing and devel-
oping useful language tests. 3rd ed. New York: Oxford University Press.
Bialystok, E. 1983. Some factors in the selection and implementation of communica-
tion strategies. In: C. Faerch and G. Kasper, eds. 1983. Strategies in interlanguage
communication. London: Longman. 79–99.
Bialystok, E. 1990. Communication strategies. Oxford: Blackwell.
Bialystok, E. and Fröhlich, M. 1980. Oral communication strategies for lexical dif-
ficulties. Interlanguage Studies Bulletin, 5(1), 3–30.
pedagogy. In: J.C. Richards and R.W. Schmidt, eds. 1983. Language and Com-
munication. New York: Longman, 2–27.
Chang, S. and Liu, Y. 2016. From problem-orientedness to goal-orientedness:
Re-conceptualizing communication strategies as forms of intra-mental and inter-
mental mediation. System, 61, 43–54.
Chen, S. 1990. A study of communication strategies in Interlanguage production by
Chinese EFL Learners. Language Learning, 40(2), 155–187.
Chomsky, N. 1965. Aspects of the theory of syntax. Cambridge: The MIT Press.
Cohen, A. 2014. Strategies in learning and using a second language. 2nd ed. Abing-
don: Routledge.
Cohen, L., Manion, L. and Morrison, K. 2011. Research methods in education. 7th ed.
London: Routledge.
Corder, S. 1981. Error analysis and interlanguage. Oxford: Oxford University Press.
Council of Europe 2001. Common European framework of reference for languages.
Dörnyei, Z. 1995. On the teachability of communication strategies. TESOL Quar-
terly, 29(1), 55–85.
Dörnyei, Z. 2007. Research methods in applied linguistics. 1st ed. Oxford: Oxford
University Press.
Dörnyei, Z. and Scott, M.L. 1995. Communication strategies: An empirical analysis
with retrospection. In: J.S. Turley and K. Lusby, eds. Selected papers from the pro-
ceedings of the 21st annual symposium of the Deseret Language and Linguistics
Society. Provo, UT: Brigham Young University, 155–168.
Dörnyei, Z. and Scott, M.L. 1997. Communication strategies in a second language:

Definitions and taxonomies. Language Learning, 47(1), 173–210.
Dörnyei, Z. and Thurrell, S. 1991. Strategic competence and how to teach it. ELT
Journal, 45(1), 16–23.
Ellis, R. 1985. Understanding second language acquisition. Oxford: Oxford Univer-
sity Press.
Ellis, N., Simpson-Vlach, R. and Maynard, C. 2008. Formulaic language in native
and second language speakers: Psycholinguistics, corpus linguistics, and TESOL.
TESOL Quarterly, 42(3), 375–396.
Hughes, R. 2011. Teaching and researching: Speaking. 2nd ed. London: Pearson
Education.
Sociolinguistics, Harmondsworth: Penguin, 269–293.
Levelt, W. 1989. Speaking: From intention to articulation. Cambridge, MA: The MIT
Press.
Miles, M. Huberman, A. and Saldan~a, J. 2014. Qualitative data analysis: A methods
sourcebook. 1st ed. Los Angeles: Sage.
Paribakht, T. 1985. Strategic competence and language proficiency. Applied Linguis-
tics, 6(2), 132–146.
Poulisse, W. 1997. Compensatory strategies and the principle of clarity and economy.
In G. Kasper and E. Kellerman Communication strategies: Psycholinguistic and
sociolinguistic perspectives. Abingdon: Pearson Education Limited, 49–64.
Prebianca, G. 2009. Communication strategies and proficiency levels in L2 speech pro-
duction: A systematic relationship. Revista de Estudos da Linguagem, 17(1), 7–50.
Rossiter, M. Derwing, T., Manimtim, L. and Thomson, R. 2010. Oral fluency: The
neglected component in the communicative language classroom. Canadian Mod-
ern Language Review, 66(4), 583–606.
Selinker, L. 1972. Interlanguage. International Review of Applied Linguistics, 10,
209–231.
Tarone, E. 1981. Some thoughts on the notion of communication strategy. TESOL
Quarterly, 15(3), 285–295.
Tarone, E. and Yule, G. 1989. Focus on the language learner: Approaches to identify-
ing and meeting the needs of second language learners. 2nd ed. Oxford: Oxford
University Press.
Taylor, L. 2011. Examining speaking: Research and practice in assessing Second lan-
guage speaking (studies in language testing). Cambridge: Cambridge University Press.
Terrell, T. 1977. A natural approach to second language acquisition and learning.
The Modern Language Journal, 61(7), 325–337.
Varadi, T. 1992. Reviews. Applied Linguistics, 13(4), 434–440.
Chapter 4
Discourse competence
4.1 Introduction
This chapter will define discourse competence broadly as the ability of learn-
ers to produce and manage spoken texts with cohesion and coherence. This
is similar to the definition given in the CEFR (2001: 123), which suggests
it can be defined as the way in which messages are ‘organised, structured
and arranged’. This broad definition will be employed to analyse how suc-
cessful speakers use specific features of spoken language to create successful
spoken texts across different levels. After we define terms and review some
related previous studies in sections 4.2 and 4.3, the data will be explored in
two main ways. Firstly, in section 4.4 we will adapt Celce-Murcia, Dornyei
and Thurell’s (1995) more detailed definition of discourse competence and
explore their main categories in terms of the language used to achieve key
aspects of discourse competence at different levels. Secondly, in section 4.5,
we will focus on how speakers use discourse markers (DMs) to help them
to achieve discourse competence. As we do so, we will attempt to show how
learners use different aspects language to achieve the CEFR (2001: 124/125)
‘Can do’ statements related to discourse competence at different levels, but
we also note that some of these statements are so vague that they are difficult
to measure. As in previous chapters, some comparison will be made to our
native-speaker test data where we feel this illuminates the analysis. Overall,
we wish to show that successful speech, even at lower levels, tends to pay
attention to language use beyond simply constructing sentences and towards
linking ideas within and across turns.
4.2 Definitions of discourse competence

Most researchers agree that we can define discourse broadly as any written
or spoken text which is longer than a written sentence or a single spoken
utterance (see Carter and McCarthy 1993, Cook 1989, McCarthy 1991,
Thornbury 2005 for examples). There is slightly less agreement as regards a
definition of discourse competence. Early research from Canale and Swain
(1980), as discussed in chapter 1, suggests that discourse competence is the
110 Discourse competence
ability to link ideas together to form cohesive and coherent texts of differ-
ent types. In a written form, this may be the ability to write, as an example,
a postcard which is understood by its recipient, while in a spoken form it
could be the ability to request a money transfer from your bank on the
telephone. The CEFR (Council of Europe 2001: 123) defines discourse com-
petence as ‘the ability of a user/learner to arrange sentences in sequence
so as to produce coherent stretches of language’. While both definitions
seem reasonable (if somewhat vague), it is of course inadequate to describe
spoken language in terms of ‘sentences’ and cohesion and coherence need
a more detailed description because in conversation this is achieved both
within a speaker’s turn and across the turns of different participants. For
these reasons, and for the purposes of this chapter, we will use the more
detailed definition of discourse competence given by Celce-Murcia, Dornyei
and Thurell (1995), alongside the broad definition given at the start of this
chapter. Celce-Murcia, Dornyei and Thurell suggest that this competence
can be defined as follows: ‘Discourse competence concerns the selection,
sequencing, and arrangement of words, structures, sentences and utterances
to achieve a unified spoken or written text’ (Celce-Murcia, Dornyei and
Thurell 1995: 15). They go on to suggest that this competence can be further
defined and measured by examining the following elements:
1 Cohesion – which will be achieved with aspects such as substitution,

reference, ellipsis and lexical chains.
2 Deixis – which will be achieved via use of pronouns to indicate spatial
(e.g. ‘this’, ‘that’), temporal (e.g. ‘now’, ‘then’) and textual (e.g. ‘the fol-
lowing chart’; ‘the example above’) relations.
3 Coherence – which will be achieved by organising information so that
the message makes sense and is easy to follow for the listener or reader.
4 Genre/generic structure – which will be achieved by following the expected
organisation of genres such as service encounters or research reports.
5 Conversational structure – which will be achieved by aspects such as
how to perform openings and reopening, topic establishment and change
and how to collaborate and backchannel.
(Celce-Murcia, Dornyei and Thurell 1995: 14)
It is these aspects which will inform how discourse competence is analysed

in this chapter. The nature of the USTC data does not allow us to explore
the ways in which successful speakers at different levels follow the expected
organisation of different genres, but we will attempt to address other key
aspects listed above. In order to make a coherent and cohesive spoken text it
is clear that a successful speaker will need to perform the skills above in com-
bination with each other. Some of these will include aspects such as the need
to be able to respond to what someone else has said in a way that continues
Discourse competence 111
the conversation, to join his or her ideas together across and within turns and
to close down or open new topics and turns. These skills will of course be
achieved in a number of ways, at what Celce-Murcia, Dornyei and Thurell
(1995) suggest is a macro level and a micro level. We are taking the macro level
to mean the larger more holistic view of spoken discourse as often exemplified
in the CEFR ‘can do’ statements and the micro level to mean the ‘smaller’ lan-
guage features which help speakers to realise coherent and cohesive discourse.
The language features we will explore are items set out in section 4.4 and
include items such as it for anaphoric reference. We also explore spoken dis-
course markers, and it is therefore necessary to define these in the next section.
4.2.1 Definitions of spoken discourse markers

Defining a DM is a difficult task, something Jucker and Ziv (1998: 1)
acknowledge when they suggest that ‘there is no generally agreed upon
definition of the term ‘discourse marker’ ’. Instead, the literature reveals a
multiplicity of definitions and terms. Amongst these are ‘sentence connec-
tive’ (Halliday and Hassan 1976), ‘discourse marker’ (Schiffrin 1987, Jucker
and Ziv 1998a), ‘discourse operator’ (Redeker 1990), ‘pragmatic marker’
(Fraser 1996), and ‘discourse particle’ (Aijmer 2002). The variety of terms
is perhaps a result of the difficulty in providing a definition for a part of
speech which can have multiple functions and also operate as part of several
word classes, sometimes as a DM and sometimes not. We need therefore to
acknowledge that researchers use different terms and a DM is something of
a ‘fuzzy concept’ (Jucker and Ziv 1998: 2). Having acknowledged this, the
term ‘discourse marker’ has been chosen for the purposes of this book as it
seems to be the term most widely understood and used.
We will use the largely functional definitions of Aijmer (2002) and Fung
and Carter (2007) and suggest that in order for a lexical item or phrase to
be a DM, there are a number of characteristics it will display, and the more
characteristics it seems to display, the more ‘prototypical’ (Jucker and Ziv
1998: 2) it is as a DM. These characteristics have been summarised by Jones
(2011) as follows:
1 DMs are lexical items or phrases (Redeker 1990, Carter and McCarthy
2006), such as ‘right’, ‘I mean’, ‘you know’, ‘I think’.
2 DMs are optional – the absence of a DM does not affect the semantics
or grammar of an utterance. However, the absence will make compre-
hension at least more difficult (Aijmer 2002, Eslami and Eslami-Rasekh
2007).
3 DMs are multifunctional – the same DM can have a variety of functions,
each dependent on context. Fung and Carter (2007) give the example of
‘so’, which can, for instance, both summarise and launch a topic.
4 DMs are not drawn from one grammatical class and are not a closed
grammatical class. Aijmer (2002), Carter and McCarthy (2006) and
Fung and Carter (2007), give examples of DMs drawn from a wide
variety of grammatical classes, such as prepositional phrases (‘by the
way’), response tokens (‘right’) and interjections (‘oh’).
5 DMs have a procedural but not propositional meaning. A DM may
possess a propositional meaning when used as part of another class. An
example of this is the temporal use of ‘now’. The meaning of a DM can
be defined from the broader context in which it operates.
6 DMs function at a referential, interpersonal, structural and cognitive
level (Aijmer 2002; Fung and Carter 2007). They act as signposts for
speakers and listeners as they orientate themselves to the ongoing dis-
course (Schiffrin 1987; Aijmer 2002) by, for instance, signalling that lis-
teners need to time to think or that they wish to show they are listening.
7 DMs are often (but not always) sentence or turn initial (Aijmer 2002;
Fung and Carter (2007).This position occurs often as it fulfils a number
of common functions, such as launching topics (Fung and Carter 2007).
8 DMs ‘should be prosodically independent and be largely separate from
the utterances they introduce’ (Fung and Carter 2007: 413). This will
generally be indicated by the DM occupying a separate tone unit and
(often) being followed by a pause.
If we apply this definition to the following (invented) examples with the

word ‘right’, it is possible to illustrate the above functions more clearly.
‘Right, shall we start the lesson?’ (DM usage: fulfilling categories 1, 2, 3,
5, 6 (structural), 7 and 8).
‘Turn right at the next corner.’ (Non-DM usage: fulfilling category 1 only
and having a clear propositional meaning).

Several studies have explored discourse competence across CEFR levels but
few have done so in relation to spoken language. Recent studies in this area
include Chen and Baker (2016), who examined lexical bundles in written
essays from learners at B1-C1 levels. They found that as the learner levels
increased the bundles used tended to become more similar to those in aca-
demic discourse and less like those uses in conversational discourse. Also,
although there were some bundles shared across the levels (such as ‘on the
other hand’), the accuracy of usage increased as the levels progressed. DMs
have also been explored in relation to discourse competence across CEFR
levels but thus far, this seems to have been largely undertaken in relation
to written texts. Waller (2015), for example, explored the use of metadis-
course markers used in written exams at B2 and C1 levels, as one aspect of
discourse competence. Metadiscourse markers are defined as items which
either explicitly relate to textual organisation (for example, ‘in conclusion’)

or the stance of the writer (for example, ‘personally’). His findings show that
there were no significant differences in the amount of DMs used at each level
but that the markers used for particular functions did differ in important
ways across the levels.
One recent study did examine discourse competence by exploring spoken
data. Iwashita and Vasquez (2015) analysed 58 samples of IELTS speaking
task two at levels 5, 6, and 7, which are broadly equivalent to CEFR lev-
els B1-C1. Speaking task two is a long turn of approximately two minutes
produced by a single speaker on an everyday topic, which candidates have a
short time to prepare for and during which the examiner does not interrupt
them. Looking at this data via quantitative and qualitative analysis, Iwashita
and Vasquez (2015) explored aspects of discourse competence such as the
use of cohesive devices and the organisation of the discourse. Their findings
show that learners found that cohesive devises used to reference (such as ‘it’
for anaphoric reference) were more frequent and used with a greater range
and accuracy as the levels increased. The way learners structured their talk
at IELTS levels 6 and 7 also closely conformed to the expected structure of
the text. However, the authors also observed that the differences were only
statistically significant in regard to some of the aspects of language used
(such as comparative conjunctions), and there was a great deal of individual
variation across the levels. These results show that discourse competence can
be achieved at different CEFR levels without huge variation in the language
students use but with an increase in frequency, accuracy and flexibility of
use, a point Byrne (2015) makes in relation to the vocabulary use of C1 level
learners’ speech. Her findings show that C1 levels learners make a great deal
of use of the first 1000 most frequent words in the British National Corpus
but use them more frequently and flexibly across a range of lexical chunks
and functions when we compare with analysis at B2 levels.
As mentioned in section 4.4, DMs have been extensively analysed in pre-
vious studies. Such research has offered a number of definitions of these
items and there have also been a number of studies which explore the use of
DMs by learners. These studies have tended to compare their use with that
of native speakers. Recent examples are Fung and Carter (2007), Hellerman
and Vergun (2007), Gilquin (2008), Adolphs and Carter (2013) and Tsai
and Chu (2015). All the studies found that in general, learners used DMs
with lesser frequency than native speakers and that different types of DMs
tend to be used by learners when compared to native speakers. Fung and
Carter (2007) found that native speakers used a greater range of DMs with
a wider variety of functions and that learners significantly underused DMs
with an interpersonal function such as ‘sort of ‘ and ‘you know’, as verified
by Adolphs and Carter (2013). Hellerman and Vergun (2007) looked at the
use of ‘well’, ‘you know’ and ‘like’ by elementary level learners in America.
They found that learners at these levels used few of these items and those
that did tended to be more acculturated. Gilquin (2008) found that French
leaners of English used fewer DMs as lexical markers of hesitation (such as
‘like’) compared to native speakers of English. Tsai and Chu (2015) looked
at the use of DMs by Chinese L1 speaking teachers and learners of Chinese
as an L2, studying both inside and outside China. They found that the fre-
quency of use of DMs was linked to greater levels of fluency by individual
speakers. While such studies are illuminating, we wish in this chapter to
explore the use of DMs by successful learners to show how they contrib-
ute to discourse competence in differing ways at each level. We make some
comparison to our native-speaker data but this is not only to illustrate that
learners use fewer DMs but rather to explore which ones they do use, how
they use them and why they do so.

The first part of the data analysis in this chapter seeks to analyse the language
used to meet the various CEFR descriptors for discourse competence from
B1-C1 levels. ‘Can do’ statements’ of this nature allow us to look at the data
in a holistic fashion, at the macro level. However, there is a certain element
of subjectivity when we try to interpret the data and seek evidence for some-
times vague statements. To give one example, one statement for C1 learners
is that they ‘can produce clear, smoothly flowing, well-structured speech’
(Council of Europe 2001: 125). We can see that such a descriptor requires
a subjective impression in regard to the definition of what the researcher
defines as ‘clear’ and it is certainly possible to suggest that a learner at B1 or
B2 can also be ‘clear’, albeit using different forms of language. In order to
highlight the difficulty interpreting some of these statements they are shown
for B1- C1 levels (Council of Europe 2001: 124/125), in table 4.1.
In a bid to militate against an overly subjective analysis, it was decided
that we would first search the corpus for specific items of language which
we felt could realise these different ‘can do’ statements, partly based on our
observation of the data as discussed in relation to high frequency language
in chapter 2. The language searched for here was also influenced by the defi-
nitions given by Celce-Murcia, Dornyei and Thurell (1995) and by studies
such as Iwashita and Vasquez (2015), as described in section 4.2. Once the
language was examined, the vagueness of the ‘flexibility’ descriptor meant
we did not feel able to interpret this notion in a meaningful way in this
analysis, though we do mention it in discussion of how different language
items function across levels.
The first part of the analysis looked at the data quantitatively to measure
the overall frequency of some key aspects of language used across each level
and qualitatively to measure the functions for which this language was used.
Following this initial analysis, frequency levels were compared using log-
likelihood scores, a form of analysis further described later in this section. Such
Table 4.1 CEFR ‘can do statements’ related to discourse competence
Flexibility Coherence and cohesion Thematic development Turn-taking
B1 Can adapt his/her expression Can link a series of Can reasonably fluently Can initiate, maintain and close simple
to deal with less routine, even shorter, discrete simple relate a straightforward face-to-face conversation on topics that are
difficult, situations. elements into a connected, narrative or description familiar or of personal interest.
Can exploit a wide range of linear sequence of points. as a linear sequence of
simple language flexibly to points.
express much of what he/she
wants.
B2 Can adjust what he/she says and Can use a limited number Can develop a clear Can initiate discourse, take his/her turn
the means of expressing it to of cohesive devices to link description or narrative, when appropriate and end conversation
the situation and the recipient his/her utterances into expanding and supporting when he/she needs to, though he/she may
and adopt a level of formality clear, coherent discourse, his/her main points with not always do this elegantly.
appropriate to the circumstances. though there may be relevant supporting detail Can use stock phrases (e.g. ‘That’s a
Can adjust to the changes of some ‘jumpiness’ in a long and examples. difficult question to answer’) to gain time
direction, style and emphasis contribution. and keep the turn whilst formulating what
normally found in conversation. to say.
Can vary formulation of what he/ Can intervene in a discussion on a familiar
she wants to say. topic, using a suitable phrase to get the
floor.
C1 As of B2 Can produce clear, Can give elaborate Can select a suitable phrase from a readily
smoothly flowing, well- descriptions and available range of discourse functions to
structured speech, narratives, integrating preface his/her remarks appropriately in
showing controlled use of sub-themes, developing order to get the floor, or to gain time and
organisational patterns, particular points and keep the floor whilst thinking.
connectors and cohesive rounding off with an Can intervene appropriately in discussion,
devices. Can use a variety appropriate conclusion exploiting appropriate language to do so.
of linking words efficiently Can initiate, maintain and end discourse
to mark clearly the appropriately with effective turn-taking.
relationships between ideas.
analysis is necessarily selective but we felt that the language chosen could
reflect key aspects of discourse competence. For these reasons, we chose to
look at the following language areas in order to explore discourse competence:
1 Anaphoric referencing: We, he, she, it, this, that, they

2 Spatial deixis: This, that, these, those
3 Elaborating on and developing a topic to achieve coherence: for exam-
ple, and also, I also.
4 Opening and closing topics and turns: What do you think?, How about
you?, So yeah.
The corpus was searched for the items above and was then analysed manu-
ally to ensure that the items searched for were being used with the functions
mentioned in 1–4 above and only items which functioned in these ways were
counted. To give one example, it used as a form of cataphoric reference in
figure 4.1 sample ‘A’ would be discounted and it used as a form of anaphoric
reference in figure 4.1 sample ‘B’ would be accepted. Both the examples
below are taken from B2 level.
When analysing DMs in the second part of the chapter, a combination of
quantitative and qualitative analysis was also undertaken. We first simply
looked at the frequency of the following DMs: Er, yeah, I think, Ok, You
know, Well, I mean, Right, Anyway.
These items were chosen because they have been found to be high fre-
quency (O’Keeffe, McCarthy and Carter 2007) in spoken corpora in general
and, as noted in chapter 2, this includes our corpus to a certain degree. They
also fulfil different functions according to the analysis described by Fung
and Carter (2007). These functions can be defined as follows:
1 Interpersonal – you know (marking shared knowledge), I think (show-

ing attitude), Ok and yeah (showing a response).
2 Referential – anyway (digression), well (unexpected response)
Sample A
<$6> some advertisements should er limited it’s because should er from the parents side they
should protect their young children.
Sample B
<$2F> Er maybe winter because I like snow and er I like er play snow with my friends. I think
it’s really interesting
Figure 4.1 Samples of it used for cataphoric and anaphoric reference

3 Structural – Ok (opening topics)

4 Cognitive – I mean (reformulation), well (hesitation), er (hesitation), you
know (hesitation), I think (hesitation)
We are aware that not everyone will define er as a discourse marker, but
we wish to suggest that it meets many of the criteria for a DM and this is
something others have also suggested (see for example, Hoey 2004). It is,
for example, mainly turn initial and prosodically independent and carries a
procedural rather than a propositional meaning.
In order to analyse this data, we explored the USTC data at different levels
and qualitatively assessed each instance to check it was being used as a DM
as some of the items listed can have a different propositional meaning. Ok,
for instance, can also be used as an adjective. Only those uses where the
items functioned as DMs were counted. Following this initial analysis, fre-
quency levels were compared using log-likelihood scores. Oakes (1998) and
Jones and Waller (2015) describe this as a measure which allows us to check
frequency of occurrence in different corpora for statistical significance. This
does so by checking the relative frequency of an item in one corpus when
compared to another while taking into account the size of the corpus and
then showing if the difference in frequency of occurrence is significant or
not. Once frequency levels of the DMs were established, the items were ana-
lysed to examine how they were typically ‘chunked’ by speakers, via search-
ing for NGrams of between two and six words. Six was chosen as a ‘cut off’
point because this is generally recognised as the largest number of words in
any meaningful lexical chunk (O’Keeffe, McCarthy and Carter 2007), and
we felt that looking at some larger chunks may illuminate this analysis, even
though, as noted in chapter two, the number of chunks in evidence usu-
ally decreases greatly after four words. To conclude the analysis, the DMs
were examined qualitatively to analyse how different functions vary across
learner levels and to show how they help learners to achieve discourse com-
petence and the various CEFR ‘can do’ statements.
4.5 Discourse competence at B1-C1 levels

As mentioned in section 4.4, we first looked at the language students used at
each level and related this to different aspects of discourse competence. The
initial results of this are shown in table 4.2, which gives the raw frequencies
of each item fulfilling each function given.
What is striking about this data is that cohesive devices (in this case
pronouns) in general increase across levels. It is the most frequent pronoun
with this function and its frequency increases so that at C1, learners use
the item nearly twice as often as they do at B1 level. Examining the data
showed us that at B1 level, successful learners tended to focus more on
referring back to previous ideas within their own turn and from B2 levels
Table 4.2 Language items used to realise discourse competence across B1 – C1 levels
CEFR: CEFR: CEFR:

Cohesion and coherence Thematic Turn-taking
development Opening and closing
Celce-Murcia, Cohesion Deixis Coherence Conversational

Dornyei and (Anaphoric (Referring (Elaborating on and structure
Thurell’s reference to the developing a topic) Opening and
(1995) only) immediate closing)
categories environment)
B1 IT = 275 THIS = 6 FOR EXAMPLE=11 WHAT DO YOU
THIS = 84 THAT = 0 AND ALSO = 14 THINK? =3
THAT= 53 THESE = 6 I ALSO = 6 HOW ABOUT
HE = 67 THOSE =0 YOU? = 3
SHE = 38 SO YEAH = 13
THEY= 69
WE= 150
B2 IT = 361 THIS = 7 FOR EXAMPLE = 28 WHAT DO YOU
THIS = 120 THAT = 0 AND ALSO =8 THINK? = 15
THAT = 180 THESE = 6 I ALSO = 8 HOW ABOUT
HE = 34 THOSE = 1 YOU? = 5
THEY = 229
WE = 216
C1 IT = 500 THIS = 7 FOR EXAMPLE = 17 WHAT DO YOU
THIS = 102 THAT = 3 AND ALSO =28 THINK? = 16
THAT= 166 THESE = 9 I ALSO = 10 HOW ABOUT
HE = 26 THOSE = 3 YOU? = 1
THEY= 313
WE = 230
Native IT = 228 THIS = 4 FOR EXAMPLE = 8 WHAT DO YOU
speakers THIS = 3 THAT = 0 AND ALSO = 2 THINK? = 0
THAT = 123 THESE = 3 I ALSO = 2 HOW ABOUT
HE = 8 THOSE = 2 YOU? = 0
SHE = 3 SO YEAH = 4
THEY = 70
WE = 38
this changes so that learners refer back to ideas within the speakers’ own
turns but also across previous turns in the conversation. The extracts in
figure 4.2 show some examples of it used as a cohesive device in the ways
described.
In contrast to this, there is a notable increase as the levels increase of
both this and that but this seems to peak at B2 level, where that is used
almost four times as much as at B1 level but with a similar frequency at
C1 level. This seems to show that learners at B1 level will be familiar with
B1
<$0> Thank you. <$1M>. What kind of music do you like to listen to?
<$1M> Er music? Er <$=> s= </$=> er sometimes I like erm er listen some pop music
because er I’m very interested in it
C1
<$2M> I personally think the same about the location the hotel is really important.
<$1F> Yeah.
<$2M> I said if it’s near to the attraction point then that’s good but if it’s far away then you
know it will be costly to travel up.
<$1F> Yeah spend more.
Figure 4.2 Examples of it used as a cohesive device
B1
. . . when I coming to this school erm because he she she’s my friend in China so I I just feel
happshe she’s my friend in China so I I just feel happy and er full of <$G1> about this school so
B2
<$1M> + mm and er if er what if we will have party er if tomorrow is er Sunday we can have
the party outside and er if mm they are mm Sunday I think erm most er football match and er
other sports will go on but if have a heav= heavy rain that will be stopped.
<$0> Okay.
<$1M> Or like F E.
<$0> Alright and er what do you think?
<$2F> Er erm I think er in my country I can believe this because in my country maybe it\x92s er
the weather al= always the same but in UK er I you can\x92t believe these because th= they
always change the weather sometimes rains maybe after sunshine <$=> so </$=>.
C1
<$32M> And and if if with the tourists from from other country erm ask ask us ask us some
questions such as the way how the way or some place then we have to we have to speak English
+
<$0> Mhm.
<$32M> + and not Japanese.
<$0> And what do you think <$33M>?
Figure 4.3 Examples of this and that at B1, B2 and C1 levels
these items but less confident in using them to link ideas across turns, in
a similar way as we saw with it. As previously shown with the example
of it at B1 level, learners tend to use this and that within their own turn
while at B2 and C1, the items are used across a learner’s own turns and
those of their partner. The extracts in figure 4.3 demonstrate this usage at
each level.
These uses show the increased control speakers have over cohesive devices,
as suggested by the CEFR ‘can do statements’. At C1, for example, it is sug-
gested that learners ‘Can use a variety of linking words efficiently to mark
clearly the relationships between ideas’ (Council of Europe 2001: 124/125).
However, it is clear that what is more important is not that the range of
items used to achieve cohesion increases but rather that the frequency and
flexibility of use does, as the study reviewed by Iwashita and Vasquez (2015)
in section 4.3 also found.
The data also shows that there is very little use of spatial deixis at any of
the levels and also in the native-speaker data. This may be because deixis is
normally used to discuss things within immediate time and space and tends to
occur in types of speech such as ‘language in action’ (Carter and McCarthy
2006) where speakers talk about something as they are undertaking a task.
The types of topics discussed in the speaking tests do not develop this type
of talk and as a result, deixis is relatively uncommon.
The items used to realise thematic development had a similar frequency
across the levels, although there is a slight increase in the items used to realise
this from B2 level onwards. Students at all levels tend to develop themes in a
simple additive fashion, much in the same way as native speakers do (Willis
2003), and there is little evidence of the use of spoken forms such as ‘more-
over’ or ‘in addition’ being used in the learner data. A check on the native-
speaker data from our corpus also reveals similar frequency patterns as B2 and
C1 levels, and table 4.3 shows that the differences were not significant. This
suggests that learners from B1-C1 use these items to achieve the sort of the-
matic development as described in the B2 level statement ‘Can develop a clear
description or narrative, expanding and supporting his/her main points with
relevant supporting detail’ (Council of Europe 2001: 124/125). Figure 4.4
gives an example of this usage at B2 level and an example from the NS data.
Table 4.3 Log-likelihood scores for key linguistic items used to realise discourse competence
ITEM B1-B2 B1-C1 B2-C1 B1-NS B2-NS C1-NS
IT –4.02* –53.28**** –10.55** –52. 09**** –33.18**** –11.46***

THIS –2.99 –4.38* +3.89* +41.10**** +57.41**** +41.55****
THAT –59.63**** –54.33**** +2.91 –118.45**** –23.21**** –38.55****
FOR –5.88* –0.96 +3.96* –1.24 +0.57 –0.49
EXAMPLE
WHAT DO –10.31** –8.99** +0.01 +2.14 +9.74** +9.51**
YOU THINK?
SO YEAH +3.49 +0.13 –2.52 +0.36 –0.70 +0.12
Note: * = p<.05, ** = p<.01, *** = p<.001, **** = p<.0001
+ = greater use in corpus 1 compared to corpus 2
– = greater use in corpus 2 compared to corpus 1
B2
<$0> Right erm in your opinion is technology changing too fast?
<$17F> Yes <$E <$0> laughs /$E> yeah it is changing too fast now <$=> the er </$=> for
example the iphone er it takes long time <$E <$0> laughs /$E> yeah <$E laughs /$E> it takes
a long time to be produced now there iphone 1 2 3 and 5 maybe it just changes +
NS
<6F> Erm. I guess one that’s sort of <$=> didn’t doesn’t </$=> doesn’t really go anywhere or I
don’t feel like I’m like utilising my skills so <$=> not to put </$=> not to put like the job down
but for example if if I was to work at say McDonald’s or Weatherspoons for the rest of my life +
Figure 4.4 For example to develop a theme
B2
<$4M> Erm okay I think that erm if I’m going on a holiday the first thing that I will see is the
location of the hotel. I prefer the hotel to be in a down town so that I can go like for shopping or
go <$=> to </$=> to run some errands as easily as I could. What do you think?
Figure 4.5 Use of what do you think? to initiate conversation
What is also striking is that learners from B2 level onwards are better
able to initiate conversation as we can see in the increased use of what do
you think? This suggests that as learners progress, they have to focus less
on how to manage their own turn and can think more about the interactive
nature of discourse. The fact that the native-speaker data has no examples
of this suggests that there is less need to explicitly invite a conversational
partner into the discourse because they will be aware of a ‘gap’ and can find
it without the need for a signal of this nature. The sample in figure 4.5 shows
this usage at C1 level.
The use of so yeah to round off a turn and to signal the learners are ready
to close it follows a different pattern to what do you think? with an increase
in frequency of use at B1 and C1 levels but as table 4.3 shows, there is no
significant difference in this the frequency of usage. A typical use of so yeah
is shown in figure 4.6 at B1 level.
The findings described so far are confirmed by the log-likelihood scores
in table 4.3 (p. 120), which compare the scores for the most frequent items
above in order to show if increased usage was significant when comparing
<$35M> + er in the time so yeah you know so IELTS is written in English er <$O69> so
</$O69> you know we are just er er not we are not not native speakers +
<$34F> <$O69> yeah </$O69> yeah.
<$35M> + so yeah <$O70> yeah </$O70> yeah it is very difficult to learn +
Figure 4.6 Use of so yeah to close a turn
the learner data across levels and with the native speaker data. These results
show that calculations were undertaken to compare B1-B2, B1C1, B2-C1
and the learner data with the native speaker data. A plus symbol (+) means
the item was used significantly more in the first corpus compared to the
second and a minus symbol (-) means it is used significantly more in the
second corpus than the first. Stars indicate the level of significance, if this
has been found (please refer to table 4.3).
In summary this analysis shows the following:
1 It, this and that were generally used for anaphoric referencing with more
frequency and confidence and flexibility as levels increased, to link ideas
within turns at B1 and then across turns from B2 levels onwards. This
difference in frequency of use was significant particularly in the case of
it, with the frequency of this being more significant at B2 levels. Both
it and that were used significantly less by all learners than the native
speakers, while this was used significantly more.
2 Learners from B2 level onwards were better able to initiate conversation
via the use what do you think? to invite a contribution from their part-
ner. From B2 level onwards, learners used this with greater frequency
than native speakers and this increased frequency was significant. Com-
parisons also show that B2 and C1 levels used this with significantly
greater frequency than B1 level learners (see section 3.5.4). This suggests
that, as mentioned, after B1 level, learners begin to focus less on how to
manage their own turn and can think more about the interactive native
of discourse and that there is a need for learners to be more explicit
when inviting a conversation partner’s turn.
3 Thematic development was realised at all levels (and by native speak-
ers) by use of simple additive expressions such as and also or I also
throughout the levels. The use of for example to explicitly signal that
a turn is going to be extended is significantly more frequent at B2 level
when compared with B1 and C1 levels but not when compared with the
native-speaker data.
4 So yeah is used to round off turns more frequently by both B1 and C1
levels and when compared to native speakers but none of these differ-
ences in frequency are significant.
Having looked at these results, the next section will attempt a similar style of
analysis: examining discourse competence at the micro level, via an analysis
of DMs while exploring how this relates to CEFR competences at the macro
level of discourse.
4.6 The frequency and functions of common

discourse markers used to achieve
discourse competence
An initial search of the frequency of the DMs can be seen in table 4.4 show-
ing the frequency across the different levels, in order of frequency for each
level represented in the learner corpus.
We can observe several patterns from the raw frequency counts, as
follows:
Firstly, in almost all cases, even taking into account the smaller amount
of NS data, the DMs are used with greater frequency by learners than
native speakers. In terms of specific items, we can see that er decreases as
the CEFR level increases but er is by far the highest frequency DM. Sec-
ondly, the use of yeah, I think and well increase as the CEFR level moves
up. Thirdly, Ok is noticeably more frequent in general in the learner data
and in particular at B2 level and You know is more frequently used at B1
level. Lastly, Right and anyway are infrequent in the learner and native
speaker
There are several possible reasons for this and these will be discussed
after looking at the log-likelihood scores for these items. Table 4.5 gives the
log-likelihood scores for the most frequent DMs (anyway and right have
been excluded from further analysis due to their very low frequency) in an
attempt to show if increased usage was significant when comparing the leaner
and native-speaker data and across levels. Calculations were undertaken
Table 4.4 Frequency of DMs in learners and NS data
DMS B1 B2 C1 Native speaker data
ER 1391 1312 770 42

YEAH 250 380 442 171
I THINK 205 219 288 105
OK 86 116 60 3
YOU KNOW 51 35 38 25
WELL 27 33 64 14
I MEAN 4 8 8 17
RIGHT 0 5 7 0
ANYWAY 1 2 0 0
to compare B1-B2, B1-C1, B2-C1 and B1-NS, B2-Ns and C1-NS. A plus
symbol (+) means the DM was used significantly more in the first corpus
compared to the second and a minus symbol (-) means the opposite. Stars
indicate the level of significance, if this has been found.
Based on the raw frequencies and this data, we can make several further
observations.
Firstly, as noted here and in chapter 2, the raw frequency with which Er
is used decreases as level improves but is still in use at C1 level. As we have
discussed in chapter 1, this is likely to be because it allows for a speaker to
control their own turn and hold the floor. It also buys time and at lower
levels, there is a greater need to do this as you search for language. This is
supported by the log-likelihood scores, which show that this use or Er and
also Ok have a significantly higher frequency at all levels when compared
to the native-speaker data and that they also occur with significantly more
frequency at B1 and B2 when compared with the C1 levels. We would sug-
gest that this is also because Ok also has a key function of buying time and
this is most needed most where learners have less ability to quickly recall
language. The significantly increased use of you know at B1 level when com-
pared to B2 and C1 levels suggests that it is primarily used as a marker to
manage the discourse and to hold the floor, something an examination of the
data demonstrated clearly. There was little evidence of this DM being used
to mark shared knowledge, which Fung and Carter (2007) suggest is a key
function it often fulfils and instead it was part of the greater need of B1 level
learners to manage their turn by buying time. This reinforces the findings of
Fung and Carter (2007), who suggested that learners made less use of the
interpersonal functions of DMs such as you know.
Table 4.5 Log-likelihood scores for each discourse marker
DMs B1-B2 B1-C1 B2-C1 B1-NS B2-NS C1-NS
ER +19.66**** +339.14**** +203.76***** +715.08**** +585.27**** +241.31****

YEAH –14.42**** –20.72**** –0.48 –21.27**** –2.95 –1.50
I THINK 0.22 –1.89 –3.68 –2.13 –3.44 –0.20
OK –1.74 +11.91**** +24.33**** +42.39**** +55.01**** +19.70****
YOU +5.06* +5.72* +0.01 25 –5.37* –5.86*
KNOW –0.29
WELL –0.12 –8.59** –7.16*** 14 –0.10 +2.66
–0.32
I MEAN –0.95 –0.64 +0.04 17 –17.47**** –19.59****
–23.31****
Note: * = p<.05, ** = p<.01, *** = p<.001, **** = p<.0001
+ = greater use in corpus 1 compared to corpus 2
– = greater use in corpus 2 compared to corpus 1
Another aspect to observe is that the use of yeah is significantly more

frequent at B2 and C1 and in the NS data, when compared to B1 level. This
seems to indicate a greater ability to pay attention to and respond to the
ongoing discourse, rather than the greater need to focus on your own turn,
as mentioned in the initial analysis of cohesive devices such as it earlier in
this chapter. Similarly, well is used significantly more at B2 and particularly
C1 levels, primarily to mark a pause before a turn is continued.
In contrast, I mean is used significantly more by native speakers and I
think does not differ with any significance across all sets of the data. This
indicates that native speakers use I mean to explicitly adjust and reformu-
late speech, a key function of this DM, as identified by Fung and Carter
(2007). Successful learners do not do this as often and we can suggest that
this is likely to be because when trying to reformulate a message, the focus
is on what you are trying to say and this takes up processing time, leaving
little spare time to signal what you are doing. Finally, the use of I think also
differed across levels in the manner in which it is used and the functions it
fulfils and these will be discussed in the next section, when we look at how
the DMs are used in chunks.
Table 4.6 shows the most frequent two to six word chunks for each word.
These were sorted left and right around the keyword. As mentioned, the
most frequent chunks in each category are listed here, with three occur-
rences the minimum cut off point. In each case the frequencies are given
after each chunk.
It is obvious from this data that the chunks in use across each level
have many similarities. The DMs frequently cluster together (such as Er
I think, Well I think and You know er) and there are some which dif-
fer only slightly in frequency across levels such as Er I think. They are
also markedly different in nature to written forms of discourse marking
such as ‘in addition’. Although differences in form did occur, differences
were mainly found in terms of the functions for which they are employed.
Some of these differences for er, yeah, I think, Ok, and well will now be
explored, with examples used from the exams at each level. I mean has
been excluded because there were too few occurrences to merit attention
in the learner data.
4.6.1 And er
At the most obvious level, this chunk is used by speakers to aid the cohe-
sion and coherence by allowing a speaker to continue a turn by adding new
information to old. It is another way in which successful speakers achieve
thematic development, as for example also does, as discussed in section 4.5.
This function can be seen in particular at B1 and B2 levels and is used in a
Table 4.6 Two to six word chunks with er, yeah, I think, Ok, you know, well and I mean
DMs B1 B2 C1 NATIVE-SPEAKER DATA
ER AND ER (175) AND ER (173) AND ER (77) ER I (5)

ER I (160) ER I (106) ER I (69) ER BUT I (3)
I THINK ER (47) I THINK ER (35) ER I THINK (22)
ER I THINK (37) ER I THINK (28) I THINK ER (16)
ER I THINK ER (14) ER I THINK ER (7) ER I AGREE WITH (4)
ER II DON’T (3) AGREE WITH YOU BUT ER (6) THINK IT’S ER (4)
ER I AGREE WITH YOU (5) I THINK IT’S ER (4)
I AGREE WITH YOU BUT ER (5)
I SPEND A LOT OF ER (3)
YEAH YEAH YEAH (30) YEAH I (33) YEAH I (50) YEAH I (24)
YEAH YEAH YEAH (12) YEAH YEAH (25) SO YEAH (14) YEAH I THINK (10)
YEAH IT’S (9) YEAH IT’S (22) YEAH I THINK IT (4)
YEAH YEAH YEAH (7) YEAH I AGREE WITH (11)
YEAH I AGREE WITH (4) YEAH I AGREE WITH YOU (4)
YEAH I AGREE WITH YOU (4)
I THINK I THINK ER (47) I THINK ER (35) I THINK IT (54) I THINK IT (25)
I THINK IT’S (26) I THINK IT’S (20) I THINK IT’S (41) I THINK IT’S (24)
I THINK IT’S SO (4) I THINK IT’S ER (4) I THINK IT’S VERY (7) I THINK A LOT OF (4)
I THINK IF YOU HAVE A (3) I THINK IT’S GOOD FOR (3)
OK OK ER (9) OK OK (17) OK I (7) TOO FEW
OK OK (7) OK OK OK (5) OK OK (5) OCCURRENCES
OK ERM I (3) OK I THINK (3) TO GATHER ANY
MEANINGFUL DATA
YOU YOU KNOW ER (12) YOU KNOW ER (5) YOU KNOW THE (6) YOU KNOW YOU (3)
KNOW YOU KNOW IT’S (3)
WELL WELL I (9) WELL I (11) WELL I (13) WELL SO (3)
WELL I THINK (4) ER WELL (6) WELL I THINK (4)
ER WELL (4) WELL I THINK (4)
I MEAN TOO FEW OCCURENCES TOO FEW OCCURENCES TOO FEW OCCURENCES I MEAN IT (6)
I MEANS IT’S (5)
I MEAN I (3)
simple additive fashion, within and across turns. When speakers progress to
C1 level, there is also evidence of this but in addition, at this level, speakers
also use the chunk to hold the floor within longer, more extended stretches
of talk and across turns, as the examples in figure 4.7 show.
4.6.2 Yeah/yeah/yeah I
In addition to use of so yeah to close conversations at B1 and C1 levels, as
noted in section 4.5, this data shows how the DM changes function across
levels. At B1, it seems to be used most often to agree, pause or buy time,
(yeah yeah) and at B2 to agree and to add a viewpoint (yeah I) and buy
time. These functions continue at C1 but yeah I also begins to function
as a more interactive marker to agree and also continue a topic, even if it
was begun by a different speaker. This usage begins at B2 level (whereas at
B1 it tends to be used to manage within a speaker’s own turn), but there
are more examples of this at C1. The samples in figure 4.8 show these
functions.
B1
<$29F> Oh okay so well actually er I need describe a game er so the game must be the the same
like the sport or any games?
<$0> Any game. Any game.
<$29F> Ah okay so erm I I think er erm in <$G1> games I just enjoy play the computer game
because I don’t like sport. Yeah so er you know when I play the indoor the online games I can
stay in my rooms don’t need go out and er and sometimes I can lie in the bed play this game with
my friend.
B2
<$16F> Yeah erm I I like you I prefer outdoor outdoor sports <$O38> just like </$O38> er I
like play badminton and er maybe sometimes I think just lie +
<$15F> <$O38> Yeah </$O38> Yeah.
<$16F> + on the grass is very comfortable.
<$15F> Yes I think erm you do some sports and some sports is on just like football and erm er
basketball and er you can also erm play play football er or basketball inside or outside.
C1 level
<$0> Okay. Thank you. Is there a particular place that you would like to visit?
<$20F> Er I went to Bristol. I like that city +
<$0> Mhm.
<$20F> + and er Lancashire er and er Liverpool er Manchester. I hope I will go to other cities.
Figure 4.7 And er as a continuation

B1
Agreeing
<$1M> If the language is a problem maybe we can talk talk with each other.
<$2M> Yeah. I’m very agree with your opinion and er in addition I would like to say
Buying time
<$20M> Okay. Er <$E> coughs </$E> er when I was young I used to play er football everyday
but er now it’s because we= I have a lot of things to do it I er I always busy and <$G5> and no
more time I er I I didn’t have any time to play football but I like to watch it and er er yeah
football yeah I like to watch it.
B2
Buying time and agreeing/showing listenership
<$2F> Okay. Yeah erm but er different er like er our break is very warm but is really hot so
people er always er always re= can’t erm go out and er to en= to enjoy they job I think.
<$1M> Yeah.
<$2F> And er +
<$1M> Mm yeah I+
C1 Showing listening and extending the conversation
<$1F> The hotel amenities before I stay there has to be good like they has to be very good
before I stay in a hotel because I can’t just stay in normal hotel that just have run down stuff +
<$2M> Yeah.
<$1F> + it has to be really good.
<$2M> Yeah I do the same and most importantly first is the hotel room right?
<$1F> Mhm.
<$2M> Yeah the place where you would spend the night over yeah how important do you think
it is?
<$1F> Yeah.
<$2M> Yeah it’s very important yeah.
Figure 4.8 Yeah across the levels
4.6.3 I think er and I think it

The extensive use of chunks with think were discussed in chapter 2, where
we noted it can function to give views, gain time and to hedge opinions. In
this data, the more frequent use of I think er at B1 and B2 levels shows that it
tends to be used to express a view and to gain thinking time while holding the
floor. At C1 level, as discussed in chapter 2, this function is still used but the
preference for I think it shows that successful learners at this level tend to
use this chunk to link to previous turns or to reference something outside the
immediate context. Figure 4.9 shows some examples of these uses.
B1
<$0> Okay. Would you begin please?
<$4F> Er I will begin first <$E> coughs </$E> Er yes I think er individual personality will
affect the choice of work job because sometimes if the person is confidence maybe they will do
some own <$G1> not follow some other people’s doing some job applications.
B2
<$7M> + so what do you think?
<$8M> Yeah I definitely agree with you because I think er maybe sometimes it’s depend
depends on different average er different <$G2> of application maybe maybe some developed
countries er the average is very high so people maybe have a quite good of er knowledge so they
can just communicate very easily <$=> and er mm I er </$=> besides I think mm +
C1
<$0> <$29M> sorry. When is the best age to learn to drive?
<$29M> To learn to drive I think it must be sixteen or seventeen but to start driving itshould be
eighteen because if people just learn to drive in one month or in one session they won’t learn
everything so <$=> they start </$=> two years ago they start having classes some safety some
safety classes about the classes about the cars <$=> can </$=> what they are capable of what
they can do what they can’t do they must tell them everything before they can use the car.
Figure 4.9 I think er/I think it across the levels
4.6.4 Ok er/ ok I
The function of this DM is used to buy time, to show understanding, and to
launch a turn or topic. The first two functions are most common at B1 and
B2 and the final function becomes increasingly used at C1 level. Figure 4.10
shows these uses.
4.6.5 Well I
As noted previously, the use of well increases as levels increase and its use
is significantly more frequent at B2 and C1 levels when compared with B1
level. If we compare B1 to C1, this difference is highly significant. At B2
level, the most common use is to signal some form of hesitation at the start
of the turn. At C1 level, this use is also in evidence but there is also some
evidence that it is used to mark a slightly unexpected answer, which is often
listed as a common function of this DM (see for example Carter and McCar-
thy 2006) The examples in figure 4.11 show these uses at C1 level.
Overall, what these samples show is that DMs vary to some degree in fre-
quency across levels and this is also shown in the log-likelihood scores, but
more importantly they vary in terms of how they function across levels. As
with the cohesive devices explored in the earlier part of this chapter, there is
a clear tendency for successful speakers to use DMs to manage the discourse
within their own turn at B1 level. While these functions continue at higher
Showing understanding and buying time

B1
<$0> Right. Now I’m going to give each of you a topic to talk about.
<$15M> Ok.
<$0> So <$15M> you will talk for about a minute after which I will stop you. And I’ll ask you er
<$16M> to comment on your partner’s topic at the end of the minute. Okay so <$15M> here
you are. And let <$16M> see it please. And can you answer the questions for us <$15M>?
<$15M> Ok er actually I I I I prefer live in house because the house is big is bigger than
apartment and maybe some house
B2
Showing understanding
<$0> Okay would you begin please?
<$8M> Ok er <$7M> let’s talk about something about er the companies’ team working. So what
do you think about it if we just refer to these two words?
<$7M> The training er +
C1
Launching a turn
<$4M> how important is tourism to your country?
<$3M> Ok can I go first? Ok erm tourism in my country is not really important why because
erm my country has a lot of problems like erm there are a lot of things that er need to be atte=
attended to before tourism so basically what they are focussing on is not tourism at all they are
trying to focus on agriculture and the production the industries and the rest so like tourism is
like neglected in my country
Figure 4.10 Ok across the levels
Marking hesitation
<$22F> Okay what do you think?
<$21M> Er well these points important points in tourism but er I think well I agree with this
paper and the first point sites of historical interest especially erm in my country there is many
many sites historical site
Unexpected response
<$0> Film or a book?
<$27M> Well I don’t read books.
Figure 4.11 Well used at C1 level
levels, DMs tend to become used more across turns from B2 and in particu-
lar at C1 levels. In terms of the CER descriptors, the B1 descriptor suggests
that regarding flexibility, learners ‘Can exploit a wide range of simple lan-
guage flexibly to express much of what he/she wants’ while at B2 and C1
levels learners ‘Can adjust to the changes of direction, style and emphasis
normally found in conversation’ (Council of Europe 2001: 124/125). This
data suggests that learners use DMs more flexibly as levels increase to adjust
to the ongoing needs of the discourse and of their conversational partner.
It would seem then that successful discourse competence at lower lev-
els is reflected in the ability to manage and extend your own turn, and as
learners progress this becomes increasingly part of what McCarthy (2009)
has termed ‘confluence’ in relation to fluency, that is co-constructing the
discourse across turns. It is also interesting that in some cases the learners
used more DMs than was evident in the NS data, although, as noted, these
differences in frequency were not always significant. We can suggest that this
is likely to be because successful learners have a greater need to explicitly
signal as a way to manage the discourse in spoken test conversations such
as the ones which make up this data. This is likely to be because the topics
are not of their choosing and more time is thus taken processing, managing
and responding than it was by native speakers.
4.7 Conclusion
The findings in this chapter lead us to several conclusions related to the
language learners use at different levels to achieve discourse competence and
have clear implications for teaching and learning.
The first thing we can note is that speakers generally use similar language
to produce discourse that is coherent, cohesive, developed and managed but
that frequency of occurrences does differ across levels in significant ways as
do the specific functions for which such language is used. This was demon-
strated through the analysis in the first part of the chapter. We can see that
in terms of cohesion, for example, that when learners refer back to some-
thing said previously they used it more frequently as the levels increase but
for this and that the frequency of use reached a peak at B2. This increase in
use of it was significant in terms of log-likelihood scores when we compare
B2 and C1 levels with B1 level. However, the usage of this was significantly
more frequent at B2 level in comparison to B1 and C1, while that was sig-
nificantly more frequent at both B2 and C1 levels when compared with B1.
Such differences reflect the fact that successful discourse competence at B1
level generally reflected the ability of learners to manage their own turn, and
as learners progress this becomes increasingly part of co-constructing the
discourse across turns and developing what a speaking partner says.
Secondly, the chapter shows that certain DMs are an important feature of
discourse competence at all levels. Some items were used with significantly
higher frequency by learners at all levels when compared with the NS data,
namely er and ok. In the case of I mean, the usage was significantly higher
in the NS data. The frequency of items also varied across the levels and these
variations can be significant alongside differences in how they function to
give two examples, you know was used with significantly higher frequency
at B1 level, when we compare this to B2 and C1. At B1, it is used to mark
hesitation or buy time rather than to mark shared knowledge. Well was used
with significantly higher frequency at C1 levels when compared to B1 and
B2 and the functions also increase. At B2 level, well is primarily used to mark
hesitation and to buy time while at C1 is also used to mark an unexpected
response. As a whole, the data also supports the findings of Gilquin (2008),
that hesitation was more commonly marked with er before any other DM,
even though usage decreases as levels increase. The low frequency of I mean
in the USTC data suggest that learners have less focus on explicitly signalling
a reformulation in their output and may be more focused on the message
they are producing. DMs also formed chunks such as so yeah which fulfil
important functions across the levels. In the case of so yeah, it was used most
frequently at B1 and C1 levels to close turns.
Overall, this chapter shows that discourse competence at B1 level is devel-
oped largely within a speaker’s own turn where the learner focuses on creat-
ing coherence and cohesion mainly within their own contribution. At B2 and
C1 levels, this starts to expand across turns and learners can co-construct
turns and increasingly develop what a speaking partner says by linking their
contribution to what has been said.
References
Adolphs, S. and Carter, R. 2013. Spoken corpus linguistics: From monomodal to
multimodal. London: Routledge.
Aijmer, K. 2002. English discourse particles: Evidence from a corpus. Amsterdam:
John Benjamins.
Byrne, S. 2015. Examining successful language use at C1 level: A learner corpus
study into the vocabulary and abilities demonstrated by successful speaking exam
candidates. Journal of Linguistics and Language Teaching, 6(1), 67–88.
Carter, R. and McCarthy, M. 2006. Cambridge grammar of English: A comprehen-
sive guide: Spoken and written English grammar and usage. Cambridge: Cam-
Celce-Murcia, M., Dornyei, Z. and Thurell, S. 1995. Communicative competence:
A pedagogically motivated model with content specifications. Issues in Applied
Linguistics, 6(2), 5–35.
Chen, Y-H. and Baker, P. 2016. Investigating criterial discourse features across sec-
ond language development: Lexical bundles in rated learner essays, CEFR B1, B2
and C1. Applied Linguistics, 37(6), 849–880.
Cook, G. 1989. Discourse. 6th ed. Oxford: Oxford University Press.
Council of Europe 2001. Common European framework of reference for languages:
Learning, teaching, assessment. Cambridge: Cambridge University Press.
Fung, L. and Carter, R. 2007. Discourse markers and spoken English: Native and
learner use in pedagogic settings. Applied Linguistics, 28(3), 410–439.
Gilquin, G. 2008. Hesitation markers among EFL learners: pragmatic deficiency or
difference? In J. Romero-Trillo ed. Pragmatics and corpus linguistics: A mutualistic
entente. Berlin: Mouton de Guyter, 119–143.
Halliday, M.A.K. and Hasan, R. 1976. Cohesion in English. 2nd ed. London:
Longman.
Hellermann, J. and Vergun, A. 2007. Language which is not taught: The discourse
marker use of beginning adults learners of English. Journal of Pragmatics, 39(1),
157–179.
Hoey, M. 2004. Spoken discourse: Discourse markers er, erm and OK.MED Maga-
zine [Online] Available from: <www.macmillandictionaries.com/MED-Magazine/
March2004/17-language-awareness-discourse-uk.htm> [Accessed 12 February
2017].
Iwashita, N. and Vasquez, C. 2015. An examination of discourse competence at dif-
ferent proficiency levels in IELTS speaking part 2. IELTS research reports, 5, 1–44.
Jones, C. 2011. Spoken discourse markers and English language teaching: practices
and pedagogies. PhD. Thesis. Nottingham: University of Nottingham.
Jones, C. and Waller, D. 2015. Corpus linguistics for grammar: A guide for research.
London: Routledge.
Jucker, A.H. and Ziv Y. 1998. Discourse markers: Descriptions and theory. Amster-
dam: John Benjamins.
McCarthy, M. 1991. Discourse analysis for language teachers. 2nd ed. Cambridge:
Cambridge University Press.
McCarthy, M. 2009. Rethinking spoken fluency. Estudios de linguistica inglesa apli-
cada, 9, 11–29.
McCarthy, M. and Carter, R. 1993. Language as discourse: Perspectives for language
teaching. New York: Longman.
Oakes, M.P. 1998. Statistics for corpus linguistics. Edinburgh: Edinburgh University
Press.
O’Keeffe, A., McCarthy, M. and Carter, R. 2007. From corpus to classroom: Lan-
guage use and language teaching. Cambridge: Cambridge University Press.
Reder, G. 1990. Ideational and pragmatic markers of discourse structure. Journal of
Pragmatics, 14, 367 –381.
Schiffrin, D. 1987. Discourse markers. Cambridge: Cambridge University Press.
Thornbury, S. 2005. Beyond the sentence: Introducing discourse analysis (methodol-
ogy). 3rd ed. Oxford: Macmillan Education.
Tsai, P-S and Chu, W-H. 2015. The use of discourse markers among Mandarin Chi-
nese teachers, and Chinese as a second language and Chinese as a foreign language
learners. Applied Linguistics. Advanced online access [Online]. Available from:
<https://academic.oup.com> [13th March 2017].
Waller, D. 2015. Investigation into the features of written discourse at levels B2 and
C1 of the CEFR. Unpublished PhD dissertation, University of Bedfordshire.
Willis, D. 2003. Rules, patterns and words: Grammar and lexis in English language
teaching. Cambridge: Cambridge University Press.
Chapter 5
Pragmatic competence
5.1 Introduction
This chapter will focus on the speech acts of requests and apologies with
the aim of establishing the prerequisite components for their successful pro-
duction. As outlined in chapter 1, the data used to analyse the pragmatic
features of successful spoken request and apology language were based upon
a 39,437-word corpus of international learner responses. As Geluykens and
Kraft (2008) highlight, in the absence of suitable reference corpora which
often do not lend themselves well to cross-cultural analyses, or are too gen-
eral to be useful for pragmatics research in particular, researchers may be
left with little option but to devise their own. It is for these reasons that the
Speech Act Corpus of English (SPACE) originally took shape, though it is
limited to the speech acts of request and apology in its current form, at the
time of writing this book.
The majority of the NNS and NS data for the SPACE corpus were gath-
ered over a series of intervention studies (Halenko 2016; Halenko and Jones
forthcoming) and captured via the use of innovative and interactive virtual
role plays, designed to elicit large amounts of data in a controlled envi-
ronment, in an attempt to simulate authentic NS-NNS interactions more
closely (see chapter 1 for a more detailed description of this instrument).
As discussed in chapter 1, simulated spoken data cannot entirely replicate
naturally-occurring discourse but is a more efficient means of capturing
large samples of language with a specific focus, in controlled environments.
As a reminder, the test featured interlocutors from an academic context who
international students were likely to encounter on campus at a British uni-
versity, playing high and low social distance roles, within scenarios typical
of staff-student exchanges such as high imposition requests and apologies
(see figure 5.1). The SPACE corpus comprised those request and apology
responses considered successful from B2 level learners, which were rated as
being ‘appropriate’ or ‘very appropriate’ for the scenarios presented.
Without revisiting communicative models from chapter 1 in too much
depth, section 5.2 first outlines features which underpin the notion of prag-
matic competence, its importance as a component to successful language
Pragmatic competence 135
You have not completed your

essay.
You go to your tutor’s office,

who you know well, to ask for
extra time.
You:
Figure 5.1 An example of a scenario from the CAPT to elicit spoken data
production, and how the CEFR embeds the notion of the sociolinguistic
competence in its descriptors. Given the central focus of requests and apol-
ogies in the SPACE corpus, these speech acts are also defined, providing
frameworks for analysing the data more closely, in addition to an acknowl-
edgement of the substantial role formulaic language plays in production
of pragmatic language, in particular. Following a review of previous stud-
ies and how the data were analysed in sections 5.3 and 5.4, the chapter
turns to examining the linguistic aspects of successful request (section 5.5)
and apology (section 5.6) language and cross-referencing the findings to the
CEFR descriptors. Section 5.7 concludes the chapter by identifying the prag-
matic strategies and organisational patterns of moves considered requisite
for structuring successful requests and apologies. As a point of comparison
of alternate ways success can be achieved, native-speaker data within the
SPACE corpus will also be examined, as in previous chapters. The aim of this
chapter is to be able to highlight components of successful B2 level request
and apology language to encourage prioritising these aspects in teaching and
learning programmes.
5.2 Definitions of pragmatic competence

Hymes’s (1972) shift away from the Chomskyian (1965) view of language
as a system isolated from context and use, introduced the importance of
situating both the knowledge of language and the ability to use it within
the construct of communicative competence, thereby guiding the design of
later influential frameworks. As seen in chapter 1, Canale and Swain (1980,
Canale, 1983) and Bachman and Palmer (1996) are among those credited
with attempting to capture the essential components of communicative com-
petence in second language acquisition (SLA). Whilst Canale and Swain’s
(1980) work implicitly embeds a pragmatic component, referring to the
136 Pragmatic competence
rules of use and appropriateness within ‘sociolinguistic competence’, Bach-

man and Palmer (1982), and subsequently Bachman (1990), were the first
to explicitly categorise it as a discrete element to describe the ability to use
language in socially appropriate ways.
Collectively, the above models posit that it is not only linguistic knowledge
that is a key tenet to communicative competence, but acquisition of func-
tional and sociolinguistic control of language. For instance, when requesting
a favour from someone, in addition to possessing the declarative knowledge
of what forms and lexis are needed (linguistic competence), learners need
adequate procedural knowledge of how to enact the request by consider-
ing its acceptability on the basis of the overall social context, the specific
situation, the favour itself, and from whom they are soliciting the favour
(pragmatic competence). The importance of the social aspects of interaction
is echoed by a number of researchers who suggest that pragmatic compe-
tence must be reasonably well developed for successful communication in a
second language (L2) (Bardovi-Harlig 2001; Kasper and Rose 2002; Rose
2005), though early studies provided evidence of disparities between NNS
linguistic proficiency and pragmatic competence, even in advanced learners
of English (Bardovi-Harlig and Hartford 1990; Blum-Kulka 1982, 1983;
Kasper 1995; Rintell 1981; Thomas 1983).
Most second language pragmatics investigations are grounded in Austin
(1962) and Searle’s (1969) speech act theory which aims to provide a clearer
understanding of what is required for effective and appropriate communica-
tion. It is problematic to assign a clear definition of a speech act given that
it is not a sentence or an utterance, but an act in itself. As Austin (1962)
describes, language is more than making statements of fact, it has a perfor-
mative function to carry out social actions such as in stating, ‘I apologise’,
has both a linguistic and social function. With this in mind, Austin (1962)
posited that when producing utterances, a speaker actually performs three
acts: the locutionary act (the utterances themselves), the illocutionary act
(the speaker’s intention behind the words, such as requesting or apologising)
and the perlocutionary act (the effect of the utterance on the hearer).
Of the three acts described above, the illocutionary act is said to be the
underlying focus of speech act theory. Building on Austin’s (1962) classifi-
cations of illocutionary acts, Searle’s (1969) revised taxonomy is based on
functional characteristics and incorporates five major groups; representa-
tives (e.g. assertions), directives (e.g. requests), expressives (e.g. apologies),
commissives (e.g. promises) and declarations (e.g. vows). The illocution-
ary act, also known as illocutionary force, provides a signal as to how the
speaker wishes the utterance to be interpreted (Barron 2002), and is typically
realised by Illocutionary Force Indicating Devices (IFIDs) such as performa-
tive verbs (e.g. requesting or apologising), or word order and intonation.
For instance, ‘Would it be possible to have an extension for my assignment?’
functions as a request by the speaker. An IFID is considered successful if the
listener obliges and complies with the request. The success of utilising IFIDs
appropriately, however, is less commonly achieved by NNS (Barron 2003),
the reasons for which have been one of the motivating drives for second
language pragmatic investigations. Building on the descriptions in chapter 1,
Crystal (1997: 301) provides the most widely referenced definition of sec-
ond language pragmatics:
Pragmatics is the study of language from the point of view of the users,
especially of the choices they make, the constraints they encounter in using
language in social interaction, and the effects their use of language has on
other participants in the act of communication.
These two areas of choice and constraint are conveniently differentiated
by Leech (1983) and Thomas (1983) as pragmalinguistic and socioprag-
matic components of pragmatics. Pragmalinguistics refers to the knowledge
of linguistic resources available and the choices made to convey messages.
Sociopragmatics, on the other hand, is primarily concerned with the effect
constraints such as social distance and status will have when realising a
communicative act.
As noted in chapter 1, pragmatic competence as referred to in the CEFR,
aligns to Canale’s discourse competence, representing ‘the functional use of
linguistic resources . . . the mastery of discourse, cohesion and coherence’
(CEFR 2001: 13). With the descriptions of pragmatic competence outlined
in 5.1 in mind, which place an emphasis on the influence of social and cul-
tural conventions on language use, it seems a much closer fit to the CEFR’s
notions of sociolinguistic competence: ‘sociolinguistic competence is con-
cerned with the knowledge and skills required to deal with the social dimen-
sion of language use . . . [including appropriate use of] linguistic markers
of social relations, politeness conventions and differences in register at the
different levels’ (CEFR 2001: 118). The CEFR descriptors are captured in
table 5.1 and suggest B2 learners (whose responses comprise the SPACE cor-
pus) should have a sociopragmatic awareness of politeness and behaviour
relative to the context and interlocutor, and be able to make appropriate
(pragma)linguistic choices based on this.
5.2.1 Requests
Derived from Searle’s (1976) classification of ‘directives’, requests are seen
as illocutionary acts in which a speaker conveys to a hearer his/her wish
for the hearer to perform an act which is of cost to them but has benefit
to the speaker. This can be a request for verbal goods, such as information,
or non-verbal goods, such as an object or service (Trosborg 1995). It is
characterised as a pre-event act given the expectation that the act will take
place in the immediate or near future time. As the request imposes on the
hearer, it is also, by nature, a face-threatening act (FTA). Within Brown and
Levinson’s (1987) theory of politeness, a request specifically threatens the
Table 5.1 CEFR ‘can do’ statements for sociolinguistic appropriateness
Sociolinguistic appropriateness
C1 Can recognise a wide range of idiomatic expressions and colloquialisms,

appreciating register shifts; may, however, need to confirm occasional details,
especially if the accent is unfamiliar.
Can follow films employing a considerable degree of slang and idiomatic usage.
Can use language flexibly and effectively for social purposes, including emotional,
allusive and joking usage.
B2 Can express him or herself confidently, clearly and politely in a formal or
informal register, appropriate to the situation and person(s) concerned.
Can with some effort keep up with and contribute to group discussions even
when speech is fast and colloquial.
Can sustain relationships with native speakers without unintentionally amusing or
irritating them or requiring them to behave other than they would with a native
speaker.
Can express him or herself appropriately in situations and avoid crass errors of
formulation.
B1 Can perform and respond to a wide range of language functions, using their most
common exponents in a neutral register.
Is aware of the salient politeness conventions and acts appropriately.
Is aware of, and looks out for signs of, the most significant differences between
the customs, usages, attitudes, values and beliefs prevalent in the community
concerned and those of his or her own.
hearer’s negative face (the freedom to be unimpeded by others) by creating

this imposition.
To mitigate this FTA, a number of strategies can be undertaken to mini-
mise the request, whilst maximising politeness at the same time. One of
the most common ways a request may be minimised is through indirect-
ness within the core component of the request, the head act, which conveys
the speaker’s wish/es. The head act comprises three main strategies: direct,
conventionally indirect and non-conventionally indirect which increase in
indirectness, as outlined in figure 5.2.
Firstly, direct strategies may be employed when the speaker wishes to
explicitly state the illocutionary point of the utterance via performative
verbs e.g. I request a lift from you, imperatives, e.g. Give me a lift, or modals
expressing obligation, e.g. You must give me a lift. They fail to offer any
options to the hearer so are considered the least polite. Next, conventionally
indirect strategies (CID) question the hearer’s ability and willingness to com-
ply with the request, e.g. Could you give me a lift? In this case, compliance is
not taken for granted and a means to opt out is supplied, thereby lowering
the risk of the speaker losing face by increasing indirectness. CID strategies
typically comprise routinised formulae and those which are hearer-oriented,
i.e. Could you . . . are generally considered more polite as a compliance
option is provided. Finally, the third strategy, non-conventionally indirect
Direct strategies
• direct
e.g., Give me a lift
Conventionally indirect strategies

• less direct
e.g., Could you give me a lift?
Non-conventionally
indirect strategies • least direct
e.g., I’m late for the train
Figure 5.2 Strategy choice for request head acts
or ‘hints’, are employed when the speaker does not wish to overtly state the
desired action but instead prefers to make a statement or ask a question. It
requires the hearer’s interpretation of the speaker’s intent, e.g. I’m late for
the train ought to signal to the hearer that he/she might offer the speaker a
lift to the train station.
The head act is able to function independently but is typically embedded
within a range of mitigating supportive moves which serve to soften the
request. These comprise internal and external modifiers. Internal modifiers
are those which form part of the head act itself and include softeners which
reduce the force, e.g. Could you possibly . . ., items used to fill in the gaps
of the utterance, e.g., Could you, erm, possibly . . . or alerters which serve
to gain the interlocutor’s attention, e.g. Excuse me . . . or the token please.
In contrast, external modifiers surround the head act, serving to further
absorb the impact of the impending imposition. These include Preparators,
employed to set up the request, e.g. Mr Waters, I’ve got a question about my
assignment, and Grounders, devices used to provide a reason or explanation
for the request, e.g. Could I have an extension? I’ve had computer problems.
Observations about the context and social environment need to be made
before deciding on the appropriate construction of the request itself.
5.2.2 Apologies
As with requests, apologies are considered face-threatening acts (FTAs).
To repair the damage of FTAs, interlocutors may engage in a number
of facework strategies to ‘redress’ the incident which include apologies.
Olshtain and Cohen (1983: 20) maintain that when an action, utterance
(or lack thereof) causes offence on the part of the recipient(s), the culpable
person(s) needs to apologise to re-establish social harmony. In other words,
‘[an] apology is an instance of socially-sanctioned hearer-supportive behav-
iour’ (Edmondson 1981: 273) and defined as a post-act event (Blum-Kulka
and Olshtain 1984).
The conditions to the apology being fulfilled, however, are dependent on
the culpable person acknowledging or recognising the offence has occurred,
which may be determined by sociocultural norms just as linguistic norms will
determine whether the utterance actually qualifies as an apology (Olshtain
and Cohen 1983: 20). In Brown and Levinson’s (1987) terms, the act of
apologising is face-saving for the hearer and face-threatening for the speaker.
Leech (1983) qualifies this by maintaining there is some kind of benefit
for the hearer at a cost for the speaker through the act of apology, unlike
requests which are costly in the reverse.
Researchers have posited a number of general and more detailed classifi-
cations for the semantic formulae contained in acts of apology. Most build
on the influential work of Goffman (1971) who describes apologising as
‘remedial work’ accomplished by accounts (excuses/explanations), requests
(begging sufferance) and apologies. Goffman classifies apologies as either
‘ritual’, motivated by social habits, or ‘substantive’- the wish to repair any
damage or harm caused by the initial act.
The limited categorisations proposed by Goffman have since been modi-
fied and expanded by a number of scholars based on cumulative research
conducted in the 1980s (Olshtain and Cohen 1983; Owen 1983; Blum-
Kulka and Olshtain 1984; Trosborg 1987). As a result, these studies have
developed and described a range of strategies to be undertaken for appro-
priate apology behaviour. Observational and interventional investigations
often cite the seminal work of Blum-Kulka, House and Kasper (1989) Cross-
Cultural Speech Act Realisation Project (CCSARP) for a basic conceptual
framework of the semantic formulae involved in apologising, though this
is largely a reformulation of those proposed in the earlier studies (Blum-
Kulka and Olshtain 1984; Olshtain and Cohen 1983; Owen 1983; Trosborg
1987). It consists of a set of five formulae which individually may be con-
sidered sufficient to placate the hearer, although a combination, signifying
a more intense apology, is also commonplace (see figure 5.3). The apology
may also be accompanied by a strategy to signal intensification I’m very
sorry to amplify the speaker’s regret (Dalmau and Gotor 2007: 1). It is use-
ful to view them on a continuum as Trosborg (1987) suggests. In this case,
the cumulative total of formulae a) to e) in figure 5.3 increase in indirectness
and potential for placating the recipient.
Explicit expressions of apology a) are generally realised through some kind
of performative verb such as ‘apologise’ or ‘forgive’. An explanation b) provides
a reason for the violation or damage which has occurred and often provides
supportive evidence to a). An admission of responsibility for the offence is
realised through strategy c) which Nureddeen (2008: 290) suggests is the
a). Expression of apology (with intensification)

e.g., I’m really sorry
b). Explanation or account

e.g., My bus was late
c). Acknowlegement of responsibility

e.g., It’s my fault
d). Offer of repair

e.g., Let me buy you a drink
e). Promise of forbearance

e.g., It won’t happen again
Figure 5.3 Formulaic strategies for the apology speech act (based on Trosborg 1987)
‘most explicit, most direct and strongest apology strategy’. An offer to repair
or pay for the damage caused is provided through d), whilst promising not to
repeat the offence in the future is acknowledged in strategy e).
Strategies a) and b) are said to be the basis of any remedial work, whilst c)
to e) are situation-dependent (Blum-Kulka, House and Kasper 1989) in the
event further mitigation is required. In contrast, Bergman and Kasper’s
(1993) review of a range of empirical apology studies have since concluded
the essential components of an apology contained strategies a) and c);
explicit expressions of apology and statements of responsibility (Holmes
1990; House 1988; Kasper 1989; Trosborg 1987) and it is the severity of the
infraction which dictates the redressive strategy preferences (e.g. Brown and
Levinson 1987; Holmes 1990; Maeshiba et al., 1996). A single apology in
the form of an IFID may be adequate for being slightly late to meet a friend
(ritual apology), but a more elaborate formulation incorporating multiple
strategies may be required if the offence is much more serious such as break-
ing a person’s treasured possession (substantive apology). Contextual fac-
tors such as power and social distance between interlocutors also influence
apology performance (Maeshiba et al., 1996).
As many of the ways to realise both requests and apologies comprise the
use of formulaic chunks, it is worth visiting this area further in section 5.2.3
with specific reference to pragmatic language.
5.2.3 Formulaic language and developing pragmatic

competence
Formulaic language, and lexical chunks in particular, are seldom a focus of
mainstream second language pragmatic studies, despite the acknowledgement
that formulaic language plays a central role in effective and efficient commu-
nication (Pawley and Syder 1983; Schmitt 2004; Sinclair 1991; Wray 2008),
as described in chapter 2. For L2 learners in particular, formulaic language
can save mental capacity which can be used more effectively to internalise
syntactic rules (Wang 2011), relieves pressure on memory which may benefit
L2 acquisition (Weinert 1995) and is said to improve fluency (Fillmore 1979).
For these reasons, language learning from a formulae-based approach can be
an effective learning strategy. Interlocutors also easily recognise formulaic
language and, in fact, have an expectation that conventionalised sequences
are used in order to expedite effective communication in many formal and
informal situations. Kecskes (2014) notes that much formulaic language
which occurs in social situations is culture-specific, so an understanding of
the social norms of a speech community is dependent on successful use and
interpretation. Much conventional formulaic language can be found in real-
ising speech acts such as requests and apologies (Schmitt and Carter 2004),
but as Wray (2012: 236) notes, ‘instructed L2 learners have an impoverished
stock of formulaic expressions’. Bardovi-Harlig (2009) and Kecskes (2000)
suggest lack of familiarity with expressions and sociopragmatic knowledge,
overuse of familiar expressions, level of proficiency and inadequate L2 expo-
sure as factors for this lack of resource.

Pragmatic competence is learned and developed through social interaction
but, assuming accessibility to input, can be a slow process (Cohen 2008;
Taguchi 2010). Estimates have suggested up to 10 years (Olshtain and Blum-
Kulka 1984), yet some researchers suggest competency may never be achieved
despite permanent residency in an L2 context (Cohen 2008; Kasper and Rose
2002). Instruction in pragmatics is seen as one way to efficiently advance
pragmatic knowledge and the majority of second language pragmatics studies
have shown this to be an effective technique to improving learners’ pragmatic
language production across a range of speech acts. Recent intervention stud-
ies specifically targeting request (Eslami and Eslami-Rasekh 2008; Johnson
and deHaan 2013; Halenko and Jones 2011; Martinez-Flor 2008; Safont
Jorda 2004) and apology language (Eslami and Eslami-Rasekh 2008; John-
son and deHaan 2013) also follow this positive trend.
Corpus-driven studies of spoken data with a pragmatic focus most com-
monly investigate learner use of pragmatic markers or discourse mark-
ers, as described in chapter 4 (see Vyatkina and Cunningham 2015 for a
recent overview). The interactive nature of discourse markers to maintain
conversation means they have an important pragmatic role to play. Studies
utilising corpora to examine functional language such as speech acts, on the
other hand, are limited to a handful of investigations such as Cheng (2004),
Schauer and Adolphs (2006) and Jones and Halenko (2014). Collectively,
these studies demonstrate how corpora may be used successfully for work-
place training and language teaching purposes.
Cheng’s (2004) study analysed a six-hour sub-corpus of service encounter
exchanges at a hotel reception desk from the 500,000 Hong Kong Cor-
pus of Spoken English (HKCSE). Her quantitative and qualitative analysis
included identifying a need for pragmatics training of frontline hotel staff
who failed to meet the sociopragmatic and pragmalinguistic expectations of
their role and the hotel’s high customer care ethos. Specifically, the discourse
highlighted staff members’ preoccupation with settling the bill during the
speech event of ‘checking out’, at the expense of expected moves which show
sufficient concern and interest in the guests’ overall satisfaction. This was
exacerbated by the infrequent use of honorifics, hedging and the marker
‘please’ whilst guiding guests through the check-out process.
Schauer and Adolphs (2006) focussed on formulaic sequences in expres-
sions of gratitude elicited from written production tasks compared to the
multi-million word Cambridge and Nottingham Corpus of Discourse in
English (CANCODE). Findings revealed the former successfully provided a
wide range of interactional formulaic sequences and contemporary expres-
sions such as that’s wicked which could be used as classroom resources for
learners to gain insights into current language practices. In addition, the
CANCODE data was able to provide examples of how false starts and hesi-
tations keep conversations moving, and revealed expressions of gratitude
often comprised several phrases rather than a single expression e.g. ‘Great
stuff. That’s brilliant. Thank you very much. That’s much appreciated’. The
authors advised combining both data sets for maximum teaching benefits.
Jones and Halenko’s (2014) exploratory study also aimed to encourage
the use of corpora in classroom teaching by showing how the freely available
corpus tool Lextutor (2013) could be used to examine learner language, in
this case Chinese EAP learners’ successful request language, as a founda-
tion for teaching materials. The authors’ analyses concluded that conven-
tional indirectness comprising high frequency modals can, could and would,
and formulaic expressions to encourage listenership (e.g. sorry to bother
you, excuse me) and convey politeness/indirectness (e.g. would you mind,
could you help me) were key features of the 3,919 word spoken sample.
These findings were further confirmed in a data comparison with the British
National Corpus (BNC). In terms of strategy choice, greater social distance
and imposition tended to produce the organisational pattern: apology (for
imposition) + grounder + request or request + grounder e.g. I’m sorry to
bother you, because I have personal reason, could I ask for extension for my
assignment? For interactions with smaller social distance, learners mainly
opted for preparator (e.g. compliment) + request e.g. You are good at com-
puter skills; can you design our group’s presentation?

Employing a similar approach to the previous chapters, the SPACE data
were analysed in the following ways. Firstly, all the requests and apologies
were examined quantitatively through WordSmith tools 6.0 (Scott 2015)
to identify the most frequent words. Frequency counts were obtained for
both the request and apology data, in addition to separating the learner
and native-speaker data. Secondly, AntConc (Anthony 2017) was used to
analyse of all the requests and apologies to identify the most frequent two-
to four-word chunks which cluster together for evidence of formulaicity of
language patterns. As described in chapter 2, most chunks appear between
two and four words in length (McCarthy 2010). For the purposes of this
chapter this seemed a reasonable range to capture both shorter chunks such
as excuse me, and longer stretches typical of more formal requests such as
the bi-clausal I was wondering if, both of which can form essential parts
of the request speech act. AntConc also provided useful concordance lines
to illustrate specific examples of how the language functioned in context.
Finally, the data were analysed qualitatively in order to examine how their
construction in terms of strategy use contributed to their success. This chap-
ter does not include any comparisons of the SPACE corpus to larger corpora
such as the multi-million-word British National Corpus (BNC) because a
reference corpus needs to be at least reasonably comparable and due to the
very specific nature of SPACE and the much more generalised nature of the
BYU-BNC, we felt comparisons would have little to tell us.
Given the corpus responses had already been evaluated as successful, and
in addition to avoiding the pitfalls of conducting native-speaker comparisons
as discussed in chapter 1, the main focus of the analysis here is to identify
the successful sociopragmatic and pragmalinguistic components from the
learners’ request and apology data. Where references to the native-speaker
data are made, this will be to confirm the learner data or offer alternate ways
success may be achieved. The aim is that by isolating successful features in
these ways, practitioners and other stakeholders may benefit from model
language to include in curricula or use as a basis for further research.
5.5 Pragmalinguistic features of successful

request language
Tables 5.2–5.4 illustrate word frequency lists from the requests responses.
This is first viewed with regards to raw frequency lists, and subsequently
examined to identify the most frequent two-, three- and four-word chunks
for evidence of occurrences of formulaic language. To some extent the tokens
in both analyses will reflect language specific to the context of the scenarios
so the subsequent discussions will focus on frequencies most relevant to
request head acts, or internal and external modification as described earlier.
Table 5.2 The twenty most frequent words in the learner and native speaker request data
N Learner Request Data Native Speaker Request Data

Word Freq. % Texts % Word Freq. % Texts %
1 I 1,245 7.81 479 87.41 I 240 9.00 67 89.33

2 TO 899 5.64 458 83.58 TO 114 4.27 56 74.67
3 YOU 768 4.82 468 85.40 YOU 83 3.11 55 73.33
4 ME 603 3.78 411 75.00 ME 67 2.51 46 61.33
5 MY 365 2.29 265 48.36 IT 64 2.40 43 57.33
6 THE 338 2.12 219 39.96 A 63 2.36 43 57.33
7 CAN 325 2.04 252 45.99 COULD 56 2.10 43 57.33
8 A 296 1.86 214 39.05 IF 52 1.95 39 52.00
9 SOME 289 1.81 210 38.32 THE 50 1.87 29 38.67
10 AND 258 1.62 186 33.94 AND 41 1.54 27 36.00
11 HAVE 246 1.54 198 36.13 HAVE 38 1.42 25 33.33
12 SO 241 1.51 187 34.12 MY 35 1.31 30 40.00
13 IT 231 1.45 186 33.94 BOOK 32 1.20 27 36.00
14 BOOK 225 1.41 147 26.82 REALLY 32 1.20 22 29.33
15 SORRY 221 1.39 200 36.50 HI 31 1.16 31 41.33
16 THIS 215 1.35 167 30.47 PLEASE 31 1.16 28 37.33
17 FINISH 208 1.31 156 28.47 EXCUSE 30 1.12 30 40.00
18 WANT 199 1.25 155 28.28 WONDERING 29 1.09 27 36.00
19 COULD 192 1.21 162 29.56 FOR 28 1.05 22 29.33
20 TIME 182 1.14 121 22.08 THAT 27 1.01 18 24.00
The data indicate that both learners and native speakers achieve success
through the use of the modal verbs can (NNS) and could (NNS and NS)
which express conventional indirectness in the request head act. The learner
data suggests both modals are used interchangeably though can is the pre-
ferred choice (see figure 5.4). Want also features highly in the ranking but,
on closer inspection, tends to be isolated to transactional exchanges where
services are being provided (library, accommodation office) so increased
directness is less likely to affect the outcome of the request, and the direct
head act is almost always mitigated by supportive external modification
devices such as alerters or internally modified by please. In contrast, NS use
both can and want infrequently. Typical of native-speakers’ requests but not
evident in the learner data are the use of please as a closing and really as an
emphatic marker to boost the urgency of the request, show appreciation,
or as an intensifier to an apology, mitigating the imposition of the request.
<R01S12Ch> I want to find out how to book a study room please. Can you help me? (NNS)
<R01S09Na> Hi. Could you help me book a study room please? (NS)
Figure 5.4 NNS and NS head act requests

Table 5.3 The twenty most frequent four-, three- and two-word chunks in the learner request data
Rank Four-word chunks (freq.) Three-word chunks (freq.) Two-word chunks (freq.)
1 BOOK A STUDY ROOM (117) I WANT TO (234) I HAVE (298)

2 MORE TIME TO FINISH (102) TO FINISH MY (138) I WANT (288)
3 TO FINISH MY HOMEWORK (95) A STUDY ROOM (134) EXCUSE ME (283)
4 TO BOOK A STUDY (91) FINISH MY HOMEWORK (127) COULD YOU (258)
5 TIME TO FINISH MY (69) BOOK A STUDY (121) WANT TO (252)
6 HOW TO BOOK A (60) MORE TIME TO (120) CAN YOU (210)
7 GIVE ME MORE TIME (59) TIME TO FINISH (110) THANK YOU (207)
8 A NEW PLACE TO (56) YOU HELP ME (109) HELP ME (198)
9 FIND A NEW PLACE (56) TO BOOK A (106) TO FINISH (195)
10 COULD YOU HELP ME (53) EXCUSE ME I (103) GIVE ME (188)
11 CAN YOU HELP ME (52) YOU GIVE ME (94) FINISH MY (181)
12 NEW PLACE TO LIVE (51) HOW TO BOOK (88) MY HOMEWORK (179)
13 ME MORE TIME TO (47) HELP ME TO (81) STUDY ROOM (161)
14 CAN YOU GIVE ME (45) A NEW PLACE (74) MORE TIME (160)
15 COULD YOU GIVE ME (44) UNTIL NEXT WEEK (74) AND I (159)
16 YOU HELP ME TO (43) I HAVE SOME (72) I NEED (156)
17 TO FIND A NEW (41) ME MORE TIME (68) TIME TO (146)
18 NEED MORE TIME TO (40) FIND A NEW (66) TO BOOK (145)
18 SORRY TO BOTHER YOU (40) GIVE ME MORE (62) BOOK A (143)
20 I NEED MORE TIME (39) NEW PLACE TO (61) CAN I (140)
Table 5.4 The twenty most frequent four-, three- and two-word chunks in the native-speaker request data
1 I WAS WONDERING IF (17) I WAS WONDERING (21) EXCUSE ME (27)

2 BOOK A STUDY ROOM (12) IF YOU COULD (19) THANK YOU (26)
3 WAS WONDERING IF YOU (9) WAS WONDERING IF (17) AND I (25)
4 WONDERING IF YOU COULD (9) IF I COULD (15) I WAS (25)
5 TO BOOK A STUDY (9) A STUDY ROOM (13) WONDERING IF (24)
6 WONDERING IF I COULD (8) BOOK A STUDY (12) IF I (23)
7 I JUST WANTED TO (6) EXCUSE ME I (11) IF YOU (22)
8 IF I COULD HAVE (6) WONDERING IF YOU (11) I HAVE (21)
9 WANTED TO KNOW IF (6) TO BOOK A (10) WAS WONDERING (21)
10 WAS WONDERING IF I (6) WONDERING IF I (10) I COULD (20)
11 HOW TO DO IT (5) PROBLEMS WITH MY (9) YOU COULD (20)
12 JUST WANTED TO KNOW (5) TO KNOW IF (9) BUT I (18)
13 KNOW IF I COULD (5) BE ABLE TO (8) HELP ME (16)
14 THANK YOU VERY MUCH (5) WANTED TO KNOW (8) BOOK A (14)
15 TO KNOW IF I (5) I HAVE A (7) I NEED (14)
16 A FEW PROBLEMS WITH (4) I NEED TO (7) ME I (14)
17 A NEW PLACE TO (4) I REALLY NEED (7) A STUDY (13)
18 AND I WAS WONDERING (4) HOW TO DO (6) STUDY ROOM (13)
18 COULD YOU HELP ME (4) I COULD HAVE (6) THAT I (13)
20 FEW PROBLEMS WITH ME (4) I JUST WANTED (6) TO DO (13)
The data further suggest the commonality of formulaic language, within

different positions of the request, as a key component to their success. Both
data sets evidence excuse me to be a popular initial, highly appropriate and
face-saving opening (alerter) to seek the listener’s attention. This is superseded
in ranking in the learner data only by sorry which most commonly collocates
in expressions such as sorry to bother you, sorry to interrupt you, sorry to dis-
turb you, to apologise for the imposition prior to the request itself. NS utilise
hi more frequently as an alternative alerter, perhaps illustrating a greater lin-
guistic sensitivity (and confidence to carry this out) in the lower social distance
scenarios. At the same time, collocations featuring if as part of the request
head act are commonly produced by native speakers, e.g. if you/I could, won-
dering if, wanted to know if, but are not as frequent in the learner data. This
suggests, whilst successful, learners may have a more limited range of formu-
laic expressions at their disposal. These findings are confirmed to some extent
when examining the most frequent clusters, as illustrated in tables 5.3–5.4.
These analyses suggest, perhaps unsurprisingly, that of all the possible com-
ponents of a request, the most frequent chunks are those used to construct the
head act. In the learner data this most often comprises past or present modals,
e.g. Could you help to book a study room; Can you help me find a new place;
Could you give me more time to finish my homework. In contrast, NS rely
more heavily on the bi-clausal structure I was wondering if I/you could. Both
data sets also feature the direct verb want but in slightly different ways (see
figure 5.5). For instance, I want features frequently as a direct request in the
learner data but, as mentioned previously, is often accompanied by an appro-
priate mitigator such as please, and used most often in service exchanges.
NS however, seem more sensitised to the directness of this verb as its force is
generally softened both with the internal modifier just and use of a past tense
verb form as in the example; I just wanted to know if.
In addition to the core request, alerters are also common chunks from both
data sets, notably excuse me being the most frequent two-word chunk, along-
side sorry to bother you, as the four-word chunk (learner data only). Accord-
ing to the CEFR, learners should be able to express themselves appropriately
in social interaction and ‘avoid crass errors of formulation’ (2001: 122). One
way of interpreting this is that learners are able to make appropriate linguis-
tic choices, within the sociocultural and contextual boundaries of the inter-
action, which do not have a negative effect on the listener. The request data
<R05S15Ch> Hi sir. I want to study but the students are too noisy. Could you tell them please?
(NNS)
<R05S08Na> Hi. I just wanted to know if you could please go and talk to those studentsand ask
them to keep the noise down a bit. I’d really appreciate it. (NS)
Figure 5.5 Use of want in the NNS and NS request data

discussed in section 5.5 has illustrated how this can be achieved. Section 5.6
will now consider this in light of the speech act of apology.
5.6 Pragmalinguistic features of successful

apology language
It is not unexpected that sorry occurs frequently in both the NNS and NS
data (see table 5.5) and is probably the earliest learned L1 and L2 token
for expressing regret. Interestingly, learners rely heavily on so as the pre-
ferred adverbial, acting as an intensifier, whilst NS opt for really. The latter
is commonly repeated a number of times within the NS apologies perhaps
to foreground a substantive, rather than ritualistic apology, as differentiated
by Goffman (1971). As presented in figure 5.6, both data sets also reveal
the use of can as a means of suggesting how to repair the situation (NS) or
requesting assistance in repairing it (NNS).
Table 5.5 The twenty most frequent words in the learner and native-speaker apology data
N Learner Apology Data Native Speaker Apology Data

Word Freq % Texts % Word Freq % Texts %
1 I 849 11.59 186 98.94 I 309 11.64 60 100.00

2 TO 260 3.55 131 69.68 IT 107 4.03 43 71.67
3 SORRY 243 3.32 153 81.38 TO 90 3.39 48 80.00
4 SO 225 3.07 116 61.70 REALLY 79 2.98 36 60.00
5 YOU 221 3.02 112 59.57 SORRY 66 2.49 43 71.67
6 AND 150 2.05 98 52.13 YOU 59 2.22 32 53.33
7 MY 132 1.80 90 47.87 THAT 57 2.15 37 61.67
8 THE 128 1.75 83 44.15 THE 53 2.00 32 53.33
9 IT 113 1.54 68 36.17 AND 51 1.92 34 56.67
10 ABOUT 112 1.53 76 40.43 A 42 1.58 32 53.33
11 THAT 108 1.47 77 40.96 FOR 32 1.21 20 33.33
12 A 97 1.32 74 39.36 JUST 30 1.13 20 33.33
13 HAVE 91 1.24 63 33.51 BUT 29 1.09 23 38.33
14 FOR 86 1.17 63 33.51 HI 29 1.09 29 48.33
15 THIS 85 1.16 59 31.38 BOOK 28 1.05 24 40.00
16 APOLOGISE 81 1.11 70 37.23 THIS 27 1.02 20 33.33
17 ME 81 1.11 63 33.51 ABOUT 26 0.98 20 33.33
18 CAN 77 1.05 49 26.06 IF 26 0.98 13 21.57
19 BOOK 71 0.97 40 21.28 CAN 25 0.94 17 28.33
20 WILL 70 0.96 53 28.19 KNOW 25 0.94 16 26.67
<A07S15Ch> I’m so sorry about the book. Can you tell me what to do now? (NNS)
<A08S08Na> I’m really sorry about the book. I can pay for the damage. (NS)
Figure 5.6 Use of can as a repair strategy in the NNS and NS apology data
A feature of the learner data not evident in the NS data is the performative
verb, apologise, though much less frequent than its counterpart sorry (see
table 5.6). The data reveal the instances of apologise are generally limited
to high social distance-high imposition scenarios, hence their acceptability
in these situations. As identified in the request data, NS apologies tend to
incorporate the use of just as a mitigator to the main verb, e.g. just wonder-
ing if, just want to know if just been so busy, just didn’t have enough time
(see table 5.7).
Confirming initial examinations of the raw frequency lists, explicit expres-
sions of regret are common to both the NNS and NS sets but these are
realised in different ways: a combination of (so) sorry about (that) dominate
the NNS requests whilst NS opt for a combination of really (really) sorry
about (that). For both data sets this is generally followed by a description
of the context of the apology, and often concluded with direct offers to
repair the situation using I will or I can (see figure 5.7), in addition to some
requests for help to make amendments which feature in the NNS set only:
Can you help me . . . /What should I do . . .
In comparison to the examples in figure 5.7, there are fewer occurrences
of chunks containing apologise. NNS experiment with several collocations,
in the order of frequency: apologise to; I apologise for; want to/have to apol-
ogise, whereas NS most commonly rely on the sequence want to apologise
for. Figure 5.8 provides an example of several instances of say apologise to
you from the NNS data, where grammatical and syntactic errors are evident,
suggesting that inaccuracies may not always affect the success of a formu-
laic sequence, as noted in other studies (Halenko 2016; Halenko and Jones
2011; Halenko and Jones forthcoming).
<A10S07Jp> Hi I’m sorry I’ve been late to your class every day in this week. I missed the bus
and train every day because I woke up late um I’m so sorry about that. I will make effort to wake
up earlier from next time. (NNS)
<A09S12Na> Hi I’m really, really sorry forbeing late this week I’vebeen ill I’ve had a lot of
assignments erm, I’ll do better next week it’s just it’s just been a really, really tough week. (NS)
Figure 5.7 Common chunks in NNS and NS apology data
<A07S02Sa> Er hello excuse me just I want to say apologise to you about the noise because the
party with my friend er but I promise you that this last time I do it so I apologise to you and the
student.
Figure 5.8 Grammatically inaccurate chunks in the NNS apology data

Table 5.6 The twenty most frequent four-, three- and two-word chunks in the learner apology data
1 SO SORRY ABOUT THAT (21) SORRY ABOUT THAT (44) SO SORRY (88)
2 SORRY ABOUT THAT I (13) I HAVE TO (27) I HAVE (74)
3 I HAD A PARTY (12) I WANT TO (26) SORRY ABOUT (62)
4 SPLIT SOME COFFEE ON (12) SO SORRY ABOUT (26) SO I (57)
5 HAD A PARTY AT (8) SO SORRY I (22) AND I (56)
6 HOW ARE YOU DOING (7) I HAD A (16) I WILL (56)
7 I FORGOT TO COLLECT (7) HAD A PARTY (15) SORRY I (52)
8 REALLY SORRY ABOUT THAT (7) HOW ARE YOU (15) ABOUT THAT (50)
9 SAY APOLOGISE TO YOU (7) ABOUT THAT I (14) I CAN (39)
10 THANK YOU SO MUCH (7) APOLOGISE TO YOU (14) I WANT (38)
11 WHAT SHOULD I DO (7) SPILT SOME COFFEE (14) REALLY SORRY (37)
12 A PARTY AT MY (6) EXCUSE ME I (13) I AM (34)
13 CAN YOU HELP ME (6) SOME COFFEE ON (13) TO YOU (34)
14 HI HOW ARE YOU (6) I AM SORRY (12) THANK YOU (33)
15 I HAVE SPILT SOME (6) IN MY FLAT (11) BUT I (31)
16 I HAVE TO APOLOGISE (6) A PARTY AT (10) EXCUSE ME (30)
17 A BOOK FROM YOU (5) COMPLETE MY ESSAY (9) WANT TO (29)
18 A PARTY IN MY (5) I APOLOGISE FOR (9) HAVE TO (28)
18 FLAT WITH MY FRIENDS (5) REALLY SORRY ABOUT (9) FIND IT (27)
20 HAVE SPILT SOME COFFEE (5) SORRY I HAVE (9) ARE YOU (26)
Table 5.7 The twenty most frequent four-, three- and two-word chunks in the native speaker apology data
1 REALLY SORRY ABOUT THAT (6) REALLY REALLY SORRY (10) REALLY SORRY (39)
2 SPLIT SOME COFFEE ON (6) REALLY SORRY ABOUT (10) AND I (30)
3 TO COLLECT A PACKAGE (6) REALLY SORRY I (10) FIND IT (20)
4 I FORGOT TO COLLECT (5) SPLIT SOME COFFEE (8) THANK YOU (20)
5 I JUST WANT TO (5) I FORGOT TO (7) THAT I (20)
6 REALLY SORRY BUT I (5) I JUST WANT (7) BUT I (19)
7 SORRY FOR BEING LATE (4) COLLECT A PACKAGE (6) I CAN (18)
8 WANT TO APOLOGISE FOR (4) SOME COFFEE ON (6) I WAS (17)
9 A PACKAGE FROM THE (3) SORRY ABOUT THAT (6) SORRY I (17)
10 ACCIDENTALLY SPILT SOME COFFEE (3) TO APOLOGISE FOR (6) HI I (13)
11 COLLECT A PACKAGE FROM (3) TO COLLECT A (6) I JUST (13)
12 FORGOT TO COLLECT A (3) WANT TO APOLOGISE (6) IT I (12)
13 HI I JUST WANT (3) BE ABLE TO (5) REALLY REALLY (12)
14 I BE ABLE TO (3) FIND IT AT (5) SORRY ABOUT (12)
15 I BORROWED A BOOK (3) FIND IT I (5) A PACKAGE (11)
16 I BORROWED FROM YOU (3) FOR BEING LATE (5) THIS WEEK (10)
17 I JUST WANTED TO (3) FORGOT TO COLLECT (5) ABOUT THAT (9)
18 I WAS WONDERING IF (3) JUST WANT TO (5) I FORGOT (9)
18 IT AT THE MOMENT (3) REALLY SORRY BUT (5) I KNOW (9)
20 REALLY REALLY SORRY ABOUT (3) SORRY BUT I (5) I WILL (9)
5.7 Sociopragmatic features of successful request

and apology language
Examining the SPACE corpus qualitatively allows for a closer inspection of
the moves and organisational patterns considered requisite for a success-
ful request or apology. Unlike the evidence of pragmalinguistic variation
between the learner and native-speaker data that has been presented, there
is less variation with regards to adopting appropriate strategies and the
organisation of these for both requests and apologies. Regarding requests,
the most common strategy choice is that of alerter + request head act, and
organised in this sequence. In the scenarios requiring greater sensitivity to
either the context or interlocutor, additional external moves, incorporat-
ing an explanation in the pattern alerter + request head act + explanation,
are evident. Successful apologies for NNS and NS tend to adopt explicit
expressions of regret, often accompanied by an appropriate intensifier,
as the main strategy. The main apology also tends to be enhanced by at
least one other strategy and, depending on circumstance, may include an
offer of repair or explanation. Overall, an apology might look like this:
Explicit expression of regret + intensifier + strategy (e.g. offer of repair
or explanation). The examples in figure 5.9 illustrate that the learners
have achieved the CEFR’s descriptors of sociolinguistic appropriateness by
observing polite expression and using a suitable register which is context
and interlocutor-appropriate.
An exception to this is in the data from Chinese learners where requests
are often foregrounded by firstly justifying why the request is needed
(Grounder) before producing the request head act itself, such as: I’ve been
sick these last days (Grounder). Could you give me more time to finish my
essay? (request). This construction, termed the because-therefore pattern (Yu
1999; Zhang 1995) is a typical L1 strategy used to convey politeness but
opposite to what NS most commonly produce where the request precedes
the grounder, as in the example: Could I have an extension on the essay?
I’ve been in bed all week with flu (see figure 5.10). Nevertheless, as with the
grammatical inaccuracies, this does not seem to have affected the success of
the responses.
<R06S01Ch> Excuse me. Could you help me find a new place to live? I have some problems
with my accommodation.
<R03S12Jp> I’m so sorry about that. What can I do about this?
Figure 5.9 Examples of NNS strategies and organisation in the SPACE data
<R04S12Ch> I really need this book to finish my essay so could you help me to borrow it until
next week? (NNS)
<R04S12Na> Hi. Is it possible for me to borrow this book until next week because I need it for
my homework? (NS)
Figure 5.10 Examples of the because-therefore, therefore-because patterns in the request

data
5.8 Conclusion
In summary, the SPACE corpus reveals that the learner and native-speaker
request and apology realisations share many common features. What is
most noteworthy is that although the learner data suggest the repertoire
of request and apology expressions and strategies are less varied, learners,
in fact, need not have a great pragmalinguistic range in order to be con-
sidered successful. Instead, they can rely on a smaller range of polite core
requests and alerters when formulating requests, and explicit apologies with
intensifiers when apologising, whose forms need not always be grammati-
cally accurate to be successful. Similarly, simply by observing basic request
and apology moves which incorporate appropriate alerters and head acts
(requests) and explicit apologies with attempts at repairing the situation
(apologies), learners are able to successfully show appropriate socioprag-
matic awareness even though disparities in organisational patterns may exist
and choice of strategy may differ from their NS counterparts. Secondly, as
outlined in chapter 1, the frequency of formulaic expressions within the
SPACE corpus lends support to research claims concerning their importance
for effective communication (Pawley and Syder 1983; Schmitt 2004; Wray
2008), and underlines how much formulaic language can be found in speech
acts such as requests and apologies (Aijmer 1996; Wang 2011). That the
learners did not always have native-like command of these formulaic expres-
sions but their responses were still considered successful based on the raters’
scores, raises two interesting points. Firstly, there does appear to be a social
expectation that conventionalised expressions are used, as proposed by Kec-
skes (2000). The influence of formulaic expressions is further underlined in
that simply attempting to employ them, though not always syntactically or
grammatically correct, was still considered positive, as reflected in the high
raters’ scores and the inclusion of the responses in the SPACE corpus. The
second point concerns the perception of pragmatic competence versus lin-
guistic competence for successful communication. As highlighted by Bardovi-
Harlig and Dornyei’s (1998) empirical investigation, ESL tutors in the host
community overwhelmingly favoured pragmatic ability over grammatical
ability, whilst the reverse was true for EFL tutors in the at-home environ-
ment. This perception of the importance of pragmatic awareness seems true
for this data and the examples presented above. Thomas’s (1983: 96–97)
seminal paper, seen as a driver of pragmatic studies, provides an emphatic
distinction between the consequences of pragmatic and grammatical errors
and underlines the importance of pragmatic awareness for successful spo-
ken English: ‘while grammatical error may reveal a speaker to be a less than
proficient language user, pragmatic failure reflects badly on him/her as a per-
son’. Overall, this review of what constitutes pragmatic success for requests
and apologies may be a good starting point for learners and practitioners to
enhance sociopragmatic and pragmalinguistic awareness, particularly when
preparing for interactions with members of an academic community.
References
Anthony, L. 2017. AntConc. (Version 3.4.4). [computer software] Tokyo, Japan:
Waseda University. Available from: <www.laurenceanthony.net/> [Accessed 2 March
2017].
Aijmer, K. 1996. Conventional routines in English: Convention and creativity. New
York: Longman.
Austin, J. 1962. How to do things with words. Oxford: Oxford University Press.
Bachman, L.F. 1990. Fundamental considerations in language testing (Oxford
applied linguistics). 5th ed. Oxford: Oxford University Press, USA.
Bachman, L.F. and Palmer, A.S. 1982. The construct validation of some components
of communicative proficiency. TESOL Quarterly, 16(4), 449–465.
Bachman, L.F. and Palmer, A.S. 1996. Language testing in practice: Designing and
developing useful language tests. New York: Oxford University Press.
Bardovi-Harlig, K. 2001. Evaluating the empirical evidence: Grounds for instruction
in pragmatics. In K. R. Rose and G. Kasper, eds. Pragmatics in language teaching.
Bardovi-Harlig, K. 2009. Conventional expressions as a pragmalinguistic resource:
Recognition and production of conventional expressions in L2 pragmatics. Lan-
guage Learning, 59(4), 755–795.
Bardovi-Harlig, K. and Hartford, B.S. 1990. Congruence in native and Nonnative
conversations: Status balance in the academic advising session. Language Learn-
ing, 40(4), 467–501.
Barron, A. 2002. Acquisition in interlanguage pragmatics: Learning how to do things
with words in a study abroad (Pragmatics and beyond new series). Philadelphia,
PA: John Benjamins Publishing Co.
Barron, A. 2003. Acquisition in interlanguage pragmatics: Learning how to do things
with words in a study abroad context. Amsterdam: John Benjamins.
Bergman, M.L. and Kasper, G. 1993. Perception and performance in native and non-
native apology. In: S. Blum-Kulka and G. Kasper, eds. Interlanguage pragmatics.
Oxford: Oxford University Press, 82–108.
Blum Kulka, S. 1982. Learning to say what you mean in a second language: A study
of the speech act performance of learners of Hebrew as a Second language1.
Applied Linguistics, III(1), 29–59.
Blum-Kulka, S. 1983. Interpreting and performing speech acts in a second language:
A cross-cultural study of Hebrew and English. In N. Wolfson and E. Judd, eds.
Sociolinguistics and Language Acquisition. Rowley: Newbury House, 36–55.
Blum-Kulka, S., House, J. and Kasper, G. 1989. Cross-cultural pragmatics: Requests

and apologies. Norwood, NJ: Ablex Pub. Corp.
Blum-Kulka, S. and Olshtain, E. 1984. Requests and apologies: A cross-cultural study
of speech act realization patterns (CCSARP)1. Applied Linguistics, 5(3), 196–213.
Brown, P. and Levinson, S.C. 1987. Politeness: Some universals in language usage.
Canale, M. 1983. From communicative competence to communicative language ped-
agogy. In J. C. Richards and R. W. Schmidt, eds. Language and communication.
New York: Longman, 2–27.
second language teaching and testing. Applied Linguistics, I(1), 1–47.
Cheng, W. 2004. //a did you TOOK// a from the MINIbar//: What is the practical
relevance of a corpus-driven language study to practitioner’s in Hong Kong’s hotel
industry? In U/ Connor and T.A. Upton, eds. Discourse in the professions. Amster-
dam: John Benjamins, 141–166.
Chomsky, N. 1965. Aspects of the theory of syntax. Cambridge: The MIT Press.
Cobb, T. 2013. Compleat Lexical Tutor (LexTutor) [Online]. Available from: www.
lextutor.ca/ [Accessed 21 February 2017].
Cohen, A. 2008. Teaching and assessing L2 pragmatics: What can we expect from
learners?. Language Teaching, 41(02), 213–235.
Council of Europe 2001. Common European framework of reference for languages:
Learning, teaching, assessment. Cambridge: Cambridge University Press.
Crystal, D. 1997. English as a global language. Cambridge, England: Cambridge
University Press.
Dalmau, M.S.I and Gotor, H.C. 2007. From “Sorry very much” to “I’m ever so
sorry”: Acquisitional patterns in L2 apologies by Catalan learners of English.
Intercultural Pragmatics, 4(2), 287–315.
Dornyei, Z. and Bardovi-Harlig, K. 1998. Do language learners recognize pragmatic
violations? Pragmatic versus grammatical awareness in instructed L2 learning.
TESOL Quarterly, 32(2), 233–259.
Edmondson, W. 1981. Spoken discourse: A model for analysis. London: Longman
Higher Education.
Eslami, Z. and Eslami-Rasekh, A. 2008. Enhancing the pragmatic competence of
non-native English-speaking Teacher Candidates (NNESTCs) in an EFL context.
In E. Alcon Soler and A. Martinez-Flor, eds. Investigating pragmatics in foreign
language learning. Bristol: Mulitlingual Matters, 178–197.
Fillmore, C.J. 1979. On fluency. In D. Kempler and W.S.Y. Wang, eds. Individual dif-
ferences in language ability and language behaviour. New York: Academic Press,
85–102.
Geluykens, R. and Kraft, B. 2008. The use(fulness) of corpus research in cross-cultural
pragmatics: Complaining in intercultural service encounters. In: J. Romero-Trillo,
ed. Pragmatics and corpus linguistics, 1st ed. Berlin: Mouton de Gruyter, 93–118.
Goffman, E. 1971. The presentation of self in every day life. United Kingdom: Pen-
guin Books.
Halenko, N. 2016. Evaluating the explicit pragmatic instruction of requests and
apologies in a study abroad setting: The case of Chinese ESL learners at a UK
Higher Education Institution. Unpublished PhD thesis. Lancaster University.
Halenko, N. and Jones, C. 2011. Teaching pragmatic awareness of spoken requests

to Chinese EAP learners in the UK: Is explicit instruction effective? System, 39(2),
240–250.
House, J. 1988. On excuse me please . . .”: Apologizing in a foreign language. In
B. Ketterman, P. Bierbaumer, A. Fill and A. Karpf, eds. English als Zweitsprache.
Tubingen: Narr, 303–327.
Hymes, D. 1972. Models of interaction of language and social life. In J. J. Gumperz
and D. H. Hymes, eds. Directions in sociolinguistics: The ethnography of commu-
nication. New York: Holt, Rinehart and Winston, 35–71.
Johnson, N. H. and de Haan, J. 2013. Strategic interaction 2.0. International Journal
of Strategic Information Technology and Applications, 4(1), 49–62.
Jones, C. and Halenko, N. 2014. What makes a successful spoken request? Using
corpus tools to analyse learner language in a UK EAP context. Apples-Journal of
Applied Language Studies, 8(2), 23–41.
Kasper, G. 1989. Interactive procedures in interlanguage discourse. In W. Oleksy ed.,
Contrastive pragmatics. Amsterdam: Benjamins, 189–229.
Kasper, G. 1995. Pragmatics of Chinese as native and target language. Honolulu: Sec-
ond Language Teaching and Curriculum Center, University of Hawaii at Manoa.
Kasper, G. and Rose, K. R. 2002. Pragmatic development in a second language. Mal-
den, MA: Blackwell Publishers.
Kecskes, I. 2000. A cognitive-pragmatic approach to situation-bound utterances.
Journal of Pragmatics, 32(5), 605–625.
Kecskes, I. 2014. Intercultural Pragmatics. United States: Oxford University Press.
Leech, G. 1983. Principles of pragmatics. 12th ed. New York: Longman.
Maeshiba, N., Yoshinaga, N., Kasper, G. and Ross, S. 2006. Transfer and proficiency
in interlanguage apologizing. In S. Gass and J. Neu, eds. Speech acts across cul-
tures: Challenges to communication in a Second language. Germany: De Gruyter
Mouton, 155–187.
Martinez-Flor, A. 2008. The effect of inductive-deductive teaching approach to
develop learners’ use. In E. A. Soler, ed. Learning how to request in an instructed
language learning context. Switzerland: Peter Lang Pub, 191–226.
McCarthy, M. 2010. Spoken fluency revisited. English Profile Journal, 1(1), 1–15.
Nureddeen, F. 2008. Cross cultural pragmatics: Apology strategies in Sudanese Ara-
bic. Journal of Pragmatics, 40(2), 279–306.
Olshtain, E. and Blum-Kulka, S. 1984. Requests and apologies: A cross-cultural study
of speech act realization patterns (CCSARP)1. Applied Linguistics, 5(3), 196–213.
Olshtain, E. and Cohen, A. D. 1983. Apology: A speech act set. In N. Wolfson and E. Judd,
eds. Sociolinguistics and Language Acquisition. Rowley: Newbury House. 18–35.
Owen, M. 1983. Apologies and remedial interchanges: A study of language use in
social interaction. Germany: Mouton de Gruyter.
Pawley, A. and Syder, F. 1983. Two puzzles for linguistic theory: Nativelike selection
and nativelike fluency. In J. Richards and R. Schmidt, eds. Language and Com-
munication. London: Longman, 191–225.
Rintell, E. 1981. Sociolinguistic variation and pragmatic ability: A look at learners.
International Journal of the Sociology of Language, 27, 11–34.
Rose, K. 2005. On the effects of instruction in second language pragmatics. System,
33(3), 385–399.
Safont Jorda, M. 2004. An analysis on EAP learners’ pragmatic production: A focus

on request forms. Iberica, 8, 23–39.
Schauer, G. A. and Adolphs, S., 2006. Expressions of gratitude in corpus and DCT
data: Vocabulary, formulaic sequences, and pedagogy. System, 34(1), 119–134.
Schmitt, N. 2004. Formulaic sequences acquisition, processing, and use. Amsterdam:
John Benjamins Publishing Co.
Schmitt, N. and Carter, R. 2004. Formulaic sequences in action. In N. Schmitt, ed.
Formulaic sequences. Amsterdam: John Benjamins, 1–23.
Scott, M. 2015. WordSmith Tools.v.6.0.0.252. [Online]. Lexical Analysis Software
Ltd. Available from: <http://lexically.net/wordsmith/> [Accessed 25 February 2017].
Searle, J.R. 1969. Speech acts: An essay in the philosophy of language. London:
Cambridge University Press.
Searle, J.R. 1976. Indirect speech acts. In P. Cole and J. Morgan, eds. Syntax and
semantics 3; Speech Acts. New York: Academic Press, 59–82.
Sinclair, L. and Sinclair, J. 1991. Corpus concordance and collocation (describing
English language). 3rd ed. Oxford: Oxford University Press.
Taguchi, N. 2010. Longitudinal studies in interlanguage pragmatics. In A. Trosborg,
ed. Handbook of pragmatics vol.7: Pragmatics across languages and cultures. Ber-
lin: Mouton de Gruyter, 333–361.
Thomas, J. 1983. Cross-cultural pragmatic failure. Applied Linguistics, 4(2), 91–112.
Trosborg, A. 1987. Apology strategies in natives/non-natives. Journal of Pragmatics,
11(2), 147–167.
Trosborg, A. 1995. Interlanguage Pragmatics: Requests, complaints, and apologies.
Germany: De Gruyter Mouton.
Vyatkina, N. and Cunningham, D.J. 2015. Learner corpora and pragmatics. In S.
Granger, F. Meunier and G. Gilquin, eds. The Cambridge Handbook of Learner
Corpus Research. Cambridge: Cambridge University Press, 281–305.
Wang, V. 2011. Making requests by Chinese EFL learners. Netherlands: John Ben-
jamins Publishing Co.
Weinert, R. 1995. The role of formulaic language in second language acquisition: A
review. Applied Linguistics, 16(2), 180–205.
Wray, A. 2008. Formulaic language-pushing the boundaries. Oxford: Oxford Uni-
versity Press.
Wray, A. 2012. What do we (think we) know about formulaic language? An eval-
uation of the current state of play. Annual Review of Applied Linguistics, 32,
231–254.
Yu, M. 1999. Universalistic and culture-specific perspectives on variation in the
acquisition of pragmatic competence in a second language. Pragmatics Quarterly
Publication of the International Pragmatics Association (IPrA), 9(2), 281–312.
Zhang, Y. 1995. Indirectness in Chinese requesting. In G. Kasper, ed. Pragmatics
of Chinese as native and target language. Hawai’i: University of Hawai’i Press,
69–118.
Chapter 6
Conclusion
6.1 Introduction
This chapter will attempt to summarise the main findings of this book and
draw some conclusions and implications for both research and teaching.
In order to do so, section 6.2 gives a summary of findings for each aspect
of communicative competence we have described. We again separate each
competence for ease of analysis whilst acknowledging, as we have through-
out this book, that the different aspects of communicative competence do
overlap and interlock in many ways. We then move on to discussion of the
implications for research in section 6.3 before doing the same of teaching
in 6.4. We then give a final summary of the limitations of the research and
what we hope will be the implications.
6.2 Summary of findings
6.2.1 Linguistic competence

As noted, this chapter restricted its focus to exploring the lexical aspects of
linguistics contexts by exploring the most frequent words, the keywords and
the three- and four-word chunks used at each level. Comparisons were made
with the NS data and the LINDSEI corpus (Gilquin, De Cock and Granger
2010) where applicable. Findings show that B1, B2 and C1 learners are com-
parable in their combined K-1 and K-2 token and family coverages. K-1 and
K-2 tokens stood at 97% whereas family coverage fell between 81–84%.
Less than 1 in 33 tokens came from another band. In addition, learners do
not use less frequent vocabulary, according to K-1 and K-2 bands, as profi-
ciency develops so the 2000 most frequent words are therefore fundamental
to success at each level and the 20 most frequent words at B1, B2 and C1
also comprise approximately 40% of all speech. Knowledge of these words
is clearly vital for success. When looking at these frequent words in com-
parison with NS data and LINDSEI, there were some noticeable differences.
The pronoun we is used more frequently by the learners in the USTC cor-
pus. This item is used to perform a range of functions and express a number
160 Conclusion
of meanings which change across levels. At B1 and B2 levels, the use tends
to be focused on third parties, whereas at C1 level, the learners are also able
to use this to discuss concepts in a general sense.
In addition, in the USTC, er and erm are very frequent at all levels although
hesitation via er does reduce as proficiency grows which reflects changes
documented in CEFR descriptors of fluency. We suggested then that far from
being a mark of an unsuccessful speaker, hesitancy has important functions
as it signals to the listener that the speaker needs more time and it thus can
enable them to hold the floor and extend their own turn (see chapter 4 for
further discussion).
In terms of keywords, it was clear that the verb think is used throughout
in a range of patterns, illustrating how communicative routines for giving
opinions can be realised. Analysis revealed a variety of functions. It enables
learners to successfully sequence utterances, shift focus, express stance and
hedge language. Can also featured as a keyword and is used most often to dis-
cuss general possibility, with the focus shifting as levels increase. At B1, this
is used to talk about things from the speaker’s viewpoint (employing phrases
with I can) and at B2 and C1 levels, it is used to express more hypothetical
or general viewpoints (employing phrases with you can and we can). The use
of these keywords across the levels showed that success at higher proficiency
levels relates to the range of functions an individual word fulfils rather than a
learner simply using a broader range of words. Though increases in frequency
of lexical chunks are evident across the levels, there were no statistically sig-
nificant gains in usage as proficiency grew. Instead, it is clear that learners at
all levels favoured chunks such as I agree with you which can be used with a
range of functions such as expressing agreement or buying time, rather than
employing a range of chunks for one function. A number of chunks enable
learners at different levels to fulfil different aspects of communicative com-
petence. For example, we discussed how so yeah is used by B1 and C1 levels
to signal the closing of a turn as part of discourse competence in chapter 4.
6.2.2 Strategic competence

In identifying how strategic competence can contribute to the spoken success
of B1, B2 and C1 learners, this chapter analysed learner language for evi-
dence of the CEFR’s production and interaction strategy descriptors. With
some previous studies advocating a deficit view of communication strategies
(CSs), often concentrating only on learner-to-native speaker interaction, it
aimed to establish whether strategy use could act as a positive rather than
negative indicator of growing proficiency in both learner-to-native speaker
and learner-to-learner speech. A central finding was that not only were a
variety of strategies displayed, but that their volume increases as proficiency
grows despite falls in relevant can-do statement numbers from B1 to C1.
Heavily weighted towards interaction at all three levels, the data established
Conclusion 161
that CSs relate to typical ‘problems’ or gaps in learner language but these do
not dominate successful speech at all; it is the strategies used to maintain,
extend, clarify and preface speech in jointly constructed interaction which
shine through. Arguments that CSs can boost and maximise perceptions of
learners’ overall communicative competence are to some extent supported
in this chapter.
Concerning production strategies, all learners exhibited an ability to
correct or reformulate their speech. Potentially seen as a shortcoming or
disfluency in learner language, the findings instead show that correction is
evidenced more frequently at higher levels both in overall and in learner-
to-learner speech. Whilst corrective focus shows parallels across all levels
with word choice in terms of pronoun and verb selection often receiving
attention, the data also display trends within each level: B1 learners attend
to tense use, B2 learners attend to missing words in their utterances, and C1
learners begin attending to reformulations of entire utterances as opposed
to the individual components within them. We argued therefore that cor-
rection can act as an indicator of proceduralisation as proficiency grows,
with speech monitoring following verbalisation potentially taking the place
of forward planning in speech. Closer inspection of corrective production
strategies thereby uncovered several layers as to the implications correction
has for learners’ successful speech both within and across proficiency levels.
Finally, though prefacing remarks to initiate, maintain and close discus-
sion were found to dominate learner interaction at B1, B2 and C1, it was
their ability to invite others into the conversation and clarify meaning when
necessary that were explored in more detail. Previous findings in chapter 2
suggested that B1 learners overlooked asking questions of partners in place
of communicating their own thoughts and opinions. It was instead found
that it was at B1, rather than at B2, that this interactive CS begins to emerge;
in fact, 21% of learner-to-learner CSs are aimed at conversing with others at
this level. Analysis of USTC data once again demonstrated a change across
B1, B2 and C1 in the nature of the question posed. Though prevalent at
B1, the use of yes/no, one-word questions such as do you agree with me?
and do you think X is good? decrease from B1 to C1; the ability to elicit
longer responses does however rise. We surmise that successful interaction
at higher levels hinges on this ability to seek extended responses from inter-
locutors, rather than short answers which in turn can provide the founda-
tion for jointly constructed discourse. Clarification requests were likewise
found to exhibit changes in both their focus and their expression. Whereas
B1 learners clarify task instructions on the majority of occasions, B2 learn-
ers concentrate on vague examiner questions and C1 learners enquire about
the meaning of vocabulary or questions. This is achieved through the use of
partially and fully formed questions as well as repetition. In our discussion,
we acknowledged that clarification could be viewed in a negative light; ask-
ing questions of fellow learners or native speakers could be taken as a sign
162 Conclusion
of lacking understanding or vocabulary. However, the fact that learners are

able to formulate clarification requests according to their own needs once
again demonstrates that strategic competence can enable and extend learn-
ers’ current communicative capability to adapt to the varying, numerous and
perhaps unpredictable language use settings they will encounter.
6.2.3 Discourse competence

This chapter explored how successful speakers used linguistic forms such as
it for anaphoric reference and discourse markers (DMs) such as you know to
realise discourse competence at different levels and fulfil the CEFR descrip-
tions related to this competence. The first aspect we noted was that speakers
generally use similar language to produce discourse that is coherent, cohesive,
developed and managed but that frequency of occurrences does differ across
levels in sometimes significant ways. As an example of this, the analysis dem-
onstrated that in terms of cohesion, when learners refer back to something
said previously they tend to use it more frequently as the levels increase and
this increase is significant in terms of log-likelihood scores when we compare
B2 and C1 levels with B1 level. However, when we explored this and that
the frequency seems to level off at B2. The usage of this is significantly more
frequent when we compare B2 to B1 and C1 while that is significantly more
frequent at both B2 and C1 levels when compared with B1. This usage seems
to reflect the fact that successful discourse competence at lower levels is gen-
erally reflected in the ability to manage speakers’ own turns, and as learn-
ers progress this becomes increasingly part of co-constructing the discourse
across turns and in developing what a speaking partner says. It, to give one
example, is increasingly used as levels progress to refer back to what a speak-
ing partner has said and to develop the idea in their own turn.
The chapter also showed that DMs are an important feature of discourse
competence at all levels. They co-occur with other items to form chunks,
which are then used to manage the discourse within and across turns. The
use of items does vary in frequency across the levels and these variations
can be significant alongside differences in how they function. To give two
examples, you know is used with significantly higher frequency at B1 level;
when we compare this to B2 and C1 and at B1 level it is used to mark hesi-
tation or buy time rather than to mark shared knowledge. Well is used with
significantly higher frequency at C1 levels when compared to B1 and B2
and the functions also increase. At B2 level, well is primarily used to mark
hesitation and to buy time while at C1 it is also used to mark an unexpected
response. Overall, this chapter showed that discourse competence at B1 level
is developed largely within a speaker’s own turn where the learner focuses
on creating coherence and cohesion within their own contribution while at
B2 and C1 levels, this expands across turns, co-constructing and developing
what a speaking partner says.
Conclusion 163
6.2.4 Pragmatic competence

This chapter analysed components of successful speech act data within a
corpus containing examples of request and apology language from non-
native speakers (NNS) and native speakers (NS) in an academic setting.
The Speech Act Corpus of English (SPACE) was devised in the absence of a
corpus which was effective and large enough for a specific pragmatic analy-
sis of the linguistic and organisational features of spoken language. What
the chapter showed is that NNS at B2 level are able to achieve the levels
of sociolinguistic appropriateness, as defined by the CEFR, by observing
politeness and sociocultural conventions, as dictated by the interlocutor and
context. Specifically, the NNS data reveal that successful requests are mainly
realised through conventional indirectness such as the modals can and could,
and more direct requests using the verb want, tend to be isolated to service
encounters and further mitigated with politeness markers such as please. An
examination of formulaic chunks shows NNS tend to rely on a smaller range
which are less complex than common NS expressions such as, I just wanted
to know if, I was wondering if but NNS still find alternative, successful ways
to convey the same message such as with the use of polite alerters to get the
listener’s attention (e.g. sorry to bother you) and appropriate use of modals
such as can and could. This evidence of variation in achieving success tends
to be mirrored in the apology data too. For instance, there is evidence that
NNS exhibit more caution and less confidence by asking for assistance to
repair a problematic situation rather than offering a solution themselves, and
are less able to distinguish the appropriateness of the performative apologise
to the same extent as NS. Nevertheless, it appears NNS need not possess the
same level and range of skills needed to be considered successful which is
welcome news for learners and practitioners. In fact, the data further sug-
gest that formulaic chunks need not always be grammatically accurate, or
organised in a pre-described way, to be successful. As a final note, in terms
of strategy choice, the selection of alerters to accompany a request head act,
and explicit expressions of apology (+ intensifier), accompanied by a further
mitigating strategy, such as an offer of repair or explanation, seem sufficient
to placate most academic interactions, as typified in the SPACE corpus. The
results do offer a starting point for teaching and learning to advance prag-
matic competence and may be particularly beneficial for learners engaged in
overseas academic communities.
6.2.5 Summary
We can summarise the findings overall in table 6.1. This gives simple ‘can do’
statements based on our key findings for each chapter, by levels and in gen-
eral. As the data was not divided by levels when measuring pragmatic com-
petence we have kept these in the ‘B2’ category only. The intention here is
Table 6.1 Summary of successful speakers’ key competences at B1-C1 levels
Level Linguistic competence Strategic Competence Discourse competence Pragmatic competence
B1 I can use we to refer to I can correct mix-ups with I can use it to link back to previous
third parties. tenses and expressions that lead ideas in my turn.
I can use I can to refer to misunderstandings without I can use simple additive
to general possibilities in interlocutor feedback. expressions such as for example,
relation to my own life. I can use simple question forms e.g. and also and I also to develop the
I can use chunks with I do you agree with me? requiring a yes/ theme of my turn.
think to express my own no or one-word answer to invite I can use so yeah to close a turn.
views. others into a discussion. I can use yeah in chunks such as
I can use statements and questions, yeah yeah to agree, pause or buy
though they may not always be time.
accurately formed, to clarify I can use Ok on its own and in
information from native speakers chunks such as Ok er to buy time
and fellow learners. or to show understanding.
I can enquire as to the meaning of
unknown vocabulary or unclear
questions as well as task instructions.
B2 I can use we to refer to I can correct slips and errors I can use it, increasingly this and I can formulate requests with
third parties and general relating to simple word choices that to link back to previous ideas conventional indirectness using modals
topics. and missing words when I become in my turn and a partner’s turn. such as can and could.
I can use I can and you conscious of them or they have led I can use simple additive I can use common 2–4 word chunks such
can to refer to general to misunderstandings. expressions such as and also and I as sorry to bother you to get the listener’s
possibilities in relation I can use open-ended questions also and increasingly for example to attention and could you give me, could you
to my own life and in e.g. how about you? requiring more develop the theme of my turn. help me to form the head act in requests.
relation to others’ lives. detailed responses to invite others I can use what do you think? to I can organise a request using the pattern
I can use chunks with into the discussion and maintain open a turn and invite a partner alerter (e.g. Excuse me) + polite request
agree with to agree with co-constructed conversation. to speak. head act using conventional indirectness.
a speaking, to signal I can use questions, though not I can use yeah on its own and in I can formulate apologies with explicit
some disagreement always accurately formed, to clarify chunks such as yeah I to agree, buy apologies incorporating an intensifier
with I agree with you vague questions that are asked of time and add a viewpoint. such as I’m so sorry.
but partner and to gain me. I can use Ok to buy time and I can use common 2–4 word chunks
thinking time. increasingly in chunks such as Ok such as I’m so sorry about that, I want to
I can use chunks with I Ok to show understanding. apologise (for).
think to give opinions, to I can use well in chunks such as well I can organise an apology using the
seek the views of others, I to mark hesitation. pattern explicit expression of apology +
to hedge my own views offer of repair/explanation.
and to hold the floor.
C1 I can use we to refer to I can backtrack when I encounter a I can use it, this and increasingly that
third parties’ general difficulty in speech relating to word to link back to previous ideas in my
topics and increasingly choice, missing words and tenses. turn and a partner’s turn.
hypothetical topics. I can also monitor utterances for I can use simple additive
I can use I can and you their success in conveying desired expressions such as I also, for
can to refer to general messages and reformulate them example and increasingly and also
possibilities in relation when necessary. to develop the theme of my turn.
to my own life and that I can use open-ended questions I can use what do you think? to
of others we can to e.g. what do you think about X? open a turn and invite a partner
increasingly general and requiring more detailed responses to speak.
hypothetical topics. to invite others into the discussion I can use so yeah to close a turn.
I can use chunks with and maintain co-constructed I can use yeah on its own and in
agree with to agree with conversation. These can also be used chunks such as yeah I to agree and
a speaking partner and to gain more thinking time when add a viewpoint, often continuing
to extend their ideas preparing a response. a topic begun by my speaking
from their turn into my I can use partially and fully formed partner.
own turn. questions, as well as repetition to I can sometimes use Ok to launch
I can use chunks with I clarify the meaning of vocabulary and turns or topics.
think to give opinions, to questions. I can also use repetition I can use well in chunks such as well
seek the views of others, to gain more thinking time when I to mark hesitation and sometimes
to hedge my own views preparing a response. to signal an unexpected response.
and to hold the floor.
In I can use vocabulary I can self-correct my speech when I can use er to buy time, mark
general form the first two errors relating to words choices, in hesitation and hold the floor.
thousand words to talk particular pronouns and verbs, occur. I can use chunks such as I think to
about everyday topics. I can clarify the responses mark hesitation and hold the floor
I can use three- and produced by native speakers and
four-word chunks such fellow learners to aid my own
as a lot of across a range understanding in real-time speech.
of everyday topics
I can use chunks with
think to give my views.
166 Conclusion
not to rewrite the CEFR ‘can do’ statements, as such an undertaking would
involve a much lengthier process and would need to be based on larger, more
comprehensive corpora. Instead, we simply offer these as summaries based
on our data, which we feel could be used as an initial guide to the commu-
nicative competence of successful speakers.
6.3 Implications for research

1 Research into fluency is one way in which this current study could be
extended. Fluency is of course a ‘slippery notion’ but it is inherently
linked with impressions of learner success. As mentioned in the CEFR,
for instance, at B2 learners should be able to interact ‘with a degree
of fluency and spontaneity that makes regular interaction with native
speakers quite possible without imposing strain on either party’ (Coun-
cil of Europe 2001: 129) so fluency clearly has the potential to influ-
ence judgements based on learners’ all round performances. Although
research is being carried out at the University of Louvain (see Gilquin
and De Cock 2013) into software for analysing fluency in learner lan-
guage, it is felt that there is some scope here for research into fluency.
Aspects which could be further explored in this area could be learner
speech rates, the contribution of chunks to fluency and the use of both
filled and unfilled pauses in the USTC corpus. Such analysis could com-
bine quantitative measures such as the frequency of chunks used and
attempt to correlate these with fluency scores given to learners at differ-
ent levels by test raters.
2 Future studies might also begin to explore perceptions of learner success
in speech. Though perception-type studies may not always produce rep-
licable facts, they can provide an excellent barometer for current beliefs
or ‘feeling’ in the world of practice. Just as Timmis (2002) was able to
provide insights into teacher and learner views towards NS models and
spoken grammar in international contexts based on qualitative ques-
tionnaires, a similar study into what makes learners successful could
identify the barriers that currently exist. Research taking this approach
could also investigate whether there are differences between novice and
experienced teachers and assessors, or whether learners from different
cultural backgrounds judge themselves according to different criteria.
3 Clearly, the corpus could be extended further to include other levels and
more tokens at each level. Although we have argued that a specialised
corpus of this type can be relatively small in size, clearly, a larger set of
data would allow us to make broader generalisations about successful
spoken English, which could be used to make an extensive analysis of
such English.
4 Related to point three, the research could also be extended particu-
larly in relation to linguistic competence so that both grammatical and
phonological aspects of this competence are explored. In chapter 2, we
Conclusion 167
argued that lexis is an important component of this competence but also

noted that there are other elements we could consider. Extending the
analysis could give a broader picture of linguistic competence and how
it operates at each level.
5 Returning perhaps to a more traditional measure of success, current
research is also being conducted into how sophistication and accuracy
manifest in the speech of learners. Presented by Hunston (2016), a
model of accuracy not based on written grammar is being devised so
that speech can be analysed from the bottom up. Once again advocating
the view that learners need to be viewed and judged according to what
they can do, rather than what they cannot, there is a potential for similar
analysis to be conducted on the USTC once more details are released.
6 As we have shown in a simple manner in table 6.1, there is also potential
for corpora such as USTC to be used to amend and add clarity to the
CEFR ‘can do’ statements, in order for these to become easier to utilise
by researchers and teachers. While the original intention of the CEFR
was not to specify which forms could be used to realise the various
competences, we feel it could be enhanced by additional information
taken from corpus data. Such work has already been undertaken based
on written learner language in relation to grammar in the English Gram-
mar Profile (Mark and O’Keeffe 2015) and could easily be extended to
the kind of spoken data we have described in this book.
6.4 Implications for teaching

1 Our results firstly reinforce previous calls for classroom instruction to
centre on the core, basic vocabulary of the first 2000 words in English
(Adolphs and Schmitt 2003; McCarthy 1999). Though the overall size
of a learner’s vocabulary is intrinsically linked with L2 competence and
assessments of proficiency (see Laufer 1998: Meara, 1996 Stæhr 2008;
Taylor 2011), the findings have shown that successful speech cannot be
achieved without knowledge of this specific group of words. The fact
that the 2000 most frequent words accounted for a large percentage
of successful B1, B2 and C1 speech, supports claims for learners to be
taught vocabulary strategies so that they can learn words beyond this
limit and compensate for unknown vocabulary when necessary (Schmitt
2008). The use of large amounts of classroom time for explicit teaching
of words outside 2000 most frequently would seem to be of limited use
given the small gains other bands yield in term of coverage (see Adolphs
and Schmitt 2003). Therefore, giving learners opportunities to explore
these words in context (rather than just being given a list of these words)
would seem to be a useful approach. This could ensure that vocabulary
learning is not superficial and that further judgement can be offered as
to the items of the most lexical content, and so that learners can under-
stand that not every word in the list is equally useful (see Lewis 1993;
168 Conclusion
McCarthy 1999; Moon 2010; Schmitt 2008). The benefits of training

learners in the use of corpora as a learning tool to explore the use of
words in context may aid in this pursuit and with appropriate guidance
and exemplification from a teacher, there is no reason why this cannot
be undertaken (see Haywood 2014 for a good example of materials
using corpora in an EAP context). Similarly, learners can be encour-
aged to use dictionaries to check the frequency of words they encounter
and pay particular attention to those which occur in the most frequent
1000, something now listed by many dictionaries. Both of these can be
offered as strategies which learners can employ as they attempt to learn
vocabulary as employing conscious strategies has been shown to benefit
learning (Folse 2004).
2 The next implication would be to consider the emphasis that is placed
on the learning of vocabulary by CEFR descriptors. As pointed out on
several occasions in this book, and in particular in chapter 2, qualita-
tive descriptions of vocabulary range in the CEFR clearly stress that
higher amounts of vocabulary are required at higher proficiency levels.
Whilst formulaic language is remarked upon (Council of Europe 2001:
110–112), the overall impression given is that learners must learn more
words if they are to progress and therefore be successful. However, given
that there was little difference in vocabulary profile data and given that
analysis of individual words and chunks revealed more about varying
functions at B1, B2 and C1, we would argue that broadening learners’
vocabularies should not be carried out at the expense of teaching learners
what can actually be achieved with the lexis they already know. Other
researchers have previously advocated such an approach: namely that
learners need to make use of a limited vocabulary which is continually
repeated and recycled to satisfy a range of functions and meanings (for
example, Nation 2001; Nation and Waring 1997). It is therefore our view
that enhancing comprehensive word knowledge can be of more benefit
to learners and have more influence on their successful spoken English.
For example, we have seen that the words such as it, this and that may
first be learned at a low level but it is also clear they have an impor-
tant role to play in discourse competence, allowing speakers to link their
own ideas together and extend those of others in conversation, a point
also made by Carter, Hughes and McCarthy (2000) in relation to native
speaker corpora. Undoubtedly, as we have noted, the first 2000 words in
English provide a solid foundation that can then be built upon accord-
ing to individual needs (Thornbury and Slade 2006), and we believe that
more can be gained from enhancing how well they know the words they
have already learned (Nation 2001; Qian 2002; Schmitt 2008). Knowing
a word must therefore entail an increasing awareness of how it can be
used with different functions, to achieve different aspects of communi-
cative competence and within its typical patterning in collocations and
Conclusion 169
chunks. To take one example, can was a keyword at each level but was
patterned differently from B1–C1 and the range of functions for which
it could be used increased as the level did.
3 The use of chunks in achieving communicative competence was clear
across the chapters. One simple way in which teachers can supplement
courses with instruction on chunks is to use the many useful lists of
these now available based on corpus evidence (for example, Martinez
and Schmitt 2012) as a way of informing their teaching. Another way
teaching could address this would be exploring learner corpora such as
the USTC. The focus here could be upon drawing learners’ attention to
real language learner transcripts of speech and helping them to search
for chunks of language that are used to perform functions such as ini-
tiating, maintaining and ending turns, and seeking clarification. Asking
learners to perform a speaking task (such as a speaking exam practice)
and then examining a recording and transcript of a more successful
speaker (B1 level listening to B2 learners, for example) undertaking the
same task has long been advocated in relation to task-based learning
(for example, Willis 1996). Students can be asked to undertake the task,
then to listen to the recording of more successful speakers, first not-
ing general differences, then exploring the transcripts for differences in
the language used before being guided towards finding and underlining
or highlighting specific chunks with specific functions. Such a process
would help to foster the habit of noticing in learners, something which
has long been suggested is a crucial aspect of second language acquisi-
tion (Schmidt 1990). In addition to such classroom practices, assem-
bling a bank of multifunctional chunks such as the ones highlighted in
this study would also help teachers and learners to focus on using the
same chunks in different ways as their level increases.
4 Related to the use of the USTC corpus for teaching, is also the potential
role it could play as a study aid for learners. The language contained
within the corpus can itself be used to model how some functions can
be realised in learner speech. As a supplementary tool, it can offer a
substitute to resources containing NS speech which may seem unachiev-
able or unrealistic. Giving learners access to the corpus or the texts it
comprises could thus raise awareness of successful speech. For instance,
if the corpus was used to demonstrate discourse competence is realised
in speech at different levels, it could draw learners’ attention to features
which make the process of speaking with fellow language users a more
manageable and perhaps less daunting task.
6.5 Final thoughts

As noted in chapter 1, any exploration of successful spoken English will
always be limited and partial. While spoken, specialised corpora of learner
170 Conclusion
language tend to be smaller, it must be accepted that their findings ‘cannot

provide the basis of sweeping generalizations’ about language (Carter and
McCarthy 1995: 143). Related to this is the caveat that corpora can only
provide a snapshot of language. No matter how large they are, in nearly all
cases, they cannot replicate the language itself, nor the infinite choices and
combinations possible within it. Corpora provide a source of evidence about
language but ultimately, they will never have exactly the same properties as
the language itself. This means that spoken corpora, such as the corpus we
have used, are only able to capture what is evidenced in language use and
not what the speaker is capable of. They cannot reveal what learners are able
to understand and what they choose not to use in their speech. Put simply,
just because particular words or structures do not appear in a corpus does
not mean that learners are unaware that they exist; they may have simply
decided not to employ them in their language production or the task may
not have demanded them. Also, as we have noted, this study of success can-
not completely claim to be representative of the learners’ full communica-
tive competence in English. Focusing specifically on an exam-based research
tool, for instance, we have been unable to fully report on the learners’ use of
language in alternative tasks or contexts, or with different speakers. There-
fore, the results of this study will not relate to all modes and genres of
learner speech and we acknowledge the fact that spoken exams are in many
ways a specific genre of speech. Using exam data could also mean that the
learners involved might have been concentrating on passing the test rather
than satisfying the goals of the interaction and, as noted, we can of course
always question the authenticity of the interaction in any spoken exam,
where learners do not know each other and may not have chosen to speak
to each other outside this context. A final limitation concerns the analysis
of speech through written transcription. In this study, spoken language was
captured and represented in the form of written, broad transcriptions. Not
only were multimodal features such as body language, gesture, and facial
expression lost, but important prosodic information such as pronunciation,
tone, timings and stress were also omitted.
Having noted these limitations, we would also argue that the study does
make a contribution to our understanding of successful spoken English.
Earlier comparison of learners according to NS norms had resulted in per-
ceptions lacking appreciation regarding what learners could successfully
accomplish with their speech. Though native-like proficiency is often seen
as the ultimate target in second language, we have suggested that such a tar-
get is insufficient in highlighting how learners can become more proficient
in their speech at different stages of the second-language learning process.
Using descriptions of communicative competence from the CEFR as a basis,
this study into B1, B2 and C1 speech has shown that learners demonstrate
successful spoken language use in a number of ways. The use of a common
vocabulary differing little in token and family coverage, frequency bands
Conclusion 171
and difficulty demonstrated that it was the flexibility with which individual
lexis could be used that most exemplified success at different levels, with
particular words and chunks revealing that learner proficiency is in part reli-
ant upon the manner that multifunctionality can be exploited and adapted
in speech. In chapter 1 we described a successful speaker as one who is able
to use linguistic, strategic, pragmatic and discourse competence as appropri-
ate for a particular goal, at a particular language level, as defined by the
CEFR. We hope that this book has contributed in a small way to under-
standing this concept and we hope it is one which is further developed with
the ultimate goal of helping as many learners as possible to be successful
speakers of English.
References
Adolphs, S. and Schmitt, N. 2003. Lexical coverage of spoken discourse. Applied
Linguistics, 24(4), 425–438.
Carter, R., Hughes, R. and McCarthy, M. 2000. Exploring grammar in context:
Upper intermediate and advanced. Cambridge: Cambridge University Press.
Council of Europe 2001. Common European Framework of Reference for Languages:
Language, teaching, assessment. Cambridge: Cambridge University Press.
Folse, K. 2004. Vocabulary myths: Applying second language research to classroom
teaching. Michigan: University of Michigan Press.
Gilquin, G. and De Cock, S. 2013. Errors in disfluencies in spoken corpora. Amster-
dam: John Benjamins Publishing.
Gilquin, G., De Cock, S. and Granger, S. 2010. The louvain international database
of spoken English interlanguage. Handbook and CD-ROM. Louvain-la-Neuve:
Presses universitaires de Louvain.
Haywood, S. 2014. Academic vocabulary. [Online] Available from: <www.nottingham.
ac.uk/alzsh3/acvocab/index.htm> [Accessed on 12 February 2017].
Hunston, S. 2016. Measuring improvements in learner language: Issues of complex-
ity and correctness. In: English Profile, 10th anniversary English Profile seminar.
Cambridge, Cambridge University Press. 5th February 2016.
Hunston, C. 5 February 2016. Measuring improvements in learner language: Issues
of complexity and correctness. Presentation given at 10th anniversary English
Profile seminar. Seminar conducted at the meeting of English Profile, Cambridge,
United Kingdom.
Laufer, B. 1998. The development of passive and active vocabulary in a second lan-
guage: Same or different? Applied Linguistics, 19(2), 255–271.
Lewis, M. 1993. The lexical approach: The state of ELT and a way forward. Hove:
Language Teaching Publications.
Mark, G. and O’Keeffe, A. 2015. Introducing the English grammar profile [Online]
Available from: <www.cambridge.org/elt/blog/2015/11/11/introducing-english-
grammar-profile-1-building-profile-1/> [Accessed on 20 February 2017].
Martinez, R. and Schmitt, N. 2012. A phrasal expressions list. Applied Linguistics,
33(3), 299–320.
McCarthy, M. 1999. What constitutes a basic vocabulary for spoken communica-
tion? Studies in English Language and Literature, 1, 233–249.
172 Conclusion
Meara, P. 1996. The dimensions of lexical competence. In: G. Brown, K. Malmkjӕr

and J. Williams, eds. Performance and competence in second language acquisition.
Moon, R. 2010. What can a corpus tell us about lexis? In: A. O’Keeffe and M.
McCarthy, eds. The Routledge handbook of corpus linguistics. Abingdon: Routledge,
197–211.
Nation, I.S.P. 2001. Learning vocabulary in another language. Cambridge: Cam-
Nation, P. and Waring, R. 1997. Vocabulary size, text coverage and word lists.
Vocabulary: Description, Acquisition and Pedagogy, 14, 6–19.
Qian, D.D. 2002. Investigating the relationship between vocabulary knowledge and
academic reading performance: An assessment perspective. Language Learning,
52(3), 513–536.
Schmidt, R.W. 1990. The role of consciousness in second language learning. Applied
Linguistics, 11(2), 129–158.
Schmitt, N. 2008. Instructed second language vocabulary learning. Language Teach-
ing Research, 12(3), 329–363.
Stæhr, L.S. 2008. Vocabulary size and the skills of listening, reading and writing. The
Language Learning Journal, 36(2), 139–152.
Taylor, L. 2011. Examining speaking. Cambridge: Cambridge University Press.
Timiis, I. 2002. Native-speaker norms and International English: a classroom view.
ELT Journal, 56(3), 240–249.
Willis, J. 1996. A framework for task-based learning. London: Longman.
Index
ability to learn 18 Common European Framework of

Adolphs, S. 143 References for Languages (CEFR):
agree with 69–72, 73, 75 ‘can do’ statements 40, 50, 109,
a lot of 65–8, 75 111, 114, 117, 138, 163, 166, 167;
and also 122 Common Reference Levels 16–17;
and er 125–7 communicative competence 4, 15, 18;
anyway 116, 117, 123 comparison of most frequent strategy
apologies 136, 139–41, 150, 153 statements at each of the three levels
assertions 136 92–3; discourse competence 110;
Austin, J. 136 enhancement from corpus data 167;
existential competence 11; impact on
Bachman, L.F. 10–14, 33, 135–6 language assessment and coursebook
Bardovi-Harlig, K. 154 design 1–2; language use 17–18;
Blum-Kulka, S. 140 linguistic competence 33; linking
BNC-20 37 communicative competences and
British National Corpus (BNC/BYU- learner success to 16–19; notion of
BNC) 24, 38, 47, 143, 144 general competence 18–19; pragmatic
Brown, P. 137, 140 competence 137; preliminary analysis
but 71–2 of production and interaction
strategies in three levels of speech
Cambridge and Nottingham Corpus of 87–92; qualitative descriptions of
Discourse in English (CANCODE) learner accuracy 98; sociolinguistic
143 appropriateness 153; sociolinguistic
can 47, 62–5, 74, 145 competence 135; statements
Canale, M. 6–10, 33, 78, 109, 135–6, for strategy usage in speech 81;
137 vocabulary development 43, 168–9
‘can do’ statements 40, 50, 109, 111, communication strategies (CSs) 79–81;
114, 117, 138, 163, 166, 167 previous studies 81–3
Celce-Murcia, M. 109–11, 114, 118 communicative competence: Canale
Cheng, W. 142–3 and Swain’s theory of 6–8; Canale’s
Chomsky, N. 6, 8, 78, 135 theory of 8–10; definitions of 14–15;
chunks 36, 65–8, 73, 74–5, 100, 128, Hymes’s theory of 1, 4–6; linguistic
132, 148, 150, 162, 169 knowledge 136
clarification requests 103–7, 161–2 communicative performance 5, 6–7,
cognitive functions 117 10–14
coherence 110 ‘comparative fallacy’ 2
cohesion 110 competence 5, 18–19
collocations 36, 38–9, 52, 58–9 computer-animated production test
commissives 136 (CAPT) 24–6, 135
174 Index
conventionally indirect strategies (CID) genre/generic structure 110

138 Goffman, E. 140, 149
corpus linguistics 35 grammar rules 5, 6
correction 92 grammatical competence 6–8, 9, 13
correction production strategies 106 gratitude expressions 143
correction strategies 93–9, 106, 161
could 145 Halenko, N. 143
Cross- Cultural Speech Act Realisation hesitation 55, 58, 69, 72, 74, 114, 117,
Project (CCSARP) 140 129, 132, 143, 160, 162
Hong Kong Corpus of Spoken English
data collection 20–7 (HKCSE). 143
declarations 136 House, J. 140
declarative knowledge 5, 9, 18 Hymes, D. 4–6, 8, 10, 33, 78, 135
deixis 110, 120
directives 136, 137 I also 122
direct strategies 138 illocutionary acts 136–7, 137
discourse 7 illocutionary competence 13
discourse competence: at B1-C1 levels Illocutionary Force Indicating Devices
117–31; concept of 13; definitions (IFIDs) 136–7, 141
of 10, 15, 19, 109–11; discourse I mean 117, 125, 131
markers 111–12, 123–31, 162; and er interaction strategies 87–92, 99–105,
125–7; it 117–18, 122, 162; language 106, 161–2
areas 116; methods of analysis internal modifiers 139
114–17; ok er/ ok I 129; pragmatic interpersonal functions 116
competence and 13, 137; previous it 117–18, 122, 162
studies 112–14; summary of findings I think 58–62, 73, 116, 123, 125
162; well 129–31, 162; yeah/yeah I think er 128–9
yeah/yeah I 123, 125, 127–8; you I think it 128
know 162
discourse markers 142 Jones, C. 143
discourse markers (DMs) 111–12, 116,
123–31, 162 Kasper, G. 140
Dörnyei, Z. 80, 81, 93, 100, 109–11, keyword lists 39
114, 118, 154 keywords 55–7
knowledge 5, 9, 18
English Grammar Profile 35, 167
English Vocabulary Profile 35, 47 language competence 35
er/erm 52–5, 73, 74, 116, 124, 160 language knowledge 11–13
EVP Text Inspector tool 43 language use: Bachman and Palmer’s
excuse me 148 model 10–14; definitions of 11, 17–18;
existential competence 11, 18 sociolinguistic competence and 7
expressives 136 learners’ self–assessments 2
external modifiers 139 Leech, G. 137, 140
Levinson, S.C. 137, 140
face-threatening act (FTA) 137, 139 lexical chunks 36, 39–40, 65–72, 74,
favours 136 100, 113, 141, 160
first language acquisition (FLA) 5 lexicogrammar 35
fluency 166 linguistic competence: agree with
for example 122 69–72; at B1-C1 levels 40–5; can
formulaic language 35–6, 141–2, 148 62–5; communicative competence
frequency lists 37–9 and 5, 19; definitions of 14, 19,
frequency profiles 36–7 33–4; discussion 72–3; er/erm 52–5,
frequent words 34–5, 45–55, 74 160; keywords 55–7; a lot of 65–8;
Index 175
methods of analysis 36–40; most pragmatics 137

frequent lexical chunks 65–8; most procedural knowledge 9
frequent words 45–6, 74; previous production strategies 93–9
studies 34–6; summary of findings production strategies for correction
159–60; think 58–62, 160; we 47–51 91–2
linguistic knowledge 136 promises 136
locutionary acts 136 pronoun correction 96
Louvain International Database of
Spoken Learner English (LINDSEI) qualitative analysis 38
24, 37–8, 38, 45, 47, 52, 55, 58, 159 questions 100–3, 106
many 68–9 really 145, 149

missing word correction 97 redressive strategies 141
much 68–9 reference corpus 37
multi-word units 36 referential functions 116
rephrase strategies 93–9
native-speakers (NSs) 2–4, 38, 58, 78, representatives 136
106–7, 134, 145, 150, 153, 154, 163 requests 136, 137–9, 153
need 47 research 166–7
non-conventionally indirect strategies right 116, 123
138–9
non-native-speakers (NNSs) 134, 145, Schauer, G. A. 143
153, 163 Scott, M.L. 80, 81, 93, 100
Nureddeen, F. 140 Searle, J.R. 136
second language acquisition (SLA) 2, 135
ok er/ ok I 116, 124, 129 second language learning 8
organisational competence 13 service encounter exchanges 143
should 47
Palmer, A.S. 10–14, 33, 135–6 skills 18
performance 5 sociocultural competence 5, 19
perlocutionary acts 136 sociolinguistic competence 6–10, 11,
please 145 13–14, 78, 135
politeness 137–8 sociopragmatics 137
post-utterance correction 99 sorry 149–50
pragmalinguistics 137 sorry to bother you 148
pragmatic competence: apologies so yeah 121–2, 123, 132, 160
139–41; communicative competence Speech Act Corpus of English (SPACE)
and 18; concept of 13; definitions 24, 134–5, 137, 153, 154, 163
of 13, 15, 135–42; discourse spoken language 4
competence and 13, 137; formulaic strategic competence: at B1-C1 levels
language and developing 141–2; 85–105; clarification requests
methods of analysis 144; most 103–7, 161–2; definitions of 10, 14,
frequent words in apologies 151–2; 78–81; and er 125–7; interaction
most frequent words in requests strategies 87–92, 99–105, 161–2;
145–7; pragmalinguistic features I think er 128–9; I think it 128;
of successful apology language language use and 11–12; methods
149–52; pragmalinguistic features of of analysis 83–5; ok er/ ok I 129;
successful request language 144–9; previous studies 81–3; production
previous studies 142–3; requests 136; strategies for correction 91–2, 93–9,
sociopragmatic features of successful 161; questions 100–3, 106; role in
request and apology language 153–4; communication breakdowns 7–8;
SPACE 24, 134–5, 137, 153, 154, summary of findings 160–2; well
163; summary of findings 163 129–30; yeah/yeah yeah/yeah I 127–8
176 Index
structural functions 117 UCLan Speaking Test Corpus (USTC)

successful speakers: definitions of 1, 20, 24, 37–8, 38, 45, 47, 58, 106,
28; measuring 19–27; rationale for 110, 159–60, 167, 169
exploring 2–4 utterance reformulation 97–8
‘successful users of English’ (SUEs) 2
Swain, M. 6–8, 10, 33, 78, 109, verb correction 96
135–6 vocabulary development 167–9
vocabulary size 34–5
teaching 167–9
tense correction 95–7 we 47–51, 63, 74
textual competence 13 well 116, 123, 125, 129–31, 132, 162
that 118–19, 122, 131 what do you think? 121–2
thematic development 120, 122 WordSmith Tools 6.0 37–9, 39
think 58–62, 73, 74, 100, 128–9, 160 written language 35, 36
this 118–19, 122, 131
Thomas, J. 137, 155 yeah/yeah yeah/yeah I 116, 123, 125,
Thurell, S. 109–11, 114, 118 127–8
t-tests 42, 52 you know 116, 123, 131, 162

Successful Spoken English

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Successful Spoken English

Uploaded by

Copyright:

Available Formats

Successful Spoken English

Successful Spoken English: Findings from Learner Corpora demonstrates how

• Examines databases of transcribed speech from learners at different

Successful Spoken English: Findings from Learner Corpora is key reading

Shelley Byrne is Lecturer in English for Academic Purposes at the University

Nicola Halenko is Senior Lecturer in English Language Teaching at the Uni-

SERIES EDITOR: RONALD CARTER

SERIES EDITOR: MICHAEL McCARTHY

SERIES EDITOR: ANNE O’KEEFFE

Other titles in this series

Historical Spoken Language Research

Successful Spoken English

More information about this series can be found at

Findings from Learner Corpora

Christian Jones, Shelley Byrne

List of figures viii

1 What is a successful speaker of English? 1

2.4.3 Keyword lists 39

4 Discourse competence 109

4.6.1 And er 125

5 Pragmatic competence 134

1.1 Canale and Swain’s (1980) theory of communicative

3.3 B1 interaction strategies 88

5.6 Use of can as a repair strategy in the NNS and NS

1.1 Marking criteria 21

3.1 Dörnyei and Scott’s (1995) CS taxonomy 81

Svenja Adolphs, Marco Antonini, Nick Carter, Ronald Carter, Jane

Thanks to all our BA TESOL/MOLA and MA TESOL with Applied Linguis-

What is a successful speaker

competence to demonstrate that language knowledge, skills and know–how

1.2 Rationale for exploring successful

Failure is accompanied by dissatisfaction, awareness of one’s own inad-

caused by the dominance of the NS model by establishing not what learners

1.3 Rationale for a focus on spoken language

1.4 Definitions of successful language:

1.4.1 Hymes’s theory of communicative competence

performance, had neglected to show the features of language which made

sociocultural competence are likely to be judged as unsuccessful. Hymes’s

1.4.2 Canale and Swain’s theory of communicative

They stress that grammatical, sociolinguistic and strategic competence (to

the realization of these competences and their interaction in the actual

Canale and Swain’s clarification of each of the competences is displayed

Verbal/Non-verbal communication strategies

Figure 1.1 Canale and Swain’s (1980) theory of communicative competence

attributed to a lack of competence or the influence of performance variables

1.4.3 Canale’s theory of communicative competence

such [knowledge – oriented] approaches do not seem to be sufficient for

For learners of a second language aspiring to be seen as successful, competence

• Grammatical competence: ‘concerned with mastery of the language

contexts depending on contextual factors.’ With appropriacy thus

1.4.4 Bachman and Palmer’s model of language

implementing, or using this competence’ (Bachman 1990: 81), it is evident

In general, language use can be defined as the creation or interpretation

Grammatical Textual Illocutionary Sociolinguistic

characteristics and an individual’s language knowledge. In a sense, strategic

therefore, it is essential that learners’ competences in goal setting, ‘deciding

1.5 Towards a definition of successful spoken

1 Linguistic competence – the ability to use language, which includes lexis,

3 Discourse competence – the ability to organise and link language across

We would acknowledge that it is a difficult task to define, let alone measure,

A successful user is one who is able to use linguistic, strategic, discourse

‘I can . . . relate the plot of a book or film and describe my reactions.

In order to undertake this successfully, we can suggest that a learner needs

1.6 Linking communication, communicative