Professional Documents
Culture Documents
Lexical Facility Size Recognition Speed and Consistency As Dimensions of Second Language Vocabulary Knowledge Michael Harrington Full Chapter
Lexical Facility Size Recognition Speed and Consistency As Dimensions of Second Language Vocabulary Knowledge Michael Harrington Full Chapter
Lexical Facility Size Recognition Speed and Consistency As Dimensions of Second Language Vocabulary Knowledge Michael Harrington Full Chapter
be
to
of
and
a
in
t
tha
v e
it
ha
for
Lexical Facility
Size, Recognition Speed and Consistency
as Dimensions of Second Language
Vocabulary Knowledge
Lexical Facility
Michael Harrington
Lexical Facility
Size, Recognition Speed and
Consistency as Dimensions of Second
Language Vocabulary Knowledge
Michael Harrington
University of Queensland
Brisbane, QLD, Australia
vii
Acknowledgments
I would first like to thank my wife Jan and daughter Bridget for their
forebearance. I am also greatly indebted to John Read for his advice and sup-
port throughout this project. He, of course, is not responsible for the final
outcome. Special thanks to collaborators Thomas Roche, Michael Carey, and
Akira Mochida, and colleagues Noriko Iwashita, Paul Moore, Wendy Jiang,
Mike Levy, Yukie Horiba, Yuutaka Yamauchi, Shuuhei Kadota, Ken Hashimoto,
Fred Anderson, Mark Sawyer, Kazuo Misono, John Ingram and Jenifer Larson-
Hall. Thanks also to Said Al-Amrani, Lara Weinglass, and Mike Powers.
Vikram Goyal programmed and has served as the long-standing sys-
tem administrator for the LanguageMAP online testing program used
to collect the data reported. He has been especially valuable to the proj-
ect. Special thanks also to Chris Evason, Director of the University of
Queensland’s (UQ) Foundation-Year program, who has provided encour-
agement and financial support for testing and program development.
Funding support is also acknowledged from Andrew Everett and the UQ
International Education Directorate.
The research reported here has been supported by the Telstra Broadband
Fund and a UQ Uniquest Pathfinder grant for the development of the
LanguageMAP program. Support was also provided by research contracts
from the Milton College (Chap. 9) and International Education Services–
UQ Foundation-Year (Chaps. 8, 9, and 10), and a grant, with Thomas
Roche, from the Omani Ministry of Research.
ix
Contents
xi
xii Contents
References283
Index303
List of Figures
xvii
xviii List of Figures
Fig. 5.4 Elements of the instruction set for the Timed Yes/No Test 111
Fig. 6.1 Lexical facility measures by English proficiency levels 140
Fig. 6.2 Median proportion of hits and 95% confidence intervals
for lexical facility measures by frequency levels and groups 149
Fig. 6.3 Median individual mnRT and 95% confidence intervals
for lexical facility measures by frequency levels and groups 150
Fig. 6.4 Median coefficient of variation (CV) and 95% confidence
intervals for lexical facility measures by frequency levels
and groups 150
Fig. 7.1 University entry standard study. Mean proportion of hits
by frequency levels for written and spoken test results 179
Fig. 7.2 University entry standard study. Mean response times by
frequency levels for written and spoken test results 180
Fig. 7.3 University entry standard study. Mean CV ratio by
frequency levels for written and spoken test results 181
Fig. 8.1 Combined IELTS dataset: Timed Yes/No Test scores by
IELTS overall band scores 194
Fig. 9.1 Sydney language program study. Comparison of VKsize
and mnRT scores with program placement grammar and
listening scores across four placement levels 213
Fig. 9.2 Singapore language program levels. Standardized scores
for the lexical facility measures (VKsize, mnRT, and CV)
for the VLT and BNC test versions 219
Fig. 9.3 Singapore language program study. Standardized scores
for the lexical facility measures (VKsize, mnRT, and CV)
for the combined test by level 221
Fig. 10.1 Oman university GPA study. Standardized VKsize,
mnRT, and CV scores by faculty 238
List of Tables
xix
xx List of Tables
xxiii
xxiv Introduction
Research Goals
This book has three goals. The first is to make the theoretical case for lexi-
cal facility. The validity of the construct is established in the first four
chapters by first examining the crucial roles that vocabulary size (Chaps. 1
and 2) and word recognition skill (Chap. 3) play in L2 performance.
The rationale for characterizing size and processing skill jointly as an L2
vocabulary construct, that is, for lexical facility, is then set out in Chap. 4.
This chapter discusses key theoretical and methodological issues that arise
from the proposal. Primary among these is the attempt to treat size and
speed as parts of a unitary construct. Standard practice in the psychomet-
ric tradition has long been to treat the two as separate dimensions.
Human performance has been characterized either as knowledge (also
called power) or speed, the relative importance of each dependent on the
kind of performance being measured. Knowledge is seen as the critical
attribute of higher-level cognitive tasks such as educational testing, while
speed is paramount for mechanical tasks such as typing. The lexical
facility account proposes that size (knowledge) and processing skill (speed
xxvi Introduction
References
Balota, D. A., & Yap, M. J. (2011). Moving beyond the mean in studies of
mental chronometry: The power of response time distributional analyses.
Current Directions in Psychological Science, 20(3), 160–166.
Hird, K., & Kirsner, K. (2010). Objective measurement of fluency in natural
language production: A dynamic systems approach. Journal of Neurolinguistics,
23(5), 518–530. doi:10.1016/j.jneuroling.2010.03.001.
Laufer, B., & Nation, P. (1995). Vocabulary size and use: Lexical richness in L2
written production. Applied Linguistics, 16(3), 307–322.
Read, J., & Chapelle, C. A. (2001). A framework for second language vocabu-
lary assessment. Language Testing, 18(1), 1–32.
Segalowitz, N., & Segalowitz, S. J. (1993). Skilled performance, practice and
differentiation of speed-up from automatization effects: Evidence from sec-
ond language word recognition. Applied Psycholinguistics, 14(3), 369–385.
doi:10.1017/S0142716400010845.
Part 1
Introduction
References
Meara, P., & Buxton, B. (1987). An alternative to multiple choice vocabulary
tests. Language Testing, 4(2), 142–145.
Nation, I. S. P. (2013). Learning vocabulary in another language (2nd ed.).
Cambridge: Cambridge University Press
1
Size as a Dimension
of L2 Vocabulary Skill
Aims
1.1 Introduction
This chapter introduces the field of what will be called vocabulary size
research, an approach based on the simple assumption that the overall
number of words a user knows—the breadth of an individual’s vocabulary
stock—provides an index of vocabulary knowledge. The focus on vocab-
ulary breadth means that little attention is given to what specific words
are known or the extent (or depth) to which any given word is used.
Rather, researchers in the area are interested in estimating the vocabulary
size needed to perform particular tasks in a target language. These tasks
can range from reading authentic texts (Hazenberg and Hulstijn 1996) to
coping with unscripted spoken language (Nation 2006). Size estimates
are used to propose vocabulary thresholds for second language (L2)
instruction, and more generally to provide a quantitative picture of an
individual’s L2 vocabulary knowledge (Laufer 2001; Laufer and
Ravenhorts-Kalovski 2010). The focus here, and in the book in general,
is on the size of recognition vocabulary and the role it plays in L2 use.
The main focus is on the recognition of written language.
Recognition vocabulary is acquired before productive vocabulary and
serves as the foundation for the learning of more complex language struc-
tures. The store of recognition vocabulary knowledge builds up over the
course of an individual’s experience with the language. This knowledge
ranges from the most minimal, as in the case of knowing only that a word
exists, to an in-depth understanding of its meaning and uses. A sparkplug
may be a thingamajig found in a car or, according to Wikipedia, ‘a device
for delivering electric current from an ignition system to the combustion
chamber of a spark-ignition engine to ignite the compressed fuel/air mix-
ture by an electric spark, while containing combustion pressure within
the engine’. Recognition vocabulary knowledge emerges from both
intentional learning and implicit experience, and even the most casual
experience can contribute to the stock of recognition vocabulary knowl-
edge. Repeated exposure to a word also has a direct effect on how effi-
ciently it is recognized.
The notion that knowing more words allows a language user to do
more in the language hardly seems controversial. However, many appar-
ently commonsensical assumptions in language learning are often diffi-
cult to specify in useful detail or to apply in practice (Lightbown and
Spada 2013). Even when evidence lends support to the basic idea, spe-
cific findings introduce qualifications that often diminish the scope and
power of the original insight. This chapter introduces and surveys the
vocabulary size research literature to see how the ‘greater size = better per-
formance’ assumption manifests itself. The methodology used for esti-
mating vocabulary size is first described, and then findings from key
studies are presented.
Size is a quantitative property and therefore requires some unit of mea-
surement. In the vocabulary size approach, it is the single word. Size
1.2 Estimating Vocabulary Size 5
What to Count
odification. But this knowledge is only part of the lexicon, which con-
m
sists of these words in combination with the mostly implicit grammatical
properties that constrain how the words are used. These properties reside
in procedural memory, a system of implicit, unconscious knowledge.
Paradis (2009) makes a distinction between vocabulary and the lexicon to
capture this difference. Vocabulary is the totality of sound–meaning asso-
ciations and is typical of L2 learner knowledge, particularly in the early
stages. The lexicon characterizes the system of explicit and implicit
knowledge that the first language (L1) user develops as a matter of course
in development, and which is developed to varying degrees in more
advanced L2 users. In Paradis’s terms, the lexical facility account relates
strictly to vocabulary knowledge, its measurement, and its relationship to
L2 proficiency and performance.
Last, the pivotal role the single word plays in online processing also
reflects its importance. The word serves as the intersecting node for a
range of sentence and discourse processes that unfold in the process of
reading (Andrews 2008). It is where the rubber meets the road, as it were,
in text comprehension.
The focus on the recognition of single words means that the vocabu-
lary size approach captures only a small part of L2 vocabulary knowledge,
a multidimensional notion comprising knowledge of form, meaning, and
usage. Each word is part of a complex web of relationships with other
words, and this complex network is used to realize the wide range of
expressive, communicative, and instrumental functions encountered in
everyday use. Figure 1.1 depicts the basic elements of word knowledge in
a three-part model adapted from Nation (2013); see also Richards (1976).
The vocabulary size account reduces vocabulary knowledge to the sin-
gle dimension of the number of individual words a user knows, or more
precisely, recognizes. It is about the user’s ability to relate a form to a basic
meaning, whether by identifying the meaning from among a set of alter-
natives, as in the Vocabulary Levels Test (VLT), or merely recognizing a
word when it is presented alone, as in the Yes/No Test. This passive ‘rec-
ognition knowledge’ is assumed to be an internal property—a trait—of
the L2 user’s vocabulary stock that can be measured independently of a
given context.
1.2 Estimating Vocabulary Size 7
of a given word very often depends on the context, and ‘knowing’ a word
ultimately comes down to whether it facilitates comprehension in a par-
ticular context in an appropriate and timely manner. The measurement
of size alone says nothing about the depth of word knowledge, though
the two are not unrelated. Ultimately, greater vocabulary size correlates
with greater depth of vocabulary knowledge (Vermeer 2001).
The central question in the vocabulary size approach is the degree to
which this single form–meaning dimension relates to individual differ-
ences in L2 performance. Evidence of a reliable relationship between size
and performance has implications for the way L2 vocabulary knowledge
is conceptualized and, in turn, for L2 vocabulary assessment. The next
section will consider the challenging problem of how to count single
words.
There are alternative ways to calculate vocabulary size, all with their
advantages and disadvantages. The number of words on this page could
simply be counted by tallying the number of white spaces before each
word. These are all words in the simplest sense. But this method would
yield a very insensitive measure of vocabulary knowledge, given that
many words are repeated. For example, the word ‘the’ appears seven times
in this paragraph. The same word can also appear in different forms.
Does the researcher count ‘word’ and ‘words’ as one or two words? As a
result, although estimating vocabulary size is a quantitative process, the
researcher must make qualitative distinctions as to if and how individual
word forms are counted. Several alternatives are available.
Word Families Related to the lemma is the word family, which is defined
as the base word form plus its inflections and most common derivational
variants, for example, invite, invites, inviting, invitation (Hirsh and Nation
1992, p. 692). English inflections include third person -s, past participle
-ed, present participle -ing, plural -s, possessive -s, and comparative -er and
superlative -est. Derivational affixes include -able, -er, -ish, -less, -ly, -ness,
-th, -y, non-, un-, -al, -ation, -ess, -ful, -ism, -ist, -ity, -ize, -ment, and in-
(Hirsh and Nation 1992, p. 692). As with the lemma, the underlying idea
is that a base word and its inflected forms express the same core meaning,
and thus can be considered learned words if a learner knows the base and
the affix rules. Bauer and Nation (1993) proposed seven levels of affixes,
which include derivations and inflections. Word families differ from lem-
mas in that they cross syntactic categories. In the example of bank, as
above, the noun and verb forms are counted as part of the same family.
10 1 Size as a Dimension of L2 Vocabulary Skill
As a result, a lemma count will always be larger than the word family
count, given the narrower range of forms counted as a single instance.
Milton identifies what he terms a ‘very crude’ equivalence of lemma to
word family involving multiplying the word family size by 1.6 to get the
approximate lemma size (Milton 2009, p. 12).
The word family has been widely used as the unit of counting in vocabu-
lary size studies (Schmitt 2010). Nation has argued that the word family
is a particularly appropriate unit for studying L2 recognition vocabulary
because it is primarily about meaning and meaning potential (Nation
2006, p. 76). It also has a degree of psycholinguistic reality regarding how
the different forms in a given family are stored in the mental lexicon
(Nagy et al. 1989). The basic assumption is that if the meaning of the
base word is known, the various inflections and derivations in which it
appears will also be potentially understood, at least to some degree. This
assumption has proved to be useful in relating individual vocabulary size
to L2 use, but is one that is not categorical. The assumption that a learner
who knows the meaning of build will understand the meaning of builder
on the first encounter is a probabilistic one. Schmitt and Zimmerman
(2002) show that university-level ESL students’ knowledge of the derived
forms of many stem words is far from complete, for example, not knowing
that persistent, persistently, and persistence all come from persist. However,
they also recognize that users will probably work out the meaning of per-
sistence faster if they knew persist than if they did not.
The word family construct also conflates the distinction that Paradis
(2009) makes between the stock of form–meaning associations stored in
declarative memory and morphological processes that are procedural in
nature. Widely used tests of vocabulary size, the VLT (Nation 2013) and
the Yes/No Test (Meara and Buxton 1987), always present the base form
as the test item, thus sidestepping any attempt to measure the morpho-
logical knowledge assumed in the word family construct.
1.2 Estimating Vocabulary Size 11
Figuring out how many words a user knows is the next challenge for the
vocabulary size researcher. While in theory it may be possible to identify
every single word a user knows, in practice, the process of fixing vocabu-
lary size is one of estimation. A vocabulary size estimate is based on a
finite sample of a user’s knowledge obtained in a specific task or set of
tasks. Recognition vocabulary knowledge is passive by nature, and evi-
dence for it must be elicited from the user. This is done by presenting a
set of words to a user and eliciting a response that indicates whether the
items are known. Time and resource limitations mean that any test can
present only a limited number of words, and it is from this limited sam-
ple that the user’s vocabulary size is estimated. Word frequency statistics
provide the vocabulary size researcher with a reliable and objective means
to index the size of recognition (and productive) vocabulary knowledge
(Laufer 2001).
Words greatly differ in how often they occur in a given language.
When the words in a large corpus of spoken or written English are rank-
ordered from the most to least frequently occurring, a highly distinctive
pattern emerges. The 2000–3000 most frequently occurring words
account for the vast majority of tokens that appear in the corpus. Beyond
these high-frequency words, the relative frequency of a given word
steadily decreases as a function of its relative order, until the very-low-
frequency words tail off and account for only a tiny proportion of tokens.
This frequency distribution, called Zipf ’s law (after one of its original
discoverers), provides an index for the measurement and interpretation of
vocabulary size. The law states that, for a corpus of natural language
utterances, the frequency of any word is in inverse proportion to its rank
12 1 Size as a Dimension of L2 Vocabulary Skill
they also have direct implications for vocabulary learning and the repre-
sentation of this knowledge in the mental lexicon. This is discussed below.
100
Likelihood of knowing
80
60
40
20
0
High Mid Low
Frequency of occurence
Table 1.1 Vocabulary size expressed in word families and text coverage (written
and spoken) across ninea corpora (Nation 2006, p. 79)
Knowledge of all Approximate written text Approximate spoken text
word in coverage (%) coverage (%)
1K 78–81 81.84
2K 8–9 5–6
3K 3–5 2–3
4K–5K 3 1.5–3
6K–9K 2 0.75–1
10K–14K <1 0.5
Proper nouns 2–4 1–1.5
+14K 1–3 1
a
Corpora analyzed: Lancaster–Oslo–Bergen (LOB) Corpus, Freiburg–LOB, Brown,
Frown, Kohlapur, Macquarie, Wellington Written, Wellington Spoken, and
Lund, available from the International Computer Archive of Modern and
Medieval English at http://gandalf.aksis.uib.no/icame.html (Nation 2006, p. 63).
100
90
80
Percentage of coverage
70
60
50 Spoken Wrien
40
30
20
10
0
1K 2K 3K 4–5K 6–9K 10–14K
Frequency band
99 1 10
98 2 5
95 5 2
90 10 1
80 20 0.5
Fig. 1.4 Text coverage as the number of unfamiliar words and the number of
lines of text per unfamiliar word
word families were sufficient for 95% text coverage, a number similar to
the written text research. In contrast, knowledge of only the 6K–7K
bands was needed for 98% text coverage, lesser than the 8K–9K sug-
gested as being necessary to read authentic texts with some degree of
fluency (Nation 2006). van Zeeland and Schmitt (2013) also reported
that listening comprehension required knowledge of fewer word families
than comparable reading levels.
A question remains as to whether these text coverage levels, particu-
larly the 95% and 98% levels, reflect a qualitative threshold that must be
met for adequate comprehension, or a continuum from lesser to greater
comprehension skill. Schmitt et al. (2011) examined this issue by plot-
ting text coverage levels against performance for 600 tertiary L2 English
readers from 12 different countries. The relationship between text cover-
age and comprehension was plotted at ten text coverage levels, ranging
from 90% to 100% coverage. See Fig. 1.6.
This figure is adapted from Schmitt et al. (2011, p. 34), with only
alternating text coverage levels reported here. A consistent linear relation-
ship is evident across the reading comprehension and vocabulary cover-
age levels. There is little suggestion of discrete thresholds at the 95% or
100
90
Comprehension percentage
80
70
60
50
40
30
1+SD Mean 1-SD
20
10
0
90% 92% 94% 96% 98% 99% 100%
(n=21) (n=39) (n=93) (n=176) (n=200) (n=186) (n=187)
Vocabulary Coverage and Number of Parcipants at Each Level
1.4 Conclusions
The vocabulary size approach is based on the simple assumption that the
number of words an individual knows has a direct relationship to L2
proficiency. The focus here is on recognition vocabulary knowledge,
which is narrowly defined as the ability to recognize the association
between a single word form and a basic meaning. Vocabulary learning is
viewed as an input-driven process in which vocabulary size emerges from
the user’s experience with the language. Corpus-based word frequency
statistics provide a means of estimating the overall vocabulary size from
recognition performance on a limited set of words. These vocabulary size
estimates have been related to L2 proficiency and use in two ways.
Vocabulary size has been examined as a predictor of differences in L2
performance as measured by standardized and context-specific tests. As a
key component of the lexical facility construct, this use of vocabulary size
is the focus of the book. Considerable attention has also been given to the
relationship between vocabulary size and text coverage, the latter reflect-
ing the comprehension demands of written or spoken texts. Vocabulary
size thresholds have been proposed to meet the levels of text coverage
required for successful comprehension. Both uses demonstrate the utility
of vocabulary size as a dimension of L2 vocabulary knowledge and a cor-
relate of L2 proficiency.
Another random document with
no related content on Scribd:
Sec. 3. In the prosecution of slaves for crimes of higher grade than
petit larceny, the legislature shall have no power to deprive them of
an impartial trial by a petit jury.
Sec. 4. Any person who shall maliciously dismember, or deprive a
slave of life, shall suffer such punishment as would be inflicted in
case the like offence had been committed on a free white person, and
on the like proof, except in case of insurrection of such slave.
Free Negroes.
Bill of Rights, Sec. 23. Free negroes shall not be allowed to live in
this state under any circumstances.
Sec. 1. Every male citizen of the United States, above the age of
twenty-one years, having resided in this state one year, and in the
county, city, or town in which he may offer to vote, three months
next preceding any election, shall have the qualifications of an
elector, and be entitled to vote at all elections. And every male citizen
of the United States, above the age aforesaid, who may be a resident
of the state at the time this constitution shall be adopted, shall have
the right of voting as aforesaid; but no such citizen or inhabitant
shall be entitled to vote except in the county in which he shall
actually reside at the time of the election.
The Topeka Constitution.
Slavery.
The Republicans had named May 16th, 1860, as the date and
Chicago as the place for holding their second National Convention.
They had been greatly encouraged by the vote for Fremont and
Dayton, and, what had now become apparent as an irreconcilable
division of the Democracy, encouraged them in the belief that they
could elect their candidates. Those of the great West were especially
enthusiastic, and had contributed freely to the erection of an
immense “Wigwam,” capable of holding ten thousand people, at
Chicago. All the Northern States were fully represented, and there
were besides partial delegations from Delaware, Maryland,
Kentucky, Missouri and Virginia, with occasional delegates from
other Slave States, there being none, however, from the Gulf States.
David Wilmot, of Penna., author of the Wilmot proviso, was made
temporary chairman, and George Ashmun, of Mass., permanent
President. No differences were excited by the report of the committee
on platform, and the proceedings throughout were characterized by
great harmony, though there was a somewhat sharp contest for the
Presidential nomination. The prominent candidates were Wm. H.
Seward, of New York; Abraham Lincoln, of Illinois; Salmon P. Chase,
of Ohio; Simon Cameron, of Pennsylvania, and Edward Bates, of
Missouri. There were three ballots, Mr. Lincoln receiving in the last
354 out of 446 votes. Mr. Seward led the vote at the beginning, but
he was strongly opposed by gentlemen in his own State as prominent
as Horace Greeley and Thurlow Weed, and his nomination was
thought to be inexpedient. Lincoln’s successful debate with Douglas
was still fresh in the minds of the delegates, and every addition to his
vote so heightened the enthusiasm that the convention was finally
carried “off its feet,” the delegations rapidly changing on the last
ballot. Lincoln had been a known candidate but a month or two
before, while Seward’s name had been everywhere canvassed, and
where opposed in the Eastern and Middle States, it was mainly