Phraseology and Second Language

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Applied Linguistics 19/1 24-44 © Oxford University Press 1998

Phraseology and Second Language


Proficiency
PETER HOWARTH
University of Leeds

It is now generally accepted that advanced learners of English need to have

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


command of a wide range of complex lexical units, which are for a native
speaker processed as prefabneated chunks, fixed, or semi-fixed expressions
However, although there has been an increasing amount written about the role
of phraseology in second language acquisition, there remains a lack of detailed
descnption of learners' phraseological performance as the basis for under-
standing how phraseological competence develops This paper addresses certain
current issues in the descnption of collocations in English, and, in discussing the
major approaches to the linguistic descnption of prefabricated language, the
need for detailed categorization is emphasized, particularly for those interested
in the development of this component of proficiency in a second language Data
is presented from native speaker language use, illustrating what can be revealed
by one such descriptive model Finally, the findings of a number of studies of
native and non-native academic writing in English are discussed

INTRODUCTION
Although the term 'phraseology' (the study of word combinations) is
increasingly used by writers in a number of language-related disciplines, the
field has perhaps not yet achieved wide recognition in applied linguistics nor
are the implications of research within the field fully understood by or easily
available to language teachers This is partly the result of interest in the
phenomenon of word combinations having developed independently in a
vanety of disciplines, and few writers have attempted an overview (but see
Cowie 1994 and forthcoming) It is not possible within the scope of a single
paper to give an account of the whole field {see Cowie and Howarth 1996 for
an extensive bibliography), and the focus of this paper is restneted to some
aspects of phraseology that have relevance to the needs of adult learners of
English in particular their use of collocations These are defined as
combinations of words with a syntactic function as constituents of sentences
(such as noun or prepositional phrases or verb and object constructions) They
can be considered most centrally involved in the process of composition at
clause level, therefore potentially sensitive indicators of learners' acquisition,
and they raise some of the most challenging issues in the study of
phraseology The data desenbed in later sections is taken from academic
writing m the social sciences, a register that requires the production of a large
PETER HOWARTH 25

proportion of conventional collocations The paper begins by briefly discussing


the theoretical foundations of one approach to the descnption of phraseology
and then presents data from a comparative study of native and non-native
performance, along with findings from other empirical research, to illustrate
what can be learned from applying that framework about developing
proficiency in a second language

Background

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


The theoretical framework used in the following descnption of collocations is
presented in detail in Howarth (1996) This model draws on the work of
Cowie (eg 1988) and Glaser (1988) in descnptive linguistics, of Bohnger
(1976) and Pawley and Syder (1983) in language processing, of Cowie (1981)
and Aisenstadt (1981) in lexicography, much of which in turn is influenced by
the Russian tradition of phraseology (e g Arnold 1986) There are important
ways in which this approach differs from other perspectives on phraseology,
partly in the terminology used, but also in the ways in which items are
identified and categonzed For example, Weinert in a recent Applied Linguistics
article (1995) focuses on the function of formulaic language in spoken
discourse and on its role in acquisition processes (for instance, as a
communication strategy) Her chief interest is in the linguistic development
of early L2 learners' speech, rather than in phraseology per se, and possibly as
a result there is some lack of precision in terminology She lists the following
associated terms 'formulas, prefabncated or ready-made language, chunks,
unanalysed language or wholes, etc ', and, she says, 'I will use these terms
interchangeably' (1995 182) Her justification is that 'researchers have very
much the same phenomenon in mind', but there is a danger that such terms
may be used too loosely as labels for a wide range of phenomena that may,
under closer examination, differ significantly from each other, and she ignores
the very considerable problems of categonzation Underlying her list of terms,
several phraseological features or processes can be distinguished firstly, the
formulaic nature of expressions (l e the conventional 'form-meaning
pairings' (Pawley and Syder 1983) that become institutionalized in a
language), secondly, memorization (which can be seen as a property of the
individual language user), thirdly, lexicahzation (when a multi-word unit
becomes stored and processed unanalysed as if it were a simple lexical item)
A fourth feature is that of fixedness, not explicitly referred to in the above list,
though the term 'fixed expression'' is common in certain branches of
phraseology, especially lexicography (Moon 1992) and language teaching
(Alexander 1984)
It may give a misleading impression to suggest that 'formulaic language' is a
single category, to be contrasted with language generated by rule, and
encompassing all significant features of word combinations In the absence of
formal criteria for identification and categonzation it is not clear whether
'formulaic language' includes such 'prefabricated chunks' as collocations and
26 PHRASEOLOGY AND SECOND LANGUAGE PROFICIENCY

idioms Furthermore, an examination of a small amount of natural data


suggests that the above properties of formulaic language are gradable Taking
lexicahzation as an example, what is an unanalysed whole in a child's speech
may become analysed at a later date (see Peters 1983), though, as Bolinger
(1976) points out, a speaker can continue well into adulthood before
discovering the analysabihty of certain combinations What is partially
unanalysed for a native speaker may be produced in identical form by
compositional means by a learner, and even among native speakers
expressions will be more or less analysed (for instance, in to let off steam (=

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


'to display anger'), speakers may vary in the degree to which the literal sense
of steam (as 'water vapour') is activated), or features of the context may cause
an expression to be more decomposed for a particular speaker For example,
in he conceived the idea of crossing Antarctica It was not all plain sailing the marine
sense of sailing might for some people become apparent within what is
normally an opaque idiom It is not only possible but necessary to discriminate
between categories of combination according to some at least of the above
cntena and to avoid the very broad use of 'formula', limiting its application to
the more specific sense employed by Coulmas (1979) and others
A widely recognized approach to phraseology has become familiar to ELT
professionals through the work of John Sinclair and COBUILD (see Sinclair
1991, for example) It has also found specialist application in research into the
phraseology of speech (Altenberg 1993, forthcoming) and in the production of
a dictionary of collocations (Kjellmer 1994) It overcomes some of the
complex problems of descnption by means of statistical measurement and
claims that a radical new ('data-driven') approach to categorization is needed
The focus of this approach is typically on co-occurrences of word-forms that
are recurrent in a body of texts, drawing on Firth's concept of 'meaning by
collocation' The significance of any pair of forms is stated in terms of its
occurring more frequently than probability would predict the more
frequently it is found, the more. significant it is regarded to be in the
language The reliability of such statements derives from the size of the corpus
being analysed, the whole approach to analysis being greatly assisted by
advances in computer technology and the development of very large corpora
Naturally, such automatic quantitative analysis focuses on performance and
may exclude considerations of competence It is essential for those interested
in learners' developing proficiency to consider processes of memory storage
and production underlying performance It may be extremely difficult to
arnve at firm answers, but it is crucial to consider what kind of lexical units
are stored and how they are manipulated The mental lexicon clearly holds
more abstract entities than are identified by computational searches, and
neither native speakers nor learners produce word combinations on the basis
of their frequency and probability of co-occurrence Furthermore, a notion of
significance based solely on frequency nsks giving unwarranted emphasis to
completely transparent collocations such as have children, which may occur
frequently as a result of the subject matter of certain texts but are quite
PETER HOWARTH 27

unproblematic for processing To be of use to learners and teachers of a


language, the notion of phraseological significance needs to take into account
differences between phraseological types and to consider how they are
processed by native and non-native speakers and writers in production
The approach followed here recognizes the enormous value of corpora large
and small, but takes the view that phraseological significance means
something more complex and possibly less tangible than what any computer
algorithm can reveal By closely examining the internal form and external
function of word combinations it is possible to establish a set of features which

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


provides a basis for their categorization These include semantic specialization,
syntactic restriction, and the blocking of lexical substitution (or lack of
'commutability') The following model is an amalgamation of several that are
quite widely accepted, and owe much to Russian lexicology (Arnold 1986,
Cowie 1988, Glaser 1988)

word combinations

functional expressions composite units

non-idiomatic Idiomatic grammatical lexical


composites composites

non-idtomatlc idiomatic norndiomaltc idiomatic

Figure 1 Phraseological categories

The most significant feature of this model is that the split between idiomatic
and non-idiomatic combinations, which some writers such as Zgusta (1971)
apply as a primary division, is here seen to cut across the functional and
formal categories The major division of word combinations into 'functional
expressions' and 'composite units' corresponds to Zgusta's second-level
distinction between 'multi-word lexical units' and 'set groups' and Glaser's
'propositions' and 'nominations' Functional expressions, on the one hand,
are identified by their role in discourse {for example, discourse-structuring
devices, such as gambits For a kick off ), and some may be complete
utterances in themselves proverbs {You scratch my back and I'll scratch yours),
catchphrases (What's up doc>), and slogans (Your country needs you) (categories
and examples taken from Alexander 1984) Composite units, on the other
hand, have a syntactic function in the clause or sentence and are generally
best seen as realizations of phrase structures such as prepositional phrases,
noun phrases, etc Following Benson (1985) they can be further divided into
grammatical and lexical categories, depending on the word class of their
constituents lexical collocations consist of two open class words (verb + noun
(make a claim), adjective + noun (ulterior motive) etc ), while collocations
28 PHRASEOLOGY AND SECOND LANGUAGE PROFICIENCY

between one open and one closed class word are grammatical preposition +
noun (m advance), adjective + proposition {fond of) As Figure 1 indicates, the
split between idiomatic and non-idiomatic applies equally to grammatical and
lexical composites, though as Figure 2 shows, this further sub-categorization is
not a simple two-way division but a continuum derived from the application
of such criteria as restricted collocabihty, semantic specialization, and
ldiomaticity, each of which is gradable This continuum is illustrated below
by lexical composites of verb and noun and grammatical composites of
preposition and noun

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


free restricted figurative pure
combinations collocations idioms idioms

lexical composites blow a trumpet blow a fuse blow your own blow the gaff
verb + noun trumpet

grammatical composites under the table under attack under the under the
preposition + noun microscope weather

Figure 2 Collocational continuum

Free combinations {also referred to as open or free collocations) consist of


elements used in their literal senses and freely substitutable {carry a trumpet, on
top of the table) Restricted collocations have one component (usually the
preposition, verb, or adjective 'collocator' of the 'base' noun, to use
Hausmann's 1979 terms) that is used in a specialized, often figurative sense
only found in the context of a limited number of collocates While figurative
idioms have metaphorical meanings in terms of the whole and have a current
literal interpretation, pure idioms have a unitary meaning that cannot be
denved from the meanings of the components and are the most opaque and
fixed category The significance of composites is regarded as psychological,
their degree of restnetedness related to mental storage and processing
Significance is therefore gradable and the result of a complex of features
rather than simply a statistical measure
Howarth (1996) describes how the main criterion of commutabihty can be
applied more or less strictly to vary the number of subdivisions of the central
category, 'restricted collocations' The strictest application, allowing no
substitution of either verb or noun element, identifies restricted collocations
that are on the borderline with idioms (e g curry favour) allowing some
substitution of either the verb or noun admits into the category pay/take heed
and give the appearance/impression A more liberal application, permitting
limited substitution in both elements would include mtroduce/table/bnng
forward a bill/an amendment as restricted There are clearly problems in
making this method of categorization reliable The difficulty lies in finding an
authority for deciding on what substitutions are 'permitted' A pragmatic
combination of published collocational dictionaries and (increasingly) large
PETER HOWARTH 29

corpora can provide substantial amounts of data, and recent technological


developments in automatic lemmatization, tagging, and parsing have enabled
computational processing to identify collocations at the required abstract,
lexemic level However, it must be recognized that decisions about the
acceptability of combinations that occur individually at very low frequencies
must continue to rely heavily on human judgement The absence of a possible
combination from dictionaries and even large corpora cannot reasonably
exclude it from consideration Additionally, the collocations of most interest
in studying acquisition are not typically fixed enough for automatic

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


identification

Native speaker phraseological competence


It is generally agreed by phraseologists, whatever their theoretical standpoint,
that native speaker linguistic competence has a large and significant
phraseological component For those with an additional interest in the
development of such competence in a second language, there is a need to
understand much more about how and to what extent this component
presents difficulties for learners If it is accepted that phraseological
phenomena are not merely fixed or formulaic expressions and therefore are
not purely memorized as lexical items, it is necessary to make a much more
detailed description of conventionality than simply listing familiar phrases (an
approach unfortunately common in ELT matenals) The conventionality
associated with the use of phraseological expressions results from specializa-
tion of form and meaning, and evidence can be found from errors at the
phraseological level for a considerable amount of vanability in spontaneous
native speech This may be unintentional deviation or deliberate creativity
and the stylistic effects of non-standard or unconventional phraseology can be
bnefly summarized

(a) Unconventional use forces the decomposition of combinations, and


readers or listeners become unnaturally aware of the constituent
elements in the expression (they are forced into being taxis rather than
buses, in Aitchison's (1987) graphic terms) There is much greater need to
refer to the context, which slows down processing and defeats the
purpose of conventionality This, of course, explains why deviation is so
effective when intended, but unintended deviation in certain (especially
more formal) registers can have quite undesired effects
(b) In spite of this, intelligibility is generally maintained, In cases of severe
distortion there is a strong tendency for listeners or readers to find a
conventional interpretation to overcome problems of comprehension, by
reference to a known familiar combination He's not standing on his laurels
[resting] Related to this is the finding, confirmed empirically by Moon
(forthcoming), that the conventional meaning of an idiom dominates
over any alternative literal interpretation 'Corpora support the view that
30 PHRASEOLOGY AND SECOND LANGUAGE PROFICIENCY

the idiom-collocation or metaphor blocks use of the literal equivalent'


For this reason, literal uses of conventionally specialized expressions are
generally avoided, unless employed for effect and indeed their use may
be taken as a sign of non-nativeness The Yamamah project is a big deal
(NNS)
(c) This ability to find an interpretation is associated with the arbitrariness of
some more idiomatic combinations If- one or more elements in an
expression is figurative and makes no independent contribution to' its
overall meaning, it may not make much difference which combination is

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


produced The ldiomaticity of an expression may mean that the
conventional sense associated with the whole expression can be accessed
from only a partial lexical clue {assuming sufficient assistance from the
context) and no analysis is required he seems to be carrying the rap {carry the
can/take the rap] Indeed, even in less idiomatic expressions the deviation
may pass unnoticed Mugs and crockery are disappearing with regular
monotony [with monotonous regularity]

These instances of natural language use could be dismissed as performance


slips, but the consideration of these phenomena can make a contribution to
an understanding of underlying language processes, if described within a
framework that allows categories to emerge and fine distinctions to be made
It becomes clear from this evidence that the phraseological performance of
learners at an advanced level is likely to be under severe strain, with demands
on their control of lexical complexes and stylistic appropriateness There now
follow the findings from a range of formal and informal studies, illustrating
what can be learned from a detailed application of the descriptive method to
bodies of native and non-native texts They are suggestive of how learners'
collocational proficiency develops

Understanding non-native collocational competence


A glance through recent EFL coursebooks (eg Flower and Berman 1989,
Harmer and Rossner 1991, Harnson 1990, McCarthy and O'Dell 1994,
Redman and Ellis 1991, Rudzka et al 1981, Wellman 1992) shows that
teachers and matenals writers are paying increasing attention to the necessity
of learners to acquire knowledge of collocations and are aware that this
component of competence should be addressed explicitly Although this need
was recognized and examined in detail as long ago as the 1930s {Palmer 1933,
IRET 1933), the prolonged influence of generative grammar and the purer
forms of communicative language teaching downgraded vocabulary learning
in the syllabus and made teachers and applied linguists shy away from any
matenals that smacked of phrasebook learning As Cowie {1992) has shown,
this is to misunderstand the significance of phraseology, not only in second
language acquisition, but also in the linguistic system as a whole, and since
the early 1980s there have been an increasing number of writers acknow-
PETER HOWARTH 31

ledging this fact for example, Channell 1981, Pawley and Syder 1983, Peters
1983, Allerton 1984, Nattinger and DeCamco 1992, Widdowson 1989 The
following is representative

So often the patient language-learner is told by the native speaker that


a particular sentence is perfectly good English but that native
speakers would never use it How are we to explain such a state of
affairs9 {Allerton 1984 39)

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


Nattinger and DeCamco (1992), in one of the few book-length studies of L2
phraseology, see formulaic units {or 'lexical phrases', distinguished from
collocations in having a pragmatic function) as 'the very center of language
acquisition' (xv)

One common pattern in language acquisition is that learners pass


through a stage in which they use a large number of unanalyzed
chunks of language in certain predictable social situations (ibid)

There is general recognition of the problems facing learners in achieving the


naturalness of native-speaker use that denves from the appropriate selection
of conventional phraseology Unfortunately, few of these references to L2
phraseology examine learner language empirically Palmer (1933) produces
an extensive categorization of problematic target forms in English, but does
not seem to have gone further.in descnbmg learner performance than a few
hypothetical examples of error (make a question, perform a favour, do trouble,
keep patience), and it is still true today that, while the role of formulae and
other sentence-level expressions in second language development is well
documented (see Weinert 1995), few researchers have studied the acquisition
of collocations from the point of view of L2 learners' production

Data
A possible reason is the problem of data collection and the selection of
subjects One option is the informal recording of non-native speech and
wnting from a range of sources, as in the first set of examples below, an
approach which can satisfactory illustrate some of the descnptive points
made above
(1) ~>Give lip service to
(2) 7Keep them under key and lock
(3) 7The general Synod took the monumental decision
(4) ^Riddled with knife wounds
(5) ">May all your dreams become true
(6) 7You can lead a hone to the river, but you cannot make it dnnk'the water
32 PHRASEOLOGY AND SECOND LANGUAGE PROFICIENCY

Not all, of course, are taken to be errors, but may be viewed as deviating from
what native speakers would regard as the most natural and, perhaps,
acceptable form of expression For example, in (1) the lexical idiom lip service
conventionally collocates with the verb pay to form a complex collocation
The learner appears to have mastered the idiomatic component (the most
fixed part) but perhaps allows it more freedom to collocate with verbs than it
has for native speakers (2) illustrates both the fixed order of lock and key and
its collocabihty with the preposition under, which can perhaps also be viewed
as restricted The choice of monumental in (3) in place of momentous might be

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


treated as a semantic error, though the phonological similarity suggests that
the result was a collocational mis-hit (4) indicates that riddled collocates with
wounds for this speaker, though, for a native, riddled would perhaps be
determined by the single collocation bullet wounds (5) and (6) show that even
in longer stretches of language a tension can be seen between a conventional
form and the compositional expression of the same meaning The intended
meaning is clear in each, but in the speech formula (5) there seems a strong
and arbitrary preference for the verb come (though a semantic distinction
between become and come could be argued) The explanation for (6) is harder
to find It is structurally so close to the target that it cannot have been
generated compositionally Unless there is a problem of interference from an
LI proverb, it is probable that the learner has half-remembered the original
and not believed it to be so entirely fixed While there is no shortage of data of
this kind, the random nature of the subjects and vanable circumstances of
production reduce the value of this method
A second approach involves devising experimental procedures for eliciting
L2 collocations from an appropnately controlled homogeneous group of
subjects Gabrys-Biskup (1992), Bahns (1993) and Bahns and Eldaw (1993),
for example, working from a pre-established set of English collocations
predicted to be problematic for German and Polish undergraduate students,
designed translation and other tasks that might reveal LI interference (such as
lead a bookshop [run] or put up a record [set]) The acceptability of the forms
produced was in both studies evaluated by native-speaker informants These
studies could be regarded as direct measures of collocational competence, as
the results are expressed in terms of the number of subjects who know a given
collocation One drawback of this approach is the difficulty of establishing the
validity of any predefined list of target collocations, since, as many EFL
teachers might agree, this component of a learner's linguistic competence is
one of the least predictable, making it hard to generalize from subjects'
knowledge or ignorance of a small number of individual items
Granger (forthcoming), also interested in interlingual collocational use (in
her case advanced French students of English), is one of the few researchers to
adopt a corpus approach to learner phraseology and a comparative analysis of
NS and NNS written performance The NNS data is taken from the French
component of the International Corpus of Learner English (ICLE) (see
Granger 1993), composed of exam papers, timed, and untimed argumentative
PETER HOWARTH 33

essays, and therefore controlled for content and length The corpus as a whole
consists of comparable texts from EFL learners from nine language back-
grounds and provides a substantial body of data for crosshngual studies
Although her analysis is chiefly quantitative, she recognizes the difficulty in
the automatic retrieval of lexical collocations that occur at very * low
frequencies As a result, some of her most interesting findings focus on the
range of collocations used by the two groups of writers rather than on their
raw totals For instance, in comparing the collocations produced only by NS
wnters, those only produced by NNS and those produced by both groups, she

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


reports

the native-exclusive combinations fell into two categones stereotyped


combinations [restncted collocations] and creative combinations
Both types of combination were significantly underused by the
learners the few stereotyped combinations used by the learners
typically have a direct translation equivalent in French

Additionally, Granger examined the judgements of the two groups about


which, in a list of 15 adjectives in each case, were acceptable collocates of 11
amplifiers

On' balance, the learners marked a greater number of types of


combinations than the natives, indicating that the learners' sense of
salience is not only weak, but partly misguided

Cowie and Howarth (1996a), looking in greater detail at a much smaller


amount of data, also compared NS and NNS writing, though, without
controlling the language background of the L2 wnters, they were unable to
make interlingual comparisons It could be claimed that although interlingual
data is of great value, the problems that phraseology poses derive as much
from internal features of the target language as from interlingual inference,
and thus justify an intralingual approach A further justification was the
practical problem for researchers in the UK of finding a linguistically
homogeneous group of learners studying at a British university The focus
of that small-scale study was the collocational proficiency displayed in the
academic essays of relatively advanced NNS writers (non-native teacher*, of
EFL taking an MA in Linguistics and ELT) and NS undergraduates of English
The hypothesis was that there might be a measurable overlap in collocational
use between less proficient NS and more profiaent NNS writers This overlap
was indeed found, in terms of the proportion used of collocations of, a given
grammatical pattern (verb + noun) that could be regarded as restncted In
order to discover this stylistic feature of academic texts, it was necessary to
analyse each essay individually, rather than the two complete corpora
This is the approach to NNS data analysis followed in the study reported in
34 PHRASEOLOGY AND SECOND LANGUAGE PROFICIENCY

Howarth 1996, on the grounds that language proficiency is the property of an


individual learner, and by extracting an average performance from a corpus of
several writers one loses opportunities for identifying significant differences
among learners' processing mechanisms and cognitive strategies Therefore,
while NS norms can be established through the large-scale analysis of corpora,
NNS proficiency is best studied by means of small-scale manual analysis

NNS versus NS collocational use

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


The scope of the research1 was limited in several ways First, the focus was
narrowed to one category of conventional language restricted lexical
collocations (see Figure 2) Second, it concentrated on a single register
formal written English in social science academic writing If phraseological
competence makes an important contribution to successful style, it was
desirable to focus on a register in which the stylistic demands can be clearly
specified In the case of academic writing, these are assumed to be clarity,
precision, and lack of ambiguity Other formal registers would make different
demands a diplomatic or political speech, for example, might allow
obfuscation and ambiguity on occasions (such as current affairs media
interviews), though these effects would still need to be under the control of
the speaker Third, the category of lexical collocations was limited to transitive
verb + object noun, a small part of the whole framework, but of considerable
importance from the point of view of the propositional content of an academic
argument, and therefore central to the production process Much of the
distinctive procedural vocabulary of academic disciplines can be found in such
predicate structures as make a claim, reach a conclusion, adopt an approach, set out
criteria
The NS data consisted of appropriate sections of the 1 million-word LOB
(Lancaster/Oslo/Bergen) corpus of written Bntish English (the 29 texts
classified as social science2), supplemented by published texts in sociology,
education, and law acquired from British academic writers The total
amounted to 240,000 words The NNS data was drawn from a set of 10
essays, written by 10 non-native postgraduate students as assignments for an
MA in applied linguistics at the University of Leeds The total came to 25,000
words The point of departure in analysing the former set of data was a list of
the highest frequency verb lemmas {more than 60 in total), which rendered in
excess of 5,300 lexical collocations of the desired pattern in the NS corpus In
the case of the NNS matenal all collocations (1,200) of this pattern were
examined
By means of formal criteria it is possible to categorize such collocations
according to their degree of restnctedness, and thus identify degrees of
conventionality (see Figure 2 above),

• the use of a verb in a specialized sense (figurative, technical or delexica!)


and
PETER HOWARTH 35

• a degree of limitation on the permitted substitution of the verb or noun

Assistance in making such judgements was provided by the few collocational


dictionaries available (Benson et al BB1 Combinatory Dictionary of English 1986
and Dzierzanowska and Kozlowska Selected English Collocations 1988) and other
phraseological dictionaries, such as the Oxford Dictionary of Current Idiomatic
English (Cowie and Mackin 1975/1993 and Cowie et al 1983) At the time the
research was undertaken there were unfortunately no suitable large corpora

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


easily available for reference
The following table gives examples of combinations drawn from the NS
corpus along with their categorization

combination category

COMPARE behaviour, levels, results size tree


EMPHASIZE autonomy concept link rights collocations
INFLUENCE content, culture groups

INTRODUCE bill amendment motion restricted collocations Level 1

PAY attention, heed restricted collocations Level 2


MAKE decision, improvements

GIVE credit to sb preference to sth restricted collocations Level 3

DRAW hue figurative idiom

SET store by sth pure idiom

Figure 3 examples from NS data

It is essential to see the categories as forming a continuum from the most free
combinations to the most fixed idioms, rather than discrete classes Dividing
lines cannot be strictly drawn, though points along the scale are regarded as
somehow reflecting psychological reality (It can be pointed out here that
much of the statistically significant data produced by computational
techniques consist of free combinations)

Research findings
Quantitative
The detailed numencal data from the research is reported in greater detail
elsewhere (Howarth 1996) The figures are denved from a categonzation of
the verb + noun collocations into the three mam levels of restnctedness (free
collocations, restricted collocations, and idioms) the proportion in the whole
NS corpus of the two categones regarded as conventional (restncted
collocations and idioms) was 38% This finding is helpful in providing an
36 PHRASEOLOGY AND SECOND LANGUAGE PROFICIENCY

indicator of collocational 'density' and degree of stylistic conventionality, and


permits comparison with analyses of other registers Cowie's studies (1991,
1992) of journalistic writing (news reports and feature articles) give
comparable figures of between 40% and 50% Cowie and Howarth (1996)
provides further support for the existence of a stable norm of mature native
phraseological use underlying proficient performance in a given register
This quantitative measure provides the most straightforward means of
comparison with the non-native wnters in this study, who produced, on

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


average, a much lower density of conventional combinations (25%),
suggesting either a generally lower level of knowledge of collocations, or a
lack of awareness of how to deploy them appropriately, or both (a finding
confirmed by Granger forthcoming)
The purpose of the research, however, was not to consider the average
performance of a group of learners, but to identify indicators of proficiency, a
necessarily individual phenomenon This overall score disguises a range of
individual NNS figures from 13% to 33% of conventional patterns, and
attempts were made to discover a direct correlation between these scores and
other measures of proficiency Data was available from an English language
test administered in the university to all students whose first language is not
English, though the percentage score on this test is not intended to be a highly
accurate measure of general proficiency (It is used for the rapid identification
of students likely to have difficulty studying academic subjects through the
medium of English ) However, it was thought possible that the rank ordenng
of the 10 subjects' test scores might correlate with their ranked collocational
score as defined above No correlation was found (0 15), thus perhaps
suggesting that this language test was the wrong measure to choose and that
another more global test might correlate better Alternatively, it could be that
appropnate collocational performance, in the sense of approximating to NS
norms of conventionality, is a highly individual matter of style, which follows
a quite separate path of development from measurable levels of general
language proficiency

Qualitative
Numencal analysis, then, appears to be of limited help in understanding
collocational proficiency Qualitative conclusions drawn from the production
of individual NNS wnters in relation to the norms of NS performance can be
more revealing, especially when focused on the central area of the
collocational spectrum the borderlands between restricted and free colloca-
tions It is here that the greatest descriptive effort is required and where the
most significant features of NS phraseological competence and NNS
proficiency are discovered It may be claimed that the problem facing the
non-native wnter or speaker is knowing which of a range of collocational
options are restncted and which are free The conclusions that stand out as
PETER HOWARTH 37

particularly illuminating relate to types of deviation from NS norms and types


of strategy employed by learners

Types of deviation
One of the most useful results of the detailed analysis has been to uncover
certain clusters of overlapping collocations, found towards the free end of the
restncted category

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


list

memorandum

• ^ table

Figure 4 Overlapping collocations


In this example, the verbs on the left are treated as being equivalent in
meaning (in this particular lexical context), while the nouns on the right are
part of a lexical set, which could be extended, but is more or less constrained
by the set of verbs Taken together they form a cluster of collocations in which
certain combinations are unacceptable for example, ">write a table Their
significance lies in the way in which specific collocations might be predicted
by analogy, but are arbitrarily blocked by usage, and clearly they are the kind
of phenomenon likely to confound learners Collocational clusters of this kind
are a descriptive tool with great potential for the very fine analysis of verb
meaning earned out by lexicographers (see Howarth 1996 for discussion)
While NS wnters do not seem, on the whole, to fall into the trap of producing
collocations that are arbitranly blocked, the largest class of NNS errors found
in the research data involve the production of a blocked combination in an
overlapping cluster

(7) a test can be designed to test the ability to participate in group


discussion for students required to perform a group project
*perform a project
perform a task
carry out a task/project

In this case the combination perform a project is regarded as non-standard, and


the proposed explanation is that the writer, aware of the acceptability of
perform a task, believed that task and project share identical collocability with
verbs Further examples in this category include *pay effort {pay atientionla call
38 PHRASEOLOGY AND SECOND LANGUAGE PROFICIENCY

make a call/an effort), *reach findings (reach a conclusion, arrive at a conclusion/


findings) This method of analysis finds support from Granger's (forthcoming)
evidence concerning learners' misguided sense of collocational 'salience'
It is perhaps this type of error that many EFL teachers will recognize and
that presents them with the hardest task of description and explanation, since,
unlike the more restricted collocations and idioms, combinations of this kind
are probably not all learned as inflexible wholes While they are often not
fully lexicahzed, they are quite institutionalized, and therefore form part of
the stock of complexes that help to mark a piece of writing as natural and

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


proficient It appears that the ability to manipulate such clusters is a sign of
true native speaker competence and is a useful indicator of degrees of
proficiency across the boundary between non-native and native competence
A second type, accounting for the largest number of NS non-standard
lexical substitutions in the research data, is blending, which differs from the
first type in an important way Within overlapping clusters there is a natural
tendency to 'fill in' the combinations by analogy, if, as a learner, one is
unaware of the restnctedness, or, in spontaneous native speech, one is under
processing pressure Blends, on the other hand, seem to occur among more
restncted collocations, where the verbs involved are more obviously figurative
or delexical in meaning and the collocations as a whole are semantically more
closely related

(8) and we can pjiy_ particular care to look at the fortunes of United
Kingdom trade
*pay care
pay attention
take care

While native speakers might repair their slips in overlapping collocations such
as do a proposal, they might not be aware of having produced a blend such as
he seems to be carrying the rap {The distinction between blends and overlapping
collocations is more fully discussed in Howarth forthcoming } It is noticeable
that NNS wnters produce many fewer blends than overlaps and that it is the
more proficient who produce them

(9) We have drawn a favourable correlation between motivation and


proficiency in language
*draw a correlation
draw a comparison
make a correlation

Dechert and Lennon (1989) conduct a detailed study of blends in learner


essays written under test conditions, though their conception of blending is
somewhat broader, interpreting forms such as he has gotfinanciallybroke as a
PETER HOWARTH 39

blend of he is broke and he has got into financial difficulties These forms are
regarded as part of a careful style in learners' interlanguage, produced under
certain task conditions, in which the writer's focus is on linguistic form They
suggest that the prevalence of blends in advanced learner language shows that
this linguistic level has been ignored in language teaching in favour of syntax,
and has not been developed through use to the same degree of 'procedural
automaticity' as have syntactic structures It may be, however, that blending
is a sign of competence rather than its absence Since one can'only produce
blends of collocations that are already known, they could be regarded as

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


useful evidence of the acquisition of phraseology Again, the evidence
suggests the need for detailed analysis and sub-categonzation m under-
standing L2 proficiency

Types of strategy
There is a danger in the above approach to description in focusing too much
attention on error, deviance or non-standard forms While analysing what
makes an individual collocation non-standard can help in understanding what
the non-native has done on a particular occasion, and some general
conclusions can be drawn, there is a need for alternative perspectives to
increase our understanding of deeper processes of acquisition such as learner
strategies In discussing strategies in relation to phraseology, one must
distinguish between two different phenomena On the one hand, there is
the repeated use of routines and patterns as an early communication strategy
used by a speaker to overcome a lack of linguistic resources (discussed in
Krashen and Scarcella 1978), on the other hand, there are cognitive strategies
used by more advanced learners when consciously attending to collocational
knowledge We are concerned here with the latter The following sections
present a preliminary categorization of strategies, drawn from the empirical
collocational studies already referred to
Avoidance This category includes cases where the learner is either unable to
produce a target L2 collocation or gives as a paraphrase a freely generated
expression It is clear from her test matenal that Gabrys-Biskup (1992)
recognizes the significance of the distinction between restricted collocations
and free combinations in learners' production and its relevance to proficiency
She was able with her experimental method (outlined above) to identify
instances when subjects gave no L2 response to an LI stimulus In her
research, she found that Polish subjects were more inclined not to answer
than German students and less likely than the latter to provide a descriptive
paraphrase (e g to break a nut open for crack a nut) She attributes this
difference to 'the Polish educational system insisting on accuracy while the
Germans pay more attention to fluency and communication' (88) As a result,
Polish students gave more correct answers overall than did the Germans On
the other hand, Bahns and Eldaw (1993) found that their German student
subjects had difficulty in accurately paraphrasing German collocations in
40 PHRASEOLOGY AND SECOND LANGUAGE PROFICIENCY

English, leading them to conclude that for successful L2 communication 'a


knowledge of collocations is essential''{109) At high levels of performance,
then, avoidance of collocations is not an effective solution
Experimentation One of the most interesting conclusions from Gabrys-
Biskup's research is that one reason for the greater number of errors among
German students was that they were nsk-takers, and 'when [they] didn't
know an [English] restncted collocation they tned to find a synonymous one
(free combination)' (89) Granger {forthcoming) also talks of this creative

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


process in L2 production, and quotes from her corpus of French essays
collocations such as ferociously menacing and shamelessly exploited, which she
takes to be non-institutionalized and thus invented This process can, of
course, misfire, and she also found forms such as dangerously threatened and
irretrievably different Examples of less successful NNS experimentation can also
be found in Howarth 1996 trigger the focus and ignite the emergence, though, as
the type of academic writing studied does not encourage creativity of this
kind, few instances would be expected
The whole area of avoidance, paraphrase, and experimentation in
collocation use is highly problematic in learner language While in NS use
non-institutionalized combinations are clearly produced by means of
generative rules and are semantically quite transparent (e g provide a
characterization), in non-native data it is more difficult to be certain of the
dividing line between the non-institutionalized and the lexically or
semantically deviant There are combinations which are clearly nonce forms
but fail to communicate a clear meaning (e g that is the means to collect students'
interest) They can be regarded as not only non-institutionalized but also not
semantically rule-governed, therefore not the result of the simple substitution
of an unconventional collocate for a conventional one, and outside the realms
of phraseology
Transfer Risk-taking of the kind identified by Gabrys-Biskup may lead to
interference error, though learners might not be aware of any potential
problems in transferring an LI collocation to L2 She was able to distinguish
between Polish interference errors (e g to state a record [set]) and German (to
lead a bookshop [run]) In contrast, Granger found evidence of successful
collocational transfer from LI In her comparative study of NS and NNS
wnting she writes

Of all the combinations used by natives, the only one that translates
into French is precisely that used by learners—severely punished, which
corresponds to severement pum (forthcoming)

Analogy The process of adapting a known L2 collocation (by substituting one


element for another known lexical item) could be regarded as a form of
intralingual L2 transfer, and is clearly highly productive Howarth (1996)
PETER HOWARTH 41

shows how this can result in a very successful degree of variation The
following verbs were all used by a single NNS writer as collocates of method,
and this evidence illustrates the value of examining the work of individual
writers adopt, base on sth, bring in, discuss, evolve, extend, implement, modify,
practise, recommend, teach, use
However, the above discussion of overlapping clusters of collocations shows
how analogy can misfire and lead to overgenerahzation of collocabihty (see
Cowie and Howarth 1996a) "> adopt ways (possibly on analogy with adopt an
approach7) 7carry out principles (from carry out research7)

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


Repetition Without the confidence or inclination to extend collocations by
analogy, a writer may fall back on the repeated use of a limited number of
known collocations One of Granger's main conclusions is that her French
NNS wnters tend to overuse certain 'general-purpose' adverb + adjective
collocations with very, and some combinations such as deeply-rooted occurred
several times in the corpus Howarth 1996 found a considerable'difference
between individual writers' repetition of conventional collocations, ranging
from one text in which 4 2 % of all instances were repeats to another with less
than 10% of repeats
Discussion of learners' strategies in producing collocations must be tentative
at this stage, based so far on interpreting small amounts of data To make
stronger claims about strategies would require much more information about
what individual writers know of L2 collocations There is much less certainty
than with native writers concerning the competence that learners are drawing
on, whether they 'really know' the assumed target collocation or whether the
deviant form is the result of incomplete knowledge of an associated set of
collocations The following example could be analysed either as a blend of two
collocations that have a semantic similarity or as the result of gaps in an
overlapping cluster of collocations

*make a reaction
either (a) a blend of
give a reaction and
make a response
or (b) partial overlap within the following cluster

reaction
make *^__
\ ^T~ •*• comment
give i
^ response

There can be no definitive choice between the two interpretations It is not


sufficient in the case of non-native wnters to identify a pair of collocations
assumed to be known that result in a blend, since there is no way of knowing
what the state of each writer's competence is
42 PHRASEOLOGY AND SECOND LANGUAGE PROFICIENCY

Conclusion
Although there is a growing recognition of collocation (and phraseology in
general) in language teaching, there seems to be a lack of awareness of its true
significance Linguists and teachers have traditionally concentrated their
attention on the extreme ends of the spectrum free combinations and idioms,
giving learners the impression that there are two distinct modes of
construction the unfettered application of generative rules to lexis in free
combinations, on the one hand, and complete frozenness in idioms, on the

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


other The large and complex middle ground of restncted collocations (not
generally recognized as a pedagogically significant category) is often regarded
as an unrelated residue of arbitrary co-occurrences and familiar phrases
The continuum model demonstrated above is designed to show the
connection It has also been suggested here that learners' difficulties lie
chiefly in this central area, since idioms and free collocations are,
phraseologically, largely unproblematic The greatest challenge lies in
differentiating between combinations that are free and those that are
somehow limited in substitutabihty This is a non-tnvial objective, which
involves distinguishing between what is semantic and thus generalizable and
what is collocational and thus highly specific, one in which teachers
themselves need a great deal of guidance
It is misleading to account for the majority of collocations as arbitrary co-
occurrences that must be learned as wholes, as some materials do, encouraged
perhaps by the Neo-Firthian approach to collocational description, suggesting
that co-occurrence is a question of probability It is the gaps in collocabihty (as
seen in partially-overlapping clusters) that are arbitrary, not the collocations
that do occur, and these may be of more significance for the development of
proficiency A companson between NS performance and NNS errors suggests
that at an advanced level learners are lexically competent and have
successfully internalized the more restricted collocations or semi-idioms
There remains, however, a vast hinterland of less restricted combinations, far
too many to learn as unitary lexical items {which many are not) It is far more
efficient to teach the nature of the phenomenon and thereby develop
awareness of potential problems
The categorization proposed provides a framework for uncovenng stages in
L2 development and strategies of L2 learning that would not be found by
purely quantitative methods Evidence of the kind presented here shows the
value of relatively small-scale but detailed manual analysis in uncovenng
features of acquisition specific to the collocational component of competence
This is not to deny the value of larger more automated corpus studies for
certain purposes, and it is hoped that a productive co-ordination of the two
approaches can be developed
(Revised version received April 1997)
PETER HOWARTH 43

NOTES
1 Full "details of the data and method can be 2 J22-35 Social, behavioural sciences and J36-
found in Howarth 1996 50 Political science, law, education

REFERENCES
Aisenstadt, E 1981 'Restricted collocations in Coulmas, F 1979 On the sociolinguistic
English lexicology and lexicography' ITL relevance of routine formulae Journal of

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


(Instituut VOOT Toegepaste Lingutsttek) Review of Pragmatics 3 239-66
Applied Linguistics 5J 53-61 Cowie, A P 1981 'The treatment of colloca-
Aitchison, J 1987 'Reproductive furniture tions and idioms in learners dictionaries' in
and extinguished professors' in R Steele A P Cowie (ed ) Lexicography and its
and T Threadgold (eds) Language Toptis Pedagogical Applications Applied Linguistics
Essays in Honour of Michael Halliday 2/3 (thematic issue)
Volume 2 Amsterdam John Benjamins Cowie, A P 1988 'Stable and creative aspects
Alexander, R 1984 'Fixed expressions in of vocabulary use in R Carter and
English Reference books and the teacher ' M McCarthy (eds ) Vocabulary and Language
English Language Teaching Journal 38/2 127- Teaching London Longman
32 Cowie, A P 1991 'Multiword units in news-
Allerton.D 1984 'Three (orfour) levelsofword paper language' in S Granger (ed ) Perspec
cooccurrence restriction Lmgua'63 17-40 lives on the English Lexicon Louvain-la-Neuve
Alienberg, B 1993 'Recurrent verb-comple- Cahiers de I'lnstitut de Linguistique de
ment constructions in the London-Lund Lou vain
Corpus' in J Aarts, P de Haan and Cowie, A P 1992 'Multi-word lexical units
N Oostdijk (eds ) English Language Corpora and communicative language teaching' in
Design Analysis and Exploitation Amsterdam Amaud and Bejoint
Rodopi Cowie, A P 1994 'Phraseology in R E Asher
Altenberg, B forthcoming 'On the phraseol- (ed ) The Encyclopedia of Language and Lin-
ogy of spoken English The evidence of guistics Vol 6 Oxford and New York
recurrent word-combinations' in Cowie Pergamon
(forthcoming) Cowie, A P (ed ) forthcoming Phraseology
Arnold, I V 1986 The English Word 2nd Theory Analysis and Application
edition Moscow Vyssaja Skola Cowie, A P and P Howarth 1996a Phra-
Bahns, J 1993 Lexical collocations a con- seological competence and written profi-
trastive view' English Language Teaching ciency in G M Blue and R Mitchell
Journal 47/1 56-63 (eds ) Language and Education (British Stud-
Bahns, J and M Eldaw 1993 'Should we ies in Applied Linguistics II) Clevedon
teach EFL students collocations 7 System 2 1 / Multilingual Matters
I 101-14 Cowie, A P and P Howarth 1996b 'Phra-
Benson, M 1985 Collocations and idioms' in seology A select bibliography ' International
R Ilson (ed ) Dictionaries Lexicography and Journal of Lexicography 9/1 38-51
language Learning ELT Documents 120 Cowie, A P and R Mackin 1975 Oxford
Pergamon Press/British Council Dictionary of Current Idiomatic English
Benson, M , E Benson, and R Ilson 1986 Volume ] (Second edition, entitled Oxford
The BBl Combmatory Dictionary of English Dictionary of Phrasal Verbs. 1993), Oxford
Amsterdam John Benjamins Oxford University Press
Bohnger, D 1976 Meaning and memory Cowie, A P , R Mackin, and I R McCaig
Forum Linguisticum 1/1 1-14 1983 Oxford Dictionary of Current Idiomatic
Channell,J 1981 'Applying semantic theory to English, Volume 2 (retitled Oxford Dictionary of
vocabulary teaching ' English Language Teach- English Idioms, 1993) Oxford Oxford Uni-
ing Journal^ 12 115-22 versity Press
44 PHRASEOLOGY AND SECOND LANGUAGE PROFICIENCY

Dechen, H and P Lennon 1989 'Colloca- Kjellmer, G 1994 A Dictionary of English


tional blends of advanced language learners Collocations 3 volumes Oxford Clarendon
A preliminary analysis' in W Olesky (ed ) Press
Gmtrasttve Pragmatics Amsterdam John Krashen, S and R Scarcella 1978 On
Benjamins routines and patterns in language acquisi-
Dzierzanowska, H and C Koztowska 1988 tion and performance ' Language teaming 28
Selected English Collocations 2nd edition 283-300
Warsaw Panstwowe Wydawmctwo Nau- McCarthy, M and F 6'Delt 1994 English
kowe Vocabulary in Use Cambridge Cambridge
Flower, J and M Berman 1989 Build Your University Press
Vocabulary Hove Language Teaching Pub-

Downloaded from http://applij.oxfordjournals.org/ at Belgorod State University on December 30, 2013


Moon, R 1992 Textual aspects of fixed
lications expressions in learners dictionaries' in
Gabrys-Biskup, D 1992 'LI influence on Amaud and Bejoint
learners' renderings of English collocations Moon, R forthcoming 'Frequencies and forms
A Polish/German empirical study' in Amaud of fixed expressions in English' in Cowie
and Bejoint (forthcoming)
Glaser, R 1988 'The grading of idiomaticity as Nattmger, J and J DeCamco 1992 Lexical
a presupposition for a taxonomy of idioms' Phrases and Language Teaching Oxford
in W Hullen and R Schulze (eds) Under- Oxford University Press
standing the Lexicon Meaning, Sense and World Palmer, H 1933 'Aids to conversational skill'
Knowledge in Lexical Semantics Tubingen The Bulletin of the Institute for Research in
Max Niemeyer English Teaching 90 1-3
Granger, S 1993 'The International Corpus Pawley, A and F H Syder 1983 Two
of Learner English' in J Aans P de Haan puzzles for linguistic theory Nativelike
and N Oostdijk English Language Corpora selection and nativelike fluency' in J C
Design Analysts and Exploitation Amster- Richards and R W Schmidt (eds ) Language
dam/Atlanta Rodopi and Communication London Longman
Granger, S forthcoming Prefabricated pat- Pelers, A 1983 The Units of Language Acqum
terns in advanced EFL writing Collocations tion Cambridge Cambridge University Press
and lexical phrases' in A P Cowie (forth- Redman, S and R Ellis 1991 A Way With
coming) Words Cambridge Cambridge University
Harmer, J and R Rossner 1991 More Than Press
Words London Longman Rudzka, B , J Channell, Y Putseys, and P
Harrison, M 1990 Word Perfect London Ostyn 1981 The Words You Need London
Nelson Macmillan
Hausmann, F 1979 'Un dietionnaire des Sinclair, J 1991 Corpus Concordance Colloca-
collocations est-il possible 7 ' Travaux de Lin- tion Oxford Oxford University Press
gmstique et de Lttterature 17 187-95 Wemert, R 1995 The role of formulaic
Howarth, P 1996 Phraseology m English Aca- language in second language acquisition A
demic Wntin$ Some Implications for Language review ' Applied Linguistics 16/2 180-205
Learning and Dictionary Making Lexicogra- Wellman, G 1992 Wordbuilder London Hei-
phica Series Maior 75 Tubingen Max nemann
Niemeyer
Widdowson, H 1989 'Knowledge of language
Howarth, P forthcoming The Phraseology of and ability for use Applied Linguistics 10/2
Learners' Academic Writing' in Cowie 128-37
(forthcoming)
Zgusta, L 1971 Manual of Lexicography Janua
IRET 1933 Second Interim Report on English Linguarum Series major 39 The Hague
Collocations Tokyo Institute for Research in Mouton
English Teaching

You might also like