Corpus Linguistics and Critical A

JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.
1 (54-120)
Corpus Linguistics and Critical

Discourse Analysis
Examining the ideology of sleaze
Debbie Orpin
University of Wolverhampton
Critical Discourse Analysis (CDA) has often proved fruitful in providing

insights into the relationship between language and ideology. However, CDA
is not without its critics. Constructive criticism has been offered by Stubbs,
who suggests bolstering CDA by using a large corpus as the basis on which to
make reliable generalisations about language use. Taking up that suggestion,
this paper reports on a study of a group of words semantically related to
corruption. In the study, corpus methodology is used to manipulate the data:
concordances and collocational tools are used to provide semantic profiles of
the words and highlight connotational differences, and to identify the
geographical locations that the words refer to. It is argued that words with a
noticeably negative connotation tend to be used when referring to activities
that take place outside of Britain, while less negative words are used when
referring to similar activities in British contexts. CDA theory is drawn on to
interpret the ideological significance of the findings.
Keywords: corpus linguistics, critical discourse analysis, collocation,

ideology
. Introduction
For any scholar wishing to undertake a study of the relationship between lan-
guage and ideology, Critical Discourse Analysis (CDA) can provide a useful
framework (e.g. Fairclough 1989, 1992, 1995; van Dijk 1997; Wodak 1996).
One of the strengths of CDA is that by marrying a Hallidayan approach to
linguistic analysis (an approach that sees language as firmly rooted in its socio-
International Journal of Corpus Linguistics : (), ‒.

 ‒ ⁄ - ‒ © John Benjamins Publishing Company
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.2 (120-170)
 Debbie Orpin
linguistic context) with theories relating to the mediation of ideology and its
relation to power structures in society, the researcher can make insightful state-
ments about the socio-political implications of the instances of language use.
Although proponents of CDA advocate taking a multidisciplinary ap-
proach to the study of language and ideology, CDA itself is situated firmly
within the field of Applied Linguistics. It is therefore disturbing to see CDA
criticised precisely for weaknesses in its linguistic analytical methodology.
Sharrock and Anderson (1981) and Widdowson (1995a, 1995b, 1996) voice
concerns about academic rigour, implying that the data is analysed in such a
way as to bear out the analyst’s preconceptions. Criticisms of CDA methods
have also come from within the field of critical language study, most notably
from Fowler (1996: 8), who draws attention to methodological weaknesses in-
herent in its qualitative approach to language study, stating that, although a
range of text types have been studied, ‘they tend to be fragmentary [and]
exemplificatory’.
A critique of CDA which could be seen as offering an important contribu-
tion to the development of a more robust methodology is provided by Stubbs
(1997). Although he makes a series of criticisms, he puts forward a number
of proposals for strengthening CDA. Among his criticisms, Stubbs (1997: 107)
points out that few CDA studies compare the features they find in texts with
norms in the language. This is crucial if reliable generalisations are to be made
concerning the effects of different linguistic choices. He also takes issue with the
fact that it is often hard to argue that data sampled in CDA texts is represen-
tative, since little data is analysed and selection is normally random. Among
his proposals, Stubbs (ibid.: 107, 111) emphasises the need to compare fea-
tures of texts with language norms, and suggests using a corpus for this pur-
pose. He also stresses the necessity of using a large body of data, so that reliable
generalisations can be made about typical language use.
. Critical Discourse Analysis and Corpus Linguistics methodology
The major problem in combining a CDA approach with corpus methodology is

deciding where to start. The qualitative methods of CDA are obviously at odds
with the quantitative methodology of Corpus Linguistics, which is best suited
to describing the collocational and syntactic patterns of a given lexical item. If
corpus methods are to be employed in critical language study, the researcher
needs firstly to decide which aspects of the CDA approach can be best served
by corpus analysis, and secondly to find a point or points of entry into the
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.3 (170-235)
Corpus Linguistics and CDA 
data. An attendant danger in using a large corpus is that the researcher may
feel swamped by the huge amount of data s/he is faced with. It is necessary
therefore to exploit whatever corpus tools are available to the best effect in
order to render the task more manageable.
Relatively few studies to date have used computer corpora to examine
language and ideology. Of these, most have looked at grammatical or lexi-
cal choice, concepts of key importance in CDA. Pronoun use is examined in
Stubbs’ (1992) study of sexism and language. Stubbs and Gerbig (1993) con-
sider transitivity choices in their study of the encoding of causation and agency
in a comparison of geography textbooks (see also Stubbs 1996), as do Galasin-
ski and Marley (1998) in their comparison of representations of the foreign
in the British and Polish press, and Jeffries (2003) in her article on the re-
porting of the 1995 Yorkshire drought. Examples of studies considering lexi-
cal choice are Caldas-Coulthard’s (1993) article on representations of women
in the news, Krishnamurthy’s (1996) study of the words ethnic, racial and
tribal, Stubbs’s (1996) work on corpus analysis and ideologically significant
language use, Hardt-Mautner’s (1995) analysis of representations of the EU in
the British press, Alexander’s (1999) work on business texts concerning ecolog-
ical issues, Bayley’s (1999) study of British parliamentary debates on European
integration, and Fairclough’s (2000) analysis of New Labour rhetoric.
Methods that are common to all of the above studies are: the compari-
son of frequencies, and the analysis of the syntagmatic environment of key
words. The basic software tool used to highlight typical collocational and syn-
tactic patterns is the concordancer, although some researchers (e.g. Louw 1993;
Krishnamurthy 1996; Stubbs 1996) make extensive use of collocational soft-
ware tools to automate the process of identifying the most significant collocates
of a word. Lists of significant collocates gathered in this way provide a seman-
tic profile of a word, and thus enable the researcher to gain insight into the
semantic, connotative and prosodic meanings of a word. This idea goes back
to Leech’s (1974) notion of collocative meaning (i.e. words have a tendency to
take on the meanings of their habitual collocates), and Sinclair’s (1991) idea of
semantic prosody (i.e. the connotative meanings of words can be coloured by
the collocates they attract, e.g. set in collocates with negative words such as rot,
decay etc.).
Krishnamurthy (1996) adds a diachronic aspect to his study by comparing
frequency data for ethnic, racial and tribal taken from the pre-1985 Birming-
ham Collection of English Texts (18 million words) with data from the post-1985
Bank of English (167 million words).
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.4 (235-272)
 Debbie Orpin
The study reported here largely follows Krishnamurthy’s (1996) method-

ology, but it specifically draws on CDA theory to interpret the sociological im-
plications of the findings. Since it combines corpus methodology with CDA
theory, this study can be considered as an attempt at responding to Stubbs’
(1997) proposal: that CDA methodology be made more reliable through the
use of random sampling, the analysis of large bodies of data (rather than merely
short or fragmentary texts), and by comparing features found in text samples
with language norms highlighted by the use of a large corpus. Furthermore,
by using collocational tools as well as concordances to provide semantic pro-
files of words, a fuller and more reliable picture of their meanings and associ-
ations is built up. This is crucial to CDA, which firmly espouses the view that
the choice of one word rather than another can encode a speaker’s ideological
stance towards what they are talking about.
. Background to this study
The starting point for this study was a smaller piece of research I carried out in
1995 into the words sleaze and corruption. The stimulus for that research origi-
nated in the observations that: (a) sleaze shared some areas of semantic overlap
with corruption (i.e. denoting the abuse of a position of power for personal or
financial advancement), (b) sleaze was also used to refer to sexual misconduct,
(c) sleaze seemed to be the generally preferred choice, rather than corruption,
when referring to events in public life in Britain and the US (for example it
was used when talking about certain British politicians accused of accepting
bribes). For my data, I consulted the Bank of English corpus, which at that time
contained 167 million words of text. I found that use of the word sleaze was
restricted to the media data in the corpus (i.e. data from British and American
newspapers, journals, and broadcast news), and that it was more frequent in
the American data. The observation that the word sleaze was the preferred me-
dia choice over corruption when referring to misconduct in British and US pub-
lic life was borne out: of the 215 citations of sleaze, all but two instances referred
to British and US contexts. However, similar financial or political malpractice
in Southern and Eastern Europe, Africa, Asia and South America was typi-
cally referred to as corruption. Furthermore, on examining the concordances
in greater detail, it seemed that corruption had a greater negative connotation
than sleaze. Finding in my study that the words sleaze and corruption covered
some of the same semantic area, but were connotationally different, and were
used in different geographical contexts, raised questions as to what the ideolog-
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.5 (272-312)
Corpus Linguistics and CDA 
ical implications may be, and whether other lexical choices made by the media
when talking about corruption in public life showed similar geographical re-
strictions. I decided therefore to carry out a more detailed study, extending
the set of nouns under examination, and using CDA theory to interpret the
implications of the results.
. Method
. The corpus
The data consulted was again drawn from the Bank of English corpus. But by
the time of this later study, the corpus had been updated and stood at 323 mil-
lion running words (it has since been updated again). All of the data dated from
between 1990 and 1996. The corpus was divided into 17 sub-corpora, each of
which could be accessed separately if so desired, and each containing data from
a different source (e.g. British spoken data, British books, American books,
British magazines, British journals, British radio news broadcasts, American
radio news, and various British, American and Australian newspapers). Four
of the sub-corpora represented data from British newspapers. These were the
Guardian, the Independent, the Times and the (now defunct) tabloid Today.
These four sub-corpora together contained over 800 texts. Owing to time con-
straints, I decided to limit the scope of the detailed study to an examination
of the lexical choices made in these four British newspaper sub-corpora. How-
ever, the general semantic profiles of the words were constructed using data
from the entire 323 million word Bank of English corpus.
. The selection of lexical items
In extending the research, the aim was to assemble a set of nouns which were
synonyms, near-synonyms, or hyponyms of corruption, i.e. a set of nouns
that represented choices on the paradigmatic axis. To do this, two thesauri
were consulted, and a further list of nouns was added by looking through a
computer-generated list of the most significant collocates of corruption in the
corpus. The selected nouns also needed to occur frequently in the Guardian, In-
dependent, Times and Today sub-corpora. To qualify definitively for inclusion
in the set of nouns to be examined, each noun had to have at least 15 citations
in the four British newspaper sub-corpora. The final set of 8 nouns assembled
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.6 (312-362)
 Debbie Orpin
by this process consisted of: bribery, corruption, cronyism, graft, impropriety/ies,

malpractice(s), nepotism, sleaze.
. Frequency and distribution
Having assembled the set of lexical items, the next task was to gather infor-
mation on their overall frequencies in the Bank of English corpus. In order to
discover whether the frequency of use of these items had changed over time, I
also compared the Bank of English frequencies with frequencies from the pre-
1985, 18 million word Birmingham Collection of English Texts. Since the 323
million word Bank of English is approximately 18 times the size of the Birm-
ingham Collection of English Texts, a reasonable comparison could be made by
simply multiplying the frequencies obtained in the earlier corpus by 18.
To gain an idea of the language variety, mode, genre, or discourse com-
munity in which each of the lexical items is typically used, their distribution
across the 17 sub-corpora of the Bank of English was examined. The distribu-
tion of a word is generally calculated as the average number of times it occurs
per million words of text in a given sub-corpus. Unfortunately, by the time of
this study, although frequency data from the Birmingham Collection of English
Texts was still available, concordances were not. It was therefore impossible to
eliminate from the pre-1985 frequency data instances of the major senses of
graft (i.e. those relating to surgical procedures and hard work), and count only
the instances for the “corruption” sense. The diachronic comparison of fre-
quency data for graft is therefore unreliable. However, the data relating to the
distribution of graft in the Bank of English data remains valid, as it is based
solely on occurrences of the “corruption” sense of graft.
. Concordances and collocational data
For each lexical item, the concordances from the Guardian, Independent, Times
and Today sub-corpora were manually scanned, to get an initial impression of
the typical contexts in which they were used. A list of the top 50 collocates of
each item was then obtained automatically using a collocation program draw-
ing data from the entire Bank of English corpus. This program takes the col-
locates from a span of four words either side of the node (or key word), cal-
culates the collocational significance (using a statistical measure called t-score)
of each collocate, and outputs a list of collocates in order of statistical signifi-
cance. Since graft is polysemous, all concordances relating to senses other than
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.7 (362-458)
Corpus Linguistics and CDA 
the “corruption” sense were eliminated before the collocation program was
run. Finally, all the concordances from the four British newspaper sub-corpora
were manually scanned again in order to verify the connotational impressions
and geographical references obtained from the initial manual scanning and the
collocate lists.
. Results
. Frequency
In the post-1985 Bank of English corpus, corruption is by far the most frequent
of the 8 lexical items, followed by sleaze, bribery, graft, malpractice(s), impropri-
ety/ies, nepotism, and cronyism. In the pre-1985 Birmingham Collection of En-
glish Texts, the frequency order was roughly similar: corruption, bribery, graft,
nepotism, impropriety, malpractice, and cronyism. Sleaze is remarkable in that it
was completely absent in the earlier corpus, but has 1,152 occurrences in the
later one (and is the second most frequent member of the set).
Table 1 shows the change in frequency of each of the lexical items between
the pre-1985 period and the post-1985 period. The figure shows the raw fre-
quencies of the items in the earlier corpus; a calculation of what their expected
frequencies would be in the later, larger corpus (if usage remained stable); and
the actual frequencies in the later corpus. The final column shows the rela-
tionship between the actual frequency and the expected frequency in the later
corpus, expressed as a percentage.
The data shows that all but one of the items (nepotism) underwent a
greater-than-expected increase in frequency between 1985 and 1996. The fig-
ures are not the same for all the items, though: bribery shows only a slight in-
crease in actual versus expected frequency (9.06%), while corruption and graft
have increased by just over 50% (the true picture for graft may, however, be
masked by the frequencies of references to surgery or hard work); impropri-
ety/ies shows an increase of just over 100%, and cronyism and malpractice(s)
have risen by over 300%. As mentioned earlier, sleaze was absent in the earlier
corpus, but is the second most frequent item in the later one.
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.8 (458-468)
 Debbie Orpin
Table 1. Diachronic comparison of corpus frequencies for the lexical items
Lexical items Frequency in Expected Actual Relationship

(in alphabetical pre-1985 corpus frequency in frequency in between actual and
order) (18m words) post-1985 post-1985 expected frequency,
corpus (323m corpus (323m expressed as a
words) words) percentage
bribery 57 1,026 1,119 9.06%
corruption 229 5,382 8,278 53.8%
cronyism 1 18 81 350%
graft 27 486 740 52%
impropriety/ies 13 234 474 102.5%
malpractice(s) 9 162 703 333.95%
nepotism 18 324 306 –5.55%
sleaze 0 0 1,152 N/A
. Distribution
The information about distribution comes solely from the post-1985 Bank of
English corpus. Owing to constraints of space, I will not show the details of the
distribution of the 8 items across all the 17 sub-corpora. However, I can report
that all of the items were used most frequently in the media sub-corpora. This
indicates that the activities denoted by these items were of current concern in
the public domain between 1990 and 1996. Graft proved to be far more fre-
quent in the American books sub-corpus than the British books sub-corpus.
Half the citations for cronyism (the least frequent word of the set) were found
in the Economist and Australian news sub-corpora. Interestingly, sleaze (which
in my earlier study, based on the 167 million word Bank of English corpus had
been far more frequent in the American media data than in the British data)
was now found to be more frequent in the British media data. This suggests ei-
ther that concern about the subject had fallen in the US while it rose in Britain,
or that a different term was now being used in the US.
In the sub-corpora that I decided to examine in more detail, i.e. the
British newspaper sub-corpora (Guardian, Independent, Times and Today), the
raw frequencies are highest for corruption, sleaze, and bribery, and lowest for
graft and cronyism (N.B. the figure for graft refers only to citations for the
bribery/corruption sense) (see Table 2).
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.9 (468-527)
Corpus Linguistics and CDA 
Table 2. Frequencies in the 1990–96 British newspaper data
Lexical item Frequency Lexical item Frequency

bribery 380 impropriety/ies 180
corruption 2,555 malpractice(s) 179
cronyism 16 nepotism 123
graft 45 sleaze 862
. Manual scanning of concordances
A look at a selection of concordances from the Guardian, Independent, Times

and Today gives an initial impression of the contexts in which these newspapers
typically used these words.
bribery
. . . Richard Branson’s allegations of attempted bribery
. . . question Mr Berlusconi about bribery allegations
. . . the Milan bribery and corruption scandal
. . . charges of fraud, bribery and criminal conspiracy
. . . allegations of soccer bribery and match-fixing
. . . Pakistan cleared Salim Malik of bribery charges.
. . . Mr. Claes’s involvement in a bribery scandal
corruption
. . . he uncovers fraud, bribery, malpractice and corruption
. . . Labour group in centre of corruption allegations
. . . a country rife with corruption and plagued by the Mafia
. . . the West Midlands police corruption case
. . . Ms Bhutto was ordered to stand trial on corruption charges
. . . the corruption endemic in Italian politics
. . . allegations of corruption in Malaysia’s government
. . . Italy’s corruption scandals.
cronyism
. . . Mr. Gingrich was guilty of cronyism
. . . the mingled dishonesty, favours, cronyism and ruthlessness that have
characterised his style of government
. . . allegations of cronyism from the Bush administration.
. . . casual ethics and parochial cronyism which has dogged the Clinton
administration
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.10 (527-580)
 Debbie Orpin
graft
. . . an attempt by the New York city council to create an independent
agency to monitor police graft
. . . her wealth was obtained by graft and corruption, he said.
. . . graft and illegal campaign contributions have been the lifeblood of
Italy’s political system
. . . Milan’s anti-graft magistrates
. . . South Korean graft scandals taint the entire system
. . . the wholesale graft uncovered in Italy
impropriety/ies
. . . she was worried about alleged improprieties in the [White House]
travel office.
. . . Mr Aitken and Mr Howard were cleared of any impropriety
. . . General of Oflot, also denied any impropriety in accepting free flights.
. . . allegations of financial impropriety were made against him
The report finds no evidence of impropriety in the conduct of the Matrix
Churchill prosecution.
. . . deny any hint of sexual impropriety.
. . . rumours of sexual impropriety, strongly denied,
. . . nobody is suggesting impropriety.
malpractice(s)
. . . from office for alleged electoral malpractice
Company management has strenuously denied accusations in the press of
financial malpractice.
. . . 7m awarded to Trevor in a medical malpractice suit
Obstetric accident claims account for the largest individual awards in
medical malpractice suits.
. . . the government inquiry cleared the council of malpractice.
. . . a combination of police malpractice and judicial complacency
Blair has been beset by reports of malpractice in Lambeth, Birmingham,
Hackney
. . . fraud and serious malpractices in Whitehall. . .
nepotism
Corruption, incompetence and nepotism are the hallmarks of the new-
look NHS.
. . . Monklands councillors are accused of nepotism and granting sectarian
favours.
. . . Labour town halls are centres of nepotism and malpractice
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.11 (580-706)
Corpus Linguistics and CDA 
. . . Strathclyde, has been found guilty of nepotism and unfair bias

. . . harassment, discrimination or nepotism adequately protects the one
who . . .
. . . Tory nepotism, unaccountable health authorities
sleaze
. . . allegations of Tory sleaze
. . . the Nolan Report into political sleaze
. . . Labour crackdown on sleaze
. . . Madonna, the Queen of Sleaze
. . . accusations of Tory sleaze and scandal.
. . . the fundraising annual Sleaze Ball
. . . plunged the Tories into a fresh sleaze scandal
. . . sleaze scandal MP, Sir Jerry Wiggin
The concordances indicate that, at the time represented by the data (1990 to
1996), bribery, corruption, graft and sleaze were the objects of allegations and
scandals. Bribery and corruption are mentioned in connection with various
countries (e.g. Britain, Pakistan and Italy), while graft appears to be strongly
connected with Italy. Sleaze is particularly associated with British politics:
the Tories are the targets of accusations of sleaze, while Labour are the ac-
cusers. There is also reference to a fundraising Sleaze Ball; and the pop singer,
Madonna, is termed the Queen of Sleaze. Nepotism also appears to be asso-
ciated with British politics (in particular Labour town councils) and the Na-
tional Health Service (NHS). There were only 16 citations for cronyism in the
four sub-corpora under investigation, and they mostly referred to American
politics; whereas citations for impropriety and malpractice seem to refer largely
to British contexts. We see that impropriety can be of a financial or sexual na-
ture, or the precise nature may be unspecified. Similarly, malpractice can be
financial, medical or unspecified.
. Semantic profiles
.. Collocational data

What follows is an analysis of the collocational data. Table 3 shows the top
lexical collocates of the 8 selected items.
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.12 (706-706)
 Debbie Orpin
Table 3. Top lexical collocates of the lexical items
Lexical item Main lexical collocates

bribery accusations, accused, affair, allegations, alleged, attempted, betting, Branson,
centre, charge(s), charged, claims, conspiracy, convicted, corruption, cricket,
crime, embezzlement, extortion, former, guilty, inquiry, investigation,
involvement, laundering, match-fixing, nepotism, officials, Pakistan, police,
Salim Malik, scandal(s), tax, theft, trial, widespread
corruption abuse (of power), accusations, accused, allegations, alleged, bribery, cases,
charged, charges, commission, crime, endemic, evidence, Fitzgerald, former,
fraud, government, greed, incompetence, independent, inefficiency, inquiry,
investigation, Italy, mismanagement, nepotism, official, party, police, political,
public, rampant, rife, scandal(s), trial, widespread
cronyism accusations, act, allegations, blatant, corruption, demonstration, dishonesty,
favours, government, guilty, jobs for the boys, mingled, newspapers, patronage,
politics, rise, ruthlessness, shady
graft allegations, Beanland, bribery, (graft-) busting, campaign, case, charges, China,
claims, combat, commission, corrupt, corruption, denies, di Pietro, (anti-graft)
drive, entrenched, former, fraud, given, influence, investigation, jail, magistrates,
Milan, money, new, office, official, opportunities, paid, police, political, South,
system, years, 1993
impropriety accusations, accused, allegations, alleged, appearance, avoid, Bowman, charges,
claims, cleared, concerning, Cranston, denied, denies, dishonesty, election,
evidence, financial, guilty, hint, illegality, investigation, involved, involving,
irrationality, part, procedural, public, question, sacked, sexual, suggesting,
suggestion, whatsoever
malpractice accusations, allegations, alleged, awards, BCCI, cases, claims, corruption, costs,
doctors, electoral, evidence, financial, found, fraud, incompetence, increased,
inquiry, insurance, intimidation, lawsuits, lawyers, legal, liability, medical,
negligence, number, physicians, police, premiums, reform, serious, suit,
widespread
nepotism accusations, accused, allegations, article, bias, (a) bit, blatant, bribery, charge(s),
charged, corruption, council, criticism, discrimination, evidence, excessive,
favouritism, fraud, government, guilty, inefficiency, inquiry, kind, Labour, law,
mismanagement, misuse, Monklands, NHS, nineteenth, patronage, political,
promotion, rife, said, sectarian, sectarianism, tribalism, widespread
sleaze accusations, allegations, (Sleaze) Ball, committee, Commons, corruption, factor,
football, government, greed, incompetence, inquiry, issue, Labour, Major, MP(s),
new, Nolan, party, political, politics, public, Queen (of Sleaze), report, row,
scandal(s), sex, Sir Jerry (Wiggin), Tories, Tory, word
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.13 (706-844)
Corpus Linguistics and CDA 
.. Shared or common collocates

The lexical items above were chosen for analysis precisely because they shared
a key area of semantic overlap. This overlap is reflected by the fact that several
of the collocates are shared by most of the items. We are dealing with the field
of corruption and, unsurprisingly, all of the items except impropriety (and cor-
ruption itself) collocate with corruption. The collocational data substantiates
the evidence of the concordance lines as regards shared or common collocates:
allegations collocates with all 8 of the items (and alleged with 4 of them); ac-
cusations with 7 (and accused with 4); charge/charges/charged, political/politics,
and inquiry with 5; and claims, evidence, fraud, government, guilty, investigation,
police, and widespread collocate with 4 of the items.
.. Unique collocates

What is more interesting to examine, since we are concerned with the concept
of lexical choice and the possible ideological consequences of such choices, is
where there are areas of divergence among the collocates of each word. Such
differences are likely to highlight differences in areas of activity, people or places
with which each word is associated and may also highlight differences in the
attitude encoded by the user of the item to what is being spoken of. Table 4
shows the collocates that are unique to each item.
.. Domains
The collocates of a word can give an indication of which areas of life, people
and places the word is associated with. All the items in the set are linked with
politics, public office and officialdom (see Table 5).
Bribery is further linked with the field of business and sport, collocating
with Branson (the businessman), betting, cricket, Salim Malik (a Pakistani crick-
eter) and match-fixing; nepotism and sleaze also show connections with sport,
collocating with football.
Malpractice, on the other hand, is connected more with financial, legal,
and medical institutions or practitioners, as is evidence by collocates such as
awards, BCCI (Bank of Credit and Commerce International), claims, costs,
doctors, insurance, lawsuits, lawyers, legal, medical, physicians, premiums, suit.
.. Geographical locations

As for places, bribery features Pakistan among its most significant collocates,
while graft features China, Italy and Milan. There is a reference to a British
businessman (Branson) among the collocates of bribery, to Australian politi-
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.14 (844-844)
 Debbie Orpin
Table 4. Collocates unique to each lexical item
Lexical item No. of unique Unique collocates

collocates
bribery 16 affair, attempted, betting, Branson, centre, conspiracy, convicted,
cricket, embezzlement, extortion, laundering, match-fixing,
Pakistan, Salim Malik, tax, theft
corruption 6 abuse (of power), endemic, Fitzgerald, independent, Italy,
rampant
cronyism 9 act, demonstration, favours, jobs for the boys, mingled,
newspapers, rise, ruthlessness, shady
graft 20 Beanland, (graft-)busting, campaign, China, combat, di Pietro,
(anti-graft) drive, entrenched, given, influence, jail, magistrates,
Milan, money, opportunities, paid, South, system, years, 1993
impropriety 13 appearance, avoid, Bowman, cleared, concerning, hint, illegality,
irrationality, part, procedural, question, sacked, whatsoever
malpractice 17 awards, BCCI, costs, doctors, found, increased, insurance,
intimidation, liability, medical, negligence, number, physicians,
premiums, reform, serious, suit
nepotism 18 article, bias, (a) bit, council, criticism, discrimination, excessive,
favouritism, kind, misuse, Monklands, NHS, nineteenth,
promotion, said, sectarian, sectarianism, tribalism
sleaze 14 (Sleaze) Ball, committee, Commons, factor, football, issue, Major,
MP(s), Nolan, Queen (of Sleaze), report, row, Sir Jerry (Wiggin),
word
Table 5. Collocates indicating domain
Lexical item Collocates: politics, public office, officialdom
bribery officials, police

corruption commission, Fitzgerald (an Australian politician), government, political,
party, police
cronyism government, politics
graft campaign, commission, office, police, political
impropriety election, public
malpractice electoral, police
nepotism council, government, Labour, Monklands (a town council in Scotland),
NHS, political
sleaze committee, Commons, government, Labour, (John) Major, MP(s), Nolan,
party, political, politics, public, Sir Jerry (Wiggin), Tory, Tories.
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.15 (844-913)
Corpus Linguistics and CDA 
Table 6. Collocates indicating criminal activities
Lexical item Collocates: criminal activities
bribery conspiracy, convicted, corruption, crime,

embezzlement, extortion, laundering,
match-fixing, nepotism, theft, trial
corruption abuse (of power), bribery, crime, fraud,
nepotism, trial
cronyism
graft bribery, corrupt, corruption, fraud, jail,
magistrates
impropriety/ies
malpractice(s) corruption, fraud, intimidation
nepotism bribery, corruption, fraud
sleaze
cians with corruption (Fitzgerald) and impropriety (Cranston), to an Australian

scandal (Beanland), to the Italian anti-corruption magistrate (di Pietro) with
graft, and to an American politician (Bowman) with impropriety. There are nu-
merous references to British politicians, political parties and places among the
collocates of nepotism and sleaze.
.. Connotations
The activities with which a word is associated can be highlighted by its collo-
cates.
... Criminality: bribery, corruption, graft, nepotism, (malpractice)

Scanning the collocates of the selected set of words, we see that bribery, cor-
ruption, graft, nepotism and (to a very slight degree) malpractice all show a
connection with criminal activities (see Table 6). This is particularly clear in
the profiles of bribery and corruption. Graft is similar. Nepotism is weaker. Mal-
practice has a very different general profile from the other words, but does share
the collocates corruption and fraud. It also collocates with intimidation.
... Bad professional practice: corruption, malpractice, nepotism, sleaze

As can be seen from the data shown in Table 7, corruption has an extra di-
mension to that outlined above, in that some of its collocates do not indicate
criminal activities but bad professional practice: incompetence, inefficiency, mis-
management. These words also occur among the collocates of other items in the
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.16 (913-1007)
 Debbie Orpin
Table 7. Collocates indicating bad professional practice
Lexical item Collocates: bad professional practice

bribery
corruption incompetence, inefficiency, mismanagement
cronyism
graft
impropriety/ies
malpractice(s) incompetence, negligence
nepotism inefficiency, mismanagement
sleaze incompetence
Table 8. Collocates indicating dishonest and discriminatory practice
Lexical item Collocates: dishonest/discriminatory practice

bribery
corruption
cronyism dishonesty, favours, jobs for the boys, patronage,
ruthlessness
graft
impropriety/ies
malpractice(s)
nepotism bias, discrimination, favouritism, misuse, patronage,
sectarianism, tribalism
sleaze
set. Malpractice additionally collocates with negligence, reflecting the fact that
malpractice is often used in medical contexts.
... Greed: corruption, sleaze

Corruption furthermore collocates with greed, indicating a tendency to covet
money. Greed is also a collocate of sleaze.
... Dishonest and discriminatory practice: nepotism, cronyism

Nepotism and cronyism have several collocates which are peculiar to them and
are a reflection of their semantic meaning (see Table 8). These collocates de-
note dishonest and discriminatory practice. Similar semantically but with a
greater negative connotation are sectarianism and tribalism, which collocate
with nepotism. Ruthlessness, which has a particularly negative connotation, col-
locates with cronyism. The data for cronyism is skewed, however, owing to the
fact that there are relatively few occurrences of it in the corpus and one concor-
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.17 (1007-1069)
Corpus Linguistics and CDA 
dance line, containing the words dishonesty, favours and ruthlessness is repeated
four times.
... Sex: impropriety, sleaze

Impropriety and sleaze are alone in having an association with sex. Impropri-
ety collocates with sexual, and sleaze with sex. Apart from its association with
elections, money and sex, impropriety shares relatively few collocates with the
other items. There are no collocates suggesting crime, except for the very non-
specific term illegality, and the only collocates suggesting bad practice are dis-
honesty (which it shares with cronyism) and irrationality. Illegality, dishonesty
and irrationality denote states of affairs rather than specifying actions, so as
such they are rather vague terms.
... Negative speaker attitudes

Finally there are a number of collocates which encode speaker attitude. These
generally have negative connotations. Rampant collocates with corruption, as
does rife, which also collocates with nepotism. Blatant collocates with nepo-
tism, shady with cronyism, and entrenched with graft. Serious collocates with
malpractice, and widespread collocates with bribery, corruption, malpractice and
nepotism. These collocates do not have such an obvious negative connotation as
those in previous sections, but do indicate concern about the degree or extent
of a problem.
... Most negative connotations: bribery, corruption, graft, nepotism

The semantic profiles of the words bribery, corruption, graft and nepotism ap-
pear to have particularly negative connotations. All are associated with crim-
inal activities, and corruption, graft and nepotism also have collocates which
indicate a particularly negative speaker attitude. The negative connotations of
nepotism are reinforced by the group of collocates indicating dishonesty and
discrimination. Cronyism, like nepotism, displays the aspect of dishonest and
discriminatory dealings and a negative speaker attitude, but lacks collocates
indicating crime.
... Least negative connotations: malpractice, impropriety, sleaze

Corruption was found to have an aspect of its meaning associated with bad
professional practice. Malpractice similarly covers this area but has a less pro-
nounced negative connotation, having far fewer collocates denoting crime and
none indicating a strongly negative speaker attitude. Impropriety and sleaze
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.18 (1069-1216)
 Debbie Orpin
can be seen to have the least negative connotations of all the words in the set.
The semantic area that they share with corruption is that denoting bad profes-
sional practice, and neither impropriety nor sleaze collocate with words denot-
ing crimes, nor do they have collocates that indicate a negative speaker attitude.
Furthermore, the semantic profile of impropriety proved to be made up in part
of words which were non-specific in the actions they denoted.
.. Geographical references

Table 9 lists the thirteen countries that were referred to most frequently in the
data, presented in descending order of their overall frequency, and shows, as a
percentage, the proportion of the concordances for each word which referred
to a given country.
Overall, events in the UK are referred to most often. This is only to be ex-
pected since the newspapers consulted are British. However, the ranking of the
foreign references are striking when compared with the results of a study car-
ried out by Galasinski and Marley (1998) into the foreign pages of the British
and Polish press. They found that in the foreign pages of the British press, six of
the G7 countries (the US, UK, France, Germany, Japan and Italy) are referred
to most frequently, followed by Russia and other Eastern European countries,
newer members of the EU, and a handful of Third World countries. The pro-
portion of coverage given to each of the G7 countries identified in their data is
as follows:
1. US 28%
2. UK 16.5%
3. France 7.6%
4. Germany 5.8%
5. Japan 4.5%
6. Italy 4.1%
(Galasinski & Marley 1998: 569)
My data represents national as well as foreign pages. That would account for
the UK being mentioned more often than any other country. What is signifi-
cant, however, is that Italy ranks second, above the US, with Pakistan fourth,
above France. Furthermore, in my data Italy accounts for 11.5% of the cita-
tions of bribery, 11.5% of the citations of corruption and 30.8% of the citations
of graft. These figures are particularly striking given that Galasinski and Mar-
ley (1998) found that only 4.1% of coverage in the foreign pages of the British
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.19 (1216-1218)
Corpus Linguistics and CDA 
Table 9. Geographical references
Location (in order of Lexical items: % of citations of each item referring to location
frequency of references) stated in left-hand column
bribery corruption cronyism graft
1. U.K. 28.7 23.8 6.3 5.8
2. Italy 11.5 11.5 0.0 30.8
3. U.S. 2.6 4.0 50.0 13.5
4. Pakistan 17.8 22.3 0.0 0.0
5. France 3.7 6.0 0.0 5.8
6. China 2.4 4.8 0.0 7.8
7. S. Korea 3.1 3.3 0.0 5.8
8. Germany 2.4 1.8 0.0 1.9
9. India 1.6 2.3 0.0 0.0
10. Malaysia 4.5 0.5 0.0 0.0
11. Belgium 2.4 1.3 0.0 0.0
12. Spain 0.3 1.8 0.0 0.0
13. Japan 1.0 0.8 0.0 0.0
Others 5.6 24.8 43.7 32.1
Location (in order of Lexical items: % of citations of each item referring to location
frequency of references) stated in left-hand column
impropriety malpractice nepotism sleaze
1. U.K. 76.8 72.4 53.7 79.0
2. Italy 1.5 1.5 1.6 0.9
3. U.S. 8.1 4.5 1.6 3.9
4. Pakistan 1.0 0.8 0.0 0.0
5. France 0.5 .0 0.8 1.6
6. China 0.0 1.5 0.8 0.0
7. S. Korea 0.0 0.8 1.6 0.0
8. Germany 0.0 0.0 0.0 0.2
9. India 0.0 0.0 0.8 0.0
10. Malaysia 0.0 0.8 0.0 0.0
11. Belgium 0.0 0.0 0.0 0.0
12. Spain 0.0 1.5 0.0 1.2
13. Japan 0.5 0.8 0.0 0.0
Others 3.5 4.4 11.3 1.4
press are devoted to Italy. Similarly, the 17.8% of the citations of bribery that re-
fer to Pakistan is notable, as is the fact that a number of Third World countries,
China, South Korea, India and Malaysia, rank more highly than Galasinski and
Marley’s (ibid.) data would lead one to expect. As predicted, the majority of
the citations of sleaze refer to British contexts, although a small proportion
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.20 (1218-1266)
 Debbie Orpin
are used in references to other countries. A relatively small proportion refer to

U.S. contexts.
As was established in 3.4.6.7 and 3.4.6.8 above, bribery, corruption and graft
were found to have very negative semantic profiles. A choice of sleaze or (fi-
nancial) impropriety can be made in their place and it is important to ask why
that choice is almost never taken up when events in Italy are written about
and never taken up to describe events in Pakistan, China, South Korea, India
and Malaysia. One might ask too why graft is used to describe events in the
USA (13.5% of the citations of graft) more often than it is applied to British
contexts (5.8%).
The one word with a particularly negative semantic profile which is used
more in British contexts (53.7%) than elsewhere is nepotism. By contrast,
50% of the citations of its (less negative) synonym cronyism are applied to
US contexts.
. Discussion
Analysis of the data revealed a greater-than-expected increase since 1985 in the

frequency of all the words in the set (with the exception of nepotism) across the
whole of the Bank of English. Sleaze has grown exponentially in its frequency
and may have entered British English usage from American English. It was ab-
sent from the pre-1985 corpus, was evident in the 167 million word corpus
but was more common in the American media data than British data, and by
1996 was more frequent in British data than American media data. The distri-
bution of the words across the 17 sub-corpora showed that all the words were
most frequent in the media sub-corpora. The raw frequencies in the Guardian,
Independent, Times and Today sub-corpora of all the words (except cronyism
and graft) are particularly high, indicating that the phenomena of corruption,
malpractice, sleaze etc. were of particular concern in the British press at the
time. According to Galtung and Ruge (1973: 66), in their analysis of criteria that
make an item newsworthy, events in geographically or culturally close coun-
tries or in elite nations (e.g. the US) are more likely to be reported than events
elsewhere. This criterion can be overridden though, if events taking place in
distant or non-elite nations are deemed relevant to the readership because they
are similar to, or more extreme examples of, events that are of current concern
at home. This may account for the fact that, in my data, the number of refer-
ences to corruption in countries such as Italy, Pakistan, South Korea, India and
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.21 (1266-1322)
Corpus Linguistics and CDA 
Malaysia is disproportionate to the amount of coverage these countries usually

receive in the international pages of the British press.
The analysis also showed an increase since 1985 in the number of words
available to talk about corruption: cronyism and sleaze were found to be recent
coinages and graft (in the sense of bribery and corruption) was seen to be en-
tering British English from American English. This may point to what Fowler
(1991: 84) terms relexicalisation, that is the coining of a new term to imply that
a new phenomenon is being denoted, and overlexicalisation, ‘the existence of
an excess of “quasi synonymous” terms to talk about entities and ideas that are
a particular problem or concern within a culture’s discourse’. If so, there are
two factors which might account for these phenomena occurring at the time
represented by the data.
Firstly, there was growing dissatisfaction with the Conservative govern-
ment which had been in power for well over a decade. It is now widely ac-
knowledged that accusations of ‘Tory sleaze’ formed a large part of the Labour
party’s arsenal in attacking the then government. The recently coined term
sleaze was useful to exploit in that, by denoting both financial and sexual mis-
conduct, misdemeanours of a diverse nature (some serious, some trivial) could
be lumped together, thus intensifying the notion that the government was un-
trustworthy. Also, use of a non-specific term such as sleaze could protect an ac-
cuser from litigation. These factors might account for the increase in frequency
of the term in British English.
Secondly, Britain in the 1980s had undergone, and was still undergoing,
massive structural readjustment. There was a move towards deregulation and
greater private ownership of previously public assets. There was (and still is)
public concern over the full implications of such changes. This concern is
evident in some of the data that was examined, for example:
. . . the government’s reforms of the NHS have opened up a whole new
world of possibilities for corruption
. . . checking the National Health Service for fraud and corruption and
ensuring that public money for patients is put to proper use since the
internal market was introduced.
Compared with the earlier study I made into corruption and sleaze (using the
167 million word corpus), where all but two instances of sleaze referred to
British or US contexts and almost all of the citations of corruption referred
to Southern European or Third World countries, the data in this study did not
show quite such a marked split. The more clearly negatively connotative bribery
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.22 (1322-1383)
 Debbie Orpin
and corruption (and to a lesser extent graft) are seen to be chosen to describe ac-
tions in Britain as well as sleaze. This could be evidence of a shifting awareness
in Britain, an awareness that corruption does not only happen abroad. Indeed
at the time a public inquiry, the Nolan inquiry, had been set up to investigate
standards in public life. Examples from the British news sub-corpora illustrate
this awareness:
. . . bribery of an MP should be a criminal offence
When such a system operates overseas, Tory MPs call it corruption.
Nolan cannot be used as a carpet under which the endemic corruption in
our political system is swept.
. . . until recently people did not believe MPs were involved in graft
The fact that most citations for nepotism were applied to British contexts might
be further evidence of this shifting awareness. To cite another corpus example:
There was a time when nepotism and fleecing the public purse were asso-
ciated with Third World countries, while our government and civil service
were held up as models of rectitude. Alas, now we have fallen.
The decision to choose to use the word sleaze or impropriety or bribery, corrup-
tion or graft in a given context can thus be seen to reflect an ideological stance.
Where no shift in attitude among the British press was apparent is in
its tendency to choose words with greater negative associations to talk about
events abroad. Sleaze and its near-synonym impropriety are very infrequent
choices when Italy is written about, and are not chosen at all to talk about
Pakistan, China, South Korea, India or Malaysia. Even the word malpractice,
which was seen to have a greater negative semantic profile than sleaze and im-
propriety but a lesser one than bribery, corruption and graft, was found not to
be the preferred choice for events in these countries.
This might well have the effect of reinforcing existing stereotypes. As Fair-
clough (1995: 12) makes clear, the media are instrumental in reproducing ide-
ology precisely by representing different groups in certain ways. Above all,
. . . if particular lexical and grammatical choices are regularly made, and if peo-
ple and things are repeatedly talked about in certain ways, then it is plausible
that this will affect how they are thought about. (Stubbs 1996: 92)
The data drawn on for this study dates back to the first half of the 1990s. Lan-
guage use changes over time and it is likely that, if one were to conduct a sim-
ilar study based on more contemporary data, one might find further changes
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.23 (1383-1429)
Corpus Linguistics and CDA 
in the frequency and distribution of the words examined. Intuitively, it seems

likely that the relative frequency of sleaze has diminished, and probable that
occurrences of Tory sleaze have vastly decreased. Perhaps one would find that
the word cronyism has all but completely taken the semantic space occupied by
nepotism and that Labour is one of its top collocates, that and Tony’s + cronies.
. Conclusion
This paper has attempted to respond to Stubbs’s (1997) suggestion of using

corpus linguistics methodology to bolster a critical linguistic analysis. By us-
ing corpus methodology, I have been able to deal with a very large body of
data: 800 texts. A qualitative analysis was carried out, identifying the contexts
in which individual words were used, but the usage in these contexts was com-
pared against the norms of language use as reflected by the 323 million word
Bank of English. Use of the Bank of English also made it possible to identify
the different shades of meaning among the members of a set of closely related
words, and helped to highlight areas of semantic overlap and areas of diver-
gence. I am not arguing for corpus methodology to substitute qualitative anal-
ysis but to complement it, so that assertions made in CDA can be backed up by
reliable, empirical evidence.
Acknowledgements
I would like to thank Ramesh Krishnamurthy for his comments on an earlier

draft of this paper.
References
Alexander, R. J. (1999). Ecological Commitment in Business: A computer-corpus-based

critical discourse analysis. In J. Verschueren (Ed.), Language and Ideology: Selected
papers from the 6th International Pragmatics Conference (Vol. 1) (pp. 14–24). Antwerp:
International Pragmatics Association.
Bayley, P. (1999). Lexis in British Parliamentary Debate: Collocation patterns. In J.
Verschueren (Ed.), Language and Ideology: Selected papers from the 6th International
Pragmatics Conference (Vol. 1) (pp. 43–55). Antwerp: International Pragmatics
Association.
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.24 (1429-1525)
 Debbie Orpin
Caldas-Coulthard, C. R. (1993). From discourse analysis to Critical Discourse Analysis:

The differential representation of women and men speaking in written news. In J.
McH. Sinclair, M. Hoey & G. Fox (Eds.), Techniques of Description: Spoken and Written
Discourse (pp. 196–208). London: Routledge.
Fairclough, N. (1989). Language and Power. London: Longman.
Fairclough, N. (1992). Discourse and Social Change. Cambridge: Polity.
Fairclough, N. (1995). Media Discourse. London: Edward Arnold.
Fairclough, N. (2000). New Labour, New Language? London: Routledge.
Fowler, R. (1991). Language in the News: Discourse and ideology in the press. London:
Routledge.
Fowler, N. (1996). On critical linguistics. In C. R. Caldas-Coulthard & M. Coulthard
(Eds.), Texts and Practices: Readings in Critical Discourse Analysis (pp. 3–14). London:
Routledge.
Galasinski, D. & Marley, C. (1998). Agency in foreign news: A linguistic complement of a
content analytical study. Journal of Pragmatics, 30, 565–587.
Galtung, J. & Ruge, M. H. (1973). Structuring and selecting news. In S. Cohen & J. Young
(Eds.), The Manufacture of the News: Social problems, deviance and the mass media (pp.
67–72). London: Constable.
Halliday, M. A. K. & Hasan, R. (1976). Cohesion in English. London: Longman.
Hardt-Mautner, G. (1995). Only connect. Critical discourse analysis and corpus linguistics.
University of Lancaster. Online, available at:
http://www.comp.lancs.ac.uk/computing/research/ucrel/tech_papers.html.
Jeffries, L. (2003). Not a drop to drink: Emerging meanings in local newspaper reporting of
the 1995 water crisis in Yorkshire. Text, 23 (4), 513–538.
Krishnamurthy, R. (1996). Ethnic, racial and tribal: The language of racism?. In C. R. Caldas-
Coulthard & M. Coulthard (Eds.), Texts and Practices: Readings in Critical Discourse
Analysis (pp. 129–149). London: Routledge.
Leech, G. N. (1974). Semantics. Harmondsworth: Penguin.
Louw, B. (1993). Irony in the text or insincerity in the writer? The diagnostic potential
of semantic prosodies. In M. Baker, G. Francis, & E. Tognini-Bonelli (Eds.), Text and
Technology (pp. 157–176). Amsterdam/Philadelphia: John Benjamins.
Sharrock, W. W. & Anderson, D. C. (1981). Language, thought and reality again. Sociology,
15, 287–293.
Sinclair, J. McH. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press.
Stubbs, M. (1992). Institutional Linguistics: Language and institutions, linguistics and
sociology. In M. Putz (Ed.), Thirty Years of Linguistic Evolution. Amsterdam/
Philadelphia: John Benjamins.
Stubbs, M. (1996). Text and Corpus Analysis. Oxford: Blackwell.
Stubbs, M. (1997). Whorf ’s Children: Critical comments on Critical Discourse Analysis
(CDA). In A. Ryan & A. Wray (Eds.), Evolving models of language (pp. 110–116).
Clevedon: BAAL in association with Multilingual Matters.
Stubbs, M. & Gerbig, A. (1993). Human and Inhuman Geography: On the computer-
assisted analysis of long texts. In M. Hoey (Ed.), Data, Description, Discourse (pp.
64–85). London: HarperCollins.
JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.25 (1525-1543)
Corpus Linguistics and CDA 
van Dijk, T. A. (1997). Discourse as Social Interaction. London: Sage.

Widdowson, H. (1995a). Discourse analysis: A critical view. Language and Literature, 4 (3),
157–152.
Widdowson, H. (1995b). Review of Fairclough Discourse and Social Change. Applied
Linguistics, 16 (4), 510–516.
Widdowson, H. (1996). Discourse and interpretation: Conjectures and refutations [Reply to
Fairclough, 1996]. Language and Literature, 5 (1), 57–69.
Wodak, R. (1996). Disorders of Discourse. London: Sage.

Corpus Linguistics and Critical A

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Corpus Linguistics and Critical A

Uploaded by

Copyright:

Available Formats

JB[v.20020404] Prn:24/01/2005; 15:21 F: IJC10103.tex / p.

Corpus Linguistics and Critical

Critical Discourse Analysis (CDA) has often proved fruitful in providing

Keywords: corpus linguistics, critical discourse analysis, collocation,

International Journal of Corpus Linguistics : (), ‒.

. Critical Discourse Analysis and Corpus Linguistics methodology

The major problem in combining a CDA approach with corpus methodology is

Corpus Linguistics and CDA 

The study reported here largely follows Krishnamurthy’s (1996) method-

. Background to this study

Corpus Linguistics and CDA 

. The corpus

. The selection of lexical items

by this process consisted of: bribery, corruption, cronyism, graft, impropriety/ies,

. Frequency and distribution

. Concordances and collocational data

Corpus Linguistics and CDA 

Table 1. Diachronic comparison of corpus frequencies for the lexical items

Lexical items Frequency in Expected Actual Relationship

Corpus Linguistics and CDA 

Table 2. Frequencies in the 1990–96 British newspaper data

Lexical item Frequency Lexical item Frequency

. Manual scanning of concordances

A look at a selection of concordances from the Guardian, Independent, Times

Corpus Linguistics and CDA 

. . . Strathclyde, has been found guilty of nepotism and unfair bias

. Semantic profiles

.. Collocational data

Table 3. Top lexical collocates of the lexical items

Lexical item Main lexical collocates

Corpus Linguistics and CDA 

.. Shared or common collocates

.. Unique collocates

.. Geographical locations

Table 4. Collocates unique to each lexical item

Lexical item No. of unique Unique collocates

Table 5. Collocates indicating domain

Lexical item Collocates: politics, public office, officialdom

bribery officials, police

Corpus Linguistics and CDA 

Table 6. Collocates indicating criminal activities

Lexical item Collocates: criminal activities

bribery conspiracy, convicted, corruption, crime,

cians with corruption (Fitzgerald) and impropriety (Cranston), to an Australian

... Criminality: bribery, corruption, graft, nepotism, (malpractice)

... Bad professional practice: corruption, malpractice, nepotism, sleaze

Table 7. Collocates indicating bad professional practice

Lexical item Collocates: bad professional practice

Table 8. Collocates indicating dishonest and discriminatory practice

Lexical item Collocates: dishonest/discriminatory practice

... Greed: corruption, sleaze

... Dishonest and discriminatory practice: nepotism, cronyism

Corpus Linguistics and CDA 

... Sex: impropriety, sleaze

... Negative speaker attitudes

... Most negative connotations: bribery, corruption, graft, nepotism

... Least negative connotations: malpractice, impropriety, sleaze

.. Geographical references

Corpus Linguistics and CDA 

Table 9. Geographical references

are used in references to other countries. A relatively small proportion refer to

Analysis of the data revealed a greater-than-expected increase since 1985 in the

Corpus Linguistics and CDA 