Fuzzy Math

Fuzzy Math: Source Diversity Not Apparent in Newspaper and
Wire Service Computer-Assisted Reporting Stories in 2001
Paper for JOUR 7320 Mass Media and Diversity

Dr. Cynthia Bond Hopson, University of Memphis
By Pierce Presley
April 18, 2005
CONTENTS
Acknowledgements.........................................iii
Abstract..................................................iv
Introduction...............................................1
Literature Review..........................................5
Method.....................................................7
Results....................................................9
Conclusion................................................12
Appendices................................................13
Appendix 1: Codebook.................................14
Appendix 2: Coding Form..............................16
Appendix 3: Coding Results...........................17
Bibliography..............................................18
ii
ACKNOWLEDGEMENTS
The author would like to thank:
Professors Bob Thomas, Lisa Martin, and Liz Scott of Loyola

University New Orleans for inspiring him and setting him on
this path;
the Veterans Administration, its Vocational Rehabilitation

program and Jerry Foshee of the VA for their support during
this leg of the journey;
the staff of Investigative Reporters and Editors, and its

National Institute for Computer-Assisted Reporting for data
it might well have been impossible to collect otherwise;
and, finally and most of all, Kelly, Laura and Kane for
their support and patience as he has, slowly at times,
trudged along it to this point.
iii
ABSTRACT
A content analysis of computer-assisted reporting stories
published in American newspapers or wire services in 2001
showed that there is scant direct evidence of the
ethnicity, disability or poverty, little evidence of the
race or age, and ample evidence of the sex of persons
mentioned in those stories. The evidence shows that men
appear three times as often as women, but there is little
difference between the rates men and women simply appear
and those at which they are quoted or paraphrased. This
study finds that there is little internal evidence of
diversity within these stories and that further pursuit of
this avenue of inquiry is likely fruitless.
iv
INTRODUCTION
Technology has driven change in newspapers since the
first recognized newspapers appeared in the 17th Century,
when not only the printing press but regular mail service
was instrumental in starting the industry.1 One of the major
drivers of change in the latter half of the 20th Century
was the availability of inexpensive personal computers and
of data in electronic form. Though the idea of a computer
and attendant software for $10,000 as mentioned in a 1992
article, does not seem inexpensive today, at the time it
represented not only computing power several orders of
magnitude higher than was available to anyone 25 years
before and quite simply not available to anyone outside of
government as little as a decade prior. 2 This movement of
computing power from centralized locations and cloistered
institutions to where work was done gave reporters, among
others, more power to do independent analysis of data,
freeing them from relying on official interpretations and
the obfuscation that often ensued. The opening of the data
to public consumption, thanks in large part to freedom of
information and sunshine acts at the federal and state

1
Michael Emery, Edwin Emery, and Nancy L. Roberts, The press
and America: an interpretive history of the mass media 9th
ed. (Needham Heights, Mass.: Allyn & Bacon, 2000), 7.
2
George Landau, “Quantum leaps: computer journalism takes
off,” Columbia Journalism Review 31, no. 1 (May-June 1992):
62.
1
2
levels, meant there was something to aim that computing
power toward that could serve the public welfare.
Melisma Cox argues persuasively that the development of
CAR was driven by individuals’ initiative, whether the
product was produced for an established news outlet or on a
freelance basis, but this does not mean that those outlets
have not since embraced the reporting techniques.3
Currently, the CAR story is a staple of newspapers in
America and around the world, and its practitioners are
often among the most experienced and best trained reporters
in a newsroom.4 They not only have the support of generalist
professional organizations, such as the Society of
Professional Journalists and Radio Television News
Directors Association, but of the specialized Investigative
Reporters and Editors and its National Institute for
3
Melisma Cox, “The Development of Computer-Assisted
Reporting” (Paper presented as part of the Southern
Colloquium of the Association for Educaiton in Journalism
and Mass Communications, Chapel Hill, N.C., 17-18 March
2000).
4
Cecilia Friend, ”Computerized reports and newspapers;
computer-assisted records analysis becoming a staple of
papers,” Editor and Publisher 125, no. 52 (26 December
1992): 62; Bruce Garrison, “Tools daily newspapers use in
computer-assisted reporting,” Newspaper Research Journal
17, no. 1-2 (winter-spring 1996): 113; Lucenda Davenport,
Fred Fico, and Mary Detwiler, “How Michigan dailies use
computers to gather news,” Newspaper Research Journal 22,
no.3 (summer 2001): 45; and Louis Rom, “Pushing numbers: an
increasing number of newsrooms use technology to develop
stories: CAR advocates call for even more,” The Quill 91,
no 2 (March 2003): 11.
3
Computer-Assisted Reporting. Evidence of the spread of CAR
can also be found in journalism schools, where at least two
of the leading institutions are now adding CAR to their
required studies.5
This experience, training and support places most CAR
practitioners at the forefront of the profession at large,
and their work is often an exemplar of the profession’s
best. Thus that body of work would seem to make an
excellent subject for an examination of how newspaper
journalists select persons within the stories they write
and how they choose whom to quote or paraphrase, as well as
how they choose to convey that information. What has become
clear, however, is that this body of work is opaque so far
as the diversity of the people within its boundaries goes.
This study was conceived as an initial investigation
into the state of CAR stories vis-à-vis diversity and to
determine if they would be applicable to further diversity
research. The aim was not to look at the analyses behind
the stories, but at the writers’ selection of people within
those analyses and without to speak for and about the
issues covered. While the resemblance between the data
underpinning the stories and the persons included within
5
Mike Reilley, ”Stalled C.A.R.: computer-assisted
reporting: journalism is still way behind,” Columbia
Journalism Review 41, no. 5 (January-February 2003): 62.
4
them seems an interesting avenue of research, it was not
the focus of this study and would likely need a far greater
commitment of time and resources, not to mention
considerable access to unpublished work, than the current
study.
The scope, while likely too limited for exhaustive study
should be useful for determining whether larger scale
studies would be of use. The results seem to indicate that
research in other directions or of other datasets would be
more fruitful than that reported here.

LITERATURE REVIEW
Academic research into journalism is far less prevalent
than research into mass media generally. Research into the
practice of CAR has fallen off since 2000, but there are
still important items to note.6 CAR has moved from being the
sole provenance of a cabal of specialist reporters to
becoming more and more that of the beat reporter, a
movement driven in part by the specialists themselves and
IRE/NICAR.7 The number of CAR stories available from
IRE/NICAR for the year 2003 is more than double that of
2001.8
Investigations into diversity within news reports have
focused on general newspaper items, which are produced by
reporters with a wide range of experience and training, or
on the broadcast media where such things as sex, race, and
ethnicity seem easier to determine, though this is easily
shown to be fraught with peril in a multicultural society
like America’s.
6
Louis Rom, 12.
7
IRE Conferences and NICAR CAR Conferences have featured
beat-focused panels and classes in the last few years in a
stated effort to expand CAR’s use in the newsroom.
8
The author is pursuing separate research on a dataset from
2003.
5
6
Newsrooms do not currently reflect the diversity of
their communities well and have difficulty in retaining
members of diverse populations within their ranks.9
A study of sample size in newspaper research indicated
that a broader view of single years is preferable to a
shallow view of many years when testing the tenor of
coverage.10
9
American Society of Newspaper Editors, Newsroom Employment
Census http://www.asne.org/index.cfm?id=1138 (accessed 6
April 2005); and Pamela T. Newkirk, “Guess who’s leaving
the newsrooms,” Columbia Journalism Review 39, no. 3
(September 2003): 36.
10
Stephen Lacy and others, “Sample size for newspaper
content analysis in multi-year studies,” Journalism and
Mass Communication Quarterly 78, no. 4 (winter 2001): 842.
While this study looked at multi-year research, it was felt
that it could be applied to the choice in preliminary
research of using a single or multiple years.
METHOD
This study looked at CAR stories published in 2001 as
identified and supplied by the IRE/NICAR Resource Center, a
publicly available repository of investigative stories,
including entries for IRE awards.11
The study used content analysis to attempt to identify
the sex, ethnicity, race, age group, disability status and
income status of persons within the stories.12 The analysis
relied on direct indication of status, such as age given or
description, or fairly unambiguous indications, such as
first names within the top 100 American names for one sex
not in the top 1,000 for the other.13 Sex was the only
characteristic for which this was possible.
Persons in the stories were further separated into
prominent persons, i.e. those quoted or paraphrased
directly (neither through hearsay nor by previous writings
nor utterances), and persons mentioned, which included the
entirety of the previous category. Infographics, commentary
and photo captions were not included in the textual
analysis, though obvious visual clues in related
11
Investigative Reporters and Editors, IRE Resource Center
http://www.ire.org/resourcecenter/ (accessed 16 March 2005).
12
U.S. Census Bureau, 2000 Census of Population and Housing.
U.S. Department of Commerce. Washington, 2000.
13
U.S. Census Bureau, 1990 Census Name Files
http://www.census.gov/genealogy/names/names_files.html
(accessed 15 March 2005).
7
8
photographs were used to determine some part of a person’s
status.14
Textual analysis was done on a per-story basis, so that
the identification of a person as being of a certain status
in one story did not influence their identification in
another. The author did the entirety of the analysis to
ensure continuity of evaluations and to limit expenditures.
A codebook and coding form were created and are included
in Appendices 1 and 2 herein.15
This research attempted to answer the research question,
“Do CAR stories have enough internal evidence to allow
further analysis into the composition of persons within
these stories?”
14
The Charlotte (N.C.) Observer’s “Death at the track”
project included mug shots and capsules about all those
killed at tracks between 1990 and 2001; while this
represented a tremendous effort on the paper’s part, and
great journalism besides, it was decided that the inclusion
of this data into this study would have the effect of
moving the entire universe of the Observer’s dataset into
this study’s, and as the purpose of this study was to look
at those persons highlighted within stories rather than all
those included in the CAR analyses, this was rejected.
15
Guidance for creating these documents came from Kimberly
A. Neuendorf’s The Content Analysis Guidebook (Los Angeles:
Sage, 2001) and Ruth A. Palmquist’s Content Analysis
http://www.gslis. utexas.edu/~palmquis/courses/content.html
RESULTS
The dataset included 10 projects, defined as a series of
related stories covering a central topic, that were
composed of 45 stories total. Projects contained 18, 11,
seven, two, or a single story, and appeared in medium to
large newspapers or wire services (one project identified
as student work appeared on a statewide news service). All
but two stories were bylined, and a total of 18 authors are
credited.
The main result observed was that the sex of persons
within the stories can be readily determined using the
coding methods used, but almost no other information is
available unless there is an accompanying photograph, and
that can be considered dubious evidence in several
categories.
About 92 percent of persons could be identified by sex.
Men were in both the mentioned and prominent categories
almost three times as often as women.
The age of persons in the story also could be determined
more than half the time, owing to intermittent references
to a person’s age, but far less often than sex.
Ethnicity and race could be determined less than 10
percent of the time, and it is likely that without The
(Nashville) Tennesseean’s stories on the death penalty’s
9
10
effects on blacks in that state, both would have been far
lower.
Black and white were the only identified races. There
were some persons whose names indicated they were likely
Asian, but difficulties, including transliteration between
languages, were deemed sufficient to disallow that
determination. There were no Native American/Alaskans,
Pacific Islanders, nor persons of more than one race
mentioned.
Disabled persons likewise would likely not have been
present save for The Charlotte (N.C.) Observer’s series on
car racing deaths, and even then the evidence was often
ambiguous as to a person’s status at the time of the story.
Poverty was unmentioned entirely save for a person’s
self-characterization as “poor,” and this was determined to
be too weak to merit the inclusion of that person within
that category. Whether this was because few of the stories
dealt with issues directly related to poverty or if there
was any bias against the use of such persons (or against
identifying them as such) is a conjecture the data is
wholly unable to answer.
The data collected did indicate that men and women had
similar ratios between the prominent and mentioned groups,
which might be taken as a sign of even-handed treatment

11
within the two groups. This should be tempered by men’s
three to one advantage, however, which is an inversion of
women’s slight advantage in the population as a whole.
Further analysis of the data collected was deemed
unlikely to be fruitful and was not performed.

CONCLUSION
How well newspapers and other media represent people
from diverse backgrounds and in diverse categories may
determine whether the medium survives far into the 21st
century as it fights circulation losses and slashed
newsroom budgets.
It is entirely possible that the dearth of information
about persons included in the news report is a part of the
perception that mainstream newspapers do not cover or care
about non-mainstream communities. Even when there are
internal efforts to ensure better reflection and balance,
e.g. the USA TODAY diversity index, it may be that a lack
of identification is equated with ignoring that segment of
the community, rightly or wrongly.16 One thing is certain:
these data show that there is little information about
people’s identities in CAR stories from 2001. Whether this
lack of identification is due to self-censorship and
whether that would be good for journalism are important
questions that lay beyond the scope of this research.
16
Phil Meyer and Paul Overberg, “Updating the USA TODAY
Diversity Index,” CARstat: Tools
http://www.unc.edu/~pmeyer/carstat/tool.html (accessed 18
March 2005).
12
13
APPENDICES
13
14
APPENDIX 1: CODEBOOK
DIVERSITY ANALYSIS CODEBOOK
Units of Analyses:
A story is a single newspaper item, including jumps and

sidebars, that has a headline and consists of text. A
story is referred to by its headline or by its first
line of text.
A person is an individual human either referred to

directly, paraphrased or quoted within the material.
Writings by a person are not references to a person
(e.g. court opinions). A person is referred to by his or
her name, role or description, in descending order of
preference.
Metadata:
A project is a connected series of stories that share a

common topic. A project is referred to by its name or
the headline of its first story.
A publication is a newspaper or wire service. A publication

is referred to by its name.
A topic is the main subject matter of a project.
Data:
A person’s sex is either male, female or unknown. Sex is

determined by the use of references such as mother,
daughter, sister, or of citations such as “she said.”
The mere pairing of a person with another person of an
identified sex shall not be used to identify the first
person’s sex. Sex can also be identified if the person’s
first name is among the U.S. Census Bureau’s top 100
names for a sex without being in the top 1,000 names for
the opposite sex, i.e. Matthew is appropriate, Kelly is
not. A person’s appearance in an accompanying photograph
can also be used to determine his or her sex.
A person’s ethnicity is either Hispanic, non-hispanic or

unknown. It is determined either through his or her
identification as a Hispanic or reasonably clear
evidence in an accompanying photograph.
14
15
A person’s race is white, black, native (including American

Indians and Alaskan Natives), Asian, Pacific Islander
(including Hawaiian) or more than one of the preceding.
It is determined by direct reference or clear
photographic evidence. Persons not residing in the
United States will be identified as unknown.
A person’s age is his or her chronological age at the time

of the story’s writing. It is determined by direct
reference to a known point in time only. A person is
considered young is they are less than 25 years of age
and old if they are more than 64 years of age.
A person’s disability status is determined by direct

reference only.
A person’s poverty status (“poor”) is determined by direct

reference only.
Persons not classified as young, old, disabled or poor are

classified as other-unknown.
15
16
APPENDIX 2: CODING FORM
16
17
APPENDIX 3: CODING RESULTS
17
BIBLIOGRAPHY
American Society of Newspaper Editors, Newsroom Employment

Census. http://www.asne.org/index.cfm?id=1138 (accessed
6 April 2005)
Cox, Melisma. “The Development of Computer-Assisted

Reporting.” Paper presented as part of the Southern
Colloquium of the Association for Educaiton in
Journalism and Mass Communications, Chapel Hill, N.C.,
17-18 March 2000.
Davenport, Lucenda, Fred Fico, and Mary Detwiler. “How

Michigan dailies use computers to gather news.”
Newspaper Research Journal 22, no.3 (summer 2001): 45.
Emery, Michael, Edwin Emery, and Nancy L. Roberts, The

press and America: an interpretive history of the mass
media, 9th ed. Needham Heights, Mass.: Allyn & Bacon,
2000.
Friend, Cecilia. ”Computerized reports and newspapers;

computer-assisted records analysis becoming a staple of
papers.” Editor and Publisher 125, no. 52 (26 December
1992): 62.
Garrison, Bruce. “Tools daily newspapers use in computer-

assisted reporting.” Newspaper Research Journal 17, no.
1-2 (winter-spring 1996): 113.
Investigative Reporters and Editors. IRE Resource Center

http://www.ire.org/resourcecenter/ (accessed 16 March
2005).
Lacy, Stephen, Daniel Riffe, Staci Stoddard, Hugh Martin,

and Kuang-Kuo Chang. “Sample size for newspaper content
analysis in multi-year studies.” Journalism and Mass
Communication Quarterly 78, no. 4 (winter 2001): 842.
Landau, George. “Quantum leaps: computer journalism takes

off.” Columbia Journalism Review 31, no. 1 (May-June
1992): 62.
Meyer, Phil and Paul Overberg. “Updating the USA TODAY

Diversity Index.” CARstat: Tools
http://www.unc.edu/~pmeyer/carstat/tool.html (accessed
18 March 2005).
18
19
Neuendorf, Kimberly A. The Content Analysis Guidebook. Los

Angeles: Sage, 2001.
Newkirk, Pamela T. “Guess who’s leaving the newsrooms.”

Columbia Journalism Review 39, no. 3 (September 2003):
36.
Palmquist, Ruth A. Content Analysis. http://www.gslis.

utexas.edu/~palmquis/courses/content.html (accessed 17
March 2005).
Reilley, Mike. ”Stalled C.A.R.: computer-assisted

reporting: journalism is still way behind.” Columbia
Journalism Review 41, no. 5 (January-February 2003): 62.
Rom, Louis. “Pushing numbers: an increasing number of

newsrooms use technology to develop stories: CAR
advocates call for even more.” The Quill 91, no 2 (March
2003): 11.
U.S. Census Bureau. 2000 Census of Population and Housing.

U.S. Department of Commerce. Washington, 2000.
U.S. Census Bureau. 1990 Census Name Files.

http://www.census.gov/genealogy/names/names_files.html
19

Fuzzy Math

Uploaded by

Copyright:

Available Formats

You might also like

Fuzzy Math

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fuzzy Math

Uploaded by

Copyright:

Available Formats

Fuzzy Math: Source Diversity Not Apparent in Newspaper and

Wire Service Computer-Assisted Reporting Stories in 2001

Paper for JOUR 7320 Mass Media and Diversity

Appendix 2: Coding Form..............................16

Appendix 3: Coding Results...........................17

The author would like to thank:

Professors Bob Thomas, Lisa Martin, and Liz Scott of Loyola

the Veterans Administration, its Vocational Rehabilitation

the staff of Investigative Reporters and Editors, and its

A content analysis of computer-assisted reporting stories

published in American newspapers or wire services in 2001

showed that there is scant direct evidence of the

ethnicity, disability or poverty, little evidence of the

race or age, and ample evidence of the sex of persons

mentioned in those stories. The evidence shows that men

appear three times as often as women, but there is little

difference between the rates men and women simply appear

and those at which they are quoted or paraphrased. This

study finds that there is little internal evidence of

diversity within these stories and that further pursuit of

this avenue of inquiry is likely fruitless.

Technology has driven change in newspapers since the

first recognized newspapers appeared in the 17th Century,

was instrumental in starting the industry.1 One of the major

drivers of change in the latter half of the 20th Century

was the availability of inexpensive personal computers and

of data in electronic form. Though the idea of a computer

and attendant software for $10,000 as mentioned in a 1992

article, does not seem inexpensive today, at the time it

represented not only computing power several orders of

magnitude higher than was available to anyone 25 years

before and quite simply not available to anyone outside of

government as little as a decade prior. 2 This movement of

computing power from centralized locations and cloistered

institutions to where work was done gave reporters, among

others, more power to do independent analysis of data,

freeing them from relying on official interpretations and

the obfuscation that often ensued. The opening of the data

to public consumption, thanks in large part to freedom of

information and sunshine acts at the federal and state

levels, meant there was something to aim that computing

power toward that could serve the public welfare.

Melisma Cox argues persuasively that the development of

CAR was driven by individuals’ initiative, whether the

product was produced for an established news outlet or on a

have not since embraced the reporting techniques.3

Currently, the CAR story is a staple of newspapers in

America and around the world, and its practitioners are

often among the most experienced and best trained reporters

in a newsroom.4 They not only have the support of generalist

professional organizations, such as the Society of

Professional Journalists and Radio Television News

Directors Association, but of the specialized Investigative

Reporters and Editors and its National Institute for

Computer-Assisted Reporting. Evidence of the spread of CAR

can also be found in journalism schools, where at least two

of the leading institutions are now adding CAR to their

This experience, training and support places most CAR