Fuzzy Math

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 23

Fuzzy Math: Source Diversity Not Apparent in Newspaper and

Wire Service Computer-Assisted Reporting Stories in 2001

Paper for JOUR 7320 Mass Media and Diversity


Dr. Cynthia Bond Hopson, University of Memphis

By Pierce Presley
April 18, 2005
CONTENTS

Acknowledgements.........................................iii

Abstract..................................................iv

Introduction...............................................1

Literature Review..........................................5

Method.....................................................7

Results....................................................9

Conclusion................................................12

Appendices................................................13

Appendix 1: Codebook.................................14

Appendix 2: Coding Form..............................16

Appendix 3: Coding Results...........................17

Bibliography..............................................18

ii
ACKNOWLEDGEMENTS

The author would like to thank:

Professors Bob Thomas, Lisa Martin, and Liz Scott of Loyola


University New Orleans for inspiring him and setting him on
this path;

the Veterans Administration, its Vocational Rehabilitation


program and Jerry Foshee of the VA for their support during
this leg of the journey;

the staff of Investigative Reporters and Editors, and its


National Institute for Computer-Assisted Reporting for data
it might well have been impossible to collect otherwise;

and, finally and most of all, Kelly, Laura and Kane for
their support and patience as he has, slowly at times,
trudged along it to this point.

iii
ABSTRACT

A content analysis of computer-assisted reporting stories

published in American newspapers or wire services in 2001

showed that there is scant direct evidence of the

ethnicity, disability or poverty, little evidence of the

race or age, and ample evidence of the sex of persons

mentioned in those stories. The evidence shows that men

appear three times as often as women, but there is little

difference between the rates men and women simply appear

and those at which they are quoted or paraphrased. This

study finds that there is little internal evidence of

diversity within these stories and that further pursuit of

this avenue of inquiry is likely fruitless.

iv
INTRODUCTION

Technology has driven change in newspapers since the

first recognized newspapers appeared in the 17th Century,

when not only the printing press but regular mail service

was instrumental in starting the industry.1 One of the major

drivers of change in the latter half of the 20th Century

was the availability of inexpensive personal computers and

of data in electronic form. Though the idea of a computer

and attendant software for $10,000 as mentioned in a 1992

article, does not seem inexpensive today, at the time it

represented not only computing power several orders of

magnitude higher than was available to anyone 25 years

before and quite simply not available to anyone outside of

government as little as a decade prior. 2 This movement of

computing power from centralized locations and cloistered

institutions to where work was done gave reporters, among

others, more power to do independent analysis of data,

freeing them from relying on official interpretations and

the obfuscation that often ensued. The opening of the data

to public consumption, thanks in large part to freedom of

information and sunshine acts at the federal and state


1
Michael Emery, Edwin Emery, and Nancy L. Roberts, The press
and America: an interpretive history of the mass media 9th
ed. (Needham Heights, Mass.: Allyn & Bacon, 2000), 7.
2
George Landau, “Quantum leaps: computer journalism takes
off,” Columbia Journalism Review 31, no. 1 (May-June 1992):
62.

1
2

levels, meant there was something to aim that computing

power toward that could serve the public welfare.

Melisma Cox argues persuasively that the development of

CAR was driven by individuals’ initiative, whether the

product was produced for an established news outlet or on a

freelance basis, but this does not mean that those outlets

have not since embraced the reporting techniques.3

Currently, the CAR story is a staple of newspapers in

America and around the world, and its practitioners are

often among the most experienced and best trained reporters

in a newsroom.4 They not only have the support of generalist

professional organizations, such as the Society of

Professional Journalists and Radio Television News

Directors Association, but of the specialized Investigative

Reporters and Editors and its National Institute for

3
Melisma Cox, “The Development of Computer-Assisted
Reporting” (Paper presented as part of the Southern
Colloquium of the Association for Educaiton in Journalism
and Mass Communications, Chapel Hill, N.C., 17-18 March
2000).
4
Cecilia Friend, ”Computerized reports and newspapers;
computer-assisted records analysis becoming a staple of
papers,” Editor and Publisher 125, no. 52 (26 December
1992): 62; Bruce Garrison, “Tools daily newspapers use in
computer-assisted reporting,” Newspaper Research Journal
17, no. 1-2 (winter-spring 1996): 113; Lucenda Davenport,
Fred Fico, and Mary Detwiler, “How Michigan dailies use
computers to gather news,” Newspaper Research Journal 22,
no.3 (summer 2001): 45; and Louis Rom, “Pushing numbers: an
increasing number of newsrooms use technology to develop
stories: CAR advocates call for even more,” The Quill 91,
no 2 (March 2003): 11.
3

Computer-Assisted Reporting. Evidence of the spread of CAR

can also be found in journalism schools, where at least two

of the leading institutions are now adding CAR to their

required studies.5

This experience, training and support places most CAR

practitioners at the forefront of the profession at large,

and their work is often an exemplar of the profession’s

best. Thus that body of work would seem to make an

excellent subject for an examination of how newspaper

journalists select persons within the stories they write

and how they choose whom to quote or paraphrase, as well as

how they choose to convey that information. What has become

clear, however, is that this body of work is opaque so far

as the diversity of the people within its boundaries goes.

This study was conceived as an initial investigation

into the state of CAR stories vis-à-vis diversity and to

determine if they would be applicable to further diversity

research. The aim was not to look at the analyses behind

the stories, but at the writers’ selection of people within

those analyses and without to speak for and about the

issues covered. While the resemblance between the data

underpinning the stories and the persons included within

5
Mike Reilley, ”Stalled C.A.R.: computer-assisted
reporting: journalism is still way behind,” Columbia
Journalism Review 41, no. 5 (January-February 2003): 62.
4

them seems an interesting avenue of research, it was not

the focus of this study and would likely need a far greater

commitment of time and resources, not to mention

considerable access to unpublished work, than the current

study.

The scope, while likely too limited for exhaustive study

should be useful for determining whether larger scale

studies would be of use. The results seem to indicate that

research in other directions or of other datasets would be

more fruitful than that reported here.


LITERATURE REVIEW

Academic research into journalism is far less prevalent

than research into mass media generally. Research into the

practice of CAR has fallen off since 2000, but there are

still important items to note.6 CAR has moved from being the

sole provenance of a cabal of specialist reporters to

becoming more and more that of the beat reporter, a

movement driven in part by the specialists themselves and

IRE/NICAR.7 The number of CAR stories available from

IRE/NICAR for the year 2003 is more than double that of

2001.8

Investigations into diversity within news reports have

focused on general newspaper items, which are produced by

reporters with a wide range of experience and training, or

on the broadcast media where such things as sex, race, and

ethnicity seem easier to determine, though this is easily

shown to be fraught with peril in a multicultural society

like America’s.

6
Louis Rom, 12.
7
IRE Conferences and NICAR CAR Conferences have featured
beat-focused panels and classes in the last few years in a
stated effort to expand CAR’s use in the newsroom.
8
The author is pursuing separate research on a dataset from
2003.

5
6

Newsrooms do not currently reflect the diversity of

their communities well and have difficulty in retaining

members of diverse populations within their ranks.9

A study of sample size in newspaper research indicated

that a broader view of single years is preferable to a

shallow view of many years when testing the tenor of

coverage.10

9
American Society of Newspaper Editors, Newsroom Employment
Census http://www.asne.org/index.cfm?id=1138 (accessed 6
April 2005); and Pamela T. Newkirk, “Guess who’s leaving
the newsrooms,” Columbia Journalism Review 39, no. 3
(September 2003): 36.
10
Stephen Lacy and others, “Sample size for newspaper
content analysis in multi-year studies,” Journalism and
Mass Communication Quarterly 78, no. 4 (winter 2001): 842.
While this study looked at multi-year research, it was felt
that it could be applied to the choice in preliminary
research of using a single or multiple years.
METHOD

This study looked at CAR stories published in 2001 as

identified and supplied by the IRE/NICAR Resource Center, a

publicly available repository of investigative stories,

including entries for IRE awards.11

The study used content analysis to attempt to identify

the sex, ethnicity, race, age group, disability status and

income status of persons within the stories.12 The analysis

relied on direct indication of status, such as age given or

description, or fairly unambiguous indications, such as

first names within the top 100 American names for one sex

not in the top 1,000 for the other.13 Sex was the only

characteristic for which this was possible.

Persons in the stories were further separated into

prominent persons, i.e. those quoted or paraphrased

directly (neither through hearsay nor by previous writings

nor utterances), and persons mentioned, which included the

entirety of the previous category. Infographics, commentary

and photo captions were not included in the textual

analysis, though obvious visual clues in related

11
Investigative Reporters and Editors, IRE Resource Center
http://www.ire.org/resourcecenter/ (accessed 16 March 2005).
12
U.S. Census Bureau, 2000 Census of Population and Housing.
U.S. Department of Commerce. Washington, 2000.
13
U.S. Census Bureau, 1990 Census Name Files
http://www.census.gov/genealogy/names/names_files.html
(accessed 15 March 2005).

7
8

photographs were used to determine some part of a person’s

status.14

Textual analysis was done on a per-story basis, so that

the identification of a person as being of a certain status

in one story did not influence their identification in

another. The author did the entirety of the analysis to

ensure continuity of evaluations and to limit expenditures.

A codebook and coding form were created and are included

in Appendices 1 and 2 herein.15

This research attempted to answer the research question,

“Do CAR stories have enough internal evidence to allow

further analysis into the composition of persons within

these stories?”

14
The Charlotte (N.C.) Observer’s “Death at the track”
project included mug shots and capsules about all those
killed at tracks between 1990 and 2001; while this
represented a tremendous effort on the paper’s part, and
great journalism besides, it was decided that the inclusion
of this data into this study would have the effect of
moving the entire universe of the Observer’s dataset into
this study’s, and as the purpose of this study was to look
at those persons highlighted within stories rather than all
those included in the CAR analyses, this was rejected.
15
Guidance for creating these documents came from Kimberly
A. Neuendorf’s The Content Analysis Guidebook (Los Angeles:
Sage, 2001) and Ruth A. Palmquist’s Content Analysis
http://www.gslis. utexas.edu/~palmquis/courses/content.html
(accessed 17 March 2005).
RESULTS

The dataset included 10 projects, defined as a series of

related stories covering a central topic, that were

composed of 45 stories total. Projects contained 18, 11,

seven, two, or a single story, and appeared in medium to

large newspapers or wire services (one project identified

as student work appeared on a statewide news service). All

but two stories were bylined, and a total of 18 authors are

credited.

The main result observed was that the sex of persons

within the stories can be readily determined using the

coding methods used, but almost no other information is

available unless there is an accompanying photograph, and

that can be considered dubious evidence in several

categories.

About 92 percent of persons could be identified by sex.

Men were in both the mentioned and prominent categories

almost three times as often as women.

The age of persons in the story also could be determined

more than half the time, owing to intermittent references

to a person’s age, but far less often than sex.

Ethnicity and race could be determined less than 10

percent of the time, and it is likely that without The

(Nashville) Tennesseean’s stories on the death penalty’s

9
10

effects on blacks in that state, both would have been far

lower.

Black and white were the only identified races. There

were some persons whose names indicated they were likely

Asian, but difficulties, including transliteration between

languages, were deemed sufficient to disallow that

determination. There were no Native American/Alaskans,

Pacific Islanders, nor persons of more than one race

mentioned.

Disabled persons likewise would likely not have been

present save for The Charlotte (N.C.) Observer’s series on

car racing deaths, and even then the evidence was often

ambiguous as to a person’s status at the time of the story.

Poverty was unmentioned entirely save for a person’s

self-characterization as “poor,” and this was determined to

be too weak to merit the inclusion of that person within

that category. Whether this was because few of the stories

dealt with issues directly related to poverty or if there

was any bias against the use of such persons (or against

identifying them as such) is a conjecture the data is

wholly unable to answer.

The data collected did indicate that men and women had

similar ratios between the prominent and mentioned groups,

which might be taken as a sign of even-handed treatment


11

within the two groups. This should be tempered by men’s

three to one advantage, however, which is an inversion of

women’s slight advantage in the population as a whole.

Further analysis of the data collected was deemed

unlikely to be fruitful and was not performed.


CONCLUSION

How well newspapers and other media represent people

from diverse backgrounds and in diverse categories may

determine whether the medium survives far into the 21st

century as it fights circulation losses and slashed

newsroom budgets.

It is entirely possible that the dearth of information

about persons included in the news report is a part of the

perception that mainstream newspapers do not cover or care

about non-mainstream communities. Even when there are

internal efforts to ensure better reflection and balance,

e.g. the USA TODAY diversity index, it may be that a lack

of identification is equated with ignoring that segment of

the community, rightly or wrongly.16 One thing is certain:

these data show that there is little information about

people’s identities in CAR stories from 2001. Whether this

lack of identification is due to self-censorship and

whether that would be good for journalism are important

questions that lay beyond the scope of this research.

16
Phil Meyer and Paul Overberg, “Updating the USA TODAY
Diversity Index,” CARstat: Tools
http://www.unc.edu/~pmeyer/carstat/tool.html (accessed 18
March 2005).

12
13

APPENDICES

13
14

APPENDIX 1: CODEBOOK

DIVERSITY ANALYSIS CODEBOOK

Units of Analyses:

A story is a single newspaper item, including jumps and


sidebars, that has a headline and consists of text. A
story is referred to by its headline or by its first
line of text.

A person is an individual human either referred to


directly, paraphrased or quoted within the material.
Writings by a person are not references to a person
(e.g. court opinions). A person is referred to by his or
her name, role or description, in descending order of
preference.

Metadata:

A project is a connected series of stories that share a


common topic. A project is referred to by its name or
the headline of its first story.

A publication is a newspaper or wire service. A publication


is referred to by its name.

A topic is the main subject matter of a project.

Data:

A person’s sex is either male, female or unknown. Sex is


determined by the use of references such as mother,
daughter, sister, or of citations such as “she said.”
The mere pairing of a person with another person of an
identified sex shall not be used to identify the first
person’s sex. Sex can also be identified if the person’s
first name is among the U.S. Census Bureau’s top 100
names for a sex without being in the top 1,000 names for
the opposite sex, i.e. Matthew is appropriate, Kelly is
not. A person’s appearance in an accompanying photograph
can also be used to determine his or her sex.

A person’s ethnicity is either Hispanic, non-hispanic or


unknown. It is determined either through his or her
identification as a Hispanic or reasonably clear
evidence in an accompanying photograph.

14
15

A person’s race is white, black, native (including American


Indians and Alaskan Natives), Asian, Pacific Islander
(including Hawaiian) or more than one of the preceding.
It is determined by direct reference or clear
photographic evidence. Persons not residing in the
United States will be identified as unknown.

A person’s age is his or her chronological age at the time


of the story’s writing. It is determined by direct
reference to a known point in time only. A person is
considered young is they are less than 25 years of age
and old if they are more than 64 years of age.

A person’s disability status is determined by direct


reference only.

A person’s poverty status (“poor”) is determined by direct


reference only.

Persons not classified as young, old, disabled or poor are


classified as other-unknown.

15
16

APPENDIX 2: CODING FORM

16
17

APPENDIX 3: CODING RESULTS

17
BIBLIOGRAPHY

American Society of Newspaper Editors, Newsroom Employment


Census. http://www.asne.org/index.cfm?id=1138 (accessed
6 April 2005)

Cox, Melisma. “The Development of Computer-Assisted


Reporting.” Paper presented as part of the Southern
Colloquium of the Association for Educaiton in
Journalism and Mass Communications, Chapel Hill, N.C.,
17-18 March 2000.

Davenport, Lucenda, Fred Fico, and Mary Detwiler. “How


Michigan dailies use computers to gather news.”
Newspaper Research Journal 22, no.3 (summer 2001): 45.

Emery, Michael, Edwin Emery, and Nancy L. Roberts, The


press and America: an interpretive history of the mass
media, 9th ed. Needham Heights, Mass.: Allyn & Bacon,
2000.

Friend, Cecilia. ”Computerized reports and newspapers;


computer-assisted records analysis becoming a staple of
papers.” Editor and Publisher 125, no. 52 (26 December
1992): 62.

Garrison, Bruce. “Tools daily newspapers use in computer-


assisted reporting.” Newspaper Research Journal 17, no.
1-2 (winter-spring 1996): 113.

Investigative Reporters and Editors. IRE Resource Center


http://www.ire.org/resourcecenter/ (accessed 16 March
2005).

Lacy, Stephen, Daniel Riffe, Staci Stoddard, Hugh Martin,


and Kuang-Kuo Chang. “Sample size for newspaper content
analysis in multi-year studies.” Journalism and Mass
Communication Quarterly 78, no. 4 (winter 2001): 842.

Landau, George. “Quantum leaps: computer journalism takes


off.” Columbia Journalism Review 31, no. 1 (May-June
1992): 62.

Meyer, Phil and Paul Overberg. “Updating the USA TODAY


Diversity Index.” CARstat: Tools
http://www.unc.edu/~pmeyer/carstat/tool.html (accessed
18 March 2005).

18
19

Neuendorf, Kimberly A. The Content Analysis Guidebook. Los


Angeles: Sage, 2001.

Newkirk, Pamela T. “Guess who’s leaving the newsrooms.”


Columbia Journalism Review 39, no. 3 (September 2003):
36.

Palmquist, Ruth A. Content Analysis. http://www.gslis.


utexas.edu/~palmquis/courses/content.html (accessed 17
March 2005).

Reilley, Mike. ”Stalled C.A.R.: computer-assisted


reporting: journalism is still way behind.” Columbia
Journalism Review 41, no. 5 (January-February 2003): 62.

Rom, Louis. “Pushing numbers: an increasing number of


newsrooms use technology to develop stories: CAR
advocates call for even more.” The Quill 91, no 2 (March
2003): 11.

U.S. Census Bureau. 2000 Census of Population and Housing.


U.S. Department of Commerce. Washington, 2000.

U.S. Census Bureau. 1990 Census Name Files.


http://www.census.gov/genealogy/names/names_files.html
(accessed 15 March 2005).

19

You might also like