Do Open-Access Articles Have a

Greater Research Impact?

Kristin Antelman

Although many authors believe that their work has a greater research
impact if it is freely available, studies to demonstrate that impact are few.
This study looks at articles in four disciplines at varying stages of adoption
of open accessphilosophy, political science, electrical and electronic
engineering and mathematicsto see whether they have a greater im-
pact as measured by citations in the ISI Web of Science database when
their authors make them freely available on the Internet. The nding is
that, across all four disciplines, freely available articles do have a greater
research impact. Shedding light on this category of open access reveals
that scholars in diverse disciplines are adopting open-access practices
and being rewarded for it.

here is currently an explosion ability of computer science conference

of interest in the academic documents under the catchy title Online
and publishing communities or Invisible, the notion that freely avail-
about the promiseand pos- able papers have a greater research im-
sible perilsof open-access scholarship pact has taken hold. (Lawrences preprint
and publishing. Although the term open itself was evidence of this phenomenon:
access is somewhat uid, included under the version published in the journal Na-
its banner are the so-called two roads to ture [under a dierent title] was almost an
open access: open-access journals and aerthought; more than seven hundred
e-print (i.e., preprints or postprints) documents in Google reference Online
repositories, both of which make the full or Invisible.2) Despite the limited do-
text of scholarly articles freely available main of the Lawrence study, making it
to everyone on the open Internet.1 Al- dicult to make assumptions about other
though debate swirls around questions disciplines based on his ndings, it is now
of copyright, peer review, and publishing common to see the assertion that research
costs, individual authors are taking action impact is increased by open access. For
in this arena by posting their articles to example, Eugene Gareld casually noted
personal or institutional Web pages and in a recent post to the American Scientist
to disciplinary repositories. Open Access Forum listserv that it has
Since Steve Lawrence circulated his been demonstrated that on line access
study of the impact of free online avail- improves both readership and citation

Kristin Antelman is Associate Director for Information Technology at North Carolina State University
Libraries; e-mail:

Do Open-Access Articles Have a Greater Research Impact? 373
impact.3 Although this statement may A common author perspective is re-
be intuitively believable, the evidence ected in a reported conversation with an
to document it is still being collected. In electrophysiologist in the journal Science:
addition to the Lawrence study, small Free online papers are likely to reach
studies of the research impact of e-prints more readers, he gures, and therefore
have been done for physics, chemistry, attract more citations.10 Backing this
and a subdiscipline of computer sci- up is a survey of authors (primarily in
ence.46 Preliminary results from a larger the life and medical sciences) sponsored
study, which will look at articles from by the Joint Information Systems Com-
seven thousand journals from the ISI Web miee (JISC), which found that the two
of Science database, indicate a signicant principal reasons for publishing in an
increased research impact for open-access open-access journal was belief in the
articles in physics.7 In related research, principle of free access for all readers,
a study performed using the Citebase followed by I perceive OA journals to
data found that the more oen a paper have faster publication times. The next
is downloaded, the more likely it is to be two most important reasons were related
cited (with the strongest correlation for to perception of research impact: I per-
high-citation papers and authors).8 ceive the readership to be larger and I
In April 2004, ISI released a study titled think my article will be more frequently
The Impact of Open Access Journals, in cited.11 One quantiable measure of au-
which it compared impact factor and the thors belief that free access is valuable to
number of citations of open-access journals them is their adoption of an open-access
in the natural sciences with non-open-ac- option initiated by the Entomological
cess journals. ISI found that the OA jour- Society of America and now oered by
nals have a broadly similar citation paern a number of publishers. Aer oering
to other journals, but may have a slight open access to individual articles for a
tendency to earlier citations.9 However, it (low) fee, they found that by the second
qualies the ndings by noting that many year more than half of the authors elected
of the journals in the study only recently to purchase it.12
shied to open access, that high-prole ti- There also is more indirect evidence of
tles (such as PLoS Biology) were too new to a link between free online availability and
be included, and that their relatively small impact. Studies have shown that authors,
sample included many regional titles that as consumers of research information,
would not be expected to be high-impact rely heavily on browsing online journals
journals. It also should be noted that this and articles.13 Brody and colleagues have
study was of journals, not articles; it is open used data from the CiteSeer repository to
to question whether these two groups of demonstrate that the peak of citations
journals are suciently comparable with occurs higher and sooner for papers
one another in measures other than their deposited in each succeeding year. 14
access model to produce a meaningful Moreover, research has been done on the
gauge of the impact of open access. Given tangential question of whether online
that we are at an early stage in the evolu- journals receive greater use than journals
tion of open-access journals, comparing available only in print. That research, un-
the research impact of articles taken from dertaken at a point when only a portion
the same issues of the same journals is a of journals were available in so-called
more solid methodology for measuring the combination online and print editions,
impact of open access. did not make the distinction between ac-
374 College & Research Libraries September 2004
cess to a licensed online version versus a on artifacts of scholarly communication
freely available online version.15 (published articles). Two of Bormgans
University libraries contemplating four bibliometric research question types
providing support for open-access initia- are addressed, namely, characterizing
tives, such as institutional repositories larger paerns of behavior in scholarly
or open-access journals, face several key communities and evaluating measures of
challenges. Librarians must be able to inuence of scholarly contributions.16
draw on a sophisticated understanding This study does not attempt to ad-
of the scholarly communication practices dress the variable of publisher policies
of individual disciplines even as they are regarding posting of pre- or postprints of
rapidly evolving, including scholars use articles from their journals, or the ques-
of prepublication research material not tion of why authors choose to post or not
traditionally part of the domain of librar- to post e-prints.
ies in a print environment. If we choose
to implement institutional repositories, Methodology
we also must be able to persuade faculty, The four disciplinesmathematics,
many of whom are for a variety of reasons electrical and electronic engineering,
quite reluctant, to contribute their prime political science, and philosophywere
research output. Data showing that freely chosen with the expectation that they
available articles in their discipline are would represent dierent points on the
more likely to be cited is powerful evi- continuum of open-access adoption. They
dence of the value of repositories as well also were selected as disciplines whose
as other open-access channels. scholars have a tradition of active use of
preprints. Ten leading journals in each
The Research Question discipline, as defined by ISIs Journal
This studys hypothesis is that scholarly Citation Reports (JCR) for 2002 were se-
articles from disciplines with varying lected (except philosophy, where there is
rates of open-access adoption have a no JCR).17 High-impact journals were se-
greater research impact if the articles are lected as indicators of leading researcher
freely available online than if they are behavior while making no assumptions
not. To determine whether a dierence about journal quality. Open-access articles
in impact exists, the mean citation rates, published in lower-impact journals may,
as recorded in the ISI Web of Science da- in fact, have a greater relative research
tabase, of freely available articles (1) were impact because they are not so widely
compared with those that are not (0) for available to authors through personal and
a sample population of journal articles institutional subscriptions.
in four disciplines. The null hypothesis ISI Web of Science citation data were
is that there is no dierence between the used as a measure of research impact.
mean citation rates: H0: d1 = d0; H1: H0 is Although citations cannot in themselves
false. be said to measure research impact, nev-
These data provide a snapshot in time ertheless, citedness as measured by ISI
of open-access adoption in a few disparate is a measure that is commonly relied on
disciplines. Within Christine L. Borgmans as a surrogate for such impact.18,19 Use of
framework describing the intersection this inexact proxy for research impact is
of scholarly communication and biblio- less of a concern in this study because
metrics, this study examines the impact citedness in this literature is itself viewed
and potential of networked information by scholars as an objective, at least in the
Do Open-Access Articles Have a Greater Research Impact? 375
sciences and social sciences, and an ac- an author who publishes a given article in
tual assessment of research impact is not a higher-page-count title is either more or
of interest but, rather, the eect of open less likely to post that article online than
access on one traditional and frequently an author who publishes in a lower-page-
used measure of research impact. count title.
Articles from 2001 and 2002 were
selected as the population from which Data Collection
to draw the sample. Philosophy was the Article titles and the number of citations
exception, where 1999 and 2000 were to each article, as recorded in the ISI Web
selected because of the lower level of of Science database, were collected from
citation of philosophy articles. the sample population. Self-citations,
citations from articles within the same
Sample journal issue, and citations from 2004
A systematic presample, which included were excluded. Aer collecting these data,
a minimum of y open-access and y the article title was searched as a phrase
restricted articles, was taken for each of in Google. If any freely available full-text
the disciplines to estimate the expected version (including dras, preprints, and
frequency and variance of open access in postprints) was available, the article was
each population in order to calculate the considered to be open access. Google is
necessary sample sizes. (See table 1.) Data a particularly powerful search engine as
collection for political science and electri- it now indexes not only the full text of
cal and electronic engineering began at PDF les, but also some PostScript les.
the midpoint of the population (the rst Zipped les and dvi les (output from
2002 issue) and expanded equally into the the TeX typeseing program commonly
second half of 2001 and rst half of 2002 used in mathematics), on the other hand,
until the sample target for open-access were only discoverable through Google
articles was reached. For mathematics from Web page links. Some full-text
and philosophy, where the frequency of articles in mathematics, therefore, may
the restricted and open-access articles, have been missed; however, the majority
respectively, was low, data were taken of open-access articles in mathematics are
for the entire population. A potential bias contained in repositories, which are well
introduced, namely, that some journals indexed. (See table 2.) Searching on the
have more articles per issue than others, full title, which in most cases was a phrase
is not a concern because it is unlikely that unique to an article, resulted in any full-
text copy of that
TABLE 1 article appearing
Sample (Number of Articles)* and Frequency at the top of the
of Open Access result list ahead
of references to it
% of Total
(because the title
Discipline ss (Total) ss (1) ss (0) Open Access
is typically en-
Philosophy 602 101 501 17
coded in the title
Political science 299 87 212 29 tag). Parentheti-
Electrical and electronic 506 188 318 37 cal additions to
engineering the title made by
Mathematics 610 426 184 69 some philosophy
* (1) = open; (0) = not open journals were re-
376 College & Research Libraries September 2004

Where Open Access Articles Are Found (n = 50)
Conf. /
Deptl / Assoc. / Working Another
Authors Discipline Other Company Project Paper Persons Course
Discipline Site Repository Repository Site Site Series Site Archive
Philosophy 36 7 0 1 2 2 0 2
Political 23 3 2 5 6 8 0 3
Electrical and 25 9 0 9 2 0 4 1
Mathematics 15 30 0 0 3 0 1 1

moved before searching, as were nontext only Google was searched, it is very
or encoded characters. likely that articles not indexed in Google
Three phenomena indicate that the would be discoverable through other
total number of articles coded as open search engines.
access may be underestimated. The rst
is that a number of articles were ephem- Results and Discussion
eral, meaning that there was evidence Because bibliographic distributions
that they were available at one time but are highly skewed, the nonparametric
not when the sample was taken (e.g., the equivalent of the t-test, the two-sided
link was dead). Another set of ephemeral Wilcoxon signed rank test, was run for
articles is reected in the practice of au- each discipline. (See table 3 for summary
thors who post a preprint but remove it results.) The data show a signicant dif-
when the article is published or replace ference in the mean citation rates of
it with a link to the restricted publishers open-access articles and those that are
copy. Whether they do that knowing that not freely available online in all four dis-
that copy is restricted is not always clear. ciplines. The relative increase in citations
(Of course, it also is possible that articles for open-access articles ranged from a low
available at the time of data collection of 45 percent in philosophy to 51 percent
were posted only recently.) There also in electrical and electronic engineering,
is the likelihood that some article titles 86 percent in political science, and 91
changed signicantly enough to not be percent in mathematics. The disciplines
discoverable between the preprint and selected did indeed represent a spectrum
the final publication. Finally, because of adoption of open-access practices

Comparison of Mean Citation Rates Between Freely Available Articles and
Those That Are Not Freely Available
Standard Mean Percent Wilcoxon
Mean Error Mean Standard Difference Difference Two-tailed SD SD
Discipline (1) (1) (0) Error (0) in Means in Means p Value (1) (0)
Philosophy 1.60 0.491 1.10 0.230 .500 45% .0012 2.51 2.62
Political science 2.20 0.477 1.18 0.353 1.016 86% <.0001 2.27 1.73
Electrical and 2.35 0.449 1.56 0.275 .798 51% .0006 3.14 2.50
Mathematics 1.60 0.270 0.84 0.230 .762 91% <.0001 2.84 1.60
Do Open-Access Articles Have a Greater Research Impact? 377

Comparison of Citation Rates Across Disciplines
for Open and Not Open Articles

Mathematicsnot open (n = 184)

Mathematicsopen (n = 424)

Electrical and electronic engineering

not open (n = 317)
Electrical and electronic engineering
open (n = 186)

Political Sciencenot open (n = 212)

Political Scienceopen (n = 87)

Philosophynot open (n = 500)

Philosophyopen (n = 102)

0 2 4 6 8 10 12 14 16
Number of Citations (Source: ISI Web of Science)
(outliers > 15 excluded)

among scholars. Seventeen percent of medical libraries (whose journals were the
articles in philosophy were open access; rst to move online), which indicate that
29 percent in political science; 37 percent even journals only available in printor
in electrical and electronic engineering; back issues of online journalsstarted to
and 69 percent in mathematics (table 1). see a dramatic decline in usage when a
It is interesting to note that the discipline critical mass of journals went online.21 As
with the highest rate of adoption of open more research is available online, readers
access (mathematics) is not the discipline lower the threshold of eort they are will-
with the greatest impact of open access ing to expend to retrieve documents that
on citation rates (political science). On present any barriers to access. This indi-
the other hand, the discipline with the cates both a push away from print and
lowest open-access rate (philosophy) also a pull toward open access, which may
exhibited the most tenuous link between strengthen the association between open
citation and open access (the 95% con- access and research impact. As Lawrence
dence interval includes both means). hypothesized, it is likely that the greater
Although it has been shown that scien- citation rate for open-source articles
tists prefer to access their research mate- indicates that authors are nding them
rial online, and this study indicates that more easily, reading them more oen, and
that may be the case for social scientists therefore citing them disproportionately
as well, it may well not yet be true for in their own work. It is conceivable that
humanists.20 It is likely that a critical mass the converse is true, that high impact
of open articles is needed before authors articles are for some reason more likely
will become accustomed to regularly to be posted online (maybe their authors
looking for needed articles online but, get more requests for reprints and want
when they do, the move away from print to facilitate that process). However, that
is irreversible. Research supporting this seems unlikely because the author behav-
interpretation comprises studies done in ior observed during the gathering of these
378 College & Research Libraries September 2004
data indicates that the typical practice of the article was a preprint or a postprint,
each individual is to post either all or none which also varies by discipline. (See gure
of his or her articles. 2.) Perhaps the fact that it is more typical to
Because comparing means of highly post a postprint in electrical and electronic
skewed distributions can be misleading, engineering reects a somewhat weaker
the citation distributions of each popula- preprint culture in that discipline. What is
tion were examined to see if there was a not reected in these numbers is that some
dierence. The box plots in gure 1 show authors will post a preprint from a given
that across all disciplines the distribu- journal and others will post a postprint.
tion of number of citations indicates that Thus, author choice is clearly a factor in
articles in the open-access sample have the decision to post a pre- or postprint (as
higher citation counts. These distributions well as any restrictions a journal may place
also suggest that the greatest impact of on posting pre- or postprints).
open access is with the most-cited articles, One conclusion that could be drawn
as found by Steve Hitchcock et al. and from the increased citation rates of ar-
Elana Broch had found.22 This also is re- ticles whose authors made them available
ected in the higher standard deviations as preprints is that preprints are being
for the open sample for three of the four used as near substitutes for the less
disciplines studied (philosophy being the accessible published versions for many
exception). readers (although those readers still cite
the published version).24 In the increasing
Author and Reader Behaviors use and distribution of the preprint, there
To beer understand the individual choice is a blurring of the boundary between
that posting a research article online rep- research documentation and scholarly
resents, subsamples of y open-access literature, to use terms from an IFLA
articles were sampled randomly from statement on open access.25 A likely driver
each discipline. For each article, its loca- of the observed author behavior is that
tion was noted (table 2).23 Links to articles the more accessible online preprint, for
most oen were found on author home disciplines that practice the exchange of
pages (50% overall) or in disciplinary preprints, is an innovation that facilitates
repositories (25%), with clear dierences long-established peer communication
across disciplines. Also noted was whether behavior.

Preprint/Postprint Distribution in Subsample (n = 50)

40% preprint
Philosophy Political Science Engineering Mathematics
(electrical and
Do Open-Access Articles Have a Greater Research Impact? 379
One may speculate that when articles But because publishers register DOIs
are only a mouse click away, bad author with the International DOI Foundation
behaviors that have been described in the (IDF), which also provides resolution
citation analysis literature will be less services, DOI links will always resolve
common. One example is citation bias, to the publishers copy of the article.31
where authors reference only journals This practice points to the need for the
they can access.26 Another author behav- DOI to be directly actionable on the Web,
ior that may become less prevalent thanks perhaps through the info URI, so that
to open access is hollow citing, where the authors, libraries, or others could provide
author does not read the article he or she services that resolve DOI-based links to
cites, evidenced by the carrying forward open-access (or locally licensed) copies
of prior citation errors.27 of the articles.32

Publisher Behaviors Conclusions

Some interesting publisher practices also This study indicates that, across a variety
were uncovered during the course of this of disciplines, open-access articles have
study. Publishers such as Blackwell, IEEE, a greater research impact than articles
ACM, and Kluwer are exposing article- that are not freely available. Although
level metadata, sometimes through for- this nding is only a part of the complex
mal agreements with Google, linking back picture of ongoing changes in scholarly
to a restricted or pay-per-view copy.28 The communication in a networked informa-
result is that the Google searcher will see tion environment, it can help to inform
as many as ve or six dierent links to librarians strategies in working on initia-
the restricted copy of the article. Google tives such as building institutional reposi-
searching currently works in favor of the tories, pursuing open-access publishing
seeker of an open-access copy, however. alternatives, or working with faculty on
If an open copy of the article is available, negotiating rights with publishers. Un-
and the article itself has been indexed by derstanding disciplinary dierences in
Google, that copy will appear ahead of the authors posting of e-prints has implica-
restricted copies in the search results dis- tions for existing library services as well.
play. Where the open-access article itself Some approaches to changing the existing
is not indexed but is available through a scholarly communication model, such as
link, the page with the link will follow the disciplinary or institutional repositories
links to the restricted copies. and open-access journals, are longer term
Publisher use of the Digital Object and top-down. Author self-posting, as
Identier (DOI) to expose article-level seen in the ad hoc manner in which it
metadata deserves singling out as a is occurring, is a present-day grassroots
metadata exposure technique. Many response to clearly perceived benets in
publishers note the DOI on the electronic sharing scholarly output. Libraries as
copy of the article and promote its use institutions must respond in both the
as a unique identier for an article, one long and short terms by participating in
that some authors use on their own Web major new initiatives and at the same time
pages.29 Publishers are currently leverag- beer encompassing the new open-access
ing the DOI through partnerships, such literature in their collections. For instance,
the one between CrossRef and Google in the contents of many repositories can be
which CrossRef will provide Google with accessed using the Open Access Initiative
publisher metadata, including DOIs.30 Metadata Harvesting Protocol. In many
380 College & Research Libraries September 2004
cases, simply passing a title through to and inumetrics, as Blaise Cronin has
Google would be a valuable addition to characterized them, is the partnership
a reference linking service. between ISI and CiteSeer to create a new
It is well known that despite the fact citation measurement tool.35,36
that such journal-level impact factors are Studies such as this one can help to shed
routinely used to evaluate authors of in- light on the dark maer of open access.
dividual articles, journal impact factors Outside a few disciplines, the majority of
correlate poorly with actual citations of freely available articles will not be found
individual articles.33 The high standard in a repository or in an open-access journal
deviations of these samples bear this out but, rather, on personal home pages. Those
and point to the value of new citation articles are not amenable to collecting hit-
measures, such as CiteSeer or ParaCite, count measures of use, yet they are clearly
which assess the impact of individual used. Shedding light on this category of
articles. 34 Open-access articles make open access reveals that scholars in di-
these new, more meaningful measures verse disciplines are adopting open-access
of research impact possible. Evidence practices at a surprisingly high rate and
of the rapid evolution of bibliometrics are being rewarded for it, as reected in a
toward webometrics, cybermetrics, traditional measure of research impact.

