Professional Documents
Culture Documents
Artículo Sobre Biorxiv
Artículo Sobre Biorxiv
analysis, accounting for 14% of all published preprints (Elsevier process. This was a sampling of articles submitted to a specific
was second, with 13%). Papers were matched across bioRxiv and publisher. The sample size was limited by the availability of
the Nature journals by digital object identifier (DOI). complete submission, acceptance, and publication data across many
The data show that 57% of the preprints related to papers publishers.
published in Nature journals were posted on bioRxiv after the With the majority of preprints in this sample being posted after
paper was submitted. A total of 5% of papers were posted after submission to the journal that would ultimately publish them, and
acceptance, while 0.2% were posted after publication. usually after some editorial signal has been sent, if not complete
The time afforded to posting before submission was gener- reviews, it seems bioRxiv is being used by authors for purposes
ally brief. Only 29% of published papers were posted as a pre- other than garnering community feedback. Given the potential for
print more than 10 days before the paper was submitted to the posted preprints to reflect peer review or editorial review feedback,
journal that published it, while 26% of preprints were posted and the increasing time between submission and posting, these
within 10 days before or after submitting it to that journal. The practices may have implications for publisher policies around Green
remainder were posted more than 10 days after submission. OA, preprints, and copyright transfer agreements.
Figure 1 shows the data for each paper from 1 to 1,220 (X-axis).
Negative values connote days after submission, meaning the pre-
print was posted after submission.
The trend in the data was towards a longer period occurring TRENDS IN UNREVIEWED PREPRINTS ON
between submission and posting, doubling from 20 days post- BIORXIV
submission for 2016 preprints to 40 days post-submission for
2018 preprints. The overall effect of the data suggests fewer (A note on terminology: I view preprints as published in the prac-
postings prior to submission and more postings after (the data tical sense. However, where they have not been accepted and
are from 2016 to 2018, reading left to right, with the spread of published by a peer reviewed journal, what remains on bioRxiv is
papers distributed evenly from 0 to 1,220). This suggests that the unreviewed assertions of the authors, which here are called
authors are increasingly using the platform for purposes other the ‘unreviewed preprint’. This term is used to differentiate these
than receiving pre-submission feedback. articles from those that have been reviewed and subsequently
Overall and on average, the preprints related to Nature jour- published in peer reviewed journals.)
nal articles were posted slightly more than a month (34 days) after Using data from more than 37,000 preprints posted on bio-
being submitted to the journal that would ultimately publish the Rxiv between November 2013 and 31 December 2018, thou-
work. This timing may have some significance. In general, for many sands of unreviewed preprints remain on bioRxiv servers. I
journals, 30 days is enough time for a paper to clear initial editorial consider these to be abandoned manuscripts – either abandoned
review and, in some cases, to have cleared initial peer review. by the authors, who never intended to publish them in a peer
Given these data, it appears that, instead of a pre-submission reviewed journal or who gave up after multiple unsuccessful
system, bioRxiv may be viewed as a pre-publication system by attempts, or abandoned by the scientific community via the peer
the authors using it. This shift in author utilization of bioRxiv may review and editorial review mechanisms the community uses to
be accelerating as well. accept or reject research findings. However defined, the rate of
There are limitations to these data. We cannot tell from these unreviewed or abandoned manuscripts is steady – 2 years after
data if any papers were reviewed and rejected elsewhere prior posting, approximately a third of preprints has not resulted in a
to the ultimately successful submission, review, and publication peer reviewed publication.
As of 1 August 2019, for the years 2013–2019, 41.9% of bio- PUBLISHERS OF PAPERS WITH ASSOCIATED
Rxiv preprints have not been published in a peer reviewed journal. PREPRINTS
The averages at the heart of the data are quite consistent (Fig. 2):
Since 2013, Elsevier and Nature have published the largest per-
• 2013 – 40.3% (service started in November, 77 preprints centage of papers initially posted on bioRxiv. These papers were
posted, 31 never published) published in journals with a variety of business models, including
• 2014 – 31.9% subscription journals. Over time, in fact, the percentage of papers
• 2015 – 32.2% published by pure OA publishers (e.g. PLOS, BioMed Central,
• 2016 – 31.3% eLife) has decreased.
• 2017 – 32.4% As of 1 August 2019, Nature and Elsevier accounted for 27%
• 2018 – 50.0% of the published papers based on preprints from 2013 to 2018.
Of the two, Elsevier is growing the fastest. Along with PLOS and
Focusing on the four complete years, when the service had a Oxford University Press (OUP), four publishers published 47% of
full year for preprint posting and publication events can be the papers based on bioRxiv preprints over the years mentioned
assumed to have had sufficient time to occur (2014–2017), the above:
average rate of abandoned manuscripts is 32%, or roughly 1 in
3. These data are not dissimilar to publication rates calculated
• Nature – 14%
from arXiv preprints (Lariviere et al., 2014).
• Elsevier – 13%
The rapid increase in the raw number of preprints means that
• PLOS – 11%
the steady percentage of abandoned preprints represents a grow-
• OUP – 9%
ing number of preprints posted on bioRxiv that have never been
accepted in a peer reviewed journal (Fig. 3): In the market overall (personal communication, 2019),
Elsevier accounts for 18% of the papers published (which puts
• 2013 – 31 preprints never published them short of market here), while SpringerNature accounts for
• 2014 – 252 ~12% (which means the Nature subset is strongly over-
• 2015 – 509 performing in the market for these papers). Authors associated
• 2016 – 1,293 with bioRxiv preprints may submit papers to Nature journals
• 2017 – 3,339 more often due to its journals mix – there are many prominent
• 2018 – 10,213 biology journals at Nature, as well as the flagship. Despite this,
Elsevier’s share appears to be overtaking Nature’s – up to
This rapid increase in preprints posted, combined with the 1 August 2019, for 2018 preprints, Elsevier had published 14% of
historical norms of unpublished or abandoned preprints, means the published preprints to Nature’s 13%.
that the volume of unreviewed and unpublished manuscripts on Two of the Top 4 are non-profit publishers. Overall, the mix
bioRxiv is growing, strongly indicating that the purpose of bioRxiv of the most active publishers of papers emanating from bioRxiv
as a platform allowing authors to share manuscripts for preprints skews towards the non-profit space. This may have to
community-based improvements prior to publication in a peer do with how biology and medical publishers skew overall, with
reviewed journal is not being fulfilled. some of the most robust independent journals coming from
biology and medicine. Non-profit biomedical publishers are usu- Communications, Scientific Reports, and Science Advances). These
ally affiliated with a society and have some of the most respected percentages declined over time.
journal brands – high-impact, central to communities, aspirational
– which may also explain the observation.
A total of ~100 publishers or journals have published
papers that first existed as bioRxiv preprints. This number
OTHER OBSERVATIONS
seems to be stabilizing, probably indicating the scope of interest
Further analysis shows that publication events generally occur
for authors of viable bioRxiv preprints. However, the papers are
more frequently in the final calendar quarter of each year, with
not evenly distributed. The share of bioRxiv preprints published
the lowest percentage published in the first calendar quarter
with the 12 most active publishers of bioRxiv preprints
(16%) and the highest percentage (30%) published in the last cal-
totals 72% of the papers resulting from preprints. These 12 are
endar quarter. Given the high probability of calendar quarters
(in order – those with a share of 10% or more are in bold
aligning with fiscal year quarters, these data suggest that pub-
italic):
lishers accelerate publication of papers as the year ends. These
trends were independent of the overall growth of the number of
1. Nature preprints on bioRxiv (Fig. 4).
2. Elsevier The publishers most consistently publishing both more
3. PLOS papers and a higher percentage of papers in Q4 were Nature,
4. OUP Elsevier, and OUP. This behaviour suggests that some calendar-
5. BioMed Central year incentive may drive the pace of paper acceptances and pub-
6. eLife lication practices.
7. Wiley Papers emanating from preprints were published in journals
8. PNAS from exclusively OA publishers (PLOS, eLife, PeerJ, BioMed Cen-
9. Genetics Society of America tral, and Frontiers) and OA megajournals 51% of the time in
10. Society for Neuroscience 2014. This percentage dropped to 24% by 2017.
11. Frontiers As the share of Nature and Elsevier journals publishing pre-
12. PeerJ prints increased, there was an observable tendency for society
journals contracting for services with these large publishers to
The total share of papers published from bioRxiv preprints publish articles using the Gold OA business model. Nature and
for this dozen has been declining slowly; however, 80% of papers Elsevier’s proprietary titles appeared to do this far less commonly.
based on 2015 preprints were published by these 12, and this Twitter amplification of preprints is the most commonly asso-
had dropped to 70% for 2018 preprints as of 1 August 2019. ciated activity on the bioRxiv platform. However, its efficacy has
Megajournals do not appear to appeal to authors posting been declining. From 2014 to 2017, a tweet about a bioRxiv pre-
preprints on bioRxiv. For the preprint years 2015–2017, bioRxiv print on average reached 76,181 users. By the first half of 2019,
preprints were only published as peer reviewed papers 3–9% of the average reach had declined to 53,967, a 29.2% decline. The
the time in four of the major megajournals (PLOS One, Nature number of tweets per preprint also declined, from an average of
18 in the 2014–2017 time period to 13 in the first half of 2019. Overall, four publishers – Nature, Elsevier, PLOS, and OUP –
The major contributing factor appears to be that Twitter users were responsible for publishing 72% of the papers emanating
with followers numbering in the hundreds of thousands boosted from preprints posted to bioRxiv. The level of concentration
preprints in the early days but have since moved on to other among these four is slowly decreasing over time. This may be
things and no longer tweet about bioRxiv or preprints. due to more society publishers establishing cascade systems and
launching Gold OA publications. The lack of interest in mega-
journals among authors posting bioRxiv preprints and successfully
DISCUSSION pursuing publication may indicate that publication brand is an
important element of submission decisions. The erosion of pure
Overall, the null hypothesis – that bioRxiv helps authors improve OA publishers as outlets for papers emanating from bioRxiv pre-
manuscripts prior to publication in peer reviewed journals – is prints appears to confirm other findings that a business model is
not supported by the data. Most authors who publish peer not a primary way for authors to determine publication venue.
reviewed articles based on preprints post the related preprint Preprint servers are an experiment and need to be measured
after submission to the accepting journal, and more post within and assessed as such. To date, bioRxiv looks as if it adds little to
10 days of submission. Taken together, these data indicate that the experience of most authors, the fate of most papers, and the
bioRxiv is not generally used as a source of pre-submission feed- overall health of the scholarly publishing ecosystem. The risks
back. Trends also indicate that the rate of unpublished and ulti- and costs of leaving thousands of unreviewed and abandoned
mately abandoned preprints remains steady, indicating that preprints available to the general public, scientists, and practi-
bioRxiv is not scaling any effective and productive feedback tioners are not clear and may be sizable, especially as unrefereed
mechanism for authors, which one would reasonably expect preprints are ported into medRxiv, a preprint server specifically
would lead to a decreasing percentage of unpublished preprints. focused on medicine.
The practice of posting preprints close to or after submission, These data suggest that bioRxiv is being utilized by authors
and potentially after receiving a positive sign indicating likely more as a pre-publication, post-acceptance platform for article
publication, is not unexpected. Good authors are cautious. They promotion, career advancement, and protection of priority.
do not want to be scooped or embarrassed. Therefore, it makes
sense they would only post a paper that has received some signal ACKNOWLEDGEMENTS
about its fate, and perhaps some review and positive feedback, The author acknowledges the contributions of Eric Anderson,
maybe even a letter noting that only minor changes remain, bol- which included the acquisition of key data elements about pub-
stering their confidence that their paper is in decent shape and lishers’ utilization of preprints. The author also thanks Phil Davis
bound for publication. for discussions about similar findings for arXiv and for providing
There are also data suggesting that papers posted as pre- useful references for the same.
prints generate more citations (Davis & Fromerth, 2007). An
unknown proportion of authors may be placing preprints in bio- REFERENCES
Rxiv to realize this benefit. Depositing preprints also helps Abdill, R. J., & Blekham, R. (2019). Meta-research: Tracking the popu-
authors establish priority. Finally, there is a distinct aspect of larity and outcomes of all bioRxiv preprints. eLife, 8, e45133.
social media promotion detectable on bioRxiv, with most ‘com- https://doi.org/10.7554/eLife.45133
ments’ linked from the tool the platform uses coming from article Brooks, T. C. (2009). Organizing a research community with SPIRES:
promotion on Twitter and other social media platforms. In a more Where repositories, scientists, and publishers meet. Information
competitive ‘publish or perish’ and funding environment, authors Services & Use, 29, 91–96. Retrieved from https://inspirehep.net/
may be using any and all methods they have to promote them- info/general/project/ape09.pdf
selves and their work. While rational, author and paper promo- Davis, P. M., & Fromerth, M. J. (2007). Does the arXiv lead to higher
tion is not a stated purpose of bioRxiv. citations and reduced publisher downloads for mathematics arti-
cles? Scientometrics, 71, 203–215. https://doi.org/10.1007/
The observation that a steady percentage of bioRxiv pre-
s11192-007-1661-8
prints are abandoned is more problematic. Operational assump-
Goldschmidt-Clermont, L. (1965). Communication patterns in high-
tions – such as assigning permanent DOIs to preprints – were
energy. Physics Retrieved from http://eprints.rclis.org/4253/
made based on the notion that preprints would be tied to perma-
Kling, R. (2005). The internet and unrefereed scholarly publishing.
nent, validated journal articles. If the assumption had been other-
Annual Review of Information Science and Technology., 38(1),
wise, a URL would be perfectly functional until and unless a peer 591–631. https://doi.org/10.1002/aris.1440380113
reviewed article was published based on the existing preprint.
Lariviere, V. et al. (2014). arXiv e-prints and the journal of record: An
Because this assumption was made, bioRxiv also lacks a policy to analysis of roles and relationships. Journal of the Association for
retire or deprecate an inactive preprint after a reasonable period, Information Science and Technology, 65(6), 1157–1169. https://doi.
which the data indicate would be 2–3 years after posting. org/10.1002/asi.23044