Professional Documents
Culture Documents
Jurnal-Stuka Biomol
Jurnal-Stuka Biomol
doi: 10.1093/bioinformatics/btab190
Advance Access Publication Date: 23 April 2021
Applications Note
Sequence analysis
DamageProfiler: fast damage pattern calculation for
ancient DNA
1,2,3,
Judith Neukamm *, Alexander Peltzer2,4 and Kay Nieselt2,*
Received on October 5, 2020; revised on February 22, 2021; editorial decision on March 11, 2021
Abstract
Motivation: In ancient DNA research, the authentication of ancient samples based on specific features remains a
crucial step in data analysis. Because of this central importance, researchers lacking deeper programming know-
ledge should be able to run a basic damage authentication analysis. Such software should be user-friendly and easy
to integrate into an analysis pipeline.
Results: DamageProfiler is a Java-based, stand-alone software to determine damage patterns in ancient DNA. The
results are provided in various file formats and plots for further processing. DamageProfiler has an intuitive graphic-
al as well as command line interface that allows the tool to be easily embedded into an analysis pipeline.
Availability and implementation: All of the source code is freely available on GitHub (https://github.com/Integrative-
Transcriptomics/DamageProfiler).
Contact: judith.neukamm@uzh.ch or kay.nieselt@uni-tuebingen.de
Supplementary information: Supplementary data are available at Bioinformatics online.
1 Introduction sample sets. For ancient metagenomic samples, HOPS (Hübler et al.,
2019) has recently been published. This tool offers a fully automat-
The field of ancient DNA (aDNA) further develops with every pass- ized bacterial screening pipeline for aDNA sequences that provides
ing year and current research spans diverse topics including the ana- detailed information on species identification and authenticity.
lysis of various taxa, from ancient plant and animal species However, HOPS includes the metagenomic mapping software
(Debruyne et al., 2008; Jaenicke-Després et al., 2003), to human MALT (Vågene et al., 2018) and is limited to its output format.
microbiological communities from oral or gut microbiomes (Tito Here, we present DamageProfiler, a fast and user-friendly soft-
et al., 2012; Warinner, 2016). One central analysis step is the au- ware to investigate damage patterns in high-throughput sequencing
thentication of the ancient origin of the DNA. Several characteristics data. DamageProfiler aims for high usability and fast runtime to be
have been identified that describe post-mortem DNA modifications. applicable to large datasets. The results are provided as intuitive,
These can be used to differentiate between authentic aDNA and well-described figures and text-based representation in various file
modern contamination and are typically summarized as (i) short formats, including JSON format. Moreover, the visualizations can
fragment length (Meyer et al., 2016); (ii) preferential fragmentation be explored interactively in the graphical user interface. The various
of aDNA at purine bases (Briggs et al., 2007); and (iii) increased output formats and runtime improvements make it suitable for inte-
Cytosine (C) to Thymine (T) base misincorporations towards the grated use in analysis pipelines.
five prime (50 ) end and (iv) complementary Guanine (G) to Adenine
(A) base misincorporations towards the three prime (30 ) end of the
fragment due to enhanced cytosine deamination in single-stranded
2 Results
50 -overhanging ends (Briggs et al., 2007). Existing software such as
mapDamage2.0 (Jónsson et al., 2013) and PMDTools (Skoglund DamageProfiler is a fast, interactive and easy-to-use Java-based pro-
et al., 2014) calculate the nucleotide misincorporation and fragmen- gram to investigate the damage patterns of high-throughput
tation patterns from next-generation sequencing (NGS) reads and sequencing data. DamageProfiler is compatible with newer file for-
provide graphical and text-based representations, as well as further mats such as CRAM, and also legacy file formats such as SAM or
statistical estimates. However, their usability is hindered by the ex- BAM. In addition to several statistics, DamageProfiler generates a
clusive use as a command line tool and long runtimes for larger damage plot, which is calculated by determining the frequency of C
C The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
V 3652
DamageProfiler 3653