Professional Documents
Culture Documents
BTV 161
BTV 161
doi: 10.1093/bioinformatics/btv161
Advance Access Publication Date: 19 March 2015
Applications Note
Systems biology
Abstract
Motivation: Network alignment aims to find conserved regions between different networks.
Existing methods aim to maximize total similarity over all aligned nodes (i.e. node conservation).
Then, they evaluate alignment quality by measuring the amount of conserved edges, but only after
the alignment is constructed. Thus, we recently introduced MAGNA (Maximizing Accuracy in
Global Network Alignment) to directly maximize edge conservation while producing alignments
and showed its superiority over the existing methods. Here, we extend the original MAGNA with
several important algorithmic advances into a new MAGNAþþ framework.
Results: MAGNAþþ introduces several novelties: (i) it simultaneously maximizes any one of three
different measures of edge conservation (including our recent superior S3 measure) and any
desired node conservation measure, which further improves alignment quality compared with
maximizing only node conservation or only edge conservation; (ii) it speeds up the original
MAGNA algorithm by parallelizing it to automatically use all available resources, as well as by
reimplementing the edge conservation measures more efficiently; (iii) it provides a friendly graph-
ical user interface for easy use by domain (e.g. biological) scientists; and (iv) at the same time,
MAGNAþþ offers source code for easy extensibility by computational scientists.
Availability and implementation: http://www.nd.edu/cone/MAGNAþþ/
Contact: tmilenko@nd.edu
1 Introduction (Liao et al., 2009; Kuchaiev et al., 2010; Milenković et al., 2010;
Proteins produced by genes interact to carry out cellular processes, Kuchaiev and Pržulj, 2011; Patro and Kingsford, 2012; Faisal et al.,
which can be modeled by protein–protein interaction (PPI) net- 2014; Saraph and Milenković, 2014; Sun et al., 2014).
works. PPI network alignment can be used to find a node mapping Traditionally, existing methods first compute pairwise node simi-
between networks of different species that identifies similar regions larities between networks and then they find a high-scoring align-
between the networks (Sharan and Ideker, 2006; Clark and Kalita, ment that maximizes (greedily or optimally) the total similarity over
2014). Consequently, it can be used to predict protein function by all aligned nodes (or node conservation) (Faisal et al., 2014;
transferring knowledge from the network of a well-studied species Crawford et al., 2014). However, alignment quality is then evaluated
to the network of a poorly studied species, or to reconstruct species’ with respect to a measure of edge conservation. Thus, traditional
phylogenetic relationships based on similarities of their networks methods aim to conserve edges by aligning nodes that are similar.
C The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
V 2409
2410 V.Vijayan et al.
In contrast, our recent MAGNA (Maximizing Accuracy in the accuracy of the alignment in terms of node correctness (NC)
Global Network Alignment) (Saraph and Milenković, 2014) directly (Fig. 1) (Saraph and Milenković, 2014).
maximizes edge conservation while producing alignments. MAGNA
was shown to outperform the existing methods, including IsoRank
2.2 Speedup via parallelization and faster calculation of
(Singh et al., 2007), MI-GRAAL (Kuchaiev and Pržulj, 2011) and
GHOST (Patro and Kingsford, 2012), in terms of both node and
edge conservation
The original MAGNA is a genetic algorithm that combines existing
edge conservation. Importantly, in addition to constructing its own
‘parent’ alignments into superior ‘children’ alignments and then
superior alignments from scratch, MAGNA can combine alignments
evolves this process over multiple generations. Its main computa-
of existing methods to further improve them.
tional bottleneck is the calculation of quality for each alignment in
As simultaneously maximizing both node and edge conservation
each generation, which is needed in order to select the highest-
could further improve alignment quality (Neyshabur et al., 2013;
scoring alignments for the next generation. As quality of each align-
Crawford and Milenković, 2014; Sun et al., 2014), here we extend
ment can be calculated independently, unlike the original single-
MAGNA into a new MAGNAþþ framework, which (i) allows for
2.8
2.4 S3
2 NM
1.6 NC
1.2
0.8
0.4
0
0 0.4 1 0 0.6 1 0 0.6 1 0 0.6 1 0 0.6 1 0 0.4 1 0 0.6 1 0 0.6 1 0 0.8 1 0 0.8 1 0 0.6 1 0 0.4 1 0 0.4 1
α α α α α α α α α α α α α
Fig. 1. Alignment quality in terms of NC, GDV-similarity node conservation measure (NM) and S3 edge conservation measure (S3 ), when optimizing with
MAGNAþþ node conservation only (left; a ¼ 0), edge conservation only (right; a ¼ 1; the original MAGNA), or a combination of node and edge conservation (mid-
dle; a in the (0, 1) range). We show results for the same 13 synthetic and real-world network pairs (shown at the top of the figure) as in the original MAGNA publi-
cation (Saraph and Milenković, 2014). Recall that we can compute NC for the synthetic but not real-world networks
Accuracy in global network alignment 2411
3 Conclusion
MAGNA is an already proven network aligner. MAGNAþþ is its
novel extension that allows for higher alignment quality, lower com-
putational complexity, intuitive use by domain scientists and easy
functional extensibility by computational scientists.
Funding
This work was supported by the National Science Foundation
[CCF-1319469].