Professional Documents
Culture Documents
Munir 2018
Munir 2018
PII: S1476-9271(17)30871-X
DOI: https://doi.org/10.1016/j.compbiolchem.2018.04.011
Reference: CBAC 6843
Please cite this article as: Munir, Anum, Elahi, Sana, Masood, Nayyer, Clustering
based drug-drug interaction networks for possible repositioning of drugs against EGFR
mutations: Clustering based DDI networks for EGFR mutations.Computational Biology
and Chemistry https://doi.org/10.1016/j.compbiolchem.2018.04.011
This is a PDF file of an unedited manuscript that has been accepted for publication.
As a service to our customers we are providing this early version of the manuscript.
The manuscript will undergo copyediting, typesetting, and review of the resulting proof
before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that
apply to the journal pertain.
Clustering Based Drug-Drug Interaction Networks for Possible
PT
Anum Munir1, 3*, Sana Elahi1, Nayyer Masood1,2
RI
1
Department of Bioinformatics and Biosciences, Capital University of Science and
SC
Technology, Islamabad Pakistan,
U
Department of Computer Science, Capital University of Science and Technology,
N
Islamabad Pakistan,
A
3
Bioinformatics International Research Club, Abbottabad, Pakistan
M
PT
RI
SC
U
N
A
M
D
TE
Highlights:
The synergistic effect arises when the interactions cause an expansion in the effect
of either of the drug molecule.
CC
interactions
The centrality analysis is conducted to determine the strong interactions among the
drug compounds
Toxic doses are frequently administered as LD50 values in mg/kg of the body
weight.
The LD50 is the average lethal dose
Different types of interactions are observed in each docked complex
Drug-like properties based on Lipinski rule can be utilized as a part of the drug
PT
repurposing
drug-like properties based on Lipinski rule can be utilized as a part of the new drug-
RI
drug interaction discoveries.
SC
U
Abstract
N
A
EGFRs are a vast group of receptor tyrosine kinases playing an important role in a number
M
of tumors, including lungs, head and neck, breast, and esophageal cancers. A couple of
techniques are being used in the process of drug design. Drug repositioning or repurposing
D
is a rising idea that consists of distinguishing modern remedial indications for officially
TE
compounds against EGFR mutations. Data about 2062 drugs are obtained, and mining is
CC
performed to filter only those drugs which fulfill Lipinski rule of five. Clustering is
performed, and DDIs are built on the clusters to identify effective drug compounds. Only
A
1052 compounds fulfill Lipinski rule. 12 clusters are formed for 1052 drugs compounds.
DDIs are developed for each cluster. Only 15 drugs are suggested to be more effective
Introduction
The epidermal Growth Factor receptor (EGFR) group of receptor tyrosine kinases (TKs),
PT
also known as the HER or ErbB family comprises four members HER1 (ErbB1), HER2
(ErbB2), HER3 (ErbB3) and HER4 (ErbB4) that direct numerous metabolic, physiological
RI
and developmental pathways (Gazdar, 2009) Stanley Cohen, a scientist, found EGFR 25
SC
years ago and explained its part in cell development (Seshacharyulu et al., 2012). The
EGFRs are a vast group of receptor tyrosine kinases playing an important role in a number
U
of tumors, including lungs, head and neck, breast, and esophageal cancers. EGFR and its
N
members are the significant contributors of signaling cascades, thus, regulate development,
A
signaling, migration, adhesion, development, and survival of tumor cells. Because of their
M
Particularly the abnormal movement of EGFR has appeared to play a key role in the
apoptosis and cell proliferation (Wells, 1999). Nowadays, a lot of investigative studies are
CC
being done on EGFR ligand binding inhibition through the design of many antibodies and
therapies (Lin et al., 2003). A number of techniques are being used in the process of drug
A
(Mountain, 2003) and is the real subject of research for some academic labs (Anderson,
2003). Cancer targets can be irritating because the targets are regularly physical cell
mutants of proteins that manage fundamental cell functions, bringing about the loss of a
general function. Obviously, it is irritating for a small inhibitor to potentiate the recovery of
PT
remedial indications for officially existing dynamic pharmaceutical compounds (Udrescu et
RI
al., 2016; Sleigh et al., 2010; Munir et al., 2016). The rise of huge information and
SC
of pharmacology and medication configuration, including drug repurposing. In fact,
U
computational models are utilized to reveal drug interactions, which were not found during
N
clinical trials (Tatonetti et al., 2012), or to anticipate drug security (Egan et al., 2004).
A
Utilizing In-silico devices make a visual and natural framework for representing the
M
interactions (Lewis, 2010), in this manner helping medicinal and pharmaceutical practice.
drugs, and the links between them correspond to the drug interaction, relationships (Polesk
TE
et al., 2011; Karlgren et al., 2011). Clustering methodologies are utilized to build DDI that
EP
proposes conceivable repositioning (Nugent et al., 2016). Here in this research work, the
same technique of clustering is applied to build the DDI to perform drug repositioning
CC
against the EGFR mutations. Though, current research proposes that interaction evidence
from DDI alone can be used to calculate physiological effects of the drug and therefore,
A
This procedure is based on the clustering of drug properties and development of DDIs.
First, the procedure of clustering is explained then the method of building DDIs and
PT
Collection of Drugs Property Data
RI
The data of commercial compounds, which can be used in the treatment of EGFR
SC
mutations in various cancers, were obtained from Zinc database, through virtual screening.
U
structures of drug compounds, it also gives information about the drug-like properties of
N
the chemical compounds. Virtual screening is a computational method used in the parts of
A
drug innovation and improvement to discover collections of small ligands, which can
M
appropriately bind to their target proteins or enzymes (Raster, 2008; Rollinger et al., 2008).
D
The data is comprised of about 2052 records of the drug compounds, consisting of the
TE
chemical properties such as; Drug Zinc ID, Log P, Molecular weight (Mw), Hydrogen bond
Initially, the data consisted of all the drug's compounds, which can be used to treat EGFR
A
mutations. Most of the records lists contain obsolete drugs, which are toxic in nature and
have a lot of side effects. The data was sorted, and mining was performed to filter those
drug compounds that do not fulfill Lipinski rule of five and obtained only 1050 drugs
fulfilling the Lipinski rule of five. This rule demonstrates that drug-like compound must
have Mw no more than 500 Daltons, less than five HBD, less than ten HBA, and logP
PT
Simple K means clustering was performed on the drug's information data in the Weka tool.
RI
WEKA is a prevalent machine learning work surface, contains applications of algorithms
SC
for clustering, classification, and association rule mining, along with the graphical user
interface and visualization services for data investigation and evaluation of an algorithm
U
(Bouckaert et al., 2010). K-means is one of the simplest unsupervised learning algorithms
N
used to solve the renowned clustering problem. This process follows a simple and easy way
A
to classify a given data set (x¬¬1, x2, x3…… xn) into a certain number of clusters, usually
M
K clusters, (K ≤ N) (Macqueen 1967). The optimal number of cluster K(n) was determined
D
using Elbow method given by Kodinariya and Makwana (2013) where K is the number of
TE
Drug-Drug interaction networks were built on the K clusters to determine the strong
CC
association between drug compounds within each cluster with the assistance of Gephi 0.9.1.
It is a principal tool for the visualization and analysis of large networks designed by
A
Andrew (2012). Each network consists of the set of vertices V and edge E, The average
path length L, The degree D of nodes, Network Density, and modularity classes. Modules
are particular parts of a system where the density of edges is fundamentally higher than
1
𝑙=
𝑛(𝑛 − 1)
PT
Where n is the number of vertices in the graph. The average clustering coefficient was
RI
calculated by the formula
SC
|𝐸𝑗, 𝑘 ∗ 𝑉𝑗, 𝑘𝑁|
𝐶=
𝐾𝑖(𝐾𝑖 − 1)
U
Where E is the edge, V is the vertex, N is the total number of nodes and Ki is the neighbor
N
vertices. The strong interactions among the drug compounds in each cluster were observed.
A
The modularity was calculated by the formula
M
𝑖 𝑑
𝑀 = Ʃ[ − ( ) ^2]
𝐸 2𝐸
D
TE
Where E is the number of edges in the network, i represent the number of strongly
connected edges, and d is the degree of a node. The drug compounds, which demonstrated
EP
The strongly interacted drugs obtained through the DDI network were further analyzed for
A
their toxicity level, the drugs having lethal toxic classes were mined, and only those drugs
suggested to be used in combination with each other to achieve maximal results during the
treatment. DDIs reflect the intervention of several behaviors of the drug. In order to
understand potential repositioning, the strategy explained in this paper depends on the
PT
behavioral interactions between drugs rather than the structural similarities characterized by
RI
chemical-structures interactions or drug-target interactions. Therefore, to reduce the
complexity required by drug repurposing, it is suggested that this interactive perception can
SC
be integrated with the complementary structural perception (Bastian et al., 2009).
U
Results
Cluster Generation
N
A
M
The Drug property data obtained from the Zinc database were filtered based on a Lipinski
rule of five. Among the 2062 chemical compounds, only 1052 compounds fulfilled the
D
Lipinski rule of five and were selected for further analysis. By using the clustering
TE
methodology for grouping the similar chemical compounds, K-means algorithm was used.
12 K clusters were identified by Elbow method and built for 1052 drug compounds. The
EP
clusters were built for all the properties fulfilling Lipinski rule, shown in Fig 1. Each cluster
CC
contains only those drug compounds, which are similar to each other, based on their
12 distinctive DDIs were built on the clustered data to find strong interacting drugs. In each
DDI every node represents a distinct drug. A connection between the two drugs relates to
the interaction between them according to the similarity of their properties. No data with
respect to the structural properties of these drug compounds is utilized throughout the
procedure. As each cluster contains similar drugs, therefore; the strong interactions among
the drugs demonstrate them as synergistic, because this type of interactions provides
PT
information about the functional profile of a drug cluster. At the point; when the
RI
interactions cause an expansion in the effect of either of the drug molecule the interaction is
known as a synergistic effect. An "additive synergy" happens when the last effect is
SC
equivalent to the sum of the effects of the two drugs. A point occurs when the final effect is
substantially more prominent than the sum of the two effects this is called enhanced
U
synergistic effect (Qato et al., 2016). The DDIs are shown in Fig 2.
N
A
The network 1 consists of 338 nodes and 562 edges; the network 2 is comprised of 205
M
nodes and 305 edges. Similarly; 373 nodes and 569 edges in network 3, 509nodes and 809
edges in network 4, 404 nodes and 592 edges in network 5, 238 nodes and 342 edges in
D
network 6, 424 nodes and 628 edges in network 7, 480 nodes and 799 edges in network 8,
TE
549 nodes and 914 edges in network 9, 328 nodes and 453 edges in network 10, 318 nodes
EP
and 428 edges in network 11, 470 nodes and 710 edges in network 12 were identified. It
was observed that network 9, 8, 4, and 12 contained maximum interacting drugs and edges
CC
Modularity classes were calculated for each DDI. Distinct colors were assigned
automatically, to the nodes based on modularity. The network structure is usually portrayed
by a high modularity, both in the bipartite network and in their projections, showing that
topology is exceptionally distinct from an arbitrary network, and that it contains a rich and
PT
properties and usefulness of chemical compounds (Qato et al., 2016). Additionally,
RI
modularity is directly connected to the links, which demonstrate drug interactions. The
average (Avg) degree, network density, modularity, Avg path length, and the Avg graph
SC
clustering coefficients are shown in Table 1.
U
Network Centrality Analysis
N
Centrality analysis was performed to determine the strong interactions among the drug
A
compounds, drugs that were most susceptible to DDIs link to the nodes with the largest
M
From each DDI, drugs representing the strong interactions were selected, and a last drug-
EP
drug interaction network was built, based on the data consisting of strongly interacted drug
The final DDI network contains 290 nodes and 416 edges. Again, the same procedure was
A
used to assign different colors to the nodes, average degree, network density, modularity,
average path length, and the average graph clustering coefficients were calculated and the
drug compounds that demonstrate strong interactions were selected. Finally, only 15 drugs
represent stronger interactions to each other. Values such as network density, clustering
The chemical properties of strongly interacted drugs found in the final DDI network,
PT
The 15 strongly interacted results were further analyzed for the toxicity; Toxic doses are
RI
frequently administered as LD50 values in mg/kg of the body weight. The LD50 is the
SC
average lethal dose implying, the dose at which 50% of test subjects die upon introduction
to a compound. The drugs with fewer toxic values are shown in Table 4.
U
N
Only 11 drugs were lying in the toxicity class 4 and 5, both are non-toxic classes; Four
A
drugs were mined due to higher toxic values, based on results; these 11 strongly interacted
M
drugs are suggested to be used in the treatment of EGFR, and can be given to patients in
This entire information presents the case for recouping various drug properties, for
TE
remaking some known drug repositioning, and the arrangements of conceivable drug-like
EP
properties based on Lipinski rule can be utilized as a part of the drug repurposing or in new
pharmacological issues as well as known repositioning, using only information about DDIs
A
Discussions
The clustering based, DDI network strategy introduced in this research work is portraying
drug interaction potential for the treatments of certain diseases. On the other hand, a few
parameters like modularity, path lengths, clustering coefficient, etc.., can distinguish those
drugs which have the most noteworthy potential for DDIs. By utilizing twofold clustering
techniques for DDI networks and drug repositioning, one can, undoubtedly, filter out useful
drug molecules from the jumble of drug compounds. There is a high likelihood that, the
PT
pharmacological properties by this twofold clustering strategy can be affirmed, these 15
RI
drug compounds can be additionally examined with in vivo and in vitro methods, to affirm
the conceivable repositioning or new interactions. The DDIs usually reflect the impedance
SC
of drug practices. The system clarified in this paper depends on the behavioral interactions
between drugs instead of the basic structural or drug target associations. It is proposed It is
U
proposed by Nacher and Schwartz (2012) that this behavioral point of view can be
N
coordinated with the structural perspective for better outcomes.
A
M
Nowadays, a lot of network-based methods are being developed; these methods have,
likewise, being utilized to distinguish drug candidates suitable for repositioning (Newman,
D
2003). For example, Chiang and Butte processed drug-drug similarity mechanism to
TE
recognize drug repositioning applicants (Zahoránszky et al., 2016), whereas some other
EP
used either drug-drug similarities (Wu et al., 2013; Chiang and Butte, 2011) or both drug-
drug and disease-disease similarity approaches (Yang and Agarwal, 2011; Cheng et al.,
CC
2012; Yutuka et al., 2013). Nonetheless, the vast majority of these methodologies is either
drug driven, disease driven or symptoms driven. Here in this research work, a DDI
A
approach is utilized based on the clustering strategy to recognize the best chemical
poor prognosis. Hence it is critical to decide the proper utilization of such medicines
clinically. Several anti-cancer agents are being used. Various combinations of treatment
regimens have turned out to be effective (Assaf et al., 2011) and are broadly concerned
PT
with the introductory treatments for cancers (Keiser et al., 2009). However, at present, the
RI
effect of these treatments on enhancing persistent survival stays in a far away (Bonomi et
al., 2000; Schillar et al., 2002; Shepherd et al., 2013). Therefore, the method of identifying
SC
effective drugs by using DDIs based on clustering methods will be proved fruitful and will
U
Conclusion N
A
This approach to predict novel drugs by representing drug-drug interactions and using the
M
clustering methodology, not just prompted the proposal of drug repositioning, but
D
hopefully, permits additional knowledge into them. The predictions produced for a strongly
TE
interacted drug exhibit that this approach can viably recognize new indications and more
compelling drugs to treat EGFR mutations. 12 clusters were formed for 1052 drugs
EP
compounds. DDIs for developed for each cluster. Only 15 drugs are suggested to be more
effective based on strong interactions in a DDI and can be given in the combination to
CC
achieve maximal benefits. However, this is an In-silico study. Further examinations can be
performed in-vivo and in-vitro to confirm the efficacy of these suggested drug compounds.
A
Acknowledgments
The Authors are grateful to Bioinformatics International Research Club Abbottabad for
Conflict of Interests: None of the authors have any challenging conflict of interests
PT
References
RI
Anderson, A. C. (2003). The process of structure-based drug design. Chemistry & Biology,
SC
Andrew Ng (2012). Clustering with the K-Means Algorithm, Machine Learning
U
Assaf G, Gideon YS, Eytan R, Roded S (2011). PREDICT: a method for inferring novel
N
drug indications with application to personalized medicine. Molecular Systems Biology,
A
Vol. 7, No. 1.
M
Bastian, M., Heymann, S., Jacomy, M. et al (2009). Gephi: an open source software for
D
(2000). Comparison of survival and quality of life in advanced non-small-cell lung cancer
patients treated with two dose levels of paclitaxel combined with cisplatin versus etoposide
CC
with cisplatin: Results of an Eastern Cooperative Oncology Group trial. J Clin Oncol, Vol.
Bouckaert, R. R., Frank, E., Hall, M. A., Holmes, G., Pfahringer, B., Reutemann, P., &
PT
leads for novel drug uses. Clinical pharmacology and therapeutics, Vol. 86, No. 5, pp. 507-
RI
510.
SC
Egan, W. J., Zlokarnik, G. & Grootenhuis, P. D.(2004). In silico prediction of drug safety:
despite progress there is abundant room for improvement. Drug Discovery Today:
U
Technologies, Vol. 1, pp. 381–387
N
A
Gazdar, A. F. (2009). Activating and resistance mutations of EGFR in non-small-cell lung
M
cancer: role in clinical response to EGFR tyrosine kinase inhibitors. Oncogene, Vol. 28, pp.
S24–S31. https://doi.org/10.1038/onc.2009.198
D
Gemma, A., Li, C., Sugiyama, Y., Matsuda, K., Seike, Y., Kosaihira, S., … Kudoh, S.
TE
(2006). Anticancer drug clustering in lung cancer based on gene expression profiles and
EP
Grandis JR, Sok JC. (2004). Signaling through the epidermal growth factor receptor during
CC
the development of malignancy. Pharmacol Ther. Vol. 102, pp. 37–46. [PubMed:
A
15056497]
J. B. MacQueen (1967) "Some Methods for classification and Analysis of Multivariate
PT
and predict clinical drug-drug interactions, Pharmaceutical research Vol. 29, pp. 411–426
RI
Kaelin, W. (1999). Choosing anticancer drug targets in the post- genomic era. J. Clin.
SC
Invest. Vol. 104, pp. 1503–1506.
U
Keiser MJ, Setola V, Irwin JJ, Laggner C, Abbas AI, Hufeisen SJ, Jensen NH, Kuijer MB,
N
Matos RC, Tran TB, Whaley R, Glennon RA, Hert J, Thomas KL, Edwards DD, Shoichet
A
BK, Roth BL(2009): Predicting new molecular targets for known drugs. Nature, Vol. 462,
M
No. 7270, pp. 175-181.
Lin, Q., Rana, P. ., Yingnian, L., Rajesh, A., Gail, H., Alex, F., & Michael, G. (2003).
EP
Lewis, L. (2010). Drug-drug interactions: is there an optimal way to study them? British
A
Mountain, V. (2003). Astex, Structural Genomix, and Syrrx. Chem. Biol. Vol. 10, pp. 95–
98.
Munir, A., Azam, S., Fazal, S., Khan, Z., & Mehmood, A. (2016). In silico Repositioning
of Alendronate and Cytarabine Drugs to Cure Mutations of FPPS, HAP, PTPRS, PTPRE,
PTN4, GGPPS Gene and Mutant DNA, DPOLB, TOP2a, DPOLA, DNMT, RNA, TYSY,
PT
Nacher, J. C., & Schwartz, J.-M. (2012). Modularity in Protein Complex and Drug
RI
Interactions Reveals New Polypharmacological Properties. PLoS ONE, Vol. 7, No. 1,
e30028. https://doi.org/10.1371/journal.pone.0030028
SC
Newman, M. E. (2003). The structure and function of complex networks. SIAM review
U
Vol. 45, pp. 167–256
N
A
Nugent, T., Plachouras, V. & Leidner, J. L. (2016). Computational drug repositioning based
M
on side-effects mined from social media. PeerJ Computer Science 2, e46
criteria-based assessment. British journal of clinical pharmacology Vol. 71, pp. 727–736
EP
Qato DM, Wilder J, Schumm LP, Gillet V, Alexander GC (2016). "Changes in Prescription
and Over-the-Counter Medication and Dietary Supplement Use Among Older Adults in the
CC
United States, 2005 vs 2011". JAMA Internal Medicine. Vol. 176, No. 4, pp. 473–82.
A
doi:10.1001/jamainternmed.2015.8581
Rester U (2008). "From virtuality to reality - Virtual screening in lead discovery and lead
Rollinger JM, Stuppner H, Langer T (2008). "Virtual screening for the discovery of
PT
bioactive natural products". Progress in Drug Research. Progress in Drug Research. Vol.
RI
65, No. 211, pp. 213–49.
SC
Schiller JH, Harrington D, Belani CP, Langer C, Sandler A, Krook J, Zhu J, Johnson DH
U
cancer. N Engl J Med, pp. 346:92-98.
N
A
Seshacharyulu, P., Ponnusamy, M. P., Haridas, D., Jain, M., Ganti, A. K., & Batra, S. K.
M
(2012). Targeting the EGFR signaling pathway in cancer therapy. Expert Opinion on
https://doi.org/10.1517/14728222.2011.64861
TE
trial of docetaxel versus best supportive care in patients with non-small-cell lung cancer
CC
previously treated with platinum-based chemotherapy. J Clin Oncol, Vol. 2000, No. 18, pp.
2095-2103.
A
prediction of drug effects and interactions. Science translational medicine Vol. 4, pp.
125ra31–125ra31
Udrescu, L., Sbârcea, L., Topîrceanu, A., Iovanovici, A., Kurunczi, L., Bogdan, P., &
PT
Udrescu, M. (2016). Clustering drug-drug interaction networks with energy model layouts:
RI
community analysis and drug repurposing. Scientific Reports, Vol. 6, No. 1.
https://doi.org/10.1038/srep32745
SC
Wells A.( (1999). EGF receptor. Int J Biochem Cell Biol. Vol. 31, pp. 637–43. [PubMed:
U
10404636]
N
A
Wu, C., Gudivada, R. C., Aronow, B. J., & Jegga, A. G. (2013). Computational drug
M
repositioning through heterogeneous network clustering. BMC Systems Biology, Vol. 7,
No. 5, S6.
D
Yutaka Fukuoka DT, Hisamichi Ogawa. (2013). A two-step drug repositioning method
based on a protein-protein interaction network of genes shared by two diseases and the
CC
Figure 1: The number of clusters formed by using every property of the 1052 drug compounds
Figure 2: Drug-drug interaction networks built for each cluster, small spheres represent drugs,
interactions between the drugs compounds are represented by dotted lines, whereas the strong
interactions are represented by solid lines. The strong interactions are represented by solid lines of
Figure 3: The Drug-drug interaction network generated on strongly interacted drugs data. The
stronger interactions are shown by dark green color edges, which were selected for further analysis.
PT
RI
SC
U
N
A
M
D
TE
EP
CC
A
Table 1: Degrees, Densities, path lengths, and modularity values predicted for each Drug-drug
Interaction network
PT
length
RI
Network 3.213 5 0.005 0.587 0.021 0.9821
SC
Network 2.967 2 0.007 0.627 0.01 1.036
U
2
4
D
6
CC
7
A
10
PT
11
RI
Network 3.021 2 0.003 0.595 0.002 1.095
12
SC
U
Table 2: Degrees, Densities, path lengths, and modularity values predicted for the final Drug-drug
N
Interaction network
A
Average Network Graph Modularity Clustering Average
M
length
D
Network
EP
Table 3: The Properties of the drug compounds, fulfilling Lipinski rule of five
CC
acceptors
PT
ZINC29128814 3.19 4 9 426 9
RI
ZINC28901061 3.13 4 8 446 9
SC
ZINC34800033 3.47 4 8 461 9
U
ZINC29128958 3.54 4 9 493 9
ZINC26983604
ZINC29198958
1.56
1.53
5
3
N 9
7
397
365
9
7
A
M
LD50 class
1
EP
4
CC
A
2 ZINC29128766 1000 mg/kg Toxicity class
PT
3 ZINC29043935 1000 mg/kg Toxicity class
RI
SC
4 ZINC28903156 1000 mg/kg Toxicity class
U
4
N
A
M
5
D
TE
EP
4
CC
A
7 ZINC28901061 1600 mg/kg Toxicity class
PT
8 ZINC34642910 3160 mg/kg Toxicity class
RI
SC
9 ZINC64564748 1000 mg/kg Toxicity class
U
4
N
A
M
4
TE
EP
5
A