Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Hindawi Publishing Corporation

Advances in Bioinformatics
Volume 2014, Article ID 324753, 18 pages
http://dx.doi.org/10.1155/2014/324753

Research Article
Alternate Phosphorylation/O-GlcNAc Modification
on Human Insulin IRSs: A Road towards Impaired Insulin
Signaling in Alzheimer and Diabetes

Zainab Jahangir,1 Waqar Ahmad,2 and Khadija Shabbiri1,2


1
Department of Chemistry, GC University Lahore, Lahore, Pakistan
2
School of Biological Sciences, Goddard Building, The University of Queensland, Brisbane, QLD 4072, Australia

Correspondence should be addressed to Khadija Shabbiri; k shabbiri@yahoo.com

Received 24 August 2014; Accepted 10 November 2014; Published 17 December 2014

Academic Editor: Jian-Liang Li

Copyright © 2014 Zainab Jahangir et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Impaired insulin signaling has been thought of as important step in both Alzheimer’s disease (AD) and type 2 diabetes mellitus
(T2DM). Posttranslational modifications (PTMs) regulate functions and interaction of insulin with insulin receptors substrates
(IRSs) and activate insulin signaling downstream pathways via autophosphorylation on several tyrosine (TYR) residues on IRSs.
Two important insulin receptor substrates 1 and 2 are widely expressed in human, and alternative phosphorylation on their serine
(Ser) and threonine (Thr) residues has been known to block the Tyr phosphorylation of IRSs, thus inhibiting insulin signaling and
promoting insulin resistance. Like phosphorylation, O-glycosylation modification is important PTM and inhibits phosphorylation
on same or neighboring Ser/Thr residues, often called Yin Yang sites. Both IRS-1 and IRS-2 have been shown to be O-glycosylated;
however exact sites are not determined yet. In this study, by using neuronal network based prediction methods, we found more
than 50 Ser/Thr residues that have potential to be O-glycosylated and may act as possible sites as well. Moreover, alternative
phosphorylation and O-glycosylation on IRS-1 Ser-312, 984, 1037, and 1101 may act as possible therapeutic targets to minimize
the risk of AD and T2DM.

1. Introduction by both inadequate insulin secretion and insulin resistance


of target organs [3]. In DM, numerous peripheral organs,
Alzheimer’s disease (AD) is a neurodegenerative disease such as kidneys, eyes, and peripheral nerves, undergo patho-
associated with a progressive loss in memory and other logical changes. DM also affects the central nervous system
mental processes, resulting in dementia. In the AD brain, (CNS). Specifically, both types of diabetes are associated
some changes occur in tertiary structures of microtubule- with impairment of learning abilities and memory deficits
associated protein tau and 𝛽-amyloid (A𝛽) peptide result- [3]. Hoyer in 1998 presented a hypothesis stating that brain
ing in self-association and deposition. Hyperphosphorylated insulin may be resisted in AD and that AD can be called
forms of tau act as major constituents of neurofibrillary a brain specific “type 3 diabetes.” Both AD and T2DM are
tangles (NFTs), while A𝛽 is major constituent of plaques [1]. degenerative diseases characterized by neuronal loss and
By 2050, there is expected to be a million new cases of AD per 𝛽-cells destruction, respectively. Impaired insulin signaling
year, or one new case every 33 seconds and AD prevalence is resulting in neurodegeneration and cognitive damages is
proposed to be 11 to 16 million [2]. On the other hand diabetes the main systematic link between these two diseases. Many
mellitus (DM) is characterized by hyperglycemia. Type 1 studies found that DM almost doubles the risk of AD. It
diabetes is insulin-dependent diabetes mellitus caused by was also found that diabetes is not only the risk factor
absolute deficiency of insulin secretion while type 2 diabetes of AD but it also accelerates the onset of AD. Recent
(T2DM) is non-insulin-dependent diabetes mellitus caused studies also elaborate the involvement of impaired insulin
2 Advances in Bioinformatics

signaling in tau hyperphosphorylation. AD and DM also on Thr or Ser residue can be inhibited by the addition of
share several enzymes (DOPA decarboxylase, glutamic acid O-𝛽-GlcNAc. Interplay between these two PTMs on the same
decarboxylase) and growth factor receptors (thyrotrophin- residues also known as yin yang sites has been reported in
releasing hormone, neuronal growth factor receptors, and several nuclear and cytoplasmic proteins [29]. These PTMs
p75 receptors) [4–9]. regulate many functions of the proteins and are responsible
Insulin receptor substrate proteins (IRSs) act as interac- for temporary conformational changes [30]. Various studies
tion platforms that are phosphorylated on multiple tyrosine have shown that O-glycosylation on IRSs reduced in AD
residues and attract different signaling proteins to spread and T2DM due to impaired insulin signaling and glucose
the stimulus. Two isoforms of IRSs, that is, IRS-1 and IRS- metabolism [31–34]. Although some Ser/Thr sites are mapped
2, are most abundant in mammals and have many tyrosine to be O-glycosylation on C-terminal of IRSs, their exact
phosphorylation sites. In general, IRS-1 is important for location is still undetermined. This work predicts potential
growth, while proper function of pancreatic 𝛽-cells and phosphorylation, O-𝛽-GlcNAc, and yin yang sites on IRS-
glucose homeostasis are main functions of IRS-2 [10, 11]. 1 and IRS-2 proteins, involved in metabolic functions in
Knockout of either of them causes insulin resistance, whereas the brain by using various bioinformatics tools. Impaired
only IRS-2-knockout results in diabetes. Either IRS-1 or IRS-2 phosphorylation and O-glycosylation on IRS 1 and 2 lead
gene deletion in mice leads to insulin resistance [12]. Irrecon- to defective insulin signaling that resulted in Alzheimer-
cilably, heterozygous and brain specific deletion of IRS-1 and and diabetes-like complications. The sites where this type of
IRS-2 in mice increases duration of life [13, 14]. However, IRS- crosstalk occurs are useful targets for designing new drugs for
2 deletion is important to induce tau phosphorylation and it resulting diseases.
also accumulates cytoplasmic deposits of phosphorylated tau
[15]. Loss of IRS-2 in neurons of AD patients is particularly
severe. IRS-1 and IRS-2 decreased levels in AD neurons 2. Materials and Methods
increase NFT pathology. Reduced IRS-1 and IRS-2 levels
2.1. Multiple Alignment and Sequences Retrieval. The FASTA
indicate ineffective signaling of IR and IGF-1R in AD [16].
sequences of human IRS-1 and IRS-2 were taken from the
Apart from their important role in metabolic signaling, IRSs
Uniprot sequence database (http://www.uniprot.org/uniprot/
also spread proliferative and antiapoptotic signals and as a
?query=irs&sort=score) with entry name IRS1 HUMAN and
result become overexpressed in most cancers [17].
accession number P35568 and entry name IRS2 HUMAN
In biological systems, posttranslational modifications
with accession number Q9Y4H2, respectively. Blast of these
(PTMs) determine protein activity, localization, and their
two sequences was made by Blast/Uniprot. A total of 37
interaction with other proteins [18]. Multiple kinases acti-
hits for IRS-1 and 24 hits for IRS-2 were obtained with E-
vated by many triggers of insulin resistance, like reactive
value of zero or less. Sequences based on uncharacterized,
oxygen species [19], inflammatory cytokines [20], or excess
putative, hypothetical, predicted, and unnamed proteins were
lipids [19, 20], phosphorylate IRSs on several Ser/Thr residues
neglected. For the multiple alignments of IRS-1 proteins,
[20]. This phosphorylation on various Ser/Thr residues is
eight sequences were selected from 37 retrieved sequences.
thought to inhibit insulin signaling and induce insulin resis-
The accession numbers of these selected sequences are
tance by blocking IRSs Tyr phosphorylation. Increased levels
given in Table 1. For the alignment of IRS-2 proteins, three
of inactivated phosphoserine-312-IRS-1 and phosphoserine-
sequences were selected from 24 retrieved sequences. The
616-IRS-1 associated with decreased levels of IRS-1 and IRS-
accession numbers of these three selected sequences are
2 were identified in AD neurons. These increased levels of
given in Table 2. ClustalW (http://www.ebi.ac.uk/Tools/msa/
inactivated phosphoserine are strongly colocalized with NFTs
clustalw2/) was used for the multiple alignments of all the
[16]. Phosphorylation of Ser or Thr residue results in IRS-
sequences of IRS-1 and IRS-2 to get the conservation status
1 or IRS-2 inactivation [21]. IRS-1 after Ser/Thr phosphory-
of serine, threonine, and tyrosine residues.
lation becomes a weak substrate for enzyme insulin recep-
tor tyrosine kinase [22–24]. Phosphoserine-312-IRS-1 causes
IRS-1 inactivation by disordering IRS-1 binding to insulin 2.2. Protein Structure Prediction and Analysis. As there were
receptor (IR). However, phosphoserine-616-IRS-1 prevents no complete 3D structures of human IRS 1 and IRS 2 available
insulin activation of PI3-kinase [21, 23, 25]. Mechanistically in protein data bank, ab initio models were designed by using
many kinases that are overexpressed in AD phosphorylate software MODELLER (ModWeb: https://modbase.compbio
IRS-1 result in IRS-1 inactivation and IR/IGF-1R resistance. .ucsf.edu/scgi/modweb.cgi) [35], I-TASSER (http://zhanglab
Phosphorylation and O-glycosylation are also involved in .ccmb.med.umich.edu/I-TASSER/) [36], and Swiss Model
affinity changes of IGFBP-6 and cause inhibition of IGF-II (http://swissmodel.expasy.org/) [37].
action on target cells [26, 27]. O-glycosylation of IGFBP- On the basis of Ramachandran plots, best ab initio models
6 leads to 10-fold higher affinity for IGF-II by preventing were selected. Ramachandran plots were taken from Mol-
IGFBP-6 binding to glycosaminoglycans and cell membrane Probity (http://molprobity.biochem.duke.edu/) server [38].
[28]. FATCAT web server (http://fatcat.burnham.org/fatcat-cgi/
Analogous to phosphorylation, O-glycosylation is also cgi/fatcat.pl?-func=pairwise) was used to perform pairwise
one of the important PTMs during which N-acetyl-glucos- alignment of protein 3D structure. Chimera (http://www.cgl
amine (O-GlcNAc) molecule is introduced on Thr or Ser .ucsf.edu/chimera/) [39] and Jmol (http://jmol.sourceforge
residue by O-GlcNAc transferase (OGT). Phosphorylation .net/) software were used to view and analyze 3D structure.
Advances in Bioinformatics 3

Table 1: Different IRS-1 proteins used for multiple alignment.

Species name Accession number Identity Score 𝐸-value


Human (Homo sapiens) P35568 100% 6,593 0.0
Green monkey (Chlorocebus aethiops) Q28224 96% 6,392 0.0
Pig (Sus scrofa) Q68H99 92% 6,016 0.0
Chicken (Gallus gallus) P79773 90% 5,854 0.0
Chinese hamster (Cricetulus griseus) G3IDC4 89% 5,804 0.0
Mouse (Mus musculus) Q543V3 89% 5,786 0.0
Rat (Rattus norvegicus) P35570 89% 5,783 0.0
Western clawed frog (Xenopus tropicalis) F6XDD6 59% 3,161 0.0

Table 2: Different IRS-2 proteins used for multiple alignment.

Species name Accession number Identity Score 𝐸-value


Human (Homo sapiens) Q9Y4H2 100% 7,108 0.0
Rat (Rattus norvegicus) F1MAL5 85% 5,908 0.0
Mouse (Mus musculus) P81122 85% 5,891 0.0

2.3. Prediction Methods for PTMs. We used three distinct database [48] was used to predict O-𝛽-GlcNAc modification
algorithms: support vector machine (SVM), neural network potential sites. The potential phosphorylation sites can also
(ANNs), and hierarchal searches [40] to reduce the false be predicted by YinOYang 1.2 program. Hence, this program
negative and false positive sites during predicting PTM sites predicts the yin yang sites with highly uneven threshold
on human IRS-1 and IRS-2. The NetPhos server generates that is adjusted in accordance with amino acid surface
neural network predictions for Ser/Thr/Tyr phosphorylation accessibility. This method also helps to predict the potential
sites in eukaryotic proteins. The NetPhosK 1.0 produces yin yang sites. Modification potential and conservation status
kinase specific phosphorylation sites predictions. KinasePhos of Thr and Ser residues were used to determine the false
is a new version of kinase-specific phosphorylation site pre- negative (FN) yin yang sites.
diction server. Phospho.ELM is a database of experimentally
The surface accessibility of predicted Ser, Thr, and Tyr
confirmed phosphorylation sites in eukaryotic proteins. The
residues for posttranslational modifications was assessed by
YinOYang server predicts O-𝛽-GlcNAc attachment sites in
using NetSurfP (http://www.cbs.dtu.dk/services/NetSurfP/)
eukaryotic nuclear and intracellular protein sequences. These
[49]. Scansite (http://scansite.mit.edu/motifscan id.phtml)
bioinformatics tools have been used efficiently in many
[50] was also used to check surface accessibility of human IRS
studies [27, 41, 42].
1 and IRS 2 to solvents and for PTMs.
2.3.1. Phosphorylation Residues and Related Kinases Predic-
tion. Neural network-based programs NetPhos 2.0 (http:// 3. Results
www.cbs.dtu.dk/services/NetPhos/) [43] and Scansite (http://
3.1. Alignment of Sequences to Determine the Conservation
scansite.mit.edu/motifscan id.phtml) [44] were used to pre-
of Ser/Thr/Tyr Residues within IRS-1. To determine the evo-
dict phosphorylation potential for human IRS 1 and IRS 2
lutionary conservation of Ser/Thr/Tyr residues, IRS-1 was
for serine, threonine, and tyrosine residues. 0.5 is minimum
aligned with orthologous sequences of other species includ-
threshold value for NetPhos 2.0 used to predict phosphoryla-
tion. ing green monkey, pig, chicken, Chinese hamster, mouse, rat,
Prediction of kinase specific phosphorylation sites in and western clawed frog (Figure 1). Out of 182 Ser residues,
human IRS 1 and IRS 2 was made by Scansite (http://scansite 99 (54.3%) were conserved, 12 (6.5%) were conserved sub-
.mit.edu/motifscan id.phtml) [44], NetPhosK 1.0 (http://cbs stitute, and 29 (15.9%) were semiconserved. For total 60 Thr
.dtu.dk/services/NetPhosK/) [45], and KinasePhos 2.0 (http:// residues, 29 (48.3%) were conserved, 9 (15%) were conserved
kinasephos2.mbc.nctu.edu.tw/) [46]. Kinase specific accep- substitutes, and 5 (8.3%) were semiconserved. Same as for
tor substrates including Thr, Ser, and Tyr are predicted by total 32 Tyr residues where 25 (78.1%) were conserved and
these programs. 1 (3.1%) was conserved substitution. This data showed that
Phospho.ELM (http://phospho.elm.eu.org) [47] and Uni- human IRS-1 has high conservation status.
prot (http://www.uniprot.org/uniprot/) databases were used
to obtain the data regarding experimentally verified phospho- 3.2. Alignment of Sequences to Determine the Conservation
rylation sites on human IRS 1 and IRS 2. of Ser/Thr/Tyr Residues within IRS-2. IRS-2 was aligned
with only known orthologous sequences of species rat and
2.3.2. O-Glycosylated Residues and Yin Yang Sites Prediction. mouse (Figure 2). Like IRS-1, human IRS-2 showed high
YinOYang 1.2 (http://www.cbs.dtu.dk/services/YinOYang/) conservation among different species. Out of 177 Ser residues,
4 Advances in Bioinformatics

IRS1 HUMAN MASPPE---SDGFSDVRKVGYLRKPKSMHKRFFVLRAASEAGGPARLEYY 47


IRS1 CHLAE MASPPE---SDGFSDVRKVGYLRKPKSMHKRFFVLRAASETGDPARLEYY 47
IRS1 CHICK MASPPE---SDGFSDVRKVGYLRKPKSMHKRFFVLRAASEAGAPG-VEYY 46
IRS1 MOUSE MASPPD---TDGFSDVRKVGYLRKPKSMHKRFFVLRAASEAGGPARLEYY 47
IRS1 RAT MASPPD---TDGFSDVRKVGYLRKPKSMHKRFFVLRAASEAGGPARLEYY 47
IRS1 CRIGR MASPPD---TDGFSDVRKVGYLRKPKSMHKRFFVLRAASEAGGPARLEYY 47
IRS1 PIG -------------SDVRKVGYLRKPKSMHKRFFVLRAASEAGGPARLEYY 37
IRS1 XENTR MASPTDPQAQENFSDVRKVGYLRKPKSMHKRFFVLRSASES-GLARLEYY 49
∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ : ∗∗∗ : . : ∗∗∗

IRS1 HUMAN ENEKKWRHKSSAPKRSIPLESCFNINKRADSKNKHLVALYTRDEHFAIAA 97


IRS1 CHLAE ENEKKWRHKSSAPKRSIPLESCFNINKRADSKNKHLVALYTRDEHFAIAA 97
IRS1 CHICK ENEKKWWHKSSAPKRSIPLESCFNINKQVGDKNKHLVALYTRDEHFAIAA 96
IRS1 MOUSE ENEKKWRHKSSAPKRSIPLESCFNINKRADSKNKHLVALYTRDEHFAIAA 97
IRS1 RAT ENEKKWRHKSSAPKRSIPLESCFNINKRADSKNKHLVALYTRDEHFAIAA 97
IRS1 CRIGR ENEKKWRHKSSAPKRSIPLESCFNINKRADSKNKHLVALYTRDEHFAIAA 97
IRS1 PIG ENEKKWRHKSSAPKRSIPLESCFNINKRADSKNKHLVALYTRDEHFAIAA 87
IRS1 XENTR ENEKKWRHKSGAPKRSIPLESCFNINKRADSKNKHLVALYTKEECFAIAA 99
∗∗∗∗∗∗ ∗∗∗ . ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ : . . . ∗∗∗∗∗∗∗∗∗∗ : : ∗ ∗∗∗∗∗
IRS1 HUMAN DSEAEQDSWYQALLQLHNRAKGHHDGAAALGAGGGGGSCSGSSGLGEAGE 147
IRS1 CHLAE DSEAEQDSWYQALLQLHNRAKGHHDGAAALGAGGGGGSCSGSSGLGEAGE 147
IRS1 CHICK DSEAEQDSWYQALLPLHNRAKGHHDGAAALG-GGGGGSCSSSASLGEAGE 145
IRS1 MOUSE DSEAEQDSWYQALLQLHNRAKAHHD---GAGGG-CGGSCSGSSGVGEAGE 143
IRS1 RAT DSEAEQDSWYQALLQLHNRAKAHHD---GAGGG-CGGSCSGSSGVGEAGE 143
IRS1 CRIGR DSEAEQDSWYQALLQLHNRAKAHHD---GAGGGGCGGSCSGSSGVGEAGE 144
IRS1 PIG DSEAEQDSWYQALLQLHNRAKGHHDGAAGPGAGGGGGSCSGSSGLGEAGE 137
IRS1 XENTR ECEQEQDGWYQALVDLHNRGKTHHQ-------HHNHDGATNGVHDGLNGD 142
: . ∗ ∗∗∗ . ∗∗∗∗∗ : ∗∗∗∗ . ∗ ∗∗ : ... :. . ∗ ∗:
IRS1 HUMAN DLSYGDVPPGPAFKEVWQVILKPKGLGQTKNLIGIYRLCLTSKTISFVKL 197
IRS1 CHLAE DLSYGDVPPGPAFKEVWQVILKPKGLGQTKNLIGIYRLCLTSKTISFVKL 197
IRS1 CHICK DLSYGDGAPGPAFKEVWQVILKPKGLGQTKNLIGIYRLCLTSKTISFVKL 195
IRS1 MOUSE DLSY-DTGPGPAFKEVWQVILKPKGLGQTKNLIGIYRLCLTSKTISFVKL 192
IRS1 RAT DLSY-DTGPGPAFKEVWQVILKPKGLGQTKNLIGIYRLCLTSKTISFVKL 192
IRS1 CRIGR DLSY-DTGPGPAFKEVWQVILKPKGLGQTKNLIGIYRLCLTSKTISFVKL 193
IRS1 PIG DLSYGDVPPGPAFKEVWQVILKPKGLGQTKNLIGIYRLCLTSKTISFVKL 187
IRS1 XENTR DVYGEVTPPGLAFKEVWQVIMKPKGLGQLKNLVGIYRLCLTNRTISLVKL 192
∗: ∗∗ ∗∗∗∗∗∗∗∗∗ : ∗∗∗∗∗∗∗ ∗∗∗ : ∗∗∗∗∗∗∗∗ . : ∗∗∗ : ∗∗∗

IRS1 HUMAN NSEAAAVVLQLMNIRRCGHSENFFFIEVGRSAVTGPGEFWMQVDDSVVAQ 247


IRS1 CHLAE NSEAAAVVLQLMNIRRCGHSENFFFIEVGRSAVTGPGEFWMQVDDSVVAQ 247
IRS1 CHICK NSDAAAVVLQLLNIRRCGHSENFFFIEVGRSAVTVPGEFWMQVDDSVVAQ 245
IRS1 MOUSE NSEAAAVVLQLMNIRRCGHSENFFFIEVGRSAVTGPGEFWMQVDDSVVAQ 242
IRS1 RAT NSEAAAVVLQLMNIRRCGHSENFFFIEVGRSAVTGPGEFWMQVDDSVVAQ 242
IRS1 CRIGR NSEAAAVVLQLMNIRRCGHSENFFFIEVGRSAVTGPGEFWMQVDDSVVAQ 243
IRS1 PIG NSEAAAVVLQLMNIRRCGHSENFFFIEVGRSAVTGPGEFWMQVDDSVVAQ 237
IRS1 XENTR NSDAAAVVLQLMNIRRCGHSENFFFIEVGRSAVTGAGEFWMQVDDSVVAQ 242
∗∗ : ∗∗∗∗∗∗∗∗ : ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ . ∗∗∗∗∗∗∗∗∗∗∗∗∗∗
IRS1 HUMAN NMHETILEAMRAMSDEFRPRSKSQSSSNCSNPISVPLRRHHLNNPPPSQV 297
IRS1 CHLAE NMHETILEAMRAMSDEFRPRSKSQSSSNCSNPISVPLRRHHLNNPPPSQV 297
IRS1 CHICK NMHETILEAMRAMSEEFRPRSKSQSSSNCSNPISVPLRRHHINNPPPSQV 295
IRS1 MOUSE NMHETILEAMRAMSDEFRPRSKSQSSSSCSNPISVPLRRHHLNNPPPSQV 292
IRS1 RAT NMHETILEAMRAMSDEFRPRTKSQSSSSCSNPISVPLRRHHLNNPPPSQV 292
IRS1 CRIGR NMHETILEAMRAMSDEFRPRSKSQSSSSCSNPISVPLRRHHLNNPPPSQV 293
IRS1 PIG NMHETILEAMRAMSDEFRPRSKSQSSSNCSNPISVPLRRHHLNNPPPSQV 287
IRS1 XENTR NMHETILEAMKALSDEFRPRSKSQSSSNCSNPISVPLRRHHLNHPPPSQV 292
∗∗∗∗∗∗∗∗∗∗ : ∗ : ∗ : ∗∗∗∗∗ : ∗∗∗∗∗∗ . ∗∗∗∗∗∗∗∗∗∗∗∗∗ : ∗: ∗∗∗∗∗∗
IRS1 GLTRRSRTESITATSPASMVG--GKPGSFRVRASSDGEGTMSRPASVDGS
HUMAN 345
IRS1 GLTRRSRTESITATSPASMVG--GKPGSFRVRASSDGEGTMSRPASVDGS
CHLAE 345
IRS1 GLSRRSRTESVTATSPAGGVG--GKPSSFRVRESSDGEGTMSPPDQQDGS
CHICK 343
IRS1 GLTRRSRTESITATSPASMVG--GKPGSFRVRASSDGEGTMSRPASVDGS
MOUSE 340
IRS1 RAT GLTRRSRTESITATSPASMVG--GKPGSFRVRASSDGEGTMSRPASVDGS 340
IRS1 GLTRRSRTESITATSPASMVG--GKPGSFRVRASSDGEGTMSRPASVDGS
CRIGR 341
IRS1 PIG GLTRRSRTESITATSPASLVG--GKQGSFRVRASSDGEGTMSRPASVDGS 335
IRS1 GLNRRARTESVTATSPAGGVAKHGSSSSFRVRASSDGEGTMSRPASMEGS
XENTR 342
∗∗. ∗∗ : ∗∗∗∗ : ∗∗∗∗∗∗ . ∗ . ∗ . . ∗∗∗∗∗ ∗∗∗∗∗∗∗∗∗ ∗ . :∗∗
IRS1 HUMAN PVSPSTNRTHAHRHRGSARLHPPLNHSRSIPMPASRCSPSATSPVSLSSS 395
IRS1 CHLAE PVSPSTNRTHAHRHRGSARLHPPLNHSRSIPMPASRCSPSATSPVSLSSS 395
IRS1 CHICK PVSPSTNRTHAHRHRGSALLQPPLNHSRSIPMPASRCSPSATSPVSLSSS 393
IRS1 MOUSE PVSPSTNRTHAHRHRGSSRLHPPLNHSRSIPMPSSRCSPSATSPVSLSSS 390
IRS1 RAT PVSPSTNRTHAHRHRGSSRLHPPLNHSRSIPMPSSRCSPSATSPVSLSSS 390
IRS1 CRIGR PVSPSTNRTHAHRHRGSSRLHPPLNHSRSIPMPSSRCSPSATSPVSLSSS 391
IRS1 PIG PVSPSTNRTHAHRHRGSSRLHPPLNHSRSIPMPSSRCSPSATSPVSLSSS 385
IRS1 XENTR PVSPSASRAQSHRHRGSSRLHPPLNHSRSIPMPATRCSPSATSPVSLSSS 392
∗∗∗∗∗ : . ∗ : : : ∗∗∗∗∗∗ : ∗ :∗∗∗∗∗∗∗∗∗∗∗∗ : : ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗

IRS1 HUMAN STSGHGSTSDCLFPRRSSASVSGSPSDGGFISSDEYGSSPCDFRSSFRSV 445


IRS1 CHLAE STSGHGSTSDCLFPRRSSASVSGSPSDGGFISSDEYGSSPCDFRSSFRSV 445
IRS1 CHICK STSGHGSTSDCLFPRRSSASVSGSPSDGGFISSDEYGSSPCDFRSSFRSV 443
IRS1 MOUSE STSGHGSTSDCLFPRRSSASVSGSPSDGGFISSDEYGSSPCDFRSSFRSV 440
IRS1 RAT STSGHGSTSDCLFPRRSSASVSGSPSDGGFISSDEYGSSPCDFRSSFRSV 440
IRS1 CRIGR STSGHGSTSDCLFPRRSSASVSGSPSDGGFISSDEYGSSPCDFRSSFRSV 441
IRS1 PIG STSGHGSTSDCLFPRRSSASVSGSPSDGGFISSDEYGSSPCDFRSSFRSV 435
IRS1 XENTR STSGHGSTSDCMCPRRSSASVSGSPSDGGFISSDEYGSSPCDFRSSFRSV 442
∗∗∗∗∗∗∗∗∗∗∗ : ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗

Figure 1: Continued.
Advances in Bioinformatics 5

IRS1 HUMAN TPDSLGHTPPARGEEELSNYICMGGKGPSTLTAPNGHYILSRGGNGHRCT 495


IRS1 CHLAE TPDSLGHTPPARGEEELSNYICMGGKGPSTLTAPNGHYILSRGGNGHRYT 495
IRS1 CHICK TPDSLGHTPPARGEEELSNYICMGGKGPSTLTAPNGHYILSRGANGHRCT 493
IRS1 MOUSE TPDSLGHTPPARGEEELSNYICMGGKGASTLAAPNGHYILSRGGNGHRYI 490
IRS1 RAT TPDSLGHTPPARGEEELSNYICMGGKGASTLTAPNGHYILSRGGNGHRYI 490
IRS1 CRIGR TPDSLGHTPPARGEEELSNYICMGGKGASTLTAPNGHYILSRGGNGHRYI 491
IRS1 PIG TPDSLGHTPPARGEEELSNYICMGGKGASTLAAPNGHYILPRGGNGHRCL 485
IRS1 XENTR TPDSLGHTPPAR-EEELNNYICMGKSGSHLQRSQQQRYQLSRGEE----- 486
∗∗∗∗∗∗∗∗∗∗∗∗ ∗∗∗∗ . ∗∗∗∗∗∗ . ∗ . : : : ∗ ∗ . ∗∗ :
IRS1 HUMAN PGTGLGTSPALAGDEAASAADLDNRFRKRTHSAGTSP-TITHQKTPSQSS 544
IRS1 CHLAE PGTGLGTSPALAGDEASSAADLDNRFRKRTHSAGTSP-TITHQKTPSQSS 544
IRS1 CHICK PGTGLGTSPALAGDEQTSAADLDNRFRKRTHSAGTSP-TITHQKTPSQSS 542
IRS1 MOUSE PGANLGTSPALPGDEAAGAADLDNRFRKRTHSAGTSP-TISHQKTPSQSS 539
IRS1 RAT PGATMGTSPALTGDEAAGAADLDNRFRKRTHSAGTSP-TISHQKTPSQSS 539
IRS1 CRIGR PGANLGTSPALTGDEATSAADLDNRFRKRTHSAGTSP-TISHQKTPSQSS 540
IRS1 PIG PGAGLGTSPALTGEEAGPASDLDNRFRKRTHSAGTSP-TISHQKTPSQSS 534
IRS1 XENTR ------------------HPDFDKVFRKRTHSSGTSPPTVSHQKTPSQS- 517
. ∗ : ∗ : ∗∗∗∗∗∗∗ : ∗∗∗∗ ∗ : : ∗∗∗∗∗∗∗∗
IRS1 HUMAN VASIEEYTEMMP-AYPPGGGSGGRLPGHRHSAFVPTRSYPEEGLEMHPLE 593
IRS1 CHLAE VASIEEYTEMMP-AYPPGGGSGGRLPGHRHSAFVPTHSYPEEGLEMHPLE 593
IRS1 CHICK VASIEEYTEMMP-AYPPGGGSGGRAWPGRLSAFVPTRSYPEEGLEMHPLE 591
IRS1 MOUSE VASIEEYTEMMPAAYPPGGGSGGRLPGYRHSAFVPTHSYPEEGLEMHHLE 589
IRS1 RAT VVSIEEYTEMMPAAYPPGGGSGGRLPGYRHSAFVPTHSYPEEGLEMHHLE 589
IRS1 CRIGR VASIEEYTEMMPAAYPPGGGSGGRLPSYRHSAFVPTHSYPEEGLEMHPLE 590
IRS1 PIG VASIEEYTEMMP-AYPPGGGSGGRMPNYRHSAFVPTHSYPEEGLEMHPLE 583
IRS1 XENTR --SIEEYTEMMP-------AHPVRLTSFRHSAFVPTYSYPEECLDLHLEG 558
∗∗∗∗∗∗∗∗∗∗ . ∗ ∗ ∗∗∗∗∗∗ ∗∗∗∗∗ ∗ : : ∗
IRS1 HUMAN RRGGHHRPDSSTLHTDDGYMPMSPGVAPVPSGRKGSGDYMPMSPKSVSAP 643
IRS1 CHLAE RRGGHHRPDSSTLHTDDGYMPMSPGVAPVPSSRKGSGDYMPMSPKSVSAP 643
IRS1 CHICK RRAGRTSPDSSTLHTDDGYMPMSPGWPPVSSGRKGNGDYMSMSPKSVSTP 641
IRS1 MOUSE RRGGHHRPDTSNLHTDDGYMPMSPGVAPVPSNRKGNGDYMPMSPKSVSAP 639
IRS1 RAT RRGGHHRPDSSNLHTDDGYMPMSPGVAPVPSNRKGNGDYMPMSPKSVSAP 639
IRS1 CRIGR RRGGHHRPDTSTLHTDDGYMPMSPGVAPVPSNRKGNGDYMPMSPKSVSAP 640
IRS1 PIG RRGGHHRQDTSSLHTDDGYMPMSPGVAPVPGTRKGSGDYMPMSPKSVSAP 633
IRS1 XENTR SRAN---------HTDDGYMPMSPGVAPVPTK---SNDYMPMSPKSVSAP 596
∗. . ∗∗∗∗∗∗∗∗∗∗∗∗ . ∗∗ . . . ∗∗∗ . ∗∗∗∗∗∗∗ : ∗
IRS1 HUMAN QQIINPIRRHPQRVDPNGYMMMSPSGGCSPDIGGGPSSSSSSSNAVPSGT 693
IRS1 CHLAE QQIINPIRRHPQRVDPNGYMMMSPSGGCSPDIGGGPSSSSSS--TVPSGS 691
IRS1 CHICK QQIINPITGPSERVYPNGYMMMSHSGGCSPDIGGGPSSSSSSSNAVPSGT 691
IRS1 MOUSE QQIINPIRRHPQRVDPNGYMMMSPSGSCSPDIGGG-SSSSSSISAAPSGS 688
IRS1 RAT QQIINPIRRHPQRVDPNGYMMMSPSGSCSPDIGGG-SCSSSSISAAPSGS 688
IRS1 CRIGR QQIINPIRRHPQRVDPNGYMMMSPSGSCSPDIGGG-SSSSS--GAAASGS 687
IRS1 PIG QQIINPIRRHPQRVDPNGYMMMSPSGSCSPDIGGGPSSGGSSSGAAPSGS 683
IRS1 XENTR QQIINPRRHS--AVDSNGYMMMSPSGSCS-----------------PDST 627
∗∗∗∗∗∗ ∗ . ∗∗∗∗∗∗∗ ∗ ∗ . ∗∗ . ..:
IRS1 HUMAN SYGKLWTNGVGGHHSHVLPHPKPPVESSGGKLLPCTGDYMNMSPVGDSNT 743
IRS1 CHLAE SYGKLWTKGVGAHNSQVLLHPKPPVESSGGKLLPCTGDYMNMSPVGDSNT 741
IRS1 CHICK SYGKLWTNGVGGHHSHVLPHPKPPVESSGGKLLPCTGDYMDMYPVGDSNT 741
IRS1 MOUSE SYGKPWTNGVGGHHTHALPHAKPPVESGGGKLLPCTGDYMNMSPVGDSNT 738
IRS1 RAT SYGKPWTNGVGGHHTHALPHAKPPVESGGGKLLPCTGDYMNMSPVGDSNT 738
IRS1 CRIGR SYGKPWTNGVGGHHSHALPHSKPPTESGSGKLLPCTGDYMNMSPVGDSNT 737
IRS1 PIG SYGKLWTNGVGGHHSHALPHPKLPVESSSGKLLSCTGDYMNMSPVGDSNT 733
IRS1 XENTR NYSKIWTNGTN---------PKLSIDSIEG-KLPCS-DYINMSPASGSTT 666
. ∗ . ∗ ∗∗ : ∗ . . . ∗ . : ∗ ∗ ∗ . ∗ : ∗∗ : : ∗ ∗ . . . ∗ . ∗
IRS1 HUMAN SSPSDCYYGPEDPQHKPVLSYYSLPRSFKHTQRPGEPEEGARHQHLRLST 793
IRS1 CHLAE SSPSDCYYGPEDPQHKPVLSYYSLPRSFKHTQRPGEPEEGARHQHLRLST 791
IRS1 CHICK SDPSLSYYGPEDPQHKPVLSYYSLPDAFKHPSGPGEPEEGARHQHLRLST 791
IRS1 MOUSE SSPSECYYGPEDPQHKPVLSYYSLPRSFKHTQRPGEPEEGARHQHLRLSS 788
IRS1 RAT SSPSECYYGPEDPQHKPVLSYYSLPRSFKHTQRPGEPEEGARHQHLRLSS 788
IRS1 CRIGR SSPSECYYGPEDPQHKPVLSYYSLPRSFKHTQRPGEPEEGARHQHLRLSS 787
IRS1 PIG SSPSDCYYGPEDPQHKPVLSYYSLPRSFKHTQRPGELEESARHQHLRLSS 783
IRS1 XENTR STPPDSYLNSIEESTKPVYSYFSLPRSFKHVHRKSE------DGNLRITA 710
∗ ∗ . . ∗ . . : . ∗∗∗ ∗∗ : ∗∗∗ : ∗∗∗ .∗ . : ∗∗ : : :
IRS1 HUMAN SSGRLLYAATADDSSSSTSSDSLGGGYCGARLEPSLPHPHHQVLQPHLPR 843
IRS1 CHLAE SSGRLLYAATADDSSSSTSSDSLGGGYCGARLEPSLPHPHHQVLQPHLPR 841
IRS1 CHICK SSGRLLYAATADDSSSSTSSDTLGGGYCGARLEPRLPHPHHQVLQPHLAR 841
IRS1 MOUSE SSGRLRYTATAEDSSSSTSSDSLGGGYCGARPESSLTHPHHHVLQPHLPR 838
IRS1 RAT SSGRLRYTATAEDSSSSTSSDSLGGGYCGARPESSVTHPHHHALQPHLPR 838
IRS1 CRIGR SSGRLLYAATAEDSSSSTSSDSLGGAYCGARPESGLPHPHHHVVQPHGPR 837
IRS1 PIG SSGRLLYAAAAEDSSSSTSSDSLGGGYCGARPEPGLPHLHHQVLQPHLPR 833
IRS1 XENTR NSGHNLYTE--DSSSSSTSSDSLGG------------------QDPQQPR 740
. ∗∗ : ∗ : : . ∗∗∗∗ ∗∗∗∗ : ∗∗∗ :∗ : . ∗
IRS1 HUMAN KVDT-AAQTNSRLARPTRLSLGDP-KASTLPRAREQQQQQQ--------- 882
IRS1 CHLAE KVDT-AAQTNSRLARPTRLSLGDP-KASTLPRAREQQQQQQQQQQQQQQQ 889
IRS1 CHICK KVDTSLAQTNSRLARPTRLSLGDP-KASTLPRAREQQQQSQ--------- 881
IRS1 MOUSE KVDT-AAQTNSRLARPTRLSLGDP-KASTLPRVREQQQQQQ--------- 877
IRS1 RAT KVDT-AAQTNSRLARPTRLSLGDP-KASTLPRVREQQQQQQQQQQ----- 881
IRS1 CRIGR KVDT-AAQTNRRLPRPTRLSLGDP-KASTFPRVWEQQQ------------ 873
IRS1 PIG KVDT-AAQTNNRLARPTRLSLGDP-KASTLPRAREQPPQP---------- 871
IRS1 XENTR KGET--CIQGKRLTRPTRLSLENSSKASTLPRAREPAL------------ 776
∗ : ∗ . . ∗∗ . ∗∗∗∗∗∗∗ : . ∗∗∗∗ : ∗∗ . ∗

Figure 1: Continued.
6 Advances in Bioinformatics

IRS1 HUMAN --PLLHPPEPKSPGEYVNIEFGSDQSGYLSGPVAFHSSPSVRCPSQLQPA 930


IRS1 CHLAE QQPLLHPPEPKSPGEYVNIEFGSDQPGYLSGPVASRSSPSVRCPSQLQPA 939
IRS1 CHICK --PFACPPEPKSPGEYVNIEFGSDQSGYLSGPVAFHSSPSVRCPSQLQPA 929
IRS1 MOUSE --SSLHPPEPKSPGEYVNIEFGSGQPGYLAGPATSRSSPSVRCPPQLHPA 925
IRS1 RAT --SSLHPPEPKSPGEYVNIEFGSGQPGYLAGPATSRSSPSVRCLPQLHPA 929
IRS1 CRIGR --PSLHPPEPKSPGEYVNIEFGSGQPGYLAGPATSHSSPSVRCPPQLHPA 921
IRS1 PIG --ALLHPPEPKSPGEYVNIEFGGDQPGYLSGPVASHSAPSVRCSSQLQPA 919
IRS1 XENTR ------PPEPKSPGEYVNIEFN---DKVFSGGLMPSMCPLPFVQSRVVP- 816
∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ . ::∗ .∗ .:: ∗
IRS1 HUMAN PREEETGTEEYMKMDLGPGRRAAWQESTGVE------MGRLGPAPPGAAS 974
IRS1 CHLAE PREEETGTEEYMKMDLGPGRRAAWQESTGVE------MGRLGPAPPGAAS 983
IRS1 CHICK PREEETGTEEYMKMDLGPGRRAAWQERTGVE------MGRLGPAPPGAAS 973
IRS1 MOUSE PREE-TGSEEYMNMDLGPGRRATWQESGGVE------LGRIGPAPPGSAT 968
IRS1 RAT PREE-TGSEEYMNMDLGPGRRATWQESGGVE------LGRVGPAPPGAAS 972
IRS1 CRIGR PREE-TGSEEYMNMDLGPGRRATWQESGGVE------LGRVGPAPPGAAT 964
IRS1 PIG PREENTGAEEYMNMDLGPGRRAIWQESAGVQ------PGRVGPAPPGAAS 963
IRS1 XENTR QREN---LSEYMNMDLGVWRAKTSYASTSSSSYEPPYKPVSSVCPTETCS 863
∗∗ : . ∗∗∗ : ∗∗∗∗ ∗ . . . .∗. : . :
IRS1 HUMAN ICRPTRAVPSSRGDYMTMQMSCPRQSYVDTSP----AAPVSYADMRTGIA 1020
IRS1 CHLAE ICRPTRAVPSSRGDYMTMQMSCPRQSYVDTSP----IAPVSYADMRTGIA 1029
IRS1 CHICK ICRPT-AVPSSRGDYMTMQMSCPRQSYVDTSP----AAPVSYADMRTGIA 1018
IRS1 MOUSE VCRPTRSVPNSRGDYMTMQIGCPRQSYVDTSP----VAPVSYADMRTGIA 1014
IRS1 RAT ICRPTRSVPNSRGDYMTMQIGCPRQSYVDTSP----VAPVSYADMRTGIA 1018
IRS1 CRIGR ICRPTRSVPGSRGDYMTMQIGCPRQSYVDTSP----VAPVSYADMRTSIG 1010
IRS1 PIG VCRPTRAVPSSRGDYMTMQMGCPRQSYVDTSP----VAPISYADMRTGIV 1009
IRS1 XENTR SSRPPIRGKTNSRDYMSMQLSALCSDYSQVPPTRITAKPITLSSNKSNYA 913
. ∗∗ . . ∗∗∗ : ∗∗ : . . . . ∗ : . . ∗ ∗: : : . :: .
IRS1 HUMAN AEEVSLPRATMAAASSSSAASAS--PTGPQGAAELAAHSSLLGGPQGPGG 1068
IRS1 CHLAE AEEVSLPRATMAAAASSSAASAS--PTGPQGAAELAAHSSLLGGAQGPGG 1077
IRS1 CHICK AEEVSLPRATIAAASSSSTTSAS--DWARQ-APELAAHFVLLGGPQGPGG 1065
IRS1 MOUSE AEKASLPRPTGAAPPPSSTASS--SVTPQGATAEQATHSSLLGGPQGPGG 1062
IRS1 RAT AEKVSLPRTTGAAPPPSSTASASASVTPQGA-AEQAAHSSLLGGPQGPGG 1067
IRS1 CRIGR AEKVSLPRTTGAAPSSSSTASAS-PAAPEGA-AEQAAHSSLLGGPQGPGG 1058
IRS1 PIG VEEASLPGATAAAPSASSATSAS-SAAPQGA-GELAACSSLLGGPQGPGG 1057
IRS1 XENTR EMSSGGVSDNIPAIPQTSNSSLS-----------EASRSSLLG--QG-SG 949
. . . . ∗ . :∗ : ∗ ∗: ∗∗∗ ∗∗ . ∗
IRS1 HUMAN MS-AFTRVNLSPNRNQSAKVIRADPQGCRRRHSSETFSSTPSATRVGNTV 1117
IRS1 CHLAE MS-AFTRVNLSPNRNQSAKVIRADPQGCRRRHSSETFSSTPSATRVGNTV 1126
IRS1 CHICK MSGSFTRANLSPNRNQSAKVIRADTQGCRRIASSQTLSSTPSSTRWHTHV 1115
IRS1 MOUSE MS-AFTRVNLSPNHNQSAKVIRADTQGCRRRHSSETFSAP---TRAGNTV 1108
IRS1 RAT MS-AFTRVNLSPNHNQSAKVIRADTQGCRRRHSSETFSAP---TRAANTV 1113
IRS1 CRIGR MS-AFTRVNLSPNHNQSAKVIRADTQGCRRRHSSETFSAP---TRAGNTV 1104
IRS1 PIG LS-AFTRVNLSPNRNQSAKVIRADPQGCRRRHSSETFSSTPSATRAGNTV 1106
IRS1 XENTR PS-AFTRVSLSPNRNQSAKVIRAGDPQGRRRHSSETFSSTPTTARVTS-- 996
∗ : ∗∗∗ . .∗∗∗∗ : ∗∗∗∗∗∗∗∗∗ . ∗∗ ∗∗ : ∗ : ∗ : . :∗ .
IRS1 HUMAN PFGAGAAVGGGG----GSSSSSEDVKRHSSASFENVWLRPGELGGAPKEP 1163
IRS1 CHLAE PFGAGAAIGGSG----GSSSSSEDVKRHSSASFENVWLRPGELGGAPKEP 1172
IRS1 CHICK PLRAGAAVGGIR----VSASSEQDVKPHSSASFENVWLRPGELGGAPKEP 1161
IRS1 MOUSE PFGAGAAVGGSGGGGGGGS---EDVKRHSSASFENVWLRPGDLGGVSKES 1155
IRS1 RAT SFGAGAAGGGSG----GGS---EDVKRHSSASFENVWLRPGDLGGASKES 1156
IRS1 CRIGR SFGAGAAVGGSGGGAGGGSSSSEDVKRHSSASFENVWLRPGDLGGASKET 1154
IRS1 PIG PFGGGAAVGGGG----GGSSSTEDVKRHSSASFENVWLRPGELGGAPKEL 1152
IRS1 XENTR -----------------GPVSGEDVKRHSSASFENVWLKPGEIARRDS-- 1027
.. : ∗∗∗ ∗∗∗∗∗∗∗∗∗∗∗ : ∗∗ : : . .
IRS1 HUMAN AKLCGAAGGLENGLNYIDLDLVKDFKQCPQECTPEPQPPPPPPPHQPLGS 1213
IRS1 CHLAE AQLCGAAGGLENGLNYIDLDLVKDFKQRPQECTPQPQPPPPPPPHQPLGS 1222
IRS1 CHICK AKLCGAAGGLENGLNYIDLDLVKDFKQCPQECTPEPQPPPPPPPHQPLGS 1211
IRS1 MOUSE APVCGAAGGLEKSLNYIDLDLAK---ERSQDCPSQQQSLPPPPPHQPLGS 1202
IRS1 RAT APGCGAAGGLEKSLNYIDLDLVKDVKQHPQDCPSQQQSLPPPPPHQPLGS 1206
IRS1 CRIGR AQVCGAAGGLESNLNYIDLDLAKDVRQRPQECPSQQQPLPPPAPHQPSRN 1204
IRS1 PIG AQMCGAAGGLENGLNYIDLDLVKDFKQRPQERPPQPQPPPPPPPHQPLGN 1202
IRS1 XENTR ---LQPSDHTHNGLNYIDLDLAKDLSGLDHCN------------------ 1056
. : . . . . ∗∗∗∗∗∗∗∗ . ∗ :
IRS1 HUMAN GESSSTRRSSEDLSAYASISFQKQPEDRQ 1242
IRS1 CHLAE SESSSTRRSSEDLSAYASISFQKQPEDLQ 1251
IRS1 CHICK GESSSTRRSSEDLSAYASTSFQKQPEDRQ 1240
IRS1 MOUSE NEGNSPRRSSEDLSNYASISFQKQPEDRQ 1231
IRS1 RAT NEGSSPRRSSEDLSTYASINFQKQPEDRQ 1235
IRS1 CRIGR NEGSSPRRSSEDLSTYASINFQKQPEDHQ 1233
IRS1 PIG SESSSTSRSSEDLSAYASIRFQKQPED-- 1229
IRS1 XENTR SHQSGVSHPSDDLSTYASITFHKLEEHRN 1085
. . . . : . ∗ : ∗∗∗ ∗∗∗ ∗ : ∗ ∗ .

Figure 1: Multiple alignments of eight sequences (human, green monkey, pig, chicken, Chinese hamster, mouse, rat, and western clawed frog).
ClustalW was used to align these different sequences. The conserved sequence (highlighted red) is marked by an asterisk, semiconserved
(highlighted yellow) substitution by a single dot, and conserved substitution (highlighted green) by a double dot. Multiple alignment showed
that Ser-11, 24, 36, 57, 63, 68, 193, 199, 217, 228, 243, 261, 270, 272, 273, 274, 277, 281, 295, 307, 312, 323, 329, 330, 337, 345, 348, 350, 362, 372, 374,
383, 385, 388, 391, 393, 394, 395, 396, 398, 402, 404, 412, 413, 415, 417, 419, 421, 427, 428, 433, 434, 440, 441, 444, 449, 527, 531, 541, 543, 547,
574, 581, 616, 636, 639, 641, 666, 668, 672, 720, 741, 744, 763, 766, 795, 807, 808, 809, 810, 812, 813, 862, 869, 892, 1038, 1041, 1070, 1078, 1084,
1100, 1101, 1105, 1142, 1143, 1145, 1223, 1227, and 1231; Thr-88, 188, 191, 231, 252, 305, 309, 311, 335, 387, 397, 403, 446, 453, 525, 530, 533, 539, 552,
579, 608, 700, 743, 811, 847, 859, 870, 1073, and 1103; and Tyr-18, 46, 47, 87, 107, 183, 431, 465, 483, 551, 582, 612, 632, 662, 695, 732, 750, 764,
800, 896, 941, 989, 1001, 1179, and 1229 are highly conserved residues. Meanwhile, Ser-137, 268, 303, 380, 770, 792, 815, 910, 974, 1011, 1037, and
1106; Thr-351, 354, 535, 693, 729, 793, 991, 1017, and 1111; and Tyr-765 showed conserved substitutions. Ser-58, 78, 99, 105, 135, 139, 189, 315, 341,
463, 486, 629, 691, 694, 747, 794, 918, 925, 985, 995, 1000, 1005, 1025, 1035, 1131, 1132, 1217, 1218, and 1222; and Thr-300, 979, 1004, 1030, and 1107
showed semiconserved substitutions.
Advances in Bioinformatics 7

IRS2 HUMAN MASPPRHGPPGPASGDGPNLNNNNNNNN-HSVRKCGYLRKQKHGHKRFFV 49


IRS2 RAT MASAPLPGPPASAGGDGPNLNNNNNNNNNHSVRKCGYLRKQKHGHKRFFV 50
IRS2 MOUSE MASAPLPGPPASAGGDGPNLNNNNNNNN-HSVRKCGYLRKQKHGHKRFFV 49
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN LRGPGAGGDEATAGGGSAPQPPRLEYYESEKKWRSKAGAPKRVIALDCCL 99
IRS2 RAT LRGPGTGGEEAAAAGGSPPQPPRLEYYESEKKWKSKAGAPKRVIALDCCL 100
IRS2 MOUSE LRGPGTGGDEASAAGGSPPQPPRLEYYESEKKWRSKAGAPKRVIALDCCL 99
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN NINKRADAKHKYLIALYTKDEYFAVAAENEQEQEGWYRALTDLVSEGRAA 149
IRS2 RAT NINKRADAKHKYLIALYTKDEYFAVAAENEQEQEGWYRALTDLVSEGRSG 150
IRS2 MOUSE NINKRADAKHKYLIALYTKDEYFAVAAENEQEQEGWYRALTDLVSEGRSG 149
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN AGDAPPAAAPAASCSASLPGALGGSAGAAGAEDSYGLVAPATAAYREVWQ 199
IRS2 RAT DGGSGTTGG---SCSASLPGALGGSAGAAGCDDNYGLVTPATAVYREVWQ 197
IRS2 MOUSE EGGSGTTGG---SCSASLPGVLGGSAGAAGCDDNYGLVTPATAVYREVWQ 196
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN VNLKPKGLGQSKNLTGVYRLCLSARTIGFVKLNCEQPSVTLQLMNIRRCG 249
IRS2 RAT VNLKPKGLGQSKNLTGVYRLCLSARTIGFVKLNCEQPSVTLQLMNIRRCG 247
IRS2 MOUSE VNLKPKGLGQSKNLTGVYRLCLSARTIGFVKLNCEQPSVTLQLMNIRRCG 246
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN HSDSFFFIEVGRSAVTGPGELWMQADDSVVAQNIHETILEAMKALKELFE 299
IRS2 RAT HSDSFFFIEVGRSAVTGPGELWMQADDSVVAQNIHETILEAMKALKELFE 297
IRS2 MOUSE HSDSFFFIEVGRSAVTGPGELWMQADDSVVAQNIHETILEAMKALKELFE 296
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN FRPRSKSQSSGSSATHPISVPGARRHHHLVNLPPSQTGLVRRSRTDSLAA 349
IRS2 RAT FRPRSKSQSSGSSATHPISVPGARRHHHLVNLPPSQTGLVRRSRTDSLAA 347
IRS2 MOUSE FRPRSKSQSSGSSATHPISVPGARRHHHLVNLPPSQTGLVRRSRTDSLAA 346
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN TPPAAKCSSCRVRTASEGDGGAAAGAAAAGARPVSVAGSPLSPGPVRAPL 399
IRS2 RAT TPPAAKCTSCRVRTASEGDGGAAGGAGTAGGRPMSVAGSPLSPGPVRAPL 397
IRS2 MOUSE TPPAAKCTSCRVRTASEGDGGAAGGAGTAGGRPMSVAGSPLSPGPVRAPL 396
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN SRSHTLSGGCGGRGSKVALLPAGGALQHSRSMSMPVAHSPPAATSPGSLS 449
IRS2 RAT SRSHTLSAGCGGRPSKVALAPAGGALQHSRSMSMPVAHSPPAATSPGSLS 447
IRS2 MOUSE SRSHTLSAGCGGRPSKVTLAPAGGALQHSRSMSMPVAHSPPAATSPGSLS 446
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN SSSGHGSGSYPPPPGPHPPLPHPLHHGPGQRPSSGSASASGSPSDPGFMS 499
IRS2 RAT SSSGHGSGSYPLPPGSHPHLPHPLHHPQCQRPSSGSASASGSPSDPGFMS 497
IRS2 MOUSE SSSGHGSGSYPLPPGSHPHLPHPLHHPQGQRPSSGSASASGSPSDPGFMS 496
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN LDEYGSSPGDLRAFCSHRSNTPESIAETPPARDGGGGGEFYGYMTMDRPL 549
IRS2 RAT LDEYGSSPGDLRAFSSHRSNTPESIAETPPARDGSGG-ELYGYMSMDRPL 546
IRS2 MOUSE LDEYGSSPGDLRAFSSHRSNTPESIAETPPARDGSGG-ELYGYMSMDRPL 545
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN SHCGRSYRRVSGDAAQDLDRGLRKRTYSLTTPARQRPVPQPSSASLDEYT 599
IRS2 RAT SHCGRPYRRVSGDGAQDLDRGLRKRTYSLTTPARQRQVPQPSSASLDEYT 596
IRS2 MOUSE SHCGRPYRRVSGDGAQDLDRGLRKRTYSLTTPARQRQVPQPSSASLDEYT 595
IRS2 CRIGR --------------------------------------------------
---------
IRS2 HUMAN LMRATFSGSAGRLCPSCPASSPKVAYHPYPEDYGDIEIGSHRSSSSNLGA 649
IRS2 RAT LMRATFSGSSGRLCPSLPASSPKVAYNPYPEDYGDIEIGSHKSSSSNLGA 646
IRS2 MOUSE LMRATFSGSSGRLCPSFPASSPKVAYNPYPEDYGDIEIGSHKSSSSNLGA 645
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN DDGYMPMTPGAALAGSGSGSCRSDDYMPMSPASVSAPKQILQPRAAAAAA 699
IRS2 RAT DDGYMPMTPGAALRSGGPNSCKSDDYMPMSPTSVSAPKQILQP----RSA 692
IRS2 MOUSE DDGYMPMTPGAALRSGGPNSCKSDDYMPMSPTSVSAPKQILQP----RLA 691
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN AAVPSAGPAGPAPTSAAGRTFPASGGGYKASSPAESSPEDSGYMRMWCGS 749
IRS2 RAT AALPPSGAAVPAPPSGAGRTFPVNGGGYKASSPAESSPEDSGYMRMWCGS 742
IRS2 MOUSE AALPPSGAAVPAPPSGVGRTFPVNGGGYKASSPAESSPEDSGYMRMWCGS 741
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN KLSMEHADGKLLPNGDYLNVSPSDAVTTGTPPDFFSAALHPGGEPLRGVP 799
IRS2 RAT KLSMENPDPKLLPNGDYLNMSPSEAGTAGTPPDFFSAALRPGGEALKGVP 792
IRS2 MOUSE KLSMENPDPKLLPNGDYLNMSPSEAGTAGTPPD-FSAALRGGSEGLKGIP 790
IRS2 CRIGR --------------------------------------------------
IRS2 HUMAN GCCYSSLPRSYKAPYTCGG-DSDQYVLMSSPVGRILEEERLEPQATPGPS 848
IRS2 RAT GHCYSSLPRSYKAPCTCGGGDNDQYVLMSSPVGRILEEERLEPQATPG-A 841
IRS2 MOUSE GHCYSSLPRSYKAPCSCSG-DNDQYVLMSSPVGRILEEERLEPQATPG-A 838
IRS2 CRIGR --------------------------------------------------

Figure 2: Continued.
8 Advances in Bioinformatics

IRS2 HUMAN QAASAFGAGPTQPPHPVVPSPVRPSG--GRPEGFLGQRGRAVRPTRLSLE 896


IRS2 RAT GTFGAAGGSHTQPHHSAVPSSMRPSGIVGRPEGFLGQRCRAVRPTRLSLE 891
IRS2 MOUSE GTFGAAGGSHTQPHHSAVPSSMRPSAIGGRPEGFLGQRCRAVRPTRLSLE 888
IRS2 CRIGR ---------------------MRPSGSGGRPEGFLGQRCRAVRPTRLSLE 29
: ∗∗∗ . ∗∗∗∗∗∗∗∗∗∗ ∗∗∗∗∗∗∗∗∗∗∗
IRS2 HUMAN GLPSLPSMHEYPLPPEPKSPGEYINIDFGEPGARLSPPAPPLLASAASSS 946
IRS2 RAT GLQTLPSMQEYPLPTEPKSPGEYINIDFGEGGTRLSPPAPPLLASAASSS 941
IRS2 MOUSE GLQTLPSMQEYPLPTEPKSPGEYINIDFGEAGTRLSPPAPPLLASAASSS 938
IRS2 CRIGR GLQTLPSMQEYPLPAEPKSPGEYINIDFGEAGTRLSPPAPPLLASAASSS 79
∗∗ : ∗∗∗ ∗ : ∗∗∗∗∗ . ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ ∗ : ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗
IRS2 HUMAN SLLSASSPASSLGSGTPGTSSDSRQRSPLSDYMNLDFSSPKSPKPGAPSG 996
IRS2 RAT SLLSASSPASSLGSGTPGTSSDSRQRSPLSDYMNLDFSSPKSPKPSTRSG 991
IRS2 MOUSE SLLSASSPASSLGSGTPGTSSDSRQRSPLSDYMNLDFSSPKSPKPSTRSG 988
IRS2 CRIGR SLLSASSPASSLGSGTPGTSSDSRQRSPLSDYMNLDFSSPKSPKPGTHSG 129
∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ . : ∗∗
IRS2 HUMAN HPVGSLDGLLSPEASSPYPPLPPRPSASPSSSLQPPPPPPAPGELYRLPP 1046
IRS2 RAT DTVGSIDGLLSPEASSPYPPLPPRPSASPSSLQQPLPP--APGDLYRLPP 1039
IRS2 MOUSE DTVGSMDGLLSPEASSPYPPLPPRPSTSPSSLQQPLPP--APGDLYRLPP 1036
IRS2 CRIGR DTVGSMDGLLSPEASSPYPPLPPRPSASPSSLQQPLPP--APGELYRLPP 177
. . ∗∗∗ : ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ : ∗∗∗∗ ∗∗ ∗∗ ∗∗∗ : ∗∗∗∗∗∗
IRS2 HUMAN ASAVATAQGPGAASSLSSDTGDNGDYTEMAFGVAATPPQPIAAPPKPEAA 1096
IRS2 RAT AT-AATSQGPTAGSSMSSEPGDNGDYTEMAFGVAATPPQPIAAPPKPEGA 1088
IRS2 MOUSE AS-AATSQGPTAGSSMSSEPGDNGDYTEMAFGVAATPPQPIVAPPKPEGA 1085
IRS2 CRIGR AS-AATSQGPTAGSSKSSEPGDNGDYTEMAFGVAAPPPQPIAAPPKPEGA 226
∗ : . ∗∗ : ∗∗∗ ∗ . ∗∗ ∗∗ : . ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ . ∗∗∗∗∗ . ∗∗∗∗∗∗ . ∗
IRS2 HUMAN RVASPTSGVKRLSLMEQVSGVEAFLQASQPPDPHRGAKVIRADPQGGRRR 1146
IRS2 RAT RVTSPTSGLKRLSLMDQVSGVEAFLQVSQPPDPHRGAKVIRADPQGGRRR 1138
IRS2 MOUSE RVASPTSGLKRLSLMDQVSGVEAFLQVSQPPDPHRGAKVIRADPQGGRRR 1135
IRS2 CRIGR RVTSPTSGLKRLSLMEQVSGVEAFLQVSQPPDPHRGAKVIRADPQGGRRR 276
∗∗ : ∗∗∗∗∗ : ∗∗∗∗∗∗ : ∗∗∗∗∗∗∗∗∗∗ . ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗
IRS2 HUMAN HSSETFSSTTTVTPVSPSFAHNPKRHNSASVENVSLRKSSEGGVGVGPGG 1196
IRS2 RAT HSSETFSSTTTVTPVSPSFAHNSKRHNSASVENVSLRKSSEGNSILG--G 1186
IRS2 MOUSE HSSETFSSTTTVTPVSPSFAHNSKRHNSASVENVSLRKSSEGSSTLG--G 1183
IRS2 CRIGR HSSETFSSTTTVTPVSPSFAHNSKRHNSASVENVSLRKSSEGSGILG--G 324
∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ . ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ . : ∗ ∗
IRS2 HUMAN GDEPPTSPRQLQP--APPLAPQGRPWTPGQPGGLVGCPGSGGSPMRRETS 1244
IRS2 RAT SDEPSTSPGQAQPSAGVPPAPQARPWNPGQPGALIGCPGGSSSPMRRETS 1236
IRS2 MOUSE GDEPPTSPGQAQPLVAVPPVPQARPWNPGQPGALIGCPGGSSSPMRRETS 1233
IRS2 CRIGR SDEPPTSPGQAQPSVAVPPAPQARPWNPGQPGALIGCPGGSSSPMRRETS 374
. ∗∗∗ . ∗∗∗ ∗ ∗∗ . ∗ . ∗ ∗ . ∗∗∗ . ∗∗∗∗∗ . ∗ : ∗∗∗∗ . . . ∗∗∗∗∗∗∗∗
IRS2 HUMAN AGFQNGLNYIAIDVREEPGLPPQPQPPPPPLPQPGDKSSWGRTRSLGGLI 1294
IRS2 RAT VGFQNGLNYIAIDVRGEQGSLAQSQPQH---PQPGDKNSWGRTRSLGGLL 1283
IRS2 MOUSE VGFQNGLNYIAIDVRGEQGSLAQSQ------PQPGDKNSWSRTRSLGGLL 1277
IRS2 CRIGR VGFQNGLNYIAIDVRDEQGSLSQTQPQH---PQPGDKSSWGRTRSLGGLL 421
. ∗∗∗∗∗∗∗∗∗∗∗∗∗∗ ∗ ∗ . ∗ . ∗ ∗∗∗∗∗∗. ∗∗ . ∗∗∗∗∗∗∗∗ :
IRS2 HUMAN SAVGVGSTGGGCGGPGPGALPPANTYASIDFLSHHLKEATIVKE 1338
IRS2 RAT GTVGGSGTSGVCGGPGTGALPSASTYASIDFLSHHLKEATVVKG 1327
IRS2 MOUSE GTVGGSGASGVCGGPGTGALPSASTYASIDFLSHHLKEATVVKE 1321
IRS2 CRIGR GTVGSTGSSGVCGGPGTGALPSASTYASIDFLSHHLKEATVVKE 465
. : ∗∗ . : . ∗ ∗∗∗∗∗ . ∗∗∗∗ .∗ . ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ : ∗∗

Figure 2: Multiple alignments of three sequences (human, rat, and mouse). ClustalW was used to align these different sequences. The
conserved sequence (highlighted red) is marked by an asterisk, semiconserved (highlighted yellow) substitution by a single dot, and conserved
substitution (highlighted green) by a double dot. Ser-3, 30, 66, 78, 84, 144, 162, 164, 166, 174, 210, 222, 237, 251, 253, 262, 277, 304, 306, 308,
309, 311, 312, 318, 334, 342, 346, 358, 365, 384, 388, 391, 400, 402, 406, 414, 428, 430, 432, 438, 444, 447, 449, 450, 451, 452, 456, 458, 482, 483,
485, 487, 489, 491, 493, 499, 505, 506, 515, 518, 523, 550, 560, 577, 591, 592, 594, 606, 608, 615, 619, 620, 639, 642, 643, 644, 645, 669, 672, 679,
682, 684,, 714, 730, 731, 735, 736, 740, 749, 752, 770, 772, 785, 804, 805, 809, 827, 828, 868, 873, 894, 903, 915, 932, 941, 944, 945, 946, 947, 950,
952, 953, 956, 957, 960, 966, 967, 969, 973, 976, 984, 985, 988, 995, 1001, 1007, 1011, 1012, 1022, 1024, 1026, 1027, 1060, 1061, 1063, 1064, 1100, 1103,
1109, 1115, 1124, 1148, 1149, 1153, 1154, 1162, 1164, 1174, 1176, 1181, 1185, 1186, 1203, 1237, 1244, 1283, 1289, 1322 and 1327; Thr-117, 140, 191, 214, 225,
239, 265, 286, 314, 336, 344, 350, 363, 404, 443, 520, 527, 575, 579, 580, 599, 604, 657, 719, 776, 779, 844, 859, 891, 962, 965, 1052, 1073, 1082,
1102, 1151, 1155, 1156, 1157, 1159, 1202, 1243, 1287, 1319, and 1334 and Tyr-36, 75, 76, 111, 116, 121, 136, 184, 194, 217, 459, 503, 540, 542, 556, 576, 598,
625, 628, 632, 653, 675, 727, 742, 766, 803, 810, 823, 907, 919, 978, 1014, 1042, 1072, 1253, and 1320 are highly conserved residues. Meanwhile,
Ser-357, 848, 900, and 1048 and Thr-61, 544, 777, 815, and 1302 showed conserved substitutions. Ser-14, 183, 555, 665, 667, 704, 723, 820, 852,
1234, 1282, 1295, and 1301 and Thr-713, 1066, and 1221 showed semiconserved substitutions.

159 (89.8%) were conserved, 4 (2.25%) were conserved substi- 3.3. Predicted 3D Structures of Human IRS-1 and IRS-2
tutions, and 13 (7.34%) were semiconserved. While for total Proteins. 3D structure of protein plays important role in
53 Thr residues, 45 (84.9%) were conserved, 5 (9.4%) were determining the exact and precise functions of any protein.
conserved substitutes, and 3 (5.6%) were semiconserved. Tyr However, the whole 3D structures of IRS-1 and IRS-2 proteins
residues showed maximum conservation status. Out of 37 Tyr have not been determined yet. To assess this difficulty we
residues, 36 (97.29%) were conserved. predict the 3D structure of these proteins by homology and
Advances in Bioinformatics 9

ab initio modeling. We used many servers to predict the found to be phosphorylated experimentally. These include
whole structure of both proteins and I-TASSER was the only Ser-24, 268, 270, 272, 274, 307, 312, 323, 341, 345, 348, 374,
server that predicts the complete structure of these pro- 527, 531, 574, 616, 629, 636, 639, 794, 892, 1078, 1101, 1145, 1222,
teins. I-TASSER predicted five models for each 3D structure and 1223 Thr-530, and 1111, Tyr-46, 612, 662, and 896. Uniprot
with minimum 𝑧-score. The energy of these structures was database also showed phosphorylation potential on Ser-3, 99,
minimized using YASARA software. To determine whether 329, 1223, and Tyr-465, 989, and 1229 on basis of similarity.
these structures truly represent or are nearer to possible Different kinases and groups of kinases like CK2, PLK1, IKK,
future true structures, their geometry had been accessed by PKC, MAPK3, SIK, AMPK, INSR, and RSK1 were involved in
drawing Ramachandran plots that represent most reliable and phosphorylation of IRS-1 on same or different residues.
suitable source to check predicted 3D structure’s geometry. For human IRS-2 protein phosphorylation had been
The Ramachandran plot analysis is a method to determine verified on Ser-304, 306, 312, 334, 346, 365, 384, 388, 391,
the phi and psi torsion angles between amino acids and 518, 523, 550, 560, 577, 594, 608, 620, 679, 714, 730, 731, 735,
compare this information to experimentally verified data 736, 770, 772, 828, 894, 915, 973, 1100, 1103, 1109, 1148, 1149,
as well as standard geometrical properties of protein 3D 1162, 1174, 1176, 1203, and 1283, Thr-350, 363, 520, 527, 713,
structures. Five models each were predicted by I-TASSER 777, 779, 1151, 1159, and 1202, and Tyr-75, 184, 653, 675, 823,
for constructing Ramachandran plots assessing both IRS- and 919 residues. While IKK kinase group was involved in
1 and IRS-2. Figures 7(a) and 7(b) represent the predicted phosphorylation, Uniprot database showed phosphorylation
3D structures, Ramachandran plot analysis, and geometrical on four more residues including above, that is, Thr-527, 579,
properties of IRS-1 and IRS-2. 580 and Tyr-1253.
On the basis of Ramachandran plot analysis we proposed
predicted model #2 (Figure 3(a)) as possible 3D structure 3.4.2. Predicted Phosphorylation Sites in IRS-1. To predict
for IRS-1 protein. In this structure 94.7% residues were phosphorylated residues in human IRS-1, NetPhos was used.
in Ramachandran allowed region with only 5.32% outliers. Figure 5(a) is a graphical representation of phosphorylation
No residues with bad bonds were found while only 1.06% potential of the Ser, Thr, and Tyr residues in IRS-1. NetPhos
residues with bad angles were predicted. However, the c𝛽 predicted phosphorylation on 113 Ser, 13 Thr, and 20 Tyr
deviations for these predicted models were not predicted by residues including all experimentally verified phosphoryla-
server. tion sites except Ser-323, 374, 794 and Thr-530 and 1111 as
Ramachandran plot analysis was also performed for shown in Table S1 available online at http://dx.doi.org/10.1155/
predicted structures for IRS-2 protein as shown in Figure 3(b) 2014/324753.
and model #1 was selected as best predicted model. The
percentage of compliance for the predicted structure of IRS-
3.4.3. Predicted Phosphorylation Sites in IRS-2. Similar to
2 with the Ramachandran plot indicates that 82.2% (361)
IRS-1, web server NetPhos 2.0 also made prediction of
of residues reside in the favored region while 94.3% (414)
phosphorylated residues in human IRS-2. Figure 5(b) is a
of all residues exist within the allowed region with only
graphical representation of phosphorylation potential of the
9 C𝛽 deviations > 0.25A. Only 5.69% of residues exist
Ser, Thr, and Tyr residues in IRS-2. These predicted phospho-
within the disallowed region of the plot. The 3D structures
rylation residues include all experimentally confirmed sites
of IRS-1 and IRS-2 proteins predicted by I-TASSER were
except Ser-312, 388, 518, 679, 714, 722, 730, 772, 828, 1103, 1179,
submitted to online database PMDB—Protein Model Data-
Thr-350, 527, 579, 713, 1151, 1202, and Tyr-75 as shown in Table
base (http://bioinformatics.cineca.it/PMDB/). PMDB acces-
S2.
sion numbers are given in front of each respective model in
Figures 3(a) and 3(b).
IRS-1 and IRS-2 predicted 3D structures had been com- 3.4.4. Predicted and Experimentally Verified Kinases in IRS-1.
pared with each other by FATCAT and visualized by using Various kinases involved in phosphorylation of human IRS-
Chimera as shown in Figure 4. IRS-1 structure was shown in 1. NetPhosK 1.0, Scansite, and KinasePhos 2.0 were used to
red while IRS-2 in green. Both structures were significantly assess possible kinases on IRS-1. The results obtained from
different (𝑃 value 9.01 E-01; Raw score = 251.71). The structure these three servers had shown the involvement of various
alignment has 546 equivalent positions with an RMSD of kinases in phosphorylation of human IRS-1. GSK3, PKC,
11.03, with five twists. Both structures were found to have and PKA showed maximum involvement in phosphorylation
identity 4.02% and similarity 8.71%. This information leads while other kinases like PKG, p38MAPK, cdk5, CKII, cdc2,
to the conclusion that both IRS-1 and IRS-2 proteins regulate ATM, and CKI also showed potential for phosphorylation
insulin singling in different ways. of many Ser and Thr residues. INSR, syk, SRC, and insulin
receptor kinase are the most predicted kinases for tyrosine
phosphorylation as shown in Table 1S.
3.4. Posttranslational Modification Prediction and Analysis
3.4.1. Experimentally Verified Phosphorylation Sites on IRS-1 3.4.5. Predicted and Experimentally Verified Kinases in IRS-
and IRS-2. Phospho.ELM is the website that contains data 2. Similar to IRS-1, NetPhosK 1.0, Scansite, and KinasePhos
about experimentally verified phosphorylated residues and 2.0 predicted many kinases involved in phosphorylation
kinases involved in proteins to ease research work. Human of human IRS-2. GSK3, PKC, PKA, 14-3-3 Mode 1, PKG,
IRS-1 has more than 40 Ser, Thr, and Tyr residues that were p38MAPK, cdk5, CKII, cdc2, ATM, and CKI are the most
10 Advances in Bioinformatics

Human IRS-1 predicted Ramachandaran plots of Geometrical analysis of


structures by I-TASSER predicted structures predicted structures
180
PMDB ID: PM0079784
Poor rotamers = 48 (4.37%)
Ramachandran outliers = 102 (8.23%)
Ψ 0 Ramachandran favoured = 987 (79.6%)
Ramachandran allowed = 91.8%
C𝛽 deviations > 0.25A = N.A
Residues with bad bonds = 2/4967 (0.04%)
Predicted model # 1 −180 Residues with bad angles = 50/6207 (0.81%)
−180 0 180
Φ
180
PMDB ID: PM0079785
Poor rotamers = 44 (4.34%)
Ramachandran outliers = 66 (5.32%)
Ψ 0 Ramachandran favoured = 1036 (83.55%)
Ramachandran allowed = 94.7%
C𝛽 deviations > 0.25A = N.A
Residues with bad bonds = 0/4967 (0.00%)
Predicted model # 2 −180 Residues with bad angles = 66/6207 (1.06%)
−180 0 180
Φ
180
PMDB ID: PM0079786
Poor rotamers = 41 (4.04%)
Ramachandran outliers = 86 (6.94%)
Ψ 0 Ramachandran favoured = 999 (80.56%)
Ramachandran allowed = 93.1%
C𝛽 deviations > 0.25A = N.A
Residues with bad bonds = 1/4967 (0.02%)
Predicted model # 3 −180 Residues with bad angles = 53/6207 (0.85%)
−180 0 180
Φ
180
Predicted model # 4 PMDB ID: PM0079787
Poor rotamers = 35 (3.45%)
Ramachandran outliers = 148 (11.94%)
Ψ 0 Ramachandran favoured = 901 (72.6%)
Ramachandran allowed = 89.0%
C𝛽 deviations > 0.25A = N.A
Residues with bad bonds = 0/4967 (0.00%)
Residues with bad angles = 106/6207 (1.71%)
−180
−180 0 180
Φ
180
Predicted model # 5 PMDB ID: PM0079788
Poor rotamers = 39 (3.85%)
Ramachandran outliers = 137 (11.05%)
Ψ 0 Ramachandran favoured = 891 (71.85%)
Ramachandran allowed = 91.8%
C𝛽 deviations > 0.25A = N.A
Residues with bad bonds = 0/4967 (0.00%)
Residues with bad angles = 116/6207 (1.87%)
−180
−180 0 180
Φ

(a)
Human IRS-1 predicted Ramachandaran plots of Geometrical analysis of
structures by I-TASSER predicted structures predicted structures
Predicted model # 1 180
PMDB ID: PM0079789
Poor rotamers = 42 (4.08%)
Ramachandran outliers = 109 (8.16%)
Ψ 0 Ramachandran favoured = 1069 (80.01%)
Ramachandran allowed = 91.8%
C𝛽 deviations > 0.25 A = 84 (7.12%)
Residues with bad bonds = 0/5351 (0.00%)
Residues with bad angles = 80/6687 (1.20%)
−180
−180 0 180
Φ
180 PMDB ID: PM0079790
Predicted model # 2 Poor rotamers = 50 (4.85%)
Ramachandran outliers = 121 (9.06%)
Ψ Ramachandran favoured = 1029 (77.02%)
0 Ramachandran allowed = 90.9%
C𝛽 deviations > 0.25 A = 92 (7.80%)
Residues with bad bonds = 1/5351 (0.02%)
−180 Residues with bad angles = 102/6687 (0.81%)
−180 0 180
Φ
180
PMDB ID: PM0079791
Poor rotamers = 36 (3.50%)
Ramachandran outliers = 124 (9.28%)
Ψ 0 Ramachandran favoured = 1027 (76.8%)
Ramachandran allowed = 90.7%
C𝛽 deviations > 0.25A = 77 (6.53%)
Residues with bad bonds = 2/5351 (0.04%)
Predicted model # 3 Residues with bad angles = 116/6687 (1.73%)
−180
−180 0 180
Φ
180
Predicted model # 4 PMDB ID: PM0079792
Poor rotamers = 36 (3.50%)
Ramachandran outliers = 186 (13.92%)
Ψ 0 Ramachandran favoured = 924 (69.16%)
Ramachandran allowed = 86.1%
C𝛽 deviations > 0.25A = 65 (5.51%)
Residues with bad bonds = 0/5351 (0.00%)
Residues with bad angles = 149/6687 (2.2%)
−180
−180 0 180
Φ
180
Predicted model # 5 PMDB ID: PM0079793
Poor rotamers = 38 (3.69%)
Ramachandran outliers = 109 (8.16%)
Ψ 0 Ramachandran favoured = 1039 (77.7%)
Ramachandran allowed = 91.8%
C𝛽 deviations > 0.25A = 50 (4.24%)
Residues with bad bonds = 0/5351 (0.00%)
−180 Residues with bad angles = 78/6687 (1.17%)
−180 0 180
Φ

(b)

Figure 3: Homology models of human IRS-1 and IRS-2 retrieved through I-TASSER. Five models were predicted as possible 3D structures
for both IRS-1 and IRS-2 proteins. The best model for each protein was selected on the basis of Ramachandran plots and their geometrical
configuration. (a) Human IRS-1 full-length predicted 3D structures by I-TASSER with Ramachandran plots and geometry. (b) Human IRS-2
predicted 3D structures by I-TASSER with Ramachandran plots and geometry.
Advances in Bioinformatics 11

yang sites. These residues include Ser-383, 412, 1037 and Thr-
387.

3.4.7. Prediction of O-𝛽-GlcNAc Modification and Potential


Yin Yang Sites for IRS-2. According to YinOYang 1.2 results,
IRS-2 had high potential for O-𝛽-GlcNAc modification and
64 Ser/Thr residues were predicted to be O-glycosylated
(Figure 7(a)). As described earlier, these sites can also
undergo phosphorylation modification that may lead to
interplay between O-glycosylation and phosphorylation on
these sites. YinOYang 1.2 predicted 40 possible yin yang sites
(Figure 7(b)). These sites were Ser-304, 308, 311, 384, 400, 428,
438, 444, 449, 482, 483, 485, 489, 523, 560, 606, 620, 665, 731,
736, 953, 956, 967, 984, 988, 1012, 1022, 1024, 1026, 1027, 1048,
IRS-1 1149, 1162, and 1203 and Thr-443, 777, 844, 965, 1157, and 1159
IRS-2 (Table 4). Ser-162, 342, 357, 447, 714, 848, 946, 966, and 1115,
Identity = 4.02% and Thr-713 and 1202 were predicted as FN-yin yang sites.
Similarity = 8.71%

Figure 4: Homology models of both human IRS-1 and IRS-1 were 4. Discussion
retrieved utilizing automated protein modelling options through
I-TASSER. The best 3D representative models for both IRS-1 (in Insulin is a primary hormone that regulates glucose transport
red) and IRS-2 (in green) were aligned using FATCAT server and and homeostasis in cell systems through its interaction with
viewed in Chimera software package. Sequence alignment revealed its receptors (IR) via tyrosine autophosphorylation of IR.
that both proteins are significantly different in structure and only Activation of IR subsequently phosphorylates its substrates
have 4.02% identity and 8.71% similarity.
(IRSs). Insulin receptor substrates 1 and 2 are specific sub-
strates for IR/IGFR and widely expressed in mammalian
tissues. Both IRS 1 and 2 contain phosphotyrosine-binding
(PTB) domains. Tyrosine phosphorylation of IRS proteins is
predicted kinases involved in phosphorylation of many serine
well studied and more than 20 potential tyrosine phospho-
and threonine residues in IRS-2. INSR, syk, SRC, and insulin
rylation sites are mapped on IRSs. During insulin signaling,
receptor kinase are the most predicted kinases for tyrosine
IRS proteins phosphorylate and activate AKT/PKB/MAPK
phosphorylation as shown in Table 2S. Intrinsic tyrosine
pathways that regulate insulin mediated metabolic actions
kinase in the 𝛽-subunit of IR is known to phosphorylate
and control cell growth and differentiation [10, 20].
diverse substrates, including IRS-2 [51].
Despite Tyr phosphorylation, IRSs also phosphorylate
on Ser/Thr residues and phosphorylation on these residues
3.4.6. Prediction of O-𝛽-GlcNAc Modification and Potential may effect downstream insulin signaling. More than 30
Yin Yang Sites for IRS-1. YinOYang prediction server showed Ser/Thr residues on IRS-1 and 50 on IRS-2 have been mapped
that the human IRS-1 has 57 potential sites for O-𝛽-GlcNAc using different analytical techniques. However, function of all
modifications (Figure 6(a)). It had been studied by various mapped Ser/Thr residues on IRSs is still undetermined. Our
researchers that phosphorylate Ser/Thr residues can also results showed that both IRS-1 and IRS-2 have high potential
undergo O-𝛽-GlcNAc modification when OGT and kinases to be phosphorylated on Ser and Thr residues and more
compete for same site modification [52–54], thus indicat- than 100 Ser/Thr residues have been predicted including
ing a possibility for crosstalk between O-glycosylation and previously mapped residues on IRS 1 and 2 on both N- and
phosphorylation on these residues. YinOYang 1.2 showed that C-terminals including PTB domain (Figures 5(a) and 5(b)).
IRS-1 has high potential for O-glycosylation/phosphorylation These predicted residues are conserved as shown in Figures
interplay (Table 3) and 37 possible yin yang sites were 1 and 2 (multiple alignment of IRS-1 and IRS-2, resp.) and
predicted for the crosstalk of phosphorylation and O-𝛽- may have potential to modify IRSs functioning. Meanwhile
GlcNAc modification (Figure 6(b)). These sites are Ser-7, 268, various kinases such as GSK3, PKC, PKA, ERK, mTOR, S6K,
270, 273, 303, 312, 341, 348, 388, 391, 393, 395, 413, 417, PKG, CDK5, CKII, and JNK are involved in Ser/Thr phos-
543, 639, 680, 681, 683, 684, 741, 807, 808, 809, 985, 1000, phorylation of IRSs. Both ERK and mTOR are experimentally
1005, 1011, 1035, 1036, 1101, 1105, 1213, and 1223 and Thr-305, verified kinases that can cause phosphorylation at Ser-616
847, and 1004 (Table 3). Out of these 37 sites, 7 sites (Ser- [55]. Phosphorylation of Ser-312 is mediated by increased
7, 680, 681, 683, 684, 1036, and 1213) were labeled as false activation of kinases downstream of PI3-kinase, comprising
positive (FP) yin yang sites because they are nonconserved. S6K1, IKK𝛽, mTOR, PKC, and JNK [22, 25, 56]. Akt-mTOR
Meanwhile, some conserved Ser and Thr residues that were activity is highly activated within neurons in many AD cases
not predicted to be O-𝛽-GlcNAc modified, but exhibited very [57]. S6K activation was also found to be induced in AD that
high phosphorylation potential, and also close to threshold induces phosphorylation of Ser-636 and 639 of IRS-1 [58].
value of O-glycosylation, are named as false negative (FN) yin Ser-312, 616, 636, 639, and 1101 in IRS-1 are experimentally
12 Advances in Bioinformatics

NetPhos 2.0 predicted phosphorylation sites in sp P35568-I NetPhos 2.0 predicted phosphorylation sites in sp Q9Y4H2-I
Phosphorylation

Phosphorylation
1 1
potential

potential
0 0
0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200
Sequence position Sequence position

Serine Tyrosine Serine Tyrosine


Threonine Threshold Threonine Threshold
(a) (b)

Figure 5: Graphical representation of the potential Ser, Thr, and Tyr residues for phosphorylation sites in human IRS-1 and IRS-2. (a)
Predicted potential sites for phosphorylation on Ser, Thr, and Tyr residues. The light gray horizontal line shows the threshold for modification
potential. The blue, green, and red vertical lines indicate the potential phosphorylated Ser, Thr, and Tyr residues, respectively. This graph was
generated by NetPhos 2.0 web based program. (b) Predicted potential sites for phosphorylation on Ser, Thr, and Tyr residues. The light gray
horizontal line shows the threshold for modification potential. The blue, green, and red vertical lines indicate the potential phosphorylated
Ser, Thr, and Tyr residues, respectively. This graph was generated by NetPhos 2.0 web based program.

YinOYang predicted O-𝛽-GlcNAc sites in sp P35568-I YinOYang predicted O-𝛽-GlcNAc sites in sequence
O-Glycosylation
O-Glycosylation

1
potential

potential 1

0 0
0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200
Sequence position Sequence position

O-GlcNAc potential O-GlcNAc potential Yin-Yang


Threshold Threshold
(a) (b)

Figure 6: Graphical representation of the potential Ser and Thr residues for O-𝛽-GlcNAc as well as yin yang sites in human IRS-1. (a) Predicted
potential sites for O-𝛽-GlcNAc modification of Ser and Thr. Green vertical line shows O-𝛽-GlcNAc modification potential of Ser/Thr residues,
while the light blue wavy line indicates the threshold for modification potential. YinOYang 1.2 web based program generated this graph. A
total number of 57 potential Ser/Thr residues showed O-𝛽-GlcNAc modification potential including Ser-7, 268, 270, 273, 295, 303, 312, 341,
348, 388, 391, 393, 394, 395, 412, 413, 417, 440, 541, 543, 624, 639, 641, 680, 681, 682, 683, 684, 741, 807, 808, 809, 917, 985, 1000, 1005, 1011,
1025, 1035, 1036, 1037, 1101, 1105, 1213, and 1223 and Thr-305, 311, 351, 387, 539, 743, 803, 847, 851, 1004, 1030, and 1045. (b) Predicted yin yang
residues for human IRS-1. A total of 37 Ser/Thr residues showed potential for possible yin yang sites for phosphorylation and O-glycosylation
interplay.

YinOYang predicted O-𝛽-GlcNAc sites in sp Q9Y4H2-I YinOYang predicted O-𝛽-GlcNAc sites in sequence
O-Glycosylation
O-Glycosylation

1
potential

potential

0 0
0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200
Sequence position Sequence position

O-GlcNAc potential O-GlcNAc potential Yin-Yang


Threshold Threshold
(a) (b)

Figure 7: Graphical representation of the potential Ser and Thr residues for O-𝛽-GlcNAc as well as yin yang sites in human IRS-2. (a) Predicted
potential sites for O-𝛽-GlcNAc modification of Ser and Thr. Green vertical line shows O-𝛽-GlcNAc modification potential of Ser/Thr residues,
while the light blue wavy line indicates the threshold for modification potential. YinOYang 1.2 web based program generated this graph. A
total number of 64 potential Ser/Thr residues showed O-𝛽-GlcNAc modification potential including Ser-3, 162, 304, 308, 311, 357, 384, 400,
428, 438, 444, 449, 482, 483, 485, 489, 523, 560, 591, 592, 606, 615, 619, 620, 665, 679, 684, 704, 714, 731, 736, 848, 941, 944, 945, 946, 952,
953, 956, 966, 967, 984, 988, 1012, 1022, 1024, 1026, 1027, 1048, 1103, 1149, 1162, and 1203 and Thr-344, 443, 713, 719, 777, 844, 965, 1157, 1159,
1202, and 1319. (b) Predicted yin yang residues for human IRS-1. A total of 41 Ser/Thr residues showed potential for possible yin yang sites for
phosphorylation and O-glycosylation interplay.
Advances in Bioinformatics 13

Table 3: Experimental and predicted O-𝛽-GlcNAc and yin yang sites in human IRS-1.

Substrate O-𝛽-GlcNAc status


Yin yang residues
Residues PS Conservation SA YinOYang 1.2
Ser 7 NC E ++ Y(FP)
Ser 268 CS B ++ Y
Ser 270 C E ++ Y
Ser 273 C B + Y
Ser 295 C E + −
Ser 303 CS E + Y
Thr 305 C E + Y
Thr 311 C E + −
Ser 312 C B + Y
Ser 341 SC E + Y
Ser 348 C E + Y
Thr 351 CS E + −
Ser 383 C E − Y(FN)
Thr 387 C E +++ Y(FN)
Ser 388 C E ++ Y
Ser 391 C E ++++ Y
Ser 393 C B + Y
Ser 394 C B + −
Ser 395 C E + Y
Ser 412 C E +++ Y(FN)
Ser 413 C E ++ Y
Ser 417 C E ++ Y
Ser 440 C B ++ −
Thr 539 C E ++ −
Ser 541 C E + −
Ser 543 C E + Y
Ser 624 NC E + −
Ser 639 C E + Y
Ser 641 C E + −
Ser 680 NC E ++ Y(FP)
Ser 681 NC E +++ Y(FP)
Ser 682 NC E ++ −
Ser 683 NC E + Y(FP)
Ser 684 NC E + Y(FP)
Ser 741 C E + Y
Thr 743 C E ++ −
Thr 803 NC E ++ −
Ser 807 C E + Y
Ser 808 C E +++ Y
Ser 809 C E ++ Y
Thr 847 C E + Y
Thr 851 NC E + −
Ser 917 NC E + −
Ser 985 SC E + Y
Ser 1000 SC E + Y
Thr 1004 SC E + Y
14 Advances in Bioinformatics

Table 3: Continued.
Substrate O-𝛽-GlcNAc status
Yin yang residues
Residues PS Conservation SA YinOYang 1.2
Ser 1005 SC E + Y
Ser 1011 CS E ++ Y
Ser 1025 SC E + −
Thr 1030 SC E + −
Ser 1035 SC E + Y
Ser 1036 NC E +++ Y(FP)
Ser 1037 CS E ++++ Y(FN)
Thr 1045 NC E + −
Ser 1101 C E + Y
Ser 1105 C E ++ Y
Ser 1213 NC E + Y(FP)
Ser 1223 C B ++ Y
PS: position; SA: surface accessibility; E: exposed; B, buried; C: conserved; NC: nonconserved; CS: conserved substitutions; SC: semiconserved; Y: predicted as
yin yang sites by YinOYang 1.2; FP: false positive; FN: false negative; −: negative/not determined yet; +: positive; ++: highly positive; +++: very highly positive;
++++: very very highly positive.

verified phosphorylation sites. Phosphorylation of Ser-312 as IRS-1 and this glycosylation was shown to be reduced in
causes IRS-1 proteasomal degradation that leads to insulin diabetes and AD [32]. In this study we have predicted more
resistance under both pathological and physiological condi- than 65 Ser/Thr residues on IRS-2 that can act as potential
tions, while phosphorylation of Ser-616 is supposed to inhibit O-glycosylated sites and may ease researchers to map correct
insulin activation of PI3-kinase. Furthermore, phosphoryla- O-glycosylated residues on IRS-2.
tion on Ser-1101 by S6K1 on C-terminal of IRS-1 is thought It is established fact that O-𝛽-GlcNAc can mimic or
to be involved in blocking tyrosine as well as AKT phospho- inhibit phosphorylation on the neighboring or same residues
rylation that leads to insulin resistance in mice as well as in and if this interplay happened on same Ser/Thr residues these
human [21, 23, 25]. GSK-3 and JNK sequentially phospho- sites are called yin yang residues. These yin yang residues
rylate IRS-2 at serine residues and its phosphorylation plays can be used as therapeutic targets to overcome different
inhibitory role in hepatic insulin signaling [51, 58, 59]. diseases [42, 52]. Protein 3D structures are best models to
The O-𝛽-GlcNAc modification is known to be influ- tell whether the predicted residues are accessible or not for
ential and analogous to phosphorylation. Some researchers these types of interplays [27, 41]. However, for both IRS-1
indicated increased phosphorylation and reduced O-GlcNAc and IRS-2, complete 3D have not been determined yet, so we
levels in the AD and diabetes patients [60–62]. These findings used ab initio models to design the complete 3D structures
made O-glycosylation as important PTM in elucidating the of both proteins. As both IRS-1 and IRS-2 are involved in
mechanism of such metabolic disorders. In this study we used insulin signaling mechanism, best possible 3D structures
YinOYang 1.2 server to predict O-glycosylation on Ser/Thr for both proteins were selected on basis of Ramachandran
residues of both IRS-1 and 2, and more than 50 Ser/Thr plots (Figures 3(a) and 3(b)) and were aligned together to
residues were shown to have high potential for O-𝛽-GlcNAc be compared with each other (Figure 4). We found that
modification on IRS-1 (Figures 6(a) and 7(a) and Tables 3 both proteins have significantly nonidentical structures as
and 4). O-𝛽-GlcNAc modification in human IRS-1 has been their secondary structure previously described. To determine
reported at Ser-984 or 985, 1011, and possibly at multiple yin yang sites and surface accessibility, we predicted surface
residues within 1025–1045 [33]. Based on our results Ser- accessible residues on the basis of their secondary structure
984 is more prone to be O-glycosylated than Ser-985 while by using NetPhos P server (as given in Table S1 and S2
Ser-1030, 1035, 1037 and Thr-1045 have potential to be O- for both IRS-1 and IRS-2, resp.). Secondary structure based
glycosylated in 1025–1045 sequence on the basis of conserva- prediction method showed that both proteins have high
tion. Meanwhile western blotting and site-directed mutagen- accessibility for PTMs on Ser/Thr residues except Ser-268,
esis studies in rat suggested Ser-1036 (human numbering Ser- 273, 312, 393, 394, 440, 1223 on IRS-1, Ser-428, and Thr-
1037) on IRS-1 as possible site of O-𝛽-GlcNAc modification. 344. Although secondary structure based prediction method
Ser-1037 is conserved in human IRS-1 and identification of showed no/reduced surface accessibility for Ser-312 of IRS-1,
this site will be helpful in exploring the biological implication predicted 3D structure revealed that Ser-312 is accessible for
of the O-𝛽-GlcNAc modification [33]. Although to date no PTMs (Figure 8). Moreover, Ser-312 is experimentally verified
O-glycosylated residue is mapped on IRS-2, experimental and known phosphorylation residue, which confirms the
data showed that IRS-2 also have potential to be glycosylated exposed accessibility of this residue. Ser-312 phosphorylation
Advances in Bioinformatics 15

Table 4: Experimental and predicted O-𝛽-GlcNAc and yin yang sites in human IRS-2.

Substrate O-𝛽-GlcNAc status


Yin yang residues
Residues PS Conservation SA YinOYang 1.2
Ser 3 C E + −
Ser 162 C E ++ Y(FN)
Ser 304 C E ++ Y
Ser 308 C E + Y
Ser 311 C E + Y
Ser 342 C E − Y(FN)
Thr 344 C B + −
Ser 357 CS E ++ Y(FN)
Ser 384 C E ++++ Y
Ser 400 C E + Y
Ser 428 C B + Y
Ser 438 C E + Y
Thr 443 C E ++++ Y
Ser 444 C E ++ Y
Ser 447 C E − Y(FN)
Ser 449 C E + Y
Ser 482 C E + Y
Ser 483 C E ++++ Y
Ser 485 C E + Y
Ser 489 C E + Y
Ser 523 C E + Y
Ser 560 C E + Y
Ser 591 C E + −
Ser 592 C E + −
Ser 606 C E + Y
Ser 615 C E + −
Ser 619 C E + −
Ser 620 C E + Y
Ser 665 SC E ++ Y
Ser 679 C E + −
Ser 684 C E + −
Ser 704 SC E + −
Thr 713 SC E ++ Y(FN)
Ser 714 C E ++ Y(FN)
Thr 719 C E + −
Ser 731 C E + Y
Ser 736 C E + Y
Thr 777 CS E + Y
Thr 844 C E ++ Y
Ser 848 CS E ++ Y(FN)
Ser 941 C E + −
Ser 944 C E + −
Ser 945 C E + −
Ser 946 C E ++++ Y(FN)
Ser 952 C E + −
Ser 953 C E ++ Y
16 Advances in Bioinformatics

Table 4: Continued.
Substrate O-𝛽-GlcNAc status
Yin yang residues
Residues PS Conservation SA YinOYang 1.2
Ser 956 C E + Y
Thr 965 C E ++ Y
Ser 966 C E +++ Y(FN)
Ser 967 C E + Y
Ser 984 C E + Y
Ser 988 C E + Y
Ser 1012 C E + Y
Ser 1022 C E +++ Y
Ser 1024 C E ++ Y
Ser 1026 C E + Y
Ser 1027 C E ++ Y
Ser 1048 CS E + Y
Ser 1103 C E + −
Ser 1115 C E − Y(FN)
Ser 1149 C E + Y
Thr 1157 C E + Y
Thr 1159 C E + Y
Ser 1162 C E ++ Y
Thr 1202 C E ++ Y(FN)
Ser 1203 C E ++ Y
Thr 1319 C B + −
PS: position; SA: surface accessibility; E: exposed; B: buried; C: conserved; CS: conserved substitutions; SC: semiconserved; Y: predicted as yin yang sites by
YinOYang 1.2; FN: false negative; −: negative/not determined yet; +: positive; ++: highly positive; +++: very highly positive; ++++: very very highly positive.

is important in insulin resistance and has negative regulation


of insulin signaling. Interestingly, Ser-312 is situated very near
to PTB domain (Figure 8) and any change in its vicinity
via PTMs should alter surrounding areas [21]. Furthermore
phosphorylation on three important Ser residues 984, 1037, Ser-1037
and 1101 in C-terminal of IRS-1 also has been shown to induce Ser-312 Ser-984
insulin resistance [33]. All these residues were conserved, Ser-1101
accessible for such modifications based on 3D structure, and
predicted as yin yang residues in our study. This finding
showed that alternative O-glycosylation on these residues
Figure 8: Space filled model of human IRS-1 protein showing higher
can inhibit the protein functioning rise by phosphorylation.
surface accessibility of possible therapeutic yin yang residues for
Although a lot of phosphorylation sites mapped on IRS-2 interplay between phosphorylation and O-glycosylation. Blue and
like Ser-303, 343, 675, and 907 are thought to be involved green colours denote the Ser and Thr residues, respectively.
in inducing insulin resistance, we were unable to predict O-
glycosylation on these residues due to limited experimental
data availability. On the other hand, we found more than 40
Ser/Thr residues on IRS-2 that has high potential to act as
possible yin yang sites (Table 4), and many of them may act Conflict of Interests
as therapeutic targets in future. The authors have no conflict of interests regarding the
Here we propose that alternative O-glycosylation on publication of this paper.
Ser-312 at N-terminal and Ser-984, Ser-1037, and Ser-1101
at C-terminal of IRS-1 is an important modification that
becomes reduced in diabetes and AD due to impaired glucose Authors’ Contribution
metabolism. During normal circumstances, O-glycosylation
can block phosphorylation on these residues and can act as Zainab Jahangir and Waqar Ahmad contributed equally to
possible therapeutic sites to reduce risk of diabetes and AD. this work.
Advances in Bioinformatics 17

References [18] M. Mann and O. N. Jensen, “Proteomic analysis of post-


translational modifications,” Nature Biotechnology, vol. 21, no.
[1] D. J. Selkoe, “Alzheimer’s disease: genotypes, phenotype, and 3, pp. 255–261, 2003.
treatments,” Science, vol. 275, no. 5300, pp. 630–631, 1997. [19] S. A. Summers, “Ceramides in insulin resistance and lipotoxic-
[2] Alzheimer’s Association, “2012 Alzheimer’s disease facts and ity,” Progress in Lipid Research, vol. 45, no. 1, pp. 42–72, 2006.
figures,” Alzheimer’s & Dementia, vol. 8, no. 2, pp. 131–168, 2012. [20] G. Sesti, “Pathophysiology of insulin resistance,” Best Practice
[3] W. H. Gispen and G.-J. Biessels, “Cognition and synaptic and Research: Clinical Endocrinology and Metabolism, vol. 20,
plasticity in diabetes mellitus,” Trends in Neurosciences, vol. 23, no. 4, pp. 665–679, 2006.
no. 11, pp. 542–549, 2000.
[21] M. W. Greene, H. Sakaue, L. Wang, D. R. Alessi, and R. A.
[4] S. Hoyer, “Is sporadic Alzheimer disease the brain type of non- Roth, “Modulation of insulin-stimulated degradation of human
insulin dependent diabetes mellitus? A challenging hypothesis,” insulin receptor substrate-1 by serine 312 phosphorylation,” The
Journal of Neural Transmission, vol. 105, no. 4-5, pp. 415–422, Journal of Biological Chemistry, vol. 278, no. 10, pp. 8199–8211,
1998. 2003.
[5] S. M. de la Monte and J. R. Wands, “Alzheimer’s disease is type [22] R. J. Griffin, A. Moloney, M. Kelliher et al., “Activation of
3 diabetes-evidence reviewed,” Journal of Diabetes Science and Akt/PKB, increased phosphorylation of Akt substrates and
Technology, vol. 2, no. 6, pp. 1101–1113, 2008. loss and altered distribution of Akt and PTEN are features of
[6] W. Ahmad, “Overlapped metabolic and therapeutic links Alzheimer’s disease pathology,” Journal of Neurochemistry, vol.
between Alzheimer and diabetes,” Molecular Neurobiology, vol. 93, no. 1, pp. 105–117, 2005.
47, no. 1, pp. 399–424, 2013.
[23] L. Rui, V. Aguirre, J. K. Kim et al., “Insulin/IGF-1 and TNF-
[7] M.-K. Sun and D. L. Alkon, “Links between Alzheimer’s disease 𝛼 stimulate phosphorylation of IRS-1 at inhibitory Ser307 via
and diabetes,” Drugs of Today, vol. 42, no. 7, pp. 481–489, 2006. distinct pathways,” The Journal of Clinical Investigation, vol. 107,
[8] Y. Deng, B. Li, Y. Liu, K. Iqbal, I. Grundke-Iqbal, and C.-X. no. 2, pp. 181–189, 2001.
Gong, “Dysregulation of insulin signaling, glucose transporters, [24] Y. Zick, “Ser/Thr phosphorylation of IRS proteins: a molecular
𝑂-GlcNAcylation, and phosphorylation of tau and neurofila- basis for insulin resistance,” Science’s STKE, vol. 2005, no. 268,
ments in the brain: implication for Alzheimer’s disease,” The p. pe4, 2005.
American Journal of Pathology, vol. 175, no. 5, pp. 2089–2098,
[25] H. Yu and T. Rohan, “Role of the insulin-like growth factor
2009.
family in cancer development and progression,” Journal of the
[9] Y. D. Ke, F. Delerue, A. Gladbach, J. Götz, and L. M. Ittner, National Cancer Institute, vol. 92, no. 18, pp. 1472–1489, 2000.
“Experimental diabetes mellitus exacerbates Tau pathology in
a transgenic mouse model of Alzheimer’s disease,” PLoS ONE, [26] S. M. Firth and R. C. Baxter, “Cellular actions of the insulin-like
vol. 4, no. 11, Article ID e7917, 2009. growth factor binding proteins,” Endocrine Reviews, vol. 23, no.
6, pp. 824–854, 2002.
[10] G. Sesti, M. Federici, M. L. Hribal, D. Lauro, P. Sbraccia, and R.
Lauro, “Defects of the insulin receptor substrate (IRS) system in [27] W. Ahmad, K. Shabbiri, B. Ijaz et al., “Serine 204 phosphory-
human metabolic disorders,” FASEB Journal, vol. 15, no. 12, pp. lation and O - GlcNAC interplay of IGFBP-6 as therapeutic
2099–2111, 2001. indicator to regulate IGF-II functions in viral mediated hepa-
tocellular carcinoma,” Virology Journal, vol. 8, article 208, 2011.
[11] D. J. Withers, D. J. Burks, H. H. Towery, S. L. Altamuro, C.
L. Flint, and M. F. White, “Irs-2 coordinates Igf-1 receptor- [28] C. H. Wu, R. Apweiler, A. Bairoch et al., “The Universal
mediated 𝛽-cell development and peripheral insulin signalling,” Protein Resource (UniProt): an expanding universe of protein
Nature Genetics, vol. 23, no. 1, pp. 32–40, 1999. information,” Nucleic Acids Research, vol. 34, pp. D187–D191,
2006.
[12] D. J. Withers, J. S. Gutierrez, H. Towery et al., “Disruption of
IRS-2 causes type 2 diabetes in mice,” Nature, vol. 391, no. 6670, [29] N. E. Zachara and G. W. Hart, “The emerging significance of O-
pp. 900–904, 1998. GlcNAc in cellular regulation,” Chemical Reviews, vol. 102, no.
[13] C. Selman, S. Lingard, A. I. Choudhury et al., “Evidence 2, pp. 431–438, 2002.
for lifespan extension and delayed age-related biomarkers in [30] M. F. White, “Regulating insulin signaling and 𝛽-cell function
insulin receptor substrate 1 null mice,” FASEB Journal, vol. 22, through IRS proteins,” Canadian Journal of Physiology and
no. 3, pp. 807–818, 2008. Pharmacology, vol. 84, no. 7, pp. 725–737, 2006.
[14] A. Taguchi, L. M. Wartschow, and M. F. White, “Brain IRS2 sig- [31] K. F. Petersen and G. I. Shulman, “Etiology of insulin resistance,”
naling coordinates life span and nutrient homeostasis,” Science, American Journal of Medicine, vol. 119, no. 5, 2006.
vol. 317, no. 5836, pp. 369–372, 2007. [32] C. D’Alessandris, F. Andreozzi, M. Federici et al., “Increased
[15] M. Schubert, D. P. Brazil, D. J. Burks et al., “Insulin receptor O-glycosylation of insulin signaling proteins results in their
substrate-2 deficiency impairs brain growth and promotes tau impaired activation and enhanced susceptibility to apoptosis in
phosphorylation,” The Journal of Neuroscience, vol. 23, no. 18, pancreatic 𝛽-cells,” The FASEB Journal, vol. 18, no. 9, pp. 959–
pp. 7084–7092, 2003. 961, 2004.
[16] A. M. Moloney, R. J. Griffin, S. Timmons, R. O’Connor, R. Ravid, [33] A. L. Kleini, M. N. Berkaw, M. G. Buse, and L. E. Ball, “𝑂-
and C. O’Neill, “Defects in IGF-1 receptor, insulin receptor and linked 𝑁-acetylglucosamine modification of insulin receptor
IRS-1/2 in Alzheimer’s disease indicate possible resistance to substrate-1 occurs in close proximity to multiple SH2 domain
IGF-1 and insulin signalling,” Neurobiology of Aging, vol. 31, no. binding motifs,” Molecular and Cellular Proteomics, vol. 8, no.
2, pp. 224–243, 2010. 12, pp. 2733–2745, 2009.
[17] R. K. Dearth, X. Cui, H.-J. Kim, D. L. Hadsell, and A. V. Lee, [34] L. E. Ball, M. N. Berkaw, and M. G. Buse, “Identification of the
“Oncogenic transformation by the signaling adaptor proteins major site of O-linked 𝛽-N-acetylglucosamine modification in
insulin receptor substrate (IRS)-1 and IRS-2,” Cell Cycle, vol. 6, the C terminus of insulin receptor substrate-1,” Molecular and
no. 6, pp. 705–713, 2007. Cellular Proteomics, vol. 5, no. 2, pp. 313–323, 2006.
18 Advances in Bioinformatics

[35] N. Eswar, B. Webb, M. A. Marti-Renom et al., “Unit 5.6. Chapter signaling,” American Journal of Physiology: Endocrinology and
5. Comparative protein structure modeling using modeller,” in Metabolism, vol. 294, no. 2, pp. E307–E315, 2008.
Current Protocols in Bioinformatics, 2006. [52] I. Ahmad, D. C. Hoessli, E. Walker-Nasir, S. M. Rafik, and A. R.
[36] Y. Zhang, “I-TASSER server for protein 3D structure predic- Shakoori, “Oct-2 DNA binding transcription factor: functional
tion,” BMC Bioinformatics, vol. 9, article 40, 2008. consequences of phosphorylation and glycosylation,” Nucleic
[37] K. Arnold, L. Bordoli, J. Kopp, and T. Schwede, “The SWISS- Acids Research, vol. 34, no. 1, pp. 175–184, 2006.
MODEL workspace: a web-based environment for protein [53] A. Kaleem, D. C. Hoessli, I. Ahmad et al., “Immediate-early gene
structure homology modelling,” Bioinformatics, vol. 22, no. 2, regulation by interplay between different post-translational
pp. 195–201, 2006. modifications on human histone H3,” Journal of Cellular Bio-
[38] I. W. Davis, L. W. Murray, J. S. Richardson, and D. C. Richard- chemistry, vol. 103, no. 3, pp. 835–851, 2008.
son, “MolProbity: structure validation and all-atom contact [54] A. Kaleem, D. C. Hoessli, I.-U. Haq et al., “CREB in long-
analysis for nucleic acids and their complexes,” Nucleic Acids term potentiation in hippocampus: role of post-translational
Research, vol. 32, pp. W615–W619, 2004. modifications-studies in silico,” Journal of Cellular Biochemistry,
[39] E. F. Pettersen, T. D. Goddard, C. C. Huang et al., “UCSF vol. 112, no. 1, pp. 138–146, 2011.
Chimera—a visualization system for exploratory research and [55] Z. Gao, D. Hwang, F. Bataille et al., “Serine phosphorylation
analysis,” Journal of Computational Chemistry, vol. 25, no. 13, of insulin receptor substrate 1 by inhibitor 𝜅B kinase complex,”
pp. 1605–1612, 2004. Journal of Biological Chemistry, vol. 277, no. 50, pp. 48115–48121,
[40] P. Baldi and S. Brunak, Bioinformatics: The Machine Learning 2002.
Approach, MIT Press, Cambridge, Mass, USA, 2nd edition, [56] T. Haruta, T. Uno, J. Kawahara et al., “A rapamycin-sensitive
2001. pathway down-regulates insulin signaling via phosphorylation
[41] W. Ahmad, K. Shabbiri, B. Ijaz et al., “Claudin-1 required for and proteasomal degradation of insulin receptor substrate-1,”
HCV virus entry has high potential for phosphorylation and O- Molecular Endocrinology, vol. 14, no. 6, pp. 783–794, 2000.
glycosylation,” Virology Journal, vol. 8, article 229, 2011. [57] W.-L. An, R. F. Cowburn, L. Li et al., “Up-regulation of
[42] W. Ahmad, K. Shabbiri, N. Nazar, S. Nazar, S. Qaiser, and M. phosphorylated/activated p70 S6 kinase and its relationship to
A. S. Mughal, “Human linker histones: interplay between phos- neurofibrillary pathology in Alzheimer’s disease,” The American
phorylation and O-𝛽-GlcNAc to mediate chromatin structural Journal of Pathology, vol. 163, no. 2, pp. 591–607, 2003.
modifications,” Cell Division, vol. 6, article 15, 2011. [58] S. S. Neukamm, R. Toth, N. Morrice et al., “Identification of the
[43] N. Blom, S. Gammeltoft, and S. Brunak, “Sequence and Amino acids 300-600 of IRS-2 as 14-3-3 binding region with the
structure-based prediction of eukaryotic protein phosphoryla- importance of IGF-1/insulin-regulated phosphorylation of Ser-
tion sites,” Journal of Molecular Biology, vol. 294, no. 5, pp. 1351– 573,” PLoS ONE, vol. 7, no. 8, Article ID e43296, 2012.
1362, 1999. [59] R. Zhande, W. Zhang, Y. Zheng et al., “Dephosphorylation by
[44] Z. Songyang, S. Blechner, N. Hoagland, M. F. Hoekstra, H. default, a potential mechanism for regulation of insulin receptor
Piwnica-Worms, and L. C. Cantley, “Use of an oriented peptide substrate-1/2, Akt, and ERK1/2,” The Journal of Biological Chem-
library to determine the optimal substrates of protein kinases,” istry, vol. 281, no. 51, pp. 39071–39080, 2006.
Current Biology, vol. 4, no. 11, pp. 973–982, 1994. [60] F. Liu, J. Shi, H. Tanimukai et al., “Reduced O-GlcNAcylation
[45] N. Blom, T. Sicheritz-Pontén, R. Gupta, S. Gammeltoft, and links lower brain glucose metabolism and tau pathology in
S. Brunak, “Prediction of post-translational glycosylation and Alzheimer’s disease,” Brain, vol. 132, no. 7, pp. 1820–1832, 2009.
phosphorylation of proteins from the amino acid sequence,” [61] Y. Yu, L. Zhang, X. Li et al., “Differential effects of an O-
Proteomics, vol. 4, no. 6, pp. 1633–1649, 2004. GlcNAcase inhibitor on tau phosphorylation,” PLoS ONE, vol.
[46] H.-D. Huang, T.-Y. Lee, S.-W. Tzeng, and J.-T. Horng, 7, no. 4, Article ID e35277, 2012.
“KinasePhos: a web tool for identifying protein kinase-specific [62] S. A. Yuzwa, X. Shan, M. S. MacAuley et al., “Increasing O-
phosphorylation sites,” Nucleic Acids Research, vol. 33, no. 2, pp. GlcNAc slows neurodegeneration and stabilizes tau against
W226–W229, 2005. aggregation,” Nature Chemical Biology, vol. 8, no. 4, pp. 393–399,
[47] F. Diella, S. Cameron, C. Gemünd et al., “Phospho.ELM: a 2012.
database of experimentally verified phosphorylation sites in
eukaryotic proteins,” BMC Bioinformatics, vol. 5, article 79,
2004.
[48] R. Gupta and S. Brunak, “Prediction of glycosylation across
the human proteome and the correlation to protein function,”
Pacific Symposium on Biocomputing, pp. 310–322, 2002.
[49] B. Petersen, T. N. Petersen, P. Andersen, M. Nielsen, and C.
Lundegaard, “A generic method for assignment of reliability
scores applied to solvent accessibility predictions,” BMC Struc-
tural Biology, vol. 9, article 51, 2009.
[50] J. C. Obenauer, L. C. Cantley, and M. B. Yaffe, “Scansite 2.0:
proteome-wide prediction of cell signalling interactions using
short sequence motifs,” Nucleic Acids Research, vol. 31, no. 13,
pp. 3635–3641, 2003.
[51] H. Sharfi and H. Eldar-Finkelman, “Sequential phosphorylation
of insulin receptor substrate-2 by glycogen synthase kinase-3
and c-Jun NH2-terminal kinase plays a role in hepatic insulin
International Journal of

Peptides

Advances in
BioMed
Research International
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Stem Cells
International
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Virolog y
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
International Journal of
Genomics
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014

Journal of
Nucleic Acids

Zoology
 International Journal of

Hindawi Publishing Corporation Hindawi Publishing Corporation


http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

Submit your manuscripts at


http://www.hindawi.com

Journal of The Scientific


Signal Transduction
Hindawi Publishing Corporation
World Journal
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

Genetics Anatomy International Journal of Biochemistry Advances in


Research International
Hindawi Publishing Corporation
Research International
Hindawi Publishing Corporation
Microbiology
Hindawi Publishing Corporation
Research International
Hindawi Publishing Corporation
Bioinformatics
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

Enzyme International Journal of Molecular Biology Journal of


Archaea
Hindawi Publishing Corporation
Research
Hindawi Publishing Corporation
Evolutionary Biology
Hindawi Publishing Corporation
International
Hindawi Publishing Corporation
Marine Biology
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014

You might also like