Ilovepdf Merged Removed

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Dissertation Report

On
“STRUCTURE BASED DRUG DESIGN AND COMPUTATIONAL
ANALYSIS FOR MISFOLDING OF HUNTINGTON’S DISEASE
FRAGMENT”

Submitted to

Amity University Uttar Pradesh


In partial fulfilment of the requirements for the award of the degree
B. Tech Biotechnology

By
Ojas Singh
AMITY INSTITUTE OF
BIOTECHNOLOGY
AMITY UNIVERSITY UTTAR PRADESH

under the guidance of

Dr. Dibyakanti Mandal


AMITY UNIVERSITY UTTAR PRADESH

Signature of the Convener


1
3
CERTIFICATE

On the basis of the declaration submitted by Ojas Singh, a student of B.Tech


Biotechnology, I hereby certify that the project titled “STRUCTURE BASED DRUG
DESIGN AND COMPUTATIONAL ANALYSIS FOR MISFOLDING OF
HUNTINGTON’S DISEASE FRAGMENT” which is submitted to Amity University Uttar
Pradesh, in partial fulfilment of the requirement for the award of the degree of B.tech
Biotechnology, is an original contribution with existing knowledge and faithful record of
work carried out by him under my guidance and supervision. To the best of my knowledge,
this work has not been submitted in part or full for any Degree or Diploma to this University
or elsewhere.

Noida

Date

Signature of Internal faculty Guide

Dr. Dibyakanti Mandal

4
PLAGIARISM CERTIFICATE

This is to certify that the thesis entitled “STRUCTURE BASED DRUG DESIGN AND
COMPUTATIONAL ANALYSIS FOR MISFOLDING OF HUNTINGTON’S
DISEASE FRAGMENT” submitted by for the partial fulfilment of the degree of has been
checked by Turnitin software for plagiarism. The thesis has 3% plagiarism.

Signature of the IFC Signature of the NTCC Co-ordinator

5
Plag report

6
DECLARATION

I, Ojas Singh, student of B.tech Biotechnology hereby declare that the project titled
“STRUCTURE BASED DRUG DESIGN AND COMPUTATIONAL ANALYSIS FOR
MISFOLDING OF HUNTINGTON’S DISEASE FRAGMENT” which is
submitted by me to Amity Institute of Biotechnology / Biotechnology, Amity University Uttar
Pradesh, in partial fulfilment of requirement for the award of the degree of B.tech
Biotechnology ,has not been previously formed the basis for the award of any degree,
diploma or other similar title or recognition.

Date Name and Signature of Student

7
ACKNOWLEDGEMENT

It is a high privilege for me to express my deep sense of gratitude to those entire members
who helped me in the completion of the STRUCTURE BASED DRUG DESIGN AND
COMPUTATIONAL ANALYSIS FOR MISFOLDING OF HUNTINGTON’S
DISEASE FRAGMENT under the supervision my internal guide Dr. Dibyakanti Mandal.

From the moment I started my internship, Dr. Dibyakanti Mandal provided me with clear
direction and expectations, and she was always available to answer my questions and
provide valuable feedback. Their expertise and guidance helped me to understand the inner
workings of the company and the industry and allowed me to make the most of my
internship.

My special thanks to all other members & seniors of the Amity University, Noida. I am
thankful for their time and effort, and for the invaluable knowledge and skills I have gained
during my internship.

Ojas Singh
A0504120094
(Amity University)

8
TABLE OF CONTENTS

S. No. Contents Page No.

0 ABSTRACT 10

1 INTRODUCTION 11-13

2 OBJECTIVES 14

3 REVIEW OF LITERATURE 15-21

4 MATERIALS AND METHODS 22-24

5 RESULTS AND DISCUSSION 25-47

6 CONCLUSION 48-49

7 REFERENCES 50-53

9
LIST OF FIGURES
S. No. Figure caption
3.1 Central Role Of Neurons
3.2 Huntington Pathogenesis Mechanism
3.3 MRI Scan Image Analysis Of Huntington Patient
5.1 ERRAT Result
5.2 Ramachandran Plot
5.3 3D Structure In Rasmol
5.4 Representing Hydrogen Bond
5.5 BLAST Description
5.6 BLAST Description
5.7 BLAST Description
5.8 Interpro Scan
5.9 COBALT Output
5.10 Membrane Preference In Cobalt
5.11 Clustal Omega
5.12 Protein Chain With Different Colors
5.13 Protein-Protein Interaction
5.14 Measurement Of SO4 In Pymol Interaction
5.15 Active Site Identification
5.16 Prepared Protein Sample
6.17 Docking Result
5.18 Docking Result
5.19 Structural Diversity In Protein
5.20 Analysis Of Structure Diversity
5.21 Architecture Of The Protein Sample
5.22 Graphical Representation
5.23 Specific Restriction Enzyme
5.24 Pathway Analysis
5.25 String Enrichment Analysis Of Chain A
5.26 String Enrichment Analysis Of Chain B

10
5.27 Sequence Identity Of PLAT With Chain B
5.28 Representation Of “Nodes” By Using String Analysis
5.29 Representation Of “Edges” By Using String Analysis

LIST OF TABLES
S. No. Table Caption
5.1 Docking Score

11
ABSTRACT
Huntington's disease (HD) is a destructive genetic condition inherited dominantly, primarily
caused by the expansion of CAG repeats in exon 1 of the Huntington gene on chromosome 4.
This expansion results in the production of mutant forms of the huntingtin protein (m-HTT)
with abnormal polyglutamine sequences, initiating the disease's progression. The initial 17
residues of amino acids in huntingtin protein [HTT(1-17)] are critical in defending against
various pathological manifestations both in laboratory experiments and in living organisms. A
recent study investigates how a single-chain variable fragment (scFv) known as C4 effectively
blocks the formation of amyloid structures by exon1 fragments of huntingtin under laboratory
conditions. The research also explores the structural mechanisms underlying this inhibition and
protection by analyzing the crystal structure of the C4 scFv and HTT(1-17) complex. The
peptide interacts with specific residues (3-11), forming an amphipathic helix that binds to the
antibody fragment. This binding occurs within a dimeric C4 scFv: HTT (1-17) complex,
involving the hydrophobic surface and β-sheet interface. Further, elucidation through high-
resolution NMR and physicochemical analysis in solution provides deeper insights into how
C4 scFv effectively prevents HTT aggregation, thus demonstrating its potential as a therapeutic
candidate.
Furthermore, computational analysis will be conducted to assess the structural properties, total
protein atoms, domains, and functions of protein samples of Huntington homologs related to a
protein sequence of interest. This will be followed by an interactive examination of sequence-
structure relationships, active sites, and bound chemicals. Active site identification is done
using the PDB file format in PyMOL. Drug designing will be performed to study the interaction
of Huntington’s Disease receptors with associated ligands. Quality estimation of protein
structure using server PROCHECK, ERRAT, and Ramachandran plot which identifies possible
secondary structure protein can adapt. The alignment program uses clustal omega and InterPro
Scan to generate a phylogenetic tree. CADD is used to dock the appropriate ligand to the
binding site of the protein. Finally, STRING was used to determine potential protein-protein
interaction.
Key Words: Huntington's disease, Drug designing, Quality estimation of protein, protein-
protein interaction, structural analysis, phylogenetic relationships.

12
CHAPTER 1
INTRODUCTION

Neurodegenerative diseases represent a formidable challenge in modern medicine, posing


significant burdens on healthcare systems and society (Mukherjee S et al., 2020). These
conditions, characterized by progressive degeneration of neuronal structures and function,
often lead to debilitating symptoms and profound declines in quality of life. Among the myriad
neurodegenerative disorders, Huntington's disease (HD) stands out as a particularly devastating
condition, encompassing a complex interplay of genetic, molecular, and cellular abnormalities
(Ganesh S et al., 2023). In addition to Huntington's disease, neurodegenerative disorders such
as Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis (ALS), and
frontotemporal dementia (FTD) contribute to the global burden of neurological morbidity and
mortality (Ciurea AV et al., 2023). While each of these diseases has distinct clinical
manifestations and underlying pathological mechanisms, they share common features,
including protein misfolding, aggregation, and neuronal dysfunction (Labbadia J et al., 2013).

Huntington's disease (HD) stands as one of the most challenging neurodegenerative disorders,
portrayed by a devastating triad of motor dysfunction, cognitive decline, and psychiatric
instability (Bano D et al., 2011). This hereditary condition exerts a profound impact on affected
individuals and their families, often manifesting in mid-adulthood and progressing relentlessly
over the course of 15 to 20 years until death. With an estimated prevalence of 5-10 cases per
100,000 individuals worldwide, HD represents a significant burden on healthcare systems and
society at large (Conway R., 2016).

At the heart of Huntington's disease lies a complex interplay of genetic mutations and
molecular aberrations, prominently involving the misfolding and aggregation of the huntingtin
(HTT) protein within neurons of the central nervous system (Gandhi J et al., 2019). The
causative genetic defect in HD is an expansion of CAG trinucleotide repeats within the HTT
gene, leading to an elongated polyglutamine (poly Q) stretch in the HTT protein (Stoyas CA et
al., 2018). This expanded poly Q tract confers a propensity for HTT protein misfolding,
aggregation into insoluble protein aggregates, and subsequent neuronal dysfunction and
degeneration (Tabrizi SJ et al., 2020). Despite decades of intensive research, effective disease-
modifying therapies for Huntington's disease remain elusive, underscoring the urgent need for
innovative therapeutic strategies that address the underlying molecular pathology (Akyol S et

13
al., 2023). While symptomatic treatments aimed at managing motor and psychiatric symptoms
exist, they provide only limited relief and do not alter the course of disease progression (Novak
MJ et al., 2011).

In recent years, there has been a paradigm shift in the approach to drug discovery and
therapeutic intervention in neurodegenerative diseases, including Huntington's disease.
Increasingly, researchers are turning to computational analysis and structure-based drug design
as powerful tools to unravel the intricate molecular pathways underlying protein misfolding
and aggregation and identify novel therapeutic targets (Dunkel P et al., 2012).

Computational analysis plays a pivotal role in elucidating the structural determinants of HTT
misfolding and aggregation (Louros N et al., 2023). Molecular dynamics simulations, for
example, allow researchers to explore the conformational landscape of mutant HTT protein
and investigate how changes in protein structure contribute to pathogenicity. By simulating the
behavior of atoms and molecules over time, these simulations provide valuable insights into
the stability, flexibility, and interactions of mutant HTT protein, shedding light on its propensity
for aggregation and toxicity (Moldovean SN et al., 2019). Protein-ligand docking studies
represent another computational approach with significant implications for Huntington's
disease research. By computationally modeling the binding interactions between small
molecules and mutant HTT protein, researchers can identify potential drug candidates capable
of modulating protein misfolding and mitigating neurodegeneration (Khan MQ et al., 2023).
Structure-based virtual screening further enables the rapid identification of novel compounds
with favorable binding affinities and pharmacological properties, expediting the drug discovery
process (Lavecchia et al., 2013). In addition to experimental methods, computational structure-
based analysis plays a crucial role in understanding HD pathogenesis. Molecular modeling
methods, such as homology modeling and molecular dynamics imitations, allow researchers to
predict the three-dimensional structure of mutant HTT protein and investigate its dynamic
behavior at the atomic level. By simulating the folding and unfolding of mutant HTT protein,
these computational approaches provide insights into the stability, flexibility, and aggregation
propensity of the protein, aiding in the elucidation of its pathogenic mechanisms (Shafie et al.,
2024).

Through a multidisciplinary approach that integrates computational modeling, bioinformatics,


and structural biology techniques, this thesis seeks to shed light on the complex molecular
landscape of Huntington's disease (Moldovean SN et al., 2019). By elucidating the structural

14
determinants of protein misfolding and aggregation and identifying potential drug targets, this
research aims to pave the way for the design and optimization of novel small-molecule
therapeutics aimed at modulating protein misfolding and attenuating neurodegeneration in HD
(Khan MQ et al., 2023).

This research endeavors to explore the intersection of computational analysis and structure-
based drug design in the context of Huntington's disease, with a specific focus on understanding
the misfolding dynamics of key fragments of the HTT protein. By leveraging advanced
computational methodologies, including molecular dynamics simulations, protein-ligand
docking studies, and structure-based virtual screening, this research aims to elucidate the
structural basis of HTT misfolding and identify promising therapeutic targets for intervention.
The introductory sections of this research will provide a comprehensive overview of
Huntington's disease, encompassing its clinical manifestations, genetic etiology, and the
molecular mechanisms underlying protein misfolding and aggregation. Subsequent chapters
will delve into the principles of computational analysis and structure-based drug design,
elucidating their significance in rational drug discovery and the development of precision
therapeutics targeting HD. Ultimately, the insights gleaned from this research hold the potential
to catalyze transformative advancements in the treatment of Huntington's disease. By
harnessing the power of computational analysis and structure-based drug design, this thesis
aims to offer new avenues for therapeutic intervention and provide hope to the millions of
individuals worldwide affected by this devastating neurological disorder.

15
CHAPTER 2
OBJECTIVES

1. Computational Analysis of human Huntington's disease in complex of biological


sample

2. To analyze structure properties, total protein atoms, domain and function of a protein
sample of oral cancer

3. Macromolecular database to discover three-dimensional structures corresponding to


homologs of a protein sequence of interest. Then, engage interactively to examine the
relationships between sequence and structure, identify active sites, and explore bound
chemicals..
4. Computational alignment method to calculate all possible parameters using biological
database
5. Computational analysis program to generate 2-D representations of protein-ligand
complexes And Active site identification from standard Protein Data Bank file input
(PYMOL).
6. Molecular docking will be performed to study the interaction of Huntington's disease
protein and associated ligands ( Docking server).
7. Quality estimation of protein structure and Ramachandran plot analysis, Alignment
program that uses a seed guide tree and hidden mark of model profiling technique.
8. Analyse DNA sequences and predict potential restriction enzyme cleavage site.
9. Analysis of classification of protein domain data and folding pattern of protein
structure, Enrichment analysis of multiple proteins at a time using retrieval of
interacting gene protein (String).
10. Connecting the biological sample to gene ontology and KEGG pathway networks.

16
CHAPTER 3
REVIEW OF LITERATURE
Neurodegenerative Disease:
Neurons have a central role in communication with other organs and their parts are
irreplaceable to the identifiable functioning of the human brain (Fisher ER et al,2014; Squitieri
F,2016). Although they come from in the brain but are exhibited everywhere in the human
body (Kempermann G., 2006; Pino A. 2017). Neural stem cells produce many neurons during
childhood, their number starts reducing as age progresses (Ganat Y.M.et al,2017). Therefore,
any loss or degeneration in neurons, their structure, and function results in the progression of
neurodegeneration, central to the pathophysiology of various neurodegenerative diseases
(Przedborski S. et al.,2003). These debilitating and incurable diseases are very complex to be
understood by the scientific community because of the involvement of many genetic and
environmental factors. There is a lot of evidence that suggests that these disorders are due to
multiple factors during the protein folding process there is a defect in the degradation and
aggregation of protein resulting in abnormal protein function or exposure to various pesticides
and metal toxicity. Although some success has been achieved by surgical and extremely
invasive methods their clinical approval rate is quite low due to the potential damage to the
blood-brain barrier. For terminating or withdrawing neurodegeneration, several
nanotherapeutics with the capability to cross the blood-brain barrier without damaging it have
been reported and documented in many cases (Harilal S.et al., 2019; Hinge NS et al., 2022).
Some of the most common NDDs are –Alzheimer's, Parkinson's, Amyloid Lateral Sclerosis,
and Huntington’s.

Figure 3.1: Representation of the central role of neurons in neurodegenerative diseases


(NDDs).

17
Types of NDD-
• Alzheimer-
It is one of the most common NDD disorder which becomes progressive with age, earlier
sign of this disease is considered to be dementia (memory loss). In this, there is disposition
of amyloid protein in the form of plaque and neurofibrillary fragments composed of tau
protein which is helical in structure causing cerebral neuron degeneration accompanied by
the death of neurons. Naturally, our brain can stop the aggregation of protein but as our age
progresses our brain might lose this capability resulting in the accumulation of protein and
causing miscommunication between the neurons when this plaque accumulates in the brain
then various neurological symptoms are observed like memory loss, motor neuron
dysfunction and various another mental dysfunction. This disease is not only incurable but
also its molecular origin remains unknown. Recently some scientific studies proposed the
use of neurotactin as being used for the treatment to slow down or stop the progression of
the disease but its use in humans still has a long way to go. Currently, there is no cure for
this disease, but effective treatment management and medicine might slow down the
symptoms (Bayer T.A. et al., 2010; Pickett E.K. et al., 2019).

• Parkinson’s-
It is a disorder that destroys the part of the brain responsible for controlling various bodily
movements. The most commonly affected neurons in this are dopamine-producing neurons
which are present in the deep part of the brain called basal ganglia and substantia nigra. For
some unknown reason when these dopamine-producing neurons begin to die present in
substantia nigra when more than 85% die, patients might experience symptoms like tremors
which usually start in one hand then other symptoms like stiffness and balance-related
neurological inabilities. The exact cause of why these neurons start to die is still unknown
but scientists are currently looking into various genetic and environmental factors that
might trigger the disease. No currently available medication might cure the disease, but
several drugs are available on the market that might slow down the symptoms. Till now
self-management of the treatment is considered to be the best treatment for this neurological
disability (Harilal S.et al., 2019; Hinge NS et al., 2022).

18
• Amyloid Lateral Sclerosis-
This disorder, often termed motor neuron disease or Lou Gehrig's disease, is characterized
by the progressive damage and gradual deterioration of neurons, ultimately leading to their
demise. Consequently, the brain struggles to interpret signals, causing disruptions in
communication and resulting in muscle weakness and paralysis. Currently, researchers
have identified mutations in 30 different genes associated with the disease. However, in
70% of cases, mutations occur in C9orf72, TARDBP, SOD1, and FUS. These genes play
vital roles in encoding various biological functions such as DNA repair, cellular
homeostasis, mitochondrial occupation, and glial cell function. The accumulation of
intraneural aggregates is considered a hallmark of the disease. Yet, it remains unclear
whether these protein complexes or aggregates precede neuron destruction or vice versa.
(Hardiman O.et al 2017).

• Huntington- This disorder is considered to be one of the most complex and severe
disorders which like any other NDD is characterized by the loss of neurons which occurs
progressively mainly affecting the stratum region of the brain leading to motor neuron
dysfunction. This is considered to be a genetically determined disorder caused by
autosomal dominant genes and expanded CAG trinucleotide repeats (0f variable length) in
HTT (gene encoding Huntington. In a person carrying the mutation, there is an abnormal
length of polyglutamine resulting in a toxic gain of function which eventually fragments
the protein (Pringsheim T. et al 2012). More recent studies highlight that there is an
increment in prevalence in some regions. Although there are plausible explanations for
variations and increased rates across populations, such as the identification of
different HTT gene haplotypes, healthcare accessibility, attitudes that differ on illness‐
related stigma, migration, and the identification of HD cluster regions, specific
determinants remain to be elucidated (Evans SJ et al., 2013; Baine FK et al., 2016; Fisher
ER et al., 2014; Squitieri F et al., 2016).

19
Huntington's disease (HD) pathogenesis mechanism-

Figure 3.2: Schematic representation of Huntington pathogenesis mechanism.

Huntington causes a wide variety of symptoms which differ from person to person. These
symptoms might be dominant in some impairing functional abilities while in others it might
not have the same effect.
There are various stages for the disease progression-
• Preclinical Stage- It is the stage where damage is still at the cellular level and is not easily
detectable. Patients might not even feel any symptoms at all.
In 10-15 years before the symptoms appear some cognitive & behavioral changes might be
seen like depression, difficulty in learning new things & poor coordination, etc (Tang C. et
al 2012).
• Juvenile HD – This stage appears at approx. 15 years of age, now the symptoms seem to
be more visible & severe. Unified Huntington Disease Scale (UHDS) is a tool used for
measuring or visualizing the progression of the disease. This scale takes into account-
cognitive changes, motor function coordination, emotional and functional ability (Evans SJ
et al., 2013).
• Early Stage-It occurs in 20s-30s years od age. Although cellular damage occurs in the
preclinical stage, symptoms will not be that prominent. This stage is the first stage where
symptoms appear to be more prominent than the early stage& can last up to 8 years.
Symptoms of stage 1 include- loss of coordination, trouble in complex movements,
involuntary twitches, difficulty in problem-solving, etc (Baine FK et al., 2016).

20
• Early Intermediate Stage- This is stage II, where symptoms become more noticeable and
start to affect daily life. This can last up to 3-13 years. While symptoms may be the same
as in stage I the severity increases. Complex movements might become more difficult and
more noticeable changes in involuntary movements can be seen (Fisher ER et al., 2014).
• Late Intermediate Stage- This stage is sometimes referred to as stage III, now the
symptoms start to affect an individual’s daily activities. Symptoms may include severe
chorea, difficulty walking, frequent falls, concentration issues, difficulty organizing
thoughts, etc (Squitieri F et al., 2016).
• Early Advanced Stage- This stage might begin almost after a decade of onset of symptoms
but can range from 9-21 years after the initial stage of the disease. Sometimes referred to
as stage IV, this time people start to require more advanced care. Symptoms might include
dyslexia, severe cognitive loss, rigidity, abnormal movements, etc (Squitieri F et al., 2016).
• Advanced Stage- Sometimes referred to as stage V people at this stage need nursing care
round the clock, movement is severely affected, and even the most basic motor functions
are affected. Symptoms include- being completely bedridden, unable to eat independently
or swallow, unable to communicate, etc (Baine FK et al., 2016).

Figure 3.3: MRI scan Image analysis of Huntington Patient.

Huntington's Disease targets therapy for pathogenesis Although it has been well known
about the genetic origin of HD, various molecular alterations reported have not been
completely understood. Despite knowing that toxicity in HD arises from a dominant gain of
function mutation in the mutant protein % the toxic expression of expanded polyglutamine
residue, the role of deletion or inactivation in wild-type Huntington cannot be completely
discarded as it also leads to neurodegeneration (Shafie et al., 2024).
Diagnosis- For the diagnosis of Huntington's disease it is necessary to know about the family
history as it is autosomal dominant. For clinical trial criteria, motor alteration remains the key

21
(Liu, D. et al., 2015). In a patient experiencing chorea symptoms HD is strongly suspected.
Other alternative ways are neuroimaging with a CT scan or MRI revealing cerebral atrophy of
the caudate nucleus. Diminution in striatal metabolic rate can be shown in PET scanning
(Versluis M. et al., 2012; Kloeppel, S. et al., 2009). Another reliable technique involves genetic
testing for clinical diagnosis (Nold 2017; Shin, H. et al., 2013).
Treatment- Various classes of medications are employed in the treatment of different aspects
of neurological disorders. For managing chorea, medications like Tetrabenazine and
Deutetrabenazine are utilized. These drugs work by depleting central monoamines through
reversible inhibition of VMAT2, albeit Deutetrabenazine has a longer half-life compared to
Tetrabenazine. Antipsychotic medications such as Olanzapine and Risperidone play a role in
controlling symptoms by targeting various receptors. Olanzapine inhibits dopamine, serotonin,
histamine, α1-adrenergic, and muscarinic receptors, while Risperidone selectively inhibits
serotonin and dopamine-D2 receptors. Antidepressants belonging to the SSRIs (Selective
Serotonin Reuptake Inhibitors) class, including Citalopram, Fluoxetine, and Sertraline,
function by inhibiting the reuptake of serotonin (5-HT) into the presynaptic nerve terminal,
thus increasing its availability in the synaptic cleft (Harilal S.et al., 2019).
In the realm of mood stabilizers, Lamotrigine and Carbamazepine are commonly prescribed.
Lamotrigine exerts its effect by blocking voltage-gated sodium ion channels and also
suppresses excitatory neurotransmitters like glutamate and aspartate. On the other hand,
Carbamazepine primarily functions through the blockade of voltage-gated sodium ion
channels. These medications offer diverse mechanisms of action, targeting different
neurotransmitter systems and pathways, to manage symptoms associated with neurological
disorders effectively (Liu, D. et al., 2015).

Surgical Treatment:
Another alternative treatment for treating the symptoms are surgical intervention through deep
brain stimulation typically used to treat or improve chorea symptoms. In a study it was found that
DBS is effective in reducing chorea in pharmacologically resistant patients (Gonzalez et al..
2014).
Potential Future Therapeutic Options for Huntington’s Disease:
Antisense oligonucleotide therapy (AS0):
It holds significant potential in HD therapeutics with many ASOs currently in the trial phase.
They are ss oligonucleotide analogues that can bind to pre-mRNA or mRNA & can act through
multiple mechanisms like translation inhibition, RNA degradation, and modulation in the
22
splicing mechanism which will eventually alter the protein expression. ASO can target mHTT in
an allele allele-specific manner or both wt-mHTT in a non-specific manner. 3 ASO are currently
in clinical trials Tominersen (an allele non-specific ASO), WVE-120101, and WVE-120102
(allele-specific ASOs) (Gonzalez et al.,2014).
Current treatment of HD:
There is currently no treatment or drug for a disease-modifying drug for Huntington's.
Treatment is only based on psychotic symptoms associated with it. Tetrabenazine is the only
known drug licensed in the UK for the treatment of chore Alzheimer's form movements. Trials
of cholinesterase inhibitors used to treat cognitive problems seen in Alzheimer's disease have
been largely negative in HD (Cubo et al.2006).

23
CHAPTER 4
MATERIALS AND METHOD

1. Data Acquisition and Verification:

Initiate the study by accessing the Protein Data Bank (PDB) using the unique identifier
(PDB ID: 4RAV) to retrieve the target biological sample, ensuring the integrity and
authenticity of the structural data. Conduct a comprehensive validation process by
cross-referencing the obtained sample with the Molecular Modeling Database
(MMDB), prioritizing structures with high experimental reliability and quality. Utilize
the PDBsum web server to procure detailed overviews and pictorial analyses of the
sample's macromolecular structure, encompassing critical structural features, chain
composition, and intricate interactions within the biomolecular assembly.

2. Three-Dimensional Analysis:

Embark on a meticulous three-dimensional analysis of the sample employing RasMol,


a robust molecular visualization tool known for its versatility in depicting complex
molecular structures. Delve into the atomic composition of the sample, meticulously
identifying and scrutinizing hydrogen bond interactions, spatial arrangement, and van
der Waals interactions to elucidate the structural intricacies and functional significance
of the macromolecular assembly.

3. Sequence Analysis and Alignment:

Initiate a comprehensive sequence analysis to discern the underlying genetic


information encoded within the sample by employing BLAST against relevant
biological sequence databases. Proceed with multiple sequence alignment using
COBALT, leveraging a constraint-based approach to unveil conserved domains,
evolutionary relationships, and potential functional motifs within the sample,
facilitating a deeper understanding of its biological significance.

4. Visualization and Attribute Manipulation:

Leverage the advanced visualization capabilities of PyMOL to further dissect and


manipulate the attributes of the sample, enabling the identification of key structural

24
features and active drug-target sites critical for therapeutic intervention and drug design
endeavours.

5. Protein-Ligand Docking:

Conduct sophisticated protein-ligand docking studies utilizing the CB-Dock2 server to


explore potential therapeutic avenues, with a specific focus on addressing Huntington's
disease. Employ computational algorithms to predict the binding affinity and mode of
ligands to the target protein, providing valuable insights into the feasibility and efficacy
of prospective drug candidates.

6. Structural Quality Assessment:

Validate the structural integrity and reliability of the sample through a rigorous
assessment process using computational tools available on the SAVES server. Employ
state-of-the-art methods such as ERRAT and PROCHECK to evaluate key structural
parameters, including stereochemical correctness and Ramachandran plot analysis,
ensuring the robustness of the structural model.

7. Structural Classification with CATH:

Incorporate the CATH (Class, Architecture, Topology, Homology) database into the
methodology to provide a comprehensive structural classification of the protein sample
obtained from PDB ID: 4RAV. Utilize the CATH database to classify the protein
structure into hierarchical levels based on its overall fold, architecture, topology, and
homologous relationships with other protein structures. By leveraging CATH,
researchers can gain deeper insights into the evolutionary relationships, structural
motifs, and functional implications of the protein sample within the broader context of
protein structure classification. This structural classification augments the
understanding of the protein's biological significance and its potential role in disease
mechanisms, including Huntington's disease pathology. Integrating CATH analysis
into the methodology enhances the structural characterization of the protein sample,
providing valuable information for further elucidating its functional properties and
therapeutic relevance.

8. DNA Analysis with Nebcutter:

25
Integrate Nebcutter into the methodology to analyze the DNA sequence associated with
the protein sample. Although primarily used for DNA sequence analysis and predicting
restriction enzyme cleavage sites, Nebcutter's adaptation allows for the assessment of
DNA characteristics relevant to the protein sample. Utilizing Nebcutter, investigate
features such as circularity, base pair count, and nucleotide composition, including GC
content and AT content. This unconventional application of Nebcutter provides
additional insights into the genetic organization and regulatory elements associated
with the protein sample's DNA sequence, complementing the structural and functional
analyses conducted using other bioinformatics tools.

9. Pathway Analysis:

Employ advanced bioinformatics techniques to elucidate the functional significance of


the sample within the context of biological pathways by conducting multiple sequence
alignment and motif analysis. Utilize the STRING database to unravel intricate protein-
protein interactions and connect the biological sample to relevant KEGG pathway
networks, providing crucial insights into the underlying molecular mechanisms of
Huntington's disease pathology and potential therapeutic targets.

By meticulously following this detailed methodology, researchers can unravel the complex
interplay between structure and function in biological macromolecules, offering profound
insights into disease mechanisms and facilitating the development of novel therapeutic
strategies to combat debilitating conditions such as Huntington's disease.

26
CHAPTER 5
RESULTS AND DISCUSSION
RESULTS:
5.1 Validation of structure (4rav) using server tools:
First, we verify whether the sample 4rav structure is good enough to perform further research
or not. For that, we used ERRAT and PROCHECK by using Saves server (model validation).

ERRAT (Quality estimation of protein):

Figure 5.1: ERRAT result showing amino acid distribution in error or non-error region. With
Overall quality factor analysis 96.270.
Analysis of validation of structure where Black bars identify the misfolded region located
distantly from the active site, Gray bars demonstrate the error region between 95% and 99%,
and white bars indicate the region with a lower error rate for protein folding. It evaluates the
agreement between the observed distribution of non-bonded interactions and the expected
distribution derived from high-resolution protein structures. Regions with significant
deviations from the expected distribution indicate potential errors in the protein structure, such
as incorrect amino acid assignments or structural distortions. ERRAT's analysis provides
valuable insights into the overall quality and reliability of protein models, aiding researchers
in refining and validating structural predictions for various biological applications.

PROCHECK Server (model validation):

27
Figure 5.2: Validation of protein structure using Ramachandran plot of PROCHECK analysis
(4RAV).
The validation of protein structures is essential for ensuring their accuracy and reliability in
various biological applications. One widely used method for structural validation is the
Procheck analysis, which evaluates the conformational quality of protein models based on their
agreement with known structural principles. In the provided results, the Procheck analysis
indicates that a significant majority of residues, 89.8%, fall within the most favored regions of
the Ramachandran Plot. These regions represent the energetically favorable conformations for
amino acid residues in a protein structure, indicating that the model conforms well to expected
stereochemical constraints. Additionally, the Procheck analysis identifies smaller percentages
of residues in other regions of the Ramachandran Plot. These include 8.4% in additional
allowed regions, 1.2% in generously allowed regions, and only 0.5% in disallowed regions.
Residues in additional allowed and generously allowed regions may deviate slightly from the
most favored conformations but are still within acceptable ranges, while those in disallowed
regions indicate potential structural irregularities or errors. Overall, the distribution of residues
across these regions provides valuable insights into the overall quality and reliability of the
protein model.

28
5.2 Three-Dimensional Analysis:

Figure 5.3: Representing protein sample(4rav) in ribbon format 3D structure in RasMol.

Utilizing RasMol, the protein sample 4RAV is depicted in ribbon format, offering a captivating
portrayal of its three-dimensional structure. This ribbon representation elegantly highlights the
protein's secondary structural elements, such as alpha helices and beta strands, providing
insights into its overall fold and topology. Beyond mere visualization, RasMol enables users to
explore the protein's structural features dynamically, facilitating deeper analysis of its
conformational dynamics and functional attributes. Through its intuitive interface and powerful
analytical capabilities, RasMol serves as a gateway to unraveling the intricacies of
biomolecular architecture, offering a holistic understanding of the protein's structure-function
relationships and biological significance.

29
Figure 5.4: Representing hydrogen bond with a dotted line in the sample (4rav).

In the visualization of the protein sample 4RAV using RasMol, the depiction of hydrogen
bonds as dotted lines serves as a crucial element in elucidating the molecular interactions within
the protein structure. These hydrogen bonds, essential for stabilizing the protein's tertiary
structure, are represented as dashed lines connecting donor and acceptor atoms. By highlighting
these interactions, RasMol provides valuable insights into the spatial arrangement and stability
of the protein, shedding light on its functional properties and potential binding sites. Through
the visualization of hydrogen bonds, RasMol enhances our understanding of the intricate
network of molecular interactions that govern protein folding and dynamics, paving the way
for deeper insights into its biological function and therapeutic potential.

5.3 Sequence Analysis and Alignment:


Sequence Analysis:

Figure 5.5: Representing BLAST output of Chain A, B showing sequence similarity of chain
A with the synthetic construct of anti-Huntington single chain.

30
Figure 5.6: Representing sequence similarity BLAST output of Chain B, D showing 100%
identity sequence identity with synthetic construct anti-Huntington intrabody single chain fv
antibody.

Figure 5.7: Representing BLAST output of Chain E,F

Representing the BLAST output of Chain A, B, C, D, E, and F from the protein sample 4RAV
involves visualizing the sequence similarity between all the chains with the synthetic construct
of Huntington intrabody single chain antibody. This visualization typically entails aligning the
amino acid sequences of all the Chains and the synthetic construct using bioinformatics tools
such as Biopython or specialized software. By annotating regions of similarity, amino acid
substitutions, and sequence gaps, researchers can gain insights into the evolutionary
relationship and functional conservation between all the Chains and the synthetic construct of
Huntington single chain. This representation aids in interpreting the BLAST results and
facilitates further analysis of structural and functional implications arising from the observed
sequence similarity.

InterProScan is a tool utilized to consolidate predictive insights concerning protein function


from various collaborating resources. It offers a comprehensive overview of a protein's
affiliations with different families, along with the identification of domains and sites it
encompasses. By employing the InterPro protein signature databases, this tool enables the
scanning of biological sequences for matches. Through interpretation of the sequence analysis
results, facilitates the integrative classification of protein sequences into families and pinpoints
functionally significant domains and conserved sites. InterProScan serves as the foundational
software that permits the search of both protein and nucleic acid sequences against the InterPro

31
database.

Figure 5.8: Representing amino acid sequence, domain, homologous superfamily and
unintegrated light chain giving an overview of the protein family and domain.

Utilizing InterProScan, the representation of amino acid sequences, domains, homologous


superfamilies, and unintegrated light chains offers a comprehensive overview of the protein
family and its domain architecture. This multifaceted analysis integrates diverse bioinformatics
tools to identify conserved domains, infer evolutionary relationships within superfamilies, and
uncover potential functional motifs within the protein sequence. By elucidating the structural
and functional characteristics of the protein family, InterProScan facilitates deeper insights into
its biological roles and regulatory mechanisms. Furthermore, this comprehensive analysis aids
in understanding the molecular basis of disease, including conditions like Huntington's disease,
by highlighting key domains and motifs associated with pathogenesis.

Alignment:

32
COBALT, which is a constraint-based tool to determine multiple sequence alignment is used
to evaluate mismatches (in red) and matches (in Gray) and membrane preference.

Figure 5.9: COBALT output representing panorama view large proportion of mismatch are
color in red.

In the COBALT output, the panorama view depicts the alignment between two sequences, with
mismatches highlighted in red and regions of sequence match coloured in gray. The gray
shading represents positions where the sequences exhibit a high degree of similarity, indicating
conserved residues between the two sequences. This measurement of sequence match in gray
signifies regions where amino acids are aligned with little to no variation, suggesting functional
or structural importance due to evolutionary conservation. By visually emphasizing these
conserved regions, the COBALT output aids in identifying potential functional domains,
evolutionary relationships, and critical residues within the aligned sequences, facilitating
deeper insights into their biological significance and potential functional roles.

Figure 5.10: Membrane preference in cobalt (red for low membrane and green for high
membrane).
In the COBALT output, membrane preference is represented by colour gradients ranging from
red for low membrane preference to green for high membrane preference. This colour scheme
highlights regions within the protein sequence that are predicted to exhibit varying degrees of
affinity or propensity for membrane interaction. Regions coloured in red indicate a lower
likelihood of membrane association, while those in green suggest a higher propensity for
membrane interaction. This visualization aids in identifying potential membrane-binding
domains or regions within the protein sequence, providing valuable insights into its cellular

33
localization, structural organization, and functional properties within the context of cellular
membranes.

Figure 5.11: Representing different color-coded amino acid by using Clustal Omega.
In Clustal Omega, amino acids are represented with a color-coded scheme for enhanced
visualization: AVFPMILW are coloured red, indicating small and hydrophobic residues,
excluding aromatic tyrosine (Y); DE residues are depicted in blue, while RHK residues appear
magenta, signifying basic properties. STYHCNGQ residues are green, representing amino
acids containing hydroxyl, sulfhydryl, amine, and glycine functionalities. Any other amino
acids not fitting these categories are shown in Gray, denoting unusual amino acids. This color-
coded representation facilitates the analysis of sequence alignments and aids in identifying
conserved regions, functional motifs, and structural features within protein sequences.

5.4 Visualization and Attribute Manipulation:

PyMol software was used to determine the active site for ligand binding with the sample chain,
its RMSD score, and the potential binding site for the ligand molecule.

Figure 5.12: Representing each chain with different colors Chain A(yellow), Chain
B(Green), Chain C(Blue), Chain D(Orange), Chain E(gray), Chain F(red).

34
In PyMOL, the visualization of the protein sample 4RAV involves assigning distinct colors to
each chain to enhance clarity and facilitate comprehensive analysis. Chain A is designated a
bright yellow color, while Chain B is depicted in a vibrant green hue. Chain C adopts a tranquil
blue tone, and Chain D stands out with an attention-grabbing orange shade. Chain E assumes
a subtle gray color, and Chain F is represented in a bold red hue. By employing this color-
coded approach within PyMol’s user-friendly interface, researchers can effortlessly distinguish
individual chains within the protein structure, allowing for detailed examination of their spatial
orientation, interactions, and functional significance. This visual representation empowers
researchers to gain deeper insights into the complex architecture and dynamics of the protein
sample, facilitating a comprehensive understanding of its biological function and potential
therapeutic implications.

Figure 5.13: Representing alignment of 4rav chain & 2ld0 in PyMol RMSD = 3.870.

In PyMOL, aligning the chains of the protein samples 4RAV and 2LD0 involves
superimposing their structures to assess their similarity and differences. The alignment is
performed using the RMSD (Root Mean Square Deviation) calculation, which quantifies the
overall structural differences between the two protein chains. In this case, the alignment of
Chain E from 4RAV (colored gray) and Chain A from 2LD0 (colored cyan blue) yields an
RMSD score of 3.870, indicating a relatively good structural similarity between the two chains.
During the alignment process, PyMOL superimposes the backbone atoms of the two protein
chains, minimizing their structural deviations to achieve the best possible overlap. The
resulting RMSD score reflects the average distance between equivalent atoms in the aligned
structures, providing a measure of their structural congruence. A lower RMSD score indicates

35
a higher degree of structural similarity, suggesting that the aligned chains share similar overall
folds and conformations. To visually represent the alignment in PyMOL, the structures of the
two protein chains are depicted in their respective colors (gray for 4RAV and cyan blue for
2LD0), with any structural deviations highlighted in purple. The purple regions indicate areas
where the two chains diverge structurally, potentially reflecting differences in their amino acid
sequences or conformational flexibility. Additionally, PyMOL may generate a superimposed
model of the aligned chains, allowing researchers to visualize their spatial overlap and identify
regions of similarity and divergence with ease. Overall, the alignment of 4RAV Chain E and
2LD0 Chain A in PyMOL provides valuable insights into their structural relationship, enabling
researchers to assess their similarity and differences in a visually intuitive manner. The RMSD
score serves as a quantitative measure of their structural congruence, facilitating comparative
analyses and further investigations into their functional implications.

Figure 5.14: Measurement of SO4 in PyMol interaction with protein sample.

In PyMOL, measuring the interaction of sulphate ions (SO4) with the protein sample 4RAV
involves selecting the sulphate ions and protein residues of interest, measuring the distances
between them using PyMol’s distance measurement tool, and analyzing these distances to
assess the strength and nature of the interactions. Visualizing these interactions allows
researchers to identify potential binding sites and understand the spatial relationship between
the sulphate ions and protein residues. Additionally, computational methods can be employed
to further analyze the binding affinity and energetics of the interactions, providing
comprehensive insights into the sulphate-protein interactions within the 4RAV structure.

36
Figure 5.15: Active site identification of the protein 4RAV by using Python-based software.

Identifying the active site of protein 4RAV with Python-based software involves loading the
protein structure using BioPython, performing structural analyses to pinpoint potential ligand-
binding regions, and employing computational algorithms to predict binding sites.
Visualization of predicted sites in PyMOL allows for validation against experimental data,
followed by functional annotation to characterize their roles. Further analyses, including
molecular docking simulations, offer insights into ligand-protein interactions. This
comprehensive approach facilitates the elucidation of 4RAV's functional landscape and
informs future studies in drug design and protein engineering.

Figure 5.16: Prepared Protein sample for protein-ligand interaction.


Now, NCBI BLAST was used to perform the sequence similarity between the various chains
of the protein to identify which chain of huntington protein has the similarity with our sample

37
chain 4rav. The algorithm compares a nucleotide or protein sequence in the database against a
query sequence to analyze statistical significance generating a bit score and E value per
alignment score.. A lower E value, closer to zero, signifies a more meaningful match and the
sequence with percent identity 95% to 100% is significantly similar. Then, molecular docking
was performed using CBDOCK server which employs an algorithm to predict the 3D structure
of the protein-ligand complex and evaluate binding affinity to target protein. Its scoring
function (Vina Score) ranks the potential binding poses of ligand molecules on the predicted
structure, thereby helpful in screening large libraries of small molecule compounds for
identification of potential drug targets which can eventually accelerate the drug discovery
process.

5.5 Protein-Ligand Docking:


Now, PubChem was used to determine the 2D and 3D structure of the potential drug of
Huntington protein sample to determine structural activity and relationship of a compound or
ligand with the sample so that we can prioritize candidates for our study. For this study, only
two potential drugs named Ingrezza (Valbenazine) and Haloperidol are used, and their auto-
blind Docking results and scores are as follows:

(A) (B)
Haloperidol Ingrezza (Valbenazine)

38
In the context of Huntington's disease (HD), Ingrezza may hold promise as a treatment option for
managing chorea, a characteristic symptom of HD characterized by involuntary, jerky movements.
Chorea in HD is thought to arise from dysregulation of dopamine signaling in the brain, and VMAT2
inhibitors like Ingrezza could help modulate dopamine release and alleviate choreic movements. While
further research is needed to fully evaluate its efficacy and safety in HD, preliminary studies suggest
that Ingrezza may offer benefits for managing chorea and improving quality of life in individuals with
HD. However, it is important to consult with a healthcare professional for personalized medical advice
and treatment recommendations for Huntington's disease.

Figure 5.17: Representing auto-blind docking between sample (4rav) and Ingrezza
(Valbenazine).

Figure 5.18: Representing auto-blind docking between 4rav & Haloperidol.

39
Table 5.1: Docking Scores of effective drugs.
S.No. Vina Cavity Centre Docking Size
Drug name
Score Volume x y z x y z
3
(A )
1. Ingrezza (Valbenazine) -7 457 29 -12 70 23 23 23
2. Haloperidol -8 800 70 19 -61 25 25 25

5.6 Structural Classification with CATH:

CATH is used to determine the classification, analysis, and interpretation of the domain
structure of 4rav. Class represents overall fold or fold type of domain which for our sample is
mainly beta. Architecture represents variation in the overall arrangement of the secondary
structure which for our sample is a sandwich. Topology represents unique spatial arrangement
and connectivity of secondary structure elements capturing the fine details beyond overall fold
and architecture which for our sample is immunoglobulin-like. Homologous Superfamily
represent evolutionary related protein families sharing common ancestor which for our sample
is immunoglobulins.

It has various applications in protein engineering and design in which researcher use CATH to
identify structurally similar domains with desired properties, like binding specificity for
engineering novel proteins with improved or tailored functions.
CATH Classification:

Figure 5.19: Representing structural diversity in protein sample 4rav, each class,
architecture, topology & hierarchy is represented by a distinct code.
Representing the structural diversity within the protein sample 4RAV involves categorizing its
architecture, topology, and hierarchy using distinct codes corresponding to each level of
structural organization. Classifying the protein's structural features into different classes,
architectures, topologies, and hierarchies provides a systematic framework for understanding
its diverse structural motifs and evolutionary relationships. Each code denotes specific

40
characteristics such as overall fold (class), arrangement of secondary structural elements
(architecture), connectivity patterns (topology), and homologous relationships with other
protein structures (hierarchy). This comprehensive classification scheme facilitates the
identification of structural similarities and differences among protein samples, enabling
researchers to elucidate the underlying principles governing protein structure-function
relationships and their implications in various biological processes, including Huntington's
disease pathology. By leveraging these codes to represent structural diversity, researchers can
gain deeper insights into the functional significance and evolutionary conservation of the
protein sample within the broader context of protein structure classification.

Figure 5.20: Analysis of structure Diversity in CATH.

Figure 5.21: Representing Architecture of the protein sample (4rav) by distinct code.

41
Figure 5.22: Graph representing the structural neighborhood of protein sample 4rav with
increasing sequence similarity in which yellow colour represents the same homologous
superfamily with our sample and blue represents the same topology with our sample.

5.7 DNA Analysis with Nebcutter:

Nebcutter is mainly used to analyse DNA sequences and predicting potential restriction
enzyme cleavage site but we used it for our protein sample from which we found out that the
DNA of our protein sample is circular with103106 bp, having GC content 49% and AT 53%.

Figure 5.23: Finding possible recognition site for a specific restriction enzyme using
Nebcutter.

Although Nebcutter is primarily utilized for analysing DNA sequences and predicting
restriction enzyme cleavage sites, in our study, we applied it to assess our protein sample,

42
revealing that its corresponding DNA sequence is circular, spanning 103,106 base pairs. The
analysis further unveiled a GC content of 49% and an AT content of 53%. While Nebcutter is
conventionally employed for DNA sequence analysis, our adaptation highlights its versatility
in providing insights into the structural characteristics and nucleotide composition of circular
DNA associated with our protein sample. This unconventional application underscores the
tool's utility in diverse molecular biology research contexts, offering valuable information for
understanding genetic organization and potential regulatory elements within circular DNA
sequences.

5.8 Pathway Analysis:

KEGG- is used to analyze various pathways of the disease, trying to know the exact
mechanism which is causing abnormality due to our protein sample 4rav.

43
Figure 5.24: Pathway analysis of Huntington’s Disease.

Pathway analysis of Huntington's Disease involves the systematic examination of molecular


pathways and biological processes implicated in the pathogenesis of the disease. By leveraging
advanced bioinformatics techniques and databases such as KEGG (Kyoto Encyclopaedia of
Genes and Genomes), researchers aim to elucidate the intricate network of interactions among
genes, proteins, and metabolites associated with Huntington's Disease pathology. This analysis
provides crucial insights into the underlying molecular mechanisms driving disease
progression, including alterations in signalling pathways, protein aggregation, mitochondrial
dysfunction, and neuronal cell death. By comprehensively mapping out these pathways,

44
researchers can identify key molecular targets for therapeutic intervention and develop
strategies to mitigate disease symptoms and progression.

STRING- used for determining functional analysis of protein in which network nodes
represent protein in which various interactions of protein are visualized.

Figure 5.25: Representing string enrichment analysis of Chain A showing first cell
interaction with sequence identity 98.9%.
The metrics provided pertain to a String Enrichment Analysis conducted for Chain A within a
protein-protein interaction (PPI) network. This analysis reveals that Chain A consists of 11
nodes representing proteins, interconnected by 42 edges denoting interactions between these
proteins. The average node degree, indicative of the average number of connections per node,
is 7.64, implying a relatively high degree of connectivity within Chain A. Moreover, the
average local clustering coefficient, measuring the tendency of nodes to form clusters, is
notably high at 0.922, suggesting a dense and highly interconnected local network structure
within Chain A. The expected number of edges in Chain A, based on random chance, is 11,
highlighting a significant enrichment of protein-protein interactions within this chain. This
enrichment is further supported by the low PPI enrichment p-value of 3.97e-13, indicating the
statistical significance of observed interactions within Chain A. These metrics collectively
provide insights into the structural characteristics and functional implications of protein
interactions within Chain A of the network.

45
Figure 5.26: String Enrichment Analysis Of Chain B showing first cell interaction with
immunoglobulin lambda variable 10-54 having identity 80.8%.

Figure 5.27: Representing sequence identity of PLAT with chain B which is 100%.

The provided metrics pertain to a String Enrichment Analysis conducted specifically for Chain
B within the protein-protein interaction network. The analysis reveals that Chain B comprises
11 nodes representing proteins, interconnected by 33 edges denoting interactions between these
proteins. The average node degree, calculated as the average number of connections per node,
is 6, indicating a moderate level of connectivity within Chain B. The average local clustering

46
coefficient, which measures the tendency of nodes to form clusters, is notably high at 0.885,
suggesting a dense and highly interconnected local network structure. The expected number of
edges in Chain B, based on random chance, is 11, indicating a significant enrichment of protein-
protein interactions within this chain. This enrichment is further supported by the low PPI
enrichment p-value of 3.97e-08, signifying the statistical significance of the observed
interactions within Chain B. Overall, these metrics provide valuable insights into the structural
characteristics and functional implications of protein interactions within Chain B of the
network.

Figure 5.28: Representation of “Nodes” by using string analysis.

In the context of protein-protein interaction networks analyzed using the STRING database,
"nodes" typically refer to individual proteins represented within the network. Each node
represents a specific protein, and edges between nodes represent interactions or associations
between these proteins. Therefore, the term "nodes in STRING" would indicate the total
number of proteins included in the protein-protein interaction network analyzed using the
STRING database.

Figure 5.29: Representation of “Edges” by using string analysis.

47
In the context of protein-protein interaction networks analyzed using the STRING database, an
"edge" represents a connection or association between two proteins. Each edge between nodes
(proteins) in the network indicates a potential interaction or relationship between the
corresponding proteins. These interactions can include direct physical interactions, functional
associations, co-expression patterns, co-localization, and shared pathway memberships.
Therefore, the term "edge in STRING" refers to the total number of connections or interactions
between proteins within the protein-protein interaction network analyzed using the STRING
database.

DISCUSSION:

The above results represent that there is significant protein interaction among themselves than
what would be expected from a random set of proteins of the same size and degree of
distribution drawn from the genome. Such an enrichment indicates that the proteins are at least
partially biologically connected.

Studying the misfolding of Huntington's fragment is necessary for understanding its molecular
mechanism improving earlier prognosis and diagnosis for the development of various drugs or
targeted therapies which might eventually help determine the potential cause and timely cure
of the disease as like most neurodegenerative diseases, Huntington still has no available cure,
only its associated symptoms can be treated not the root of the disease. So, the objective of our
study is to computationally analyze the misfolding huntingtin fragment protein 4rav which was
isolated from the human blood sample. We started by identifying the validation of our protein
sample whether it is suitable for further analysis or not using ERRAT and PROCHECK of the
SAVES server in which the score was found to be significant for further analysis. Then we
started by downloading the structure of 4rav in PDB format for its detailed analysis in
RASMOL and PyMol software. Rasmol allows us to explore the graphical representation of
the sample and our understanding of its 3D structure by giving commands to the software for
identifying individual atoms, the total no. of atoms in the sample, and various chains and
residues. We can also manipulate the structure according to our purpose for more clarity.
PyMol is used to mainly identify the active site and binding site of the protein molecule, one
of the features of PyMol is the calculation of RMSD value between two similar protein chains,
we manipulated the PyMol structure by identifying its various chains by different colouring for
better clarity and identification of its various active site on which prospective ligand molecule

48
can bind. For further analysis, we used NCBI BLAST and COBALT to analyse the sequence
similarity between the various chain of our protein sample. Then we used PubChem to identify
the structure of ligand or drug molecule used to treat symptoms of Huntington's disease and we
used CB dock 2 to blind dock each ligand (Ingrezza and Haloperidol) with the protein sample
in PDB format only then analysed the result using vina binding score to determine binding
efficiency. For further analysis of the protein we used CATH to determine the various
classification levels, KEGG to determine the potential disease pathway, clustal omega and
InterPro scan to identify the evolutionary relationship between the protein chains. Finally,
STRING analysis was performed to identify various PPI interactions.

49
CHAPTER 6
CONCLUSION

In conclusion, this research comprehensively explored the structural and functional aspects of
a protein associated with Huntington's Disease through an integrated bioinformatics approach.
Our journey has been marked by a comprehensive exploration of the molecular landscape
underlying this complex disorder, utilizing a diverse array of computational analyses,
bioinformatics methodologies, and innovative approaches to unravel its mysteries. At the core
of our investigation lay the protein-protein interaction networks, intricate webs of molecular
interactions that govern cellular processes and underpin the pathogenesis of Huntington's
Disease. Through meticulous validation and verification processes, we ensured the integrity
and reliability of the structural data retrieved from repositories such as the Protein Data Bank
(PDB) and the Molecular Modelling Database (MMDB), laying a robust foundation for our
subsequent analyses. With the aid of powerful bioinformatics tools and visualization software,
we embarked on a three-dimensional odyssey, meticulously dissecting the atomic intricacies
and spatial arrangements of our protein of interest. We scrutinized hydrogen bond interactions,
van der Waals forces, and structural motifs, uncovering the subtle nuances that dictate protein
function and dysfunction in the context of Huntington's Disease.

Our exploration extended beyond mere structural analyses; we delved into the genetic
blueprints encoded within our protein sample, employing sequence analysis tools such as
BLAST and COBALT to decipher conserved domains, evolutionary relationships, and
potential functional motifs. This genetic perspective provided invaluable insights into the
underlying molecular mechanisms driving disease progression and laid the groundwork for
further investigations. Moreover, our analysis of protein-protein interaction networks yielded
fascinating insights, particularly through String Enrichment Analysis applied to specific
protein chains. By unravelling intricate networks of molecular interactions within these chains,
we uncovered hidden connections and potential regulatory pathways implicated in disease
pathogenesis, offering valuable insights for targeted therapeutic interventions and drug
discovery efforts.

Furthermore, our research transcended traditional boundaries, embracing innovative


methodologies such as the adaptation of Nebcutter for DNA analysis. This unconventional
approach unearthed surprising insights into the genetic landscape associated with Huntington's
50
Disease, challenging existing paradigms and providing novel perspectives on disease
regulation at the genetic level. As we reflect on the culmination of our research journey, it
becomes evident that our endeavours have far-reaching implications for both scientific
understanding and clinical practice. By elucidating the structural and functional intricacies of
proteins associated with Huntington's Disease, we pave the way for the development of novel
treatments and interventions aimed at halting or even reversing disease progression. Moreover,
our findings contribute to the broader landscape of neurodegenerative research, offering
insights that may extend beyond Huntington's Disease to other related conditions, thus
broadening the scope of potential therapeutic targets and strategies.

In essence, our research represents a testament to the power of interdisciplinary collaboration,


innovation, and perseverance in the pursuit of scientific knowledge and societal progress. As
we embark on the next phase of our scientific odyssey, let us carry forward the lessons learned
and the insights gained, ever mindful of the transformative potential that lies at the intersection
of science, technology, and human ingenuity. Together, we can continue to push the boundaries
of discovery, bringing us one step closer to a world free from the burdens of neurodegenerative
disease and ushering in a brighter, healthier future for all.

51
CHAPTER 6
REFERENCES

1. Allan S.M., Rothwell N.J. Inflammation in central nervous system injury. Philos. Trans. R.
Soc. B Biol. Sci. 2003;358:1669–1677.
2. Akyol S, Ashrafi N, Yilmaz A, Turkoglu O, Graham SF. Metabolomics: An Emerging
“Omics” Platform for Systems Biology and Its Implications for Huntington Disease
Research. Metabolites. 2023 Dec 18;13(12):1203.
3. Brain Basics: The Life and Death of a Neuron. Office of Communications and Public
Liaison, National Institute of Neurological Disorders and Stroke; Bethesda, MD, USA:
2002.
4. Baine FK, Krause A, Greenberg LJ. The frequency of Huntington disease and Huntington
disease‐like 2 in the South African population. Neuroepidemiology 2016;46:198–202.
5. Bano D, Zanetti F, Mende Y, Nicotera P. Neurodegenerative processes in Huntington's
disease. Cell death & disease. 2011 Nov;2(11):e228-.
6. Brouwer-DudokdeWit A.C., Savenije A., Zoeteweij M.W., Maat-Kievit A., Tibben A. A
hereditary disorder in the family and the family life cycle: Huntington disease as a
paradigm. Fam. Process. 2002;41:677–692. doi: 10.1111/j.1545-5300.2002.00677.x.
7. Ciurea AV, Mohan AG, Covache-Busuioc RA, Costin HP, Glavan LA, Corlatescu AD,
Saceleanu VM. Unraveling Molecular and Genetic Insights into Neurodegenerative
Diseases: Advances in Understanding Alzheimer’s, Parkinson’s, and Huntington’s
Diseases and Amyotrophic Lateral Sclerosis. International journal of molecular sciences.
2023 Jun 28;24(13):10809.
8. Conway R. The sphingolipidoses. Health care for people with intellectual and
developmental disabilities across the lifespan. 2016:659-82.
9. Dunkel P, Chai CL, Sperlágh B, Huleatt PB, Mátyus P. Clinical utility of neuroprotective
agents in neurodegenerative diseases: current status of drug development for Alzheimer's,
Parkinson's and Huntington's diseases, and amyotrophic lateral sclerosis. Expert opinion
on investigational drugs. 2012 Sep 1;21(9):1267-308.
10. Esch T., Stefano G.B., Fricchione G.L., Benson H.J.N.L. The role of stress in
neurodegenerative diseases and mental disorders. Neuro Endocrinol. Lett. 2002;23:199–
208.

52
11. Evans SJ, Douglas I, Rawlins MD, Wexler NS, Tabrizi SJ, Smeeth L. Prevalence of adult
Huntington's disease in the UK based on diagnoses recorded in general practice records. J
Neurol Neurosurg Psychiatry 2013;84:1156–1160.
12. Fisher ER, Hayden MR. Multisource ascertainment of Huntington disease in Canada:
prevalence and population at risk. Mov Disord 2014;29:105–114.
13. Ganat Y.M., Silbereis J., Cave C., Ngu H., Anderson G.M., Ohkubo Y., Ment L.R.,
Vaccarino F.M. Early postnatal astroglial cells produce multilineage precursors and neural
stem cells in vivo. J. Neurosci. 2006;26:8609–8621. doi: 10.1523/JNEUROSCI.2532-
06.2006.
14. Gandhi J, Antonelli AC, Afridi A, Vatsia S, Joshi G, Romanov V, Murray IV, Khan SA.
Protein misfolding and aggregation in neurodegenerative diseases: a review of
pathogeneses, novel detection strategies, and potential therapeutics. Reviews in the
Neurosciences. 2019 May 27;30(4):339-58.
15. Ganesh S, Chithambaram T, Krishnan NR, Vincent DR, Kaliappan J, Srinivasan K.
Exploring Huntington’s Disease Diagnosis via Artificial Intelligence Models: A
Comprehensive Review. Diagnostics. 2023 Dec 3;13(23):3592.
16. Gonzalez V, Cif L, Biolsi B, Garcia-Ptacek S, Seychelles A, Sanrey E, et al. Deep brain
stimulation for Huntington's disease: Long-term results of a prospective open-label study. J
Neurosurg. 2014;121(1):114-122.
17. Hardiman O., Al-Chalabi A., Chio A., Corr E.M., Logroscino G., Robberecht W., Shaw
P.J., Simmons Z., Van Den Berg L.H. Amyotrophic lateral sclerosis. Nat. Rev. Dis.
Primers. 2017;3:1–19. doi: 10.1038/nrdp.2017.71.
18. Harilal S., Jose J., Parambi D.G.T., Kumar R., Mathew G.E., Uddin M.S., Kim H., Mathew
B. Advancements in nanotherapeutics for Alzheimer’s disease: Current perspectives. J.
Pharm. Pharmacol. 2019;71:1370–1383. doi: 10.1111/jphp.13132.
19. Hinge N.S., Kathuria H., Pandey M.M. Engineering of structural and functional properties
of nanotherapeutics and nanodiagnostics for intranasal brain targeting in
Alzheimer’s. Appl. Mater. Today. 2022;26:101303. doi: 10.1016/j.apmt.2021.101303.
20. Kempermann G. Adult Neurogenesis: Stem Cells and Neuronal Development in the Adult
Brain. Oxford University Press; New York, NY, USA: 2006.
21. Khan MQ, Mubeen H, Khan ZQ, Masood A, Zafar A, Wattoo JI, Nisa AU. Computational
insights into missense mutations in HTT gene causing Huntington’s disease and its
interactome networks. Irish Journal of Medical Science . 2023 Jun;192(3):1435-45.

53
22. Kloeppel, S.; Henley, S.; Hobbs, N.Z.; Wolf, R.C.; Kassubek, J.; Tabrizi, S.J.; Frackowiak,
R. Magnetic Resonance Imaging of Huntington’s Disease: Preparing for Clinical Trials.
Neuroscience 2009, 164, 205–219.
23. Labbadia J, Morimoto RI. Huntington's disease: underlying molecular mechanisms and
emerging concepts. Trends in biochemical sciences. 2013 Aug 1;38(8):378-85.
24. Lavecchia A, Di Giovanni C. Virtual screening strategies in drug discovery: a critical
review. Current medicinal chemistry. 2013 Aug 1;20(23):2839-60.
25. Liu Z., Zhou T., Ziegler A.C., Dimitrion P., Zuo L. Oxidative stress in neurodegenerative
diseases: From molecular mechanisms to clinical applications. Oxid. Med. Cell.
Longev. 2017;2017:2525967. doi: 10.1155/2017/2525967.
26. Liu, D.; Long, J.D.; Zhang, Y.; Raymond, L.A.; Marder, K.; Rosser, A.; McCusker, E.A.;
Mills, J.A.; Paulsen, J.S. Motor Onset and Diagnosis in Huntington Disease Using the
Diagnostic Confidence Level. J. Neurol. 2015, 262, 2691–2698.
27. Louros N, Schymkowitz J, Rousseau F. Mechanisms and pathology of protein misfolding
and aggregation. Nature Reviews Molecular Cell Biology. 2023 Dec;24(12):912-33.
28. Moldovean SN, Chiş V. Molecular dynamics simulations applied to structural and
dynamical transitions of the huntingtin protein: a review. ACS chemical neuroscience.
2019 Dec 16;11(2):105-20.
29. Mukherjee S, Madamsetty VS, Bhattacharya D, Roy Chowdhury S, Paul MK, Mukherjee
A. Recent advancements of nanomedicine in neurodegenerative disorders theranostics.
Advanced Functional Materials. 2020 Aug;30(35):2003054.
30. Nold, C.S. Huntington Disease. JAAPA 2017, 30, 46–47.
31. Novak MJ, Tabrizi SJ. Huntington's disease: clinical presentation and treatment.
International review of neurobiology. 2011 Jan 1;98:297-323.
32. Przedborski S., Vila M., Jackson-Lewis V. Series Introduction: Neurodegeneration: What
is it and where are we? J. Clin. Investig. 2003;111:3–10. doi: 10.1172/JCI200317522.
33. Shafie A, Ashour AA, Anjum F, Shamsi A, Hassan MI. Elucidating the Impact of
Deleterious Mutations on IGHG1 and Their Association with Huntington’s Disease.
Journal of Personalized Medicine. 2024 Apr 1;14(4):380.
34. Shin, H.; Kim, M.H.; Lee, S.J.; Lee, K.-H.; Kim, M.-J.; Kim, J.S.; Cho, J.W. Decreased
Metabolism in the Cerebral Cortex in Early-Stage Huntington’s Disease: A Possible
Biomarker of Disease Progression? J. Clin. Neurol. 2013, 9, 21–25.
35. Squitieri F, Griguoli A, Capelli G, Porcellini A, D'Alessio B. Epidemiology of Huntington
disease: first post‐HTT gene analysis of prevalence in Italy. Clin Genet 2016;89:367–370.
54
36. Stoyas CA, La Spada AR. The CAG–polyglutamine repeat diseases: a clinical, molecular,
genetic, and pathophysiologic nosology. Handbook of clinical neurology. 2018 Jan
1;147:143-70.
37. Tabrizi SJ, Ghosh R, Leavitt BR. Huntingtin lowering strategies for disease modification
in Huntington’s disease. Neuron. 2019;101(5):801-819.
38. Tabrizi SJ, Flower MD, Ross CA, Wild EJ. Huntington disease: new insights into molecular
pathogenesis and therapeutic opportunities. Nature Reviews Neurology. 2020
Oct;16(10):529-46.
39. Tang C, Feigin A. Monitoring Huntington’s disease progression through preclinical and
early stages. Neurodegener Dis Manag, 2012;2(4):421-435.doi: 10.2217/nmt.12.34
40. Van den Heuvel M.P., Sporns O. Network hubs in the human brain. Trends Cogn.
Sci. 2013;17:683–696. doi: 10.1016/j.tics.2013.09.012.
41. Versluis, M.; van der Grond, J.; van Buchem, M.; van Zijl, P.; Webb, A. High-Field
Imaging of Neurodegenerative Diseases. Neuroimaging Clin. 2012, 22, 159–171.
42. Nold, C.S. Huntington Disease. JAAPA 2017, 30, 46–47.

43. Uma Kumari, K. S. (2023). CADD Approaches For The Early Diagnosis Of Lung Cancer.
Journal of Clinical Otorhinolaryngology, Head, and Neck Surgery, 27(1), 5190-5199.

44. Uma Kumari, N. B. (2023). Computer Aided Drug Designing Approach for Prospective
Human Metastatic Cancer. International Journal for Research in Applied Science and
Engineering Technology, 11, 1874-1879. doi:10.22214/ijraset.2023.550014.

45. Uma Kumari, Madhura Das. (2023). Molecular Profiling and Data Analysis of Lung
Adenocarcinoma. International Journal for Research in Applied Science and Engineering
Technology, 11(9), 356-363. doi:10.22214/ijraset.2023.5565.

46. Wojtecki L, Groiss SJ, Hartmann CJ, Elben S, Omlor S, Schnitzler A, et al. Deep brain
stimulation in Huntington’s disease—preliminary evidence on pathophysiology, efficacy
and safety. Brain Sciences. 2016;6(3):38.

55

You might also like