Professional Documents
Culture Documents
11 - Chapter 1 Introduction
11 - Chapter 1 Introduction
11 - Chapter 1 Introduction
With the intensive growth of the amount of publicly available genomic data, a new
field of Computer Science i.e., Bioinformatics has emerged, focusing on the use of
computing systems for efficiently deriving, storing, and analyzing the character strings
of genome to help to solve problems in molecular biology. The term Bioinformatics was
coined by Paulien Hogeweg in 1979 for the study of informatics processes in biotic
systems. The National Center for Biotechnology Information (NCBI, 2001) defines as
Information Technology merges into a single discipline. In this field there are three
important sub discipline within: the development of new algorithms and statistics with
which to assess relationships among members of large data sets, the analysis and
The flood of data from biology, mainly in the form of DNA, RNA and Protein
sequences, puts heavy demand on computers and computational scientists. At the same
clustering; which is one of the branches of Data Mining can be applied to biological data
and this can result in a better era of rapid medical development and drug discovery.
1
Diabetes Mellitus is a endocrine metabolic disorder, it is unfortunate that India is
considered diabetes capital of the globe. Necessitating researchers to probe and peep into
In the past decade, the advent of efficient genome sequencing tools has led to
enormous progress in life sciences. Among the most important innovations, microarray
technology allows to quantify the expression for thousand of genes simultaneously. The
recognition data includes, a fair amount of random noise, missing values, a dimension in
the range of thousands, and a sample size in few dozens. A particular application of the
micro array technology is undertaken in our study of diabetes research, where the goal is
to search for gene expression at transcriptome level in Healthy, Prediabetic and Diabetic
subjects, and there effects on Gene Ontology (Molecular Function, Cellular Component
and Biological Process) and Metabolic Pathways. The challenge for a biologist and
2
Prediabetes and Diabetes:-
The global prevalence of pre diabetes has been increasing progressively in past few
decades, As per IDF (International Diabetes Federation) Diabetes Atlas, the number of
340 million (2). By 2030 the global prevalence of IGT is estimated to reach 8.4% which
will be approximately 462 million people. It has been established that Prediabetic status
is a strong risk factor for overt Diabetes with or without Micro and Macro vascular
complications (1,2).
associated with long term damage, dysfunction, and failure of different organs, like Eye,
Kidneys, Nerve, Heart and Blood Vessels. Diabetes is currently fastest growing
epidemic and has been ascribed to a collision between genes and the environment.
3
pancreas working overtime when patient develop full blown Diabetes the pancreas is
still working overtime and get exhausted. This is considered so called Clinical Diabetes
Mellitus. The pancreas tries to overcome this resistance initially by producing more
pancreas fail to overcome the insulin resistance, clinical Diabetes, develops late insulin
secretion starts declining either due to glucose toxicity and B-cell exhaustion. Increase in
hepatic glucose production also seen in Type 2 Diabetes (T2D) but not in Impaired
Glucose Tolerance (IGT). These increased hepatic glucose production is reversible with
genes have been elusive and no single gene defect explained T2D, hence the disease is
Genome-wide association studies (GWAS) have facilitated a substantial and rapid rise in
the number of confirmed genetic susceptibility variants for Type 2 Diabetes (T2D).
Approximately 40 variants have been identified so far, many of which were discovered
through GWAS in available literature (11). This success has led to widespread hope that
the findings will translate into improved clinical care for the increasing numbers of
patients with Diabetes. Potential areas or clinical translation include risk prediction and
therapeutics. However, the genetic loci so far identified account for only a small fraction
(approximately 10%) of the overall heritable risk for T2D. Uncovering the missing
4
heritability is essential to the progress of T2D genetic studies and to the translation of
(GWAS):-
NGS (next generation sequencing) and GWAS (Genome Wide Association Studies)
have led to technical development of genetic variants risked and protection of Type 2
Diabetes Mellitus. Next Generation Sequencing (NGS) which shows the amount of
gene which has been expressed and there arrangement of nucleotide bases in the gene
fragments which codes for protein, also some genes, or a copy of a gene which has lost
expression can be found due to Single Nucleotide Polymorphism (SNP’s), are DNA
variant that occur when a single base pair (A,T,G and C) in the sequence altered.
Transcriptome Analysis:-
molecular basis of phenotypic variation in biology, including diseases. During the past
decades micro-arrays have been the most important and widely used approach for such
a powerful alternative present study is based on this new transcriptome analysis (5). And
sequencing (NGS) methods to sequence cDNA that has been derived from an RNA
sample, and hence produces millions of short reads. These reads are then typically
5
mapped to a reference genome and the number of reads mapping within a genomic
the feature in the analyzed sample. Arguably the most common use of transcriptome
profiling is in the search for Differentially Expressed Genes (DEG), that is, genes that
show differences in expression level between conditions or in other ways are associated
with given predictors or responses. RNA-seq offers several advantages over microarrays
for differential expression analysis, such as an increased dynamic range and a lower
background level, and the ability to detect and quantify the expression of previously
progressively from past few decades and India being a major hub centre for Diabetes
genes which have been expressed and their arrangements of nucleotide base in the gene
fragments which code for protein and also to recognize the genes which has lost their
ability to produce functional proteins and genes. We have humbly tried to analyze with
Gene Ontology and Metabolic Pathway analysis for Differential Expressed Genes in