You are on page 1of 16

RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

I. BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

1. INTRODUCTION:

1.1 What is BIOINFORMATICS?

Bioinformatics = Biology + Information technology.

 Can be defined as the body of tools, algorithms needed t o handle large and complex
biological information.

 Bioinformatics is a new scientific discipline created from the interaction of


biology and computer.

 The NCBI defines bioinformatics as:


"Bioinformatics is the field of science in which biology, computer science, and
information technology merge into a single discipline”

Fig 1.The interrelationship of the different subjects of sciences

Easy Answer - Using computers to solve molecular biology problems


Hard Answer - Computational techniques for management and analysis of biological
data and knowledge.

What do you need to know?

It all depends on your background


Are you a … ?
Biologist with some computer knowledge or
Computer scientist with some biology knowledge,
Few do both well

“Bioinformatics is not just using a computer to store data, or speed up biology.With


bioinformatics, you do biological hypothesis testing on a computer.”

1
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

“Bioinformatics combines the tools of Biology, Chemistry, Mathematics, Statistics and


Computer Science to understand Life & its processes.”

Bioinformatics would not possible without advances in computing hardware and


software: analysis of algorithms, datastructures and software engineering.

Bioinformatics is
Another Revolution in Biology

Molecular Biology
Physics & Chemistry, 1950s
Biochemistry
Biophysics
Biology

Computer Sci. & Statistics, 1970s

Bioinformatics

Fig 2.Revolution in biology

1.2 DNA ROLE:

DNA contains the instructions needed for a living organism to grow and function.
It tells cells exactly what role they should play in the body. It holds instructions to make
your:
• Heart cells beat.
• Limbs form in the correct place.
• Immune system fight infection.
• Digestive system digest your dinner.

1.3 GENOMICS ERA:

High-throughput DNA sequencing.The first high-throughput genomics technology was


automated DNA sequencing in the early 1990.

Baker’s yeast, Saccharomyces cerevisiae(15 million bp), was the first eukaryotic genome to
be sequenced. In September 1999, Celera Genomics completed the sequencing of the
Drosophila genome.The 3-billion-bp human genome sequence was generated in a
competition between the publicly funded Human GenomeProject and
Celera.Bioinformatics bridges many disciplines.

2
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

2. Biological Molecular Databases:


The current scenario a total of 387 biological databases listed in the special database
include Primary and Derived databases
• Sequences and Structures
• Genomic, Proteomic and related data
• Intermolecular interactions
• Metabolic pathways and cellular regulation
• Mutation (SNPs and others)
• Pathology
• Transgenics etc.

3. BIOINFORMATICS AND COMPUTER SCIENCE CURRICULA


According to LeBlanc and Dyer,“Genomic research intersected with 10 of the 14
knowledge focus groups involving at least 36 of the 132 units.”
It means that background in computer science is necessary for the bioinformatics
curricula development and specialists’ preparation in bioinformatics.

For example, computerscience topics of algorithms, software engineering and databases


are linked with biology topics of cell evaluation and genetics. Linking the courses of
biology and computer science topics .The group of algorithms highly relevant for
computational statistics from computer science is machine learning, artificial intelligence
(AI), and knowledge discovery in databases or data mining.

4. DATA MINING:
Data Mining is the process of extracting knowledge hidden in large volumes of raw
data. Data mining automates the process of finding relationships and patterns in raw data
and delivers results that can be either utilized in an automated decision support system or
assessed by a human analyst.Data mining techniques can be divided into two broad
categories: “predictive data mining and discovery data mining”.

- Predictive techniques: Classification, Regression.


- Discovery techniques: Association Analysis, Sequence Analysis, Clustering.

4.1 Predictive data mining:


It is applied to a range of techniques that find relationship between a specific
variable (called target variable) and the other variables in your data.

 Classification is about assigning data records into pre-defined categories.


In this case the target variable is the category and the techniques discover
the relationship between the other variables and the category.Regression is about
predicting the value of a continuous variable from the other variables in a data record.
The most familiar value prediction techniques include linear and polynomial regression.

4.2 Discovery data mining:


It is applied to range of techniques that find patterns inside your data without any prior
knowledge of what patterns exist. The following are examples of discovery mining
techniques:

3
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

 Clustering is the term for range of techniques, which attempts to group data
records on the basis of how similar they are.

 Association and sequence analysis describes a family of techniques that


determines relationship between data records.

Where is the knowledge we have lost in information?


Where is the wisdom we have lost in knowledge?
--T.S. Elliot, "The Rock"

4.4 Knowledge Discovery:


Data mining is one stage in an overall knowledge discovery process. This process
involves selection and sampling of the appropriate data from the database(s);
preprocessing and cleaning of the data to remove redundancies, errors, and
conflicts; transforming and reducing data to a format more suitable for the data
mining; data mining; evaluation of the mined data; and visualization of the
evaluation results.

Fig 3 Data mining in the larger context of the knowledge discovery process.

Development of new tools for data mining.

 Sequence alignment
 Genome sequencing
 Genome comparison
 Micro array data analysis
 Proteomics data analysis
 Small molecular array analysis.
To derive “information” and gain “knowledge” from the data

 Bioinformatics is a good career path for computer scientists.


 Create tools adapted to the individual needs.Access the vast amount of biological
informationstored in data bases.

APPLICATIONS:
Bioinformatics is the use of IT in biotechnology for the data storage, data warehousing
and analyzing the DNA sequences. In Bioinfomatics knowledge of many branches are
required like biology, mathematics, computer science, laws of physics & chemistry, and
of course sound knowledge of IT ... Microbial genome applications;

Why Should Biologists Use Computers?


Computers are powerful devices for understanding any system that can be described in a
mathematical way. As our understanding of biological processes has grown and
deepened, it isn't surprising, then, that the disciplines of computational biology and, more

4
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

recently, bioinformatics, have evolved from the intersection of classical biology,


mathematics, and computer science.

BIOINFORMATICS FOR VARIOUS FIELDS:


 Molecular medicine, Antibiotic resistance, Forensic analysis of microbes
 Bio-weapon creation,Evolutionary studies,Crop improvemen,Insect
resistance
 Improve nutritional quality ,Development of Drought resistance varieties
 Vetinary Science, Personalised medicine,Preventative medicine
 Gene therapy,Drug development,Microbial genome applications
 Waste cleanup,Climate change Studies,Alternative energy sources
 Biotechnology

5. ADVANTAGES: As computing power increases and our databases of genetic and


molecular information expand, the real of bioinformatics is sure to grow and change
drastically, allowing us to build models of incredible complexity and utility.It was
through the combined efforts of using information technology and computer science that
allowed them to create a large database capable of housing and securely storing all of the
valuable work that was being done with studies on DNA. The database that was created
allowed scientists to be able to access millions of records of parts and molecules of DNA
sequences from different species to compare to the work that was currently being done.
Combines the opportunity for a flexible response with ability to determine frequencies,
correlations & quantitative analyses. Particularly useful for tapping attitudes,
perceptions, and opinions.

6. LIMITATIONS:

 Cannot discriminate sensitivity and subtlety from the data.


 No assumption of equal intervals.
 No check on whether respondents are telling the truth.
 One persons ‘strongly agree’ may be another’s ‘weakly agree’.

7. CONCLUSION:

I Collected information from a variety of sources regarding the composition of core


bioinformatics curricula
There is an overwhelming consensus that the next generation of Bioinformaticians must be
trained from scratch as Biologist+Computer Scientist challenging the orthogonal traditional
view.

5
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

8.BIBILOGRAPHY:

LeBlanc, M. D. and Dyer, B. D. (2005). Bioinformatics and Computing Curricula 2001. Why
Computer Science is well positioned in a post-genomic world,
http://genomics.wheatoncollege.edu/LeBlancDyerSIGCSE.html.

http://www.bioinformatics.org/forums

II. BIO-MOLECULAR COMPUTING

1. DEFINITION:
 Molecular computing is an emerging field to which chemisty biophysics,
molecular biology, electronic engineering,solid state physics and computer science
contribute to a large extent.

 It involves the encoding, manipulation and retrieval


of information at a macromolecular level in contrast to the current techniques, which
accomplish the above functions via IC miniaturization of bulk devices.

1.1 THE AIM OF THIS ARTICLE:


Is to exploit these characteristics to build computing systems, which have many
advantages over their inorganic .
Counter parts.

DNA computing began in 1994 when Leonard Adleman proved thatDNA


computing was possible by finding a solution to a real- problem, a Hamiltonian Path
Problem, known to us as the Traveling Salesman Problem,with a molecular computer.In
theoretical terms, some scientists say the actual beginnings of DNA computation should
be attributed to
Charles Bennett's work. Adleman, now considered the father of DNA computing, is a
professor at the
University of Southern California and spawned the field with his paper, "Molecular
Computation of Solutions of Combinatorial Problems." Since then, Adleman has
demonstrated how the massive parallelism of a trillion DNA strands can simultaneously
attack different aspects of a computation to crack even the toughest combinatorial
problems.

1.2 Adleman's Traveling Salesman Problem:

6
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

The objective is to find a path from start to end going through all the points only once.
This problem is difficult for conventional computers to solve because it is a "non-
deterministic polynomial time problem".
These problems, when they involve large numbers, are intractable with conventional
computers, but can be solved using massively parallel computers like DNA computers.
The Hamiltonian Path problem was chosen by Adleman because it is known problem.

The following algorithm solves the Hamiltonian Path problem:

 Generate random paths through the graph.

 Keep only those paths that begin with the start city (A) and conclude with the
end city (G).

 If the graph has n cities, keep only those paths with n cities. (n=7)

 Keep only those paths that enter all cities at least once.

 Any remaining paths are solutions.

The key was using DNA to perform the five steps in the above algorithm. Adleman's first
step was to synthesize DNA strands of known sequences, each strand 20 nucleotides
long. He represented each of the six vertices of the path by a separate strand, and
further.He represented each of the six vertices of the path by a separate strand, and
further represented each edge between two consecutive vertices, such as 1 to 2, by a
DNA strand which consisted of the last ten nucleotides of the strand representing vertex
1 plus the first 10 nucleotides of the vertex 2 strand.Then, through the sheer amount of
DNA molecules (3x1013 copies for each edge in this experiment!) joining together in all
possible combinations, many random paths were generated.Adleman used well-
established techniques of molecular biology to weed out the Hamiltonian path, the one
that entered all vertices, starting at one and ending at six.After generating the numerous
random paths in the first step, he used polymerase chain reaction (PCR) to amplify and
keep only the paths that began on vertex 1 and ended at vertex 6.The next two steps kept
only those strands that passed through six vertices, entering each vertex at least once. At
this point, any paths that remained would code for a Hamiltonian path, thus solving the
problem
1.3 Open Problem:

DNA Strand Engineering

Given a DNA strand, there are polynomial-time algorithms that predict the secondary
structure of the strand.

1.3.1 Inverse Problem: find an efficient algorithm that, given a desired secondary
structure, generates a strand with that structure.

1.4 Applications:
– Information storage, retrieval for DNA computing
– Molecular bar codes for chemical libraries.

7
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

1.5 DNA Computing on Surfaces


DNA computers will be thousands of times smaller and more powerful than silicon
based computers. One pound of DNA has ability to store more data than every
electronic devices ever made to date. A water droplet sized DNA computers will have
more computing power than today's most powerful supercomputers. Another
advantage of DNA computing over silicon based computers is the ability to do
parallel calculations. Silicon based microprocessors can only do on calculation at a
time while DNA computer will be able to do many simultaneous calculations. The
creation of practical DNA computers will start a whole new computer revolution.
Conclusion

The idea of DNA Based computing is to subvert the mechanisms produced by


evolution and use them to do data processing we want to do.

Advantages

1. By DNA computing, people can get and analyze the information of materials.
Through DNA computing, we can find all the genes in the DNA sequence and to
develop tools for using this information in the study of some fields, such as
biology, medicine biology, physics, and so on. The team from HP and U.C.L.A.
has found a way to build circuits using chemical processes, making the switches
as small as a molecule. Tim Gardner, a graduate student at Boston University,
recently made a genetic system that can store a single bit of information—either a
1 or a 0.
2. More parallel: for some problem too big to fit or run in a silicon machine, DNA
computer, which be with pure parallel power or massive memory, will be able to
do a computation quickly than a powerful supercomputer.
3. By creating DNA computing, some fields are combined together to reach a
desirable goal, by the way, this is improving those fields and some new fields
come out.

Disadvantages of DNA Computing:

1. Slow: algorithms proposed so far use really slow molecular-biological operations.


Each primitive operation takes hours when you run them with a small test tube of
DNA. Scale up to the vast amounts of DNA we're talking about, and they may
slow down dramatically.

8
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

2. Hydrolysis: the DNA molecules can fracture. Over the six months you're
computing, your DNA system is gradually turning to water. DNA molecules can
break – meaning a DNA molecule, which was part of your computer, is fracture
by time.

3. Unreliable: every operation you want to do in your DNA computer is random.


The components in the DNA computer are probabilistic. Because there are some
noisy components, the computing sometimes is not reliable. If a tiny subcircuit is
supposed to give the answer "1," it may yield that answer 90 percent of the time
and "0" the rest of the time. To make DNA computing work, we have to figure
out how to build a reliable computer out of noisy components.

4. Not transmittable: the model of the DNA computer is concerned as a highly


parallel computer, with each DNA molecule acting as a separate process or. In a
standard multiprocessor a Connection-buses transmit information from one
processor to the next. But the problem of transmitting information from one
molecule to another in a DNA computer has yet to be solved. Current DNA
algorithms compute successfully without passing any information, but this limits
their flexibility.

5. Not practical: DNA computing is not a here and now practical technology. It just
is a pie-in-the-sky research project.

6. No generality: Some concrete algorithms are just for solving some concrete
problems. Every algorithm has some constraints on it.
7. The process is extremely laborious and time-consuming

8. Radioactive probes pose health and disposal risks (although chemiluminescent technology
eliminated this risk)

9. A relatively large amount of sample is required to perform the tests

10. The method requires high molecular weight, un-degraded DNA

11. The use of yield gels is an essential, but time consuming, step in the analysis not only to
estimate the amount of DNA recovered but also to determine the suitability of the sample
for analysis

Advantages of DNA microarray tests include high throughput (lots of information with
one test), and good coverage of the genome with the chips that have larger numbers of
test spots.

9
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

Disadvantages include incomplete coverage, which can lead to false normal results, and
the ability to test only for unbalanced rearrangements (duplications and deletions), and
not balanced translocations or inversions.

12.

Advantages over “solution phase” chemistry

1. Facile purification steps


2. Reduced interference between strands
3. Easily automated

1.6 Disadvantages:

1.6.1 Loss of information density (2D)


1. Lower surface hybridization efficiency
2. Slower enzyme kinetics

The Future of Bioinformatics

We are currently witnessing a technological revolution. With the increase of sequencing


projects, bioinformatics continues to make considerable progress in biology by providing
scientists with access to the genomic information. This progress is especially contributed
by the Human Genome Project. The information obtained with the help of
Bioinformatics tools furthers our understanding of various genetic and other diseases and
helps identify new drug targets. With technological developments of the Internet,
scientists are now able to freely access volumes of such biological information, which
enables the advancement of scientific discoveries in biomedicine.

In spite of being young, the science of Bioinformatics exhibits tremendous potential for
playing a major role in the future development of science and technology. This is evident
from the fact that modern biology and related sciences are increasingly becoming
dependent on this new technology. It is expected that Bioinformatics will especially
contribute in the future as the leading edge in biomedicine to pharmaceutical companies
by expediently yielding a greater quantity of lead drugs for therapy.

Artificial intelligence+Parallel computing

10
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

2. CONCLUSION:

DNA computing has expanded the notion of what is computation


Solid-phase chemistry is a promising approach to DNA computing
DNA computing will require greatly improved DNA surface attachment chemistries and
control of chemical and enzymatic processes
New research problems in combinatorics, complexity theory and algorithms.

11
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

BIBILOGRAPHY:

1. Leonard M. Adleman (1994-11-11). "Molecular Computation Of Solutions


To Combinatorial Problems". Science 266 (11): 1021–1024.
http://www.usc.edu/dept/molecular-science/papers/fp-sci94.pdf.— The first DNA
computing paper. Describes a solution for the directed Hamiltonian path problem.

www.dna computing.org
www.gene.ch.genetech
www.genetic engineering.org

12
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

13
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

14
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

15
RIIMS BIOINFORMATICS AND BIO-MOLECULAR COMPUTING

16

You might also like