Download as pdf or txt
Download as pdf or txt
You are on page 1of 85

ONE COMPUTATIONAL INTELLIGENCE PARADIGMS: INTRODUCTION

intelligence is a capacity of a system to achieve a goal or sustain desired behavior under conditions of
uncertainty. Intelligent systems have to cope with the sources of uncertainty like the occurrence of
unexpected events-such as an unpredictable changes in the world in which the systems operates, and
incomplete, inconsistent or unreliable information available to the system for the purpose of deciding
what to do next.

It is important to contrast intelligent systems, i.e., systems that can make decisions under uncertainty,
with the systems that are programmed to make only deterministic decisions. Data processing systems,
conventional robots, production lines and computer controlled machines tools are examples of such
nonintelligent systems. '

Intelligent systems exhibit intelligent behavior. Intelligent behavior is exhibited by artifacts and
biological systems capable of achieving specified goals or sustaining desired behavior under conditions
of uncertainty even in poor structured environments. Such environments are environments in which
variable characteristics are not measurable, where several characteristics change simultaneously and in
unexpected ways and where it is not possible to decide in advance how the system should respond to
every combination of events. Some characteristics of intelligent behavior are: adaptability, learning,
goal-seeking, self-improvement and reproduction.

Computational Intelligence encompasses techniques and methods that can be used to tackle problems
that are not solved well by many traditional mathematical and statistical techniques .It is also defined as
"a methodology involving computing (whether with a computer or wetware) that exhibits an ability to
learn and/ or to deal with new situations, such that the system is perceived to possess one or more
attributes of reason such as generalization, discovery, association and abstraction". Computational
intelligence or soft computing, is a research discipline which encompasses the theoretical formalization
and applications of 0 Artificial Neural Networks (ANNs), also known as connection networks.

0 Evolutionary algorithms comprising Genetic Algorithms (GAS), Evolutionary Strategies (ES), Genetic
Programming (GP), and Grammatical Evolution (GE).

Fuzzy logic. etc.

The use of Computational Intelligence that links the

complementary fields of neural networks, evolutionary algorithms and adaptive fuzzy systems is
appropriate when systems that are analogous to the behavioural characteristic of . biological systems
are desired.

CI has resulted from a synergy among information processing technologies such as Artificial Neural
Networks (ANNs), fuzzy sets, Case-Based Reasoning (CBR), genetic algorithms (GA's), and Simulated
Annealing (SA). The CI sub-field is defined as the study of 'adaptive mechanisms' which enable or
facilitate intelligent behaviour in complex and changing environments. In other

words, CI is mainly about the design of algorithmic models to solve complex problems. There are large
number of topics that could be considered but only three paradigms: artificial neural networks; fuzzy
logic; genetic algorithm will be considered.

The technology of Computational Intelligence (CI) intensively exploits various mechanisms of interaction
With humans and processes domain knowledge with intent of building intelligent systems. As the
software complexity grows and the diversity of software systems skyrocket, it becomes apparent that
there is a

genuine need for a solid, efficient, designer-oriented vehicle to support software analysis, design, and
implementation at various levels. CI is a highly compatible and appealing vehicle to address the needs of
knowledge rich environment of Software Engineering.

CHAPTER TWO

ARTIFICIAL INTELLIGENCE

Artificial Intelligence (A.I), a branch of computer science that studies how to endow computer with
capabilities of human intelligence, or a branch of science, which deals with helping machines, finds
solution to complex problem in a more human-like fashion, which evolved around 1960. The initial hope
for this field was to mimic human problem solving abilities. This generally involves borrowing
characteristics from human intelligence, and applying them as algorithms in a computer friendly way.

Warren McCulloch and Walter Pitts (1943) did the first work that is now generally recognized as AI. They
drew on three sources: knowledge of the basic physiology and function of neurons in the brain, the
formal analysis of prepositional logic due to Russell and Whitehead, and Turing's theory of computation.

q.

Mcculloch and Pitts also suggested that suitably defined networks could learn. Their work served as the
forerunner of both the logicist tradition in AI and the connectionist tradition.

Claude Shannon (1950) and Alan Turing (1953) wrote chess pregrams for Von Neumann style
conventional computers at the same time; two graduate students in the Princeton mathematics
department, Morvin Minsky and Dean Edmonds built the first neural network computer in 1951. It was
called SNARAC and it

. used 300 vacuum tubes.

14
ARTIFICIAL INTELLIGENCE

One of the major results of research in the area of artificial intelligence has been the development of
techniques that allow the modeling of information at higher levels of abstraction. These techniques are
embodied in languages or tools that allow programs to be built that closely resemble human logic in
their implementation and are therefore easier to develop and maintain. These programs, Which
emulate human expertise in well-defined problem domains, are called expert systems.

If our purpose is to make computer programs think like ourselves, then we must have a way of resolving
how humans think. For this, it is necessary to determine precise theory of the mind and, With correct
tools, express the theory as computer programs. If the actions of computer programs are
indistinguishable from human behavior then we may say that similar mechanisms may also exits
inhumans.

Many disciplines in cognitive science, which have mainly concentrated on the investigation of human
and animal behavior, try to bring together computer model from AI and experimental techniques from
psychology to build exact models of the human mind. AI and cognitive science enrich each other in the
area of computer vision, natural language and learning.

Studies on human thinking and the mind go back nearly 2000 years. Aristotle was one of the first
philosophers who attempted to formalize right thinking [8]. His syllogisms (three-part deductive
reasoning) provided patterns for argument structures that always give 'true' conclusions given 'true'
premises.

A big contribution to AI again came from McCarthy in 1958 when he wrote a high level programming
language called 'LISP . Even today it is the most dominant AI programming languages. In the same year,
he developed a program called 'Advice Taker which

was designed to use knowledge to search for solutions to problems. The program was in a published
paper titled 'Program with Common Sense . The system used domain knowledge and some simple
axioms to generate plans; for example, it generated a plan to drive to the catch a plane. The significance
of this system in the development of Al is that the system embodied knowledge represented and
reasoning, and manipulated knowledge representation with deduction [19].

What is Intelligence? Intelligence is defined as the ability to learn, understand and think in logical ways
about things. it is also defined as the ability to learn facts and skills and apply them.

'

The Turing test /Alan Turing (1950) proposed the Turing test. It was designed to provide a satisfactory
operational definition of intelligence. Turing defined intelligent behavior as the ability to achieve human
level performance in all cognitive tasks, sufficient to fool an interrogator. The proposed test is that the
computer should be interrogated by a human Via a teletype and passes the test of interrogator cannot
tell if there is a computer or a human at the (other end.

Programming a computer to pass the test provides plenty to work on. The computer Will need to
possess the following intelligent capabilities to pass the Turing test: > Natural language processing to
enable it to communicate successfully in English (or some other human language). Knowledge
representation to store information provided before or durin g the interroga tion Automated reasoning
to use the stored information to answer questions and to draw new conclusions.

16

0 Machine learning to adapt to new citcumstances and to detect

7, and extrapolate patterns

Definitions of Artificial Intelligence The definitions of AI can be organized into four categories:

Systems that think like humans3.. The automa tion of activi ties that we associate with human thinking,
activities such as decision-making and problem solving Bellman (1948).

to Systems that act like humans The art of creating machines that pe1form jimctions that require in
telligence when performed by people. Kurzweil (1990)The study of how to make computers do things at
which, at the moment people are better. Rich and Knight (1991) oz. Systems that think rationally The
study of mental faculties through the use of computational models.

Chamiak and Mc Dermott (1985)

O0.0 Systems that act rationally A field of study that seeks to explain and emulate intelligent behavior in
terms of compu tational processes. Schalkojj a 990) The branch of computer science that is concerned
with the automation of in telligent behavior.

Luger and Stubblefield (1993)

Artificial Intelligence is the art of creating machine that perform function that requires intelligence when
performing by people (Kuizweil, 1996).

" The early studies on the operations of the mind established the field of logic. Today, the logical
approach aim to construct computer programs, with the hope that these programs will be

17

able to create intelligent systems. More specifically in Artificial Intelligent operation must have a
learning ability, Which is autonomous, goaI-directed and highly adaptive.

Artificial intelligence, as described above, demands a number of irreducible features and capabilities. In
order to proactively accumulate knowledge from various (and/ or changing) environments, it requires: '3
Sense to obtain features from 'the world' (virtual or actual), .g. A coherent means for storing knowledge
obtained this way, and .;. Adaptive output/ actuation mechanisms (both static and dynamics). [7]

Importance and Application of Artificial Intelligencé o It is being. applied in industries for designing
automobile

parts
0 It is presently being used in the field of medicine for diagnosis o It is being used for mineral
exploitation o It is being used in the field of education for instruction and

learning

. It is being used in image analysis and interpretation It is applicable in carrying out maintenance and
repair. It is being used for carrying out complex analysis of problems 1'.e. debugging and interpretation

inches ofArtificiaI Intelligence 9 main research areas/ Branches in artificial intelligence are *d below:
Game Playing Na turaI language pI OCGSSing

. ExpertSyStem . Vision

, Robotics , Machinelearning . Problem solving and inference . Knowledge representation and utilization
. Speechprocessing 0 Cognitive modeling. ,.

o Neural Networks or Connecti onism or Parallel Distributed

. Logical AI

'

0 Search 0 Pattern recognition 9 Common sense knowledge and reasoning 0

Learning from experience 0

Planning 0 Epistemology

Ontology

° Heuristics 0 Genetie programming

WHAT IS ARTIFICIAL INTELLIGENCE? Artificial Intelligence (Al), the study of how to make computers do
things which minds can do. These include many things not normally thought of as intelligent, such as
moving without bumping into obstacles, or gaining information about an environment through vision.
Humans share these capacities, and'

also the ability to learn from experience, with many other animals. Only humans , however, have
language. The intellectual aspects of intelligence depend on language. Much work in AI models

intellectual tasks, as opposed to the sensory, motor, and adaptive abilities possessed by all mammals.
Most AI systems are programs, existing only inside the computer. Others are robots, controlled either by
a program or (in situated robots) by engineered reflexes.

More definitions are given below: A.I. is the subfield of computer science that is concerned with
symbolic reasoning and problem solving by manipulation of knowledge rather than mere data

A.I. is the ability of a human made machine (an automation) to emulate or simulate human methods for
the deductive and inductive acquisition and application of knowledge and
reasoning.

A.I. is the art of creating machine that perform function that require intelligence when performed by
people (Kuizweil, 1996).

SCOPE OF A.I f/ A.I programs can do many different things. They can play games, predict share values,
interpret photographs, diagnose diseases, plan travel itineraries, translate languages, take dictation,
draw analogies, help design complex machinery, teach logic, make jokes, compose music, do drawings,
and learn to do tasks better. Some of these things they do well. Expert systems can make medical
diagnoses as well as, or better than, most human doctors. The world chess champion Garry Kasparov
was beaten by. a program in 1997; computers often predict share prices better than humans, and some
AI-generated music sounds like compositions by famous composers. Other things, they do rather badly.
Their translations are imperfect, but good enough to be understood. Their dictation is reliable only if the
vocabulary is predictable and

90

the speech unusually clear. And their jokes are poor, although some are found 'funny by children. To
match everything that people can do, they would need to model the richness and subtlety of human
memory and common sense. Moreover, programs do only one thing, whereas people do many things.

AI robots a1 hou; . nu , Q ~ '. .. industrial robots, are similarly limited. Very few can avoid obstacles
smoothly, or move across uneven surfaces without falling over. Robots that plan their actions
beforehand are vulnerable to unexpected environmental changes. Even if a robot performs successfully,
it cannot undertake a wide variety of tasks. And its success often requires simplification of the
environment: floor-cleaning robots are useful only if the floor is uncluttered. Nevertheless, AI-robots can
do boring, dirty, or dangerous jobs, sometimes in places that humans cannot reach

TYPES OF ARTIFICIAL INTELLIGENCE

Symbolic Artificial Intelligence Symbolic AI is based in logic. It uses sequences of rules to tell the
computer What to do next. Expert systems consist of many socalled IF-THEN rules: IF this is the case,
THEN do that. Since both sides of the rule can be defined in complex ways, rule-based programs can be
very powerful. The performance of a logic-based program need' not appear logical , since some rules
may cause it to take apparently irrational actions. Illogical W not used for a ractical o roblem solvin ; ,
but are useful in modellin; how humans think. Symbolic programs are good at dealing with set problems,
and at representing hierarchies (in grammar, for example, or planning). But they are brittle: if part of the
expected input data is missing or mistaken, they may give a bad answer, or no answer at all.

21

Evolutionary Artificial Intelligence Evolutionary AI draws on biology. Its programs make random changes
in their own rules, and select the best daughter programs ' to breed the next generation. This method
deve10ps problemsolving programs, and can evolve the brain : and 3ng of robots. It is often used in
modelling artificial life (A- 1 e7. A-Life studies self-organization: how order arises from something that is
ordered to a lesser degree. Biological examples include the flocking patterns of birds and the
development of embryos. Technological examples include the A-Life flocking algorithms used for
computer animation

Connectionism Artificial Intelligence Connectionism is inspired by the brain. It is closely related to


computational neuroscience, which models actual brain cells and neural circuits. Connectionist AI uses
artificial neural networks made of many units working in parallel. Each unit is connected to its
neighbours by links that can raise or lower the likelihood that the neighbour unit will fire (excitatory and
inhibitory connections respectively). Neural networks that are able to learn do so by changing the
strengths of these links, depending on past experience. These simple units are much less complex than
real neurons. Each can do only one thing: for instance, report a tiny vertical line at a particular place in
an image. What matters is not what any individual unit is doing, but the overall activity-pattern of the
whole network.

APPLICATIONS OF A1: Since the use of computers seems to grow evermore, it is natural to see that the
use of AI grows with the flow. At present, the trend of AI is UP, hence we have included in this section
the main aspects

currently researched in Al. The intention is to give a basic understanding in the field, and to allow the
reader to learn through seeing example: both in theory and in practice.

1 Game playing There is some AI in them, but they play well against people mainly through brute force
computation looking at hundreds of thousands of positions. To beat a world champion by brute force
and known reliable heuristics requires being able to look at 200 million positions per second. [19]

ii. Speechrecognition In the 1990's computer speech recognition reached a particular level for limited
purposes. Thus United Airlines has replaced its keyboard tree for flight information by a system using
speech recognition of flight numbers and city names. It is quite convenient. On the other hand, while it
is possible. to instruct some computers using speech, most users have gone back to the . keyboard and
the mouse as still more convenient.

iii. Understanding natural language Just getting a sequence of words into a computer is not enough.
Parsing sentences is not enough either. The computer has to be provided with an understanding of the
domain the text is about, and this is presently possible only for very limited domains.

iv. Computer vision The world is composed of three-dimensional objects, but the inputs to the human
eye and computers' TV cameras are two dimensional. Some useful programs can work solely in two
dimensions, but full computer vision requires partial three dimensional information that is not just a set
of two-dimensional views. At present there are only limited ways of representing

23

three-dimensional information directly, and they are not as good as what humans evidently use.

v. Expert systems A knowledge engineer interviews experts in a certain domain and tries to embody
their knowledge in a computer program for carrying out some task. How well this works depends on
whether the intellectual mechanisms required for the task are within the present state of A1. When this
turned out not to be so, there were many disappointing results. One of the first expert systems was
MYCIN IN 1974, which diagnosed bacterial infections of the blood and suggested treatments. It did
better than medical students or practicing doctors, provided its limitations were observed (8). Namely,
its ontology included bacteria, symptoms, and treatmentsand did not include patients, doctors,
hospitals, death, recovery, and events occurring in time. Its interactions depended on a single patient
being considered. Since the experts consulted by the knowledge engineers knew about patients, doctor,
death, recovery, etc. it is clear that the knowledge engineers forced what the expert told them into a
predetermined framework. In the present state of A1, this has to be true. The usefulness of current
expert system depend on their users having common sense [20]

vi. Heuristic classification One of the most feasible kinds of expert system given the present knowledge
of AI is to put some information in one of a fixed set of categories using several sources of information.
An example is advising whether to accept a proposed credit card, his record of payment and also about
the item he is buying it (e.g. about whether there have been previous credit card frauds at this
establishment). [191

HEURISTICS

The word heuristic is derived from the Greek word heuriskeien meaning to find or to discover. After
discovering the principle of floatation, Archimedis shouted Heureka! (I have found it) the word heureka
was later converted to Eureka

George Polya (1945) defines heuristics as the study of the methods and rules of discovery and invention.
A heuristic is a way of trying to discover something or an imbedded in a program. The term is . used
variously in AI.

Heuristic functions are used in some approaches to search to measure how far a node in a search tree
seems to be from a goal. Heuristic predicates that compare two nodes in a search tree to see if one is
better than the other, i.e. constitutes an advance toward the goal, and may be more useful.

25

CHAPTER THREE

PROBLEM SOLVING IN ARTIFICIAL INTELLIGENCE

A problem is really a collection of information that an agent will use to decide what to do. Lets will begin
by specifying the information needed to define a single state problem. The basic elements of a problem
definition are the states and actions. To capture these formally, the following are needed 0 The initial
state

0 The operator 0 The goal test 0 The path cost The initial state: is a state that the agent knows itself to
be in. The set of possible actions available to the agent. Operator: is used to denote the description of
an action in terms of which state will be reached by carrying out the action in a particular state.

The initial state and operator both define the state space of the problem.

The state space: the set of all states reachable from the initial state by any sequence of actions. A path
in the state space is simply any sequence of actions leading from one state to another. The next element
of a problem is the following: ' The goal test: which the agent can apply to a single state description to
determine if it is a goal state. Sometimes, there is an

26
explicit set of possible goal states, and the test simply checks to see if it has reached one of them.
Sometimes, the goal is specified by an abstract property rather than an explicitly enumerated set of
states. For example, in chess the goal is to reach a state called Checkmate where the opponent's king
can be captured on the next move no matter what the opponent does

Finally, it may be the case that one solution is preferable to another, even though they both reach the
goal. For example, we might prefer paths with fewer or less costly actions.

A path cost function: is a function that assigns a cost to a path. In all cases we will consider, the cost of a
path is the sum of the individual actions along the path. The path function is often denoted byg

Together, the initial state, operator set and goal test, and path cost function define a problem. Naturally,
we can define a data type with Which to represent problems:

Datatype PROBLEM

components:INITIALSTATE,OPERATORS,GOAL-TEST, PATH COST-FUNCTION.

WHAT IS A SOLUTION?

The output of a search algorithm is a solution, that is, a path from the initial state to a state that satisfies
the goal test. A search algorithm takes a problem as input and returns a solution in the form of action
sequence. Instances of the above datatype will be input to our search algorithms. The process of looking
for sequences of actions that leads to state of known value and choosing the best one is called a search.
The majority of work in the area of search has gone into finding the right search strategy for a problem.
In this field, we will evaluate strategies in terms of four

27

criteria stated below:

Completeness: Is the strategy guaranteed to find a solution when there is one?

Time complexity: how long does it take to find a solution? '

Space complexity: how much memory does it need to perform the search? °

Optimality: does the strategy find the highest-quality solution when there are several different
solutions?

Problems could be solved by searching in case which we look at how an agent can decide what to do by
systematically considering the outcomes of various sequences of actions that it might take. Once a
solution is found, the actions it recommends can be carried out. This is called the execution phase. Thus,
we have a simple formulate, search, execute design for an agent as discussed in the next section. After
formulating a goal and a problem to solve, the agents call a search procedure to solve it. It then uses the
solution to guide it's actions, doing whatever the solution recommends as the next thing to do, and then
removing that step from the sequence. Once the solution has been executed, the agent will find a new
goal.
PROBLEM SOLVING AGENT It is a goal-based agent that decides what to do by finding sequences of
agents that lead to desirable states. A formulation process is needed by this agent. The problem type
that results from the formulation process will depend on the knowledge available to the agent. The first
step in problem solving is goal f ormulation. Intelligent agents are supposed to act in such a way that the
environment goes through a sequence of states that maximizes the performance measure. An algorithm
for a simple problem solving agent is written below

'10

A sim . 1e roblem-solvin; a ; ent

function SIMPLE PROBLEM SOLVING AGENT(p) returns an action

inputs: p,n,percept Static: s, an action sequence, initially empty . state, some descriptions of the current
world state g, a goal, initially null problem, a problem formulation State UPDATE STATE(state,p) if s is
empty then g FORMULATE GOAL(state) problem FORMULATE PROBLEM(state,g) 5 SEARCH (problem) '
action RECOMMENDATION(S, state) S REMAINDERS(s, state)

return action

Problems in two views: Problems can be viewed as a toy-problem'or a real-world problem. Toy-
problems are intended to illustrate or exercise various problem solving methods. They can be given a
concise, exact description. They can be easily used by different researchers to compare the performance
of algorithms. Examples include the 8puzzle problem, the 8-queens problem, the vacuum world
amongst others Real-world problems on the other hand, tend not to have a single agreed-upon
description, but a general flavor of their formulations will be explained below. Examples of real world
problems are the robot navigation and assembly sequencing

29
I

ROBOT NAVIGATION Robot navigation is a generalization of the route-finding problem. Rather than a
discrete set of routes, a robot can move in a continuous space With an infinite set of possible actions
and states. For a simple, circular robot moving on a flat surface, the space is essentially two-
dimensional. When the robot has two arms and legs that must be controlled, the search space becomes
many dimensional. An example is a robot that can make a kinder cut on patients. Robots can make
surgery quicker, cheaper and safer.

ASSEMBLY SEQUENCING Automatic assembly of complex objects by a robot was first demonstrated by
FREDDY the robot (Michie, 1972). Progress since then has been slow but sure, to the point where
assembly of object6 such as electric motor as feasible.

30

Search strategies Search as a problem solving technique: human generally consider a number of
alternatives strategies on their way to solving problems (Luger 2001, Rich and Knight, 2003) searching is
a basic operational tasks of artificial intelligence. A.I. programs depends on search procedures to
perform a task. Problems generally are defined in terms of states, and their solutions correspond to goal
states.

A.I. programs use ,a number of search strategies which can be classified into two categories namely
informed search strategies or heuristic search strategies and Uninformed search or Blind search. N 0t
suprisingly, uninformed search is less effectively than informed search. Uninformed search is still
important, however, because there are many problems for Which there is no additional information to
consider.

We consider six uninformed search strategies:

(i) Breadth-first search In this strategy, the root node is expanded first, then all the nodes generated by
the root node are explained next, and their successors, and so on in general, all the nodes at depth d in
the search tree are expanded before the nodes at depth d+1. Breadthfirst search can be implemented
by calling GENERAL SEARCH ALGORITHM with a queuing function that puts the newly generated states
at the end of the queue, after all the previously generated states. Function BREADTH FIRST SEARCH
(problem) returns a soliition, 0r failure

return GENERAL-SEARCH(problem, ENQUEUE-AT-END) Breadth-first is a very systematic strategy


because it ébnsiders all

31

the paths of length 1 first then all those of length 2 and so on. Figure 1.1 shows the progress of the
search on a simple binary tree. If there is solution, breadth-first search will always find the shallowest
goal state first. In terms of the fouf'criteria, breadth-first search is complete and it is optimal provided
the path cost is a non-decreasing function of the depth of the mode.

So far, the news about breadth-first search has been good. To see why it is not always the strategy of
choice, we have to consider the amount of time and memory it takes to complete a search. To do this,
we consider a hypothetical state space where every state can be expanded to yield b new states. We say
that the. branching factor of these states (and of the search tree) is b. The root of the ' search tree
generates b nodes at the level each of which generates b more nodes, for a total of b2 at the second
level. Each of these generates b more nodes, yielding b3 third level, and so on. Now suppose that the
solution for this problem has a path length of d. then the maximum number of nodes expanded before
finding a solution is

1+b+ b2+b3+...b l

This is the maximum number, but the solutionicould be found at any point on the dth level. In the best
case, therefore, the number

'

would be smaller.

Those who do complexity analysis get nervous (or excited, if they ,are the sort of people who like a
challenge) whenever they see an exponential complexity bound like O(bd). Figure 1.2 shows why. It
shows the time and memory required for a breadth-first search with branching factor b=10 and for
various values of the solution depth. The space complexity is the same as the time complexity, because
all the leaf nodes of the tree must be maintained in memory

Table 1 assumes that 100 nodes can be goal-checked and expanded per second, and that a node
requires 100 bytes of storage. Many puzzle-like problems fit roughly within these assumptions (give or
take a factor of 100) when run on a modern personal computer or workstation.

Table 1 Time and memory requirements for breadth-first search.

Depth Nodes Time Memory 0 1 1 millisecond 100 bytes 2 111 l

1 seconds I

11 kilobytes 4 11.111 11 seconds H

1 megabytes 6

106 18 minutes U 111 megabytes 8 108 31 hours {I 11 gigabytes 10 1010 128 days H 1 terabytes 12 I
1012 ' 35 years 1 11 terabytes 14 {I 1014 L
3500 years I 11. 11 terabytes.

The figure shown assume branching factors b=10: 1000 nodes/ second; 100 bytes/ node.

There are two lessons to be from Table 1. First the memory requirements are a bigger problem for
breadth-first search than the execu tion time. Most people have the patience to Wait 18 minutes for a
depth 6 search to complete, assuming they care about the

answer, but not so many have the 111 megabytes of memory that are required. And although 31 hours
would not be too long to wait for the solution to an important problem of depth 8, very few people
indeed have access to be the gigabytes of memory it would take. Fortunately, there are other search
strategies that require less

memory.

The second lesson is that time requirement is still a majorfactor. If your problem has a solution at depth
12, then (given-our assumptions) it Will take 35 years for an uninformed search to find it. Of course, if
trends continue then in 10 years, one will be able to buy a computer that is 100 times faster for the
same price as the current one. Even with that computer, however, it will still take 128 days to find a
solution at depth 12 and 35 years for a solution at depth 14. Moreover, there are no other uninformed
search strategies that fare any better. In general exponential complexity search cannot be solved for any
but the smallest instances.

Uniform Cost Search Breadth first search finds the shallowest goal state, but this may not always be the
least-cost solution for a general path cost function, Uniform cost search modifies the breadth-first
strategy by always expanding the lowest cost node on the fringe ( as measured by the path cost g (n));
rather than the lowest depth node. It is easy to see that breadth-first search is just uniform cost search
with g (n) = DEPTH (n).

When certain conditions are met, the first solution that is found is guarantee to be the cheapest
solution, because if there were a cheaper path that was a solution, it would have been expanded earlier,
and thus would have been found first. A look at the strategy in action will help explain. Consider the
route-finding problem in Figure 2.1. The problem is to get from S to G and the cost of each operator is
marked

34
a T17 ') estate space showmg the costfor each operator.

[7) Progression of the search. Each node is 1a beled With g(n). at the HEXfStep, thegoalnode With g(n)
Will be selected.

Depth-first search Depth-first search always expands one of the nodes at the level of the tree. Only
when the search hits a dead end (a non-goal node with no expansion) does the search go back and
expand nodes at shallower level. This strategy an be implemented by GENERALSEARCH with a queuing
function that always puts the newly generated states at the front of the queue. Because the expanded
node was the deepest, its successors will be even deeper

Depth-first search has very modest memory requirement. It needs to store only a single path from the
root to a leaf node along with the remaining unexpanded sibling nodes for each node on the path. For a
state space with branching factor b and maximum depth m, depth-first search requires storage of only
bm nodes, in contrast to the bd that would be required by breadth first search in

35

the case were the shallowest goal is at depth d. Using the same assumptions as Figure 1.2 depth-first
search would require 12 kilobytes instead of 111 terabytes at depth d = 12, a factor of 10 billion times
less space
The time complexity for depth-first search is O(bm). For problems that have very many solutiohs, depth-
first may actually be faster than breadth-first, because it has a good chance of finding the solution
faster.

'

The drawback of depth-first search is that it can get stuck going down the wrong path. Many problems
have very deep or even infinite search trees, so depth-first search will never be able to recover from an
unlucky choice at one of the nodes near the top of the tree. The search will always continue downward
without backing up, even when a shallow solution exits. Thus, on these problems depth-first search Will
either get stuck in an infinite loop and never return a solution, or it may eventually find a solution path
that is longer than the optimal solution. That means depthfirst search is neither complete nor optimal.
Because of this, depthfirst search should be avoided for search trees with large or infinite maximum
depths.

It is trivial to implement depth-first search with GENERAL SEARCH

Function depth-first search (problem) returns a solution, or failure

GENERAL-SEARCH(problem ENQUEUE-AT FRONT)

It is also common to implement depth-first search with a recursive that calls it self on each of its children
in turn. In this case, the queue is stored implicitly in the local state of each invocation on the calling
stack.

Depth Limited Search Depth limited search avoids the pitfalls of depth-first search by imposing a cut off
on the maximum depth of a path. This cutoff can be implementing with a special depth limited search
algorithm, or by using the general search algorithm with operator that keep track of the depth. For
example, on the map of Romania, there are 20 cities, so we know that if there is a solution, then it must
be of length 19 at the longest. We can implement the depth cutoff using operators of the form: if you
are in city A and have traveled a path of less than 19 steps, then generate a new state in the city B with a
path length that is one greater . With this new operator set, we are guaranteed to find the solution if it
exists, but we are still not guaranteed to find the shortest solution first: depth limited search is complete
but not optimal. If we choose a depth limit that is too small, then depthlimited search is not even
complete. The time and space complexity of depth limited search is similar to depthfirst search. It takes
O(b ) time and 0(bl)space, where l is the depth limit.

Iterative deepening Search Iterative deepening search is a strategy sidesteps the issue of choosing the
best depth limit by trying all possible depth limits: first depth 0, then depth 1, depth 2, and so on.

The algorithm is shown in Figure 5.0. In effect, iterative deepening combines the benefits of depth-first
and breadthfirst search. It is optimal and complete, like breadthfirst search but has only the modest
memory requirements of depth-first search. The order of expansion of states is similar to breadth first,
except that some states are expanded multiple times. Figure 5.1 shows the first four iterations of
ITERATIVE- DEEPENING SEARCH on a binary search tree

Iterative deepening search may seem wasteful, because so many states are expanded multiple times.

37
Fi ; re 3: The iterative dee enin- search a1 orithm

Function ITERATIVE-DEEPENING SEARCH (problem) returns a solution sequence

Inputs: problem, a problem For depth 0 to do If .DEI I H-LIMITED-SEARCH (problem, depth) succeeds
then return its result

End

Return failure

Figure 4: Four iterations of i tern tive deepening search on a binary tree

Bi-directional search : The idea behind the bidirectional search is to simultaneously search both forward
from the initial start and backward from the goal, and st0p where the two searches meet in the middle

38
is about to succeed, when a branch from the start node meets a branch from the goal node.

For problems where the branching factor is b, bi-directional search can make a big difference. If we
assume as usual that there is a solution depth of (1, then the solution Will be found in 0(2bd/2)=O(bd/ 2)
steps because the forward and backward searches even have to go on half way. To make this concrete,
for b=10 and d=6, breadth-first search generates 1,111,111 nodes whereas bi directional search
succeeds When each direction is at depth 3, at which point 2,222 nodes have been generated. Several
issues needs to be addressed before this algorithm can be implemented. The space complexity of
uniformed bidirectional is . O(bd/Z).

CHAPTER FOUR

NATURAL LANGUAGE PROCESSIN G

Communication

Communication is such a wide spread phenomenon that it is hard to pin down an exact definition. In
general, it is the intentional exchange of information brought about by the production and perception of
signs down from a shared system of conventional

Signs.
Most animals employ a fixed set of signs to represent message that are important to their survival; food
here, predator nearby, approach, withdraw, let s mate.

Humans use a limited number of conventional signs like smiling, shaking hands to communicate in much
the same as other animals. Humans have also developed a complex, structured system of signs known
as language that enables them to communicate most of What they know about the world.

Fundamentals of Language Basically, languages 'can be divided into two:

(1) Formal languages: these types of languages are invented and rigidly defined. Examples are Lisp,
prolog, first -order logic, etc.

(2) N atural languages: Examples of these types include German, Chinese, English, Spanish that human
beings use to talk to one

40

another.

A formal language could also be defined as a set of strings, where each strings is a sequence of symbols
taken from a finite set called the terminal symbols. For English, the terminal symbols include words like
a, aback, abacus, arrow, and about 400, 000 more.

Working with formal and natural languages entails the use of phrase structure. This means that strings
are composed of substrings called phrases, which come in different categories. The reasons for
intensifying phrases in this way are

0 Phrases are convenient handles on which are can attach

semantics . Categorizing phrases helps us to describe the allowable. strings of the languages.

A noun phrase can combine with a Verb phrase to form a phrase of category sentence. Grammatical
categories are essentially posite o as part of a scientific theory of language that give the differenc:
between grammatical and ungrammatical languages. Categories

'

such as noun phrase, verb phrase and sentence are called no terminal symbols.

The Component Steps of Communication Communication is composed of seven processes. Considering


a typical communication episode, in which a speaker wants to convey a proposition to a hearer, seven
steps will be involved. Three take place in the speaker: intention, generation and synthesis, while four
steps take place in the hearer: perception, analysis, disambiguation and incorporation.

The seven processes are exclusively explained below Intention: This involves reasoning about the beliefs
and godly of

41

the heater, so that the utterance Will take the desired impact.
Generation: The speaker uses knowledge about language to decide what to say. In many ways, this is
harder than the inverse problem of understanding.

Perception: When the medium of communication is speech, the perception step is called speech
recognition; when it is printing, it is called optical character recognition.

Analysis: This can be divided into two main parts: syntactic interpretation (or parsing) and semantic
interpretation. Parsing refers to the process of assigning a part of speech (noun, verb and so on) to each
word in a sentence and grouping the words into phrases. A parse tree could be used to display the result
of a syntactic analysis. It is a tree in which interior nodes represent phrases, links represent applications
of grammar rules and leaf nudes represent words.

Semantic interpretation includes understanding the meanings of words and incorporating knowledge of
the current situation. It is also called pragmatic interpretation and it extracts the meaning of an
utterance as an expression in some representation language. Disambiguation: This is the first process
that depends heavily on uncertain reasoning. Analysis generates possible interpretations; If more than
one interpretation is found, then disambiguation chooses the best.

Incorporation: This entails taking the words used and the derived interpretation as additional pieces of
evidence that get considered along with all other evidence for and against the derived interpretation.

Models of Communication

There are two models of communication: 0 Encoded message model: This says that the speaker has
adequate proposition in mind and encodes the proposition into words or signs. The hearer then tries to
decode the message to retrieve the original proposition. Under this model, the meaning in the speaker's
head, the message that gets transmitted, and the interpretation that the bearer arrives at all carry the
same content. When they differ, it is because of noise in the communication channel or an error in
encoding or decoding.

Situated language model: This says that the meaning of a message depends on both the word and the
situation in which the words are uttered. This accounts for the fact that the same words can have very
different meanings in different situations.

GRAMMAR

Lexicon: The first step in defining a grammar is to define a lexicon, or list of allowable vocabulary words.
The words are grouped into the categories or parts of speech familiar to dictionary users: nouns,
pronouns, and names to denote things, verbs to denote events, adjectives to modify nouns, and adverbs
to modify verbs. _ Other categories include articles, prepositions and conjunctions. For nouns, verbs,
adjective, and adverbs, it is in principle infeasible to list them all. Not only are there thousands or ten
thousands of members in each class, but new ones are constantly being added. These four categories
are called open classes. The other categories (pronoun, article, preposition, and conjunction) are called
closed classes. They have a small number of words that could in principle be enumerated. Closed classes

normally change over the course of centuries.

Grammar :Having defined a lexicon, the next step is to combine the words into phrases. The different
kinds of phrases include sentence, noun phrase, verb phrase, prepositional phrase, and relative clause.
Parsing :Parsing entails recovering the phrase structure of an utterance, given a grammar. There are
many possible parsing algorithms. Some operate top-down, starting with a sentence and expanding it
according to the grammar rules to match the words in the string. Some use a combination of top-down
and bottom up, and some use dynamic programming techniques to avoid the inefficiencies of back
tracking.

Definitive Clause Grammar (DCG) Once language is needed for communication, a way of associating a
meaning with each string is required. Also, grammars that are content- sensitive must be described.

These are the two problems With BNF, which talks only about strings and is context-free. Therefore, the
idea is to use first order logic to talk about strings and their meanings. Each non-terminal. symbol
becomes a one-place predicate that is true of strings that are phrases of that category.

Logic grammar is defined as a grammar written with logical sentences. In using logical inference to solve
the problems, unrestricted inference is expensive, therefore a restricted format is preferred by most
logic grammars. An example and the most common is the definitive clause grammar (DCG). This is a case
in which every sentence must be a definitive clause. DCG is attractive because it allows-fur grammars to
be described in terms of first order logic.

Augmenting a Grammar Due to the fact that English language is not context-free, for example any noun
phrase cannot combine with any verb phrase to form .a sentence, there is need to introduce new
categories like subjective and objective noun phrases if a context-free grammar is to be used.

However, an alternative approach is to augment the existing rules the grammar instead of introducing
new ones. This will produce compact and concise grammar. Subject-verb agreement could also I handled
with augmentation.

Verb Sub-categorization. In order to prevent ungrammatical sentences, the grammar must state which
verbs can be followed by which other categories. This is called sub-categorization information for the
verb. This means that each verb has a list of obligatory phrases that follow the verb within the phrase.

A sub-categorization list is therefore a list of complement categories that a verb accepts. It is possible for
a verb to have more than one sub-categorization list.

The three steps needed to integrate verb sub-categories are: (1) Augmenting the category verb phrase
to have take a subcategory augment that indicates the complement that are needed to form a complete
verb phrase. (2) Changing the rule for subject to mean that it requires a verb phrase that has all its
complements. (3) Noting that in addition to complements, phrases can also take adjuncts.

Generative Capacity of Any Grammar. The number of values for augmentations determines the
generative capacity of augmented grammars. The augmented grammar equivalent to a context free
grammar if there are a finite number of values. In the general case however, augmented grammars go
beyond context-free.

Semantic Interpretation There is the need to consider the semantic of formal languages because
expressions that have a compositional semantic are easy to deal with. Therefore, in semantic
interpretation, the semantic of any phrase is the function of the semantic of its sub-phrases. That is, if
the meanings of the sub-phrases are known then the whole phrase will have a known meaning. This
allows for the handling of infinite grammar With a finite and concise set of rules. Semantic
interpretation is responsible for combining meanings compositionally in order to get a set of possible
interpretation.

Due to some of the shortcomings of semantic interpretation, many modern grammars take an
intermediate form also called quasi logical form. This is because it sits between the syntactic and logical
forms. It has two properties: 0 It is structurally similar to the syntax of the sentence and can thus be
easily constructed through compositional means 0 It contains enough information in order that it can be
translated into a first order logical sentence

3Steps in Semantic Interpretation 1. The logical or quasi-logical form to be generated must be dec1ded.
Writing down some example sentences and the corresponding logical forms can do this.

ii. Modification of the example sentences one word at a time. This

is in order to study corresponding logical form

iii.. The basic logical type of each lexical category should be written down along with some logical form
pairs. Once the semantic type of one word in a category is decided, everything in the category is there or
of the same type.

iv. Determination of examples and types for constituent phrases. These modifications are done one
phrase at a time.

v. Attachment of semantic interpretation augmentations to the grammar rules. vi. Application of the
relation or function to the object.

vii.. Building up semantics by concatenating the semantics of the constituents possibly With some
connectors wrapped around them.

Vii. Taking apart one of the constituents before putting the semantics of the whole phrase back
together.

Ambiguity The biggest problem that can cause communication breakdown is ambiguity, that is the fact
that most utterances are ambiguous.

Types of Ambiguity. 1) Lexical ambiguity: this is the simplest type. It is a case where a word has more
than one meaning.

2) Syntactic ambiguity: it is also called structural ambiguity. It can occur where prepositional phrase
modifies the noun or verb. It leads to semantic ambiguity. 3) Semantic ambiguity: this occurs mostly in
phrases, which have lexical and syntactic ambiguities.

4) Referential ambiguity: this is a persuasive form of semantic ambiguity. This type occurs because
natural language consists almost entirely of words for categories not for individual projects. 5) Pragmatic
ambiguity: it occurs when the speaker and hearer disagree on what the current situation is.

6) Logical ambiguity: this occurs where a sub-string can be parsed In several ways but only one of these
ways fits In to the larger context of the whole string.

CHAPTER FIVE
EXPERT SYSTEMS

What is Expert system? An expert system is a computer program designed to stimulate the problem-
solving behavior of a human Who is an expert in a narrow domain or discipline. An expert system is
normally composed of a knowledge base (information, heuristics, etc.), inference engine (analyzes the
knowledge base), and the end user interface (accepting inputs, generating outputs). The path that leads
to the development of expert system is different from that of conventional programming techniques.
The concepts for expert system development come from the subject domain of artificial intelligence (AI),
and require a departure from conventional computing practices and programming techniques. A
conventional program consists of an algorithm process to reach a specific result. An AI program is made
up of a knowledge base and a procedure to infer an answer.

Here are two other definitions of an expert system: 0 A model and associated procedure that exhibits,
Within a specific domain, a degree of expertise in problem solving that is comparable to that of a human
expert. ( I gnizo, 1 991). o An expert system is. a computer system, which emulates the decision-making
ability of a human expert. (Giarra tono, 1 992). Simply put, an expert System contains knowledge
derived from an'

expert in some narrow domain. This knowledge is used to help

individuals using the expert system to solve some problem, The traditional definition of a computer
program is usually: Algorithm + data structu res =

program

In an expert system, the definition changes to:

I nference engine + knowledge =

expert sys tem

Today s expert systems deal with domains of narrow specialization. For expert systems to perform
competently over a broad range of tasks, such that they are now highly specialized a new deployment of
an expert system is Autonomous intelligent agent. An autonomous intelligent system is a system
situated within and a part of an environment that senses that environment and act on it, over time, in
pursuit of its own agenda and so as effect what it senses in the future.

Expert systems are capable of delivering quantitative information, much of Which has been developed
through basic and applied researches. The expert system derives its answers by running the knowledge
base through an inference engine, software program that interacts with the user and processes the
result from the rule and data in the knowledge base.

One of the most powerful attributes of expert system is the ability to explain reasoning. Since the
system remember its logical chain of reasoning, a user may ask for an explanation of a recommendation
and the system will display the factors it considered in providing a particular recommendation. This
attribute enhances user confidence in the recommendation and acceptance of expert system. The
development of an electronic decision support system requires the combined effort of specialist from
man nnmo m
A brief history on Expert System Evolution

Edward A. feigenbaum was one of the people in artificial intelligence research who decided, in the mid-
196OS, that it was important to know how much program can know and that the best way to find out
would be to try to construct an artificial expert. While looking for an appropriate field of Nobel laureate
biochemist, who suggest that organic chemists sorely needed assistance in determining the molecular
structure of chemical compounds.

Together with Bruce Bechanan, Lederbeg and Feigenbaum began work on Dendral, the first expert
system, in 1965 at Stanford University. Conventional computer-based systems had failed to provide
organic chemists with a tool for forecasting molecular structure. Human chemist knows that the possible
structure of any chemical compound depends on a number of basic rules about how different atoms can
bond to one another. They also know a lot of facts about different atoms in known compound. When
they make or discover a previously unknown compound, they can gather evidence about the compound
by analyzing the substance with a mass spectroscope-which provides a lot of data, but no clues to what
it all means. W

Classification and types of Expert systems Expert systems can be broadly categorized according to
several main delimiters such as:
1'.) Area of application. 0

Configuration expert systems o

Diagnosis and predictive expert systems 0 Instruction expert systems ' o Interpretatidn and analytical
expert systems 0 Monitoring and management oversight expert systems 0 Planning and scheduling
expert systems 0 Prognosis and Remedy expert systems 0 Control expert systems 9 Autonomous
intelligent agents.

ii.) Logistic of inference engine or construétion. This relates to the logical pattern or method of the
inference and reasoning engine of the expert system {logical construct}, by this we have:

o Heuristic Classification 0 Heuristic Construction '0 Fuzzy logic 9 Neural networks {forward and
backWard propagation networks} logic 9 Adaptive semantic logic

(iii) Characteris tics ofExpert Sys tem A good expert system must have the following qualities; 0

Acquire broad knowledge in the field. of expertise

Having a fast reqaonse time (time between the posing of a problem and the solution presentation
should be small)

Representing knowledge in a concise, robust and simple form Having a friendly user interface Explains
and justify its reasoning and conclusions Transfers new knowledge to and / or modifies the existing in
the system s knowledge base. An Expert System can: be distinguished froin other kinds of artificial
intelligence prbgrams because: 0 It deals with subject matter of realistic complexity that normally
requires a considerable amount of human expertise. It must exhibit high performance in terms of Speed
and reliability on order to be a useful tool. It separates knowiedge (the codified knowledge base) form '
reasoning. It must be capable of explaining and justifying solutions and recommendations, in order to
convince the user- that its reasoning is in fact correct

Reasons For Using Expert Systems 9 Facility for non- txpert personnel to solve problems that require
some expertise When human experts are difficult to find For speedy solution Power to manage without
human expert When human experts are expensive

When airailalie information is poor, incomplete and fuzzy To eliminate uncomfortable and monotcinous
activities

9 When knowledge improvement is needed

Examples of Expert Systems Some examples of expert systems include the following

MYCIN: Mycin Expert System developed at Stanford University in 1972 is used in the medical domain to
diagnose blood infection. It uses approximately 600 production rules. It could also explain reasons that
led to it diagnosis and recommendation.
DENDRAL: A Chemical analysis expert system was developed in 1965 at the Stanford University by
Edward Feigenbaum and Lederberg Joshua. It primary aim was to help organic chemists in identifying
unknown organic molecules, by analyzing their mass spectra. Dendral consist of two sub programs,
Heuristic Dendral and Meta-Dendral

PUFF: Puff is an expert system used for the interpretation of pulmonary function data. PUFF was
developed as 'a heuristic programming project by departments of Medicine and Computer Science of
Stanford University.

ALADIN: is an expert system that aids metallurgists in the design of new aluminum alloys. It was
developed in 1986 by the Robotic Institute, Carnegie Mellon University

MACSYMA: was developed at the Massachusetts Institute of T echnology (MIT) for assisting in solving
complex mathematical problems '

LIMEX: is an integrated expert system with multimedia that was deveIOped to assist lime growers and
extension agents in the cultivation of lime for the purpose of improving their yield. It scope includes:
assessment, irrigation, fertilization and PeSt control.

STD WIZARD: Expert system for recommending medical screening tests


Other expert systems include:

EMYCIN, MOLGEN, PROSPECTOR, XCON, HELP, EXPERTAX

Development of Expert System The process of building an expert system is called knowledge
engineering and the process is carried out by a knowledge engineer. The knowledge engineer is a human
With good background in computer science and AL A knowledge engineer also decides how to represent
the knowledge in an expert system and helps the programmer to write the code. Knowledge Engineering
is the acquisition of knowledge from a human expert

'

or any other source. The diagram below illustrates the process.

expert system include: 0 Statement of the problem to be solved . Search for the human expert or the
equivalent data or experience 0 '

Design of the expert system ' 0 Selection of the degree of participation of the user 0 ; Selection of the
development tool, shell, or programming language ,5 9

Development of a prototype o ' Prototype checking o Refinement and generalization 6 Maintenance

J0

Updating

Errors in Development stages Expert system development errors can be classified as follows. i. Expert's
Knowledge error ii. Semantic error iii. Syntax error iv. Inference engine error v. Inference chain error V1.
Limit of ignorance error

CHAPTER SIX KNOWLEDGE BASED SYSTEM

Introduction An Intelligent Knowledge Based System (KBS) has the capacity to acquire, store, retrieve,
communicate, process and use knowledge for the purpose of solving problems. An expert system (E5) is
a type of KBS system that uses human knowledge captured in a computer to solve problems that
ordinarily require human expertise. Welldesigned systems imitate the reasoning process experts use to
solve specific problems. This is usually a narrow area of expertise called the domain.

Advantages of using Knowledge Base System include:

System should perform better than a human (fewer errors than non-experts)

o Consistent, as it will not overlook solutions that are always available

0
Knowledge transfer will be available

9 The system can manipulate knowledge, manage complex documents and help analyse expert
knowledge 0 The system can provide training to non-experts _ or those training to become an expert
(medical students and home users)

. The system can handle uncertainty making explicit the human knowledge used.

Limitations of Knowledge Base System (KBS): 0 The knowledge that is required to build a successful
system is not always readily available. 0 It is sometime difficult to extract the precise expertise from
experts that is needed for the system to be accurate 0 The system will only work well within a narrow
domain of knowledge 0 Vocabulary or jargon that experts use to express facts and relations is often
limited and not understood by others i.e. non-expert users. 0 Incorrect recommendations may be made
as it does not always arrive at correct conclusions.

KNOWLEDGE ELICITATION knowledge elicitation can be described 'as actluiring knowledge from human
experts and learning from data'.

There are three major stages to the process:

0 Initial understanding and structuring of the domain (environment, tasks, information flow, concept
and attributes).

oProducing the first working system (extract relationships between domain concep ts)

0 Testing and debugging the system.

The knowledge utilized by an Expert System is derived from a variety of sources usually through human
experts. Expertise can be iivided into four ca tegories: i) Domain Level

ii) Task Level

iii) Strategic Level iv) Inference Level Studies fr'om research have identified the following as tools, which
can be utilized to help undertake the process of knowledge elicitation. These tools are listed below in
Table 1. With their advantages and disadvantages.
Knowledge Elicitation Methods Many Knowledge Elicitation (KB) methods have been used to obtain the
information required to solve problems. These methods can be classified in many ways. One common
way is by hOW

directly they obtain information from the domain expert. Direct methods involve directly questioning a
domain expert on how they do their job. In order for these methods to be successful, the domain

60

expert has to be reasonably articulate and willing to share information. The information has to be easily
expressed by the expert, which is often difficult when tasks frequently performed often become
'automatic. Indirect methods are used in order to obtain information that can not be easily expressed
directly.

Two other ways of classifying methods are discussed. One classifies the methods by how they interact
with the domain expert. Another classifies them by what type of informa tion is obtained.

Other factors that influence the choice of Knowledge Elicitation method are the amount of domain
knowledge required by the knowledge engineer and the effort required to analyze the data.
Knowledge Elicitation (KE) Methods by Interaction Type There are many ways of grouping KE methods.
One is to group them by the type of interaction with the domain expert. Table 2 shows the categories
and the type of information produced.

Case Study In Case Study methods different examples of problems/ tasks within a domain are discussed.
The problems consist of specific cases that can be typical, difficult, or memorable. These cases are used
as a context within which directed questions are asked.

Protocols

Protocol analysis [Ericsson and Simon, 1984] involves asking the expert to perform a task while thinking
aloud." The intent is to capture both the actions performed and the mental process used to'

determine these actions. As with all the direct methods, the success of the protocol analysis depends on
the ability of the expert to describe

62

why they are making their decision. In some cases, the expert may not remember why they do things a
certain way. In many cases, the verbalized thoughts will only be a subset of the actual knowledge used
to perform the task. One method used to augment this information is Interruption analysis. For this
method, the knowledge engineer interrupts the expert at critical points in the task to ask questions
about why they performed a particular action.

Role Playing In Role Playing, the expert acLapts a role and acts out a scenario where their knowledge is
used [Geiwitz, et al., 1990]. The intent is that by Viewing a situation from a different perspective,
information will be revealed that was not discussed when the expert was asked directly. Table 6 shows
role playing.

Knowledge Elicitation Methods by Knowledge T ype Obtained KE methods can also be grouped (to some
extent) by the type of knowledge obtained. Since some designers may not be able to directly express
how they perform a design task. The following are some Information/knowledge types that can be
obtained 0 Problem solving strategy

9 Procedures

0 Goals

'

0 Classification

0 Relationships

0 Evaluation
CHAPTER SEVEN

SOFTWARE AGENTS

Software agents: Basically are software component, which operates fairly independently. They act in
order to accomplish a task on behalf of its user. By this property the history of software agents can be
traced back into the early 1970's where work was conducted in the fields of software engineering,
human interface research and in artificial intelligent (AI). Conceptually software agents' predecessors
are in technology where the concept of 'actors' was introduced. These actors were self-contained
objects, with own encapsulated internal state and some interactive and concurrent communication
capabilities [7].

Agents come in many different flavours. Depending on their intended use, agents are referred to by
enormous variety of names. E.g. softbot, userbot, personal assistant, travel agent, mobile agent, search,
intranet agent, database agent, identification agent, information agent. The word 'agent' is an umbrella
term that co'vers a wide range of specific agent types. The most popular names used for different
agents are highly nbn-descriptive. It is therefore preferable to describe and classify agents according to
the specific properties they exhibit [2].

A definition close to present-day reality is that of T ed Selker from the IBM Almaden Research Center.

An agent is a software thing that know how to do things that you could probably do yourself if you had
the time [4]. As the agent can solve tasks given by its user, agents also can co

are developed because ot the nature of the conversatlon, wmcn the agents are designed to have. Basic
message passing was insufficient because it did not offer any goal-oriented approach. Usually the co
operation is the eligible behaviour if there ia s collective of agents available. Straightforwardly the
benefits of cooperation are: efficient use of resource, speedup in solving problems and ability to solve
more complex problems.

Software Agent The definition of a software agent requires that the agent possesses the following
minimal characteris tics:

9 Delegation: The agent performs a set of tasks on behalf of a user (9r other agents) that are explicitly
approved by the user.

3 Communication skills: The agent needs to be able to interact with the user to receive task delegation
instructions, and inform task status and completion through an agent user interface or throughanagent
communicationlanguage.

y Autonomy: The agent operates without direct intervention (e.g., in the background) to the extent of
the user's specified delegation. The autonomy attribute of an agent can range from being able to initiate
a nightly backup to negotiating the best price of a product for the user. 0" Monitoring: The agent needs
to be able to monitor its environment in order to be able to perform tasks autonomously. 0. Actuation:
The agent needs to be able to affect its environment via an actuation mechanism forautonomous
operation. 0 Intelligence: The agent needs to be able to interpret the monitored events to ma ke
appropriate actuation decisions for autonomous operation.
In addition to the basic attributes mentioned above, an agent may have other attribute such as mobility,
security and others.

CLASSIFICATION OF AGENTS There is no easy and straightforward way of grouping agents in to certain
classes. This yields the classes of stationary (or statics) or mobile agents. Secondly, on what the agent
responds on? To the environment changes or does it posses some sort of a deliberate behaviour. By
deliberate in this case means an agent, which is capable of some sort of deliberate thinking by a
symbolic reasoning mode [6].

The above can be summarized into a combined form, and from the capacity to learn as they react and
interact with other a gents' goal.

Collaboration Agents This emphasizes autonomy and co-operation With other agents iri order to
complete tasks given by their user. These agents can be capable of learning but it- is not necessary. It
consists of the following characteristics, autonomy, social ability, responsiveness and proactiviness.
Hence they are able to act rationally and autonomously in open and time constrained multi-agent
environments. Application areas where collaboration agent are suited among other things come from
the industry. They often are located between the monitoring process and the diagnostics. [13]

Interface Agents It emphasize autonomy and learning With other agent in order to complete task given
by their user. The key metaphor underlying interface agent is that they are personal assistants who
collaborate With the human user in the same working environment. Interaction between human and
interface agent dose not necessary require an agent communication language as the communication
between

agent dose. Interface agents typically learn as they quid's their users. Here is a list of ways how agent
can learn from their user: [13] o

By observing and imitating 0 Receiving positive and negative feedback 0

Receiving clearinstructions 0 Advice from other agents.

Mobile Agents These are computational software processes capable of migrating from one computer to
another via the Internet or Intranet. They interact with foreign host gathering information on behalf of
its owner and then return home with the gathered information. This sort of agent can be used in various
different applications ranging from flight reservation to the managing of telecommunication network.
Mobile agents are defined as agents because they are autonomous and they co-operate with other
agents. When they do co-operate it is important that they exchange information in the same area of
interest without necessarily giving it all away, and on the other hand, they do not gather the same
information twice, or transport the information to the location from Where it originated. The security
issues will play a great role when designers are dealing with mobile agents. For instance, viruses have
the same behaviour as agents do. This yields problem how to exclude viruses from agents. [13]

Information Agents These are tools for gathering constantly growing information from the Internet.
Information agents perform the role of managing, manipulating and collating the retrieved and
distributed information. Though this role definition is a bit faltering since it fits other agents' definition
as well. Hence, this is an issue of debate among other issues in the agent ideology. It can have many
different roles depending on their job in a current application. E.g. retrieving information from the
databases. In this case they are usually called as a database agent. [13]

Reactive Agents This belong to a special category because they do not possess any kind of symbolic
model of their environments. Therefore they act/ respond in a stimulus-response manner to the present
state of the environment in which they are embedded. Reactive agents are relatively simple and they
interact with other agents in a basic way. Nevertheless, complex patterns of behaviour emerge from
these interactions when the ensemble of agents are viewed globally. By these properties reactive agents
tend to operate on raw sensor data that needs to be processed quickly. [13]

Hybrid Agents These are a Combination of agents mentioned in this chapter. Since each type has its own
strengths and deficiencies, the aim is to find the best suitable solution for the problem in hand, where
deficiencies are maximized in a straight forward manner. [13]

Intelligent Agents These always have some sort of a decision making model, which will give the agent a
primitive level of intelligence. This intelligence is usually based on reasoning theory, fuzzy logic,
knowledgebased systems, neural network or some combination of the previous four. Basically an agent
is intelligent, if it perceives its environment and is capable of reasoning its perceptions, solve problems,
and determine actions depending on its environment and tasks, which were given to it by its user. It is
possible for intelligent agents to learn as they communicate with their users or other agents. However,
building a learning engine is such a complex task; most of the current intelligent a gents are not learning

agents. In many cases intelligent agent are the interfaces between human users and the agent
community. This interaction can be done in several ways e.g., text recognition, through speech
interface, command line, and so forth. In figure 2.4 is shown the fields, which are influencing intelligent
agents. [13].

Mobility of Agents during Transaction.

The following illustrates the various communication between the software Agents which carry out
transaction/ operation during online shopping.

a). LocalAgents or UserInterfaceAgents These types of agents access local resources only. Usually, they
act as Advisory Agents (e.g. intelligent help systems) or Personal Assistants, which are supporting human
users during their. daily work. That is, they are used to monitor the user interactions with the
application and can control various aspects of that interaction, such as level of promp ting or the
number of options available.

b). NetworkAgents Network agents in contrast to Local agents can access not only local resources but
also remote resources, and thus have more or less knowledge about the network infrastructure. They
not only provide an intelligent ihterface to the user but also make an extensive use of the various
services available in the network.

c). DAI-basedAgents Distributed Artificial Intelligence (DAI) based agents coordinate intelligent
behaviour among a collection of autonomous intelligent agents i.e. to coordinate their knowledge,
goals, skills, and plans to jointly take actions or to solve problems. Using various AI techniques such as
rule-based system develops them, and case- and example- based reasoning. So, the main concern here
is the agent cooperation.
d). MomleAgents Also known as traveling agents, these agents will shuttle their being, code and state,
among resources. This of ten improves performance by moving the agents to where the data reside
instead of moving data to where the agents reside. Mobile a gents are aimed primarily at large
computer networks offering a huge number of sophisticated services. 80, this existing technology
supports agent mobility.

e). FilteringAgents

Filtering agents, as their name implies, act as filter that allows information of particular interest or
relevance to users to get through, while eliminating the flow of useless or irrelevant information. They
can also interact with other agents, functioning in a multi-agent system.

(1 ) Information Agents

It is parallel agent type to the filtering agent, which cuts down the information received, and actively
finds information for the user. As a research or intelligence gathering tool, information agent could
provide an invaluable service, keeping the user informed of any developments in a field or new web
sites that contain information relayed to their area of interest.

g). User Interface Agents

The major goal of these agents is to collaborate with the user, and hence the main emphasis of
investigations clearly lies in the field of user/agent interaction. Therefore, these types of agents are also
called

in tell i gent in terface

or in tevface agen ts

h). Office or Workflow Agents

An office management agent automates kinds of routine, daily tasks that up so much time at the office.
These tasks include

scheduling meetings, sending faxes, holding meeting review information, and updating process
documents. Some of these tasks can be considered under the workgroup or workflow software
because they deal with documents and calendars.

1'). System Agents Their main job is to manage the operations of a compu ting system or a data
communication network. These a gents monitor for device failures or system overloads and redirects
work to other parts in order to maintain a level of performance and / or reliability. Also, they act as
resource managers for the system. System agents be in the pro-active, responding not only to specific
events in the environment, but taking the initiative to recognize the situations that call for preemptive
actions.
j). BrokeringorTransactionAgents Transaction agents concentrate on monitoring and execution of
transaction. Applications are found in e-Commerce, manufacturing and management of business
process. It is software program that take request from buyer and searches for a set of possible sellers
using buyer's criteria for the item of interest. When the potential sellers are found to satisfy the request,
the broker agent can return result to the user, who chooses a seller, and manually executes the
transaction.

INTELLIGENT MOBILE AGENTS Intelligent mobile agents can be considered as active objects or objects
with mental states. To understand intelligent mobile Agents we should first understand What an Agent
is. An Agent is software that assists with tasks and acts on behalf of the initiator. Agents are typically,
autonomous, goaldriven, reactive, social, adaptive and mobile Agents may have one or more of these
characteristics. An autonomous agents (object) can be programmed to satisfy one 01

more goals, even if the agent (object) moves and losses contact with the creator. A mobile agent has the
ability to move independently from one device to another on a network. Mobile agents are generally
serializable and persistent.

There are also several public definitions of mobile agents. One definition that seems to hold universal
meaning come from Caltech. _ Mobile agents are programs that encapsulate data code, which may be
dispatched from a client computer and transported to a remove server for execution . Intelligent Mobile
Agents execute asynchronously and autonomously. Once a user has created an agent, it can run without
intervention from the user. Mobile agents provide a reliable transport between a client and server
without necessita ting a reliable underlying communications medium. .

Advantages of Using Intelligent Mobile Agents (IMA)

o The system should perform better and faster than human.

0 Useful in unreliable network or uncondusive environment.

No special client needed especially in shopping hall where in telligent mobile agent (e. g robot) is used. It
reduces work to be done by human. E.g. IMA, function as

road traffic control.

IMA can be sent into a shopping mall to find special offers and it can also been sent to deliver mail.

Software a gents perform low network traffic because agents do data processing locally.

imitation Of Using Intelligent Mobile Agents (IMA)

Incorrect recommendation may be made as it does not always arrive at correct conclusion.

1t requlres a lot or aacutlonal research.

IMA require a lot of maintenance to prevent then from misbehaving.

It requires a lot of capital i.e. it is capital intensive.


0 Software agent on central server needs all the, data from the other computers before it can do some
processing.

INTELLIGENT MOBILE AGENTS. Intelligent Mobile Agents is artificial intelligent machine that is movable
and perform the function similar to human e.g. robot, war aircraft. Software agent etc. To understand
intelligent mobile agents, we study mobile agents in detail.

MOBILE AGENTS

Mobile agents are programs, typically written in a script language, which may be dispatched from a
client computer and transported to a remote server computer for execution. Mobile agents are
autonomous intelligent programs that move through a network, searching for an interacting with
services on the user's behalf. These systems use specialized servers to interpret the agent's behaviours
and communicate with other servers.

Mobile agents offer an important new method of performing transactions and information retrieval in
networks. It should be able to execute on every machine in a network and the agent code should not
have to be installed on every machine the agent could visit. Therefore, mobile agent uses mobile code
system like Java and Java virtual machine where classes can be loaded at run-time over the network. The
creation of a pervasive agent frame work facilitates a very large number of network service applications.

Problems That Can Be Solved By Intelligent Mobile Agents

Intelligent mobile agents (IMA) solve problem as if the problem is solve by human.

Intelligent mobile agents (IMA) solve the nagging client/ server network bandwidth problem. Network
bandwidth in a distributed application is a valuable resources, a transaction between a client and the
server, it may require many round trips over the wire to complete. Each trip creates network traffic and
consumes bandwidth. In a system with many clients or transactions, the total bandwidth requirements
may exceed available bandwidth resulting in poor performance for the application as a whole. By
creating an agent to handle the query or the transaction, and sending the IMA from the client to the
server, network bandwidth consumption is reduced or stop. So instead of intermediate results and
information passing over the wire, only the IMA needed to be

75

IMA reduces the stress of client / server architecture. In the design of traditional client or server
architecture, the architect spells out the roles of the client and server pieces very precisely at the design
time. The architect make decisions about where a particular piece of functionality will reside based on
network bandwidth constraints, network traffic, transaction volume, number of clients and servers and
many other factors. If these are wrong, or the architect makes bad decisions, the performance of the
application will suffer. Unfortunately, once the system has been built and the performance measured,
it's often difficult or impossible to change the design and

'

fix the problems. An architecture based on IMA are potentially much less susceptible to the above
problems With the help of IMA, fewer decisions must be made at design time and the system is much
more easily modified after it is built. Agents architectures that support adaptive network load balancing
could do much of the redesign automatically.

In the early dates before the introduction of IMA, thousands of soldiers go to war and many may not
come back. Now with the advent of IMA, a soldier can fight a nation without living his office with the use
of intelligent mobile war aircraft. The aircraft can go as far as possible to drop missiles i.e. to attack
enemies Without any pilotinside the aircraft. Intelligent mobile agent technology solve the old age
problem by getting a computer to do the real thinking for man.

Intelligent Mobile Agents Application Intelligent mobile agents Were applicable in many fields. From our
own studies at college as well as from independent research, IMA were applicable most in the following
field; 1 I 'l'tary IMA were highly useful in military these days . 11 m1 1 . : _

because a soldier can destroy a whole nation without living his office with the help of an intelligent
mobile war aircraft. it help in simulation of the outcome of a war.

2. In communication: In this field, IMA play a very vital role, the rapid growth in telecommunication
networks has stimulated research on new generation artificial intelligent (AI) programming language.
Mobile agent in communication has basically three different domain of application. i. Data intensive
applications: Here the data remotely located is owned by the remote service provider and the user
sends an agent to the server storing data. ii. Agent is lunched by an appliance: This involves shipping an
agent from a cellular phone to a remo te server. iii. Extensible servers: ?- The user can ship and install an
agent representing him more permanently on a remote server. The agent is now a personalized,
autonomous piece of code that run remotely and only contacts the user wherever events of interest to
the user occur. Mobile Agents are software abstractions that can migrate across the network ~

representing users in various tasks.

3. On transportation: Intelligent mobile agent such as robot can be used as a road traffic officer. That
can control the movement of cars on roads. A car can be design base on IMA and send to environment
that is not conducive for man.

Migration Application: IMA present a new genre of interface applications that can migrate from one
machine to another, taking their user interface and application contexts with them, and continue from
where they left off. Such applications are not tied to one user or one machine, and can roam freely over
the network, rendering service to a community of users, gathering .

human input and interacting with people. 5. In supermarket or big store: On entering any supermarket
or big store, you automatically receive a mobile agent to gulcile you around, such as robot. All these are
the few appllcatlon or intelligent mobile agents. Finally, mobile agent is suitable for applications such as

i. Electronic commerce

ii. System administration and management especially ne twork management

iii. Information retrieval.

CHAPTER NIN E

ARTIFICIAL NEURAL NETWORKS


Introduction

The current wave of excitement in neural networks which gives a justification to it as a new area of
research is the recent breakthroughs in hardware construction of massively parallel machines that may
enable much faster simulation of biological neural networks.

It is a well-known fact that, when it comes to performing intelligent information processing, such as
image recognition, combinatorial optimization or language comprehension, the human brain
outperforms even the fastest digital computer (Garry 1987). The computational richness of the brain
comes from its large number of living neurons, Which are highly connected, to each other by a complex
network of synapses.

Digital computers can be programmed for intelligent tasks, but the problem sometimes is that the
algorithmic solution to many information-processing task is far too complex to be programmed. This is
not the only problem. Sometimes, even if a particular application has a clear and concise solution, many
algorithms are too computationally intensive to allow digital computer to find a solution in any
reasonable period of time. In view of the above constraints, attention is now been shifted to neural
computing.

INTRODUCTION TO NEURAL N ETWORKS Neural networks (a paradigm of artificial intelligence) is a


concept that lend itself well to the heuristics of learning from experience and because of its ability to
imitate the skill of experts by capturing knowledge, generalizing non-linear functional relationship
between input-output variable, it provides a flexible way of handling complex and intelligent
information processing.

OVERVIEW OF NEURAL COMPUTING. The approach of neural computing is to capture the guiding
principles that underlie the brain solution to problems and apply them to computer systems. We do not
know how the brain represent high-level information, so we cannot mimic that exactly, but we know
that the computational richness of brain comes from the neurons, so neural network design uses the
structure of neurons inter connection to emulate the way intelligent information processing occurs
within the brain.

Definition (Artificial Neural Network) An artificial neural network is a massively distributed processor (or
simply) computing systems made of a number of simple highly inter-connected signal or information
processing units (called artificial neuron) for storing experimental knowledge and making it available for
use (Haykins, 1994) It resembles the human brain in that:

o It can acquire and store knowledge through a learning


process.
9 It can make available. the knowledge store when required.

A neural network has three layers, viz;

N EURAL COMPUTING TECHNIQUES In contrast to conventional computing which requires an explicit


analysis of the problem to be solved to enable the programmer write down a step by step sets of
instruction to be followed by a computer, Neural computing does not require an explicit description of
how the problem is to be solved (Http://www.emsl.pnl.gov.2080/ docs/cie/neural. Homepagehtml). The
neural computer is able to adapt itself during training period, based on examples of similar problems
often with a desired solution to each problem. After sufficient training the neural computer is able to
relate the problem data to the solution, input to output, and it is then able to offer a viable solution to a
brand new problem.

OVERVIEW OF NEURAL NETWORKS

The approach of neural computing is to capture the guiding principles that underlie the brain's solution
to these problems and apply them to computer systems.

BIOLOGICAL NEURONS There are three major types of neurons in the human body 0 The sensory
neurons

0 The motor neurons

0 The relay neurons

Neurons can basically be defined as filters or cells which transmit impulse from the peripheral of the
body to the body, to the brain and also from the periphery. This transmission is done by the three
neurones mentioned above. Despite the functional differences, neurons have certain basic
characteristics and features peculiar to

Sensory Neurons Sensory neuron are said to mainly transmit impulse from the periphery (skin) to the
spinal cord or the brain. The impulses are transported in fast and slow phases of axoplasmic flow from
the cell body along the axon and dendrite to synapses. These types of impulse generation in sensory
neurons are said to be unidirectional.

Motor Neurons

Motor neurons can be said to be neurons that transmit impulse from the brain to the effector organ or
from the spinal cord to the effector organs. These neurons are also said to have a unidirectional mode of
impulse transmission.

Relay Neurons Relay neurons are mostly found in the spinal cord. Between the sensory neuron and the
brain and between the brain and the motor

no

neurons is a neuron, which transmit impulse linking the sensory neurons, the brain and the motor
neuron. These intermediate neurons are also known as Relay neurons. they transmit impulse from the
sensory neuron through the spinal cord to the brain and from the brain through the spinal cord to he
sensory neuron. So it is mostly found in the spinal cord. Relay neuron exhibits a kind of bidirectional
mode of impulse transmission i.e. it transmits between the sensory and motor impulses.
SOMA OR CELL BODY The soma or cell body is the large, round central body in which almost all the
logical functions of the neurons are realised. The generic and metabolic machinery necessary to keep
the neurons alive is contained in the cell body.

The cell body is often located at the dendritic zone of the axon (e.g. auditory neurons) or attached to the
side of the axon (e.g. cutaneous neurons). Its location makes no difference as far as receptor function of
the dendritic zone and transmission of function of the axon is concerned. It should be noted that
integration of activity is not the only function of dendrite and local potential pass from dendrite to
another in the central nervous system. The neuron soma also contains the nucleus and the proteins
synthesis machinery.

AXON

'

The axon are nerve fibers originating from somewhat thickened area of the cell body and can serve as a
final output channel of the neuron. The first segment of the axon is called hillock. These convert signals
into sequence of nerve pulses (spikes) which are propagated without alternation along the axon to the
larger cells. All axon are highly branched fiber that link other numerous

receptors and muscles.

DENDRITES. Dendrites are long irregular shaped nerve fibers that are attached to the soma. The fibers
are normally highly branched tree fibers, there are about 103 to 104 dendrites per neuron.

Dendrites connect the neurons to a set of other neurons. Dendrites either receive input from other
neurons via specialised contacts called synapses or connect other dendrites to the synaptic output.
Generally, dendrites are regarded as providing receptive surfaces for input signals to the neurons and as
conducting signals With decrement to the body cell and axon hillock.

SYNAPSES

Impulses are transmitted from one nerve cell to another at synapses. These are junctions Where the
axon or some other portion of a cell terminates on the soma, the dendrites, or some other position of
other neurons. Synapses played the role of interfaces connecting some axons of the neurons to the
spine of the input dendrites.

Synapses are capable of changing local potential in a positive or negative direction. Due to the friction,
synapses can be of excitatory or inhibitory in nature in accordance with their ability to increase or damp
the neuron excitation.

Storing information in a neuron is supposed to be concentrated in its synaptic connection, or more


precisely in the pattern of these connection and strength (weight) of the synaptic connection.

OPERATION OF A BIOLOGICAL NEURON Neuron celli body receives input from other neuron through

adjustable or adaptive synaptic connection to the dendrite. Output signals are transmitted along the
axon to the synapses of other neurons. Each pulse arriving at the synapses generates an analog internal
potential in some proportion to the synaptic strength that has either positive or negative value
corre5ponding to exciting or inhibiting synapses. These potential are summed in a spatial-temporal way
and when the total potential exceeds some value, called the threshold, a train of pulse is generated and
travel along the axon. The maximum firing rate is around 1000 pulse per second. The information
between neuron is transmitted in form of nerve impulse, Which can be considered as digital signals.
Now, after having a better appreciation of biological neurons, the features, functions, and limitations,
artificial neuron can be better

understood.

THE MAJOR ELEMENTS OF NEURAL N ETWORKS The neuron is the basic unit of the brain, and is a stand-
alone analogue logical processing unit. The basic function of a biological neuron is to add up its mpg: and
produce an output. If the sum of the input is greater than some value, known as the threshold value,
then the neuron will be activated and fired', if not, the neuron will remain in its inactive, quiet state.

The input to the neuron arrive along the dendrites which are connected, to the output from the other
neuron by specialized function called synapses. These functions alter the effectiveness with Which the
signal is transmitted; some synapses are good function and pass a large signal across whilst other are
very poor and allow very little through. Just like the biological neural network, the neuron is the major

elementary processing units in artificial neural network. In most common networks, neuron are
arranged in layer with the input data fed to the network at input layer (figure1.1). the data then passes
through the network to the output layer to provide the solution or answer.

At each neuron, every input has an associated 'weight'. Which is the strength of each input connected to
that neuron. The neuron simply adds together all the input and calculates an output to be passed on.

BENEFITS OF NEURAL COMPUTING

(i) Ability to tackle new kinds of problem Neural Network are effective at solving problem whose
solution are difficult, if not impossible to define, and since it has the ability to learn from experience
(previous examples) when presented with a new but similar problem it can provide solution.

(ii) Robustness

Neural network tend to be more robust than their conventional counterparts. They have the ability to
cope well with incomplete or "fuzzy data. '

(iii) Fault tolerant Since data and processing are distributed rather than centralized, neural network can
be very tolerant of faults. This contrasts With conventional system Where the failure of one component
usually means the failure of the entire system.

(iv) Fast processing speed. Neural network are very fast because they consist of a large number of
massively inter-connected processing unit, all

Operating in parallel on the same problem. This contrast to the serial, one step at a time processes.

(v) Flexibility and ease of maintenance Neural computers are very flexible in that they are able to adapt
their behaviour to new and changing environments. This contrast to the serial conventional computing,
which is strictly algorithmic and requires writing a new program for any modification. They are also
easier to maintain to accommodate changes or modifications; some networks have the ability to learn
from experience in order to improve their own performance.

(vi) Parallel processing Parallel processing is a processing technique, which involves multiple operations
being carried out simultaneously. parallelism reduces computational time. for this reason, it is used for
many computationally intensive applications such as predicting economic trends or generating special
visual effects for feature films. The high speed with which the brain and Artificial Neutral Networks
(ANNs) are able to process information is astounding. Consider the amount of computation needed to
process a single visual image. If one restricts the image resolution to 1,0001,000 receptors, a small
number compared to the retina, over one million (three million for colour images) must be examined
and several million computations performed in order that objects in the image -areidentified.

Even at the nanosecond speed of modern computers, this task can require several seconds in a
conventional computer. In contrast biological visual system compute such tasks in milliseconds. (Vii) By
patterning themselves after the architecture of the brain, they provide a plausible model of intelligent
mechanisms.

:viii) They provide a tool for modeling and exploring brain Functions.

(ix) They can be modeled to solve various classes of process control problems. By patterning themselves
after the architecture of the brain, they provide a plausible model of intelligent mechanism.

NEURAL NETWORK ARCHITECTURES

Artificial neural networks ANNs) use a variety of architectures of Which the multi-layer feed-forward
perceptron and the Recurrent Neural Networks are notable. Others include the Radial Basis Function
Network [(Broom and Lowe 1988), (Moody and Darken 1989)], the Adaptive and Learned Vector
Quantization Network Kangas et al ,(1990) use for data compression, the Konohen SelfOrgam'zing Maps
(Konohen, 1988), the counter-propagation Neural Networks (Hecht- Nielson, 1987), the Adaptive
Resonance Theory (ART1 and ARTZ) proposed by (Carpenter and Grossberg, 11987, 1988), Probabilistic
N eural Network (Cain 1990, specht, 1990), the Self-Organizing Feature Maps (SOFM) ~ Fukushima, 1989
and the Cellular Neural Networks(CNN) Chua and Yang (1988), Chua et al,(1993).

APPLICATION S OF N EURAL N ETWORKS

Neural networks have been successfully applied to broad spectrum of data- intensive and intelligent
computing applications, such as:

9 Process Modeling and Control

Creating a neural network model for a physical plant then using that model to determine the best
control settings for the plant.

0 Machine Diagnostics Detect when a machine has failed so

that the system can automatically shut down the machine when this occurs.

9 Portfolio Management Allocate the assets in a portfolio in


a way that maximizes return and minimizes risk. 0 Target Recognition

Military application Which uses video and/or infrared image data to determine if an enemy target is

present.

0 Medical Diagnosis

Assisting doctors with their diagnosis by analyzing the reported symptoms and / or image data such as
MRIs or X-rays.

9 Credit Rating

Automatically assigning a company's or individuals credit rating based on their financial condition.

0 Targeted Marketing

Finding the set of demographics which have the highest response rate for a particular marketing
campaign.

9 Voice Recognition

Transcribing spoken words into ASCII

text.

0 Financial Forecasting

Using the historical data of a security to predict the future movement of that security. 0 Quality Control

Attaching a camera or sensor to the end 01 a production process to automatically inspect for defects. 9
Intelligent Searching An internet search engine tha provides the most relevant content and banner ads
based on th users' past behavior. . 'Fraud Detection Detect fraudulent credit caré transactions and
automatically decline the charge. '

ARTIFICIAL NEURAL NETWORK(ANN)

An artificial neural network is a massively disturbed processor that resemble the human brain in that:

(i) It can acquire and store knowledge through a learning

process.

(ii) Inter-neuron connection strengths, known as synaptic weights are used to store the knowledge,
which can be made available for use.
'

There are several modes of neural networks that have been successfully implemented in solving some
seemingly intractable and complex problems such as:

(i) The travelling salesman problem (TSP)

(ii) Vowel and consonant discrimination

(iii) Pattern recognition e.t.c

These networks differ in structure, implementation and principle of operation but share common
features. Artificial neural networks are basically computing systems made up of a number of simple
highly inter connected signals or information processing units (artificial neurons). (Fig. 2.3.1) shows N
eural Network consisting of neurons.

There are different ways to connect artificial neuron to a large network, these pattern are called
architecture. The architecture of artificial neural network can be divided into the following:

(a) Feed-forward (multiphase) network, (FNN).

(b) Feedback (Recurrent Network) network, (FRN)

© Cellular network, (CN).


MODELIN G THE SINGLE N EURON
The neuron is the basic unit of the brain, and is a stand-alone analogue logical processing unit. The
neurons from two main types, local processing inter-neuron cells that have their input and output
connection over about 1000 microns, and output cells that connect from sensory organs into the brain.

The neuron accepts many inputs, which are all added up in some fashion.

The basic faction of a biological neuron is to add up its inputs and produce an input, if this sum is greater
than some value, known as the threshold value. Then the neuron will be activated and "fire" if not the
neuron will remain in its inactive, quiet state

The input to the neuron arrive along the dendrites, which are connected to the output from other
neurons by specialized junctions called synapses. These junctions alter the effectiveness With the signal
is transmitted; some synapses are good junctions, and pass a large signal across, whilst others are very
poor, and allow very little through.

Our model of the neuron must capture these important features.

(i) The output from a neuron is either on or off

(ii) The output depends only on the inputs. A certain number must be on at any one time in order to
make the neuron fire.

The efficiency of the synapses at coupling the incoming signal into the cell body can be modeled by
having a multiplicative factor on each of the inputs to the neuron. A more efficient synapse, which
transmits more of the signal, has a corresponding larger weight a weak synapse has a small weight.

We now formulate this mathematically. If there are 11 inputs, then there are n associated weights on
the input lines. The model neuro calculates the weighted sum of its inputs; it takes the first inp ,
multiplies it by the weight on that input line, then does the same fbr the next input and so on, adding
them all up at the end.

This can be written as

Total input =

weight on input 1 x input on 1 + weight on

input 2 x input on 2 + . . . +weight on input n x

Input on n

w1x1+ w2x2+ w3x3+ + waxn

=
zwixi

MATHEMATICAL MODELIN G OF A N EURAL N ETWORK

The basic function of a neuron is to accept inputs, add them in some fashion and produce an output. A
neuron can accept many inputs xi. If there are 11 inputs, associated with each input neuron is a weight
vector wi and a bias 0 . A neuron yields its output by performing the weighted sum, adding a bias and
then passes the result through a non-linear function known as the activation function or linear
combiner.

Here we define the input to the neuron as a vector X of n-tuple i.e.,

TX: . orX =[x1,x2,...,x,,]

Ex" 3.

The model neuron calculates the weighted sum of its inputs as follows:

It takes the first input, multiplies it by the weight on the input line, and does the same for the next input
and so on, adding them all up at the end. This sum is then compared in the neuron with the threshold
valve.

This is written as

= W1x1+ W2X2+ W3X3+ m + ann n=

Z: Wlxi .=

In a multi-layer feed forward Network, the neurons are arranged in consecutive layers (figure.2.7) so
that inputs to a neuron in a particular layer are composed solely of the outputs of the neurons in the
immediately preceding layers.

Let the vectors XPl = [XWXP12 . . .XP,JT, where T denotes transpose, be the input vectors to any neuron
in the 1th layer correspongmg to the Pm pattern presented to the net. Associated with the n layer 15 a
weight vector W. = [ Wm, Wm, WW]T and a bias w is the number of weights in the layer.

The neuron forms the sum:

Ypln = X; Win +9111 (12) Augmenting XPl by 1 and W, by .n allows us to write the sum in as1mplified
form as: __ rYpln " Xpl Win (13)where b 1 and neuronél Y Wu1 are now N + 1 vectors. The output of the
Y? a

F(Ymn)

I The input to any neuron in the next layer is then

Xpl+1 = [Xp,L+IIXpL+1,2, ' ' °Xp,L+l,j]T =

[2131.11me ...... ZN] (15)


where j is the number of neurons in the layer. The neuron in the (L+1) t layer generates the outputs,
which are the inputs to neurons in the next layer, and this continues to the neuron in the output layer
whose output is the output of the net. A widely used activation function is the sigmoid function, given
by

l -- Z

"s 1

F=m

Where parameter s specifies the steepness of the curve. The sigmoid function is a smooth switch
function having the property of 1 00 > {11 2: 3 3on (17)

other possible activation functions are the arc-tangent function given by 2F ( y) =

(E) arctan(y) (18) And the hyperbolic-tangent func tion given by ey __ e-y 1 F = ' (y) gy + e.,, j {-1 . (19) All
these logistic functions are bounded, continuous, monotonic and continuously differentiable.

How N eural Network Learns (training In The N et) Training in a neural net can be viewed as mapping a
set of input vectors to another set of desired output vectors (supervised training). To train the net to
perform the desired mapping, we apply an input vector to the net and compare the actual output of the
net with the desired output (the output vector corresponding to the applied input). The difference
between the actual output and desired output (i.e the error) is used to update weights and biases
associated with every neuron in the net, until this difference average over every input / output pair is
below a specified tolerance.

A net performs the desired mapping when each neuron in the net yields a correct response. Training the
net, hence implies training each neuron in the net. N ow, assume that the desired output values of the
neuron are known. The output values of the linear combiner dp, P = 1, 2, . . .m (where m is the number
of pattern in the input) are known. Then the error of a node corresponding to pattern P is E? = 172(dp-
Yp) (20) where Yp = WTXP is the actual output of the linear combiner. . The total mean squared error of
the node is E r =

p): I E P (21) To find the optimum weight vector that minimizes ET, we take the gradient of ET with
respect to W and set it to zero, as follows VE: §l(dp WTXP)XP=0

' (22) p8

Hence, we have

mm

3(p§1(XpX;»W = p2-1(deP) (2)

R=
[E

X X; (24)

and

P=p2=11deP (24) we can write equation (22) in matrix form as :

RW=P 2.38

The matrix R can be interpreted as the correlation matrix of the input set, and the vector P as the cross
correlation between the training patterns and the corresponding desired responses.

Equation (27) is referred to as the deterministic normal equation in the context of adaptive filtering, and
the optimum weight vector is the solution to (27).

Equation (27) can be solved iterativelgby a number of descent methods, such as the steepest descent
method, which yields the popular Delta-learning rule, by conjugate-gradient method, and several other
Quasi-Newton methods.

BACKPROPAGATION OF ERROR

Recall (20)

Let E1? = 2ETLN (28)

where E , is the total mean square error of the net associated with the Pkh training pattern. L is the
number of layer in the network and the index n is over the neurons in the output layer.

From (28) and (19), we know that the error 5PLn = d PLn -Y PLn (29)

associated with the Pth training pattern at the 11 1 output node can be expressed as ' ~6E 5an =

.572: (30) an Based on this observation, we now define the error associated with any node; hidden or
otherwise as the derivative of En, W.r.t the linear combiner output at that node:

= £11 . (31). 8 an a Yan The task now is to calculate p, at every node. For the output nodes the desired
result is given by 2.40 For the hidden layers (i.e L = 1,2. . .L 1), we have that 6E TP 6E TP 625 Ln

__,

TP . (32)p a YPLn 6Zan a Yan By the Chain rule, the first factor in 2.43 can be expi'essed as a linear
combination of the derivatives of En, w.r.t the variables assoc1ated With the node in the next L + 1 layer
as follows: @111. = 2 6E TP aYPlHr

(33)621914 asz71+11 aZpln . I where r is over the nodes in the layer L + 1 using (32) in (33) we arrive at
the recursion £5an = F' (Y PLn) 26p|+1.TW|+1,r,n

Where
f (Y,,L,.) = 62 " 61'an

'

(34)

By definition:

To summarize, the output errors calculated from 2.40 are backpropagated to the hidden layers by
recursively using 2.45

Once the error is available in a particular node, it can be used to update the weights, as will be shown in
what follows. Let the weight vector at iteration tbe W,,,(t). Then minimizing E (E?) or taking the gradient
with respect to the weight vector and adjus ting W.,,(t) in the direction of steepest descent yields.
wt..(t+1) =

Wm) + u(-vEp> (36>

where u is the step size.

Since

Yan = W l: X pl (37)

we can write 0E n» (38)VEp

6},an . VYp1n Substituting 2.42 in 2.49 and observing from 2.47 that Ym. =

XPL we have

VEP = " dPln XPI (38)

and hence

W...(t+1) = W...(t) + d.....X,,. (39)

equation (40) is variously called the Delta rule or method of steepest descent. When (40) is used in
conjunction with (32) and(35) the resulting iterative Scheme, is referred to as the back propagation.
[(Rumelhart et al 1986), (Parker, 1985) and HechtN ielson; 1987)]

CHAPTER TEN

EVOLUTIONARY COMUPUTING

Evolutionary computing (EC) draws ideas from natural evolution such as survival of the fittest, natural
selection, reproduction, mutation, competition and symbiosis. In EC techniques and tools the emphasis
is therefore on adaptation and optimization. To emulate the processes involved in natural selection, EC
makes use of genetic algorithms to model genetic evolution. These algorithms operate on strings of
information similar to the chromosome (or genome) concept in biology. In these algorithms, the
characteristics of individuals can be represented in two ways, by genotypes (which focus on inherited
aspects) and phenotypes (which focus on behavioral aspects). There are six types of genetic algorithms
that differ in the way in Which they represent (and study) individuals. These are:

genetic programming (genotype-oriented, individuals are represented as 'trees or executable


programmes)

evolutionary programming (phenotype oriented, behavioral models)

0 evolutionary strategies ('evolution of evolution' with an emphasis on phenotypic behavior)

differential evolution (population based search strategy) cultural evolution (adaptation to the
environment at rates faster than biological evolution)

o co-evolution (complementary evolution of associated species.)

Examples of areas to which EC has been applied are: routing optimization and scheduling, design of
filters, neural network architectures and structural optimization, controllers for gas turbine engine,
visual guidance systems for robots, classification and clustering, function approximation and time series
modeling, regression, composing music and data mining.

Genetic algorithm, for example, is a computational equivalent of evolution, of survival of the fittest.
Most of the interesting feature of genetic algorithm is their ability to expand the search space, to
diverge, as well as to converge. For this reason, GA is quite effective as search algorithm, particularly for
solving optimization problems with a large number of local minima. Some application of Genetic
Algorithm in engineering are: optimal capacitor placement and control , economic load dispatch,
optimal power flow , expansion planning , and unit commitment problems . Genetic algorithm + data
structures =

evolutionary programs

GENETIC ALGORITHM

A genetic algorithm (GA) is a heuristic used to find approximate solutions to difficult-to-solve problems
through application of the principles of evolutionary biology to computer science. Genetic algorithms
use biologically-derived techniques such as inheritance, mutation, natural selection, and recombination
(or crossover). Genetic algorithms are a particular class of evolutionary algorithms. It is also a method of
SIMULATING the action of EVOLUTION Within a computer. A population of fixed[ength STRINGS is
evolved with a GA by employing CROSSOVER md MUTATION operators along with a FITNESS FUNCTION
hat determines how likely individuals are to reproduce.

. co-evolution (complementary evolution of associated species.)


Examples of areas to which EC has been applied are: routing optimization and scheduling, design of
filters, neural network architectures and structural optimization, controllers for gas turbine engine,
Visual guidance systems for robots, classification and clustering, function approximation and time series
modeling, regression, composing music and data mining.

Genetic algorithm, for example, is a computational equivalent of evolution, of survival of the fittest.
Most of the interesting feature of genetic algorithm is their ability to expand the search space, to
diverge, as well as to converge. For this reason, GA is quite effective as search algorithm, particularly for
solving optimization problems with a large number of local minima. Some application of Genetic
Algorithm in engineering are: optimal capacitor placement and control , economic load dispatch,
optimal power flow , expansion planning , and unit commitment problems .

Genetic algorithm + data structures =


evolutionary programs
GENETIC ALGORITHM

A genetic algorithm (GA) is a heuristic used to find approximate solutions to difficult-to-solve problems
through application of the principles of evolutionary biology to computer science. Genetic algorithms
use biologically-derived techniques such as inheritance, mutation, natural selection, and recombination
(or crossover). Genetic algorithms are a particular class of evolutionary algorithms. It is also a method of
SIMULATING the action of EVOLUTION Within a computer. A population of fixedlength STRINGS is
evolved with a CA by employing CROSSOVER and MUTATION operators along with a FITNESS FUNCTION
that determines how likely individuals are to reproduce.

A genetic algorithm (GA) is a search technique used in computing to find exact or approximate solutions
to optimization and search problems.

0 Chromosome is used to refer to a potential solution. It contains all of the necessary information
needed to describe one solution.

9 Clone is when a duplicate of a Chromosome is created. This is usually done so some sort of Mutation
can be tried Without damaging the initial Chromosome.

0 Crossover is when two Chromosomes create one or more new Chromosomes by mixing their
solutions. '

9 Fitness is a measure of the quality of a particular Chromosome. Chromosomes that are better
solutions will have better fitness values than those that are less optimal solutions.

. Generation refers to one round of the Genetic Algorithm Cycle. New Chromosomes are created and old
ones are removed to make room for them.

9 Mutation is used in two slightly different contexts. The first is as a general term to describe any
modification made to the Population. The second is a modification made to a single Chromosome.

. Penalty is a part of the fitness. It penalizes illegal or undesira ble actions of the Chromosome in the
solution space.

Population is the collection of available Chromosomes. There is normally a limit on the size of the
Population, and those Chromosomes that do poorly are eliminated to make room for better performing
Chromosomes.

BACKGROUND OF GENETIC ALGORITHM

Computer simulations of evolution started as early as in 1954 with the work of Nils Aall Barricelli, who
was using the computer at the Institute for Advanced Study in Princeton, New Jersey (Barricell, 1954).
His 1954 publication was not widely noticed. Starting in 1957, the Australian quantitative geneticist Alex
Fraser published a series of papers on simulation of artificial selection of organisms with multiple loci
controlling a measurable trait. From these beginnings, computer simulation of evolution by biologists
became more common in the early 19603, and the methods were described in books by Fraser and
Bumell (Fraser et a1, 1970) and Crosby (Crosby, 1973). Fraser's simulations included all of the essential
elements of modern genetic algorithms. Other noteworthy early pioneers include Richard Friedberg,
George Friedman, and Michael Conrad. Many early papers are reprinted by Fogel (Fogel, 1998).
Although Barricelli, in work he reported in 1963, had simulated the evolution of ability to play a simple
game, artificial evolution became a Widely recognized optimization method as a result of the work of
Ingo Rechenberg and Hans-Paul Schwefel in the 19603 and early 19705 his group was able to solve
complex engineering problems through evolution strategies (Schwefel, 1981)

Genetic a1 gorithms in particular became pOpular through the work of John Holland in the early 19705,
and particularly his 1975 book (Holland, 1975). His work originated with studies of cellular automata,
conducted by Holland and his students at the University of Michigan. Genetic algorithms (GAS) have
been in use for over three decades as a solution technique for variety of problems including
combinatorial optimization and artificial intelligence. GAS can be described as an intelligent approach to
random search that is based on the natural processes of biological

lution (survival-of the-fittest principle) as proposed by 2:1:rles Darwin where each individual compete to
propagate to the next level.

Genetic algorithms are general-purpose search algorithms based upon the principles of evolution
observed 1n nature. Genetic algorithms combine selection, crossover, and mutation operators with the
goal of finding the best solution to a problem. Genetic algorithms search for this optimal solution until a
specified termination criterion is met.

GENETIC ALGORITHM STAGES

The steps involved in genetic al gorithm are explained below:

(i) Initialization: The starting point is the generation of a number of solutions, the so called population.
Each solution is represented by constant length chromosome strings. We begin the search with an initial
set of schedules, known as a population, and let it evolve into a new generation while usually keeping
the population size a constant.

(ii) Evaluation: Each chromosome (solution) is evaluated by their fitness value. Several things need to be
considered for the definition:

0 Desired Goal is of course the whole reason for the Genetic Algorithm. The fitness definition should be
heavily based on that goal. For example, if the goal is to minimize cost then have cost be the main
determining factor for fi tness.

o Calculable is an issue, sincethe Calculate Fitness step of the Cycle is done every generation. Efficiency
of the calculation can be a major concern.

0 Penalty needs to be assigned such that unfit chromosomes are eliminated over time.

(iii) Selection: Each new generation can be created by three

means;

0 Reproduction, where members of the current population are allowed to produce clones

o Cross-over, where selected parents of the current population are allowed to produCe children With
mixed genes

. Mutation, where new chromosomes are generated through minor changes to genes in existing ones.
The next generation's parents, i.e. certain chromosomes of the population, are selected. The choice
takes place randomly following the principle of surviving of the fittest , which means that the fittest
solutions have the greatest chance for being chosen as parent.

(iv) Recombination: This involves how two parent individuals exchange characteristics to produce valid
offspring. This is achieved through the use of GA Operators such as Deletion operator, mutation
operator, and crossover operator. '

(V) Termination This generational process is repeated until'a termination condition has been reached.
Common terminating conditions are 9 A solution is found that satisfies minimum criteria

0 Fixed number of generations reached 9 Allocated budget (computation time/ money) reached 9 The
highest ranking solution s fitness is reaching or has reached a plateau such that successive iterations no
longer produce better results

9 Manualinspection

GENETIC ALGORITHM

1. Chooseinitial population

2. Evaluate the fitness of each individual in the population

3. Repeat ' 1 Select best ranking individuals to reproduce 2. Breed new generation through crossover
and mutation (genetic operations) and give birth to offspring, 3. Evaluate the individual fitness's of the
offspring 4. Replace worst ranked part of population with offspring

4. Until <terminating condition>

The three most important aspects of using genetic algorithms are: (1) Definition of the objective
function. (2) Definition and implementation of the genetic representation. (3) Definition and
implementation of the genetic operators. Once these three have been defined the ene ' '

'thmshould work fairly well. 3 nc genetlc algqn

CHAPTER ELEVEN OVERVIEW OF QUANTU M COMPUTIN G

A BRIEF HISTORY OF QUANTUM COMPUTIN G

The idea of a computational device based on quantum mechanics was first explored in the 1970's and
early 1980's by physicists and computer scientists such as Charles H. Bennett of the IBM Thomas .
Watson Research Center Paul A. Benioff of Ar_onne National Laborator in Illinois, David Deutsch of the
Universi . of Oxford and the late Richard P. Fe nman of the California Institute of Technology (Caltech).
The idea emerged when scientists were pondering the fundamental limits of computation. They
understood that if technology continued to abide by Moore's Law, then the continually shrinking size of
circuitry packed onto silicon chips would eventually reach a point where individual elements would be
no larger than a few atoms. Here a problem arose because at the atomic scale the physical laws that
govern the behavior and properties of the circuit are inherently quantum mechanical in nature, not
classical. This then raised the question of whether a new kind of computer could be devised based on
the principles of quantum physics.
Feynman was among the first to attempt to provide an answer to this question by producing an abstract
model in 1982 that showed .how a quantum system could be used to do computations. He also
explained how such a machine would be able to act as a simulator for quantum physics. In other words,
a physicist would have the

ability to carry out experiments in quantum physics inside a quantum mechanical computer.

Later, in 1985, Deutsch realized that Feynman's assertion could eventually 'lead to a general purpose
quantum computer and published a crucial theoretical paper showing that any physical process, in
principle, could be modeled perfectly by a quantum computer. Thus, a quantum computer would have
capabilities far beyond those of any traditional classical computer. After Deutsch published this paper,
the search began to find interesting applications for such a machine.

Unfortunately, all that could be found were a few rather contrived mathematical problems, until Shor
circulated in 1994 a preprint of a paper in which he set out a method for using quantum computers to
crack an important problem in number theory, namely factorization. He showed how an ensemble of
mathematical

operations, designed specifically for a quantum computer, could be organized to enable a such a
machine to factor huge numbers extremely rapidly, much faster than is possible on conventional
computers. With this breakthrough, quantum computing transformed from a mere academic curiosity
directly into a national and world interest.

WHAT IS QUANTUM COMPUTING?

What is quantum computing? The subject of quantum computing brings together ideas from classical
information theory, computer science, and quantum physics.

'

In quantum computing, we witness an exciting and. very promising merge of two of the deepest and
most successful scientific and technological developments of this century: quantum physics and
computer science. In spite of the fact that its

experimental developments are in their infancy, there has already been a variety of concepts, models,
methods and results obtained at the theoretical level that clearly have lasting value. These concepts,
methods and results are the main subject of the term paper. Knowledge from these two areas is of
importance for understanding the basic developments in quantum computing: namely, quantum physics
arid theoretical computer science. The term paper provides elements of both, and concentrates on the
presentation of concepts, models, methods and results mainly from a computing point of view. No
previous knowledge of quantum mechanics is required.

A quantum computer is a device for com putation that makes direct use of distinctively uantum
mechanical ohenomena such as superposition and entan ; lement to perform operations on data. In a
classical (or conventional) computer, information is stored as 12% in a quantum computer, it is stored as
gubits (quantum binary digits). The basic principle of quantum computation is that the quantum
properties can be used to represent and structure data, and that quantum mechanisms can be devised
and built to perform operations with these data.
Although quantum computing is still in its infancy, experiments have been carried out in which quantum
computational operations were executed on a very small number of gubits. Both practical and
theoretical research continues with interest, and many national government and military funding
agencies support quantum computing research to develop quantum computers for both civilian and
national security purposes, such as W

If large scale quantum computers can be built, they will be able to solve certain problems much faster
than any of our current classical computers (for example Shor's algorithm). Quantum computers are
different from other com-uters such as ma corn euters and traditional computers based on transistors.
Some computing architectures such as Outical commuters may use classical superposition of
electromagnetic waves. Without some specifically quantum mechanical resources such as entan ;
lement it is conjectured that an exponential advantage over classical computers is not possible.

Quantum computer basics In the classical model of a computer, the most fundamental building block,
the bit, can only exist in one of two distinct states, a O or a 1. In a quantum computer the rules are
changed. Not only can a quantum bit', usually referred to as a 'qubit', exist in the classical 0 and 1
states, it can also be in a coherent superposition of both. When a qubit is in this state it can be thought
of as existing in two universes, as a 0 in one universe and as a 1 in the other. An operation on such a
qubit effectively acts on both values at the

same time. The significant point being that by performing the single operation on the qubit, we have
performed the operation on two different values. Likewise, a two-qubit system would perform the
operation on 4 values, and a three-qubit system on eight. Increasing the number of qubits therefore
exponentially increases the 'quantum parallelism' we can obtain With the system. With the correct type
of algorithm it is possible to use this parallelism to solve certain problems in a fraction of the time taken
by a classical

computer.

The pitfall of quantum computing decoherence The very thing that makes quantum computing so
powerful, its reliance on the bizarre subatomic goings-on governed by the rules of quantum mechanics,
also makes it very fragile and difficult to control. For example, consider a qubit that is in the coherent
state. As soon as it measurable interacts with the environment it Will

decohere and fall into one of the two classical states. This is the problem of decoherence and is a
stumbling block for quantum computers as the potential power of quantum computers depends on the
quantum parallelism brought about by the coherent state. This problem is compounded by the fact that
even looking at a qubit can cause it to decohere, making the process of obtaining a solution from a
quantum computer just as difficult as performing the calculation itself.

Theory of Universal Computation One thing that all computers have in common, frdm Charles Babbage's
analytical engine (1936) to Pentium(tm) based PCs, is the theory of classical computation as described
by the work of Alan Turing. Ir'i essence, Turing's work describes the idea of the universal Turing
machine, a very simple model of a computer that can be programmed to perform any operation that
would
naturally be considered to be computable". All computers are essentially implementations of a universal
Turing machine. They are all functionally equivalent and although some may be quicker, larger or more
expensive than others, they can all perform the same set of computational tasks.

Heating up over last information A great deal of time has been spent on investigating Whether quantum
theory places any fundamental limits on computing machines. As a result, it is now believed that physics
does not place any absolute limits on the speed, reliability or memory capacity of computing machines.
One consideration that needs to be made however, concerns the information that may be 'lost' in a
computation. In order for a computer to run arbitrarily fast, its operation must be reversible (i.e. it's
inputs must be entirely deducible from its outputs). This is because irreversible computations involve a
'loss' of information Which can be equated to a loss in heat, and thus the restricted ability of the system
to dissipate heat will in turn limit the performance of the computer. An example of information being
lost can be seen in an ordinary AND gate. An AND gate has two inputs and only one output, Which
means that in the process of moving from the input to the output of the gate, we loose one bit of
information.

The Universal Quantum Computer The Church-Turing principle "There exists or can be built a universal
computer that can be programmed to perform any computational task that can be performed by any
physical object".

A number of key advances have been made in the theory of quantum computation, the first being the
discovery that a simple class of 'universal simulator can mimic the behaviour of any finite physical
object, by Richard Feynman in 1982. David Albert made

the second discovery in 1984 When he described a self measuring quantum automaton' that could
perform tasks that no classical computer can simulate. By instructing the automaton to measure itself, it
can obtain subjective' information that is absolutely inaccessible by measurement from the outside. The
final and perhaps most important discovery was made by David Deutsch in 1989, he proved that all the
computational capabilities of any finite machine obeying the laws of quantum computation are
contained in a single machine, a 'universal quantum computer'. Such a computer could be built from the
quantum equivalent of the Toffoli gate and by adding a few extra operations that can bring about linear
superpositions of O and 1 states, the universal quantum computer is complete. This discovery requires a
slight alteration to the Church-Turing principle There exists or can be built a universal quantum
computer that can be programmed to perform any computational task that can be performed by any
physical object".
Artzficial in telligence
The theories of quantum computation have some interesting implications in the world of artificial
intelligence. The debate about whether a computer will ever be able to be truly artificially intelligent has
been going on for years and has been largely based on philosophical arguments. Those against the
notion suggest that the human mind does things that aren't, even in principle, possible to perform on a
Turing machine.

The theory of quantum computation allows us to look at the question of consciousness from a slightly
different perspective. The first thing to note is that every physical object, from a rock to the universe as
a whole, can be regarded as a quantum computer and that any detectable physical process can be
cOnsidered a

computation. Under these criteria, the Dram can be regarded as a computer and consciousness as a
computation. The next stage of the argument is based in the Church-Turing principle and states that
since every computer is functionally equivalent and that any given computer can simulate any other,
therefore, it must be possible to simulate conscious rational thought using a quantum

computer.

Some believe that quantum computing could well be the key to cracking the problem of artificial
intelligence but others disagree. Roger Penrose of Oxford University believes that consciousness may
require an even more exotic (and as yet unknown) physics.

Building a quantum computer A quantum computer is nothing like a classical computer in design; you
can't for instance build one from transistors and diodes. In order to build one, a new type of technology
is needed, a technology that enables 'qubits' to exist as coherent superpositions of 0 and 1 states. The
best method of achieving this goal is still unknown, but many methods are being experimented with and
are proving to haveyarying degrees of success.

Quantum dots An example of an implementation of the qubit is the 'quantum dot' which is basically a
single electron trapped inside a cage of atoms. When the dot is exposed to a pulse of laser light of
precisely the right wavelength and duration, the electron is raised to an excited state: a second burst of
laser light causes the electron to fall back to its ground state. The ground and excited states of the
electron can be thought of as the 0 and 1 states of the qubit and the application of the laser light can be
regarded as a controlled NOT function as it knocks the qubit from O to 1 or from to 0.

If the pulse of laser light is only half the duration of that required for the NOT function, the electron is
placed in a superposition of both ground and excited states simultaneously, this being the equivalent of
the coherent state of the qubit. More complex logic functions can be modeled using quantum dots
arranged in pairs. It would therefore seem that quantum dots are a suitable candidate for building a
quantum computer. Unfortunately there are a number of practical problems that are preventing this
from happening:

0 The electron only remains in its excited state for about a microsecond before it falls to the ground
state. Bearing in mind that the required duration of each laser pulse is around 1 nanosecond, there is a
limit to the number of computational steps that can be made before information is lost.

0
Constructing quantum dots is a very difficult process because they are so small. A typical quantum dot
measures just 10 atoms (1 nanometer) across. The technology needed to build a computer from these
dots doesn't yet exist. '

0 To avoid cramming thousands of lasers into a tiny space, quantum dots could be manufactured so that
they respond to different frequencies of light. A laser that could reliably retune itself would thus
selectively target different groups of

'

quantum dots with different frequencies of light. This again, is another technology that doesn't yet exist.
'

ComputingLiquids Quantum dots are not the only implementation of qubits that have been
experimented with. Other techniques have attempted to use individual atoms or the polarisation of
laser light as the information medium. The common problem with these techniques is decoherence.
Attempts at shielding the experiments from their

surroundings, by for instance cooling them to within a thousandth of a degree of absolute zero, have
proven to have had limited success at reducing the effects of this problem.

The latest development in quantum computing takes a radical new approach. It drops the assumption
that the quantum medium has to be tiny and isolated from its surroundings and instead uses a sea of
molecules to store the information. When held in a magnetic field, each nucleus within a molecule spins
in a certain direction, which can be used to describe its state; spinning upwards can signify a 1 and
spinning down, a 0. Nuclear Magnetic Resonance (NMR) techniques can be used to detect these spin
states and bursts of specific radio waves can flip the nuclei from spinning up (1) to spinning down (0)
and vice-versa.

The quantum computer in this technique is the molecule itself and its qubits are the nuclei within the
molecule. This technique does not however use a single molecule to perform the computations; it
instead uses a whole 'mug' of liquid molecules. The advantage of this is that even though the molecules
of the liquid bump into one another, the spin states of the nuclei within each molecule remain
unchanged. Decoherence is still a problem, but the time before the decoherence sets in is much longer
than in any other technique so far. Researchers believe a few thousand primitive logic operations should
be possible within time it takes the qubits to decohere.

Dr. Gershenfield from the Massachusetts Institute of Technology is one of the pioneers of the
computing liquid technique. His research team has already been able to add one and one together, a
simple task which is way beyond any of the other techniques being investigated. The key to being able
to perform more complex tasks is to have more qubits but this requires more complex molecules with a
greater number of nuclei, the caffeine molecule being a possible candidate. Whatever the molecule, the
advancement to 10

132

qubit systems is apparently straightforward. Such a system, Dr. Gershenfield hopes, will be possible by
the end of this year, and should be capable of factoring the number 15.
Advancing beyond a lO-qubit system may prove to be more difficult. In a given sample of "computing
liquid' there will be a roughly even number of up and down spin states but a small excess of spin in one
direction will exist. It is the signal from this small amount of extra spin, behaving as if it were a single
molecule that can be detected and manipulated to perform calculations while the rest of the spins Will
effectively cancel each other out. This signal is extremely weak and grows weaker by a factor of roughly
2 for every qubit that is added. This imposes a limit on the number of qubits a system may have as the
readable output Will be harder to detect.

Applications of Quantum Computers It is important to note that a quantum computer Will not
necessarily outperform a classical computer at all computational tasks. Multiplication for example, Will
not be performed any quicker on a quantum computer than it could be done on a similar classical
computer. In order for a quantum computer to show its superiority it needs to use algorithms that
exploit its power of quantum parallelism. Such algorithms are difficult to formulate, to date the most
significant theorised being Shor's algorithm and Grover's algorithm. By using good these algorithms a
quantum computer Will be able to outperform classical computers by a significant margin. For example,
Shor's algorithm allows extremely quick factoring of large numbers, a classical computer can be
estimated at taking 10 million billion billion years to factor a 1000 digit number, where as a quantum
computer would take around 20 minutes.

Shor's algorithm This is an algorithm invented by Peter Shor in 1995 that can be used to quickly factorise
large numbers. If it is ever implemented it will have a profound effect on cryptography, as it would
compromise the security provided by public key encryption (such as RSA).

At Risk Public Key Encryption This is currently the most commonly used method for sending encrypted
data. It works by using two keys, one public and one private. The public key is used to encrypt the data,
while the private key is used to decrypt the data. The public key can be easily derived from the private
key but not visa versa. However, an eavesdropper Who has acquired your public key can in principle
calculate your private key as they are mathematically related. In order to do so it is necessary to
factorise the public key, a task that is considered to be intractable.

For example, multiplying 1234 by 3433 is easy to work out, but calculating the factors of 4236322 is not
so easy. The difficulty of factorising a number grows rapidly with additional digits. It took 8 months and
1600 Internet users to crack RSA 129 (a number with 129 digits). Cryptographers thought that more
digits could be added to the key to combat increasing performance in computers (it would take longer
than the age of the universe to calculate RSA 140). However, using a quantum computer, which is
running Shor's algorithm, the number of digits in the key has little effect On the difficulty of the
problem. Cracking RSA 140 would take a matter of seconds.

Shor's algorithm -An example The purpose of this section is to illustrate the basic steps involved in Shor's
Algorithm. In order to keep the example relatively easy to follow we will consider the problem of finding
the prime factors of

the number 15. Since the Algorithm consists of three key steps, this explanation will be presented in 3
stages. ..

Stagel The first stage of the algorithm is to place a memory register into a coherent superposition of all
its possible states. The letter 'Q will be used to denote a qubit that is in the coherent state.
/lIl I

A Qubit Q

A 3 Bit Register

l§§1§1~ ooo~;: ~

111

001 ,

1|1

015'] , 101 [0111lol0l

Figure 3 A three-qubit register can represent 25 classical states simultaneously.

When a qubit is in the coherent state, it can be thought of as existing in two different universes. In one
universe it exists as a '1' and in the other it exists as a '0' (See Figure 1). Extending this idea to the 3 bit
register we can imagine that the register exists in 8 different universes, one for each of the classical
states it could represent (i.e. 000, 001, 010, 011, 100, 101, 110, 111). In order to hold the number 15, a
four bit register is required (capable of representing the numbers 0 to 15 simultaneously in the coherent
state).

A calculation performed on the register can be thought of as a whole group of calculations performed in
parallel, one in each universe. In effect, a calculation performed on the register is a calculation
performed on every possible value that register can

represent.

S tage2 The second stage of the algorithm performs a calculation using the register. The details of which
are as follows:

0 The number N is the number We wish to factorise, N = 15

o A random number X is chosen, where 1 < X < N l

o X is raised to the power contained in the register (register A) and then divided by N The remainder
from this operation is placed in a second 4 bit register (register B).

, Register A Register B Q Q Q Q Q Q Q Q = X MOD N

Figure 4

Operation performed in stage 2. After this operation has been performed, register B contains the
superposition of each universes results.
Grover's algorithm Lov Grover has written an algorithm that uses quantum computers to search an
unsorted database faster than a conventional computer. Normally it would take N / 2 number of
searches to find a specific entry in a database with N entries. Grover's algorithm makes it possible to
perform the same search in root N searches. With the increasing size and integration of databases, this
saving in time becomes significant. The speed up that this algorithm provides is a result of quantum
parallelism. The database is effectively distributed over a multitude of universes,

allowing a single search to locate the required entry. A further number of operations (proportional to
root N) are required in . order to produce a readable result.

Grover's algorithm has a useful application in the field of cryptography. It is theoretically possibly to use
this algorithm to crack the Data Encryption Standard (DES), a standard which is used to protect, amongst
other things, financial transactions between banks. The standard relies on a 56-bit number that both
participants must know in advance, the number is used as a key to encrypt/ decrypt data.

If an encrypted document and its source can be obtained, it is possible to attempt to find the 56-bit key.
An exhaustive search by conventional means would make it necessary to search 2 to the power 55 keys
before hitting the correct one. This would take more than a year even if one billion keys were tried every
second, by comparison Grover's algorithm could find the key after only 185 searches. For conventional
DES, a method to stop modern computers from cracking the code (i.e. if they got faster) would be simply
to add extra digits to the key, which would increase the number of searches needed exponentially.
However, the effect that this would have on the speed of the quantum algorithm is negligible.

APPLICATION OF QUANTUM COMPUTING

Quantum communication The research carried out on quantum computing has created the spin-off field
of quantum communication. This area of research aims to provide secure communication mechanisms
by using the properties of quantum mechanical effects

Researchers in quantum communication have enjoyed a greater level of success. The partial quantum
computers imfplved have enabled secure communication over distances as far as 10km. Depending on
how costly these lines are to develop and the demand that exists for them, there could be a strong
future for quantum communications.

'

CHAPTER TWELVE

MOLECULAR COMPUTING

A Brief Overview

Life is computation. Every single living cell reads information from a . memory, re-writes it, receives data
input (information about the state of its environment), processes the data and acts according to the
results of all this computation. Globally, the zillions of cells populating the biosphere certainly perform
more computation steps per unit of time than all man-made computers put together.
However, living cells do not use any devices which (1990s computer users) would expect to be necessary
for computation. No semiconductor chips, nor quantum dots or mechanical Babbagetype machinery.
Rather than mechanics, quantum mechanics or electronics molecular computers or cells use chemistry:
they compute by using molecules, mostly heteropolymeric macromolecules such as proteins and nucleic
acids. Protein can, for instance, act as signal receptors, logical gates, or signal transducers between
different forms of signaling including light, electricity and chemical messenger systems. Nucleic acids
mainly act as memory, both for permanent and for short- term applications, although the self-splicing
activity of setting RNA molecules means that they can perform information processing as well. While the
genetic role of nucleic acids has been known for several decades, the roles of proteins for molecular
computation in the cell are only

just beginning to be explored. Signal transduction even if it is only to do with yes or no signals, is a
complex business, which only very slowly yield to massive research effort.

While we seem to know more about the computational aspects of DN A than those of proteins in the
cell, this situation is reversed in the application of biomolecules in artificial computing systems. Although
some light harvesting proteins such as bacteriorhodopsin (bR) have been used in attempt to create
molecular (computer) memories since the early 1970's, the notion of DNA-based computing only came
up in 1994. This paradox may partly be explained by the fact that relatively small populations of bR
molecules can be manipulated and 'read' by laser light which has been available for several decades,
whereas the methods to deal with small amounts of DNA have only been developed very recently.

To this day, copying and spreading genetic information is what life is all about. All the visible part of
living organism (phenotype) has evolved because they were able to assist this purpose in one way or the
other. Considering that life essentially is a molecular data coping machinery, we will be less surprised to
learn that evolution has produced some clever computational devices on the way.

Molecular Computing can be defined as computations going on in life or living systems using molecules
as data storage and/ or processing units. Supramolecular chemistry which studies complex system
composed of synthetic molecules associated by weak interactions, has already been used to produce
some devices such as molecular switches, which may prove useful for future developments in
computation.

'

Molecular computing is closer to reality today with the creation of molecular switch that can be turned
on and off hundreds of times

and could replace silcon-based computing. A molecular switch is a logic gate which can represent the
binary language of a di 'tal computer. g1 Molecular switches would be many times cheaper than
traditional solid state devices and would allow for continued miniaturized and increases in the power
that silcon-based components would never be able to reach.

You might also like