Chapter3 2020

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

SCOB000 2020

CHAPTER 3: ARTIFICIAL INTELLIGENCE (AI)

1. Intelligence and machines

Artificial Intelligence (AI) got its start in the 1950s, when computer pioneer Alan Turing introduced the
Turing test in his paper “Computing Machinery and Intelligence.” The test involves a judge who must
communicate with two participants in two different rooms. In one room is a machine, and in the other is a
real person. The judge is then supposed to ask each participant questions and figure out which one is the
machine. More details will be covered later in this chapter. In the past couple decades, there’s been
enormous progress in this field, but we have to ask: is that necessarily a good thing? Here are some
interesting AI facts – some of which are a little…troubling:

 Artificial Intelligence Can Read, Write and Learn


 Artificial Intelligence can be a Fierce Poker Player
 Artificial Intelligence Can Repair Itself
 Artificial Intelligence will Become Smarter Than Humans?
 Can Artificial Intelligence Become self-aware?
 Can Artificial Intelligence be hacked?
 Can Artificial Intelligence feel emotions?
 Can Artificial Intelligence actually exist?

All these interesting questions are basically something to think about. Although the computer is often
personified, an important distinction exists between its properties and the properties of a human mind.
 In terms of speed and accuracy, computers outperform human beings, they however not gifted in
situations where common sense is required.
 When faced with situation not foreseen by a programmer, a computer’s performance deteriorates
rapidly.
 The human mind often flounders when confronted with complex computations but is capable of
understanding and reasoning.
 Lastly, whereas a machine might outperform a human in computing solutions to problems e.g.,
nuclear physics, the human is more likely to understand the results and to determine what the
next computation should be.

What is AI?

Artificial Intelligence (AI) is usually defined as the science of making computers do things that require
intelligence when done by humans. There are off course many definitions according to different authors
e.g.,
Luger and Stubblefield, 1993 defined AI as the branch of computer science that is concerned with the
automation of intelligent behavior.

J.Glenn brookshear define AI as the field of computer science that seeks to build autonomous machines –
that is machines that can carry out complex tasks without human intervention.

_____________________________________________________________________ 51
SCOB000 2020

Artificial intelligence in short, refers to a series of related technologies that try to simulate and reproduce
human-thought behavior, including thinking, speaking, feeling and reasoning by making use of
computers. To achieve all these things, computers must have the following properties.

 Understand common sense


 Understand relationships among facts
 Deal with exceptions
 Interface with human in a free-format fashion (natural language)
 Understand facts and manipulate qualitative data
 Be able to deal with new situations based on previous learning
 Be able to learn from experience
Many of the definitions of AI according to different authors fall into two dimensions. That is, we have
definitions that are concerned with thought process and reasoning and definitions that are concerned
with behavior. Building from these many definitions, we draw four possible goals or categories to pursue
a study in AI. i.e.,
 Systems that think like humans – Before we can conclude that a particular machine think like a
human, we must have some way of determining how humans think. In other words we need to
get the inside, the actual workings of human minds. There are two ways to do this:
 Through introspection – trying to catch our own thought as they go by,
 Through psychological experiments.
Once we have a sufficient precise theory of the mind, it becomes possible to express the theory as
a computer program
 Systems that act like humans – for a particular machine to act like humans it must posses the
following capabilities based on the Turing test:
 Natural language processing – to enable it to communicate successfully in any natural
language,
 Knowledge representation – to store information provided before or during the
interrogation,
 Automated reasoning – to use the stored information to answer questions and to draw
new conclusions,
 Machine learning – to adapt to new circumstances and to detect and extrapolate patterns.
 Systems that think rationally - This approach is driven by the laws of logic or deductive
reasoning. For example, given two statements (say "Socrates is a man" and "All men are mortal")
we can imply further knowledge ("Socrates is mortal"). Using logic, we should be able to build
systems that allow us take a description of a problem (in logical notation) and find the answer to
that problem.

_____________________________________________________________________ 52
SCOB000 2020

 Systems that act rationally – acting rationally means acting so as to achieve one’s goals, given
one’s beliefs. In other words the system is rational if it does the right thing. Making correct
inference is sometimes part of being a rational agent, because one way to act rationally is to
reason logically to the conclusion that a given action will achieve one’s goals and then act on
that conclusion.

1.1. Intelligent Agents

Agent: A “device” that responds to stimuli from its environment eg, robot
Sensors – to receive data
Actuators – to affect the environment
Much of the research in artificial intelligence can be viewed in the context of building agents that
behave intelligently – the actions of the actuators must be rational responses to the data received
through the sensors.

2. The Turing Test

The Turing test was designed to provide a satisfactory operational definition on intelligence. Turing thus,
defined intelligent behavior as the ability to achieve human-level performance in all cognitive tasks,
sufficient to fool an interrogator

This test was invented by Alan M. Turing (1912-1954) and first described in his 1950 article. The basic setup
of the test includes two people and the machine to be tested. One person is an interrogator, and the other
person and the machine are respondents. The interrogator and respondents are all in different rooms and
thus physically separated. The interrogator can only ask questions via a keyboard (e.g. a teletype or
computer terminal). Both respondents attempt to convince the interrogator that they are the human
respondent. Turing suggested that the test should be run for five minutes or so, but the precise length is
somewhat irrelevant. This, then, is an imitation game for the machine figure 3.1

_____________________________________________________________________ 53
SCOB000 2020

Questions (A) Computer contestant: aims to


Questioner: Aims to discover if A or B is the ----------> fool the questioner
computer. <--------- (B) Human confederate: aims to
Answers help the questioner
Figure 3.1, Turing test

The machine is said to pass the test if the interrogator can not tell the difference between the respondents,
or guesses at chance at the identity of the respondents. The machine fails the test if the interrogator can tell
the difference. Turing thought that any machine which passes the test should be considered intelligent, or
more precisely, should be considered to 'think'.

But the big question would be “can a machine think”.

The crux of the test is that there must be a comparison between a man and a machine. So long as the
observer is not able to distinguish between the two then we can say that the computer has passed the test
and therefore that the machine is intelligent or can think. No computer has ever passed the test; this is
because Turing’s test condition was open and general. But scaling down the activities involved to small
fields like playing chess, some machines are not as far from passing the test.

However, there are quite a number of objection to Turing’s opinion concerning the thinking machine. Here
are some of the objection and the replies he gave to defend his opinion.

Theological Objection: This states that thinking is a function of man's immortal soul and therefore a machine
could not think. Turing replies by saying that he sees no reason why it would not be possible for God to grant
a computer a soul if he so wished.

Mathematical Objections: This objection uses mathematical theorems, such as Gödel's incompleteness
theorem, to show that there are limits to which questions a computer system based on logic can answer.
Turing suggests that humans are too often wrong themselves and pleased at the fallibility of a machine.

Argument from Consciousness: This argument, suggested by Professor Jefferson Lister states, "not until a
machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the
chance fall of symbols, could we agree that machine equals brain". Turing replies by saying that we have no
way of knowing that any individual other than ourselves experiences emotions, and that therefore we should
accept the test.

Lady Lovelace Objection: One of the most famous objections, it states that computers are incapable of
originality. Turing replies that computers could still surprise humans, in particular where the consequences
of different facts are not immediately recognizable.

Informality of Behaviour: This argument states that any system governed by laws will be predictable and
therefore not truly intelligent. Turing replies by stating that this is confusing laws of behaviour with general
rules of conduct.

_____________________________________________________________________ 54
SCOB000 2020

Extra-sensory perception: Turing seems to suggest that there is evidence for extra-sensory perception.
However he feels that conditions could be created in which this would not affect the test and so may be
disregarded.

2.1. The eight-puzzle machine

But “intelligence”, how do you judge a particular machine to be intelligent? Let us now investigate how
machines can be programmed to appear to be intelligent. We going to base our investigation on the so
called eight-puzzle machine figure 3.2. We start by identifying the elementary intellegence characteristics
that we will need to consider when designing such a machine.

 the machine will have to take the form of a box equiped with a gripper,
 a video camera
 a finger with a rubber end (so that it does not slip when pushing something)

Figure 3.2, puzzle-solving machine

Let us now consider such a machine been on a table on which an eight-puzzle is placed. This is a puzzle
consisting of eight square tiles labelled 1 through 8 mounted in a frame capable of holding a total of nine
such tiles in three rows and three columns. Among the tiles in the frame is a vacancy into which any of the
adjacent tiles can be pushed. Figure 3.3 shows how the tiles are arranged.

Figure 3.3 eight-puzzle in its solved configuration

We begin by picking up the puzzle and rearranging it by repeatedly pushing arbitrarily chosen tiles into the
vacancy. The machine is then turn on, and the gripper begings to open and close as if asking for the puzzle.
The machine is then placed in the gripper, and the gripper closes on the puzzle. After a short time the
machine’s finger lowers and begins pushing the tiles around in the frame until they are back in their original
order. At this point the machine releases the puzzle and turns itself off.

Understanding images

_____________________________________________________________________ 55
SCOB000 2020

The first intelligent bevavior required by the puzzle solving machine is the extraction of information through
a visual medium. It is important to realize that the problem faced by the machine when looking at the puzzle
is not that of merely producing and storing an image. Technology has been doing that for years e.g.,
photography and television, stores images without understanding. However, the problem faced by the
machine is to understand the image in order to extract the current status of the puzzle and later to monitor
the movement of the tiles. In othet words the machine must demostrate the ability to perceive.

For the machine to understand the individual pictures on the puzzles, we must imagine and assume that the
puzzle has been encoded in terms of bits in the computer’s memory, with each bit representing the
brightness level of a particular pixel. Assuming a uniform size of the of the image, the machine can detect
which tile is in which position by comparing the different section of the picture to prerecoded templates
consisting of the bit patterns produced by the individual digits used in the puzzle. As matches are found, the
condition of the puzzle is revealed.

The task of understanding general images is usually approached in two steps:

Image processing – This refers to identifying characteristics of the image. The following are the steps
identifying the various components in an image.

 Edge enhancement – which is the process of applying mathematical techniques to clarify the
boundaries between regions in an image.
 Region finding – which is the process of identifying those areas in an image that have
common properties such brightness, color, or texture. It is the ability to recognize regions
that allow computers to add color to cartoons or old fashioned black and white motion
pictures.
 Smoothing – This is the process of removing flaws in the image.

Image analysis – This refers to the process of understanding the meaning of characteristics identified during
image processing. In other words it is the process of finding what these components represent and
ultimately what the image means. One approach to image analysis is to start with an assumption about what
the image might be and try to associate the components in the image with the objects whose presence is
conjectured.

3. Reasoning

As soon as the puzzle-solving machine has deciphered the positions of the tiles from the visual image, its task
becomes that of figuring out what moves are required to solve the puzzle. An approach to this problem that
might come is to pre-program the machine with solutions to all possible arrangements of the tiles. Then the
machine’s task would merely just to select and execute the proper program. However, the eight-puzzle
machine has 181,440 different configurations, so the idea of providing an explicit solution would not be an
easy thing to do, especially when we consider time and storage constraints. The machine will, thus have to
be programmed so that it can construct solutions to eight-puzzle on its own. That is, the machine must be
programmed to make decisions, draw conclusion, and also to perform elementary reasoning activities.

3.1. Production systems

_____________________________________________________________________ 56
SCOB000 2020

Consists of three main components:

 A collection of states – each state is a situation that might occur in the application environment. The
beginning state is called start or initial state; the desired state is called the goal state. In the case of
the puzzle machine, a state is a configuration of the eight-puzzle; the start state is the configuration
of the puzzle when handed to the machine; the goal state is the configuration of the solved puzzle.
 A collection of productions (rules or moves) – A production is an operation that can be performed in
the application to move from one state to another. Each production may be associated with
preconditions:
 A control system – the control system consists of the logic that solves the problem of moving from
the start state to the goal state. At each step in the process the control system must decide which of
those productions whose precondition are satisfied should be applied.

4. Search Trees

When developing control systems, a state graph becomes important. A state graph consists of a
collection of nodes representing the states, connected by arrows representing the production that shift
the system from one state to another. The control system’s job basically involves searching the state
graph to find a path from the start node to the goal node.
The search for the goal will definitely spread into several directions. The process will thus continue until a
goal is found in one of these new states. At the point in which the solution has been found, the control
system will then need to apply the productions along the discovered path from the start state to the goal
state.
The effect of the strategy discussed is to build a tree called a search tree, which consists of the part of the
stage graph that has been investigated by the control system.
Attributes of a search tree
- Root node – it is the start state,
- Children of each nodes – they are the states reachable from the parent by applying one
production,
Each arc between nodes in a search tree represents the application of a single production, and each path
from the root to a leaf represents a path between the corresponding states in the state graph.

Figure 3.4 unsolved eight-puzzle

Figure 3.4 shows an unsolved eight-puzzle configuration, while figure 3.5 shows the tree that would be
produced when solving the puzzle in figure 3.4

_____________________________________________________________________ 57
SCOB000 2020

Figure 3.5, search tree for the 8-puzzle

The leftmost branch of this tree represents an attempt to solve the problem by first moving the 6 tile up,
the center branch represents the approach of moving the 2 tile to the right, and the rightmost branch
represents moving the approach of moving the 5 tile down. Following this pattern of moving the tiles, the
goal state will occur in the last level of the search tree. Once the solution has been found, the control
system terminates its search procedure and begins constructing the instruction sequence that will be
used to solve the puzzle in the external environment.

Applying this method the above search tree will thus produce the stack of productions (figure 3.6). Now
the control system can solve the puzzle in the outside world by executing the instructions as they are
popped off from this stack.

Figure 3.6 productions

4.1 Problem solving

Imagine you are lost in a town and you are trying to find your way out and a mathematician is going up
and down trying to solve a particular theorem, both of you are basically searching for a solution. Search
is a very important concept used in computers and AI. The search strategies are normally evaluated in
terms of the following criteria:

_____________________________________________________________________ 58
SCOB000 2020

 Completeness, whether the strategy is guaranteed to find a solution if there is one or not
 Time complexity, which defines the strategy by taking into the consideration the time the
strategy takes to find a solution
 Space complexity, i.e., how much memory does it take to perform a search
 Optimality, which addresses the question whether the strategy finds the highest-quality
solution when there are several different solution or not

There are three main categories of searching techniques used to solve problems in AI, e.g.,

 Dumb search – which consists of choosing the next move at random, hoping that sooner or
later a solution will be found.
 Blind search, also called uninformed search – that is, one does not have information about the
number of steps or the path cost from the current state to the goal state. In other words the
entire search space may be systematically searched, eventually either a solution is found or the
whole area is covered
 Heuristic search, also called informed search – uses some knowledge about the problem
to help guide the search strategy.

A search space is defined as a set of all possible solutions and a state is defined as a possible solution

For the following search strategies, consider the following simple road map:

4.1.1 Blind search

The blind search has got several search strategies which only differs by the order in which nodes
are expanded, but this can have a dramatic effect as to how well the search perform as we shall
observe. Here are some of the blind search strategies:

 Breadth-first search

In this strategy, the root node is expanded first, then all the nodes generated by the root node are
expanded next, then their successors, and so on. In other words, all the nodes in level D are expanded
before nodes in level D+1. Its advantage is that if there is a solution it will be found. The disadvantage is
that a greater area of the search space is covered,
e.g,

_____________________________________________________________________ 59
SCOB000 2020

Find the shortest route from A to F


The shortest route is [A,C,F], the cost of the path g(n) = 12

 Uniform cost search


This strategy modifies the breadth-first strategy by always expanding the lowest-cost node on the fringe as
measured by the path cost g (n). For example,

Find the shortest route from A to F

At first, the goal node F is reached through D, however the algorithm does not recognize it yet as there is
still a cheaper path cost
The shortest route is [A,C,F], the cost of the path g(n) = 12

 Depth-first search
This strategy always expands one of the nodes at the deepest level of the tree. It is only when the search
hits a dead end (a non goal state with no expansion) that the search goes back and expands nodes at the
shallower levels. In short it explores one branch of the tree before it starts to explore another branch.
For example,

_____________________________________________________________________ 60
SCOB000 2020

Find the shortest route from A to E


The shortest route is [A,B,E], the cost of the path g(n) = 11

4.1.2 Heuristic search

We can improve on blind search strategies by building in some information about the problem.
This will help us to be more direct in our search and avoid covering the whole area. The more
direct the search the smaller the search tree, instead of exploring all nodes, with heuristic method,
we will be able to choose the next node intelligently from the list of currently open nodes. Here are
some examples of heuristics:

 Hill climbing
The Hill climbing (H) search always moves towards the goal. Using heuristics, it finds which
direction will take it closest to the goal. The value of H is determined by:
H = (the value at the current node – the value at the goal node), which determines the next
move from the tree. For example,

Find the shortest route from A to F

The shortest route is [A,C,F], the cost of the path g(n) = 12

 Least cost search


The least cost search chooses an open node which closest to the start node. The method assures us of
finding the shortest route; the only disadvantage is when the distance of the nodes are the same, because,
it will then resemble the blind search. e.g., this is similar to Uniform cost search.

_____________________________________________________________________ 61
SCOB000 2020

 A* search strategy
The A* search strategy is a combination of the Hill climbing and the Least cost search. The A* generates a
value which is based on both strategies. The value is calculated as follows: A = G + H, where G is the least
cost and H is the Hill climbing. For example,

Find the shortest route from A to F

The shortest route is [A,D,F], the cost of the path g(n) = 12

To have a clear understanding of these searching techniques, let us then use this classical example: At the
end of this example we should be able to visualize how these techniques could be used by a computer to
solve a chess problem.

Figure 3.7. arbitrary road map

You are sitting quietly in the lounge of the Snark Inn on Trogstar Beta, the inconspicuous Brown Dwarf
companion of a totally hypothetical yellow star in the third galactic arm, when your radionic modem emits
a beep. You assume that is merely some promotional data-frame which has somehow managed to evade
your junk mail filtering algorithm. You decide to examine the message out of curiosity anyway. Surprisingly
it turns out that it is an invitation to a party on Terra Firma, the third planet in your local system. Terra
Firma is hardly the place to be seen but then neither is the Snark Inn, so you decide to go, for want of

_____________________________________________________________________ 62
SCOB000 2020

anything better to do. You whip out your gate-crasher’s intergalactic route map of the vicinity (figure 3.7)
and ask your computer the way to get to Terra firma

The road map of figure 3.7 shows different routes you might take from Trogstar Beta to Terra Firma. At the
start, your computer will randomly send you to one of the neighboring planets. From there it will once
more choose the next planet at random. Eventually you will either give up or arrive at Terra Firma if you
are lucky. The chances are you might arrive long after the party is over.

We now would want to find out how different search strategies will help us to find the cheapest route to
Terra Firma

 Breadth-first search

The shortest route when using breadth-first will be [0, 1, 6, 9, 12, 15], g(n) = 121
 A* search

_____________________________________________________________________ 63
SCOB000 2020

The shortest route when using A* will be [0, 5, 7, 9, 10, 11], g(n) = 111
 Uniform cost search
Give it a try:
Hill climbing
Give it a try

Heuristics re-visited
Two characteristics of heuristics:
- It should constitute a reasonable estimate of the amount of work remaining in the solution
if the associated state were reached. The better the estimate provided by the heuristics,
the better will the decisions that are based on the information,
- The heuristic should be easy to compute – that is, it has a good chance of benefiting the
search process rather than of becoming a burden.

Let us now look into our eight-puzzle problem:

A simple heuristic in the case of the eight-puzzle would be to estimate the distance to the goal by
counting the number of tiles that are out of place, the conjecture being that a state in which four tiles are
out of place is farther from the goal (and therefore less appealing) than a state in which only two tiles are
out of place. However, this heuristic does not take into account how far out of position the tiles are. If the
two tiles are far from their proper positions, many productions could be required to move them across
the puzzle.

_____________________________________________________________________ 64
SCOB000 2020

A slightly better heuristic, then, is to measure the distance each tile is from its destination and add these
values to obtain a single quantity. A tile immediately adjacent to its final destination would be associated
with a distance of one, whereas a tile whose corner touches the square of its final destination would be
associated with a distance of two. This heuristic is easy to compute and produces a rough estimate of the
number of moves required to transform the puzzle from its current state to the goal. For example, the
heuristic value associated with the configuration in figure 3.8 is seven (because tiles 2, 5, and 8 are each
at a distance of 1 from their final destinations while tiles 3, and 6 are each a distance of 2 from their final
destination). In other words, it actually takes seven moves to turn this puzzle configuration to the solved
configuration.

Figure 3.8 unsolved eight-puzzle

Let us now incorporate this heuristic into our decision-making process. We all know that any human
faced with a decision tends to select the option that appears closest to the goal. Thus our search
procedure should consider the heuristic of each leaf node in the tree and pursue the search from a leaf
node associated with the smallest value. This is the strategy adopted in the algorithm of figure 3.9.

Perhaps let’s apply this algorithm to the eight-puzzle, starting from the initial configuration in figure 3.3.
First we establish this initial state as the root node and record its heuristic value, which is 5. Then, the
first pass through the body of the While structure instructs us to add the three nodes that can be reached
from the initial state, as shown in figure 3.10. The heuristic value of each leaf node has been recorded in
parentheses beneath the node.

Figure 3.9 an algorithm for a controlled system using heuristics

_____________________________________________________________________ 65
SCOB000 2020

Figure 3.10

The goal has not been reached, so we again pass through the body of the While structure, this time
extending our search from the leftmost node (the leftmost leaf node with the smallest heuristic value).
After this, the search tree has the form shown in figure 3. 11.
The heuristic value of the leftmost leaf node is now five, indicating that this branch is perhaps not a good
choice to pursue after all. The algorithm picks up on this and in the next pass through the loop instructs
us to expand the tree from the rightmost leaf node (which now is the leftmost leaf node with the
smallest heuristic value). Having been expanded in this fashion, the search tree appears as in figure 3.12

Figure 3.11

_____________________________________________________________________ 66
SCOB000 2020

Figure 3.12

At this point the algorithm seems to be on the right track. Because the heuristic value of this last node is
3. The While structure then instructs us to continue pursuing this path, and the search focuses towards
the goal, producing the search tree appearing in figure 3.13.

Figure 3.13

Comparing this with the tree in figure 3.5 shows that, even with the temporary wrong turn taken early by
the algorithm, the use of heuristics information has greatly decreased the size of the search tree and
produced a much more efficient process.

_____________________________________________________________________ 67
SCOB000 2020

After reaching the goal state, the while structure terminates, and we move on to traverse the tree from
the goal node up to the root, pushing the productions encountered onto a stack as we go. The resultant
stack appears as depicted earlier in figure 3.6

Exercise:

1. What is the significant of production system in AI?


2. Using a breadth-first approach, draw the search tree that is constructed by a control system when
solving the eight-puzzle from the following start state:

1 2 3
4 8 5
7 6

3. Using the road map of figure 3.7,


a) What is the minimal route from Trogstar Beta to Terra Firma, when using Depth-first
search method
b) When using Hill Climbing, what would be the minimal route from Limbo to Luna

5. Artificial Neural Networks

Artificial neural networks (ANN) are crude electronic models that are based on the neural structure of
the brain. The ANN, basically is composed of a large number of highly interconnected processing
elements (neurons) working in unity to solve specific problems. Like humans, ANNs learn by example or
experience. An Artificial Neural network is configured for a specific application, such as pattern
recognition or data classification, through a learning process. Learning in biological systems involves
adjustment to the synaptic connections that exist between the neurons. The same process also goes for
the Artificial Neural Networks.

Figure 3.1, biological neuron

5.1 Basic properties of Artificial Neural Networks

_____________________________________________________________________ 68
SCOB000 2020

Artificial Neural Networks are constructed from many individual processors, called processing units.
These processing units are constructed in a manner that will model networks of neurons in living
biological systems. A biological neuron is a single cell with input tentacles called dendrites and an output
tentacle called the axon figure 3.1. The signals transmitted via a cell’s axon reflect whether the cell is in
an inhibited or excited state. This state is determined by the combination of signals received by the cell’s
dendrites. These dendrites pick up signals from the axons of other cells across small gaps known as
synapses.

Figure 3.2, activities within a processing unit

A processing unit in an Artificial Neural Network is a simple device that mimics this basic understanding
of the biological neuron. It produces an output of 1 or 0, depending on whether its effective input
exceeds a given threshold value. This effective input is a weighted sum of the actual inputs, as
represented in figure 3.2.

In this figure the outputs of three processing units (denoted by v1, v2, and v3) are used as inputs to
another unit. The “other” unit will be referred to as the fourth unit. The inputs to this fourth unit are
associated with values called weights (denoted by w1, w2, and w3).

The receiving unit multiplies each of its inputs values by the weight associated with that particular input
position and then adds these products to form the effective input (v1w1 + v2w2+ v3w3).

If this sum exceeds the processing unit’s threshold value, the unit produces an output of 1; otherwise the
unit produces an output of 0.

In figure 3.2, we adopted the convention of representing processing units as rectangles. At the input end
of the unit, we place a small rectangle for each input, and in this rectangle we write the weight
associated with that input. Finally we write the unit’s threshold value in the middle of the large
rectangle.

_____________________________________________________________________ 69
SCOB000 2020

As an example, figure 3.3 represents a processing unit with three inputs and a threshold value of 1.5. The
first input is weighted by the value -2, the second is weighted by 3, and the third is weighted -1.
Therefore if the unit receives the inputs 1, 1, and 0, its effective input is:

(1)(-2) + (1) (3) + (0) (-1) = 1

Therefore its output is 0.

But, if the unit receives 0, 1, and 1, its effective input is:

(0)(-2)+ (1) (3) + (1) (-1) = 2, which exceeds the threshold value,

Therefore the unit’s output is 1

Figure 3.3, representation of a processing unit

The fact that a weight can be positive or negative means that the corresponding input can have either an
inhibiting or exciting effect on the receiving unit.

If the weight is negative, then a 1 at that input position reduces the weighted sum and thus tends to hold
the effective input below the threshold value. In contrast, appositive weight causes the associated input
to have an increasing effect on the weighted sum and thus increases the chances of that sum exceeding
the threshold value.

_____________________________________________________________________ 70
SCOB000 2020

Figure 3.4, a neural network with two different programs

Moreover the actual size of the weight controls the degree to which the corresponding input is allowed
to inhibit or excite the receiving unit. Consequently, by adjusting the values of the weights throughout
an artificial neural network, we can program the network to respond to different inputs in a
predetermined manner.

For example, the simple network presented in figure 3.4(a) is programmed to produce an output of 1 if
its two inputs differ and an output of 0 otherwise. If, however, we change the weights to those shown in
figure 3.4(b), we obtain a network that responds with a 1 if both of its inputs are 1s and with 0
otherwise.

The exact workings of the brain are still a mystery. In short the network presented in figure 3.4 is far
more simplistic than an actual biological network. A human brain contains approximately 10 11 neurons
with about 10000 synapses per neuron.

5.1.1 Comparing brains with digital computer

- Brains and computers normally they perform quite different tasks, and have different properties.
There are however; more neurons in the typical human brain than there are in a typical high-end
computer workstation
- Computers can execute instructions in tens of nanoseconds, whereas biological neurons require
milliseconds to fire.
- A brain can perform a complex task – e.g. recognize a face less than a second, which is only
enough time for a few hundreds cycles. A serial computer however, requires billions of cycles to
perform the same task less well.
- Brains are more fault-tolerant than computers. A hardware error that flips a single bit can doom
the entire computations, but brain cells die all the time with no ill effect to the overall
functioning of the brain.

5.2 How artificial neurons work

The fundamental processing element of a neural network is a neuron. This building block of human
awareness encompasses a few general capabilities. Basically, a biological neuron receives inputs from
other sources, combines them in some way, then performs a general nonlinear operation on the results,
and then outputs the final results as we have seen in the previous section.

In figure 3.1, we saw how the components of a neuron relate to each other. Within humans there are
many variations on this basic type of neuron, further complicating man’s attempts at electrically
replicating the process of thinking. Yet, all natural neurons have the same four basic components.

- Dendrites
- Soma (cell body)
- Axon

_____________________________________________________________________ 71
SCOB000 2020

- Synapses

Dendrites are hair-like extensions of the soma which act like input channels. These input channels
receive their input through the synapses of other neurons. The soma (cell body) processes these
incoming signals over time. The some then turns that processed value into an output which is sent out to
other neurons through the axon and the synapses. Research has shown that biological neurons are
structurally more complex than the simplistic explanation above.

5.3 Why Artificial neural networks

Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data,
can be used to extract patterns and detect trends that are too complex to be noticed by either humans
or other computer techniques.

Advantages of neural networks

- Expert system - A trained neural network can be thought of as an “expert” in the


category of information it has been given to analyze. This expert can then be used to
provide projections given new situation of interest and answer “what if” questions.
- Adaptive learning – it should be able to learn how to do tasks based on the data given
for training or initial experience.
- Self-Organization – it should be able to create its own organization or presentation of
the information it receives during learning time.
- Real-time operation – its computations may be carried out in parallel, and special
hardware devices are being designed and manufactured which take advantage of this
capability
- Fault-tolerant via redundant information coding – network capabilities are retained
even with a major network damage. A point to note is that partial destruction of a
network may leads to the degradation of the performance

5.3.1 Neural networks and conventional computers

Neural networks and conventional systems have different approaches to problem solving:

 Conventional computers use algorithmic approach, that is, the computer follows a set of instructions
in order to solve a problem.
 Neural networks learn by example, that is, the network cannot be programmed to perform a specific
task. However, the examples must be selected carefully otherwise useful time is wasted or even
worse the network might be functioning incorrectly.
 With neural networks, the network must find out how to solve the problem by itself, in other words
its operation can be unpredictable.
 Conventional computers on the other hand, use cognitive approach to problem solving, the way the
problem is to be solved must be known and stated in small unambiguous instructions. These
instructions are then converted to a high level language program and then into machine code that
the computer can understand.

_____________________________________________________________________ 72
SCOB000 2020

Neural networks and conventional computers complement each other. There are tasks that are more
suited to an algorithmic approach like arithmetic operations and tasks that are more suited to neural
networks. Even more, a large number of tasks require systems that use a combination of the two
approaches; normally a conventional computer is used to supervise the neural network, in order to
perform at maximum efficiency.

5.4 How Artificial Neural Networks are being used

Artificial neural networks are progressing quite enormously in promising application areas. The following
are the various possibilities where artificial neural networks might offer solutions.

5.4.1 Language processing

In the previous module we learned the rudiments of the translation process; these include, lexical
analysis, parsing, and code generation. We looked at how these steps were applied to the task of
translating programs from high-level languages into low-level machine language. The ability to perform
the translation gives the illusion that the machine actually understands the language being translated.
However programming languages are constructed from well-designed primitives so that each statement
has only one meaning and only one grammatical structure. In contrast, a statement in natural language
can have multiple meanings depending on its context or even the manner in which it is communicated.
Human beings normally rely on acquired knowledge and associated memory to help them understand.

For example the sentences

Norman Rockwell painted people

And

Cinderella had a ball.

Have multiple meanings that cannot be distinguished by parsing or translating each word independently.
Instead, to understand these sentences require the ability to understand the context in which the
statement is made. In other instances the true meaning of a sentence is not the same as its literal
translation.

For example,

Do you know what time it is?

Often means “Please tell me what time it is” or if the speaker has been waiting for a long time, it may
mean “You are very late”.

To unravel the meaning of a statement in a natural language therefore requires several levels of analysis:

- Syntactic analysis – its major task is parsing. It is here that the subject of the sentence is
recognized e.g.

_____________________________________________________________________ 73
SCOB000 2020

Mary gave John a birthday card, (the subject is Mary) while in the sentence John got a birthday card,
(John is the subject).

- Semantic analysis – it has the task of identifying the semantic role of each word in the
statement. It seeks to identify such things as the action described, the agent of that
action, and the object of the action
- Contextual analysis – it is at this level that the context of the sentence is brought into
the understanding process.

For example, it is easy to identify the grammatical role of each word in the sentence

The bat flew from his hand.

We can even perform semantic analysis by identifying the action involved as flying, the agent as bat, and
so on. But it is not until we consider the context of the statement that its meaning becomes clear. It is
clear to note that it has a different meaning in the context of baseball game than it does in the context
of cave exploration.

One important concept to note in natural language processing concerns an entire document rather than
individual sentences. The problems of concern fall into two categories:

Information retrieval – which refers to the task of identifying documents that relate to the topic at hand.
As an example, imagine the task faced by WWW users as they try to find the sites that relate to a
particular topic

- Information extraction – which refers to the task of extracting information from documents so
that it takes a form that is useful in other applications. This may mean identifying the answer to
a specific question or recording the information in a form from which questions can be answered
at a later date.

5.4.2 Robotics

Robotics is the study and technology of robots. It enables us to design automated machines capable of
replacing people in certain jobs. A Robot is defined as an active artificial agent designed to move
materials, parts, tools, or some devices. It should be easily reprogrammed to perform a variety of tasks,
and must have sensors that enable it to react and adapt to changing conditions. Research in Robotics has
emerged to be independent of commercial applications. Its major goal is to build autonomous robots,
that is, robots that behave in a manner that ensure their survival.

The basic components of a robot system are:

- The manipulator/mechanical linkage - The manipulator consists of a set of rigid links connected
by joints. The joints are typically rotary or sliding. The last link or the most distal link is called the
end effector because it is this link to which a gripper or a tool is attached.
- Actuators - The actuators are used to drive the joints of the manipulator. The actuators convert
software commands into physical motion.

_____________________________________________________________________ 74
SCOB000 2020

- Transmissions - Transmissions are elements between the actuators and the joints of the
mechanical linkage.
- Sensors - In order to control a robot, it is necessary to know the position of each joint in the
mechanical/manipulator linkage. Therefore it is necessary to instrument the joints of the robot
with position sensors
- Controllers - The controller provides the intelligence that is necessary to control the manipulator
system. It looks at the sensory information and computes the control commands that must be
sent to the actuators to carry out the specified task.
- User interface - This interface allows a human operator to monitor or control the operation of
the robot. It must have a display that shows the status of the system. It must also have an input
device that allows the human to enter commands to the robot. The user interface may be a
personal computer with the appropriate software or a teach pendant.
- Power conversion unit - The power conversion unit takes the commands issued by the controller
which may be low power and even digital signals and converts them into high power analog
signals that can be used to drive the actuators.

5.4.3 AI and Database systems

The human mind is a marvelous device. How it stores its acquired knowledge and later identifies and
extracts the particular items of information that are pertinent to the task at hand remains a mystery
– a mystery that permeates many areas of artificial intelligence. A numbers of questions arise: viz,

 How does the human mind recall the information needed to recognize images?
 How does the human mind apply the proper context when processing ambiguous statements in
natural languages?
 How does the human mind store its knowledge of the real world so that it can reason about that
knowledge?

The fundamental problem of artificial intelligence however, is a database problem. Artificial intelligence
techniques are applied to traditional database system to provide better service, and database
techniques are applied in artificial intelligence projects in an attempt to handle the massive amounts of
real-world knowledge underlying the decision processes involved.

In AI and Database systems, the same problem may be attacked from different perspectives.

Associative memory – one example is the problem of developing associative memory (database)
systems. This becomes important in the field of natural language processing where relevant real-world
information must be applied to understand sentences according to their contexts. In traditional database
perspective the problem is merely to identify and retrieve information that is related to a topic rather
than merely the information that is explicitly requested.

Data storage – AI and databases, persuaded the development of data storage and information retrieval
system that can provide information that is implied by the stored data rather than merely respond with
information that is explicitly stored. In other words, we would like the database to be able to reason
about the information it contains.

_____________________________________________________________________ 75
SCOB000 2020

5.4.4 Expert systems

An important extension of the intelligent database concept is that of expert systems – software packages
designed to assist humans in situation in which an expert in a specific area is required. These systems are
designed to simulate the cause-and-effect reasoning that experts would accomplish if confronted with
the same situations. In other words, a medical expert system should propose the same procedure as a
medical expert would propose. The major task in constructing an expert system is to obtain the required
knowledge from an expert.

2.5.4.1 Expert system building tools

An expert system tool, or shell, is a software development environment containing the basic
components of expert systems. Associated with a shell is a prescribed method for building applications
by configuring and instantiating these components. Some of the generic components of a shell are
shown in Figure 3.15 and described below. The core components of expert systems are the knowledge
base and the Inference engine.

Figure 3.15 Basic Components of Expert system Tools

 Knowledge base: It is the storage of factual and heuristic knowledge. An ES tool provides one or
more knowledge representation schemes for expressing knowledge about the application

_____________________________________________________________________ 76
SCOB000 2020

domain. Some tools use both frames (objects) and IF-THEN rules. In PROLOG the knowledge is
represented as logical statements.
 Inference engine: It has the inference mechanisms for manipulating the symbolic information
and knowledge in the knowledge base to form a line of reasoning in solving a problem. The
inference mechanism can range from simple modus ponens backward chaining of IF-THEN rules
to case-based reasoning.
 Knowledge acquisition subsystem: It is a subsystem that helps experts to build knowledge
bases. Collecting knowledge needed to solve problems and build the knowledge base continues
to be the biggest bottleneck in building expert systems.
 Explanation subsystem: A subsystem that explains the system's actions. The explanation can
range from how the final or intermediate solutions were arrived at to justifying the need for
additional data.
 User interface: This provides the means of communication with the user. The user interface is
generally not a part of the ES technology, and was not given much attention in the past.
However, it is now widely accepted that the user interface can make a critical difference in the
perceived utility of a system regardless of the system's performance.

_____________________________________________________________________ 77

You might also like