Topic 3i - Artificial Neural Networks - Revised 20032020

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 70

CSC583 – Artificial

Intelligence Algorithms
Topic 3i – Artificial Neural Networks
Faculty of Computer & Mathematical Sciences
Universiti Teknologi MARA
A. Introduction: Objectives
• To understand:
• The similarities between biological neuron and artificial
neuron;
• How artificial neural network learns;
• The difference between machine learning and deep
learning;
• The application of machine learning and deep learning.
A. Introduction: Human Intelligence
A. Introduction: Artificial Intelligence
A. Introduction: AI, Machine Learning & Deep
Learning
A. Introduction: How the Brain Works?
• Machine learning: adaptive mechanisms that enable computers to
learn from experience, learn by example and learn by analogy
• Learning capabilities can improve the performance of an intelligent
system over time
• The most popular approaches to machine learning are artificial neural
networks and genetic algorithms
A. Introduction: How the Brain Works? (cont)
• A neural network can be defined as a model of reasoning based on
the human brain.
• The brain consists of a densely interconnected set of nerve cells, or
basic information-processing units, called neurons.
• The human brain incorporates nearly 10 billion neurons and 60 trillion
connections, synapses, between them.
• By using multiple neurons simultaneously, the brain can perform its
functions much faster than the fastest computers in existence today
A. Introduction: How the Brain Works? (cont)
• Biological neurons
A. Introduction: How the Brain Works? (cont)
• A neural network can be defined as a model of reasoning based on
the human brain.
• The brain consists of a densely interconnected set of nerve cells, or
basic information-processing units, called neurons.
• The human brain incorporates nearly 10 billion neurons and 60 trillion
connections, synapses, between them.
• Each neuron has a very simple structure
• Soma : cell body
• Dendrites : fibers that accept input from environment
• Axon : single long fiber that sends output to other neurons
A. Introduction: What is Artificial Neural
Network?
• An Artificial Neural Network (ANN) is an information processing
paradigm that is inspired by the way biological nervous systems, such
as the brain, process information.
• The key element of this paradigm is the novel structure of the
information processing system.
• It is composed of a large number of highly interconnected processing
elements (neurones) working in unison to solve specific problems.
• The neurons are connected by weighted links passing signals from
one neuron to another
A. Introduction: What is Artificial Neural
Network? (cont)
• ANNs, like people, learn by example. An ANN is configured for a
specific application, such as pattern recognition or data classification,
through a learning process.
• Learning in biological systems involves adjustments to the synaptic
connections that exist between the neurones. This is true of ANNs as
well
A. Introduction: Analogy between Biological and
Artificial Neural Networks
A. Introduction: Architecture of a Typical
Artificial Neural Network
B. Neuron: Simple Computing Element
• A neuron is the basic building element of an ANN
• First, the neuron computes the weighted sum of the input
signals, X
B. Neuron: Simple Computing Element (cont)
• Next, the neuron compares the result X with a threshold
value, θ to get the final output Y. Example:
• If the net input is less than the threshold, the neuron output gets value of -1
• But if the net input is greater than or equal to the threshold, the neuron
becomes activated and gets value of +1

• This is the action of applying transfer function, and the


example above is called a sign function type transfer function
B. Neuron: Transfer/Activation Functions
• Many activation functions have been tested, but only a few
have found practical applications
C. Perceptron: Training a Simple ANN
• In 1958, Frank Rosenblatt introduced a training algorithm
that provided the first procedure for training a simple ANN: a
perceptron
• The perceptron is the simplest form of a neural network. It
consists of a single neuron with adjustable weights and a
hard limiter
C. Perceptron: Training a Simple ANN (cont)
• Single layer two-input perceptron
C. Perceptron: Training a Simple ANN (cont)
• The aim of this perceptron with a hard limiter is to classify
inputs, for example x1, x2 into one of two classes for example
A1 and A2
• This is done by dividing the space into two decision regions
by drawing a straight line defined by the function:
C. Perceptron: How Does it Learn?
• This is done by making small adjustments in the weights
• The adjustment is to reduce the difference between actual
and desired outputs of the perceptron
• The initial weights are randomly assigned, usually in the
range [-0.5, 0.5] and then updated to obtain the output
consistent with the training examples.
C. Perceptron: How Does it Learn? –
Perceptron Weight Correction
• If at iteration p, the actual output is Y(p) and the desired
output is Yd(p), then the error is given by:
e(p) = Yd(p) - Y(p) where p = 1,2,3…
• If error e(p) is positive, we must increase perceptron output
Y(p). If error e(p) is negative, we must decrease perceptron
output Y(p).*
• Thus the following perceptron learning rule is established:
wi(p+1) = wi(p) + α × xi(p) × e(p)
where α is the learning rate
C. Perceptron: Training Algorithm – Step 1
Step 1: Initialisation
• Set initial weights w1, w2, … wn and threshold θ to random
numbers in the range [-0.5, 0.5]
C. Perceptron: Training Algorithm – Step 2
Step 2: Activation
• Activate the perceptron by applying inputs x1(p), x2(p), …
xn(p) and desired output Yd(p).
• Calculate the actual output at iteration p = 1, where n is the
number of the perceptron inputs, and activation is an
activation function
C. Perceptron: Training Algorithm – Step 3
Step 3: Weight Training
• Update the weights of the perceptron using perceptron
learning rule:
wi(p+1) = wi(p) + α × xi(p) × e(p)
C. Perceptron: Training Algorithm – Step 4
Step 4: Iteration
• Increase iteration p by one, go back to Step 2 and repeat the
process until convergence (minimum error is achieved).
C. Perceptron: Example of Learning: AND
Operation
• Training data for AND Operation (Truth Table):
C. Perceptron: Example of Learning: AND
Operation – Step 1 (Initialisation)
• Weights and threshold randomly assigned:
w1 = 0.3, w2 = -0.1, θ = 0.2, α = 0.1
C. Perceptron: Example of Learning: AND
Operation – Step 2 (Activation): p=1
• Computes the weighted sums:
• X = x1w1 + x2w2
• X = 0(0.3) + 0(-0.1)
•X=0
• Apply activation function:
• Y = activation(X)
•Y=0

w1 = 0.3, w2 = -0.1, θ = 0.2, α = 0.1


C. Perceptron: Example of Learning: AND
Operation – Step 3 (Weight Training): p=1
• Updates the weights of perceptron using perceptron learning
rule:
• wi(p+1) = wi(p) + α × xi(p) × e(p) e(p) = Yd(p) - Y(p)

• w1(1+1) = 0.3 + (0.1 × 0 × (0-0))


• w1(2) = 0.3

• w2(1+1) = -0.1 + (0.1 × 0 × (0-0))


• w2(2) = -0.1

Because error is 0, weight maintains the same


w1 = 0.3, w2 = -0.1, θ = 0.2, α = 0.1
C. Perceptron: Example of Learning: AND
Operation – Step 4 (Iteration): p=1
• Increase iteration p by 1, go back to Step 2 with updated
weights.
C. Perceptron: Example of Learning: AND
Operation – Step 2 (Activation): p=2
• Computes the weighted sums:
• X = x1w1 + x2w2
• X = 0(0.3) + 1(-0.1)
• X = -0.1
• Apply activation function:
• Y = activation(X)
•Y=0

w1 = 0.3, w2 = -0.1, θ = 0.2, α = 0.1


C. Perceptron: Example of Learning: AND
Operation – Step 3 (Weight Training): p=2
• Updates the weights of perceptron using perceptron learning
rule:
• wi(p+1) = wi(p) + α × xi(p) × e(p) e(p) = Yd(p) - Y(p)

• w1(2+1) = 0.3 + (0.1 × 0 × (0-0))


• w1(3) = 0.3

• w2(2+1) = -0.1 + (0.1 × 1 × (0-0))


• w2(3) = -0.1

Because error is 0, weight maintains the same


w1 = 0.3, w2 = -0.1, θ = 0.2, α = 0.1
C. Perceptron: Example of Learning: AND
Operation – Step 4 (Iteration): p=2
• Increase iteration p by 1, go back to Step 2 with updated
weights.
C. Perceptron: Example of Learning: AND
Operation – Step 2 (Activation): p=3
• Computes the weighted sums:
• X = x1w1 + x2w2
• X = 1(0.3) + 0(-0.1)
• X = 0.3
• Apply activation function:
• Y = activation(X)
•Y=1

w1 = 0.3, w2 = -0.1, θ = 0.2, α = 0.1


C. Perceptron: Example of Learning: AND
Operation – Step 3 (Weight Training): p=3
• Updates the weights of perceptron using perceptron learning
rule:
• wi(p+1) = wi(p) + α × xi(p) × e(p) e(p) = Yd(p) - Y(p)

• w1(3+1) = 0.3 + (0.1 × 1 × (0-1))


• w1(4) = 0.2

• w2(3+1) = -0.1 + (0.1 × 0 × (0-1))


• w2(4) = -0.1

Because error is no longer zero, weight is updated


w1 = 0.3, w2 = -0.1, θ = 0.2, α = 0.1
w1 = 0.2, w2 = -0.1, θ = 0.2, α = 0.1
C. Perceptron: Example of Learning: AND
Operation – Step 4 (Iteration): p=3
• Increase iteration p by 1, go back to Step 2 with updated
weights.
C. Perceptron: Example of Learning: AND
Operation – Step 2 (Activation): p=4
• Computes the weighted sums:
• X = x1w1 + x2w2
• X = 1(0.2) + 1(-0.1)
• X = 0.1
• Apply activation function:
• Y = activation(X)
•Y=0

w1 = 0.2, w2 = -0.1, θ = 0.2, α = 0.1


C. Perceptron: Example of Learning: AND
Operation – Step 3 (Weight Training): p=4
• Updates the weights of perceptron using perceptron learning
rule:
• wi(p+1) = wi(p) + α × xi(p) × e(p) e(p) = Yd(p) - Y(p)

• w1(4+1) = 0.2 + (0.1 × 1 × (1-0))


• w1(5) = 0.3

• w2(4+1) = -0.1 + (0.1 × 1 × (1-0))


• w2(5) = 0.0

Because error is no longer zero, weight is updated


C. Perceptron: Example of Learning: AND
Operation – Step 4 (Iteration): p=4
• Increase iteration p by 1, go back to Step 2 with updated
weights.
C. Perceptron: Example of Learning: AND
Operation – End of Epoch 1
• We have iterated all possible inputs for x1 and x2 from p=1
until p=4
• This sequence of four input patterns are called an epoch
• However there are errors and weights w1 and w2 are not
uniform for the four iterations
• We must continue training the perceptron with the same
data until all the weights converge to a uniform set of values
• Therefore we start Epoch 2
C. Perceptron: Example of Learning: AND
Operation
• Training of the perceptron continues with subsequent
epochs
C. Perceptron: Example of Learning: AND
Operation (cont)
• Training ends at Epoch 5 where errors are 0 for all inputs
combination of an epoch and weights are uniform for all w1
and w2
C. Perceptron: Example of Learning: AND
Operation (cont)
• With the final value of weights for the chosen threshold, we
can define the boundary of AND classification graph using
this formula:
x1w1 + x2w2 = θ
• 0.1x1 + 0.1x2 = 0.2
• x1 + x2 = 2
C. Perceptron: Weakness
• Single-layer perceptron can classify only linearly separable
patterns
• XOR operation: cannot be separated by a single line on a
graph
C. Perceptron: Exercise (from Past Year Qs)
• Given the following parameter values for a perceptron to train logical
OR operation,
• Calculate the value of output Y for all iterations. Show your work
including the weight adjustments, if any for Epoch 1 only (10 Marks)
D. Multilayer Neural Networks: Introduction
• A multilayer perceptron is a feedforward neural network
with one or more hidden layers.
• The network consists of an input layer of source neurons, at
least one middle or hidden layer of computational neurons,
and an output layer of computational neurons.
• The input signals are propagated in a forward direction on a
layer-by-layer basis
D. Multilayer Neural Networks: Introduction
(cont)
• Multilayer perceptron with two hidden layers
D. Multilayer Neural Networks: What Does
the Middle Layer Hide?
• A hidden layer ‘hides’ its desired output. Neurons in the
hidden layer cannot be observed through the input/output
behaviour of the network
• There is no obvious way to know what the desired output of
the hidden layer should be.
• In other words, the desired output of the hidden layer is self-
determined by the layer itself
D. Multilayer Neural Networks: Can a Neural
Network include More than Two Hidden Layers?
• Commercial ANNs incorporate three and sometimes four
layers, including one or two hidden layers. Each layer can
contain from 10 to 1000 neurons.
• Experimental neural networks may have five or even six
layers, including three or four hidden layers, and utilise
millions of neuron
• But most practical applications use only three layers,
because each additional layer increases the computational
burden exponentially
D. Multilayer Neural Networks: Question
Example
• Given the following ANN architecture and the parameter values,
apply feed-forward learning algorithm and show the training process
for the first epoch.
• learning rate: 0.5, activation function: Y = 0 if sum ≤ 4, Y = 1 if sum > 4
• training data: input: 1,1 , output: 1 | input: 0,0 , output: 0
D. Multilayer Neural Networks: Question
Example - Solution
• Arrange all information in a table:

P x1 x2 N1w1 N1w2 N2w1 N2w2 N3 N3w N4 N4w YD Y1


p1 1 1 1 2 2 1 2 1 1

1
D. Multilayer Neural Networks: Question
Example – Solution (cont)
• Calculate the value of N3:
0
• N3 = x1(w1) + x2(w2) 1
• = 1(1) + 1(2)
• =3,≤4
• ∴Y=0
1

P x1 x2 N1w1 N1w2 N2w1 N2w2 N3 N3w N4 N4w YD Y1


p1 1 1 1 2 2 1 0 2 1 1
D. Multilayer Neural Networks: Question
Example – Solution (cont)
• Calculate the value of N4:
0
• N4 = x1(w1) + x2(w2) 1
• = 1(2) + 1(1)
• =3,≤4
• ∴Y=0 0
1

P x1 x2 N1w1 N1w2 N2w1 N2w2 N3 N3w N4 N4w YD Y1


p1 1 1 1 2 2 1 0 2 0 1 1
D. Multilayer Neural Networks: Question
Example – Solution (cont)
• Calculate the value of N5:
0
• N5 = x1(w1) + x2(w2) 1
• = 0(2) + 0(1)
0
• =0,≤4
• ∴Y=0 0
1

P x1 x2 N1w1 N1w2 N2w1 N2w2 N3 N3w N4 N4w YD Y1


p1 1 1 1 2 2 1 0 2 0 1 1 0
D. Multilayer Neural Networks: Question
Example – Solution (cont)
• Error is: 1-0 = 1, not zero, weight must be adjusted, start with weights of N1:
• wi(p+1) = wi(p) + α × xi(p) × e(p) 0
1
• w1(2) = 1 + 0.5 (1 × 1)
• ∴ w1(2) = 1.5 0
• w2(2) = 2 + 0.5 (1 × 1) 0
• ∴ w2(2) = 2.5 1

P x1 x2 N1w1 N1w2 N2w1 N2w2 N3 N3w N4 N4w YD Y1


p1 1 1 1 2 2 1 0 2 0 1 1 0
1.5 2.5
D. Multilayer Neural Networks: Question
Example – Solution (cont)
• Continue with weights of N2:
• wi(p+1) = wi(p) + α × xi(p) × e(p) 0
1
• w1(2) = 2 + 0.5 (1 × 1)
• ∴ w1(2) = 2.5 0
• w2(2) = 1 + 0.5 (1 × 1) 0
• ∴ w2(2) = 1.5 1

P x1 x2 N1w1 N1w2 N2w1 N2w2 N3 N3w N4 N4w YD Y1


p1 1 1 1 2 2 1 0 2 0 1 1 0
1.5 2.5 2.5 1.5
D. Multilayer Neural Networks: Question
Example – Solution (cont)
• Start iteration 2 with new inputs and adjusted weights
1.5
0
2.5

2.5

0
1.5

P x1 x2 N1w1 N1w2 N2w1 N2w2 N3 N3w N4 N4w YD Y1


p1 1 1 1 2 2 1 0 2 0 1 1 0
p2 0 0 1.5 2.5 2.5 1.5 2 1 0
D. Multilayer Neural Networks: Question
Example – Solution (cont)
• Calculate the value of N3
• N3 = x1(w1) + x2(w2) 1.5 0
• = 0(1.5) + 0(2.5) 0
2.5
• =0,≤4
• ∴Y=0 2.5

0
1.5

P x1 x2 N1w1 N1w2 N2w1 N2w2 N3 N3w N4 N4w YD Y1


p1 1 1 1 2 2 1 0 2 0 1 1 0
p2 0 0 1.5 2.5 2.5 1.5 0 2 1 0
D. Multilayer Neural Networks: Question
Example – Solution (cont)
• Calculate the value of N4
• N4 = x1(w1) + x2(w2) 1.5 0
• = 0(2.5) + 0(1.5) 0
2.5
• =0,≤4
• ∴Y=0 2.5
0
0
1.5

P x1 x2 N1w1 N1w2 N2w1 N2w2 N3 N3w N4 N4w YD Y1


p1 1 1 1 2 2 1 0 2 0 1 1 0
p2 0 0 1.5 2.5 2.5 1.5 0 2 0 1 0
D. Multilayer Neural Networks: Question
Example – Solution (cont)
1.5 0
• Calculate the value of N5 0
• N5 = x1(w1) + x2(w2) 2.5

• = 0(2) + 0(1)
2.5
• =0,≤4 0
• ∴Y=0 0
1.5
• Error is: 0-0 = 0, no adjustment to the weight.
• Finish Epoch 1

P x1 x2 N1w1 N1w2 N2w1 N2w2 N3 N3w N4 N4w YD Y1


p1 1 1 1 2 2 1 0 2 0 1 1 0
p2 0 0 1.5 2.5 2.5 1.5 0 2 0 1 0 0
D. Multilayer Neural Networks: Exercise
• Train the following ANN using feed-forward algorithm where the
learning rate is 0.2 and step function against a threshold of 0.6 is
applied in the hidden and output layers. Show the results of the
training at Epoch 1 only. The training data is as follows:
D. Multilayer Neural Networks: Advantages &
Disadvantages
• Advantage :
• A neural network can perform tasks in which a linear program cannot
perform.
• Disadvantage :
• The neural network requires training to operate
E. Classification of Learning Methods
• Supervised Learning
• Unsupervised Learning
E. Classification of Learning Methods:
Supervised Learning
• Supervised learning incorporates an external teacher, so
that each output unit is told what its desired response to
input signals ought to be.
• During the learning process global information may be
required.
• Paradigms of supervised learning include error-correction
learning, reinforcement learning and stochastic learning.
E. Classification of Learning Methods:
Supervised Learning (cont)
• An important issue concerning supervised learning is the
problem of error convergence, ie the minimisation of error
between the desired and computed unit values.
• The aim is to determine a set of weights which minimises
the error.
E. Classification of Learning Methods:
Unsupervised Learning
• Unsupervised learning uses no external teacher and is based
upon only local information.
• identifies commonalities in the data and reacts based on the presence or
absence of such commonalities in each new piece of data.
• It is also referred to as self-organisation, in the sense that it
self-organises data presented to the network and detects
their emergent collective properties.
• Paradigms of unsupervised learning are Hebbian learning
and competitive learning.
E. Classification of Learning Methods:
Example of Supervised Learning
• Assume that we want a network to recognise hand-written digits. We might use an
array of, say, 256 sensors, each recording the presence or absence of ink in a small area
of a single digit. The network would therefore need 256 input units (one for each
sensor), 10 output units (one for each kind of digit) and a number of hidden units.
• For each kind of digit recorded by the sensors, the network should produce high activity
in the appropriate output unit and low activity in the other output units.
• To train the network, we present an image of a digit and compare the actual activity of
the 10 output units with the desired activity. We then calculate the error, which is
defined as the square of the difference between the actual and the desired activities.
Next we change the weight of each connection so as to reduce the error. We repeat
this training process for many different images of each different images of each kind of
digit until the network classifies every image correctly.
• To implement this procedure we need to calculate the error derivative for the weight
(EW) in order to change the weight by an amount that is proportional to the rate at
which the error changes as the weight is changed. One way to calculate the EW is to
perturb a weight slightly and observe how the error changes
F. Applications of Neural Networks
• Neural networks in medicine
• modelling parts of the human body and recognising diseases from
various scans (e.g. cardiograms, CAT scans, ultrasonic scans, etc.).
• Neural networks are ideal in recognising diseases using scans
since there is no need to provide a specific algorithm on how to
identify the disease. Neural networks learn by example so the
details of how to recognise the disease are not needed. What is
needed is a set of examples that are representative of all the
variations of the disease
F. Applications of Neural Networks (cont)
• Neural networks in business
• Credit Evaluation and mortgage screening (the data related to
property and borrower qualifications)
G. Review Questions
• What is machine learning?
• What is artificial neural network?
• How does learning occur in neural network?
• What are some examples of transfer or activation function?
• What is the weakness of a single-layer perceptron?
• What are the common applications of neural network?

You might also like