Ann PDF

Artificial Neural Network
1
What do you mean by the neural
network?
1. The neural network is a model that works as neurons

works in human brains.
2. It is also known as ANN (Artificial Neural Networks)
3. It copies the mechanism as working in human brains.
4. In a neural network, the machine can learn, recognize

and make decisions like human beings.
2
Biological prototype of neuron
3
Artificial Neural Network
4
Neural Network Architectures
5
Network Architecture Types
Three basic type of neuron connection architectures
➢ Single layer feed forward network
➢ Multi-layer feed forward network
➢ Recurrent neural network
6
Single layer feed forward network
This network is called single-layer network, where

single layer refers to output layer of computation
nodes(neuron)
There is only one computational layer, so it is a single

layer architecture.
Input layer only receives signal from the external

world it won’ t perform any processing. 7
Multi-layer feed forward network
8
Recurrent neural network
These networks are differ from feedforward network in the

sense that there is atleast one feedback loop. They can be
single layer and multi-layer network
9
Learning Rules
➢ Error-correction learning
➢ Memory-based learning
➢ Hebbian learning
➢ Competitive learning
➢ Boltzmann learning
10
Error Correction learning
11
Memory based learning
12
13
Hebbian Learning
14
Hebbian Learning
15
Competitive Learning
16
17
18
19
Rosenblatt Perceptron Model
1.Rosenblatt Perceptron model was designed by Rosenblatt in 1958
to overcome the issues of McCulloch-Pitts Neuron model.
✓ It can process non-Boolean inputs and it assigns different weights to

each input automatically.
2. It is a single layer network.
3. Rosenblatt perceptron can be seen as a set of inputs that are

weighted and to which we apply activation function.
20
Rosenblatt Perceptron Model
4. The input can be seen as neurons and will be
called as input layer
5. These neurons (input layer) and activation

function forms a perceptron.
6. This model implements the functioning of a

single neuron that can solve linear classification
problem through very simple learning algorithm.
7. Rosenblatt Perceptron are called as first

generation of neural networks.
8. This model has main limitation of not solving the

non-linear problem. 21
Perceptron Convergence theorem
22
23
24
25
26
27
Activation Function
28
29
30
31
32
33
34
35
Multilayer Perceptron
36
Multilayer Perceptrons
37
38
39
40
The Back-Propagation Algorithm
41
The Back-Propagation Algorithm
42
Gradient Descent
Gradient Descent is defined as one of the most commonly
used iterative optimization algorithms of machine learning
to train the machine learning and deep learning models. It
helps in finding the local minimum of a function.
43
Gradient Descent
This entire procedure is known as Gradient Ascent, which is
also known as steepest descent. The main objective of using
a gradient descent algorithm is to minimize the cost
function using iteration. To achieve this goal, it performs
two steps iteratively:
Calculates the first-order derivative of the function to

compute the gradient or slope of that function.
Move away from the direction of the gradient, which means

slope increased from the current point by alpha times,
where Alpha is defined as Learning Rate. It is a tuning
parameter in the optimization process which helps to decide
the length of the steps.
44
45
46
47
48
49
50
51
52
53
54
55
56
XOR PROBLEM
57
58
1
59
61
62
Batch Learning
According to batch learning, it is unable to learn continuously from data after
we train a model. Then the model has to be trained once according to the
complete dataset, it may take longer and also require more computer
resources.
Then if we get some new data, how can we add it to this model? We should train the
model again from scratch using the whole data (old data + new data).
So, then we need again more time and computer resources. We can solve this
problem using algorithms that are capable of learning continuously. This is
called online learning.
63
Online Learning
That way we can continue to learn from the model so we can feed the dataset
into small groups also known as mini-batches without having to train the model
all at once from the complete dataset. Or we can train the model using individual
data points from the whole dataset.
64
Cover’s Theorem
Cover states that a pattern classification problem cast in a
nonlinear high-dimensional space is more likely to be linearly
separable than in a low-dimensional space.
65
Radial Basis Function Neural
Network
The idea of Radial Basis Function (RBF) Networks derives from the theory of function
approximation. We have already seen how Multi-Layer Perceptron (MLP) networks with
a hidden layer of sigmoidal units can learn to approximate functions. RBF Networks take
a slightly different approach. Their main features are:
Basic Form of RBF
Input layer: Source node connected to the environment
Hidden layer: Provide a set of function which form a base for mapping into hidden
layer
Output Layer: Supplies Response
66
Network
67
Network
P is dimensionality of input feature space , M is dimensionality of transformed feature

space where we have imposed our RBF.
68
Network
Training Comprises for these kind of network in two phases:
➢ Training Hidden layer which comprises of M RBF functions, the parameters to be
determined for RBF function are receptor position t and the Sigma in case of
Gaussion RBF.
➢ Training weight vectors Wij for output layer.
Training Hidden layer:
So for training hidden layers there are different approaches, let us assume for
now we are dealing with Gaussian RBF so we need to determine receptor t and
Spread ie Sigma. One of the approach is to randomly select M number of
receptor from N number of sample feature vector but this does not seems logical
so we can go ahead with clustering mechanism to determine receptors ti.
As we have M nodes in hidden layer and N samples so for clustering to work

here N>M. 69
Network
70
Network
Calculation of receptors:
Let’s look at above example where we have M=3 so we need to to determine three
t’s. so initially we divide out feature vector space in to three arbitrary clusters and
took their means as the initial receptors, then we need to iterate for every sample
feature vector and perform below steps:
➢ a) From the selected input feature vector x determine distances of means(t1,t2,t3)
of three different clusters whichever distance mean is minimum the sample x will
get assigned to that cluster.
➢ b) After x got assigned to different cluster all the means(t1,t2,t3) gets recomputed.
➢ c) Perform step 1 and step 2 for all sample points.
➢ Once the iteration finishes we will get the optimal t1,t2 and t3.
71
Network
Calculation of Sigma:
Once receptors are calculated we can use K nearest neighbor algorithm to calculate
sigma the formula is there in above image. we need to select the values of P.
72
Network
73
Network
74
Network
75
Network
76
Network
77
Network
Training Weight Vectors
Let us assume the dimensionality of hidden layer as M and sample size as N
the we can calculate the optimal weight vector for the network using the
pseudo inverse matrix solution.
78
Network
Every component of dk either will be one or 0. It will be equal to 1 if the

79
corresponding input vector belongs to class k else it will be 0.
Network
80
Network
81
K-means clustering
➢ K-Means clustering is an unsupervised learning algorithm.
There is no labeled data for this clustering, unlike in supervised
learning. K-Means performs the division of objects into clusters
that share similarities and are dissimilar to the objects belonging
to another cluster.
➢ The term ‘K’ is a number. You need to tell the system how many
clusters you need to create. For example, K = 2 refers to two
clusters. There is a way of finding out what is the best or
optimum value of K for a given data.
For a better understanding of k-means, let's take an example from

cricket. Imagine you received data on a lot of cricket players from
all over the world, which gives information on the runs scored by
the player and the wickets taken by them in the last ten matches.
Based on this information, we need to group the data into two
clusters, namely batsman and bowlers.
Let's take a look at the steps to create these clusters. 82
K-means clustering
83
K-means clustering
84
K-means clustering
85
K-means clustering
86
K-means clustering
87
K-means clustering
88
K-means clustering
89
K-means clustering
90
K-means clustering
91
K-means clustering
92
K-means clustering
93
K-means clustering
94
K-means clustering
95
K-means clustering
96
K-means clustering
97
K-means clustering
98
K-means clustering
99
Hybrid Learning Procedure for RBF Networks.
100
Self-Organizing Map
• Introduced by Prof. Teuvo Kohonen

in 1982
• Also known as Kohonen feature map
• Unsupervised neural network
• Clustering tool of
high-dimensional
and complex data
101
Self-Organizing Map
• Maintains the topology of the dataset

• Training occurs via competition between the neurons
• Impossible to assign network nodes to specific input
classes in advance
• Can be used for detecting similarity and degrees of
similarity
• It is assumed that input pattern fall into sufficiently
large distinct groupings
• Random weight vector initialization
102
103
Terminology used
• Clustering
• Unsupervised learning
• Euclidean Distance
p = (p1 , p 2 ,..., p n )
q = (q1 , q 2 ,..., q n )
n
ED =  (p
i =1
i − qi ) 2
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129

Ann PDF

Uploaded by

Copyright:

Available Formats

You might also like

Ann PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ann PDF

Uploaded by

Copyright:

Available Formats

Artificial Neural Network

1. The neural network is a model that works as neurons

2. It is also known as ANN (Artificial Neural Networks)

3. It copies the mechanism as working in human brains.

4. In a neural network, the machine can learn, recognize

➢ Single layer feed forward network

➢ Multi-layer feed forward network

➢ Recurrent neural network

This network is called single-layer network, where

There is only one computational layer, so it is a single

Input layer only receives signal from the external

These networks are differ from feedforward network in the

✓ It can process non-Boolean inputs and it assigns different weights to

2. It is a single layer network.

3. Rosenblatt perceptron can be seen as a set of inputs that are

5. These neurons (input layer) and activation

6. This model implements the functioning of a

7. Rosenblatt Perceptron are called as first

8. This model has main limitation of not solving the

Calculates the first-order derivative of the function to

Move away from the direction of the gradient, which means

Basic Form of RBF

Input layer: Source node connected to the environment

Output Layer: Supplies Response

P is dimensionality of input feature space , M is dimensionality of transformed feature

As we have M nodes in hidden layer and N samples so for clustering to work

Every component of dk either will be one or 0. It will be equal to 1 if the

For a better understanding of k-means, let's take an example from

• Introduced by Prof. Teuvo Kohonen

• Maintains the topology of the dataset

You might also like