Download as pdf or txt
Download as pdf or txt
You are on page 1of 141

ARTIFICIAL

NEURAL
NETWORKS
Syllabus

UNIT - II
Artificial Neural Networks-1– Introduction, neural network representation,
appropriate problems for neural network learning, perceptions, multilayer networks
and the backpropagation algorithm.
Artificial Neural Networks-2- Remarks on the Back-Propagation algorithm, An
illustrative example: face recognition, advanced topics in artificial neural networks.
Evaluation Hypotheses – Motivation, estimation hypothesis accuracy, basics of
sampling theory, a general approach for deriving confidence intervals, difference in
error of two hypotheses, comparing learning algorithms.

TEXT BOOK:
1. Machine Learning – Tom M. Mitchell, -
MGH
Introduction
ANN was first introduced in 1943 by the neurophysiologist Warren McCulloch and the
mathematician Walter Pitts.

Artificial neural networks (ANNs) provide a general, practical method for learning real-
valued, discrete-valued, and vector-valued functions from examples.

An artificial neural network (ANN) is the piece of a computing system designed to


simulate the way the human brain analyzes and processes information.

Dr. Robert Hecht-Nielsen defines a neural network as:

"...a computing system made up of a number of simple, highly interconnected processing


elements, which process information by their dynamic state response to external
inputs.”
From Biological to Artificial Neurons:
let’s take a quick look at a biological neuron
(represented in Figure 1. It is an unusual-
looking cell mostly found in animal cerebral
cortexes (e.g., your brain), composed of a cell
body containing the nucleus and most of the
cell’s complex components, and many
branching extensions called dendrites, plus
one very long extension called the axon. The
axon’s length may be just a few times longer
than the cell body, or up to tens of thousands
of times longer. Near its extremity the axon
splits off into many branches
called telodendria, and at the tip of these
branches are minuscule structures
called synaptic terminals (or
simply synapses), which are connected to the
dendrites (or directly to the cell body) of
other neurons.

Figure 1: Biological neuron


Biological neurons receive short electrical impulses called signals from other neurons via
these synapses. When a neuron receives a sufficient number of signals from other
neurons within a few milliseconds, it fires its own signals.
Processing in Biological neuron:
 Neuron collects signals from dendrites
 Sends out spikes of electrical activity through an axon, which splits into thousands
of branches.
 At end of each brand, a synapses converts activity into either exciting or inhibiting
activity of a dendrite at another neuron.
 Neuron fires when exciting activity surpasses inhibitory activity
 Learning changes the effectiveness of the synapses
Biological neuron: Abstract neuron model:
ANN- APPROPRIATE FOR
FOLLOWING PROBLEMS
 Instances are represented by many
attributes.
 The Target function out put may be
discrete, real valued, vector etc
 The training examples may contain errors
 Fast evaluation of the learned target
function may be required
 The ability of humans to understand
learned target function is not important.
ANN- APPROPRIATE FOR
FOLLOWING PROBLEMS
APPROPRIATE PROBLEMS FOR NEURAL
NETWORK LEARNING:

Figure 4.1: Neural network learning to steer an autonomous vehicle. The


ALVINN system uses BACKPROPAGATION to learn to steer an autonomous
vehicle (photo at top) driving at speeds up to 70 miles per hour.
o- actual output->the fun will
generate this output
t-target output
If actual=target=>weights are fixed
Otherwise=> change weights
An overview of the backpropagation(2):
In machine learning, backpropagation is a widely used algorithm for training feedforward
neural networks. Generalizations of backpropagation exist for other artificial neural
networks, and for functions generally. These classes of algorithms are all referred to
generically as "backpropagation".
Representational Power of Perceptrons:
Representation power is related to ability of a neural network to assign proper labels to
a particular instance and create well defined accurate decision boundaries for that class.
However, some boolean functions cannot be represented by a single perceptron, such
as the XOR function. The decision surface represented by a two-input perceptron. x1
and x2 are the perceptron inputs.
•a single perceptron can represent many
boolean functions
•if 1 (true) and -1 (false), then to implement an
AND function make and
•a perceptron can represent AND, OR, NAND,
and NOR but not XOR!!
•Every boolean function can be represented by
some network of perceptrons only two levels
deep
ANN BACK
PROPAGATION
Multilayer Neural networks
 Singleperceptrons can only express linear
decision surfaces. In contrast, the kind of
multilayer networks learned by the
Backpropagation algorithm are capable
of expressing a rich variety of nonlinear
decision surfaces.
What is back propagation
 Back propagation is the essence of neural
network training. It is the method of fine-
tuning the weights of a neural network based
on the error rate obtained in the previous
epoch (i.e., iteration). Proper tuning of the
weights allows you to reduce error rates and
make the model reliable by increasing its
generalization.
 Back propagation in neural network is a short
form for “backward propagation of errors.” It
is a standard method of training artificial
neural networks. This method helps calculate
the gradient of a loss function with respect to
all the weights in the network.
The Perceptron Training Rule:
According to this rule, initially the weight vector is assigned with random weights,
then iteratively apply the perceptron to each training example, modifying the
perceptron weights 𝑤 whenever it misclassifies an example. This process is
repeated, iterating through the training examples as many times as needed until
the perceptron classifies all training examples correctly. Weights are modified at
each step according to the perceptron training rule, which revises the weight 𝒘𝒊
associated with input 𝒙𝒊 according to the rule
𝑤𝑖 .← 𝑤𝑖 + Δ𝑤𝑖
Where, Δ 𝑤𝑖 = η(𝑡 − 𝑜)𝑥𝑖
𝑡= target output for the current training example,
𝑜= output generated by the perceptron, and η = Learning rate

Example: Suppose the training example is correctly classified already by the


perceptron. So, (𝑡 − 𝑜) is zero, making Δ𝑤𝑖 zero, so that no weights are
updated.

Suppose perceptron outputs a -1, when the target output is +1. To make
the perceptron output a + 1 instead of - 1 in
this case, the weights must be altered to increase the value of 𝑤. 𝑥

For example, if 𝒙𝒊 > 𝟎 , then increasing 𝑤𝑖 will bring the perceptron closer to
correctly classifying it.
 WHAT IS GRADIENT DESCENT?
 Gradient Descent is an optimization
algorithm for finding a local minimum of a
differentiable function. Gradient descent
is simply used to find the values of a
function's parameters (coefficients) that
minimize a cost function as far as possible.
Multilayer Networks and the
Backpropagation Algorithm:
Single Surfaces. In contrast, the kind of multilayer networks learned by the
Backpropagation algorithm are capable of expressing a rich variety of nonlinear decision
surfaces.
FIGURE 4.6: The sigmoid threshold unit.
 The Sigmoidal unit computes its output 𝑜 as
1
𝑜 = 𝜎(𝑤. 𝑥 ) , where 𝜎 𝑦 = . The Sigmoidal
1+𝑒 −𝑦
function, often called Logistic function / Squashing
function and its output range is [0,1]. It increases
monotonically with its input. The Sigmoidal function
has the useful property that its derivative is easily
expressed in terms of its output. In particular,
𝑑𝜎(𝑦)
 = 𝜎 𝑦 . (1 − 𝜎 𝑦 )
𝑑𝑦
NEED OF ACTIVATION
FUNCTION
 ANNs use activation functions (AFs) to perform
complex computations in the hidden layers and then
transfer the result to the output layer. The primary
purpose of AFs is to introduce non-linear properties in
the neural network.

 If activation functions are not applied, the output signal


would be a linear function, which is a polynomial of
one degree. While it is easy to solve linear equations,
they have a limited complexity quotient and hence,
have less power to learn complex functional mappings
from data. Thus, without AFs, a neural network would
be a linear regression model with limited abilities.
Sigmoid Function

Rectified Linear Unit (ReLU) Function

Hyperbolic Tangent Function (Tanh)

Softmax Function
The Backpropagation Algorithm:
The Backpropagation algorithm;
• learns the weights for a multilayer network, given a network with a
fixed set of units and interconnections.
• It employs gradient descent to attempt to minimize the squared error
between the network output
values and the target values for these outputs.
We can write 𝐸(𝑤)= 𝑑∈𝐷 𝑘∈𝑜𝑢𝑡𝑝𝑢𝑡(𝑡𝑘𝑑 −𝑜𝑘𝑑 )2
Where, 𝑜𝑢𝑡𝑝𝑢𝑡𝑠 = set of output units in the network, and
𝑡𝑘𝑑 and 𝑜𝑘𝑑 are the target and output values associated with the 𝑘𝑡ℎ output unit
and training example 𝑑.

The algorithm can be decomposed in the following four


steps:
1. Initialization of weights
2. Feed-forward computation
3. Backpropagation to the output layer
4. Backpropagation to the hidden layer
5. Weight updates
Interconnected neurons
Local-inside
Global-outside
An illustrative example: face recognition
An illustrative example: face recognition
An illustrative example: face recognition
An illustrative example: face recognition
An illustrative example: face recognition
An illustrative example: face recognition
An illustrative example: face recognition
An illustrative example: face recognition
An illustrative example: face recognition
Face Recognition Software & Applications

 Deep Vision AI
 Sense Time
 Amazon Recognition
 Face First
 True face
 Cognitec
Applications: Hospital Security,
Airline Industry
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Advanced topics in artificial neural networks
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis

756
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Evaluating Hypothesis
Comparing Learning Algorithms.
We have so many algorithms in Machine Learning.
There is always a confusion about which one to
choose. So we consider some factors.
1. Time complexity
2. Space complexity
3. Sample complexity
4. Unbiased data
5. Online and offline algorithms
6. Parallelizability
7. Parametricity – Parametric data – fixed size data
- Non parametric data-changes size of data
THANK YOU

You might also like