Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 43

Lecture -01: Introduction to

Neural Networks

1
Artificial Intelligence vs. Machine Learning vs. Deep Learning
and Neural Networks

What is AI?
The idea of artificial intelligence came into
existence and the term AI was first coined
in 1956 by John McCarthy
“Artificial Intelligence is a technique
which enables the machines to mimic
human behavior”
2
Artificial Intelligence vs. Machine Learning vs. Deep Learning
and Neural Networks What is Learning?

• “Learning is any process by which a system improves performance


from experience.” -Arthur Samuel
• “Learning denotes changes in a system that ... enable a system to do the
same
task more efficiently the next time.” –Herbert Simon
• “Learning is constructing or modifying representations of what is being
experienced.” –Ryszard Michalski
• “Learning is making useful changes in our minds.” –Marvin Minsky
3
Artificial Intelligence vs. Machine Learning vs. Deep Learning
and Neural Networks

What is Machine Learning?


Machine Learning is a subset of Artificial Intelligence (AI) which provides
machines the ability to learn automatically and improve from experience without
being explicitly programmed.

4
Artificial Intelligence vs. Machine Learning vs. Deep Learning
and Neural Networks

What is Deep Learning?

Deep learning is part of a broader family of machine learning


methods based on artificial neural networks with representation
learning. Learning can be supervised, unsupervised or
reinforcement-based. (Wikipedia)

5
Artificial Intelligence vs. Machine Learning vs. Deep Learning
and Neural Networks
What are Neural Networks (NNs) ?
Neural Networks or Artificial Neural Networks (ANNs) are
machine learning techniques inspired by our brain cells'
functionality. These brains cells are knows as bilogical neurons.

Biological neuron (Image source: Wikipedia)


8
What are Neural Networks ?
Biological Neurons vs. Artificial Neuron

Image source: Wikipedia

Image source: https://towardsdatascience.com

Biological neuron Artificial neuron


7
What are Neural Networks ?

Biological Neuron vs. Artificial Neuron

Biological Neuron Artificial Neuron


Dendrites (input) Input
Soma (cell body) Node
Axon (output) Output
Synapse Interconnections
Adaptation-based learning Model-based learning
8
Neural Network Learning Algorithms (1)

• Two main algorithms:

• Perceptron: Initial algorithm for learning simple neural


networks (with no hidden layer) developed in the 1950’s.

• Backpropagation: More complex algorithm for learning multi-


layer neural networks developed in the 1980’s.

9
Neural Network Learning Algorithms (2)

• Neural Netwoks are one the most important class of


learning
algorithms in ML.

• The learned classification model is an algebraic function.

• The function is linear for Perceptron algorithm, non-linear


for Backpropagation algorithm

• Both features and the output classes are allowed to be real valued
13
Types of Machine learning algorithms (1)

• Supervised learning is a method in which the machine is trained using


data with corresponding labels.

• Unsupervised learning is when a machine is trained on unlabeled


data without providing any guidance with the data.

• Reinforcement learning is a method in which an agent learns by


interacting with the environment by taking specific actions and discovers
rewards or penalties in response to its actions.

11
Types of Machine learning algorithms (2)
Target problems in supervised , unsupervised and reinforcement learning respectively
(from left to right)

12
Types of Machine learning algorithms (3)
Nature of data in supervised , unsupervised and reinforcement learning respectively
(from left to right)

16
Types of Machine learning algorithms (4)
Training process in supervised , unsupervised and reinforcement learning respectively
(from left to right)

14
Types of Machine learning algorithms (5)
Purpose of supervised , unsupervised and reinforcement learning respectively
(from left to right)

15
Traditional ML vs. Deep Learning
(1)

19
Traditional ML vs. Deep Learning
(2)

17
Traditional ML vs. Deep Learning
(3)

18
Perceptron: The First Neural Network
Types of Artificial Neural Networks
• ANN can be categorized based on number of hidden layers contained
in ANN architecture

One Layer Neural Network (Perceptron)


01 • Contains 0 hidden layers

Multi Layer Neural Network


• Regular Neural Network
02 • Contains 1 hidden layer
• Deep Neural Network
• Contains >1 hidden layers
One layer Artificial Neural Network (Perceptron)

• Multiple input nodes x1


w1
• Single output node x2 w2
• Takes weighted sum of the inputs w3 ∑ ? ŷ

• Unit function calculates the output for x3

the network
wn

xn
Unit Function
• Linear Function
• Weighted sum followed by an activation function
Perceptron Example

• To categorize a 2x2 pixel binary image to:


• “Bright” and “Dark”
• The rule is:
• If it contains 2, 3 or 4 white pixels, it is
“bright”
• If it contains 0 or 1 white pixels, it is “dark”
Image of 4 pixels
• Perceptron architecture:
• Four input units, one for each pixel
• One output unit: +1 for bright, -1 for dark
Perceptron Example

Pixel 1 x1
0.25 If S >= 0
+1 Bright

Pixel 2 x2 0.25
ŷ
S
0.25
Pixel 3 x3
0.25 Otherwise -1 Dark

x4
Pixel 4

S= 0.25*x1 + 0.25*x2 + 0.25*x3 + 0.25*x4


Perceptron Example
• Calculation (Step-1):
• X1 = 1
• X2 = 0 1 x1
0.25 +1 Bright
• X3 = 0 If S>= 0
• X4 = 0 x2 0.25
0
ŷ
S
S= 0.25*(1) + 0.25* (0) + 0.25*(0) + 0.25* (0) = 0.25 0.25
0 x3
• 0.25 > 0, so the output of ANN is 0.25 Otherwise -1 Dark
+1
• So the image is categorized as “Bright” x4
0
• Target : “Dark”
Perceptron Training Rule (How to update weights)
• When t(E) is different from o(E)
• Add ∆i to weight wi
• Where ∆i = ɳ(t(E) – o(E) ) xi  ɳ is learning rate (Usually very
small value)
• Do this for every weight in the network
• Let ɳ=0.1
Calculating the error values Calculating the New Weights

∆1 = ɳ (t(E) – o(E) )* x1
wʹ1 = w1 +∆1 = 0.25 - 0.2 = 0.05
= 0.1 ( -1-1) * 1 = -0.2
∆2 = ɳ (t(E) – o(E) )* x2
wʹ2 = w2 +∆2 = 0.25 + 0 = 0.25
= 0.1 ( -1-1) * 0 = 0
∆3 = ɳ (t(E) – o(E) )* x3
wʹ3 = w3 +∆3 = 0.25 + 0 = 0.25
= 0.1 ( -1-1) * 0 = 0
∆4 = ɳ (t(E) – o(E) )* x4 wʹ4 = w4 +∆4 = 0.25 + 0 = 0.25
= 0.1 ( -1-1) * 0 = 0
Perceptron Example

• Calculation (Step-2):
• X1 = 1
1 x1
• X2 = 0 0.05 +1 Bright
• X3 = 0 If S>= 0
• X4 = 0 0 x2 0.25
ŷ
S
S= 0.05*(1) + 0.25* (0) + 0.25*(0) + 0.25* (0) = 0.05 0.25

0 x3
Otherwise
• 0.05 > 0, so the output of ANN is +1 0.25 -1 Dark

• So the image is categorized as “Bright” 0


x4
• Target : “Dark”
Perceptron Training Rule (How to update weights)
• When t(E) is different from o(E)
• Add ∆i to weight wi
• Where ∆i = ɳ(t(E) – o(E) ) xi  ɳ is learning rate (Usually very small value)
• Do this for every weight in the network
• Let ɳ=0.1

Calculating the error values Calculating the New


∆1 = ɳ (t(E) – o(E) )* x1
Weights wʹ1 = w1 +∆1 = 0.05 - 0.2 = - 0.15
= 0.1 ( -1 -1 ) * 1 = -0.2
∆2 = ɳ (t(E) – o(E) )* x2
wʹ2 = w2 +∆2 = 0.25 + 0 = 0.25
= 0.1 ( -1-1) * 0 = 0
∆3 = ɳ (t(E) – o(E) )* x3 wʹ3 = w3 +∆3 = 0.25 + 0 = 0.25
= 0.1 ( -1-1) * 0 = 0
∆4 = ɳ (t(E) – o(E) )* x4 wʹ4 = w4 +∆4 = 0.25 + 0 = 0.25
= 0.1 ( -1-1) * 0 = 0
Perceptron Example

• Calculation (Step-3):
• X1 = 1
1 x1
• X2 = 0 -0.15 +1 Bright
• X3 = 0 If S>= 0
• X4 = 0 0 x2 0.25
ŷ
S
S= - 0.15*(1) + 0.25* (0) + 0.25*(0) + 0.25* (0) = - 0.15 0.25
0 x3
Otherwise
• - 0.15 < 0, so the output of ANN is -1 0.25 -1 Dark

• So the image is categorized as “Dark” 0


x4
• Target : “Dark”
Another Example (AND)

X1 X2 X1 AND X2 +1 ON
If S >= 0
0 0 0
0 x1 W1=0.5
0 1 0
S ŷ
W2=0.5
1 0 0
-1 OFF
1 x2
1 1 1

Weights Step-1 Step-2 Step-3 Step-4


• X1 = 0
• X2 = 1, w1 0.5 0.5 0.5 0.5

• ɳ = 0.1 w2 0.5 0.3 0.1 -0.1

• t(E) = -1 Weighted Sum 0.5 0.3 0.1 -0.1

Observed Output +1 +1 +1 -1
Another Example (AND)

X1 X2 X1 AND X2 +1 ON
If S >= 0
0 0 0
0 x1 0.5
0 1 0
S ŷ
1 0 0
-0.1
-1 OFF
1 x2
1 1 1

Weights Step-1 Step-2 Step-3 Step-4


• X1 = 1
• X2 = 0, w1 0.5 0.3 0.1 -0.1

• ɳ = 0.1 w2 -0.1 -0.1 -0.1 -0.1

• t(E) = -1 Weighted Sum 0.5 0.3 0.1 -0.1

Observed Output +1 +1 +1 -1
Another Example (AND)

X1 X2 X1 AND X2 +1 ON
If S >= 0
0 0 0
0 x1 -0.1
0 1 0
S ŷ
1 0 0
-0.1
-1 OFF
1 x2
1 1 1

Weights Step-1 Step-2


• X1 = 1
• w1 -0.1 0.1
X2 = 1,
• ɳ = 0.1 w2 -0.1 0.1

• t(E) = +1 Weighted Sum -0.2 0.2

Observed Output -1 +1
Use of Bias
Bias x0
• Bias is just like an intercept
If S >= 0 +1 ON
added in a linear equation. 0 x1 0.5
S ŷ
0.5
-1 OFF
output = sum (weights * inputs) + bias 1 x2

• The output is calculated by multiplying the inputs with their weights


and then passing it through an activation function like the Sigmoid
function, etc. Here, bias acts like a constant which helps the model to
fit the given data.
Use of Bias

• A simpler way to understand bias is through a constant c of a linear function

y =mx + c
• It allows us to move the line down and up fitting the prediction with the
data better. If the constant c is absent then the line will pass through the
origin (0, 0) and we will get a poorer fit.
Example (AND) with Bias
x0 =
Bias
1
X1 X2 X1 AND X2 W0=0.5
0 0 0 If S > 0 +1 ON
0 1 0 0 x1 W1=0.5
S ŷ
1 0 0
W2=0.5
1 1 1 -1 OFF
1 x2

Weights Step-1 Step-2 Step-3 Step-4


• X1 = 0
• X2 = 1, w0 0.5 0.3 0.1 -0.1

• ɳ = 0.1 w1 0.5 0.5 0.5 0.5

• t(E) = -1 w2 0.5 0.3 0.1 -0.1

Weighted Sum 1 0.6 0.2 -0.2

Observed Output +1 +1 +1 -1
Example (AND) with Bias
x0 =
Bias
1
X1 X2 X1 AND X2 W0=-0.1
0 0 0 If S > 0 +1 ON
0 1 0 0 x1 W1=0.
5 S ŷ
1 0 0
W2=-
1 1 1 0.1 -1 OFF
1 x2

Weights Step-1 Step-2


• X1 = 1
• w0 -0.1 -0.3
X2 = 0,
• ɳ = 0.1 w1 0.5 0.3

• t(E) = -1 w2 -0.1 -0.1

Weighted Sum 0.4 0

Observed Output +1 -1
Example (AND) with Bias
x0 =
Bias
1
X1 X2 X1 AND X2 W0=-0.3
0 0 0 If S > 0 +1 ON
0 1 0 0 x1 W1=0.
3 S ŷ
1 0 0
W2=-
1 1 1 0.1 -1 OFF
1 x2

Weights Step-1 Step-2


• X1 = 1
• w0 -0.3 -0.1
X2 = 1,
• ɳ = 0.1 w1 0.3 0.5

• t(E) = +1 w2 -0.1 0.1

Weighted Sum 0 0.5

Observed Output -1 +1
Example (AND) with Bias
x0 =
Bias
After 2 Epochs 1
W0= - 0.3
X1 X2 X1 AND X2 If S > 0 +1 ON
0 0 0 0 x1 W1= 0.3
S ŷ
0 1 0 W2= 0.1
-1 OFF
1 0 0 1 x2
1 1 1

Final Weights
Weights
• X1 = 0
• w0 - 0.3
X2 = 1,
• ɳ = 0.1 w1 0.3

• t(E) = -1 w2 0.1
Learning in Perceptron

• Need To Learn
• Both the weights between input and output units
• And the value for the bias

• Make Calculations easier by:


• Thinking of the bias as a weight from a special input unit
where the output from the unit is always 1

• Exactly the same result:


• But we only have to worry about learning weights
New Representation for Perceptron

Threshold function has


Special input, which is become this
always 1 1
w0
If S > 0
+1
x1 w1
ŷ
w2 S
x2
-1
Otherwis
wn e

xn

S= w0 + w1*x1 + w2*x2 . . . . . . . wn*xn


Learning Algorithm
• Weights are randomly initialized.
• For each training example E
• Calculate the observed output from Perceptron, o(E)
• If the target output t(E) is different to o(E)
• Then update all the weights so that o(E) becomes closer to t(E)

• This process is done for every example


• It is not necessary to stop when all examples are used.
• Repeat the cycle again (an epoch) until network produces the correct output
Limitations of Perceptron
• The perceptron can only learn simple problems. this is
only useful if the problem is linearly separable.
• A linearly separable problem is one in which the classes can
be separated by a single hyperplane.
Any Questions?

You might also like