ML Ch-4 Artificial Neural Network

Artificial Neural Network (ANN)
Introduction to Machine Learning
Lecture 4
2
Agenda
 Introduction
 Understanding the brain
 Artificial Neural Network
 Architecture of ANN
 How ANN Works
 Activation Function
 Backpropagation Algorithm
 Types of ANN
 The Perceptron
 Multilayer Perceptron
 Back propagation algorithm
 Nonlinear Regression
3
Introduction: ANN
 “...a computing system made up of a number of simple,
highly interconnected processing elements, which process
information by their dynamic state response to external
inputs.“ … Dr. Robert Hecht-Nielsen
 i.e., it is a computational model that is inspired by the

way biological neural networks in the human brain
process information.
4
Understanding the Brain
 The basic computational unit of the brain is a neuron.
 Approx. 86 billion neurons can be found in the human
nervous system, and they are connected with approx..
1014 - 1015 synapses.
 A biological neuron has three types of main
components; Dendrites, Cell Body and Axon.
 Each neuron receives input signals from its dendrites
and produces output signals along its (single) axon.
5
Understanding the brain…
6
Understanding the brain…
 The axon eventually branches out and connects via
synapses to dendrites of other neurons.
 In the computational model of a neuron,
 The signals that travel along the axons (e.g., x0) interact
multiplicatively (e.g., w0x0) with the dendrites of the other
neuron based on the synaptic strength at that synapse
(e.g., w0).
 The Cell Body, sums the incoming signals.
 When sufficient input is received, the cell fires; that is, it transmit a
signal over its axon to other cells .
7
ANN
 Neural networks are set of algorithms inspired by the
functioning of human brain.
 Generally, when you open your eyes, what you see is called data
and is processed by the Neurons (data processing cells) in your
brain and recognizes what is around you.
 That’s how similar the Neural Networks works.
 They takes a large set of data, process the data(draws
out the patterns from data), and outputs what it is.
8
ANN…
 ANNs have been developed as generalizations of
mathematical models of neural biology, based on the
assumptions that:
1. Information processing occurs at many simple elements
called neurons.
2. Signals are passed between neurons over connection
links.
3. Each connection link has an associated weight, which, in
typical neural net, multiplies the signal transmitted.
4. Each neuron applies an activation function to its net input
to determine its output signal.
9
ANN…
 A neural net consists of a large number of simple
processing elements called neurons, units, cells or
nodes.
 Each neuron is connected to other neurons by means
of directed communication links, each with associated
weight (w).
 The weight represent information being used by the
net to solve a problem.
10
ANN …
 Each neuron has an internal state, called its activation or
activity level, which is a function of the inputs it has
received.
 Typically, a neuron sends its activation as a signal to several
other neurons.
 The basic unit of computation in a NN is the neuron:
 It receives input from some other nodes, or from an external
source and computes an output.
 Each input has an associated weight (w), which is
assigned based on its relative importance to other inputs.
11
ANN: Architecture
12
ANN: Architecture…
13
ANN: Architecture …
14
 A common NN architecture consists of 3 layers: Input
Layer, Hidden Layer(s), and Output Layer.
 Input Layer: pass the information into the next layer.
 No computation is performed.
 Hidden Layers(s): Intermediate computation is done.
 Transfer the computed information from input layer to the next
hidden layer or directly to output layer.
 Output Layer: holds the result or the output of the
problem.
15
 Connections and weights:
 The network consists of connections, each connection
transferring the output of a neuron i to the input of a
neuron j.
 In this sense i is the predecessor of j and j is the
successor of i, Each connection is assigned a weight Wij.
 Weights are randomly assigned based on the importance
of the input to the output.
 Significant as it tell the importance of each feature which is
passed as an input to the ANN.
16
 Bias:
 It is required to shift the activation function across the
plane either towards the left or the right.
 Summation function:
 It is defined as the function which sums up the product of
the weight and the features with bias.
 Activation function:
 It is required to add non-linearity to the neural network
model.
17
ANN: How it works?
 Here’s how it works:
1. Information is fed into the input layer which transfers it
to the hidden layer
2. The interconnections between the two layers assign
weights to each input randomly
3. A bias added to every input after weights are multiplied
with them individually.
18
ANN: How it works?
19
ANN: How it works?…
20
ANN: How it works?...
4. The weighted sum is transferred to the activation function
5. The activation function determines which nodes it should fire
for feature extraction
21
 Example code for a single neuron
each neuron performs a dot product with the input and its weights, adds the bias and
applies the non-linearity (or activation function), in this case the
1
sigmoid𝜎 ( 𝑥 ) =
1+𝑒𝑥𝑝 ( − 𝑥 )
22
6. The model applies a loss function to the output layer to
deliver the output.
 a way to quantify how “good” it’s doing so that it can try to do
“better”
7. Weights are adjusted, and the output is back-propagated to

minimize error, called Backpropagation.
23
24
Activation Function
 It decides whether a neuron should be activated or not by
calculating the weighted sum and further adding bias to it.
 Purpose is to introduce non-linearity into the output of a
neuron
 Most real-world problems are nonlinear, and we want our
NN algorithm to learn that,
 Linear regression models doesn’t have Activation function.
 An activation function (or non-linearity) takes a single
number and performs a certain fixed mathematical
operation on it.
25
Activation Function: types
 There are several types of activation functions
 Linear Activation functions: No activation
 All layers of the NN will collapse into one if a linear activation
function is used.
 No matter the number of layers in the neural network, the last
layer will still be a linear function of the first layer.
 So, essentially, a linear activation function turns the neural
network into just one layer.
26
Activation Function: types…
 Sigmoid: takes a real-valued input and squashes it to
range between 0 and 1.
 Maps any sized inputs to outputs in range [0,1].
 The larger the input (more +ve), the closer the output
value will be to 1.0, whereas the smaller the input (more -
ve), the closer the output will be to 0.0.
 It is commonly used for models where we must predict the
probability as an output.
 Since probability of anything exists only between the range of 0
and 1,
27
 Sigmoid
28
SoftMax: takes a vector of arbitrary real-valued
scores and squashes it to a vector of values between
0 and 1 that sum to 1.
 It takes raw outputs of the neural network and returns a
vector of probability scores.
 Used in the output layer of NN.
 For multiclass classification.
29
 SoftMax
• Z – Output vector from the NN

• e – 2.718
• N – Number of classes
30
 tanh: takes a real-valued input and squashes it to the
range [-1, 1].
 The advantage is that the negative inputs will be mapped
strongly negative, and the zero inputs will be mapped near
zero in the tanh graph.
 The tanh function is mainly used classification between
two classes.
31
 tanh
32
 ReLU (Rectified Linear Unit) Activation Function
 Is the most used activation function in deep learning
 It takes a real-valued input and thresholds it at zero
(replaces negative values with zero)
33
 How to choose activation function for hidden layers
34
Backpropagation Algorithm
 Every Neural Network has 2 main parts:
 Forward Propagation
 is the way to move from the Input layer (left) to the Output
layer (right) in the neural network.
 Backward Propagation
 Is the process of moving from the right to left i.e backward
from the Output to the Input layer.
35
Backpropagation Algorithm…
36
 Forward propagation:- Repeated matrix
multiplications interwoven with activation function.
37
 Loss Function:
 Quantifies the difference between the expected outcome
and the outcome produced by the machine learning model.
 It is a function that compares the target and predicted output
values;
 Measures how well the NN models the training data.
 When training, we aim to minimize this loss between the
predicted and target outputs.
 From the loss function, we can derive the gradients which
are used to update the weights.
38
 In supervised learning, there are two main types of
loss functions:
 Regression and
 Mean Squared Error (MSE),
 Mean Absolute Error (MAE)
 Classification loss functions
 Binary Cross-Entropy
 Categorical Cross-Entropy
 Evaluating the predicted value and the expected value
happens using cost function.
39
 Mean Squared Error (MSE)
 Finds the average of the squared differences between the
target and the predicted outputs.
• n is the number of samples

• y represents the variable being predicted,
• ytrueis the true value of the variable (the “correct answer”).
• y_{pred}ypredis the predicted value
40
 MSE:
 What will be the loss???

41
 Categorical Cross-Entropy
 Is a measure of the difference between two probability
distributions.
 Is applied in multiclass classification scenarios.
42
 After each forward pass through a network,
backpropagation performs a backward pass while
adjusting the model’s parameters (weights and
biases).
 In every iteration, based on the model output and the
expected output, we calculate the gradients.
 The gradient is how much weights and biases we
need to change in order to reduce the loss.
43
 Based on the loss function’s value, the model “knows”
how much to adjust its parameters in order to get
closer to the expected output.
 This happens using the backpropagation algorithm.
44
What kind of problems do NNs solve?
 NNs are used to solve complex problems that require
analytical calculations like those of the human brain.
 Classification. NNs label the data into classes by implicitly
analyzing its parameters.
 E.g., a NN can analyze the parameters of a bank client such as
age, solvency, credit history and decide whether to loan them
money.
 Prediction. E.g., it can foresee the rise or fall of a stock
based on the situation in the stock market.
 Recognition. This is currently the widest application of NNs.
 E.g., a security system can use face recognition to only let
authorized people into the building.
45
ANN: Types
 The Perceptron
 The simplest models that can learn and solve complex
data problems using neural networks.
 It is comprised of only input and output layer.
 These perceptron units are combined to form a bigger
ANN architecture.
46
ANN: Types…
 Feed Forward NN
 It is a multi-layer NN, and, as the name suggests, the
information is passed in the forward direction—from left to
right.
 Data moves in a single direction.
 It enters via the input nodes and leaves through output
nodes.
 there is no backpropagation
 The basic learning process of Feed-Forward Networks
remain the same as the perceptron.
47
ANN: Types…
 Recurrent Neural Networks (RNNs)
 RNN is a network good at modeling sequential data.
 Sequential data means data that follow a particular order

in that a thing follows another.
 Are the state-of-the-art algorithm for sequential data and
are used by Apple’s Siri and Google’s voice search.
 It is the first algorithm that remembers its input, due to an
internal memory,
48
ANN Types: RNN…
 RNN has the concept of sequential memory, which is
used to remember sequences of data.
 A user types in… what time is it?
Final hidden
state of the
RNN
ANN Types: RNN…
 Feed-forward NN have no memory
of the input they receive and are bad
at predicting what’s coming next.
 Only considers the current input,
it has no notion of order in time.
 Simply can’t remember anything
about what happened in the past
except its training.
 Used for: speech recognition,
language modeling, translation,
image captioning…
50
ANN: LSTM
51
 Convolutional neural networks (CNNs)

 contain five types of layers: input, convolution, pooling,
fully connected and output.
 Each layer has a specific purpose, like summarizing,
connecting or activating.
 CNN have popularized image classification and object
detection.
 However, CNNs have also been applied to other areas,
such as natural language processing and forecasting.
52
53
Thank You!
?C0 - Personal Information

54

ML Ch-4 Artificial Neural Network

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML Ch-4 Artificial Neural Network

Uploaded by

Copyright:

Available Formats

Artificial Neural Network (ANN)

Introduction to Machine Learning

 i.e., it is a computational model that is inspired by the

7. Weights are adjusted, and the output is back-propagated to

• Z – Output vector from the NN

• n is the number of samples

 What will be the loss???

 Sequential data means data that follow a particular order

 Convolutional neural networks (CNNs)

?C0 - Personal Information

You might also like