Machine Learning Lecture 11

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Building Blocks: Neurons

A neuron takes inputs, does some math


with them, and produces one output.

First, each input is multiplied by a weight

Second, all the weighted inputs are added together with a bias b
(w.x) + b = w1* x1 + w2* x2 + b

Finally, the sum is passed through an activation function

y = f((w.x) + b) + f(w1* x1 + w2* x2 + b)


Activation Function and Examples
The activation function is used to turn an unbounded input into an
output that has a nice, predictable form.
Building Blocks: Neurons
The activation function is used to turn an unbounded input into an
output that has a nice, predictable form.
A commonly used activation function is the sigmoid function

The sigmoid function only outputs


numbers in the range (0,1).
You can think of it as compressing
(−∞,+∞) to (0,1)
— big negative numbers become ~0
and
big positive numbers become ~1.
Example
w1 = 0 , w2 = 1 , and b = 4
x1 = 2 , and x2 = 3
Example
Is the output change If we use others activation functions?

Function Value at x = 7
Unit Step 1
Linear Function 7
Hyperbolic Tangent 0.9999

Function Value at x = 7
Unit Step 0
Linear Function -7
Hyperbolic Tangent -1
Sigmoid 0
Linear Classifier
The simple neuron can solve linear classifier problem such as OR.

There are many lines can be used to classify this problem


XOR problem

The simple neuron (Perceptron) can not classify the problems such as
XOR.

There is no line can be used to classify this problem.


Hidden Layer
The solution to this problem is to expand beyond the single-layer
architecture by adding an additional layer of units without any direct access
to the outside world, known as a hidden layer. It solves the conflict.
This kind of architecture is another feed-forward network known as a
multilayer perceptron (MLP).
Combining Neurons into a Neural Network
A neural network is nothing more than a bunch of neurons connected together.

This network has 2 inputs, a hidden layer with 2 neurons (h1 and h2), and an output
layer with 1 neuron (o1).
Notice that the inputs for o1 are the outputs from h1 and h2
A hidden layer is any layer between the input (first) layer and output (last) layer.
There can be multiple hidden layers! (This is called Deep Learning )
Example
Given this neural network
w1 =w3 =0 , w2 = w4 =1, and b = 0 for h1 and h2 and o1

Neurons h1 and h2 and o1 has same activation function

x1 = 2 and x2 = 3 Calculate the output y


Training a Neural Network
Build a neural network to predict someone’s sex given their weight and
height from this data
(Sex (Weight) (Height) ID
F 59 164 1
M 72 183 2
M 71 179 3
F 54 154 4

First Step : Construct the neural network typology :


Inputs (2 inputs Wight and Height)
One Hidden layer with two neuron h1 and h2
One output neuron O1  sex
Second Step: (Repeated task) train the network that means determine the weights
of network

w1, w2, w3, w3, w5, w6,b1,b2,b3


Training a Neural Network
First : Data preparing We’ll represent Male Sex Weight - 64 Height - 170
1 -5 -6
with a 0 and Female with a 1, and we’ll also 0 8 13
shift the data to make it easier to use: 0 7 9
y_true 1 -10 -16

Second: we initialize w1, w2, w3, w3, w5, w6,b1,b2,b3 by random values

Third: we determine predicate Sex according to give Weight and Height as follow

O1 is called y_pred
Training a Neural Network
Loss
We first need a way to quantify how “good” it’s doing so that it can try to do “better”.
That’s what the loss is.
We’ll use the mean squared error (MSE) loss:

Where
• n is the number of samples, which is 4
• y represents the variable being predicted, which is Sex.
• y_true is the true value of the variable (the “correct answer”).
y_pred is the predicted value of the variable. It’s whatever our network outputs.
(y_true−y_pred)² is known as the squared error.
Our loss function is simply taking the average over all squared errors.
The better our predictions are, the lower our loss will be!
Better predictions = Lower loss.
Training a network = trying to minimize its loss.
Training a Neural Network
The process of calculate h1, h2 ,and o1 is called feedforward where the calculation goes from input to
hidden and then to output

The process of modifying weights (w1,w2,w3, … w2) is called back Propagation where the modifications
goes from output to hidden and then to input

Feed Forward Calculation


Feed Forward Calculation
So this type neural network (NN) is named:
X1 1 W11
• Multilayer NN
H1
W1k
WH11
• Feedforward NN
W21 • back Propagation
X2 2 .
W2k .
O1
.
.
WHk1
. Wn1
.
Hk
why these names are used ?
Xn n
Wnk • Multilayer NN
• Feedforward NN
Back Propagation weights modifications
Back Propgation weights modifications
• back Propagation
Training a Neural Network

In Feedforward we calculate
1. H1 and h2
2. O1

In back Propagation we calculate


1. b3
2. w6 and w6
3. b2 and b1
4. w1, w2, w3, and w4
Training Algorithm
1. initialize w1,w2,w3,w4,w5,w6,wb1,b2,b3
2. Repeat this part for N times
I. Determine h1, h2, o1
II.Determine mean squared error (MSE) loss
III.Modify w1,w2,w3,w4,w5,w6,wb1,b2,b3
3. IF loss is acceptable
The neural network is constructed
Else
Try to re-build new NN typology (increase hidden nodes)
Training Algorithm (trial and Errors)
Start Construct another
typology (add nodes for
Initialize Weights And Biases hidden or add new hidden
layer

Determine Hidden values (h1, h2) and Output (o1)


NO

YES
loss is
acceptable Stop
Determine mean squared error (MSE) loss

Repeat for
All iterations Done
Modify Weights And Biases
specified iterations
First Step initialize w1,w2,w3,w4,w5,w6,wb1,b2,b3
# Weights
self.w1 = np.random.normal() w1 = -0.17078787256065822
self.w2 = np.random.normal() w2 = 0.8018910260223238
self.w3 = np.random.normal() w3 = 2.042028648489558
w4 = 0.9472266245782457
self.w4 = np.random.normal()
w5 = 0.11610745255156371
self.w5 = np.random.normal()
w6 = -0.04474672280574983
self.w6 = np.random.normal()

# Biases
self.b1 = np.random.normal() b1 = 0.049210103523584146
self.b2 = np.random.normal() b2 = -0.7372822297715569
self.b3 = np.random.normal() b3 = 0.6148445824873799
Second Step: Determine Hidden values (h1, h2) and Output (o1)
x1 = w1 * weight + w2 * height + b1

h1 = x3 = w5 * h1 + w6 * h2 + b3

x2 = w3* weight + w3 * height + b2 o1 =

h2 =

def feedforward(self, x):


# x is a numpy array with 2 elements.
h1 = sigmoid(self.w1 * x[0] + self.w2 * x[1] + self.b1)
h2 = sigmoid(self.w3 * x[0] + self.w4 * x[1] + self.b2)
o1 = sigmoid(self.w5 * h1 + self.w6 * h2 + self.b3)
return o1
Applay at first row hight weight Y SE = 0 initial value
-2 -1 1

w1 = -0.170
w2 = 0.802
b1 = 0.049

w3 = 2.042
w4 = 0.947
b2 = -0.737

w5 = 0.116
w6 = -0.045
b3 = 0.615
Determine mean squared error (MSE) loss

def mse_loss(y_true, y_pred):


# y_true and y_pred are numpy arrays of the same length.
return ((y_true - y_pred) ** 2).mean()

Ypred Ytrue
0.65 1
0.67 0
0.67 0
0.65 1

MSE = ¼ {(1-0.65)2 +(0-0.67)2 + (0-0.67)2 + (1-0.65)2}


MSE = ¼ (1.127) = 0.28
Third Step: Modify Weights And Biases
We write loss (MSE) is as a function of weights and biases
MSE = L(w1,w2,w3,w4,w5,w6,wb1,b2,b3)

We use partial differentiation to determine the value of the effect of each of the
weights.so that the
new weight = old weight – learning rate * partial differential of equation L with respect
to weight.
Third Step: Modify Weights And Biases
= −
ℎ o1
d d
− =2 − ∗ = −2 −
d d
Third Step: Modify Weights And Biases
Imagine we wanted to tweak w1. How would loss L change if we changed w1?
That’s a question the partial derivative can answer. How do we calculate it?
To start, let’s rewrite the partial derivative in terms of ∂y_pred/∂w1 instead:

Where L= (1−y_pred)²
Ytrue = 1 in this example
Third Step: Modify Weights And Biases
We can break down ∂L/∂w1 into
several parts we can calculate:

Since w1 only affects h1 (not h2), we can write

This system of calculating partial


derivatives by working backwards is
known as backpropagation, or
“backprop”.
Third Step: Modify Weights And Biases
The same with
Third Step: Modify Weights And Biases
Where w5  ypre  loss

So

The same with

You might also like