Neural Networks: Introduction & Matlab Examples

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Neural Networks: Introduction & Matlab Examples

Ricardo de Castro
Technische Universität München
Lehrstuhl für Elektrische Antriebssysteme
Agenda
• Introduction & Motivation

• Single-Layer Neural Network


• Fundamentals: neuron, activation function and layer
• Matlab example: constructing & evaluating NN
• Learning algorithms
• Batch solution: least-squares
• Online solution: LMS
• Matlab example: online system identification with NN

• Multi-Layer Neural Network


• Network architecture
• Learning algorithm: backpropagation
• Matlab example: nonlinear fitting with noise
• Overfitting & regularization

• Case Study
• Matlab example: MPC solution via Neural Networks
References

Chapters 2,3, 10 and 11 (aka Deep Learning Toolbox )

[1] Hagan et al. Neural Network Design, 2nd edition, 2014  online version: https://hagan.okstate.edu/nnd.html
[2] Abu-Mostafa et al. Learning from Data, a Short Course, 2012.
[3] Mathworks, Neural Network Toolbox User‘s Guide (2017)
3
Some Problems…
Computer vision Medical Diagnosis
symptoms:
fever, headache,
blood pressure, …
4 diagnosis
biochemical analysis,
age, height, …

Control
Finance (e.g. prediction) (e.g., prediction / system identification)
stockPrice[k],
u[k], u[k-1],… u[k-N],
stockPrice[k-1], stockPrice[k+1] y[k+1]
y[k], y[k-1], …, y[k-M]

stockPrice[k-N]
u = control input, y=output, k=time index

How to build a system that can learn


these tasks?

4
Motivation Example: Credit Approval

Application Information

credit approved

credit rejected

Adapted from “Learning from Data, a Short Course” 5


Learning Framework
Credit Approval

Formalism:

• Input: = , , ,… ∈ ℝ (customer application)

• Output: ∈ 0,1 (0 credit rejected, 1  credit approved)

• Ideal function: =ℎ (ℎ=ideal credit approval function)

• Data: , , , ,…,( , ) (historical records)

• Candidate model: = ( ) (formula to be learned from data, e.g. Neural Network)


Remarks:
• Ideal (credit approval) function is unknown,
• …, but banks have massive amounts of data (customer info, default history, etc… )

Adapted from “Learning from Data, a Short Course” 6


Learning Framework (II)
, =data point
Unknown Function Data Set
input output ,
=ℎ = ( , ,…,( , )}

model output,
Candidate Model
Learning Algorithm
(e.g. Neural Net)

parameters, Adapt s.t.



= ( , )

Main goal: learn (i.e. ≈ ℎ) using data

Adapted from “Learning from Data, a Short Course” 7


Classification vs Regression

Classification Problems

Output  discrete categories ∈ {0,1, . . 9, }

Regression Problems ( )
( − 1)
Output  continuous variable ∈ [0, ∞)

( − 1)

8
Neuron (Single Input)

= ( + )

Input Output
• ∈ ℝ • ∈ ℝ

Parameters Net Input Activation/Transfer


• weight: ∈ ℝ • ∈ ℝ Function
• bias : ∈ ℝ • : ℝ → ℝ
10
Neuron: Activation Functions

Linear Hard Limit

0 < 0
= =
1 ≥ 0

Matlab: purelin Matlab: hardlim

Log-Sigmoid Hyperbolic Tangent


Sigmoid

1
= −
1+ =
+

Matlab: logsig Matlab: tansig

11
Neuron: Multiple Inputs

element-wise representation vector representation

= ( ) = ( + )

Net input: = , + , +⋯ , + Net input: = +


Weights: , , , ,…, , Weight vector:
Bias: = , , … ,

Bias:
Inputs: , ,…,
Inputs: = …
Number of inputs:
12
Single-Layer Neural Network (NN)

element-wise representation

= , + , +⋯ , +

= , + , +⋯ , +


= , + , +⋯ , +

Notation:
Number of inputs= Number of outputs= = weight connecting neuron
,
Remark: each of the inputs is connected to a with input
neuron through a weight ,
13
Single-Layer Neural Network (NN)

vector representation
= ( + )

= number of inputs
= number of outputs

14
Create and Evaluate Single-Layer NN (I)

Matlab Code
P = [1;
2]; % data set - input (R=2)
Y = 1; % data set - output (S=1)

net = linearlayer;
% Creates a single layer of linear neurons
% OUTPUT: net = neural network object

net = configure(net, P, Y);


% Configures network dimensions (inputs, outputs, weight,
% bias, etc) to match the size of data set (P,Y)
% INPUT
% net= neural network object
% P = [R-by-N] matrix with data set (input)
% Y = [S-by-N] matrix with data set (output)
% OUTPUT
% net= configured neural network object

15
Create and Evaluate Single-Layer NN (II)

Matlab Code
% net = Neural network object
define activation function
net.layers{1}.transferFcn ='hardlim';

net.IW{1,1} = W; % set weights W define W,b


net.b{1} = b; % set bias b

evaluate NN for input(s) P


A= sim(net, P)
% Simulates neural network
% INPUT
% net= neural network object
% P = [R-by-N] input data
% OUTPUT
% A= [S-by-N] neural network output

16
Matlab Example

Consider a two-input network with one neuron and the following parameters:

−1
= = 3 2 b = 1.2
1
Write a Matlab script that computes the output of the network for the following
activation functions:

a) Hard limit
b) Linear
c) Log-Sigmoid

Answers:
a) 1.000
b) 0.200
c) 0.197

17
Learning Algorithm: Problem Setup

Data set: = ( , ,…,( , )},


vector input: ∈ℝ
scalar output: ∈ ℝ

Single-layer linear NN: = +


parameters = Problem: min ( )

Error function: ∗
= ∗

= optimal NN parameters
1
( )= − ( )
2

( )= +

18
Learning Algorithm: Batch Solution
Solution 1:
1
min ( ) = min − ( ) = =
2 1 1
A. vector representation

1
= = 1
min ( ) = min − ( − ) ⋮ , ⋮
1
B. first-order optimality condition
=0

∗ =
Least-squares Solution

Problem: solution might be costly to compute for large data sets

19
Learning Algorithm: On-line Solution
Solution 2:
1
min ( ) = min − ( ) = =
2 1 1
A. Split error function
1 1
min ( ) = min = ( − )
2

B. Sequential gradient descent


=
( ) ( ) =
= −
( ) =
C. Compute gradient

( ) = −( − )
1 1
( ) = ( ) + ( − 1 )
1 aka least-mean squares (LMS) algorithm

Remarks:
• data points are processed one at a time (useful for real-time applications)
• learning rate needs to be chosen with care to ensure algorithm convergence [Hagan, Chap. 10]
20
On-line Learning Algorithm: Matlab API
Matlab Code
%net = Neural network object

net.IW{1,1} = …; % set initial weights


net.b{1} = …; % set initial bias

% setup learning algorithm


net.inputWeights{1,1}.learnFcn = 'learnwh'; define learning algorithm
net.biases{1}.learnFcn = 'learnwh'; (Widrow-Hoff weight/bias learning=LMS)

net.inputWeights{1,1}.learnParam.lr = alpha; define learning rate ( )


net.biases{1,1}.learnParam.lr = alpha;
net.trainFcn = 'trains'; % set sequential/online training

[net] = adapt(net,p,y); % apply 1 steps of the LMS algorithm


% INPUT
% net= neural network object
% p = [R-by-1] data point- input
% y = [S-by-1] data point- output
% OUTPUT
% net= updated neural network object (with new weights and bias)

Remark: simplified API for function adapt() 21


Online System Identification via NN (I)

output
Plant

control
input

-
Neural Network
NN output
Learning
Algorithm

22
Online System Identification via NN (II)

Consider the following discrete-time model of a plant


= + 1.8 − 1 + 0.9 [ − 2]

Goal: design and train a single-layer NN capable of approximating the output of this plant
 Proposed structure for the NN
• Single layer
• Inputs: = −1 −2
• Activation function: linear

Write a Matlab script that:


a) generates a data set for the problem
- assume: = sin (2 ), = 1 , = 0.02 , = 1,2,3, … , 150
b) plots data set
c) creates the NN and define initial weights and bias: = 0 −10 0, = −2
d) adapts the weight and bias of the NN in order to approximate the plant’s output
- option 1: via adapt(.); option 2: manual implementation of the LMS algorithm
- learning rate = 0.2
e) plots i) plant’s output, NN’s output; ii) difference between plant’s output and NN’s output
23
Online System Identification via NN (III)

Results:

a) Training Data b) Fitting Results


4 y(true)
y(NN prediction)
4
u[k] 2
3 u[k-1]
u[k-2] 0
y[k]
2
-2

1 -4
0 50 100 150
0
5
-1

error=y-y(NN)
-2
0

-3

-4 -5
0 50 100 150 0 50 100 150
index index

Final weights/bias
= 6.23 −5.06 3.58 , = 0.0883

24
Multi-Layer Neural Network (I)
Example: 2-layer NN

= number of inputs

Element-wise representation

Layer 1 Layer 2
Number of neurons

Outputs , ,… , ,…
Weights
= 1, … , ; = 1, … , = 1, … , ; = 1, … ,
Bias , = 1, … , , = 1, … , 25

Activation function
Multi-Layer Neural Network (II)
Example: 2-layer NN

= ( + ) = +

Vector representation
Layer 1 Layer 2
Number of neurons

Outputs

Weights

Bias

Activation function

26
Multi-Layer Neural Network (III)
Hidden layer Output layer

= ( + ) = +

= ( + )+
Remarks:

• Layer 1  Hidden/Input layer

• Layer 2  Output layer

• ,  defined by data set (input and output dimensions)

• , ,  defined by the designer

• , , ,  parameters defined by the learning algorithm


27
Learning Algorithm: Problem Setup

Data set: = ( , ,…,( , )},


vector input: ∈ℝ
scalar output: ∈ ℝ

Two-layer NN:
= , = ( + )+
parameters =( , , , ) Problem: min ( )

Error function: ∗
= optimal NN parameters
1
( )= − ( )
2

( )= ,

28
Learning Algorithm: Backpropagation
Solution :
1 = ,
min ( ) = min − ( )
2
A. Split error function
1 1
min ( ) = min = ( − )
2

B. Sequential gradient descent


=
( ) ( ) =
= −
( ) =
C. Compute gradient
( , )
( ) = −( − )

, = ( + )+

Backpropagation algorithm
compute gradients of ( , ) using chain rule
(see [Hagan, Chap. 11] for details) 29
Matlab API: Create a 2-layer NN
Matlab Code

P = [1;
2]; % data set - input (R=2)
Y = 1; % data set - output
= 10 = 2, =1
hiddensize = [10];
net = feedforwardnet(hiddensize);
% Creates a multi-layer neuron network
% INPUT
% hiddensize = row vector with one or more
% hidden layer sizes
% OUTPUT: net = neural network object

net = configure(net, P, Y);


% configures network dimensions (inputs, outputs, weight,
% bias, etc) to match the size of data set (P,Y) 30
Matlab API: Setup activation functions, W, b
Matlab Code

% define activation function, , ,


% weight and bias of Layer 1
net.layers{1}.transferFcn ='logsig';
net.IW{1,1} = … ;% set W1
net.b{1} = …; % set b1

% define activation function,


% weight and bias of Layer 2
net.layers{2}.transferFcn ='purelin';
net.LW{2,1} = …; % set W2
net.b{2} = …; % set b2

% Remark: net.b{i} = bias of layer i; net.LW{i,j} = weights conecting layer i with layer j 31
Matlab API: Training Algorithm
Matlab Code

Data
Set

Learning/Training
, , , Algorithm

net.trainFcn = 'traingd'; % Gradient descent backpropagation


%net.trainFcn = ' trainlm'; % Levenberg-Marquardt backpropagation (default)

net.trainParam.lr = 0.01; % learning rate (\alpha)


net.trainParam.epochs=1000; % Maximum number of epochs to train
net.trainParam.goal=0; % performance goal (stop criteria)
(min acceptable value for ( ) )
net.trainParam.min_grad=1e-5; % Minimum performance gradient (stop critera)
(min acceptable value for ( ))
32
Matlab API: Training Algorithm
Matlab Code

Data
Set

Learning/Training
, , , Algorithm
net.trainParam.showWindow = 0; % deactivate interactive GUI
net.trainParam.showCommandLine=1; % Generate command-line output
net.divideFcn = ’’; % use entire data set for training

[net]=train(net, P,Y);
% train trains a network net according to net.trainFcn and net.trainParam.
% INPUT
% net= neural network object
% P = [R-by-N] data set - input
% Y = [S-by-N] data set - output
% OUTPUT
% net = (trained) neural network output 33
Matlab Example: Nonlinear Fitting

Consider the following system


• ∈ [−1,1]
input:
• output: ∈ ℝ
• function: ℎ = 1 + sin
ℎ • random noise: ∼ (0, )

Gaussian distribution with zero


mean and variance = 0.2
Write a Matlab script that:
a) generates an artificial data set for this problem, i.e.
= { , , } , where ∈ {−1, −0.9, −0.8, … , 1} is the input for the system
b) creates a neural network (NN) with 2 layers;
a) hidden layer with 10 neurons and activation function=logistic sigmoid
b) output layer with 1 neuron and activation function=linear
c) trains the weights of the NN to fit the data set
d) plots
 training results, i.e. the data set and the output of the trained NN
 testing results, i.e. the data set , the (true) function and trained NN evaluated in the
domain ∈ {−1, −0.99, −0.98, −0.97, … , 1}
34
Matlab Example: Nonlinear Fitting (II)

Perfect fitting of training data We might be fitting noise, instead of


the true signal (ℎ)
35
Overfitting

Problem: previous example shown poor generalization, i.e.


• NN fits very well the training data set,
• … but produces large error when faced with new inputs

Solution*: Regularization
• Idea: penalize large values of NN parameters (weights/bias)

Problem: min (1 − ) +
regularization term =( , , , ,…)

• ∈ [0,1] → controls the importance of regularlization vs fitting


• larger → smoother fitting of neural network

Matlab Code
net.performParam.regularization = 0.1; % sets value of lambda( )

*Additional solutions: early stopping, reduce number of neurons (see Hagen,Chap.13, for details)
36
Matlab Example: Nonlinear Fitting (III)

(Continuation of previous problem)


• ∈ [−1,1]
input:
• output: ∈ ℝ
• function: ℎ = 1 + sin
ℎ • random noise: ∼ (0, )

Gaussian distribution with zero


mean and variance = 0.2

e) adapt the previous Matlab script in order to train neural network with regularization
• use = 10

f) Plot training and testing results for both cases

37
Matlab Example: Nonlinear Fitting (IV)

=0

= 10

38
Matlab API: Auxiliary Commands

Matlab Code
gensim(net);
% net = neural network object

Generation of Simulink block


for neural network simulation

view(net);

Generation of a
graphical view

39
MPC-NN: Motivation

Consider the following discrete-time model of a mechanical system

Previous lecture: the MPC Toolbox was employed to design a controller that brings the
state of this system to the origin while fulfilling input and state constraints.

Issue: the computational time of the MPC is too high due to the numerical optimization
routines

Task: design a NN that learns the MPC’s control law, enabling us to quickly compute the
optimal control action

40
MPC-NN: Design Approach
1) Design MPC Controller
control
input states
MPC Plant

2) Design NN
MPC
-
error Training
Algorithm
Neural Network

parameters

3) Replace MPC with NN


control
input states
Neural Network Plant

41
MPC-NN: Matlab Example

Write a Matlab script that:


a) generates data set* where

b) plots training data


c) trains a neural network to learn the MPC‘s control law using the following settings
 2 layers
 hidden layer: 20 neurons, „logsig“ activation function
 output layer: 1 neuron, linear activation function
d) plots neural network’s fitting of training data
e) exports trained network to Simulink and simulates the closed-loop response of
plant with initial state

*Matlab implementation of MPC controller: i) previous lecture; or ii) https://tinyurl.com/tsfvdyp 42


MPC-NN: Matlab Example (II)

b) training data d) NN fitting

1 2

0.5 1

0
0

u
u

-0.5
-1
-1

-1.5 -2
-1 -1

-0.5 -0.5

-10 0 -10
0 -5
-5
0.5 0 0.5 0
5 x2 5
x2 1 1 10 x1
10 x1

43
MPC-NN: Matlab Example (III)

e) Plant response with NN control

x1

x2

time
44
Summary
Introduction
• Data sets, learning algorithms, candidate models
Key Matlab Functions
Single-Layer Neural Network
• Fundamentals: neuron, activation function, layer linearlayer()
configure()
• Matlab example: constructing & evaluating NN net.layer{i}.transferFcn=...
• Learning algorithms sim()

• Batch solution: least-squares


• Online solution: LMS adapt()

Multi-Layer Neural Network


feedforwardnet()
• Network architecture
• Learning algorithm: backpropagation train()

• Overfitting & regularization net.performParam.


regularization=…
Homework

Consider the following system


• input: = ∈ −1,1 × [−1,1]
• output: ∈ ℝ
ℎ • function: ℎ = 10 +3 +7
• random noise: ∼ 0, , =2

Write a Matlab script that:


a) generates an artificial data set for this problem, i.e.
= , , where is an input sampled from the grid −1, −0.9, . . , 1 × −1, −0.9, . . , 1
b) plots data set (hint use plot3)
c) creates a neural network (NN) with 2 layers;
a) hidden layer with 50 neurons and activation function= hyperbolic tangent sigmoid
b) output layer with 1 neuron and activation function=linear
d) trains the weights and bias of the NN to fit the data set
• Use all data for training; number of epochs = 2000
e) plots training results, i.e. the data set and the output of the trained NN
f) adds a regularization term to the training ( = 0.5) and plots results

46
Homework: Expected Results

= 0.5
= 0.0

30 20

20
10

10
0
y

y
0

-10
-10

-20 -20
1 1
0.5 1 0.5 1
0 0.5 0 0.5
0 0
-0.5 -0.5
-0.5 -0.5
p2 -1 -1 p1 p2 -1 -1 p1

47

You might also like