Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Lecture 4

Logistic regression
and neural networks

Machine Learning
Andrey Filchenkov

Lecture plan
• Logistic regression
• Single-layer neural network
• Completeness problem of neural
• Multilayer neural networks
• Backpropagation
• Modern neural networks

• The presentation is prepared with

materials of the K.V. Vorontsov’s
course “Machine Leaning”.

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 2

Lecture plan
• Logistic regression
• Single-layer neural network
• Completeness problem of neural
• Multilayer neural networks
• Backpropagation
• Modern neural networks

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 3

Logistic regression

We may want to talk about probably of belonging to a class

(we will discuss it on Lecture 5 in details).
1 checar lo de (-∞, ∞) -> (0, 1) con
𝑦 = = σ 〈𝑤, 𝑥 〉 , el exponente
1+𝑒 〈 , 〉
where σ 𝑧 is logistic (sigmoid) function.

Then classification model is

𝑄 𝑎, 𝑇 ℓ = ln(1 + exp(− 𝑤, 𝑥 𝑦)) → min .

That is logarithmic loss function.

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 4

Logarithmic loss function plot

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 5

Gradient descent

σ 𝑠 = σ 𝑠 σ(−𝑠).

µ∇𝑄 𝑤 [ ] =− 𝑦 𝑥 σ −𝑀 𝑤 .

Gradient descent step:

𝑤[ ] = 𝑤 [ ] − µ𝑦 𝑥 σ −𝑀 𝑤 .

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 6

Smoothed Hebb’s rule

Hebb’s rule:
if − 𝑤 , 𝑥 𝑦 > 0, then 𝑤 [ ] = 𝑤 [ ] + µ𝑥 𝑦 .
Marginal [𝑀 < 0] and smoothed σ −𝑀 :

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 7

Smoothed Hebb’s rule

Hebb’s rule:
if − 𝑤 , 𝑥 𝑦 > 0, then 𝑤 [ ] = 𝑤 [ ] + µ𝑥 𝑦 .
Marginal [𝑀 < 0] and smoothed σ −𝑀 :

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 8

Logistic regression implementation

Python: LogisticRegression with different solvers

Weka: Logistic

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 9

Lecture plan
• Logistic regression
• Single-layer neural network
• Completeness problem of neural
• Multilayer neural networks
• Backpropagation
• Modern neural networks

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 10

Biological intuition

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 11


Generalized McCulloch-Pitts neuron:

𝑎 𝑥, 𝑇 ℓ = σ 𝑤 𝑓 𝑥 −𝑤 ,

where σ is activation function.

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 12

Activation functions

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 13

Rosenblatt’s rule and Hebb’s rule

Rosenblatt’s rule for {1; 0} classification case for

weight learning is for each object 𝑥( ) change
weight vector:
𝑤 [ ] ≔ 𝑤 − η(𝑎 𝑥 − 𝑦 ).

Hebb’s rule for {1; −1} classification case for

weight learning is for each object 𝑥( ) change
weight vector:
If 𝑤 𝑥 𝑦( ) < 0 then 𝑤 [ ] ≔𝑤 + η𝑥 𝑦 .

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 14

Delta rule

Let 𝐿 𝑎 , 𝑥 = 〈𝑤, 𝑥〉 − 1 .
Delta-rule for weight learning is for each object
𝑥( ) change weight vector:
𝑤 [ ] ≔ 𝑤 − η 𝑤, 𝑥 −𝑦 .

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 15

Lecture plan
• Logistic regression
• Single-layer neural network
• Completeness problem of neural
• Multilayer neural networks
• Backpropagation
• Modern neural networks

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 16

Completeness problem (for neuron)

Basic idea: synthesize combinations of neurons.

Completeness problem: how rich is family of

function which can be represented with neural

Start with single neuron.

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 17

Logical functions as neural networks

Logical AND
𝑥 ∧ 𝑥 = [𝑥 + 𝑥 − 3/2 > 0]

Logical OR
𝑥 ∨ 𝑥 = [𝑥 + 𝑥 − 1/2 > 0]

Logical NOT
¬𝑥 = [−𝑥 + 1/2 > 0]

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 18

Two ways of making it more complex

Example (Minkovski):
𝑥 ⊕𝑥

Two way of making it more complex

1. Use non-linear transformation:
𝑥 ⊕ 𝑥 = [𝑥 + 𝑥 − 2𝑥 𝑥 − 1/2 > 0]
2. Build superposition:
𝑥 ⊕ 𝑥 = [(𝑥 ∨ 𝑥 ) − (𝑥 ∧ 𝑥 ) − 1/2 > 0]

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 19

Completeness problem (Boolean functions)

Completeness problem: how rich is family of

function which can be represented with neural
DNF Theorem:
Any particular Boolean function can be
represented by one and only one full disjunctive
normal form.
What is with a all possible functions?

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 20

Gorban Theorem

Theorem (Gorban, 1998)

• 𝑋 be compact space,
• 𝐶(𝑋) be an algebra of continuous on 𝑋 real-
valued functions,
• 𝐹 be linear subspace 𝐶(𝑋), closed with respect to
nonlinear continuous function ϕ and containing
constant 1 ∈ 𝐹 ,
• 𝐹 separated points in 𝑋.
Then 𝐹 is dense in 𝐶 𝑋 .
Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 21
Lecture plan
• Logistic regression
• Single-layer neural network
• Completeness problem of neural
• Multilayer neural networks
• Backpropagation
• Modern neural networks

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 22

Multilayer neural network

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 23

Multilayer neural network

Any number of layers

Any number of neurons on each layer
Any number of ties between different layers

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 24

Weights adjusting

Let use SGD to learn weights

𝑤 = 𝑤 ,𝑤 ∈ℝ :

𝑤[ ] = 𝑤 [ ] − η𝛻𝐿 𝑤, 𝑥 , 𝑦 ,

where 𝐿 𝑤, 𝑥 , 𝑦 is loss function (depends on the

problem we are solving).

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 25

Lecture plan
• Logistic regression
• Single-layer neural network
• Completeness problem of neural
• Multilayer neural networks
• Backpropagation
• Modern neural networks

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 26

Derivation of functions superposition

𝑎 𝑥 =𝜎 𝑤 𝑢 𝑥 ;

𝑢 𝑥 =𝜎 𝑤 𝑓 𝑥 ;

Let 𝐿 𝑤 = ∑ 𝑎 𝑥 −𝑦 .
Find partial derivatives
∂𝐿 (𝑤) ∂𝐿 (𝑤)
; .
∂𝑎 ∂𝑢
Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 27
Errors on layers

∂𝐿 (𝑤)
=𝑎 𝑥 −𝑦
ε =𝑎 𝑥 −𝑦 is error on output layer.

∂𝐿 (𝑤)
= 𝑎 𝑥 −𝑦 σ 𝑤 = ε σ 𝑤
ε =∑ ε σ 𝑤 is error on hidden layer.

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 28

Backpropagation discussion (advantages)

• efficacy: gradient can be computed in a time,
which is comparable to time of the network
• can be easily applied for any σ, 𝐿 ;
• can be applied in dynamical learning;
• not all the sample objects can be used;
• can be paralleled.

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 29

Backpropagation discussion

• do not always converge;
• can stuck in local optima;
• number of neurons in the hidden layer should be
• the more ties, the probable overfitting is;
• “paralysis” of a single neuron and for network.

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 30

Lecture plan
• Logistic regression
• Single-layer neural network
• Completeness problem of neural
• Multilayer neural networks
• Backpropagation
• Modern neural networks

Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 31

Plethora of neural networks

Tens or even hundreds different neural networks

• self-organizing map
• deep learning networks
• recurrental neural networks
• radial basis function networks
• Bayesian neural networks
• modular neural networks
• …
Machine learning. Lecture 4. Logisitc regression and neural neworks. 08.06.2016. 32

You might also like