Regularization

Lecture 5
Regularization
Dr. Le Huu Ton
Hanoi, 2018
Outline
Overfitting Problem
Regularization
Regularization with Linear Regression
Regularization with Logistic Regression
2
Outline
Overfitting Problem
Regularization
2
Overfitting Problem
Under Fitting OK
h ( x)   0  1 x h ( x)   0  1 x   2 x 2
Over Fitting
h ( x)   0  1 x   2 x 2  3 x 3   4 x 4  ...
4
Overfitting Problem
x2 x2 OK
Under Fitting
x1 x1
h ( x)  g ( 0  1 x1   2 x2 ) h ( x )  g ( 0  1 x1   2 x2
x2   3 x12   4 x22   5 x1 x2 )
Over Fitting
h ( x )  g ( 0  1 x1 
 2 x2   3 x12   4 x22 
x1  5 x13   6 x23  ...)
Overfitting Problem
Under fitting:
Under fitting refers to a model that can neither model the training
data not generalize to new data.
An under fit machine learning model is not a suitable model and will
be obvious as it will have poor performance on the training data.
Over Fitting :
Overfitting happens when a model learns the detail and noise in the
training data to the extent that it negatively impacts the
performance on the model on new data.
6
Outline
Overfitting Problem
Regularization
2
Regularization
Regularization is a technique used in an attempt to solve the

overfitting problem.
Regularization is done by reduce the magnitude of some coefficient

θj
8
Overfitting Problem
Over Fitting OK
 0  1 x   2 x 2
 0  1 x   2 x 2  3 x3   4 x 4
Regularization: reduce value of θ3 and θ4

Minimize the cost function
1 m
E    
2m i 1
( h ( x (i )
)  y ) 99993  9999 4
(i ) 2
 3  0,   4  0
Regularization
Small values of coefficients  0 ,1 ,... n

 Simpler hypothesis h(x)
 Less prone to overfitting
Regularization: Add a regularization component into the cost

function
1 m n
E    [ (h ( x )  y )     j ]
(i ) (i ) 2 2
2m i 1 j 1
Regularization component
10
Regularization
Question:
What if  is set by a extremely large number ( too large for our
problem), which of the following statement is correct:
1. The algorithm works fine
2. Algorithm fail to eliminate overfitting
3. Algorithm results in under fitting
4. Gradient descent will fail to converge
11
Outline
Overfitting Problem
Regularization
2
Regularization:
Minimize the Cost Function
1 m n
E    [ (h ( x )  y )     j ]
(i ) (i ) 2 2
2m i 1 j 1
Gradient descent:

 j :  j   E ( )
 j
13
Gradient Descent:
Repeat until converged:
{ 1 m
 0 :  0  
m
 (
i 1
h ( x (i )
)  y (i )
) x0
i
1 m 
 j :  j    (h( x )  y ) x j   j j  1: n
(i ) (i ) i
m i 1 m
 1 m
 j :  j (1   )    (h( x (i ) )  y (i ) ) xij j  1: n
m m i 1
}
14
Normal Equation without regularization:
  (X X ) X Y
T 1 T
Normal Equation with regularization
0 0 ... 0
0 1 0 0 1 T
  (X X  
T
) X Y
... ... 1 ...
0 0 0 1
15
Outline
Overfitting Problem
Regularization
2
Logistic Regression: Minimize the cost function
1 m (i )  n
E ( )    [ y log h ( x (i ) )  (1  y (i ) ) log(1  h ( x (i ) )]    2
j
m i 1 2m j 1
Gradient descent:

 j :  j   E ( )
 j
17
Gradient Descent:
Repeat until converged:
{ 1 m
 0 :  0  
m
 (
i 1
h ( x (i )
)  y (i )
) x0
i
1 m 
 j :  j    [(h( x )  y ) x0 ]   j j  1: n
(i ) (i ) i
m i 1 m
 1 m
 j :  j (1   )    (h( x (i ) )  y (i ) ) x0i j  1: n
m m i 1
}
18
Newton’s Method with Regularization
 t 1 :  t  H 1 E
1

E   m
 ( h ( x (i )
)  y (i )
) x0
i
 0 1 
 E  ...  m  (h( x )  y ) x1  1
(i ) (i ) i
m
 ...
E  
 n 1 
m
 (h( x )  y ) xn   n
(i ) (i ) i
19
Hessian Matric:
0 0 ... 0
1 m 0 1 0 0
H    h( x )(1  h( x ) x ( x )   
(i ) (i ) (i ) (i ) T
m i 1 ... ... 1 ...

0 0 0 1
20
21
References
http://openclassroom.stanford.edu/MainFolder/CoursePage.php?c
ourse=MachineLearning
Andrew Ng Slides:
https://www.google.com.vn/url?sa=t&rct=j&q=&esrc=s&source=w
eb&cd=2&cad=rja&uact=8&sqi=2&ved=0ahUKEwjNt4fdvMDPAhXI
n5QKHZO1BSgQFggfMAE&url=https%3A%2F%2Fdatajobs.com%2F
data-science-repo%2FGeneralized-Linear-Models-%5BAndrew-
Ng%5D.pdf&usg=AFQjCNGq37q2uVFcpGhNqH-
5KZSlJ_HSxg&sig2=vnCEvyvKQGCuryttAPcokw&bvm=bv.134495766
,d.dGo
22

Regularization

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Regularization

Uploaded by

Copyright:

Available Formats

Lecture 5

Dr. Le Huu Ton

Regularization with Linear Regression

Regularization with Logistic Regression

Regularization with Linear Regression

Regularization with Logistic Regression

Regularization with Linear Regression

Regularization with Logistic Regression

Regularization is a technique used in an attempt to solve the

Regularization is done by reduce the magnitude of some coefficient

Regularization: reduce value of θ3 and θ4

Small values of coefficients  0 ,1 ,... n

Regularization: Add a regularization component into the cost

Regularization with Linear Regression

Regularization with Logistic Regression

Normal Equation without regularization:

Normal Equation with regularization

Regularization with Linear Regression

Regularization with Logistic Regression

Logistic Regression: Minimize the cost function

Newton’s Method with Regularization

m i 1 ... ... 1 ...

You might also like