Professional Documents
Culture Documents
Chapter 14
Chapter 14
Lecture 14
1 / 18
Contents
1 Introduction
3 Gradient Descent
2 / 18
Introduction
Correlation:
Correlation function to measure the strength and direction of the
linear linear relationship between two variables.
The value of the correlation between two variable lies between -1
to 1.
If the two variables move in the same direction, then those
variables are said to have a positive correlation. If they move in
opposite directions, then they have a negative correlation.
3 / 18
Simple Linear Regression model
The model for linear regression with one predictor variable can be
stated as follows:
yi = βxi + α + ϵi , where
4 / 18
Simple Linear Regression (Contd.)
5 / 18
Simple Linear Regression (Contd.)
1 i m p o r t numpy as np
2 i m p o r t seaborn as sns
3 i m p o r t m a t p l o t l i b . p y p l o t as p l t
4
5 x =[1 ,2 ,3 ,4 ,9]
6 y =[5 ,2 ,9 ,7 ,3]
7
8 plt . figure ()
9 sns . r e g p l o t ( x , y , f i t r e g = True )
10 p l t . s c a t t e r ( np . mean ( x ) , np . mean ( y ) , c o l o r = ’ green ’ )
6 / 18
Simple Linear Regression (Contd.)
Correlation(x, y) ∗ σy
βb =
σx
b = y - βx
α b , where
y = mean of y
x= mean of x
σy =standard deviation of y
σx =standard deviation of x
7 / 18
Simple Linear Regression (Cont.)
Program to find the least square estimates of α and β :
1.By Formula:
1 i m p o r t numpy as np
2 x = np . a r r a y ( [ 5 , 15 , 25 , 35 , 45 , 5 5 ] )
3 y = np . a r r a y ( [ 5 , 20 , 14 , 32 , 22 , 3 8 ] )
4
5 from t y p i n g i m p o r t Tuple
6 from s c r a t c h . l i n e a r a l g e b r a i m p o r t V e c t o r
7 from s c r a t c h . s t a t i s t i c s i m p o r t c o r r e l a t i o n , s t a n d a r d d e v i a t i o n ,
mean
8 ” ” ” Given two v e c t o r s x and y ,
9 f i n d t h e l e a s t −squares v a l u e s o f alpha and beta ” ” ”
10
11 d e f l e a s t s q u a r e s f i t ( x : Vector , y : V e c t o r ) −> Tuple [ f l o a t ,
float ]:
12 beta = c o r r e l a t i o n ( x , y ) * s t a n d a r d d e v i a t i o n ( y ) /
standard deviation ( x )
13 alpha = mean ( y ) − beta * mean ( x )
14 r e t u r n alpha , beta
8 / 18
Simple Linear Regression (Cont.)
Program to predict for any value of x and to find the error :
9 / 18
Simple Linear Regression (Cont.)
1 import s t a t s m o d e l s . a p i as s
2 import numpy as np
3 import seaborn as sns
4 import m a t p l o t l i b . p y p l o t as p l t
5 import s t a t s m o d e l s . f o r m u l a . a p i as sm
6
7 x = np . a r r a y ( [ 5 , 15 , 25 , 35 , 45 , 5 5 ] ) . reshape ( ( − 1 , 1 ) )
8 y = np . a r r a y ( [ 5 , 20 , 14 , 32 , 22 , 3 8 ] )
9
10 x= s . a d d c o n s t a n t ( x )
11
12 model1=s . OLS( y , x )
13 r e s u l t 1 =model1 . f i t ( )
14 p r i n t ( r e s u l t 1 . summary ( ) )
10 / 18
Simple Linear Regression (Cont.)
11 / 18
Simple Linear Regression (Cont.)
Calculation of R-square:
1 from s c r a t c h . s t a t i s t i c s i m p o r t de mean
2 d e f t o t a l s u m o f s q u a r e s ( y : V e c t o r ) −> f l o a t :
3 # ” ” ” t h e t o t a l squared v a r i a t i o n o f y i ’ s from t h e i r mean ” ” ”
4 r e t u r n sum ( v * * 2 f o r v i n de mean ( y ) )
5 d e f r s q u a r e d ( alpha : f l o a t , beta : f l o a t , x : Vector , y : V e c t o r )
−> f l o a t :
6 r e t u r n 1 . 0 − ( s u m o f s q e r r o r s ( alpha , beta , x , y ) /
total sum of squares ( y ) )
7
8 # ” ” ” t h e f r a c t i o n o f v a r i a t i o n i n y c a p t u r e d by t h e model , which
equals
9 #1 − t h e f r a c t i o n o f v a r i a t i o n i n y n o t c a p t u r e d by t h e model
”””
10
11 r s q = r s q u a r e d ( alpha , beta , num friends good ,
daily minutes good )
12 a s s e r t 0.328 < r s q < 0.330
12 / 18
Simple Linear Regression (Cont.)
R-square:
R-square value lies between 0 to 1
The model whose R-squared value close to 1 indicates that the
model is a maximum variance can be explained using the model.
If the R-square value close to 0 indicates that is the variance of
response can’t be explained using x.
13 / 18
Simple Linear Regression (Cont.)
14 / 18
Gradient Descent :
15 / 18
Gradient Descent (Cont...) :
16 / 18
Maximum Likelihood Estimation :
1
L (α, β|xi , yi , σ) = √ exp − (yi − α − βxi )2 /2σ 2
2πσ
17 / 18
Thank You
Any Questions?
18 / 18