Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Simple Linear Regression

Lecture 14

Centre for Data Science, ITER


Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, Odisha, India.

1 / 18
Contents

1 Introduction

2 Simple Linear Regression model

3 Gradient Descent

4 Maximum Likelihood Estimation

2 / 18
Introduction

Correlation:
Correlation function to measure the strength and direction of the
linear linear relationship between two variables.
The value of the correlation between two variable lies between -1
to 1.
If the two variables move in the same direction, then those
variables are said to have a positive correlation. If they move in
opposite directions, then they have a negative correlation.

What is the need of linear regression ?

3 / 18
Simple Linear Regression model

The model for linear regression with one predictor variable can be
stated as follows:

yi = βxi + α + ϵi , where

yi is the value of the response variable in the ith trial.


αandβ are parameters.
xi is known, it is the value of the predictor variable in the ith trial.
ϵi is a random error term ,normally distributed with mean,E( ϵi ) = 0
and variance, V( ϵi ) = σ 2
Cov(ϵi , ϵj ) =0 if i ̸= j for i=1,...,n and j=1,...,n.
The above regression model is said to be simple linear regression
because it is linear in terms of parameters.

4 / 18
Simple Linear Regression (Contd.)

Examples of Simple Linear Regression model:

Figure 1: first figure Figure 2: second figure

5 / 18
Simple Linear Regression (Contd.)

Program to plot regression line and scatter plot :

1 i m p o r t numpy as np
2 i m p o r t seaborn as sns
3 i m p o r t m a t p l o t l i b . p y p l o t as p l t
4
5 x =[1 ,2 ,3 ,4 ,9]
6 y =[5 ,2 ,9 ,7 ,3]
7
8 plt . figure ()
9 sns . r e g p l o t ( x , y , f i t r e g = True )
10 p l t . s c a t t e r ( np . mean ( x ) , np . mean ( y ) , c o l o r = ’ green ’ )

6 / 18
Simple Linear Regression (Contd.)

Least square Estimation of α and β :

Correlation(x, y) ∗ σy
βb =
σx
b = y - βx
α b , where

y = mean of y
x= mean of x
σy =standard deviation of y
σx =standard deviation of x

7 / 18
Simple Linear Regression (Cont.)
Program to find the least square estimates of α and β :
1.By Formula:

1 i m p o r t numpy as np
2 x = np . a r r a y ( [ 5 , 15 , 25 , 35 , 45 , 5 5 ] )
3 y = np . a r r a y ( [ 5 , 20 , 14 , 32 , 22 , 3 8 ] )
4
5 from t y p i n g i m p o r t Tuple
6 from s c r a t c h . l i n e a r a l g e b r a i m p o r t V e c t o r
7 from s c r a t c h . s t a t i s t i c s i m p o r t c o r r e l a t i o n , s t a n d a r d d e v i a t i o n ,
mean
8 ” ” ” Given two v e c t o r s x and y ,
9 f i n d t h e l e a s t −squares v a l u e s o f alpha and beta ” ” ”
10
11 d e f l e a s t s q u a r e s f i t ( x : Vector , y : V e c t o r ) −> Tuple [ f l o a t ,
float ]:
12 beta = c o r r e l a t i o n ( x , y ) * s t a n d a r d d e v i a t i o n ( y ) /
standard deviation ( x )
13 alpha = mean ( y ) − beta * mean ( x )
14 r e t u r n alpha , beta
8 / 18
Simple Linear Regression (Cont.)
Program to predict for any value of x and to find the error :

1 d e f p r e d i c t ( alpha : f l o a t , beta : f l o a t , x i : f l o a t ) −> f l o a t :


2 r e t u r n beta * x i + alpha
3 ” ” ” The e r r o r from p r e d i c t i n g beta * x i + alpha
4 when t h e a c t u a l v a l u e i s y i ” ” ”
5 d e f e r r o r ( alpha : f l o a t , beta : f l o a t , x i : f l o a t , y i : f l o a t ) −>
float :
6 r e t u r n p r e d i c t ( alpha , beta , x i ) − y i
7
8 from s c r a t c h . l i n e a r a l g e b r a i m p o r t V e c t o r
9 d e f s u m o f s q e r r o r s ( alpha : f l o a t , beta : f l o a t , x : Vector , y :
V e c t o r ) −> f l o a t :
10 r e t u r n sum ( e r r o r ( alpha , beta , x i , y i ) * * 2
11 for x i , y i in zip (x , y ) )
12
13 m1= l e a s t s q u a r e s f i t ( x , y )
14 p r e d i c t (m1 [ 0 ] , m1 [ 1 ] , 20)
15 E= e r r o r (m1 [ 0 ] , m1 [ 1 ] , x , y )

9 / 18
Simple Linear Regression (Cont.)

Program to find the least square estimates of α and β :


2.By Inbuilt module of python:

1 import s t a t s m o d e l s . a p i as s
2 import numpy as np
3 import seaborn as sns
4 import m a t p l o t l i b . p y p l o t as p l t
5 import s t a t s m o d e l s . f o r m u l a . a p i as sm
6
7 x = np . a r r a y ( [ 5 , 15 , 25 , 35 , 45 , 5 5 ] ) . reshape ( ( − 1 , 1 ) )
8 y = np . a r r a y ( [ 5 , 20 , 14 , 32 , 22 , 3 8 ] )
9
10 x= s . a d d c o n s t a n t ( x )
11
12 model1=s . OLS( y , x )
13 r e s u l t 1 =model1 . f i t ( )
14 p r i n t ( r e s u l t 1 . summary ( ) )

10 / 18
Simple Linear Regression (Cont.)

Test the least square module:

1 x = [ i f o r i i n range ( −100 , 110 , 10) ]


2 y = [3 * i − 5 f or i in x ]
3 # Should f i n d t h a t y = 3x − 5
4 p= l e a s t s q u a r e s f i t ( x , y )
5 a s s e r t l e a s t s q u a r e s f i t ( x , y ) == ( −5 , 3 )

Use of assert statement in least square method:

1 from s c r a t c h . s t a t i s t i c s i m p o r t num friends good ,


daily minutes good
2 alpha , beta = l e a s t s q u a r e s f i t ( num friends good ,
daily minutes good )
3 a s s e r t 22.9 < alpha < 23.0
4 a s s e r t 0 . 9 < beta < 0.905

11 / 18
Simple Linear Regression (Cont.)
Calculation of R-square:

1 from s c r a t c h . s t a t i s t i c s i m p o r t de mean
2 d e f t o t a l s u m o f s q u a r e s ( y : V e c t o r ) −> f l o a t :
3 # ” ” ” t h e t o t a l squared v a r i a t i o n o f y i ’ s from t h e i r mean ” ” ”
4 r e t u r n sum ( v * * 2 f o r v i n de mean ( y ) )
5 d e f r s q u a r e d ( alpha : f l o a t , beta : f l o a t , x : Vector , y : V e c t o r )
−> f l o a t :
6 r e t u r n 1 . 0 − ( s u m o f s q e r r o r s ( alpha , beta , x , y ) /
total sum of squares ( y ) )
7
8 # ” ” ” t h e f r a c t i o n o f v a r i a t i o n i n y c a p t u r e d by t h e model , which
equals
9 #1 − t h e f r a c t i o n o f v a r i a t i o n i n y n o t c a p t u r e d by t h e model
”””
10
11 r s q = r s q u a r e d ( alpha , beta , num friends good ,
daily minutes good )
12 a s s e r t 0.328 < r s q < 0.330

12 / 18
Simple Linear Regression (Cont.)

R-square:
R-square value lies between 0 to 1
The model whose R-squared value close to 1 indicates that the
model is a maximum variance can be explained using the model.
If the R-square value close to 0 indicates that is the variance of
response can’t be explained using x.

13 / 18
Simple Linear Regression (Cont.)

R-square comparison between two plots:

Figure 3: R 2 close to 0 Figure 4: R 2 close to 1

14 / 18
Gradient Descent :

Program to estimate the parameters using gradient and Descent:


1 i m p o r t random
2 i m p o r t tqdm
3 from s c r a t c h . g r a d i e n t d e s c e n t i m p o r t g r a d i e n t s t e p
4 num epochs = 10000
5 random . seed ( 0 )
6 guess = [ random . random ( ) , random . random ( ) ] # choose random
value to s t a r t
7 l e a r n i n g r a t e = 0.00001
8 w i t h tqdm . t r a n g e ( num epochs ) as t :
9 for in t :
10 alpha , beta = guess
11 # P a r t i a l d e r i v a t i v e o f l o s s w i t h r e s p e c t t o alpha
12 grad a = sum( 2 * e r r o r ( alpha , beta , x i , y i )
13 f o r x i , y i i n z i p ( num friends good ,
daily minutes good ) )

15 / 18
Gradient Descent (Cont...) :

Program to estimate the parameters using gradient and Descent:


1 # P a r t i a l d e r i v a t i v e o f l o s s w i t h r e s p e c t t o beta
2 grad b = sum( 2 * e r r o r ( alpha , beta , x i , y i ) * x i
3 f o r x i , y i i n z i p ( num friends good ,
4 daily minutes good ) )
5 # Compute l o s s t o s t i c k i n t h e tqdm d e s c r i p t i o n
6 l o s s = s u m o f s q e r r o r s ( alpha , beta ,
7 num friends good , d a i l y m i n u t e s g o o d
)
8 t . set description ( f ” loss : { loss : . 3 f } ” )
9 # F i n a l l y , update t h e guess
10 guess = g r a d i e n t s t e p ( guess , [ grad a , grad b ] , −
learning rate )
11 # We should g e t p r e t t y much t h e same r e s u l t s :
12 alpha , beta
13 a s s e r t 22.9 < alpha < 23.0
14 a s s e r t 0 . 9 < beta < 0.905

16 / 18
Maximum Likelihood Estimation :

Imagine that we have a sample of data v1 , v2 , ..., vn that comes


from a distribution that depends on some unknown parameter θ .
If θ is unknown then we can consider the likelihood of θ given the
sample that is
L (θ|v1 , ..., vn )
Under this approach, the most likely is the value that maximizes
this likelihood function.
As per the assumption of the Linear regression model, the errora
are normally distributed with mean 0 and standard deviation σ. So
the likelihood of α and β based on x and y data set is :

1  
L (α, β|xi , yi , σ) = √ exp − (yi − α − βxi )2 /2σ 2
2πσ

17 / 18
Thank You
Any Questions?

18 / 18

You might also like