Professional Documents
Culture Documents
Regression Kann Ur 14
Regression Kann Ur 14
BY
DR. ISMAIL B
PROFESSOR
DEPARTMENT OF STATISTICS
MANGALORE UNIVERSITY
MANGALAGANGOTHRI
e-mail: prof.ismailb@gmail.com
1
Descriptive Statistics
Regression
Models
2+ Explanatory
Variables
Multiple
Simple
Linear
NonLinear
Linear
NonLinear
10
Log linear
Dependent
1 Explanatory
Variable
Regression
Models
2+ Explanatory
Variables
Multiple
Simple
Linear
NonLinear
Linear
NonLinear
11
Linear Equations
Y
Y = bX + a
b = Slope
Change
in Y
Change in X
a = Y-intercept
Yi 0 1 X i i
Dependent
(Response)
Variable
Slope
Independent
(Explanatory)
Variable
13
Yi 0 1X i i
Observed
Value
i = Random Error
YX
0 1X i
(E(Y/X))
Observed Value
14
Assumption s
1. E( ) 0, disturbanc e have zero mean.
2. V( i ) 2 , i 1,2,....n
i.e., distubance have constant v ariance.
3. E i j 0,
for i j
SSE =(Y - Yi )2
Xi
_
Y
X
16
then
x y
x
i
2
i
Y - X
1
X .
.
1
(X' X) -1 X' Y,
X1
X2
X n
V( ) (X' X) -1 2
17
V( ) / x i2
2
i 1
Estimation of 2
2 e 2 /( n 2)
i
e i Yi OLS OLS X i
Re sidual,
Testing H 0 : 0
2
s
t obs OLS
S.E( ) (s 2
x
i 1
2
i
2
i
OLS
S.E
,n 2
se ( ).
18
A measure of fit :
ei Yi Y1 ,
0,
Yi Y Y
i 1
i 1
R 2 y i2 / y i2 1
2
e
i
2
y
i
2
e
i
Yi
2
Y
i
2
Y
i
Centered R 2 is
n
R 1 e
2
i 1
2
i
2
i
i 1
19
Prediction :
Y0 0 X 0 0
BLUP of E(Y 0 ) is
X
Y
0
0
0
X
2 1 0
VY
0
2
n
x
i
X
s 1 1 0
t
Y
0
0.025, n - 2
2
n
x
i
t 0.025, n - 2 represents 2.5% critical value obtained for t - distributi on with n - 2 d.f.
20
Example :
Annual consumptio n of 10 households each selected randomly
from a group of households with a fixed personal disposable income.
Both income and expenditur e measured in 1000 Rs.
Solution :
Yi X i i
0.8095
21
V( )
s2
2
i
0.005941 , s 2 0.311905
SE( ) 0.077078
V( ) s
2
n
x
SE( ) 0.60446
2
0.365374
10 .50
SE
R r
2
x y / x y 0.9324 .
1 - e y 0.9324 .
2
2
i
2
i
2
i
2
i
23
SSR
regression sum of squares
=
SST
total sum of squares
25
26
27
28
Y = a0 + a1 X 1 + a2 X 2 + ... + an X n + e
e ~ N (0, s 2 )
Y: Response variable
X: Explanatory variable
e : Error
29
30
Multiple Regression
31
with?
Is the model adequate?
32
60
100
50
40
Residual
30
20
East
West
North
50
10
0
-10
-20
-30
-40
1st
Qtr
50
3rd
Qtr
100
150
200
Fitted Value
Remedy?
34
35
36
37
Logistic
Regression
Logistic regression is a form regression used when the dependent variable is dichotomy
(binary) and independent variable is of any type
Continuous variable are not used as dependent variable.
Logistic regression does not assume linearity of relationship between dependent and
independent variables
Does not assume normality and homoscedasticity
It assumes that observations be independent and that independent variables are linearly
related the logit of the dependent.
The scatter plot of outcome variable (Y) vs. independent variables shows all points fall on
one of the two parallel lines representing Y=0 and Y=1.
This scatter plot does provide clear picture of linear relationship.
In linear regression the quantity E(Y/X) can take any value in range
( , )
where in logistic regression E(Y/X) lies between (0,1)
38
40
41
42
Thanks !!!
43