Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 17

Regression

Explaination relationship (includes causality)

Dependent variable: consumption expenditure (household)

Micro Macro
Factor Variable Data Sign Factor / var
Income Ok Ok + GDP
Price ??? No CPI
Taste No No
Demand No No
Education of ok Ok !!! ?
Size OK Ok + Population
Gender of Ok ?
householder
Age of Ok +/-
Weight
Height
Location Ok Urban/rural
Weather
Occupation / Interest rate
carear
Literature review

 Topic
 Research question
 Review: background theory; economic theory (journal)
 Model
 Data
 Estimate
 Test / check
 Analyze
 Forecast
 Turn back to the question:
 Conclusion: Summary what you have done
^
w age i=2.23+1.654 ex pi
wag e i=2.23+1.654 ex pi + ei
 2.23: estimated intercept: When experience is zero, average wage is
estimated by 2.23 (units)
 1.654: estimated slope: When experience increases 1 unit (1 year),
average wage increases by 1.654, cet.par
 *NOTE: in this sample of 5 observations only*

A
Value 5 6 8 9
Overall mean 7
Deviation -2 -1 +1 +2
Group Under Grad
Mean of group 5.5 8.5
Group - deviation -1.5 -1.5 1.5 1.5
Within - deviation -0.5 +0.5 -0.5 +0.5

Total sum of squares (TSS) = (-2)2 + (-1)2 + 12 + 22 = 10


Explained SS = (-1.5)2 + (-1.5)2 + .. + .. = 9
Residual SS = (-0.5)2 + … = 1
TSS = Explained SS + Residual SS
2 Explained SS 9
R= = =90 %
Total SS 10
90% of total variation in wage is explained by variation in Graduated

Total variation => Total SS


Between variation => Between SS : Explained SS
Within variation => Within SS: Residual SS

ANOVA Table: Factor => k groups


Source SS Df MS F-stat
Group / Factor: BSS k −1 BMS = BSS/(k – 1) BMS / WMS
Between
Error / Residual WSS n−k WMS = WSS / (n-k)
Within
Total TSS n−1

ANOVA: test for means


In k groups: means are: μ1 , μ 2 , … , μk
H 0 : μ1=μ2=…=μ k : Factor does not affect to the means
H 1 : not H 0: Factor affects to the means

F-test:
BSS/(k−1)
F stat =
WSS /(n−k )
IF F stat > f crit =f ( k−1 ,n−k ) α: Reject H0
P-value of the test: P(F (k−1 , n−k ) > F stat )

Example
n=4 ;Factor: training => k =2 groups
BSS = 9, RSS = 1; TSS = 10

Source SS Df MS F-stat
Training 9 1 9/1 = 9 = 9 / 0.5 = 18
Error 1 2 ½ = 0.5
Total 10 3

Test
H 0 : μGrad =μUnder
F stat =18 ; F crit =f ( 1 ,2) 0.05=¿
 P[distribution]( x ¿ => P ( X < x )=F (x)
 d[distribution]( x ¿ => f (x)
 q[distribution]( β ¿ => x β : P ( X < x β ) =β
 Critical value x α : P ( X > x α ) α =q (1−α )
 f (1 ,2 ) 0.05=qf ( 0.95 , 1, 2 )=18.512

 F stat <18.512 : Not reject Ho => Not enough evidence to say that: Factor

“Graduate” affects to the mean of wage


 P−value = P ( F (1 ,2 )>18 )=1−pf ( 18 , 1 ,2 ) =0.0513

Source SS Df MS F-stat
Training 1 1 1/1 = 1 = 1 / 4.5 = 0.22
Error 9 2 9/2 = 4.5
Total 10 3

F stat =0.22 ; F crit =18.512 : Not reject


P−value=P ( F (1 , 2) >0.22 ) =1− pf ( 0.22, 1 , 2 )=0.685

Survey 25 workers in 4 factories, BSS = 24; Residual SS = 14. Test for equality of
means !
Source SS Df MS F-stat
Factories 24 3 24/3 = 8 = 8/0.67 = 12
Error 14 21 14/21 = 0.67
Total 38 24
H 0 : μ1=μ2=μ 3=μ4
F stat =12; F crit =f (3 ,21) 0.05=qf ( 0.95 , 3 , 21 )=3.07
Reject Ho => Means are not equal
P-value = 1- pf(12, 3, 21) = 0.000

2-factor (without interaction)


Source SS Df MS F-stat
Rows 24 4 24/4 = 6 6/3.67 = 1.64
Colums 21 1 21 / 1 = 21 21/3.67 = 5.72
Error 114 31 114/31 = 3.67
Total 37 - 1

H 0 : μR 1=μR 2=…=μ R 5: Factor “row” does not affect


F stat =1.64 ; P−value=1−pf ( 1.64 , 4 , 31 )=0.189
H 0 : μC 1=μC 2 : Factor “column” does not affect
F Stat =5.72 ; P−value=1− pf ( 5.72 ,1 , 31 )=0.02

2-factor (without interaction)


Source SS Df MS F-stat
Rows 24 4 24/4 = 6 6/3.7 = 1.62
Colums 21 1 21 / 1 = 21 21/3.7 = 5.68
Row*Colum 14 4*1 = 4 14 / 4 = 3.5 3.5 / 3.7 = 0.95
n
Error 100 27 100/27 = 3.7
Total 37 - 1
Test for affection of Row
P−value=1− pf ( 1.62 , 4 ,27 )=0.2
Test for affection of Row
P−value=1− pf ( 5.68 , 1 ,27 )=0.024
Test for affection of Row*Column
P−value=1− pf ( 0.95 , 4 , 27 )=0.45
R =0.961: 96.1% of total variation in wage is explained by (model) variation in
2

experience.

23 Jan 2024
Example 2.2
Model
wage=β 0 + β 1 exp+ ε

(a) Test:
H 0 : β 1=0 ; H 1 : β 1 ≠ 0
^β =0 1.6538−0
1
T stat = = =8.6
se ( ^β 1) 0.1923
Critical value: t (n−2) α /2, at 5%: t (3 )0.025= [ R ] qt ( 0.975 , 3 ) =3.18
|T stat|>3.18: reject H0: slope is significant at 5%
 P-value ¿ 2 P ( T ( 3 ) >8.6 )=[ R ] 2[1−pt ( 8.6 , 3 ) ]=0.0033

“Slope is statistically significant (p = 0.00165)”

(b) Intercept
H 0 : β 0=0 ; H 1 : β0 ≠ 0
^β =0 2.231−0
0
T stat = = =4.449
se ( ^β ) 0.5015
0

Critical value: t (n−2) α /2, at 5%: t (3 )0.025= [ R ] qt ( 0.975 , 3 ) =3.18


|T stat|>3.18 reject H0: intercept is significant at 5%
 P-value ¿ 2 P ( T ( 3 ) >4.449 )=[ R ] 2[1− pt ( 4.449 , 3 ) ]=0.021

(c) Confidence interval (CI) for slope


β 1 ∈ ^β1 ± t ( n−2) α / 2 se ( ^β1 )
( 1.6538 ± 3.18∗0.1923 )=( 1.042; 2.265)
At 95%, when experience increases 1 unit (year), on average, the increase of
wage is in (1.042; 2,265) units
 CI 95% of average increase of wage when exp increases 1 unit is ….

Test the hypothesis that when exp increases 1 year, on average, wage increases
less than 2 thousands, and find p-value.
H 0 : β 1=2
H 1 : β1 <2
1.6538−2
T stat = =−1.8
0.1923
Critical value: −t (3 )0.05 =[ R ] =¿−2.35
T stat >−2.35: Not reject H0: hypothesis is not correct.
P-value = P ( T ( 3 ) ←1.8 ) =[ R ] pt (−1.8 , 3 ) =0.085

(d) exp=6=¿ ^
wage=2.231+1.6538∗6=12.1538


2
1 ( 6−2.4 )
se ( pred ) =0.4385 1+ + =0.843
n 5.2
CI 95% of predicted value of wage
12.1538 ± 3.18∗0.843=(9.478 ; 14.83)

Call:
lm(formula = wage ~ gen)

Residuals:
1 2 3 4 5
-2.3333 -0.3333 -1.0000 1.0000 2.6667

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.0000 1.5635 3.838 0.0312 *
gen 0.3333 2.0184 0.165 0.8793

Dependent Variable: WAGE


Method: Least Squares
Date: 01/23/24 Time: 15:04
Sample: 1 5
Included observations: 5
Variable Coefficient Std. Error t-Statistic Prob.
C 6.000000 1.563472 3.837613 0.0312
GEN 0.333333 2.018434 0.165145 0.8793
R-squared 0.009009 Mean dependent var 6.200000
Adjusted R-squared -0.321321 S.D. dependent var 1.923538
S.E. of regression 2.211083 Akaike info criterion 4.714016
Sum squared resid 14.66667 Schwarz criterion 4.557792
Log likelihood -9.785041 F-statistic 0.027273
Durbin-Watson stat 0.765152 Prob(F-statistic) 0.879331

> year <- c(1,2,2,3,4)


> wage <- c(4,6,5,7,9)
> summary(lm(wage ~ year))

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.2308 0.5015 4.448 0.02113 *
year 1.6538 0.1923 8.600 0.00331 **

Residual standard error: 0.4385 on 3 degrees of freedom


Multiple R-squared: 0.961, Adjusted R-squared: 0.948

Wag e i=2.23+1.6538 yea r i + ei


^
w age i=2.23+1.6538 yea r i
R-sq = 0.961
In this sample
 When year = 0 (staff without experience): average wage is ….
 When experience increases 1 (year) => on average,….
 96.1% of total variation in wage is explained by model (variation in year).
Without intercept
> summary(lm(wage ~ 0+ year))

Call:
lm(formula = wage ~ 0 + year)

Coefficients:
Estimate Std. Error t value Pr(>|t|)
year 2.4412 0.1795 13.6 0.000169 ***
Residual standard error: 1.047 on 4 degrees of freedom
Multiple R-squared: 0.9788, Adjusted R-squared: 0.973

Wag e i=2.4412 yea r i +e i


^
w age i=2.4412 yea r i

27 February 2024

R : proportion of total variation in dependent variable is explained by model (by


2

variation in all of explanatory variables).

n−1
Adj−R =1−( 1−R )
2 2
n−k −1
n=10
2 10−1
(1) Y <- x, z => R2=0.6 → Adj R =1−( 1−0.6 ) =0.486
n−( 2+1 )
2 10−1
(2) Y <- x, z, w => R =0.65 → Adj R =1−( 1−0.6 ) =0.475
2
n−( 3+1 )
(3) Y <- x, w => R2=0.62

y=β 0 + β 1 x + β 2 z y=β 0 x z
β1 β2

∂y β1 β0 β1 x
β1−1
z
β2

∂x
Constant at any point (at x=x 0 , z =z 0 ¿
y ∂y x x β1−1 β2 x
εx= × β1 β0 β1 x z = β1
∂x y β 0+ β1 x+ β2 z β β
β0 x z 1 2

(at x=x 0 , z =z 0 ¿ constant

β1 β2
y=β 0 x z u
ln y =ln β 0+ β1 ln x+ β2 ln z +ln u

¿
ln y =β0 + β 1 ln x+ β2 ln z+ ε
Log-log model (log-linear model) !!!!

β 1=elasticity

dy dx dz
=β 1 + β 2
y x z

When x increases 1%, on average, y changes approximately β 1 % (cet. par.)

 β 1 ∈(0 , 1): y diminishingly increases to x


 β 1=1: y is linear on x (constantly increases)
 β 1> 1: y increasingly increases to x

Profit: 10 (bil.): increases 1 percent => 10.1 (bil.)


Interest rate: 10 (%): increases 1 percent =>
 11% ?
 10.1 (%) ? Correct !!!
Increases 1 percent point => 11%
Dep. Is wage

Const 7.659 [0.001] -0.359 [0.994] -2.177 [0.557] -12.39 [0.060]


***
Exp 0.044 [0.936] 0.401 [0.003] 1.960 [0.029]
Exp^2 0.022 [0.516]
Edu 0.832 [0.913] 0.675 [0.030] 1.426 [0.009]
Edu^2 -0.005 [0.985]
Exp*edu -0.115 [0.066]
R-sq 0.543 0.2103 0.723 0.823
Adj R-sq 0.442 0.035 0.661 0.757
F-stat 5.356 [0.029] 1.198 [0.346] 11.76 [0.003] 12.42 [0.002]

With data of Exp, Edu, Wage, Male

Significant level: 5%
Question Answer
1. Mean of wage

2. Sample variance of experience

3. Covariance of exp and edu

4. Correlation of exp and edu = ?

5. Test for correlation between wage and


experience, then test statistic =?
At 5%:
A. Reject Ho, no correlated
B. Reject Ho, correlated
C. Not reject Ho, no correlated
D. Not reject Ho, correlated
Regress wage on edu, intercept included (Model
[1])
6. Estimated intercept = ?

7. Coefficient of determination =?

8. Test for significant of slope


A. Reject Ho, slope is insignificant
B. Reject Ho, slope is significant
C. Not reject Ho, insig.
D. Not reject Ho, sig
Transform the above model into log-log form
(Model [2])
9. Estimated slope =

10. The first fitted value =

11. The first residual =

12. Covariance between estimated intercept and


slope
Adding male into model [2], gain [3]
13. Adjutsted coefficient of determination

14. At 5%, how many coefficient are significant

15. Estimate the difference between male and


female in average wage
Adding experience into model [3], gain [4]
16. The new variable’s coefficient

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27.

28.

29.

30.
31.

32.

33.

34.

35.

36.

37.

rice=β 0 + β1 ¿ β ¿ 2 income + β 3 nrice+ ε

Estimate Std. Error t value Pr(>|t|)


(Intercept) 8.950e+02 2.947e+02 3.037 0.00254 **
size 1.106e+03 7.395e+01 14.951 < 2e-16 ***
income -1.559e-04 1.247e-03 -0.125 0.90058
nrice 2.986e-04 4.204e-03 0.071 0.94341

Var Coef Se Standardized coefficient


Intercept 895
Size 1106 0.637
Income -0.000156 -0.007
nrice 0.000299 0.004

S S S S ¿
ric e =β 1 siz e + β 2 incom e + β3 nric e + ε

> incomes<- (income - mean(income))/sd(income)


> sizes<- (size - mean(size))/sd(size)
> rices<- (rice - mean(rice))/sd(rice)
> nrices<- (nrice - mean(nrice))/sd(nrice)
> summary(lm(rices ~ 0 + sizes + incomes + nrices ))

Call:
lm(formula = rices ~ 0 + sizes + incomes + nrices)

Residuals:
Min 1Q Median 3Q Max
-2.6381 -0.4252 -0.0988 0.3343 7.5584

Coefficients:
Estimate Std. Error t value Pr(>|t|)
sizes 0.636886 0.042547 14.969 <2e-16 ***
incomes -0.007037 0.056232 -0.125 0.900
nrices 0.004236 0.059573 0.071 0.943

Regress wage on exp, edu, male (intercept


included)
Q Test the hypthesis that coeffient of exp
equals unit
A. Reject Ho, hypothesis is correct
B. Reject Ho, hypothesis is incorrect
C. Not reject Ho, hyp. is correct
D. Not reject Ho, hyp. is incorrect
Test the hypthesis that coeffient of exp
differs from unit
A. Reject Ho, hypothesis is correct
B. Reject Ho, hypothesis is incorrect
C. Not reject Ho, hyp. is correct
D. Not reject Ho, hyp. is incorrect
Test the hypothesis that sum of
coefficients of exp and edu differs from
1.5
A, B, C, D
Test the hypothesis that sum of slopes is
equal 3.
A, B, C, D
Add squared of exp into the model.
Test for significant of the new
coefficient
A. Reject Ho, coef. Is sig
B. Reject , is insig
C. Not reject, sig.
D. Not reject, insig.
Test for adding squared of exp and squared
of edu into the model, using F-test
A. Reject Ho, should be added
B. Reject Ho, should not be added
C. Not Reject Ho, should be added
D. Not Reject Ho, should not be added

Firm A Firm B
2022 K = 400; Y = 1000 K = 400; Y = 2000
2023 K = 404; Y = 1010 K = 404; Y = 2010
Absolute effect Δ K =4 ; Δ Y =10 Δ K =4 ; Δ Y =10
ΔY ΔY
=2.5 =2.5
ΔK ΔK
When Capital increases 1
unit => Output increases
2.5 units
ICOR: Increment Capital
Output Ratio = 4/10 = 0.4
Relative effect ΔK ΔK
% Δ K= ( 100 % ) =0.01=1 % % Δ K= ( 100 % ) =0.01=1 %
K K
ΔY ΔY
% ΔY = ( 100 % )=0.01=1 % % ΔY = ( 100 % )=0.005=0.5 %
Y Y

Y % ΔY 1% Y % Δ Y 0.5 %
εK= = =1 εK= = =0.5
% Δ K 1% % ΔK 1%

Y =f ( K ) :continuous
' dY '
 Absolute effect: Derivative: Y = → dY =f ( K ) dK
dK
 Relative effect
Y %dY dY /Y dY K ' K
εK= = = × =f ( K )
%dY dK / K dK Y Y

You might also like