Econometrics: Department of Marketing Faculty of Business Studies University of Dhaka

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Department of Marketing

Faculty of Business Studies


University of Dhaka
Econometrics

Code : 412
Course Teacher : Haripada Bhattacharjee
Professor
Department of Marketing

Made by : Md. Fahad Bin Zahid


Id : 089
Batch : 23rd
Session : 2016-17
Text Book

Basic Econometrics Fifth Edition


Damodar N. Gujarati Dawn C. Porter
Professor Emeritus of Economics, University of Southern California
United States Military Academy, West Point

Course Outline

1st mid 2nd mid

1. Nature of regression analysis 12. Testing the assumption of regression model


2. Nature of two variables regression 13. Multi Collinearity
3. Classical linear regression model 14. Heteroscedasticity
4. Dummy variable regression analysis 15. Auto-correlation
5. The problem of estimation 16. Model specification
6. The problem of inference

7. Non-linear regression model


8. Logit and Probit models
9. Time - series estimation
10. Random effect model
11. Nature of simultaneous equation
Table of Contents

Econometrics ...................................................................................................................................................................................................... 1

Methodology of Econometrics ..................................................................................................................................................................... 1

Test of Hypothesis:............................................................................................................................................................................................ 2

Assumption of Classical Liner Regression Model ................................................................................................................................ 3

Interpretation of Regression Model .......................................................................................................................................................... 4

Two Variable Regression ( Single Equation ) ....................................................................................................................................... 5

Multiple Regression Analysis (Simultaneous Equation Method) ................................................................................................. 7

Multiple Regression Analysis (Three Dimensions Method)............................................................................................................. 9

Cobb-Douglas Production Function ....................................................................................................................................................... 12

Autocorrelation / Serial Correlation ..................................................................................................................................................... 14

Multi collinearity............................................................................................................................................................................................ 17

Heteroscedasticity ......................................................................................................................................................................................... 19

i|Page
Department of Marketing, University of Dhaka
Econometrics
Econometrics in the study of economic data with mathematical and statistical analysis.
𝐸𝑐𝑜𝑛𝑜𝑚𝑖𝑐𝑠 + 𝑀𝑎𝑡ℎ = 𝑀𝑎𝑡ℎ𝑒𝑚𝑎𝑡𝑖𝑐𝑎𝑙 𝐸𝑐𝑜𝑛𝑜𝑚𝑖𝑐𝑠
𝑀𝑎𝑡ℎ𝑒𝑚𝑎𝑡𝑖𝑐𝑎𝑙 𝐸𝑐𝑜𝑛𝑜𝑚𝑖𝑐𝑠 + 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑠 = 𝐸𝑐𝑜𝑛𝑜𝑚𝑒𝑡𝑟𝑖𝑐𝑠

Methodology of Econometrics
There are 08 steps Econometrics Methodology:

1. Developing Economic Theory: This is a set of variable rules which influence the accurate prediction of
economic analysis. Any irrelevant variable must not be included in economic theory.
2. Developing Mathematical Model for Economic Theory: Model may be linear, non-linear, log- linear.
𝑦 = 𝑎 + 𝑏𝑥

[When in any equation, the highest number of independent variables is one, it will be linear.]

[When the unknown variable contains highest power of 01, it is liner equation. If the unknown variable
contains highest power of 02, it is quadratic function. And it has to be converted into logarithm function.]
3. Developing Econometric Model of The Theory: The variables which are not added through the
mathematical model, but have influence on the depended variable shall be denoted by the symbol, µ, which
means error terms, residual factors, stochastic factors. In economic model, this error term is developed
based upon certain assumption.

𝑌 = 𝑎 + 𝑏𝑥 + µ/𝑒/𝑅/𝑆

4. Collection of Data: In most cases, econometrics depends on secondary data. Economic data must be two
types, primary and secondary data. However, before using any kind of data, quality of the data should be
assessed in terms of its reliability and the purpose of the data.
5. Estimation of the economic model.
6. Hypothesis Testing:
a) Null hypothesis
b) Alternative Hypothesis
In econometric modelling, researchers will always test the Null Hypothesis, through various probability
density functions. There are four techniques of testing the Null Hypothesis:
a) T-test (Where data is ≤ 30)
b) Z-test (Where data is > 30)
c) F-test (Where there are dependable variables)
d) Ci-square (Difference between the expectations)
7. Forecasting or prediction of econometric model for policy purpose.
1|Page
Department of Marketing, University of Dhaka
Single Equation Regression Model
1. Explain Variable (1) Independent Variable
2. Predictand Variable (2) Predictor Variable
3. Regressand Variable (3) Regressor Variable
4. Response Variable (4) Stimulus Variable
5. Endogenous Variable (5) Exogenous Variable
6. Outcome Variable (6) Co-variant Variable
7. Control Variable (7) Control Variable

Test of Hypothesis:
Procedure for Testing the Hypothesis.

Step 1 : Developing Null and Alternative Hypothesis

Null Hypo (Ho) means a statement in which no changes in difference or effects is expected.

Step 2 : Alternative Hypothesis:

It is a state in which some difference or effects is expected. (H1)

Step3 : Identification of One-Tail and Two-Tail Test

One-tailed test is a test of (Ho) where alternative Hypothesis (H1) is expressed directionally.

Symbol ≥ or ≤ indicated one-tailed test.

On the other hand, two-tailed test of Null Hypothesis (Ho) is not expressed directionally. Which has
symbol = or ≠

Step 4 : Identification of Type I and Type II error

Accepting a false statement as true, which is not in fact true is called type I error. Accepting a true
statement which is in fact false is type II error.

Step 5 : Identification of Level of Significance

It is the percentage at which (Ho) is rejected or accepted, usually measured 0.05 or 95%

Step 6 : Identification of Critical Value

It is the value which is derived from Standard Normal Distribution from Z-table is called critical value.
It is also known as p-value (probability value).
2|Page
Department of Marketing, University of Dhaka
Step 7 : Identification of Degrees of Freedom

It is the number of out layer (Abnormal Data) which can be deleted by the researcher from the entire
series, varies the test statistic which will be applied.

Step 8 : Identification and Calculation of Test Statistic

There are four types of test statistic which are frequently used in testing (Ho)

T- Test

Z- test

F – Test

Chi-square

Chi-square will be used when the data is none-metric and the remaining three tests will follow for
metric data.

Step 9 :

Comparison of Calculated result with that of table value. If the calculated value is greater than table
value then we can reject the (Ho).

Assumption of Classical Liner Regression Model


Linearity: Amount of resource is directly proportional to the level of activity

Divisibility: Fractional values are permitted

Non-negativity: The decision variables are permitted to have only the values greater than or equal to zero.

Additivity: The total output is the algebraic sum of each individual output

1. LRM –> Linear Regression Model is linear in Parameter,


Parameter

𝑌 = 𝑎 + 𝑏𝑥 + 𝑒 (error terms/disturbance/stochastic)

Independent Variable

Parameter means coefficient of independent variable and the highest power of the parameter is ‘1’. If
the equation is non-linear then it has to be converted into linear equation by taking logarithm in the
both site of the equation.

3|Page
Department of Marketing, University of Dhaka
2. The residual variable which is denoted by ‘e’ have 0 mean value. That is expected value of ‘e’ must
always be equal to 0 (zero).
3. The variance of the error term is same regardless of the variance of ‘x’.
Standard deviation vs Variance

Sample Data Population


When the variance of the error term is same then it is called Homoscedasticity.
4. There is no auto correlation between the two error terms. The technique for measuring the auto
correlation is D.W. Test (Durbin–Watson).
5. There is no multi collinearity. It means independent variables are not corelated to each other. The
technique for measuring multi collinearity is measuring auto regressive.
6. The number of observation (sample size) must be greater than number of exploratory variables (x)
7. Independent variables must not be any outlier
Abnormal figure

5 , 7 , 9 , 11 , 31 Outlier

8. Exploratory variables and error terms are Interpretation

Interpretation of Regression Model:


𝑌̂ = 𝑎̂ + 𝑏̂𝑥 + 𝑒̂
1. In the Regression Model, Value of constant term is of little importance.
2. The parameter ‘b’ measures the changes of explanatory variables with response to dependent variable.
3. The test of significance of the parameter needs to be tested against the null hypothesis
4. For testing the significance of the parameter four theoretical density function can be tested. Such as 7
- test, T - test, 'F - test and chi-square depending upon the nature of the problem.
5. Test of significance of the constant term is not usually conducted as it has little or no economic
significance.
6. To test the goodness of fits of the regression model usually coefficient or determination (r2) is
calculated. The value of (r2) will lie in between ‘0 to 1’. Higher the value of (r2) better the model is fits.
7. To test the interdependency of the independent variable ‘correlation coefficient (r)’ is calculated.
Smaller the value of (R) of independent variable greater the model fits.

4|Page
Department of Marketing, University of Dhaka
Two Variable Regression ( Single Equation )

X Y
𝑥= 𝑥 𝑦=𝑌 𝑌 𝑦 𝑥𝑦 = + 3.6 𝑒=𝑌 𝑌̂ 𝑒 𝑌̂ 𝑌
Ads Sales
10 45 0 0 5 25 0 40.0 5.0 25 0
9 40 -1 1 0 0 0 36.4 3.6 13 12.96
11 42 1 1 2 4 2 43.6 -1.6 3 12.96
12 45 2 4 5 25 10 47.2 -2.2 5 51.84
8 28 -2 4 -12 144 24 32.8 -4.8 23 51.84
∑X 50 ∑Y 200 ∑𝑥 ∑𝑦 ∑𝑥𝑦 ∑𝑒
= 10 𝑌= 40 10 198 36 68.40

Question:

1. Find out the regression equation.


2. If BDT 15 is spent for advertisement what will be the sales volume.
3. Find out the coefficient of determination/goodness of fit and correlation.
4. Test the significant of the parameter ( 𝛽1 ).
5. Interpret the result.

Answers:

1. The Two variable regression equation is, = 𝛽0 + 𝛽1

So, the estimated regression equation will be, ̂ = 𝛽̂ 0 + 𝛽̂ 1


∑𝑥𝑦 36
Here we get, 𝛽̂ 1 = = = 3.6
∑𝑥 10

̂0 = 𝑌
𝛽 𝛽1 ̅ = 0 3.6 10 =
So, the estimated regression equation is,
̂= + 3.6

If no advertisement is given, even then sales will be BDT 4. For additional BDT 01 advertisement, the
additional sales will be BDT 3.6
2. If BDT 15 is spent for advertisement, estimated sales will be,
̂= + 3.6
= 3.6 15
= 58

3. The coefficient of determination (r2) can be calculated using the formula,


𝛽1∑𝑥𝑦
𝑟 =
∑𝑦
3.6 × 36
=
198
= 0.65

If BDT 15 is spent for ad, then sales will be BDT 58, that can be explained by 65%. That means advertisement
influences the sales by 65%.

5|Page
Department of Marketing, University of Dhaka
The relationship between sales and advertisement can be calculated by the co-efficient of correlation denoted
by, √ 𝑟= 𝑟
= √0.65
= 0.81
The value of r is equal to 0.81 that indicates that there is a strong positive correlation between sales and
advertisement.

4. Test of significant of the parameter (β1) can be tested by using this formula,

∑𝑒
𝛽̂ 1 = √
𝑛 𝑘 ∑𝑥

68. 0
=√
5 10

= 1.51
To test the significant of this parameter, we have to calculate t-ratio,
̂1
𝛽
𝑇=
𝑆𝐷 𝛽̂ 1
3.6
=
1.51
= .38

5. If the calculated t-ratio is above the 2.5, we can say parameter is significant. But in this calculation,
calculated t-ratio is less than 2.5. which indicates, parameter is not significant. That is additional BDT 01
advertisement, sales will be 3.6 is not significant.

6|Page
Department of Marketing, University of Dhaka
Multiple Regression Analysis (Simultaneous Equation Method)

Savings Income Number of


Family 𝑌 𝑁 𝑆𝑌 𝑆𝑁 𝑁𝑌
(S) (Y) Children (N)

A 6 8 5 64 25 48 30 40
B 12 11 2 121 4 132 24 22
C 10 9 1 81 1 90 10 9
D 7 6 3 36 9 42 21 18
E 3 6 4 36 16 18 12 24
∑ 𝑆 38 ∑ 𝑌 40 ∑ 𝑁 15 ∑ 𝑌 ∑𝑁 ∑ 𝑆𝑌 ∑ 𝑆𝑁 ∑ 𝑁𝑌
𝑆 7.6 𝑌 ̅
8 𝑁 3 338 55 330 97 113

Question:

1. Estimate the multiple regression equation through simultaneous equation method.

Solution:

Step 1: The multiple regression equation is,

𝑆 = 𝛽0 + 𝛽1 𝑌 + 𝛽 𝑁 + 𝑒

Step 2: The estimated equation will be,

𝑆̂ = 𝛽̂0 + 𝛽̂1 𝑌 + 𝛽̂ 𝑁

Step 3: Applying least square method, the three normal equations are,
∑𝑆 = 𝛽̂0 𝑛 + 𝛽̂1 ∑𝑌 + 𝛽̂ ∑𝑁 … … . … … … … … … … 𝑖

∑𝑆𝑌 = 𝛽̂0 ∑𝑌 + 𝛽̂1 ∑𝑌 + 𝛽̂ ∑𝑁𝑌 … … … … … … . . 𝑖𝑖

∑𝑆𝑁 = 𝛽̂0 ∑𝑁 + 𝛽̂1 ∑𝑁𝑌 + 𝛽̂ ∑𝑁 … … … … … … . 𝑖𝑖𝑖

Step 4: Putting the values in these three equations we have,

38 = 5 𝛽̂0 + 0 𝛽̂1 + 15 𝛽̂

330 = 0 𝛽̂0 + 338 𝛽̂1 + 113 𝛽̂

97 = 15 𝛽̂0 + 113 𝛽̂1 + 55 𝛽̂

Step 5: Multiplying the equation, no. (i) by 8 and deducting from the equation no. (ii) we have,

330 = 0 𝛽̂0 + 338 𝛽̂1 + 113 𝛽̂

30 = 0 𝛽̂0 + 3 0 𝛽̂1 + 1 0 𝛽̂

6 = 18 𝛽̂1 07 𝛽̂

18 𝛽̂1 07 𝛽̂ = 6 … … … . . 𝑖𝑣

7|Page
Department of Marketing, University of Dhaka
Step 6: Again, Multiplying the equation, no. (i) by 3 and deducting from the equation no. (iii) we have,

97 = 15 𝛽̂0 + 113 𝛽̂1 + 55 𝛽̂

11 = 15 𝛽̂0 + 1 0 𝛽̂1 + 5 𝛽̂

17 = 07 𝛽̂1 + 10 𝛽̂

07 𝛽̂1 + 10 𝛽̂ = 17 … … … … . 𝑣

Step 7: Step 8:
Now, multiplying the equation, no. (iv) by 10 and Putting 𝛽̂1 = 1.076 in the equation no. (v) we get,
equation, no. (v) by 07 and then deducting equation, no.
(v) from the equation no. (iv) we have, 0.7 × 1.076 10 𝛽̂ = 17

180 𝛽̂1 70 𝛽̂ = 60
⇒ 10 𝛽̂ = 7.53 17
9 𝛽̂1 + 70 𝛽̂ = 119
−9.468
⇒ 𝛽̂ = 10
131 𝛽̂1 = 1 1

1 1 ∴ 𝛽̂ = 0.9 7
𝛽̂1 =
131

∴ 𝛽̂1 = 1.076

Step 9: The equation will pass through mean values,

𝑆 = 𝛽̂0 + 𝛽̂1 𝑌 + 𝛽̂ 𝑁
̅

⇒ 7.6 = 𝛽̂0 + 1.076 × 8 + 0.9 7 × 3

⇒ 𝛽̂0 = 7.6 8.608 + .8 1

∴ 𝛽̂0 = 1.833

Our estimated equation stands with,

𝑆̂ = 1.833 + 1.076 𝑌 0.9 7 𝑁

Illustration:

1. BDT 1.833 will be saved without any income and children.

2. BDT 1.076 will be saved if income increases by BDT 01.

3. BDT 0.947 will be decreased if 01 children is born.

8|Page
Department of Marketing, University of Dhaka
Multiple Regression Analysis (Three Dimensions Method)

Y X1 X2
𝑦=𝑌 𝑌 𝑦 𝑥1 = 1 1 𝑥1 𝑥 = 𝑥 𝑥1𝑦 𝑥 𝑦 𝑥 1𝑥
Qnty. Price Y(Income)
100 5 1000 20 400 -1 1 200 40,000 -20 4,000 -200
75 7 600 -5 25 1 1 -200 40,000 -5 1,000 -200
80 6 1200 0 0 0 0 400 160,000 0 0 0
70 6 500 -10 100 0 0 -300 90,000 0 3,000 0
50 8 300 -30 900 2 4 -500 250,000 -60 15,000 -1000
65 7 400 -15 225 1 1 -400 160,000 -15 6,000 -400
90 5 1300 10 100 -1 1 500 250,000 -10 5,000 -500
100 4 1100 20 400 -2 4 300 90,000 -40 6,000 -600
110 3 1300 30 900 -3 9 500 250,000 -90 15,000 -1500
60 9 300 -20 400 3 9 -500 250,000 -60 10,000 -1500
∑ 800 ∑ 60 ∑ 8000 ∑𝑦 ∑𝑥 1 ∑𝑥 ∑𝑥 1𝑦 ∑𝑥 𝑦 ∑𝑥 1𝑥
𝑌= 80 1= 6 = 800 3450 30 1,580,000 -300 65,000 -5,900

Questions:

1. Estimate the multiple regression equation.


2. Find out the coefficient of determination (value of R2).
3. Calculate the correlation-coefficient ( value of R )
4. Find out the 𝑅 (Adjusted R square).
5. Calculate the e2.
6. Calculate the standard deviation of e2.
7. Calculate the variance of β1 and β2.
8. Find out the standard error of β1 and β2.
9. Calculate the T-ratio of β1 and β2.
10. Draw the conclusion.

Solution

1. We know, Multiple Regression Equation is, = 𝛽0 + 𝛽1 1 + 𝛽

So, The Estimated Multiple Regression Equation will be, ̂ = 𝛽̂ 0 + 𝛽̂ 1 1 + 𝛽̂

Now, ∑ 𝑥1 𝑦 × ∑ 𝑥 ∑ 𝑥 𝑦 × ∑ 𝑥1 𝑥 300 15,80,000 65,000 5,900


̂1 =
𝛽 =
∑ 𝑥1 ×∑ 𝑥 ∑ 𝑥1 𝑥 30 15,80,000 5900

9,05,00,000 = 7.188
=
1, 5,90,000

∑ 𝑥 𝑦 × ∑ 𝑥1 ∑ 𝑥 1𝑦 × ∑ 𝑥 1𝑥 65,000 30 300 5,900


Again, ̂ =
𝛽 =
∑ 𝑥1 × ∑ 𝑥 ∑ 𝑥 1𝑥 30 15,80,000 5900

1,80,000
= = 0.01 3
1, 5,90,000

̂0 =
𝛽 ̂1 1
𝛽 ̂
𝛽 = 80 7.188 6 0.01 3 800
And,
= 80 + 3.1 8 11. = 111.688

The Estimated Multiple Regression Equation is, ̂ = 111.688 + 7.188 1 + 0.01 3


= 111.688 7.188 1 + 0.01 3

9|Page
Department of Marketing, University of Dhaka
𝛽1 × ∑ 𝑥1 𝑦 𝛽 ×∑ 𝑥 𝑦
2. Coefficient Of Determination, 𝑅 =
∑ 𝑦
7.188 × 300 + 0.01 3 × 65,000
=
3, 50
= 0.89
That means, the output of Quantity depends 89.4% on price and income.

3. The correlation-coefficient 𝑅 = √𝑅
= √0.89
= 0.9 5

That means, there is a strong positive correlation because the value is closer to 1.

4. Adjusted R-square ( R ) measures what would have been happen if more independent variables are added.
Always the value of R-square ( R ) will be less than the original R2 value. And the formula is –

𝑛 1
Adjusted R square, 𝑅 =1 1 𝑅 ( )

=1 1 0.89
𝑛

(
𝑘
10 1
10 3
) [ 𝑛 = Number of Data
𝑘 = Number of Variable ]
= 0.863
That means, if more independent variable would have been added, even that coefficient of R
determination is closer to original R2, which indicates independent variable are rightly identified.

5. Calculation of 𝑒 6. Standard Deviation of 𝑒


∑ 𝑒2
𝑒 = ∑𝑦 1 𝑅 𝜎 𝑒=
𝑛−3

365.7
= 3 50 1 0.89 = 10−3

= 365.7 =5 .

7. Variance of 𝛽̂ 1, Variance of 𝛽̂ 2 ,

∑ 𝑥2 ∑ 𝑥2
Var (𝛽̂ 1) = 𝜎 𝑒 ∑ 𝑥 2 ×∑ 𝑥 2 −2∑ 𝑥 2
Var (𝛽̂ 2) = 𝜎 𝑒 ∑ 𝑥 2 ×∑ 𝑥 2 −1∑ 𝑥 2
1 2 1 𝑥2 1 2 1 𝑥2

15,80,000 30
= 5 . { 30 × 15,80,000 − −5900 2 } = 5 . { 30 × 15,80,000 − −5900 2 }

= 6.56 = 0.0001

8. Standard Error of 𝛽̂1 𝑎𝑛𝑑 𝛽̂

𝑆𝐸, (𝐵̂1 ) = √𝑣𝑎𝑟 𝛽̂1 𝑆𝐸, (𝐵̂ ) = √𝑣𝑎𝑟 𝛽̂

= √6.56 = √0.0001

= .56 = 0.01

10 | P a g e
Department of Marketing, University of Dhaka
9. T-ratio of 𝛽̂1 𝑎𝑛𝑑 𝛽̂ ,
𝛽 𝛽
𝑡 𝑟𝑎𝑡𝑖𝑜, (𝛽̂1 ) = 𝑆𝐸 𝑜𝑓1 𝛽 𝑡 𝑟𝑎𝑡𝑖𝑜, (𝛽̂1 ) = 𝑆𝐸 𝑜𝑓1 𝛽
1 1

−7.188 0.0143
= =
.56 0.01

= .806 = 1. 3

10. Summary of these results:

𝑌̂ = 111.69 7.188 1 + 0.01 3

𝑆𝐸 2.56 0.01

𝑇 𝑟𝑎𝑡𝑖𝑜 -2.806* 1.43

𝑅 0.894

𝑅 0.863

Significant level at 0.05. [If T-ratio is greater than 3.55, we will consider significant level at 0.01]

1. The interpretation of constant term 𝛽0 is not needed.


2. Price variable 1 negatively influence the production (Q).
3. Income variable positively affects the output (Q).
4. Standard error of price variable 1 is 2.56 which indicates that the error is 256% and standard error
of income variable is 0.01 or 1%.
5. T-ratio of 1 is statistically significant but is insignificant. It means, production is highly
dependent on price factor.
6. The above sayings that is price and income together influence by 89.4%.
7. If more variables would have been added, then the value of 𝑅 is 86.3%.

11 | P a g e
Department of Marketing, University of Dhaka
Cobb-Douglas Production Function
Y (Qty.) X 2 (Labour) X 3 (Capital) Y X2 X3
𝑦=𝑌 𝑌 𝑦 𝑥 = 𝑥 𝑥3 = 𝑥3 𝑥 𝑦 𝑥3𝑦 𝑥 𝑥3
Q L K 𝑙𝑜 𝑒𝑄 𝑙𝑜 𝑒𝐿 𝑙𝑜 𝑒𝐾 2 3 3

100 1.0 2.0 4.605 0.000 0.693 -0.444 0.197 -0.755 0.571 -0.215 0.046 0.3350.095 0.162
120 1.3 2.2 4.787 0.262 0.788 -0.261 0.068 -0.493 0.243 -0.120 0.014 0.1290.031 0.059
140 1.8 2.3 4.942 0.588 0.833 -0.107 0.011 -0.168 0.028 -0.075 0.006 0.0180.008 0.013
150 2.0 1.5 5.011 0.693 0.405 -0.038 0.001 -0.062 0.004 -0.503 0.253 0.0020.019 0.031
165 2.5 2.8 5.106 0.916 1.030 0.057 0.003 0.161 0.026 0.121 0.015 0.0090.007 0.020
190 3.0 3.0 5.247 1.099 1.099 0.198 0.039 0.343 0.118 0.190 0.036 0.0680.038 0.065
200 3.0 3.3 5.298 1.099 1.194 0.250 0.062 0.343 0.118 0.286 0.082 0.0860.071 0.098
220 4.0 3.4 5.394 1.386 1.224 0.345 0.119 0.631 0.398 0.316 0.100 0.2180.109 0.199
∑ 40.39 ∑ 6.04 ∑ 7.27 ∑ ∑𝑦 ∑ ∑𝑥 ∑𝑥3 ∑𝑥 𝑦 ∑ 𝑥3𝑦 ∑ 𝑥 𝑥3
𝑌 5.049 2 0.755 3 0.908 0.000 0.502 0.000 1.505 0.000 0.551 0.865 0.379 0.647

Cobb-Douglas Production Function,


𝑄 = 𝐴. 𝐿𝛼 . 𝐾𝛽
Where ,
A = Constant

L = Labour

K = Capital

Now transforming into natural loge at both side we have,

log 𝑒 𝑄 = log 𝑒 𝐴 + 𝛼 log 𝑒 𝐿 + 𝛽 log 𝑒 𝐾

Now,
∑ 𝑥 𝑦 × ∑ 𝑥3 ∑ 𝑥3 𝑦 × ∑ 𝑥 𝑥3 ∑ 𝑥3 𝑦 × ∑ 𝑥 ∑ 𝑥 𝑦 × ∑ 𝑥 𝑥3
𝛼= 𝛽=
∑ 𝑥 × ∑ 𝑥3 ∑ 𝑥 𝑥3 ∑(𝑥 ) × ∑ 𝑥3 ∑ 𝑥 𝑥3

0.865 0.551 0.379 0.6 7 0.379 1.505 0.865 0.6 7


= =
1.505 0.551 0.6 7 1.505 0.551 0.6 7

= 0.5638 = 0.0 9

𝐴=̅ 𝛼̅ 𝛽 ̅3

= 5.0 9 0.5638 0.755 0.0 9 0.908

= .60

So,
log 𝑒 𝑄 = log 𝑒 𝐴 + 𝛼 log 𝑒 𝐿 + 𝛽 log 𝑒 𝐾

= .60 + 0.5638 𝐿 + 0.0 9 𝐾


According to the data of labour and Capital, the contribution of labours for economic growth is 0.5638 or
56.38% and the contribution of capital is 0.0249 or 2.49%. When the 04 factors of production will be
increased proportionately, the production will also increase and per unit cost will decrease. This is called
economies of scale, and the curve is u-shaped.

MES

constant
12 | P a g e
Department of Marketing, University of Dhaka
Here, the values of α and β are,
α = 0.5638
β = 0.0249
𝛼 + 𝛽 = 0.5638 + 0.0 9
= 0.5887

As, 𝛼 + 𝛽 < 1, it is called decreasing return to seale. And per unit production cost will be decreased if
resources are ineseared.

13 | P a g e
Department of Marketing, University of Dhaka
Autocorrelation / Serial Correlation

1. five important assumptions of regression are:

1) Linear regression model, 𝑌 = 𝛽1 + 𝛽 1 +µ

2. X values are fixed in repeated sampling. More specifically, X is assumed to be non-stochastic (conditional
regression).

3. Zero mean value of disturbance 𝜇𝑖 , 𝐸 𝜇𝑖 |𝑥𝑖 = 0

4. Equal variance of µi . It means that the Y populations corresponding to various X values have the same
variance. (Homoscedasticity).

5. No autocorrelation between the disturbance. This means disturbance Ui and Uj are uncorrelated. This is the
assumption of no serial correlation, or no autocorrelation.

Detection of Autocorrelation – Assumptions:

1. The correlation between Ut and Ut-k is called autocorrelation of order k.


2. The correlation between Ut and Ut-1 is called first-order autocorrelation and denoted by
P1 (rho). [lag 1]
3. The correlation between Ut and Ut-2 is called second-order autocorrelation and is denoted by
P2 . [lag 2]
4. P is the slope coefficient in the regression of µi and µt-1.
5. Mean of autocorrelated Ut turns out to be Zero.
6. Variance of auto correlated Ut’s is a constant value.

The Durbin-Watson Test

A commonly used statistic to detect auto correlation is the Durbin-Watson DW statistic, which is denoted by
‘d'. It is defined as:

∑ 𝑒𝑡 −𝑒𝑡−1 2
𝑑𝑥 = ∑ 𝑒𝑡2
e/µ = disturbance term.

∑𝑛 ̂ 𝑡−1 2
̂ 𝑡 −𝜇
1 𝜇
Or, 𝑑𝑥 = 𝑛
∑1 𝜇 ̂ 𝑡2

Decision Rule

1. If 'd' is found to be 2, one may assume that there is no first-order autocorrelation, either positive or
negative.
2. If 𝑃̂ = +1 , indicating perfect positive correlation in the residual, 𝑑 = 0. Therefore, the closer d is to 0,
the greater the evidence of positive Serial Correlation

14 | P a g e
Department of Marketing, University of Dhaka
3.

No positive autocorrelation ⟹ Reject H0 ⟹ If, 0 < 𝑑 < 𝑑𝐿


No positive autocorrelation ⟹ No decision ⟹ If, 𝑑𝐿 ≤ 𝑑 ≤ 𝑑𝑈
No negative correlation ⟹ Reject ⟹ If, 𝑑𝐿 < 𝑑 <
No negative correlation ⟹ No decision ⟹ If, 𝑑𝑢 ≤ 𝑑 ≤ 𝑑𝐿
No autocorrelation ⟹ Do not Reject ⟹ If, 𝑑𝑈 < 𝑑 < 𝑑𝑈
(positive or negative)

Example:
Consider the following model; 𝑌 = 𝛼 + 𝛽 𝑡 + 𝜇𝑡 . test the autocorrelation with following observations on Y
and X:
X : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Y : 2 2 1 1 3 5 6 6 10 10 10 12 15 10 11

The estimated model is : 𝑌 = 0. 8 + 0.91 𝑡

t – ratio = (0.107) R2 = 0.85

60. 13
dx = = 1.
1.767

Value of dL and du on 5% level of significance, with n=15 and one explanatory variable, are:

dL = 1.09 and du = 1.36 [See Table]

Since 𝑑 𝑥 = 1. > 𝑑𝑢 = 1.36 , null hypothesis is accepted. In other words, there is no


autocorrelation in the given sample of observation on X and Y.

(1) The null hypothesis is : 𝐻𝑜 ∶ 𝑃 = 0


The alternative hypothesis is: 𝐻1 ∶ 𝑃 ≠ 0

(2) Compare calculated dx valve with table value of d, with (n-k) df K being the no. of explanatory
variables including constant term.

(3) Two values; upper bound du and lower bound dL have been assigned to d.

(4) Values of d lie between 0 and 4 and if there is no autocorrelation then dx = 2 (we accept Ho) and if dx
is close to 0 or 4, we reject H0.

(5) The exact value of d is never known, but exist a range of Values within which we can neither accept
nor reject Ho.
15 | P a g e
Department of Marketing, University of Dhaka
16 | P a g e
Department of Marketing, University of Dhaka
Multi collinearity

1. Multicollinearity refers to the existence of high correlation between independent variables.


2. There are no sure methods of detecting Collinearity, but there are several indicators of it :

(a) A high R2 but regression coefficients insignificant with t-test.

(b) A high R2 but partial correlations are low.

(c) The higher the observed Chi-sq. than the table value, the more severe the multicollinearity.

(d) If calculate F-value is greater than theoretical F-value, there is multicollinearity.

(e) if ∗ 𝑡 > 𝑡, multicotlinerity is a culprit.

3. 𝑡𝑒𝑠𝑡 examines the severity of multicollinearity. F-test locates which variables are multicollinear.
And t-test detects the pattern of multicollinearity (variables which cause multicollinearity).
4. To detect the multi collinearity, the most popular method is the Frisch's Confluence Analysis.

Example:

Year C Y L Pc Po
2001 8 82 17 91 94
2002 9 88 21 93 96
2003 10 99 25 96 97
2004 11 105 29 94 97
2005 12 117 34 100 100
2006 14 131 40 101 101
2007 15 148 44 105 104
2008 17 161 49 112 109
2009 19 174 51 112 111
2010 20 184 53 112 111

C = expenditure on clothing; Y = income; L = Asset, Pc = price of clothing Po = price of other


commodities

∴ 𝐶 = 𝑏𝑜 + 𝑏1 𝑌 + 𝑏 𝐿 + 𝑏3 𝑃𝑐 + 𝑏4 𝑃𝑜 + 𝑒

∴𝐶= 13.53 + 0.097𝑌 + 0.015𝐿 0.199𝑃𝑐 + 0.3 𝑃𝑜

𝑆𝐸 7.5 0.03 0.05 0.09 0.15

𝑅 = 0.998, ∑ 𝑦̂ = 8.15, ∑ 𝑒 = 0.33, 𝑑 = 3.


∑ 𝑦̂ / 𝑘 1
𝐹∗ =
∑𝑒 / 𝑛 1
8.15/
=
0.33/5
= 15.6

Table value of F with k-1 and n-k d.f at 5% SL is 5.19. Hence, we reject the Ho; that is there is
multicollinearity.
17 | P a g e
Department of Marketing, University of Dhaka
All the explanatory variables are seriously multicollinear when we compute simple correlation coefficients:

rYL = 0.993
rYPc = 0.980
rYPo = 0.987
rLPc = 0.964
rLPo = 0.973
rPcPo = 0.991
To explore the effects of multicollinearity, we compute elementary regression gradually into the function:

b̂ 1 b̂ 2 b̂ 3 b̂ 4
bo (Y) Pc L Po R d
𝐶=𝑓 𝑌 -1.24 0.118 - - - 0.995 2.6
(0.37) (0.002)

𝐶 = 𝑓 𝑌, 𝑃𝑐 1.40 0.126 -0.036 - - 0.996 2.5


(4.92) (0.01) (0.07)

𝐶 = 𝑓 𝑌, 𝑃𝑐 , 𝐿 0.94 0.138 -0.034 -0.037 - 0.996 3.1


(5.17) (0.02) (0.06) (0.5)

𝐶 = 𝑓 𝑌, 𝑃𝑐 , 𝐿, 𝑃𝑐 -13.53 0.097 -0.199 0.015 0.34 0.998 3.4


(7.5) (0.03) (0.09) (0.05) (0.15)

𝐶 = 𝑓 𝑌, 𝑃𝑐 , 𝑃𝑜 -12.76 0.104 -0.188 - 0.819 0.997 3.5


(6.82) (0.01) (.07)
Note: Numbers in brackets are the standard errors of the estimates.

Dropping L and introducing Po , we obtain a better overall fit. R2 is slightly increased. L is clearly a superfluous
variable. Thus, best fit is obtained from the function 𝐶 = 𝑓 𝑌, 𝑃𝑐 , 𝑃𝑜 .

18 | P a g e
Department of Marketing, University of Dhaka
Heteroscedasticity

1. One assumption we have made until now is that the errors ‘µi’ in the regression equation have a common
variance 𝜎 2.
2. Other three assumptions about errors are:

(a) randomness of the disturbance variable.

(b) Zero mean,

(c) normality of this disturbance variable.

3. 'U' is introduced into model to take into account the various influence of ‘errors’, such as errors of omitted
variable, errors of the mathematical form of the model, errors of the measurement of the dependent
variable, and erratic behavior of human.
4. Randomness of 'µ' means, omitted variables should follow a Systematic pattern.
5. Zero mean of 'U' is the conceptual application of the rules of algebra and can never be proved.
Geometrically, zero mean of 'U' implies that the observations of Y and X must be scattered around the line
in random way.
6. Normality assumption of ‘U' says that probability distribution of random 'µ' remains the same over all
observations of X, and in particular that variance of each 'U' is the same for all values of the explanatory
variable.
7. Heteroscedasticity can be detected by following tests.

(a) Ramsey’s test.

(b) Glejser's test.

(c) Breuch and Pegan's tests,

(d) White’s test.

(e) Goldfeld and Quandt’s test.

(f) The Likelyhood ratio test.

8. Amongst the above tests, widely used ones is the Ramsey (RESET) test.
9. The test involves regressing ‘µi’ on x2,x3 and so on/or regressing ‘µi’ on ŷ2, ŷ3 to test whether or not the
coefficients are significant.
10. There are three solutions for heteroscedasticity problem:

(a) Use of weighted least Square,

(b) Deflating the data by some measure of ‘Size’,

(c) Transforming the data to the logarithmic form.

19 | P a g e
Department of Marketing, University of Dhaka
Detection of Heteroscedasticity:

1. The easiest way to detect the presence is to plot the graph of residual against the dependent variable and
then examine the pattern of residuals

2. Spearman rank correlation coefficient can be used to test the presence of heteroscedasticity. In this case,
OLS is applied to estimate equation,
𝑌 = 𝛼 + 𝛽𝑥 + 𝜇
then estimated error term and X are arranged and rank correlation coefficient is computed by using the
formula,
6 ∑ 𝐷𝑡
𝑟𝑒 .𝑥 = 1
𝑛 𝑛 1
And if coefficient is high it means there exists a problem of heteroscedasticity.
3. Park suggested two-stage procedure to test heteroscedasticity:
i. Run OLS regression and obtain e1 from the regression
ii. Run log-linear regression between et2 and X and examine whether β is significant (a significant β
suggests heteroscedasticity).
4. Goldfield - Quandt test involves estimating two least square lines, one using data with low variance and
other using data high variance errors. If the variances are approximately equal, there is no
heteroscedasticity.

20 | P a g e
Department of Marketing, University of Dhaka

You might also like