CJ Econometrics

1 Question 1
1a
Heteroskedasticity arise when the variance of standard error in the model is not constant. Standard
error means the variability of a coefficient. However, by looking at the model, the standard error of
the residual model is very large, which is 1133000 on 750 degrees of freedom. Looking at the
standard error of the coefficient, estimate fitted has a standard error of around -375 but the
standard error of estimate is around 409,500. Thus, the standard error is not constant, meaning, the
error term do not have constant variance.
1b
Null hypothesis : Variances of error are equal. Heteroskedasticity absent
Alternative hypothesis : Variances of error are not equal. Heteroskedasticity present
From the auxiliary regression provided, we can use White Test to test for the Heteroskedasticity.
From the regression model, we have the R-squared value. From the R-squared value, we can obtain
the F-statistic by formula below. Degree of freedom is found in the auxiliary regression model, which
are 2 and 750. We can reject the null hypothesis and concluded that heteroskedasticity present if
the F-statistic value is bigger than the critical value.
1−R 2
F-statistic = R2 ÷ > F critical value (for 5% significance level)
N −2
F-statistic is larger than critical value = 3.008 at 5% significance level with degree of freedom of 2 nd
750. Hence, we can reject the null hypothesis and conclude that heteroskedasticity present.
We can also use Breusch-Pagan heteroskedasticity test. The test statistic is N × R2 with k degree of
freedom (N is the sample size, R2 value can be obtained from the auxiliary regression). If the chi-
square value is bigger than the criteria value, the null hypothesis is rejected and heteroskedasticity
present.
1c
We can use Box-cox transformation. We may not know the distribution of dependent variable and
box-cox transformation can transform the distribution into normal distribution which is unbiased.
We can transform the model by logarithm to eliminate heteroskedasticity too. This may reduce the
variance of errors as the values are rescaled by logarithm.
We can transform our estimators to weighted least square estimator to associate with individual
observations to remove the heteroskedasticity. In simple words, we assign less weight to the
variable with higher variance. We can apply Generalized Least Square to transform the residuals so
that the variances are equal and uncorrelated. We can split the sample into subsample and divide
each term in the model by the variance of the error term to obtain a modified model. Generalized
model able to give maximum likehood estimate when heteroskedasticity present.
1d
Heteroskedasticity will affect the accuracy of the model. From the model, the variance of error is not
constant and one important assumption of Ordinary Least Square model is homoskedasticity
(constant variance of error term). The noise of data is large where we cannot get the maximum
likehood estimate from the Ordinary Least Square model. Thus, we may not be able to get the
accurate number of hours from this model given heteroskedasticity.
2a
In time series model, one important assumption is zero conditional mean: E [ ut| x t ¿=0
ut −2=ρ ut−3 + v t−2 ----------------------------------1
ut −1=ρ ut−2 + v t−1 -------------------------------- 2
ut =ρ ut −1 + v t ------------------------------------- 3
In time series model, v t is distributed with zero mean.
Sub 2 into 3
ut =ρ( ρut −2+ v t −1 )+v t
ut =ρ2 ut−2 + ρv t−1 + vt

Substitute 1 into the equation above
ut =ρ2 ( ρ ut−3 + v t−2 ) + ρv t−1 +v t
ut =ρ2 ut−3 + ρ2 v t −2+ ρvt −1+ v t

After attempts of recursive substitution,
ut =ρ3 v t −3+ ρ2 v t−2 + ρv t−1+ v t
E(u ¿¿ t)=E ( v t ) + ρE(v ¿¿ t−1)+ ρ 2 E ( vt −2 ) + ρ3 E(v t −3) ¿ ¿ = 0 since v t is distributed with zero
mean
2b
First method to do this is to do by recursive substitution:
1 1
Var ( ut ) = E (u 2t ) −E( u¿¿ t)= E ( u2t ) ¿
n n
1 1 1
Var ( v t )= E ( v 2t ) −E(v¿ ¿t )= E ( v 2t ) ¿ = δ 2
n n n
u2t =( ρ ut −1 + v t )2= p2 u 2t−1+ v 2t +2 ρu t−1 v t
u2t −1=( ρut −2+ v t −1 )2= p2 u 2t−2 +v 2t −1+ 2 ρ ut −2 vt −1
u2t =( ρ ut −1 + v t )2= p2 (p 2 u 2t−2+ v 2t −1 +2 ρ ut−2 v t −1)+v 2t +2 ρ ut−1 v t
u2t = p 4 u2t−2 + p2 v 2t−1 +2 p 3 ut −2 v t−1 +v 2t +2 ρ ut−1 v t
E(u ¿ ¿ t 2 )− p4 E(u ¿ ¿ t 2 )= p2 E (v 2t −1)+2 p 3 E(ut−2 v t −1 )+ E( v 2t )+2 ρE(ut −1 v t )¿ ¿
(1− p 4 ) E(u ¿ ¿ t 2 )=(1+ p¿ ¿2) E (v 2t −1)+2 p 3 E (ut−2 v t −1)+2 ρE (ut −1 v t )¿ ¿

Since E ¿
(1− p 4 ) E(u ¿ ¿ t 2 )=(1+ p¿ ¿2) E (v 2t −1) ¿ ¿
(1+ p¿¿ 2)
E(u ¿ ¿ t 2 )=E( v 2t ) ¿¿
(1− p 4)
1 2 1 2 1
Var (u ¿ ¿t )= Var ( vt ) ¿
n n (1−p 2)
δ2
Var ( ut ) =
( 1− p2 )
Second method:
Var ( ut ) =Var ( ρ ut−1 ) +Var ( v t )
Var ( v t ) is δ 2 as per information aboveVar ( ut ) = p2 Var ( ut ) +δ 2
Var ( ut ) − p2 Var ( ut )=δ 2
( 1− p 2) Var ( ut )=δ 2
δ2
Var ( ut ) =
( 1− p2 )
2c Autocorrelation may cause the ordinary least square estimator to underestimate the true
variance, thus the t-value will be larger than the real value.
2d
We can conduct Breusch-Godfrey test to detect the autocorrelation of order higher than 1.
The assumption made is the sample size is large sample.
Null hypothesis H 0 : ρi=0 for all i=t

Alternate hypothesis H 1 : ρ1 ≠ ρ2 ≠ … ≠ ρi for i=t
First, we should estimate the model and obtain the residuals. The auxiliary regression model will be
fitted. Next, we can compute the F-statistic test to evaluate on the joint significance of the residuals.
We can conclude that if F-statistic larger than the critical value, the null hypothesis can be rejected.
2 2
On the other hand, we can also compute ( N−q ) Rauxillary xt . N-q represent the number of
observations. If the chi-square statistic is larger than the critical value, we can reject the null
hypothesis at degree of freedom. Degree of freedom will be the number of lags
3a
Linear probability model can trigger issue due to assumption of linear relationship between
dependent variables and independent variables. As the response variable is binary, we should not
use Ordinary Least Square model.
Value of R-squared and adjusted R square are not good enough. The R square (Coefficient of
Determination) and adjusted R square are 0.03877 and 0.03554 respectively. R square of 0.03877
means only 3.87 % of the variation in the dependent variables (smoker or non-smoker can be
explained by the model consisting of the variables age, years of education, annual income and price
of cigarettes. This is probably due to incorrect model is used. The technique or variables may not be
inaccurate.
The variable age and educ are significant as their t-values are very small, which are less than 0.05.
They are significant at 0.05. However, the variable income and pcigs79 is not significant. Their t-
value are 0.5298 and 0.0723 respectively.
3b
I recommend to use the logistic regression model. Logistic regression is designed to predict the
binary response based on 1 or 0. This can be done by estimating the equation to be predicting the
probability of having someone smoke as y.
P ( smoker i =1| x ¿= ^y =β 0+ β1 age i+ β2 educ i+ β3 incomei + β 4 pcigsi+u i
Logistic regression has a few assumptions:
1. The dependent variable need to be binary which is fulfilled

2. The observations should be independent of each other
3. Multicollinearity should be little or no among the independent variables.
4. The independent variables and log odds have linear relationship
Logistic regression model is not a perfect linear model which violates the assumption in OLS
methodology where we must have a linear relationship between dependent variable and
independent variables. This indicates the marginal effects of the independent variables in logistic
regression are not constant.
Besides that, OLS assumes that the data are normally distributed but in logistic regression, the
distribution is not needed to be normal. Logistic regression is not following the assumption in OLS
model where homoskedasticity must present.
3c
The variable income and pcigs69 is not significant. As observed, t-values are 0.510356 and 0.073538
respectively. Thus, it is recommended to construct another model without income and pcigs79.
Another possible model for binary response is a probit model. Probit model can be generalized to
cater for heteroskedasticity, which is the non-constant error variances.
We can evaluate the model fit based on difference between null deviance and residual deviance. We
will then be able to know the significance of the model. The null deviance and residual deviance
differs by 47.2. The null deviance is the value when there is only intercept in the equation while
residual deviance is the value when we have all the variables in the equation.
Besides that, we can also compare the new model with the existing model using AIC Statistic. Lower
AIC statistic indicates a better fit. The existing logit model has an AIC statistic of 1551.7. New model
should have lower value if better fit.
Thus, by this, we can construct another model without variable of income and pcigs79 or even try
construct a probit model. Then, we compare the value of difference between null deviance and
residual deviance. Another value to be compared is AIC value.
5a
If structural break occurs between male and female subsamples, this means the full sample fit is
different from the sub-sample fit. The coefficients of the independent variables in both equations for
male and female will be different from the model of full sample. The independent variables such as
years of education, working experience, amount of time in current jobs have different effects on
wages of different sub-groups – male and female in the population.
5b
We can construct 3 models as below:
1. Model 1 using the full sample

wage i=β 0+ β1 educ i + β 2 exper i+ β 3 tenurei +ui
2. Model 2 using the sample for female group only which means female = 1
wage i=α 0+ α 1 educ i+ α 2 exper i +α 3 tenure i +ui
3. Model 3 using the sample for male group only which means female = 0
wage i=γ 0 +γ 1 educ i +γ 2 exper i +γ 3 tenure i +ui
We can perform chow test to prove the statement. Null hypothesis and alternate hypothesis will be
as below
H 0=β =γ=δ
H 1=β ≠ γ ≠ δ
We will need to run and obtain separate regressions on the data for different model and collect sum
of squared residuals (SSR) for every model. Next, we can calculate the F-value from the formula
below:
F=¿ ¿
After obtaining the F-value, we should compare with the critical value obtained from the F
distribution table. The degree of freedom will be k and n – 2k where k is the number of parameters
and n is the total sample size. If F-value is bigger than the critical value, we can reject the null
hypothesis and prove that there is a structural break in the model between male and female group.
We can also insert dummy variables into the model to investigate on how the parameters in
different strategies related to each other.
5c
H 0 :Thereis no difference between full sample fit ∧− sample fit
H 1=There is a difference between full sample fit ∧−sample fit
F=¿ ¿
(4966.03−1257.092−3137.153)/4
F=
( 1257.092+3137.153 ) /(252+274−4 ( 2 ) )
F-value will be 16.85 which is larger than critical value of 3.356 at significance level of 0.01, degree of
freedom 1= 4, degree of freedom 2 = 518. We will reject the null hypothesis at 0.01 significance
level. This indicates there is a structural break. In conclusion, the independent variables such as
years of education, working experience, amount of time in current jobs have different effects on
wages of different sub-groups – male and female in the population.

CJ Econometrics

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CJ Econometrics

Uploaded by

Copyright:

Available Formats

1 Question 1

Null hypothesis : Variances of error are equal. Heteroskedasticity absent

Alternative hypothesis : Variances of error are not equal. Heteroskedasticity present

ut −2=ρ ut−3 + v t−2 ----------------------------------1

ut −1=ρ ut−2 + v t−1 -------------------------------- 2

In time series model, v t is distributed with zero mean.

ut =ρ( ρut −2+ v t −1 )+v t

ut =ρ2 ut−2 + ρv t−1 + vt

ut =ρ2 ( ρ ut−3 + v t−2 ) + ρv t−1 +v t

ut =ρ2 ut−3 + ρ2 v t −2+ ρvt −1+ v t

ut =ρ3 v t −3+ ρ2 v t−2 + ρv t−1+ v t

First method to do this is to do by recursive substitution:

u2t −1=( ρut −2+ v t −1 )2= p2 u 2t−2 +v 2t −1+ 2 ρ ut −2 vt −1

u2t =( ρ ut −1 + v t )2= p2 (p 2 u 2t−2+ v 2t −1 +2 ρ ut−2 v t −1)+v 2t +2 ρ ut−1 v t

u2t = p 4 u2t−2 + p2 v 2t−1 +2 p 3 ut −2 v t−1 +v 2t +2 ρ ut−1 v t

E(u ¿ ¿ t 2 )− p4 E(u ¿ ¿ t 2 )= p2 E (v 2t −1)+2 p 3 E(ut−2 v t −1 )+ E( v 2t )+2 ρE(ut −1 v t )¿ ¿

(1− p 4 ) E(u ¿ ¿ t 2 )=(1+ p¿ ¿2) E (v 2t −1)+2 p 3 E (ut−2 v t −1)+2 ρE (ut −1 v t )¿ ¿

(1− p 4 ) E(u ¿ ¿ t 2 )=(1+ p¿ ¿2) E (v 2t −1) ¿ ¿

Var ( ut ) =Var ( ρ ut−1 ) +Var ( v t )

Var ( v t ) is δ 2 as per information aboveVar ( ut ) = p2 Var ( ut ) +δ 2

Var ( ut ) − p2 Var ( ut )=δ 2

The assumption made is the sample size is large sample.

Null hypothesis H 0 : ρi=0 for all i=t

P ( smoker i =1| x ¿= ^y =β 0+ β1 age i+ β2 educ i+ β3 incomei + β 4 pcigsi+u i

Logistic regression has a few assumptions:

1. The dependent variable need to be binary which is fulfilled

We can construct 3 models as below:

1. Model 1 using the full sample

H 0 :Thereis no difference between full sample fit ∧− sample fit

H 1=There is a difference between full sample fit ∧−sample fit

You might also like