Professional Documents
Culture Documents
Problem Set - Vinay Boriwal
Problem Set - Vinay Boriwal
Problem Set - Vinay Boriwal
PC20200377
Problem Set
Question 1
Independent variables:
Part b: Keeping data points with prbarr<1 & prbconv <1 & prbpris <1, we get
Vinay Boriwal
PC20200377
Here, we can see that the p-value of lavgsen is greater than 0.1 and hence, we should remove lavgsen
variable.
The Breusch-Pagan Test shows that the p-value is less than 0.05 and hence, the disturbances do not
have constant variance. Pooled OLS assumes the disturbances to have constant variance but through
this test we can reject this. This shows that, Pooled OLS is not the correct model for this regression.
Part e:
Vinay Boriwal
PC20200377
The Hausman Test detects endogenous regressors (predictor variables) in a regression model.
Endogenous variables have values that are determined by other variables in the system. Having
endogenous regressors in a model will cause ordinary least squares estimators to fail, as one of the
assumptions of OLS is that there is no correlation between an predictor variable and the error term.
In panel data analysis, the Hausman test can help you to choose between fixed effects model or a
random effects model. The null hypothesis is that the preferred model is random effects and the
alternate hypothesis is that the model is fixed effects. The tests looks to see if there is a correlation
between the unique errors and the regressors in the model. The null hypothesis is that there is no
correlation between the two.
Hence, the Fixed Effects Model is the best model for this data.
Part j:
Here, we can see that the p-value of pctymle is greater than 0.1. Hence, we can remove this variable.
Vinay Boriwal
PC20200377
Here, we can see that the p-value of density is greater than 0.1. Hence, we can remove this variable.
Vinay Boriwal
PC20200377
This model shows that the coefficients of probability of arrest, conviction and prison sentence have
negative values. This implies that higher these probability values will be, lesser will be the crime rate.
The coefficient of police per capita has a positive value which implies that the areas with higher police
per capita will have higher crime rate. This makes sense when we see from the angle that larger crime
rate areas require higher police per capita.
Part k: Including the number of police per capita as an explanatory variable in a model of crime brings in
concerns of endogeneity. The number of police per capita will depend upon the the crime rate in that
area. Higher the crime rate, higher police will be required and hence higher police per capita will be
there. Here, an explanatory variable is being explained by the dependent variable. This violates the
assumption 3 of the Gauss-Markov. Assumption 3 states that In the sample (and therefore in the
population), none of the independent variables is constant and there are no exact relationships among
the independent variables. This gets violated in this model.