Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

SCHOOL OF ECONOMICS

Econ 2206 Introductory Econometrics Final Examination Session 1, 2007

1. TIME ALLOWED - 2 Hours. 2. TOTAL NUMBER OF QUESTIONS - 6. 3. ANSWER ALL QUESTIONS. 4. ALL QUESTIONS ARE OF EQUAL VALUE (The marks awarded to each part of a question are indicated - the total marks for this exam is 60). 5. CANDIDATES MAY BRING THEIR OWN CALCULATORS TO THE EXAM 6. STATISTICAL TABLES ARE PROVIDED AT THE END OF THE EXAM PAPER 7. ALL ANSWERS MUST BE WRITTEN IN PEN. PENCILS MAY BE USED ONLY FOR DRAWING, SKETCHING OR GRAPHICAL WORK.

ANSWER ALL SIX QUESTIONS REMINDER: When performing statistical tests, always state the null and alternative hypotheses, the test statistic and its distribution under the null hypothesis, the level of signicance and the conclusion of the test.

Question 1. (10 Marks). (i) Suppose that the correct population regression model is: y = 0 + 1 x1 + 2 x2 + u (1.1)

However we only have data only on y and x1 , and as a consequence we estimate the following model by OLS: y = 0 + 1 x1 + v In what circumstance will the OLS estimator for model (1.2): (a) provide an unbiased estimate of the true population parameter 1 ? (2 marks) (b) provide an estimate of 1 that has positive (or upward) bias ? (2 marks) (1.2)

(ii) Outline the advantages of using larger samples of data in regression analysis. (2 marks)

(iii) A model used analysing the eect of house characteristics on the sale price was: log(price) = 0 + 1 area + 2 bdrms + 3 area bdrms + u where price is the house price, area is the oor area of the house (measured in square metres), and bdrms d is the number of bedrooms. What is the partial eect on log(price) of increasing area by 1 square metre ? ( 2 marks). (iv) What is the meaning of the term contemporaneous exogeneity as used in the context of time series data ? What is the dierence between contemporaneous exogeneity and strict exogeneity as used in multiple regression models for time series data ? (2 marks)

Question 2. (10 Marks in total) The following regression model explains the monthly wages as a function of years of education (educ), years of labour market experience (exper) and current job tenure (tenure): log(wage) = 0 + 1 educ + 2 exper + 3 tenure + u With a random sample of data the following output was obtained using SHAZAM: Welcome to SHAZAM - Version 10.0 |_sample 1 722 |_read wage educ exper tenure 4 VARIABLES AND 722 OBSERVATIONS STARTING AT OBS 1 |_genr lnwage=log(wage) |_* Model estimates |_ols lnwage educ exper tenure REQUIRED MEMORY IS PAR= 81 CURRENT PAR= 2000 OLS ESTIMATION 722 OBSERVATIONS DEPENDENT VARIABLE= LNWAGE ...NOTE..SAMPLE RANGE SET TO: 1, 722 R-SQUARE = 0.1551 R-SQUARE ADJUSTED = 0.1524 VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.19493 STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.44151 SUM OF SQUARED ERRORS-SSE= 139.96 MEAN OF DEPENDENT VARIABLE = 6.7790 LOG OF THE LIKELIHOOD FUNCTION = -438.839 VARIABLE NAME EDUC EXPER TENURE CONSTANT ESTIMATED COEFFICIENT 0.74864E-01 0.15328E-01 0.13375E-01 5.4967 STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY ERROR 718 DF P-VALUE CORR. COEFFICIENT AT MEANS 0.6512E-02 11.50 0.000 0.353 0.3905 0.1487 0.3370E-02 4.549 0.000 0.147 0.1592 0.0261 0.2587E-02 5.170 0.000 0.167 0.1612 0.0143 0.1105 49.73 0.000 0.852 0.0000 0.8108 (2.1)

(i) What is the interpretation of the coecient on education, 1 ? (2 marks). (ii) Calculate the exact percentage eect of another year of education on the predicted wage level. (2 marks). (iii) Test the null hypothesis that all the slope parameters in the model are jointly equal to zero using a 1 percent signicance level. What do you conclude ? (3 mark ). Note: The F-test statistic is given by the formula based on R2 is: F =
2 2 (Rur Rr )/q 2 (1 Rur )/(n k 1)

where q is the number of restrictions, and ur and r stand for unrestricted and restricted models, respectively. (iv) We are interesting in constructing a condence interval for the (conditional) predicted log(wage) when educ = 13, exper = 11 and tenure = 7. To obtain the standard error for the prediction we need to estimate a transformed model that is equivalent to (2.1). Derive the transformed model which will give a direct estimate of the prediction and the standard error of the prediction. (3 marks). 3

Question 3. (10 Marks in total) We are interested in analysing the eect of dierent house characteristics on the market price of the house in the Sydney, and consider the following regression model: log(price) = 0 + 1 log(lotsize) + 2 log(sqrf t) + 3 log(bdrms) + u (3.1)

where price is the sale price (measured in $1000), lotsize is land area (square metres), sqrmtr (is the oor area of the house (also measured in square metres), and bdrms is the number of bedrooms. Based on a sample of data from 2005 house sales in Sydney, the following regression estimates were obtained: d log(price) = 0.5481 + 0.7013 log(sqrmtr) + 0.1745 log(lotsize) + 0.0363 log(bdrms) (0.3945)(0.0823) (0.0353) (0.0932) n = 108, R2 = 0.551, R2 = 0.538

(i) Construct a 90% condence interval for 3 (the coecient on log(bdrms)). Is zero within the condence interval ? (3 marks).

(ii) Given the estimation results, would you conclude that this is a good econometric model ? Explain. (3 marks).

(iii) We are concerned that the model in (3.1) may be misspecied. An alternative model specication where all the variables are in level form (rather than in log form) is: price = 0 + 1 lotsize + 2 sqrf t + 3 bdrms + u (3.2)

Outline a procedure for testing whether model (3.1) or model (3.2) is a better specication. What are the limitations (if any) of the test ? Explain. (4 marks)

Question 4. (10 Marks in total). In a recent study an economist examined the factors explaining whether a rm was taken over by another rm during a given year. The dependent variable in the analysis was T akeover - which is a binary variable equal to 1 if it was taken over (and 0 otherwise). The explanatory variables were prof it which is the rms average prot rate over the previous ve years, mktval which is the market value of the rm (in $100m), and debtearn which is the debt-to-earnings ratio. The table below presents coecient estimates (and standard errors) based on a sample of 177 rms in 2004. Table 4.1. Estimation Results for Takeover Models Dependent Variable: T akeover Variables prof it

0.251 (0.068) mktval 0.930 (0.287) debtearn 0.364 (0.249) constant 19.21 (4.839) Observations(n) 177 R2 0.233 Note: The usual OLS standard errors in () below the coecient estimates.

(i) What is the interpretation of the coecient on prof it ? (2 mark) d (ii) What is the predicted probability of T akeover for a rm with the following characteristics: prof it = 0.05, mktval = 1.5 and debtearn = 6 ? Briey explain whether the result is sensible. (2 marks) (iii) We know the Linear Probability Model must contain heteroskedaticity. What is heteroskedasticity and what are the consequences of heteroskedasticity for: (a) estimation, and (b) inference with the standard OLS procedures ? (2 marks)

(iv) Given that we know the model contains heteroskedasticity, what advice would you give an economist wishing to analyse the determinant of T akeover with regression methods ? (4 marks)

Question 5. (10 Marks in total). The following regression model was proposed for analyse the eect of the minimum wage on employment: log(emprtet ) = 0 + 1 log(minwgt ) + 2 log(minwgt1 ) + 3 log(GN Pt ) + ut (5.1)

where emprtet is the employment rate, minwgt is the minimum wage and GN Pt is GNP (a proxy for labour demand) in year t. (i) What is the interpretation of the coecient 1 ? (2 mark ). (ii) Is this a static or dynamic model ? What is the purpose of including the lagged term minwgt1 ? Briey explain. (2 marks).

Using annual data from 1950-1987, the following regression model estimates were obtained: d log(emprtet ) = 7.05 0.072 log(minwgt ) 0.061 log(minwgt1 ) 0.012 log(GN Pt ) (0.77) (0.031) (0.015) (0.089) 2 2 = 0.641 n = 38, R = 0.661, R (5.2)

(iii) Test the null hypothesis that the lagged term minwgt1 is insignicant using a 10 percent signicance level and the one-sided alternative that the coecent is negative (H0 : 2 = 0, H1 : 2 < 0 ). (2 marks). (iv) There is not enough information in the results presented in (5.2) to construct a condence interval for the Long Run Propensity (LRP). Rewrite the model in (5.1) into a form which you give you a direct estimate of the LRP (and the standard error on the LRP). What parameter in this transformed model corresponds to the LRP ? (2 marks).

(v) I am concerned that the model in (5.2) may suer from the spurious regression problem. What is the spurious regression problem and what simple adjustment to the model would help reduce the possibility of this problem ? (2 marks).

Question 6. (10 Marks in total). We are interested in analysing the eect of locating a water desalination plant on local property prices. Desalination plants are large, industrial sites which can generate a lot of noise pollution and reduce amenities in the local area. The South Australian government built a desalination plant in the Adelaide area of South Beach in 1998. Discussion about building a desalination plant in South Beach began after 1994, and the plant was built and began operating in 1998. We have data on the prices of houses sold in South Beach in 1994 (the before period) and another sample on houses sold in 2002 (the after period). The hypothesis we wish to test is that the price of houses located near the site of the desalination plant would fall below the price of more distant houses. The data for each year includes the dummy variable nearplant which is equal to one if the house is located within 3 kilometres of the desalination plant. The variable hprice denotes the real house price (scaled by $10,000). The following simple regression model was estimated using only the year 2002 sample of data: d hprice = 21.311 6.198 nearplant (0.618) (0.992) n = 353, R2 = 0.212 Using the 1994 sample, the following regression results were obtained: d hprice = 16.527 3.679 nearplant (0.538) (0.615) n = 182, R2 = 0.172 (6.2) (6.1)

(i) What is the interpretation of the coecient on the intercept term in model (6.2) (that is, what does the value 16.527 represent) ? What is the interpretation of the coecient on nearplant in model (6.2) ? (2 marks)

(ii) Can you infer from the estimates in (6.1), based on the year 2002 data, that the location of the plant caused the price of houses located nearby to fall by an average of $61,980 ? Explain . (2 marks)

(iii) An alternative approach is to pool the data for both years and estimate the following model: d hprice = 16.527 + 4.7840 year2 3.679 nearplant 2.519 year2 . nearplant (0.793) (0.9471) (0.876) (1.128) 2 n = 535, R = 0.202 (6.3)

where year2 is a dummy variable equal to one if the observation is for the year 2002 (and is equal to zero if the observation is for the year 1994). What is the estimated eect of the plant on neighbouring house prices based on the dierence-in-dierence estimator ? Is the eect signicantly dierent from 0 at the 5% signicance level ? (use the one-sided alternative hypothesis that the coecient is negative). (3 marks)

(iv) What, if any, would be the advantages of collecting and using panel data to evaluate the eect of the location of the desalination plant on local property prices ? Explain. (3 marks). 7

Table 1. Critical Values of the t Distribution 1-Tailed: 2-Tailed: 1 2 3 4 5 6 7 8 9 D 10 e 11 g 12 r 13 e 14 e 15 s 16 17 o 18 f 19 20 F 21 r 22 e 23 e 24 d 25 o 26 m 27 28 29 30 40 60 90 120 0.10 0.20 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.323 1.321 1.319 1.318 1.316 1.315 1.314 1.313 1.311 1.310 1.303 1.296 1.291 1.289 1.282 0.05 0.10 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.697 1.684 1.671 1.662 1.658 1.645 Significance Level 0.025 0.05 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 2.042 2.021 2.000 1.987 1.980 1.960 0.01 0.02 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 2.681 2.650 2.624 2.602 2.583 2.567 2.552 2.539 2.528 2.518 2.508 2.500 2.492 2.485 2.479 2.473 2.467 2.462 2.457 2.423 2.390 2.368 2.358 2.326 0.005 0.01 63.656 9.925 5.841 4.604 4.032 3.707 3.499 3.355 3.250 3.169 3.106 3.055 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.750 2.704 2.660 2.632 2.617 2.576

Example: The 1% critical value for a one tailed test with 25 df is 2.485. The 5% critical value for a two-tailed test with large (>120) df is 1.960.

Table 2. 1% Critical Values of the F Distribution 1 10.04 9.65 9.33 9.07 8.86 8.68 8.53 8.40 8.29 8.18 8.10 8.02 7.95 7.88 7.82 7.77 7.72 7.68 7.64 7.60 7.56 7.31 7.08 6.93 6.85 6.63 2 7.56 7.21 6.93 6.70 6.51 6.36 6.23 6.11 6.01 5.93 5.85 5.78 5.72 5.66 5.61 5.57 5.53 5.49 5.45 5.42 5.39 5.18 4.98 4.85 4.79 4.61 3 6.55 6.22 5.95 5.74 5.56 5.42 5.29 5.19 5.09 5.01 4.94 4.87 4.82 4.76 4.72 4.68 4.64 4.60 4.57 4.54 4.51 4.31 4.13 4.01 3.95 3.78 Numerator Degrees of Freedom 4 5 6 7 5.99 5.64 5.39 5.20 5.67 5.32 5.07 4.89 5.41 5.06 4.82 4.64 5.21 4.86 4.62 4.44 5.04 4.69 4.46 4.28 4.89 4.56 4.32 4.14 4.77 4.44 4.20 4.03 4.67 4.34 4.10 3.93 4.58 4.25 4.01 3.84 4.50 4.17 3.94 3.77 4.43 4.10 3.87 3.70 4.37 4.04 3.81 3.64 4.31 3.99 3.76 3.59 4.26 3.94 3.71 3.54 4.22 3.90 3.67 3.50 4.18 3.85 3.63 3.46 4.14 3.82 3.59 3.42 4.11 3.78 3.56 3.39 4.07 3.75 3.53 3.36 4.04 3.73 3.50 3.33 4.02 3.70 3.47 3.30 3.83 3.51 3.29 3.12 3.65 3.34 3.12 2.95 3.53 3.23 3.01 2.84 3.48 3.17 2.96 2.79 3.32 3.02 2.80 2.64 8 5.06 4.74 4.50 4.30 4.14 4.00 3.89 3.79 3.71 3.63 3.56 3.51 3.45 3.41 3.36 3.32 3.29 3.26 3.23 3.20 3.17 2.99 2.82 2.72 2.66 2.51 9 4.94 4.63 4.39 4.19 4.03 3.89 3.78 3.68 3.60 3.52 3.46 3.40 3.35 3.30 3.26 3.22 3.18 3.15 3.12 3.09 3.07 2.89 2.72 2.61 2.56 2.41 10 4.85 4.54 4.30 4.10 3.94 3.80 3.69 3.59 3.51 3.43 3.37 3.31 3.26 3.21 3.17 3.13 3.09 3.06 3.03 3.00 2.98 2.80 2.63 2.52 2.47 2.32

D e n o m i n a t o r D e g r e e s o f F r e e d o m

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 60 90 120

Example: The 1% critical value for numerator df =3 and denominator df=60 is 4.13.

You might also like