Professional Documents
Culture Documents
Week 1 Assignment
Week 1 Assignment
ECONOMETRICS
This exercise considers an example of data that do not satisfy all the
standard assumptions of simple regression. In the considered case,
assumption A6 that the coefficients α and β are the same for all
observations is violated. The dataset contains survey outcomes of a travel
agency that wishes to improve recommendation strategies for its clients.
The dataset contains 26 observations on age and average daily expenditures
during holidays.
(a) Use all data to estimate the coefficients a and b in a simple
regression model, where expenditures is the dependent variable and
age is the explanatory factor. Also compute the standard error and
the t-value of b.
Solution:
𝑆𝐸 2 = 25.698
SE = 5.0693
We know that,
Thus, from calculations shown in excel
𝑡𝑏 = 0.00423
(b) Make the scatter diagram of expenditures against age and add the
regression line y = a + bx of part (a) in this diagram. What
conclusion do you draw from this diagram?
Solution:
Regression Line
120
100
80
Expenditure
60
40
20
0
0 10 20 30 40 50 60
Age
(c) It seems there are two sets of observations in the scatter diagram,
one for clients aged 40 or higher and another for clients aged below
40. Divide the sample into these two clusters, and for each cluster
estimate the coefficients a and b and determine the standard error
and t-value of b.
Set-1
People less than 40 years of age.
Intercept a = 100.821
Intercept b = 0.179
SE= 1.162
𝑡𝑏 = 0.4239
Set-2
People more than 40 years of age.
Intercept a = 88.872
Intercept b = 0.146
SE= 3.833
𝑡𝑏 = 0.00477
(d) Discuss and explain the main differences between the outcomes in
parts (a) and (c). Describe in words what you have learned from these
results.
Answer:
Regression Line for people under 40 years of age
110
109
108
107
Expenditure
106
105
104
103
102
0 5 10 15 20 25 30 35 40 45
Age
102
100
98
Expenditure
96
94
92
90
88
0 10 20 30 40 50 60
Age
We can see that in part (a) we got a decreasing trend between the two
variables but in part(d) when we graph out the two sets of age groups
differently, we see that the trend is increasing in both cases. The reason
why overall trend comes out to be negative is that expenditure for set 1 is
way more than set 2, so with increasing age, the overall expenditure
decreases giving the regression line a negative slope.