Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

WEEK 1 ASSIGNMENT

ECONOMETRICS
This exercise considers an example of data that do not satisfy all the
standard assumptions of simple regression. In the considered case,
assumption A6 that the coefficients α and β are the same for all
observations is violated. The dataset contains survey outcomes of a travel
agency that wishes to improve recommendation strategies for its clients.
The dataset contains 26 observations on age and average daily expenditures
during holidays.
(a) Use all data to estimate the coefficients a and b in a simple
regression model, where expenditures is the dependent variable and
age is the explanatory factor. Also compute the standard error and
the t-value of b.

Solution:

Thus, we get the coefficients as follows:


a = 114.241
b = -0.334
Now, to compute the standard error, we use the formula:
𝑆𝐸 2 = 1/(n-2) * ∑𝑛1 𝑒𝑖2
And 𝑒𝑖 = 𝑦𝑖 − 𝑎 − 𝑏𝑥𝑖

𝑆𝐸 2 = 25.698
SE = 5.0693

We know that,
Thus, from calculations shown in excel
𝑡𝑏 = 0.00423

(b) Make the scatter diagram of expenditures against age and add the
regression line y = a + bx of part (a) in this diagram. What
conclusion do you draw from this diagram?
Solution:
Regression Line
120

100

80
Expenditure

60

40

20

0
0 10 20 30 40 50 60
Age

We can see a negative relationship between age and expenditure. As age


increases, people spend less. Also, we can see two sets of groups on the
basis of age: people below the age of 40 and above the age of 40.

(c) It seems there are two sets of observations in the scatter diagram,
one for clients aged 40 or higher and another for clients aged below
40. Divide the sample into these two clusters, and for each cluster
estimate the coefficients a and b and determine the standard error
and t-value of b.

Set-1
People less than 40 years of age.
Intercept a = 100.821
Intercept b = 0.179
SE= 1.162
𝑡𝑏 = 0.4239

Set-2
People more than 40 years of age.

Intercept a = 88.872
Intercept b = 0.146
SE= 3.833
𝑡𝑏 = 0.00477

(d) Discuss and explain the main differences between the outcomes in
parts (a) and (c). Describe in words what you have learned from these
results.
Answer:
Regression Line for people under 40 years of age
110

109

108

107
Expenditure

106

105

104

103

102
0 5 10 15 20 25 30 35 40 45
Age

Regression Line for people above 40 years of age


104

102

100

98
Expenditure

96

94

92

90

88
0 10 20 30 40 50 60
Age

We can see that in part (a) we got a decreasing trend between the two
variables but in part(d) when we graph out the two sets of age groups
differently, we see that the trend is increasing in both cases. The reason
why overall trend comes out to be negative is that expenditure for set 1 is
way more than set 2, so with increasing age, the overall expenditure
decreases giving the regression line a negative slope.

You might also like