Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

21/22

THE UNIVERSITY OF HONG KONG


DEPARTMENT OF STATISTICS AND ACTUARIAL SCIENCE
STAT3907 Linear Models and Forecasting
Assignment 2
(Due date: April 3, 2022)

1. There is a hypothesis that the grade-point average (GPA) of a student can be explained
by how much time he or she spends studying, and it may also be related to whether or
not this student’s family is rich. To investigate these hypotheses, we conduct a survey
on four male students and four female students, and record the following data:

• Y = the GPA of the student;


• X1 = the number of hours spent studying per week;
• X2 = the monthly income of the student’s parents (in $1000).

Male Female
Y X1 X2 Y X 1 X2
4.0 22 5 3.0 12 5
3.0 16 10 3.5 18 10
3.5 16 15 2.5 7 15
2.0 10 20 2.5 12 20

Consider a simple linear regression to investigate whether the GPA of a student can
be affected by the number of hours spent studying per week.

(a) Perform the Goldfeld-Quant test for hypotheses

H0 : σi2 = σ 2 vs H1 : σi2 = cX2 ,

where σi2 = var(εi ) is the variance of the error term, and c > 0 is a constant. Use
the 5% significance level.
(b) Suppose that the null hypothesis H0 in (e) is rejected. Describe how you would
modify your simple linear regression model to improve the efficiency of the ordi-
nary least squares estimators in (b).
(c) To test for the existence of heteroscedasticity, assume that

σ 2 = Var(ε) = δ0 + δ1 x1 ,

where ε is the error term, and δ0 and δ1 are constants. Perform the Breusch-Pagan
test for hypotheses
H0 : δ1 = 0 vs H1 : δ1 6= 0.
Use the 5% significance level.

2. For the GPA example at the previous question, consider a multiple linear regression
for female students to investigate how both X1 and X2 affect the GPA of a student.

1
(a) Construct the 95% confidence interval for the expected GPA of female students
with X1 = 15 and X2 = 10.
(b) Calculate the partial correlation coefficient between Y and X2 except the effect
of X1 .
(c) Construct a test to check whether the coefficient of X1 is half that of X2 . Use the
5% significance level.

3. Suppose that the true model is

Yi = β0 + β1 Xi + β2 Xi2 + εi , 1 ≤ i ≤ n,

where {εi } are i.i.d. normal random variables with mean zero and variance σ 2 . How-
ever, we make a mistake, and fit a simple linear regression model

Yi = β0∗ + β1∗ Xi + ε∗i .

(a) Write down the ordinary least squares estimator of the slope parameter, denoted
by βb1∗ .
(b) Calculate E(βb∗ ) and the variance of βb∗ . When is it unbiased?
1 1

4. Consider the GPA example at Q1 again. Use R software to fit a multiple linear regres-
sion model,

Yi = β0 + β1 X1i + β2 X2i + γ0 dsexi + γ1 dsexi X1i + γ2 dsexi X2i + εi ,

where dsexi = 1 for female students, and 0 for male students.

(a) Write down the ordinary least squares estimators of the parameters. Are they
significant?
(b) Write down the R-squared statistic, and construct an ANOVA table. Interpret
the results by using the 5% significance level.
(c) Construct the 95% confidence interval for the expected GPA of female students
with X1 = 15 and X2 = 10.

You might also like