Chapter 6 Exercises

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Chapter 6 – Linear Regression with Multiple Regressors

Exercise 1:

The data set consists of information on 7178 full-time, full-year workers. The highest
educational achievement for each worker was either a high school diploma or a bachelor’s
degree.
The workers’ ages ranged from 25 to 34 years. The data set also contains information on the
region of the country where the person lived, marital status, and number of children. For the
purposes of these exercises, let:
AHE = average hourly earnings
College = binary variable (1 if college, 0 if high school)
Female = binary variable (1 if female, 0 if male)
Age = age (in years)
Northeast = binary variable (1 if Region = Northeast, 0 otherwise)
Midwest = binary variable (1 if Region = Midwest, 0 otherwise)
South = binary variable (1 if Region = South, 0 otherwise)
West = binary variable (1 if Region = West, 0 otherwise)

1. Compute 𝑅̅ 2 for each of the regressions.


2. Using the regression results in column (1):
a. Do workers with college degrees earn more, on average, than workers with only
high school diplomas? How much more?
b. Do men earn more than women, on average? How much more?
3. Using the regression results in column (2):
a. Is age an important determinant of earnings? Explain.
b. Sally is a 29-year-old female college graduate. Betsy is a 34-year-old female
college graduate. Predict Sally’s and Betsy’s earnings.
4. Using the regression results in column (3):
a. Do there appear to be important regional differences?
b. Why is the regressor West omitted from the regression? What would happen if it
were included?
c. Juanita is a 28-year-old female college graduate from the South. Jennifer is a 28-
year-old female college graduate from the Midwest. Calculate the expected
difference in earnings between Juanita and Jennifer.

Exercise 2:

Data were collected from a random sample of 200 home sales from a community in 2013. Let
Price denote the selling price (in $1000s), BDR denote the number of bedrooms, Bath denote
the number of bathrooms, Hsize denote the size of the house (in square feet), Lsize denote the
lot size (in square feet), Age denote the age of the house (in years), and Poor denote a binary
variable that is equal to 1 if the condition of the house is reported as “poor.”
An estimated regression yields:
Price = 109.7 + 0.567BDR + 26.9Bath + 0.239Hsize + 0.005Lsize + 0.1Age - 56.9Poor
𝑅̅ 2 = 0.849, SER = 45.8
a. Suppose that a homeowner converts part of an existing family room in her house into
a new bathroom. What is the expected increase in the value of the house?
b. Suppose that a homeowner adds a new bathroom to her house, which increases the
size of the house by 80 square feet. What is the expected increase in the value of the
house?
c. What is the loss in value if a homeowner lets his house run down so that its condition
becomes “poor”?
d. Compute the 𝑅 2 for the regression.

Exercise 3:

(𝑌𝑖 , 𝑋1𝑖 , 𝑋2𝑖 ) satisfy the LSA for causal inference in the multiple regression model. You are
interested in 𝛽̂1, the causal effect of 𝑋1 , on 𝑌. Suppose 𝑋1 and 𝑋2 are uncorrelated. You
estimate 𝛽̂1 by regressing 𝑌 on 𝑋1 (so that 𝑋2 is not included in the regression).
Does this estimator suffer from omitted variable bias? Explain.

Exercise 4:
Critique each of the following proposed research plans. Your critique should explain any
problems with the proposed research and describe how the research plan might be improved.
Include a discussion of any additional data that need to be collected and the appropriate
statistical techniques for analyzing those data.

a. A researcher wants to determine whether a leading global university is guilty of racial


bias in admissions. To determine potential bias, the researcher collects data on the
race of all applicants to the university for a given year. The researcher plans to
conduct a difference-in-means test to determine whether the proportion of acceptances
among Black candidates is systematically different from the proportion of acceptances
among other candidates.
b. A researcher is interested in identifying the impact of a mother’s education on the
educational attainment of her child. She collects data on a random sample of
individuals aged between 25 and 40 years who are out of the schooling system. The
data set contains information on each person’s level of schooling, the type of school
attended, gender and ethnicity, as well as information on the schooling of their parents
and the demographic characteristics of the household in which they grew up. The
researcher plans to regress years of schooling achieved by an individual on the years
of schooling of their mother, including in the regression the other potential
determinants of schooling (number of siblings and whether parents lived together or
are separated) as controls.
Exercise 5:
The cost of attending your college has once again gone up. Although you have been told that
education is investment in human capital, which carries a return of roughly 10% a year, you
(and your parents) are not pleased. One of the administrators at your university/college does
not make the situation better by telling you that you pay more because the reputation of your
institution is better than that of others. To investigate this hypothesis, you collect data
randomly for 100 national universities and liberal arts colleges from the 2000-2001 U.S.
News and World Report annual rankings. Next you perform the following regression

= 7,311.17 + 3,985.20 × Reputation – 0.20 × Size

+ 8,406.79 × Dpriv – 416.38 × Dlibart – 2,376.51 × Dreligion

R2 = 0.72, SER = 3,773.35

where Cost is Tuition, Fees, Room and Board in dollars, Reputation is the index used in U.S.
News and World Report (based on a survey of university presidents and chief academic
officers), which ranges from 1 ("marginal") to 5 ("distinguished"), Size is the number of
undergraduate students, and Dpriv, Dlibart, and Dreligion are binary variables indicating
whether the institution is private, a liberal arts college, and has a religious affiliation.
(a) Interpret the results. Do the coefficients have the expected sign?
(b) What is the forecasted cost for a liberal arts college, which has no religious affiliation, a
size of 1,500 students and a reputation level of 4.5? (All liberal arts colleges are private.)
(c) To save money, you are willing to switch from a private university to a public university,
which has a ranking of 0.5 less and 10,000 more students. What is the effect on your cost? Is
it substantial?
(d) What can you say about causation in the above relationship? Is it possible that Cost
affects Reputation rather than the other way around?

You might also like