Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Chapter 4:

BASIC ESTIMATION TECHNIQUES

Essential Concepts

1. A simple linear regression model relates a dependent variable Y to a single independent (or
explanatory) variable X in a linear fashion

The intercept parameter (a) gives the value of Y at the point where the regression line crosses the Y-
axis, which is the value of Y when X is zero. The slope parameter (b) gives the change in Y associated

with a one-unit change in X ( ).


2. Because the variation in Y is affected not only by variation in X but also by various random effects as
well, the actual value of Y cannot be predicted exactly. The regression equation is correctly
interpreted as providing the average value, or the expected value, of Y for a given value of X.
3. Parameter estimates are obtained by choosing values of a and b that minimize the sum of the squared

residuals. The residual is the difference between the actual value of Y and the fitted value of Y, .
This method of estimating a and b is called the method of least-squares, and the estimated regression

line, is called the sample regression line. The sample regression line is an estimate of the
true regression line.

4. The estimates and do not, in general, equal the true values of a and b. Since and are
computed using data from a random sample, the estimates themselves are random variables—the
estimates would vary in value from one random sample to another random sample. Statisticians have
shown that the distribution of values that the estimates might take is centered around the true value of
the parameter. An estimator is unbiased if the average value, or the expected value, of the estimator is
equal to the true value of the parameter. The method of least-squares can produce unbiased estimates
of a and b.
5. It is the randomness of the parameter estimates that necessitates testing for statistical significance.

Just because the estimate is not zero does not mean the true value of b is not zero. Even when b does

equal zero, it is still possible that the sample will produce a least-squares estimate that is different
from zero. Thus, it is necessary to determine if there is sufficient statistical evidence in the sample to
indicate that Y is truly related to X (i.e., ).
6. There are two ways to determine whether an estimated parameter is statistically significant. Either a t-
test can be performed or the p-value for the parameter estimate can be examined.
7. To perform a t-test for significance, a researcher must first determine the level of significance for the
test. The significance level of a test is the probability of finding a parameter estimate to be
significantly different from zero when, in fact, b is zero. This mistake is called a Type I error. Lower
levels of significance, other things equal, are more desirable. One minus the level of significance is
called the level of confidence.
Once the level of significance is chosen, the t-ratio is computed as

Chapter 4: Basic Estimation Techniques


2016 by McGraw-Hill Education.  This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner.  This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
where is the standard error of the estimate Next, the critical value of t is found in the t-table at
the end of your textbook. Choose the critical t-value with degrees of freedom for the desired
level of significance, where n is the number of observations and k is the number of parameters being

estimated. If the absolute value of the t-ratio is greater (less) than the critical t-value, then is (is not)
statistically significant.
8. An alternative method of assessing the statistical significance of parameter estimates is to treat as
statistically significant only those parameter estimates whose p-values are smaller than the maximum
acceptable significance level. The p-value gives the exact level of significance for a parameter
estimate, which is the probability of finding significance when none exists.
9. The coefficient of determination R2 measures the percentage of the total variation in the dependent
variable that is explained by the regression equation. The value of R2 ranges from 0 to 1. A high R2
indicates Y and X are highly correlated and the scatter diagram tightly fits the sample regression line.
10. The F-test is used to test for significance of the overall regression equation. The F-statistic from the
computer printout is compared to the critical F-value obtained from the F-table at the end of your
textbook. The critical F-value is identified by two separate degrees of freedom and the significance
level. The first of the degrees of freedom is and the second is . If the value for the
calculated F-statistic (calculated by the computer) exceeds the critical F-value, the regression
equation overall is statistically significant at the specified significance level. Alternatively, if the p-
value for the F-statistic is smaller than the acceptable level of significance, the equation as a whole is
statistically significant.
11. Multiple regression uses more than one explanatory variable to explain the variation in the dependent
variable. The coefficient for each of the explanatory variables measures the change in Y associated

with a one-unit change in that explanatory variable ( ).


12. Two types of nonlinear models can be easily transformed into linear models that can be estimated
using linear regression analysis. These are quadratic regression models and log-linear regression
models.
(a) Quadratic regression models are appropriate when the curve fitting the scatter plot is either -
shaped or -shaped. A quadratic equation, Y = a + bX + cX2, can be transformed into a linear
form by computing a new variable Z = X2, which is then substituted for X2 in the regression. Then,
the regression equation to be estimated is Y = a + bX + cZ.
(b) Log-linear regression models are appropriate when the relation takes the multiplicative
exponential form: Y = aXbZc. The equation is transformed by taking natural logarithms:

The coefficients b and c are elasticities. For example, b measures the percent change in Y that results
when X changes by 1 percent.

Chapter 4: Basic Estimation Techniques


2016 by McGraw-Hill Education.  This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner.  This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Answers to Applied Problems

1. a. The intercept a is expected to be positive because even if no advertising is undertaken, some sales
are expected to occur. b is expected to have a positive sign since Vanguard's sales are positively
related to its level of advertising expenditures. Vanguard's sales should be inversely related to its
rivals' expenditures on advertising, so c is expected to be negative.

b. a is the sales of Bright Side detergent when neither Vanguard nor its rivals advertise. b is ΔS/ΔA,
the increase in Bright Side sales attributable to a $1,000 per week increase in advertising
expenditures by Vanguard. c is ΔS/ΔR, the decrease in Bright Side sales attributable to a $1,000
per week increase in advertising expenditures by Vanguard's rivals.

c. The exact level of significance of is 0.0128. There is only a 1.28% chance that b = 0, which is
better than the 10 percent level required by the marketing director.

d. The exact level of significance of is 0.0927. There is a 9.27% chance that rivals’ spending on
advertising does not affect Vanguard’s sales (i.e., b = 0), which is just barely better than the 10
percent level required by the marketing director.

e. About 78 percent of the variation in sales remains unexplained. Find additional explanatory
variables that have a significant affect on S. The manager might try adding the price of its
detergent and the prices of its rivals’ detergents.

f. = 175,086 + 0.855  40,000 – 0.284  100,000 = $180,886 of sales each week.

2. a. At the 95% level of confidence, the critical F-value is Fk–1,n–k = F1,15 = 4.54. Since the computed
F-ratio 42.674 is greater than 4.54, the regression equation provides evidence of a statistically
significant relation. Note also that 74% of the variation in V is explained by the equation.

b. The critical t-value for n – k = 15 degrees of freedom and a 95% level of confidence is 2.131. For
ˆ: t = 25.418 > 2.131; statistically significant. If Proposition 103 has no impact on auto insurance
premiums in any given county, P = 0, and the expected percentage of voters favoring Proposition
103 in that particular county is given by:
V = 53.682 – 0.528(0) = 53.682,
or 53.7% are expected to favor Proposition 103.

c. For : t = |–6.519| > 2.131; statistically significant


Remember that both V and P are measured as percentages. Thus, a 1% increase in P is estimated
to result in a 0.53% decrease in V. A 10% increase in P is expected to result in a 5.3% (= 0.53 
10) decrease in V.
Note: All of the p-values are so small that you could quickly determine that all tests would find
significance at extremely high levels of confidence.

Chapter 4: Basic Estimation Techniques


2016 by McGraw-Hill Education.  This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner.  This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
3. a. The F-statistic provides evidence that the regression equation as a whole is statistically
significant. The p-value for the F-statistic is significant at less than 0.01%. The R2 indicates the
regression equation explains 83% of the variation in E. The p-values for the individual
coefficients are:
For : the p-value = 0.2369, or there is a 23.69% chance that a = 0.
For the p-value = 0.0023, or there is only a 0.23% chance that b = 0.

b. Since ΔE/ΔN is estimated to be 32.31, each extra ticket sold in December is expected to increase
annual earnings by $32.31.

c. = 25,042,000 + 32.31 950,000 = $55,736,500 for the year.

4. a. Estimate the model


Q' = a' + bH ' + cS ' ,
where Q' = ln Q, a' = ln a, H ' = ln H, and S ' = ln S.

b. b = %ΔQ/%ΔH
c = %ΔQ/%ΔS
A 20% increase in S will increase Q by 5.1% (= 0.2550 x 20).

c. Perform an F-test. The 5 percent F-value is Fk–1,n–k = F2,50 = 3.18. Since 29.97 > 3.18, the overall
equation is statistically significant. The p-value for the F indicates significance below the 0.01%
level.

d. 54.52% of the variation in Q is explained by this model. The R2 could be increased by adding
some additional explanatory variables such as the sales experience of the salespersons employed.
Whether the sales day is a weekday or a Saturday/Sunday, and the level of advertising in
newspapers the previous week.

e. The critical t-ratio is tn–k = t50 ≈ 2.


For : t = 3.80 > 2; statistically significant. The p-value indicates exact significance at the
0.04% level. Since = ln , = eˆ', and = 2.5. Since = 2.5 H 0 .3517S 0 .2550,
= 2.5(0)0.3517 (0)0.2550 = 0.
Sales are expected to be zero because in the nonlinear form Q is zero when either H or S is zero.

f. For : t = 3.44 > 2; statistically significant. The p-value indicates exact significance at the
0.12% level.
Since b = %ΔQ / %ΔH, is the estimated percentage increase in sales attributable to increase the
hours of operation by 1%, all else constant. Since = 0.3517, a 10 percent decrease in H will
decrease sales by 3.52 percent (= 0.3517 10).

Chapter 4: Basic Estimation Techniques


2016 by McGraw-Hill Education.  This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner.  This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.

You might also like