Professional Documents
Culture Documents
Statistical Inference
Statistical Inference
Hypothesis Testing
I. Statistical Inference:
Statistical inference “... draws conclusions from (or makes inferences about) a
population from a random sample taken from that population.”
For example, we could compute the average income for Singapore households
during the last year:
1 3000
i=1 I i = I
3000
The idea is that there is some underlying distribution to our random variable,
household income.
I ~ N( I , 2I )
normally:
with population mean and variance.
I ~ N( I ,
2I )
n
I - I
Z= ~ N(0, 1)
I / n
Of course, we don’t know the population standard deviation σI, but we can
replace it with the sample standard deviation (S):
1
ˆ = S = n
i=1 ( I i - I )
2
n - 1
I - I
t=
S / n
We can look up the critical t values in the Table for the t-distribution:
2.576 S 2.576 S
Prob( I - I I + ) = 0.99
n n
This is the Confidence Interval around the unknown population parameter μI.
Interpretation: there is a 99% probability that this ‘random interval’ contains
the true μI. Confidence level=100%-Significance level.
Suppose I = 40 (i.e., the average household receives $40,000 per year), and the
sample standard deviation is 10.
25.76 25.76
40 - I 40 + =
54.772 54.772
39.530 I 40.470
1 3000 1
I= i=1 I i = ( I1 + I 2 + ... + I n )
3000 3000
E( I ) = I
Note that the properties of unbiasedness and consistency are conceptually very
different. Unbiasedness can hold for any sample size. Consistency is strictly a
large-sample property.
Suppose we want to ‘test the hypothesis’ that the true mean income of
households is equal to some value. For example, we might test the null
hypothesis that it’s equal to $41,000.
H 0 : I = 41.0 H 1 : I 41.0
39.530 I 40.470
Therefore, if a particular null hypothesis doesn’t lie within this interval, we can
reject it. The confidence interval is also known as the Acceptance Region. If
not within this interval, it lies in the Rejection Region.
Page -4
Might want to say something about the different ‘types’ of mistakes that one can
make in these hypothesis tests.
• ‘Type I’ Error. Rejecting a null hypothesis when it’s true. Not always
going to be right.
• ‘Type II’ Error. Accepting a null hypothesis when it’s false. Suppose
our null had been that the mean household income was $40,250. This
clearly lies with the confidence interval. We’d not reject the null. Yet,
the truth is it might be $41,000.
This probability has been traditionally assigned the Greek letter β (do not
confuse this β with β’s used in the regression model). One minus this value is
called the Power of the Test. The classical approach is to set α to be a small
number and minimise β (or maximise the power of the test given the confidence
level).
We know that the statistic computed from the sample follows a t distribution
with n-1 degrees of freedom.
I I
t
S/ n
Let’s take the earlier example. Suppose we want to know whether or not μI is
equal to 41.
40 - 41
t= - 5.477
10 / 3000
Can we reject the null? Depends on the confidence level. Suppose again that
we set α=.01. Again, it’s a two-sided test.
Page -5
Clearly, the absolute value of our computed t exceeds this critical value. We
reject the null that the true mean household income is $41,000.
What about another null, for example that the population mean is equal to
$40,250? Compute the t statistic as:
40 - 40.25
t= - 1.369
10 / 3000
In econometrics, more often than not, we test the null that the estimator is equal
to zero. In other words, whether or not an estimated coefficient is significantly
different from zero. The terminology is that the estimated coefficient is
‘statistically significant’. In context of this example, this is written:
I
t=
S / n
where the numerator is the estimated coefficient and the denominator is the
estimated standard error.
3. The ‘P Value’
The problem with classical hypothesis testing is that one has to choose a
particular significance level. This is quite arbitrary. Conventionally, α is set
equal to 0.10, 0.05 and/or 0.01.
The way around this is to compute the Exact Significance Level or P Value.
This is the largest significance level at which the null cannot be rejected (or the
lowest significance level at which the null can be rejected). Consider the
earlier example. Instead of specifying α and looking up the critical value, you
plug in the computed t statistic as the critical value and ‘look up’ the
corresponding α.
This says that this test is significant at a 3.55% level. At 5%, the null is
rejected, but at the 1% level, we cannot reject the null.
S12 / 12
The parameter to be tested is / . The statistic used is 2 2 which has a
2 2
1 2
S2 / 2
distribution called F with degrees of freedom being n1 1 and n2 1 , if the two
populations are normally distributed.
H 0 : 1 = 0 H 1 : 1 0
Why is the test important? How to test it? Note that we typically put the value(s)
that we do not expect in the null hypothesis and put the values that we expect to
be true in the alternative hypothesis.
We note that
ˆ 1
~ N( , Var ( ˆ ) )
1 1
2
So by standardisation, we have
ˆ1 1
~ N (0,1)
( ˆ1 )
Page -7
ˆ
where s( ˆ 1 ) = (in SLRM), K is the number of independent variables.
2
xi
This is a t distribution with n-K-1 degrees of freedom.
We’ve gone through the ‘general mechanics’ already, so let’s use a specific
numerical example to see how we would proceed. To illustrate, suppose we
want to put a ‘confidence interval’ around 1 and suppose we had a cross
section of 10 households, and we wanted to estimate our linear consumption
function. We get the following:
ˆ = 24.455 + .509 DI i
C
(6.414) (.036)
n = 10 df = 8 ̂ 2 = 42.159 R 2 = .962
Might want to construct the 95% confidence interval for β0. You should get the
following:
9.664 0 39.245
Page -8
H 0 : 1 = 0.3 H 1 : 1 0.3
We want to know whether or not the true MPC is 0.3, where the alternative
hypothesis is that it’s something other than 0.3. Compute the t statistic from the
general formula above:
.509 - .3
t= = 5.806
.036
.509 - .4
t= = 3.028
.036
but note that the formula and procedure are identical to the two-sided test. Only
the upper confidence limit will be changed. All 5% of the area is in the upper
tail.
1. The t-test does not say if the model is economically valid. Examples: regress
stock prices on intensity of dog barking, regress the consumer price index in
Singapore on the rainfall in UK.
2. The t-test does not test importance of an independent variable.
V. Further Use of t-Test: Testing a Linear Restriction
Y i = Li 1 K i 2 e i
where: Yi = Output.
Li = Labour input.
Ki = Capital input.
ln Yi 0 1 ln Li 2 ln K i i
where 0 ln 1 .
H 0 : 1 2 1
This is a test of 'Constant Returns to Scale'.
( ˆ 1 + ˆ 2 ) ( 1 + 2 ) ( ˆ 1 + ˆ 2 ) 1
t= =
SE( ˆ + ˆ )
1 2
ˆ R( ˆ ) + VA
VA ˆ V ( ˆ , ˆ )
ˆ R( ˆ ) 2CO
1 2 1 2
H 0 : 1 2 K 0
H 1: At least one of these beta’s is not zero.
SS Df MSS
ESS ˆ1 y i x1i + ˆ2 y i x 2i
K
ˆ1 y i x1i + ˆ 2 y i x 2i K
RSS
ei2
ei2 n – K-1 n - K 1
TSS yi
2
n–1
If the computed F statistic exceeds some critical value, we reject H0 that the
slope coefficients are simultaneously equal to zero.
Example: Woody’s.
29 ( .618 )
F= 15.64
3(1 - .618 )
Since this exceeds the critical value at a 5 percent significance level (2.93), we
can reject H0 that all slope coefficients are simultaneously equal to zero or the
model is overall insignificant.
No real estimation issues here. Still linear in the parameters. However, L and
L2 will tend to be highly correlated.
To assess the incremental contribution of both Li and Li2, we use the following F
test (i.e., H0: β2=β3=0).
2 2
RUR - R R RSS R - RSSUR
F= m m
=
2
1 - RUR RSSUR
n - K 1 n - K 1
.433 - .405
2 .014
F= = = 2.369
1 - .433 .00591
96
With a 5% confidence level, the critical value is 3.10. Since our F statistic is
less than this value, we can't reject H0. Experience and experience squared have
an insignificant effect in our regression. No statistical evidence for their
inclusion.
Note that this F-test can be used to deal with a null hypothesis that contains
multiple hypotheses or a single hypothesis about a group of coefficients.