Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 14

Page -1

Hypothesis Testing

I. Statistical Inference:

Statistical inference “... draws conclusions from (or makes inferences about) a
population from a random sample taken from that population.”

A population is the ‘universe’ or the total number of observations.

A sample is a subset of a given population.

For example, we could compute the average income for Singapore households
during the last year:
1 3000
i=1 I i = I
3000

where I is household income. I is a sample statistic of the population


parameter E(I) which is the average income for entire households in this
country.

There are 2 steps in this process: Estimation and Hypothesis Testing.

The idea is that there is some underlying distribution to our random variable,
household income.

Although it is unlikely to be the case, assume that income is distributed

I ~ N(  I ,  2I )

normally:
with population mean and variance.

If this is the case, then the ‘point estimate’ I is distributed:

I ~ N(  I ,
 2I )
n

where ‘n’ is the sample size.


Page -2

Which we can turn into a ‘standardized’ normal distribution by subtracting the


population mean and dividing by the standard deviation of the estimator.

I - I
Z= ~ N(0, 1)
I / n

Of course, we don’t know the population standard deviation σI, but we can
replace it with the sample standard deviation (S):

1
ˆ = S = n
 i=1 ( I i - I )
2

n - 1

and rewrite the earlier expression:

I - I
t=
S / n

which follows a t distribution with (n-1) degrees of freedom.

We can look up the critical t values in the Table for the t-distribution:

Prob(-2.576  t  2.576)= 0.99

Substitute in the earlier expression for t and rearrange terms:

2.576 S 2.576 S
Prob( I -  I  I + ) = 0.99
n n

This is the Confidence Interval around the unknown population parameter μI.
Interpretation: there is a 99% probability that this ‘random interval’ contains
the true μI. Confidence level=100%-Significance level.

Suppose I = 40 (i.e., the average household receives $40,000 per year), and the
sample standard deviation is 10.

25.76 25.76
40 -  I  40 + =
54.772 54.772

39.530   I  40.470

Properties of Point Estimators


Page -3

• Linearity. An estimator is linear if it’s a linear function of the


observations in the sample.

1 3000 1
I= i=1 I i = ( I1 + I 2 + ... + I n )
3000 3000

• Unbiasedness. An estimator is unbiased if in repeated samples the mean


value of the estimators is equal to the true parameter value.

E( I ) =  I

• Efficiency. This has to do with the ‘precision’ of our estimator. An


estimator is efficient if it has the smallest variance among other linear
unbiased estimators. It’s BLUE (Best Linear Unbiased Estimator).

• Consistency. An estimator is ‘consistent’ if it approaches the true


parameter value as the sample size gets larger.

Note that the properties of unbiasedness and consistency are conceptually very
different. Unbiasedness can hold for any sample size. Consistency is strictly a
large-sample property.

Suppose we want to ‘test the hypothesis’ that the true mean income of
households is equal to some value. For example, we might test the null
hypothesis that it’s equal to $41,000.

H 0 :  I = 41.0 H 1 :  I  41.0

This is known as a two-sided alternative hypothesis. An example of a one-sided


alternative hypothesis is that this true parameter is greater than $41K.

II. Three Typical Approaches to Hypothesis Testing:


1. The Interval Approach.

We computed earlier the 99% confidence interval for μI.

39.530   I  40.470

Therefore, if a particular null hypothesis doesn’t lie within this interval, we can
reject it. The confidence interval is also known as the Acceptance Region. If
not within this interval, it lies in the Rejection Region.
Page -4

Might want to say something about the different ‘types’ of mistakes that one can
make in these hypothesis tests.

• ‘Type I’ Error. Rejecting a null hypothesis when it’s true. Not always
going to be right.

Type I Error = Prob(Rejecting H0 | H0 is True) = α.

where this is a ‘conditional probability’, and α is .01. This is the


‘significance level’ chosen.

• ‘Type II’ Error. Accepting a null hypothesis when it’s false. Suppose
our null had been that the mean household income was $40,250. This
clearly lies with the confidence interval. We’d not reject the null. Yet,
the truth is it might be $41,000.

Type II Error = Prob(Accepting H0 | H0 is False) = β.

This probability has been traditionally assigned the Greek letter β (do not
confuse this β with β’s used in the regression model). One minus this value is
called the Power of the Test. The classical approach is to set α to be a small
number and minimise β (or maximise the power of the test given the confidence
level).

2. The Significance Test Approach.

We know that the statistic computed from the sample follows a t distribution
with n-1 degrees of freedom.
I  I
t
S/ n

Let’s take the earlier example. Suppose we want to know whether or not μI is
equal to 41.

40 - 41
t=  - 5.477
10 / 3000

Can we reject the null? Depends on the confidence level. Suppose again that
we set α=.01. Again, it’s a two-sided test.
Page -5

Pr ob(| t | 2.576)  0.01

Clearly, the absolute value of our computed t exceeds this critical value. We
reject the null that the true mean household income is $41,000.

What about another null, for example that the population mean is equal to
$40,250? Compute the t statistic as:

40 - 40.25
t=  - 1.369
10 / 3000

Clearly, here we’d be unable to reject this null.

In econometrics, more often than not, we test the null that the estimator is equal
to zero. In other words, whether or not an estimated coefficient is significantly
different from zero. The terminology is that the estimated coefficient is
‘statistically significant’. In context of this example, this is written:

I
t=
S / n

where the numerator is the estimated coefficient and the denominator is the
estimated standard error.

3. The ‘P Value’

The problem with classical hypothesis testing is that one has to choose a
particular significance level. This is quite arbitrary. Conventionally, α is set
equal to 0.10, 0.05 and/or 0.01.

The way around this is to compute the Exact Significance Level or P Value.
This is the largest significance level at which the null cannot be rejected (or the
lowest significance level at which the null can be rejected). Consider the
earlier example. Instead of specifying α and looking up the critical value, you
plug in the computed t statistic as the critical value and ‘look up’ the
corresponding α.

Prob( | t | > 5.477) = .0001

In this case, 5.477 is ‘just significant’ at a .01% level.


A more typical result might be the following:
Prob( | t | > 2.196) = .0355
Page -6

This says that this test is significant at a 3.55% level. At 5%, the null is
rejected, but at the 1% level, we cannot reject the null.

Inference about the population variance:

The sample variance S 2 is an unbiased, consistent and efficient point estimator


(n  1) S 2
for  2 . The statistic has a distribution called Chi-square with (n-1)
2
degrees of freedom, if the population is normally distributed.

Inference about the ratio of two population variances:

S12 /  12
The parameter to be tested is  /  . The statistic used is 2 2 which has a
2 2
1 2
S2 /  2
distribution called F with degrees of freedom being n1  1 and n2  1 , if the two
populations are normally distributed.

III. Hypothesis Testing in Linear Regression Models

Suppose we want to test the hypothesis in linear regression models,

H 0 : 1 = 0 H 1 : 1  0

Why is the test important? How to test it? Note that we typically put the value(s)
that we do not expect in the null hypothesis and put the values that we expect to
be true in the alternative hypothesis.

We note that
ˆ 1
~ N(  , Var ( ˆ ) )
1 1


2

where Var( ˆ1 ) = .


 xi2

So by standardisation, we have

ˆ1  1
~ N (0,1)
 ( ˆ1 )
Page -7

We can use this sampling distribution to make statistical inferences. However,


when the standard deviation of OLS estimator is unknown, this sampling
distribution is not applicable. We can substitute sigma by its estimator and get a
new statistic (with a new sampling distribution)

ˆ1 -  1 ( ˆ1 -  1 )  xi2


t= = ~ t ( n  K 1)
s ( ˆ1 ) ˆ

ˆ
where s( ˆ 1 ) = (in SLRM), K is the number of independent variables.
2
 xi
This is a t distribution with n-K-1 degrees of freedom.

We’ve gone through the ‘general mechanics’ already, so let’s use a specific
numerical example to see how we would proceed. To illustrate, suppose we
want to put a ‘confidence interval’ around  1 and suppose we had a cross
section of 10 households, and we wanted to estimate our linear consumption
function. We get the following:

ˆ = 24.455 + .509 DI i
C
(6.414) (.036)

n = 10 df = 8 ̂ 2 = 42.159 R 2 = .962

We want to place a confidence interval around the estimated slope coefficient.


Need to choose a confidence level. Suppose 95% (α=.05). Two-tailed test.
Look it up in the table. Critical value is 2.306. Now recall the general
expression for the confidence interval.

Prob[ ˆ1 - t/2 s ( ˆ1 )   1  ˆ1  t/2 s ( ˆ1 ) ] = 1 - 

Plugging in the values and estimates from above we get:

.509 - (2.306) (.036)   1  .509 + (2.306) (.036)


.4268   1  .5914

Might want to construct the 95% confidence interval for β0. You should get the
following:

9.664   0  39.245
Page -8

Now suppose I want to perform a two-sided test.

H 0 :  1 = 0.3 H 1 :  1  0.3

We want to know whether or not the true MPC is 0.3, where the alternative
hypothesis is that it’s something other than 0.3. Compute the t statistic from the
general formula above:

.509 - .3
t= = 5.806
.036

which is significant at better than a 5% level (critical value given above as


2.306), and 1% level (critical value 3.355). Reject the null.

the density function for the t variable

Now suppose I want to perform a one-sided test.


Page -9

H 0 :  1  0.4 H 1 :  1 > 0.4


We want to know whether or not the true MPC is less than or equal to 0.4,
where the alternative hypothesis is that it’s greater than 0.4.

Compute the t statistic:

.509 - .4
t= = 3.028
.036

but note that the formula and procedure are identical to the two-sided test. Only
the upper confidence limit will be changed. All 5% of the area is in the upper
tail.

IV. Do NOT Over Use t-Test

1. The t-test does not say if the model is economically valid. Examples: regress
stock prices on intensity of dog barking, regress the consumer price index in
Singapore on the rainfall in UK.
2. The t-test does not test importance of an independent variable.
V. Further Use of t-Test: Testing a Linear Restriction

In this example, begin with a Cobb-Douglas production function:

Y i =  Li 1 K i 2 e i
  

where: Yi = Output.
Li = Labour input.
Ki = Capital input.

The natural logs of the variables can be written in linear form:

ln Yi   0  1 ln Li   2 ln K i   i

where  0  ln  1 .

An example of a linear restriction would be the following:

H 0 : 1   2  1
This is a test of 'Constant Returns to Scale'.

Use the following t-test:

( ˆ 1 + ˆ 2 )  (  1 +  2 ) ( ˆ 1 + ˆ 2 )  1
t= =
SE( ˆ + ˆ )
1 2
ˆ R( ˆ ) + VA
VA ˆ V ( ˆ , ˆ )
ˆ R( ˆ )  2CO
1 2 1 2

Compute this t statistic. If it exceeds critical value, reject H0.

VI. Testing for Overall Insignificance for MLR

In a MLR with K independent variables, you want to test:

H 0 : 1   2     K  0
H 1: At least one of these beta’s is not zero.

The t-test can’t be used to test overall significance of a regression model. In


particular, we can't simply take the product of the individual tests. The reason
is that the same data are used to estimate the two coefficients. They are not
independent on one another. It’s possible for the coefficients to be individually
equal to zero, and yet jointly they might be different from zero. Although R 2 or
R 2 measure the overall degree of fit of an equation, they are not a formal test.

Begin with Analysis of Variance (ANOVA). Typical table.

SS Df MSS
ESS ˆ1  y i x1i + ˆ2  y i x 2i
K
ˆ1  y i x1i + ˆ 2  y i x 2i K
RSS
 ei2
 ei2 n – K-1 n - K 1

TSS  yi
2
n–1

When  i is normally distributed, we can construct the following variable:

ESS ˆ1  y i x1i + ˆ 2  y i x 2i


df K (n - K  1 ) R 2
F = ESS = =
RSS  ei2 K(1 - R 2 )
df RSS n - K 1

This is the ratio of 2 chi-square distributed variables. It has an F distribution


with K and n-K-1 degrees of freedom.

If the computed F statistic exceeds some critical value, we reject H0 that the
slope coefficients are simultaneously equal to zero.

Note that F statistic is closely related to R 2 . If R2=0, F=0. If R2=1, then F  .

Example: Woody’s.

Yˆ i = 102,192  9075N i + 0.355Pi  1.288I i , R 0.618, n  33


2

29 ( .618 )
F=  15.64
3(1 - .618 )

Since this exceeds the critical value at a 5 percent significance level (2.93), we
can reject H0 that all slope coefficients are simultaneously equal to zero or the
model is overall insignificant.

VII. Further Use of F-Test: Assessing the Marginal Contribution of


Regressors

Practical question: How do we know if other explanatory variables should be


added to our regression? How do we assess their 'marginal contribution'?
Theory often to weak. What does the data tell us?

Suppose we have a random sample of 100 workers. We first obtain these


results:

lˆnW i = .673 + .107 S i


2
R = .405
(.013)

where Wi is the wage rate, and Si is the number of years of schooling or


education completed (standard error in parentheses).

I want to know whether 'labour market experience' (Li) should be added as a


quadratic expression in this regression.

This involves ‘polynomial regressions’. Couldn’t discuss it earlier, because


requires more than one independent variable. This is a ‘second-degree’
polynomial (includes X and X2). A ‘third-degree’ polynomial includes X, X2
and X3.

lˆnW i = - .078 + .118 S i + .054 Li - .001 Li R = .433


2 2

(.016) (.026) (.001)

We obtain these results:


This means that the wage functions are ‘concave’ from below in terms of log
wages and experience (holding education constant).

No real estimation issues here. Still linear in the parameters. However, L and
L2 will tend to be highly correlated.
To assess the incremental contribution of both Li and Li2, we use the following F
test (i.e., H0: β2=β3=0).

2 2
RUR - R R RSS R - RSSUR
F= m m
=
2
1 - RUR RSSUR
n - K 1 n - K 1

where K = number of slope coefficients in the unrestricted regression.


RUR2 = Coefficient of determination in the
'unrestricted' regression.
2
RR = Coefficient of determination in the
'restricted' regression.
m = Number of linear restrictions (m=2 in this example).

.433 - .405
2 .014
F= = = 2.369
1 - .433 .00591
96

With a 5% confidence level, the critical value is 3.10. Since our F statistic is
less than this value, we can't reject H0. Experience and experience squared have
an insignificant effect in our regression. No statistical evidence for their
inclusion.

Note that this F-test can be used to deal with a null hypothesis that contains
multiple hypotheses or a single hypothesis about a group of coefficients.

VIII Questions for discussion: Q5.11

IX Computing Excise: Q5.16

You might also like