6-Testing&conf Intervals PDF

Economics 420
Introduction to Econometrics
Professor Woodbury
Fall Semester 2015
Experiments and Basic Statistics
1. The probability framework for statistical inference
2. Estimation
3. Hypothesis testing
4. Confidence intervals
The problem with QE [quantitative easing] is that it works in

practice, but it doesnt work in theory.
Ben Bernanke, Fed Chairman, January 2014
Thats a big problem. If we have no theory why something works,
then maybe it doesnt really work.
John Cochrane, Hoover Institute, September 2015
The hypothesis testing problem (for the mean)
Definition: The statement being tested in a significance test is
called the null hypothesis. The significance test is designed to
assess the strength of the evidence against the null hypothesis.
Usually, the null is a statement of no effect or no difference.
Do mean hourly earnings of recent US college students equal
$20 per hour?
Are mean earnings the same for men and women?
Do students at private colleges take on greater debt than
those who go to publics?
3
The challenge is to answer the questions based on a sample of

data
Usually, we want to know whether the null hypothesis is true,
or some alternative hypothesis is true
For the questions about earnings, we are test one of the following:
H0: E(Y) = Y,0 vs. H1: E(Y) Y,0
(2-sided alternative most
general)
H0: E(Y) = Y,0 vs. H1: E(Y) > Y,0
(1-sided, > alternative)
H0: E(Y) = Y,0 vs. H1: E(Y) < Y,0
(1-sided, < alternative)
Some terminology for testing statistical hypotheses

In a given sample, the samples average Y will rarely exactly equal
Y,0
Why not?
Maybe the true mean really is not Y,0 (the null is false)
But maybe the true mean really is Y,0, (the null is true) but Y
differs from Y,0 because of random sampling (sampling error)
We will never know with complete certainty, but we can make a

statement about how likely it is to observe Y if Y,0 is true
Definition: a p-value is the probability of drawing a statistic (in
this case Y ) as extreme (or more extreme) as the value you
observe if the null hypothesis is true
The smaller the p-value, the stronger the evidence against the null
hypothesis
Example: The National Student Loan Survey sampled 1,280

borrowers who started repaying their loans 46 months before
the survey
The mean debt of those who attended a 4-year private college
was $21,200
The mean debt of those who attended a 4-year public college
was $17,100
The difference is $4,100, but would a different sample have
given us a different answer?
What is the probability of finding a difference of $4,100 if in
fact the mean difference were $0?
What is the null hypothesis here?
Calculate the p-value

Basically, we want to calculate how many standard deviations
$4,100 (the observed Y ) is from $0
To do this, we need to know the standard deviation of the

estimate we will see later it is ~$3,000
If so, then $4,100 is
z = (estimate hypothesized value) / (std. dev. of estimate)

= ($4,100 0) / $3,000

= 1.37 standard deviations from $0 (the null)
We can look up how likely it is to have an observation that is 1.37
standard deviations from mean in a standard normal probability
table (Think in terms of standard deviations!)
8
Formally, we are looking for

P(Z 1.37 or Z 1.37)
It turns out that P(Z 1.37) = 0.0853
So P(Z 1.37 or Z 1.37) = 0.0853 x 2 = 0.1706
So there is about a 17% chance of seeing a difference as extreme
as $4,100 in this sample if the true population difference is $0
10
Using Stata to compute p-values

The command to compute an expression is display, or di for
short
To compute p-values after computing a test statistic, use the
command
di normprob(1.58)
This gives the probability that a standard normal random variable
is greater than the value 1.58 (about .943)
So if a standard normal test statistic takes on the value 1.58, the pvalue is 1 0.943 = 0.057.
11
Other functions are defined to give the p-value directly.

di tprob(df,t)
10
returns the p-value for a t test against a two-sided alternative (t is
the absolute value of the t statistic and df is the degrees of
freedom)
For example, with df = 31 and t = 1.32, the command returns the
value .196.
12
of incorrectly rejecting the null, when the null is true.

Calculating
the p-value
Calculating
the p-value
based onbased
Y : on Y
p-value = PrH 0 [| Y
act
|
|
Y
Y ,0
Y ,0
|]
act
Y
where
is the value of Y actually observed (nonrandom)
act
where Y
is the value of Y actually observed (nonrandom)
1/2/3-48
This is not difficult but you need to think about it and

understand it
13
Calculating the p-value, ctd.
To compute the p-value, you need the to know the

complicated if n is
samplingthe
distribution
of Y , which
Calculating
p-value based
on Y is
(continued)
compute the p-value, you need the to know the sampling
Tosmall.
If n is large,
can use
the normalif approximation
(CLT):
distribution
of Yyou
, which
is complicated
n is small
If n is large, you can use the normal approximation (CLT):
act
|
|
Y
Y ,0
Y ,0 |],
= PrH 0 [|
Y ,0
Y
= PrH 0 [|
/ n
Y ,0
Y
| |
Y act
Y
| |
Y ,0
/ n
Y act
Y ,0
|]
|]
probability
under
left+right
probability under
left + right
tails
of N(0,1)N(0,1) tails
where =Y =
ofdev.
the of
distribution
of Y of
= YY / n .
where
Ystd.
/ dev.
= std.
the distribution
1/2/3-49
14
known:
Calculating
the the
p-value
with
Calculating
p-value
with YY known
For large n, p-value = the probability that a N(0,1) rando

15
act
For large n, the p-value

= the probability that a N(0,1) random variable falls outside
|(
Y,0)/ Y |
In practice, Y is unknown it must be estimated
16
Estimator
of the
variance
of Y:of Y
Estimator
of the
variance
The sample variance of Y =
1 n
2
s
=
(
Y
Y
)
= sample variance of Y
ample variance of nY 1 i 1 i
2
Y
Fact:
Fact: If (Y1,,Yn) are i.i.d. and E(Y4) < , then
p
2
Y
If (sY21,,Yn2) are i.i.d. and E(Y ) < , then s

) < , then
Y
Y
2
Y
Whydoes
doesthe
thelaw
lawofoflarge
large numbers
numbers apply?
Why
apply?
s apply?
2 2
sYis isa sample
Because
s
Appendix
Y
Because
based onaverage;
a samplesee
(averages
over3.3
observations)
e; see Appendix 3.3

4
4
4
Technical
note:
we
assume
E
(
Y
becausehere
herethe
the
Technical
Y ) < because
herenote:
the we assume E(Y ) )<< because
average
isisnot
of its
its square
square; see App. 3.3
average
notofofYYi,i, but of
quare; see
App.
3.3
17
Estimator of the standard deviation of Y
So to estimate Y the standard deviation of Y we

2
s
simply take the square root of Y and divide by n:
Y =
Y
SE(
)
=
sY2 /(n) = sY /n
This is usually called the standard error of Y because it is an

estimator of the standard deviation (the terminology is
intended to keep the two distinct)
18
2
Computing
with Yestimated
estimated
Computingthe
the p-value
p-value with
:
Y
act
|
|
Y
Y ,0
= PrH 0 [|
Y ,0
Y
PrH 0 [|
/ n
Y ,0
sY / n
| |
Y ,0
Y act
Y
| |
|],
Y ,0
/ n
Y act
Y ,0
sY / n
|]
|] (large n)
so
p-value = PrH 0 [| t | | t act |]
2
Y
estimated)
probability under normal tails outside |tact|

where t =
Y ,0
sY / n
(the usual t-statistic)
19
PrH 0 [|
so
so
Y ,0
sY / n
| |
Y act
Y ,0
sY / n
p-value = PrH 0 [| t | | t act |]
2
Y
|] (large n)
estimated)
probability under normal tails outside |tact|

where t =
Y ,0
sY / n
(the usual t-statistic)
1/2/3-52
20
Notes on significance
The significance level of a test is a pre-specified probability of
incorrectly rejecting the null, when the null is true
What is the link between the p-value and the
significance level?
Definition: If the p-value is as small or smaller than , then we
say the estimate is statistically significant at the level
21
The significance level is pre-specified

For example, if the pre-specified significance level is 5%,
you reject the null hypothesis if |t| 1.96
equivalently, you reject if p 0.05
The p-value is sometimes called the marginal significance
level
Often, it is better to communicate the p-value than simply
whether a test rejects or not the p-value contains more
information than the yes/no statement about whether the
test rejects
22
Digression: the Student t distribution

At this point, you might be wondering ...
What happened to the t-table and the degrees of
freedom?
2
If Yi, i = 1, , n is i.i.d. N(Y, Y ), then the t-statistic has the
Student t-distribution with n 1 degrees of freedom

The critical values of the Student t-distribution are tabulated in
the back of all statistics books
23
Remember the recipe?

Compute the t-statistic
Compute the degrees of freedom, which is n 1
Look up the 5% critical value
If the t-statistic exceeds (in absolute value) this critical value,
reject the null hypothesis
24
Comments on this recipe and the Student t-distribution

The theory of the t-distribution was one of the early triumphs
of mathematical statistics
It is astounding, really: if Y is i.i.d. normal, then you can know
the exact, finite-sample distribution of the t-statistic it is the
Student t
So you can construct confidence intervals (using the Student t
critical value) that have exactly the right coverage rate, no
matter what the sample size
This result was really useful in times when computer was a
job title, data collection was expensive, and the number of
observations was perhaps a dozen
25
It is also a conceptually beautiful result, and the math is

beautiful too which is probably why stats profs love to
teach the t-distribution
But ...
26
If the sample size is moderate (several dozen) or large (hundreds

or more), the difference between the t-distribution and N(0,1)
critical values are negligible
Here are some 5% critical values for 2-sided tests:
degrees of freedom
(n 1)
10
20
30
60
5% t-distribution
critical value
2.23
2.09
2.04
2.00
1.96
27
So, the Student-t distribution is only relevant when the

sample size is very small
But in that case, for it to be correct, you must be sure that the
population distribution of Y is normal
In economic data, the normality assumption is rarely plausible
Here are the distributions of some economic data
Do you think earnings are normally distributed?
Suppose you have a sample of n = 10 observations from one
of these distributions would you feel comfortable using the
Student t distribution?
28
29
Comments on Student t distribution (continued)
Consider the t-statistic testing the hypothesis that two means

(groups s, l) are equal:
Even if the population distribution of Y in the two groups is

normal, this statistic doesnt have a Student t distribution!
30
There is a statistic testing this hypothesis that has a normal

distribution the pooled variance t-statistic
However, the pooled variance t-statistic is only valid if the
variances of the normal distributions are the same in the two
groups
Would you expect this to be true, say, for mens v. womens
wages?
31
The Student-t distribution summary
The assumption that Y is distributed N(Y,
) is rarely
plausible in practice (income? number of children?)
For n > 30, the t-distribution and N(0,1) are very close as n
grows large, the tn1 distribution converges to N(0,1)
The t-distribution is an artifact from days when sample sizes
were small and computers were people
For historical reasons, statistical software typically uses the tdistribution to compute p-values but this is irrelevant when
the sample size is moderate or large
For these reasons, we will focus on the large-n approximation
given by the CLT
32
Summary of Hypothesis Testing

1. Compute the standard error of Y [that is, SE( Y )]
aka Y =
2
Y
s = sY /n
2. Compute the t-statistic (or z-statistic essentially the same

thing because we rely on large sample size)
t = (observed Y hypothesized Y ) / SE( Y )
3. Compute the p-value
4. Reject the hypothesis at the 5% significance level if the p-value is
less than 0.05 (that is, if |t| > 1.96)
33
Experiments and Basic Statistics

1. The probability framework for statistical inference
2. Estimation
4. Confidence intervals
34
4. Confidence Intervals
Definition: A 95% confidence interval for Y (or any
parameter) is an interval calculated from sample data that contains
the true value of Y in 95 percent of repeated samples
Remember what is random here
The values of Y1,,Yn and thus any functions of them are random
including the confidence interval
So the confidence interval will differ from one sample to the next
The population parameter Y is not random; we just dont
know it
35
Confidence intervals, ctd.

A 95% confidence
interval
can always be constructed as the
Confidence
intervals
(continued)
Aset95%
confidence
can always
constructed
the set
of
by abehypothesis
testaswith
a 5%
of values
of interval
Y not rejected
possible
valueslevel.
of Y not rejected by a hypothesis test with a 5significance
percent significance level
Y
Y
{ Y:
sY / n
1.96} = { Y: 1.96
Y
Y
sY / n
1.96}
sY
= { Y: 1.96
n
sY
Y Y 1.96
}
n
sY
sY
= { Y (Y 1.96
, Y + 1.96
)}
n
n
This confidence interval relies on the large-n results that Y is
approximately normally distributed and sY2
2
Y
.
36
1/2/3-62
n
sY
sY
, Y + 1.96
)}
n
n
This confidence
interval relies on the large-n result that
is
s on theapproximately
large-n results
that Y distributed
normally
and
2
Y
ibuted and s
2
Y
is
That is, the estimated variance

of Y converges to the true variance
1/2/3-62
as the sample size approaches infinity
37
Confidence intervals for the population mean

A 95% two-sided confidence interval for Y is an interval
constructed so it contains the true value of Y in 95% of all
possible random samples
With a large sample, 99%, 90%, and 99% confidence intervals for
Y are
95% confidence interval for Y = {

1.96 SE( )}
2.58 SE( )}
1.64 SE( )}
38
Example
In 2012, the mean hourly earnings of men and women who were
college graduates ages 2534 were:
(men) = $25.30
(women) = $21.50
If the standard error of the difference between (men) and

(women) = $0.35, what is the 95% confidence interval for the
male-female wage gap?
39
Summary
We have two assumptions:
1. Simple random sampling of a population, that is,
{Yi, i =1,,n} are independent and identically distributed
2. 0 < E(Y4) < (so we dont have extreme outliers)
From these, we developed, for large samples (large n):
Theory of estimation (sampling distribution of )

Theory of hypothesis testing (large-n distribution of t-statistic
and computation of the p-value)
Theory of confidence intervals (constructed by inverting test
statistic)
40
Question:
Are assumptions (1) & (2) plausible in practice?
Answer:
Yes
41
Lets go back to the original policy question
What is the effect on test scores of reducing STR by one

student/class?
42
student/class?
Have we answered this question?
Have we answered this question?
1/2/3-64
43

6-Testing&conf Intervals PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

6-Testing&conf Intervals PDF

Uploaded by

Copyright:

Available Formats

Economics 420

The problem with QE [quantitative easing] is that it works in

The challenge is to answer the questions based on a sample of

Some terminology for testing statistical hypotheses

We will never know with complete certainty, but we can make a

Example: The National Student Loan Survey sampled 1,280

Calculate the p-value

To do this, we need to know the standard deviation of the

Formally, we are looking for

Using Stata to compute p-values

Other functions are defined to give the p-value directly.

of incorrectly rejecting the null, when the null is true.

is the value of Y actually observed (nonrandom)

This is not difficult but you need to think about it and

Calculating the p-value, ctd.

To compute the p-value, you need the to know the

For large n, p-value = the probability that a N(0,1) rando

For large n, the p-value

In practice, Y is unknown it must be estimated

If (sY21,,Yn2) are i.i.d. and E(Y ) < , then s

e; see Appendix 3.3

Estimator of the standard deviation of Y

So to estimate Y the standard deviation of Y we

This is usually called the standard error of Y because it is an

probability under normal tails outside |tact|

(the usual t-statistic)

p-value = PrH 0 [| t | | t act |]

probability under normal tails outside |tact|

(the usual t-statistic)

The significance level is pre-specified

Digression: the Student t distribution

If Yi, i = 1, , n is i.i.d. N(Y, Y ), then the t-statistic has the

Student t-distribution with n 1 degrees of freedom

Remember the recipe?

Comments on this recipe and the Student t-distribution

It is also a conceptually beautiful result, and the math is

If the sample size is moderate (several dozen) or large (hundreds

So, the Student-t distribution is only relevant when the

Comments on Student t distribution (continued)

Consider the t-statistic testing the hypothesis that two means

Even if the population distribution of Y in the two groups is

There is a statistic testing this hypothesis that has a normal

The Student-t distribution summary

The assumption that Y is distributed N(Y,

Summary of Hypothesis Testing

2. Compute the t-statistic (or z-statistic essentially the same

Experiments and Basic Statistics

Confidence intervals, ctd.

That is, the estimated variance

Confidence intervals for the population mean

95% confidence interval for Y = {

99% confidence interval for Y = {

If the standard error of the difference between (men) and

Theory of estimation (sampling distribution of )

Lets go back to the original policy question

What is the effect on test scores of reducing STR by one

Have we answered this question?

You might also like