Conf Intervals

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Economic Statistics (Fall 2012) 1 Neşe Yıldız

Confidence Intervals
In the previous handout we discussed estimators and some desirable properties that we
might seek in an estimator. As discussed in that handout, estimators are functions of the
sample. Since individuals/subjects are selected into our sample randomly, the estimators
are random variables. This means that we can never be certain that the value of the
estimator will be exactly equal to the parameter it is designed to estimate. As a result,
it would be useful to be able to evaluate how close the population parameter is likely to
be to the value of the estimator that we use. Specifically, we would like to say something
like “the population parameter is believed to be in such and such interval with a certain
confidence level.” In this handout, we will study how we can make such statements
formally.

The easiest case in which such intervals can be constructed is the estimation of the
population mean of a variable whose population distribution is normal with an unknown
mean, but a known variance. Even though this example is not really realistic, it is
easiest to describe how to construct confidence intervals in this context. As usual we are
going to assume that we have a random sample of size n on this variable. So we assume
that X1 , X2 , ..., Xn are mutually independent, and X1 ∼ N (µ, σ 2 ), X2 ∼ N (µ, σ 2 ),....,
Xn ∼ N (µ, σ 2 ), where µ is unknown, but σ 2 is known. Our goal is to make inferences
about µ. As we saw in the previous handout, sample mean is an estimator that has some

good properties. So we are going to use the sample mean, X = n1 ni=1 Xi , to estimate µ,
the population mean of the X’s. To construct the interval that we would like, we are going
to rely on a special property of normally distributed random variables. If X1 , X2 ,...,Xn
are all mutually independent normally distributed random variables, then X will be
normally distributed with E(X) = n1 [E(X1 ) + E(X2 ) + ... + E(Xn )] and V ar(X) =
1
n2
[V ar(X1 ) + V ar(X2 ) + ... + V ar(Xn )]. For our random sample, E(X1 ) = E(X2 ) =
n ) = µ, and V ar(X1 ) = V ar(X2 ) = ... = V ar(Xn ) = σ . As a result, X ∼
2
... (= E(X )
2 √
N µ, σn . This implies that X−µ √σ
= n X−µ σ
∼ N (0, 1). We will reverse engineer the
n
desired confidence interval. For this purpose, let γc := P (|Z| < c). As we will see shortly,
γc will denote the desired confidence level, and c will denote the critical value. If γ = 0.95,
then c will be 1.96. If γ = 0.9, then c will be approximately
( 1.645.
) If(γ = 0.88, c will) be
X−µ
approximately 1.55. What we have so far is γc = P √σ < c = P −c < X−µ √σ
<c =
( ) ( n n)

P −c √σn < X − µ < c √σn , or equivalently, γc = P X − c √σn < µ < X + c √σn . Notice
( )
that the interval, X − c √σn , X + c √σn depends on the sample; depending on the value of
X, the interval changes. For some samples, this interval will trap/cover the population
Economic Statistics (Fall 2012) 2 Neşe Yıldız

mean, for others it will not.


( For about 100γ) c % of all samples of size n drawn from this
population the interval X − c √n , X + c √n covers the population mean, and for the
σ σ

remaining 100(1 − γc )% it does not.

In practice, however, we will have only one sample at hand, and one realization for X.
Let x denote this realization of the sample mean. Then the population mean will be
either between x − c √σn and x + c √σn . We will not be able to tell whether the particular
interval corresponding to our sample covers the population mean or not. All we know
is that we have a procedure that works 100γc % of the time. This is why we replace the
term “probability” with the term“confidence”.

To get the c from the table in your book, note that the standard normal table in the
book gives P (0 < Z < c) for different c values, and the value of c is obtained using
first very left column and adding to that the value in the top row. Since P (|Z| < c) =
P (−c < Z < c) = 2P (0 < Z < c) (because the standard normal density is symmetric
around 0), you need to choose c so that P (0 < Z < c) = γ2 ..

Now that we discussed the basic idea behind constructing confidence intervals, let us
consider a slightly more realistic situation. We will next discuss constructing confidence
intervals for the population mean of a variable whose population distribution is normal
with an unknown mean and unknown variance. If the population variance is unknown,
we will not be able to construct intervals of the form X − c √σn and X + c √σn because we
cannot plug in a number for the value of σ. We can estimate
∑n the population variance
using our sample. In particular, we can use s = n−1 i=1 (Xi − X)2 to estimate σ 2 .
2 1

X−µ (X−µ) n
But if we do this, the probability distribution of √s
= s
is no longer standard
n

normal. The distribution of (X−µ) n
s
is (Student’s) t-distribution with n − 1 degrees of
freedom. The t-density is symmetric around 0 and bell-shaped, like the standard normal
density, but has heavier/thicker tails, meaning that it is more prone to producing values
that fall far from its mean. As the degrees of freedom increase t-density becomes more
and more like the standard normal density; in the limit, with infinite degrees of freedom,
t-density is the same as the standard normal density. Now let γ denote the desired
confidence level( again. Then ) find(c such that γ = P (|tn−1 |)< c). Then for that c,
we
(X−µ)√n
we have γ = P < c = P X − c √sn < µ < X + c √sn . So 100γ% confidence
s
( )
interval for the population mean in this case is given by x − c n , x + c n . In this
√s √s

case, to get the c value, you need to use the table for t-distribution in the book. The
very left column of that table lists different degrees of freedom values. The very top row
of the table gives some right tail probabilities. Specifically, if we would like to find the
Economic Statistics (Fall 2012) 3 Neşe Yıldız

cutoff value such that the probability that a random variable that has a t-distribution
with 15 degrees of freedom takes a value above that cutoff value is 5%, we need to look
at the 15th row (excluding the very top row) of the table and 2nd column (excluding the
very left column) of the table and take the number at their intersection, which is 1.753.
So if we wanted a 90% confidence for the population mean using a sample of size 16, the
critical value we would use is 1.753. (Note 5% = 100−90
2
%).

Next, we consider what would happen if the population distribution of the variable whose
population mean we are trying to make inferences about is not normal. In that case, in
√ √ X−µ
general we will not know what the probability distribution of n X−µ σ
or n s is. How
should we get the critical value c in such a case? If we cannot figure out what the actual

probability distribution of n X−µ σ
is, we are going to approximate this distribution.
The Central Limit Theorem is a remarkable result which tells us that we can use the

standard normal distribution to approximate the probability distribution of n X−µ σ
as
long as {X1 , X2 , ..., Xn } is a random sample and the variance of each Xi is finite (which
we have been implicitly assuming anyway), regardless of what distribution the random
variables {X1 , X2 , ..., Xn } are drawn from, that is regardless of what the population
distribution of the variable ( whose population
) mean we are trying to make inferences
√ X−µ
about is. This means that P n σ < c is approximately equal to P (|Z| < c) when
the sample size is large; when the sample size is small, the approximation will be bad.
In this class, you can use this approximation when n ≥ 50.
( √ )
X−µ
Although the Central Limit Theorem tells us that we can approximate P n σ < c
by P (|Z| < c), we still face a problem because it is not realistic to assume that we know
exactly what the population variance of the X’s is. We can estimate this variance by
the sample standard deviation, s2 , defined above. Using the fact that s2 is a consistent

estimator for σ 2 and that n X−µ √ is a continuous function of s2 along with the Central
s2 ( √ )

Limit Theorem we can still approximate P n X−µ s
< c by P (|Z| < c).

You might also like