Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Statistical Methods in Economics

Estimation II

Dr. Michela Tincani

Department of Economics
University College London
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

CI for Difference in Mean of Two Normal Populations

Sometimes we want to compare the parameters in two different


populations. For example,
1 A firm may want to compare impurity levels in chemical
supplies from two different sources;
2 A farmer may want to compare the outcomes across two
different fertilizers;
3 A student may want to compare the mean score in a
course last year and the previous year.
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

CI for Difference in Mean of Two Normal Populations

There are basically two sampling schemes one can use:


1 Matched Pairs (Dependent Samples): Sample members
are chosen in pairs – one from each population. Apart
from the feature of interest, the idea is to select individuals
that resemble one another. (clinical trials, treatment
effects) (WE WILL SKIP THIS CASE)
2 Independent Samples: One sample from each population.
Three situations: 1. both population variances are known;
2. both population variances unknown but assumed
identical; and 3. both population variances unknown.
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Two Means, Independent Samples, Known Variance

Let the populations be described by the random variables X


and Y . We obtain (independent) samples of size nX and nY
from each population. Furthermore, assume that

X ∼ N(µX , σX2 ) and Y ∼ N(µY , σY2 )

This implies that

X ∼ N(µX , σX2 /nX ) and Y ∼ N(µY , σY2 /nY )


CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Two Means, Independent Samples, Known Variance

It is then the case that

E(X − Y ) = E(X ) − E(Y ) = µX − µY

and

Var (X − Y ) = Var (X ) + Var (Y ) = σX2 /nX + σY2 /nY

It can also be shown that X − Y follows a normal distribution!

X − Y ∼ N(µX − µY , σX2 /nX + σY2 /nY ).


CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Two Means, Independent Samples, Known Variance

Using the results we already know:

CI for Mean Difference (Indep. Samples and Known Variance


Consider two independent random samples of nX and nY ob-
servations from normal distributions with means µX and µY and
known variances σX2 and σY2 . Given observed sample means X
and Y , a 100(1 − α)% confidence interval for µX − µY is given
by
s s
σX2 σY2 σX2 σY2
X − Y − Zα/2 + < µX − µY < X − Y + Zα/2 +
nX nY nX nY
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Two Means, Independent Samples, Equal Variance

What if the variances are the same but we do not know them?
In this case, we can estimate it!

Since σ 2 = σX2 = σY2 , both sX2 and sY2 are unbiased estimators for
the population variance. But we want to use information from
both. . . the estimator that best combines these two estimators is
(nX − 1)sX2 + (nY − 1)sY2
sP2 = .
nX + nY − 2
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Two Means, Independent Samples, Equal Variance

Before, when we used the sample variance because the


population variance was unknown, we were able to standardize
the sample mean and obtain a quantity distributed according to
a Student’s t distribution. As a matter of fact,

(X − Y ) − (µX − µY )
T = r
sp2 sp2
nX + nY

follows a t distribution with nX + nY − 2 degrees of freedom.


CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Two Means, Independent Samples, Equal Variance

With this in hand we can obtain a confidence interval estimator:

CI for Two Means (Equal Variances)


Consider two independent random samples of nX and nY ob-
servations from normal distributions with means µX and µY and
(unknown, but equal) variances σ 2 = σX2 = σY2 . Given observed
sample means X and Y , a 100(1 − α)% confidence interval for
µX − µY is given by
r
sp2 sp2
X −Y −tnX +nY −2,α/2 nX + nY < µX − µY <
r
sp2 sp2
< X − Y + tnX +nY −2,α/2 nX + nY .
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Example (practice problem)

(NCT) The residents of Orange City complain that traffic


speeding fines given in their city are higher than traffic
speeding fines that are given in nearby DeLand. The assistant
to the county manager agreed to study the problem and
indicate if complaints are reasonable. Independent random
samples of the amounts paid by residents for speeding tickets
in each of two cities over the last three months were obtained.
These amounts were

Orange City: 100 125 135 128 140 142 128 137
DeLand: 95 87 100 75 110 105 85 95

Assuming an equal population variance, find a 95% confidence


interval in the mean cost of speeding tickets in these two cities.
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Two Means, Independent Samples, Unknown Variance

Remember that, when the variances are known,

Var (X − Y ) = Var (X ) + Var (Y ) = σX2 /nX + σY2 /nY

Our last case investigates difference in means when the


variances are unknown. In this case, we can estimate the
above quantity by:
sX2 /nX + sY2 /nY .
Since the difference in the sample means is normally
distributed, we can use the results we already obtained for the
construction of CI on a normal mean when the variance is not
known.
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Two Means, Independent Samples, Unknown Variance

CI for Two Means (Unknown Variances)


Consider two independent random samples of nX and nY ob-
servations from normal distributions with means µX and µY and
(unknown) variances σX2 and σY2 . Given observed sample means
X and Y , a 100(1 − α)% confidence interval for µX − µY is given
by
q
X −Y −tν,α/2 sX2 /nX + sY2 /nY < µX − µY <
q
< X − Y + tν,α/2 sX2 /nX + sY2 /nY .

(Continues)
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Two Means, Independent Samples, Unknown Variance

CI for Two Means (Unknown Variances) (Continued)


where the degrees of freedom for the cutoff points are given by
" ! !#2
sX2 sY2
nX + nY
ν= !2 !2
sX2 sY2
nX /(nX − 1) + nY /(nY − 1)
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Two Means, Independent Samples, Unknown Variance

If the sample size is large (⇒ ν is large), the Student’s t


distribution approaches a standard normal distribution and we
can use Zα/2 instead of tν,α/2 .

In this case the confidence interval is well approximated by


s s
sX2 sY2 sX2 sY2
X − Y − Zα/2 + < µX − µY < X − Y + Zα/2 + .
nX nY nX nY
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

CI for Diff in Pop Proportions

Assume we have a random sample of size nX observations


from a population with proportion πX of “successes”. Let the
sample proportion be πbX . You obtain an independent sample
from another population with πY probability of “success” with nY
observations. The proportion of successes in this sample is πbY .

We want to make inferences about the difference between


these two population proportions: πX − πY .
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

CI for Diff in Pop Proportions


We know that an unbiased estimator for the difference in the
population proportions is:

bX − π
π bY

Furthermore, the variance for these estimators can be seen to


be
πX (1 − πX ) πY (1 − πY )
+ .
nX nY
By the Central Limit Theorem, we know that, in large samples,

bX − π
(π bY ) − (πX − πY )
p
πX (1 − πX )/nX + πY (1 − πY )/nY

has a distribution well-approximated by the standard normal


distribution.
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

CI for Diff of Pop Proportions

Since we do not know πX or πY , we can estimate them by π


bX
and πY .
b

In large samples, the quantity below is also well approximated


by a standard normal random variable.
bX − π
(π bY ) − (πX − πY )
p
bX (1 − π
π bX )/nX + πbY (1 − π
bY )/nY
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

CI for Diff of Pop Proportions

As before, this allows us to construct confidence intervals for


differences in population proportions (in large samples):
CI for Population Proportions (Large Samples)
Let πbX denote the observed proportion of “successes” in a ran-
dom sample of nX observations from a population with popula-
tion proportion πX , and let π
bY denote the observed proportion of
“successes” in a random sample of nY observations from a pop-
ulation with population proportion πY . Then, for large samples,
a 100(1 − α)% confidence interval for the difference between
population proportions (πX − πY ) is given by
s
bX (1 − π
π bX ) πbY (1 − π
bY )
bX − π
(π bY ) ± Zα/2 + .
nX nY
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Sample Size Determination

We saw how, in possession of the sample size, we can


determine margin of errors for many different circumstances.

Sometimes we may be able to fix the margin of error a priori


and we would like to determine the sample size necessary to
achieve such a margin of error. For inferences about the mean
of a normal population when the variance is known, we saw
that the CI is given by

Zα/2 σ
X± √ .
n
| {z }
≡B
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Sample Size Determination

Let’s say we want to achieve a certain margin of error B. We


can then use the expression above to determine the necessary
sample size:
2 σ2
Zα/2
Zα/2 σ
B= √ ⇒n= .
n B2
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Sample Size Determination


If we are alternatively interested in a population proportion, we
know that the appropriate confidence interval (in large samples)
is s
p(1 − p)
b ± Zα/2
p .
n
The problem here is that we do not know p before drawing a
sample. But we know that p(1 − p) ≤ .25. So
s s
p(1 − p) 0.25
Zα/2 ≤ Zα/2 = B̄.
n n

Then setting
2
0.25Zα/2
n=.
B2
guarantees the actual margin of error is no greater than B.
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

Example

(NCT) Suppose that a random sample of graduate admissions


personnel was asked what the standardized test scores (such
as GMAT or GRE) play in the consideration of a candidate for
graduate school. It is desired to ensure that a 95% confidence
interval for the population proportion answering “very
important” extends no further than 0.06 on each side of the
sample proportion. How large a sample should be taken?
CI Mean Diff in Normal Pop CI Diff in Prop Sample Size

We have covered NCT8, 7.8 and 8.2-8.3.

You might also like