Chapter 2 Building Confidence in The Mean

Lesson 12: Inference About Means

Chapter 2: Building Con�dence in the Mean

For quantitative data, the sample mean x̄ (x-bar) is our best guess, or point estimate, of the
population mean µ (mu). Even though it's our best guess, it's only a single value that likely
di�ers from the population mean. As we did with proportions, we'll build a margin of error
around the sample mean to form an interval that's likely to capture the population mean.

Let's start with the basic form for con�dence intervals that you saw in Lesson 10.

We'll use x̄ as the point estimate, but what about the critical value and standard error?

Remember the central limit theorem that we discussed back in Lesson 9? That theorem says the
standard deviation of x̄ is the population standard deviation σ divided by the square root of the
sample size √ n . But if we don't know the population mean, we also don't usually know the
population standard deviation. So what should we do?

With proportions, we used p̂ (p-hat) not only as the point estimate but also as a replacement for
p in the standard error . Unfortunately, the point estimate x̄ doesn't tell us anything that
we can substitute into the calculation of the standard error of x̄.

We'll replace the unknown value σ with an estimate from the sample. The sample standard
deviation s is our best guess for the population standard deviation. We'll substitute it for σ to
form the standard error of x̄.

Text equivalent start.

The standard error of the sample mean equals the sample standard deviation divided by the
square root of the sample size.

Text equivalent stop.

In the early years of statistics (more than a century ago), people made this replacement and
then used the normal model assuming it would work. It did for large samples, but there were
problems when it came to smaller samples.

The problem when you make this replacement is that the sample standard deviation is a
statistic that varies from sample to sample. So in addition to the variability of the sample mean,
there's added variability from the sample standard deviation. This extra variation in the
standard error messes up the margins of error and p-values from hypothesis tests when you try
to use the normal model.

And then came William Gosset and the t-distribution.

Student's t
William Gosset �rst realized that you need a new sampling distribution model to account for
this extra source of variation. He was the quality control engineer for the Guinness
Brewery in Ireland, and his job was to make sure the beer leaving the brewery was of
high quality.

Gosset's samples were small, so he was rejecting too many batches using tests based
on the normal model that further lab testing showed were good. (In other words, he was making
