Chapter 14: Comparing two means

I. Comparing two means

In economics comparing two different groups often happens. For example, a firm could want to know if
an offer will increase its sales, for that it takes two different groups one without offer and one with. Using
side-by-side boxplots we can see if the group with the offer has higher sales than the one without. But when
the difference is small we can wonder if the offer worked, for that we compare the means of the two
samples. First, we start by finding the SE of the difference of the means, SE ( y a− y b) = √ Var ( y ) +Var ( y ) =
a b

√ s² s ²
+ .
na n b
The sampling distribution, which is the probability of finding this result follows the t-distribution

( y a− y b ) −Δ 0
model, so we have t = , where Δ0 is the difference in the means of the null hypothesis, it is
SE( y a− y b )
generally equal to 0.

II. The two-sample t-test

Once we have the null hypothesis, we choose an alternative hypothesis (one-sided or two-sided), find the
SE and t with the formulas above, and find a P-value that will reject or not the null hypothesis.

III. Assumptions and conditions

To do two samples t-test, the samples have to check some assumptions and condition:
- The independence assumption: the randomization condition must be met which means that the data are
drawn from a randomized sample. Same for the 10% condition, the sample size as to be n < than 10% of the
- Normal population assumption: the nearly normal condition as to be met, if the sample size is n < 15
then the data should follow a normal model if 15 < n < 40, t works if the data are unimodal and reasonably
symmetric if n > 40/50 then t is safe except if the data are really skewed or if there are outliers than the
analysis should be made with and without the outliers.
- The independence group assumption: the group must be independent of each other if, for example, we
have wives-husbands, 2 salaries but from the same person, then it’s likely that the results are dependent and
so said paired, in that case, we use the paired t-method.

IV. A confidence interval for the difference between two means

To find a confidence interval, we use the same method as in the previous chapters:
CI = ( y a− y b)± tdf. SE ( y a− y b).

V. The pooled t-test

When the variances of the two groups are almost identical, we can use the pooled t-test. First, we find
( na −1 ) . s ² a+ ( nb−1 ) . s ²b
the pooled standard error of the samples, s²pooled = , then the SEpooled =
( na−1 ) + ( na−1 )

√ s ² pooled s ² pooled

, the df = ( n a−1 ) + ( na−1 ). The CI is made the same way as previously, only the SE

VI. Paired data

To use the paired t-test, some assumptions must be checked: the independence assumption (here it is of
the differences) and the nearly normal condition.

VII. Paired t-methods

Here we work only with the differences, so we find the mean of them, then the sd =
√ ∑ ( x−µ)2 , SE =
and the CI is given by CI = d± tdf. SE (d ).

