Comparing Means and Proportions Measures of Association

Comparing Means and Proportions
Measures of Association
Truong Phuoc Long, ph.D
11/7/2022 1
Contents
1) Comparing means of 2 independent groups.
2) Comparing means of > 2 independent groups: ANOVA.
3) Comparing proportions of 2 independent groups: 2 sample z-test.
4) Comparing proportions of > 2 independent groups: Chi-square test.
5) Measures of association: relative risk, risk difference, odds ratio.
11/7/2022 2
Comparing means of 2 independent samples
• Comparing means of two independent groups:

- The two populations are independent and normally distributed.
- Whether the two population means µ1 and µ2 are equal or not.
- What can we assume about two population variances?
→ Equal variances
→ Unequal variances
11/7/2022 3
Comparing two independent groups
Equal variances
 Research question: is there a difference in serum iron levels

between the population of healthy children and the population
of children with cystic fibrosis, assuming that the two groups’
variances are equal?
 A sample of healthy children: n1 = 9, 𝑥ҧ 1 = 18.9 µmol/l,
s1 = 5.9 µmol/l
 A sample of children with cystic fibrosis: n2 = 13,
𝑥ҧ 2 = 11.9 µmol/l, s2 = 6.3 µmol/l
11/7/2022 4
Comparing two independent groups:
Equal variances
H 0 : 1 =2 or 1 - 2  0
HA :1  2 or 1 - 2  0
 To carry out the test, begin by calculating the pooled
estimate of the variance
(n -1)s 2
 (n -1)s 2
s 2p  1 1 2 2
n1  n2 -2
(9 -1)(5.9)2  (13-1)(6.3)2
  37.74
9 13 - 2
11/7/2022 5
Equal variances
 Next, calculate the test statistic
 Degrees of freedom n1 + n2 – 2 = 20
 Get p value: Pr( t20  2.63)
11/7/2022 6
7
7
Equal variances
 According to the t-table, p is somewhere between 0.02 and

0.01, i.e. p <0.05 → reject H0 at alpha-level = 0.05
 Obtain exact p-value by p-value calculator: p = 0.016 (two-
tail)
 Conclude: Based on these samples, the difference between
the mean serum iron level of healthy children and the mean
level of children with cystic fibrosis is statistically
significant.
 It appears that children with cystic fibrosis suffer from an
iron deficiency.
11/7/2022 8
Unequal variances
 Suppose we are investigating the effects of an antihypertensive

drug treatment on hypertension patients. A sample of 2308
individuals receiving the treatment with the mean systolic BP
of 142.5 mmHg, SD of 15.7 mmHg. A sample of 2293 patients
receiving placebo with the mean systolic BP of 156.5 mmHg,
SD of 17.3 mmHg.
 State the null and alternative hypothesis and perform the test.
11/7/2022 9
Unequal variances
H0 : 1 = 2 or 1 - 2  0
HA : 1  2 or 1 - 2  0
 Now, we calculate the test statistic
11/7/2022 10
Unequal variances
 We don’t need to calculate the pooled variance, but need to
calculate the approximate degrees of freedom.
11/7/2022 11
How to Determine Equal or Unequal
Variance in t-tests
There are two ways to do so:
1) Use the Variance Rule of Thumb.
2) Perform an F-test.
11/7/2022 12
Variance in t-tests
1) Use the Variance Rule of Thumb.

• If the ratio of the larger variance to the smaller variance is less than 4 then
we can assume the variances are approximately equal and use the Student’s
t-test.
• For example:
- Variance of sample 1= 24.86, Variance sample = 15.76.
- Ratio of larger sample variance/smaller sample variance:
Ratio = 24.86 / 15.76 = 1.577
 Since this ratio is less than 4, we could assume that the
variances between the two groups are approximately
equal.
11/7/2022 13
Variance in t-tests
2) Perform an F-test.
• An F-test is a formal statistical test that uses the following null and
alternative hypotheses:
H0: The samples have equal variances.
HA: The samples do not have equal variances.
• The test statistic is calculated as follows:
F = s12 / s22
where s12 and s22 are the sample variances.
• If the p-value that corresponds to the test statistic is less than a
significance level (like 0.05), then we have sufficient evidence to say
that the samples do not have equal variances.
11/7/2022 14
Variance in t-tests
• Once again suppose we have the following two samples:

• To perform an F-test on two samples, we can calculate
the F test statistic as:
F = s12 / s22 F = 24.86 / 15.76 = 1.577
- According to the F-Distribution Calculator, an F-
value of 1.577 with numerator df = n1-1 = 12 and
denominator df = n2-1 = 12 has a corresponding p-
value of 0.22079.
- Since this p-value is not less than 0.05, we fail to
reject the null hypothesis. In other words, we can
assume the sample variances are equal.
11/7/2022 15
Comparing means
 Two groups:
Paired t-test
Two sample t-test
 More than two groups?
Example: 25 patients with blisters.

Treatments: Treatment A, Treatment B and Placebo.
Measurement: # of days until blisters heal.
Data [means]:
• A: 5,6,6,7,7,8,9,10 [7.25]
• B: 7,7,8,9,9,10,10,11 [8.875]
• P: 7,9,9,10,10,10,11,12,13 [10.11]
Are these differences significant?
16
11/7/2022
Side by side Boxplots
13
12
11
10
9
days
A B P
treatment
11/7/2022 17
Comparing means of > 2 groups
 Multiple t-tests?
• Using t-tests, we would have to do 1 vs. 2, 1 vs. 3, and 2 vs. 3.
• Each time we do a t-test, the type I error rate is equal to 
• As the number of comparisons increases, the probability of making at
least 1 Type-I error increases rapidly:
P(Making at least 1 error in n tests) = 1 - (1 – α)n
Where:
P(Making an error) = α, P(Not making an error) = 1 – α,
P(Not making an error in n tests) = (1 - α)n
• Note: A type I error is made if we reject the null hypothesis
11/7/2022 18
One way Analysis of Variance: ANOVA
 Allows for the mean comparison of more than just two groups.
 The experiment-wise error rate () is held at 0.05
 Example: that if you set α = 0.05 for each of the three sub-
analyses then the overall alpha value is 0.14
Since 1 – (1 – α)3 = 1 – (1 – 0.05)3 = 0.142525.
 This means that the probability of rejecting the null

hypothesis even when it is true (type I error) is 14.2525%.
11/7/2022 19
One way Analysis of Variance: ANOVA
 ANOVA tests the following hypotheses:

H0: The population means of all the groups are equal.
H0: 1 = 2 = 3 = ….. = k
i.e. No treatment effect (no variation in means among groups).
HA: Not all of the population means are equal
i.e. At least one population mean is different, or, there is a
treatment effect.
 Doesn’t say how or which ones differ.
 Doesn’t mean that all population means are different (some
pairs may be the same).
11/7/2022 20
Hypotheses of one-way ANOVA
H0: μ1 = μ2 = μ3 = = k
HA : Not al μi are the same
All Means are the same:

The Null Hypothesis is True
(No Treatment Effect)
μ1  μ2  μ3
11/7/2022 21
Hypotheses of one-way ANOVA
H0: μ1 = μ2 = μ3 = = μk
HA: Not all μi are the same
At least one mean is different:
The Null Hypothesis is NOT true (Treatment effect is present)
or
μ1  μ2  μ3 μ1  μ2  μ3
11/7/2022 22
Assumptions of ANOVA
 Each group has approximately normal distribution

• Check this by looking at histograms and/or normal quantile
plots, or use assumptions.
• Can handle some non-normality, but not severe outliers.
 Standard deviations of each group are approximately equal
 Rule of thumb: ratio of largest to smallest sample standard
deviation must be less than 2:1
11/7/2022 23
Normality Check
 We should check for normality using:

• Assumptions about population.
• Histograms for each group.
• Normal quantile plot for each group.
 With such small data sets, there isn’t a really good way to
check normality from data, but we make the common
assumption that physical measurements of people tend to
be normally distributed.
11/7/2022 24
Standard Deviation Check
Variable treatment N Mean Median StDev

days A 8 7.250 7.000 1.669
B 8 8.875 9.000 1.458
P 9 10.111 10.000 1.764
 Compare largest and smallest standard deviations:

• Largest: 1.764
• Smallest: 1.458
• 1.458 x 2 = 2.916 > 1.764
11/7/2022 25
Notation For ANOVA
• n = number of individuals all together

• k = number of groups
• 𝑥ҧ = mean for entire data set
Group i has
• ni = # of individuals in group i
• xij = value for individual j in group i
• 𝑥ҧ i = mean for group i
• si = standard deviation for group i
11/7/2022 26
How ANOVA works
 ANOVA measures two sources of variation in the data and
compares their relative sizes.
Variation BETWEEN groups
◼ for each data value, look at the difference between its
group mean and the overall mean
 xi - x 2
Variation WITHIN groups
◼ for each data value, look at the difference between
that value and the mean of its group
xij - xi2
11/7/2022 27
How ANOVA works
SSB = Sum of Squares between groups;
k = number of groups
SSW = Sum of Squares within groups;
n = number of individuals all together
• The F-statistic is simply a ratio of two variances.

• Variances are a measure of dispersion, or how far the data are scattered from the mean.
• Larger values represent greater dispersion. Hence, a large F- value indicates that the high variation
between sample means, that is they are far from the grand mean when compared to the variation
within sample.
11/7/2022 28
How ANOVA works
11/7/2022 29
Variable treatment N Mean Median StDev
days A 8 7.250 7.000 1.669
B 8 8.875 9.000 1.458
P 9 10.111 10.000 1.764
 8(7.25 - 8.8)2  8(8.875 - 8.8)2  9(10.111- 8.8)2  34.73
 Conclude:
• Not all of the population
 7(1.669)  7(1.458)  8(1.764)  59.26
2 2 2
means are equal
• At least one population
mean is different
 p  0.05  reject H0 • There is a treatment effect
Fstatistic  6.45  F0.05,2,22  3.44 30

F statistic table
31
CLT for the Proportion
 So far, we always deal with the mean

 In many cases, we also care about the proportion
Proportion of individuals who are hypertensive
Proportion of individuals positive on a HIV test
Proportion of adverse drug reactions
Proportion of premature infants who survive
And so on….
 We have the proportion in the sample, we want to conclude
about the proportion in the population.
 Luckily, the CLT also applies for the proportion
11/7/2022 32
 The sampling distribution of sample proportions based on all

samples of same size n is approximately normal (if n is large).
 The mean of all sample proportions in the sampling distribution is

the true mean of the population proportion from which the samples
were taken, p
 Standard deviation in the sample proportions is called the standard

error of the sample proportion.
11/7/2022 33
 Note: p is the population proportion, pˆ is the sample proportion.

 95% CI for the population proportion p is given by:
 But we don’t know p, so we get the estimate
 The estimated 95% CI for the population proportion p based on

a single sample of size n
11/7/2022 34
Comparing two proportions
 If you have 2 independent random samples and want to test the

null hypothesis: H0 P1 = P2 with either a one-sided or a two-
sided alternative hypothesis.
11/7/2022 35
Example
 In a study investigating mortality among pediatric victims of motor

vehicle accidents, information regarding the effectiveness of seat
belts was collected. Two random samples were selected:
Sample 1: 123 children who were wearing seat belt at the time of
the accident, 3 died.
Sample 2: 290 children who were not wearing seat belt at the
time of the accident, 13 died.
 Is there evidence of a relationship between children mortality
and wearing seat belts?
11/7/2022 36
Example
11/7/2022 37
The z-table: Pr(z ≤ zstatistic)
z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-1 0.15865 0.15625 0.15386 0.1515 0.14917 0.14686 0.14457 0.14231 0.14007 0.13786
-0.9 0.18406 0.18141 0.17878 0.17618 0.17361 0.17105 0.16853 0.16602 0.16354 0.16109
-0.8 0.21185 0.20897 0.20611 0.20327 0.20045 0.19766 0.19489 0.19215 0.18943 0.18673
-0.7 0.24196 0.23885 0.23576 0.23269 0.22965 0.22663 0.22363 0.22065 0.21769 0.21476
-0.6 0.27425 0.27093 0.26763 0.26434 0.26108 0.25784 0.25462 0.25143 0.24825 0.24509
-0.5 0.30853 0.30502 0.30153 0.29805 0.2946 0.29116 0.28774 0.28434 0.28095 0.27759
-0.4 0.34457 0.3409 0.33724 0.33359 0.32997 0.32635 0.32276 0.31917 0.31561 0.31206
-0.3 0.38209 0.37828 0.37448 0.3707 0.36692 0.36317 0.35942 0.35569 0.35197 0.34826
-0.2 0.42074 0.41683 0.41293 0.40904 0.40516 0.40129 0.39743 0.39358 0.38974 0.3859
-0.1 0.46017 0.4562 0.45224 0.44828 0.44433 0.44038 0.43644 0.4325 0.42857 0.42465
0 0.5 0.49601 0.49202 0.48803 0.48404 0.48006 0.47607 0.47209 0.46811 0.46414
The z-table: Pr(z ≤ zstatistic)
z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
-1 0.15865 0.15625 0.15386 0.1515 0.14917 0.14686 0.14457 0.14231 0.14007 0.13786
-0.9 0.18406 0.18141 0.17878 0.17618 0.17361 0.17105 0.16853 0.16602 0.16354 0.16109
-0.8 0.21185 0.20897 0.20611 0.20327 0.20045 0.19766 0.19489 0.19215 0.18943 0.18673
-0.7 0.24196 0.23885 0.23576 0.23269 0.22965 0.22663 0.22363 0.22065 0.21769 0.21476
-0.6 0.27425 0.27093 0.26763 0.26434 0.26108 0.25784 0.25462 0.25143 0.24825 0.24509
-0.5 0.30853 0.30502 0.30153 0.29805 0.2946 0.29116 0.28774 0.28434 0.28095 0.27759
-0.4 0.34457 0.3409 0.33724 0.33359 0.32997 0.32635 0.32276 0.31917 0.31561 0.31206
-0.3 0.38209 0.37828 0.37448 0.3707 0.36692 0.36317 0.35942 0.35569 0.35197 0.34826
-0.2 0.42074 0.41683 0.41293 0.40904 0.40516 0.40129 0.39743 0.39358 0.38974 0.3859
-0.1 0.46017 0.4562 0.45224 0.44828 0.44433 0.44038 0.43644 0.4325 0.42857 0.42465
0 0.5 0.49601 0.49202 0.48803 0.48404 0.48006 0.47607 0.47209 0.46811 0.46414
Example
11/7/2022 41
Comparing proportions of ≥ 2 groups:
Chi-square test
Seat belt Total

Dead
Yes No
Yes 3 13 16
No 120 277 397
Total 123 290 413
11/7/2022 42
How Chi-square test works
 If H0 is true: the mortality proportions among those wearing

seat belts and those not wearing seat belts are identical.
 We can ignore the two separate categories and treat all 413
children as a single sample.
 In this sample:
The overall mortality proportion = 16/413 = 3.87%

The non-mortality proportion = 397/413 = 96.13%
11/7/2022 43
 If H0 is true: the mortality proportions among those wearing seat belts

and those not wearing seat belts are identical.
 The mortality proportion of those wearing seat belt (123) is expected
to be 3.87% → the expected number of mortality in this group = 123
(0.0387) = 4.76
 The mortality proportion of those not wearing seat belt (290) is
expected to be 3.87% → the expected number mortality in this group =
290(0.0387)= 11.23
 Similarly, the expected number of non-mortality is
Wearing seat belt = 123 (0.9613) = 118.23
Not wearing seat belt = 290 (0.9613) = 278.76
11/7/2022 44
Seat belt Total

Dead
Yes No
Yes 3 4.76 13 11.23 16
No 120 118.23 277 278.76 397
Total 123 290 413
11/7/2022 45
11/7/2022 46
 Calculate
 Df = (2-1)(2-1) = 1
 Using chi-square distribution table, p-value is somewhere
between 0.25 and 0.5
 p-value = 0.325 (p-value calculator) > 0.05
 Do not reject H0
11/7/2022 47
48
Chi-square test limitation
 Does not perform so well in smaller samples.
 Requires that none of the expected cell is less than 5.
 Fisher’s Exact test: always appropriate to test equality of

two proportions, no minimum sample size requirement.
11/7/2022 49
Measures of association
 Risk difference (attributable risk)
 Relative risk (risk ratio)
 Odds ratio
11/7/2022 50
Risk difference
• The risk difference (RD), excess risk, or attributable

risk is the difference between the risk of an outcome in the
exposed group and the unexposed group.
• It describes the actual difference in the observed risk of
events between experimental and control interventions.
11/7/2022 51
Example
 A randomized, double-blinded, placebo controlled trial of the efficacy
and safety of zidovudine (AZT) in reducing the risk of maternal-infant
HIV transmission. 363 HIV infected pregnant women were randomized
to AZT or placebo.
 Results
Of the 180 women randomized to AZT group, 13 gave birth to
children who tested positive for HIV within 18 months of birth.
Of the 183 women randomized to the placebo group, 40 gave birth
to children who tested positive for HIV within 18 months of birth.
Note: A double-blind study is one in which neither the participants nor the
experimenters know who is receiving a particular treatment.
11/7/2022 52
HIV Drug group Total
Transmission AZT Placebo
Risk difference
Yes 13 40 53
No 167 143 310
Total 180 183 363
 The risk of HIV transmission (i.e. the proportion of HIV transmission):
 AZT: pˆ1  13/180  0.07  7%
 Placebo: pˆ2  40 / 183  0.22  22%
 Risk difference: p1ˆ- pˆ2  - 0.15  -15%
 Interpretation: If AZT was given to 1,000 HIV infected pregnant

women, this would reduce the number of HIV positive infants by 150
(relative to the number of HIV positive infants born to 1,000 women not
treated with AZT).
11/7/2022 53
Relative risk (Risk ratio)
 Relative risk (RR) is the ratio of the probability of an outcome in an
exposed group to the probability of an outcome in an unexposed group.
 Ex: The risk of HIV transmission with AZT relative to placebo:
The risk of HIV transmission with AZT is about 1/3 the risk of HIV
transmission with placebo.
Interpretation: An HIV positive pregnant woman could reduce her
personal risk of giving birth to an HIV positive child by nearly 70% if
she takes AZT during her pregnancy.
11/7/2022 54
Relative risk (Risk ratio)
 RR could be computed in the other direction as well
 Interpretation: An HIV positive pregnant woman

increases her personal risk of giving birth to an HIV
positive child by slightly more than three times if she does
not take AZT during her pregnancy.
11/7/2022 55
Risk difference vs. Relative risk
 Risk difference (attributable) provides a measure of the

public health impact of an exposure (assuming causality).
 Relative risk provides a measure of the magnitude of
the disease-exposure association for an individual.
 Each provides a different piece of information about the
“story”.
11/7/2022 56
What is an Odds?
 Odds is the ratio of the risk of having an outcome to the

risk of not having an outcome.
 If p represents the risk of an outcome, then the odds are
given by:
11/7/2022 57
What is an Odds?
 The estimated risk of giving birth to an HIV infected child

among mothers treated with AZT is 𝑝Ƹ 1= 0.07
 The corresponding odds estimate is
 The estimated risk of giving birth to an HIV infected child

among mothers not treated with AZT is 𝑝Ƹ 2 = 0.22
 The corresponding odds estimate is
11/7/2022 58
Odds Ratio
 The estimated odds ratio of an HIV birth with AZT relative to placebo
 The odds of HIV transmission with AZT is 0.28 (about 1/3) the odds of
transmission with placebo.
 Interpretation: AZT is associated with an estimated 72% (estimated OR
= 0.28) reduction in odds of giving birth to an HIV infected child among
HIV infected pregnant women.
11/7/2022 59

Comparing Means and Proportions Measures of Association

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Comparing Means and Proportions Measures of Association

Uploaded by

Copyright:

Available Formats

Comparing Means and Proportions

Truong Phuoc Long, ph.D

1) Comparing means of 2 independent groups.

2) Comparing means of > 2 independent groups: ANOVA.

3) Comparing proportions of 2 independent groups: 2 sample z-test.

4) Comparing proportions of > 2 independent groups: Chi-square test.

5) Measures of association: relative risk, risk difference, odds ratio.

• Comparing means of two independent groups:

 Research question: is there a difference in serum iron levels

 Get p value: Pr( t20  2.63)

 According to the t-table, p is somewhere between 0.02 and

 Suppose we are investigating the effects of an antihypertensive

There are two ways to do so:

1) Use the Variance Rule of Thumb.

1) Use the Variance Rule of Thumb.

• Once again suppose we have the following two samples:

Example: 25 patients with blisters.

 The experiment-wise error rate () is held at 0.05

Since 1 – (1 – α)3 = 1 – (1 – 0.05)3 = 0.142525.

 This means that the probability of rejecting the null

 ANOVA tests the following hypotheses:

All Means are the same:

 Each group has approximately normal distribution

 We should check for normality using:

Variable treatment N Mean Median StDev

 Compare largest and smallest standard deviations:

• n = number of individuals all together

SSB = Sum of Squares between groups;

SSW = Sum of Squares within groups;

n = number of individuals all together

• The F-statistic is simply a ratio of two variances.

 8(7.25 - 8.8)2  8(8.875 - 8.8)2  9(10.111- 8.8)2  34.73

Fstatistic  6.45  F0.05,2,22  3.44 30

 So far, we always deal with the mean

 The sampling distribution of sample proportions based on all

 The mean of all sample proportions in the sampling distribution is

 Standard deviation in the sample proportions is called the standard

 Note: p is the population proportion, pˆ is the sample proportion.

 But we don’t know p, so we get the estimate

 The estimated 95% CI for the population proportion p based on

 If you have 2 independent random samples and want to test the

 In a study investigating mortality among pediatric victims of motor

Seat belt Total

 If H0 is true: the mortality proportions among those wearing

The overall mortality proportion = 16/413 = 3.87%

 If H0 is true: the mortality proportions among those wearing seat belts

Seat belt Total

 Does not perform so well in smaller samples.

 Requires that none of the expected cell is less than 5.

 Fisher’s Exact test: always appropriate to test equality of

 Risk difference (attributable risk)

 Relative risk (risk ratio)

• The risk difference (RD), excess risk, or attributable

 The risk of HIV transmission (i.e. the proportion of HIV transmission):

 AZT: pˆ1  13/180  0.07  7%

 Placebo: pˆ2  40 / 183  0.22  22%

 Risk difference: p1ˆ- pˆ2  - 0.15  -15%

 Interpretation: If AZT was given to 1,000 HIV infected pregnant

 RR could be computed in the other direction as well

 Interpretation: An HIV positive pregnant woman

 Risk difference (attributable) provides a measure of the

 Odds is the ratio of the risk of having an outcome to the

 The estimated risk of giving birth to an HIV infected child

 The estimated risk of giving birth to an HIV infected child