Download as xls, pdf, or txt
Download as xls, pdf, or txt
You are on page 1of 23

Use the t-test to determine if two groups (two treatments) are different.

Compare the blood pressure of patients taking a drug to that of patients not taking the drug
Two treatments: Drug vs. No drug
Drug No drug (cells A7:A13 and B7:B13)
110 130
115 145
120 145
125 150
130 150
135 170
140 175
The p-value for the t-test tells us the probability that we are wrong
if we conclude that there is a difference between the treatments.
Expressed another way:
The t-test tells us the probability that the observed differences between the two groups
is just due to random differences in sampling the two groups,
in the absence of any effect of the drug.
Using the Excel ttest workbook function:
=TTEST(A7:A13,B7:B13,2,2)
t-test p-value =
Using Menu: Tools / DataAnalysis / t-test Two-sample assuming equal variance
t-Test: Two-Sample Assuming Equal Variances
Drug No drug
Mean 125 152.1429
Variance 116.6667 240.4762
Observations 7 7
Pooled Variance 178.5714
Hypothesized Mean Difference 0
df 12
t Stat -3.8
P(T<=t) one-tail 0.001265
0
50
100
150
200
1 2 3 4 5 6 7 B
l
o
o
d

p
r
e
s
s
u
r
e

Patient number
Blood pressure vs treatment
Drug
No drug
t Critical one-tail 1.782287
P(T<=t) two-tail 0.00253
t Critical two-tail 2.178813
Single factor ANOVA can do the same thing as the t-test to determine if two groups (two treatments) are different.
Compare the blood pressure of patients taking a drug to that of patients not taking the drug
Two treatments: Drug vs. No drug
Drug No drug Using the Excel ttest workbook function:
110 130 =TTEST(A7:A13,B7:B13,2,2)
115 145 t-test p-value = 0.00253
120 145
125 150 Single-factor ANOVA = 1-way ANOVA
130 150 Two-factor ANOVA = 2-way ANOVA
135 170
140 175
Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
Drug 7 875 125 116.6667
No drug 7 1065 152.1429 240.4762
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 2578.571 1 2578.571 14.44 0.00253 4.747225
Within Groups 2142.857 12 178.5714
Total 4721.429 13
Single factor ANOVA can do the same thing as the t-test to determine if two groups (two treatments) are different.
Use single-factor (1-way) Analysis of variance (ANOVA) to determine if two or more groups are different
Compare the blood pressure of patients taking three different drugs, Drug A, B and C.
Drug A Drug B Drug C
110 105 130
115 115 145
120 125 145
125 125 150
130 125 150
135 140 170
140 140 175
The p-value for the ANOVA tells us the probability that we are wrong
if we conclude that there is any difference between the treatments.
ANOVA tells us the probability that the observed differences between the groups
is just due to random differences in sampling the groups, in the absence of any effect of the drug.
If at least one group is different, ANOVA gives us a small p-value.
Using Menu: Tools / DataAnalysis / ANOVA Single Factor
SUMMARY
Groups Count Sum Average Variance
Drug A 7 875 125 116.6667
Drug B 7 875 125 158.3333
Drug C 7 1065 152.1429 240.4762
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 3438.095 2 1719.048 10.00462 0.001198 3.554561
Within Groups 3092.857 18 171.8254
Total 6530.952 20
Looking at the SUMMARY table, we notive that the average for drug C is 152.1429, while
the average for the other two groups is 125.
0
50
100
150
200
1 2 3 4 5 6 7
B
l
o
o
d

p
r
e
s
s
u
r
e

Patient
Drug A
Drug B
Drug C
The ANOVA table tells us that P-value is 0.001198, which means that it is very unlikely
we would see this big a difference between the three groups just by chance.
Use single-factor (1-way) Analysis of variance (ANOVA) to determine if two or more groups are different
Use two-factor (2-way) Analysis of variance to determine if either of two factors affects the outcomes
Suppose we think that two factors, gender and drug, may affect the patient's response
Use the Excel menu: Tools/Data Analysis/ ANOVA 2-factor without replication
Factor: Drug
Drug A Drug B Drug C
Factor: Age Under 21 118 128 135
21 to 55 120 130 136
Over 55 121 130 134
Anova: Two-Factor Without Replication
SUMMARY Count Sum Average Variance
Under 21 3 381 127 73
21 to 55 3 386 128.6667 65.33333
Over 55 3 385 128.3333 44.33333
Drug A 3 359 119.6667 2.333333
Drug B 3 388 129.3333 1.333333
Drug C 3 405 135 1
ANOVA
Source of Variation SS df MS F P-value F crit
Rows 4.666666667 2 2.333333 2 0.25 6.944272
Columns 360.6666667 2 180.3333 154.5714 0.000163 6.944272
Error 4.666666667 4 1.166667
Total 370 8
The analysis indicates that there is a significant difference among the columns (Factor: Drug), with a p-value of 0.000163
The analysis indicates that there is NOT a significant difference among the rows (Factor: age), with a p-value of 0.25
We might be concerned that we only treated three people with each drug, and feel that we would like more replicates.
The analysis indicates that there is a significant difference among the columns (Factor: Drug), with a p-value of 0.000163
The analysis indicates that there is NOT a significant difference among the rows (Factor: age), with a p-value of 0.25
We might be concerned that we only treated three people with each drug, and feel that we would like more replicates.
2-way ANOVA with replicates
Use two-factor (2-way) Analysis of variance to determine if either of two factors affects the outcomes
Include replicates to increase confidence in the results
Suppose we think that two factors, gender and drug, may affect the patient's response
Factor: Drug
Drug A Drug B Drug C
Factor: Age Under 21 118 128 135
117 126 136
110 125 130
118 131 135
21 to 55 120 130 136
118 132 136
121 129 140
124 130 131
Over 55 121 130 134
122 128 131
119 130 138
127 135 140
Use the Excel menu: Tools/Data Analysis/ ANOVA 2-factor with replication
Anova: Two-Factor With Replication
SUMMARY Drug A Drug B Drug C Total
Under 21
Count 4 4 4 12
Sum 463 510 536 1509
Average 115.75 127.5 134 125.75
Variance 14.91667 7 7.333333 70.20455
21 to 55
Count 4 4 4 12
Sum 483 521 543 1547
Average 120.75 130.25 135.75 128.9167
Variance 6.25 1.583333 13.58333 47.7197
Over 55
Count 4 4 4 12
Sum 489 523 543 1555
Average 122.25 130.75 135.75 129.5833
Variance 11.58333 8.916667 16.25 43.90152
Total
Count 12 12 12
Sum 1435 1554 1622
Average 119.5833 129.5 135.1667
Variance 17.35606 7 10.87879
ANOVA
Source of Variation SS df MS F P-value F crit
Sample 100.6667 2 50.33333 5.182078 0.012453 3.354131
Columns 1493.167 2 746.5833 76.86463 7.14E-12 3.354131
Interaction 24.66667 4 6.166667 0.63489 0.641998 2.727765
Within 262.25 27 9.712963
Total 1880.75 35
The analysis indicates that there is NOT a interaction between the rows (Factor: age)and the columns (Factor: Drug), with a p-value of 0.64
The analysis indicates that there is a significant difference among the columns (Factor: Drug), with a p-value of 7.14E-12
The analysis indicates that there is a significant difference among the rows (Factor: age), with a p-value of 0.012
The analysis indicates that there is NOT a interaction between the rows (Factor: age)and the columns (Factor: Drug), with a p-value of 0.64
The analysis indicates that there is a significant difference among the columns (Factor: Drug), with a p-value of 7.14E-12
The analysis indicates that there is a significant difference among the rows (Factor: age), with a p-value of 0.012
2-way ANOVA with replicates, test for interaction
Use two-factor (2-way) Analysis of variance to determine if either of two factors affects the outcomes
Include replicates to increase confidence in the results
Suppose we think that two factors, gender and drug, may affect the patient's response
Factor: Drug
Drug A Drug B Drug A
Factor: Age Under 21 118 120 Under 21 118
118 118 118
110 121 110
118 123 118
21 to 55 120 118 Average 116
118 117 Drug B > Drug A if under 21
121 110
125 118 Drug A
21 to 55 120
118
121
125
Average 121
Drug B < Drug A if 21 to 55
What is the effect of drug A vs drug B? Depends on age
What is the effect of age? Depends on drug
Therefore, there is an interaction between age and drug
Anova: Two-Factor With Replication
SUMMARY Drug A Drug B Total
Under 21
Count 4 4 8
Sum 464 482 946
Average 116 120.5 118.25
Variance 16 4.333333 14.5
21 to 55
Count 4 4 8
Sum 484 463 947
Average 121 115.75 118.375
Variance 8.666667 14.91667 17.98214
Total
Count 8 8
Sum 948 945
Average 118.5 118.125
Variance 17.71429 14.69643
ANOVA
Source of Variation SS df MS F P-value F crit
Sample 0.0625 1 0.0625 0.005693 0.9411 4.747225
Columns 0.5625 1 0.5625 0.051233 0.82474 4.747225
Interaction 95.0625 1 95.0625 8.658444 0.012314 4.747225
Within 131.75 12 10.97917
Total 227.4375 15
Total 605.4375 15
The analysis indicates that there IS an interaction between the rows (Factor: age)and the columns (Factor: Drug), with a p-value of 0.01
The analysis indicates that there is NOT a significant difference among the columns (Factor: Drug), with a p-value of .82
The analysis indicates that there is NOT a significant difference among the rows (Factor: age), with a p-value of .94
Is it correct to conclude that neither age nor drug have any effect?
How should we analyze the data to determine the effect(s), if any, of age and drug?
What would we have concluded if we had not tested for the interaction between age and drug?
Because we have a significant interaction, we have to look at the effect of the drug separately in each age group
Drug A Drug B t-test for drug effect in "Under 21" age group separately
Under 21 118 Under 21 120 p-value= 0.032785
118 118
110 121
118 123
Average 116 120.5
Drug B > Drug A if under 21
Drug A Drug B t-test for drug effect in "21 to 55" age group separately
21 to 55 120 118 p-value= 0.02351
118 117
121 110
125 118
Average 121 115.75
Drug B < Drug A if 21 to 55
If we do not test for interaction, we would conclude that the drug has no effect.
If we test for interaction, we learn that the drug has different effects in different age groups, and that
the effects is significant, but in the opposite direction, in each age group.
Drug B
Under 21 120
118
121
123
120.5
Drug B > Drug A if under 21
Drug B
118
117
110
118
115.75
Drug B < Drug A if 21 to 55
The analysis indicates that there IS an interaction between the rows (Factor: age)and the columns (Factor: Drug), with a p-value of 0.01
The analysis indicates that there is NOT a significant difference among the columns (Factor: Drug), with a p-value of .82
The analysis indicates that there is NOT a significant difference among the rows (Factor: age), with a p-value of .94
Because we have a significant interaction, we have to look at the effect of the drug separately in each age group
t-test for drug effect in "Under 21" age group separately
t-test for drug effect in "21 to 55" age group separately
If you have a missing value, the number of observations in each treatment condition is unbalanced.
This situation is called an unbalanced design
Excel cannot workon unbalanced designs.
You get an error message saying "Input range contains non-numeric data", because one of the cells is empty.
If you have an unbalanced design (missing values) use another statistics package such as R that will handle them.
Example: missing value in lower right cell of data.
Factor: Drug
Drug A Drug B Drug C
Factor: Age Under 21 118 128 135
117 126 136
110 125 130
118 131 135
21 to 55 120 130 136
118 132 136
121 129 140
124 130 131
Over 55 121 130 134
122 128 131
119 130 138
127 135
You get an error message saying "Input range contains non-numeric data", because one of the cells is empty.
If you have an unbalanced design (missing values) use another statistics package such as R that will handle them.
PatientID Drug Age Group Response
P1 Drug A Under 21 118
P2 Drug A Under 21 117
P3 Drug A Under 21 110
P4 Drug A Under 21 118
P5 Drug A 21 to 55 120
P6 Drug A 21 to 55 118
P7 Drug A 21 to 55 121
P8 Drug A 21 to 55 124
P9 Drug A Over 55 121
P10 Drug A Over 55 122
P11 Drug A Over 55 119
P12 Drug A Over 55 127
P13 Drug B Under 21 128
P14 Drug B Under 21 126
P15 Drug B Under 21 125
P16 Drug B Under 21 131
P17 Drug B 21 to 55 130
P18 Drug B 21 to 55 132
P19 Drug B 21 to 55 129
P20 Drug B 21 to 55 130
P21 Drug B Over 55 130
P22 Drug B Over 55 128
P23 Drug B Over 55 130
P24 Drug B Over 55 135
P25 Drug C Under 21 135
P26 Drug C Under 21 136
P27 Drug C Under 21 130
P28 Drug C Under 21 135
P29 Drug C 21 to 55 136
P30 Drug C 21 to 55 136
P31 Drug C 21 to 55 140
P32 Drug C 21 to 55 131
P33 Drug C Over 55 134
P34 Drug C Over 55 131
P35 Drug C Over 55 138
P36 Drug C Over 55 140
Repeated measures ANOVA
Recall that we used the paired t-test when each patient was measured before and after treatment,
or the same patient got two different treatments (drug A vs. drug B).
The advantage of the paired t-test is that we compare treatments within patients,
which controls for differences between patients.
The paired t-test only works for comparing two treatments, or two time points (before and after)
If we want to compare three or more treatments, we use Repeated Measures ANOVA, which is
the extension of the paired t-test.
In Excel, we perform a repeated measures ANOVA using the Data Analysis menu item "ANOVA: Two-factor without replication".
Essentially, we consider treatment (the drug) to be one factor, and patient is the second factor.
PatientID Drug A Drug B Drug C
1 118 110 125
2 117 126 117
3 110 125 110
4 118 121 131
5 120 130 130
6 118 132 132
7 121 119 129
8 124 130 130
9 121 130 118
10 122 128 121
11 119 130 130
12 127 135 135
Anova: Two-Factor Without Replication
SUMMARY Count Sum Average Variance
1 3 353 117.6667 56.33333
2 3 360 120 27
3 3 345 115 75
4 3 370 123.3333 46.33333
5 3 380 126.6667 33.33333
6 3 382 127.3333 65.33333
7 3 369 123 28
8 3 384 128 12
9 3 369 123 39
10 3 371 123.6667 14.33333
11 3 379 126.3333 40.33333
12 3 397 132.3333 21.33333
Drug A 12 1435 119.5833 17.35606 We see evidence of a difference in means among
Drug B 12 1516 126.3333 46.78788 the drugs, but we need to consider variance,
Drug C 12 1508 125.6667 56.78788 so we look at the ANOVA p-value.
ANOVA
Source of Variation SS df MS F P-value F crit
Rows 745.6388889 11 67.78535 2.550889 0.02960642 2.258518
Columns 332.0555556 2 166.0278 6.247933 0.00709924 3.443357
Error 584.6111111 22 26.57323
Total 1662.305556 35
The ANOVA p-value for patients (rows) is 0.0296, which indicates that patients differ.
The ANOVA p-value for the drugs (columns) is 0.0071, which indicates that the drugs differ.
We probably want to know which of the three drugs are different from each other.
When we compare pairs of treatments after an ANOVA, it is called "post-hoc" comparisons.
Excel doesn't have statistical tests that correct the p-values for doing multiple post-hoc comparisons.
To get p-values corrected for multiple comparisons we should use other software.
However, for now we'll use multiple t-tests, even though this method can give more false-positive results.
PatientID Drug A Drug B Drug C
1 118 110 125
2 117 126 117
3 110 125 110
4 118 121 131
5 120 130 130
6 118 132 132
7 121 119 129
8 124 130 130
9 121 130 118
10 122 128 121
11 119 130 130
12 127 135 135
t-test for A vs B 0.007941
t-test for A vs C 0.022839
t-test for B vs C 0.822582
It appears that there is a big difference between A and B, and between A and C.
B and C do not appear to differ.
In Excel, we perform a repeated measures ANOVA using the Data Analysis menu item "ANOVA: Two-factor without replication".
We see evidence of a difference in means among
the drugs, but we need to consider variance,
so we look at the ANOVA p-value.

You might also like