Week 3 Selecting Proper Stat Tests

MED101x
Introduction to Applied Biostatistics
Selecting Proper Statistical Tests

for Evidence Based Medicine
Overview:
1.1 Why the selection of valid statistical tests is important?

1.2 What are the factors to be considered for test selection?
1.3 Can you select now? (Tutorials)
Example 1：Different tests → Different results
60 A
A
A AA A
Cognitive
A
50 A
A A A
A A
function
A
A A A
A
AA AA
A AA
score at 3 A
A A
A A AA
A A
Pearson’s Correlation
40
P=0.10 (NS）
A
months
AA
AA A
A
after ICU
A A
A
discharge 30 Spearman’s Correlation

AA
A
P=0.03 （Significant）
20 A
10
0 250 500
Biomarker (S100)
1
MED101x
P=0.５３９ (NS）
NIH Research Funds ($ billions)
1.4 AIDS Spearman’s Correlation

1.2 P＜０．０００１（Significant）
1.0
0.8 Ischemic
heart
0.6 Breast cancer
disease
0.4 (ISD)
0.2
0
0 1 2 3 4 5 6 7 8 9
Disability-Adjusted Life-Years (millions)
lost due to illness
Gross et al. (1999)
Student’s T-test
P=0.405 (NS）
Total delirium days
Mann-Whitney U test /
2
Wilcoxon rank sum test
Absen t Presen t
APO-E4
APO-E
2
MED101x
The Scandal of Poor Medical Research

Douglas G. Altman. British Medical Journal, 1994.
What should we think about a doctor who uses the wrong treatment,
either willfully or through ignorance, or who uses the right treatment
wrongly (such as by giving the wrong dose of a drug)? Most people
would agree that such behavior was unprofessional, arguably
unethical, and certainly unacceptable.
What then would we think about researchers who use the wrong
techniques (either willfully or in ignorance), use the right techniques
wrongly, misinterpret their results, report their results selectively, cite
the literature selectively, and draw unjustified conclusions? We
should be appalled. Yet numerous studies of the medical literature,
in both general and specialist journals, have shown that all of the
above phenomena are common. This is surely a scandal.
2 Factors to be considered for test selection
Understanding A Statistician’s Mind
3
MED101x
Flow-chart for popularly used statistical tests

Q 2,
Q 3,
Q 1,U nivariate D ifference Q 4,Q 5　 Q ６,N o.of Q ７,sam pl
P aired /
/M utivariable /C orrelatio Type of outcom e (N orm ality) groups e size V alid Tests
related
n
C ontinuous （
N orm al） 2 S tudent's t-test
＞2 O ne-w ay A N O V A
C ontinuous （N on-norm al）
/ 2 M ann-W hitney U test
Indepdende
O rdered categorical ＞2 K ruskal-W allis H test
nt
(un-paired) N om inal 2 ＜20 Fisher's exact test
≧2 ≧20 C hi-square test
D ifference Tim e to Event Log-R ank test（ K aplan-M eier plot）
U nivariate 2 P aied-t test

C ontinuous （ N orm al） R epeated m easured A N O V A
＞2
Q1 Q2 Q3
D ependent
(paired) C ontinuous （ Q4,5
N on-norm al）
O rdered categorical
/ Q6
2
＞2
Q7
M ixed effect R egression
W ilcoxon signed-rank test
Friedm an test
N om inal 2 M cN em ar's test
C ontinuous （ N orm al） P earson's correlation （r）
C orrelation C ontinuous （ N on-norm al）/ordered S pearm an's correlation （rs）
N om inal(2 levels) 2 S pearm an/Kappa (A grreem ent）
C ontinuous （ N orm alresidulas） Linear　R egression
＊
Indepdende C ontinuous （ N on-norm alresidulas） Linear　R egression
nt O rdered categorical O rdered Logistic R egression
(un-paired) N om inal (2 levels) B inary　Logistic　R egression
(＞2) M ultinom ial　Logistic　R egression
M ultivariable
Tim e to Event C ox　P roportional　H azard　R egression
C ontinuous （ N orm alresidulas） Linear M ixed Effect R egression
D ependent C ontinuous （ N on-norm alresidulas） Linear M ixed Effect R egression＊
(paired) O rdered categorical G eneralized Estim ation Equation (G EE)
N om inal(2 levels) G eneralized Estim ation Equation (G EE)
＊Transform outcom e variables for norm alizing residuals
Created based on Publishing Your Medical Research Paper, by Daniel Byrne, Williams and Wilkins (1998）
Q2,
Q3, Q4, Q5
Q1,Univariate Difference Q６, No. of Q７,sampl
Paired /
/Mutivariable /Correlatio Type of outcome (Normality) groups e size Valid Tests
related
n
Continuous （Normal） 2 Student's t-test
＞2 One-way ANOVA
Continuous （Non-normal）/ 2 Mann-Whitney U test
Indepdende
Ordered categorical ＞2 Kruskal-Wallis H test
nt
(un-paired) Nominal 2 ＜20 Fisher's exact test
≧2 ≧20 Chi-square test
Difference Time to Event Log-Rank test（Kaplan-Meier plot）
Univariate 2 Paied-t test

Continuous （Normal） Repeated measured ANOVA
Dependent ＞2
Mixed effect Regression
(paired) Continuous （Non-normal）/ 2 Wilcoxon signed-rank test
Ordered categorical ＞2 Friedman test
Nominal 2 McNemar's test
Continuous （Normal） Pearson's correlation （r）
Correlation Continuous （Non-normal）/ordered Spearman's correlation （rs）
Nominal (2 levels) 2 Spearman/Kappa (Agrreement）
Continuous （Normal residulas） Linear Regression
＊
Indepdende Continuous （Non-normal residulas） Linear Regression
nt Ordered categorical Ordered Logistic Regression
(un-paired) Nominal (2 levels) Binary Logistic Regression
(＞2) Multinomial Logistic Regression
Multivariable
Time to Event Cox Proportional Hazard Regression
Continuous （Normal residulas） Linear Mixed Effect Regression
＊
Dependent Continuous （Non-normal residulas） Linear Mixed Effect Regression
(paired) Ordered categorical Generalized Estimation Equation (GEE)
Nominal (2 levels) Generalized Estimation Equation (GEE)
＊Transform outcome variables for normalizing residuals
Created based on Publishing Your Medical Research Paper, by Daniel Byrne, Williams and Wilkins (1998)
4
MED101x
Question 1 – Univariate?
Which type of test do you need:
Univariate or Multivariable?
Univariate - Unadjusted Analysis

Multivariable - Adjusted Analysis
-Are there confounders?

- Need for adjustment?
Question 1 – Univariate? (cont’d)
Randomization prevents confounding. Thus, in general,

confounders are more problematic in observations
studies than RCT’s.
If you want to adjust for confounders, then you need to

perform regression analysis.
RCT -> Probably OK with univariate analysis

Observational studies -> Need to use Regression
5
MED101x
Population?
Patients with an echo for possible coronary disease
Exposure?
Use of Aspirin at the baseline visit
Control?
No use of Aspirin at the baseline visit
Outcome?
Long term mortality
(median FU of 3.1 years)
Results
6
MED101x
v People who use

aspirin had reasons
to use aspirin, they
were sicker and had
poorer prognosis.
Therefore this may
mask the effect of
aspirin as its effect
is mixed with poorer
patients prognosis.
This is called
“Confounding”.
Question 1 – Univariate? (cont’d)
Randomization prevents confounding. Thus, in general,

confounders are more problematic in observations
studies than RCT’s.
If you want to adjust for confounders, then you need to

perform regression analysis.
7
MED101x
Question 2 -Difference?
• Do you want to test for a difference
between groups or want to test for
correlation between variables?
–Comparing mean (or median) of two groups?

–Correlation between two variables in one group?
60 A
A
A AA A
Cognitive
A
50 A
A A A
A A
function
A
A A A
A
AA AA
A AA
score at 3 A
A A
A A AA
A A
40
P=0.10 (Not Significant）
A
months
AA
AA A
A
after ICU
A A
A
discharge 30 Spearman’s Correlation

AA
A
20 A
10
0 250 500
Biomarker (S100)
8
MED101x
Comparing Difference
6 Examples:
Student’s T-test
Comparing 2 means
Total delirium days
Mann-Whitney U test /
2
Wilcoxon rank sum test
Comparing 2 medians
0
Absen t Presen t
APO-E4
APO-E
Question 3 - Paired?
• Were the groups paired or unpaired /
(dependent or independent)?
Are you measuring more than once from

one sample?
Examples:
Student t-test comparing 2 independent means.
(Comparing outcome between intervention and control groups)
Paired t-test comparing 2 related means.

(Comparing outcome before and after an intervention).
9
MED101x
Paired or Independent groups?

New eye drop Placebo eye drop
Independent VS
New eye drop Placebo eye drop

Right eye Left eye
Paired VS
Question 4 - Outcome Type?

• What is the level of measurement for
outcome variable?
-Continuous (Interval)? Ex. Blood pressure, BMI, Weight
- Discrete/Categorical/Factor?
-Nominal? 2 levels (Binary, dichotomous) ex. Died / Survived
>2 levels. Ex. Disease Type (cancer, DM, cardiovascular)
-Ordinal?
> 2 levels. Ex. Disease severity (1: Mild, 2: Moderate, 3: Severe)
Disease score (0: normal, 10: abnormal)
10
MED101x
Question 5 – Normality?
• If an outcome variable is continuous, is it

normally distributed? If your histogram forms
a bell-shaped curve, assume that it is normal;
otherwise, assume that it is non-normal.
Histogram
Histogram
120.0
120.0
80.0 80.0
Count
Count
40.0 40.0
0.0 0.0
0.0 0.5 0.9 1.4 1.0 2.0 3.0 4.0
log_Valsal VALSAL_1
Normal Non-normal / skewed
6
6
] 5
4 4
] 3
2 2
0 Normal
Non-normal
0
Use Parametric
Use Non-parametric
Absent Present Absent Present
APO-E4 APO-E4
Mann-Whitney U test / Student’s T-test

Wilcoxon rank sum test P=0.405 (NS）
11
MED101x
APO-E4 Absent
Absent
APO-E4 Present
Present
30
Count
20
10
0
0 10 20 30 0 10 20 30
Total delirium days Total delirium days
Usually, any variable involving “days” are highly skewed (cannot

take negative value, long tail value)
Parametric Tests are valid only when…

Student t-test, ANOVA are valid only when outcome variable is
normally distributed within a group.
Paired t-test (for example comparing BP before after an

intervention) is valid only when within-patient difference in outcome
variable (e.g.BP) is normally distributed.
Linear regression is valid only when residuals (difference between

observed and predicted values) are normally distributed.
Pearson-correlation analysis is valid only when both (outcome and

exposure) variables are normally distributed.
Non-Parametric Tests are always valid

regardless of distribution of data
12
MED101x
Statistical Methods Recommendation by New England Journal

of Medicine
The basis for these guidelines is described in Bailar JC III, Mosteller F.

Guidelines for statistical reporting in articles for medical journals:
amplifications and explanations. Ann Intern Med 1988;108:266-73.
Exact methods should be used as extensively as possible in the
analysis of categorical data. For analysis of measurements,
nonparametric methods should be used to compare groups
when the distribution of the dependent variable is not normal.
This page can be found at

http://authors.nejm.org/Misc/NewMs.asp#statistics
Question 6 - #groups?
• How many groups are there for the
independent (predictor) variable?
- 2 levels ?
- 3 or More?
Examples:
Student t-test comparing 2 group means
ANOVA comparing 3 or more group means
13
MED101x
Question 7 - Sample Size?
• What is the total sample size?
Examples:
Greater than total N=20, use Chi-square test
Greater than 20 and less than 40 and
an expected # in a cell < 5,
use Fisher’s exact test
When to use Fisher’s Exact test

30
New Drug N=20 Control Drug N=20
died survived died Survived

N=12 N=8 N=18 N=2
Observed Expected
died survived died survived
New Drug New Drug
Control Control
14
MED101x
Selection of Regression
Only depends on the following 2 things:

- type of outcome variable
- Whether data are paired or not (Repeated
or not).
Not repeated Repeated
Continuous Linear Mixed effect
Binary Logistic GEE
Time to Events Cox MULCOX

Q 2,
Q 3,
Q 1,U nivariate D ifference Q 4,Q 5　 Q ６,N o.of Q ７,sam pl
P aired /
/M utivariable /C orrelatio Type of outcom e (N orm ality) groups e size V alid Tests
related
n
C ontinuous （
N orm al） 2 S tudent's t-test
＞2 O ne-w ay A N O V A
C ontinuous （N on-norm al）
/ 2 M ann-W hitney U test
Indepdende
O rdered categorical ＞2 K ruskal-W allis H test
nt
(un-paired) N om inal 2 ＜20 Fisher's exact test
≧2 ≧20 C hi-square test
D ifference Tim e to Event Log-R ank test（ K aplan-M eier plot）
U nivariate 2 P aied-t test

C ontinuous （ N orm al） R epeated m easured A N O V A
D ependent ＞2
M ixed effect R egression
(paired) C ontinuous （ N on-norm al）/ 2 W ilcoxon signed-rank test
O rdered categorical ＞2 Friedm an test
N om inal 2 M cN em ar's test
C ontinuous （ N orm al） P earson's correlation （r）
C orrelation C ontinuous （ N on-norm al）/ordered S pearm an's correlation （rs）
N om inal(2 levels) 2 S pearm an/Kappa (A grreem ent）
C ontinuous （ N orm alresidulas） Linear　R egression
＊
Indepdende C ontinuous （ N on-norm alresidulas） Linear　R egression
nt O rdered categorical O rdered Logistic R egression
(un-paired) N om inal (2 levels) B inary　Logistic　R egression
(＞2) M ultinom ial　Logistic　R egression
M ultivariable
Tim e to Event C ox　P roportional　H azard　R egression
C ontinuous （ N orm alresidulas） Linear M ixed Effect R egression
D ependent C ontinuous （ N on-norm alresidulas） Linear M ixed Effect R egression＊
(paired) O rdered categorical G eneralized Estim ation Equation (G EE)
N om inal(2 levels) G eneralized Estim ation Equation (G EE)
＊Transform outcom e variables for norm alizing residuals
Created based on Publishing Your Medical Research Paper, by Daniel Byrne, Williams and Wilkins (1998）
15
MED101x
1.3 Tutorials for selecting valid statistical tests
Example 1
• Comparing ventilator free days between patients
who were randomized to daily awakening and
breathing trial vs daily breathing trial among
ventilated patients in medical ICU: A prospective
randomized study.
Q1: (Univariate?) Univariate Mutivariable Linear regression
Q2: (Difference?) Difference
Q3: (Paired?) Unpaired
Q4: (Type?) Continuous
Q5: (Normality?) Normal
Non-Normal
Q6: (#groups?) 2
Q7: (sample size?) > 30 in each group
Student’s T-test Mann-Whitney U Test
16
MED101x
Example 2
• Cytokine responses of peripheral blood mononuclear cells (PBMC)
from HIVseronegative adults with prior extra pulmonary TB were
compared with responses from persons with prior pulmonary
tuberculosis and latent M. tuberculosis infection in a case-control
study. Antas, Journal of Allergy and Clinical Immunology. 2006.
Q1: (Univariate?) Univariate Multivariable Linear regression

Q5: (Normality?) Normal Non-Normal
Q6: (#groups?) 3
Kruskal-Wallis H Test
1-way ANOVA
Example 3
• We want to estimate the relationship between two
numerical measures: Bio-marker value for S100 and
patient’s cognitive scores measured at 3 months after
ICU discharge among patients in medical ICU.
Q1: (Univariate?) Univariate Multivariable Linear regression
Q2: (Difference?) Correlation
Q3: (Paired?) NA
Q6: (#groups?) 1 group
Pearson’s r Spearman’s ρ
Correlation coeffient Rank Correlation coefficient
17
MED101x
Example 4
• Martinez-Picado et. al. compared proportion of patients with
HIV infection who had viral surge between alternation of
antiretroviral drug regimens and standard regimens. A
Randomized, Controlled Trial. Annals of Internal Medicine.
2003
Q1: (Univariate?) Univariate Multivariable Logistic regression
Q4: (Type?) Nominal
Q5: (Normality?) NA
Q6: (#groups?) 2
Q7: (sample size?) > 20 < 20
Chi-square test Fisher’s Exact test
Example 5
• A researcher wants to evaluate the effect of a new diet on
weight loss by comparing patient’s weight before and after
the diet program.
Q1: (Univariate?) Univariate

Q3: (Paired?) Paired
Q6: (#groups?) 2
18

Week 3 Selecting Proper Stat Tests

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 3 Selecting Proper Stat Tests

Uploaded by

Copyright:

Available Formats

MED101x

Introduction to Applied Biostatistics

Selecting Proper Statistical Tests

1.1 Why the selection of valid statistical tests is important?

Example 1：Different tests → Different results

discharge 30 Spearman’s Correlation

Example 2：Different tests → Different results

1.4 AIDS Spearman’s Correlation

Example 3：Different tests → Different results

The Scandal of Poor Medical Research

2 Factors to be considered for test selection

Understanding A Statistician’s Mind

Flow-chart for popularly used statistical tests

U nivariate 2 P aied-t test

Flow-chart for popularly used statistical tests

Univariate 2 Paied-t test

Univariate - Unadjusted Analysis

-Are there confounders?

Question 1 – Univariate? (cont’d)

Randomization prevents confounding. Thus, in general,

If you want to adjust for confounders, then you need to

RCT -> Probably OK with univariate analysis

v People who use

Question 1 – Univariate? (cont’d)

Randomization prevents confounding. Thus, in general,

If you want to adjust for confounders, then you need to

–Comparing mean (or median) of two groups?

Example 1：Different tests → Different results

discharge 30 Spearman’s Correlation

Are you measuring more than once from

Paired t-test comparing 2 related means.

Paired or Independent groups?

New eye drop Placebo eye drop

Question 4 - Outcome Type?

• If an outcome variable is continuous, is it

Normal Non-normal / skewed

Mann-Whitney U test / Student’s T-test

Total delirium days Total delirium days

Usually, any variable involving “days” are highly skewed (cannot

Parametric Tests are valid only when…

Paired t-test (for example comparing BP before after an

Linear regression is valid only when residuals (difference between

Pearson-correlation analysis is valid only when both (outcome and

Non-Parametric Tests are always valid

Statistical Methods Recommendation by New England Journal

The basis for these guidelines is described in Bailar JC III, Mosteller F.

This page can be found at

Question 7 - Sample Size?

• What is the total sample size?

When to use Fisher’s Exact test

New Drug N=20 Control Drug N=20

died survived died Survived

New Drug New Drug

Only depends on the following 2 things:

Flow-chart for popularly used statistical tests

U nivariate 2 P aied-t test

1.3 Tutorials for selecting valid statistical tests

Student’s T-test Mann-Whitney U Test

Q1: (Univariate?) Univariate Multivariable Linear regression

Chi-square test Fisher’s Exact test

Q1: (Univariate?) Univariate

You might also like