Nonparametric Tests and Anovas:: What You Need To Know

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 40

Nonparametric tests

and
ANOVAs:
What you need to know
Nonparametric tests
• Nonparametric tests are usually based
on ranks
• There are nonparametric versions of
most parametric tests
Parametric Nonparametric

One-sample and
Paired t-test Sign test

Two-sample t-test Mann-Whitney


U-test
Quick Reference Summary:
Sign Test
• What is it for? A non-parametric test to
compare the medians of a group to some
constant
• What does it assume? Random samples
• Formula: Identical to a binomial test with
po= 0.5. Uses the number of subjects with
values greater than and less than a
hypothesized median as the test statistic.
⎛n ⎞ x n−x
P(x) = ⎜ ⎟p (1− p) P = 2 * Pr[xX]
P(x) = probability of a total of x successes
p = probability of success in each trial ⎝x ⎠
n = total number of trials
Sign test
Null hypothesis
Sample Median = mo

Test statistic
x = number of values Null distribution
compare
greater than mo Binomial n, 0.5

How unusual is this test statistic?


P < 0.05 P > 0.05

Reject Ho Fail to reject Ho


Quick Reference Summary:
Mann-Whitney U Test
• What is it for? A non-parametric test to compare the
central tendencies of two groups
• What does it assume? Random samples
• Test statistic: U
• Distribution under Ho: U distribution, with sample
sizes n1 and n2
• Formulae: n = sample size of group 1
n1( n1 +1)
1
n2= sample size of group 2

U1 = n1n2 + − R1 R1= sum of ranks of group 1

2 Use the larger of U1 or U2


U2 = n1n2 − U1 for a two-tailed test
Mann-Whitney U test
Null hypothesis
Sample
The two groups
Have the same
median

Test statistic
U1 or U2 Null distribution
compare
(use the largest) U with n1, n2

How unusual is this test statistic?


P < 0.05 P > 0.05

Reject Ho Fail to reject Ho


Mann-Whitney U test
• Large-sample approximation:
2U − n1n2
Z=
n1n2 ( n1 + n2 +1) / 3

Use this when n1& n2 are both > 10


Compare to the standard normal distribution

Mann-Whitney U Test
• If you have ties:
– Rank them anyway, pretending they were
slightly different
– Find the average of the ranks for the
identical values, and give them all that rank
– Carry on as if all the whole-number ranks
have been used up
Example
Data

14
2
5
4
2
14
18
14
Example
Sorted
Data Data

14 2
2 2
5 4
4 5
2 14
14 14
18 14
14 18
Example
Sorted
Data Data

14 2
2 2
5 4
4 5 TIES
2 14
14 14
18 14
14 18
Example
Sorted
Data Data

14 2 Rank them
2 2 anyway,
5 4
4 5 TIES pretending
2 14 they were
14 14 slightly
18 14
14 18 different
Example
Sorted
Data Data Rank A

14 2 1
2 2 2
5 4 3
4 5 4
2 14 5
14 14 6
18 14 7
14 18 8
Example
Sorted
Data Data Rank A
Find the
14 2 1 average of the
2 2 2
5 4 3 ranks for the
4 5 4 identical
2 14 5 values, and
14 14 6
18 14 7 give them all
14 18 8 that rank
Example
Sorted
Data Data Rank A

14 2 1 Average = 1.5
2 2 2
5 4 3
4 5 4
2 14 5
14 14 6 Average = 6
18 14 7
14 18 8
Example
Sorted
Data Data Rank A Rank

14 2 1 1.5
2 2 2 1.5
5 4 3 3
4 5 4 4
2 14 5 6
14 14 6 6
18 14 7 6
14 18 8 8
Example
Sorted
Data Data Rank A Rank

14 2 1 1.5
2 2 2 1.5
5 4 3 3
4 5 4 4
2 14 5 6
14 14 6 6
18 14 7 6
14 18 8 8

These can now be used for the Mann-Whitney U test


Benefits and Costs of
Nonparametric Tests
• Main benefit:
– Make fewer assumptions about your data
– E.g. only assume random sample
• Main cost:
– Reduce statistical power
– Increased chance of Type II error
When Should I Use
Nonparametric Tests?
• When you have reason to suspect the
assumptions of your test are violated
– Non-normal distribution
– No transformation makes the distribution
normal
– Different variances for two groups
Quick Reference Summary:
ANOVA (analysis of variance)
• What is it for? Testing the difference among k
means simultaneously
• What does it assume? The variable is
normally distributed with equal standard
deviations (and variances) in all k
populations; each sample is a random sample
• Test statistic: F
• Distribution under Ho: F distribution with k-1
and N-k degrees of freedom
Quick Reference Summary:
ANOVA (analysis of variance)
• Formulae: MSgroup
F=
MSerror
SSgroup SSgroup SSerror SSerror
MSgroup = = MSerror = =
df group k −1 df error N − k

SSgroup = ∑ n i (Y i − Y ) 2 SSerror = ∑ si2 (n i −1)

Y i = mean of group i ni = size of sample i
Y = overall mean
N = total sample size
ANOVA
Null hypothesis
k Samples All groups have
the same mean

Test statistic
MSgroup Null distribution
F= compare
MSerror F with k-1, N-k df

€ How unusual is this test statistic?


P < 0.05 P > 0.05

Reject Ho Fail to reject Ho


Quick Reference Summary:
ANOVA (analysis of variance)
• Formulae: MSgroup
F=
MS
There are a LOT of
error

SSgroup equations
SSgroup here, SSerror SSerror
MSgroup = = MSerror = =
df group and
k −1 this is the df error N − k
€ simplest possible
SS group i
ANOVA
= ∑ n (Y i − Y ) 2 SS error = ∑ i (ni −1)
s 2


Y i = mean of group i ni = size of sample i
Y = overall mean
N = total sample size
MSgroup
F=
MSerror
SSgroup SSgroup SSerror SSerror
MSgroup = = MSerror = =
df group k −1 df error N − k

SSgroup = ∑ n i (Y i − Y ) 2 SSerror = ∑ si2 (n i −1)

SSgroup SSgroup MSgroup
SSgroup = ∑ n i (Y i − Y ) 2
dfgroup = k-1 MSgroup =
df group
=
k −1
F=
MSerror
SSerror SSerror
SSerror = ∑ s (n i −1)
2
i
dferror = N-k MSerror = =
df error N − k


Sum of Squares df Mean Squares F-ratio

SSgroup SSgroup MSgroup


SSgroup = ∑ n i (Y i − Y ) 2
dfgroup = k-1 MSgroup =
df group
=
k −1
F=
MSerror
SSerror SSerror
SSerror = ∑ s (n i −1)
2
i
dferror = N-k MSerror = =
df error N − k


ANOVA Tables
Source of Sum of df Mean F ratio P
variation squares Squares

Treatment

Error

Total
ANOVA Tables
Source of Sum of df Mean F ratio P
variation squares Squares

Treatment
SSgroup = ∑ n i (Y i − Y ) 2

Error € SSerror = ∑ si2 (n i −1)

Total € SSgroup + SSerror


ANOVA Tables
Source of Sum of df Mean F ratio P
variation squares Squares

Treatment k-1
SSgroup = ∑ n i (Y i − Y ) 2

Error € SSerror = ∑ si2 (n i −1) N-k

Total € N-1
SSgroup + SSerror


ANOVA Tables
Source of Sum of df Mean F ratio P
variation squares Squares

Treatment k-1 SSgroup


SSgroup = ∑ n i (Y i − Y ) 2 MSgroup =
df group

Error € SSerror = ∑ si2 (n i −1) N-k SSerror


MSerror =
€ df error

Total € N-1
SSgroup + SSerror


ANOVA Tables
Source of Sum of df Mean F ratio P
variation squares Squares

Treatment k-1 SS MSgroup


SSgroup = ∑ n i (Y i − Y ) 2 MSgroup = group F=
df group MSerror

Error € SSerror = ∑ si2 (n i −1) N-k SSerror


MSerror =

€ df error

Total € N-1
SSgroup + SSerror


ANOVA Tables
Source of Sum of df Mean F ratio P
variation squares Squares

Treatment k-1 SS MSgroup


SSgroup = ∑ n i (Y i − Y ) 2 MSgroup = group F= *
df group MSerror

Error € SSerror = ∑ si2 (n i −1) N-k SSerror


MSerror =

€ df error

Total € N-1
SSgroup + SSerror


ANOVA Table: Example
Source of Sum of df Mean F ratio P
variation squares Squares

Treatment 7.22 2

Error 9.41 19

Total
ANOVA Table: Example
Source of Sum of df Mean F ratio P
variation squares Squares

Treatment 7.22 2 3.61 7.29 0.004

Error 9.42 19 0.50

Total 16.64 21
Additions to ANOVA
• R2 value: how much variance is
explained?
• Comparisons of groups: planned and
unplanned
• Fixed vs. random effects
• Repeatability
Two-Factor ANOVA
• Often we manipulate more than one
thing at a time
• Multiple categorical explanitory
variables
• Example: sex and nationality
Two-factor ANOVA
• Don’t worry about the equations for this
• Use an ANOVA table
Two-factor ANOVA
• Testing three things:
1. Means don’t differ among treatment 1
2. Means don’t differ among treatment 2
3. There is no interaction between the
two treatments
Two-factor ANOVA Table
Source of Sum of df Mean Square F ratio P
variation Squares
Treatment 1 SS1 k1 - 1 SS1 MS1
k1 - 1 MSE
Treatment 2 SS2 k2 - 1 SS2 MS2
k2 - 1 MSE
Treatment 1 * SS1*2 (k1 - 1)*(k2 - 1) SS1*2 MS1*2
Treatment 2 (k1 - 1)*(k2 - 1) MSE
Error SSerror XXX SSerror
XXX
Total SStotal N-1

You might also like