Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Assumptions in

hypothesis testing
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Randomness
Assumption
The samples are random subsets of larger
populations

Consequence
Sample is not representative of population

How to check this


Understand how your data was collected

Speak to the data collector/domain expert

1 Sampling techniques are discussed in "Sampling in Python".

HYPOTHESIS TESTING IN PYTHON


Independence of observations
Assumption
Each observation (row) in the dataset is independent

Consequence
Increased chance of false negative/positive error

How to check this


Understand how our data was collected

HYPOTHESIS TESTING IN PYTHON


Large sample size
Assumption
The sample is big enough to mitigate uncertainty, so that the Central Limit Theorem applies

Consequence
Wider con dence intervals

Increased chance of false negative/positive errors

How to check this


It depends on the test

HYPOTHESIS TESTING IN PYTHON


Large sample size: t-test
One sample Two samples
At least 30 observations in the sample At least 30 observations in each sample

n ≥ 30 n1 ≥ 30, n2 ≥ 30

n: sample size ni : sample size for group i

Paired samples ANOVA


At least 30 pairs of observations across the At least 30 observations in each sample
samples
ni ≥ 30 for all values of i
Number of rows in our data ≥ 30

HYPOTHESIS TESTING IN PYTHON


Large sample size: proportion tests
One sample Two samples
Number of successes in sample is greater Number of successes in each sample is
than or equal to 10 greater than or equal to 10

n × p^ ≥ 10 n1 × p^1 ≥ 10

Number of failures in sample is greater than n2 × p^2 ≥ 10


or equal to 10
Number of failures in each sample is
n × (1 − p^) ≥ 10 greater than or equal to 10

n: sample size n1 × (1 − p^1 ) ≥ 10


p^: proportion of successes in sample
n2 × (1 − p^2 ) ≥ 10

HYPOTHESIS TESTING IN PYTHON


Large sample size: chi-square tests
The number of successes in each group in greater than or equal to 5

ni × p^i ≥ 5 for all values of i

The number of failures in each group in greater than or equal to 5

ni × (1 − p^i ) ≥ 5 for all values of i

ni : sample size for group i


p^i : proportion of successes in sample group i

HYPOTHESIS TESTING IN PYTHON


Sanity check
If the bootstrap distribution doesn't look normal, assumptions likely aren't valid

Revisit data collection to check for randomness, independence, and sample size

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Non-parametric
tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Parametric tests
z-test, t-test, and ANOVA are all parametric tests

Assume a normal distribution

Require su ciently large sample sizes

HYPOTHESIS TESTING IN PYTHON


Smaller Republican votes data
print(repub_votes_small)

state county repub_percent_08 repub_percent_12


80 Texas Red River 68.507522 69.944817
84 Texas Walker 60.707197 64.971903
33 Kentucky Powell 57.059533 61.727293
81 Texas Schleicher 74.386503 77.384464
93 West Virginia Morgan 60.857614 64.068711

HYPOTHESIS TESTING IN PYTHON


Results with pingouin.ttest()
5 pairs is not enough to meet the sample size condition for the paired t-test:

At least 30 pairs of observations across the samples.

alpha = 0.01
import pingouin
pingouin.ttest(x=repub_votes_potus_08_12_small['repub_percent_08'],
y=repub_votes_potus_08_12_small['repub_percent_12'],
paired=True,
alternative="less")

T dof alternative p-val CI95% cohen-d BF10 power


T-test -5.875753 4 less 0.002096 [-inf, -2.11] 0.500068 26.468 0.239034

HYPOTHESIS TESTING IN PYTHON


Non-parametric tests
Non-parametric tests avoid the parametric assumptions and conditions

Many non-parametric tests use ranks of the data

x = [1, 15, 3, 10, 6]

from scipy.stats import rankdata


rankdata(x)

array([1., 5., 2., 4., 3.])

HYPOTHESIS TESTING IN PYTHON


Non-parametric tests
Non-parametric tests are more reliable than parametric tests for small sample sizes and
when data isn't normally distributed

HYPOTHESIS TESTING IN PYTHON


Non-parametric tests
Non-parametric tests are more reliable than parametric tests for small sample sizes and
when data isn't normally distributed

Wilcoxon-signed rank test


Developed by Frank Wilcoxon in 1945

One of the rst non-parametric procedures

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-signed rank test (Step 1)
Works on the ranked absolute di erences between the pairs of data

repub_votes_small['diff'] = repub_votes_small['repub_percent_08'] -
repub_votes_small['repub_percent_12']
print(repub_votes_small)

state county repub_percent_08 repub_percent_12 diff


80 Texas Red River 68.507522 69.944817 -1.437295
84 Texas Walker 60.707197 64.971903 -4.264705
33 Kentucky Powell 57.059533 61.727293 -4.667760
81 Texas Schleicher 74.386503 77.384464 -2.997961
93 West Virginia Morgan 60.857614 64.068711 -3.211097

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-signed rank test (Step 2)
Works on the ranked absolute di erences between the pairs of data

repub_votes_small['abs_diff'] = repub_votes_small['diff'].abs()
print(repub_votes_small)

state county repub_percent_08 repub_percent_12 diff abs_diff


80 Texas Red River 68.507522 69.944817 -1.437295 1.437295
84 Texas Walker 60.707197 64.971903 -4.264705 4.264705
33 Kentucky Powell 57.059533 61.727293 -4.667760 4.667760
81 Texas Schleicher 74.386503 77.384464 -2.997961 2.997961
93 West Virginia Morgan 60.857614 64.068711 -3.211097 3.211097

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-signed rank test (Step 3)
Works on the ranked absolute di erences between the pairs of data

from scipy.stats import rankdata


repub_votes_small['rank_abs_diff'] = rankdata(repub_votes_small['abs_diff'])
print(repub_votes_small)

state county repub_percent_08 repub_percent_12 diff abs_diff rank_abs_diff


80 Texas Red River 68.507522 69.944817 -1.437295 1.437295 1.0
84 Texas Walker 60.707197 64.971903 -4.264705 4.264705 4.0
33 Kentucky Powell 57.059533 61.727293 -4.667760 4.667760 5.0
81 Texas Schleicher 74.386503 77.384464 -2.997961 2.997961 2.0
93 West Virginia Morgan 60.857614 64.068711 -3.211097 3.211097 3.0

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-signed rank test (Step 4)
state county repub_percent_08 repub_percent_12 diff abs_diff rank_abs_diff
80 Texas Red River 68.507522 69.944817 -1.437295 1.437295 1.0
84 Texas Walker 60.707197 64.971903 -4.264705 4.264705 4.0
33 Kentucky Powell 57.059533 61.727293 -4.667760 4.667760 5.0
81 Texas Schleicher 74.386503 77.384464 -2.997961 2.997961 2.0
93 West Virginia Morgan 60.857614 64.068711 -3.211097 3.211097 3.0

Incorporate the sum of the ranks for negative and positive di erences

T_minus = 1 + 4 + 5 + 2 + 3
T_plus = 0
W = np.min([T_minus, T_plus])

HYPOTHESIS TESTING IN PYTHON


Implementation with pingouin.wilcoxon()
alpha = 0.01
pingouin.wilcoxon(x=repub_votes_potus_08_12_small['repub_percent_08'],
y=repub_votes_potus_08_12_small['repub_percent_12'],
alternative="less")

W-val alternative p-val RBC CLES


Wilcoxon 0.0 less 0.03125 -1.0 0.72

Fail to reject H0 , since 0.03125 > 0.01

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Non-parametric
ANOVA and
unpaired t-tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Wilcoxon-Mann-Whitney test
Also know as the Mann Whitney U test

A t-test on the ranks of the numeric input

Works on unpaired data

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-Mann-Whitney test setup
age_vs_comp = stack_overflow[['converted_comp', 'age_first_code_cut']]

age_vs_comp_wide = age_vs_comp.pivot(columns='age_first_code_cut',
values='converted_comp')

age_first_code_cut adult child


0 77556.0 NaN
1 NaN 74970.0
2 NaN 594539.0
... ... ...
2258 NaN 97284.0
2259 NaN 72000.0
2260 NaN 180000.0

[2261 rows x 2 columns]

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-Mann-Whitney test
alpha=0.01

import pingouin
pingouin.mwu(x=age_vs_comp_wide['child'],
y=age_vs_comp_wide['adult'],
alternative='greater')

U-val alternative p-val RBC CLES


MWU 744365.5 greater 1.902723e-19 -0.222516 0.611258

HYPOTHESIS TESTING IN PYTHON


Kruskal-Wallis test
Kruskal-Wallis test is to Wilcoxon-Mann-Whitney test as ANOVA is to t-test

alpha=0.01

pingouin.kruskal(data=stack_overflow,
dv='converted_comp',
between='job_sat')

Source ddof1 H p-unc


Kruskal job_sat 4 72.814939 5.772915e-15

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Congratulations!
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Course recap
Chapter 1 Chapter 3

Work ow for testing proportions vs. a Testing di erences in sample proportions


hypothesized value between two groups using proportion tests

False negative/false positive errors Using chi-square independence/goodness


of t tests

Chapter 2 Chapter 4

Testing di erences in sample means Reviewing assumptions of parametric


between two groups using t-tests hypothesis tests

Extending this to more than two groups Examined non-parametric alternatives


using ANOVA and pairwise t-tests when assumptions aren't valid

HYPOTHESIS TESTING IN PYTHON


More courses
Inference
Statistics Fundamentals with Python skill track

Bayesian statistics
Bayesian Data Analysis in Python

Applications
Customer Analytics and A/B Testing in Python

HYPOTHESIS TESTING IN PYTHON


Congratulations!
HYPOTHESIS TESTING IN PYTHON

You might also like