Professional Documents
Culture Documents
Probability and Statistics - Lecture 4
Probability and Statistics - Lecture 4
BIOL2001 / BIOL6200
Semester 1, 2024
2. Collect data.
3. Calculate a test statistic that compares the observed data to what we would expect if the null
hypothesis were true.
4. Define a threshold (alpha value; a) to determine whether a test statistic is statistically significant,
i.e., to determine whether to reject the null hypothesis based on a given p-value.
5. If this probability of obtaining the test statistic is below this threshold, we reject the null
hypothesis, otherwise we do not reject it.
P-values: what they are and what they are not
The probability of obtaining results at least as extreme as the observed result
assuming the null hypothesis is true.
The p-value...
• IS a statement about the probability of obtaining the respective data under the null hypothesis.
The larger the sample size, the smaller the effect needed to produce a statistically significant p-value
Null Hypothesis Significance Testing (NHST)
We us statistical hypothesis tests to decide whether or not to reject the null hypothesis.
We set a threshold – the alpha value, a – below which a p-value is considered to reflect a statistically
significant result (i.e., one where we reject H0).
a is usually set at 0.05; sometimes 0.01 or 0.001, depending on the field (ultimately arbitrary)
a represents the conditional probability of rejecting the null hypothesis when the null hypothesis is
true; the Type I error rate.
Hypothesis testing – sometimes we will be wrong!
If we reject H0 when it is actually true, that is called a Type I error.
The Type I error rate, when H0 is true, is equivalent to our alpha value a.
If we fail to reject H0 (i.e., we retain it) when it is indeed false, that is called a Type II error.
Functions like these produce a test statistic, and this particular function is called the chi-square,
(χ2) statistic. The index k is the degrees of freedom (df).
is observed value
e is expected value
Chi-Square (X2) Statistic
Figure from the previous lecture: 2
Frequency of plots with given number of spiders
𝑛
( 𝑜𝑖 −𝑒 𝑖 )
𝜒 2𝑘 =∑ ; 𝑘 =𝑛 −1
𝑖 𝑒𝑖
k= 9
0 2 4 6 8 10
χ2
Figuring out Degrees of Freedom
The number of degrees of freedom (df) is the total number of categories (or observations) minus the
number of categories (or observations) which we can calculate given the marginal (or total):
# spiders 0 1 2 ≥3
k = n – 1 for univariate data
Expected
5.4506 7.0858 4.6058 2.85780
Observed
3 9 5 3
Number of columns – 1 = 4 – 1 = 3 df
shape df 0.995 0.99 0.975 0.95 0.90 0.10 0.05 0.025 0.01 0.005
1 3.9e-5 1.5e-4 0.001 0.004 0.016 2.706 3.841 5.024 6.635 7.879
2 0.010 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 1.064 7.779 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 1.610 9.236 11.070 12.833 15.086 16.750
6 0.676 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812 18.548
7 0.989 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475 20.278
8 1.344 1.646 2.180 2.733 3.490 13.362 15.507 17.535 20.090 21.955
9 1.735 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666 23.589
df = 3 critical value at df = 3
7.815
α – level of significance
0.05
Now, we are going to see it applied as an "Independence Test", or "Test of Association", or "Homogeneity Test".
– A test to check whether the frequencies of one variable differ depending on the value of the other variable.
Use of the χ2 test means we will continue to work with categorical data.
The "one variable vs. the other variable" means we will be working with bivariate data.
The researcher wants to know whether there is an association between field of study and
smoking.
If smoking is not associated with field of study, it would mean that the probability of
being a smoker is independent from the probability of being in a specific field.
i.e., there is no relationship!
χ2 Test, contingency tables, & bivariate data
Is there is an association between field of study and smoking?
The researcher surveyed a random sample of 120 students from both fields and
obtained the following results:
The researcher surveyed a random sample of 120 students from both fields and
obtained the following results:
• What is the probability of being Med student?
• What is the probability of being a smoker?
Physics 40 23 63
Marginal freqs 82 38
23 120
105
P(Smoker)
23/120 = 0.3166
Rules of Probability
1. The Addition Rule: P(A or B) = P(A) + P(B) - P(A and B)
If A and B are mutually exclusive events, or those that cannot occur together,
then the third term is 0, and the rule reduces to P(A or B) = P(A) + P(B).
If smoking and field of study are independent, we expect the joint frequencies to be:
Have a go at filling this out P(X & Y) = P(X)∙P(Y); E(X & Y) = P(X & Y)∙N
Non-smokers Smokers Marginal freqs
Medicine 0.475 x 0.3166 = 0.1504
0.1504 x 120 = 18.05 students
57 P(Med) = 0.475
If smoking and field of study are independent, we expect the joint frequencies to be:
Have a go at filling this out P(X & Y) = P(X)∙P(Y); E(X & Y) = P(X & Y)∙N
Non-smokers Smokers Marginal freqs
Medicine 38.95 18.05 57
Physics 43.05 19.95 63
Marginal freqs 82 38 120
χ2 Test, contingency tables, & bivariate data
Non-smokers Smokers Marginal freqs
Medicine 42 15 57
Physics 40 23 63
Marginal freqs 82 38 120
2
𝑛
( 𝑜𝑖 −𝑒 𝑖 )
𝜒 =∑
2
𝑘
𝑖 𝑒𝑖
Non-smokers Smokers Marginal freqs
Medicine E = 38.95 E = 18.05 57
Physics E = 43.05 E = 19.95 63
Marginal freqs 82 38 120
χ2 Test, contingency tables, & bivariate data
Non-smokers Smokers Marginal freqs
Medicine 42 15 57
Physics 40 23 63
Marginal freqs 82 38 120
Pause now and answer: How many degrees of freedom are there?
k = (r – 1) ∙ (c – 1); r = rows, c = columns
1 degree of freedom
k = (2 – 1) ∙ (2 – 1) = 1 ∙ 1 = 1
shape df 0.995 0.99 0.975 0.95 0.90 0.10 0.05 0.025 0.01 0.005
1 3.9e-5 1.5e-4 0.001 0.004 0.016 2.706 3.841 5.024 6.635 7.879
2 0.010 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 1.064 7.779 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 1.610 9.236 11.070 12.833 15.086 16.750
6 0.676 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812 18.548
7 0.989 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475 20.278
8 1.344 1.646 2.180 2.733 3.490 13.362 15.507 17.535 20.090 21.955
9 1.735 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666 23.589
In the previous example, the test does not allow for a student to be both a Med student and a Physics student.
Expected values for any one category should not be fewer than 5 and never fewer than 1, otherwise results unreliable.
It is recommended that χ2 not be used when total number of observations are fewer than 50 (more is better).
Physics 29 6 35
Marginal freqs 39 8 47
Fisher’s Exact Test – when sample sizes are small
Fisher’s Exact Test is useful when values
The test provides an exact p-value for a test of association. in any cell > 5.
Variable A1 Variable A2
Better than χ2 where the expected frequencies are too low to meet the Variable B1 8 9
rules demanded by the χ2 approximation. Variable B2 5 2
Fisher's exact test was developed for a 2 x 2 contingency table with fixed
row and column totals, but it can be expanded to larger tables.
- Pause now and formulate this problem in the context of the hypothesis testing workflow.
2. Collect data.
3. Calculate a test statistic that compares the observed data to what we would expect if the null
hypothesis were true.
4. Define a threshold (alpha value; a) to determine whether a test statistic is statistically significant,
i.e., to determine whether to reject the null hypothesis based on a given p-value.
5. If this probability of obtaining the test statistic is below this threshold, we reject the null
hypothesis, otherwise we do not reject it.
Hypothesis testing – workflow
1. Define the Null Hypothesis and the Alternative Hypothesis.
H0 – smoking and field of study are independent (not related). Probability of being a smoker is
the same regardless of the field of study.
2. Collect data.
Collect observations (run questionnaire). Build the contingency table, calculate expected values
under the null model (probs are independent).
3. Calculate a test statistic that compares the observed data to what we would expect if the null
hypothesis were true.
Calculate the χ2 statistic and the number of degrees of freedom:
χ2 = 1.435
k=1
Hypothesis testing – workflow
4. Define a threshold (alpha value; a) to determine whether a test statistic is statistically significant,
i.e., to determine whether to reject the null hypothesis based on a given p-value.
Let's choose the commonly used α = 0.05 threshold.
5. If this probability of obtaining the test statistic is below this threshold, we reject the null
hypothesis, otherwise we do not reject it.
Calculated probability is 0.231. It is larger than the α threshold of 0.05.
We therefore retain (i.e., accept, can not reject) our null hypothesis, H0.
H0 – smoking and field of study are independent (not related). The probability of being a
smoker is the same regardless of the field of study.
Hypothesis Testing: field of study & smoking
If we were to reject H0 anyway (thus accepting the H1 alternative – that smoking and
field of study are indeed associated, i.e., not independent), the probability of us
making a Type I error would be 0.231.
That is, there is a 23% chance that we are committing a Type I error, where the
typically accepted Type I error rate in Biology is 5%.
Summary
• A Type I error is when we reject H0 when it is actually true.
• A Type II error is when we H0 when it is actually false.
• The number of degrees of freedom (df) is the total number of categories (or observations) minus the number
of categories (or observations) which we can calculate given the marginal (or total):
• k = n – 1 for univariate data (e.g., the spider example last lecture had 4 columns; 4 -1 = 3 df)
• k = (r – 1) ∙ (c – 1) for bivariate data; r = number of rows, c = number of columns
• The χ2 goodness-of-fit and homogeneity tests are similar in terms of calculations.
• They differ in that a goodness-of-fit test compares against a known distribution, whereas homogeneity test
checks if one categorical variable correlates with the other variable (whether they are dependent).
• Both tests get “sketchy" if values in any cell are fewer than 5, and really, really sketchy if fewer than 1 (zero).
• When this is the case, or number of total observations < 50, use Fisher’s Exact Test!