Topic 4B. Inferential Statistics

International Baccalaureate
MATHEMATICS
Applications and Interpretation SL (and HL)
Lecture Notes
Christos Nikolaidis
TOPIC 4
STATISTICS AND PROBABILITY
4B. Inferential statistics
4.16 INTRODUCTION TO INFERENTIAL STATISTICS …………………………………….. 1
4.17 HYPOTHESIS TEST FOR TWO MEANS μ1, μ2 ..…………………………………….. 5
4.18 CHI-SQUARE TEST FOR INDEPENDENCE …………………………………………….. 8
4.19 CHI-SQUARE TEST FOR GOODNESS OF FIT (GOF) ……….………………….... 12
Only for HL
4.20 FURTHER DETAILS FOR THE CHI-SQUARE GOF TEST …………………...….. 19
4.21 CONFIDENCE INTERVAL FOR THE MEAN μ ..………………….………………..….. 24
4.22 HYPOTHESIS TEST FOR THE MEAN μ ......................……………………………….. 27
4.23 HYPOTHESIS TEST FOR THE PARAMETER λ OF POISSON ……………….. 33
4.24 HYPOTHESIS TEST FOR THE PROPORTION p (BINOMIAL) ……………….. 35
4.25 HYPOTHESIS TEST FOR THE CORRELATION COEEFICIENT ρ ................ 37
4.26 CRITIAL REGION – TYPE I AND TYPE II ERRORS ……………………………... 39
January 2023
TOPIC 4: STATISTICS AND PROBABILITY Christos Nikolaidis
4.16 INTRODUCTION TO INFERENTIAL STATISTICS
Well, I have to confess that this part of the syllabus is not my

favorite (to put it gently!)
The idea of inferential Statistics is simple:
we study a small sample
we draw a conclusion for the general population
For example, we find the sample mean x to draw a conclusion for

the population mean μ.
The problem is that we use dozens of formulas, terminology, tables,

graphs, politically correct expressions1, to draw a conclusion (which
is, more or less, evident by simply observing our data!). Worst of
all, this conclusion has a certain degree of uncertainty (something
not very common in pure Mathematics)
In a University course of Statistics, the students refer to a formula

booklet (of many pages) which contains all the appropriate
formulas, as it is almost impossible to memorise all this
information. In this course, the number of formulas has been
reduced to the minimum, as the answers can be obtained by the
GDC. However, there are certain concepts that require the use of
formulas for a better understanding (but these formulas are not
contained in the formula booklet!). I think that the IB must
reconsider the balance between the formulas needed and the use of
a GDC.
Thus, the following topics will provide more of a “recipe” for each
case, rather than a mathematical investigation
1 For example, we don’t say “we accept the claim” but “we do not have enough
evidence to reject the claim”)
1
The story is as follows:
There is a claim for a characteristic of the population, for example
the mean weight μ is 75kg,

the proportion p of women is 45%,
smoking behavior is independent of the gender, etc
We state:
Null Hypothesis H0: the claim (affirmative)
Alternative hypothesis H1: the negation of the claim
We investigate a sample of the population against this

characteristic. The result usually differs from the claim.
Question: Is the result close enough to the claim?
If NOT, we reject H0
If YES, we do not have enough evidence to reject H0
 But what does it mean “close enough”?

This is determined by a so-called significance level a,
10% 5% 1%
The significance level is usually
a=0.10 a=0.05 a=0.01
(it is clearly stated in the question)
This is in fact is the probability to reject the Ho while it is true.
For example, a=0.05 means: we reject results far away from the
claim H0, in a way that the probability to make a mistake is 5%.
 How do we draw a conclusion?

There are two ways:
by checking the p-value (against a)
OR by checking a statistic value (against some critical value2)
Both p-value and statistic are obtained by the sample (by GDC).
p-value < a
If we reject H0
statistic value > critical value
2
The critical value depends on the significance level a (hopefully it will be given!)
2
 A CLARIFICATION ABOUT THE STANDARD DEVIATION

(mainly for HL)
For a group of data, the GDC provides two standard deviations
denoted by
σx and sx
In fact, there are three standard deviations!
If DATA = the whole population

standard deviation of the population
σ x gives σ
σ x2 = σ 2 variance of the population
sx means nothing!
If DATA = a sample of the population

standard deviation of the sample
σx gives sn
σ x2 = sn2 variance of the sample
2
sn-1 is an unbiased estimate of σ2
sx gives sn-1
sn-1 is the corresponding standard deviation
Let’s try to explain: For some mathematical reason, while
x is an unbiased estimate of the population mean μ,

sn2 is not a good estimate of the population variance σ 2
2
A slight modification gives sn-1 and corrects this bias:
2 n 2
sn-1 = sn
n- 1
We say that
2
sn-1 is an unbiased estimate of the population variance σ 2
For example, if our sample has size n=6 and sn  1.71, then
2 6 2 6
sn-1 = sn = 1.712 = 3.50892
5 5
sn-1  1.87
In inferential statistics, what we need to draw conclusions for the

whole population is sn-1 . Hopefully, in IB exams, if they don’t give
us the full sample, they will, at least, give us sn-1 .
3
EXAMPLE 1
For the following data
1, 2, 3, 4, 5, 6
the GDC gives
σx  1.71
sx  1.87
So, what is the standard deviation for our data?
There are two different situations: Population vs Sample.
 When we throw a die, the possible results are

1, 2, 3, 4, 5, 6 [population]
then
σ  1.71 (standard deviation of the population)
σ2  1.712 (variance of the population)
 When we throw a die one million times, we obtain a large

population of numbers from 1 to 6. Suppose that a random
sample of six numbers consists of3
1, 2, 3, 4, 5, 6 [sample]
then
sn  1.71 (standard deviation of the sample)
sn2  1.712 (variance of the sample)
But if we need an unbiased estimate for the variance of the

population we use
sn-1  1.87
2
sn-1  1.872 (unbiased estimate of the variance σ2)
3
Well, this sample doesn’t look very random, but it serves our purpose, to
compare with the first situation. One would expect something more random like
“2, 3, 6, 2, 1, 4” which has a smaller standard deviation sn , so that the
correction would provide sn-1 , a more reliable estimate for the population.
4
4.17 HYPOTHESIS TEST FOR TWO MEANS μ1 and μ2 (t-test)
They give us some observe data:
either or
two samples of data only the corresponding statistics
sample means: x1 x2
we enter data in GDC in
standard deviations: sx1 sx2
LIST 1 LIST 2
size of the samples: n1 n2
The sample means x1 and x2 may be different, but we test if they

are close enough or not, to draw a conclusion for the population
means μ1 and μ2. Assuming that the distribution of each
population is Normal, we perform the so-called t-test:
CLAIM: for the population means μ1 and μ2
μ1≠μ2 μ1>μ2 or μ1<μ2

μ1=μ2 against
2-tailed test 1-tailed test
We state
[null hypothesis] Ho: μ1=μ2
[alternative hypothesis] H1: μ1≠μ2 or μ1>μ2 or μ1<μ2
We use GDC
Statistics - TEST – t – (2 samples)
if List: statistics sx , x , n are automatically entered
if Var: we enter sx , x , n on ourselves
Pooled: ON (always)
Execute gives
p-value
Conclusion
IF THEN
p-value < a we reject Ho
5
EXAMPLE 1
To compare the mean weights between two populations A and B
we obtain two samples
Sample A 65 70 74 69 57 64 61 78 83 80
Sample B 71 65 68 59 70 65 55 52
(GDC gives that the two sample means are x1 =70.1 and x2 =63.1)
We will test two different claims for the population means μ1, μ2
with a = 0.05
(a) Ann claims that μ1> μ2
(b) Bill claims that population means are different
Solution
(a) We perform a 1-tailed t-test
H o: μ1 = μ2
H 1: μ1 > μ2
GDC gives p-value = 0.041
Since p-value < 0.05
we reject Ho. That is, we accept Ann’s claim that μ1 > μ2
(b) We perform a 2-tailed t-test
H o: μ1 = μ2
H 1: μ1 ≠ μ2
Since p-value > 0.05
we do not have enough evidence to reject Ho. Bill is not right!
NOTICE.
 If the significance level was a = 0.10, the null hypothesis Ho
would be rejected in both cases
 If the significance level was a = 0.01, the null hypothesis Ho
would not be rejected in both cases
6
EXAMPLE 2.
The same example, if they give us only the statistics:
Sample A: x1 =70.1, sx1 =8.57 , n1 =10

Sample B: x2 =63.1, sx2 =7.04 , n1 =8
Solution
We use Data: Var instead of Data: List to enter the statistics.
The results are as in Example 1.
7
4.18 χ2 TEST FOR INDEPENDENCE

They give us a table of observed frequencies, for example
Tennis Volley Basketball

Male * * *
Female * * *
CLAIM: the two criteria (gender, favorite sport) are independent
We state
[null hypothesis] Ho: the two criteria are independent
[alternative hypothesis] H1: the two criteria are not independent
The two criteria are independent if the observed frequencies above

are close enough to the so-called expected frequencies:

Male # # #
Female # # #
We use GDC
Statistics - TEST – CHI – 2WAY
We enter observed frequencies in Matrix A (use F2 ►MAT)
Expected frequencies are automatically entered in Matrix B
Execute gives
χ2statistic and p-value
Conclusion
IF THEN
p-value < a
we reject Ho
(or χ2statistic > χ2critical)
Notice: χ2critical will be given in the question, if necessary.

The GDC also gives the so-called degrees of freedom of the problem
which play a role in the determination of χ2critical
8
NOTICE
 n matrix
For a m
degrees of freedom = (m-1)×(n-1)
 The expected frequencies (in matrix B) are calculated as follows
Observed frequencies: we also find the totals (in red)

Tennis Volley Basketball total
Male fobserved - - row1
Female - - - row2
total column1 column2 column3 TOTAL
Expected frequencies (keep only the totals)

Tennis Volley Basketball total
Male fexpected - - row1
Female - - - row2
total column1 column2 column3 TOTAL
Then we complete the table. For the first entry:

(column1)× (row1)
fexpected =
TOTAL
Similarly for each entry.

f 
2
- fexpected
 In fact, the test statistic χ2 statistic is the sum of all observed
fexpected
[this calculation will not be asked]
EXAMPLE 1
In a group of 80 people, we asked about their favorite sport.

Male 18 10 8
Female 12 10 22
Test if the favorite sport is independent of the gender.
Use the significance level a=0.05.
9
Solution
H o: gender and favorite sport are independent
H 1: gender and favorite sport are not independent
GDC gives
χ2statistic = 7.00
p-value = 0.0301
degrees of freedom = 2
Since
p-value < 0.05
we reject Ho, that is gender and favorite sport are not
independent.
Extra details:
 The matrix is 23. That is why
degrees of freedom = 12 = 2
 The expected frequencies are
Male 13.5 9 13.5
Female 16.5 11 16.5
For the first entry:
(column1)× (row1) 30 ×36

  13.5
TOTAL 80
 An alternative way to draw the conclusion is by using the

statistic value. It is given that the critical value, corresponding to
a=0.05 and d.f = 2, is χ2critical = 5.99
Since
χ2statistic > χ2critical (7.00 > 5.99)
we reject Ho.
 If the significance level given was a=0.01, then

p-value > 0.01
The conclusion would be: We do not have enough evidence to
reject Ho, that is gender and favorite sport may be independent.
10
NOTICE
If some fexpected is  5, we need to merge two columns (or two rows
accordingly) of the original matrix (observed) and repeat the whole

process.
For example, for the observed frequencies
Tennis Squash Volley Basketball

Male 12 6 10 8
Female 7 5 10 22
the expected frequencies are
Tennis Squash Volley Basketball

Male 8.55 4.95 9 13.5
Female 10.45 6.05 11 16.5
Since 4.95  5 we merge the first two columns, so the observes

frequencies become
Tennis & squash Volley Basketball

Male 18 10 8
Female 12 10 22
This table is exactly the same with that of Example 1, thus we

proceed in the same way.
11
4.19 χ2 TEST FOR GOODNESS OF FIT
They give us a list of observed frequencies,

f1, f2, … , fn
CLAIM: they follow a given distribution

p1, p2, …, pk
We state
[null hypothesis] Ho: data follow the distribution
[alternative hypothesis] H1: data do not follow the distribution
We use GDC
Statistics - TEST – CHI – GOF
We enter observed frequencies in List 1
and expected frequencies in List 2
List 1 List 2
f1 Np1
f2 Np2
… …
fn Npn
N = sum of frequencies in List 1

We also enter degrees of freedom d.f.= n-1
Execute gives
χ2statistic
p-value
Conclusion
IF THEN
p-value < a
we reject Ho
(or χ2statistic > χ2critical)
χ2critical will be given If necessary
12
NOTICE
 The distribution can be

Any discrete distribution given by a table as above
Binomial B(n,p) (we find the probabilities).
Normal with N(μ,σ2) (we find the probabilities).
 If some fexpected is 5, we merge categories appropriately.
For example
Observed Expected Observed Expected

13 14.2 is merged into 13 14.2
10 11.5 17 15.8
7 4.3
Notice that d.f. are also decreased accordingly
EXAMPLE 1
Philipp claims that the supporters of Football teams A, B, C and D
are as follows
A B C D
30% 30% 25% 15%
In a sample of 40 people we found
A B C D
11 13 10 6
We test Phillip’s claim for a = 0.05
Solution
We perform a Goodness of fit Chi-squared test with
Ho: data follow the given distribution

H1: data do not follow the given distribution
0.30 0.30 0.25 0.15
13
We enter
observed frequencies in List 1,
expected frequencies in List 2. (multiply probabilities by 40)
List 1 List 2
11 400.30 = 12
13 400.30 = 12
10 400.26 = 10.4
6 400.14 = 5.6
We use GDC: Statistics - TEST – CHI – GOF
We also enter degrees of freedom
d.f.= n-1 = 3
GDC gives
χ2statistic = 0.210
p-value = 0.976
Since
p-value > 0.05
we do not have enough evidence to reject Ho.
We may accept that Philipp’s claim about the distribution of the
people is true.
NOTICE
An alternative way to draw the conclusion for example 1 is by
using the statistic value χ2statistic = 0.210.
It is given that the critical value, corresponding to a=0.05 and

d.f = 3, is χ2critical = 7.81
Since
χ2statistic < χ2critical (0.21 > 7.81)
14
EXAMPLE 2
100 people throw a die 10 times and count the number of sixes:
Number of sixes 0 1 2 3 4-10

Number of people 15 30 30 15 10
We test the claim that the game follows the Binomial distribution
B(10,1/6) with a=0.05
Solution
Ho: data follow Binomial distribution B(10,1/6)
H1: data do not follow Binomial distribution B(10,1/6)
The Binomial distribution B(10,1/6) gives the probabilities
0 1 2 3 4-10
0.162 0.323 0.291 0.155 0.070
We enter
List 1 List 2
15 1000.162 = 16.2
30 1000.323 = 32.3
30 1000.291 = 29.1
15 1000.155 = 15.5
10 1000.070 = 7
Use GDC: Statistics - TEST – CHI – GOF

We also enter: d.f.= n-1 = 4
GDC gives
χ2statistic = 1.58 and p-value = 0.812
Since
p-value > 0.05
We may accept that the distribution is B(10,1/6).
15
EXAMPLE 3
It is claimed that the amount of sugar contained in 1-kg packets is
actually normally distributed with a mean of μ = 1000g and a
standard deviation of σ = 30g. We pick at random 80 packets of
sugar and notice their weight. The results are shown below
Weight (g) 940-960 960-1000 1000-1040 1040-1060
packets 10 37 28 5
Test the claim with a=0.05

Solution
Ho: data follow Normal distribution N(1000,302)
H1: data do not follow Normal distribution N(1000,302)
The Normal distribution N(1000,302) gives the probabilities
(-∞,960] [960,1000] [1000,1040] [1040,∞)
0.091 0.409 0.409 0.091
We enter
List 1 List 2
10 800.091 = 7.28
37 800.409 = 32.72
28 800.409 = 32.72
5 800.091 = 7.28
Use GDC: Statistics - TEST – CHI – GOF

We also enter d.f.= n-1 = 3
GDC gives
Since
p-value > 0.05
we do not reject Ho. The distribution may be N(1000,302).
16
ONLY FOR
HL
17
18
4.20 FURTHER DETAILS FOR THE CHI-SQUARE GOF TEST
The distribution can also be Poisson
EXAMPLE 1
It is claimed that the following sample
x 0-4 5-7 8 9-10 11-15
frequency 10 35 15 20 20
comes from a population that follows Poisson Po(8)

Solution
Ho: data follow Poisson distribution Po(8)
H1: data do not follow Poisson distribution Po(8)
The Poisson distribution Po(8) gives the probabilities
0-4 5-7 8 9-10 11-∞

0.010 0.353 0.140 0.223 0.184
We enter
List 1 List 2
10 1000.010 = 10
35 1000.353 = 35.3
15 1000.140 = 12
20 1000.223 = 22.3
20 1000.184 = 18.4
d.f.= n-1 = 4
GDC gives
Since
p-value > 0.05
we do not reject Ho. The distribution can be Po(8).
19
 DEGREES OF FREEDOM
The definition is
d.f. = number of values that have the freedom to vary
In simple cases d.f. = n-1
This is because, in List 2 (expected data) we enter the first n-1
values, but the last entry is not free (as we know their sum):
List 2
*
*
*
*
N – (above)
SUM = N
However, in some cases we also have to estimate some extra

parameters. Then we reduce d.f. accordingly.
Distribution d.f.
Any random with n categories n-1
Binomial B(n,p)
if they give us p, well n-1
x n-2
otherwise we consider p=
n
Poisson Po(λ)
If they give us λ, well n-1
If not, we consider λ= x n-2
Normal
If they give us μ, σ well n-1
If they don’t give μ we consider μ= x n-2
If they don’t give σ we consider σ= s n- 1 n-2
If they don’t give both μ, σ (so μ= x , σ= s n- 1 ) n-3
Let us revisit some examples we have seen, slightly modified:
20
EXAMPLE 2
100 people play a game 10 times and count the number of wins:
Number of wins 0 1 2 3 4-10

Number of people 15 30 30 15 10
We test the claim that the game follows the Binomial distribution
B(10,p) with a=0.05
Solution
Since p is not known we have to estimate it:
x midpoint frequency
0 0 15
1 1 30
2 2 30
3 3 15
4-10 7 10
x = 2 .0 5
x 2.05
Since np = x  p = = = 0.205
n 10
We test the claim:
Ho: data follow Binomial distribution B(10,205)
H1: data do not follow Binomial distribution B(10,0.205)
The Binomial distribution B(10,0.205) gives the probabilities
0 1 2 3 4-10
0.101 0.260 0.302 0.207 0.130
We enter
We also enter: d.f.= n-1-1 = 3

Since p-value > 0.05, we do not have enough evidence to reject Ho.
We may accept that the distribution is B(10,0.205).
21
EXAMPLE 3
It is claimed that the amount of sugar contained in 1-kg packets is
actually normally distributed with μ=1000. We pick at random 80
packets and notice their weight. The results are shown below
Weight (g) 940-960 960-1000 1000-1040 1040-1060
packets 10 37 28 5
Test the claim with a=0.05

Solution
Since σ is not given we have to estimate it:
940-960 950 10
960-1000 980 37
1000-1040 1020 28
1040-1060 1050 5
GDC gives and s n- 1 = 2 7 .8 (also x = 9 9 5 but it is not needed here).

We test the claim:
Ho: data follow Normal distribution N(1000,27.82)
H1: data do not follow Normal distribution N(1000,27.82)
The Normal distribution (1000,27.82) gives the probabilities
< 960 960-1000 1000-1040 >1040
0.075 0.425 0.425 0.075
We enter
We also enter degrees of freedom d.f.= n-1-1 = 2

Since
p-value > 0.05
we do not reject Ho. The distribution may be N(1000,27.82).
22
EXAMPLE 4
It is claimed that the following sample
x 0-4 5-7 8 9-10 11-15
frequency 10 35 15 20 20
comes from a population that follows Poisson.

Solution
Since λ is not given we have to estimate it:
0-4 2 10
5-7 6 35
8 8 15
9-10 9.5 20
11-15 13 20
GDC gives x = 9 9 5 , thus we consider λ=8

We test the claim:
Ho: data follow Poisson distribution Po(8)
H1: data do not follow Poisson distribution Po(8)
The Poisson distribution Po(8) gives the probabilities
0-4 5-7 8 9-10 11-∞
0.010 0.353 0.140 0.223 0.184
We enter
But now we enter d.f.= n-1-1 = 3

GDC gives
Since
p-value > 0.05
we do not reject Ho. The distribution can be Po(8).
23
4.21 CONFIDENCE INTERVAL FOR THE MEAN μ
either or
a sample of data only the corresponding statistics
sample mean: x
standard deviation: sx (this is sn-1 )
LIST 1
size of the sample: n
We are looking for a confidence interval for the population mean μ
We use GDC:
Statistics – INTR – (Z or t) – (1 sample)
Z t
If we know σ If we don’t know σ
We have to enter σ on ourselves We use sx (instead of σ)
if List: statistics sx , x , n are already there
if Var: we enter σ or sx x , n on ourselves
Execute gives
Lower
Upper
Conclusion
The a% confidence interval for the population mean μ is
[Lower , Upper]
If they ask us to interpret the confidence level, e.g. 95%
We are 95% confident that the interval will contain the population
mean μ
or otherwise
If we choose a sample 100 times, we expect that in 95 of them,
the interval will contain the population mean μ
24
EXAMPLE 1
For a sample of n=40 data, we know that x=23, sn-1 =3. We also
know that the standard deviation of the population is σ=2.8.
Find a 90% confidence interval for the mean μ of the population.
Solution
Since we know σ, we use Z distribution ( sn-1 =3 was not necessary).
Use GDC: Statistics – INTR – Z – 1 SAMPLE – Data: Variable
The confidence interval is [22.3, 23.7]
EXAMPLE 2
For a sample of n=20 data, we know that x=23, sn-1 =3.
Find a 90% confidence interval for the mean μ of the population.
Solution
Since we don’t know σ we use t distribution.
Use GDC: Statistics – INTR – t – 1 SAMPLE – Data: Variable
EXAMPLE 3
For a sample of n=20 data, we know that x=23, sn =5 .
(a) Find an unbiased estimate for the variance σ2 of the

population and hence sn-1
(b) Find a 90% confidence interval for the mean μ
Solution
n 2 20 2
(a) s 2n-1 = sn = 7 = 51.5789...  51.6 and sn-1 =7.18
n- 1 19
(b) GDC: Statistics – INTR – t – 1 SAMPLE – Data: Variable
25
EXAMPLE 4
Consider the following sample of size n=12
16 17 17 15 20 16 18 15 21 18 17 16
(a) Find, the mean x , and the value of sn-1
(b) Find a 90% confidence interval for the mean μ
(c) Explain the meaning of the result (b)
Solution
Since we don’t know σ we use t distribution.
We enter data in LIST 1
Use GDC: Statistics – INTR – t – 1 SAMPLE – Data: List
(make sure that freq = 1)
Execute gives everything we need
(a) x = 17.2 , sn-1 =1.85
(b) The confidence interval is [16.2, 18.1]

(c) We are 95% confident that the interval will contain the
population mean μ
or otherwise
If we choose a sample 100 times, we expect that in 95 of

them, the interval will contain the population mean μ
26
4.22 HYPOTHESIS TEST FOR THE MEAN μ

either or
a sample of data only the corresponding statistics
sample mean: x
standard deviation: sx
LIST 1
size of the sample: n
CLAIM: for the population mean μ
μ≠μ0 μ>μ0 or μ<μ0

μ=μ0 against
We state
[null hypothesis] Ho: μ=μ0
[alternative hypothesis] H1: μ≠μ0 or μ>μ0 or μ<μ0
We use GDC:
Statistics - TEST – (Z or t) – (1 sample)
Z t
If we know σ If we don’t know σ
We have to enter σ on ourselves We use sx (instead of σ)
if List: statistics sx , x , n are already there
if Var: we enter σ or sx , x , n on ourselves
Execute gives
Zstatistic = z tstatistic =t
p-value
Conclusion
IF THEN
p-value < a
we reject Ho
(or Zstatistic > Zcritical) (or tstatistic > tcritical)
27
The critical value is obtained by GDC:
InvN or Invt
Zcritical tcritical
InvN(0,1) – Tail: right t - Invt
Area = a for 1-tailed test
Area = a/2 for 2-tailed test
 PAIRED DATA
Mind the difference between
unpaired data (two samples)
paired data (one sample at different times)
In case of paired samples e.g. the values of the same objects at

different times, we test if the mean increases or decreases. For
example,
value in month 1 x1 x2 x3 …
value in month 2 y1 y2 y3 …
We do not use (2 sample), but
we find differences x1 – y1 x2 – y2 x3 – y3 …
and treat the last row as (1 sample).
We test the mean of the differences.
μ=0 indicates that the mean remains the same

μ>0 indicates that the mean increases
μ<0 indicates that the mean decreases
28
EXAMPLE 1
For a sample of n=40 data, we know that x=23, sn-1 =3. We also
know that the standard deviation of the population is σ=2.8.
There is a CLAIM that μ = 24.
(a) Perform a two-tail test for this claim with a=0.05
(b) Perform a one-tail test for this claim (against μ<24)
Use the significance level a=0.05
Solution
Since we know σ, we use Z-test ( sn-1 =3 was not necessary).
Use GDC: Statistics – TEST – Z – 1 SAMPLE – Data: Variable
(a) H0: μ=24
H1: μ≠24
p-value = 0.0239
Since p-value < 0.05, we reject H0
Hence μ≠24
(b) H0: μ=24
H1: μ<24
p-value = 0.0119
Since p-value < 0.05, we reject H0
Hence μ<24
NOTICE
We also use the statistic value (against the critical value)
(a) zstatistic = -2.26
(b) zstatistic = -2.25
29
EXAMPLE 2
For a sample of n=20 data, we know that x=23, sn-1 =3.
There is a CLAIM that μ = 24.
(a) Perform a two-tail test for this claim with a=0.05
(b) Perform a one-tail test for this claim (against μ<24)
Use the significance level a=0.05
Solution
Since we don’t know σ, we use t-test.
Use GDC: Statistics – TEST – t – 1 SAMPLE – Data: Variable
(a) H0: μ=24
H1: μ≠24
p-value = 0.152
Since p-value > 0.05, we do not have enough evidence to

reject H0
(b) H0: μ=24
H1: μ<24
p-value = 0.0762

reject H0
NOTICE
We can also use the statistic value (against the critical value)
(a) tstatistic = -1.49
(b) tstatistic = -1.49
30
EXAMPLE 3
Consider the following sample of size n=12
16 17 17 15 20 16 18 15 21 18 17 16
We can easily find (by GDC) that x = 17.2
However, there is a CLAIM that μ = 18.
Can we support this claim, against μ ≠ 18 with a=0.05?
Solution
Since we don’t know σ we use t-test.
Use GDC: Statistics – TEST – t – 1 SAMPLE – Data: List
H0: μ=18
H1: μ≠18
p-value = 0.147

reject H0.
Thus, we can support that μ = 18.
NOTICE
We can also use the statistic value (against the critical value)
tstatistic = -1.56
Finally, let’s see an example with paired data. Although it seems

that we have two samples, we have in fact one samples, at two
different moments.
31
EXAMPLE 4
We have the flowing measurements for 12 objects in two different

months.
In January
16 17 17 15 20 16 18 15 21 18 17 16
In March
18 17 19 17 18 16 17 16 21 21 18 17
Can we support the CLAIM that the mean has increased?
Use a=0.05
Solution
We have paired data, hence we find the differences and treat them
as 1 sample.
2 0 2 2 -2 0 -1 1 0 3 1 1
Use GDC: Statistics – TEST – t – 1 SAMPLE – Data: List
H0: d=0
H1: d>0
p-value = 0.0475
Since p-value < 0.05, we reject H0.
Thus, we can support that the mean has increased.
32
4.23 HYPOTHESIS TEST FOR THE PARAMETER λ OF POISSON

Let
X = number of incidents in a certain period
e.g. X= accidents per day.
They give us a sample of incidents for n different days

either or
x1, x2, …, xn only the statistics
we enter data in GDC in sample mean: x
LIST 1 size of the sample: n
We know that ΣX follows Poisson Po(λ)
CLAIM: for the parameter λ
λ>λ0 or λ<λ0
λ=λ0 against
only 1-tailed test
We state
[null hypothesis] Ho: λ=λ0
[alternative hypothesis] H1: λ>λ0 or λ<λ0
We use GDC
Statistics – DIST – Poisson Po(λ0)
x
statistic
= nx = x
i
if H1: λ>λ0 p-value = P(X  xstatistic )

if H1: λ<λ0 p-value = P(X  xstatistic )
Conclusion
IF THEN
33
EXAMPLE 1
The number of accidents per day in a certain area follows Poisson.
We record the number of accidents in 5 different days
Day 1 Day 2 Day 3 Day 4 Day 5

7 8 8 6 10
(a) Test if the total number of accidents follows Poisson Po(35)

(b) Test if the total number of accidents follows Poisson Po(45)
For both questions use a=0.10

Solution
For both questions, the statistic is x =39
i
(a) Ho: λ=35

H1: λ>35
We consider Po(35) and
p-value = P(X≥39) = 0.271
As p-value>0.1 we don’t have enough evidence to reject Ho.
(we may accept the claim λ=35)
(b) Ho: λ=50
H1: λ<50
We consider Po(50) and
p-value = P(X≤39) = 0.065
Since p-value<0.1, we reject Ho.
(we may accept the λ<50)
NOTICE
 For (a), since 57=35, the accidents per day are 7. We could
also state: Ho: λ=7, H1: λ>7,
But still, for the p-value we consider Po(35).
 They could only give us that n=5, x =7.8, instead of the complete
table. The statistic is x =nx =39.
i
34
4.24 HYPOTHESIS TEST FOR THE PROPORTION p (BINOMIAL)
Here we investigate the proportion p (or percentage) of a

particular group in the population.
They give us some observed data (sample)

only The size of the sample: n
the statistics The size of the group in the sample: x
The observed proportion of the group is x

n
CLAIM: For the proportion p of the group in the population
p>p0 or p<p0
p=p0 against
only 1-tailed test
We state
[null hypothesis] Ho: p=p0
[alternative hypothesis] H1: p>p0 or p<p0
We use GDC
Statistics – DIST – Binomial B(n,po)
x
statistic
= x (size of the group)
if H1: p>p0 p-value = P(X  xstatistic )
if H1: p<p0 p-value = P(X  xstatistic )
Conclusion
IF THEN
35
EXAMPLE 1
In a sample of 200 people there are 50 smokers. That is
n=200
x=50
50
observed proportion p= =0.25
200
Test the following two claims
(a) the proportion of smokers in the population is 0.30 (i.e. 30%)
(b) the proportion of smokers in the population is 0.20 (i.e. 20%)
Use a=0.05
Solution
For both questions, the statistic is x=50
(a) Ho: p=0.30
H1: p<0.30
We consider B(200, 0.30) and
p-value = P(X≤50) = 0.0695
Since p-value > 0.05 we do not have enough evidence to reject Ho
[we may accept that the proportion is 0.30]
(b) Ho: p=0.20
H1: p>0.20
We consider B(200, 0.20) and
p-value = P(X50) = 0.049
Since p-value < 0.05 (almost!) we reject Ho
[we accept that the proportion is greater]
36
4.25 HYPOTHESIS TEST FOR THE CORRELATION COEFFICIENT ρ

Here we find the correlation coefficient r between two
characteristics in a sample and test if there is a correlation
between these characteristics in the whole population.
We apply t-test.
They give us some observed data (sample)

Α bivariate set of data we enter data in GDC in
LIST 1 LIST 2
CLAIM: for the population correlation coefficient ρ
ρ≠0 ρ>0 or ρ<0

ρ=0 against
We state
[null hypothesis] Ho: ρ=0 [there is no correlation]
[alternative hypothesis] H1: ρ≠0 or ρ>0 or ρ<0
We use GDC
Statistics – TEST – t – REG
Execute gives
p-value
Conclusion
IF THEN
Interpretation of the result

ρ=0 Implies there is no correlation
ρ≠0 implies there is correlation
ρ>0 implies there is positive correlation
ρ<0 implies there is negative correlation
37
EXAMPLE 1
Consider the following sample of bivariate data
x 5 10 15 20 25 30 35 40 45 50
y 50 52 55 53 57 57 60 55 58 55
(a) Find the equation of the regression line of y on x

(b) Find the correlation coefficient
There are two CLAIMS for the population
(c) Test the CLAIM that there is no correlation
(d) Test the CLAIM that there is a positive correlation
Solution
(a) y = 3.39x – 159.7
(b) r = 0.67
For (c) and (d) we use GDC: Statistics – TEST – t – REG
(c) H0: ρ = 0
H1: ρ ≠ 0
p-value = 0.035
Since p-value < 0.05 we reject H0
Hence, we can support that there is correlation.
(c) H0: ρ = 0
H1: ρ > 0
p-value = 0.018
Since p-value < 0.05 we reject H0
Hence, we can support that there is a positive
correlation.
38
4.26 CRITICAL REGION – TYPE I AND TYPE II ERRORS
For a specific hypothesis H0, the set of values for which we reject
H0 is called critical region. It has nothing to do with the sample,. It
depends only on the significance level a.
We define the critical region only for 3 of the tests we have seen.
 For the mean μ of the population [Z-test only]

 For the parameter λ of Poisson [Poisson]
 For a proportion p) [Binomial]
For the population mean μ (Z-test)

We use Normal with μ0 and σ
n
CRITICAL REGION
CASES OF H1
by InvN
The red area

μ<μ0
X<r
The red area

μ>μ0
X>r
The red areas

μ≠μ0
X<r or X>s
We will call the remaining region non-critical.
39
For Poisson the Poisson parameter λ

we use Po(λ0)
CRITICAL REGION
CASES OF H1
by Pcd with trial and error
λ<λ0 X≤r
Pcd(0 to r) < a
λ>λ0 Xr
Pcd(r to +∞) < a
For the Binomial proportion p

we use B(n,p0)
CRITICAL REGION
CASES OF H1
by Bcd with trial and error
p<p0 X≤r
Bcd(0 to r) < a
p>p0 X r
Bcd(r to n) < a
NOTICE
In the last two cases, Poisson and Binomial, the significance level is
not exactly a, but P(critical region), i.e. what Pcd or Bcd gives
40
 TYPE I AND TYPE II ERRORS

There are two types of error we can make in an hypothesis test
TYPE I ERROR = The probability to reject H0 if it is true,

TYPE II ERROR = The probability to accept H0 if it is false,
Methodology: we find first
the CRITICAL REGION

the NON-CRITICAL REGION4
For μ For λ For p
we use
μ0 λ0 p0
TYPE I 2
Ncd(μ0,  σ  ) Po(λ0) B(n,p0)
ERROR n 
to find
Prob(CRITICAL REGION)
In fact, this is the significance level a
For the TYPE II error, they give us a new μ1, λ1, p1
we use
μ1 λ1 p1
TYPE II 2
Ncd(μ1,  σ  ) Po(λ1) B(n,p1)
ERROR n 
to find
Prob(NON-CRITICAL REGION)
4
or ACCEPTANCE REGION
41
EXAMPLE 1
For a sample of n=40 data, we know that x=23. We also know
that the standard deviation of the population is σ=2.8.
For the test
H0: μ=24
H1: μ≠24
with a=0.05
(a) Find the critical region and the non-critical region.
(b) State the TYPE I error
(c) It was finally found that μ = 22.9 Find the TYPE II error.
Solution
2
(a) We use N(μ0,  σ  ), that is Normal with
n 
2.8
mean = 24 st. deviation =
40
The critical region is shown below
InvN gives:
CRITICAL REGION: (-∞,23.1)  (24.9, +∞)
NON- CRITICAL REGION: (23.1, 24.9)
(b) It is a=0.05
(c) Given that μ=22.9, we use Ncd with
2.8
mean = 22.9 st. deviation =
40
to find P(non-critical region), that is
P(23.1<X<24.9)
The TYPE II error is 0.326
42
EXAMPLE 2
The number of accidents per day in a certain area follows Poisson.
We record the number of accidents in 5 different days
Day 1 Day 2 Day 3 Day 4 Day 5

7 8 8 6 10
The statistic is x =39

i
For the test

Ho: λ=35
H1: λ>35 with a = 0.10
(b) Find the TYPE I error
(c) It was finally found that λ = 45. Find the TYPE II error.
Solution
(a) We use Po(35)
For the critical region we find when P(X≥r) < 0.10
Pcd with trial and error gives r = 44, with P(X44) = 0.079
CRITICAL REGION: [44, +∞)
NON- CRITICAL REGION: [0, 43]
(b) It is P(X44) = 0.079
(c) We use Po(45)
We find the probability of the non-critical region
The TYPE II error is P(X≤43) = 0.421
NOTICE
For the test
Ho: λ=50
H1: λ<50 with a = 0.10
we use Po(50) and find when P(X≤r) < 0.10
CRITICAL REGION: [0, 40] with Prob = 0.086
NON- CRITICAL REGION: [41, +∞)
43
EXAMPLE 3
In a sample of 200 people there are 50 smokers. That is
n=200
x=50
50
observed proportion p= =0.25)
200
For the test
Ho: p = 0.30
H1: p < 0.30 with a = 0.05
(b) Find the TYPE I error
(c) It was finally found that p = 0.23. Find the TYPE II error.
Solution
(a) We use B(200, 0.30)
For the critical region we find when P(X≤r) < 0.0.5
Bcd with trial and error gives r = 48, with P(X≤44) = 0.035
CRITICAL REGION: [0, 48]
NON- CRITICAL REGION: [49, 200]
(b) It is P(X≤44) = 0.035
(c) We use B(200, 0.23)
We find the probability of the non-critical region.
The TYPE II error is P(X49) = 0.333
44

Topic 4B. Inferential Statistics

Uploaded by

Copyright:

Available Formats

You might also like

Topic 4B. Inferential Statistics

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topic 4B. Inferential Statistics

Uploaded by

Copyright:

Available Formats

International Baccalaureate

4B. Inferential statistics

4.16 INTRODUCTION TO INFERENTIAL STATISTICS …………………………………….. 1

4.17 HYPOTHESIS TEST FOR TWO MEANS μ1, μ2 ..…………………………………….. 5

4.18 CHI-SQUARE TEST FOR INDEPENDENCE …………………………………………….. 8

4.19 CHI-SQUARE TEST FOR GOODNESS OF FIT (GOF) ……….………………….... 12

4.20 FURTHER DETAILS FOR THE CHI-SQUARE GOF TEST …………………...….. 19

4.21 CONFIDENCE INTERVAL FOR THE MEAN μ ..………………….………………..….. 24

4.22 HYPOTHESIS TEST FOR THE MEAN μ ......................……………………………….. 27

4.23 HYPOTHESIS TEST FOR THE PARAMETER λ OF POISSON ……………….. 33

4.24 HYPOTHESIS TEST FOR THE PROPORTION p (BINOMIAL) ……………….. 35

4.25 HYPOTHESIS TEST FOR THE CORRELATION COEEFICIENT ρ ................ 37

4.26 CRITIAL REGION – TYPE I AND TYPE II ERRORS ……………………………... 39

4.16 INTRODUCTION TO INFERENTIAL STATISTICS

Well, I have to confess that this part of the syllabus is not my

The idea of inferential Statistics is simple:

we study a small sample

we draw a conclusion for the general population

For example, we find the sample mean x to draw a conclusion for

The problem is that we use dozens of formulas, terminology, tables,

In a University course of Statistics, the students refer to a formula

The story is as follows:

There is a claim for a characteristic of the population, for example

the mean weight μ is 75kg,

We investigate a sample of the population against this

Question: Is the result close enough to the claim?

 But what does it mean “close enough”?

This is in fact is the probability to reject the Ho while it is true.

 How do we draw a conclusion?

 A CLARIFICATION ABOUT THE STANDARD DEVIATION

If DATA = the whole population

If DATA = a sample of the population

Let’s try to explain: For some mathematical reason, while

x is an unbiased estimate of the population mean μ,

In inferential statistics, what we need to draw conclusions for the

 When we throw a die, the possible results are

 When we throw a die one million times, we obtain a large

But if we need an unbiased estimate for the variance of the

4.17 HYPOTHESIS TEST FOR TWO MEANS μ1 and μ2 (t-test)

They give us some observe data:

The sample means x1 and x2 may be different, but we test if they

CLAIM: for the population means μ1 and μ2

μ1≠μ2 μ1>μ2 or μ1<μ2

p-value < a we reject Ho

Sample A: x1 =70.1, sx1 =8.57 , n1 =10

We use Data: Var instead of Data: List to enter the statistics.

The results are as in Example 1.

4.18 χ2 TEST FOR INDEPENDENCE

Tennis Volley Basketball

CLAIM: the two criteria (gender, favorite sport) are independent

The two criteria are independent if the observed frequencies above

Tennis Volley Basketball

Notice: χ2critical will be given in the question, if necessary.

 The expected frequencies (in matrix B) are calculated as follows

Observed frequencies: we also find the totals (in red)

Expected frequencies (keep only the totals)

Then we complete the table. For the first entry:

Similarly for each entry.