Professional Documents
Culture Documents
Caie A2 Level Further Maths 9231 Further Statistics 2 v1
Caie A2 Level Further Maths 9231 Further Statistics 2 v1
ORG
CAIE A2 LEVEL
FURTHER MATHS
(9231)
SUMMARIZED NOTES ON THE FURTHER STATISTICS 2 SYLLABUS
CAIE A2 LEVEL FURTHER MATHS (9231)
3. Find P (X < m)
1. Continuous Random Solution:
Part (i):
Variables Total area must equal 1 hence
5 5
kx3
1.1. Probability Density Functions (PDF) ∫ kx (6 − x) = [3kx2 − ] =1
3 2
2
Function whose area under its graph represents 125 8
probability used for continuous random variables = 75k − k − 12k + k = 24k = 1
3 3
Represented by f (x)
1
∴k=
24
Part (ii):
Mode is the value which has the greatest probability hence
we are looking for the max point on the pdf
d
[kx (6 − x)] = 6k − 2kx
dx
6k − 2kx = 0
1
6 ( 24 )
x= =3
1
2 ( 24 )
∴ mode = 3
Conditions: Part (iii):
P (X < m) can be interpreted as P (−∞ < X < m)
Total area always = 1
3 3
m
kx3
d
∫ kx (6 − x) = ∫ kx (6 − x) = [3kx2 − ]
∫ f (x) dx = 1 3 2
−∞ 2
c
1 33 23 13
(3(3 ) − − 3(22 ) + ) =
2
Cannot have negative probabilities ∴ graph cannot dip =
24 3 3 36
Example:
F (x) = ∫ f (t) dt
−∞
Given that:
x= Median: the value of x for which F (x) = 0.5
(Apply analogy to quartiles/percentages)
{
kx(6 − x) 2<x<5
0 otherwise
Notes:
Since it is always impossible to have a value of X
1. Find the value of k
smaller than −∞ or larger than ∞:
2. Find the mode, m
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
⎨ 4x
1x
0≤x≤1
As x increases, F (x) either increases or remains 9
3
⎩
− 1 ≤ x ≤ 31
9
Example: 1 1 1
×1= <
9 9 2
Given that:
⎧k
This means the median does not lie in this interval ∴
0<x<1
f (x) = ⎨4k 1<x<3 4 3
⎩0
x − = 0.5
9 9
otherwise
15
1. Find the value of k x=
8
2. Find F (x) 1 1
3. Find the difference between the median and the fifth
The fifth percentile lies in the first interval as 20 < 9 so
percentile of X 1 1
x=
9 20
Solution:
Part (i): 9
x=
Total area must equal 1 hence 20
0 1 15 9 57
− =
8 20 40
= (k − 0) + (12k − 4k ) = 9k = 1
∴k=
1 1.3. Expectation and Variance
9
To calculate expectation
Part (ii):
∞
Integrate each case separately from its −∞ to x
E (X ) = ∫ xf (x) dx
For the first interval 0 ≤x≤1
−∞
x
1 1 x 1 In general:
F (x) = ∫ = [ x] = x
9 9 0 9
0
E (g (X )) = ∫ g (x) f (x) dx
We must split next interval 0 ≤ x ≤ 3 as ∀x
1 −∞
x
1 1 4 3 Substitute information and calculate using
= + [4 × x] = x −
9 9 1 9 9
Var (X ) = E (X 2 ) − (E (X ))2
Writing in correct notation and fixing intervals (adding equal
sign to inequalities) 1.4. Obtaining f(x) from F(x)
F(x) =
As F is obtained by integrating f , then f can be obtained
by differentiating F
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
Example: 1 1
( (y − 3)) − 2 ⟹ (y − 7)
The random variable has CDF given by 2 2
F (x) = {0 (x−1)3
8 1 x≤11≤x≤3x≥3
Expressing CDF of Y with ranges worked out FY (Y ) =
P (Y ≤ Y ) =
Find the form of the PDF of X
Solution: ⎧ 0 x≤7
F (x) = ⎨ 12 (y − 7)
⎩
F (x) is unchanging for x < 1 and for x > 3, therefore f (x)
7≤y≤9
0 otherwise
3
( ) = (x − 1)
d (x − 1) 3 2
f (x) = Method can be used for both increasing and decreasing
dx 8 8
F (x) = { 38 (x − 1)2 0
1 < x < 3 otherwise
1.6. Confidence intervals for the
difference in means
1.5. Distribution of a Function of a
Conf. interval for the difference in means for small
Random Variable samples is given as:
nx ny
fX → FX → FY → fY
X and Y are independent populations
given as:
The random variable Y is given by Y = 2X + 3. Determine
the pdf and cdf of Y . s2x s2y
(x − y) ± z α2 × +
Solution:
nx ny
n
7≤y≤9
Question 3 9231/04/SP/20:
Convert CDF from X to Y using relationship given Employees at a particular company have been working seven
hours each day, from 9 am to 4 pm. To
FY (y ) = P (Y ≤ y ) = P (2X + 3 ≤ y )
try to reduce absence, the company decides to introduce
‘flexi-time’ and allow employees to work their seven hours
1
= P (X ≤ (y − 3)) each day at any time between 7 am and 9 pm.
2
( )
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
Assume:
Or
Population differences are normally distributed
(∑ x)2
(∑ x2 − )
Hypothesis: 1
s2 =
n−1
n
H0 : μd = 0
(∑ x2 − nx2 )
1
H1 : μd < 0 =
n−1
H0 : μ = k *
Gather information:
H1 = {μ > k μ < k μ\nek
n = 10
∑ d = −34
.
∑ d2 = 648
Test statistic =
x−μ
s , where x is sample mean, s is the
n
−34 unbiased estimator of the standard deviation.
d=
10 The critical value for a significance level is:
Example:
Test statistic:
A random sample of 12 workers from a mobile phone
d − μd −3.4 assembly line is selected from a large number of workers to
t= = ≈ −1.39792
sd
2662
n
10
below:
Critical value: 43.2, 41.6, 49.3, 48.2, 44.2, 40.6, 39.7, 43.4, 44.9, 45.1, 46.2,
43.2
One-tailed test
Assuming that this sample comes from an underlying normal
10% significance level
population, investigate that the population mean is 45
Which means: minutes. Use 5% significance level
Solution:
t0.9,9 = −1.383
Assumed underlying normal dist.
t-distributions H1 : μ\neq45
Test statistic:
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
∑ x = 529.6
s2p
( n1x
+ 1
ny
)
2 Example:
12 ∑ x2 = 92274.44
x−μ 44.1 − 45
s
= 28.6
= −1.05 Assuming these data are randomly sampled from normal
n 12 distributions with the same variance, test the shopkeeper’s
which can be check in the last page of this Znote. n is small for both populations
Both have an underlying mean
t0.975,11 = 2.201
Since 1.05 < 2.201, the test statistic is not in the critical Let X be the sales with music, and Y be the sales without
region and hence we do not reject H0 . There is insufficient
music.
evidence to claim that the population mean is not 45 minutes If the sales “with music” are greater than “without”, then
Note: if we were to take the negative value, −1.05, then we μx > μy . Rewrite this as:
Otherwise, reject H0 .
Test statistic is:
test
s2p ( n1x
+ 1
ny
)
found from:
(∑ x)2
∑ (x − x)2 + ∑ (y − y)2 (nx − 1) s2x = ∑ x2 −
s2p =
nx
nx + ny − 2
Or 960.12
= 92274.44 − ≈ 95.239
10
nx + ny − 2 (ny − 1) s2y = ∑ y 2 −
ny
nx + ny − 2 10 + 8 − 2
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
n
748.2
y= = 93.525 Interval is written as:
8
(x − t α2 ,n−1 )
Test statistic: s s
,, x + t α2 ,n−1
n n
buying their ticket, their replies, in minutes, are 12, 17, 21, 9,
We used tα,nx +ny −2 here.
14, 9.
Since 1.65154 < 1.746, the test statistic is not in the critical Assuming a normal distribution, calculate a 90% confidence
region, so we do not reject H0 .
t-tests ∑ x = 92
∑ x2 = 1512
Assume:
Differences are normally distributed ∑x 92
x= = =≈ 15.333
n
Population variance of the two populations is the 922
(∑ x)2
(∑ x2 − )=
same (but may be unknown) 2 1 1512 − 6
s =
n−1 5
n
2.4. Difference in means: Normal We are creating 90% conf. interval so:
distribution 100 (α − 1)
α
Assume: ∴ = 0.95
2
same n
( nx + 20.267
ny )
σx2 σy2
15.33 ± 2.015
6
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
The χ2 test can only be used to test two lists of Question 2 9231/04/SP/20
frequencies – the observed and the expected frequencies Each of 200 identically biased dice is thrown repeatedly until
calculated from the hypothesis. an even number is obtained. The number of throws needed
is recorded and the results are summarized in the following
(Oi − Ei )2 table.
χ =∑
2
No. of throws 1 2 3 4 5 6 ≥7
Ei
Frequency 126 43 22 3 5 1 0
where Oi and Ei are the observed and expected frequencies
Ei
Hypotheses:
H0 : Geo(0.6) is a suitable model
r−1
Calculate expected values using 200 × (0.6) (0.4)
(O∗i−E∗i)2
Variable Probability Oi Ei
Ei
r 1 2 3 4 5 6 ≥7
Oi 126 43 22 3 5 1 0
⋮ ⋮ ⋮ ⋮ ⋮
you must group this class with the next class (or two …) are greater than 5, Ei
≥ 5.
Hypothesis when testing: r 1 2 3 4 ≥5
H0 : the … distribution is a suitable model
Oi 126 43 22 3 6
H1 : the … distribution is not a suitable model
Ei 120 48 19.2 7.68 5.12
Oi2
Once you have calculated the χ2 value of the data given, χ2 = ∑ ( )−N
distribution
To test 5 classes at a 5% significance level, find the critical 1262 432 222 32 62
= + + + + − 200
value of the χ2 distribution at 95% with 4 degrees of 120 48 19.2 7.68 5.12
freedom 4063
If the distribution fits, the calculate value should be less
= ≈ 4.2323
960
This is the case where the null hypothesis states that the
From tables:
data has a ‘particular named distribution’ but does not
specify all the parameters of the distribution χ24 (0.95) = 9.488
suitable model.
must subtract k from the degrees of freedom v
Note: You can use
Hence, with m different outcomes,
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
(Oi − Ei )2 Solution:
χ2 = ∑
Ei
Eij =
T
To calculate the test statistic, but the method shown is more Expected Cappuccino Latte Ground
convenient to use. Company A 144×95 144×92 144×63
250
250
250
A B C preference
X ∑ R1
Calculating the test statistic
Y ∑ R2
Z ∑ R3 (Oij−Eij)2
O E
Eij
∑ C1
∑ C2
∑ C3
T
60 54.72 0.509474
The expectation of each variable is calculated by 52 52.992 0.01857
32 36.288 0.506695
row total × column total
35 40.28 0.692115
grand total
40 39.008 0.025227
Or in math symbols 31 26.712 0.68834
∑ Ri × ∑ Cj
Eij =
T
44
v = (r − 1) (c − 1) 4. Non-Parametric Tests
Question 10 9231/02/QP/12:
Random samples of employees are taken from two
4.1. Single-sample sign test
companies, A and B. Each employee is asked which of three
types of coffee (cappuccino, latte, and ground) they prefer. Given n data points, a single-sample sign test is created
The results are shown in the following table. using X ∼ Bin(n, 0.5)
Cappuccino Latte Ground The test statistic can be the number of + signs, that is the
number of data points greater than the median.
Company A 60 52 32
We can calculate the probability that X is above this test
Company B 35 40 31 statistic, below this test statistic, or either in the case of a
two-tailed test.
a) Test, at the 5% significance level, whether coffee This can be expressed as P (X ≤ ts∣X ∼ Bin (n, 0.5))
preferences of employees are independent of their company
or P (X ≥ ts∣X ∼ Bin (n, 0.5)) where ts stands for
Larger random samples, consisting of N times as many
test statistic.
employees from each company, are taken. In each company, Only in this chapter where if the test statistic is less than
the proportions of employees preferring the three types of
the critical value is when you reject H0 .
coffee remain unchanged.
b) Find the least possible value of N that would lead to the Example:
conclusion, at the 1% significance level, that coffee It is believed that the following dataset comes from a
preferences of employees are not independent of their population with median 135
company 150 130 125 140 170
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
150 130 125 140 170 The underlying data are continuous.
140 190 180 175 165 The data are independent
Where:
160 130 140 140 145
P is the sum of the ranks corresponding to the
positive differences from the stated median.
Perform a single-sample sign test, at the 5% significance
N is the sum of the rank corresponding to the
level, to test this claim.
negative differences from the stated median.
Solution:
Hypotheses:
T = min (P , N ) is the test statistic
If the test statistic is below the critical value, we reject H0
H0 : The population median is 135
Weight, Wi Wi −Median
P N
P (X ≥ 12) ≈ 0.017578 1.6 -0.2 2
1.1 -0.7 7
Since 0.017578 < 0.025, the test statistic of 12 is in the
2.1 0.3 3
critical region and, therefore, we reject H0 .
2.4 0.6 6
Var (S ) = n4
Sums: 46 9
Then T ∼ N ( n2 , n4 )
Applying continuity correction, our z-value is: So, the test statistic here is T = min (P , N ) = 9
We look up the critical value in the statistical tables:
S + − μ + 0.5
z=
One-tailed test
σ
5% significance level
4.2. Wilcoxon signed-rank test
Since 9 < 10, (Test statistic < critical value), there is sufficient
evidence to reject. ∴ There is sufficient evidence to suggest
A Wilcoxon signed-rank test can be performed when:
The underlying data are symmetric that the population median is not 1.8kg.
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
Note: Once again, in this case we require the test statistic to Since our Test statistic > Crit. value. We reject H0 , we can
be lower than the crit. value to reject H0 . conclude that the median waiting time is lower than 50
minutes
Given the statistic T = min (P , N ) Note: as shown, once we have used normal dist. to
E (T ) = n(n+1
4
)
approximate the signed-rank test, we stick back to the old
Var (T ) = n(n+126
)(2n+1)
rule where we reject H0 when Test stat. > Crit. val.
For large n:
4.3. Paired-sample sign test
n (n + 1) n (n + 1) (2n + 1)
T ∼N( , )
4 26
2
4
31395 F 48 49 -
4
G 61 62 -
≈ −5.21485 H 38 39 -
z0.99 = −2.326
We will let the number of + signs be the test statistic
The test statistic is 2
∴ 5.21485 > 2.326 Use: X ∼ Bin (9, 0.5)
( ) ( ) ( )
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
difference in the times taken for children to tie their left and
1 2 3 4 5 6 7 8
right shoelaces.
Note: Once again here, we only reject H0 if our test statistic
Ai − Bi
-10 -7 -5 -35 13 6 -1 -40
is lower than the critical value. P 6 3
N 5 4 2 7 1 8
4.4. Wilcoxon matched-pairs signed-
rank test P =9
N = 27
A Wilcoxon signed-rank test can be performed when:
T = min (9, 27) = 9
The underlying data are symmetric
The underlying data are continuous Crit. value:
Where:
P is the sum of the ranks corresponding to the 5% significance level
positive differences between the matched pairs One-tailed test
N is the sum of the ranks corresponding to the
negative differences between matched pairs From the tables:
T = min (P , N ) is the test statistic Crit. Value = 5
Since 9 > 5, do not reject H0 . There is no evidence to suggest
Given the statistic T = min (P , N ) that the median blood pressure has decreased.
E (T ) = n(n+1 ) b)
4
Hypotheses:
Var (T ) = n(n+126
)(2n+1)
H0 : Population median blood pressure is unchanged
For large n:
1
z=
Var (T ) = 24 n (n + 1) (2n + 1) = 2363.75
σ
T − μ + 0.5
Example: z=
Student 1 2 3 4 5 6 7 8
≈ −2.06
B.T. 130 170 125 170 130 130 145 160
A.T. 120 163 120 135 143 136 144 120 Critical value:
5% significance level & one-tailed test.
B.T. = Before Training
A.T. = After Training z0.95 = −1.645
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
The two samples have sizes m and n, where m ≤ n Low B12 High B12
Rm is the sum of the ranks of the items in the sample
Sum 53 83
size m
The test statistic is: Calculate the test statistic:
W = min (Rm , m (n + m + 1) − Rm ) Rm = 83 since this is the rank sum from the smaller-sized
sample.
Example:
Researchers are investigating the effect of vitamin B12 on the m (n + m + 1) − Rm = 7 (9 + 7 + 1) − 83 = 36
0.795 13 4
H1 : Level of B12 has an effect on brain size
0.792 14 3
First, rank the whole dataset. Note which group each value
0.789 15 2
comes from. Then add up the ranks for each category. Use
the value of the sums of the group with the smaller sample 0.786 16 1
size. Sum 100 36
Low B12 High B12
Given the test statistic W , then:
0.812 1 1
E (W ) = 12 m (n + m + 1)
0.810 2 2
1
Var (W ) = 12 mn (n + m + 1)
0.808 3 3
0.803 7 7
0.802 8 8 Allowing for an approximate z-test with:
0.800 9 9
W − μ + 0.5
0.799 10 10 z=
σ
0.798 11 11
0.796
0.795
12
13 13
12
5. Probability Generating
0.792 14 14 Functions (PGF)
0.789 15 15
0.786 16 16 The uses of PGFs:
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
2
Var (X ) = GX (1) + GX (1) − (GX (1))
′′ ′ ′
enable us to describe the structure of infinite discrete
The PGF can be written as a single summation: GX1 +…+Xn (t) = GX1 (t) × … × GXn (t)
x
Aisha has a bag containing 3 red balls and 3 white balls. She
Notice that the expression for GX (t) is the same as that selects a ball at random, notes its colour and returns it to the
X
for the expectation function, E (t ), and so: bag; the same process is repeated twice more. The number
of red balls selected by Aisha is denoted by X .
GX (t) = ∑ txi P (X = xi ) = E (tX )
a) Find the probability generating function GX (t) of X.
expectation.
The random variable Z is the total number of red balls
Example: selected by Aisha and Basant.
Consider the Probability distribution: c) Find the probability generating function of Z , expressing
your answer as a polynomial.
x 0 1 2 3 4 5 6
d) Use the probability generating function of Z
P (X = x) 0.1 0.2 0.3 0.15 0.1 0.1 0.05
to find E(Z) and Var(Z).
Solution:
Solution: a) It’s best to make a Prob. Dist. table first, then make the
Apply the general form of the PGF: PGF
x 0 1 2 3
GX (t) = ∑ txi P (X = xi )
1 3 3 1
P (X = x)
x 8
8
8
8
8 8 8 8
Standard PGFs:
b) Same as part (a) but this time without replacement.
Probability Distribution P(X=r) Gx (t) y 0 1 2 3
Bin(n,p) (nr )pr q n−r
(q + pt)n P (Y = y ) 1
20
9
20
9
20
1
20
1 t(1−tn )
Uniformdist(n) n n(1−t) c) Using the Convolution Theorem:
The probabilities in a PGF can be found using: GZ (t) = GX+Y = GX (t) × GY (t)
(r) 1 3 3 1
P (X = r) =
GX (t)
=( + t + t2 + t3 ) ×
8 8 8 8
r!
( )
WWW.ZNOTES.ORG
CAIE A2 LEVEL FURTHER MATHS (9231)
60
1 9 9 9 ∴ E (Z ) = 3
( + t + t2 + t3 )
20 20 20 20
′′ 1
1 GZ (t) = (78 + 336t + 468t2 + 240t3 + 30t4 )
160
Let t =1
d) Recall that:
′′ 36
′ GZ (1) =
E (Z ) = GZ (1) 5
2
Var (Z ) = GX (1) + GX (1) − (GX (1)) 36 6
′′ ′ ′
Var (Z ) = + 3 − (3)2 =
5 5
′ 1 6
GZ (t) = (12 + 78t + 168t2 + 156t3 + 60t4 + 6t5 ) ∴ Var (Z ) =
160
5
Let t =1
′ 1
GZ (1) = (12 + 78 + 168 + 156 + 60 + 6)
160
WWW.ZNOTES.ORG
CAIE A2 LEVEL
Further Maths (9231)