Professional Documents
Culture Documents
Cie A2 Maths 9709 Statistics2 v2 Znotes
Cie A2 Maths 9709 Statistics2 v2 Znotes
Cie A2 Maths 9709 Statistics2 v2 Znotes
LEVELSERI
ES
vi
si
twww.
znot
es.
org
Updat
edt
o201
9Syl
la
bus
CI
EA
-L
EV
EL
MA
TH
S97
09
(S
2)
FORMULAEANDSOLVEDQUESTI
ONSFORSTATI
STI
CS2(
S2)
TABLE OF CONTENTS
2
CHAPTER 1
4
CHAPTER 2
4
CHAPTER 3
5
CHAPTER 4
7
CHAPTER 5
Hypothesis Tests
CIE A-LEVEL MATHEMATICS//9709
1. THE POISSON DISTRIBUTION Part (ii):
The Poisson distribution is used as a model for the Write the distribution using the correct notation
number, 𝑋, of events in a given interval of space or (𝐴 + 𝐵)~𝑃𝑜(2(0.65 + 0.45)) = (𝐴 + 𝐵)~𝑃𝑜(2.2)
times. It has the probability formula Use the limits given in the question to find probability
𝜆𝑥 (2.2)3 (2.2)2 (2.2)1
𝑃(𝑋 = 𝑥) = 𝑒 −𝜆 𝑥! 𝑥 = 0, 1, 2, …
𝑃(𝐴 < 4) = 𝑒 −2.2 ( + +
Where 𝜆 is equal to the mean number of events in the 3! 2! 1!
given interval (2.2)0
+ )
A Poisson distribution with mean 𝜆 can be noted as 0!
𝑋 ~ 𝑃𝑜(𝜆) = 0.819
PAGE 2 OF 8
CIE A-LEVEL MATHEMATICS//9709
Use the limits given in the question to find probability (IS) Ex 10h: Question 11:
0.250 𝑒 −0.25 The no. of flaws in a length of cloth, 𝑙m long has a
𝑃(𝑋 = 0) = = 0.779 Poisson distribution with mean 0.04𝑙
0!
Part (i)(b): i. Find the probability that a 10m length of cloth has
Use the rules of a Poisson distribution fewer than 2 flaws.
𝑉𝑎𝑟(𝑋) = 𝜇 = 𝜆 ii. Find an approximate value for the probability that a
Calculate 𝜆 in this scenario: 1000m length of cloth has at least 46 flaws.
𝜆 = 6 × 𝜇 (𝑖𝑛 𝑜𝑛𝑒 𝑚𝑜𝑛𝑡ℎ) = 6 × 0.25 = 1.5 iii. Given that the cost of rectifying 𝑋 flaws in a 1000m
∴ 𝑉𝑎𝑟(𝑋) = 1.5 length of cloth is 𝑋 2 pence, find the expected cost.
Part (ii): Solution:
Part (i):
Calculate 𝜆 in this scenario:
Form the parameters of Poisson distribution
𝜆 = 12 × 𝜇 (𝑖𝑛 𝑜𝑛𝑒 𝑚𝑜𝑛𝑡ℎ) = 12 × 0.25 = 3
𝑙 = 10 and 𝜆 = 0.04𝑙
Use the limits given in the question to find probability
∴ 𝜆 = 0.4
𝑃(𝑋 ≥ 3) = 1 − 𝑃(𝑋 ≤ 2)
Write down our distribution using correct notation
32 31 30
= 1 − 𝑒 −3 ( + + ) = 1 − 0.423 = 0.577 𝑋~𝑃𝑜(0.4)
2! 1! 0!
Write the probability required by the question
Part (iii):
𝑃(𝑋 < 2)
We will need two different 𝜆s in this scenario:
From earlier equations:
𝜆 𝑓𝑜𝑟 𝑜𝑛𝑒 𝑑𝑜𝑐𝑡𝑜𝑟 𝑖𝑛 𝑜𝑛𝑒 𝑦𝑒𝑎𝑟 = 1
0.40 0.41
𝜆 𝑓𝑜𝑟 𝑜𝑡ℎ𝑒𝑟 𝑡𝑤𝑜 𝑑𝑜𝑐𝑡𝑜𝑟𝑠 𝑖𝑛 𝑜𝑛𝑒 𝑦𝑒𝑎𝑟 = 2 × 1 = 2 𝑃(𝑋 < 2) = 𝑒 −0.4
( + ) = 0.938
For the first doctor: 0! 1!
Part (ii):
13
𝑃(𝑋 = 3) = 𝑒 −1 ( ) Using question to form the parameters
3!
𝑙 = 10 and 𝜆 = 0.04𝑙
For the two other doctors:
∴ 𝜆 = 40 > 15
10
𝑃(𝑋 = 0) = 𝑒 −1 ( ) Thus we can use the normal approximation
0!
Write down our distribution using correct notation
Considering that any of the three could be the first
𝑋~𝑃𝑜(40) → 𝑌~𝑁(40, 40)
13 10
𝑃(𝑋) = 𝑒 −1 ( ) × 𝑒 −1 ( ) × 3𝐶2 = 0.025 Write the probability required by the question
3! 0! 𝑃(𝑋 ≥ 46)
Apply continuity correction for the normal distribution
1.6 Normal Approximation of a Poisson
𝑃(𝑌 ≥ 45.5)
Distribution Evaluate the probability
To approximate a Poisson distribution given by: 45.5 − 40
𝑋~𝑃(𝜆) 𝑃(𝑌 ≥ 45.5) = 1 − Φ ( ) = 0.192
√40
If 𝜆 > 15 Part (iii):
Then we can use a normal distribution given by: Using the variance formula
𝑋~𝑁(𝜆, 𝜆) 2
𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 2 ) − (𝐸(𝑋))
Apply continuity correction to limits: For a Poisson distribution
Poisson Normal 𝐸(𝑋) = 𝑉𝑎𝑟(𝑋) = 𝜆 and 𝜆 = 40
𝑥=6 5.5≤𝑥≤6.5 Substitute into equation and solve for the unknown
𝑥>6 𝑥≥6.5 ∴ 40 = 𝐸(𝑋 2 ) − 402
𝑥≥6 𝑥≥5.5 𝐸(𝑋 2 ) = 1640 pence
𝑥<6 𝑥≤5.5
𝐸(𝑋 2 ) = ₤16.40
𝑥≤6 𝑥≤6.5
Expected cost for rectifying cloth is ₤16.40
PAGE 3 OF 8
CIE A-LEVEL MATHEMATICS//9709
2. LINEAR COMBINATIONS OF RANDOM 2.3 Expectation & Variance of Sample Mean
𝜎2
VARIABLES 𝐸(𝑋) = 𝜇 𝑉𝑎𝑟(𝑋) = 𝑛
(IS) Ex 6c: Question 5:
2.1 Expectation & Variance of a Function of 𝑿 The mean weight of a soldier may be taken to be 90kg,
𝐸(𝑎𝑋 + 𝑏) = 𝑎𝐸(𝑋) + 𝑏 and 𝜎 = 10kg. 250 soldiers are on board an aircraft,
𝑉𝑎𝑟(𝑎𝑋 + 𝑏) = 𝑎2 𝑉𝑎𝑟(𝑋) find the expectation and variance of their weight.
(IS) Ex 6a: Question 12: Hence find the 𝜇 and 𝜎 of the total weight of soldiers.
The random variable 𝑇 has mean 5 and variance 16. Solution:
Find two pairs of values for the constants 𝑐 and 𝑑 such Let 𝑋 be the average weight, therefore
that 𝐸(𝑐𝑇 + 𝑑) = 100 and 𝑉𝑎𝑟(𝑐𝑇 + 𝑑) = 144
𝐸(𝑋) = 𝜇 = 90
Solution:
𝜎2 102
Expand expectation equation: 𝑉𝑎𝑟(𝑋) = = = 0.4 kg2
𝑛 250
𝐸(𝑐𝑇 + 𝑑) = 𝑐𝐸(𝑇) + 𝑑 = 100 To find 𝜇 of total weight, you are calculating
∴ 5𝑐 + 𝑑 = 100 𝐸(𝑋1 ) + 𝐸(𝑋2 ) … + 𝐸(𝑋250 ) = 250𝐸(𝑋) = 22 500kg
Expand variance equation: To find 𝜎, first find 𝑉𝑎𝑟(𝑋)
𝑉𝑎𝑟(𝑐𝑇 + 𝑑) = 𝑐 2 𝑉𝑎𝑟(𝑇) = 144 𝑉𝑎𝑟(𝑋1 ) … + 𝑉𝑎𝑟(𝑋250 ) = 250𝑉𝑎𝑟(𝑋) = 2500kg
16𝑐 2 = 144 𝑉𝑎𝑟(𝑋) = 𝜎 2 = 25000
𝑐 = ±3 ∴ 𝜎 = √25000 = 158.1kg
Use first equation to find two pairs:
𝑐 = 3, 𝑑 = 85𝑐 = −3, 𝑑 = 115 3. CONTINUOUS RANDOM VARIABLES
2.2 Combinations of Random Variables 3.1 Probability Density Functions (pdf)
Expectations of combinations of random variables: Function whose area under its graph represents
𝐸(𝑎𝑋 + 𝑏𝑌) = 𝑎𝐸(𝑋) + 𝑏𝐸(𝑌) probability used for continuous random variables
Represented by 𝑓(𝑥)
Variance of combinations of independent random
variables:
𝑉𝑎𝑟(𝑎𝑋 + 𝑏𝑌 + 𝑐) = 𝑎2 𝑉𝑎𝑟(𝑋) + 𝑏 2 𝑉𝑎𝑟(𝑌)
𝑉𝑎𝑟(𝑋 ± 𝑌) = 𝑉𝑎𝑟(𝑋) + 𝑉𝑎𝑟(𝑌)
PAGE 4 OF 8
CIE A-LEVEL MATHEMATICS//9709
Notes: 3.3 The Median
o 𝑃(𝑋 < 𝑏) = 𝑃(𝑋 ≤ 𝑏) as no extra area added The Cumulative Distribution Function (cdf)
o The mode of a pdf is its maximum (stationary point) Gives the probability that the value is less than 𝑏
(IS) Ex 9a: Question 6: 𝑃(𝑋 < 𝑏) or 𝑃(𝑋 ≤ 𝑏)
Given that: Represented by 𝐹(𝑏)
𝑘𝑥(6 − 𝑥) 2<𝑥<5
𝑓(𝑥) = { It is the integral of 𝑓(𝑥)
0 otherwise 𝑏
i. Find the value of 𝑘 𝐹(𝑏) = ∫ 𝑓(𝑥) 𝑑𝑥
ii. Find the mode, 𝑚 −∞
iii. Find 𝑃(𝑋 < 𝑚) Median: the value of 𝑏 for which 𝐹(𝑏) = 0.5
Solution:
Part (i):
Total area must equal 1 hence
5 5
𝑘𝑥 3 2
∫ 𝑘𝑥(6 − 𝑥) = [3𝑘𝑥 − ] =1
2 3 2
125 8
= 75𝑘 − 𝑘 − 12𝑘 + 𝑘 = 24𝑘 = 1
3 3
1
∴𝑘=
24
Part (ii):
Mode is the value which has the greatest probability 4. SAMPLING & ESTIMATION
hence we are looking for the max point on the pdf
𝑑 4.1 Sample & Population
[𝑘𝑥(6 − 𝑥)] = 6𝑘 − 2𝑘𝑥
𝑑𝑥 Population: collection of all items
Finding max point hence stationary point
Sample: subset of population used as a representation
6𝑘 − 2𝑘𝑥 = 0 of the entire population
1
6( )
24
𝑥= 1
=3 4.1 Central Limit Theorem
2 (24)
If (𝑋1 , 𝑋2 , … , 𝑋𝑛 ) is a random sample of size 𝑛 drawn
∴ mode = 3 from any population with mean 𝜇 and variance 𝜎 2 then
Part (iii): the sample has:
𝑃(𝑋 < 𝑚) can be interpreted as 𝑃(−∞ < 𝑋 < 𝑚) Expected mean, 𝜇
𝑚 3 3 𝜎2
𝑘𝑥 3 2 Expected variance,
∫ 𝑘𝑥(6 − 𝑥) = ∫ 𝑘𝑥(6 − 𝑥) = [3𝑘𝑥 − ] 𝑛
−∞ 2 3 2 It forms a normal distribution:
1 2
33 2
23 13 𝜎2
= (3(3 ) − − 3(2 ) + ) = 𝑋̃~𝑁 (𝜇, )
24 3 3 36 𝑛
(IS) Ex 10f: Question 12:
3.2 Mean & Variance The weights of the trout at a trout farm are normally
To calculate mean/expectation distributed with mean 1kg & standard deviation 0.25kg
∞
a. Find, to 4 decimal places, the probability that a trout
𝐸(𝑋) = ∫ 𝑥𝑓(𝑥) 𝑑𝑥
−∞ chosen at random weighs more than 1.25kg.
To calculate variance: b. If 𝑌̅kg represents mean weight of a sample of 10
o First calculate 𝐸(𝑋) then 𝐸(𝑋 2 ) by trout chosen at random, state the distribution of 𝑌̅:
∞
𝐸(𝑋 2 ) = ∫ 𝑥 2 𝑓(𝑥) 𝑑𝑥 evaluate the mean and variance.
−∞ Find the probability that the mean weight of a
o Substitute information and calculate using sample of 10 trout will be less than 0.9kg
𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 2 ) − 𝐸(𝑋)2
PAGE 5 OF 8
CIE A-LEVEL MATHEMATICS//9709
Solution: 1 (Σ𝑥)2
Part (a): 𝜎2 = (Σ𝑥 2 − )
𝑛 𝑛
Write down distribution Using the divisor (𝒏 − 𝟏)
𝑋~𝑁(1, 0.252 ) Appropriate to use when data is given for a sample and
Write down the probability they want you are estimating variance of the whole population
𝑃(𝑋 > 1.25) = 1 − 𝑃(𝑋 < 1.25) The quantity calculated 𝑠 2 is known as the unbiased
Standardize and evaluate estimate of the population variance
1.25 − 1 1 (Σ𝑥)2
1 − 𝑃 (𝑍 < ) = 0.1587 𝑠2 = (Σ𝑥 2 − )
0.25 𝑛−1 𝑛
Part (b):
Write down initial distribution 4.4 Percentage Points for a Normal
𝑋~𝑁(1, 0.252 ) Distribution
For sample, mean remains equal but variance changes The percentage points are determined by finding the
Find new variance 𝑧-value of specific percentages
𝜎2 0.252
Variance of sample = = = 0.00625 E.g. to find the 𝑧-value of a 95% confidence level, we can
𝑛 10
Write down distribution of sample see that the 5% would be removed equally from both
sides (2.5%) so the 𝑧-value we would actually be finding
𝑌̅~𝑁(1, 0.00625)
would be of 100% − 2.5% = 97.5%
Write down the probability they want
𝑃(𝑌̅ < 0.9)
Standardize and evaluate
Standardized probability is negative so do 1 minus
0.9 − 1 0.1
𝑃 (𝑍 < ) = 1 − 𝑃 (𝑍 < ) = 0.103
0.00625 0.00625
Percentage Points Table
4.2 Point Estimate & Confidence Interval Confidence level 90% 95% 98% 99%
A point estimate is a numerical value calculated from a 𝒛-value 1.645 1.960 2.326 2.576
set of data (sample) which is used as an estimate of an
unknown parameter in a population 4.5 Confidence Interval for a Population
Examples of point estimates are: Mean
𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑠
Sample mean 𝑥̅ → population mean 𝜇 Sample taken from a normal population distribution with
𝑟 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑠
Sample proportion 𝑛 → population proportion 𝑝 known population variance
𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑠
𝜎 𝜎
Sample variance 𝑠 2 → population variance 𝜎 2 (𝑥̅ − 𝑧 , 𝑥̅ + 𝑧 )
√𝑛 √𝑛
Point estimate close to population value but not exact 𝑧 is the value corresponding to the confidence level
We can determine a confidence interval where the required and 𝑛 is the sample size
population value is likely to lie in (𝑥̅ − 𝛿, 𝑥̅ + 𝛿) The confidence interval calculated is exact
Large sample taken from an unknown population
4.3 The Variance distribution with known population variance
Variance can be calculated/given for either a sample or a By the Central Limit Theorem, the distribution of 𝑋̅ will
population and there is a difference between them be approximately normal so same method as above
Using the divisor 𝒏 𝜎 𝜎
This is appropriate to use when (𝑥̅ − 𝑧 , 𝑥̅ + 𝑧 )
√𝑛 √𝑛
o data is given for the whole population and you are The confidence interval calculated is an approximate
interested in the variance of the whole
o data is given for the sample and you are interested in Large sample taken from an unknown population
the variance of just the sample distribution with unknown population variance
PAGE 6 OF 8
CIE A-LEVEL MATHEMATICS//9709
As the population variance is unknown, you must first Solution:
estimate the population variance, 𝑠, using sample data Part (i):
𝑠 𝑠 Find the midpoint of the limits, finding 𝑝
(𝑥̅ − 𝑧 , 𝑥̅ + 𝑧 )
√𝑛 √𝑛 0.1771 − 0.1129
0.1129 + = 0.145
The confidence interval calculated is an approximate 2
{W13-P71}: Question 2: The midpoint is equal to the proportion of people with
Heights of a certain species of animal are normally high-speed internet use so
distributed with 𝜎 = 0.17m. Obtain a 99% confidence 87
𝑛
= 0.145 ∴ 𝑛 = 600
interval for the population mean, with total width less
Part (ii):
than 0.2m. Find the smallest sample size required.
Using the upper limit, this was calculate by:
Solution:
𝑝𝑞
For a 99% confidence interval, find 𝑧 where 0.1771 = 0.145 + 𝑧√
𝑛
Φ(𝑧) = 0.995 (think of the 1% cut from both sides)
Substituting values calculated (𝑞 = 1 − 𝑝), find 𝑧
𝑧 = 2.576
87 513
Subtract the limits of the interval and equate to 0.2 ×
𝜎 𝜎 0.0321 = 𝑧√600 600 ∴ 𝑧 = 2.233
(𝑥̅ + 𝑧 ) − (𝑥̅ − 𝑧 ) = 0.2 600
√𝑛 √𝑛 Use normal tables and find corresponding probability
𝜎
2 (𝑧 ) = 0.2 Φ(𝑧) = 0.9872
√𝑛
Think of symmetry, the same area is chopped off from
Substitute information given and find 𝑛
0.2 both sides of the graph so
√𝑛 = × 0.17 1 − 2(1 − 09872) = 0.9744
2 × 2.576
𝑛 = 4126.53 ≈ 4130 Hence the 𝛼% confidence is = 97.44%
PAGE 7 OF 8
CIE A-LEVEL MATHEMATICS//9709
To carry out a hypothesis test: 5.3 Type I and Type II Errors
Define the null and alternative hypotheses A Type I error is made
Decide on a significance level when a true null
Determine the critical value(s) hypothesis is rejected
Calculate the test statistic A Type II error is
Decide on the outcome of test depending on whether made when a false
value of test statistic lies in rejection/acceptance region null hypothesis is
State the conclusion in words accepted
The test statistic 𝑍 can be used to test a hypothesis P(Type I error) = significance level
about a population Calculating P(Type II error):
𝑥̅ − 𝜇 o Firstly, calculate the acceptance region by leaving 𝑥̅ as
𝑧=
𝜎2 a variable and equating the test statistic to the
√
𝑛 significance level
where 𝜇 is the population mean specified by 𝐻0 o Next, calculate the conditional probability that 𝜇 is
The critical values for some commonly used rejection now 𝜇′ and 𝑥̅ is still in the acceptance region
regions: P(𝑥̅ is in acceptance region | 𝜇 = 𝜇′)
Significance Two-tail One-tail Calculate this by substituting the limit of the
level 𝜇 ≠ 𝜇0 𝜇 > 𝜇0 𝜇 < 𝜇0 acceptance region as 𝑥̅ (calculated previously) and the
10% ±1.645 1.282 −1.282 new, given 𝜇′ into the test statistic equation and find
5% ±1.960 1.645 −1.645 the probability
2% ±2.326 2.054 −2.054
1% ±2.576 2.326 −2.326
PAGE 8 OF 8
WWW.
© Copyri
ght2019,2017,2015byZNotes
Fi
rstedion© 2015,byEmi rDemir
han,SaifAs mi&ZubairJ
unj
uni
afort
he2015s
yll
abus
Secondedion© 2017,r ef
orma edbyZubai rJunj
uni
a
Thi
rdedion© 2019,r efor
ma edbyZ NotesT eamforthe2019sy
ll
abus
NOT
ES
wr ienper mis s
ionofthecopy ri
ghtowner.Undernoc ondionsma ythi
sdocumentbedi st
ributed
undert henameoff als
eaut hor(
s)orsoldforfinanc i
algai
n;thedoc umentissol
elymeantforedu-
.
ORG
caonal purposesanditistoremainapr opert
ya vail
abletoall
atnoc ost
.Iti
scurr
ent l
yfr
eelyav ail
-
ablefromt hewebs itewww. znotes
.org
Thi
swor
kisl
ic
ensedunderaCr
eav
eCommonsA r
ibu on-
NonCommer
ci
al
-S
har
eAl
i
ke4.
0Int
er-
naonal
Lic
ense.