Professional Documents
Culture Documents
Probability and Probability Distributions: DR Martin C. Simuunza Dept of Disease Control School of Veterinary Medicine
Probability and Probability Distributions: DR Martin C. Simuunza Dept of Disease Control School of Veterinary Medicine
Probability and Probability Distributions: DR Martin C. Simuunza Dept of Disease Control School of Veterinary Medicine
Distributions
Dr Martin C. Simuunza
Dept of Disease Control
School of Veterinary medicine
1.1 Relative frequency and probability
• The event A occurs r times during m experiments or trials.
• The relative frequency h(A) is then defined by the formula
h(A) = r/m
Example 1.1
• Toss a coin m = 10 times (10 experiments) and define the event A =
equal to getting “head”.
• Assume that “head” occurred r = 8 times. The relative frequency will
then be
Definition
The probability of an event is the event’s long-run
relative frequency in repeated experiments (trials)
under the same conditions
Definition
An experiment is the process of collecting data
relevant to a phenomena that exhibits variation in
its outcomes
Definition
The sample space is an exhaustive list of all the
possible outcomes of an experiment.
Example 1.3
Experiment: A blood sample is examined for blood
grouping.
Sample space: {O, A, B, AB}
Example 1.4
Experiment: Throwing a dice
Sample space: {1, 2, 3, 4, 5, 6}
Definition
A sample event is defined as each distinct outcome
of an experiment
Example 1.3 (cont.)
The outcome of blood examination with regard to
blood group may be O, A, B, or AB. All these
outcomes or results are sample events.
The sample spaces described in examples 1.3 and 1.4 are both discrete
sample spaces.
Example 1.5
Experiments: The age of a cow is to be determined.
Sample space: {All real numbers between 0 and 20}
Example 1.6
Experiment: Examine one cow
Event: Clinical mastitis
3. If A is an event that can never occur, then P(A) = 0. The event A will
not consist of any sample events in the sample space.
5. P(sample space) = 1
Permutation and combination
As long as the sample space is simple and its easy to find “the number
of elements in A”, we can directly use the rule:
Rule of permutations
The number of different orderings that can be formed with r objects
selected from a group of n distinct objects and is denoted by
r=1 P110 = 10
r =2 P210 = 10 . 9 = 90
r=3 P310 = 10. 9. 8 = 720
r=4 P410 = 10 . 9 . 8 . 7 = 5040
r=5 P510 = 10 . 9 . 8 . 7 . 6 =30240
The number of different orderings of n objects in a group of n is
Pnn = n(n-1). (n-2)...... 2 . 1
Pnn =n! = “n factorial)
Example 1.9
How many different orderings can be done by n objects in a group of
n=2 objects A and B
A, B
B, A P22 = 2 . 1 = 2
n =3 objects A, B, C
ABC
AC B
BAC
BCA P33 = 3 . 2 . 1 = 6
CAB
CBA
The permutation rule deals with enumerating all arrangements when
choosing r objects out of n
In most situations we are interested only in the number of possible
choices of a group of r objects out of n, without looking at the order.
Example 1.10
We have a group of n = 3 objects (A, B, C) and we want to select 2.
If the ordering is of interest, we can perform this in
P23 = 3 . 2 = 6 ways
n = n! = Prn
r r!(n-r)! r!
Example 1.11
How many different ways can r objects be selected from a group of n =
10?
r=1 10 = 10! = 10 (same as for permutation)
1 1!.9!
r=2 10 = 10! = 45
2 2!.8!
r=3 10 = 10! = 120
3 3!.7!
r =4 10 = 10! = 210
4 4!.6!
r =5 10 = 10! = 252
5 5!.5!
Example 1.12
In a herd of n = 10 cows in which four are sub-clinically ill, we are going
to randomly select two cows. This has to be done in such a way that
we are not able to investigate the first cow before selecting the
second.
4 . 6 = 4! . 6! =6
2 0 2!.2! 0!.6!
The number of sample events in the sample space is equal to the
number of possible collection ways of collecting r = 2 cows chosen
from the total herds of n = 10. Consequently the sample space
consists of:
10 = 45 sample events
2
In accordance with the rule
P(A) = No of elements in A
No. Of elements in the sample space
No of elements in A: 4 6 = 24
1 . 1
Sample space
Event
A
Additive rule
P(AUB) = P(A) + P(B) – P(A∩B)
If the two events are disjoint P(A∩B) = 0
P(Ā) = 1 – P(A)
Example
We found that P(event) = P(AUAB) = 0.46
P(event) = P(OUB) = 1- P(event) = 1 – 0.46 = 0.54
The probability of an event A must often be modified
after information is obtained as to whether or not a
related event B has taken place.
P(A|B)
Example 1.14
There are 10 mice, of which 4 are white and 6 are grey. Two
mice are randomly selected. What is the probability that
the second selected mouse white when the first was grey.
Then we have
P(A) = 4/10 & P(B) = 6/10
But
P(A|B) = 4/9
A B
What is P(A∩B)
By looking at the figure above, it is easy to convince yourself that
P(A|B) = P(A)
P(B|A) = P(B)
Assume we now don't replace the mouse after each selection. Then we
have
P(B) = ½ & P(B|A) = 1
• Definition
• A random variable is a numerical valued function defined
on a sample space.
NOT ALL EVENTS CAN BE “RESTRUCTURETED TO A RANDOM
VARIABLE
EVENTS
•
• Gender (sex)
Male
Female nominal
•
• Blood group
O
A nominal
B
AB
•Smoking
Non-Smoker (1)
•Ex-smoker (2)
•Light smoker (3) ordinal
•Heavy smoker (4)
NUMERICAL VARIBALES
RANDOM VARIABLES
Example 2.2:
Experiment: A herd of 8 elephants is to be
investigated with regard to a given
pattern of symptom.
Variable: The number of elephants with the given
patter of symptoms.
Example 2.3:
Experiment: A herd of cows is investigated with regard to
body temperature.
Variable: Body temperature
Sample space: {all real number between 36
and 42}
Example 2.4:
Experiment: All the lambs in a sheep herd are to be treated with anti-
parasite drugs in the spring and the body weight, before
and after the summer is to be recorded.
Variable: Increase in body weight
Sample space: (all real number larger than 0
and less or equal to 40 kg).
Example 2.5:
Experiment: A total of n=98 with Rheumatoid Arthritis (RA) are
investigated and the patients report the degree of pain
on a 10 cm Visual Analogue Scale (VAS).
Variable: Degree of pain
Sample space: (all real number between
0 and 10}
2.2 Describing Data
HOW TO PRESENT DATA?
1. Descriptive statistics
- Table and figures
2. Statistical analysis
CATEGORICAL VARIABLES
Variable: Frequency of perinatal mortality in
England and Wales in 1979 by day.
Table: No. of deaths per 1000 births
Wednesday
Thursday
Saturday
Tuesday
Monday
Sunday
Friday
13.4 14.3 13.7 13.9 14.2 16.1 17.0
No. of
deaths
CONTINUOUS RANDOME VARIABLE
Variable: Age and Pl max in 25 patients with cystic fibrosis (O’Neill
et al. 1983) Subject Age Plmax
(years) (cm H2O)
1 7 80
2 7 85
3 8 110
4 8 95
5 8 95
6 9 100
7 11 45
8 12 95
9 12 130
10 13 75
11 13 80
12 14 70
13 14 80
14 15 100
15 16 120
16 17 110
17 17 125
18 17 75
19 17 100
20 19 40
21 19 17
22 20 110
23 23 150
24 23 75
25 23 95
Table: Describing averages
Lung function
(Pl max/cm H2O)
Mean 92.6
Median 95
Mode 95
Range 40-150
Variable: Serum IgM in 298 children aged from 6 months
to 6 years
IgM (g/l)
Number of Children
0.1 3
0.2 7
0.3 19
0.4 27
0.5 32
0.6 35
0.7 38
0.8 38
0.9 22
1.0 16
1.1 16
1.2 6
1.3 7
1.4 9
1.5 6
1.6 2
1.7 3
1.8 3
2.0 3
2.1 2
2.2 1
2.5 1
2.7 1
4.5 1
STEM-AND-LEAF PLOT
IgM
(g/l) Number of Children
0.1 3
0.2 7
0.3 19
0.4 27
0.5 32
0.6 35
0.7 38
0.8 38
0.9 22
1.0 16
1.1 16
1.2 6
1.3 7
1.4 9
1.5 6
1.6 2
1.7 3
1.8 3
2.0 3
2.1 2
2.2 1
2.5 1
2.7 1
4.5 1
Cumulative Frequency
Cumulative
IgM Relative Cumulative
Frequency Relative
(g/l) Frequency % Frequency
Frequency %
0.1 3 1.0 3 1.0
0.2 7 2.3 10 3.4
0.3 19 6.4 29 9.7
0.4 27 9.1 56 18.8
0.5 32 10.7 88 29.5
0.6 35 11.7 123 41.3
0.7 38 12.8 161 54.0
0.8 38 12.8 199 66.8
0.9 22 7.4 221 74.2
1.0 16 5.4 237 79.5
1.1 16 5.4 253 84.9
1.2 6 2.0 259 86.9
1.3 7 2.3 266 89.3
1.4 9 3.0 275 92.3
1.5 6 2.0 281 94.3
1.6 2 0.7 283 95.0
1.7 3 1.0 286 96.0
1.8 3 1.0 289 97.0
2.0 3 1.0 292 98.0
2.1 2 0.7 294 98.7
2.2 1 0.3 295 99.0
2.5 1 0.3 296 99.3
2.7 1 0.3 297 99.7
4.5 1 0.3 298 100.0
Total 298 99.9
2.3 Probability distribution for
discrete variables
Each random variable has a sample space which consists of all
the possible values of the variable.
P(X=x1)+P(X=x2)+..+P(X=xn) =
Example 2.6
Experiment: Flipping coin 3 times
Variable: Number of head
Sample space: {0, 1, 2, 3}
From previously we know that that the Sample space written with
events is
• P(X=0) = 1/8
• P(X=1) = 3/8
• P(X=2) = 3/8
• P(X=3) = 1/8
Example 2.7
Experiment: A herd of 8 cows is to be investigated
with regards to a given pattern of
symptoms.
Variable: The number of cows with the given
pattern.
Sample space: {0, 1, 2, 3, 4, 5, 6, 7, 8}
4)
N=3
N=4
Xm
4.5 Properties of an estimator