Probability and Probability Distributions: DR Martin C. Simuunza Dept of Disease Control School of Veterinary Medicine

Probability and Probability
Distributions
Dr Martin C. Simuunza
Dept of Disease Control
School of Veterinary medicine
1.1 Relative frequency and probability
• The event A occurs r times during m experiments or trials.
• The relative frequency h(A) is then defined by the formula
h(A) = r/m
Example 1.1
• Toss a coin m = 10 times (10 experiments) and define the event A =
equal to getting “head”.
• Assume that “head” occurred r = 8 times. The relative frequency will
then be
h(head) = r/m = 8/10 = 0.8

• Example 1.2
• m = 50 persons are examined for a syndrome called
“restless legs”. The event A of interest is a given
person suffering from “restless legs”. Assume that r
= 21 of the examined persons suffered from the
syndrome. The relative frequency of “restless legs”
will then be:
h(restless legs) = 21/50 = 0.42
• The number of experiments m in both examples are

limited. What happens if we increase them?
Example 1.1 (cont)
Number of tossed coins is increased from m = 10 to m =

10,000.
Number of No. Of Relative frequency

experiments “head”
10 8 0.800
50 20 0.400
100 46 0.460
1000 526 0.526
10,000 5080 0.508
Example 1.2 (Cont.)
• Number of examined persons is increased from m =
5 to m = 4722
No. of expts N. of persons Relative

with restless legs frequency
50 21 0.42
100 23 0.23
200 56 0.28
500 161 0.32
1000 316 0.316
4722 1448 0.307
• From these two experiments we can see that the
relative frequency increases in stability with
increasing number of experiments
• It seems that the relative frequency approaches a
“true covered frequency”. This “true covered
frequency” is called the probability.
Definition
The probability of an event is the event’s long-run
relative frequency in repeated experiments (trials)
under the same conditions
We refer to the probability of an event A as P(A).

Read as “probability of A”.
Sample Space and events
• The term “experiment” when used in statistics is
not restricted to laboratory experiments, but
includes:
• Any activity that results in the collection of data

pertaining to a phenomena that exhibits variation .
Definition
An experiment is the process of collecting data
relevant to a phenomena that exhibits variation in
its outcomes
Definition
The sample space is an exhaustive list of all the
possible outcomes of an experiment.
Each possible outcome is represented by one and only

one point in the sample space, which is usually
denoted by S
Example 1.3
Experiment: A blood sample is examined for blood
grouping.
Sample space: {O, A, B, AB}
Example 1.4
Experiment: Throwing a dice
Sample space: {1, 2, 3, 4, 5, 6}
In all the above examples, the sample space is limited.
Definition
A sample event is defined as each distinct outcome
of an experiment
Example 1.3 (cont.)
The outcome of blood examination with regard to
blood group may be O, A, B, or AB. All these
outcomes or results are sample events.
Example 1.4 (cont.)

The outcome of throwing one die may be 1, 2, 3, 4, 5
or 6. All these results are sample events.
Definition
A collection of elementary outcomes (sample event) characterised by
some descriptive feature is called an event.
An event is a subset of a sample space
Example 1.3 (cont.)

Events: all possible combinations of sample events
Example 1.4 (cont.)

Sample space: {1, 2, 3, 4, 5, 6}
Events: all possible combinations of sample events
Definitions
A discrete sample space is a sample space consisting of either a finite
or countable infinite number of sample events (elements)
A continuous sample space is a sample space consisting of

uncountable, infinite number of sample events (elements).
The sample spaces described in examples 1.3 and 1.4 are both discrete
sample spaces.
Example 1.5
Experiments: The age of a cow is to be determined.
Sample space: {All real numbers between 0 and 20}
This is a continuous sample space

1.3 Probability and events
Basic Definition
P(event) = number of times the event occurs

No. of times the experiment is repeated
Example 1.6
Experiment: Examine one cow
Event: Clinical mastitis
N = 100 experiments (cows examined)

n = 24 Clinical mastitis
P(event) = P(clinical mastitis) = n/N = 24/100 = 0.24

• If all sample events are equally likely, this rule can be written as
P(event) = number sample events in the event

No. of sample events in the sample space
Example 1.3 (cont.)

Sample events: O, A, B or AB
We are interested in blood group A or AB
P(event) = number of sample events in the event

number of sample events in the experiment
= n/N = 2/4 = 0.5
However, this probability calculation assumes that each sample event is
equally likely. This is not always the case.
Example 1.7
We analyse N = 100 blood samples for blood grouping. I.e. we repeat

the experiment 100 times.
The event is still {A, AB}
The results are : 43 cases with A and 3 cases with AB.
P(event) = number times the event occurs = 43 + 3 = 0.46

Number of repeated experiments 100
This is an estimate of the situation based on information available

Conditions for a model of probability
for discrete sample space
The probability is a function, defined on events that satisfy the following
conditions:
1. For all events A, 0≤P(A)≤1
2. If A is an event that has to occur, then P(A) = 1. The event A then

consists of all the sample events in the sample space.
3. If A is an event that can never occur, then P(A) = 0. The event A will
not consist of any sample events in the sample space.
4. P(A) is the sum of probabilities of all sample events belonging to A.
5. P(sample space) = 1
Permutation and combination
As long as the sample space is simple and its easy to find “the number
of elements in A”, we can directly use the rule:
P(A) = number of elements in A

number of elements in the sample space
If the event A is more complex, we need some new rules
Rule of permutations
The number of different orderings that can be formed with r objects
selected from a group of n distinct objects and is denoted by
Pnr = n(n-1) ........ (n-r+1)

Example 1.8:
How many different orderings can be formed with r
objects selected from a group of 10?
r=1 P110 = 10
r =2 P210 = 10 . 9 = 90
r=3 P310 = 10. 9. 8 = 720
r=4 P410 = 10 . 9 . 8 . 7 = 5040
r=5 P510 = 10 . 9 . 8 . 7 . 6 =30240
The number of different orderings of n objects in a group of n is
Pnn = n(n-1). (n-2)...... 2 . 1
Pnn =n! = “n factorial)
Example 1.9
How many different orderings can be done by n objects in a group of
n=2 objects A and B
A, B
B, A P22 = 2 . 1 = 2
n =3 objects A, B, C
ABC
AC B
BAC
BCA P33 = 3 . 2 . 1 = 6
CAB
CBA
The permutation rule deals with enumerating all arrangements when
choosing r objects out of n
In most situations we are interested only in the number of possible
choices of a group of r objects out of n, without looking at the order.
Example 1.10
We have a group of n = 3 objects (A, B, C) and we want to select 2.
If the ordering is of interest, we can perform this in
P23 = 3 . 2 = 6 ways
[AB, BA, AC, CA, BC, CB]
If the ordering is of no interest, we can perform this as: [AB AC BC] = 3

ways
This is called Combination

Rule of Combinations
The number of possible collections of r objects chosen from a group of n
distinct objects denoted by
n = n! = Prn
r r!(n-r)! r!
Example 1.11
How many different ways can r objects be selected from a group of n =
10?
r=1 10 = 10! = 10 (same as for permutation)
1 1!.9!
r=2 10 = 10! = 45
2 2!.8!
r=3 10 = 10! = 120
3 3!.7!
r =4 10 = 10! = 210
4 4!.6!
r =5 10 = 10! = 252
5 5!.5!
Example 1.12
In a herd of n = 10 cows in which four are sub-clinically ill, we are going
to randomly select two cows. This has to be done in such a way that
we are not able to investigate the first cow before selecting the
second.
a) What is the probability of selecting two sub-clinically ill cows?

b) What is the probability of selecting only one cow with sub-clinical
disease?
a) The event A in this case is to select two sick cows. The number of
sample events in this is equal to the number of possible collecting of
r = 2 cows chosen from a group of sick cows combined with r = 0
cows chosen from the group of 6 healthy cows.
4 . 6 = 4! . 6! =6
2 0 2!.2! 0!.6!
The number of sample events in the sample space is equal to the
number of possible collection ways of collecting r = 2 cows chosen
from the total herds of n = 10. Consequently the sample space
consists of:
10 = 45 sample events
2
In accordance with the rule
P(A) = No of elements in A
No. Of elements in the sample space
P(A) = 6/45 ≈ 0.13

b ) This question can be solved in the same way.
Event A = selecting one sick cow and one health cow.
No of elements in A: 4 6 = 24
1 . 1
Number of elements in in sample space is same as in

(a ).
P(A) = 24/45 = 0.53

Conditional probability
Sample space
Event
A
Assume that we have events A and B in the same sample

space. Then we have one of the following situations
1 2
A A
B OR B
Two events are called disjoint or mutually exclusive events if
they have no sample events in common.
The intersection of two events A and B is defined as the set of

sample events that belong simultaneously to both A and B.
The intersection is denoted as A∩B.
The Union of two events A and B is defined as the set of

sample events in A, in B and in A and B. The union is
denoted by AUB.
The complement of an event A is the set of sample events

that are not in A. The complement is denoted by Ā.
A B
Additive rule
P(AUB) = P(A) + P(B) – P(A∩B)
If the two events are disjoint P(A∩B) = 0
Then P(AUB) = P(A) + P(B)
In general, A1 , A2, ...... Ar are disjoint events

Then P(A1UA2UA3U...UAr) = P(A1) + P(A2) + P(A3) + + + P(Ar)
Example 1.13
Experiment: One blood sample is analysed for
blood grouping
Event: {A, AB}
Each sample event are independent and

P(O) = 0.46, P(A) = 0.43, P(B) = 0.08, P(AB) = 0.03
In accordance with the additive rule

P(event) = P(AUAB) = P(A) + P(AB)
= 0.43 + 0.03 = 0.46
A
The complement law

P(A) = 1 – P(Ā)
P(Ā) = 1 – P(A)
Example
We found that P(event) = P(AUAB) = 0.46
P(event) = P(OUB) = 1- P(event) = 1 – 0.46 = 0.54
The probability of an event A must often be modified
after information is obtained as to whether or not a
related event B has taken place.
The revised probability of A when it is known that B

has occurred is called
The Conditional probability of A, given B
P(A|B)
Example 1.14
There are 10 mice, of which 4 are white and 6 are grey. Two
mice are randomly selected. What is the probability that
the second selected mouse white when the first was grey.
Let A denote the event that the second selected mouse is

white and let B be the event that the first was grey.
Then we have
P(A) = 4/10 & P(B) = 6/10
But
P(A|B) = 4/9
A B
What is P(A∩B)
By looking at the figure above, it is easy to convince yourself that
P(A∩B) = P(A) . P(B|A)

or
P(A∩B) = P(B) . P(A|B)
This is the multiplicative law

Independent events
Two events A and B are said to be independent if:
P(A|B) = P(A)
P(B|A) = P(B)
P(A∩B) = P(A) . P(B)

Example 1.15
Assume we have two mice of which one is grey and the other is white.
Let A denote the event that the first selected mouse is white and B the
event that the second selected mouse is grey.
Assume further that we replace the mouse after each selection. Then
we have
P(A) = ½ & P(A|B) = ½
The two events A and B are independent.
Assume we now don't replace the mouse after each selection. Then we
have
P(B) = ½ & P(B|A) = 1
The two event are dependent

Example 1.16
(From Br. Med. J 1985; Telstad W. & Larsen S.)
Cramps
No Yes Total
Restless Legs No 2874 400 3274
Yes 675 773 1448

Total 3549 1173 4722
Let A denote the event to suffer from cramps in legs and B to
suffer from restless legs .
P(B) = 1448/4722 = 0.307

P(B|A) = 773/1173 = 0.659
This strongly indicates that the two events “restless legs” and
Baye’s Theorem
• Bayes' Theorem is a result that allows new information
to be used to update the conditional probability of an
event.
• Using the multiplication rule, gives Bayes' Theorem in
its simplest form:
P(A|B) = (A∩B) = P(B|A).P(A)
P(B) P(B)
Using the multiplicative rule

P(A|B) = P(B|A).P(A)
P(B|A).P(A) + P(B|Ā).P(Ā)
2. RANDOM VARIABLES AND
PROBABILITY DISTRIBUTION
• Events and random variables
• Describing data
• Probability distribution for discrete variables
• Probability distribution for continuous variables
• Parameters describing the probability distribution
• Expectation; The theoretical mean
• Variance and standard deviation; A measure of
dispersion
• Skewness and kurtosis
• Joint distributions; Covariance and correlation
2.1 Event and Random Variables
• Assume that we carry out the experiments of flipping a coin
three times. The sample space of the event will then be:
•
• {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
• The sample space described by sample events is a

description by attributes rather than by numbers. Instead
of reporting the results in sample events, we can report it
in numbers like
• “number of tail” & “number of head”

• If we change from event and report the results in
“X=number of head” the sample space changes to
•
• {0, 1, 2, 3}
•
• And the sample event has changed to a random variable.
• Definition
• A random variable is a numerical valued function defined
on a sample space.
NOT ALL EVENTS CAN BE “RESTRUCTURETED TO A RANDOM
VARIABLE
• We have the following types of “variable”:
• Categorical variable events

• Nominal (without natural ranking)
• Ordinal (with natural ranking)
• Numerical variable random variables

• Discrete variables
• Continuous variables
•
CATEGORICAL VARIBALES
•
EVENTS
•
• Gender (sex)
Male
Female nominal
•
• Blood group
O
A nominal
B
AB
•Smoking
Non-Smoker (1)
•Ex-smoker (2)
•Light smoker (3) ordinal
•Heavy smoker (4)
NUMERICAL VARIBALES
RANDOM VARIABLES
1. Discrete random variables
A discrete random variable is a variable which only can

result in a certain, limited number of numerical values.
The sample space of a discrete random variable consists

of a countable limited number of numerical values.
• Example 2.1:
Experiment: Flipping coin 10 times
Variable: Number of head
Sample space: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Example 2.2:
Experiment: A herd of 8 elephants is to be
investigated with regard to a given
pattern of symptom.
Variable: The number of elephants with the given
patter of symptoms.
Sample space: {0, 1, 2, 3, 4, 5, 6, 7, 8}

3. Continuous random variable
• A continuous random variable is a variable which
theoretically can result in an unlimited number of
numerical values within a limited or unlimited interval.
• The sample space of a continuous random variable is a

limited or unlimited interval consisting of an unlimited
number of numerical values.
Example 2.3:
Experiment: A herd of cows is investigated with regard to
body temperature.
Variable: Body temperature
Sample space: {all real number between 36
and 42}
Example 2.4:
Experiment: All the lambs in a sheep herd are to be treated with anti-
parasite drugs in the spring and the body weight, before
and after the summer is to be recorded.
Variable: Increase in body weight
Sample space: (all real number larger than 0
and less or equal to 40 kg).
Example 2.5:
Experiment: A total of n=98 with Rheumatoid Arthritis (RA) are
investigated and the patients report the degree of pain
on a 10 cm Visual Analogue Scale (VAS).
Variable: Degree of pain
Sample space: (all real number between
0 and 10}
2.2 Describing Data
HOW TO PRESENT DATA?
1. Descriptive statistics
- Table and figures
2. Statistical analysis
CATEGORICAL VARIABLES
Variable: Frequency of perinatal mortality in
England and Wales in 1979 by day.
Table: No. of deaths per 1000 births
Wednesday
Thursday
Saturday
Tuesday
Monday
Sunday
Friday
13.4 14.3 13.7 13.9 14.2 16.1 17.0
No. of
deaths
CONTINUOUS RANDOME VARIABLE
Variable: Age and Pl max in 25 patients with cystic fibrosis (O’Neill
et al. 1983) Subject Age Plmax
(years) (cm H2O)
1 7 80
2 7 85
3 8 110
4 8 95
5 8 95
6 9 100
7 11 45
8 12 95
9 12 130
10 13 75
11 13 80
12 14 70
13 14 80
14 15 100
15 16 120
16 17 110
17 17 125
18 17 75
19 17 100
20 19 40
21 19 17
22 20 110
23 23 150
24 23 75
25 23 95
Table: Describing averages
Lung function
(Pl max/cm H2O)
Mean 92.6
Median 95
Mode 95
Range 40-150
Variable: Serum IgM in 298 children aged from 6 months
to 6 years
IgM (g/l)
Number of Children
0.1 3
0.2 7
0.3 19
0.4 27
0.5 32
0.6 35
0.7 38
0.8 38
0.9 22
1.0 16
1.1 16
1.2 6
1.3 7
1.4 9
1.5 6
1.6 2
1.7 3
1.8 3
2.0 3
2.1 2
2.2 1
2.5 1
2.7 1
4.5 1
STEM-AND-LEAF PLOT
IgM
(g/l) Number of Children
0.1 3
0.2 7
0.3 19
0.4 27
0.5 32
0.6 35
0.7 38
0.8 38
0.9 22
1.0 16
1.1 16
1.2 6
1.3 7
1.4 9
1.5 6
1.6 2
1.7 3
1.8 3
2.0 3
2.1 2
2.2 1
2.5 1
2.7 1
4.5 1
Cumulative Frequency
Cumulative
IgM Relative Cumulative
Frequency Relative
(g/l) Frequency % Frequency
Frequency %
0.1 3 1.0 3 1.0
0.2 7 2.3 10 3.4
0.3 19 6.4 29 9.7
0.4 27 9.1 56 18.8
0.5 32 10.7 88 29.5
0.6 35 11.7 123 41.3
0.7 38 12.8 161 54.0
0.8 38 12.8 199 66.8
0.9 22 7.4 221 74.2
1.0 16 5.4 237 79.5
1.1 16 5.4 253 84.9
1.2 6 2.0 259 86.9
1.3 7 2.3 266 89.3
1.4 9 3.0 275 92.3
1.5 6 2.0 281 94.3
1.6 2 0.7 283 95.0
1.7 3 1.0 286 96.0
1.8 3 1.0 289 97.0
2.0 3 1.0 292 98.0
2.1 2 0.7 294 98.7
2.2 1 0.3 295 99.0
2.5 1 0.3 296 99.3
2.7 1 0.3 297 99.7
4.5 1 0.3 298 100.0
Total 298 99.9
2.3 Probability distribution for
discrete variables
Each random variable has a sample space which consists of all
the possible values of the variable.
If S denotes the sample space and X the random variable
When X is a discrete random variable, S will consists

of a limited number of X-values
S={X1, X2, X3, … Xn}
P(X=x)=f(x) for all x S

Is called the Probability density.
The probability distribution of the variable X in the sample space S.
The demands to the probability density are:
0 ≤ P(X=x) ≤1 for all values x S
P(X=x1)+P(X=x2)+..+P(X=xn) =
Example 2.6
Experiment: Flipping coin 3 times
Variable: Number of head
Sample space: {0, 1, 2, 3}
From previously we know that that the Sample space written with
events is
{HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
Then we can calculate Probability
• P(X=0) = 1/8
• P(X=1) = 3/8
• P(X=2) = 3/8
• P(X=3) = 1/8
Example 2.7
Experiment: A herd of 8 cows is to be investigated
with regards to a given pattern of
symptoms.
Variable: The number of cows with the given
pattern.
Sample space: {0, 1, 2, 3, 4, 5, 6, 7, 8}
In this case each point probability [P(X=x)] is unknown and

needs to be calculated. This follows later.
Assume that the calculations gave the following results:
P(X=0)=0.017 P(X=1)=0.089 P(X=2)=0.209
P(X=3)=0.279 P(X=4)=0.232 P(X=5)=0.124
P(X=6)=0.041 P(X=7)=0.008 P(X=8)=0.001
Important issues with probability distributions for discrete
random variables.
1) ) =1 The area under the density curve is 1
2) =
3) )
4)
Probability Density or Probability Distribution

Cumulative Probability Distribution
Example 2.7 (cont.)
Experiment: A herd of 8 cows investigated regarding a pattern
of symptoms.
Variable: No. of cows with the symptom.
Sample space: {0, 1, 2, 3, 4, 5, 6, 7, 8}
2.4 Probability distribution for continuously variables
• When X is a continuous random variable, the Sample Space consist of an unlimited
number of values within a limited or unlimited interval.
• The probability density of a continuous random variable (f(x) is consequently a

continuously function and not a “histogram”-type as for discrete random variables.
Demands to the probability density for a continuous random variable
1. The probability distribution f(x) is always positive
2. The area under the probability distribution curve f(x) is equal

• to 1.
•
•
•
•
N=2
N=3
N=4
Xm
4.5 Properties of an estimator
• An estimator is defined as a function of a random

variable
• Consequently, an estimator is also a random
variable
– has a probability distribution
– E( )) and Var ( ) are defined
• As an estimator , is unbiased for the parameter θ
if E( ) = θ, for what ever true value of θ. If this
property does not hold, is a biased estimator
However, in some situations, there exists more than
one unbiased estimator . In these situations, we
have to look at SD ( ).
Select the unbiased estimator of θ that has the

smallest variance, whatever the true value of . If
one exist, it is called the minimum variance unbiased
estimator of θ.
Which of the two demands
Unbiased
Small SD
is the most important
• In order to answer this question, think of an
estimator as a gun:
– You have two guns, One has the expectation to hit the
target even with the relative large deviation of the shots.
– The other gun does not have the expectation to hit the
target, but the deviation of the shots is small.
• The most important property is that the estimator

is unbiased for the parameter θ
I.e. Expected value of a Chi-squared
variable is n -1

Probability and Probability Distributions: DR Martin C. Simuunza Dept of Disease Control School of Veterinary Medicine

Uploaded by

Copyright:

Available Formats

You might also like

Probability and Probability Distributions: DR Martin C. Simuunza Dept of Disease Control School of Veterinary Medicine

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability and Probability Distributions: DR Martin C. Simuunza Dept of Disease Control School of Veterinary Medicine

Uploaded by

Copyright:

Available Formats

Probability and Probability

h(head) = r/m = 8/10 = 0.8

h(restless legs) = 21/50 = 0.42

• The number of experiments m in both examples are

Number of tossed coins is increased from m = 10 to m =

Number of No. Of Relative frequency

No. of expts N. of persons Relative

We refer to the probability of an event A as P(A).

• Any activity that results in the collection of data

Each possible outcome is represented by one and only

In all the above examples, the sample space is limited.

Example 1.4 (cont.)

Example 1.3 (cont.)

Example 1.4 (cont.)

A continuous sample space is a sample space consisting of

This is a continuous sample space

P(event) = number of times the event occurs

N = 100 experiments (cows examined)

P(event) = P(clinical mastitis) = n/N = 24/100 = 0.24

P(event) = number sample events in the event

Example 1.3 (cont.)

We are interested in blood group A or AB

P(event) = number of sample events in the event

We analyse N = 100 blood samples for blood grouping. I.e. we repeat

The event is still {A, AB}

The results are : 43 cases with A and 3 cases with AB.

P(event) = number times the event occurs = 43 + 3 = 0.46

This is an estimate of the situation based on information available

2. If A is an event that has to occur, then P(A) = 1. The event A then

4. P(A) is the sum of probabilities of all sample events belonging to A.

P(A) = number of elements in A

If the event A is more complex, we need some new rules

Pnr = n(n-1) ........ (n-r+1)

[AB, BA, AC, CA, BC, CB]

If the ordering is of no interest, we can perform this as: [AB AC BC] = 3

This is called Combination

a) What is the probability of selecting two sub-clinically ill cows?

P(A) = 6/45 ≈ 0.13

Number of elements in in sample space is same as in

P(A) = 24/45 = 0.53

Assume that we have events A and B in the same sample

The intersection of two events A and B is defined as the set of

The Union of two events A and B is defined as the set of

The complement of an event A is the set of sample events

Then P(AUB) = P(A) + P(B)

In general, A1 , A2, ...... Ar are disjoint events

Event: {A, AB}

Each sample event are independent and

In accordance with the additive rule

The complement law

The revised probability of A when it is known that B

The Conditional probability of A, given B

Let A denote the event that the second selected mouse is

P(A∩B) = P(A) . P(B|A)

This is the multiplicative law

P(A∩B) = P(A) . P(B)

P(A) = ½ & P(A|B) = ½