Probability & Probability Distributions: Ira Mutiara Anjasmara, PHD

PROBABILITY & PROBABILITY
DISTRIBUTIONS
RM185101 - Applied Statistics and Probability
Ira Mutiara Anjasmara, PhD
Department of Geomatics Engineering

Faculty of Civil, Planning, and Geo Engineering
Institut Teknologi Sepuluh Nopember
Probability
Measurements are assumed to be random variables, with

each measurement representing an individual sample in a
random distribution.
Probability is the numerical measure of the chance or

likelihood that a particular event will occur.
A random variable is the numerical description of the

outcome of an experiment.
-IM Anjasmara, 2019-

RM185101 - Applied Statistics and Probability 2/45 PROBABILITY & PROBABILITY DISTRIBUTIONS
Random Variable
There are two types of random variable:

• discrete: the variable may only assume certain particular
values;
• continuous: the variable can assume any real value within

a certain range
Discrete and continuous random variables are treated differently.
They are termed “random” because they are a result of chance.

Example 1
Consider an experiment that consists of flipping a coin twice. If

H indicates a head and T a tail, the possible outcomes for this
experiment are:
(H, H) (H, T) (T, H) (T, T)
The number of heads occurring in this experiment can be 0, 1

or 2. Therefore, the number of heads is a random variable in
that it can assume values of 0, 1, or 2.

Example 1
Consider an experiment that consists of flipping a coin twice. If

H indicates a head and T a tail, the possible outcomes for this
experiment are:
(H, H) (H, T) (T, H) (T, T)
The number of heads occurring in this experiment can be 0, 1

or 2. Therefore, the number of heads is a random variable in
that it can assume values of 0, 1, or 2.
in this example the random variable is discrete

Example 2
The mean score of Statistics students in the mid-semester test

can be anything between 0% and 100%. This value is a random
variable that can assume any value in the range 0 - 100.

Example 2
The mean score of Statistics students in the mid-semester test

can be anything between 0% and 100%. This value is a random
variable that can assume any value in the range 0 - 100.
in this example the random variable is continuous

Discrete Probability Distributions
A probability function, f (x), gives the probability that a

particular random variable will assume a particular value.
A probability distribution is a table, graph or mathematical
formula that shows all possible values of the random variable, x,
and the associated probability function, f (x).
The sum of all possible outcomes in a probability distribution =
1: ∞
X
f (x) = 1 (1)
−∞
There are two types of probability distributions: discrete and

continuous. Here we focus upon the discrete case.
Example 3
Tossing 2 coins
From 1 coin: p(H) = 0.5 , p(T) = 0.5.
Outcomes for 2 coins: (H, H) (H, T) (T, H) (T, T)
If we let our discrete random variable, x, be the number of
heads:
f (2) = p(x = 2) = 0.5 × 0.5 = 0.25

f (1) = p(x = 1) = (0.5 × 0.5) + (0.5 × 0.5) = 0.50
f (0) = p(x = 0) = p(2T ) = 0.5 × 0.5 = 0.25
P
Note that f (x) = 1
Example 4
Roll of one 6-sided dice.

There is the same probability for each outcome:
x 1 2 3 4 5 6
f (x) 1/6 1/6 1/6 1/6 1/6 1/6
The probability function is:

no. of occurances of x
f (x) =
total no. of outcomes

Example 5
Probability distribution for the sum of two 6-sided dice.
It can be given as a table:
x f (x) Dice rolls
2 1/36 1,1
3 2/36 1,2 2,1
4 3/36 1,3 2,2 3,1
5 4/36 1,4 2,3 3,2 4,1
6 5/36 1,5 2,4 3,3 4,2 5,1
7 6/36 1,6 2,5 3,4 4,3 5,2 6,1
8 5/36 2,6 3,5 4,4 5,3 6,2
9 4/36 3,6 4,5 5,4 6,3
10 3/36 4,6 5,5 6,4
11 2/36 5,6 6,5
12 1/36 6,6

Example 5
Or as a graph:

Expected value
The expected value of a random variable gives us the mean

value for the random variable:
X
E[x] = µ = xf (x)
NOTE: µ is population mean, not sample mean.

For discrete random variables the expected value is not
necessarily discrete:
e.g., expected number of children for an Australian family
= 2.4.
This figure does not make sense in the original data set.

Variance
The variance is used to provide a measure of the dispersion or
variability of the random variable, x. As with the mean, we refer
to the population variance. As with frequency distributions,
there are two ways to compute it:
X X
var[x] = σ 2 = {(x − µ)2 f (x)} = {x2 f (x)} − µ2
The variance measures how far the value of a particular random

variable is from the expected value (mean).
The standard deviation of the probability distribution is the
square root of the variance:
√
σ = σ2
Example 6
Probability distribution for the sum of two 6-sided dice.
x f (x) xf (x) x2 f (x)
2 1/36 2/36 4/36
3 2/36 6/36 18/36
4 3/36 12/36 48/36 N
X
5 4/36 20/36 100/36 µ= xf (x) = 7
6 5/36 30/36 180/36 i=1
7 6/36 42/36 294/36 σ 2 = {x f (x)} − µ2
2
8 5/36 40/36 320/36
9 4/36 36/36 324/36 = 54, 833 − 72
10 3/36 30/36 300/36 = 5, 833
11 2/36 22/36 242/36
12
P 1/36 12/36 144/36
1 7 54,833
Binomial Distribution
The binomial probability distribution is a discrete

probability distribution that has many applications.
It is associated with a multiple-step experiment that we call
the binomial experiment.
Its properties are:
1 The experiment consists of a sequence of n identical trials.
2 Each trial has only two possible outcomes: success or
failure.
3 The probability of success does not change from trial to
trial.
4 The trials are independent.

Binomial Distribution
2. Each trial has only two possible outcomes: success or failure.
This property is not as restrictive as it first looks, if you think in
terms of things either happening or not happening.
E.g., true or false, heads or tails, male or female.
We define: probability of success = p; probability of failure = q
Since there are only 2 possible outcomes then:p + q = 1
Example
When observing dice rolls, we can score 1, 2, 3, 4, 5 or 6. If we
want to see how many times 5 comes up in n rolls, we can use
the binomial probability distribution if we think in terms of two
outcomes:
5 occurring, with p = 1/6; or
all other results (1, 2, 3, 4, 6) as not 5, with q = 5/6.
The number of outcomes
The number of outcomes of a binomial experiment that result

in exactly x successes from n trials is computed from:
n n!
Cx =
x!(n − x)!
This expression is called a combination, or the binomial

coefficient, and is the number of ways of selecting x objects
from n objects, without replacement, and irrespective of order.
(As opposed to the permutation, n Px , which takes order into
account.)

The number of outcomes
Some simple values of the binomial coefficient are:

n
C0 = n Cn = 1 n
C1 = n Cn−1 = n
The expression n! is called n factorial, and is computed by:
n! = n(n − 1)(n − 2)(n − 3) . . . 3 × 2 × 1
with 1! = 1, and 0! = 1
For very large n the Stirling formula gives n! :

√
n! ≈ 2πnn+ /2 e−n
1

Example
A coin is flipped 3 times. In how many ways can we get exactly

2 heads?
So number of ways we
can get exactly 2 heads is
3 (HHT, HTH, THH).
Or, using the
combination equation,
with n = 3, x = 2:
3 3! 6
C2 = = =3
2!(3 − 2)! 2×1

Binomial probability function
To determine the probability of x successes we need to

know the probability of success and failure.
Since the trials of a binomial experiment are independent,
we can multiply the probabilities associated with each trial
outcome to the find the probability of a particular sequence
of outcomes.
The binomial probability distribution is given by:
f (x) = n Cx px q n−x
f (x) gives the probability of x successes from n trials.

Example
Find the probability of scoring exactly 2 heads from 3 tosses of

a coin.
We have: p = 0.5, q = 1 − 0.5 = 0.5, n = 3.
f (2) = 3 C2 (0.5)2 (0.5)1 = 3 × 0.25 × 0.5 = 0.375

Mean and variance
The expected number of successes in a binomial experiment is

given as:
E[x̄] = µ = np
The variance of a binomial distribution is given as:
var[x] = σ 2 = npq

Example 1
A particular class has 35 students. From past experience it is

known that 7% of the students fail the course.
What is the probability of a) 0 students, and b) 5 students
failing the course? c) What is the expected number of students
who will fail the course, and the standard deviation of this?

Example 1
A particular class has 35 students. From past experience it is

known that 7% of the students fail the course.
What is the probability of a) 0 students, and b) 5 students
failing the course? c) What is the expected number of students
who will fail the course, and the standard deviation of this?
Answer:
This is a binomial problem with n = 35, p = 0.07, q = 0.93.
a) f (0) = 35 C0 (0.07)0 (0.93)35 = 0.079
b) f (5) = 35 C5 (0.07)5 (0.93)30 = 0.062
√
c) µ = 35 × 0.07 = 2.45 σ = 35 × 0.07 × 0.93 = 1.51

Exercise
Sara has a Rocket Science exam tomorrow, and she hasn’t done
any revision. So she decides to answer the 20 multiple choice
questions randomly. Each question has 4 options: A, B, C or D,
only one of which is the correct answer.
a) What is the probability distribution function for Sara
getting a correct answer?
b) What is the probability that she will get every question
right?
c) What is the probability that she will get every question
wrong?
d) What is the probability that she will scrape half marks (i.e.,
get 10 questions right)?
e) What is her expected mark?
f) What is the standard deviation on this expected value?
Continuous Probability Distributions
A continuous probability distribution is the probability

distribution that applies to a continuous random variable.
Note: a continuous random variable is a random variable
that can take on all values over a range (e.g., distance,
time, temperature).
For continuous random variables, the probability
distribution function, f (x), is usually called the
probability density function.

Uniform probability distribution
The continuous uniform probability distribution is used in all

situations in which all values of the random variable are equally
likely.
E.g., the random number generator on a calculator can be
used to generate random numbers between 0 and 1.
Each number has an equal probability of being generated.
The uniform probability density function for a random number
generator has the formula:

1 0≤x≤1
f (x) =
0 elsewhere

Uniform probability distribution
A plot of the probability density function, looks like:
For each possible outcome the probability density function is the

same.
Area as a measure of probability
The value of the probability density function does not represent

probability.
The probability that a continuous random variable will assume a

value between given limits a and b is given by the area under
the graph of its probability density function between a and b.

Example
Uniform distribution:
So, if a = 0.25, and b = 0.75, then:

p(0.25 ≤ x ≤ 0.75) = 1 × (0.75 − 0.25) = 0.5
In general, for any probability density function, f (x):
Zb
p(a ≤ x ≤ b) = f (x)dx, f (x) ≥ 0
a
Since the total probability of all outcomes cannot exceed 1:

Z∞
f (x)dx = 1
−∞
i.e., the total area under the probability density function must
equal 1.
Consider the probability that the random variable assumes a

distinct value, a:
Discrete: p(x = a) = f (x = a)
has a definite value, the value of the discrete probability
distribution.
Zb
Continuous: p(x = 1) = f (x)dx = 0,
a
and does not represent a probability.

We must therefore compute the probability of the random

variable taking a value in a particular interval surrounding a, say:
Za+
p(x = a) =≈ p(a − ≤ x ≤ a + ) = f (x)dx
a−
where is some small number.

The Normal Distribution
The normal probability distribution (also called the

Gaussian distribution) is the most widely used distribution
in statistical analysis.
Two parameters define this distribution: the mean (µ), and
the standard deviation (σ). The probability density
function of the normal distribution is:
1 2 2
f (x) = √ e−(x−µ) /2σ
σ 2π

The Normal Distribution
The distribution is symmetrical about the mean, with the

mean = median = mode.
i.e., the mean is the most frequently occurring value (the
mode), and lies at the point that divides the curve exactly
in half (the median).
The curve is asymptotic, extending from the mean towards
infinity in both directions, never quite reaching zero.

Here are three normal curves of same σ, different µ:
The standard deviation indicates the spread of the

measurements (the width of the normal curve).

Here are three normal curves of same µ, different σ:

Any normally-distributed
random variable assumes
a value within:
±1σ of the mean,
68.26% of the area
±2σ of the mean,
95.44% of the area
±3σ of the mean,
99.72% of the area

The standard normal distribution
As mentioned, the area beneath the curve represents the
probability of a particular measurement occurring:
Zb
1 2 2
p(a ≤ x ≤ b) = √ e−(x−µ) /2σ dx
σ 2π
a
Fortunately, this horrible integral has already been worked out

for the standard normal probability distribution.
This is a normal distribution scaled, or standardised, to have
µ = 0; σ=1
The values of the integral above, for the standard normal

distribution only, have been catalogued in tables. -IM Anjasmara, 2019-
Probabilities for all normal distributions are computed using the

standard normal distribution, but since most real data sets do
not have a standard normal distribution, we must transform our
real normal distribution (with mean µ, and SD σ) so it has a
mean of 0, and a SD of 1.
This scaling is done using the equation:

x−µ
z=
σ
Therefore, z can be interpreted as the number of standard
deviations that the random variable x lies from the mean

Now we can use the tables to work out p(a0 ≤ z ≤ b0 ), where:
a−µ b−µ
a0 = and b0 =
σ σ
The tables for the standard normal distribution look something
like this:

The highlighted value gives the blue shaded area between the
mean and z=1.06, ie:
A = p(0 ≤ z ≤ 1.06) = 0.3554

Example
The scores (as a %) in the Statistics exam were normally

distributed with a mean of 60% and a standard deviation of
10%. What is the probability that a student scored:
a) greater than 80%
b) between 40 and 50%?

Solution
a) For p(x > 80):
First compute the z-score corresponding to x = 80:
80 − 60
z= =2
10
This gives us that the red shaded area below represents
p(x > 80) = p(z > 2).

On the supplied standard normal tables, look up the probability
for z = 2,00: the answer is 0,4772. This value is the area of the
blue shaded region beneath the curve:
The blue area is

A1 = p(0 ≥ z ≥ 2) = p(60 ≥ x ≥ 80) = 0, 4772. So if we
subtract the blue area (A1 ) from the total area to the right of
the mean (0,5), well get the answer:
p(x > 80) = 0, 5 − 0, 4772 = 0, 0227
Solution
a) For p(40 < x < 50):
First compute the
z-score:
40 − 60
x = 40 ⇒ z = = −2
10
50 − 60
x = 50 ⇒ z = = −1
10
The red shaded area represents

p(40 ≤ x ≤ 50) = p(−2 ≤ z ≤ −1)
To find the red shaded area from the normal tables, we first use
the symmetry of the normal distribution:
p(−2 ≤ z ≤ −1) = p(1 ≤ z ≤ 2)

Then, to find our required area we take the difference of two
areas:
p(1 ≤ z ≤ 2) = p(0 ≤ z ≤ 2) − p(0 ≤ z ≤ 1)
From the standard normal tables, z = 2 gives A1 = 0, 4772; and

z = 1, gives A2 = 0, 3413. Hence:
p(40 ≤ x ≤ 50) = 0, 4772 − 0, 3413 = 0, 1359


Probability & Probability Distributions: Ira Mutiara Anjasmara, PHD

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability & Probability Distributions: Ira Mutiara Anjasmara, PHD

Uploaded by

Copyright:

Available Formats

PROBABILITY & PROBABILITY

RM185101 - Applied Statistics and Probability

Ira Mutiara Anjasmara, PhD

Department of Geomatics Engineering

Measurements are assumed to be random variables, with

Probability is the numerical measure of the chance or

A random variable is the numerical description of the

-IM Anjasmara, 2019-

There are two types of random variable:

• continuous: the variable can assume any real value within

-IM Anjasmara, 2019-

Consider an experiment that consists of flipping a coin twice. If

(H, H) (H, T) (T, H) (T, T)

The number of heads occurring in this experiment can be 0, 1

-IM Anjasmara, 2019-

Consider an experiment that consists of flipping a coin twice. If

(H, H) (H, T) (T, H) (T, T)

The number of heads occurring in this experiment can be 0, 1

-IM Anjasmara, 2019-

The mean score of Statistics students in the mid-semester test

-IM Anjasmara, 2019-

The mean score of Statistics students in the mid-semester test

in this example the random variable is continuous

-IM Anjasmara, 2019-

A probability function, f (x), gives the probability that a

There are two types of probability distributions: discrete and

f (2) = p(x = 2) = 0.5 × 0.5 = 0.25

Roll of one 6-sided dice.

The probability function is:

-IM Anjasmara, 2019-

-IM Anjasmara, 2019-

-IM Anjasmara, 2019-

The expected value of a random variable gives us the mean

NOTE: µ is population mean, not sample mean.

-IM Anjasmara, 2019-

The variance measures how far the value of a particular random

The binomial probability distribution is a discrete

-IM Anjasmara, 2019-

The number of outcomes of a binomial experiment that result

This expression is called a combination, or the binomial

-IM Anjasmara, 2019-

Some simple values of the binomial coefficient are:

The expression n! is called n factorial, and is computed by:

n! = n(n − 1)(n − 2)(n − 3) . . . 3 × 2 × 1

For very large n the Stirling formula gives n! :

-IM Anjasmara, 2019-

A coin is flipped 3 times. In how many ways can we get exactly

-IM Anjasmara, 2019-

To determine the probability of x successes we need to

f (x) gives the probability of x successes from n trials.

-IM Anjasmara, 2019-

Find the probability of scoring exactly 2 heads from 3 tosses of

f (2) = 3 C2 (0.5)2 (0.5)1 = 3 × 0.25 × 0.5 = 0.375

-IM Anjasmara, 2019-

The expected number of successes in a binomial experiment is

-IM Anjasmara, 2019-

A particular class has 35 students. From past experience it is

-IM Anjasmara, 2019-

A particular class has 35 students. From past experience it is

-IM Anjasmara, 2019-

A continuous probability distribution is the probability

-IM Anjasmara, 2019-

where is some small number.