WK 3-Tut 3 Notes-Random Variables-Upload

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12


Statistics and Analysis

Week 3 Review

Random Variables, Probability

Distributions, Discrete and
Continuous Variables

Random Variable
• Represents a possible numerical value from a random
• Let X be a discrete random variable and x be one of its
possible values, then the probability of X taking a specific
value x is denoted as 𝑃(𝑋 = 𝑥)
• The probability distribution function of a random variable is
a representation of the probabilities for all the possible
outcomes of the random variable

Random Variables - Dr Gan 2

Week 3 Review – Dr Gan Chui Goh 1

Probability Distribution
Example: Toss 2 coins.
Let X=no. of heads, show 𝑃(𝑋 = 𝑥), for all values of x.
There are 4 possible outcomes: TT TH HT HH

Probability Distribution
Probability Distribution of X
X 𝑃(𝑋 = 𝑥) 0.60
1 0.50
0 = 0.25 0.40

4 0.30 0.25 0.25
2 0.20
1 = 0. 50 0.10
1 0 1 2
2 = 0.25
4 x

Random Variables - Dr Gan 3

Cumulative Distribution Function

Let X be a random variable with probability distribution
𝑓 𝑥 = 𝑃 𝑋 = 𝑥 , for different values of X.

The cumulative distribution function:

𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = 𝑃(𝑋)

Toss 2 coins. Show the cumulative distribution function.
Let X=no. of heads, 4 possible outcomes: {TT TH HT HH}
X 𝑃(𝑋 = 𝑥) 𝐹 𝑋=𝑥
0 = 0.25 0.25
1 = 0. 50 0.75
Random Variables - Dr Gan 2 = 0.25 1.00 4

Week 3 Review – Dr Gan Chui Goh 2

Discrete Random Variable
• When a random variable can only take on a discrete
number of values, the random variable is a discrete
random variable (can be a finite or infinite list)
• e.g. x: a random variable of the year a student is born.
– Discrete: there is no other value between two values
e.g. either he is born in Year 2001 or 2002; as long as
the variable is countable, it is discrete
– Infinite or finite: it could be a list of years from 1950 to
2001 or it could be an infinite list of years.

Random Variables - Dr Gan 5


Random Variables - Dr Gan 6

Week 3 Review – Dr Gan Chui Goh 3

Properties of Discrete Probability Distributions
Probability distribution of a discrete random variable X is called
probability mass function (pmf) of X.
Probability Mass Function 𝑓 𝑥 = 𝑃 𝑋 = 𝑥 for different values of 𝑥
Cumulative Distribution 𝐹 (𝑥) = 𝑃 𝑋 ≤ 𝑥 = ∑ 𝑃(𝑋)
Function (cdf)
Properties • 0 ≤ 𝑓 𝑥 ≤ 1 for each value of 𝑥 , and
• 𝐹 𝑥 𝑖𝑠 𝑛𝑜𝑛𝑑𝑒𝑐𝑟𝑒𝑎𝑠𝑖𝑛𝑔 𝑖𝑛 𝑥, and
𝐹 𝑥 = 𝑃 𝑥 =1

Random Variables - Dr Gan 7

Properties of Discrete Probability Distributions

For a random discrete variable x with the following probability distribution:
x f(x) F(x)
1 0.2 0.2 It’s cdf can be shown
2 0.5 0.7 graphically as this:
4 0.1 0.8 0 𝑥<1
cdf, F(x)
0.2 1 ≤ 𝑥 < 2 1
6 0.2 1 F x = 0.7 2 ≤ 𝑥 < 4 ⇒
0.9 0.8
0.8 0.7
0.8 4 ≤ 𝑥 < 6

0.6 0.7 - 0.2
1 6≤𝑥 0.5 =0.5

It’s pmf can be shown 0.4

0.3 0.2
graphically as this: 0.1
0 x
0 1 2 3 4 5 6 7

pmf, f(x)
0.6 0.5

0.2 0.2 Since x is discrete,
0.2 0.1
𝑃 𝑥 ≤ 3.5 = 𝑃 1 + 𝑃 2
0 Variables
1 2 3 - Dr4 Gan
5 6 7
x = 0.2 + 0.5
= 0.7

Week 3 Review – Dr Gan Chui Goh 4

Discrete Uniform Distribution: equal probability
(e. g. 𝑋~𝑈𝑛𝑖𝑓𝑜𝑟𝑚 𝑎 = 1, 𝑏 = 10 )
Probability Mass 𝑃 𝑋=𝑥 =
Function (pmf)

Discrete Uniform Distribution

Example: 0.15
10 groups of 5 students in a class equally


likely to be asked to present their answer to 0.05

the class 0
0 1 2 3 4 5 6 7 8 9 10 11
X, group no.

X: random variable for the group no. selected

Random Variables - Dr Gan 9

Bernoulli Distribution
For events with two outcomes: e.g. Yes/No, success/failure
Bernoulli probability 𝑓 1 =𝑃 1 =𝑝
distribution 𝑓 0 = 𝑃 0 = 1−𝑝
p 𝑥=1 𝑓 𝑥 = 𝑃 𝑋 = 0 for all other values of X
f x = 1−𝑝 𝑥=0
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 where 𝑝 is the probability of success.
Cumulative Probability 𝐹(1) = 1
𝐹(0) = 1 − 𝑝

e.g. Pick a person who is obese, P ob𝑒𝑠𝑒 = 𝑝 = 0.75

Bernoulli Distribution

0 1

Random Variables - Dr Gan 10


Week 3 Review – Dr Gan Chui Goh 5

Binomial Distribution
• For counting successes/failure (a 2-state outcomes)
• A fixed number of observations, n (e.g. 15 tosses of a
coin; ten light bulbs taken from a warehouse)
• Binomial (n independent Bernoulli trials, with a constant
probability of success 𝑝)
X ~ B(n, p) where X is the total number of successes in the
experiment that follows a binomial distribution.

Random Variables - Dr Gan 11


Binomial Distribution
Binomial 𝑓 𝑥 =𝑃 𝑋=𝑥 = 𝐶 𝑝 (1 − 𝑝) , for 𝑥 = 0,1, 2, …n
probability 𝑓(𝑥) = 𝑃 𝑋 = 𝑥 = 0 for all other values of X
Note: n = no. of t𝑟𝑖𝑎𝑙𝑠 ; 𝑝 = probability of success ; 𝑥 is the no. of ‘successes’

𝐶 =
𝑥! 𝑛 − 𝑥 ! 𝑛! = 𝑛 𝑛 − 1 𝑛 − 2 … x 2 x 1

Random Variables - Dr Gan 12


Week 3 Review – Dr Gan Chui Goh 6

binomial in R: dbinom(x, n, p)
A quiz contains 10 MCQs each with 4 possible answer
options, only one is correct. If you choose the answer
randomly for all questions, what is the probability for you to
get all wrong answers?

X=0, n=10, P(right answer) = p = 0.25

𝑃 𝑥 = 𝐶 𝑝 (1 − 𝑝)
= 𝑝 (1 − 𝑝)
! !

𝑓 0 =𝑃 𝑋=0 = 𝐶 (0.25) (1 − 0.25)

= (0.25) (1 − 0.25)
! !
= 0.0563 In R: dbinom(0, 10, 0.25) = 0.0563
Random Variables - Dr Gan 13


Some R commands for binomial distribution

𝑝𝑟𝑜𝑏: probability of success
𝑠𝑖𝑧𝑒: no. of trials

• dbinom(𝑥, 𝑠𝑖𝑧𝑒, 𝑝𝑟𝑜𝑏) ⇒ the pmf f(𝑥) or P(X= 𝑥);

 gives probability at X= 𝑥 (in graphic calculator, this is binompdf)

• pbinom (𝑞, 𝑠𝑖𝑧𝑒, 𝑝𝑟𝑜𝑏) ⇒ the cdf F(q) or 𝑃(𝑋 ≤ 𝑞)

 gives the cumulative probability up to 𝑞, i.e. 𝑃(𝑋 ≤ 𝑞) (in graphic
calculator, this is binomcdf)
e.g. 𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) : we can use sum(dbinom(𝑎: 𝑏, 𝑠𝑖𝑧𝑒, 𝑝𝑟𝑜𝑏)
𝑃 𝑎 < 𝑋 ≤ 𝑏 =pbinom(𝑏, 𝑠𝑖𝑧𝑒, 𝑝𝑟𝑜𝑏) - pbinom(𝑎, 𝑠𝑖𝑧𝑒, 𝑝𝑟𝑜𝑏)

• qbinom(𝑝, 𝑠𝑖𝑧𝑒, 𝑝𝑟𝑜𝑏) ⇒ returns the value of 𝒙 such that F(𝑥)= 𝑝

• rbinom( n, 𝑠𝑖𝑧𝑒, 𝑝𝑟𝑜𝑏) ⇒ draw a random sample size n

Random Variables - Dr Gan 14


Week 3 Review – Dr Gan Chui Goh 7


Random Variables - Dr Gan 15


Continuous Random Variable

• A variable that can assume any value in an interval, i.e.
uncountable different values (e.g. thickness, time, height)
• It can potentially take on any value, depending only on the
ability to measure accurately measurable

• The probability density function, f(x), of a continuous

random variable describes the height of the distribution at
a specific value x
• The probability is the area under the distribution curve
• The probability at a specific value is zero for continuous

Random Variables - Dr Gan 16


Week 3 Review – Dr Gan Chui Goh 8

Properties of Continuous Probability Distributions
Probability distribution of a continuous random variable X is called
probability density function (pdf) of X.

Probability Density 𝑓 𝑥 = 𝑝𝑑𝑓 𝑥

Function (pdf)
Cumulative 𝐹 𝑥 = 𝑃(𝑋 ≤ 𝑥) = ∫ 𝑓 𝑥 𝑑𝑥
Distribution Function
(cdf) e.g. Probability that X lies between two points, a and b:
𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = ∫ 𝑓 𝑥 𝑑𝑥
Properties • 𝑓 𝑥 > 0 for all value of 𝑥 , i.e. 𝐹 𝑥 𝑖𝑠 𝑛𝑜𝑛𝑑𝑒𝑐𝑟𝑒𝑎𝑠𝑖𝑛𝑔 𝑖𝑛 𝑥, and
• ∫ 𝑓 𝑥 𝑑𝑥 = 1

Random Variables - Dr Gan 17


The Standard Normal Distribution

• Any normal distribution (with any mean and variance combination) can
be transformed into the standardized normal distribution (Z), with
mean 0 and variance 1:
𝑍~𝑁(0, 1)

• Transforming X into Z:
𝑍= 𝜎=1
𝜇=0 Z

Random Variables - Dr Gan 18


Week 3 Review – Dr Gan Chui Goh 9

Some R commands for normal distribution
• dnorm(𝑥, mean = 𝜇 , sd = 𝜎) ⇒ the pdf f(𝑥)
 gives the “height” of the curve at point 𝑥, (in graphic calculator,
this is normpdf)

• pnorm(𝑞, mean = 𝜇 , sd = 𝜎) ⇒ the cdf F(𝑞)

 gives the area under the curve, i.e. 𝑃(𝑋 ≤ 𝑞), (in graphic
calculator, this is normcdf)

• qnorm(𝑝, mean = 𝜇 , sd = 𝜎) ⇒ returns the value of 𝑥 such that F(𝑥)= 𝑝

 when the area under the curve is 𝑝, it gives the value of 𝒙, (in
graphic calculator, this is invnorm)

• rnorm(n, mean = 𝜇 , sd = 𝜎) ⇒ draw a random sample (size=n)

Random Variables - Dr Gan 19


Example 𝜎
A management consultant found that the amount of time per day spent by
executives performing tasks that could be done equally well by subordinates
followed a normal distribution with a mean of 2.4 hour. It was also found that 10%
of executives spent over 3.5 hours per day on tasks of this type. Find the probability
that an executive spends more than 3 hours per day on tasks of this type.
. .
𝑃 𝑥 > 3.5 = 𝑃 𝑍 > = 0.10 ⇒ 𝑍 = 𝑞𝑛𝑜𝑟𝑚 0.9, 0, 1 = 1.2816

3.5 − 2.4 3.5 − 2.4 1 − 0.10

⇒ = 1.2816 ⇒ 𝜎 = = 0.8583
𝜎 1.2816 = 0.9

3 − 2.4 2.4 3.5 X

𝑃 𝑥>3 =𝑃 𝑍> 0 Z? Z
0.8583 =1.2861
= 𝑃 𝑍 > 0.6991 X=3
= 1 − 𝑝𝑛𝑜𝑟𝑚 0.6991, 0, 1
= 0.2422 𝑜𝑟 1 − 𝑝𝑛𝑜𝑟𝑚(3, 2.4, 0.8583)
So the probability that an executive spends more than 3 hours per day on tasks of
this type is 0.2422.

Random Variables - Dr Gan 20


Week 3 Review – Dr Gan Chui Goh 10

Uniform Distribution
Equal probabilities for all equal-width intervals within the range of the
random variable.

𝑓𝑜𝑟 𝑎 ≤ 𝑥 ≤ 𝑏
𝑓 𝑥 = 𝑏 − 𝑎
0 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑜𝑡ℎ𝑒𝑟 𝑣𝑎𝑙𝑢𝑒𝑠 𝑜𝑓 𝑥

f(x) Total area under the

uniform probability
density function is 1.0

xmin = a xmax = b x

Random Variables - Dr Gan 21


Some R commands for continuous

uniform distribution
• dunif(x, min = a, max = b)
 gives the density

• punif(q, min = a, max = b)

 gives the distribution function

• qunif(p, min = a, max = b)

 gives the quantile function

• runif(n, min = a, max = b)

 generates uniform random numbers between a and b

Random Variables - Dr Gan 22


Week 3 Review – Dr Gan Chui Goh 11

Uniform distribution:
Example punif(x, min, max)

Records have shown that the waiting time for a lift in the lobby of a hotel
is uniformly distributed between 0 to 4 min during 12pm-3pm. Determine
the probability that a randomly selected guest will spend at most 2.5
minutes waiting for the lift. What about selecting a guest spending more
than 3 minutes waiting for the lift.
Distribution of Elevator
𝑃 𝑥 ≤ 2.5 = 2.5 − 0 = 0.625 1/2
Waiting Times

𝑂𝑟 1/4

𝑃 𝑥 ≤ 2.5 = 𝑝𝑢𝑛𝑖𝑓(2.5, 0, 4) 4

= 0.625 0 1 2 3 4
x, waiting time (min)

𝑃 𝑥 > 3 = 1−𝑃 𝑥 <3 𝑃 2.5 < 𝑥 < 3

= 1 − 𝑝𝑢𝑛𝑖𝑓(3, 0, 4) = 𝑝𝑢𝑛𝑖𝑓 3, 0, 4 − 𝑝𝑢𝑛𝑖𝑓 2.5, 0, 4
= 1 − 0.75
= 0.75 − 0.625
= 0.25
= 0.125
Random Variables - Dr Gan 23


- The End -

Random Variables - Dr Gan 24


Week 3 Review – Dr Gan Chui Goh 12

You might also like