Lecture 8 - Continuous Probability Distributions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

ECON 1005 Lectures

Continuous Probability Distributions

1
Continuous Probability Distributions
• We have so far looked at two special discrete
probability distributions: the Binomial and the
Poisson

• Now we focus on continuous distributions,


perhaps the most famous in Statistics: that of
the Normal Distribution

2
Continuous Random Variables
• Recall that we defined a continuous (cts) random
variable as a value that is not COUNTABLE

• A cts random variable can assume any value with an


interval

• Because the number of values contained in an


interval is infinite, the possible number of values that
a cts random variable can assume is also infinite

• We therefore cannot count these values as we do for


discrete random variables
3
Probability Distributions
• A probability distribution links a random variable X with the
probability that X assumes a discrete value or a range of values

• This can be presented by a table, function or formula

• Random variables can be discrete or continuous

• Probability distributions are also correspondingly discrete or continuous

• Strictly speaking, when a variable is continuous, Pr(X=x) = 0

• In other words, it is impossible to determine the probability associated


with a PRECISE value, simply because it is impossible to determine a
precise value of the continuous random variable

• It is only possible to determine the probability associated with


INTERVALS on the real line, for example, Pr(X 5), or Pr(-3 X 7).

4
Properties of Continuous Probability Distributions
• The probability distribution of a continuous random
variable possesses the following two characteristics:

– The probability that X assumes a value in any interval lies


in the range of 0 to 1 (like all probabilities)

– The total probability of all the mutually exclusive


intervals within which X can assume a value, is 1

• The second criterion means that the area under the


curve of f(x), the probability density function, is
equal to 1. 5
The Normal Distribution
• This is one of the many distributions that a cts
random variable can possess

• however it is the most widely used continuous


distribution

• A large number of phenomena in the real world


are either exactly or approximately normally
distributed

6
The Normal Distribution
• A continuous random variable X having a
probability distribution function

2
1 x-
1 2
f(x) = e , - x<
2
is said to have a Normal Distribution.

• We know this looks somewhat unpleasant. But


don’t worry! For the moment, just try to
recognise some familiar terms within this
equation.
7
The Normal Curve
f(x)

x
x=

• The graph of the normal distribution is called The


Normal Curve

• We can notice a few interesting things about this


Curve.
– It is bell shaped and symmetric
– It is centred at the mean value μ
– Its tails extend indefinitely i.e. from -∞ on the left to +∞
on the right without touching or crossing the horizontal
axis
8
The Normal Curve
f(x)

x
x=

• We can identify certain properties of the normal distribution:


– The mean, median and mode of the distribution coincide at x = m
– The curve is symmetrical about a vertical axis through the point x =
– The total area under the curve is equal to one

• The symmetry about the mean value points to the area under
the curve to the left of the mean equals 0.5; similarly, the area
under the curve to the right of the mean is also 0.5.

• The higher the top of the curve, the lower the std deviation and
vice versa
9
The Parameters of Normal Distribution
• μ and σ are called the parameters of the normal
distribution.

• Each combination of μ and σ gives rise to a unique


normal curve referred to as N(μ , σ).

• No probability can be computed without values for


μ and σ.

• (Recall: what were the parameters of the Binomial


Distribution and the Poisson Distribution?
10
Calculating Probabilities with the Normal Distribution
• Recall that, with cts random variables and cts distributions
such as the Normal Distribution, we cannot speak about X
being EQUAL TO a value

• By definition, a cts random variable cannot be EQUAL to a


value, but rather can assume a number of infinite values
within an INTERVAL

• It is therefore only possible to determine the probability


associated with INTERVALS on the real line, for example,
Pr(X 5), or Pr(-3 X 7).

• These probabilities can be calculated by calculating the


relevant AREA under the normal curve
11
Activity
• A continuous random variable X follows a
Normal Distribution with a mean of 12 and a
standard deviation of 4.

• For each of the following cases, plot the area


that represents the required probability:
P(X < 5) P(X > 5)
P(X < -2) P(X > -2)
P(-1 < X < 3) P( 2 < X < 4)
P(X < 12) P (X > 12)

12
Calculating Probabilities with the Normal Distribution
• The probability density function of the Normal Distribution
is given by
2
1 x-
1 2
f(x) = e , - x<
2
• In the absence of any other information, calculating
probabilities that X lies in a particular interval will require
the calculation of the relevant area under the Normal Curve
(such as the areas you just shaded in the previous slide’s
Activity)

• How do we find areas under a curve? By integrating its


equation within the relevant limits.

• Does anyone want to try integrating the above formula? 13


Calculating Probabilities with the Normal Distribution
• We don’t want to have to use the formula for the probability density
function as it is very clumsy to integrate

• The areas under the Normal Curve can be presented in a cumulative


probability table. If we had this information, we could then use the tables
to calculate the required probabilities

• However, every Normal Curve will be different, depending on the values of


the parameters μ and σ

• Therefore, there exists an infinitely large family of Normal Curves based on


different combinations of μ and σ

• Does this suggest that we need to access a book containing infinitely many
cumulative probability tables? NO it does not.

• We can adopt a practice that allows us to reduce any Normal Distribution


probability into a standard metric. This metric is known as the Standard
Normal Distribution.
14
The Standard Normal Distribution
• The Standard Normal Distribution is the special case of the
Normal Distribution where μ = 0 and σ = 1

• The random variable that possesses the Standard Normal


Distribution is called the Standard Normal Variable and it is
denoted by Z

• Therefore, μ =E(Z) = 0 σ= Std Dev of Z = 1, and σ2 = Var(Z) = 1

• The values of Z are located on the horizontal axis of the


Standard Normal Curve.

• The Values of Z are also called Z Scores otherwise called


standard scores.
15
The Standard Normal Curve
f(z)

F(-z) 1-F(z)

-z 0 z

– It is bell shaped and symmetric


– It is centred at the mean value μ = 0
– Its tails extend indefinitely i.e. from -∞ on the left to
+∞ on the right without touching or crossing the
horizontal axis
– The units on the horizontal axis are denoted by Z and
are called the Z-values or the Z-scores
16
Activity
Look at your copy of the Table of the Standard
Normal Distribution and use it to compute these
probabilities.
• P(Z > 2.1)
• P(Z < 1.9)
• P(1.9 < Z < 2.1)
• P(Z > -1.9)
• P(-1.9 < Z < 1.9)
• P(Z < -2.1)
• P(0 < Z < 0.44)

17
Activity
Look at your copy of the Table of the Standard
Normal Distribution and use it to compute these
probabilities.
• P(Z < 1.9) = 0.9713
• P(Z > 2.1) = 0.0178
• P(1.9 < Z < 2.1) = 0.011
• P(Z > -1.9) = 0.9713
• P(-1.9 < Z < 1.9) = 1- (0.0287+0.0287) = 0.9426
• P(Z < -2.1) = 0.0178
• P(0 < Z < 0.44) = 0.17

18
Standardisation
• In general, a normal distribution has a mean of μ
(not necessarily equal to zero as in the standard
case) and a variance of σ (not necessarily equal to
1).

• Yet the tables discussed above are valid only for


that standard case where μ = 0 and σ = 1

• How then can we use the Standard Normal tables


to calculate probabilities for variables that follow a
Normal but NOT a Standard Normal Distribution?

• The way to do this is to “STANDARDISE”


19
Standardisation
• For a random variable X following a
normal distribution with mean μ and
standard deviation σ, a particular value of
X can be converted to its corresponding Z
value by using the formula

Z = X– μ
σ
20
Standardisation: Example
• Let X be a cts random variable that has a
normal distribution with a mean of 50 and a
standard deviation of 10. Convert the
following X values to Z values and find the
probability to the left of these points.

• (1) X = 55
• (2) X = 35

21
Standardisation: Example
X N(50, 10)

X = 55
Z = (55-50) / 10 = 0.5
P(Z<0.5) = 1- P(Z>0.5) = 1-0.3085 = 0.6915

X = 35
Z = 35-50 / 10 = -1.50
P(Z < -1.50) = P(Z>1.5) = 0.0668
22
Activity
• Let X be a normal random variable with a
mean of 40 and a standard deviation of 5.
Find the following probabilities:

(1) P(X > 55)


(2) P(X < 49)

• Let X be a normal random variable with a


mean of 50 and a variance of 8. Find
P(30<X<39)
23
How do we use the Table of the Standard Normal Distribution to find
a probability under the normal distribution N(μ , σ)?
Finding P(x1 <X < x2)
• Sketch the curve of N(μ , σ) and highlight the area
which relates to P(x1 <X < x2)
• Transform the end points X = x1 and X = x2 to Z Scores
Z1 and Z2 respectively using the formula
Z=X–μ
σ
• Sketch the equivalent area under the Standard
Normal Curve bounded by Z1 and Z2
• Read off the area from the table of the Standard
Normal Curve.

24
Applications of the Normal Distribution:
Activity
The monthly share deposits of members
of a Credit Union are normally distributed
with mean $500 and standard deviation
$150.

Find the probability that in any month the


deposits will range between $250 and
$875.
25
Solution
• Let X represent the monthly share deposits of members
• X N(500, 150)
• We therefore need to find P(250<X<875)
• Standardizing:
• Z = 250 – 500 = - 1.66
150

Z = 875 – 500 = 2.5


150
• Now we have the two corresponding Z values hence we can
use the Standard Normal Distribution and its Table

• Our resultant probability is: 1- (0.00621+0.0485) = 0.945


26
The Normal Approximation
to the Binomial Distribution

• This approximation is a special case of the very famous Central


Limit Theorem (which we will meet again soon), and is both of
practical and theoretical importance.

• In particular, it remains very useful notwithstanding the


widespread use of electronic computers.

• We have already seen that if


X Bin(n, p),
Then E(X) = np and Var(X) = npq

• If N is large, we can approximate X by a Normal Distribution
27
The Normal Approximation
to the Binomial Distribution
• Remember, we are approximating a DISCRETE distribution by a
CONTINUOUS one

• Before we approximate, we must apply what is known as a Continuity


Correction , to convert the discrete random variable into a continuous one.

• The continuity correction is made by subtracting 0.5 from the lower limit of
the interval and/or adding 0.5 to the upper limit of the interval.

• For example, if X is a discrete random variable that follows a Binomial


Probability Distribution and we are required to find Pr(X < 9), then the
binomial probability Pr(X < 9) will be approximated by the normal
probability Pr(X<9.5) - adding 0.5 to the upper limit (there is no lower
limit).

• Similarly, Pr(X>10) will become Pr(X>9.5), and Pr(5<X<8) will become


Pr(4.5<X<8.5).
28
The Normal Approximation
to the Binomial Distribution
• 75% of students on the U.W.I campus are known to be female. A
sample of 100 students is drawn, what is the probability that
there will be more than 20 male students?

• The proportion of male students is 0.25 (the value of p). If we


use the Binomial distribution, we must evaluate: Pr(X>20) =
Pr(X=21) + Pr(X=22) + ... + Pr(X=100)

• This is a Herculean task which should only be carried out using


MINITAB or some other statistical software. Doing so yields a
value of Pr(X>20) = 0.8512

• Since n =100 is a relatively large number, we may use the normal


distribution to calculate the value of Pr(X>20).
29
The Normal Approximation
to the Binomial Distribution
• The normal distribution in this case would have a mean of np =
100x0.25 =25

• and a variance of npq=100x0.25x0.75=18.75

• Since Pr(X>20)= 1 - Pr(X 20), we must evaluate Pr(X 20). Employing


the correction factor discussed above we must evaluate Pr(X 20.5)
as follows:

• Pr(X 20.5) = Pr(X- = Pr(Z -1.04) = 0.1492 so that, finally, Pr(X>20) =


1 - 0.1492 = 0.8508

• This value is reasonably close to that obtained using the Binomial


distribution. 30
Using The Normal Distribution To Approximate The Binomial
Distribution

1. Let X be a Binomial Variable with parameters


n and p.
2. Estimate P( a ≤ X ≤ b) by a Normal
Approximation
3. Perform the Continuity Correction i.e.
P( a ≤ X ≤ b) = P(a - 0.5 < X < b + 0.5)
4. Set up the transformation
Z = X – np
√npq
31
Using The Normal Distribution To Approximate The Binomial
Distribution

5. Transform the left end point a – 0.5 to z1


6. Transform the right end point b + 0.5 to z2
7. Sketch a curve of the Standard Normal
Distribution and shade the area that
corresponds to P( z1 < Z < z2)
8. Read off the area from the Std Normal Table

32
End of Lecture
• This material is covered in the PS Mann
Chapter 6

• Please ensure that you revise this material


before next week’s class. It is VITAL that you
do so!

33

You might also like