M4 StatEcon 3rd Probability

Introduction to
Normal Distributions &

Standard Distribution
Properties of Normal Distributions
A continuous random variable has an infinite number of possible

values that can be represented by an interval on the number line.
Hours spent studying in a day
0 3 6 9 12 15 18 21 24
The time spent

studying can be any
number between 0
and 24.
The probability distribution of a continuous random variable is
called a continuous probability distribution.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 3

The most important probability distribution in
statistics is the normal distribution.
Normal curve
A normal distribution is a continuous probability

distribution for a random variable, x. The graph of a
normal distribution is called the normal curve.

Properties of a Normal Distribution

1. The mean, median, and mode are equal.
2. The normal curve is bell-shaped and symmetric about the
mean.
3. The total area under the curve is equal to one.
4. The normal curve approaches, but never touches the x-axis as
it extends farther and farther away from the mean.
5. Between μ  σ and μ + σ (in the center of the curve), the
graph curves downward. The graph curves upward to the left
of μ  σ and to the right of μ + σ. The points at which the
curve changes from curving upward to curving downward are
called the inflection points.
Inflection points
Total area = 1
x
μ  3σ μ  2σ μσ μ μ+σ μ + 2σ μ + 3σ
If x is a continuous random variable having a normal

distribution with mean μ and standard deviation σ, you can
graph a normal curve with the equation
1 -( x - μ )
2
2σ
2
y = e . e = 2 .1 7 8 π = 3 .1 4
σ 2π

Means and Standard Deviations (SD)
A normal distribution can have any mean and
any positive standard deviation.
Inflection
The mean gives the points
Inflection location of the line
points of symmetry.
x x
1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11
Mean: μ = 3.5 Mean: μ = 6

SD : σ  0.3 SD: σ  1.9
The standard deviation describes the spread of the data.

Means and Standard Deviations
Example:
1. Which curve has the greater mean?
2. Which curve has the greater standard deviation?
B
A
x
1 3 5 7 9 11 13
The line of symmetry of curve A occurs at x = 5. The line of

symmetry of curve B occurs at x = 9. Curve B has the greater mean.
Curve B is more spread out than curve A, so curve B has the greater
standard deviation.
(The population distribution follow a Normal
distribution, then so does the sample mean)
Interpreting Graphs
Example:
The heights of fully grown magnolia bushes are normally
distributed. The curve represents the distribution. What is the
mean height of a fully grown magnolia bush? Estimate the
standard deviation.
The inflection points are one
standard deviation away from
μ=8 the mean. σ  0.7
x
6 7 8 9 10
Height (in feet)
The heights of the magnolia bushes are normally distributed

with a mean height of about 8 feet and a standard deviation of
about 0.7 feet.

The Standard Normal Distribution
The standard normal distribution is a normal distribution
with a mean of 0 and a standard deviation of 1.
The horizontal scale

corresponds to z-scores.
z
3 2 1 0 1 2 3
Any value can be transformed into a z-score by using the

formula z =
V a lu e - M e a n
=
x - μ
.
S t a n d a r d d e v ia t io n σ

The Standard Normal Distribution
If each data value of a normally distributed random
variable x is transformed into a z-score, the result will be
the standard normal distribution.
The area that falls in the interval under
the nonstandard normal curve (the x-
values) is the same as the area under the
standard normal curve (within the
corresponding z-boundaries).
3 2 1 0 1 2 3
z
After the formula is used to transform an x-value into a
z-score, the Standard Normal Table in Appendix B is
used to find the cumulative area under the curve.
The Standard Normal Table
Properties of the Standard Normal Distribution
1. The cumulative area is close to 0 for z-scores close to z = 3.49.
2. The cumulative area increases as the z-scores increase.
3. The cumulative area for z = 0 is 0.5000.
4. The cumulative area is close to 1 for z-scores close to z = 3.49
Area is close to 0. Area is close to 1.

z
3 2 1 0 1 2 3
z = 3.49 z = 3.49
z=0
Area is 0.5000.

Example:
Find the cumulative area that corresponds to a z-score of 2.71.
Appendix B: Standard Normal Table
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
Find the area by finding 2.7 in the left hand column, and then moving
across the row to the column under 0.01.
The area to the left of z = 2.71 is 0.9966.
Example:
Find the cumulative area that corresponds to a z-score
of 0.25.
z .09 .08 .07 .06 .05 .04 .03 .02 .01 .00
3.4 .0002 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003
3.3 .0003 .0004 .0004 .0004 .0004 .0004 .0004 .0005 .0005 .0005
0.3 .3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821
0.2 .3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207
0.1 .4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602
0.0 .4641 .4681 .4724 .4761 .4801 .4840 .4880 .4920 .4960 .5000
Find the area by finding 0.2 in the left hand column, and then moving
across the row to the column under 0.05.
The area to the left of z = 0.25 is 0.4013
Finding Areas
Finding Areas Under the Standard Normal Curve

1. Sketch the standard normal curve and shade the appropriate
area under the curve.
2. Find the area by following the directions for each case shown.
a. To find the area to the left of z, find the area that
corresponds to z in the Standard Normal Table.
The ½ area to the left of z = 0.500.
The area to the left From table: 1.23 = 0.1093. To the
of z = 1.23 is right of Z = 0.500 – 0.1093=
0.8907. 0.3907. So, left of Z = 0.500 +
0.3907 = 0.8907
z
0 1.23
1. Use the table to find the
area for the z-score.

Guidelines for Finding Areas

b. To find the area to the right of z, use the Standard
Normal Table to find the area that corresponds to z.
Then subtract the area from 1.
2. The area to
3. Subtract to find the area
the left of z
to the right of z = 1.23:
= 1.23 is 1  0.8907 = 0.1093.
0.8907.
z
0 1.23
1. Use the table to find

the area for the z-score.


c. To find the area between two z-scores, find the area
corresponding to each z-score in the Standard Normal Table.
Then subtract the smaller area from the larger area.
4. Subtract to find the area

2. The area to of the region between the
the left of z = two z-scores:
1.23 is 0.8907  0.2266 = 0.6641.
0.8907.
3. The area to the
left of z = 0.75
is 0.2266.
z
0.75 0 1.23
1. Use the table to find the area

for the z-score.

Example:
Find the area under the standard normal
curve to the left of z = 2.33.
Always draw
the curve!
2.33 0
From the Standard Normal Table, the area is equal to

0.0099.
Example:
curve to the right of z = 0.94.
Always draw
the curve!
0.8264
1  0.8264 = 0.1736
0 0.94
z
From the Standard Normal Table, the area is equal to 0.1736.

Example:
curve between z = 1.98 and z = 1.07.
Always draw
0.8577 the curve!
0.0239 0.8577  0.0239 = 0.8338
z
1.98 0 1.07
From the Standard Normal Table, the area is equal to 0.8338.

Normal Distributions:
Finding Probabilities
Probability and Normal Distributions
If a random variable, x, is normally distributed,

you can find the probability that x will fall in a
given interval by calculating the area under the
normal curve for that interval.
μ = 10
P (x < 15) σ=5
x
μ =10 15

Normal Distribution Standard Normal Distribution

μ = 10 μ=0
σ=5 σ=1
P(x < 15) P(z < 1)
x z
μ =10 15 μ =0 1
Same area
P (x < 15) = P (z < 1) = Shaded area under the curve

= 0.8413
See table area Z = 1 is = 0.1587

Example:
The average on a statistics test was 78 with a standard
deviation of 8. If the test scores are normally distributed,
find the probability that a student receives a test score less
than 90. μ = 78 x - μ 90 - 78
σ=8 z  =
σ 8
= 1 .5
P(x < 90)
The probability that a student

x receives a test score less than
μ =78 90 90 is 0.9332.
z
μ =0 ?
1.5
P (x < 90) = P (z < 1.5) = 0.9332

Example:
The average on a statistics test was 78 with a standard deviation
of 8. If the test scores are normally distributed, find the
probability that a student receives a test score greater than than
85.
x- μ 85 - 78
μ = 78 z = =
σ 8
σ=8
= 0 .8 7 5  0 .8 8
P(x > 85)
x receives a test score greater
μ =78 85
z than 85 is 0.1894.
μ =0 0.88
?
P(x > 85) = P(z > 0.88) = 1  P(z < 0.88) = 1  0.8106 = 0.1894

Example:
The average on a statistics test was 78 with a standard deviation of 8.
If the test scores are normally distributed, find the probability that a
student receives a test score between 60 and 80.
x - μ 60 - 78
z1 = = = - 2 .2 5
σ 8
P(60 < x < 80) x - μ 80 - 78
z2  = = 0 .2 5
σ 8
μ = 78
σ=8
x receives a test score between
60 μ =78 80
z 60 and 80 is 0.5865.
2.25
? μ =0 0.25
?
P(60 < x < 80) = P(2.25 < z < 0.25) = P(z < 0.25)  P(z < 2.25)
= 0.5987  0.0122 = 0.5865
normal distributions:
finding values
Finding z-Scores
Example:
Find the z-score that corresponds to a cumulative area
of 0.9973. Appendix B: Standard Normal Table
z .00 .01 .02 .03 .04 .05 .06 .07 .08
.08 .09
0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
Find the z-score by locating 0.9973 in the body of the Standard Normal
Table. The values at the beginning of the corresponding row and at the top
of the column give the z-score.
The z-score is 2.78.
Finding z-Scores
Example:
Find the z-score that corresponds to a cumulative area
of 0.4170.
z .09 .08 .07 .06 .05 .04 .03 .02 .01
.01 .00
3.4 .0002 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003
0.2 .0003 .0004 .0004 .0004 .0004 .0004 .0004 .0005 .0005 .0005 Use the
closest
0.3 .3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821 area.
0.2
0.2 .3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207
0.1 .4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602
0.0 .4641 .4681 .4724 .4761 .4801 .4840 .4880 .4920 .4960 .5000
Find the z-score by locating 0.4170 in the body of the Standard Normal
Table. Use the value closest to 0.4170.
The z-score is 0.21.
Finding a z-Score Given a Percentile
Example:
Find the z-score that corresponds to P75.
Area = 0.75
z
μ =0 ?
0.67
The z-score that corresponds to P75 is the same z-score that

corresponds to an area of 0.75.
The z-score is 0.67.

Transforming a z-Score to an x-Score
To transform a standard z-score to a data value, x, in

a given population, use the formula
x  μ + zσ.
Example:
The monthly electric bills in a city are normally distributed
with a mean of $120 and a standard deviation of $16. Find
the x-value corresponding to a z-score of 1.60.
x  μ + zσ
= 1 2 0 + 1 .6 0 ( 1 6 )
= 1 4 5 .6
We can conclude that an electric bill of $145.60 is 1.6 standard
deviations above the mean.
Finding a Specific Data Value
Example:
The weights of bags of chips for a vending machine are
normally distributed with a mean of 1.25 ounces and a
standard deviation of 0.1 ounce. Bags that have weights in
the lower 8% are too light and will not work in the machine.
What is the least a bag of chips can weigh and still work in the
machine?
P(z < ?) = 0.08
8% P(z < 1.41) = 0.08
z
?
1.41 0 x  μ + zσ
x
 1 .2 5  (  1 .4 1)0 .1
? 1.25
1.11
 1 .1 1
The least a bag can weigh and still work in the machine is 1.11 ounces.

sampling distributions and
the central limit theorem
Sampling Distributions
A sampling distribution is the probability distribution of a
sample statistic that is formed when samples of size n are
repeatedly taken from a population.
Sample Sample
Sample Sample
Sample
Sample
Sample
Sample
Population Sample
Sample

Sampling Distributions
If the sample statistic is the sample mean, then the

distribution is the sampling distribution of sample means.
Sample 3
Sample 1 x Sample 2 Sample 6
Sample 4 3
x1 Sample 5
x
x 4 x x2 6
5
The sampling distribution consists of the values of the

sample means, x 1 , x 2 , x 3 , x 4 , x 5 , x 6 .

Properties of Sampling Distributions
Properties of Sampling Distributions

σ
of Sample Means
,
1. The mean of the sample means, μ x , is equal to the population

mean.
μx = μ
2. The standard deviation of the sample means,σ x , is equal to the

population standard deviation,σ , divided by the square root of n.
σ
σx =
n
The standard deviation of the sampling distribution of the sample
means is called the standard error of the mean.

Sampling Distribution of Sample Means
Example:
The population values {5, 10, 15, 20} are written on slips of
paper and put in a hat. Two slips are randomly selected, with
replacement.
a. Find the mean, standard deviation, and variance of the
population.
Population μ = 1 2 .5
5
10 σ = 5 .5 9
15
20 σ
2
= 3 1 .2 5
Continued.

Example continued:
replacement.
b. Graph the probability histogram for the population
values.
P(x) Probability Histogram
of Population of x
0.25
This uniform distribution
Probabilit
shows that all values have

the same probability of
being selected.
y
x
5 10 15 20
Population values Continued.

Example continued:
replacement.
c. List all the possible samples of size n = 2 and calculate
the mean of each.
Sample Sample mean, x Sample Sample mean, x
5, 5 5 15, 5 10 These means
5, 10 7.5 15, 10 12.5 form the
5, 15 10 15, 15 15 sampling
5, 20 12.5 15, 20 17.5 distribution of
10, 5 7.5 20, 5 12.5 the sample
10, 10 10 20, 10 15 means.
10, 15 12.5 20, 15 17.5
10, 20 15 20, 20 20
Continued.
Example continued:
The population values {5, 10, 15, 20} are written on slips of paper
and put in a hat. Two slips are randomly selected, with
replacement.
d. Create the probability distribution of the sample means.
x f P r o b a b ility
5 1 0.0625
7.5 2 0.1250 Probability
10 3 0.1875 Distribution of
12.5 4 0.2500 Sample Means
15 3 0.1875
17.5 2 0.1250
20 1 0.0625

Example continued:
The population values {5, 10, 15, 20} are written on slips of paper and
put in a hat. Two slips are randomly selected, with replacement.
e. Graph the probability histogram for the sampling distribution.
P(x) Probability Histogram of

Sampling Distribution
0.25
Probabilit
0.20
The shape of the graph is
0.15
symmetric and bell
0.10
y
shaped. It approximates a
0.05
normal distribution.
x
5 7.5 10 12.5 15 17.5 20
Sample mean

The Central Limit Theorem
If a sample of size n  30 is taken from a population with
any type of distribution that has a mean =  and standard
deviation = ,
x x
 
the sample means will have a normal distribution.
x x
x x
x x x
x x x x x x

If the population itself is normally distributed, with
mean =  and standard deviation = ,
x

the sample means will have a normal distribution for

any sample size n. x
x
x x
x x x
x x x x x
x


In either case, the sampling distribution of sample means
has a mean equal to the population mean.
Mean of the
μx  μ sample means
The sampling distribution of sample means has a standard

deviation equal to the population standard deviation
divided by the square root of n.
σ Standard deviation of the
σ x
 sample means
n
This is also called the
standard error of the mean.
The Mean and Standard Error
Example:
The heights of fully grown magnolia bushes have a mean
height of 8 feet and a standard deviation of 0.7 feet. 38
bushes are randomly selected from the population, and
the mean of each sample is determined. Find the mean
and standard error of the mean of the sampling
distribution.
Standard deviation
Mean (standard error)
σ
μx  μ σx 
n
= 8
0 .7
= = 0 .1 1
38
Continued.
Interpreting the Central Limit Theorem
Example continued:
The heights of fully grown magnolia bushes have a mean height
of 8 feet and a standard deviation of 0.7 feet. 38 bushes are
randomly selected from the population, and the mean of each
sample is determined.
The mean of the sampling distribution is 8 feet ,and the standard

error of the sampling distribution is 0.11 feet.
From the Central Limit Theorem,

because the sample size is greater than
30, the sampling distribution can be x
approximated by the normal 7 .6 8 8 .4
distribution. μx = 8 σ x
= 0 .1 1

Example:
The heights of fully grown magnolia bushes have a
mean height of 8 feet and a standard deviation of 0.7
feet. 38 bushes are randomly selected from the
population, and the mean of each sample is determined.
The mean of the sampling distribution

is 8 feet, and the standard error of μx = 8 n = 38
the sampling distribution is 0.11 feet. σ x = 0 .1 1
Find the probability that the

mean height of the 38 bushes is x
7 .6 8 8 .4
less than 7.8 feet. 7.8
Continued.
Example continued:
Find the probability that the mean height of the 38
bushes is less than 7.8 feet.
μx = 8 n = 38
σ x
= 0 .1 1
x - μx
P( < 7.8)
z 
σ x
x
7 .6 8 8 .4 7 .8 - 8
=
7.8 0 .1 1
z
0 = - 1 .8 2
P( < 7.8) = P(z < 1.82
____
? ) = 0.0344
The probability that the mean height of the 38 bushes is
less than 7.8 feet is 0.0344.
Example:
The average on a statistics test was 78 with a standard
deviation of 8. If the test scores are normally distributed,
find the probability that the mean score of 25 randomly
selected students is between 75 and 79.
μ x = 78 x - μx 75 - 78
z1 = = = - 1 .8 8
σ 1 .6
σ 8 x
σ x
= = = 1 .6
n 25
x - μ 79 - 78
z2 = = = 0 .6 3
P(75 < < 79) σ 1 .6
75 78 79
z
1.88
? 0 0.63
? Continued.
Example continued:
P(75 < < 79)
75 78 79
z
1.88
? 0 0.63
?
P(75 < < 79) = P(1.88 < z < 0.63) = P(z < 0.63)  P(z < 1.88)
= 0.7357  0.0301 = 0.7056
Approximately 70.56% of the 25 students will have a mean
score between 75 and 79.
Probabilities of x and x
Example:
The population mean salary for auto mechanics is
 = $34,000 with a standard deviation of  = $2,500. Find
the probability that the mean salary for a randomly selected
sample of 50 mechanics is greater than $35,000.
μ x = 34000
x - μx 3 5 0 0 0 - 3 4 0 0 0 = 2 .8 3
σ 2500 z  =
σ x
 = = 3 5 3 .5 5 σ x
3 5 3 .5 5
n 50
P( > 35000) = P(z > 2.83) = 1  P(z < 2.83)
= 1  0.9977 = 0.0023
The probability that the mean

salary for a randomly selected
34000 35000 sample of 50 mechanics is
z
0 2.83
? greater than $35,000 is 0.0023.

Example:
The population mean salary for auto mechanics is
 = $34,000 with a standard deviation of  = $2,500. Find
the probability that the salary for one randomly selected
mechanic is greater than $35,000.
(Notice that the Central Limit Theorem does not apply.)
μ = 34000 x - μ 3 5 0 0 0 - 3 4 0 0 0 = 0 .4
z = =
σ 2500
σ = 2500
P(x > 35000) = P(z > 0.4) = 1  P(z < 0.4)
= 1  0.6554 = 0.3446
The probability that the salary

34000 35000 for one mechanic is greater
z
0 ?
0.4 than $35,000 is 0.3446.

Example:
The probability that the salary for one randomly selected
mechanic is greater than $35,000 is 0.3446. In a group of
50 mechanics, approximately how many would have a
salary greater than $35,000?
This also means that 34.46% of
P(x > 35000) = 0.3446 mechanics have a salary greater than
$35,000.
34.46% of 50 = 0.3446  50 = 17.23
You would expect about 17 mechanics out of the group

of 50 to have a salary greater than $35,000.
Normal Approximations to
Binomial Distributions
Normal Approximation
The normal distribution is used to approximate the

binomial distribution when it would be impractical
to use the binomial distribution to find a probability.
Normal Approximation to a Binomial Distribution
If np  5 and nq  5, then the binomial random variable x
is approximately normally distributed with mean
μ  np
and standard deviation
σ  npq .

Normal Approximation
Example:
Decided whether the normal distribution to approximate x
may be used in the following examples.
1. Thirty-six percent of people in the United States own
a dog. You randomly select 25 people in the United
States and ask them if they own a dog.
n p = ( 2 5 ) ( 0 .3 6 ) = 9 Because np and nq are greater than 5,
n q = ( 2 5 ) ( 0 .6 4 ) = 1 6 the normal distribution may be used.
2. Fourteen percent of people in the United States own

a cat. You randomly select 20 people in the United
States and ask them if they own a cat.
n p = ( 2 0 ) ( 0 .1 4 ) = 2 .8 Because np is not greater than 5, the
n q = ( 2 0 ) ( 0 .8 6 ) = 1 7 .2 normal distribution may NOT be used.
Correction for Continuity
The binomial distribution is discrete and can be represented
by a probability histogram.
Exact binomial To calculate exact binomial probabilities,
probability
the binomial formula is used for each
value of x and the results are added.
P(x = c) Normal
approximation
P(c 0.5 < x < c + 0.5)
x
c
When using the continuous x

c  0.5 c c + 0.5
normal distribution to approximate a binomial
distribution, move 0.5 unit to the left and right of the
midpoint to include all possible x-values in the interval.
This is called the correction for continuity.
Correction for Continuity
Example:
Use a correction for continuity to convert the binomial
intervals to a normal distribution interval.
1. The probability of getting between 125 and 145
successes, inclusive.
The discrete midpoint values are 125, 126, …, 145.
The continuous interval is 124.5 < x < 145.5.
2. The probability of getting exactly 100 successes.

The discrete midpoint value is 100.
The continuous interval is 99.5 < x < 100.5.
3. The probability of getting at least 67 successes.

The discrete midpoint values are 67, 68, ….
The continuous interval is x > 66.5.

Guidelines
Using the Normal Distribution to Approximate Binomial Probabilities
In Words In Symbols
1. Verify that the binomial distribution applies. Specify n, p, and q.
2. Determine if you can use the normal distribution Is np  5?
to approximate x, the binomial variable. Is nq  5?
3. Find the mean  and standard deviation
for the distribution. μ  np
4. Apply the appropriate continuity correction. σ  npq
5. Shade the corresponding area under the normal curve.
Find the corresponding z-value(s). Add or subtract 0.5
from endpoints.
6. Find the probability. x - μ
z 
σ
Use the Standard
Normal Table.

Approximating a Binomial Probability
Example:
Thirty-one percent of the seniors in a certain high school plan to
attend college. If 50 students are randomly selected, find the
probability that less than 14 students plan to attend college.
np = (50)(0.31) = 15.5 The variable x is approximately normally
nq = (50)(0.69) = 34.5 distributed with  = np = 15.5 and
σ = npq = ( 5 0 ) ( 0 .3 1 ) ( 0 .6 9 ) = 3 .2 7 .
P(x < 13.5) = P(z < 0.61)

= 0.2709 = 15.5
Correction for
continuity
13.5
x - μ 1 3 .5 - 1 5 .5
z  = = - 0 .6 1 x
σ 3 .2 7 10 15 20
The probability that less than 14 plan to attend college is 0.2079.
Approximating a Binomial Probability
Example:
A survey reports that forty-eight percent of US citizens own
computers. 45 citizens are randomly selected and asked
whether he or she owns a computer. What is the probability
that exactly 10 say yes?
np = (45)(0.48) = 12 μ = 12
nq = (45)(0.52) = 23.4 σ  npq = ( 4 5 ) ( 0 .4 8 ) ( 0 .5 2 ) = 3 .3 5
P(9.5 < x < 10.5) = P(0.75 < z  0.45)  = 12

= 0.0997 10.5
Correction for
continuity 9.5
x
The probability that exactly 10 5 10 15
US citizens own a computer is 0.0997.
y
0 .0 0 0 .0 5 0 .1 0 0 .1 5 0 .2 0 0 .2 5
0
2
4
x
6
8
P(2≤X≤4)= P(2≤X<4)= P(2<X<4)
10
The normal distribution
A normal curve: Bell shaped

Density is given by
 (x   ) 
2
1
f (x)  exp   

2
 2  2 
μand σ2 are two parameters: mean and standard variance
of a normal population
(σ is the standard deviation)
The normal—Bell shaped curve: μ=100,
fx
0.12
0.10
0.08
0.06
0.04
0.02
0.00
σ2=10
90 95 100 105 110

x
Normal curves:
0.4
0.3
(μ=0, σ2=1) and (μ=5, σ 2=1)
fx1
0.2
0.1
0.0
-2 0 2 4 6 8
x
Normal curves:
0.4
0.3
(μ=0, σ2=1) and (μ=0, σ2=2)
y
0.2
0.1
0.0
-3 -2 -1 0 1 2 3
x
Normal curves:
(μ=0, σ2=1) and (μ=2, σ2=0.25)
1.0
0.8
0.6
fx1
0.4
0.2
0.0
-2 0 2 4 6 8
The standard normal curve:
μ=0, and σ2=1
0.4
0.3
y
0.2
0.1
0.0
-3 -2 -1 0 1 2 3
x
How to calculate the probability of a normal
random variable?
Each normal random variable, X, has a density function,

say f(x) (it is a normal curve).
Probability P(a<X<b) is the area between a and b, under
the normal curve f(x)
Table I in the back of the book gives areas for a standard
normal curve with =0 and =1.
Probabilities for any normal curve (any  and ) can be
rewritten in terms of a standard normal curve.
Table I: Normal-curve Areas
Table I on page 494-495
We need it for tests
Areas under standard normal curve
Areas between 0 and z (z>0)
How to get an area between a and b? when a<b,
and a, b positive
area[0,b]–area[0,a]
Get the probability from standard
normal table
z denotes a standard normal random variable

Standard normal curve is symmetric about the
origin 0
Draw a graph
Table I: P(0<Z<z)
z .00 .01 .02 .03 .04 .05 .06

0.0 .0000 .0040 .0080 .0120 .0160 .0199 .0239
0.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636
0.2 .0793 .0832 .0871 .0910 .0948 .0987 .1026
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1404
0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772
0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123
… … … … … … … …
1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554
1.1 .3643 .3665 .3686 .3708 .3729 .3749 .3770
Examples
Example 1
P(0<Z<1) = 0.3413
Example 2
P(1<Z<2)
= P(0<Z<2)–P(0<Z<1)
= 0.4772–0.3413
= 0.1359
Examples
Example 3 A dobe A crobat 7.0

Document
P(Z≥1)
= 0.5–P(0<Z<1)
= 0.5–0.3413
= 0.1587
Examples
Document
P(Z ≥ -1)
= 0.3413+0.50
= 0.8413
Examples

Document
P(-2<Z<1)
= 0.4772+0.3413
= 0.8185
Examples
Example 6
A dobe A crobat 7.0
Document
P(Z ≤ 1.87)
= 0.5+P(0<Z ≤ 1.87)
= 0.5+0.4693
= 0.9693
Examples
Example 7
A dobe A crobat 7.0
Document
P(Z<-1.87)
= P(Z>1.87)
= 0.5 – 0.4693
= .0307
The Normal Distribution
f(X) Changing μ shifts the

distribution left or right.
Changing σ increases or
decreases the spread.

 X
The Normal Distribution:
as mathematical function
(pdf)
1 x 2
1  ( )

f (x)  e 2
 2
This is a bell shaped

Note constants: curve with different
=3.14159 centers and spreads
e=2.71828 depending on  and 
The Normal PDF
 1 x   2
1  ( )

  e  1
2
dx
 
 2
It’s a probability function, so no matter what the values of 

and , must integrate to 1!
Normal distribution is defined by its mean and
standard dev.
E(X)= = 
1 
1
(
x
)
2


x e 2
dx

 2
Var(X)=2 = 
1 
1 x 2
( )
x
2 
dx )  
2 2
( e

 2
Standard Deviation(X)=
The beauty of the normal curve:
No matter what  and  are, the area between - and

+ is about 68%; the area between -2 and +2 is
about 95%; and the area between -3 and +3 is
about 99.7%. Almost all values fall within 3 standard
deviations.
68-95-99.7 Rule
68% of
the data
95% of the data
99.7% of the data

68-95-99.7 Rule in Math terms…
  1 x 2
1  ( )


 e 2
dx  . 68
 
 2
  2 1 x 2
1  ( )


 e 2
dx  . 95
  2
 2
  3 1 x 2
1  ( )


 e 2
dx  . 997
  3
 2
How good is rule for real data?
Check some example data:

The mean of the weight of the women = 127.8
The standard deviation (SD) = 15.5
Example
Suppose TOEFL scores roughly follows a normal

distribution in the RI. population of college-bound
students (with range restricted to 200-800), and the
average math TOEFL is 500 with a standard
deviation of 50, then:
68% of students will have scores betw 450-550

95% will be between 400-600
99.7% will be between 350-650
Example
BUT…
What if you wanted to know the math SAT score
corresponding to the 90th percentile (=90% of
students are lower)?
P(X≤Q) = .90 
Q 1 x  500 2
1  ( )
 e dx  . 90
2 50
200
( 50 ) 2
Solve for Q?….Yikes!

The Standard Normal (Z):
―Universal Currency‖
The formula for the standardized normal probability
density function is
1 Z 0 2 1 2
1  ( ) 1  (Z )
p(Z )  e 2 1
 e 2
(1 ) 2 2
The Standard Normal Distribution (Z)
All normal distributions can be converted into the

standard normal curve by subtracting the mean and
dividing by the standard deviation:
X  
Z 

Somebody calculated all the integrals for the standard

normal and put them in a table! So we never have to
integrate!
Even better, computers now do all the integration.
Comparing X & Z units
100 200 X ( = 100,  = 50)
0 2.0 Z ( = 0,  = 1)
Example
For example: What’s the probability of getting a math SAT score of 575
or less, =500 and =50?
575  500
Z   1 .5
50
i.e., A score of 575 is 1.5 standard deviations above the mean

575 1 x  500 2 1 .5 1 2
1  ( ) 1  Z
 P ( X  575 ) 
 e dx  
 e
2 50 2
dz
200
( 50 ) 2 
2
But to look up Z= 1.5 in standard normal chart (or enter

into SAS) no problem! = .9332
Practice problem
If birth weights in a population are normally

distributed with a mean of 109 oz and a standard
deviation of 13 oz,
What is the chance of obtaining a birth weight of 141
oz or heavier when sampling birth records at
random?
or lighter?
Answer
oz or heavier when sampling birth records at
random?
141  109
Z   2 . 46
13
From the chart or SAS  Z of 2.46 corresponds to a right tail (greater

than) area of: P(Z≥2.46) = 1-(.9931)= .0069 or .69 %
Answer
b. What is the chance of obtaining a birth weight of 120
or lighter?
120  109
Z   . 85
13
From the chart or SAS  Z of .85 corresponds to a left tail area of:
P(Z≤.85) = .8023= 80.23%
What is the area to the
left of Z=1.51 in a
standard normal curve?
Area is 93.45%
Z=1.51
Z=1.51
Normal probabilities in SAS
data _null_;
theArea=probnorm(1.5);
put theArea;
run; The “probnorm(Z)” function gives you
the probability from negative infinity to
0.9331927987
Z (here 1.5) in a standard normal curve.
And if you wanted to go the other direction (i.e., from the area to the Z score (called the so-
called “Probit” function
data _null_;
theZValue=probit(.93);
put theZValue;
run;
The “probit(p)” function gives you the
1.4757910282 Z-value that corresponds to a left-tail
area of p (here .93) from a standard
normal curve. The probit function is also
known as the inverse standard normal
function.
Probit function: the inverse
(area)= Z: gives the Z-value that goes with the probability you want
For example, recall SAT math scores example. What’s the score that corresponds to
the 90th percentile?
In Table, find the Z-value that corresponds to area of .90  Z= 1.28
Or use SAS
data _null_;
the Z Value = probit (.90);
put the Z Value;
run;
1.2815515655
If Z=1.28, convert back to raw SAT score 
1.28 =
X – 500 = X  500 1.28 (50)
50
X = 1.28(50) + 500 = 564 (1.28 standard deviations above the mean!)
`
Are my data ―normal‖?
Not all continuous random variables are normally

distributed!!
It is important to evaluate how well the data are
approximated by a normal distribution
Are my data normally distributed?
Look at the histogram! Does it appear bell shaped?

Compute descriptive summary measures—are mean,
median, and mode similar?
Do 2/3 of observations lie within 1 std dev of the mean?
Do 95% of observations lie within 2 std dev of the
mean?
Look at a normal probability plot—is it approximately
linear?
Run tests of normality (such as Kolmogorov-Smirnov).
But, be cautious, highly influenced by sample size!
Data from our class…
Median = 6
Mean = 7.1
Mode = 0
SD = 6.8
Range = 0 to 24
(= 3.5 )
Median = 5
Mean = 5.4
Mode = none
SD = 1.8
Range = 2 to 9
(~ 4 )
Median = 3
Mean = 3.4
Mode = 3
SD = 2.5
Range = 0 to 12
(~ 5 )
The Normal Probability Plot
Normal probability plot
Order the data.
Find corresponding standardized normal quantile values:
i
(
th
Plot the observed data values against

i
n  1 normal quantile values.
quantile
)
where  is the probit function, which gives the Z value

Evaluate the plot for evidence
that correspond
ofs linearity.
to a particular left - tail area
Normal approximation to the binomial
When you have a binomial distribution where n is

large and p isn’t too small (rule of thumb: mean>5), then the
binomial starts to look like a normal distribution
Recall: smoking example…

Starting to have a normal
shape even with fairly small
n. You can imagine that if n
.27
got larger, the bars would
get thinner and thinner and
this would look more and
more like a continuous
function, with a bell curve
0 1 2 3 4 5 6 7 8
shape. Here np=4.8.
Normal approximation to binomial
.27
0 1 2 3 4 5 6 7 8
What is the probability of fewer than 2 smokers?

Exact binomial probability (from before) = .00065 + .008 = .00865
Normal approximation probability:

 = 4.8
 = 1.39
2  ( 4 .8 )  2 .8
Z    2 P (Z < 2) = .022
1 . 39 1 . 39
A little off, but in the right ballpark… we could also use the
value to the left of 1.5 (as we really wanted to know less than but
not including 2; called the “continuity correction”)…
1 .5  ( 4 .8 )  3 .3
Z     2 . 37
1 . 39 1 . 39
A fairly good approximation of

P(Z≤-2.37) =.0069
the exact probability, .00865.
Practice problem
1. You are performing a cohort study. If the

probability of developing disease in the exposed
group is .25 for the study duration, then if you
sample (randomly) 500 exposed people, What’s
the probability that at most 120 people develop
the disease?
Answer
By hand (yikes!):
P(X≤120) = P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4)+….+ P(X=120)=
 500   500   500   500 

120
 (. 25 ) (. 75 )
380
+ 
2
 (. 25 ) (. 75 )
498
+ 
1
 (. 25 ) (. 75 )
499
+ 
0
 (. 25 ) (. 75 )
500
…
 120   2   1   0 
OR Use SAS:
data _null_;
Cohort=cdf('binomial', 120, .25, 500);
put Cohort;
run;
0.323504227
OR use, normal approximation:
=np=500(.25)=125 and 2=np(1-p)=93.75; =9.68
120  125
Z    . 52
9 . 68
P(Z<-.52)= .3015
Proportions…
The binomial distribution forms the basis of statistics for
proportions.
A proportion is just a binomial count divided by n.
For example, if we sample 200 cases and find 60 smokers, X=60 but
the observed
proportion=.3
0.
Statistics for proportions are similar to binomial counts, but
differ by a factor of n.
Stats for proportions
For binomial:
 x  np
Differs by
2 a factor of
 x
 np (1  p ) n.
 x
 np (1  p )
Differs
by a
factor
  p of n.
For proportion: pˆ
2 np (1  p ) p (1  p )
 pˆ
 2

n n
P-hat stands for ―sample p (1  p )
proportion.‖  pˆ

n
It all comes back to Z…
Statistics for proportions are based on a normal
distribution, because the binomial can be
approximated as normal if np>5
From non-standard normal to standard
normal
X is a normal random variable with mean μ, and
standard deviation σ
Set Z=(X–μ)/σ
Z=standard unit or z-score of X
Then Z has a standard normal distribution and

Example 8
X is a normal random variable

with μ=120, and σ=15
Find the probability P(X≤135)
Solution:
x  x  120
L et z  
 15
120  120
z is n o r m a l  z   0
15
15
 z
 1
15
x  135  120
P ( x  135)  P (  )  P ( z  1)  0 .5  0 .3 4 1 3  0 .8 4 1 3
 15
XZ
x z-score of x
Example 8 (continued)
P(X≤150)
x =150  z-score z = (150-120)/15 = 2
P(X≤150) = P(Z≤2)
= 0.5 + 0.4772 = 0.9772
Areas Under Normal Curve
f(X)
x  P[Z > 1] =0.1587
z 
 P[Z > 1.96] =0.0250
2
 X X
Norm
al .00 .01 .02 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Deviat
eZ
0.0 .5000 .04960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641
0.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3156 .3121
0.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611
1.5 .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559
1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233
2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064
3.0 .0013 .0013 .0013 .0012 .0012 .0011 .0011 .0011 .0010 .0010
% Points of Student t Distribution
area = a For  = 10 degrees of freedom
(d.o.f):S
P[t > 1.812] = 0.05
-1.812 1.81 t P[t > -1.812] = 0.05
2
a

.25 .10 .05 .01 .0005 .0005
1 1.000 3.078 6.314 31.821 63.657 636.61
9
5 .727 1.476 2.015 3.365 4.032 6.859
10 .700 1.372 1.812 2.764 3.169 4.587
20 .687 1.325 1.725 2.528 2.845 3.850
30 .683 1.310 1.697 2.457 2.750 3.646
40 .681 1.303 1.684 2.423 2.704 3.551
60 .679 1.296 1.671 2.390 2.660 3.460
120 .677 1.289 1.658 2.358 2.617 3.373
% Points of Student t Distribution
area = a For  = 10 degrees of freedom
(d.o.f):S
P[t > 1.812] = 0.05
-1.812 1.81 t P[t > -1.812] = 0.05
2
a

.25 .10 .05 .01 .0005 .0005
1 1.000 3.078 6.314 31.821 63.657 636.61
9
5 .727 1.476 2.015 3.365 4.032 6.859
10 .700 1.372 1.812 2.764 3.169 4.587
20 .687 1.325 1.725 2.528 2.845 3.850
30 .683 1.310 1.697 2.457 2.750 3.646
40 .681 1.303 1.684 2.423 2.704 3.551
60 .679 1.296 1.671 2.390 2.660 3.460
120 .677 1.289 1.658 2.358 2.617 3.373
68-95-99.7 Rule
68% of
the data
95% of the data
99.7% of the data

-1.812 1.812 t
-1.812 1.812
f(X)
2
  X X
-1.812 1.812 t
Example
Find the area under the standard normal curve between
z = 0 and z = 1.45
0 1.4 5 z
A portion of Table 3:
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06
..
.
1.4 0.4265
..
.
P ( 0  z  1. 4 5 )  0 . 4 2 6 5
Find the area under the normal curve to the right of z
= 1.45; P(z > 1.45)
Area asked for

0 .4 2 6 5
0 1.4 5 z
P ( z  1.4 5 )  0 .5 0 0 0  0 .4 2 6 5  0 .0 7 3 5
Example: Find the area to the left of z = 1.45; P(z <
1.45)
0 .5 0 0 0 0 .4 2 6 5
0 1.4 5 z
P ( z  1.4 5 )  0 .5 0 0 0  0 .4 2 6 5  0 .9 2 6 5
Example: Find the area between the mean (z = 0)
and
z = -1.26
Area asked for
 1.2 6 0 1.2 6 z
P (  1.2 6  z  0 )  0 .3 9 6 2
Example: Find the area between z = -2.30 and z =
1.80
0 .4 8 9 3 0 .4 6 4 1
 2.30 0 1.80
P ( 2.30  z  1.80)  P ( 2.30  z  0)  P ( 0  z  1.80)

 0.4893  0.4641  0.9534
Example: A bottling machine is adjusted to fill bottles
with a mean of 32.0 oz of soda and standard
deviation of 0.02. Assume the amount of fill is
normally distributed and a bottle is selected at random:
1) Find the probability the bottle contains between 32.00

oz and 32.025 oz
2) Find the probability the bottle contains more than
31.97 oz
Solutions:
32.00   32.00  32.0
1) When x  32.00 ; z   0.00
 0.02
32.025   32.025  32.0

When x  32.025; z   1.25
 0.02
Area asked for
32.0 3 2 .0 2 5 x
0 1.2 5 z
 32.0  32.0 x  32.0 32.025  32.0 

P ( 32.0  x  32.025)  P    
 0.02 0.02 0.02 
 P ( 0  z  1.25)  0. 3944
2)
3 1.9 7 32.0 x
 1.50 0 z
 x  32.0 31.97  32.0 

P( x  31.97)  P    P( z  1.50)
 0.02 0.02 
 0.5000  0.4332  0.9332
Example: A radar unit is used to measure the speed of
automobiles on an expressway during rush-hour
traffic. The speeds of individual automobiles
are normally distributed with a mean of 62 mph.
Find the standard deviation of all speeds if 3% of the
automobiles travel faster than 72 mph.
0 .0 3 0 0
0 .4 7 0 0
62 72 x
0 1.8 8 z
Solution
P( x  72)  0.03 P ( z  1.88)  0.03
x 72  62
z  ; 1.88 =
 
1.88   10
  10 / 1.88  5.32
Example: Find the numerical value of z(0.10):
Table shows this area (0.4000)
0.10 (area information

from notation)
0 z(0.10) z
Z (0.10) = 1.28
Example: Find the numerical value of z(0.80):
Look for 0.3000; remember

that z must be negative
z(0.80) 0 z
Use Table 3: look for an area as close as possible to 0.3000

z(0.80) = -0.84

M4 StatEcon 3rd Probability

Uploaded by

Copyright:

Available Formats

You might also like

M4 StatEcon 3rd Probability

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

M4 StatEcon 3rd Probability

Uploaded by

Copyright:

Available Formats

Introduction to

Normal Distributions &

A continuous random variable has an infinite number of possible

Hours spent studying in a day

The time spent

Larson & Farber, Elementary Statistics: Picturing the World, 3e 3

A normal distribution is a continuous probability

Larson & Farber, Elementary Statistics: Picturing the World, 3e 4

Properties of a Normal Distribution

If x is a continuous random variable having a normal

Larson & Farber, Elementary Statistics: Picturing the World, 3e 6

Mean: μ = 3.5 Mean: μ = 6

The standard deviation describes the spread of the data.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 7

The line of symmetry of curve A occurs at x = 5. The line of

The heights of the magnolia bushes are normally distributed

Larson & Farber, Elementary Statistics: Picturing the World, 3e 10

The horizontal scale

Any value can be transformed into a z-score by using the

Larson & Farber, Elementary Statistics: Picturing the World, 3e 11

Area is close to 0. Area is close to 1.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 13

Finding Areas Under the Standard Normal Curve

Larson & Farber, Elementary Statistics: Picturing the World, 3e 16

Finding Areas Under the Standard Normal Curve

1. Use the table to find

Larson & Farber, Elementary Statistics: Picturing the World, 3e 17

Finding Areas Under the Standard Normal Curve

4. Subtract to find the area

1. Use the table to find the area

Larson & Farber, Elementary Statistics: Picturing the World, 3e 18

From the Standard Normal Table, the area is equal to

Larson & Farber, Elementary Statistics: Picturing the World, 3e 20

0.0239 0.8577  0.0239 = 0.8338

From the Standard Normal Table, the area is equal to 0.8338.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 21

If a random variable, x, is normally distributed,

Larson & Farber, Elementary Statistics: Picturing the World, 3e 23

Normal Distribution Standard Normal Distribution

P(x < 15) P(z < 1)

P (x < 15) = P (z < 1) = Shaded area under the curve

Larson & Farber, Elementary Statistics: Picturing the World, 3e 24

The probability that a student

P (x < 90) = P (z < 1.5) = 0.9332

Larson & Farber, Elementary Statistics: Picturing the World, 3e 26

The z-score that corresponds to P75 is the same z-score that

The z-score is 0.67.

Larson & Farber, Elementary Statistics: Picturing the World, 3e 31

To transform a standard z-score to a data value, x, in

Larson & Farber, Elementary Statistics: Picturing the World, 3e 33

Larson & Farber, Elementary Statistics: Picturing the World, 3e 35

If the sample statistic is the sample mean, then the

The sampling distribution consists of the values of the

Larson & Farber, Elementary Statistics: Picturing the World, 3e 36

Properties of Sampling Distributions

1. The mean of the sample means, μ x , is equal to the population

2. The standard deviation of the sample means,σ x , is equal to the