Professional Documents
Culture Documents
M4 StatEcon 3rd Probability
M4 StatEcon 3rd Probability
0 3 6 9 12 15 18 21 24
Normal curve
Inflection points
Total area = 1
x
μ 3σ μ 2σ μσ μ μ+σ μ + 2σ μ + 3σ
y = e . e = 2 .1 7 8 π = 3 .1 4
σ 2π
x x
1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11
B
A
x
1 3 5 7 9 11 13
Example:
The heights of fully grown magnolia bushes are normally
distributed. The curve represents the distribution. What is the
mean height of a fully grown magnolia bush? Estimate the
standard deviation.
The inflection points are one
standard deviation away from
μ=8 the mean. σ 0.7
x
6 7 8 9 10
Height (in feet)
z
3 2 1 0 1 2 3
3 2 1 0 1 2 3
z
After the formula is used to transform an x-value into a
z-score, the Standard Normal Table in Appendix B is
used to find the cumulative area under the curve.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 12
The Standard Normal Table
Properties of the Standard Normal Distribution
1. The cumulative area is close to 0 for z-scores close to z = 3.49.
2. The cumulative area increases as the z-scores increase.
3. The cumulative area for z = 0 is 0.5000.
4. The cumulative area is close to 1 for z-scores close to z = 3.49
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
Find the area by finding 2.7 in the left hand column, and then moving
across the row to the column under 0.01.
The area to the left of z = 2.71 is 0.9966.
The Standard Normal Table
Example:
Find the cumulative area that corresponds to a z-score
of 0.25.
Appendix B: Standard Normal Table
z .09 .08 .07 .06 .05 .04 .03 .02 .01 .00
3.4 .0002 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003
3.3 .0003 .0004 .0004 .0004 .0004 .0004 .0004 .0005 .0005 .0005
0.3 .3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821
0.2 .3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207
0.1 .4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602
0.0 .4641 .4681 .4724 .4761 .4801 .4840 .4880 .4920 .4960 .5000
Find the area by finding 0.2 in the left hand column, and then moving
across the row to the column under 0.05.
The area to the left of z = 0.25 is 0.4013
Finding Areas
2. The area to
3. Subtract to find the area
the left of z
to the right of z = 1.23:
= 1.23 is 1 0.8907 = 0.1093.
0.8907.
z
0 1.23
Always draw
the curve!
2.33 0
0 0.94
z
From the Standard Normal Table, the area is equal to 0.1736.
z
1.98 0 1.07
μ = 10
P (x < 15) σ=5
x
μ =10 15
x z
μ =10 15 μ =0 1
Same area
P(x > 85) = P(z > 0.88) = 1 P(z < 0.88) = 1 0.8106 = 0.1894
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
Find the z-score by locating 0.9973 in the body of the Standard Normal
Table. The values at the beginning of the corresponding row and at the top
of the column give the z-score.
The z-score is 2.78.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 29
Finding z-Scores
Example:
Find the z-score that corresponds to a cumulative area
of 0.4170.
Appendix B: Standard Normal Table
z .09 .08 .07 .06 .05 .04 .03 .02 .01
.01 .00
3.4 .0002 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003
0.2 .0003 .0004 .0004 .0004 .0004 .0004 .0004 .0005 .0005 .0005 Use the
closest
0.3 .3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821 area.
0.2
0.2 .3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207
0.1 .4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602
0.0 .4641 .4681 .4724 .4761 .4801 .4840 .4880 .4920 .4960 .5000
Find the z-score by locating 0.4170 in the body of the Standard Normal
Table. Use the value closest to 0.4170.
The z-score is 0.21.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 30
Finding a z-Score Given a Percentile
Example:
Find the z-score that corresponds to P75.
Area = 0.75
z
μ =0 ?
0.67
Example:
The monthly electric bills in a city are normally distributed
with a mean of $120 and a standard deviation of $16. Find
the x-value corresponding to a z-score of 1.60.
x μ + zσ
= 1 2 0 + 1 .6 0 ( 1 6 )
= 1 4 5 .6
We can conclude that an electric bill of $145.60 is 1.6 standard
deviations above the mean.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 32
Finding a Specific Data Value
Example:
The weights of bags of chips for a vending machine are
normally distributed with a mean of 1.25 ounces and a
standard deviation of 0.1 ounce. Bags that have weights in
the lower 8% are too light and will not work in the machine.
What is the least a bag of chips can weigh and still work in the
machine?
P(z < ?) = 0.08
8% P(z < 1.41) = 0.08
z
?
1.41 0 x μ + zσ
x
1 .2 5 ( 1 .4 1)0 .1
? 1.25
1.11
1 .1 1
The least a bag can weigh and still work in the machine is 1.11 ounces.
Sample Sample
Sample Sample
Sample
Sample
Sample
Sample
Population Sample
Sample
Sample 3
Sample 1 x Sample 2 Sample 6
Sample 4 3
x1 Sample 5
x
x 4 x x2 6
5
Example:
The population values {5, 10, 15, 20} are written on slips of
paper and put in a hat. Two slips are randomly selected, with
replacement.
a. Find the mean, standard deviation, and variance of the
population.
Population μ = 1 2 .5
5
10 σ = 5 .5 9
15
20 σ
2
= 3 1 .2 5
Continued.
x
5 10 15 20
Population values Continued.
x f P r o b a b ility
5 1 0.0625
7.5 2 0.1250 Probability
10 3 0.1875 Distribution of
12.5 4 0.2500 Sample Means
15 3 0.1875
17.5 2 0.1250
20 1 0.0625
0.20
The shape of the graph is
0.15
symmetric and bell
0.10
y
shaped. It approximates a
0.05
normal distribution.
x
5 7.5 10 12.5 15 17.5 20
Sample mean
x x
the sample means will have a normal distribution.
x x
x x
x x x
x x x x x x
Larson & Farber, Elementary Statistics: Picturing the World, 3e 43
The Central Limit Theorem
If the population itself is normally distributed, with
mean = and standard deviation = ,
x
Example:
The heights of fully grown magnolia bushes have a mean
height of 8 feet and a standard deviation of 0.7 feet. 38
bushes are randomly selected from the population, and
the mean of each sample is determined. Find the mean
and standard error of the mean of the sampling
distribution.
Standard deviation
Mean (standard error)
σ
μx μ σx
n
= 8
0 .7
= = 0 .1 1
38
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 46
Interpreting the Central Limit Theorem
Example continued:
The heights of fully grown magnolia bushes have a mean height
of 8 feet and a standard deviation of 0.7 feet. 38 bushes are
randomly selected from the population, and the mean of each
sample is determined.
distribution. μx = 8 σ x
= 0 .1 1
7 .6 8 8 .4
less than 7.8 feet. 7.8
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 48
Finding Probabilities
Example continued:
Find the probability that the mean height of the 38
bushes is less than 7.8 feet.
μx = 8 n = 38
σ x
= 0 .1 1
x - μx
P( < 7.8)
z
σ x
x
7 .6 8 8 .4 7 .8 - 8
=
7.8 0 .1 1
z
0 = - 1 .8 2
P( < 7.8) = P(z < 1.82
____
? ) = 0.0344
The probability that the mean height of the 38 bushes is
less than 7.8 feet is 0.0344.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 49
Probability and Normal Distributions
Example:
The average on a statistics test was 78 with a standard
deviation of 8. If the test scores are normally distributed,
find the probability that the mean score of 25 randomly
selected students is between 75 and 79.
μ x = 78 x - μx 75 - 78
z1 = = = - 1 .8 8
σ 1 .6
σ 8 x
σ x
= = = 1 .6
n 25
x - μ 79 - 78
z2 = = = 0 .6 3
P(75 < < 79) σ 1 .6
75 78 79
z
1.88
? 0 0.63
? Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 50
Probability and Normal Distributions
Example continued:
75 78 79
z
1.88
? 0 0.63
?
P(75 < < 79) = P(1.88 < z < 0.63) = P(z < 0.63) P(z < 1.88)
= 0.7357 0.0301 = 0.7056
Approximately 70.56% of the 25 students will have a mean
score between 75 and 79.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 51
Probabilities of x and x
Example:
The population mean salary for auto mechanics is
= $34,000 with a standard deviation of = $2,500. Find
the probability that the mean salary for a randomly selected
sample of 50 mechanics is greater than $35,000.
μ x = 34000
x - μx 3 5 0 0 0 - 3 4 0 0 0 = 2 .8 3
σ 2500 z =
σ x
= = 3 5 3 .5 5 σ x
3 5 3 .5 5
n 50
P( > 35000) = P(z > 2.83) = 1 P(z < 2.83)
= 1 0.9977 = 0.0023
μ = 34000 x - μ 3 5 0 0 0 - 3 4 0 0 0 = 0 .4
z = =
σ 2500
σ = 2500
P(x > 35000) = P(z > 0.4) = 1 P(z < 0.4)
= 1 0.6554 = 0.3446
Example:
Thirty-one percent of the seniors in a certain high school plan to
attend college. If 50 students are randomly selected, find the
probability that less than 14 students plan to attend college.
np = (50)(0.31) = 15.5 The variable x is approximately normally
nq = (50)(0.69) = 34.5 distributed with = np = 15.5 and
σ = npq = ( 5 0 ) ( 0 .3 1 ) ( 0 .6 9 ) = 3 .2 7 .
0 .0 0 0 .0 5 0 .1 0 0 .1 5 0 .2 0 0 .2 5
0
2
4
x
6
8
P(2≤X≤4)= P(2≤X<4)= P(2<X<4)
10
The normal distribution
(x )
2
1
f (x) exp
2
2 2
μand σ2 are two parameters: mean and standard variance
of a normal population
(σ is the standard deviation)
The normal—Bell shaped curve: μ=100,
fx
0.12
0.10
0.08
0.06
0.04
0.02
0.00
σ2=10
-2 0 2 4 6 8
x
Normal curves:
0.4
0.3
(μ=0, σ2=1) and (μ=0, σ2=2)
y
0.2
0.1
0.0
-3 -2 -1 0 1 2 3
x
Normal curves:
(μ=0, σ2=1) and (μ=2, σ2=0.25)
1.0
0.8
0.6
fx1
0.4
0.2
0.0
-2 0 2 4 6 8
The standard normal curve:
μ=0, and σ2=1
0.4
0.3
y
0.2
0.1
0.0
-3 -2 -1 0 1 2 3
x
How to calculate the probability of a normal
random variable?
Example 1
P(0<Z<1) = 0.3413
Example 2
P(1<Z<2)
= P(0<Z<2)–P(0<Z<1)
= 0.4772–0.3413
= 0.1359
Examples
P(Z≥1)
= 0.5–P(0<Z<1)
= 0.5–0.3413
= 0.1587
Examples
Example 4 A dobe A crobat 7.0
Document
P(Z ≥ -1)
= 0.3413+0.50
= 0.8413
Examples
P(-2<Z<1)
= 0.4772+0.3413
= 0.8185
Examples
Example 6
A dobe A crobat 7.0
Document
P(Z ≤ 1.87)
= 0.5+P(0<Z ≤ 1.87)
= 0.5+0.4693
= 0.9693
Examples
Example 7
A dobe A crobat 7.0
Document
P(Z<-1.87)
= P(Z>1.87)
= 0.5 – 0.4693
= .0307
The Normal Distribution
Changing σ increases or
decreases the spread.
X
The Normal Distribution:
as mathematical function
(pdf)
1 x 2
1 ( )
f (x) e 2
2
E(X)= =
1
1
(
x
)
2
x e 2
dx
2
Var(X)=2 =
1
1 x 2
( )
x
2
dx )
2 2
( e
2
Standard Deviation(X)=
The beauty of the normal curve:
68% of
the data
1 x 2
1 ( )
e 2
dx . 68
2
2 1 x 2
1 ( )
e 2
dx . 95
2
2
3 1 x 2
1 ( )
e 2
dx . 997
3
2
How good is rule for real data?
BUT…
What if you wanted to know the math SAT score
corresponding to the 90th percentile (=90% of
students are lower)?
P(X≤Q) = .90
Q 1 x 500 2
1 ( )
e dx . 90
2 50
200
( 50 ) 2
1 Z 0 2 1 2
1 ( ) 1 (Z )
p(Z ) e 2 1
e 2
(1 ) 2 2
The Standard Normal Distribution (Z)
X
Z
0 2.0 Z ( = 0, = 1)
Example
For example: What’s the probability of getting a math SAT score of 575
or less, =500 and =50?
575 500
Z 1 .5
50
141 109
Z 2 . 46
13
120 109
Z . 85
13
From the chart or SAS Z of .85 corresponds to a left tail area of:
P(Z≤.85) = .8023= 80.23%
What is the area to the
left of Z=1.51 in a
standard normal curve?
Area is 93.45%
Z=1.51
Z=1.51
Normal probabilities in SAS
data _null_;
theArea=probnorm(1.5);
put theArea;
run; The “probnorm(Z)” function gives you
the probability from negative infinity to
0.9331927987
Z (here 1.5) in a standard normal curve.
And if you wanted to go the other direction (i.e., from the area to the Z score (called the so-
called “Probit” function
data _null_;
theZValue=probit(.93);
put theZValue;
run;
The “probit(p)” function gives you the
1.4757910282 Z-value that corresponds to a left-tail
area of p (here .93) from a standard
normal curve. The probit function is also
known as the inverse standard normal
function.
Probit function: the inverse
(area)= Z: gives the Z-value that goes with the probability you want
For example, recall SAT math scores example. What’s the score that corresponds to
the 90th percentile?
In Table, find the Z-value that corresponds to area of .90 Z= 1.28
Or use SAS
data _null_;
the Z Value = probit (.90);
put the Z Value;
run;
1.2815515655
If Z=1.28, convert back to raw SAT score
1.28 =
X – 500 = X 500 1.28 (50)
50
X = 1.28(50) + 500 = 564 (1.28 standard deviations above the mean!)
`
Are my data ―normal‖?
Median = 6
Mean = 7.1
Mode = 0
SD = 6.8
Range = 0 to 24
(= 3.5 )
Data from our class…
Median = 5
Mean = 5.4
Mode = none
SD = 1.8
Range = 2 to 9
(~ 4 )
Data from our class…
Median = 3
Mean = 3.4
Mode = 3
SD = 2.5
Range = 0 to 12
(~ 5 )
The Normal Probability Plot
Normal probability plot
Order the data.
Find corresponding standardized normal quantile values:
i
(
th
0 1 2 3 4 5 6 7 8
1 .5 ( 4 .8 ) 3 .3
Z 2 . 37
1 . 39 1 . 39
+
2
(. 25 ) (. 75 )
498
+
1
(. 25 ) (. 75 )
499
+
0
(. 25 ) (. 75 )
500
…
120 2 1 0
OR Use SAS:
data _null_;
Cohort=cdf('binomial', 120, .25, 500);
put Cohort;
run;
0.323504227
OR use, normal approximation:
=np=500(.25)=125 and 2=np(1-p)=93.75; =9.68
120 125
Z . 52
9 . 68
P(Z<-.52)= .3015
Proportions…
The binomial distribution forms the basis of statistics for
proportions.
A proportion is just a binomial count divided by n.
For example, if we sample 200 cases and find 60 smokers, X=60 but
the observed
proportion=.3
0.
Statistics for proportions are similar to binomial counts, but
differ by a factor of n.
Stats for proportions
For binomial:
x np
Differs by
2 a factor of
x
np (1 p ) n.
x
np (1 p )
Differs
by a
factor
p of n.
For proportion: pˆ
2 np (1 p ) p (1 p )
pˆ
2
n n
P-hat stands for ―sample p (1 p )
proportion.‖ pˆ
n
It all comes back to Z…
Statistics for proportions are based on a normal
distribution, because the binomial can be
approximated as normal if np>5
From non-standard normal to standard
normal
X is a normal random variable with mean μ, and
standard deviation σ
Set Z=(X–μ)/σ
Z=standard unit or z-score of X
P(X≤150)
x =150 z-score z = (150-120)/15 = 2
P(X≤150) = P(Z≤2)
= 0.5 + 0.4772 = 0.9772
Areas Under Normal Curve
f(X)
x P[Z > 1] =0.1587
z
P[Z > 1.96] =0.0250
2
X X
Norm
al .00 .01 .02 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Deviat
eZ
0.0 .5000 .04960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641
0.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3156 .3121
0.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611
1.5 .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559
1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233
2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064
3.0 .0013 .0013 .0013 .0012 .0012 .0011 .0011 .0011 .0010 .0010
% Points of Student t Distribution
area = a For = 10 degrees of freedom
(d.o.f):S
P[t > 1.812] = 0.05
-1.812 1.81 t P[t > -1.812] = 0.05
2
a
.25 .10 .05 .01 .0005 .0005
1 1.000 3.078 6.314 31.821 63.657 636.61
9
5 .727 1.476 2.015 3.365 4.032 6.859
10 .700 1.372 1.812 2.764 3.169 4.587
20 .687 1.325 1.725 2.528 2.845 3.850
30 .683 1.310 1.697 2.457 2.750 3.646
40 .681 1.303 1.684 2.423 2.704 3.551
60 .679 1.296 1.671 2.390 2.660 3.460
120 .677 1.289 1.658 2.358 2.617 3.373
% Points of Student t Distribution
area = a For = 10 degrees of freedom
(d.o.f):S
P[t > 1.812] = 0.05
-1.812 1.81 t P[t > -1.812] = 0.05
2
a
.25 .10 .05 .01 .0005 .0005
1 1.000 3.078 6.314 31.821 63.657 636.61
9
5 .727 1.476 2.015 3.365 4.032 6.859
10 .700 1.372 1.812 2.764 3.169 4.587
20 .687 1.325 1.725 2.528 2.845 3.850
30 .683 1.310 1.697 2.457 2.750 3.646
40 .681 1.303 1.684 2.423 2.704 3.551
60 .679 1.296 1.671 2.390 2.660 3.460
120 .677 1.289 1.658 2.358 2.617 3.373
68-95-99.7 Rule
68% of
the data
f(X)
2
X X
-1.812 1.812 t
Example
Find the area under the standard normal curve between
z = 0 and z = 1.45
0 1.4 5 z
A portion of Table 3:
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06
..
.
1.4 0.4265
..
.
P ( 0 z 1. 4 5 ) 0 . 4 2 6 5
Find the area under the normal curve to the right of z
= 1.45; P(z > 1.45)
0 1.4 5 z
P ( z 1.4 5 ) 0 .5 0 0 0 0 .4 2 6 5 0 .0 7 3 5
Example: Find the area to the left of z = 1.45; P(z <
1.45)
0 .5 0 0 0 0 .4 2 6 5
0 1.4 5 z
P ( z 1.4 5 ) 0 .5 0 0 0 0 .4 2 6 5 0 .9 2 6 5
Example: Find the area between the mean (z = 0)
and
z = -1.26
1.2 6 0 1.2 6 z
P ( 1.2 6 z 0 ) 0 .3 9 6 2
Example: Find the area between z = -2.30 and z =
1.80
0 .4 8 9 3 0 .4 6 4 1
2.30 0 1.80
Solutions:
32.00 32.00 32.0
1) When x 32.00 ; z 0.00
0.02
32.0 3 2 .0 2 5 x
0 1.2 5 z
3 1.9 7 32.0 x
1.50 0 z
0 .0 3 0 0
0 .4 7 0 0
62 72 x
0 1.8 8 z
Solution
P( x 72) 0.03 P ( z 1.88) 0.03
x 72 62
z ; 1.88 =
1.88 10
10 / 1.88 5.32
Example: Find the numerical value of z(0.10):
0 z(0.10) z
Z (0.10) = 1.28
Example: Find the numerical value of z(0.80):
z(0.80) 0 z