Professional Documents
Culture Documents
Math322 Chapter5
Math322 Chapter5
Math322 Chapter5
METHODS
LECTURE SLIDES
CHAPTER 5
2
5.1 Introduction and
Motivation
Exercises
DISCRETE PROBABILITY DISTRIBUTIONS
In statistics, there are different types of probability distributions, such as the binomial distribution, normal
distribution and Poisson distribution. All of these distributions can be classified as either a continuous or a discrete
probability distribution.
The observations generated by different statistical experiments have the same general type of behavior. Discrete
random variables associated with these experiments can be described by essentially the same probability
distribution and therefore can be represented by a single formula. A discrete probability distribution is made up
of discrete variables. Specifically, if a random variable is discrete, then it will have a discrete probability
distribution.
The following are the probability distributions that will be covered in this chapter:
Discrete Uniform Distribution
Binomial Distribution
Multinomial Distribution
Hypergeometric Distribution
Geometric Distribution
Poisson Distribution and Poisson Process
5.1 THE DISCRETE UNIFORM DISTRIBUTION
Definition. If the random variable assumes the values 𝑥1 , 𝑥2 , . . . , 𝑥𝑘 , with equal probabilities, then the discrete
uniform distribution is given by X
1
𝑓 𝑥; 𝑘 = , 𝑥 = 𝑥1 , 𝑥2 , . . . , 𝑥𝑘 .
𝑘
Example: When a die is tossed, each element of the sample space occurs with probability 1/6. Therefore, we have
a uniform distribution with S={1,2,3,4,5,6}
1
𝑓 𝑥; 6 = , 𝑥 = 1,2,3,4,5,6.
6
Theorem. The mean and variance of the discrete uniform distribution are f(x;k),
σ𝑘
𝑖=1 𝑥𝑖 2 σ𝑘
𝑖=1(𝑥𝑖 −𝜇)
2
𝜇= and 𝜎 =
𝑘 𝑘
THE BINOMIAL DISTRIBUTION
Perhaps the most commonly used discrete probability distribution is the
binomial distribution. An experiment which follows a binomial distribution
will satisfy the following requirements (think of repeatedly flipping a coin
as you read these):
1. The experiment consists of n identical trials, where n is fixed in advance.
2. Each trial has two possible outcomes, S or F, which we denote ``success''
and ``failure'' and code as 1 and 0, respectively.
3. The trials are independent, so the outcome of one trial has no effect on
the outcome of another.
4. The probability of success, is constant from one trial to another p.
DEFINITION.
The random variable X of a binomial distribution counts the number of successes
in n trials. The probability that X is a certain value x is given by the formula
𝑛 𝑥 𝑛−𝑥
𝑃 𝑋 = 𝑥 = 𝑏 𝑥, 𝑛, 𝑝 = 𝑝 1−𝑝
𝑥
𝑛
where and 0 ≤ 𝑝 ≤ 1 and 𝑥 = 0,1,2, ⋯ , 𝑛. Recall that the quantity , ``n choose
𝑥
x,'' above is
𝑛 𝑥!
=
𝑥 𝑥! 𝑛 − 𝑥 !
MEAN AND VARIANCE OF A BINOMIAL RANDOM VARIABLE
We could use the formulas previously given to compute the mean and variance of
X. However, for the binomial distribution these will always be equal to
𝐸 𝑋 = 𝜇 = 𝑛𝑝 and 𝑉𝑎𝑟 𝑋 = 𝜎 2 = 𝑛𝑝𝑞
NOTE: A particularly important example of the use of the binomial distribution is when
sampling with replacement (this implies that p is constant.
EXAMPLE 5. 1
Suppose we have 10 balls in a bowl, 3 of the balls are red and 7 of them are blue. Define success S as
drawing a red ball. If we sample with replacement, P(S)=0.3 for every trial. Let's say n=20, then 𝑋 ∼
𝑏 𝑥, 20,0.3 and we can figure out any probability we want. For example,
20
𝑃 𝑋=5 = 0.3 3 (1 − 0.3)20−5
5
= 15504(0.3)5 (0.7)15 = 0.1789.
The mean and variance are
In statistics the so-called binomial distribution describes the possible number of times
that a particular event will occur in a sequence of observations.
The event is coded binary, it may or may not occur. The binomial distribution is used
when a researcher is interested in the occurrence of an event, not in its magnitude.
For instance, in a clinical trial, a patient may survive or die. The researcher studies the
number of survivors, and not how long the patient survives after treatment. Another
example is whether a person is ambitious or not. Here, the binomial distribution
describes the number of ambitious persons, and not how ambitious they are.
The binomial distribution is specified by the number of
observations, n, and the probability of occurrence, which is denoted
by p.
A classical example that is used often to illustrate concepts of
probability theory, is the tossing of a coin.
If a coin is tossed 4 times, then we may obtain 0, 1, 2, 3, or 4
heads. We may also obtain 4, 3, 2, 1, or 0 tails, but these outcomes
are equivalent to 0, 1, 2, 3, or 4 heads. The likelihood of obtaining
0, 1, 2, 3, or 4 heads is, 1/16, 4/16, 6/16, 4/16, and 1/16,
respectively.
In the figure on this page the distribution is shown with p = 1/2
.Thus, in the example discussed here, one is likely to obtain 2
heads in 4 tosses, since this outcome has the highest probability.
Other situations in which binomial distributions arise are quality
control, public opinion surveys, medical research, and insurance
problems.
EXAMPLE 5.2
The probability that a patient recovers from a rare blood disease is 0.4. If 15 people are
known to have contracted this disease, what is the probability that
15
a. at least 2 survive? P(X>=2)=1-P(X<=1)=1-[P(X=0)+P(X=1)]=1-[ 0.40 0.6 15
+
0
15
0.41 0.6 14
=
1
b. between 3 and 7 survive?
c. exactly 5 survive?
d. none will survive? (left as exercise)
SOLUTION
a. 𝑃 𝑋 ≥ 2 = 1 − 𝑃 𝑋 ≤ 1 = 1 − 𝑃 𝑋 = 0 + 𝑃 𝑋 = 1 = 1 − 𝐵 1; 15,0.4 =1-
0.0052=0.9948
b. 𝑃 3 < 𝑋 < 7 = 𝑃 4 ≤ 𝑋 ≤ 6 = 𝑏 4; 15,0.4 + 𝑏 5; 15,0.4 +
𝑇𝑎𝑏𝑙𝑒𝐴1
𝑏 6; 15,0.4 = 𝐵 6; 15,0.4 − 𝐵 3; 15,0.4 = 0.6098 − 0.0905 = 0.5193
15 5 15−5
c. 𝑃 𝑋 = 5 = 𝑏 5; 15,0.4 = 0.4 0.6 = 0.186
5
BINOMIAL DISTRIBUTION TABLE Probability of
success
Total
number of
trials
EXAMPLE 5. 3.
A traffic control engineer reports that 75% of the vehicles passing through a checkpoint are from
within the state. What is the probability that fewer than 4 of the next 9 vehicles are from out of the
state?
Solution.
Probability of success=𝑝 = 0.25, and the probability of failure = 𝑞 = 1 − 0.25 = 0.75. 𝑛 = 9
𝑋: no. of vehicles passing through the checkpoint
6
Solution. Probability of success=𝑝 = 6/10, and the probability of failure=𝑞 = 1 − = 0.4
10
If a pair of dice are tossed 6 times, what is the probability of obtaining a total of
7 or 11 twice, a matching pair once and any other combinations 3 times?
Solution.
𝑆= 1,1 , 1,2 , 1,3 , 1,4 , 1,5 , 1,6 , 2,1 , 2,2 , ⋯ , (6,6) , 𝑛 𝑆 = 36
𝐸1 : a total of 7 or 11 occurs 𝐸1 = 1,6 , 6,1 , 2,5 , 5,2 , 3,4 , 4,3 , (5,6)(6,5) 𝑛 𝐸1 =
8
8 therfore 𝑃 𝐸1 =
36
6
𝐸2 : a matching pair occurs 𝐸2 = 1,1 , 2,2 , 3,3 , 4,4 , 5,5 , 6,6 , 𝑛 𝐸1 = 6 therfore 𝑃 𝐸2 =
36
22
𝐸3 : neither a matching pair nor a total of 7 or 11 occurs 𝑛 𝐸2 = 22 therfore 𝑃 𝐸3 =
36
2 1 11 6 2 2 1 1 11 3
𝑃 2,1,3; , , , 6 = = 0.1127.
9 6 18 213 9 6 18
EXAMPLE 5.6.
Suppose that in a three-way election for a large country, candidate A received 20% of the votes,
candidate B received 30% of the votes, and candidate C received 50% of the votes. If six voters are
selected randomly, what is the probability that there will be exactly one supporter for candidate A,
two supporters for candidate B and three supporters for candidate C in the sample?
Note: This is sampling without replacement, so the correct distribution is the multivariate
hypergeometric distribution, but the distributions converge as the population grows large.
Solution:
6 1 2 3
𝑃 1,2,3; 0.2,0.3,0.5,6 = 0.2 0.3 0.5 = 0.135
123
EXAMPLE 5.7
The complexity of arrivals and departures of planes at an airport is such that computer simulation is often
used to model the “ideal” conditions. For a certain airport with three runways, it is known that in the ideal
setting the following are the probabilities that the individual runways are accessed by a randomly arriving
commercial jet:
Runway 1: p1 = 2/9,
Runway 2: p2 = 1/6,
Runway 3: p3 = 11/18.
What is the probability that 6 randomly arriving airplanes are distributed in the following fashion?
Runway 1: 2 airplanes,
Runway 2: 1 airplane,
Runway 3: 3 airplanes
Solution : Using the multinomial distribution, we have
5.3 THE HYPERGEOMETRIC DISTRIBUTION
𝑘 𝑁−𝑘
𝑥 𝑛−𝑥
𝑃 𝑋 = 𝑥 = ℎ 𝑥; 𝑁; 𝑛; 𝑘 = 𝑁 , 𝑥 = 0,1,2,3, . . . . . , 𝑘
𝑛
𝑛𝑘 𝑁−𝑛 𝑘 𝑘
𝜇= and 𝜎 2 = . 𝑛. 1− .
𝑁 𝑁−1 𝑁 𝑁
The range of x can be determined by the three binomial coefficients in the
definition, where x and n−x are no more than k and N −k, respectively, and
both of them cannot be less than 0.
3 37
ℎ 1; 40,5,3 = 1 4 = 0.3011
40
5
Once again, this plan is not desirable since it detects a bad lot (3 defectives) only about 30% of the time.
EXAMPLE 5.9.
From a lot of 10 missiles, 4 are selected at random and fired. If the lot contains 3 defective
missiles that will not fire, what is the probability that
a. all 4 will fire?
b. at most 2 will not fire?
Solution. 𝑁 = 10, 𝑛 = 4, 𝑘 = 3.
Let 𝑋: number of non-defective missiles ,
7 3
4 0 35
(a) 𝑃 𝑋 = 4 = 10 = = 0.1666
210
4
2 8
𝑃 𝑋=0 = 0 3 = 0.467
10
3
Thus, if the lot is truly unacceptable, with 2 defective parts, this sampling plan will allow
acceptance roughly 47% of the time. As a result, this plan should be considered faulty.
EXAMPLE 5.11
3 5 3 5
30 10
𝑃 𝑋=2 = 2 3
8 = , 𝑃 𝑋=3 = 3 2
8 = .
56 56
5 5
𝑎1 𝑎2 𝑎𝑘
𝑥1 𝑥2 ⋯ 𝑥𝑘
𝑓 𝑥1 , 𝑥2 , ⋯ , 𝑥𝑘 ; 𝑎1 , 𝑎2 , ⋯ , 𝑎𝑘 , 𝑁, 𝑛 = 𝑁 ,
𝑛
𝑘 𝑘
with 𝑥𝑖 = 𝑛 and 𝑎𝑖 = 𝑁
𝑖=1 𝑖=1
EXAMPLE 5.13
A group of 10 individuals is used for a biological case study. The group contains 3 people with
blood type O, 4 with blood type A, and 3 with blood type B. What is the probability that a
random sample of 5 will contain 1 person with blood type O, 2 people with blood type A, and 2
people with blood type B?
Solution : Using the extension of the hypergeometric distribution with x1 = 1, x2 = 2, x3 = 2, a1
= 3, a2 = 4, a3 = 3, N = 10, and n = 5, we find that the desired probability is
3 4 3
𝑓 1,2,2; 3,4,3,10,5 = 1 2 2 = 3
10 14
5
THE RELATIONSHIP TO BINOMIAL DISTRIBUTION (BINOMIAL APPROXIMATION
TO HYPERGEOMETRIC DİSTRİBUTION )
𝑛𝑘 𝑁−𝑛 𝑘 𝑘
𝜇 = 𝑛𝑝 = and 𝜎 2 = 𝑛𝑝𝑞 = . 𝑛. 1− .
𝑁 𝑁−1 𝑁 𝑁
EXAMPLE 5.14
A manufacturer of automobile tires reports that among a shipment of 5000 sent to a local
distributor, 1000 are blemished. If one purchases 10 of these tires at random, what is the
probability that exactly 3 are blemished?
1 10 1 3 4 7
𝑃 𝑋 = 3 = ℎ 3; 5000,10,1000 ≈ 𝑏 3; 10, = = 0.201312.
5 3 5 5
THE GEOMETRIC DISTRIBUTION
While in the binomial we have fixed number of trials and a variable number of successes, in the geometric
distribution we wait for a single success, but the number of trials is variable.
A perfect model for this situation is tossing a coin, loaded so that the probability of a head on a single toss
is p (and the probability of a tail at a single toss is 1-p=q), until a head appears for the first time.
DEFINITION.
If repeated independent trials can result in a success with probability 𝑝 and a failure with
probability 𝑞 = 1 − 𝑝, then the probability distribution of the random variable 𝑋, the
number of the trial on which the first success occurs, is
𝑔 𝑥; 𝑝 = 𝑝𝑞 𝑥−1 , 𝑥 = 1,2,3, . . .
EXAMPLE 5.15.
The probability that a student pilot passes the written test for a private pilot’s license is
0.7, find the probability that the student will pass the test on the fourth try.
𝑔 𝑥; 𝑝 = 𝑔 4; 0.7 = 0.7 1 − 0.7 4−1 = 0.7 0.3 3 = 0.0189.
EXAMPLE 5.16
Find the probability that a person flipping a coin will get the first head on the
third flip.
3−1 2
𝑔 𝑥; 𝑝 = 𝑔 3; 0.5 = 0.5 1 − 0.5 = 0.5 0.5 = 0.125.
DEFINITION.
A shooter has the constant probability of 0.4 for hitting the target.
a. What is the probability that the shooter will hit on the fifth try?
b. What is the expected number of shootings for the shooter to hit the target?
Solution.
5−1 4
(a) g x; p = 𝑔 5; 0.4 = 0.4 1 − 0.4 = 0.4 0.6 = 0.05184
1 10
(b) 𝐸 𝑋 =𝜇= = = 2.5
0.4 4
5.5 THE POISSON DISTRIBUTION
The Poisson distribution is most commonly used to model the number of random occurrences of
some phenomenon in a specified unit of space or time. For example,
In such situations we are often interested in whether the events occur randomly in time or space, or not.
The Poisson distribution is a discrete probability distribution for the counts of events that occur randomly in a
given interval of time (or space).
DEFINITION
For a Poisson random variable, the probability that X is some value x is given by the formula
𝑒 −𝜆𝑡 𝜆𝑡 𝑥
𝑃 𝑋 = 𝑥 = 𝑓 𝑥; 𝜆𝑡 = 𝑥 = 0,1,2, . . .
𝑥!
where 𝜆 is the average number of occurrences per unit time or region denoted by 𝑡. For the
Poisson distribution,
𝐸 𝑋 = 𝜆𝑡 and 𝑉𝑎𝑟 𝑋 = 𝜆𝑡.
EXAMPLE 5.18
The number of false fire alarms in a suburb of Houston averages 2.1 per day. Assuming
that a Poisson distribution is appropriate, the probability that 4 false alarms will occur on
a given day is given by
2(1)4 𝑒 −2.1
𝑃 𝑋=4 = = 0.0992
4!
EXAMPLE 5.19
Solution : Using the Poisson distribution with x = 6 and λt = 4 and referring to Table
A.2,we have
EXAMPLE 5.20:
Ten is the average number of oil tankers arriving each day at a certain port. The facilities
at the port can handle at most 15 tankers per day. What is the probability that on a given
day tankers have to be turned away?
Solution : Let X be the number of tankers arriving each day. Then, using Table A.2, we
have
Example 5.21: Births in a hospital occur randomly at an average rate of 1.8 births per hour. What is the probability of
observing 4 births in a given hour at the hospital?
We can now use the formula to calculate the probability of observing exactly 4 births in a given hour
What about the probability of observing more than or equal to 2 births in a given hour at the hospital?
Then
EXAMPLE 5.23:
Suppose we know that births in a hospital occur randomly at an average rate of 1.8 births per hour.
What is the probability that we observe 5 births in a given 2 hours interval?
Solution:
Well, if births occur randomly at a rate of 1.8 births per 1 hour interval
Then births occur randomly at a rate of 3.6 births per 2 hours interval
𝑏 𝑥; 𝑛, 𝑝 → 𝑝 𝑥; 𝜇 .
EXAMPLE 5.24
The probability that a person will die from a certain respiratory infection is 0.002. Find the probability
that fewer than 5 of the next 2000 so infected will die.
Soluion:
𝑋: number of infected people who will die
Given 𝑛 = 2000, 𝑝 = 0.002, 𝜇 = 𝑛𝑝 = 2000 ∗ 0.002 = 4,
It is known that 5% of the books bound by a certain bindery have defective bindings.
Find the probability that 2 of 100 books bound by this bindery will have defective
bindings using
a. the formula for the binomial distribution