Chapter 3

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Chapter 3

Discrete Random Variables and Probability Distributions


3.1. Concept of a Random Variable
3.2. Probability Distribution Functions
3.3. Cumulative Distribution Functions
3.4. Mean and Variance of a Discrete Random Variable
3.5. Discrete Uniform Distribution
3.6. The Binomial Distribution
3.7. Geometric and Negative Binomial Distributions
3.8. Hypergeometric Distribution
3.9. The Poisson Distribution

After careful study of this chapter, the students will be able to:

1. Determine probabilities from probability mass functions and the reverse


2. Determine probabilities from cumulative distribution functions and cumulative distribution functions
from probability mass functions, and the reverse
3. Calculate means and variances for discrete random variables
4. Understand the assumptions for each of the discrete probability distributions presented
5. Select an appropriate discrete probability distribution to calculate probabilities in specific applications
6. Calculate probabilities, determine means and variances for each of the discrete probability
distributions presented

3.1. Concept of a Random Variable


A random variable is a function that assigns a real number to each outcome in the sample space of a random experiment.
A random variable is denoted by an uppercase letter such as X. After an experiment is conducted, the measured value of the
random variable is denoted by a lowercase letter such as x = 70 milliamperes.

A sample space is discrete if it consists of a finite or countable infinite set of outcomes.

A sample space is continuous if it contains an interval (either finite or infinite) of real numbers or an infinite number of
possibilities equal to the number of points on a line segment.

A random variable is called a discrete random variable if its set of possible outcomes is countable.

When a random variable can take on values on a continuous scale, it is called a continuous random variable.

In most practical problems, continuous random variables represent measured data, such as all possible heights, weights,
temperatures, distance, or life periods, whereas discrete random variables represent count data, such as the number of
defectives in a sample of k items or the number of highway fatalities per year in a given state.

Example: Classify the following random variables as discrete or continuous:


X: the number of automobile accidents per year in Virginia.
Y : the length of time to play 18 holes of golf.
M: the amount of milk produced yearly by a particular cow.
N: the number of eggs laid each month by a hen.
P: the number of building permits issued each month in a certain city.
Q: the weight of grain produced per acre.
Answer: (X=discrete; Y=continuous; M=continuous; N=discrete; P=discrete; Q=continuous)

3.2. Probability Distribution Functions


The probability distribution of a random variable X is a description of the probabilities associated with the possible values
of X. For a discrete random variable, the distribution is often specified by just a list of the possible values along with the
probability of each. For other cases, probabilities are expressed in terms of a formula.
The set of ordered pairs (x, f(x)) is a probability function, probability mass function, or probability distribution of the
discrete random variable X if, for each possible outcome x,

REZEL A. STO. TOMAS, ECE 1


(1) 𝑓(𝑥) ≥ 0

(2) ∑𝑥 𝑓(𝑥) = 1

(3) 𝑓 (𝑥) = 𝑃(𝑋 = 𝑥)

Exercises: Solve the following problems. Show your complete solution and box your final answer.:
1. The sample space of a random experiment is {a, b, c, d, e, f }, and each outcome is equally likely. A random variable is
defined as follows:

Determine the probability mass function of X.

Answer: 𝑓𝑋 (0) = 1⁄3 , 𝑓𝑋 (1.5) = 1⁄3 , 𝑓𝑋 (2) = 1⁄6 , 𝑓𝑋 (3) = 1⁄6


2. Verify that the following functions are probability mass functions, and determine the requested probabilities.

Answer: (a) 1 (b) 7⁄8 (c) 3⁄4 (d) 1⁄2

For problems 1-2, CLICK THE LINK BELOW:

https://www.youtube.com/watch?v=7t1jLrg4WL8

3. The distributor of a machine for cytogenics has developed a new model. The company estimates that when it is introduced
into the market, it will be very successful with a probability 0.6, moderately successful with a probability 0.3, and not
successful with probability 0.1. The estimated yearly profit associated with the model being very successful is $15 million
and being moderately successful is $5 million; not successful would result in a loss of $500,000. Let X be the yearly profit
of the new model. Determine the probability mass function of X.
Answer: 𝑃(𝑋 = 15 𝑚𝑖𝑙𝑙𝑖𝑜𝑛) = 0.6, 𝑃(𝑋 = 5 𝑚𝑖𝑙𝑙𝑖𝑜𝑛) = 0.3, 𝑃(𝑋 = −0.5 𝑚𝑖𝑙𝑙𝑖𝑜𝑛) = 0.1

4. An assembly consists of three mechanical components. Suppose that the probabilities that the first, second, and third
components meet specifications are 0.95, 0.98, and 0.99. Assume that the components are independent. Determine the
probability mass function of the number of components in the assembly that meet specifications .
Anwer: 𝑃(𝑋 = 0) = 0.00001, 𝑃(𝑋 = 1) = 0.00167, 𝑃(𝑋 = 2) = 0.07663, 𝑃(𝑋 = 3) = 0.92169
For problems 3-4, CLICK THE LINK BELOW:

https://www.youtube.com/watch?v=dvgRtUK4fks

3.3. Cumulative Distribution Functions


An alternate method for describing a random variable’s probability distribution is with cumulative probabilities such as
P(X ≤ x).

The cumulative distribution function of a discrete random variable X with probability distribution f(x), denoted as F(x),
is
𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∑𝑡≤𝑥 𝑓 (𝑥 ), for −∞ < 𝑥 < ∞.

For a discrete random variable X, F(x) satisfies the following properties.


(1) 𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∑𝑡≤𝑥 𝑓(𝑥)
(2) 0 ≤ 𝐹(𝑥) ≤ 1
(3) 𝐼𝑓 𝑥 ≤ 𝑦, then 𝐹(𝑥) ≤ 𝐹(𝑦)

Exercises: Solve the following problems. Show your complete solution and box your final answer.
1. Marketing estimates that a new instrument for the analysis of soil samples will be very successful, moderately successful,
or unsuccessful, with probabilities 0.3, 0.6, and 0.1, respectively. The yearly revenue associated with a very successful,
moderately successful, or unsuccessful product is $10 million, $5 million, and $1 million, respectively. Let the random
variable X denote the yearly revenue of the product.
(a) Determine the probability mass function of X.
(b) Determine the cumulative distribution function for the random variable.
Answer:
(a) 𝑃(𝑋 = 10 𝑚𝑖𝑙𝑙𝑖𝑜𝑛) = 0.3, 𝑃(𝑋 = 5 𝑚𝑖𝑙𝑙𝑖𝑜𝑛) = 0.6, 𝑃(𝑋 = 1 𝑚𝑖𝑙𝑙𝑖𝑜𝑛) = 0.1
(b)𝐹(𝑥) = 0 𝑓𝑜𝑟 𝑥 < 1 𝑚𝑖𝑙𝑙𝑖𝑜𝑛; 0.1 𝑓𝑜𝑟 1 𝑚𝑖𝑙𝑙𝑖𝑜𝑛 ≤ 𝑥 < 5 𝑚𝑖𝑙𝑙𝑖𝑜𝑛; 0.7 𝑓𝑜𝑟 5 𝑚𝑖𝑙𝑙𝑖𝑜𝑛 ≤ 𝑥 < 10 𝑚𝑖𝑙𝑙𝑖𝑜𝑛;

REZEL A. STO. TOMAS, ECE 2


1 𝑓𝑜𝑟 10 𝑚𝑖𝑙𝑙𝑖𝑜𝑛 ≤ 𝑥
2.

Answer: (a) 1 ,(b) 0.75, (c) 0.25, (d) 0.25, (e) 0, and (f) 0

For problems 1-2, CLICK THE LINK BELOW:

https://www.youtube.com/watch?v=SVvsivSw6VY

3.4. Mean and Variance of a Discrete Random Variable


The mean, or expected value, of a random variable X is of special importance in statistics because it describes where the
probability distribution is centered. By itself, however, the mean does not give an adequate description of the shape of the
distribution.

The mean or expected value of the discrete random variable X, denoted as μ or E(X), is

𝜇 = 𝐸(𝑋) = ∑ 𝑥𝑓(𝑥)
𝑥

The variance of X, denoted as σ2 or V(X), is

σ2 = 𝑉(𝑋) = 𝐸[(𝑋 − μ)2 ] = ∑(𝑥 − μ)2 𝑓(𝑥) = ∑(𝑥)2 𝑓(𝑥) − μ2


𝑥 𝑥

The quantity (𝑥 − 𝜇) is called the deviation of an observation from its mean. Since the deviations are squared and then
averaged, σ2 will be much smaller for a set of x values that are close to μ than it will be for a set of values that vary
considerably from μ.

The standard deviation of X is σ = √σ2 .

The mean of a discrete random variable X is a weighted average of the possible values of X, with weights equal to the
probabilities.

The variance of a random variable X is a measure of dispersion or scatter in the possible values for X. The variance of X
uses weight f(x) as the multiplier of each possible squared deviation (𝑥 − μ)2 .

Since 𝜇 = ∑𝑥 𝑥𝑓(𝑥) by definition, and ∑𝑥 𝑓(𝑥) = 1 for any discrete probability distribution, it follows that

Exercises: Solve the following problems. Show your complete solution and box your final answer.:
1. Determine the mean and variance of the random variable.

Answer: E(X) = 0, V(X) = 1.5

2. Determine the mean and variance of the random variable in Exercise 1 (3.3. Cumulative Distribution Functions).
Answer: E(X) = 6.1 million, V(X) = 7.89 million2
3. The range of the random variable X is [0, 1, 2, 3, x], where x is unknown. If each value is equally likely and the mean
of X is 6, determine x.
Answer: 𝑥 = 24

For problems 1-3, CLICK THE LINK BELOW:

REZEL A. STO. TOMAS, ECE 3


https://www.youtube.com/watch?v=J9KJRqiIgzY

3.5. Discrete Uniform Distribution


The simplest discrete random variable is one that assumes only a finite number of possible values, each with equal
probability. A random variable X that assumes each of the values x1, x2,….., xn, with equal probability 1/n, is frequently of
interest.

A random variable X has a discrete uniform distribution if each of the n values in its range, say, 𝑥1 , 𝑥2 , … , 𝑥𝑛 , has equal
probability. Then,
1
𝑓(𝑥𝑖 ) =
𝑛

Suppose the range of the discrete random variable X is the consecutive integers 𝑎, 𝑎 + 1, 𝑎 + 2, … . . , 𝑏 for 𝑎 ≤ 𝑏. The range
of X contains 𝑏 − 𝑎 + 1 values each with probability 1/ (𝑏 − 𝑎 + 1). Now,
𝑏
1
μ = ∑ 𝑘( )
𝑏− 𝑎+1
𝑘=𝑎

𝑏(𝑏+1)−(𝑎−1)𝑎
From the algebraic identity ∑𝑏𝑘=𝑎 𝑘 = , we can have
2

𝑏(𝑏 + 1) − (𝑎 − 1)𝑎 1
μ= ( )
2 𝑏− 𝑎+1

[𝑏(𝑏 − 𝑎 + 1) + 𝑎𝑏] + [(𝑏 − 𝑎 + 1)𝑎 − 𝑎𝑏] 1


= ( )
2 𝑏− 𝑎+1

𝑏+𝑎
=
2

Thus, the mean of X is


𝑏+𝑎
μ = 𝐸(𝑋) =
2

The variance of X is

(𝑏 − 𝑎 + 1)2 − 1
σ2 =
12

Exercises: Solve the following problems. Show your complete solution and box your final answer.:
1. Let the random variable X have a discrete uniform distribution on the integers 1 ≤ 𝑥 ≤ 3. Determine the mean
and variance of X.
Answer: E(X) = 2, V(X) = 0.667

2. Thickness measurements of a coating process are made to the nearest hundredth of a millimeter. The thickness
measurements are uniformly distributed with values 0.15, 0.16, 0.17, 0.18, and 0.19. Determine the mean and variance of
the coating thickness for this process.
Answer: E(X) = 0.17, V(X) = 0.0002

3. The lengths of plate glass parts are measured to the nearest tenth of a millimeter. The lengths are uniformly distributed,
with values at every tenth of a millimeter starting at 590.0 and continuing through 590.9. Determine the mean and variance
of lengths.
Answer: E(X) = 590.45, V(X) = 0.0825

For problems 1-3, CLICK THE LINK BELOW:

https://www.youtube.com/watch?v=pz9HO8Ew9FI

3.6. The Binomial Distribution


An experiment often consists of repeated trials, each with two possible outcomes that may be labeled success or failure.

The most obvious application deals with the testing of items as they come off an assembly line, where each trial may
indicate a defective or a nondefective item. We may choose to define either outcome as a success. The process is referred
to as a Bernoulli process. Each trial is called a Bernoulli trial.

REZEL A. STO. TOMAS, ECE 4


The Bernoulli Process
Strictly speaking, the Bernoulli process must possess the following properties:
1. The experiment consists of repeated trials.
2. Each trial results in an outcome that may be classified as a success or a failure.
3. The probability of success, denoted by p, remains constant from trial to trial.
4. The repeated trials are independent.

Binomial Distribution
The number X of successes in n Bernoulli trials is called a binomial random variable. The probability distribution of this
discrete random variable is called the binomial distribution, and its values will be denoted by b(x; n, p) since they depend
on the number of trials and the probability of a success on a given trial.

A Bernoulli trial can result in a success with probability 𝑝 and a failure with probability 𝑞 = 1 − 𝑝. Then the probability
distribution of the binomial random variable X, the number of successes in n independent trials, is

𝑓 (𝑥 ) = 𝑏 (𝑥; 𝑛, 𝑝) = 𝐶𝑥𝑛 (𝑝 𝑥 )(𝑞𝑛−𝑥 ), 𝑥 = 0,1,2, … … , 𝑛

where
𝐶𝑥𝑛 = the total number of sample points in the experiment that have x successes and n−x failures
(𝑝 𝑥 )(𝑞𝑛−𝑥 ) = the probability of x successes and n − x failures in a specified order.

If X is a binomial random variable with parameters p and n,


μ = 𝐸 (𝑋) = 𝑛𝑝 and σ2 = 𝑉(𝑋) = 𝑛𝑝(1 − 𝑝)

Exercises: Solve the following problems. Show your complete solution and box your final answer.
1. The random variable X has a binomial distribution with n = 10 and p = 0.01. Determine the following probabilities.
(𝑎)𝑃(𝑋 = 5) (𝑏)𝑃(3 ≤ 𝑋 < 5)
Answer: (𝑎) 2.396 × 10−8 (𝑏)1.138 × 10−4

2. Batches that consist of 50 coil springs from a production process are checked for conformance to customer requirements.
The mean number of nonconforming coil springs in a batch is 5. Assume that the number of nonconforming springs in a
batch, denoted as X, is a binomial random variable.
(a) What are n and p?
(b) What is 𝑃(𝑋 ≤ 2)?
(c) What is 𝑃(𝑋 ≥ 49)?
Answer: (a) n = 50, p = 0.1 (b) 0.1117 (𝑐) 4.51 × 10−48

For problems 1-2, CLICK THE LINK BELOW:

https://www.youtube.com/watch?v=Sumrhkz0fHo

3. Determine the cumulative distribution function of a binomial random variable with n = 3 and p = 1/4.
Answer:
0, 𝑥<0
27/64, 0 ≤ 𝑥 < 1
𝐹(𝑥) = 27/32, 1 ≤ 𝑥 < 2
63/64, 2 ≤ 𝑥 < 3
{ 1, 3≤𝑥 }

4. Let X denote the number of bits received in error in a digital communication channel, and assume that X is a binomial
random variable with p = 0.001. If 1000 bits are transmitted, determine the mean and variance of X.
Answer: E(X) = 1, V(X) = 0.999

For problems 1-2, CLICK THE LINK BELOW:

https://www.youtube.com/watch?v=wsl77TjITII

3.7. Negative Binomial and Geometric Distributions


Let us consider an experiment where the properties are the same as those listed for a binomial experiment, with the exception
that the trials will be repeated until a fixed number of successes occur. Therefore, instead of the probability of x successes
in n trials, where n is fixed, we are now interested in the probability that the kth success occurs on the xth trial. Experiments
of this kind are called negative binomial experiments.

The number X of trials required to produce k successes in a negative binomial experiment is called a negative binomial
random variable, and its probability distribution is called the negative binomial distribution, denoted as 𝑏 ∗ (𝑥; 𝑘, 𝑝).

REZEL A. STO. TOMAS, ECE 5


Negative Binomial Distribution
If repeated independent trials can result in a success with probability p and a failure with probability q = 1 − p, then the
probability distribution of the random variable X, the number of the trial on which the kth success occurs, is

𝑥−1 𝑘 𝑥−𝑘
𝑓 (𝑥 ) = 𝑏∗ (𝑥; 𝑘, 𝑝) = 𝐶𝑘−1 𝑝 𝑞 x = k, k+ 1, k + 2,....

If X is a negative binomial random variable with parameters p and k,

μ = 𝐸 (𝑋) = 𝑘/𝑝 and σ2 = 𝑉(𝑋) = 𝑘(1 − 𝑝)/𝑝2

If we consider the special case of the negative binomial distribution where 𝑘 = 1, we have a probability distribution for
the number of trials required for a single success. Hence, a special case called the geometric distribution results.

Geometric Distribution
If repeated independent trials can result in a success with probability p and a failure with probability q = 1 − p, then the
probability distribution of the random variable X, the number of the trial on which the first success occurs, is

𝑓 (𝑥 ) = 𝑔(𝑥; 𝑝) = 𝑝𝑞 𝑥−1 𝑥 = 1,2,3, ….


The mean and variance of a random variable following the geometric distribution are

μ = 𝐸 (𝑋) = 1/𝑝 and σ2 = 𝑉(𝑋) = (1 − 𝑝)/𝑝2

Exercises: Solve the following problems. Show your complete solution and box your final answer.
1. The probability of a successful optical alignment in the assembly of an optical data storage product is 0.8. Assume the
trials are independent.
(a) What is the probability that the first successful alignment requires exactly four trials?
(b) What is the probability that the first successful alignment requires at most four trials?
(c) What is the probability that the first successful alignment requires at least four trials?
Answer: (a) 0.0064 (b) 0.9984 (c) 0.008

2. An electronic scale in an automated filling operation stops the manufacturing line after three underweight packages are
detected. Suppose that the probability of an underweight package is 0.001 and each fill is independent.
(a) What is the mean number of fills before the line is stopped?
(b) What is the standard deviation of the number of fills before the line is stopped?
Answer: (a) 3000 (b) 1731.18

For problems 1-2, CLICK THE LINK BELOW:

https://www.youtube.com/watch?v=eAGUWdJ5QEU

3. The probability that a person living in a certain city owns a dog is estimated to be 0.3. Find the probability that the tenth
person randomly interviewed in that city is the fifth one to own a dog.
Answer: 0.0515

CLICK THE LINK BELOW:

https://www.youtube.com/watch?v=gThESKeCXDE

4. Suppose the probability that any given person will believe a tale about the transgressions of a famous actress is 0.8.
What is the probability that
(a) the sixth person to hear this tale is the fourth one to believe it?
(b) the third person to hear this tale is the first one to believe it?
Answer: (a) 0.1638; (b) 0.032

CLICK THE LINK BELOW:

https://www.youtube.com/watch?v=BHODtQNnrXM

REZEL A. STO. TOMAS, ECE 6

You might also like