Professional Documents
Culture Documents
STAT 2006 Chapter 1 - 2022 - v2 - Polished
STAT 2006 Chapter 1 - 2022 - v2 - Polished
Presented by
Simon Cheung
Email: kingchaucheung@cuhk.edu.hk
Simulation Example
Consider flipping two fair coins. The set of possible outcomes is
𝐻𝐻, 𝐻𝐻 , 𝐻𝐻, 𝑇𝑇 , 𝑇𝑇, 𝐻𝐻 , 𝑇𝑇, 𝑇𝑇 . We can simulate the experiment by generating 500 pairs
𝑈𝑈1 , 𝑈𝑈2 of Uniform 𝑈𝑈 0,1 random numbers. We have four possible outcomes:
𝑈𝑈1 < 0.5, 𝑈𝑈2 < 0.5
𝑈𝑈1 < 0.5, 𝑈𝑈2 ≥ 0.5
𝑈𝑈1 ≥ 0.5, 𝑈𝑈2 < 0.5
𝑈𝑈1 ≥ 0.5, 𝑈𝑈2 ≥ 0.5
They correspond to the events 𝐻𝐻𝐻𝐻 , 𝐻𝐻𝑇𝑇 , 𝑇𝑇𝐻𝐻 , 𝑇𝑇𝑇𝑇 respectively.
By inspecting the relative frequency of each of the event, we can measure their
approximate probabilities.
Probability is a real-valued set function 𝑃𝑃 that assigns to each event 𝐴𝐴 in the sample
space 𝑆𝑆 a number 𝑃𝑃(𝐴𝐴), called the probability of the event 𝐴𝐴, such that the following hold:
• The probability of any event 𝐴𝐴 must be nonnegative, that is, 𝑃𝑃(𝐴𝐴) ≥ 0.
• The probability of the sample space is 1, that is, 𝑃𝑃(𝑆𝑆) = 1.
• Given mutually exclusive events 𝐴𝐴1 , 𝐴𝐴2 , 𝐴𝐴3 , …, where 𝐴𝐴𝑖𝑖 ∩ 𝐴𝐴𝑗𝑗 = ∅, for 𝑖𝑖 ≠ 𝑗𝑗,
𝑃𝑃 𝐴𝐴1 ∪ 𝐴𝐴2 ∪ ⋯ ∪ 𝐴𝐴𝑘𝑘 = 𝑃𝑃 𝐴𝐴1 + 𝑃𝑃 𝐴𝐴2 + ⋯ + 𝑃𝑃 𝐴𝐴𝑘𝑘
∞ ∞
𝑃𝑃 � 𝐴𝐴𝑖𝑖 = � 𝑃𝑃 𝐴𝐴𝑖𝑖
𝑖𝑖=1 𝑖𝑖=1
Multiplication Theorem
• For any two events 𝐴𝐴 and 𝐵𝐵 with 𝑃𝑃 𝐵𝐵 > 0,
𝑃𝑃 𝐴𝐴⋂𝐵𝐵 = 𝑃𝑃 𝐴𝐴 𝐵𝐵 𝑃𝑃 𝐵𝐵 .
• For any three events 𝐴𝐴, 𝐵𝐵, 𝐶𝐶 with 𝑃𝑃 𝐵𝐵⋂𝐶𝐶 > 0,
𝑃𝑃 𝐴𝐴⋂𝐵𝐵⋂𝐶𝐶 = 𝑃𝑃 𝐴𝐴 𝐵𝐵⋂𝐶𝐶 𝑃𝑃 𝐵𝐵 𝐶𝐶 𝑃𝑃 𝐶𝐶 .
miii
1
𝑃𝑃 𝐴𝐴 ∩ 𝐵𝐵 ∩ 𝐶𝐶 = 𝑃𝑃 the sum is 8, composed of double 4s = = 𝑃𝑃 𝐴𝐴 𝑃𝑃 𝐵𝐵 𝑃𝑃 𝐶𝐶
36
This implies that 𝐴𝐴, 𝐵𝐵, 𝐶𝐶 are independent.
11
• However, 𝑃𝑃 𝐵𝐵 ∩ 𝐶𝐶 = 𝑃𝑃 sum equals 7 or 8 = ≠ 𝑃𝑃 𝐵𝐵 𝑃𝑃 𝐶𝐶
36
• Similarly, 𝑃𝑃 𝐴𝐴 ∩ 𝐵𝐵 ≠ 𝑃𝑃 𝐴𝐴 𝑃𝑃 𝐵𝐵 . Therefore, that the requirement that
花 在 𝑃𝑃 𝐴𝐴 ∩ 𝐵𝐵 ∩ 𝐶𝐶 = 𝑃𝑃 𝐴𝐴 𝑃𝑃 𝐵𝐵 𝑃𝑃 𝐶𝐶
is not a strong enough condition to guarantee pairwise independence.
STAT 2006 - Jan 2021 13
Conditional Probability
A B
Law of total probability and Bayes Theorem
• If 0 < 𝑃𝑃 𝐵𝐵 < 1, then 𝑃𝑃 𝐴𝐴 = 𝑃𝑃 𝐴𝐴 𝐵𝐵 𝑃𝑃 𝐵𝐵 + 𝑃𝑃 𝐴𝐴 𝐵𝐵𝑐𝑐 𝑃𝑃 𝐵𝐵𝑐𝑐 , for any event 𝐴𝐴.
• If 𝐵𝐵1 , 𝐵𝐵2 , … , 𝐵𝐵𝑘𝑘 are mutually exclusive臖
啡 pmB
and exhaustive events (that is, a partition of the
sample space), then, for any event 𝐴𝐴, neuro
𝑘𝑘
𝑃𝑃 𝐴𝐴 = � 𝑃𝑃 𝐴𝐴 𝐵𝐵𝑗𝑗 𝑃𝑃 𝐵𝐵𝑗𝑗 .
𝑗𝑗=1
• Proof.
We observe that 𝐴𝐴 = 𝐴𝐴⋂𝐵𝐵1 ⋃ 𝐴𝐴⋂𝐵𝐵2 ⋃ ⋯ ⋃ 𝐴𝐴⋂𝐵𝐵𝑘𝑘
and the events 𝐴𝐴⋂𝐵𝐵1 , 𝐴𝐴⋂𝐵𝐵2 , … , 𝐴𝐴⋂𝐵𝐵𝑘𝑘 are mutually exclusive.
𝑘𝑘 𝑘𝑘
Example. When coded messages are sent, there may be errors in the transmission. In
particular, Morse code used "dots" and "dashes", which are known to occur in the
proportion of 3:4. Suppose there is interference on the transmission line, and with
probability 1/8 a dot is mistakenly received as a dash, and vice versa. If a single signal
is sent to us, what is the probability that we will receive a dot?
Denote 𝐵𝐵 as the event that the original signal sent is a dot and 𝐴𝐴 as the event that
we receive a dot. Then we have
3 1
𝑃𝑃 𝐵𝐵 = , 𝑃𝑃 𝐴𝐴𝑐𝑐 𝐵𝐵 = 𝑃𝑃 𝐴𝐴 𝐵𝐵𝑐𝑐 = .
Do⼗⼆
年Dash Ǐ
Then
7 8 Plilltllswnnne
1 3 1 3 25
𝑃𝑃 𝐴𝐴 = 𝑃𝑃 𝐴𝐴 𝐵𝐵 𝑃𝑃 𝐵𝐵 + 𝑃𝑃 𝐴𝐴 𝐵𝐵𝑐𝑐 𝑃𝑃 𝐵𝐵𝑐𝑐 = 1 − × + × 1− = .
8 7 8 7 56
Bayes Theorem
• For any two events 𝐴𝐴 and 𝐵𝐵 with 𝑃𝑃 𝐴𝐴 > 0 and 𝑃𝑃 𝐵𝐵 > 0,
𝑃𝑃 𝐵𝐵 𝐴𝐴 = 𝑃𝑃 𝐴𝐴 𝐵𝐵
𝑃𝑃 𝐵𝐵
.o 器器鬻
𝑃𝑃 𝐴𝐴
• If 𝐵𝐵1 , 𝐵𝐵2 , … , 𝐵𝐵𝑘𝑘 are mutually exclusive and exhaustive events (that is, a partition of the
sample space), and 𝐴𝐴 is any event with 𝑃𝑃 𝐴𝐴 > 0, then for any event 𝐵𝐵𝑗𝑗 ,
𝑃𝑃 𝐴𝐴 𝐵𝐵𝑗𝑗 𝑃𝑃 𝐵𝐵𝑗𝑗 𝑃𝑃 𝐴𝐴 𝐵𝐵𝑗𝑗 𝑃𝑃 𝐵𝐵𝑗𝑗
𝑃𝑃 𝐵𝐵𝑗𝑗 𝐴𝐴 = = 𝑘𝑘 .
𝑃𝑃 𝐴𝐴 ∑𝑖𝑖=1 𝑃𝑃 𝐴𝐴 𝐵𝐵𝑖𝑖 𝑃𝑃 𝐵𝐵𝑖𝑖
That is, even the test shows positive on an individual, he/she will only have about 13%
chance to be infected by the virus.
A tree diagram is a useful graphical display
that shows the outcomes of a set of events.
ooixooos 0.9840997
𝑃𝑃 − = 0.00003 + 0.97706 = 0.97709
does
resultisnegativethatperson notcamgvwus
iven 0.97706
𝑃𝑃 𝑉𝑉 𝑐𝑐 − = = 0.99997
0.97706 + 0.00003
STAT 2006 - Jan 2021 19
Conditional Probability
Example. A book club classifies members as heavy, medium, or light purchasers, and
separate mailings are prepared for each of these groups. Overall 20% of the members are
heavy purchasers, 30% medium, and 50% light. A member is not classified into a group
until 18 months after joining the club, but a test is made of the feasibility of using the first
3 months' purchases to classify members. The following percentages are obtained from
existing records of individuals classified as heavy, medium, or light purchasers.
If a member purchases no books in the first 3 months, what is the probability that the
member is a light purchaser?
STAT 2006 - Jan 2021 20
Conditional Probability
Example
Example. You are waiting for your bag at the baggage return carousel of an airport.
Suppose that you know that there are 200 bags to come from your flight, and you are
counting the distinct bags that come out. Suppose that 𝑥𝑥 bags have arrived, and your bag
is not among them. What is the probability that your bag will not arrive at all, that is, that it
has been lost (or at least delayed)?
Let us assign values to 𝑃𝑃 𝐴𝐴 based on empirical data. Page 9 of "aucbaggage.pdf" contains
data for the number of missing bags per 1000 passengers for 24 airlines (provided by the
Association of European Airlines (AEA) in 2006). Here are the data for two airlines:
• Air Malta 𝑃𝑃 𝐴𝐴 = 0.0044
• British Airways 𝑃𝑃 𝐴𝐴 = 0.023
Note that
• For Air Malta, 𝑃𝑃 𝐴𝐴 199 = 0.469. So even ⼀名
when only 1 bag remains to arrive, the
chance is less than half that your bag has
been lost.
• For British Airways, 𝑃𝑃 𝐴𝐴 199 = 0.825 .
However, 𝑃𝑃 𝐴𝐴 197 = 0.541 is the first
probability over half.
Example.
Suppose that 5 people, including you and a friend, line up at random. Let the random
variable 𝑋𝑋 denote the number of people standing between you and a friend. Determine
the probability mass function of 𝑋𝑋. to
Go
G3 × 2! × 3! 3
40
𝑃𝑃𝑋𝑋 0 =
4 × 2! × 3!
=
4
, 𝑃𝑃𝑋𝑋 1 = = , 𝑃𝑃𝑋𝑋 2 =
2 × 2! × 3!
=
2
, 𝑃𝑃 3
5! 10 5! 10 5! 10 𝑋𝑋
1 × 2! × 3! 1
= = .
5! 10
Hypergeometric distribution
If we randomly select 𝑛𝑛 items without replacement from a set of 𝑁𝑁 items of which 𝑀𝑀 of
the items are of one type and 𝑁𝑁 − 𝑀𝑀 of the items are of a second type, then the
probability mass function of the discrete random variable 𝑋𝑋 is called the hypergeometric
distribution and is of the form
𝑀𝑀 𝑁𝑁 − 𝑀𝑀
𝑥𝑥 𝑛𝑛 − 𝑥𝑥 , 0 ≤ 𝑥𝑥 ≤ 𝑀𝑀, 𝑥𝑥 ≤ 𝑛𝑛, 𝑛𝑛 − 𝑥𝑥 ≤ 𝑁𝑁 − 𝑀𝑀.
𝑃𝑃 𝑋𝑋 = 𝑥𝑥 =
𝑁𝑁
𝑛𝑛
Note that when the samples are drawn with replacement, the discrete random
variable 𝑋𝑋 follows what is called the binomial distribution.
Example.
A crate contains 50 light bulbs of which 5 are defective and 45 are not. A Quality Control
Inspector randomly samples 4 bulbs without replacement. Let 𝑋𝑋 = the number of defective
bulbs selected. Find the probability mass function, 𝑃𝑃𝑋𝑋 (𝑥𝑥), of the discrete random
variable 𝑋𝑋.
5 45 5 45 5 45
𝑃𝑃𝑋𝑋 0 = 0 4 = 0.647, 𝑃𝑃𝑋𝑋 1 = 1 3 = 0.3081, 𝑃𝑃𝑋𝑋 2 = 2 2
50 50 50
4 4 4
5 45 5 45
= 0.043, 𝑃𝑃𝑋𝑋 3 = 3 1 = 0.00195, 𝑃𝑃𝑋𝑋 4 = 4 0 = 0.000022
50 50
4 4
Expectation
If 𝑃𝑃𝑋𝑋 𝑥𝑥 is the pmf of a discrete random variable 𝑋𝑋 with support 𝑆𝑆, and if the summation
∑𝑥𝑥∈𝑆𝑆 𝑣𝑣 𝑥𝑥 𝑃𝑃𝑋𝑋 𝑥𝑥 exists, then the expected value of 𝑣𝑣 𝑋𝑋 is defined by
𝐸𝐸 𝑣𝑣 𝑋𝑋 = � 𝑣𝑣 𝑥𝑥 𝑃𝑃𝑋𝑋 𝑥𝑥 .
𝑥𝑥∈𝑆𝑆
Example.
What is the average toss of a fair six-sided die?
1
The pmf of 𝑋𝑋, the face value of a tossed fair six-sided die, is 𝑃𝑃𝑋𝑋 𝑥𝑥 = , 𝑥𝑥 = 1,2,3,4,5,6.
6
Thus, the expected face value of 𝑋𝑋 is
1 1 1 1 1 1
𝐸𝐸 𝑋𝑋 = 1 × + 2 × + 3 × + 4 × + 5 × + 6 × = 3.5.
6 6 6 6 6 6
Example. What is the expected value of a discrete random variable 𝑋𝑋 with pmf
𝑐𝑐
𝑃𝑃𝑋𝑋 𝑥𝑥 = 2 , 𝑥𝑥 = 1,2,3, … ,
𝑥𝑥
∞ ∞
𝑐𝑐 𝑐𝑐
where 𝑐𝑐 is a constant. 𝐸𝐸 𝑋𝑋 = � 𝑥𝑥 � =� = ∞.
𝑥𝑥=1 𝑥𝑥 2 𝑥𝑥=1 𝑥𝑥
Therefore, the expected value of 𝑋𝑋 doesn't exist.
Example. Let 𝑣𝑣 𝑋𝑋 = 𝑋𝑋 − 𝑐𝑐 2 , where 𝑐𝑐 is a constant. Suppose that 𝐸𝐸 𝑋𝑋 − 𝑐𝑐 2 exists.
Find the value of 𝑐𝑐 that minimizes 𝐸𝐸 𝑋𝑋 − 𝑐𝑐 2 .
𝐿𝐿 𝑐𝑐 = 𝐸𝐸 𝑋𝑋 − 𝑐𝑐 2 = 𝐸𝐸 𝑋𝑋 2 − 2𝑐𝑐𝑋𝑋 + 𝑐𝑐 2
𝑀𝑀
𝐿𝐿 𝑐𝑐 = 𝐸𝐸 −2𝑋𝑋 + 2𝑐𝑐 = −2𝐸𝐸 𝑋𝑋 + 2𝑐𝑐
𝑀𝑀𝑐𝑐
𝑑𝑑
Set 𝐿𝐿 𝑐𝑐 = 0 to imply that 𝑐𝑐 = 𝐸𝐸 𝑋𝑋 .
𝑑𝑑𝑐𝑐
The sample standard deviation is 𝑣𝑣 2 . Note that a sample before collection is random. We
often use capital letter 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 to represent it. Hence, before collection, the sample
mean 𝑋𝑋� and the sample variance 𝑆𝑆 2 are random variables.
𝑟𝑟 𝑟𝑟
Let 𝑀𝑀 𝑝𝑝 be the 𝑣𝑣 𝑡𝑡𝑡 derivative of 𝑀𝑀 𝑝𝑝 with respect to 𝑝𝑝. We have 𝐸𝐸 𝑋𝑋 𝑟𝑟 = 𝑀𝑀𝑋𝑋 0 .
2
1 2 1
Hence, 𝜇𝜇𝑋𝑋 = 𝑀𝑀𝑋𝑋 0 and 𝜎𝜎𝑋𝑋2 = 𝑀𝑀𝑋𝑋 0 − 𝑀𝑀𝑋𝑋 0 .
Example.
Use the mgf of the Binomial random variable 𝑋𝑋 to determine the mean and the variance of
𝑋𝑋.
𝑀𝑀𝑋𝑋 𝑝𝑝 = 𝑝𝑝𝑐𝑐 𝑡𝑡 + 1 − 𝑝𝑝 𝑛𝑛
1 1
𝑀𝑀𝑋𝑋 𝑝𝑝 = 𝑛𝑛 𝑝𝑝𝑐𝑐 𝑡𝑡 + 1 − 𝑝𝑝 𝑛𝑛−1 𝑝𝑝𝑐𝑐 𝑡𝑡 ⟹ 𝑀𝑀𝑋𝑋 0 = 𝑛𝑛𝑝𝑝
2 2
𝑀𝑀𝑋𝑋 𝑝𝑝 = 𝑛𝑛𝑝𝑝 𝑐𝑐 𝑡𝑡 𝑝𝑝𝑐𝑐 𝑡𝑡 + 1 − 𝑝𝑝 𝑛𝑛−1 + 𝑝𝑝𝑐𝑐 2𝑡𝑡 𝑛𝑛 − 1 𝑝𝑝𝑐𝑐 𝑡𝑡 + 1 − 𝑝𝑝 𝑛𝑛−2 ⟹ 𝑀𝑀𝑋𝑋 0
= 𝑛𝑛𝑝𝑝 1 + 𝑛𝑛 − 1 𝑝𝑝
It follows that 𝜇𝜇𝑋𝑋 = 𝑛𝑛𝑝𝑝 and 𝜎𝜎𝑋𝑋2 = 𝑛𝑛𝑝𝑝 1 + 𝑛𝑛 − 1 𝑝𝑝 − 𝑛𝑛2 𝑝𝑝2 = 𝑛𝑛𝑝𝑝 1 − 𝑝𝑝 .
Bernoulli Trial
Consider an experiment of tossing a coin with sample space 𝑆𝑆 = 𝐻𝐻, 𝑇𝑇 , and define a
discrete random variable 𝑋𝑋 as 𝑋𝑋 𝐻𝐻 = 1 and 𝑋𝑋 𝑇𝑇 = 0. Let 𝑆𝑆 be a sample space and 𝐴𝐴
be an event. Denote 𝜔𝜔 as an outcome of the experiment. Define a discrete random
variable 𝑋𝑋 with 𝑋𝑋 𝜔𝜔 ∈ 𝐴𝐴 = 1 and 𝑋𝑋 𝜔𝜔 ∉ 𝐴𝐴 = 0. The random variable 𝑋𝑋 is called a
Bernoulli trial.
Suppose 𝑃𝑃 is a probability function defined on 𝑆𝑆. Let 𝑝𝑝 = 𝑃𝑃 𝐴𝐴 . Then, 𝑝𝑝 = 𝑃𝑃𝑋𝑋 𝑋𝑋 = 1 , call
it the probability of success. Example of a Bernoulli Trial
1. Flipping a coin
2. Rolling a die, where a six is “success”
3. In conducting a political opinion poll, choosing a voter at random to ascertain
whether that voter will vote “yes” in upcoming election.
𝑛𝑛 𝑘𝑘 𝑛𝑛−𝑘𝑘 , 𝑘𝑘
𝑃𝑃 𝑌𝑌 = 𝑘𝑘 = 𝑝𝑝 1 − 𝑝𝑝 = 0, 1, 2, … , 𝑛𝑛
𝑘𝑘
Proof
From 𝑛𝑛 independent Bernoulli trials, among 𝑘𝑘 of them are success with probability 𝑝𝑝 and
𝑛𝑛
𝑛𝑛 − 𝑘𝑘 of them are not success with probability 1 − 𝑝𝑝. In addition, there are different
𝑘𝑘
number of ways to select 𝑘𝑘 success trials among 𝑛𝑛 trials. Thus, the probability of obtaining 𝑘𝑘
𝑛𝑛 𝑘𝑘
successes from 𝑛𝑛 trials is 𝑝𝑝 1 − 𝑝𝑝 𝑛𝑛−𝑘𝑘 .
𝑘𝑘
𝐸𝐸 𝑌𝑌 = � 𝐸𝐸 𝑋𝑋𝑖𝑖 = 𝑛𝑛𝜋𝜋
𝑖𝑖=1
𝑛𝑛
i
no.ofsamples
ylfhttoge
Let 𝑋𝑋 be the number of germinated seeds in a sample of 20 seeds. 𝑋𝑋 has a binomial
distribution with parameter 𝑛𝑛 = 20 and 𝜋𝜋 = 0.85. Since 𝑃𝑃 𝑋𝑋 ≤ 12 = 0.0059, it is highly
能improbable that in 20 seeds we would obtain 12 germinated seeds if 𝜋𝜋 is equal to 0.85. The
germination rate is most likely a value considerably less than 0.85.
STAT 2006 - Jan 2021 42
Discrete Random Variables
Example
A cable TV company is investigating the feasibility of offering a new service in a large
Midwestern city. For the proposed new service to be economically viable, it is necessary
that at least 50% of their current subscribers add the new service. A survey of 1,218
customers reveals that 516 would add the new service. Do you think the company should
offer the new service in this city?
Let 𝑋𝑋 be the number of customers who would subscribe the new service in a random
sample of 1,218 customers. If 𝑝𝑝 = 0.5, 𝑋𝑋 has a binomial distribution with parameter 𝑛𝑛 =
1218 and 𝑝𝑝 = 0.5. Since 𝑃𝑃 𝑋𝑋 ≤ 516 ≈ 0, offering the new service is not a good idea.
Geometric Distribution
𝑑𝑑 𝑝𝑝𝑒𝑒 𝑡𝑡 𝑑𝑑 2 𝑝𝑝𝑒𝑒 𝑡𝑡 +𝑝𝑝 1−𝑝𝑝 𝑒𝑒 2𝑡𝑡
Since 𝑀𝑀 𝑝𝑝 = 2 and 𝑀𝑀 𝑝𝑝 = 3 ,
𝑑𝑑𝑡𝑡 1−𝑒𝑒 𝑡𝑡 1−𝑝𝑝 𝑑𝑑𝑡𝑡 2 1−𝑒𝑒 𝑡𝑡 1−𝑝𝑝
𝑀𝑀 1
𝐸𝐸 𝑋𝑋 = 𝑀𝑀 𝑝𝑝 � = ;
𝑀𝑀𝑝𝑝 𝑡𝑡=0 𝑝𝑝
𝑀𝑀 2 2 − 𝑝𝑝
𝐸𝐸 𝑋𝑋 2 = 2 𝑀𝑀 𝑝𝑝 � = 2 ;
𝑀𝑀𝑝𝑝 𝑡𝑡=0 𝑝𝑝
2 − 𝑝𝑝 1 1 − 𝑝𝑝
𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋 = 𝐸𝐸 𝑋𝑋 2 − 𝐸𝐸 𝑋𝑋 2 = − = .
𝑝𝑝2 𝑝𝑝2 𝑝𝑝2
Select 𝑣𝑣 − 1 successes
from 𝑥𝑥 − 1 trials.
𝑋𝑋 is equal to the sum of 𝑣𝑣 independent Geometric distributed random variables each with
the same success probability 𝑝𝑝.
The mgf of a Negative Binomial random variable 𝑋𝑋 with parameters 𝑣𝑣 and 𝑝𝑝 is given by
𝑟𝑟 𝑟𝑟
𝑝𝑝𝑐𝑐 𝑡𝑡
𝐸𝐸 𝑐𝑐 𝑡𝑡𝑋𝑋 = 𝐸𝐸 𝑐𝑐 𝑡𝑡 𝑌𝑌1 +⋯+𝑌𝑌𝑟𝑟 = � 𝐸𝐸 𝑐𝑐 𝑡𝑡𝑌𝑌𝑗𝑗 = ,
1 − 𝑐𝑐𝑡𝑡 1 − 𝑝𝑝
𝑗𝑗=1
where 𝑌𝑌1 , 𝑌𝑌2 , … , 𝑌𝑌𝑟𝑟 are independent Geometric random variables with parameter 𝑝𝑝 and
𝑐𝑐 𝑡𝑡 1 − 𝑝𝑝 < 1.
𝑣𝑣
𝐸𝐸 𝑋𝑋 = 𝐸𝐸 𝑌𝑌1 + 𝐸𝐸 𝑌𝑌2 + ⋯ + 𝐸𝐸 𝑌𝑌𝑟𝑟 = ;
𝑝𝑝
𝑣𝑣 1 − 𝑝𝑝
𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋 = 𝑉𝑉𝑐𝑐𝑣𝑣 𝑌𝑌1 + 𝑉𝑉𝑐𝑐𝑣𝑣 𝑌𝑌2 + ⋯ + 𝑉𝑉𝑐𝑐𝑣𝑣 𝑌𝑌𝑟𝑟 = .
𝑝𝑝2
Example. An oil company conducts a geological study that indicates that an exploratory oil
well should have a 20% chance of striking oil. What is the probability that the first strike
comes on the third well drilled? ⼀
3−1
𝑃𝑃 𝑋𝑋 = 3 = 𝑝𝑝 1 − 𝑝𝑝 3−1 = 0.8 2 × 0.2 = 0.128.
1−1
What is the probability that the third strike comes on the seventh well drilled?
7−1 3 6
𝑃𝑃 𝑋𝑋 = 7 = 𝑝𝑝 1 − 𝑝𝑝 7−3 = 0.2 3 0.8 4 = 0.049.
3−1 2
What is the mean and variance of the number of wells that must be drilled if the oil
company wants to set up three producing wells?
𝑣𝑣 3 𝑣𝑣 1 − 𝑝𝑝 3 × 0.8
𝐸𝐸 𝑋𝑋 = = = 15; 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋 = = = 60.
𝑝𝑝 0.2 𝑝𝑝2 0.22
A random variable 𝑋𝑋, taking values in the non-negative integers, has a Poisson distribution
with parameter 𝜆𝜆 > 0 if
𝑐𝑐 −𝜆𝜆 𝜆𝜆𝑘𝑘
𝑃𝑃 𝑋𝑋 = 𝑘𝑘 = , 𝑘𝑘 = 0,1,2,3, …
𝑘𝑘!
Let 𝑋𝑋 denote the number of events in a given continuous interval. Then 𝑋𝑋 follows an
approximate Poisson process with parameter 𝜆𝜆 > 0 if:
• The number of events occurring in non-overlapping intervals are independent.
1
• The probability of exactly one event in a short interval of length 𝐿 = is approximately
𝑛𝑛
𝜆𝜆
𝜆𝜆𝐿 = .
𝑛𝑛
• The probability of exactly two or more events in a short interval is essentially zero.
Consider dividing the given interval into 𝑛𝑛 subintervals, 𝑋𝑋 has a Binomial distribution with
𝜆𝜆
parameter 𝑛𝑛 and .
𝑛𝑛
𝑥𝑥 𝑛𝑛−𝑥𝑥 𝑛𝑛 −𝑥𝑥
𝑛𝑛
𝜆𝜆 𝜆𝜆 𝜆𝜆𝑥𝑥 𝑛𝑛 𝑛𝑛 − 1 ⋯ 𝑛𝑛 − 𝑥𝑥 + 1 𝜆𝜆 𝜆𝜆
𝑃𝑃𝑋𝑋 𝑥𝑥 = 𝐶𝐶𝑥𝑥 1− = 1− 1−
𝑛𝑛 𝑛𝑛 𝑥𝑥! 𝑛𝑛 � 𝑛𝑛 � ⋯ � 𝑛𝑛 𝑛𝑛 𝑛𝑛
𝑛𝑛 −𝑥𝑥
𝜆𝜆𝑥𝑥 𝜆𝜆 𝑛𝑛 1 𝑥𝑥 − 1 𝜆𝜆 𝜆𝜆𝑥𝑥 −𝜆𝜆
= 1− 1− ⋯ 1− 1− ⟶ 𝑐𝑐 , 𝑥𝑥 ≪ 𝑛𝑛,
𝑥𝑥! 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑥𝑥!
as 𝑛𝑛 → ∞.
Exponential Distribution
Suppose that 𝑋𝑋 is the number of customers arriving at a bank in one hour. If 𝜆𝜆 > 0 is the
mean number of customers arriving in one hour, the number of customers arriving in 𝑝𝑝
hours has a Poisson distribution with mean 𝜆𝜆𝑝𝑝. Let 𝑊𝑊 be the waiting time until the first
customer arrives.
𝐹𝐹𝑊𝑊 𝑤𝑤 = 𝑃𝑃 𝑊𝑊 ≤ 𝑤𝑤 = 1 − 𝑃𝑃 𝑊𝑊 > 𝑤𝑤 = 1 − 𝑃𝑃 no customer arrived in 0, 𝑤𝑤
= 1 − 𝑐𝑐 −𝜆𝜆𝜆𝜆 .
For 𝑤𝑤 > 0, the pdf of 𝑊𝑊 is defined by
𝑀𝑀
𝑓𝑓𝑊𝑊 𝑤𝑤 = 𝐹𝐹𝑊𝑊 𝑤𝑤 = 𝜆𝜆𝑐𝑐 −𝜆𝜆𝜆𝜆 , 𝑤𝑤 > 0.
𝑀𝑀𝑤𝑤
𝑊𝑊 has an exponential distribution with parameter 𝜆𝜆.
Exponential Distribution
The mgf of Exponential random variable 𝑋𝑋 with parameter 𝜆𝜆 is given by
∞ ∞
𝜆𝜆 ∞ 𝜆𝜆
𝑀𝑀 𝑝𝑝 = 𝐸𝐸 𝑐𝑐 𝑡𝑡𝑋𝑋 =� 𝑐𝑐 𝑡𝑡𝑥𝑥 𝜆𝜆𝑐𝑐 −𝜆𝜆𝑥𝑥 𝑀𝑀𝑥𝑥 = 𝜆𝜆 � 𝑐𝑐 − 𝜆𝜆−𝑡𝑡 𝑥𝑥 𝑀𝑀𝑥𝑥 = − 𝑐𝑐 − 𝜆𝜆−𝑡𝑡 𝑥𝑥 � = ,
0 0 𝜆𝜆 − 𝑝𝑝 0 𝜆𝜆 − 𝑝𝑝
where 𝑝𝑝 < 𝜆𝜆. 7 tlul
xcz
𝑑𝑑 𝜆𝜆 𝑑𝑑 2 2𝜆𝜆
Since 𝑀𝑀 𝑝𝑝 = and 𝑀𝑀 𝑝𝑝 = , we have
𝑑𝑑𝑡𝑡 𝜆𝜆−𝑡𝑡 2 𝑑𝑑𝑡𝑡 2 𝜆𝜆−𝑡𝑡 3
E
𝑀𝑀 1
𝐸𝐸 𝑋𝑋 = 𝑀𝑀 𝑝𝑝 � = ;
i
𝑀𝑀𝑝𝑝 𝑡𝑡=0
𝜆𝜆
𝑀𝑀2
器前 2
2 1
𝐸𝐸 𝑋𝑋 = 2 𝑀𝑀 𝑝𝑝 � = ⟹ 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋 = .
𝑀𝑀𝑝𝑝 𝜆𝜆2 𝜆𝜆2
𝑡𝑡=0
Exponential Distribution
Memory-less property
𝑃𝑃 𝑋𝑋>𝑡𝑡+𝑡𝑡,𝑋𝑋>𝑡𝑡 𝑃𝑃 𝑋𝑋>𝑡𝑡+𝑡𝑡 𝑒𝑒 −𝜆𝜆 𝑡𝑡+𝑡𝑡
𝑃𝑃 𝑋𝑋 > 𝑣𝑣 + 𝑝𝑝 𝑋𝑋 > 𝑝𝑝 = = = = 𝑐𝑐 −𝜆𝜆𝑡𝑡 = 𝑃𝑃 𝑋𝑋 > 𝑣𝑣
𝑃𝑃 𝑋𝑋>𝑡𝑡 𝑃𝑃 𝑋𝑋>𝑡𝑡 𝑒𝑒 −𝜆𝜆𝑡𝑡
Example ⼈
ÉG
Suppose that the amount of time one spends in a bank is exponentially distributed with
1
mean 10 minutes, 𝜆𝜆 = . What is the probability that a customer will spend more than 15
10
minutes in the bank? What is the probability that a customer will spend more than 15
minutes in the bank given that he is still in the bank after 10 minutes?
Solution
𝑃𝑃 𝑋𝑋 > 15 = 𝑐𝑐 −15𝜆𝜆 = 0.22
𝑃𝑃 𝑋𝑋 > 15 𝑋𝑋 > 10 = 𝑃𝑃 𝑋𝑋 > 5 = 𝑐𝑐 −0.5 = 0.604
Gamma Distribution
Suppose that 𝑋𝑋 is the number of customers arriving at a bank in one hour. If 𝜆𝜆 > 0 is the
mean number of customers arriving in one hour, the number of customers arriving in 𝑝𝑝
hours has a Poisson distribution with mean 𝜆𝜆𝑝𝑝. Let 𝑊𝑊 be the waiting time until the 𝛼𝛼 𝑡𝑡𝑡
customer arrives.
𝐹𝐹𝑊𝑊 𝑤𝑤 = 𝑃𝑃 𝑊𝑊 ≤ 𝑤𝑤 = 1 − 𝑃𝑃 𝑊𝑊 > 𝑤𝑤
𝛼𝛼−1
𝜆𝜆𝑤𝑤 𝑘𝑘
= 1 − 𝑃𝑃 fewer than α customers arrived in 0, 𝑤𝑤 =1−� 𝑐𝑐 −𝜆𝜆𝜆𝜆
𝑘𝑘!
𝑘𝑘=0
𝛼𝛼−1
𝜆𝜆𝑤𝑤 𝑘𝑘
=1 − 𝑐𝑐 −𝜆𝜆𝜆𝜆 −� 𝑐𝑐 −𝜆𝜆𝜆𝜆 .
𝑘𝑘!
𝑘𝑘=1
豐 ie
焦悲 kǚétuiin
Gamma Distribution
⼆
点悲 kniwgē
For 𝑤𝑤 > 0, the pdf of 𝑊𝑊 is defined by
𝛼𝛼−1 ⼆
𝑓𝑓𝑊𝑊 𝑤𝑤 =
𝑀𝑀
𝐹𝐹𝑊𝑊 𝑤𝑤 = 𝜆𝜆𝑐𝑐 −𝜆𝜆𝜆𝜆 −�
𝜆𝜆𝑘𝑘
𝑘𝑘𝑤𝑤 𝑘𝑘−1 − 𝜆𝜆𝑤𝑤 𝑘𝑘 𝑐𝑐 −𝜆𝜆𝜆𝜆 ē 意悲cknixu
𝑀𝑀𝑤𝑤
𝛼𝛼−1
𝑘𝑘!
𝑘𝑘
𝑘𝑘=1
iy.it 等
−𝜆𝜆𝜆𝜆 −𝜆𝜆𝜆𝜆
𝜆𝜆𝑤𝑤 𝜆𝜆𝑤𝑤 𝑘𝑘−1 𝜆𝜆𝑤𝑤 𝛼𝛼−1
= 𝜆𝜆𝑐𝑐 + 𝜆𝜆𝑐𝑐 � − = 𝜆𝜆𝑐𝑐 −𝜆𝜆𝜆𝜆 + 𝜆𝜆𝑐𝑐 −𝜆𝜆𝜆𝜆 −1
𝑘𝑘! 𝑘𝑘 − 1 ! 𝛼𝛼 − 1 !
𝑘𝑘=1
𝜆𝜆𝛼𝛼 𝑤𝑤 𝛼𝛼−1
= 𝑐𝑐 −𝜆𝜆𝜆𝜆 .
𝛼𝛼 − 1 !
iiǐ
xē 吢 等 哎答 ie
Gamma Distribution
Since 𝑓𝑓𝑊𝑊 𝑤𝑤 ∝ 𝜆𝜆𝛼𝛼 𝑤𝑤 𝛼𝛼−1 𝑐𝑐 −𝜆𝜆𝜆𝜆 , write 𝑓𝑓𝑊𝑊 𝑤𝑤 = 𝐾𝐾𝜆𝜆𝛼𝛼 𝑤𝑤 𝛼𝛼−1 𝑐𝑐 −𝜆𝜆𝜆𝜆 , 𝑤𝑤 > 0, 𝛼𝛼 > 0, 𝜆𝜆 > 0. We
1
can determine that 𝐾𝐾 = , where Γ 𝛼𝛼 is the Gamma function.
Γ 𝛼𝛼
Chi-Square Distribution
1 𝑟𝑟
Let 𝑋𝑋 follow a gamma distribution with 𝜆𝜆 = and 𝛼𝛼 = , where 𝑣𝑣 is a positive integer. Then
2 2
the probability density function of 𝑋𝑋 is
𝑟𝑟 𝑥𝑥
−1 −2
𝑥𝑥 𝑐𝑐
2
𝑓𝑓𝑋𝑋 𝑥𝑥 = , 𝑥𝑥 > 0.
𝑟𝑟
𝑣𝑣
22 Γ
2
We say that 𝑋𝑋 follows a chi-square distribution with r degrees of freedom, denoted 𝜒𝜒 2 (𝑣𝑣).
𝑟𝑟
−
The mgf of 𝑋𝑋 is 𝑀𝑀 𝑝𝑝 = 1 − 2𝑝𝑝 2 , 𝐸𝐸 𝑋𝑋 = 𝑣𝑣 and 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋 = 2𝑣𝑣.
Normal Distribution
Let 𝑋𝑋 be normally distributed with mean 𝜇𝜇 and variance 𝜎𝜎 2 . The pdf of 𝑋𝑋 is defined by
2
1 𝑥𝑥 − 𝜇𝜇 2
𝑓𝑓𝑋𝑋 𝑥𝑥 𝜇𝜇, 𝜎𝜎 = 𝑐𝑐𝑥𝑥𝑝𝑝 − 2
, −∞ < 𝑥𝑥 < ∞.
2𝜋𝜋𝜎𝜎 2 2𝜎𝜎
The normal distribution plays a central role in statistics. There are three main reasons:
1) It and its associated distributions are very tractable analytically.
2) It has the familiar bell shape, whose symmetry makes it an appealing choice for
many popular models.
3) There is the Central Limit Theorem, which shows that, under mild conditions, the
normal distribution can be used to approximate a large variety of distributions in
large samples.
2) It is symmetrical about the mean 𝜇𝜇. Its mode, median and mean are all equal.
𝑋𝑋−𝜇𝜇
3) If 𝑋𝑋 is a normally distributed random variable with mean 𝜇𝜇 and variance 𝜎𝜎 2 , then 𝑍𝑍 = is normally
𝜎𝜎
distributed with mean 0 and variance 1, called standard normal random variable. That is, normal
distribution belongs to the family of location and scale distributions.
4) The cdf of a standard normal random variable 𝑍𝑍 is defined by
Φ 𝑧𝑧 = 𝑃𝑃 𝑍𝑍 < 𝑧𝑧 .
Table is often available for finding the value of Φ 𝑧𝑧 for a given 𝑧𝑧 value. You can visit the website
http://stattrek.com/online-calculator/normal.aspx to evaluate either a 𝑧𝑧 for a given value of Φ 𝑧𝑧 or a
value of Φ 𝑧𝑧 for a given value of 𝑧𝑧. Hence
𝑃𝑃 𝑐𝑐 < 𝑍𝑍 < 𝑏𝑏 = Φ 𝑏𝑏 − Φ 𝑐𝑐 .
Example.
The United States Environmental Protection Agency (EPA) has developed procedures for
measuring vehicle emission levels of nitrogen oxide. Let 𝑋𝑋 denote the amount of this
pollution in a randomly selected automobile in Houston, Texas. Suppose the distribution of
𝑋𝑋 can be adequately modeled by a normal distribution with a mean level of 𝜇𝜇 = 70 ppb
(parts per billion) and standard deviation of 𝜎𝜎 = 13 ppb.
(a) What is the probability that a randomly selected vehicle will have emission levels less
than 60 ppb?
(b) What is the probability that a randomly selected vehicle will have emission levels
greater than 90 ppb?
(c) What is the probability that a randomly selected vehicle will have emission levels
between 60 and 90 ppb?
ǐǒm
(c) A State of Texas environmental agency is going to offer a reduced vehicle license fee to
those vehicles having very low emission levels. As a preliminary pilot project, they will offer this
incentive to the group of vehicle owners having the best 10% of emission levels. What emission level
should the agency use?
𝑥𝑥 − 70
0.1 = 𝑃𝑃 𝑋𝑋 ≤ 𝑥𝑥 = 𝑃𝑃 𝑍𝑍 ≤ .
13
𝑥𝑥−70
It follows that = −1.28 or 𝑥𝑥 = 53.36
13
STAT 2006 - Jan 2021 79
Continuous Random Variables
Illustration of Mathematical Properties
Let 𝑋𝑋 be normally distributed with mean 𝜇𝜇 and variance 𝜎𝜎 2 .
• The mgf of 𝑋𝑋 is
1 ∞ 𝑥𝑥−𝜇𝜇 2 1 ∞ 1
𝑡𝑡𝑋𝑋 𝑡𝑡𝑥𝑥 − 2 − 2 𝑥𝑥 2 −2 𝜇𝜇+𝜎𝜎 2 𝑡𝑡 𝑥𝑥+𝜇𝜇2
𝑀𝑀 𝑝𝑝 = 𝐸𝐸 𝑐𝑐 = � 𝑐𝑐 𝑐𝑐 2𝜎𝜎 𝑀𝑀𝑥𝑥 = � 𝑐𝑐 2𝜎𝜎 𝑀𝑀𝑥𝑥
2
2𝜋𝜋𝜎𝜎 −∞ 2
2𝜋𝜋𝜎𝜎 −∞
2 2
1 ∞ 𝑥𝑥−𝜇𝜇−𝜎𝜎 𝑡𝑡 1 2
1 2 2
2 𝜇𝜇+𝜎𝜎 𝑡𝑡 −𝜇𝜇
2 − 2
2
2 𝜎𝜎 𝑡𝑡 2𝜇𝜇+𝜎𝜎 𝑡𝑡
1
𝜇𝜇𝑡𝑡+2𝜎𝜎 2 𝑡𝑡 2
= 𝑐𝑐 2𝜎𝜎 � 𝑐𝑐 2𝜎𝜎 𝑀𝑀𝑥𝑥 = 𝑐𝑐 2𝜎𝜎 = 𝑐𝑐 .
2𝜋𝜋𝜎𝜎 2 −∞
𝑋𝑋−𝜇𝜇
• Since 𝑍𝑍 = has mean 0 and variance 1,
𝜎𝜎
𝐸𝐸 𝑋𝑋 = 𝐸𝐸 𝜇𝜇 + 𝜎𝜎𝑍𝑍 = 𝜇𝜇
𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋 = 𝑉𝑉𝑐𝑐𝑣𝑣 𝜇𝜇 + 𝜎𝜎𝑍𝑍 = 𝜎𝜎 2
Let 𝑋𝑋 and 𝑌𝑌 be two discrete random variables, and let 𝑆𝑆 denote the two-dimensional
support of 𝑋𝑋 and 𝑌𝑌. Then, the function 𝑓𝑓𝑋𝑋,𝑌𝑌 (𝑥𝑥, 𝐻𝐻) = 𝑃𝑃(𝑋𝑋 = 𝑥𝑥, 𝑌𝑌 = 𝐻𝐻) is a joint probability
mass function (pmf) if it satisfies the following three conditions:
• 0 ≤ 𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥, 𝐻𝐻 ≤ 1
• � 𝑥𝑥,𝑦𝑦 ∈𝑆𝑆 𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥, 𝐻𝐻 = 1
• 𝑃𝑃 𝑋𝑋, 𝑌𝑌 ∈ 𝐴𝐴 = � 𝑓𝑓
𝑥𝑥,𝑦𝑦 ∈𝐴𝐴 𝑋𝑋,𝑌𝑌
𝑥𝑥, 𝐻𝐻 , where 𝐴𝐴 ⊂ 𝑆𝑆.
Example. Let 𝑋𝑋 and 𝑌𝑌 be two independent random variables having respective pmf
𝑓𝑓𝑋𝑋 𝑥𝑥 = 1 − 𝜆𝜆 𝜆𝜆𝑥𝑥 and 𝑓𝑓𝑌𝑌 𝐻𝐻 = 1 − 𝜇𝜇 𝜇𝜇 𝑦𝑦 for 𝑥𝑥, 𝐻𝐻 = 0,1,2, … What is the pmf of 𝑍𝑍 =
min 𝑋𝑋, 𝑌𝑌 ?
Note that for 𝑧𝑧 ≥ 0.
∞ ∞
𝑃𝑃 𝑍𝑍 ≥ 𝑧𝑧 = 𝑃𝑃 𝑋𝑋 ≥ 𝑧𝑧, 𝑌𝑌 ≥ 𝑧𝑧 = 𝑃𝑃 𝑋𝑋 ≥ 𝑧𝑧 𝑃𝑃 𝑌𝑌 ≥ 𝑧𝑧 = � 1 − 𝜆𝜆 𝜆𝜆𝑥𝑥 � 1 − 𝜇𝜇 𝜇𝜇 𝑦𝑦
𝑥𝑥=𝑧𝑧 𝑦𝑦=𝑧𝑧
𝑧𝑧 𝑧𝑧
= 𝜆𝜆 𝜇𝜇 .
Hence, for any 𝑧𝑧 ≥ 0, 𝑃𝑃 𝑍𝑍 = 𝑧𝑧 = 𝑃𝑃 𝑍𝑍 ≥ 𝑧𝑧 − 𝑃𝑃 𝑍𝑍 ≥ 𝑧𝑧 + 1 = 𝜆𝜆𝑧𝑧 𝜇𝜇 𝑧𝑧 1 − 𝜆𝜆𝜇𝜇 .
Let 𝑋𝑋 and 𝑌𝑌 be random variables (discrete or continuous) with means 𝜇𝜇𝑋𝑋 and 𝜇𝜇𝑌𝑌 . Let the
joint support of 𝑋𝑋 and 𝑌𝑌 be 𝑆𝑆. The covariance of 𝑋𝑋 and 𝑌𝑌 is defined by
𝜎𝜎𝑋𝑋,𝑌𝑌 = 𝐶𝐶𝑝𝑝𝑣𝑣 𝑋𝑋, 𝑌𝑌 = 𝐸𝐸 𝑋𝑋 − 𝜇𝜇𝑋𝑋 𝑌𝑌 − 𝜇𝜇𝑌𝑌 .
• For discrete case
𝐶𝐶𝑝𝑝𝑣𝑣 𝑋𝑋, 𝑌𝑌 = � 𝑥𝑥 − 𝜇𝜇𝑋𝑋 𝐻𝐻 − 𝜇𝜇𝑌𝑌 𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥, 𝐻𝐻
𝑥𝑥,𝑦𝑦 ∈𝑆𝑆
• For continuous case
Let 𝑋𝑋 and 𝑌𝑌 be random variables (discrete or continuous) with standard deviations 𝜎𝜎𝑋𝑋 and
𝜎𝜎𝑌𝑌 . The correlation coefficient of 𝑋𝑋 and 𝑌𝑌 is defined by
𝐶𝐶𝑝𝑝𝑣𝑣 𝑋𝑋, 𝑌𝑌 𝜎𝜎𝑋𝑋,𝑌𝑌
𝜌𝜌𝑋𝑋,𝑌𝑌 = 𝐶𝐶𝑝𝑝𝑣𝑣𝑣𝑣 𝑋𝑋, 𝑌𝑌 = = .
𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌 𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌
In the example, 𝜎𝜎𝑋𝑋2 = 12 × 0.5 + 22 × 0.5 − 1.52 = 0.25 and 𝜎𝜎𝑌𝑌2 = 12 × 0.25 + 22 ×
0.5 + 32 × 0.25 − 22 = 0.5. The correlation coefficient of 𝑋𝑋 and 𝑌𝑌 is
𝜎𝜎𝑋𝑋,𝑌𝑌 0.25
𝜌𝜌𝑋𝑋,𝑌𝑌 = = = 0.71.
𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌 0.25 × 0.5
Interpretation of correlation.
• −1 ≤ 𝜌𝜌𝑋𝑋,𝑌𝑌 ≤ 1 (For 𝑝𝑝 ∈ ℝ, 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋 − 𝑝𝑝𝑌𝑌 ≥ 0)
• If 𝜌𝜌𝑋𝑋,𝑌𝑌 = 1, then 𝑋𝑋 and 𝑌𝑌 are perfectly, positively, linearly correlated.
• If 𝜌𝜌𝑋𝑋,𝑌𝑌 = −1, then 𝑋𝑋 and 𝑌𝑌 are perfectly, negatively, linearly correlated.
• If 𝜌𝜌𝑋𝑋,𝑌𝑌 = 0, then 𝑋𝑋 and 𝑌𝑌 are completely, un-linearly correlated. That is, 𝑋𝑋 and 𝑌𝑌 may be
perfectly correlated in some other manner, in a parabolic manner, perhaps, but not in a
linear manner.
• If 𝜌𝜌𝑋𝑋,𝑌𝑌 > 0, then 𝑋𝑋 and 𝑌𝑌 are positively, linearly correlated, but not perfectly so.
• If 𝜌𝜌𝑋𝑋,𝑌𝑌 < 0, then 𝑋𝑋 and 𝑌𝑌 are negatively, linearly correlated, but not perfectly so.
In our example, we can conclude that X and Y are positively, linearly correlated, but not
perfectly so.
𝐶𝐶𝑝𝑝𝑣𝑣 𝑋𝑋, 𝑌𝑌 = � � 𝑥𝑥 − 𝜇𝜇𝑋𝑋 𝐻𝐻 − 𝜇𝜇𝑌𝑌 𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥, 𝐻𝐻 = � � 𝑥𝑥 − 𝜇𝜇𝑋𝑋 𝐻𝐻 − 𝜇𝜇𝑌𝑌 𝑓𝑓𝑋𝑋 𝑥𝑥 𝑓𝑓𝑌𝑌 𝐻𝐻
𝑥𝑥∈𝑆𝑆1 𝑦𝑦∈𝑆𝑆2 𝑥𝑥∈𝑆𝑆1 𝑦𝑦∈𝑆𝑆2
Note that 𝑋𝑋 and 𝑌𝑌 are dependent. But we have that 𝑋𝑋𝑌𝑌 = 0 and 𝐸𝐸 𝑋𝑋 = 0. It follows that 𝐶𝐶𝑝𝑝𝑣𝑣 𝑋𝑋, 𝑌𝑌 =
𝐸𝐸 𝑋𝑋𝑌𝑌 − 𝐸𝐸 𝑋𝑋 𝐸𝐸 𝑌𝑌 = 0.
Counterexample
Example. Let 𝑍𝑍 be a standard normal random variable. 𝑍𝑍 and 𝑍𝑍 2 are dependent, but 𝐶𝐶𝑝𝑝𝑣𝑣 𝑍𝑍, 𝑍𝑍 2 = 𝐸𝐸 𝑍𝑍 3 =
0.
STAT 2006 - Jan 2021 90
The correlation coefficient
Example. A quality control inspector for a t-shirt manufacturer inspects t-shirts for defects.
She labels each t-shirt she inspects as either good, a second and defective. The quality
control inspector inspects 𝑛𝑛 = 2 t-shirts. Let 𝑋𝑋 be the number of good t-shirts and 𝑌𝑌 be
the number of second t-shirts. Assume that the probability that a t-shirt is good is 0.6 and
that a t-shirt is second is 0.2. Are 𝑋𝑋 and 𝑌𝑌 independent? What is the correlation between
𝑋𝑋 and 𝑌𝑌?
Trinomial 2!
The joint pmf of 𝑋𝑋 and 𝑌𝑌 is 𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥, 𝐻𝐻 = 0.6𝑥𝑥 0.2𝑦𝑦 0.22−𝑥𝑥−𝑦𝑦 , 0 ≤ 𝑥𝑥 + 𝐻𝐻 ≤ 2.
𝑥𝑥!𝑦𝑦! 2−𝑥𝑥−𝑦𝑦 !
Since the joint pmf cannot be factorized, 𝑋𝑋 and 𝑌𝑌 are not independent. Since the marginal
pmf of 𝑋𝑋 is Binomial with parameter 𝑛𝑛 = 2 and 𝑝𝑝 = 0.6, 𝜇𝜇𝑋𝑋 = 2 × 0.6 = 1.2 and 𝜎𝜎𝑋𝑋2 =
2 × 0.6 × 0.4 = 0.48. Similarly, the marginal pmf of 𝑌𝑌 is Binomial with parameter 𝑛𝑛 = 2
and 𝑝𝑝 = 0.2. 𝜇𝜇𝑌𝑌 = 2 × 0.2 = 0.4 and 𝜎𝜎𝑌𝑌2 = 2 × 0.2 × 0.8 = 0.32. The expectation of 𝑋𝑋𝑌𝑌 is
𝜇𝜇𝑋𝑋𝑌𝑌 = ∑𝑥𝑥 ∑𝑦𝑦 𝑥𝑥𝐻𝐻𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥, 𝐻𝐻 = 𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥 = 1, 𝐻𝐻 = 1 = 0.24. We have
uhenx o.y.sn x 2 yoFzxyfxinx.g o 𝐶𝐶𝑝𝑝𝑣𝑣 𝑋𝑋, 𝑌𝑌
𝐶𝐶𝑝𝑝𝑣𝑣 𝑋𝑋, 𝑌𝑌 = 0.24 − 1.2 × 0.4 = −0.24 and 𝐶𝐶𝑝𝑝𝑣𝑣𝑣𝑣 𝑋𝑋, 𝑌𝑌 = = −0.61.
𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌
STAT 2006 - Jan 2021 91
Conditional pmf
The conditional pmf of 𝑋𝑋, given that 𝑌𝑌 = 𝐻𝐻, is defined by
𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥, 𝐻𝐻
𝐿𝐿𝑋𝑋|𝑌𝑌 𝑥𝑥 𝐻𝐻 = , 𝑓𝑓𝑌𝑌 𝐻𝐻 > 0.
𝑓𝑓𝑌𝑌 𝐻𝐻
The conditional pmf of 𝑌𝑌, given that 𝑋𝑋 = 𝑥𝑥, is defined by
𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥, 𝐻𝐻
𝐿𝑌𝑌|𝑋𝑋 𝐻𝐻 𝑥𝑥 = , 𝑓𝑓𝑋𝑋 𝑥𝑥 > 0.
𝑓𝑓𝑋𝑋 𝑥𝑥
Note that
� 𝐿𝐿𝑋𝑋|𝑌𝑌 𝑥𝑥 𝐻𝐻 = 1 and � 𝐿𝑌𝑌|𝑋𝑋 𝐻𝐻 𝑥𝑥 = 1
𝑥𝑥∈𝑆𝑆1 𝑦𝑦∈𝑆𝑆2
jtxgdy⼆ 爽啊
1
1
xnxi
2
𝑥𝑥 3 2 2
𝐸𝐸 𝑋𝑋 = 2 � 𝑥𝑥 𝑀𝑀𝑥𝑥 = 2 � = and 𝐸𝐸 𝑌𝑌 = .
3 3 3
0
0 2x
Example. Let 𝑋𝑋 and 𝑌𝑌 have the pdf 𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥, 𝐻𝐻 = 𝑥𝑥 + 𝐻𝐻 for 0 < 𝑥𝑥 < 1 and 0 < 𝐻𝐻 < 1.
57symmetry
1 1
1
2
𝑥𝑥 𝑥𝑥 3 1 𝑥𝑥 2 7 7
𝐸𝐸 𝑋𝑋 = � 𝑥𝑥 + 𝑀𝑀𝑥𝑥 = � + � = and 𝐸𝐸 𝑌𝑌 = .
0 2 3 2 2 12 12
0 0
STAT 2006 - Jan 2021 97
Two Continuous Random Variables
∞ ∞ 1
𝑥𝑥 − 𝜇𝜇𝑋𝑋 𝐻𝐻 − 𝜇𝜇𝑌𝑌 1 −
2 1−𝜌𝜌
2
2 𝑥𝑥 −2𝜌𝜌𝑥𝑥𝑦𝑦+𝑦𝑦
2
𝐸𝐸 =� � 𝑥𝑥𝐻𝐻 𝑐𝑐 𝑀𝑀𝑥𝑥𝑀𝑀𝐻𝐻
𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌 −∞ −∞ 2𝜋𝜋 1 − 𝜌𝜌2
∞ ∞ 1
1 − 𝑦𝑦−𝜌𝜌𝑥𝑥 2 +𝑥𝑥 2 1−𝜌𝜌2
= � � 𝑥𝑥𝐻𝐻𝑐𝑐 2 1−𝜌𝜌2 𝑀𝑀𝐻𝐻𝑀𝑀𝑥𝑥
2𝜋𝜋 1 − 𝜌𝜌2 −∞ −∞
∞ 𝑥𝑥 2 ∞ 1
1 −2 −
2 1−𝜌𝜌 2 𝑦𝑦−𝜌𝜌𝑥𝑥
2
= � 𝑥𝑥𝑐𝑐 � 𝐻𝐻𝑐𝑐 𝑀𝑀𝐻𝐻𝑀𝑀𝑥𝑥
2𝜋𝜋 1 − 𝜌𝜌2 −∞ −∞
∞ 2 ∞ 𝑥𝑥 2
1 −
𝑥𝑥 𝜌𝜌 2 −
= � 𝑥𝑥𝑐𝑐 2 𝜌𝜌𝑥𝑥 2𝜋𝜋 1 − 𝜌𝜌2 𝑀𝑀𝑥𝑥 = � 𝑥𝑥 𝑐𝑐 2 𝑀𝑀𝑥𝑥 = 𝜌𝜌,
2𝜋𝜋 1 − 𝜌𝜌2 −∞ 2𝜋𝜋 −∞
∞ 2 𝜋𝜋
since ∫−∞ 𝑥𝑥 2 𝑐𝑐 −𝑥𝑥 𝑀𝑀𝑥𝑥 = . It follows that 𝐶𝐶𝑝𝑝𝑣𝑣 𝑋𝑋, 𝑌𝑌 = 𝜌𝜌𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌 .
2
If 𝑋𝑋 and 𝑌𝑌 have a bivariate normal distribution with correlation coefficient 𝜌𝜌, then 𝑋𝑋 and 𝑌𝑌 are
independent if and only if 𝜌𝜌 = 0.
Proof. Note that
2 2
1 1 𝐻𝐻 − 𝜇𝜇𝑌𝑌 𝑥𝑥 − 𝜇𝜇𝑋𝑋 𝑥𝑥 − 𝜇𝜇𝑋𝑋
𝑓𝑓𝑋𝑋,𝑌𝑌 𝑥𝑥, 𝐻𝐻 = exp − − 𝜌𝜌 + 1 − 𝜌𝜌2 .
2𝜋𝜋𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌 1 − 𝜌𝜌2 2 1 − 𝜌𝜌2 𝜎𝜎𝑌𝑌 𝜎𝜎𝑋𝑋 𝜎𝜎𝑋𝑋
The distribution function technique to find the pdf of 𝑋𝑋 uses the following steps:
• Find the cdf 𝐹𝐹𝑋𝑋 𝑥𝑥 = 𝑃𝑃 𝑋𝑋 ≤ 𝑥𝑥
𝑑𝑑
• The pdf 𝑓𝑓𝑋𝑋 𝑥𝑥 = 𝐹𝐹𝑋𝑋 𝑥𝑥
𝑑𝑑𝑥𝑥
2
Example. Let 𝑋𝑋 be a random variable with pdf 𝑓𝑓𝑋𝑋 𝑥𝑥 = 3 1 − 𝑥𝑥 for 0 < 𝑥𝑥 < 1. What is the pdf of
𝑌𝑌 = 1 − 𝑋𝑋 3 ?
Answer. The cdf of 𝑌𝑌 is given by
1 1 1
𝐹𝐹𝑌𝑌 𝐻𝐻 = 𝑃𝑃 𝑌𝑌 ≤ 𝐻𝐻 = 𝑃𝑃 1 − 𝑋𝑋 ≤ 𝐻𝐻 3 = 𝑃𝑃 𝑋𝑋 ≥ 1 − 𝐻𝐻 3 =� 3 1 − 𝑥𝑥 2 𝑀𝑀𝑥𝑥
1
1−𝑦𝑦 3
1
= − 1 − 𝑥𝑥 3 � 1 = 𝐻𝐻, 0 < 𝐻𝐻 < 1.
1−𝑦𝑦 3
Hence the pdf of 𝑌𝑌 is 𝑓𝑓𝑌𝑌 𝐻𝐻 = 1, 0 < 𝐻𝐻 < 1. It follows that 𝑌𝑌 is 𝑈𝑈 0,1 distributed.
Let 𝑋𝑋 be a random variable with pdf 𝑓𝑓𝑋𝑋 𝑥𝑥 defined on the support 𝑐𝑐1 < 𝑥𝑥 < 𝑐𝑐2 . Suppose
that 𝑌𝑌 = 𝑣𝑣 𝑋𝑋 is an invertible function of 𝑋𝑋 on the support, that is 𝑋𝑋 = 𝑣𝑣−1 𝑌𝑌 for
𝑣𝑣 𝑐𝑐1 < 𝐻𝐻 < 𝑣𝑣 𝑐𝑐2 . Then, the pdf of 𝑌𝑌 is given by
−1
𝑀𝑀𝑣𝑣−1 𝐻𝐻
𝑓𝑓𝑌𝑌 𝐻𝐻 = 𝑓𝑓𝑋𝑋 𝑣𝑣 𝐻𝐻 ,
𝑀𝑀𝐻𝐻
for 𝑣𝑣 𝑐𝑐1 < 𝐻𝐻 < 𝑣𝑣 𝑐𝑐2 .
Example. Let the pdf of 𝑋𝑋 be 𝑓𝑓𝑋𝑋 𝑥𝑥 = 3𝑥𝑥 2 for 0 < 𝑥𝑥 < 1. Find the pdf of 𝑌𝑌 = 𝑋𝑋 2 .
Note that 𝐻𝐻 = 𝑥𝑥 2 is a monotone function on 0,1 with inverse 𝑥𝑥 = 𝐻𝐻 defined on 0,1 .
𝑑𝑑𝑥𝑥 1 2 1 3
Since = , 𝑓𝑓 𝐻𝐻 = 3 𝐻𝐻 � = 𝐻𝐻, 0 < 𝐻𝐻 < 1.
𝑑𝑑𝑦𝑦 2 𝑦𝑦 𝑌𝑌 2 𝑦𝑦 2
Example. Note that 𝑌𝑌1 and 𝑌𝑌2 are independent. The marginal pdf of 𝑌𝑌2 is given by
1 𝑦𝑦
𝛼𝛼+𝛽𝛽−1 − 2
𝑓𝑓𝑌𝑌2 𝐻𝐻2 = 𝐻𝐻 𝑐𝑐 𝜃𝜃 , 𝐻𝐻2 > 0.
Γ 𝛼𝛼 + 𝛽𝛽 𝜃𝜃 𝛼𝛼+𝛽𝛽 2
The marginal pdf of 𝑌𝑌1 is given by
Γ 𝛼𝛼 + 𝛽𝛽 𝛼𝛼−1 𝛽𝛽−1
𝑓𝑓𝑌𝑌1 𝐻𝐻1 = 𝐻𝐻 1 − 𝐻𝐻1 , 0 < 𝐻𝐻1 < 1.
Γ 𝛼𝛼 Γ 𝛽𝛽 1
It follows that 𝑌𝑌1 has a Beta pdf with parameters 𝛼𝛼 and 𝛽𝛽 and 𝑌𝑌2 has a Gamma pdf with
parameters 𝛼𝛼 + 𝛽𝛽 and 𝜃𝜃.
Let 𝑋𝑋1 , 𝑋𝑋2 be a random sample of size 2 and 𝑌𝑌 = 𝑋𝑋1 + 𝑋𝑋2 . We have
𝐸𝐸 𝑌𝑌 = 𝐸𝐸 𝑋𝑋1 + 𝐸𝐸 𝑋𝑋2 .
Proof.
∞ ∞
𝐸𝐸 𝑌𝑌 = � � 𝑥𝑥1 + 𝑥𝑥2 𝑓𝑓𝑋𝑋1 ,𝑋𝑋2 𝑥𝑥1 , 𝑥𝑥2 𝑀𝑀𝑥𝑥1 𝑀𝑀𝑥𝑥2
−∞ −∞
∞ ∞
=� � 𝑥𝑥1 + 𝑥𝑥2 𝑓𝑓𝑋𝑋1 𝑥𝑥1 𝑓𝑓𝑋𝑋2 𝑥𝑥2 𝑀𝑀𝑥𝑥1 𝑀𝑀𝑥𝑥2
−∞ −∞
∞ ∞ ∞ ∞
= � 𝑥𝑥1 𝑓𝑓𝑋𝑋1 𝑥𝑥1 � 𝑓𝑓𝑋𝑋2 𝑥𝑥2 𝑀𝑀𝑥𝑥2 𝑀𝑀𝑥𝑥1 + � 𝑥𝑥2 𝑓𝑓𝑋𝑋2 𝑥𝑥2 � 𝑓𝑓𝑋𝑋1 𝑥𝑥1 𝑀𝑀𝑥𝑥1 𝑀𝑀𝑥𝑥2
−∞ −∞ −∞ −∞
= 𝐸𝐸 𝑋𝑋1 + 𝐸𝐸 𝑋𝑋2 .
Let 𝑋𝑋1 , 𝑋𝑋2 be a random sample of size 2 and 𝑌𝑌 = 𝑋𝑋1 + 𝑋𝑋2 . We have
𝑉𝑉𝑐𝑐𝑣𝑣 𝑌𝑌 = 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋1 + 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋2 .
Proof.
∞ ∞ ∞ ∞
𝐸𝐸 𝑋𝑋1 𝑋𝑋2 = � � 𝑥𝑥1 𝑥𝑥2 𝑓𝑓𝑋𝑋1 ,𝑋𝑋2 𝑥𝑥1 , 𝑥𝑥2 𝑀𝑀𝑥𝑥1 𝑀𝑀𝑥𝑥2 = � � 𝑥𝑥1 𝑥𝑥2 𝑓𝑓𝑋𝑋1 𝑥𝑥1 𝑓𝑓𝑋𝑋2 𝑥𝑥2 𝑀𝑀𝑥𝑥1 𝑀𝑀𝑥𝑥2
−∞ −∞ −∞ −∞
∞ ∞
= � 𝑥𝑥1 𝑓𝑓𝑋𝑋1 𝑥𝑥1 𝑀𝑀𝑥𝑥1 � 𝑥𝑥2 𝑓𝑓𝑋𝑋2 𝑥𝑥2 𝑀𝑀𝑥𝑥2 = 𝐸𝐸 𝑋𝑋1 𝐸𝐸 𝑋𝑋2 .
−∞ −∞
2
𝑉𝑉𝑐𝑐𝑣𝑣 𝑌𝑌 = 𝐸𝐸 𝑋𝑋1 − 𝐸𝐸 𝑋𝑋1 + 𝑋𝑋2 − 𝐸𝐸 𝑋𝑋2
= 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋1 + 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋2 + 2𝐸𝐸 𝑋𝑋1 − 𝐸𝐸 𝑋𝑋1 𝑋𝑋2 − 𝐸𝐸 𝑋𝑋2
= 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋1 + 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋2 + 2 𝐸𝐸 𝑋𝑋1 𝑋𝑋2 − 𝐸𝐸 𝑋𝑋1 𝐸𝐸 𝑋𝑋2 = 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋1 + 𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋2 .
Let 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 be independent random variables. Suppose that their joint pdf is
𝑓𝑓1 𝑥𝑥1 𝑓𝑓2 𝑥𝑥2 ⋯ 𝑓𝑓𝑛𝑛 𝑥𝑥𝑛𝑛
The expected value of 𝑌𝑌 = 𝑣𝑣 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 is given by
∞ ∞ ∞
𝐸𝐸 𝑌𝑌 = � � ⋯ � 𝑣𝑣 𝑥𝑥1 , 𝑥𝑥2 , … , 𝑥𝑥𝑛𝑛 𝑓𝑓1 𝑥𝑥1 𝑓𝑓2 𝑥𝑥2 ⋯ 𝑓𝑓𝑛𝑛 𝑥𝑥𝑛𝑛 𝑀𝑀𝑥𝑥1 𝑀𝑀𝑥𝑥2 ⋯ 𝑀𝑀𝑥𝑥𝑛𝑛 ,
−∞ −∞ −∞
provided that the integral exists. In particular, we have
𝑛𝑛 ∞
𝐸𝐸 𝑣𝑣1 𝑋𝑋1 𝑣𝑣2 𝑋𝑋2 ⋯ 𝑣𝑣𝑛𝑛 𝑋𝑋𝑛𝑛 = � � 𝑣𝑣𝑗𝑗 𝑥𝑥𝑗𝑗 𝑓𝑓𝑗𝑗 𝑥𝑥𝑗𝑗 𝑀𝑀𝑥𝑥𝑗𝑗
𝑗𝑗=1 −∞
= 𝐸𝐸 𝑣𝑣1 𝑋𝑋1 𝐸𝐸 𝑣𝑣2 𝑋𝑋2 ⋯ 𝐸𝐸 𝑣𝑣𝑛𝑛 𝑋𝑋𝑛𝑛 .
Suppose that 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 are independent random variables with means 𝜇𝜇1 , 𝜇𝜇2 , … , 𝜇𝜇𝑛𝑛
and variances 𝜎𝜎12 , 𝜎𝜎22 , … , 𝜎𝜎𝑛𝑛2 . Then, the mean and variance of 𝑌𝑌 = 𝑐𝑐1 𝑋𝑋1 + 𝑐𝑐2 𝑋𝑋2 + ⋯ +
𝑐𝑐𝑛𝑛 𝑋𝑋𝑛𝑛 are 𝜇𝜇𝑌𝑌 = 𝑐𝑐1 𝜇𝜇1 + 𝑐𝑐2 𝜇𝜇2 + ⋯ + 𝑐𝑐𝑛𝑛 𝜇𝜇𝑛𝑛 and 𝜎𝜎𝑌𝑌2 = 𝑐𝑐12 𝜎𝜎12 + 𝑐𝑐22 𝜎𝜎22 + ⋯ + 𝑐𝑐𝑛𝑛2 𝜎𝜎𝑛𝑛2 .
Proof.
𝐸𝐸 𝑌𝑌 = 𝐸𝐸 𝑐𝑐1 𝑋𝑋1 + 𝐸𝐸 𝑐𝑐2 𝑋𝑋2 + ⋯ + 𝐸𝐸 𝑐𝑐𝑛𝑛 𝑋𝑋𝑛𝑛 = 𝑐𝑐1 𝜇𝜇1 + 𝑐𝑐2 𝜇𝜇2 + ⋯ + 𝑐𝑐𝑛𝑛 𝜇𝜇𝑛𝑛
𝑉𝑉𝑐𝑐𝑣𝑣 𝑌𝑌 = 𝑉𝑉𝑐𝑐𝑣𝑣 𝑐𝑐1 𝑋𝑋1 + 𝑉𝑉𝑐𝑐𝑣𝑣 𝑐𝑐2 𝑋𝑋2 + ⋯ + 𝑉𝑉𝑐𝑐𝑣𝑣 𝑐𝑐𝑛𝑛 𝑋𝑋𝑛𝑛 = 𝑐𝑐12 𝜎𝜎12 + 𝑐𝑐22 𝜎𝜎22 + ⋯ + 𝑐𝑐𝑛𝑛2 𝜎𝜎𝑛𝑛2
In general, let 𝜎𝜎𝑖𝑖,𝑗𝑗 = 𝐶𝐶𝑝𝑝𝑣𝑣 𝑋𝑋𝑖𝑖 , 𝑋𝑋𝑗𝑗 . We have
𝑛𝑛 𝑛𝑛−1 𝑛𝑛
Let 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 be a random sample of size 𝑛𝑛 from a distribution with mean 𝜇𝜇 and
variance 𝜎𝜎 2 . The mean and variance of the sample mean 𝑋𝑋� are
𝑋𝑋1 𝑋𝑋2 𝑋𝑋𝑛𝑛 𝜇𝜇 𝜇𝜇 𝜇𝜇
𝐸𝐸 𝑋𝑋� = 𝐸𝐸 + + ⋯+ = + + ⋯ + = 𝜇𝜇
𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛
𝑋𝑋1 𝑋𝑋2 𝑋𝑋𝑛𝑛 𝜎𝜎 2 𝜎𝜎 2 𝜎𝜎 2 𝜎𝜎 2 𝜎𝜎 2
𝑉𝑉𝑐𝑐𝑣𝑣 𝑋𝑋� = 𝑉𝑉𝑐𝑐𝑣𝑣 + + ⋯+ = 2 + 2 + ⋯ + 2 = 2 × 𝑛𝑛 =
𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛
Let 𝑋𝑋1 be the number of heads in two tosses of a fair coin and 𝑋𝑋2 be the number of heads
in three tosses of a fair coin. We let 𝑌𝑌 = 𝑋𝑋1 + 𝑋𝑋2 .
We understand that
1 1 2
• 𝑋𝑋1 has a binomial distribution with 𝑛𝑛 = 2 and 𝑝𝑝 = 0.5. Its mgf is 𝑀𝑀𝑋𝑋1 𝑝𝑝 = + 𝑐𝑐 𝑡𝑡 .
2 2
1 1 𝑡𝑡 3
• 𝑋𝑋2 has a Binomial distribution with 𝑛𝑛 = 3 and 𝑝𝑝 = 0.5. Its mgf is 𝑀𝑀𝑋𝑋2 𝑝𝑝 = + 𝑐𝑐 .
2 2
Suppose that 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 are independent random variables with mgf 𝑀𝑀𝑋𝑋𝑖𝑖 𝑝𝑝 =
𝑛𝑛
𝐸𝐸 𝑐𝑐 𝑡𝑡𝑋𝑋𝑖𝑖 , 𝑖𝑖 = 1,2, … , 𝑛𝑛 respectively. The mgf of 𝑌𝑌 = � 𝑐𝑐𝑗𝑗 𝑋𝑋𝑗𝑗 is given by
𝑗𝑗=1
𝑛𝑛 𝑛𝑛
𝑀𝑀𝑌𝑌 𝑝𝑝 = 𝐸𝐸 𝑐𝑐 𝑡𝑡𝑌𝑌 = 𝐸𝐸 𝑐𝑐 𝑡𝑡𝑎𝑎1 𝑋𝑋1 𝑐𝑐 𝑡𝑡𝑎𝑎2 𝑋𝑋2 ⋯ 𝑐𝑐 𝑡𝑡𝑎𝑎𝑛𝑛 𝑋𝑋𝑛𝑛 = � 𝐸𝐸 𝑐𝑐 𝑎𝑎𝑖𝑖 𝑡𝑡𝑋𝑋𝑖𝑖 = � 𝑀𝑀𝑋𝑋𝑖𝑖 𝑐𝑐𝑖𝑖 𝑝𝑝 .
𝑖𝑖=1 𝑖𝑖=1
If 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 are random sample from a population with mgf 𝑀𝑀 𝑝𝑝 , then
𝑛𝑛
• The mgf of 𝑌𝑌 = � 𝑋𝑋𝑗𝑗 is ∏𝑛𝑛𝑗𝑗=1 𝑀𝑀 𝑝𝑝 = 𝑀𝑀 𝑝𝑝 𝑛𝑛 .
𝑗𝑗=1
𝑡𝑡 𝑛𝑛
• The mgf of 𝑋𝑋� is 𝑀𝑀 .
𝑛𝑛
Let 𝑋𝑋1 , 𝑋𝑋2 and 𝑋𝑋3 be a random sample of size 3 from a gamma distribution with 𝛼𝛼 = 7 and
𝜃𝜃 = 5. Let 𝑌𝑌 = 𝑋𝑋1 + 𝑋𝑋2 + 𝑋𝑋3 .
What is the distribution of 𝑌𝑌?
−7 1
The mgf of the gamma random variable is 𝑀𝑀 𝑝𝑝 = 1 − 5𝑝𝑝 for 𝑝𝑝 < . Thus, the mgf of 𝑌𝑌
5
1
is 𝑀𝑀𝑌𝑌 𝑝𝑝 = 𝑀𝑀 𝑝𝑝 3 = 1 − 5𝑝𝑝 −21
for 𝑝𝑝 < . It follows that 𝑌𝑌 has a gamma distribution
5
with 𝛼𝛼 = 21 and 𝜃𝜃 = 5.
�
What is the distribution of 𝑋𝑋?
𝑡𝑡 3 5 −21 3
The mgf of 𝑋𝑋� is 𝑀𝑀𝑋𝑋� 𝑝𝑝 = 𝑀𝑀 = 1 − 𝑝𝑝 for 𝑝𝑝 < . Hence, 𝑋𝑋� has a gamma
3 3 5
5
distribution with 𝛼𝛼 = 21 and 𝜃𝜃 = .
3
For 𝑖𝑖 = 1,2, … , 𝑛𝑛, 𝑋𝑋𝑖𝑖 has a Chi-square distribution with 𝑣𝑣𝑖𝑖 degrees of freedom. Assume that
𝑛𝑛
𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 are independent. Then, 𝑌𝑌 = � 𝑋𝑋𝑗𝑗 has a Chi-square distribution with
𝑗𝑗=1
𝑛𝑛
degrees of freedom � 𝑣𝑣𝑗𝑗 .
𝑗𝑗=1
• Answer. The mgf of a Chi-square random variable with 𝑣𝑣 degrees of freedom is 𝑀𝑀 𝑝𝑝 =
𝑟𝑟 𝑟𝑟𝑗𝑗 1
− 1 −3 − ∑𝑛𝑛
𝑗𝑗=1 𝑟𝑟𝑗𝑗
1 − 2𝑝𝑝 2 for 𝑝𝑝 < . Hence, the mgf of 𝑌𝑌 is ∏𝑛𝑛𝑗𝑗=1 1 − 2𝑝𝑝 = 1 − 2𝑝𝑝 2 for
2
1
𝑝𝑝 < .
2
𝑛𝑛
• In particular, let 𝑍𝑍1 , 𝑍𝑍2 , … , 𝑍𝑍𝑛𝑛 be independent standard normal variables. 𝑊𝑊 = � 𝑍𝑍𝑗𝑗2
𝑗𝑗=1
has a Chi-square distribution with 𝑛𝑛 degrees of freedom.
For 𝑖𝑖 = 1,2, … , 𝑛𝑛, 𝑋𝑋𝑖𝑖 has a normal distribution with mean 𝜇𝜇𝑖𝑖 and variance 𝜎𝜎𝑖𝑖2 . Assume that
𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 are independent. Then,
𝑛𝑛 2
𝑋𝑋𝑗𝑗 − 𝜇𝜇𝑗𝑗
𝑊𝑊 = �
𝜎𝜎𝑗𝑗
𝑗𝑗=1
has a Chi-square distribution with 𝑛𝑛 degrees of freedom.
For 𝑖𝑖 = 1,2, … , 𝑛𝑛, 𝑋𝑋𝑖𝑖 has a normal distribution with mean 𝜇𝜇𝑖𝑖 and variance 𝜎𝜎𝑖𝑖2 . Assume that
𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 are independent. Then,
𝑛𝑛
𝑌𝑌 = � 𝑐𝑐𝑗𝑗 𝑋𝑋𝑗𝑗
𝑗𝑗=1
𝑛𝑛 𝑛𝑛
has a normal distribution with mean � 𝑐𝑐𝑗𝑗 𝜇𝜇𝑗𝑗 and variance � 𝑐𝑐𝑗𝑗2 𝜎𝜎𝑗𝑗2 .
𝑗𝑗=1 𝑗𝑗=1
1
2 𝜇𝜇𝑡𝑡+2𝜎𝜎 2 𝑡𝑡 2
Answer. The mgf of a normal random variable with mean 𝜇𝜇 and 𝜎𝜎 is 𝑀𝑀 𝑝𝑝 = 𝑐𝑐 .
The mgf of 𝑌𝑌 is
𝑛𝑛 𝑛𝑛 𝑛𝑛
1
𝑀𝑀𝑌𝑌 𝑝𝑝 = � 𝑀𝑀𝑋𝑋𝑗𝑗 𝑐𝑐𝑗𝑗 𝑝𝑝 = exp � 𝑐𝑐𝑗𝑗 𝜇𝜇𝑗𝑗 𝑝𝑝 + � 𝑐𝑐𝑗𝑗2 𝜎𝜎𝑗𝑗2 𝑝𝑝 2 .
2
𝑗𝑗=1 𝑗𝑗=1 𝑗𝑗=1
Example. Let 𝑋𝑋𝑖𝑖 denote the weight of a randomly selected prepackaged one-pound bag of carrots.
Past records suggest that 𝑋𝑋𝑖𝑖 is normally distributed with mean of 1.18 pounds and a standard
deviation of 0.07 pound. Now, let 𝑊𝑊 denote the weight of a randomly selected prepackaged three-
pound bag of carrots. It is known that 𝑊𝑊 is normally distributed with mean 3.22 pounds and
standard deviation 0.09 pound. Selecting bags at random, what is the probability that the sum of
three one-pound bags exceeds the weight of one three-pound bag?
Because bags are selected at random, we assume that 𝑋𝑋1 , 𝑋𝑋2 , 𝑋𝑋3 and 𝑊𝑊 are independent. Let 𝑌𝑌 =
𝑋𝑋1 + 𝑋𝑋2 + 𝑋𝑋3 be the sum of the weights of three one-pound bags. Then, 𝑌𝑌 is normally distributed
with mean 1.18 + 1.18 + 1.18 = 3.54 and variance 0.072 + 0.072 + 0.072 = 0.0147. Since 𝑌𝑌 and
𝑊𝑊 are independent, 𝑌𝑌 − 𝑊𝑊 is normally distributed with mean 3.54 − 3.22 = 0.32 and variance
0.01472 + 0.092 = 0.0228.
0 − 0.32
𝑃𝑃 𝑌𝑌 > 𝑊𝑊 = 𝑃𝑃 𝑌𝑌 − 𝑊𝑊 > 0 = 𝑃𝑃 𝑍𝑍 > = 𝑃𝑃 𝑍𝑍 > −2.12 = 𝑃𝑃 𝑍𝑍 < 2.12 = 0.9830
0.0228
Example. GPAs have been recorded for a random sample of 16 from the entering freshman
class at a major university. It can be assumed that the distribution of GPA values is
approximately normal. The sample yielded a mean, 𝑥𝑥̅ = 3.1, and standard deviation, 𝑣𝑣 =
0.8. The nationwide mean GPA of entering freshmen is 𝜇𝜇 = 2.7. What is the probability of
getting a 𝑋𝑋� which is greater than or equal to 𝑥𝑥̅ = 3.1 if the mean GPA of this university is
the same as the nationwide population of students.
Answer.
𝑥𝑥̅ − 2.7 3.1 − 2.7
𝑃𝑃 𝑋𝑋� ≥ 𝑥𝑥̅ = 𝑃𝑃 𝑇𝑇 ≥ = 𝑃𝑃 𝑇𝑇 ≥ = 2.0 ≅ 0.032,
𝑣𝑣/ 16 0.8
16
where 𝑇𝑇~𝑝𝑝15 .
𝜎𝜎 2
𝑃𝑃 𝑋𝑋 − 𝜇𝜇 > 𝜀𝜀 ≤ 2 .
𝜀𝜀
Proof. WLOG assume that 𝜇𝜇 = 0.
∞ −𝜀𝜀 ∞
2 2
𝜎𝜎 = 𝐸𝐸 𝑋𝑋 = � 𝑥𝑥 𝑓𝑓𝑋𝑋 𝑥𝑥 𝑀𝑀𝑥𝑥 ≥ � 𝑥𝑥 𝑓𝑓𝑋𝑋 𝑥𝑥 𝑀𝑀𝑥𝑥 + � 𝑥𝑥 2 𝑓𝑓𝑋𝑋 𝑥𝑥 𝑀𝑀𝑥𝑥
2 2
−∞ −∞ 𝜀𝜀
−𝜀𝜀 ∞
≥ � 𝜀𝜀 2 𝑓𝑓𝑋𝑋 𝑥𝑥 𝑀𝑀𝑥𝑥 + � 𝜀𝜀 2 𝑓𝑓𝑋𝑋 𝑥𝑥 𝑀𝑀𝑥𝑥 = 𝜀𝜀 2 𝑃𝑃 𝑋𝑋 > 𝜀𝜀 .
−∞ 𝜀𝜀
Example. A large drug company has 100 potential new prescription drugs under clinical test.
About 20% of all drugs that reach this stage are eventually licensed for sale. What is the
probability that at least 15 of the 100 drugs are eventually licensed? Assume that the
binomial assumptions are satisfied, and a normal approximation with continuity correction.
Answer.
Let 𝑋𝑋 be the number of prescription drugs that are eventually licensed for sale. Then 𝑋𝑋 has a
binomial distribution with 𝑛𝑛 = 100 and 𝑝𝑝 = 0.2.
14.5 − 100 × 0.2
𝑃𝑃 𝑋𝑋 ≥ 15 = 𝑃𝑃 𝑍𝑍 ≥ = 𝑃𝑃 𝑍𝑍 ≥ −1.38 = 𝑃𝑃 𝑍𝑍 ≤ 1.38 = 0.9162.
100 × 0.2 × 0.8
Note that the continuity correction refers to the approximation 𝑃𝑃 𝑋𝑋 ≥ 15 = 𝑃𝑃 𝑋𝑋 ≥ 14.5 .
Example. The annual number of earthquakes registering at least 2.5 on the Richter Scale and
having an epicenter within 40 miles of downtown Memphis follows a Poisson distribution
with mean 6.5. What is the probability that at least 9 such earthquakes will strike next year?
Answer. Let 𝑋𝑋 be the number of earthquakes. We answer the question in the following ways:
• Using the Poisson distribution with mean 6.5, 𝑃𝑃 𝑋𝑋 ≥ 9 = 1 − 𝑃𝑃 𝑋𝑋 ≤ 8 = 1 − 0.792 =
0.208.
• Using the normal approximation, the distribution of 𝑋𝑋 is approximated by a normal
distribution with mean 6.5 and variance 6.5.
8.5 − 6.5
𝑃𝑃 𝑋𝑋 ≥ 9 = 𝑃𝑃 𝑌𝑌 > 8.5 = 𝑃𝑃 𝑍𝑍 > = 𝑃𝑃 𝑍𝑍 > 0.78 = 0.218.
6.5