Lecture - 2020 04 28

EE319 Probability & Random Processes (Spring 2020)
Lecture 2020-04-24
Hypergeometric PGF
Prof. Dr.-Ing. Mukhtar Ullah

Head of Electrical Engineering
FAST NUCES, Islamabad
Hyper-geometric PGF
On the combined space of Bernoulli trials, define RVs
𝑋𝑖 ↔ ‘success indicator for trial 𝑖’
𝑆𝑚 ↔ ‘number of successes in a sequence of 𝑚 trials’
𝑆𝑟 ↔ ‘number of successes in a shorter sequence of 𝑟 trials’
which are distributed as
𝑋𝑖 ∼ Ber𝑝
𝑚
Õ
𝑆𝑚 = 𝑋𝑖 ∼ Bin𝑚,𝑝
𝑖=1
𝑟
Õ
𝑆𝑟 = 𝑋𝑖 ∼ Bin𝑟,𝑝
𝑖=1
Define another RV
𝑆𝑟 𝑆𝑚 =𝑛 ↔ ‘number of successes in 𝑟 trials required for 𝑛 successes in 𝑚 trials’
which, in mathematical parlance, is the restriction of 𝑆𝑟
to the event 𝑆𝑚 = 𝑛. You can think of this restricted sum
as the number of successes in 𝑛 random draws, without
replacement, from a set of 𝑚 objects 𝑟 of which are fa-
vorable. Recall what is meant by random here – each
object (remaining in the set) has the same probability
of being picked. It is instructive to picture the sample
space of 𝑚 objects as
Ω = {𝜔1 , . . . , 𝜔𝑚 } = A ∪ Ac
where elements of the subset A are favorable (considered
a success). That draw-1 is random translates to equal
elementary probabilities
1
𝑃 ({𝜔𝑖 }) = 𝑖 ∈ {1, . . . , 𝑚}
𝑚
and consequently
𝑟 𝑚−𝑟
𝑃 (A) = , 𝑃 Ac =
𝑚 𝑚
We could then express the number of successes in 𝑛
draws as a sum of dependent RVs:
𝑛
Õ
𝑆𝑛𝑟,𝑚 = 𝑆𝑟 𝑆𝑚 =𝑛 = 𝐾𝑖
𝑖=1
here
𝐾𝑖 ↔ ‘success indicator for draw 𝑖’
Draw 1 is modeled by the distribution
𝑚−𝑟
𝑃 (𝐾1 = 0) = 𝑃 Ac =
𝑚
𝑟
𝑃 (𝐾1 = 1) = 𝑃 (A) =
𝑚
Dependecy of draw 2 on draw 1 is modeled by the condi-
tional discribution
𝑃 (𝐾2 = 𝑘2 | 𝐾1 = 𝑘1 )
𝑘1 0 1
𝑘2
𝑚−𝑟−1 𝑚−𝑟
0
𝑚−1 𝑚−1
𝑟 𝑟−1
1
𝑚−1 𝑚−1
Draws 1-2 are modeled by the joint distribution

𝑃 (𝐾2 = 𝑘2 , 𝐾1 = 𝑘1 )
𝑘1 0 1
𝑘2
(𝑚 − 𝑟) 2 𝑟 (𝑚 − 𝑟)
0
(𝑚) 2 (𝑚) 2
(𝑚 − 𝑟) 𝑟 (𝑟) 2
1
(𝑚) 2 (𝑚) 2
000
m − r − 2
m − 2
00
m − r − 1 r
m − 1 m − 2
001
0
r 010
m − r − 1
m − 1
m − r m − 2
m
01
r − 1
m − 2
011
Start
100
r
m − r − 1
m
m − 2
10
m − r r − 1
m − 1 m − 2
101
1
r − 1
m − r 110
m − 1
m − 2
11
r − 2
m − 2
111
Summing over 𝑘1 gives the marginal distribution
𝑚−𝑟 𝑟
𝑃 (𝐾2 = 0) = , 𝑃 (𝐾2 = 1) =
𝑚 𝑚
which is identical to that of 𝐾1 despite their dependence.
Dependecy of draw 3 on the draws 1-2 is modeled by the
conditional distribution
𝑃 (𝐾3 = 𝑘3 | 𝐾2 = 𝑘2 , 𝐾1 = 𝑘1 )
𝑘3 = 0 𝑘3 = 1
𝑘1 0 1 0 1
𝑘2
𝑚−𝑟−2 𝑚−𝑟−1 𝑟 𝑟−1

0
𝑚−2 𝑚−2 𝑚−2 𝑚−2
𝑚−𝑟−1 𝑚−𝑟 𝑟−1 𝑟−2
1
𝑚−2 𝑚−2 𝑚−2 𝑚−2
Draws 1-3 are modeled by the joint disctribution

𝑃 (𝐾3 = 𝑘3 , 𝐾2 = 𝑘2 , 𝐾1 = 𝑘1 )
𝑘3 = 0 𝑘3 = 1
𝑘1 0 1 0 1
𝑘2
(𝑚 − 𝑟) 3 𝑟 (𝑚 − 𝑟) 2 (𝑚 − 𝑟) 2 𝑟 (𝑚 − 𝑟) (𝑟) 2
0
(𝑚) 3 (𝑚) 3 (𝑚) 3 (𝑚) 3
(𝑚 − 𝑟) 2 𝑟 (𝑟) 2 (𝑚 − 𝑟) (𝑚 − 𝑟) (𝑟) 2 (𝑟) 3
1
(𝑚) 3 (𝑚) 3 (𝑚) 3 (𝑚) 3
Summing over 𝑘1 gives the joint distribution
𝑃 (𝐾3 = 𝑘3 , 𝐾2 = 𝑘2 )
𝑘2 0 1
𝑘3
(𝑚 − 𝑟) 2 𝑟 (𝑚 − 𝑟)
0
(𝑚) 2 (𝑚) 2
(𝑚 − 𝑟) 𝑟 (𝑟) 2
1
(𝑚) 2 (𝑚) 2
which is identical to the one for draws 1-2. Dividing by
𝑃 (𝐾2 = 𝑘2 ) gives the conditional distribution
𝑃 (𝐾3 = 𝑘3 | 𝐾2 = 𝑘2 )
𝑘2 0 1
𝑘3
0
𝑚−1 𝑚−1
𝑟 𝑟−1
1
𝑚−1 𝑚−1
Thus, the dependency of draw-3 on draw-2 matches the

one of draw-2 on draw-1.
Following a similar procedure leads to the same conclu-
sion for draws 1 and 3.
Exchangeable RVs Here is the conclusion: all the 𝑛
draws are identical in marginal densities,
𝑚−𝑟 𝑟
𝑃 (𝐾𝑖 = 0) = , 𝑃 (𝐾𝑖 = 1) = , 𝑖 ∈ [1 · ·𝑛]
𝑚 𝑚
and all draw pairs have identical joint distributions,

𝑃 𝐾𝑗 = 𝑘𝑗 , 𝐾𝑖 = 𝑘𝑖
𝑘𝑖 0 1
𝑘𝑗
(𝑚 − 𝑟) 2 𝑟 (𝑚 − 𝑟)
0
(𝑚) 2 (𝑚) 2
(𝑚 − 𝑟) 𝑟 (𝑟) 2
1
(𝑚) 2 (𝑚) 2
and, consequently, in their conditional densities

𝑃 𝐾𝑗 = 𝑘𝑗 | 𝐾𝑖 = 𝑘𝑖
𝑘𝑖 0 1
𝑘𝑗
0
𝑚−1 𝑚−1
𝑟 𝑟−1
1
𝑚−1 𝑚−1
Stated differently, the RVs 𝐾𝑖 are said to be exchangeable,

or interchangeable, though they are dependent. This
has some very useful implications. Without knowing
the joint distribution of these 𝑛 RVs, we can work out a
few expectations by exploiting the fact that the RVs are
interchangeable. To start with, all the success indicators
𝐾𝑖 have the same mean
𝑟
E 𝐾𝑖 = , 𝑖 ∈ [1 · ·𝑛]
𝑚
For all 𝑖, 𝑗 ∈ [1 · ·𝑛], the expectation of the (pairwise)
product 𝐾𝑖 𝐾𝑗 is
𝑟 𝑟−𝑚
E 𝐾𝑖 𝐾𝑗 = 1+ [𝑖 ≠ 𝑗] , 𝑖, 𝑗 ∈ [1 · ·𝑛]
𝑚 𝑚−1
and the covariance between pairs 𝐾𝑖 , 𝐾𝑗 is

Cov 𝐾𝑖 , 𝐾𝑗 = E (𝐾𝑖 − E 𝐾𝑖 ) 𝐾𝑗 − E 𝐾𝑗
𝑟 𝑟 𝑚
= E 𝐾𝑖 𝐾𝑗 −E 𝐾𝑖 E 𝐾𝑗 = 1− 1− [𝑖 ≠ 𝑗]
𝑚 𝑚 𝑚−1
The sum of all the success indicators has the mean
𝑛 𝑛
Õ Õ 𝑛𝑟
E 𝐾𝑖 = E 𝐾𝑖 =
𝑖=1 𝑖=1
𝑚
and variance
𝑛 𝑛 𝑛
!2 " 𝑛 #2
Õ Õ Õ Õ
Var 𝐾𝑖 = E 𝐾𝑖 − E 𝐾𝑖 = E (𝐾𝑖 − E 𝐾𝑖 )
𝑖=1 𝑖=1 𝑖=1 𝑖=1
𝑛
ÕÕ 𝑛 𝑛 Õ
Õ 𝑛

= E (𝐾𝑖 − E 𝐾𝑖 ) 𝐾𝑗 − E 𝐾𝑗 = Cov 𝐾𝑖 , 𝐾𝑗
𝑖=1 𝑗=1 𝑖=1 𝑗=1
𝑛 Õ 𝑛
Õ 𝑟 𝑟 𝑚
= 1− 1− [𝑖 ≠ 𝑗]
𝑖=1 𝑗=1
𝑚 𝑚 𝑚−1
𝑛 𝑛 𝑛 𝑛
𝑟 𝑟 ©Õ Õ 𝑚 ÕÕ
= 1− 1− [𝑖 ≠ 𝑗] ®
ª
𝑚 − 1 𝑖=1 𝑗=1

𝑚 𝑚 𝑖=1 𝑗=1
« ¬
𝑟 𝑟 2 𝑚 𝑛𝑟 𝑟 𝑚−𝑛
= 1− 𝑛 − (𝑛) 2 = 1−
𝑚 𝑚 𝑚−1 𝑚 𝑚 𝑚−1
Captilizing on we have learned so far, consider a par-
ticular sequence of 𝑛 draws with 𝑘 successes and 𝑛 − 𝑘
failures. Regardless of how the success (and failures)
are positioned, this sequence is probabilistically no dif-
ferent than the sequence with 𝑘 consecutive successes
followed by 𝑛 − 𝑘 consecutive failures. Assume that
𝑘 ∈ [1 · ·𝑛]. The probability of a streak of 𝑘 successes in
the first 𝑘 draws is
𝑘
! 𝑘 𝑖−1
Ù Ö Ù (𝑟) 𝑘
𝑃 𝐾𝑖 = 1 = 𝑃 (𝐾1 = 1) 𝑃 𝐾 𝑖 = 1 | 𝐾𝑗 = 1® =
© ª
𝑖=1 𝑖=2 𝑗=1
(𝑚) 𝑘
« ¬
On the other hand, the probability of a streak of 𝑛 − 𝑘
failures in the last 𝑛 − 𝑘 draws is
𝑛
! 𝑛 𝑖−1
Ù Ö Ù ª (𝑚 − 𝑟) 𝑛−𝑘
𝑃 𝐾𝑖 = 0 = 𝑃 (𝐾𝑘+1 = 0) 𝑃 𝐾 𝑖 = 0 | 𝐾𝑗 = 0® =
©
(𝑚 − 𝑘) 𝑛−𝑘
𝑖=𝑘+1 𝑖=𝑘+2 « 𝑗=𝑘+1 ¬
Multiplying the two gives the probability of the sequence
with 𝑘 consecutive successes followed by 𝑛 − 𝑘 consecu-
tive failures !
𝑘 𝑛
Ù Ù (𝑟) 𝑘 (𝑚 − 𝑟) 𝑛−𝑘 (𝑟) 𝑘 (𝑚 − 𝑟) 𝑛−𝑘
𝑃 𝐾𝑖 = 1 𝐾𝑖 = 0 = =
𝑖=1
(𝑚) 𝑘 (𝑚 − 𝑘) 𝑛−𝑘 (𝑚) 𝑛
𝑖=𝑘+1
𝑛
There are 𝑘 different sequences each with 𝑘 successes
and 𝑛 − 𝑘 failures. Though different in how successes
are positioned, all these sequences have the same proba-
bility given by the above expression. Summing all these
probabilities gives the probability of 𝑘 successes in 𝑛
draws !
𝑛 −1
Õ 𝑛 (𝑟) 𝑘 (𝑚 − 𝑟) 𝑛−𝑘 𝑚 𝑟 𝑚−𝑟
𝑃 𝐾𝑖 = 𝑘 = =
𝑖=1
𝑘 (𝑚) 𝑛 𝑛 𝑘 𝑛−𝑘
Following the alternative route of conditioning successes
in Bernoulli trials, we arrive at the same hyper-geometric
distirbution
𝑃 𝑆𝑟𝑛,𝑚 = 𝑘 = 𝑃 (𝑆𝑟 = 𝑘 | 𝑆𝑚 = 𝑛)

𝑃 (𝑆𝑟 = 𝑘) 𝑃 (𝑆𝑚 = 𝑛 | 𝑆𝑟 = 𝑘)
=
𝑃 (𝑆𝑚 = 𝑛)
−1
Bin (𝑘; 𝑟, 𝑝) Bin (𝑛 − 𝑘; 𝑚 − 𝑟, 𝑝) 𝑚 𝑟 𝑚−𝑟
= =
Bin (𝑛; 𝑚, 𝑝) 𝑛 𝑘 𝑛−𝑘
𝑛,𝑚
written as 𝑆𝑟 ∼ HG𝑛,𝑟,𝑚 with support
S = [max {0, 𝑛 − 𝑚 + 𝑟} · · min {𝑛, 𝑟}] ,
the PDF, restricted to the support,
−1
𝑚 𝑟 𝑚−𝑟
𝑓𝑆𝑟𝑛,𝑚 S (𝑘) = HG (𝑘; 𝑛, 𝑟, 𝑚) = ,
𝑛 𝑘 𝑛−𝑘
and the PGF
𝑛 −1
Õ 𝑚 𝑟 𝑚−𝑟 𝑘
𝐺𝑆𝑟𝑛,𝑚 (𝑧) = 𝑧
𝑛 𝑘 𝑛−𝑘
𝑘=0
Notice the simpler limits in the sum allowed by general-
ized binomial coefficients.
Distribution properties The first two derivatives are

𝑛 −1
0
Õ 𝑚 𝑟 𝑚 − 𝑟 𝑘−1
𝐺𝑆 𝑛,𝑚 (𝑧) = 𝑘 𝑧
𝑟 𝑛 𝑘 𝑛−𝑘
𝑘=1
𝑛 −1
Õ 𝑟 𝑟−1 𝑛 𝑚−1 𝑚 − 1 − (𝑟 − 1) 𝑘−1
= (𝑘) 𝑧
𝑘 𝑘−1 𝑚 𝑛−1 𝑛 − 1 − (𝑘 − 1)
𝑘=1
𝑛−1 −1
𝑛𝑟 Õ 𝑚 − 1 𝑟 − 1 𝑚 − 1 − (𝑟 − 1) 𝑗 𝑛𝑟
= 𝑧 = 𝐺𝑆 𝑛−1,𝑚−1 (𝑧)
𝑚 𝑗=0 𝑛 − 1 𝑗 𝑛−1−𝑗 𝑚 𝑟−1
and
𝑛𝑟 0 𝑛𝑟 (𝑛 − 1) (𝑟 − 1)
𝐺𝑆00𝑛,𝑚 (𝑧) = 𝐺 𝑛−1,𝑚−1 (𝑧) = 𝐺𝑆 𝑛−2,𝑚−2 (𝑧)
𝑟 𝑚 𝑆𝑟−1 𝑚 𝑚−1 𝑟−2
Setting 𝑧 = 1 yields the mean
𝑛𝑟
𝜇𝑆𝑟𝑛,𝑚 = E 𝑆𝑟𝑛,𝑚 = 𝐺𝑆0 𝑛,𝑚 (1) =
𝑟 𝑚
and the expectation
𝑛 (𝑛 − 1) 𝑟 (𝑟 − 1)
E 𝑆𝑟𝑛,𝑚 𝑆𝑟𝑛,𝑚 − 1 = 𝐺𝑆00𝑛,𝑚 (1) =

𝑟 𝑚 (𝑚 − 1)
from which can the variance can be recovered as

Var 𝑆𝑟𝑛,𝑚 = E 𝑆𝑟𝑛,𝑚 𝑆𝑟𝑛,𝑚 − 1 − 𝜇𝑆𝑟𝑛,𝑚 𝜇𝑆𝑟𝑛,𝑚 − 1

𝑛 (𝑛 − 1) 𝑟 (𝑟 − 1) 𝑛𝑟 𝑛𝑟 𝑛𝑟 (𝑛 − 1) (𝑟 − 1) 𝑛𝑟

= − −1 = +1−
𝑚 (𝑚 − 1) 𝑚 𝑚 𝑚 (𝑚 − 1) 𝑚

𝑛𝑟 (𝑟 − 1) (𝑛 − 1) 𝑚 + (𝑚 − 𝑟𝑛) (𝑚 − 1) 𝑛𝑟 𝑟 𝑚−𝑛

= = 1−
𝑚 (𝑚 − 1) 𝑚 𝑚 𝑚 𝑚−1

Lecture - 2020 04 28

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture - 2020 04 28

Uploaded by

Copyright:

Available Formats

EE319 Probability & Random Processes (Spring 2020)

Prof. Dr.-Ing. Mukhtar Ullah

Draws 1-2 are modeled by the joint distribution

𝑚−𝑟−2 𝑚−𝑟−1 𝑟 𝑟−1

Draws 1-3 are modeled by the joint disctribution

Thus, the dependency of draw-3 on draw-2 matches the

and, consequently, in their conditional densities

Stated differently, the RVs 𝐾𝑖 are said to be exchangeable,

Distribution properties The first two derivatives are

You might also like