0.Z Notes

MTL390-Statistical Methods
Lecture 1
Discrete Distributions
Bernoulli distribution: ℙ(𝑋 = 0) = 1 − 𝑝, ℙ(𝑋 = 1) = 𝑝

We have 𝔼(𝑋) = 𝑝, 𝑀(𝑡) = 𝔼(𝑒 𝑡𝑋 ) = 𝑝𝑒 𝑡 + 𝑞, 𝑣𝑎𝑟(𝑋) = 𝔼(𝑋 2 ) − 𝔼(𝑋)2 = 𝑝 − 𝑝2 = 𝑝𝑞
Finally, we have 𝜙(𝑡) = 𝔼(𝑒 𝑖𝑡𝑋 ) = 𝑝𝑒 𝑖𝑡 + 𝑞

2
𝜕 log(𝑓(𝑥))
Fischer Information: 𝔼 (( 𝜕𝑝
) )
2
𝜕 log(𝑓(𝑥)) 𝑥2 2𝑥
We have 𝑓(𝑥) = 𝑝 𝑥 (1 − 𝑝)1−𝑥 ⇒ [ ] = 1/𝑞 2 (𝑝2 + 1 − )
𝜕𝑝 𝑝
2
𝜕 log(𝑓(𝑥)) 1 𝑝 2𝑝 1 1 1
Hence, we have 𝔼 (( 𝜕𝑝
) ) = 𝑞2 (𝑝2 + 1 − 𝑝
) = 𝑞2 (𝑝 − 1) = 𝑞𝑝
Binomial Distribution
The probability mass function of Binomial distribution is given by 𝑓(𝑥) =𝑛 𝐶𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥
Can be considered as sum of n independent Bernoulli trials
Hence, we have 𝔼(𝑋) = 𝑝 + 𝑝 + ⋯ (𝑛 − 𝑡𝑖𝑚𝑒𝑠) = 𝑛𝑝, 𝑣𝑎𝑟(𝑋) = 𝑛𝑣𝑎𝑟(𝑋1 ) = 𝑛𝑝𝑞, 𝑀(𝑡) =

𝑛
(𝑞 + 𝑝𝑒 𝑡 )𝑛 and 𝜙(𝑡) = (𝑞 + 𝑝𝑒 𝑖𝑡 )
2
𝜕 log(𝑓(𝑥)) 1 𝑥2 2𝑛𝑥
Now, we have [ 𝜕𝑝
] = (
𝑞 2 𝑝2
+ 𝑛2 − 𝑝
)
2
𝜕 log(𝑓(𝑥)) 1 𝑛𝑝𝑞+𝑛2 𝑝2 2𝑛𝑛𝑝 1 𝑛(𝑞+𝑛𝑝)
Hence, we have 𝔼 (( 𝜕𝑝
) ) = 𝑞2 ( 𝑝2
+ 𝑛2 − 𝑝
) = 𝑞2 ( 𝑝
+ 𝑛2 − 2𝑛2 ) =
𝑛 𝑛
(𝑞 + 𝑛𝑝 − 𝑛𝑝) = 𝑝𝑞
𝑝𝑞2
Result: If X and Y are independent and X~B(n,p) and Y~B(m,p), then X+Y~B(m+n,p)
Proof:
Method 1: Using MGF
We have for RV X+Y
M(t)=𝔼(𝑒 𝑡(𝑋+𝑌) ) = 𝔼(𝑒 𝑡𝑋 )𝔼(𝑒 𝑡𝑌 ) = (𝑞 + 𝑝𝑒 𝑡 )𝑛 (𝑞 + 𝑝𝑒 𝑡 )𝑚 = (𝑞 + 𝑝𝑒 𝑡 )𝑚+𝑛
Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)
Method 2:
We have ℙ(𝑋 + 𝑌 = 𝑘) = ∑𝑘𝑖=0 ℙ(𝑋 = 𝑖)ℙ(𝑌 = 𝑘 − 𝑖) = ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝑝𝑖 𝑞 𝑛−𝑖 𝐶(𝑚, 𝑘 −
𝑖)𝑝𝑘−𝑖 𝑞 𝑚−𝑘+𝑖
= ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖)𝑝𝑘 𝑞𝑚+𝑛−𝑘

Combinatorical argument. Number of ways of selecting k objects from a pile containing m + n
objects is equivalent to dividing the objects into two groups of m objects and n objects respectively
and then finding the number of ways of choosing 0,1,2 ,…, k objects from first pile and remaining
objects from the other pile
Hence, we have ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖) = 𝐶(𝑚 + 𝑛, 𝑘)
Hence, we have ℙ(𝑋 + 𝑌 = 𝑘) = 𝐶(𝑚 + 𝑛, 𝑘)𝑝𝑘 𝑞𝑚+𝑛−𝑘
Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)
Hypergeometric distribution:
It models the probability of m successes in M draws without replacement from a finite population
of size N in which n objects are associated with success
Denoted as 𝐻𝐺(𝑁, 𝑛, 𝑀, 𝑚)
𝑛 )( 𝑁−𝑛 )
(𝑚 𝑀−𝑚
We have ℙ(𝑋 = 𝑚) =
(𝑁 )
𝑀
𝑛 𝑀 𝑁−𝑀 𝑁−𝑛
Hence, we have 𝔼(𝑋) = 𝑀 and 𝑣𝑎𝑟(𝑋) = 𝑛
𝑁 𝑁 𝑁 𝑁−1
Poisson Distribution(Poi(𝜆) )
𝑒 −𝜆 𝜆𝑥
We have 𝑓(𝑥) = 𝑥!
,𝑥 = 0,1,2, …
𝑡 −1
Hence, we have 𝔼(𝑋) = 𝜆, 𝑣𝑎𝑟(𝑋) = 𝜆, 𝑀(𝑡) = 𝑒 𝜆𝑒 , 𝜙(𝑡) = exp (𝜆(𝑒 𝑖𝑡 − 1))
𝜕 log(𝑓(𝑥)) 𝑥
Fischer information: log(𝑓(𝑥)) = −𝜆 + 𝑥 log 𝜆 − log(𝑥!) ⇒ 𝜕𝜆
= −1 + 𝜆
𝑓(𝑥) 2 𝑥2 2𝑥 𝜆2 +𝜆 2𝜆 1
Hence, we have 𝔼 ((𝜕 log ( 𝜕𝜆
)) ) = 𝔼 [1 + 𝜆2 − 𝜆
] =1+ 𝜆2
− 𝜆 =𝜆
𝑓(𝑥) 2
For passion distribution, we have 𝔼 ((𝜕 log ( 𝜕𝜆
)) ) = 1/𝜆
Theorem: If 𝑋~𝑝𝑜𝑖(𝜆), 𝑌~𝑝𝑜𝑖(𝛾), then 𝑋 + 𝑌~𝑝𝑜𝑖(𝜆 + 𝛾)
Proof:
We have 𝑀(𝑡) = (exp (𝜆(𝑒 𝑖𝑡 − 1)) × (exp (𝛾(𝑒 𝑖𝑡 − 1))) = exp ((𝜆 + 𝛾)(𝑒 𝑖𝑡 − 1))
Hence, we have 𝑋 + 𝑌~𝑃𝑜𝑖(𝜆 + 𝛾)
Geometric Distribution:
We have 𝑓(𝑥) = 𝑝𝑞 𝑥−1 , 𝑥 = 1,2, …
1
Hence, we have 𝔼(𝑋) = 𝑝 , 𝑣𝑎𝑟(𝑋) = 𝑞/𝑝2
Geometric distribution possesses memoryless property, i.e. ℙ(𝑋 ≥ 𝑡 + 𝑠|𝑋 ≥ 𝑡) = ℙ(𝑋 ≥ 𝑠)
Result: If a discrete distribution has memory less property, then it needs to be geometric
Proof:
We have ℙ(𝑋 > 2|𝑋 > 1) = ℙ(𝑋 > 1) ⇒ ℙ(𝑋 > 2) = ℙ(𝑋 > 1)2
Let ℙ(𝑋 > 1) = 𝑝
Hence, we have ℙ(𝑋 > 2) = 𝑝2 , ℙ(𝑋 > 3) = ℙ(𝑋 > 3|𝑋 > 2)ℙ(𝑋 > 2) = 𝑝(𝑝2 ) = 𝑝3
Hence, in general, we have ℙ(𝑋 > 𝑛) = 𝑝𝑛
Hence, we have ℙ(𝑋 = 𝑛) = 1 − ℙ(𝑋 ≤ 𝑛 − 1) − ℙ(𝑋 > 𝑛) = 1 − (1 − 𝑝𝑛−1 ) − 𝑝𝑛 =

𝑝𝑛−1 (1 − 𝑝) = 𝑝𝑛−1 𝑞
Hence, we have 𝑋~𝐺𝑒𝑜(𝑝)
For geometric distribution, we have ∑∞

𝑖=0(1 − 𝐹(𝑖)) = 𝔼(𝑋)
Continuous Distributions
Uniform Distribution(U(a,b))
0, 𝑥 ≤ 𝑎
1 𝑥−𝑎
We have 𝑓(𝑥) = ,𝑎 ≤ 𝑥 ≤ 𝑏, 𝐹(𝑥) = {𝑏−𝑎 ,𝑎 ≤ 𝑥 ≤ 𝑏
𝑏−𝑎
1, 𝑥 ≥ 𝑏
𝑎+𝑏 (𝑏−𝑎)2 𝑒 𝑡𝑎 −𝑒 𝑡𝑏
Hence, we have 𝔼(𝑋) = , 𝑣𝑎𝑟(𝑋) = , 𝑀(𝑡) = ,𝑡 ≠0
2 12 𝑡(𝑏−𝑎)
Exponential Distribution(𝐸𝑥𝑝(𝜆))
We have 𝑓(𝑥) = 𝜆𝑒 −𝜆𝑥 ; 𝑥 > 0, 𝐹(𝑥) = 1 − 𝑒 −𝜆𝑥 ; 𝑥 > 0
1 1 𝜆
Hence, we have 𝔼(𝑋) = 𝜆 , 𝑣𝑎𝑟(𝑋) = 𝜆2 , 𝑀(𝑡) = 𝜆−𝑡
Fischer information: We have log(𝑓(𝑥)) = log 𝜆 − 𝜆𝑥
𝜕 log 𝑓(𝑥) 1 𝜕 log 𝑓(𝑥) 2 1 2𝑥

Hence, we have 𝜕𝜆
=𝜆−𝑥 ⇒( 𝜕𝜆
) = 𝜆2 + 𝑥 2 − 𝜆
𝜕 log 𝑓(𝑥) 2 1 1 1 2 1
Hence, we have 𝔼 [( 𝜕𝜆
) ] = 𝜆2 + (𝜆2 + 𝜆2 ) − 𝜆2 = 𝜆2
Exponential distribution has memoryless property
Proof:
ℙ(𝑋>𝑡+𝑠) 𝑒 −𝜆(𝑡+𝑠)
We have ℙ(𝑋 > 𝑡 + 𝑠|𝑋 > 𝑡) = ℙ(𝑋>𝑡)
= 𝑒 −𝜆𝑡
= 𝑒 −𝜆𝑠 = ℙ(𝑋 > 𝑠)∎
Gamma Distribution( Γ(𝜆, 𝛼 ))

𝜆𝛼
The pdf is 𝑓(𝑥) = Γ𝛼 𝑒 −𝜆𝑥 𝑥 𝛼−1 ; 𝛼 > 0, 𝜆 > 0, 𝑥 > 0
𝜆𝛼 ∞ 𝜆𝛼 𝛼
Hence, we have 𝔼(𝑋) = Γ𝛼 ∫0 𝑒 −𝜆𝑥 𝑥 𝛼+1−1 = λα+1 Γ𝛼 Γ(𝛼 + 1) = 𝜆
𝛼(𝛼+1)
Similarly, we have 𝔼(𝑋 2 ) = 𝜆2
𝛼
Hence, we have 𝑣𝑎𝑟(𝑋) =
𝜆2
𝜆𝛼 ∞ −𝜆𝑥 𝛼−1 𝑡𝑥 𝜆𝛼 Γ𝛼 𝑡 −𝛼
Finally, we have 𝑀(𝑡) = 𝔼(𝑒 𝑡𝑋 ) = ∫ 𝑒 𝑥 𝑒 = × (𝜆−𝑡)𝛼 = (1 − )
Γ𝛼 0 Γ𝛼 𝜆
Fischer information: log 𝑓(𝑥) = 𝛼 log 𝜆 − log Γ𝛼 − 𝜆𝑥 + (𝛼 − 1) log 𝑥

𝜕 log 𝑓(𝑥) 𝛼 𝜕 log 𝑓(𝑥) 2 𝛼2 2𝛼𝑥
Hence, we have 𝜕𝜆
= 𝜆−𝑥 ⇒( 𝜕𝜆
) = 𝜆2
+ 𝑥2 − 𝜆
𝜕 log 𝑓(𝑥) 2 𝛼2 𝛼(𝛼+1) 2𝛼 2 𝛼

Hence, we have 𝔼 [( 𝜕𝜆
) ] = 𝜆2
+ 𝜆 2 − 𝜆2
= 𝜆2
Exponential distribution is a special case of Gamma distribution 𝐸𝑥𝑝(𝜆) = Γ(𝜆, 1)
If 𝑋~Γ(𝜆, 𝛼), 𝑌~Γ(𝜆, 𝛽), X, Y are independent, then 𝑋 + 𝑌~Γ(𝜆, 𝛼 + 𝛽)
Proof:
Let 𝑍 = 𝑋 + 𝑌
𝑧 𝜆𝛼 𝜆𝛽 𝜆𝛼+𝛽 𝑧
We have 𝑓𝑍 (𝑧) = ∫0 Γ𝛼
𝑒 −𝜆𝑥 𝑥 𝛼−1 Γ𝛽 𝑒 −𝜆(𝑧−𝑥) (𝑧 − 𝑥)𝛽−1 𝑑𝑥 = Γ𝛼Γ𝛽 𝑒 −𝜆𝑧 ∫0 𝑥 𝛼−1 (𝑧 − 𝑥)𝛽−1
𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−2 𝑧 𝑥 𝛼−1 𝑥 𝛽−1

Hence, we have 𝑓𝑍 (𝑧) = Γ𝛼Γ𝛽
𝑒 𝑧 ∫ (
0 𝑧
) (1 − 𝑧
)
𝑥 𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−1 1 𝛼−1 𝜆𝛼+𝛽

Put 𝑧 = 𝑦, we get 𝑓𝑍 (𝑧) = Γ𝛼Γ𝛽
𝑒 𝑧 ∫0 𝑥 (1 − 𝑥)𝛽−1 = Γ𝛼Γ𝛽 𝑒 −𝜆𝑧 𝑧 𝛼+𝛽−1 𝐵(𝛼, 𝛽 )
Γ𝛼Γ𝛽
We have 𝐵(𝛼, 𝛽) =
Γ(𝛼+𝛽)
𝜆𝛼+𝛽
Hence, we have 𝑓𝑍 (𝑧) = 𝑒 −𝜆𝑧 𝑧 𝛼+𝛽−1
Γ(𝛼+𝛽)
Hence, we have 𝑍~Γ(𝜆, 𝛼 + 𝛽)
Normal Distribution (𝑁(𝜇, 𝜎 2 ))

1 (𝑥−𝜇)2 𝜎2𝑡 2
We have 𝑓(𝑥) = exp (− 2𝜎 2
),𝑀(𝑡) = exp (𝜇𝑡 + )
√2𝜋𝜎 2
If 𝑋~𝑁(𝜇, 𝜎 2 ),then 𝑎𝑋 + 𝑏 ~𝑁(𝑎𝜇 + 𝑏, 𝑎2 𝜎 2 )
If 𝑋~𝑁(𝜇, 𝜎 2 ), 𝑌~𝑁(𝛼, 𝛾 2 ) are independent, then 𝑋 + 𝑌~𝑁(𝜇 + 𝛼, 𝜎 2 + 𝛾 2 )
Tutorial 1
Q1: Solution: 𝑌 = −𝜆 log(1 − 𝑋)
Since 0 ≤ 𝑋 ≤ 1, we have 0 < 𝑌 < ∞

𝑦
Hence, we have ℙ(𝑌 ≤ 𝑦) = ℙ(−𝜆 log(1 − 𝑋) ≤ 𝑦) = ℙ (1 − 𝑋 ≥ exp (− 𝜆 )) = ℙ (𝑋 ≤ 1 −
𝑦 𝑦
exp (− 𝜆 )) = 1 − exp (− 𝜆 )
𝑦 1 𝑦
Hence, we have 𝐹(𝑦) = 1 − exp (− 𝜆 ) ⇒ 𝑓(𝑦) = 𝜆 exp (− 𝜆 )
1
Hence, we have 𝑌~𝐸𝑥𝑝 ( )
𝜆
Q2: Solution: We have Y = F(X)
Hence, we have 𝐹𝑌 (𝑦) = ℙ(𝑌 ≤ 𝑦) = ℙ(𝐹𝑋 (𝑋) ≤ 𝑦) = ℙ (𝑋 ≤ 𝐹𝑋−1 (𝑦)) = 𝐹𝑋 (𝐹𝑋−1 (𝑦) = 𝑦
Hence, we have 𝑌~𝑈(0,1)
Q3: Solution: We have 𝑌𝑘 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑘
Here 𝔼( 𝑋𝑖 ) = 𝑝 − 𝑞
Hence, we have 𝔼(𝑌𝑘 ) = (𝑝 − 𝑞)𝑘
Q4: Solution: We have ℙ(𝑁𝑡 = 𝑘) = ℙ(𝐸𝑥𝑎𝑐𝑡𝑙𝑦 𝑛 − 𝑘 𝑐𝑜𝑜𝑙𝑒𝑟𝑠 ℎ𝑎𝑣𝑒 𝑓𝑎𝑖𝑙𝑒𝑑 𝑡𝑖𝑙𝑙 𝑡𝑖𝑚𝑒 𝑘) =
𝑛−𝑘 𝑘
𝐶(𝑛, 𝑛 − 𝑘)ℙ(𝑋 ≤ 𝑡)𝑛−𝑘 ℙ(𝑋 ≥ 𝑡)𝑘 = 𝐶(𝑛, 𝑛 − 𝑘)(𝑒 −𝜆𝑡 ) (1 − 𝑒 −𝜆𝑡 )
Hence, we have 𝑁𝑡 ~𝐵𝑖𝑛(𝑛, 1 − 𝑒 −𝜆𝑡 )
Q5: Done
Q6: Done
Q7:Solution: We have 𝔼(𝑋 𝑠 ) ≤ 𝔼(|𝑋 𝑠 |) ≤ 𝔼(1 + |𝑋 𝑟 |) < ∞(𝐵)
Q8:Solution: Max number of virus in 1st gen = 1
Max number of virus in 2nd gen = 2
Max number of virus in 3rd gen = 4
To infect atleast 7 humans atleast 4 viruses are needed
Hence, we have 𝑃 = (0.4) × (0.4)2 × [0.83 × 0.2 × 4 + 0.84 ]
Q9: Easy
𝜆𝛼 𝑥 𝛼−1 −𝜆𝑥
Q10: Solution We have 𝑓(𝑥) = Γ𝛼
𝑒
𝑑𝑓(𝑥)
For maxima/ minima we have 𝑑𝑥
= 0 ⇒ −𝜆𝑥 𝛼−1 𝑒 −𝜆𝑥 + (𝛼 − 1)(𝑥 𝛼−2 𝑒 −𝜆𝑥 ) = 0 ⇒ 𝑥 =
𝛼−1
0 𝑜𝑟 𝜆𝑥 = 𝛼 − 1 ⇒ 𝑥 = 𝜆
𝛼−1
By first derivative test 𝑥 = 𝜆
is mode.
Q11:Solution: First point can be chosen with probability 1
Second point can only be chosen on the arc subtended by the equilateral triangle
1
Hence, 𝑝 = 3
Lecture 3
Advanced Distributions
Negative Binomial (NB(k,p))
Models number of trials needed for k successes
𝑛+𝑘−1 𝑘 𝑛 −𝑘 −𝑘 𝑘 (−𝑞)𝑥
Hence, ℙ(𝑋 = 𝑛) = ( ) 𝑝 𝑞 = ( ) 𝑝𝑘 (−𝑞)𝑛 , 𝔼(𝑋) = ∑∞
𝑥=0 𝑥 ( )𝑝
𝑘−1 𝑛 𝑥
−𝑘 −𝑘
We have (1 − 𝑞)−𝑘 = 1 + ( ) (−𝑞)1 + ( ) (−𝑞)2 + ⋯
1 2
Differentiating both sides with respect to q, we get
∞
−𝑘−1 −𝑘
𝑘(1 − 𝑞) = ∑(−𝑥) ( ) (−𝑞)𝑥−1
𝑥
𝑥=0
Multiplying both sides by q, we get

𝑞𝑘
𝔼(𝑋) = 𝑝𝑘 𝑞𝑘(1 − 𝑞)−𝑘−1 = 𝑝𝑘 𝑞𝑘 𝑝−𝑘−1 =
𝑝
Chi-Squared distribution(𝜒 2 (𝑛))

If 𝑋~𝑁(0,1), then 𝑋 2 ~𝜒 2 (1)
𝑥 2 𝑥2
𝑦 1 − 𝑦 1 −
We have ℙ(𝑋 2 ≤ 𝑦) = ℙ(−√𝑦 ≤ 𝑋 ≤ √𝑦) = ∫−√ 𝑦 𝑒 2 𝑑𝑥 = 2 ∫0√ 𝑒 2 𝑑𝑥
√ √2𝜋 √2𝜋
1
𝑦 𝑦 1
2 1 1 2 1 1 1
Hence, we have 𝑓𝑌 (𝑦) = 2𝜋
𝑒 −2 ×2 =𝑒 −
2 𝑦 2
−1
(2) ( 1 ) = Γ (2 , 2)
√ √𝑦 Γ( )
2
2 1 1
Hence, we have 𝜒(1) = Γ (2 , 2)
2
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be n iids distributed as 𝑁(0,1), then ∑𝑛𝑖=1 𝑋𝑖2 ~𝜒(𝑛)
1 𝑛
We have 𝑌~Γ (2 , 2 ) {Sum of independent Gamma distributions}
Pareto Distribution (Pareto(𝜎, 𝛼 ))

Here 𝜎is called the scale parameter, and 𝛼 is called the shape parameter. 𝛼, 𝜎 > 0
𝛼𝜎 𝛼
Pdf is: 𝑓(𝑥) = ,𝑥 >𝜎
𝑥 𝛼+1
𝜎 𝛼
CDF is 𝐹(𝑥) = 1 − (𝑥 )
Shifted exponential: 𝑓(𝑥) = 𝜆𝑒 −𝜆(𝑥−𝜃) . 𝑥 > 𝜃

𝜆
Double exponential: 𝑓(𝑥) = 2 𝑒 −𝜆|𝑥| , 𝑥 ∈ ℝ
Example: If 𝑋~𝐸𝑥𝑝(𝜆) what is the distribution of 𝑘𝑒 𝑋 , 𝑘 > 0
𝑦 𝑦
𝑘 𝜆
We have ℙ(𝑌 ≤ 𝑦 ) = ℙ(𝑘𝑒 𝑋 ≤ 𝑦) = ℙ (𝑋 ≤ ln 𝑘) = 1 − 𝑒 −𝜆 ln𝑘 = 1 − (𝑦)
Hence, we have 𝑌~ 𝑃𝑎𝑟𝑒𝑡𝑜(𝑘, 𝜆)
Weibull Distribution
𝑐
Pdf: 𝑓(𝑥) = 𝑐𝑥 𝑐−1 𝑒 −𝑥 , 𝑐, 𝑥 > 0
∞ 𝑐
We have 𝔼(𝑋) = ∫0 𝑒 −(𝑥 ) 𝑥 𝑐 𝑐
Put 𝑥 𝑐 = 𝑡, we have 𝑐𝑥 𝑐−1 𝑑𝑥 = 𝑑𝑡

1
∞ 1
Hence, we have 𝔼(𝑋) = ∫0 𝑒 −𝑡 𝑡 𝑐+1−1 𝑑𝑡 = Γ(𝑐 + 1)
∞ 𝑐
We have 𝔼(𝑋 𝑛 ) = ∫0 𝑐𝑥 𝑛+𝑐−1 𝑒 −𝑥 𝑑𝑥
Put 𝑥 𝑐 = 𝑧, we get
∞ 𝑛
𝑛
𝔼(𝑋 𝑛 ) = ∫ 𝑧 𝑐 𝑒 (−𝑧) 𝑑𝑧 = Γ ( + 1)
0 𝑐
𝑛
Γ( +1)𝑡 𝑛
Hence, we have 𝑀(𝑡) = ∑∞
𝑛=0
𝑐
𝑛!
2 1 2
Hence, we have 𝑣𝑎𝑟(𝑋) = 𝔼(𝑋 2 ) − 𝔼(𝑋)2 = Γ (𝑐 + 1) − Γ (𝑐 + 1)
𝑐
𝑐 𝑥 𝑐−1 (−𝑥)
Two parameter version of Weibull distribution 𝑓(𝑥) = ( )
𝜆 𝜆
𝑒 𝜆 , 𝑐, 𝑥, 𝜆 >0
Beta Function and Properties

1
We define Β(𝑚, 𝑛) = ∫0 𝑥 𝑚−1 (1 − 𝑥)𝑛−1 𝑑𝑥
Properties of Beta function
[1] Β(𝑚, 𝑛) = Β(𝑛, 𝑚)

𝑏 𝑏
Proof: We know that ∫𝑎 𝑓(𝑥)𝑑𝑥 = ∫𝑎 𝑓(𝑎 + 𝑏 − 𝑥)𝑑𝑥
1 1 𝑛−1
Hence, we have Β(𝑚, 𝑛) = ∫0 𝑥 𝑚−1 (1 − 𝑥)𝑛−1 𝑑𝑥 = ∫0 (1 − 𝑥)𝑚−1 (1 − (1 − 𝑥)) 𝑑𝑥 = Β(𝑛, 𝑚)
Γ(𝑛)Γ(𝑚)
[2] Β(𝑚, 𝑛) =
Γ(𝑚+𝑛)
Proof:
∞ ∞
We have Γ(𝑛)Γ(𝑚) = ∫0 ∫0 𝑒 −𝑥 𝑥 𝑛−1 𝑒 −𝑦 𝑦 𝑚−1 𝑑𝑥𝑑𝑦
Put 𝑥 = 𝑢𝑡 and 𝑦 = 𝑢(1 − 𝑡), we get

𝑢 𝑡
𝐽=| | = 𝑢(1 − 𝑡) + 𝑢𝑡 = 𝑢
−𝑢 1−𝑡
∞ 1 𝑚−1
Hence, we have Γ(𝑛)Γ(𝑚) = ∫0 ∫0 𝑒 −𝑢𝑡 (𝑢𝑡)𝑛−1 𝑒 −(𝑢(1−𝑡)) (𝑢(1 − 𝑡)) 𝑢𝑑𝑢𝑑𝑡 =
∞ 1
∫0 ∫0 𝑒 −𝑢 𝑢𝑛+𝑚−1 𝑡 𝑛−1 (1 − 𝑡) 𝑚−1
𝑑𝑢𝑑𝑡
∞ 1
Hence, we have Γ(𝑛)Γ(𝑛) = ∫0 𝑒 −𝑢 𝑢(𝑛+𝑚−1) 𝑑𝑢 ∫0 𝑡 𝑛−1 (1 − 𝑡)𝑚−1 𝑑𝑡 = Γ(𝑚 + 𝑛)Β(𝑛, 𝑚)
Γ(𝑛)Γ(𝑚)
Hence, we have Β(𝑛, 𝑚) = Β(𝑚, 𝑛) = Γ(𝑚+𝑛)
𝜋
[3] Β(𝑚, 𝑛) = 2 ∫02 cos 2𝑚−1 𝜃 sin2𝑛−1 𝜃 𝑑𝜃
Proof:
Put 𝑡 = cos2 𝜃, we get 𝑑𝑡 = 2 sin 𝜃 cos 𝜃

𝜋 𝜋
Hence, we have Β(𝑚, 𝑛) = ∫02 cos 2𝑚−2 𝜃 sin2𝑛−2 𝜃2 sin 𝜃 𝑐𝑜𝑠𝜃 𝑑𝜃 = 2 ∫02 cos 2𝑚−1 𝜃 sin2𝑛−1 𝜃 𝑑𝜃
∞ 𝑡 𝑚−1
[3] Β(𝑚, 𝑛) = ∫0 (1+𝑡)𝑚+𝑛
𝑑𝑡
Proof:
𝑡
In original integral put 𝑦 = , we get
1−𝑡
𝑦 𝑑𝑦
𝑡= ⇒ 𝑑𝑡 =
1+𝑦 (1 + 𝑦)2
𝑦 𝑚−1 𝑦 𝑛−1
∞ (1+𝑦) (1−
1+𝑦
) 𝑑𝑦 ∞ 𝑡 𝑚−1 𝑑𝑡
Hence, we have Β(𝑚, 𝑛) = ∫0 (1+𝑦) 2 = ∫0 (1+𝑡)𝑚+𝑛
Hence, we have two types of Beta distributions

1
[1] Beta1 Distribution: pdf = 𝑓(𝑥) = Β(𝑚,𝑛)
𝑥 𝑚−1 (1 − 𝑥)𝑛−1 ; 0 ≤ 𝑥 ≤ 1
1 𝑥 𝑚−1
[2] Beta2 Distribution: pdf 𝑓(𝑥) =
Β(𝑚,𝑛) (1+𝑥)𝑚+𝑛
Properties of Beta Distribution

𝑋
[1] If 𝑋~Γ(𝜆, 𝑛) and 𝑌~Γ(𝜆, 𝑚), then 𝑋+𝑌 ~𝐵𝑒𝑡𝑎1(𝑚, 𝑛)
𝑋
Proof: Let 𝑈 = 𝑋+𝑌 , 𝑉 = 𝑋 + 𝑌, we have ⇒ 𝑋 = 𝑈𝑉, 𝑌 = 𝑉 − 𝑈𝑉
𝑣 𝑢
Hence, we have 𝐽 = det [ ] = |𝑣 − 𝑣𝑢 + 𝑣𝑢| = 𝑣
−𝑣 1−𝑢
𝑣𝜆𝛼+𝛽 𝑒 −𝜆𝑣 𝜆𝛼+𝛽 𝑚+𝑛−1 −𝜆𝑣 𝑛−1
Hence, we have 𝑓𝑈𝑉 (𝑢, 𝑣) = (𝑢𝑣)𝑛−1 (𝑣 − 𝑢𝑣)𝑚−1 = 𝑣 𝑒 𝑢 (1 − 𝑢)𝑚−1
Γ𝑛 Γ𝑚 Γ𝑛 Γ𝑚
𝜆𝛼+𝛽 𝑛−1 ∞ Γ(𝑚+𝑛) 𝑛−1

Hence, we have 𝑓𝑈 (𝑢) = Γ𝑛 Γ𝑚
𝑢 (1 − 𝑢)𝑚−1 ∫0 𝑣 𝑚+𝑛−1 𝑒 −𝜆𝑣 𝑑𝑣 = Γ𝑚 Γ𝑛
𝑢 (1 − 𝑢)𝑚−1
𝑋
[2] If 𝑋~Γ(𝜆, 𝑚) and 𝑌~Γ(𝜆, 𝑛), X, Y are independent then 𝑌 ~𝐵𝑒𝑡𝑎2(𝑚, 𝑛)
𝑚 𝑚+𝑟
[3] If 𝑋~𝐵𝑒𝑡𝑎1(𝑚, 𝑛), then 𝔼(𝑋) = 𝑚+𝑛 , 𝔼(𝑋 𝑟 ) = ∏r−1
i=0 𝑚+𝑛+𝑟
𝑚 𝑚(𝑚+𝑛−1)
[4] If 𝑋~𝐵𝑒𝑡𝑎2(𝑚, 𝑛), then 𝔼(𝑋) = 𝑛−1 and 𝑣𝑎𝑟(𝑋) = (𝑛−2)(𝑛−1)2 , 𝑛 > 2
2 1 𝑛
[5] 𝜒(𝑛) =Γ( , )
2 2
Students t-distribution
2 𝑋
If 𝑋~𝑁(0,1) and 𝑌~𝜒(𝑛) , then is said to follow t distribution with n-degrees of freedom.
𝑌
√
𝑛
Pdf is
1
𝑓(𝑥) = 𝑛+1
𝑛 1 𝑢2 2
√𝑛 (Β (2 , 2) (1 + 𝑛 ) )
F distribution (𝐹𝑚,𝑛 )
2 2
If X and Y are independent random variables and 𝑋~𝜒(𝑚) and 𝑌~𝜒(𝑛) , then the random varible
𝑋
𝑚
𝑍= 𝑌 has F distribution with parameters m,n.
𝑛

0.Z Notes

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

0.Z Notes

Uploaded by

Copyright:

Available Formats

MTL390-Statistical Methods

Bernoulli distribution: ℙ(𝑋 = 0) = 1 − 𝑝, ℙ(𝑋 = 1) = 𝑝

Finally, we have 𝜙(𝑡) = 𝔼(𝑒 𝑖𝑡𝑋 ) = 𝑝𝑒 𝑖𝑡 + 𝑞

Can be considered as sum of n independent Bernoulli trials

Hence, we have 𝔼(𝑋) = 𝑝 + 𝑝 + ⋯ (𝑛 − 𝑡𝑖𝑚𝑒𝑠) = 𝑛𝑝, 𝑣𝑎𝑟(𝑋) = 𝑛𝑣𝑎𝑟(𝑋1 ) = 𝑛𝑝𝑞, 𝑀(𝑡) =

Method 1: Using MGF

We have for RV X+Y

M(t)=𝔼(𝑒 𝑡(𝑋+𝑌) ) = 𝔼(𝑒 𝑡𝑋 )𝔼(𝑒 𝑡𝑌 ) = (𝑞 + 𝑝𝑒 𝑡 )𝑛 (𝑞 + 𝑝𝑒 𝑡 )𝑚 = (𝑞 + 𝑝𝑒 𝑡 )𝑚+𝑛

Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)

= ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖)𝑝𝑘 𝑞𝑚+𝑛−𝑘

Hence, we have ∑𝑘𝑖=0 𝐶(𝑛, 𝑖)𝐶(𝑚, 𝑘 − 𝑖) = 𝐶(𝑚 + 𝑛, 𝑘)

Hence, we have ℙ(𝑋 + 𝑌 = 𝑘) = 𝐶(𝑚 + 𝑛, 𝑘)𝑝𝑘 𝑞𝑚+𝑛−𝑘

Hence, we have 𝑋 + 𝑌~𝐵(𝑚 + 𝑛, 𝑝)

Theorem: If 𝑋~𝑝𝑜𝑖(𝜆), 𝑌~𝑝𝑜𝑖(𝛾), then 𝑋 + 𝑌~𝑝𝑜𝑖(𝜆 + 𝛾)

Hence, we have 𝑋 + 𝑌~𝑃𝑜𝑖(𝜆 + 𝛾)

Geometric distribution possesses memoryless property, i.e. ℙ(𝑋 ≥ 𝑡 + 𝑠|𝑋 ≥ 𝑡) = ℙ(𝑋 ≥ 𝑠)

Let ℙ(𝑋 > 1) = 𝑝

Hence, in general, we have ℙ(𝑋 > 𝑛) = 𝑝𝑛

Hence, we have ℙ(𝑋 = 𝑛) = 1 − ℙ(𝑋 ≤ 𝑛 − 1) − ℙ(𝑋 > 𝑛) = 1 − (1 − 𝑝𝑛−1 ) − 𝑝𝑛 =

For geometric distribution, we have ∑∞

Fischer information: We have log(𝑓(𝑥)) = log 𝜆 − 𝜆𝑥

𝜕 log 𝑓(𝑥) 1 𝜕 log 𝑓(𝑥) 2 1 2𝑥

Exponential distribution has memoryless property

Gamma Distribution( Γ(𝜆, 𝛼 ))

Fischer information: log 𝑓(𝑥) = 𝛼 log 𝜆 − log Γ𝛼 − 𝜆𝑥 + (𝛼 − 1) log 𝑥

𝜕 log 𝑓(𝑥) 2 𝛼2 𝛼(𝛼+1) 2𝛼 2 𝛼

Exponential distribution is a special case of Gamma distribution 𝐸𝑥𝑝(𝜆) = Γ(𝜆, 1)

If 𝑋~Γ(𝜆, 𝛼), 𝑌~Γ(𝜆, 𝛽), X, Y are independent, then 𝑋 + 𝑌~Γ(𝜆, 𝛼 + 𝛽)

𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−2 𝑧 𝑥 𝛼−1 𝑥 𝛽−1

𝑥 𝜆𝛼+𝛽 −𝜆𝑧 𝛼+𝛽−1 1 𝛼−1 𝜆𝛼+𝛽

Hence, we have 𝑍~Γ(𝜆, 𝛼 + 𝛽)

Normal Distribution (𝑁(𝜇, 𝜎 2 ))

If 𝑋~𝑁(𝜇, 𝜎 2 ),then 𝑎𝑋 + 𝑏 ~𝑁(𝑎𝜇 + 𝑏, 𝑎2 𝜎 2 )

If 𝑋~𝑁(𝜇, 𝜎 2 ), 𝑌~𝑁(𝛼, 𝛾 2 ) are independent, then 𝑋 + 𝑌~𝑁(𝜇 + 𝛼, 𝜎 2 + 𝛾 2 )

Since 0 ≤ 𝑋 ≤ 1, we have 0 < 𝑌 < ∞

Q2: Solution: We have Y = F(X)

Hence, we have 𝑌~𝑈(0,1)

Q3: Solution: We have 𝑌𝑘 = 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑘

Hence, we have 𝔼(𝑌𝑘 ) = (𝑝 − 𝑞)𝑘

Hence, we have 𝑁𝑡 ~𝐵𝑖𝑛(𝑛, 1 − 𝑒 −𝜆𝑡 )

Q7:Solution: We have 𝔼(𝑋 𝑠 ) ≤ 𝔼(|𝑋 𝑠 |) ≤ 𝔼(1 + |𝑋 𝑟 |) < ∞(𝐵)

Q8:Solution: Max number of virus in 1st gen = 1

Max number of virus in 2nd gen = 2

Max number of virus in 3rd gen = 4

To infect atleast 7 humans atleast 4 viruses are needed

Hence, we have 𝑃 = (0.4) × (0.4)2 × [0.83 × 0.2 × 4 + 0.84 ]

Q11:Solution: First point can be chosen with probability 1

Multiplying both sides by q, we get

Chi-Squared distribution(𝜒 2 (𝑛))

Pareto Distribution (Pareto(𝜎, 𝛼 ))

Shifted exponential: 𝑓(𝑥) = 𝜆𝑒 −𝜆(𝑥−𝜃) . 𝑥 > 𝜃

Example: If 𝑋~𝐸𝑥𝑝(𝜆) what is the distribution of 𝑘𝑒 𝑋 , 𝑘 > 0

Hence, we have 𝑌~ 𝑃𝑎𝑟𝑒𝑡𝑜(𝑘, 𝜆)

Put 𝑥 𝑐 = 𝑡, we have 𝑐𝑥 𝑐−1 𝑑𝑥 = 𝑑𝑡

Beta Function and Properties

Properties of Beta function

[1] Β(𝑚, 𝑛) = Β(𝑛, 𝑚)

Put 𝑥 = 𝑢𝑡 and 𝑦 = 𝑢(1 − 𝑡), we get

Put 𝑡 = cos2 𝜃, we get 𝑑𝑡 = 2 sin 𝜃 cos 𝜃

Hence, we have two types of Beta distributions

Properties of Beta Distribution

𝜆𝛼+𝛽 𝑛−1 ∞ Γ(𝑚+𝑛) 𝑛−1