Professional Documents
Culture Documents
Lecture02
Lecture02
Lecture02
Statistical Inference
Lecture 02: Exponential families
Yuansi Chen
Spring 2023
Duke University
https://www2.stat.duke.edu/courses/Spring23/sta732.01/
1
Recap from Lecture 01
2
Goal of Lecture 02
3
Exponential families
Exponential families
𝑇 ∶𝒳 → ℝ𝑠 sufficient statistics
ℎ ∶𝒳 → ℝ carrier/base density
𝑠
𝜂∈Ξ⊆ℝ natural parameter
𝐴 ∶ℝ𝑠 → ℝ cumulant-generating function (cgf)
4
Notes on 𝐴(𝜂)
5
Example 2.1
6
Notes on the natural parameter
Ξ1 = {𝜂 ∶ 𝐴(𝜂) < ∞}
7
Other parameterization for an exponetial family
Take 𝜂 ∶ Ω → Ξ, define
8
Other parameterization for an exponetial family
Take 𝜂 ∶ Ω → Ξ, define
8
Example 2.2: normal with unknown mean and variance
1 (𝑥−𝜇)2
𝑝𝜃 (𝑥) = √ 𝑒− 2𝜎2
2𝜋𝜎2
𝜇 1 𝜇2 1
= exp [ 2 𝑥 − 2 𝑥2 − 2 − log (2𝜋𝜎2 )]
𝜎 2𝜎 2𝜎 2
9
Example 2.2: normal with unknown mean and variance
1 (𝑥−𝜇)2
𝑝𝜃 (𝑥) = √ 𝑒− 2𝜎2
2𝜋𝜎2
𝜇 1 𝜇2 1
= exp [ 2 𝑥 − 2 𝑥2 − 2 − log (2𝜋𝜎2 )]
𝜎 2𝜎 2𝜎 2
We identify
𝜇
𝜇 𝜎2 ) , 𝑇 (𝑥) = ( 𝑥 )
𝜃=( ) , 𝜂(𝜃) = (
𝜎2 − 2𝜎1 2 𝑥2
𝜇2 1
ℎ(𝑥) = 1, 𝐵(𝜃) = 2
+ log (2𝜋𝜎2 )
2𝜎 2
How to write in terms of natural parameters?
9
𝑥
𝑝𝜂 (𝑥) = exp [𝜂⊤ ( 2 ) − 𝐴(𝜂)]
𝑥
−𝜂12 1 𝜋
𝐴(𝜂) = + log (− )
4𝜂2 2 𝜂2
10
{𝑝𝜂 ∶ 𝜂 ∈ Ξ} lives inside a s-dimensional subspace
11
The form of an exponential family is not unique
𝑑 𝜇̃
𝜇 ⇝ 𝜇̃ with =ℎ
𝑑𝜇
2. Reparameterize so 0 ∈ Ξ: take 𝜂0 ∈ Ξ
𝜂 ⇝ 𝜂 ̃ = 𝜂 − 𝜂0
ℎ ⇝ ℎ̃ = 𝑝𝜂 (𝑥)
0
...
12
More examples
Example 2.3: joint density of 𝑛 i.i.d. normal
i.i.d.
Given 𝑋1 , … , 𝑋𝑛 ∼ 𝒩(𝜇, 𝜎2 ), the joint density is
𝑛
1 (𝑥𝑖 −𝜇)2
𝑝𝜃 (𝑥) = ∏ [ √ 𝑒− 2𝜎2 ]
𝑖=1 2𝜋𝜎2
𝑛
𝜇 1 2 𝜇2 1
= exp {∑ [ 2
𝑥𝑖 − 2
𝑥𝑖 − 2
− log (2𝜋𝜎2 )]}
𝑖=1
𝜎 2𝜎 2𝜎 2
13
Example 2.3: joint density of 𝑛 i.i.d. normal
i.i.d.
Given 𝑋1 , … , 𝑋𝑛 ∼ 𝒩(𝜇, 𝜎2 ), the joint density is
𝑛
1 (𝑥𝑖 −𝜇)2
𝑝𝜃 (𝑥) = ∏ [ √ 𝑒− 2𝜎2 ]
𝑖=1 2𝜋𝜎2
𝑛
𝜇 1 2 𝜇2 1
= exp {∑ [ 2
𝑥𝑖 − 2
𝑥𝑖 − 2
− log (2𝜋𝜎2 )]}
𝑖=1
𝜎 2𝜎 2𝜎 2
𝜇
𝜎2 ) , 𝑇 (𝑥) ∑ 𝑥𝑖
𝜂(𝜃) = ( =( ) , 𝐵(𝜃) = 𝑛𝐵(1) (𝜃)
− 2𝜎1 2 ∑ 𝑥2𝑖
Ex: in general the joint density of 𝑛 i.i.d. random variables from 𝑠-parameter Exp
family is still an 𝑠-parameter Exp family with the same parameters
13
Example: binomial
𝑛
𝑝𝜃 (𝑥) = ( ) 𝜃𝑥 (1 − 𝜃)𝑛−𝑥
𝑥
𝑥
𝜃 𝑛
=( ) (1 − 𝜃)𝑛 ( )
1−𝜃 𝑥
𝜃 𝑛
= exp [log ( ) 𝑥 + 𝑛 log(1 − 𝜃)] ( )
1−𝜃 𝑥
𝜃
𝑇 (𝑥) = 𝑥, 𝜂(𝜃) = log ( )
1−𝜃
14
Example: Poisson
𝜆𝑥 𝑒−𝜆
𝑝𝜆 (𝑥) =
𝑥!
1
= exp [log(𝜆)𝑥 − 𝜆]
𝑥!
This is a 1-parameter exponential family
𝜂(𝜆) = log(𝜆)
15
Differential Identities
Intuition for getting moments from cgf
⊤
𝑒𝐴(𝜂) = ∫ 𝑒𝜂 𝑇 (𝑥)
ℎ(𝑥)𝑑𝜇(𝑥)
16
Intuition for getting moments from cgf
⊤
𝑒𝐴(𝜂) = ∫ 𝑒𝜂 𝑇 (𝑥)
ℎ(𝑥)𝑑𝜇(𝑥)
16
Theorem 2.4 in Keener
Theorem 2.4
Let Ξ𝑓 be the set of values for 𝜂 ∈ ℝ𝑠 where
17
Proof sketch in 1-d (Chap. 2.3. in Keener)
18
Proof:
19
What do we get by differentiating 𝐴(𝜂)?
∇𝐴(𝜂) = 𝔼𝜂 [𝑇 (𝑋)]
Because
𝜕 𝐴(𝜂) 𝜕
𝑒 = ∫ exp [𝜂⊤ 𝑇 (𝑥)] ℎ(𝑥)𝑑𝜇(𝑥)
𝜕𝜂𝑗 𝜕𝜂𝑗
20
Differentiating twice
21
Example: Poisson
𝜆𝑥 𝑒−𝜆
𝑝𝜆 (𝑥) =
𝑥!
22
Example: Poisson
𝜆𝑥 𝑒−𝜆
𝑝𝜆 (𝑥) =
𝑥!
𝑑𝑒𝜂
𝔼𝜂 [𝑋] = = 𝑒𝜂 = 𝜆
𝑑𝜂
𝑑2
Var𝜂 [𝑋] = 2 𝑒𝜂 = 𝑒𝜂 = 𝜆
𝑑𝜂
22
Moment-generating function
23
Useful properties of moment-generating function
𝜕 𝑟1 𝜕 𝑟𝑠
𝑟1 ⋯ 𝑟 𝑀 (𝑢)∣
𝜕𝑢1 𝜕𝑢𝑠𝑠 𝑡 𝑢=0
24
Moment-generating function of exponential family
𝑇 (𝑋) ⊤
𝑀𝜂 (𝑢) = 𝔼𝜂 [𝑒𝑢 𝑇 (𝑋)
]
⊤
𝑇 𝜂⊤ 𝑇 −𝐴(𝜂)
= ∫ 𝑒𝑢 𝑒 ℎ𝑑𝜇
25
Moment-generating function of exponential family
𝑇 (𝑋) ⊤
𝑀𝜂 (𝑢) = 𝔼𝜂 [𝑒𝑢 𝑇 (𝑋)
]
⊤
𝑇 𝜂⊤ 𝑇 −𝐴(𝜂)
= ∫ 𝑒𝑢 𝑒 ℎ𝑑𝜇
25
Relationship between the moments and cumulants
𝑀 ′ = 𝐾 ′ 𝑒𝐾 ⇒ 𝔼[𝑇 ] = 𝜅1
𝑀 ″ = (𝐾 ″ + 𝐾 ′2 )𝑒𝐾 ⇒ 𝔼[𝑇 2 ] = 𝜅2 + 𝜅21
𝑀 ‴ = (𝐾 ‴ + 3𝐾 ′ 𝐾 ″ + 𝐾 ′3 )𝑒𝐾 ⇒ 𝔼[𝑇 3 ] = 𝜅3 + 3𝜅1 𝜅2 + 𝜅31
26
Exampe 2.11: moments of normal
27
Proof:
28
Summary of useful properties of exponential families
29
What is next?
• Sufficiency
• Factorization theorem
• Minimal sufficiency
30
Thank you
31
32