Professional Documents
Culture Documents
Lecture03
Lecture03
Statistical Inference
Lecture 03: Sufficient Statistics
Yuansi Chen
Spring 2023
Duke University
https://www2.stat.duke.edu/courses/Spring23/sta732.01/
1
Recap from Lecture 02
2
Goal of Lecture 03
1. Define sufficiency
2. Factorization theorem
3. Minimal sufficiency
3
Sufficiency
Motivation for sufficiency
i.i.d.
Coin flipping experiment: suppose 𝑋1 , … , 𝑋𝑛 ∼ Bernoulli(𝜃).
Then the joint density is
𝑛
𝑝𝜃 (𝑋1 = 𝑥1 , … , 𝑋𝑛 = 𝑥𝑛 ) = ∏ 𝜃𝑥𝑖 (1 − 𝜃)1−𝑥𝑖
𝑖=1
𝑛 𝑛
= 𝜃∑𝑖=1 𝑥𝑖 (1 − 𝜃)𝑛−∑𝑖=1 𝑥𝑖
𝑛
Let 𝑇 (𝑋) = ∑𝑖=1 𝑋𝑖 . 𝑇 (𝑋) follows Binomial distribution
Binom(𝑛, 𝜃), with 𝑝𝜃 (𝑇 (𝑋) = 𝑡) = 𝜃𝑡 (1 − 𝜃)𝑛−𝑡 (𝑛𝑡)
4
Motivation for sufficiency
i.i.d.
Coin flipping experiment: suppose 𝑋1 , … , 𝑋𝑛 ∼ Bernoulli(𝜃).
Then the joint density is
𝑛
𝑝𝜃 (𝑋1 = 𝑥1 , … , 𝑋𝑛 = 𝑥𝑛 ) = ∏ 𝜃𝑥𝑖 (1 − 𝜃)1−𝑥𝑖
𝑖=1
𝑛 𝑛
= 𝜃∑𝑖=1 𝑥𝑖 (1 − 𝜃)𝑛−∑𝑖=1 𝑥𝑖
𝑛
Let 𝑇 (𝑋) = ∑𝑖=1 𝑋𝑖 . 𝑇 (𝑋) follows Binomial distribution
Binom(𝑛, 𝜃), with 𝑝𝜃 (𝑇 (𝑋) = 𝑡) = 𝜃𝑡 (1 − 𝜃)𝑛−𝑡 (𝑛𝑡)
It seems that to estimate 𝜃 it is sufficient to know 𝑇 (𝑋), but 𝑇 (𝑋)
did throw away data. How to justify?
4
Sufficient statistics
5
Back to the coin flipping example
𝑝𝜃 (𝑋 = 𝑥, 𝑇 = 𝑡)
𝑝𝜃 (𝑋 = 𝑥 ∣ 𝑇 = 𝑡) =
𝑝𝜃 (𝑇 = 𝑡)
𝜃∑ 𝑥𝑖 (1 − 𝜃)𝑛−∑ 𝑥𝑖 1∑ 𝑥𝑖 =𝑡
=
𝜃𝑡 (1 − 𝜃)𝑛−𝑡 (𝑛𝑡)
1
= 𝑛 1∑ 𝑥𝑖 =𝑡
(𝑡)
6
Interpretation of sufficiency
Interpretation 1: sufficiency from generative model perspective
7
A graphical model for the data generation
𝜃 𝑇 (𝑋) 𝑋
8
Fake data generated from 𝑇 is good enough
𝜃 𝑇 (𝑋) 𝑋
𝑋̃
9
Interpretation 2: estimator using 𝑇 (𝑋) alone cannot be worse
10
Interpretation 2: estimator using 𝑇 (𝑋) alone cannot be worse
10
Factorization theorem
Motivation for factorization theorem
11
Factorization theorem
12
proof (see the rigorous proof in Keener 6.4):
13
Ex1: exponential family
14
Ex2: joint distributuon of i.i.d. uniform
i.i.d.
Suppose 𝑋1 , … , 𝑋𝑛 ∼ 𝑈 [𝜃, 𝜃 + 1]. The joint density is
𝑛
𝑝𝜃 (𝑥) = ∏ 1{𝜃≤𝑥𝑖 ≤𝜃+1}
𝑖=1
The order statistics (𝑋(𝑖) )𝑛𝑖=1 (where 𝑋(𝑘) is the 𝑘-th smallest) is
sufficient
15
Ex3: joint distribution invariant to permutations of 𝑋𝑖
i.i.d. (1)
Suppose 𝑋1 , … , 𝑋𝑛 ∼ 𝑃𝜃 . The joint distribution 𝑃𝜃 is invariant
to permutations of 𝑋 = (𝑋1 , … , 𝑋𝑛 ). What are some sufficient
statistics?
16
Ex3: joint distribution invariant to permutations of 𝑋𝑖
i.i.d. (1)
Suppose 𝑋1 , … , 𝑋𝑛 ∼ 𝑃𝜃 . The joint distribution 𝑃𝜃 is invariant
to permutations of 𝑋 = (𝑋1 , … , 𝑋𝑛 ). What are some sufficient
statistics?
• Order statistics
• Empirical distribution
̂ 1 𝑛
𝑃𝑛 (⋅) = ∑ 𝛿𝑋𝑖 (⋅)
𝑛 𝑖=1
16
Minimal sufficiency
Motivation for minimal sufficiency
i.i.d
Consider 𝑋1 , … , 𝑋𝑛 ∼ 𝒩(𝜃, 1). Among the four sufficient
statistics
𝑛
• ∑𝑖=1 𝑋𝑖
𝑛
• 𝑋̄ = 𝑛1 ∑𝑖=1 𝑋𝑖
• 𝑂(𝑋) = (𝑋(1) , … , 𝑋(𝑛) )
• 𝑋 = (𝑋1 , … , 𝑋𝑛 )
17
Ordering of sufficient statistics
Proposition
If 𝑇 (𝑋) is sufficient and there exists 𝑓 such that 𝑇 (𝑋) = 𝑓(𝑆(𝑋)),
then 𝑆(𝑋) is sufficient
18
Minimal sufficiency
• 𝑇 is sufficient
• For any other sufficient statistics 𝑆, there exists 𝑓 such that
𝑇 = 𝑓(𝑆)
(a.e. P)
19
Example: which is minimal sufficient?
Suppose 𝑋1 , … , 𝑋2𝑛 are i.i.d. from 𝒩(𝜃, 1), which of the following
statistics is sufficient? is minimal?
• 𝑋1
𝑛
∑𝑖=1 𝑋𝑖
• 𝑇̃ = ( 2𝑛 )
∑𝑖=𝑛+1 𝑋𝑖
2𝑛
• ∑𝑖=1 𝑋𝑖
20
Another way to check minimal sufficiency
21
proof:
22
Ex1: minimal sufficient statistics in exponential family
⊤
𝑝𝜃 (𝑥) = 𝑒𝜂(𝜃) 𝑇 (𝑥)−𝐵(𝜃)
ℎ(𝑥)
23
Ex2: two parameter Gaussian
24
Ex3: Laplace location family
i.i.d. (1)
Suppose 𝑋1 , … , 𝑋𝑛 ∼ 𝑝𝜃 (𝑥) = 12 𝑒−|𝑥−𝜃| . Then the joint density
is
𝑛
1
𝑝𝜃 (𝑥) = 𝑛 exp {− ∑ |𝑥𝑖 − 𝜃|}
2 𝑖=1
25
Summary
26
What is next?
• Completeness
• Ancillarity
• Basu’s theorem (relationship between sufficient and ancillary
statistics)
27
Thank you
28
29