Professional Documents
Culture Documents
04_sufficiency_mvue
04_sufficiency_mvue
Advanced Statistics II
1 Sufficient statistics
4 Up next
Outline
1 Sufficient statistics
4 Up next
Formally
Definition
Let (X1 , . . . , Xn ) ∼ fX (x1 , . . . , xn ; θ) be a random sample, and let
S1 = s1 (X1 , . . . , Xn ), . . . , Sr = sr (X1 , . . . , Xn ) be r statistics. The r
statistics are said to be sufficient statistics for fX (x; θ) iff
fX (x1 , . . . , xn ; θ | s1 , . . . , sr ) = h (x1 , . . . , xn ) ,
where g is a function of only s1 (x), . . . , sr (x) and θ, and h(x) does not
depend on θ.
An example
Let (X1 , . . . , Xn ) be a random sample from a Bernoulli population with pdf
f (x; p) = px (1 − p)1−x I{0,1} (x), p ∈ [0, 1].
Qn
fX (x1 , ..., xn ; p) = pxi (1 − p)1−xi · I{0,1} (xi )
i=1
Pn Pn Qn
= p i=1 xi (1 − p)n− i=1 xi · I{0,1} (xi ) .
| {z } | i=1 {z }
Pn
setting S = i=1 Xi , this independent from p and
term corresponds to g(s(x); p) corresponds to h(x)
It follows that the value of the sum of the sample outcomes, being sufficient
(Neyman!) contains all the sample information relevant for estimating q(p).
Suppose that n = 3 and that we observe s = 2. It does not matter how:
x = (1, 1, 0), x = (1, 0, 1), x = (0, 1, 1).
Example
Let (X1 , . . . , Xn ) be a random N (µ, σ 2 ) sample with θ = (µ, σ 2 )0 and joint pdf
of the random sample
Qn 1 2
fX (x1 , ..., xn ; θ) = i=1
√ 1
2πσ 2
e− 2σ2 (xi −µ)
1
Pn 2
= 1
(2πσ 2 )n/2
e− 2σ2 i=1 (xi −µ)
1
Pn 2 Pn 2
= 1
(σ 2 )n/2
e− 2σ2 ( i=1 xi −2µ i=1 xi +nµ ) · 1
(2π)n/2
.
| {z } | {z }
Pn Pn
setting S1 = i=1 Xi and S2 = i=1 Xi2 , this independent from θ and
Pn Pn
So S1 = i=1 Xi and S2 = i=1 Xi2 are sufficient statistics for θ = (µ, σ 2 )0 .
Pn Pn
The sample mean X̄n = n1 i=1 Xi and variance S 2 = n−1 1 2
i=1 (Xi − X̄n ) are
invertible functions of S1 and S2 . They provide the same information about θ.
Definition
A sufficient statistic S = s(X ) for fX (x; θ) is said to be a minimal
sufficient statistic if, for every other sufficient statistic T = t(X ), ∃ a
functiona
Intuition:
By definition, a function can never have more elements in its range
than in its domain.
So t may simplify to s, but not the other way round.
Statistics and Econometrics (CAU Kiel) Summer 2021 9 / 30
Sufficient statistics
Corollary
Let X ∼ fX (x; θ), and suppose that R(X ) does not depend on θ. If the
statistic S = s(X ) is such that
fX (x; θ)
does not depend on θ iff (x, y) satisfies s(x) = s(y),
fX (y; θ)
1
See Mittelhammer (1996, p. 395-396) for the full result.
Statistics and Econometrics (CAU Kiel) Summer 2021 10 / 30
Sufficient statistics
Theorem
Let fX (x; θ) be a member of the exponential class of density functions,
hP i
k
fX (x; θ) = exp i=1 ci (θ)gi (x) + d(θ) + z(x) .
Example
Let (X1 , . . . , Xn ) be a random sample from a Gamma population with a joint pdf
which belongs to the exponential class
Qn 1 α−1 −xi /β
fX (x; α, β) = i=1 β α Γ(α) xi e
h Pn Pn i
= exp (α − 1) i=1 ln xi − β1 x i −n ln[β α
Γ(α)] .
| {z } | {z } |{z} | i=1
{z }
c1 (θ) g1 (x) c2 (θ) g2 (x)
Pn Pn
Thus, it follows that [g1 (X ) , g2 (X )] = [ i=1 ln Xi , i=1 Xi ] is a bivariate
minimal sufficient statistic for (α, β).
Theorem
Let S = s(X ) be an r-dimensional sufficient statistic for fX (x; θ). If
τ [s(X )] is an r-dimensional invertible function of s(X ), then
a. τ [s(X )] is an r-dimensional sufficient statistic for fX (x; θ);
b. if s(X ) is a minimal sufficient statistic, then τ [s(X )] is a minimal
sufficient statistic.
Outline
1 Sufficient statistics
4 Up next
Main result
We focus on the scalar case.
Theorem
Let S = (S1 , ..., Sr )0 be an r-dimensional sufficient statistic for fX (x; θ),
and let T = t(X ) be any unbiased estimator for the scalar q(θ). Define
Then
1. T ∗ is a statistic and it is a function of S1 , ..., Sr ;
2. E(T ∗ ) = q(θ), that is, T ∗ is an unbiased estimator of q(θ);
3. Var(T ∗ ) ≤ Var(T ) ∀ θ ∈ Ω, where the equality is attained only if
P(T ∗ = T ) = 1.
Can’t lose
Where to stop?
Outline
1 Sufficient statistics
4 Up next
Eθ [z(S)] = 0 ∀ θ ∈ Ω,
it holds that
Pθ [z(S) = 0] = 1 ∀ θ ∈ Ω.
Lemma
If a sufficient statistic S is complete, two different functions of S cannot
have the same expected value.
Example
Let (X1 , . . . , Xn ) be a random sample from a Bernoulli population with
P(X = 1) = p, and consider the statistic
n
X
S= Xi ,
i=1
to be equal to 0, that is
n
z(j) = 0, so that z(j) = 0 ∀j ∈ {0, 1, ..., n}.
j
| {z }
6= 0
Theorem
Let S = (S1 , . . . , Sr )0 be a complete sufficient statistic for f (x; θ). Let
T = t(S) be an unbiased estimator for the function q(θ). Then T = t(S)
is the MVUE of q(θ).
Summing up
There are two possible procedures for identifying the MVUE for q(θ):
Find a statistic of the form t(S) such that E(t(S)) = q(θ).
Then t(S) is necessarily the MVUE of q(θ).
Find any unbiased estimator of q(θ), say t∗ (X ).
Then t(S) = E(t∗ (X )|S) is the MVUE of q(θ).
e−θ θx
f (x; θ) = for x = 0, 1, 2, . . . , E(X) = Var(X) = θ.
x!
Find the MVUE of q(θ) = θ.
Outline
1 Sufficient statistics
4 Up next
Coming up