Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Sufficient statistics and MVUEs

Advanced Statistics II

Prof. Dr. Matei Demetrescu

Statistics and Econometrics (CAU Kiel) Summer 2021 1 / 30


Focus on the essential

For an identified model, the sample contains all relevant information.


Functions of the sample are supposed to summarize that information.
Some statistics may in principle be more informative than others (for
given parameters),
In fact some statistics may be totally uninformative,
e.g. the sample variance is shift-invariant and not informative of the
mean.
Thus, in place of the original random-sample outcome, it may be sufficient
to have the outcomes of selected statistics to estimate any q(θ).

(MVU) Estimation should then be based on such sufficient statistics!

Statistics and Econometrics (CAU Kiel) Summer 2021 2 / 30


Today’s outline

Sufficient statistics and MVUEs

1 Sufficient statistics

2 The Blackwell-Rao theorem

3 Completeness and MVUEs

4 Up next

Statistics and Econometrics (CAU Kiel) Summer 2021 3 / 30


Sufficient statistics

Outline

1 Sufficient statistics

2 The Blackwell-Rao theorem

3 Completeness and MVUEs

4 Up next

Statistics and Econometrics (CAU Kiel) Summer 2021 4 / 30


Sufficient statistics

Formally

Definition
Let (X1 , . . . , Xn ) ∼ fX (x1 , . . . , xn ; θ) be a random sample, and let
S1 = s1 (X1 , . . . , Xn ), . . . , Sr = sr (X1 , . . . , Xn ) be r statistics. The r
statistics are said to be sufficient statistics for fX (x; θ) iff

fX (x1 , . . . , xn ; θ | s1 , . . . , sr ) = h (x1 , . . . , xn ) ,

i.e., the conditional density of X , given S = [S1 , . . . , Sr ]0 , does not


depend on the parameter θ.

What remains after knowing the sufficient statistics is unidentified!


Therefore, estimation should ignore that.

Statistics and Econometrics (CAU Kiel) Summer 2021 5 / 30


Sufficient statistics

A necessary and sufficient criterion

Theorem (Neyman’s Factorization Theorem)


Let fX (x; θ) be the pdf of the random sample (X1 , . . . , Xn ). The
statistics S1 , . . . , Sr are sufficient statistics for fX (x; θ) iff fX (x; θ) can
be decomposed as

fX (x; θ) = g (s1 (x), . . . , sr (x); θ) · h(x),

where g is a function of only s1 (x), . . . , sr (x) and θ, and h(x) does not
depend on θ.

The proof is in a sense elementary,


and the intuition is that conditioning on Sj only leaves randomness
... which is not identifying!

Statistics and Econometrics (CAU Kiel) Summer 2021 6 / 30


Sufficient statistics

An example
Let (X1 , . . . , Xn ) be a random sample from a Bernoulli population with pdf
f (x; p) = px (1 − p)1−x I{0,1} (x), p ∈ [0, 1].

Note that the joint pdf of the random sample is given by

Qn
fX (x1 , ..., xn ; p) = pxi (1 − p)1−xi · I{0,1} (xi )
i=1
Pn Pn Qn
= p i=1 xi (1 − p)n− i=1 xi · I{0,1} (xi ) .
| {z } | i=1 {z }
Pn
setting S = i=1 Xi , this independent from p and
term corresponds to g(s(x); p) corresponds to h(x)

It follows that the value of the sum of the sample outcomes, being sufficient
(Neyman!) contains all the sample information relevant for estimating q(p).
Suppose that n = 3 and that we observe s = 2. It does not matter how:
x = (1, 1, 0), x = (1, 0, 1), x = (0, 1, 1).

Statistics and Econometrics (CAU Kiel) Summer 2021 7 / 30


Sufficient statistics

Example

Let (X1 , . . . , Xn ) be a random N (µ, σ 2 ) sample with θ = (µ, σ 2 )0 and joint pdf
of the random sample
Qn 1 2
fX (x1 , ..., xn ; θ) = i=1
√ 1
2πσ 2
e− 2σ2 (xi −µ)
1
Pn 2
= 1
(2πσ 2 )n/2
e− 2σ2 i=1 (xi −µ)

1
Pn 2 Pn 2
= 1
(σ 2 )n/2
e− 2σ2 ( i=1 xi −2µ i=1 xi +nµ ) · 1
(2π)n/2
.
| {z } | {z }
Pn Pn
setting S1 = i=1 Xi and S2 = i=1 Xi2 , this independent from θ and

term corresponds to g(s1 (x), s2 (x); θ) corresponds to h(x)

Pn Pn
So S1 = i=1 Xi and S2 = i=1 Xi2 are sufficient statistics for θ = (µ, σ 2 )0 .
Pn Pn
The sample mean X̄n = n1 i=1 Xi and variance S 2 = n−1 1 2
i=1 (Xi − X̄n ) are
invertible functions of S1 and S2 . They provide the same information about θ.

Statistics and Econometrics (CAU Kiel) Summer 2021 8 / 30


Sufficient statistics

How much information compressing?

Definition
A sufficient statistic S = s(X ) for fX (x; θ) is said to be a minimal
sufficient statistic if, for every other sufficient statistic T = t(X ), ∃ a
functiona

hT (·) such that s(x) = hT (t(x)) ∀ x ∈ RΩ (X ).


a
The notation for the sample space RΩ (X ) indicates that the range of X is
taken over all θs in the parameter space Ω. If the support of the pdf does not
change with θ (e.g., Normal, Gamma, etc.) then RΩ (X ) = R(X ).

Intuition:
By definition, a function can never have more elements in its range
than in its domain.
So t may simplify to s, but not the other way round.
Statistics and Econometrics (CAU Kiel) Summer 2021 9 / 30
Sufficient statistics

Lehmann-Scheffé’s Minimal Sufficiency Theorem


The general result depends on how the sample space depends on the
parameter, so we look at a corollary only.1

Corollary
Let X ∼ fX (x; θ), and suppose that R(X ) does not depend on θ. If the
statistic S = s(X ) is such that
fX (x; θ)
does not depend on θ iff (x, y) satisfies s(x) = s(y),
fX (y; θ)

then S = s(X ) is a minimal sufficient statistic.

Proof: See Mittelhammer (1996), p. 396.


We need to find an appropriate function S = s(X );
... but this is often less complicated than it sounds.

1
See Mittelhammer (1996, p. 395-396) for the full result.
Statistics and Econometrics (CAU Kiel) Summer 2021 10 / 30
Sufficient statistics

Minimal sufficient statistics in the exponential class

Theorem
Let fX (x; θ) be a member of the exponential class of density functions,
hP i
k
fX (x; θ) = exp i=1 ci (θ)gi (x) + d(θ) + z(x) .

Then s(X ) = [g1 (X ), . . . , gk (X )] is a k-variate sufficient statistic, and if


c1 (θ),...,ck (θ), are linearly independent, the sufficient statistic is a minimal
sufficient statistic.

The result is established by using Neyman’s factorization theorem,


followed by the Corollary.

Statistics and Econometrics (CAU Kiel) Summer 2021 11 / 30


Sufficient statistics

Example

Let (X1 , . . . , Xn ) be a random sample from a Gamma population with a joint pdf
which belongs to the exponential class

Qn 1 α−1 −xi /β
fX (x; α, β) = i=1 β α Γ(α) xi e
h Pn Pn i
= exp (α − 1) i=1 ln xi − β1 x i −n ln[β α
Γ(α)] .
| {z } | {z } |{z} | i=1
{z }
c1 (θ) g1 (x) c2 (θ) g2 (x)

Pn Pn
Thus, it follows that [g1 (X ) , g2 (X )] = [ i=1 ln Xi , i=1 Xi ] is a bivariate
minimal sufficient statistic for (α, β).

Statistics and Econometrics (CAU Kiel) Summer 2021 12 / 30


Sufficient statistics

Sufficiency and minimality


Sufficient statistics are not unique:
one-to-one transformations of a (minimal) sufficient statistic provide
the same sample information about the unknown parameter as the
initial statistic.

Theorem
Let S = s(X ) be an r-dimensional sufficient statistic for fX (x; θ). If
τ [s(X )] is an r-dimensional invertible function of s(X ), then
a. τ [s(X )] is an r-dimensional sufficient statistic for fX (x; θ);
b. if s(X ) is a minimal sufficient statistic, then τ [s(X )] is a minimal
sufficient statistic.

a. follows with Neyman’s factorization theorem, while


b. is proved e.g. in Mittelhammer (1996), p. 405.
Statistics and Econometrics (CAU Kiel) Summer 2021 13 / 30
The Blackwell-Rao theorem

Outline

1 Sufficient statistics

2 The Blackwell-Rao theorem

3 Completeness and MVUEs

4 Up next

Statistics and Econometrics (CAU Kiel) Summer 2021 14 / 30


The Blackwell-Rao theorem

Main result
We focus on the scalar case.

Theorem
Let S = (S1 , ..., Sr )0 be an r-dimensional sufficient statistic for fX (x; θ),
and let T = t(X ) be any unbiased estimator for the scalar q(θ). Define

T ∗ = t∗ (X ) = E[T (X )|S1 , ..., Sr ].

Then
1. T ∗ is a statistic and it is a function of S1 , ..., Sr ;
2. E(T ∗ ) = q(θ), that is, T ∗ is an unbiased estimator of q(θ);
3. Var(T ∗ ) ≤ Var(T ) ∀ θ ∈ Ω, where the equality is attained only if
P(T ∗ = T ) = 1.

Proof: elementary manipulations.

Statistics and Econometrics (CAU Kiel) Summer 2021 15 / 30


The Blackwell-Rao theorem

Can’t lose

Given an unbiased estimator,


another unbiased estimator that is a function of a sufficient statistic
can be constructed,
which will not have larger variance but eventually smaller variance.
We may thus improve MSE performance of unbiased estimators.

But: If an unbiased estimator T is already a function of a sufficient


statistic S, then the Rao-Blackwellized estimator T ∗ will be identical to T .

Statistics and Econometrics (CAU Kiel) Summer 2021 16 / 30


The Blackwell-Rao theorem

Where to stop?

Will a Rao-Blackwellized estimator be the MVUE?

Yes! ... if the sufficient statistic S is complete.

Statistics and Econometrics (CAU Kiel) Summer 2021 17 / 30


Completeness and MVUEs

Outline

1 Sufficient statistics

2 The Blackwell-Rao theorem

3 Completeness and MVUEs

4 Up next

Statistics and Econometrics (CAU Kiel) Summer 2021 18 / 30


Completeness and MVUEs

Complete sufficient statistics


Definition
Let S = [S1 , . . . , Sr ]0 be a sufficient statistic for fX (x; θ). The sufficient
statistic S is said to be complete iff for any statistic z(S) with

Eθ [z(S)] = 0 ∀ θ ∈ Ω,

it holds that
Pθ [z(S) = 0] = 1 ∀ θ ∈ Ω.

(This relates – vaguely – to identification.)

Lemma
If a sufficient statistic S is complete, two different functions of S cannot
have the same expected value.

So any unbiased estimator of q(θ) that is a function of a complete


sufficient statistic is unique.
Statistics and Econometrics (CAU Kiel) Summer 2021 19 / 30
Completeness and MVUEs

The Bernoulli example I

Example
Let (X1 , . . . , Xn ) be a random sample from a Bernoulli population with
P(X = 1) = p, and consider the statistic
n
X
S= Xi ,
i=1

which is a sufficient statistic for p – is S also complete?

To determine whether S is a complete sufficient statistic we need to


show that a function z(S) of S for which

E[z(S)] = 0 ∀ p ∈ [0, 1] is characterized by P[z(S) = 0] = 1 ∀ p ∈ [0, 1].

Statistics and Econometrics (CAU Kiel) Summer 2021 20 / 30


Completeness and MVUEs

The Bernoulli example II

First note that S ∼binomial(n, p), so that E[z(S)] = 0 implies


  n
n jX
E[z(S)] = z(j) p (1 − p)n−j = 0
j
j=0
n  
n
X n j
= (1 − p) z(j) ω = 0, where ω = p/(1 − p).
j
j=0

Hence, E[z(S)] = 0 ∀ p ∈ [0, 1] requires that


n  
X n j
z(j) ω =0 ∀p ∈ [0, 1].
j
j=0

Statistics and Econometrics (CAU Kiel) Summer 2021 21 / 30


Completeness and MVUEs

The Bernoulli example III

For that polynomial in ω to be equal to 0 ∀ω, all coefficients z(j) nj need




to be equal to 0, that is
 
n
z(j) = 0, so that z(j) = 0 ∀j ∈ {0, 1, ..., n}.
j
| {z }
6= 0

Hence, E[z(S)] = 0 ∀ p requires that z(j) = 0 ∀ j such that E[z(S)] = 0


implies that P[z(S) = 0] = 1.
Pn
Thus, S = i=1 Xi is a complete sufficient statistic for p.

Statistics and Econometrics (CAU Kiel) Summer 2021 22 / 30


Completeness and MVUEs

Completeness in the exponential class


Theorem
Let the joint density, fX (x; θ), of the random sample (X1 , . . . , Xn ) belong
to the exponential class of densities with pdf
" k #
X
fX (x; θ) = exp ci (θ)gi (x) + d(θ) + z(x) .
i=1

If the range of [c1 (θ), . . . , ck (θ)]0 , θ ∈ Ω, contains an open k-dimensional


rectanglea , then s(X ) = [g1 (X ), . . . , gk (X )]0 is a complete sufficient
statistic for fX (x; θ), θ ∈ Ω.
a
The condition that the range of [c1 (θ), . . . , ck (θ)]0 contains an open
k-dimensional rectangle excludes cases where the ci (θ)s are linearly dependent.
For a random sample from a N (µ, σ 2 ) distribution with (µ, σ 2 ) ∈ R1 × R1+ , for
example, the range of [c1 (·), c2 (·)]0 = [ σµ2 , − 2σ1 2 ] is the set R1 × R1− and
contains an open 2-dimensional rectangle.

Statistics and Econometrics (CAU Kiel) Summer 2021 23 / 30


Completeness and MVUEs

The Lehmann-Scheffé completeness theorem

If complete sufficient statistics exist for a statistical model


{fX (x; θ), θ ∈ Ω}, then an alternative to the CRLB approach is available
to identify the MVUE of q(θ).

Theorem
Let S = (S1 , . . . , Sr )0 be a complete sufficient statistic for f (x; θ). Let
T = t(S) be an unbiased estimator for the function q(θ). Then T = t(S)
is the MVUE of q(θ).

Proof: Use Blackwell-Rao suitably.

Statistics and Econometrics (CAU Kiel) Summer 2021 24 / 30


Completeness and MVUEs

Summing up

There are two possible procedures for identifying the MVUE for q(θ):
Find a statistic of the form t(S) such that E(t(S)) = q(θ).
Then t(S) is necessarily the MVUE of q(θ).
Find any unbiased estimator of q(θ), say t∗ (X ).
Then t(S) = E(t∗ (X )|S) is the MVUE of q(θ).

The condition is that S be a complete sufficient statistic.

Statistics and Econometrics (CAU Kiel) Summer 2021 25 / 30


Completeness and MVUEs

The Poisson example I


Example
Let (X1 , ..., Xn ) be a random sample from a Poisson distribution with pdf

e−θ θx
f (x; θ) = for x = 0, 1, 2, . . . , E(X) = Var(X) = θ.
x!
Find the MVUE of q(θ) = θ.

The joint pdf fX (x; θ) is a member of the exponential class of densities,


n Pn
Y e−nθ θ i=1 xi
fX (x; θ) = f (xi ; θ) = Qn
i=1 i=1 xi !
n
X n
Y
 
= exp ln(θ) xi −nθ − ln xi ! .
| {z }
c(θ) |i=1
{z } i=1
g(x)

Statistics and Econometrics (CAU Kiel) Summer 2021 26 / 30


Completeness and MVUEs

The Poisson example II


Pn
Hence, the statistic g(X ) = i=1 Xi is a complete sufficient statistic for
θ.
To identify the MVUE for θ,
Pn
find a function of the complete sufficient statistic i=1 Xi
... whose expectation is θ.
Since X̄n = n1 ni=1 Xi is an unbiased
P
estimator for E(X) = θ, it is
1 Pn
the obvious choice; so X̄n = n i=1 Xi is the MVUE of θ.

But: Does the variance of the MVUE of θ, given by


n
!
1X θ
Var(X̄n ) = Var Xi = ,
n n
i=1

attain the CRLB?


Statistics and Econometrics (CAU Kiel) Summer 2021 27 / 30
Completeness and MVUEs

The Poisson example III


The CRLB for the variance of an unbiased estimator T of q(θ) = θ is
h i2
∂q(θ)
∂θ
Var(T ) ≥ hn o2 i ,

nE ∂θ ln f (X; θ)

where ln f (x; θ) = −θ + x ln(θ) − ln(x!), such that


n∂ o2 x 2 x2
ln f (x; θ) = (−1 + )2 = 1 − x + 2
∂θ θ θ θ
2
hn ∂ o2 i Var(X) + E(X) 1
E ln f (X; θ) = 1−2+ 2
= ,
∂θ θ θ
and
1 θ
1 = n.Var(T ) ≥

Thus, the variance of the MVUE X̄n = n1 ni=1 Xi of θ attains the CRLB.
P

Statistics and Econometrics (CAU Kiel) Summer 2021 28 / 30


Up next

Outline

1 Sufficient statistics

2 The Blackwell-Rao theorem

3 Completeness and MVUEs

4 Up next

Statistics and Econometrics (CAU Kiel) Summer 2021 29 / 30


Up next

Coming up

Point estimation methods: Maximum Likelihood

Statistics and Econometrics (CAU Kiel) Summer 2021 30 / 30

You might also like