Professional Documents
Culture Documents
Chapter 2
Chapter 2
Chapter 2
Lecture Notes
Chapter 2
Parameter Estimation
Table of Contents
Section Remark
1. Introduction to Parameter Estimation
2. “Ad-hoc" Parameter Estimation Method
3. Parametric Statistical Models
4. Formal Process of Parameter Estimation
5. Moments of a Random Variable
6. Constructing Good Estimators: Method of Moments
7. Constructing Good Estimators: Maximum Likelihood Method
8. Maximizing Likelihood Functions in the Case of One Unknown Parameter
9. Maximum Likelihood Method for Continuous Distributions
10. Non-Existence of MLEs
11. Non-Uniqueness of MLEs
12. Bias, Standard Error, Mean Squared Error
13. Sample Variance as Estimator for Variance
14. Estimated Standard Error
15. Criteria for Quality of Estimators
Table of Contents
Section Remark
16. Consistency of Estimators
17. Consistency of Method of Moments Estimators
18. Consistency of MLEs
19. Best Unbiased Estimator
20. Expected Value and Variance of Score Function
21. Fisher Information: Additivity and Alternative Formula
22. Cramer-Rao Bound and Efficient Estimators
23. Asymptotic Properties of Estimators
24. Asymptotic Normality and Asymptotic Variance of MLEs
25. Review of Concept of Confidence Intervals
26. Pivotal Quantities and Construction of Exact Confidence Intervals
27. Exact Confidence Intervals for Exponential Distributions
28. Asymptotically Pivotal Quantities and Approximate Confidence Intervals
1) Introduction to Parameter
Estimation
Conclusion:
Remaining task: Approximately find
This is the purpose of parameter estimation
Process of Point Estimation
Given are data , viewed as realizations of i.i.d.
random variables
Here
1.2) Find Functions to Approximate
Parameters
(by Law of Large Numbers)
Not bad!
Recall: ,
(1)
Observations drawn from
Binomial( ) with unknown
12 13 7 10 6 11 12 6 7 10 10 9 10 15 11 14 9 8 9 12 10 16 16 135 11 7 6 9 19 7 16 9 11 14 9 8 13 16 8 7 11 10
10 7 12 11 1312 18 8 8 12 16 11 14 7 7 10 13 4 8 7 6 8 9 12 12 11 14 8 3 8 9 12 9 7 10 8 10 8 5 10 9 13 10
10 8 12 6 8 9 11 10 14 7 7 9 8 12 12 7 9 5 9 9 10 6 8 10 12 13 13 11 10 15 9 9 12 12 10 12 9 6 8 12 3 7 9
11 8 7 10 10 8 15 9 11 12 11 6 7 9 7 8 11 13 3 12 10 11 9 6 11 13 14 10 10 10 8 18 7 15 11 7 6 10 710 7 7 13
7 9 14 12 7 14 10 15 13 12 7 5 14 13 8 8 9 8 9 8 9 8 8 7 12 14 12 6 8 12 6 4 9 10 11 14 7 6 13 9 10 10 8 8
7 9 9 10 8 10 9 13 10 11 13 11 7 9 6 11 8 13 13 9 16 15 9 11 6 11 13 12 12 16 8 10 10 17 7 9 11 9 9 10 12 8 12
3 12 6 10 10 12 14 5 10 3 9 9 14 10 12 14 7 13 11 15 8 4 7 8 8 14 8 11 9 11 8 10 11 8 9 10 11 14 9 13 7 6 8
9 11 7 9 7 9 10 5 9 11 11 7 8 11 3 13 9 8 9 5 11 15 11 4 8 11 14 13 8 14 10 5 10 13 12 5 17 10 8 14 10 10 11
13 8 16 9 10 11 9 6 6 12 10 12 12 9 8 11 11 16 12 8 7 11 12 14 10 16 10 9 8 11 15 11 7 10 8 12 12 9 12 11 9 8
11 9 10 12 7 11 12 12 10 12 6 8 9 13 5 4 15 11 10 10 11 10 12 15 12 11 11 7 8 14 7 14 9 9 7 12 6 10 10 6 12 11
10 9 9 11 11 11 4 12 13 11 11 10 8 7 8 14 9 12 12 13 13 4 10 8 8 10 6 10 16 12 13 10 8 12 9 13 11 9 8 7 8 6
18 6 9 6 10 12 10 9 12 10 7 11 6 8 4 11 9 9 16 10 8 10 8 12 11 13 11 14 14 5 11 7 11 7 9 10 10 9 10 12 12 8 8
9 11 12 9 9 14 8 11 10 12 11 11 12 15 10 11 16 7 11 14 15 12 10 9 12 11 8 17 11 13 10 7 10 12 12 9 11 7 9 11 13
10 10 13 9 6 12 9 12 8 9 14 11 5 8 13 13 15 8 11 5 9 5 14 14 11 14 10 11 11 9 13 10 8 12 7 9 11 7 7 7 12 10
10 9 6 11 13 11 10 10 12 12 9 9 11 12 13 9 7 7 8 8 8 9 13 10 11 10 8 7 10 10 9 11 6 12 8 10 14 8 11 8 17 18
11 9 9 11 7 12 12 16 15 7 7 6 14 15 8 5 9 12 10 7 9 13 8 7 11 11 15 9 8 11 8 9 6 5 11 6 8 5 11 15 7 8 6 7
9 10 10 5 4 4 10 9 7 7 5 6 8 9 4 11 12 13 9 9 8 10 9 9 10 11 11 7 13 13 12 5 10 6 11 10 9 11 12 10 8 11 9 8
14 8 14 7 8 6 16 10 8 7 11 13 12 12 12 9 6 10 7 10 9 12 14 7 7 8 8 6 8 13 8 12 11 8 12 10 7 7 12 13 10 10 13
5 13 13 9 15 10 10 11 13 13 11 9 12 10 12 7 9 10 6 9 16 9 16 8 11 7 13 4 7 10 11 18 10 12 4 10 9 15 11 9 8 15
10 9 16 11 7 6 10 6 9 12 14 11 11 9 10 15 8 13 11 8 9 10 8 12 8 11 9 7 10 6 10 7 9 10 9 7 10 9 6 8 12 11 13
8 14 8 8 7 12 10 15 7 7 10 9 8 8 12 14 4 12 2 10 10 10 13 4 12 12 9 9 12 11 9 13 613 8 8 10 9 9 10 13 9 14
12 7 2 9 12 7 10 15 14 7 12 10 12 10 12 9 13 12 7 14 10 13 13 6 12 9 7 6 7 13 6 6 10 7 11 15 9 15 11 12 9 11
16 9 13 10 10 11 12 11 6 9 10 10 9 10 10 5 8 10 12 10 14 7 5 7 14 9 9 10 7 14 10 9 12 12 13 8 7 10 12 12 13 7
10 12 17 9 7 8 12 12 13 11 10 9 6 14 13 13 11 11
Estimations for and Derived from
Observations
or
or
or
or (PDF / PMF)
Discrete and Continuous Models
Examples:
Discrete parametric model: i.i.d.
Continuous parametric model: i.i.d.
4) Formal Process of Parameter
Estimation
Parameter Estimation
Given: observations (data)
are viewed as realizations of i.i.d. random
variables
Identify a suitable type of population distribution for ’s
which depends on unknown parameters
Point estimation:
Find functions which
approximate
Substitute in these functions to get
estimates for
Estimators
Let be functions which
approximate
are called estimators for
Estimators are random variables which only depend on
the sample , that is, they are statistics
Fact: If -th moment exists, then the -th moment also exist
for all (this is proved in the theory of
Lebesgue integration)
Example
Let be a random variable with PDF for
and otherwise
…
Definition of Sample Moments
(to use the function moment, first install and load the
package “moments”)
Example
That is,
Method of Moments
Let be an i.i.d. sample drawn from a population
distribution , which depends on parameters
Compute moments:
Each which depends on , say
, provides one equation for
(namely, )
Continue until the system of equations ,
with variables has a unique solution
,…,
Method of Moments, continued
Let ,…, be the
unique solution of ,
Replace moments in expressions for by
sample moments to obtain estimators
Equations for :
Unique solution:
,
,
(method of moments estimators for Gamma distribution)
Example (Including Data & Modeling)
Data:
(here and )
Statistical Model, continued
Data Gamma PDFs
should be suitable
Can use method of moments estimators for
Example, continued
,
If , then
(independence of ’s)
(identical distribution of ’s)
(definition of PMF)
is called a
likelihood function
Interpretation of Likelihood Function,
(Discrete Case)
is the probability
to observe under the assumption that
the true parameters are
Observations: , for
Likelihood function:
Example, continued
The maximum likelihood method picks the value of (
that maximizes the likelihood function
is the maximizer
Examples:
for
for
If satisfies , then
is called a maximum likelihood estimator (MLE) for
MLE Maximization Problem
To compute the MLE, we need to solve the maximization
problem , where and is the
range of the parameters
Recall PMF of :
,
where and
For :
∑
,
For :
,
Typical Properties of
Likelihood Functions
Take only positive values in the range of the parameters
Differentiable with respect to parameters
Limit of the function is zero whenever parameter
approaches a boundary of its range
( dominates for )
Standard Conditions
Let be a function defined on a (finite or infinite) interval
such that
a) for all
c)
Since , there is
→ →
a finite sub-interval of such
that
Let be a maximizer of on
So
(independence of ’s)
Conclusion:
Probability to observe up to deviation is
approximately proportional to
Likelihood Function (Continuous Case)
Let i.i.d. where is a continuous
distribution with PDF
Observations:
Likelihood function:
If satisfies , then
is called a maximum likelihood estimator (MLE) for
MLEs for Continuous Distributions,
Summary
The maximum likelihood method for continuous distributions
is the same as in the discrete case, just with PMF replaced
by PDF
The interpretation for the method in the continuous case,
however, is different (maximize probability to approximately
observe )
The maximization problem can be solved with the same
tools as for discrete distributions, including Theorem 1 and
the logarithm trick
Example
Let be i.i.d. where
PDF: for and otherwise
Likelihood function:
Need to solve
(c)
Likelihood function:
and otherwise
, ,
Bias
Estimator: (for )
Bias
for all
So is unbiased
Definition of Standard Error
Let be a population distribution, parameterized by a real
number and let be an estimator for
Standard error of :
measures how
much on average is off from the true
97
Example
Let be i.i.d. with population distribution
,
Estimator: (for )
Note:
Connection between Bias, Variance,
and MSE
)
Proof:
)
) (by definition of bias and by (1))
(by (2))
Recall:
Solution:
Replace by in to obtain the estimated
standard error:
Example (Textbook Page 261)
Experiment: Asbestos suspended in water and spread on a
filter
Counts of asbestos fibers on 23 equally sized samples
from the filter:
31, 29, 19, 18, 31, 28, 34, 27, 34, 30, 16, 18,
26, 27, 27, 18, 24, 22, 28, 24, 21, 17, 24
• The size of the area is the same for all and the counts
on the sample are approximately independent
Estimator:
Substitute data
SE of :
Estimated SE:
Interpretation of Estimated SE
We have and
Hence,
15) Criteria for Quality of Estimators
Main Idea for Precise Estimators
for all .
If and , then
Have shown:
If holds, then for all
So that we have
How to Achieve
Example: i.i.d. , with both
unknown.
Now,
How to Achieve
How to Achieve
This usually is automatically satisfied for reasonable
estimators, because they are often based on averages and
the variance of an average tends to for increasing sample
size
Example: i.i.d.
is an estimator for
Constructing Precise Estimators
(Summary)
Goals: and
for all
by
Suppose that
(a)
(b)
Then is consistent.
Now,
Cross-term = 0
for all .
0
Recall: Method of Moments
(for Single Unknown Parameter)
Let i.i.d. sample drawn from
2) MME: where
Consistency of MMEs
(for Single Unknown Parameter)
Theorem 1
We know:
a)
b)
c)
d)
e) is continuous
Proof of Theorem 1, continued
Step 1: Apply Law of Large Numbers to
for at all
Sub-additivity of probability
Hence,
i.i.d. with
Likelihood function:
Let and
Important question:
How much information about can observations
contain?
That is, how precise can an estimator for based on
be?
Preparations:
, , (previous lecture)
Note that
Fisher Information: Summary of
Relevant Expressions
Let i.i.d. with likelihood function
Log-likelihood function:
Score function:
Fisher Information:
Condition
The following condition is useful for several identities
involving the Fisher Information.
(for PMF)
(for PDF)
(for PMF)
Recall: Score Function and Fisher
Information
Likelihood function: Our goal now is to learn
more about the score
function
Score function:
This will help simplifying
the computation of
Fisher information in
many cases
Fisher information:
Expected Value of Score Function
Consider the case of sample size :
Expected Value of Score Function,
continued
If condition holds, then
Thus,
Variance of Score Function
Finally,
Score function:
(a)
(b)
21) Fisher Information:
Additivity and Alternative Formula
Score function:
(a)
(b)
Additivity of Fisher Information
Suppose condition holds. Then
(definition)
(independence of ’s)
Additivity of Fisher Information
Then
(independence of ’s)
If holds, then .
(for PDF)
(for PMF)
Alternative Formula (Proof)
To prove,
We notice that
and,
Alternative Formula (Proof)
So we have,
Therefore,
Example
i.i.d. with
PMF: ,
Example
Additivity,
22) Cramer-Rao Bound and Efficient
Estimators
For a “good” estimator for , a basic requirement is to be
unbiased
Then
Note:
If we let and
and,
Example:
Let i.i.d. with
PMF: ,
We know, and
1) is unbiased
2) Equality holds in the Cramér-Rao bound for , that is,
Value of
Behavior of
Behavior of , continued
Behavior of , continued
Behavior of , continued
Conclusion:
Broad applicability
24) Asymptotic Normality and
Asymptotic Variance of MLEs
converges to in distribution.
Theorem 1 ) converges to
in distribution
When Theorem 1 can not be Applied
Estimator:
In previous example:
and
When
When
means that we
expect that roughly of these intervals contain
Then is called a
exact confidence interval for .
The probability is called the confidence level of the
interval
Motivation
So, we need to find and with
for all
Possible idea:
Suppose be an estimator for and suppose that only
takes positive values
for all
Motivation, continued
for all
for all
note that and are constants
is a pivotal quantity
( )
Conclusion:
CDF:
4) Then is an exact
confidence interval for
Construction of Pivotal Quantities
The essential step in the construction of confidence intervals
is to come up with pivotal quantities
CDF of is
(see Tutorial Problems)
CDF of is
CDF of is for
CDF of is for
is a pivotal quantity
Example for Construction of Confidence
Interval from Pivotal Quantity
We have:
is a pivotal quantity for i.i.d.
CDF of is for
Example of observations: :
Note
Take
a
b
− /
/
MLE:
MLE:
and
and
Mean and variance of both do not depend on
potentially is a pivotal quantity
Tutorial Problems 2:
(does not depend on )
is pivotal quantity
Confidence Intervals for , cont
Have shown: is pivotal quantity
Next step: Find with
is defined by
Confidence Intervals for , cont
Have shown:
, / , /
is confidence interval for
28) Asymptotically Pivotal Quantities
and Approximate Confidence Intervals
Motivation
i.i.d. . Can we find a pivotal quantity
based on ?
MLE:
,
if is large
(see slide on )
/ /
/ /
/ /
is approximate 100(
Then
/ /
is an approximate 100(
Case of polls: ,
is approximate
95% confidence interval for
The bounds depend on