Lecture7 Estimation

You might also like

Download as xlsx, pdf, or txt
Download as xlsx, pdf, or txt
You are on page 1of 18

Estimation of Parameters

Parameters: In statistics parameters are values that define PDF or PMF completely.
For example, in X ~ N (, 2), we can specify a complete PDF if we know and 2 .
So in the normal distribution and 2 are parameters..
Distribution

Parameters

X ~ Binomial(n, p)

X ~ Poisson ()

X ~ Exponential ()

X ~ N (, 2)

, 2

X ~ Uniform (a, b)

a, b

As long as we know the parameters, we can see how a random variable behaves
specifically, and if we want to, we can calculate specific probability.
Parameters also specify characteristics of the population: If 1 is greater than 2 in
normal distribution, we know that group 1 has bigger mean than group 2 has.
In most cases, we do not know exact values of parameters. Just think about the
population of all the highschool students. We cannot calculate their mean of heights.
So we estimate them.
The question is how to estimate parameters and, if there are different ways to
estimate, how we judge which one is better than the others.
Point Estimation and Interval Estimation
A point estimation estimates parameter by a single value while an interval
estimation estimates an interval that may contains the true value of the parameter.
ex)

= 170
165 < < 175

point estimation
interval estimation

Criteria of good estimator


An estimator is a function of sample values to estimate the parameter.
For example a sample mean is a function (linear) of sample values.
(1) Unbiaed Estimator
(2) Consistent Estimator
(3) Efficient Estimator
ex 1) Suppose we use the sample mean as an estimator of the
parameter . Then, we know that the is a linear function of the sample values
because = (x1 + x2 + .. + xn ) / n.

ex 1) Suppose we use the sample mean as an estimator of the


parameter . Then, we know that the is a linear function of the sample values
because = (x1 + x2 + .. + xn ) / n.
ex 2) The sample variance s2 , an estimator of the population variance 2
function (not a liniar function though) of samples.

is a

s2 =

Unbiaed Estimator: If the expectation value of an estimator becomes the


parameter, i.e. E( An estimator) = parameter, then this estimator is an unbiased
estimator of the parameter.
/
1/
ex) E( ) = E
(x1 + x2 + .. + xn ) =
is an unbiased estimator of .

/) + E(x ) + . . + E(x )) =
(E(x1/
1
2
n

/
1/
(n * ) = . Therefore

Consistent Estimator: An consistent estimator approaches to the true value of the parameter
probabilistically when the sample size increases.
ex) is a consistent estimator to the parameter . E( ) = , Var() = 2 / n.
So, as n is getting bigger, the variance of becaomes smaller. This means that is
approaching to the true value of in a stocatic sense.

is a consistent estimator : Its variance is getting smaller as n (the size of sample) increases

Efficient Estimator : If an estimator with smaller variance is relative efficient estimator.


from n = 20 is relative efficient estimator to from n= 10 because
Variance of is 2/n
2 / 20 is smaller than 2 /10.

Note: = 1/2 (x1 + x2) is the least variance linear estimator of .


Prrof: Let c = w1x1 + w2x2 is another linear estimator for
We can show that E(c) = . Variance of c is (w12 + w22 )2
This variance minimizes whan w1 = w2 = 1/2
Why the formula for the sample mean is

not

Therefore

is an unbiased estimator of 2.

If we use, instead

, we underestimate 2 .

Summary for the Point Estimation


(1) The sample mean is an unbaised estimator of
(2) is a consistent estimator of
(3) The bigger sample size of n is, the more efficient estimator becomes.

is an unbiased estimator of the population variance 2.


Interval Estimation: Cofidence Interval Estimation of .
We know that if X ~ N(, 2), then

distributes Z (0, 1).


Z: Standard normal distribution

So, P( -a < Z =

<a)=p

If we arrange the inequality in terms of ,

This is the interval that will contins true with the confidence (probability)
of p * 100 % .
Let suppose p = 0.95. If we find the value for the p = 0.95,
1.959964

= NORM.INV(0.95 + (1- 0.95)/2, 0, 1)

So,

.. (1)

From this, we can define 's 95 % confidence interval as:

For the 90% confidence interval, a = 1.64, and for 99 %, a is 2.54.


Therefore, 's confidence interval are:
90%

1.281552

95%

1.959964

99%

2.575829

So far we have assumed that the population variance 2 is known.


If it is no known, we have to use t distribution.
If we replace with s (= sample standard deviation)
distributes t with degree of freedom n - 1.

Using the t distribution, we can calculate the confidence interval of as:


p * 100 % confidence interval

For example with the df = 10,


=T.INV(0.9+(1-0.9)/2, 10)

90%

t=

1.812461

95%

t=

2.228139

99%

t=

3.169273

=T.INV(0.9+(1-0.9)/2, 10)

Error and Sample Size


Error:

From (1),

If we arrange the inequality in terms of

If we define the
upper limit of the error.

- ,

as error of estimation,

is the

So, when we set the upper limit as a given value, say b,

b=

, and n = ((1.96 / b) 2

For example, when = 10, to be 95% confident that the estimation error should be
less than 5, the sample size should be at least 16.
n=

15.3664

=((1.96*10)/ 5)^2


&

) = . Therefore

e of the parameter

ns that is

e of sample) increases

ent estimator.

1.644854

1.959964

Exercise
1. Prove = 1/n (x1 +.. + xn) is the most efficient estimator among all other
estimators with linear function of n samples.
Hint: Let c = w1 x1 + w2 x2 + . + wnxn, where w1 + w2 + . + wn = 1
Calculate the variance of c. Minimize the variance subject to w 1 + w2 + . + wn = 1

. + wn = 1

((
)

/
1/

((
)

()

You might also like