Professional Documents
Culture Documents
Estimation
Estimation
Part I - Estimation
Chapter 1 Estimation
Disclaimer
Chapter 1 Estimation
Probability vs Statistics
The theory of probability provides us with tools to describe events that are
random (e.g. throwing a die, flipping a coin, tomorrow’s temperature, patterns
of a boss in a video game)
Statistics consists in learning about these random events using data
Data are often, if not always, obtained as a sample taken from a population
of interest
A sample of the population of all Canadian adults
A sample of the population of all SFU students
A sample of the population of all firms
We need other tools to understand what data can tell us
Chapter 1 Estimation
Outline
Population vs samples
Estimators
Desirable properties
Sampling distribution
Moments of a sample proportion
Moments of a sample mean
Asymptotic results
Suggested reading: Chapter 3 in Stock and Watson
Chapter 1 Estimation
Population values vs sample values
Chapter 1 Estimation
Sample values: Example
Chapter 1 Estimation
Random sampling
The way a sample is drawn from a population will affect the properties of an
estimator
If observations are correlated and we don’t know it, we might fail to pick up
that correlation and our estimates might be biased or inconsistent
We also want our sample to be representative of the population we are
interested in
If we collect a sample by calling people at their house between 10 am and 3 pm,
our sample won’t be representative as it misses all the people who are not at
home at those times. The sample will be biased towards people who work from
home or don’t work
The easiest way is to draw randomly: Each member of the population has an
equal chance of being selected
Chapter 1 Estimation
Estimators
Chapter 1 Estimation
Estimators
Chapter 1 Estimation
Estimators as random objects
Chapter 1 Estimation
Desirable properties
Chapter 1 Estimation
Desirable properties of estimators
Definition: Consistency
An estimator θ̂ of a nonrandom quantity θ is consistent (or asymptotically
unbiased) if it converges in probability towards the value it estimates:
P
θ̂ → θ
P
where → denotes convergence in probability, i.e. ∀ε > 0, P |θ̂ − θ| > ε → 0 as
n→∞
Definition: Unbiasedness
An estimator θ̂ of a nonrandom quantity θ is unbiased if on average, it equals the
true value of the quantity of interest:
E[ θ̂ ] = θ
Chapter 1 Estimation
Desirable properties of estimators (cont’d)
Chapter 1 Estimation
Desirable properties of estimators (cont’d)
Many (most) estimators are biased. For some, we can have an idea of the bias
direction
But being consistent makes them reliable given the sample used is big enough.
If an estimator is consistent, the bigger the sample, the more accurate the
estimator, and the closer the estimate is to the true value
Asymptotic normality allows to make inference about the true value of the
parameter, i.e. to draw probabilistic conclusions about the true parameter, such
as (not) rejecting hypotheses about the true value or building confidence
intervals around the true value
Chapter 1 Estimation
Sampling distributions
Chapter 1 Estimation
Sampling distributions of estimators
Chapter 1 Estimation
Sampling distribution of estimators (cont’d)
Chapter 1 Estimation
Sampling distribution of a proportion
Consider a population of people. Some are smokers, others aren’t. Let p be the
true proportion of smokers
We would like to know p, but we can’t observe everyone. . .
Consider a i.i.d. sample taken from that population
We would like to estimate p. Any guess?
A sensible estimator is the sample proportion p̂
Why is it sensible?
Chapter 1 Estimation
Moments of a sample proportion: Expectation
Let Xi be the random variable equal to 1 if observation i is a smoker, and 0
otherwise. Since there is a proportion p of smokers in the population,
E[Xi ] = p and V[Xi ] = p(1 − p) (Xi is a Bernoulli r.v.!)
Then the sample proportion is p̂ = n1 n
P
i=1 Xi
n
" #
1X
E[p̂] = E Xi
n i=1
n
1 X
= E[Xi ] (by linearity of the expectation)
n i=1
n
1X
= p (by the i.i.d. assumption, “identically distributed” part)
n i=1
=p
Chapter 1 Estimation
Moments of a sample proportion: Variance
n
" #
1X
V[p̂] = V Xi
n i=1
n
1 X
= V[Xi ] (by the i.i.d. assumption, “independently distributed” part)
n2 i=1
1
= × nV[Xi ] (by the i.i.d. assumption, “identically distributed” part)
n2
1
= × n × p(1 − p)
n2
p(1 − p)
=
n
Chapter 1 Estimation
Moments of a sample average
Chapter 1 Estimation
Moments of a sample average: Expectation
n
" #
1X
E[X̄] = E Xi
n i=1
n
1X
= E[Xi ] (by linearity of the expectation)
n i=1
n
1X
= µ (by the i.i.d. assumption, “identically distributed” part)
n i=1
=µ
Chapter 1 Estimation
Moments of a sample average: Variance
n
" #
1X
V[X̄] = V Xi
n i=1
n
1 X
= V[Xi ] (by the i.i.d. assumption, “independently distributed” part)
n2 i=1
n
1
σ 2 (by the i.i.d. assumption, “identically distributed” part)
X
=
n2 i=1
1
= × n × σ2
n2
σ2
=
n
Chapter 1 Estimation
Comments
Chapter 1 Estimation
Asymptotic results
Chapter 1 Estimation
The Law of large numbers
Ideally, the bigger the sample, the closer our estimate to the true value since we
are using more and more data
Then, a big sample allows to use an estimator which is biased, since it is less
and less biased as the sample size increases
Theorem 1 : Weak law of large numbers
Let {Xi }n
i=1 be a sequence of i.i.d. random variable with finite expectation µ.
Then the average X̄ = n1 n
P
i=1 Xi converges to µ in probability, i.e.:
P
X̄ → µ
Chapter 1 Estimation
The Law of large numbers (cont’d)
Chapter 1 Estimation
The central limit theorem
Chapter 1 Estimation
The central limit theorem (cont’d)
Chapter 1 Estimation
Asymptotic distribution of a sample proportion and mean
Chapter 1 Estimation
Asymptotic distribution of the sample mean: known vs
unknown variance
Chapter 1 Estimation
Asymptotic distribution of a sample proportion and mean
The previous two theorems mean that when the sample size is big enough, we can
approximate the sampling distributions of the sample proportion and sample mean the
following way:
√ p̂ − p d
np → N (0, 1)
p(1 − p)
√ X̄ − µ d
n → N (0, 1)
σ
√ X̄ − µ d
n → tn−1
s
Even if the Xi themselves are not normally distributed!
That will be useful for inference
Note: If we knew the distribution of the Xi is normal, the exact sampling distribution
of the sample mean would also normal and there is no need to use the asymptotic
distribution to approximate it
Chapter 1 Estimation