Professional Documents
Culture Documents
Intro Bootstrap 341
Intro Bootstrap 341
Machelle D. Wilson
Outline
Why the Bootstrap?
Limitations of traditional statistics
How Does it Work?
The Empirical Distribution Function and the Plug-in
Principle
Accuracy of an estimate: Bootstrap standard error and
confidence intervals
Examples
How Good is the Bootstrap?
Limitations of Traditional Statistics:
Problems with distributional assumptions
Often data can not safely be assumed to be
from an identifiable distribution.
Sometimes the distribution of the statistic
is mathematically intractable, even
assuming that distributional assumptions
can be made.
Hence, often the bootstrap provides a
superior alternative to parametric
statistics.
An example data set
Red Lines=BS CI
250
Black Lines=Normal CI
200
150
100
50
0
Red Lines=BS CI
Black Lines=Normal CI
250
200
150
100
50
0
20
15
10
0
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
The Bootstrap Solution
In general, the number of 2n 1
bootstrap samples, Cn, is Cn .
n 1
Fˆ A Pˆ A # xi A / n
Example
A random sample of 100 throws of a die
yields 13 ones, 19 twos, 10 threes, 17 fours,
14 fives, and 27 sixes. Hence the edf is
T (F )
by
ˆ ˆ
T (F )
The Plug-in Principle
Ifthe only information about F comes from
the sample x, then ˆ T ( Fˆ ) is a minimum
variance unbiased estimator of .
The bootstrap is drawing B samples from the
empirical distribution to estimate B statistics
ˆ
of interest, .
*
x={x1,x2,…,xn}
B
1 B
t ˆ T ( x) T x*b
[T ( x ) t ]
b 2
ˆ (T ( x))
se b 1
B b 1 B 1
Bootstrap Standard Error and Confidence
intervals.
The bootstrap estimate of the mean is just
the empirical average of the statistic over
all bootstrap samples.