Variance Estimation

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 31

Output Analysis: Variance Estimation IE680 Output Analysis : Variance Estimation IE680

Output Analysis: Variance Estimation



Jong-hyun Ryu
1
Output Analysis: Variance Estimation IE680
Contents
Motivation (Quality Control Chart)
Collecting output data
Type of simulations
Steady-State Simulation
Stochastic Stationary Process
Variance Estimation Methods
Replication Method
Batch mean methods
Non-overlapping vs. Overlapping
Standardized Time Series
Additional Methods

2
Output Analysis: Variance Estimation IE680
Motivation
Quality Control Chart
Simple Control Chart (Shewhart Control Chart)

Design of Shewhart Control Chart
Control Limits
UCL & LCL



Results from incorrect control limits
If overestimated detection delay
If underestimated high false alarm rate
Estimating process variance (standard deviation) is critical.
3
3
UCL x
LCL x
o
o
= +
=
3
Output Analysis: Variance Estimation IE680
Type of simulations
Terminating vs. non-terminating simulations

Terminating simulations
Runs for some duration of time T
E
, where E is a specified event
that stops the simulation.
Starts at time 0 under well-specified initial conditions.
Ends at the stopping time T
E
.
Bank example: Opens at 8:30 am (time 0) with no customers
present and 8 of the 11 teller working (initial conditions), and
closes at 4:30 pm (T
e
)
The simulation analyst chooses to consider it a terminating
system because the object of interest is one days operation.
4
Output Analysis: Variance Estimation IE680
Type of simulations
Non-terminating simulation
Runs continuously, or at least over a very long period of time.
Examples: assembly lines that shut down infrequently, telephone
systems, hospital emergency rooms.
Initial conditions defined by the analyst.
Runs for some analyst-specified period of time TE.
Study the steady-state (long-run) properties of the system,
properties that are not influenced by the initial conditions of the
model.

Terminating or non-terminating
The objectives of the simulation study
The nature of the system.
5
Output Analysis: Variance Estimation IE680
Output Analysis for Steady-State Simulation
Consider a single run of a simulation model to
estimate a steady-state or long-run characteristics of
the system.
The single run produces observations Y1, Y2, ... (generally
the samples of an autocorrelated time series).

Performance measures:
Point estimator and confidence interval
Independent of the initial conditions
6
Output Analysis: Variance Estimation IE680
Output Analysis for Steady-State Simulation
The sample size is a design choice, with several
considerations in mind:
Any bias in the point estimator that is due to artificial or
arbitrary initial conditions (bias can be severe if run length
is too short).
Desired precision of the point estimator.
Budget constraints on computer resources.
7
Output Analysis: Variance Estimation IE680
Initialization Bias
Methods to reduce the point-estimator bias caused by using
artificial and unrealistic initial conditions:

Intelligent initialization.
Divide simulation into an initialization phase and data-collection phase.

Intelligent initialization

Initialize the simulation in a state that is more representative of long-
run conditions.
If the system exists, collect data on it and use these data to specify more
nearly typical initial conditions.
If the system can be simplified enough to make it mathematically
solvable, e.g. queueing models, solve the simplified model to find long-
run expected or most likely conditions, use that to initialize the
simulation.
8
Output Analysis: Variance Estimation IE680
Initialization Bias
No widely accepted, objective and proven technique to
guide how much data to delete to reduce initialization
bias to a negligible level.

Plots can be misleading but they are still recommended.
If averages reveal a smoother and more precise trend as the # of
replications increases.
If averages can be smoothed further by plotting a moving
average.
Cumulative average becomes less variable as more data are
averaged.
The more correlation present, the longer it takes for to approach
steady state.

9
Output Analysis: Variance Estimation IE680
Introduction to variance estimation
Simulation output analysis
Point estimator and confidence interval
Variance estimation confidence interval

Independent and identically distributed (IID)
Suppose X
1
,X
m
are iid






10
Output Analysis: Variance Estimation IE680
Stochastic Stationary Process
Stationary time series with positive
autocorrelation



Stationary time series with negative
autocorrelation



Nonstationary time series with an upward
trend
The stochastic process X is stationary for t
1
,,t
k
, t T, if
1 1
" " ( ,..., ) ( ,..., )
d
k k
d
t t t t t t
where denotes equality in distribution X X X X
+ +
= =
11
Output Analysis: Variance Estimation IE680
Stochastic Stationary Process (2)

A discrete-time stationary process X = {X
i
: i1} with mean and
variance
X
2
= Cov(X
i
,X
i
),



Variance of the sample mean



Easy to see


For iid,







2 2
1, 1
1
2 ( )
X j
j
Cov X X o o

+
=
= +

2
lim ( )
n
n
nVar X o

=
2
1
2
1, 1
1
( ) 1 2
(1 ) ( )
1
n
n
X j
j
S X j
E Cov X X
n n n n
o

+
=
( (
=
( (

1
2
1, 1
1
1
( ) 2 (1 ) ( )
n
n X j
j
j
Var X Cov X X
n n
o

+
=
(
= +
(

12
Output Analysis: Variance Estimation IE680
Stochastic Stationary Process (3)
The expected value of the variance estimator is:






If Xi are independent, then is an unbiased estimator of

If the autocorrelation is positive, then is biased low as an estimator of

If the autocorrelation is negative, then is biased high as an estimator of




2
2
1
( )
( )
1
( )
( )
n n
n
n
n
n
S X a
E Var X
n n
S X
E Var X when positively correlated
n

(
=
(


(
<<
(

( )
n
Var X
( )
n
Var X
( )
n
Var X
2
( )
n
S X
n
2
( )
n
S X
n
2
( )
n
S X
n
13
Output Analysis: Variance Estimation IE680
Replication Method
Use to estimate point-estimator variability and to construct a confidence
interval.

Approach: make R replications, initializing and deleting from each one the
same way.

Important to do a thorough job of investigating the initial-condition bias:
Bias is not affected by the number of replications, instead, it is affected only by
deleting more data or extending the length of each run (i.e. increasing T
E
).

Basic raw output data {X
rj
, r = 1, ..., R; j = 1, , n} is derived by:
Individual observation from within replication r.
Batch mean from within replication r of some number of discrete-time observations.
Batch mean of a continuous-time process over time interval j.
14
Output Analysis: Variance Estimation IE680
Replication Method
Length of each replication (n) beyond deletion point (d):
(n - d) > 10d

Number of replications (R) should be as many as time
permits, up to about 25 replications.

For a fixed total sample size (n),
as fewer data are deleted (d decreases)
C.I. shifts: greater bias.
Decrease Variance
Trade off between bias and variance.


15
Output Analysis: Variance Estimation IE680
Batch means method

When using a single, long replication:

Problem :
Data are dependent so the usual estimator is biased
Replication method (more than 25 replications) is expensive

One solution: Batch means method


16
Output Analysis: Variance Estimation IE680
Non-overlapping batch mean (NBM)




















m
observations
with batch
mean Y
1,m
Batch 1












m
observations
with batch
mean Y
k,m
Batch k
Batch means method (Nonoverlapping batch mean)
1,
2, , ,
1 1 2 , ( 1) 1 , ( 1) 1
,..., , ,..., ... ,..., ... ,...,
m
m i m k m
n k m
m m m i m im k m km
Y
Y Y Y
X
X X X X X X X X
=
+ + +
17
Output Analysis: Variance Estimation IE680
Nonoverlapping batch mean (NBM)
Suppose the batch means become uncorrelated
as m


NBM estimator for
2



Confidence Interval









,
,
( )
( ) ( )
i m
n i m
nVar Y
nVar X mVar Y
k
~ =
2
,
1

( , ) ( )
1
k
B i m n
i
m
V k m Y X
k
=
=


1,1 / 2

( , )
B
n k
V k m
X t
n
o

18
Output Analysis: Variance Estimation IE680
Overlapping Batch Mean (OBM)
OBM estimator for
2

1 2 3 , 1 2 3
, , , ... , , , , ...
m m m m
X X X X X X X
+ + +
Y
1,m
Y
2,m
1
2
,
1

( ) ( )
( 1)( )
n m
O i m n
i
nm
V m Y X
n m n m
+
=
=
+

19
Output Analysis: Variance Estimation IE680
NBM vs. OBM
Under mild conditions



Thus, both have similar bias

Variance of the estimators



Thus, the OBM method gives better MSE

2

( , ) ( 1) (1 )
B
E V k m k n o n o
(
~ + + +

2

( ) (1 )
O
E V m m o m o
(
~ + +

( , )
3
, ,

2
( )
B
O
Var V k m
as k m
Var V m
(


(

20
Output Analysis: Variance Estimation IE680
Standardized Time Series
Define the centered partial sums of X
i
as


Central Limit Theorem




Define the continuous time process


Question: How does Z
n
(t) behave as n increases?

21
Output Analysis: Variance Estimation IE680
n=100
22
Output Analysis: Variance Estimation IE680
n=1000
23
Output Analysis: Variance Estimation IE680
n=10000
24
Output Analysis: Variance Estimation IE680
n=1,000,000
25
Output Analysis: Variance Estimation IE680
Functional Central Limit Theorem (FCLT)



where B(t) is the standard Brownian motion
(drift coefficient =0, diffusion coefficient =1)

Brownian motion with drift coefficient o and diffusion coefficient
o
2
is a real valued stochastic process with stationary and
independent increments having continuous sample paths where



Thus,



1
( ) ( ) (0 1)
nt
i
i
n
X nt
Z t B t as n t
n

o
(

=

(

s s

2
2 1 2 1 2 1
( ) ( ) ~ ( ( ), ( ))
( ) ( ) (0) ~ (0, )
n n n
B t B t Normal t t t t
Z t Z t Z Normal t
o o
=
26
Output Analysis: Variance Estimation IE680
Functional Central Limit Theorem (FCLT)
While CLT says that for any t,


FCLT also shows that Z
n
(t) converges to an
asymptotically continuous stochastic process with
independent increments.

27
Output Analysis: Variance Estimation IE680
The Weighted Area Estimator
If we can define the function f such that
Continuous [0,1]
Normalized so that

Then,

Define






( )
1
0
( ) ( ) 1 Var f t B t dt =
}
1
0
( ) ( ) ~ (0,1) f t B t dt N
}
( )
2
1
2 2 2
1
0
( ) ( ) ( ) ~ ( ) A f f t B t dt E A f o o _ o
(
=
(

}
28
Output Analysis: Variance Estimation IE680
The Weighted Area Estimator



Since

and


is a variance estimator


1
( )
nt
i
i
n
X nt
Z t
n

o
(

=

(

2
1
0
2
1
( ; ) ( ) ( ) ( ) ( ) ( )
1
n
i i
A f n f Z A f f t B t dt as n
n
n n n
i
o o
(
(
~
(
(

(
=
}
( ) ( )
2 2 2 2
1
( ; ) ( ) ~ ( ) ( )
d
A f n A f E A f E A f o _ o o = =
( ; ) A f n
29
Output Analysis: Variance Estimation IE680
Additional Methods
Regenerative Method (Fishman 1973; Shedler 1993)
Spectral Theory (Heidelberger and Welch 1981, 1983;
Damerdji 1991)
Quantile estimation (Heidelberger and Lewis 1984;
Seila 1982)
Estimation of functions of means (Munoz and Glynn
1997)
Estimation of multivariate means (Anderson 1984;
Charnes 1989, 1995; Seila 1984)

30
Output Analysis: Variance Estimation IE680
Reference
Alexopoulos, C., D. Goldsman, and R. F. Serfozo. 2005. Stationary processes: statistical
estimation. In The Encyclopedia of Statistical Sciences, ed. N. L. Johnson, and C. B. Read,
2nd edition. New York: John Wiley & Sons

Kim, S.-H., C. Alexopoulos, K.-L. Tsui, and J. R. Wilson. 2006. A distribution-free tabular
CUSUM chart for autocorrelated data. IIE Transactions, forthcoming.

Alexopoulos, C., N. T. Argon, D. Goldsman, and G. Tokol. 2004. Overlapping variance
estimators for simulations. Technical Report, School of Industrial and Systems Engineering,
Georgia Institute of Technology, Atlanta, Georgia.

Law, A. M., and W. D. Kelton. 2000 Simulation Modeling and Analysis, 3rd ed. New York:
McGraw-Hill.

31

You might also like