Variance Estimation

Output Analysis: Variance Estimation IE680 Output Analysis : Variance Estimation IE680
Output Analysis: Variance Estimation

Jong-hyun Ryu
1
Output Analysis: Variance Estimation IE680
Contents
Motivation (Quality Control Chart)
Collecting output data
Type of simulations
Steady-State Simulation
Stochastic Stationary Process
Variance Estimation Methods
Replication Method
Batch mean methods
Non-overlapping vs. Overlapping
Standardized Time Series
Additional Methods

2
Motivation
Quality Control Chart
Simple Control Chart (Shewhart Control Chart)

Design of Shewhart Control Chart
Control Limits
UCL & LCL

Results from incorrect control limits
If overestimated detection delay
If underestimated high false alarm rate
Estimating process variance (standard deviation) is critical.
3
3
UCL x
LCL x
o
o
= +
=
3
Type of simulations
Terminating vs. non-terminating simulations

Terminating simulations
Runs for some duration of time T
E
, where E is a specified event
that stops the simulation.
Starts at time 0 under well-specified initial conditions.
Ends at the stopping time T
E
.
Bank example: Opens at 8:30 am (time 0) with no customers
present and 8 of the 11 teller working (initial conditions), and
closes at 4:30 pm (T
e
)
The simulation analyst chooses to consider it a terminating
system because the object of interest is one days operation.
4
Type of simulations
Non-terminating simulation
Runs continuously, or at least over a very long period of time.
Examples: assembly lines that shut down infrequently, telephone
systems, hospital emergency rooms.
Initial conditions defined by the analyst.
Runs for some analyst-specified period of time TE.
Study the steady-state (long-run) properties of the system,
properties that are not influenced by the initial conditions of the
model.

Terminating or non-terminating
The objectives of the simulation study
The nature of the system.
5
Output Analysis for Steady-State Simulation
Consider a single run of a simulation model to
estimate a steady-state or long-run characteristics of
the system.
The single run produces observations Y1, Y2, ... (generally
the samples of an autocorrelated time series).

Performance measures:
Point estimator and confidence interval
Independent of the initial conditions
6
Output Analysis for Steady-State Simulation
The sample size is a design choice, with several
considerations in mind:
Any bias in the point estimator that is due to artificial or
arbitrary initial conditions (bias can be severe if run length
is too short).
Desired precision of the point estimator.
Budget constraints on computer resources.
7
Initialization Bias
Methods to reduce the point-estimator bias caused by using
artificial and unrealistic initial conditions:

Intelligent initialization.
Divide simulation into an initialization phase and data-collection phase.

Intelligent initialization

Initialize the simulation in a state that is more representative of long-
run conditions.
If the system exists, collect data on it and use these data to specify more
nearly typical initial conditions.
If the system can be simplified enough to make it mathematically
solvable, e.g. queueing models, solve the simplified model to find long-
run expected or most likely conditions, use that to initialize the
simulation.
8
Initialization Bias
No widely accepted, objective and proven technique to
guide how much data to delete to reduce initialization
bias to a negligible level.

Plots can be misleading but they are still recommended.
If averages reveal a smoother and more precise trend as the # of
replications increases.
If averages can be smoothed further by plotting a moving
average.
Cumulative average becomes less variable as more data are
averaged.
The more correlation present, the longer it takes for to approach
steady state.

9
Introduction to variance estimation
Simulation output analysis
Point estimator and confidence interval
Variance estimation confidence interval

Independent and identically distributed (IID)
Suppose X
1
,X
m
are iid

10
Stochastic Stationary Process
Stationary time series with positive
autocorrelation

Stationary time series with negative
autocorrelation

Nonstationary time series with an upward
trend
The stochastic process X is stationary for t
1
,,t
k
, t T, if
1 1
" " ( ,..., ) ( ,..., )
d
k k
d
t t t t t t
where denotes equality in distribution X X X X
+ +
= =
11
Stochastic Stationary Process (2)

A discrete-time stationary process X = {X
i
: i1} with mean and
variance
X
2
= Cov(X
i
,X
i
),

Variance of the sample mean

Easy to see

For iid,

2 2
1, 1
1
2 ( )
X j
j
Cov X X o o

+
=
= +

2
lim ( )
n
n
nVar X o
=
2
1
2
1, 1
1
( ) 1 2
(1 ) ( )
1
n
n
X j
j
S X j
E Cov X X
n n n n
o

+
=
( (
=
( (

1
2
1, 1
1
1
( ) 2 (1 ) ( )
n
n X j
j
j
Var X Cov X X
n n
o

+
=
(
= +
(

12
Stochastic Stationary Process (3)
The expected value of the variance estimator is:

If Xi are independent, then is an unbiased estimator of

If the autocorrelation is positive, then is biased low as an estimator of

If the autocorrelation is negative, then is biased high as an estimator of

2
2
1
( )
( )
1
( )
( )
n n
n
n
n
n
S X a
E Var X
n n
S X
E Var X when positively correlated
n
(
=
(

(
<<
(

( )
n
Var X
( )
n
Var X
( )
n
Var X
2
( )
n
S X
n
2
( )
n
S X
n
2
( )
n
S X
n
13
Replication Method
Use to estimate point-estimator variability and to construct a confidence
interval.

Approach: make R replications, initializing and deleting from each one the
same way.

Important to do a thorough job of investigating the initial-condition bias:
Bias is not affected by the number of replications, instead, it is affected only by
deleting more data or extending the length of each run (i.e. increasing T
E
).

Basic raw output data {X
rj
, r = 1, ..., R; j = 1, , n} is derived by:
Individual observation from within replication r.
Batch mean from within replication r of some number of discrete-time observations.
Batch mean of a continuous-time process over time interval j.
14
Replication Method
Length of each replication (n) beyond deletion point (d):
(n - d) > 10d

Number of replications (R) should be as many as time
permits, up to about 25 replications.

For a fixed total sample size (n),
as fewer data are deleted (d decreases)
C.I. shifts: greater bias.
Decrease Variance
Trade off between bias and variance.

15
Batch means method

When using a single, long replication:

Problem :
Data are dependent so the usual estimator is biased
Replication method (more than 25 replications) is expensive

One solution: Batch means method

16
Non-overlapping batch mean (NBM)

m
observations
with batch
mean Y
1,m
Batch 1

m
observations
with batch
mean Y
k,m
Batch k
Batch means method (Nonoverlapping batch mean)
1,
2, , ,
1 1 2 , ( 1) 1 , ( 1) 1
,..., , ,..., ... ,..., ... ,...,
m
m i m k m
n k m
m m m i m im k m km
Y
Y Y Y
X
X X X X X X X X
=
+ + +
17
Nonoverlapping batch mean (NBM)
Suppose the batch means become uncorrelated
as m

NBM estimator for
2

Confidence Interval

,
,
( )
( ) ( )
i m
n i m
nVar Y
nVar X mVar Y
k
~ =
2
,
1
( , ) ( )
1
k
B i m n
i
m
V k m Y X
k
=
=

1,1 / 2
( , )
B
n k
V k m
X t
n
o
18
Overlapping Batch Mean (OBM)
OBM estimator for
2

1 2 3 , 1 2 3
, , , ... , , , , ...
m m m m
X X X X X X X
+ + +
Y
1,m
Y
2,m
1
2
,
1
( ) ( )
( 1)( )
n m
O i m n
i
nm
V m Y X
n m n m
+
=
=
+

19
NBM vs. OBM
Under mild conditions

Thus, both have similar bias

Variance of the estimators

Thus, the OBM method gives better MSE

2
( , ) ( 1) (1 )
B
E V k m k n o n o
(
~ + + +

2
( ) (1 )
O
E V m m o m o
(
~ + +

( , )
3
, ,
2
( )
B
O
Var V k m
as k m
Var V m
(

(

20
Standardized Time Series
Define the centered partial sums of X
i
as

Central Limit Theorem

Define the continuous time process

Question: How does Z
n
(t) behave as n increases?

21
n=100
22
n=1000
23
n=10000
24
n=1,000,000
25
Functional Central Limit Theorem (FCLT)

where B(t) is the standard Brownian motion
(drift coefficient =0, diffusion coefficient =1)

Brownian motion with drift coefficient o and diffusion coefficient
o
2
is a real valued stochastic process with stationary and
independent increments having continuous sample paths where

Thus,

1
( ) ( ) (0 1)
nt
i
i
n
X nt
Z t B t as n t
n
o
(

=

(

s s
2
2 1 2 1 2 1
( ) ( ) ~ ( ( ), ( ))
( ) ( ) (0) ~ (0, )
n n n
B t B t Normal t t t t
Z t Z t Z Normal t
o o
=
26
Functional Central Limit Theorem (FCLT)
While CLT says that for any t,

FCLT also shows that Z
n
(t) converges to an
asymptotically continuous stochastic process with
independent increments.

27
The Weighted Area Estimator
If we can define the function f such that
Continuous [0,1]
Normalized so that

Then,

Define

( )
1
0
( ) ( ) 1 Var f t B t dt =
}
1
0
( ) ( ) ~ (0,1) f t B t dt N
}
( )
2
1
2 2 2
1
0
( ) ( ) ( ) ~ ( ) A f f t B t dt E A f o o _ o
(
=
(

}
28
The Weighted Area Estimator

Since

and

is a variance estimator

1
( )
nt
i
i
n
X nt
Z t
n
o
(

=

(

2
1
0
2
1
( ; ) ( ) ( ) ( ) ( ) ( )
1
n
i i
A f n f Z A f f t B t dt as n
n
n n n
i
o o
(
(
~
(
(

(
=
}
( ) ( )
2 2 2 2
1
( ; ) ( ) ~ ( ) ( )
d
A f n A f E A f E A f o _ o o = =
( ; ) A f n
29
Additional Methods
Regenerative Method (Fishman 1973; Shedler 1993)
Spectral Theory (Heidelberger and Welch 1981, 1983;
Damerdji 1991)
Quantile estimation (Heidelberger and Lewis 1984;
Seila 1982)
Estimation of functions of means (Munoz and Glynn
1997)
Estimation of multivariate means (Anderson 1984;
Charnes 1989, 1995; Seila 1984)

30
Reference
Alexopoulos, C., D. Goldsman, and R. F. Serfozo. 2005. Stationary processes: statistical
estimation. In The Encyclopedia of Statistical Sciences, ed. N. L. Johnson, and C. B. Read,
2nd edition. New York: John Wiley & Sons

Kim, S.-H., C. Alexopoulos, K.-L. Tsui, and J. R. Wilson. 2006. A distribution-free tabular
CUSUM chart for autocorrelated data. IIE Transactions, forthcoming.

Alexopoulos, C., N. T. Argon, D. Goldsman, and G. Tokol. 2004. Overlapping variance
estimators for simulations. Technical Report, School of Industrial and Systems Engineering,
Georgia Institute of Technology, Atlanta, Georgia.

Law, A. M., and W. D. Kelton. 2000 Simulation Modeling and Analysis, 3rd ed. New York:
McGraw-Hill.

31

Variance Estimation

Uploaded by

Copyright:

Available Formats

You might also like

Variance Estimation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Variance Estimation

Uploaded by

Copyright:

Available Formats

Output Analysis: Variance Estimation IE680 Output Analysis : Variance Estimation IE680

Output Analysis: Variance Estimation

You might also like