04 Simulation-RNG

Simulation
010123211 Simulation and Modeling

Yuenyong Nilsiam
1
Shaquille O’Neal is another NBA great – famous as much for his relentless growth mindset
on the court as off of it. This quote reminds us to buckle down, trust the process and
remember – the best time to start isn’t later, it’s now.
Ch. 24
Introduction to
Simulation
3
Introduction to Simulation
• System is not available
• A simulation provides performance prediction or
comparing several alternatives
• Simulation models often fail
• Lack of statistical background or
• Lack of software development techniques
• Simulation takes a long time to develop
• Common mistakes in simulation and techniques to
overcome are discussed hereafter
4
24.1 Common Mistakes in
Simulation
1. Inappropriate Level of Detail
• The level of detail is limited only by time available
• Also greater execution time
• Knowledge of input parameters may cause inaccurate
model if not available
• It can take too much time to develop
• Start with less detailed model, get some results, study
sensitivities, and introduce details in the highest impact
on the results
5
Simulation
2. Improper Language
• Choice of programming language -> timely model
development
• Special-purpose simulation languages
• Less time model development
• Ease of several common tasks
• Verification
• Statistical analysis
• General-purpose languages
• More portable
• More control over the efficiency and run time
6
Simulation
3. Unverified Models
• Simulation models are generally large computer
programs
• Bugs can make the conclusion meaningless
4. Invalid Models
• No errors, but not represent the real system correctly
• Incorrect assumptions
• Models must be validated
• All simulation model results should be confirmed by
analytical models, measurements, or intuition.
7
Simulation
5. Improperly Handled Initial Conditions
• Initial part should be discarded
• It is not representative of the steady state
6. Too Short Simulation
• Trying to save time by running simulation too short
• The results may not be representative of the real system
• Length for simulation depends on the accuracy desired
and the variance of the observed quantities
8
Simulation
7. Poor Random-Number Generators
• Simulation models require random quantities
• Safer to use a well-known generator
• Even well-knows generators have problems
8. Improper Selection of Seeds
• Seed is the first random number in the sequence
• The seed for different random-number streams should
be carefully chosen to maintain independence among
the steams
• Avoid correlation among various processes in the
system
9
24.2 Other Causes of Simulation
Analysis Failure
1. Inadequate Time Estimate
• Underestimate the time and effort required to develop
a simulation model
• Start off short and continue for years
• More features, parameters, and details
• Generally, simulation takes the longest time
• Also need model verification
• For long simulation projects, provision should be made
for incorporating changes in the system
10
Analysis Failure
2. No Achievable Goal
• A clearly specified set of goals that are specific,
measureable, achievable, repeatable, and thorough
(SMART)
• Agreement between the analysts and the end users
11
Analysis Failure
3. Incomplete Mix of Essential Skills
a) Project Leadership: motivate, lead, manage
b) Modeling and Statistics: identify the key
characteristics of the system and model them
c) Programming: write a readable and verifiable
computer program that implements the model
correctly
d) Knowledge of the Modeled System: understand the
system, explain it to the team, interpret the modeling
results in terms of their impact on the system design
12
Analysis Failure
4. Inadequate Level of User Participation
• Periodically meeting between modeling team and the
user organizations to discuss progress, problems, and
changes in the system
• It helps to see “modeling bugs” early and keep the
model sync with changes in the system
5. Obsolete or Nonexistent Documentation
• Most simulation models are continuously modified as
the system is modified or better understood
• Documentation of these models are lags behind
13
Analysis Failure
6. Inability to Manage the Development of a Large
Complex Computer Program
• Tools are available for management of large software
projects
• Keep tracks of design objectives, functional
requirements, data structures, and progress estimates
• Design principles
• Top-down design
• Structured programming
14
Analysis Failure
7. Mysterious Results
• Due to bugs in simulation program, invalid model
assumptions, or lack of understanding of the real
system
• Therefore, verify the model is needed
• If the mysterious result still persists, bring it to the end
users
• It may provide a valuable insight into the system or
point to the system features the need to be modeled in
more detail
15
Checklist for Simulation
1. Checks before developing a simulation:
(a) Is the goal of the simulation properly specified?
(b) Is the level of detail in the model appropriate for
the goal?
(c) Does the simulation team include personnel with
project leadership, modeling, programming, and
computer systems backgrounds?
(d) Has sufficient time been planned for the project?
16
2. Checks during development:
(a) Has the random-number generator used in the
simulation been tested for uniformity and
independence?
(b) Is the model reviewed regularly with the end
user?
(c) Is the model documented?
uniformity, i.e. they are equally probable every where
independence, i.e. the current value of a random variable has no relation with the
previous values
17
3. Checks after the simulation is running:
(a) Is the simulation length appropriate?
(b) Are the initial transients removed before
computation?
(c) Has the model been verified thoroughly? (meet the
requirements ?)
(d) Has the model been validated before using its
results? (how well?)
(e) If there are any surprising results, have they been
validated?
(f) Are all seeds such that the random-number streams
will not overlap?
18
24.3 Terminology
• Terms that are commonly used in modeling
• An example of simulating CPU scheduling is used in
this section
• Problem: to study various scheduling techniques
for given CPU demand characteristics of jobs
• Other component of the system will be ignored for
the moment
19
24.3 Terminology
• State Variables
• The variables whose values define the state of the
system
• If a simulation is stopped in the middle, it can be
restarted if and only if values of all state variables are
known
• In CPU scheduling simulation: the length of the job
queue
20
24.3 Terminology
• Event
• A change in the system state
• In CPU scheduling simulation: arrival of a job, beginning
of a new execution, and departure of a job
• Continuous-Time and Discrete-Time Models
• The system state is defined at all time or is defined only
at particular instants in time
X-axis 21
24.3 Terminology
• Continuous-State and Discrete-State Models
• Depending upon whether the state variables are
continuous or discrete
Continuous-event model discrete-event model
Y-axis 22
24.3 Terminology
• Deterministic and Probabilistic Models
• Results of a model can be predicted = deterministic
model
• Different result on the same set of input parameters =
probabilistic model
23
24.3 Terminology
• Static and Dynamic Models
• Static, if time is not the variable in the model
• Model of matter-to-energy transformation: E = mc2
• Dynamic, the system state changes with time
• The CPU scheduling model
24
24.3 Terminology
• Linear and Nonlinear Models
• Depend on the output parameters
Constant Rate of Change

25
24.3 Terminology
• Open and Closed Models
• If the input is external to the model and is independent
of it, it is called an open model
26
24.3 Terminology
• Stable and Unstable Models
• If the dynamic behavior of the model settles down to s
steady state, that is, independent of time, it is called
stable
27
24.3 Terminology
• In general, computer system models are
continuous time, discrete state, probabilistic,
dynamic, and nonlinear
• Either open or closed models are used
• Either stable or unstable models are used
28
24.4 Selecting a Language for
Simulation
• The most important step in simulation model
development
• An incorrect decision may lead to long
development times, incomplete studies, and
failures
• 4 choices:
• Simulation language
• General-purpose language
• Extension of a general-purpose language
• Simulation package (such as a network solver)
29
Simulation
• Simulation language
• E.g., SIMULA and SIMSCRIPT
• Save time due to built-in facilities for time advancing,
event scheduling, entity manipulation, random variate
generation, statistical data collection, and report
generation
• Analysts can focus on issues specific to the system
• Easy to read code and Good error detection
• Can be classified into two categories
• Continuous simulation languages, CSMP, DYNAMO
• Discrete-event simulation language, SIMULA, GPSS
• Some languages allow discrete, continuous, and combined
simulation, SIMSCRIPT, GASP
30
Simulation
• General-purpose language
• Analysts are familiar with these language
• Deadline might not allow them to learn a simulation
language
• Also, simulation language might not be available to them
• Most people write their first simulation in a general-
purpose language
• People do not want to spend time learning simulation
languages
• However, they have to spend time develop routines for
event handling, random-number generation, and so
forth
31
Simulation
• An extension of a general-purpose language
• E.g., GASP (for FORTRAN)
• PyCX, PyDSTool, SimPy, etc. for Python
• They should contain a collection of routines to handle
tasks required in simulation
• To provide efficiency, flexibility, and portability
32
Simulation
• Simulation Packages
• E.g., QNET4 and RESQ
• A library of data structures, routines, and algorithms
• Biggest advantage is the time savings
• The main problem is inflexibility
• In most practical situations, analysts run into one or
another problem that cannot be modeled by the
package
• Analyst might have to make a simplifications
33
24.5 Types of Simulations
24.5.1 Monte Carlo Simulation
• A static simulation or one without a time axis
• To model probabilistic phenomenon that do not change
characteristics with time
34
Monte Carlo
𝑦 = 𝑥 2 𝑓𝑜𝑟 𝑥 = 0 𝑡𝑜 𝑥 = 2
p = probability of random points to be
under the graph
X = area that we want to find (light blue)
A = area of the rectangular
𝑋 𝑋 𝑋
𝑝= = =
𝐴 2∙4 8
𝑋 = 8𝑝
𝑅𝑎𝑛𝑑𝑜𝑚 𝑁 𝑝𝑜𝑖𝑛𝑡𝑠
(0 ≤ 𝑥 ≤ 2, 0 ≤ 𝑦 ≤ 4)
If n points are in the light blue area, so
𝑛
𝑋=8
𝑁
Ref: https://preecha11th.wordpress.com/2011/11/06/monte-carlo-method-คืออะไร 35
25.5.2 Trace-Driven Simulation
• A trace is a time-ordered record of events on a real
system
• They are generally used in analyzing or tuning resource
management algorithms
• Paging algorithms, cache analysis, CPU scheduling
algorithms, deadlock prevention algorithms, and
algorithms for dynamic allocation of storage are
examples of cases where trace-driven simulation has
been successfully used
36
• Sherman and Brown (1973) point out the following
advantages of trace-driven simulations
1. Credibility: easy to be accepted by others
2. Easy Validation: measure performance
characteristics of the real system during getting
trace, then compare with the simulation
3. Accurate Workload: a trace preserves the
correlation and interference effects in the
workload
37
4. Detailed Trade-offs: the high level of detail in the
workload
5. Less Randomness: a trace is a deterministic input,
overall, output has less variance
6. Fair Comparison: allows different alternatives to
be compared under the same input stream; other
models, input is generated from a random stream
7. Similarity to the Actual Implementation: a trace-
driven model is very similar to the system being
modeled.
38
• The disadvantages of trace-driven simulations are
as follows:
1. Complexity: requires a more detailed simulation of
the system
2. Representativeness: may not be a representative
workload on another system, may be obsolete
3. Finiteness: a trace is a long sequence, a few
minutes of activity may be huge
39
• The disadvantages of trace-driven simulations are
as follows:
4. Single Point of Validation: One should use a few
different traces to validate the results
5. Detail: the high level of detail, time to read it
6. Trade-off: difficult to change workload
characteristics, a trace of the changed workload is
required
40
24.5.3 Discrete-Event Simulations
• A simulation using a discrete-state model of the system
• In computer systems, discrete-event models are used
since the state of the system is described by the number
of jobs at various devices
• A discrete-event simulation may use discrete- or
continuous-time values
• All discrete-event simulations have a common structure
• Regardless of the system being modeled, the simulation
will have some of the components described here
41
1. Event Scheduler: a linked list of events, the events
are manipulated in various ways
(a) Schedule event X at time T.
(b) Hold event X for a time interval dt.
(c) Cancel a previously scheduled event X.
(d) Hold event X indefinitely (until it is scheduled by
another event).
(e) Schedule an indefinitely held event.
42
2. Simulation Clock and a Time-advancing
Mechanism:
• a global variable representing simulated time
• The scheduler is responsible for advancing this time
• unit time approach, increments time by small increment
and then checks to see if there are any events that can
occur
• event-driven approach, increments the time
automatically to the time of the next earliest occurring
event.
43
3. System State Variables:
• These are global variables that describe the state of the
system
4. Event Routines:
• Each event is simulated by its routine
• These routines update the system state variables and
schedule other events
5. Input Routines:
• The input routines typically allow a parameter to be varied in
a specified manner
• For example, the simulation may be run with mean CPU
demand varying from 1 to 9 milliseconds in steps of 2
milliseconds
44
6. Report Generator:
• calculate the final result and print in a specified format
7. Initialization Routines:
• These set the initial state of the system state variables
and initialize various random-number generation
streams
• It is suggested that there be separate routines to
initialize the state at the beginning of a simulation, at
the beginning of an iteration, and at the beginning of a
repetition
45
8. Trace Routines:
• These print out intermediate variables as the simulation
proceeds
• They help debug the simulation program
• It is advisable that the trace have an on/off feature
9. Dynamic Memory Management:
• garbage collection
• Most simulation languages and many general-purpose
languages provide this automatically
46
10. Main Program:
• This brings all the routines together. It calls input
routines, initializes the simulation, executes various
iterations, and finally, calls the output routines.
47
Ch. 26
Random-Number
Generation
48
Introduction
• A routine to generate random values is key component
for simulation
• Frist, a sequence of random numbers distributed
uniformly between 0 and 1 (random-number
generation)
• Then, the sequence is used to produce random values
satisfying the desired distribution (random-variate
generation)
• Random-number generation is discussed here
49
26.1 Desired Properties of a Good
Generator
• The most common method is to used a recursive relation
• For example,
• X0 = 5
• The first 32 numbers, 10, 3, 0, 1, 6, 15, 12, 13, 2, 11, 8, 9, 14,
7, 4, 5 10, 3, 0, 1, 6, 15, 12, 13, 2, 11, 8, 9, 14, 7, 4, 5
• Divided by 16, 0.6250, 0.1875, 0.0000, 0.0625, 0.3750,
0.9375, 0.7500, 0.8125, 0.1250, 0.6875, 0.5000, 0.5625,
0.8750, 0.4375, 0.2500, 0.3125, 0.6250, 0.1875, 0.0000,
0.0625, 0.3750, 0.9375, 0.7500, 0.8125, 0.1250, 0.6875,
0.5000, 0.5625, 0.8750, 0.4375, 0.2500, 0.3125
50
Generator
• X0 is called the seed
• Function f is deterministic, pseudo-random
• It is preferable to fully random due to repeatable for
simulation
• The seed can be changed to generate a different one
• The first 16 numbers are unique
51
Generator
• The desired properties of the generator function
are as follows:
1. It should be efficiently computable
• The processor time required should be small
2. The period should be large
• Otherwise, it may limit the useful length of simulation runs
3. The successive values should be independent and
uniformly distributed
• The correlation between successive numbers should be small
52
26.2 Linear-Congruential
Generators
• 1951, D. H. Lehmer found that the residues of
successive powers of a number have good
randomness properties
• a= multiplier (23), m = modulus (108+1)

• Many of the current used random-number
generators
• xn between 0 and m-1, a and b are non-negative
53
Generators
• The choice of a, b, and m effects the period and
autocorrelation in the sequence
• Results from researchers
1. The modulus m should be large, large period
2. For mod m computation efficiency, m should be
power of 2, 2k
3. If b is nonzero, the maximum period m is obtained iff
a) Integer m and b are relatively prime, no common factors
other than 1
b) Every prime number that is a factor of m is also a factor of a-
1 and
c) a-1 is a multiple of 4, if integer m is a multiple of 4
54
Generators
• Notice that all conditions are met if m = 2k, a = 4c +
1 , and b is odd, c, b, and k are positive integers
• Full-period generator, a generator that has the
maximum possible period
• They are not equally good, lower autocorrelation
are preferable
• Correlation 0.25
• Correlation 2-18
55
26.2.1 Multiplicative LCG
•b=0
• Less computation time

• Two types of multiplicative LCGs
• m = 2k
• m ≠ 2k
56
26.2.2 Multiplicative LCG with
m=2 k
• Easy mod operation

• Maximum period: 2k-2 , 2k/4
• Period is achieved if the multiplier a is of the form
8i ± 3 and the initial seed is an odd integer
57
Example 26.1
• 𝑥𝑛 = 5𝑥𝑛−1 𝑚𝑜𝑑 25
• Using a seed of x0=1:
• 5, 25, 29, 17, 21, 9, 13, 1, 5,…
• Period = 8 = 32/4
• If the initial seed is not an odd integer

• With x0 = 2, the sequence is: 10, 18, 26, 2, 10,…
• The period is only 4.
58
Example 26.1
• if the multiplier a is not of the form 8i ± 3
• 𝑥𝑛 = 7𝑥𝑛−1 𝑚𝑜𝑑 25
• Using a seed of x0 = 1, we get the sequence: 7, 17,

23, 1, 7,…
• The period is only 4
59
m≠2 k
• A solution to a small period problem is to use a

modulus m that is a prime number
• With a proper multiplier a , possible period of m-1
• The maximum period length is m
• xn never be zero if m is prime
• xn is between 1 and m-1
• A period of m-1 is a full-period generator
60
m≠2 k
• Not all values of the multiplier are equally good

• A multiplicative LCG will be a full-period generator
if and only if
• a is a primitive root of the modulus m
• a is a primitive root of m iff 𝑎𝑛 𝑚𝑜𝑑 𝑚 ≠ 1 for 𝑛 =
1,2, … , 𝑚 − 2
61
Example 26.2
• 𝑥𝑛 = 3𝑥𝑛−1 𝑚𝑜𝑑 31
• Starting with a seed of x0=1:
• 1, 3, 9, 27, 19, 26, 16, 17, 20, 29, 25, 13, 8, 24, 10,
30, 28, 22, 4, 12, 5, 15, 14, 11, 2, 6, 18, 23, 7, 21, 1,
…
• The period is 30
• ⇒ 3 is a primitive root of 31
62
Example 26.2
• 𝑥𝑛 = 5𝑥𝑛−1 𝑚𝑜𝑑 31
• With a multiplier of a = 5: 1, 5, 25, 1,…

• The period is only 3 ⇒ 5 is not a primitive root of 31
• 53 𝑚𝑜𝑑 31 ⇒ 125 𝑚𝑜𝑑 31 = 1
• Primitive roots of 31= 3, 11, 12, 13, 17, 21, 22, and 24.
• a is a primitive root of m iff 𝑎𝑛 𝑚𝑜𝑑 𝑚 ≠ 1 for 𝑛 =

1,2, … , 𝑚 − 2
63
Other Generators
• Tausworthe Generators
• Extended Fibonacci Generators
• Combined Generators
• Combine two or more random generators
• Adding, exclusive-or, shuffle, etc.
64
26.7 Seed Selection
• If only one random variable is required, any seed
value is okay
• Multistream simulations, arrival and service time
need to be randomed
• Most of this guidelines are for multistream
simulations
65
26.7 Seed Selection
1. Do not use zero
• Some will get stuck at zero, multiplicative LCG
2. Avoid even values
• Multiplicative LCG with modulus m = 2k , the seed
should be odd
• If possible, avoid generators that have too many
conditions on seed values or whose performance
depends upon the seed value
66
26.7 Seed Selection
3. Do not subdivide one stream
• Seed u0: {u1,u2,…} Then use u1 for arrival time, u2
for service time, u3 for arrival time,…
• This may result in a strong correlation between the
two variables
4. Use nonoverlapping stream
• A separate seed, but two streams overlap
• A correlation between the streams, the result will
not be independent
67
26.7 Seed Selection
5. Reuse seeds in successive replications
• The seeds left over from the previous replication
can continue to be used, the random stream need
not be initialized
6. Do not use random seeds
• Such as, the time of day
• Two problems: the simulation cannot be
reproduced and it is not possible to guarantee that
the multiple streams will not be overlap
68
26.8 Myths About Random-
Number Generation
1. A complex set of operations leads to random
results
• Using a sequence of operations where the final
result is difficult to guess does not necessarily
mean that the resulting sequence will pass the test
for uniformity and independence
• It is better to use simple operations that can be
analytically evaluated for randomness
69
Number Generation
2. A single test such as the chi-square test is sufficient
to test the goodness of a random-number generator
• The sequence 0,1,2,3,…,m-1 is obviously not
random, but it will pass the chi-square test with
perfect score
• It is therefore necessary to use as many test as
possible
• It is better to avoid inventing new generators
unless you are highly sophisticated statistically
70
Number Generation
3. Random numbers are unpredictable
• A truly random sequence should be completely
unpredictable
• For example, given the history of the throws of a
fair die, impossible to predict the next throw
• Not with pseudo-random number generators
• Given a few successive numbers from and LCG, it
can be computed for the parameters, a, c, and m
71
Number Generation
4. Some seeds are better than others
• This may be true for some generators
• For example, 𝑥𝑛 = 9806𝑥𝑛−1 + 1 𝑚𝑜𝑑 217 − 1
• For 𝑥0 = 37,911, it will stuck at 𝑥𝑛 = 37,911
forever
• Such generators should be avoided
• Some generators need the seed to be odd
• Generators whose period or randomness depends
upon the seed should not be used
72
Number Generation
5. Accurate implementation is not important
• The period and randomness properties of
generators are guaranteed only if the generation
formula is accurately implemented without any
overflow or truncation
6. Bits of successive words generated by random-
number generator are equally randomly distributed
• Generally, any particular bit position or sequence of
bit positions will not be equally random
73
Example 26.7
• 𝑥𝑛 = 25,173𝑥𝑛−1 + 13,849 𝑚𝑜𝑑 216
74
Number Generation
6. Bits of successive words generated by random-
number generator are equally randomly distributed
• The cyclic behavior of low-order bits illustrated in
Example 26.7 is typical of all LCGs with modulus
𝑚 = 2𝑘
• The high-order bits are more randomly distributed
than the low-order bits
75
Next
Introduction to Queueing Theory
Q&A
76

04 Simulation-RNG

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

04 Simulation-RNG

Uploaded by

Copyright:

Available Formats

Simulation

010123211 Simulation and Modeling

Continuous-event model discrete-event model

Constant Rate of Change

• a= multiplier (23), m = modulus (108+1)

• Less computation time

• Easy mod operation

• If the initial seed is not an odd integer

• Using a seed of x0 = 1, we get the sequence: 7, 17,

• A solution to a small period problem is to use a

• Not all values of the multiplier are equally good

• With a multiplier of a = 5: 1, 5, 25, 1,…

• a is a primitive root of m iff 𝑎𝑛 𝑚𝑜𝑑 𝑚 ≠ 1 for 𝑛 =

You might also like