Session 1550

Accelerating Simulations Using

Efficient Modeling Techniques

Session 1550

Accelerating Simulations Using

Efficient Modeling Techniques
R&D Solutions for Commercial
and Defense Networks
Session Goals

Session Goals
Learn the basics of factors affecting efficiency of Modeler
simulation through a combination of lectures and labs
Learn and practice Modeler efficiency techniques and
methodologies to identify causes of performance problems
Topics include
z Kernel

types, compiler settings and other environmental factors

z Efficiency techniques
z Profiling programs
z General programming practices
z Tradeoffs
z Advanced efficiency techniques to reduce memory
z OPNET data structures, algorithms and memory API
z Wireless modeling efficiency

Simulation Optimization Goals

Simulation Optimization Goals
z Session organization: from the Simple to the Complex

Environment efficiency: the Low Hanging Fruit

Lab1: Using Development vs. Optimized Kernel
Programming Efficiency
z Data Structures & Algorithms
Lab2: Routing Protocol Profiling
z Discrete Event Simulation Kernel Efficiency
z Memory Efficiency
Lab3: Memory Statistics

Modeling Efficiency

Trading Accuracy for Speed

z Using Global Space and Memory Sharing
z Lazy Evaluation
z Overloading Information Exchange
Lab 4: Operating Protocols Efficiently

Take-Away Points
Simulation Optimization Goals

Simulation Optimization Goals

z Reduce

wall-clock time
z Increase events/second

z Less

z Less chance of using swap space
z Better performance scaling with size of model

z Easier

to maintain

z Modeling

results accurate according to assumptions

Time Space / Fidelity Trade-Offs

z Choose

what is more important

Simulation Methodology (from 1572)

Simulation Methodology (from 1572)


Understanding the

Understanding your
goals for the

Choosing aspects
to be modeled

Defining input and


Specifying the
system model

Choosing input and

running simulations

System results






Abstraction vs. Fidelity


What Takes Time in a

Discrete Event Simulation?

What Takes Time in a

Discrete Event Simulation?
Model Code
z Process

z Pipeline stages
z External files

DES Kernel Services

z Kernel

User Models

Procedures: ima, pk, sar, list,

z Time spent in kernel procedures is
dictated by model code

DES Kernel Engine

z Event

z Dispatching interrupts
z Handling packet send/receive

Operating System
z Memory


DES Kernel
DES Kernel

Simulation Models

Simulation Models
Simulation Methods
z Discrete

event simulation (DES) models and engine

z Flow Analysis models and engine

DES models
z Open

source C/C++ code

z Determine protocol behavior including data + control plane
z Utilize DES kernel to run simulations
z Capture detailed component/network behavior

Understand efficient modeling techniques to operate and design DES

protocol models

Optimize Everything!

Optimize Everything!

Efficient environment
Efficient simulation kernel
Efficient compilation
Efficient model representation
Efficient configuration of protocols
Efficient code
z Data


z Algorithms
z Efficient Kernel Procedures
z Model design and abstraction

What Gains Should You Expect

What Gains Should You Expect

Pareto Principle: 80/20 rule
z 80%

of time/memory/etc. in 20% of code

z Optimize most time/memory-consuming code first

Gains vary enormously depending on technique

z Most

result in small improvement (1-5% overall speed or memory)

z A few result in big speed or memory improvement
Replace use of inappropriate data structure or algorithm
Achievable by trimming unnecessary computations (precision issues)

Simulation Optimization Goals


Simulation Optimization Goals

z Session Organization: from the Simple to the Complex

Environment efficiency: the Low Hanging Fruit

Lab1: Using Development vs. Optimized Kernel
Programming Efficiency
z Data Structures & Algorithms
Lab2: Routing Protocol Profiling
z Discrete Event Simulation Kernel Efficiency
z Memory Efficiency
Lab3: Memory Statistics

Modeling Efficiency

Trading Accuracy for Speed

z Using Global Space and Memory Sharing
z Lazy Evaluation
z Overloading Information Exchange
Lab 4: Operating Protocols Efficiently

Take-Away Points
Efficient Environment

Efficient Environment
Faster hardware
z Multi-Core

processor machines

More memory: Eliminates swapping

Less CPU contention
z Eliminate

other activities: No complex screen savers

Run simulation from command-line or OPNET Console
z Or minimal GUI
Prevent speed and memory graphs
from being displayed
Reduce frequency of
simulation progress updates

Efficient Environment:
Concurrent Simulations

Efficient Environment:
Concurrent Simulations
Modern CPUs have more than one microprocessor core
z Execute

a simulation run on each core

z Distribute simulations runs to different hosts

OPNET Preference configuration:

z Allow

Simulations on Multiple Hosts: TRUE

z Distributed Simulation Hosts:
<hostname (localhost)>::<number of CPU cores>

Need separate Simulation Run-Time license for each concurrent run

z Site

Simulation Runtime License covers unlimited number of concurrent runs

Efficient Simulation Kernel

Efficient Simulation Kernel

Development Kernel
z Model

z OPNET Simulation Debugger (ODB)
z Profiling capability in Kernel Procedures

Optimized Kernel
z No

ODB, tracing, or KP profiling

z Typically 50-100% faster than Development Kernel
z Use for production simulation after development

Parallel Kernel
z Use

only with multi-threaded models

OPNET Preference kernel_type

Efficient Compilation

Efficient Compilation
Use compiler optimizations
z Configuration

using OPNET preferences

comp_prog selects compiler
comp_flags_devel for development kernel
comp_flags_optim for optimized kernel
z See Appendix for compiler flags or OPNET FAQs

Compile without OPNET function call stack info

z FCS:

Error reporting overhead for

quickly debugging problems
z Increases speed by 30-50%
z comp_trace_info preference
z Default is disabled when compiling
LAB 1:
Simulation Kernel Performance

LAB 1:
Simulation Kernel Performance
Assess baseline simulation performance
Explore OPNET development environment configuration
Capability of environment + simulation kernel + compiler
Compare debug and optimized kernels
Run multiple simulations concurrently

LAB 1 Summary:
Simulation Kernel Performance

LAB 1 Summary:
Simulation Kernel Performance
Use Optimized Kernel
z 2x

speed improvement for simplest simulation

z Use unless debugging problem

Remove function stack trace information for speed

Reduce frequency of progress update

Simulation Optimization Goals


Simulation Optimization Goals

z Session Organization: from the Simple to the Complex

Environment Efficiency: the Low Hanging Fruit


Lab1: Using Development vs. Optimized Kernel

Programming Efficiency

Data Structures & Algorithms

Lab2: Routing Protocol Profiling
z Discrete Event Simulation Kernel Efficiency
z Memory Efficiency
Lab3: Memory Statistics

Modeling Efficiency

Trading Accuracy for Speed

z Using Global Space and Memory Sharing
z Lazy Evaluation
z Overloading Information Exchange
Lab 4: Operating Protocols Efficiently

Take-Away Points
Solve Problems Before Dealing with Efficiency:

Code Optimization Methodology

Solve Problems Before Dealing with Efficiency:

Code Optimization Methodology

Understand the perceived problem

Define the abstract problem
Design the solution
Implement a prototype
Verify correctness
Benchmark prototype
Identify optimization candidate
Optimize/redesign code
Repeat as necessary

Profiling Program Execution

Profiling Program Execution

Objective: to shorten execution time
Find hot spots accurately by code instrumentation
z Function

timing data
z Number of times function called

Use commercially-available software


Rational Quantify, compilers profiler (prof, gprof, Microsoft Profiler)

OPNET built-in profiler

z Relies

on FIN/FOUT/FRET macros in user code

z Results not precisebut accurate for finding bottlenecks
z Kernel procedures only profiled if using development kernel
Optimized kernel shows only user model functions
z More details in upcoming lab

Question: how to improve inefficiency uncovered by profiling ?

Complexity Analysis

Complexity Analysis
Big-O notation
Way to quantify performance and compare different algorithms
z O(1)

O(log n) O(n) O(n2) O(2n)

z Worst-case order of growth
z By inspection of solution
z Basic understanding of problem and solution
z Quick comparison of different approaches

Example: What is order of growth of this code?

for (i = 0; i < n; i++) {
if (array [i] == x)
return (true);
return (false);

Comparing Data Structures

Comparing Data Structures

Array (vector)
z Contiguous

sequence of items
z Fixed number
z Direct access by index

Access: O(1)
Insert: O(N)
Delete: O(N)

Double Linked List

z Non-contiguous

sequence of items
z Variable number
z Indirect access starting from beginning/end

Access: O(N)
Insert: O(1)
Delete: O(1)

Searching Algorithms

Searching Algorithms
Problem: Find a particular value in a collection of elements
Sequential search
z Works

for any data structure (array, list, etc.)

z O(n)

Binary search
z Requires

ordered array/vector
z O(log n) on average
z Standard implementations
z Not appropriate for linked lists (no direct element access)

Searching Algorithms (cont.)

Searching Algorithms (cont.)

Tree search
z O(log

z Insert/delete O(height of tree) O(log n)
#include <prg_mapping.h>
z Example in OPNET Models

Searching Algorithms (cont.)

Searching Algorithms (cont.)

Hash table search
index = hash_function (item key);

z O(1)

z Hash function on key
z Memory overhead
z Hashing overhead
z Collisions
#include <prg_string_hash_funcs.h>
#include <prg_bin_hash.h>
z Example in OPNET Models

OPNET API Data Structures

OPNET API Data Structures

API Package

Data Structure


Insert /

resizable array



linked list




balanced tree

O(log n) O(log n)


hash table
(string keys)




hash table
(fixed length



LAB 2:
Improving Code Efficiency

LAB 2:
Improving Code Efficiency
Routing protocol model
Identify bottlenecks using the OPNET Profiler
Apply different efficiency techniques

LAB 2 Summary:
Reworking Code for Efficiency

LAB 2 Summary:
Reworking Code for Efficiency
OPNET Simulation Profiler identifies inefficiencies in design
Redesign with more efficient algorithms and data structures
z 76%

faster with hash table search over sequential search

z 86,944 events/second vs. 49,573 events/second

Simulation Optimization Goals


Simulation Optimization Goals

z Session Organization: from the Simple to the Complex

Environment Efficiency: the Low Hanging Fruit


Lab1: Using Development vs. Optimized Kernel

Programming Efficiency

Data Structures & Algorithms

Lab2: Routing Protocol Profiling
z Discrete Event Simulation Kernel Efficiency
z Memory Efficiency
Lab3: Memory Statistics

Modeling Efficiency

Trading Accuracy for Speed

z Using Global Space and Memory Sharing
z Lazy Evaluation
z Overloading Information Exchange
Lab 4: Operating Protocols Efficiently

Take-Away Points
Efficiency of Packet Field APIs

Efficiency of Packet Field APIs

Common operations on a packet
z Field

z Field types
z Transferring

Access by name: op_pk_nfd_ (pkptr, <name>,

Access by index: op_pk_fd_ (pkptr, <index>,
& Safe if <name> not present
& Independent of field order
' Cost of string comparison
inside KP

' Need knowledge of index
' Dependent on field order
& Faster access

Packet Fields vs. Structs

Packet Fields vs. Structs

Two ways of representing data items in packet
z Individual

field for each item

z One structure field with corresponding C/C++ structure

struct {
int src, dst, prio;

Individual packet fields

& More modular
& Automatic display in ODB
' Slower access to data

Fields of a C/C++ structure

& More compact representation
' Require display proc for ODB
)Require copy/delete/print procs
& Faster access to data

Field Get vs. Access for Structure Fields

Field Get vs. Access for Structure Fields

_Get to extract data from packet
z Container

packet internals updated to reflect extraction

z op_pk_(n)fd_get KPs for fields of type structure

_Access to modify data in-line


op_pk_(n)fd_access_ptr KP for fields of type structure

z No

update needed for container packet internals

z Useful to read/modify data in packet
z Packet sharing: make private
Potential structure data update
z Faster

_Access_Read_Only to peek at data inside packet


op_pk_(n)fd_access_read_only_ptr KP for fields of type structure

z No

update for packet internals nor packet sharing

z Fastest
Type-Specific Packet Field Access

Type-Specific Packet Field Access

Generic KPs to set/get packet fields


z Variable

number or types of arguments

z No compiler help with bad calls
Likely to cause aborts at runtime due to type mismatch
z Deprecated APIs

Strongly Typed KPs



<type>: dbl,info,int32,int64,objid,pkid,pkt,ptr,str

op_pk_fd_get_dbl (Packet *, int index, double * val_ptr)

op_pk_nfd_set_int32 (Packet *, const char * field, int val)

z Bad

arguments reported when compiling the model

z Field type mismatch reported as simulation warning, not crash
z Faster than generic versions
Efficient Event State

Efficient Event State

Associating information with events
Pre-10.0: Must use Interface Control Information (ICI)

Ici * op_ici_install (Ici * ici_ptr)

Ici * op_ev_ici (Evhandle)

10.0+: Allows arbitrary C/C++ state structure


void * op_ev_state_install (void * state_ptr, void*

void * op_ev_state (Evhandle)

& More modular
' Slower access to data
& Automatic display in ODB
' Cannot view ICI internals
in source debugger

C/C++ structure
& More compact representation
& Faster access to data
' Only display pointer in ODB
& Easier to manipulate in source

Simulation Optimization Goals


Simulation Optimization Goals

z Session Organization: from the Simple to the Complex

Environment Efficiency: the Low Hanging Fruit


Lab1: Using Development vs. Optimized Kernel

Programming Efficiency

Data Structures & Algorithms

Lab2: Routing Protocol Profiling
z Discrete Event Simulation Kernel Efficiency
z Memory Efficiency
Lab3: Memory Statistics

Modeling Efficiency

Trading Accuracy for Speed

z Using Global Space and Memory Sharing
z Lazy Evaluation
z Overloading Information Exchange
Lab 4: Operating Protocols Efficiently

Take-Away Points
Reducing Memory Usage

Reducing Memory Usage

z Insufficient

physical memory for active tasks

z Temporarily transfer physical memory contents to disk
z Disk is slow
z Swapping dramatically decreases performance of simulations

Memory and performance

z Using

less memory improves performance

Less paging faster code
z Memory is slower than CPU
Less frequent memory access faster code

CONFIDENTIAL RESTRICTED ACCESS: This information may not be disclosed, copied, or transmitted in any format without the prior written consent of OPNET Technologies, Inc. 2010 OPNET Technologies, Inc.


Identifying Memory Problems

Identifying Memory Problems

Check high memory utilization
z Out-of-memory

z Look/listen for swapping
z Watch with tools

Memory monitoring tools

z MS

Windows Task Manager
Performance Monitor (perfmon)
z OPNET Memory Tools

Dynamically Allocating Memory

Dynamically Allocating Memory

C/C++ library calls
z new/delete

BEWARE: Memory fragmentation is possible with repeated allocations/deallocations

OPNET Memory
z Three

z APIs
op_prg_pmo (and prg_pmo) API package
op_prg_cmo (and prg_cmo) API package
op_prg_mem (and prg_mem) API package
OPNET Pooled Memory API


z Fixed

size objects
z Allocate contiguous blocks

z Variable

size objects
z Grouping of logically related memory allocations
Statistics per category

z Best

used for transient memory

OPNET Pooled Memory API

OPNET Pooled Memory API

Fixed-sized objects
z Simulations

typically use fixed-sized objects

z Allocate blocks of struct objects

z Quickly

allocate/deallocate objects


OPNET Memory Preferences

OPNET Memory Preferences

Preferences to disable memory management optimizations for
detailed memory usage studies
z mem_opt.compact_pools

If TRUE, may share same blocks for pools of the same size
z mem_opt.pool_small_blocks
If TRUE, use Pooled Memory for small blocks of dynamically allocated

8-byte overhead per memory object

Disabling memory management

preference mem_optimize = false

z Find leaks with third-party memory debugging tools
IBM Rational Purify

OPNET Built-In Memory Statistics

OPNET Built-In Memory Statistics

Relies on using OPNET memory API
Detailed memory use statistics
Reports dynamically allocated memory use only

LAB 3:
Memory Statistics

LAB 3:
Memory Statistics
OPNETs built-in memory utilization profiler
z Memory

z Memory Source Tracing

Interpret output results

See utility of OPNET memory API for analysis/debugging

LAB 3 Summary:
Memory Statistics

LAB 3 Summary:
Memory Statistics
Finding memory leaks easy in OPNET
z Memory

Utilization graph
z Memory Statistics table
z Memory Source Tracing function call stacks

Structure Optimization

Structure Optimization
struct OPNETWORK_Session {

topic [128];


sizeof (struct OPNETWORK_Session)?

Structure Optimization (cont.)

Structure Optimization (cont.)

struct OPNETWORK_Session {

topic [128];





- 4 bytes
- 4 bytes
- 128 bytes
- 4 bytes
- 8 bytes


sizeof (struct OPNETWORK_Session)?

z 4 + 4 + 128 + 8 + 4 = 148 bytes?

Optimized Structure

Optimized Structure
Sub-byte struct members
z struct

OPNETWORK_Session_Optimal {

char *
unsigned char numAttendees;
unsigned int dayOfWeek : 3;
unsigned int isFull : 1;

- 8 bytes
- 4 bytes
- 1 byte
- 3 bits
- 1 bit

z sizeof

(struct OPNETWORK_Optimal)?
char *

+ 4 + 1 + 3/8 + 1/8 = 14 bytes 16 bytes

Alternative to dynamic memory

Alternative to dynamic memory
Not deallocated
z Uniqueuses hashing
z String comparison easier

From ip_qos_support.ex.c
class_map_ptr->class_name =
(char *) prg_string_const (class_map_name);

if ((pkt_info->class_name != OPC_NIL) &&

(class_map.class_name != OPC_NIL))
if (pkt_info->class_name == class_map.class_name)
match_found = OPC_TRUE;

Caching Information

Caching Information
Trade space for time
z Eliminate

redundant computation of similar information

z Require space to store computed values

Issues to consider
z Cost

of computation vs. cost of maintaining cache

Duration of validity
Ease of invalidating/updating cache
z Cost of cache access
z Memory increase due to cache

Good OPNET Candidates for Caching

Good OPNET Candidates for Caching

Mapping between object IDs and hierarchical names
z Constant

throughout the simulation

z op_id_from_name()

Topology relationships

Neighbors discovered thru op_topo_* KPs

Model attributes
z May

or may not be constant during a simulation


Caching Data in Object State

Caching Data in Object State

Can store custom state structure with most OPNET objects
op_ima_obj_state_set (Objid obj, void * state)
z void * op_ima_obj_state_get (Objid obj)

Efficient access
Useful to cache pipeline stage information in rx/tx channels
z Example

from standard models:

dra_rxgroup.ps.c sets rx channel state for signal lock
dra_power.ps.c gets state and checks/sets signal lock
gets state and resets signal lock

Danger: only one state per object

z Requires

good coordination between models

Attributes You Should Not Cache

Attributes You Should Not Cache

Typically, built-in attributes modified by the simulation kernel
should not be cached
z Position


op_ima_obj_pos_get: obtain lat/long/alt and X/Y/Z coordinates

op_ima_obj_attr_get/set: individual position attributes
z x position, y position, altitude

condition attribute

Determines status of a link/node as failed/live

TMM Caching Example

TMM Caching Example

Terrain divided into 6 rectangles:
z The

first time T1 communicates with R1 or R2

Attenuation cached for <rect 5><rect 1><freq>
z Each time T1 communicates with R1 or R2
Cached attenuation calculation is used
z If R3 moves from rectangle 3 to rectangle 1
Communication between T1 and R3 uses cached attenuation

z tmm_longley_rice.ex.c

Simulation Optimization Goals


Simulation Optimization Goals

z Session Organization: from the Simple to the Complex

Environment Efficiency: the Low Hanging Fruit


Lab1: Using Development vs. Optimized Kernel

Programming Efficiency

Data Structures & Algorithms

Lab2: Routing Protocol Profiling
z Discrete Event Simulation Kernel Efficiency
z Memory Efficiency
Lab3: Memory Statistics

Modeling Efficiency

Trading Accuracy for Speed

z Using Global Space and Memory Sharing
z Lazy Evaluation
z Overloading Information Exchange
Lab 4: Operating Protocols Efficiently

Take-Away Points
Three-Step Process

Three-Step Process
Identify Modeling Problem
z Define

purpose of model and analysis

Understand Efficiency / Compromises

z Optimize

model based on desired answer to question

z Do not simulate everything

Implementation Methods
z Design

code according to compromises

Remove Overhead

Remove Overhead
Component Overhead
z Remove/shutdown

unused IP interfaces
z Remove un-connected nodes (except standalone servers)
Reduces memory required by a simulation
z Remove unwanted protocol components from node models (Modeler only)

Results Overhead
z Do

not collect unwanted statistics

z Use bucketized collection mode instead of collecting all values
z Disable unwanted reports
Reduces number of hard-disk writing
Reduces amount of data stored in memory

Balancing Fidelity vs. Speed

Balancing Fidelity vs. Speed

Fidelity and Simulation Performance
z Typically,

higher fidelity => longer/larger simulation

z Very detailed modeling requires more computationally heavier events,
packets, etc.

Reduction in fidelity
z Can

generate valid results in less time

z Let problem focus dictate fidelity of analysis
E.g. Application-level analysis can tolerate link-layer abstractions

Modeling Abstractions

Modeling Abstractions
Abstraction: useful simplification of a real-world process
z To

be useful, an abstraction must behave like the original process

Types of abstraction commonly used in OPNET

z Network

z Traffic abstraction
z Protocol abstraction
Control plane abstraction
Data plane abstraction
Statistical abstraction

Network Abstraction:
Modeling the Internet

Network Abstraction:
Modeling the Internet
Internet as an IP/ATM/FrameRelay cloud
Abstract a cluster of routers / switches
z Characterize cloud by latency/loss
z One routing table for entire cloud
z Use for end-to-end application performance studies

Routing/security polices cannot be deployed in complete detail
z No failure modeling in a cloud

Network Abstraction:
Server Farm as a LAN node

Network Abstraction:
Server Farm as a LAN node
Abstract a cluster of servers/workstation connected to a LAN segment
z Characterize farm by MAC access delay/switching speed
z One protocol stack for all component nodes

MAC contention delays may be inaccurate for shared segments
z Not possible to model for all technologies

Traffic Abstraction

Traffic Abstraction
Traffic representation in OPNET
Traffic Type
OPNET Representation
Packet Level Traffic Explicit Traffic
Aggregated Traffic

Traffic Flows
Device/Link Loads (Background Traffic)

Choice of representation depends on modeling purpose

z Packet

by packet
End-to-end delays, protocol details, segmentation effects
z Aggregated traffic
Capacity planning, steady-state routing analysis

Hybrid Traffic

Hybrid Traffic
Traffic Demands can be set to a mix
&More accurate than analytics
&Faster than discrete
'Does not model all protocol
dynamics like feedback, flow
control, congestion control
and policing

Hybrid Traffic TDMA Satellite Example

Hybrid Traffic TDMA Satellite Example

Blue is 100% explicit
Red is 1% explicit
'Did not model oversubscription of satellite at 9m 0s.

Protocol Abstraction:
Signaling Plane

Protocol Abstraction:
Signaling Plane
Signaling Plane

Used by connection oriented protocols for resource reservation

z Timers to model outages and topology changes
z Uses control plane information for exchanging signaling messages
z Memory required for connection management

OPNET model examples

SAAL in ATM networks
z SIP for VoIP networks
z RSVP for MPLS networks

How to abstract

Delays to model connection setup

z Simplify signaling protocol (SETUP, CONFIRM)
z Signal using global topology knowledge (if abstracting control plane)
z Examples of signaling layer abstractions in OPNET
SS7 in circuit switch networks


CONFIDENTIAL RESTRICTED ACCESS: This information may not be disclosed, copied, or transmitted in any format without the prior written consent of OPNET Technologies, Inc. 2010 OPNET Technologies, Inc.


Protocol Abstraction:
Control Plane

Protocol Abstraction:
Control Plane
Control Plane
z Model routing traffic, path-computation, tables/databases
z Model policies for filtering and altering routing information
z Handles failure/recovery situations
OPNET model examples
Circuit switch
z Frame Relay

How to abstract

Use of centralized graph accessible to all nodes

Construct with prg_djk or op_topo packages

z Stop

control traffic after a certain simulation duration

Cannot capture routing convergence times
z May not capture all routing policies

Protocol Abstraction:
Data Plane

Protocol Abstraction:
Data Plane
Simulation world

Not required to follow physical path

z Direct transfer packet between any two modules
op_pk_deliver Kernel Procedures
By-pass pipeline stages & network transmission/propagation
z Well-known link-level behavior
Delay, error rate, etc.

Available in following models

ATMATM Sim Efficiency: Enabled

z Frame Relay - FR Sim Efficiency Mode
z Transport ProtocolsDirect Delivery


Cannot model physical layer effects

Path-loss, interference,
z Cannot collect link/channel statistics
Throughput, utilization, error rate,
z Cannot model influence on other packets
Queuing, congestion, flow control back-offs, ...

Statistical Abstraction

Statistical Abstraction
Capture the statistics of a quasi-deterministic process

version of random backoff in a medium access protocol

Assume p: probability of successful transmission in any slot
In case of failure, slot range doubles, tx slot chosen uniformly from range
Find distribution of number of slots to successful transmission
I computed Prob (slot k) = (1/2n)p(1-p)n, where n = LowerInt(log2k)
z Instead of modeling collision detection, retrial etc, sample the distribution

Some processes can only be modeled via statistical abstraction

z Fading

in a multi-carrier radio transmission with Doppler effect

Doppler effect leads to time-correlation
Time correlation can be captured via Markov chain
This techniques has been used in WiMAX PHY modeling

Statistical AbstractionAn Example

Statistical AbstractionAn Example

Smart MAC
z Characterize

MAC operation in a table

Packets arrive
Perform a table lookup to obtain delay/loss
Drop or deliver to destination node after delays
z Table should account for
Delaysretransmission, back-off, propagation
Lossescollision, buffer overflow

Operational Efficiency:
Use Pre-Computed Data

Operational Efficiency:
Use Pre-Computed Data
Use pre-computed data from other simulations or empirical sources
IP forwarding table import from Flow Analysis/DES
Use forwarding table reports previously generated
Entire routing information available at beginning of simulation
z Trade-offs
No control traffic
Cannot react to topology changes
Uses a single snapshot

UMTS pre-computes antenna gain and pathloss

Computed for all UE Node_b communication pairs
z Antenna gain is a function of transmission direction and path loss is a function of
z They are re-computed if distance threshold is crossed
z Trade-offs - May be inaccurate if threshold value is very high

Receiver Groups

Use of Global State

Use of Global State

Omniscient Simulation
z Information

about all nodes/links are accessible to all other nodes

z Store large/repetitive data in a global areahighly efficient when copies of
these data structures have to be made in many nodes

Example: ARP Resolution

z Global

table for IP to MAC address resolution

z Does not generate ARP requests/responses
z Controlled by ARP Sim Efficiency attribute
& Increases simulation speed and eliminates memory required by ARP cache of
each node
' Does not model ARP traffic and also time taken for resolution

Example: TDMA Radio Slot Plan

Simplifies process model and increases simulation speed
' Does not simulate failure to receive slot plan

Lazy Evaluation

Lazy Evaluation
Delay work until result is needed
z Minimal

or zero loss of accuracy

z Avoid work at regular intervals
z Avoid work results of which will never be used

&Increases simulation speed

Lazy Evaluation Example

Lazy Evaluation Example

Routing table aging
z Routing

table or LS database entries are aged over time

z Each element expiry need not be an interrupt
z Possible lazy approaches to clean expired entries
Maintain variable with earliest expiry time in a database and purge expired
element(s) on a subsequent operation on the database
z Search on every database operation for earliest expiry element
Scan database periodically
' May not control memory usage
z Used by OLSR

LAB 4:
Operating Protocols Efficiently

LAB 4:
Operating Protocols Efficiently
Take an IP network and increase its simulation speed and reduce
memory using some of the model efficiency techniques
z No

code changes this time

z Few changes affect network fidelity
z Understand the trade-offs

LAB 4 Summary:
Operating Protocols Efficiently

LAB 4 Summary:
Operating Protocols Efficiently
Increased Simulation efficiency using three efficiency methods
Eliminated OSPF control traffic after 260 seconds

170% Improvement in speed

Eliminated routing table computations using pre-computed data


270 % Improvement in speed

Eliminated overhead in node models


21% Improvement in speed

28% Reduction in memory usage

Reduced overhead with pre-computed routing


300% Improvement in speed

Documentation References

Documentation References
OPNET Modeler Online Documentation
z Modeling

Concepts Reference Manual

Communication Mechanisms
z Programmers Reference Manuals
Kernel Procedures API Reference Manual
Data Structures and Algorithms API Reference Manual
z External Interfaces Reference Manual
Simulation Execution
z OPNET Simulation Debugger (ODB)

OPNET Support Center (http://www.opnet.com/support)

z Methodologies

and Case Studies

Optimizing Performance of Discrete Event Simulations

Big O Notation
Take-Away Points

Take-Away Points

Take-Away Points

Efficient environment
Efficient simulation kernel
Efficient compilation
Efficient model representation
Efficient configuration of protocols
Efficient code
z Data


z Algorithms
z Efficient kernel procedure use
z Model design and abstraction

Take-Away Points

Take-Away Points
Top 5 model efficiency techniques
z Reduce/eliminate

control traffic for stable networks

z Use pre-computed data
z Remove extraneous overheads
z Lazy evaluate periodic activities
z Use global state

Redesign models based on efficient algorithms

z Target

code to desired analysis

Big O Notation (intro from Wikipedia)

Big O Notation (intro from Wikipedia)
Also known as Landau notation, Bachmann-Landau notation and asymptotic
Describes the limiting behavior of a function when the argument tends towards infinity,
usually in terms of simpler functions. Allows simplification of functions in order to
concentrate on their growth rates: different functions with the same growth rate may be
represented using the same O notation.
Used in the analysis of algorithms to describe an algorithms usage of computational
resources: the worst case or average case running time or memory usage of an algorithm is
often expressed as a function of the length of its input using big O notation. This allows
algorithm designers to predict the behavior of their algorithms and to determine which of
multiple algorithms to use, in a way that is independent of computer architecture or speed.
Because Big O notation discards multiplicative constants on the running time, and ignores
efficiency for low input sizes, it does not always reveal the fastest algorithm in practice or
for practically-sized data sets. But the approach is still very effective for comparing the
scalability of various algorithms as input sizes become large.
A description of a function in terms of big O notation usually only provides an upper
bound on the growth rate of the function.
Associated with big O notation are several related notations, using the symbols o, , ,
and , to describe other kinds of bounds on asymptotic growth rates.

Standard Models for Different Kernels

Standard Models for Different Kernels
Kernel combinations


32 bit address
64 bit address

Different object files produced for different kernel types

bgp.dev32.i0.pr.obj or bgp.opt64.s1.pr.o
z <filename>.<kernel_type>.<arch_type>. <file_type>.<file_extn>
kernel_type: one from kernel combinations (dev32, opt64 )
arch_type: machine architecture (i0: Windows; i1: Linux)
file_type: process model (pr), pipeline stage (ps)
file_extn: object file, c file

Repositories shipped with model library for sequential kernel

{Development, Optimized} kernel on {32-bit, 64-bit} for {Windows, Linux}

z All repositories are named stdmod and placed in <opnet_dir>/models/std/base
z Reduces simulation loading time
z Does not include code of your custom models

Recommended comp_flags_optim Settings

Recommended comp_flags_optim Settings
MS Visual C++ 6.0 Professional (comp_prog: comp_msvc)
/G6 /Ox /Ob2
z MS Visual C++ .NET/2005,2008 Professional (comp_prog: comp_msvc)
/G7 /Ox /Ob2
z Target CPU Optimizations: /G{1,2,3,4,5,6,7}


GCC (comp_prog: comp_gcc)


MS reference


Appendix: SMART MAC Capabilities

Appendix: SMART MAC Capabilities

Contention based medium
Packet transmissions undergo backoff due to
contention from shared medium
z Resultant MAC throughput
Obtained by contention table lookup
Number of contenders computed by
SMART MAC for every packet
Table values can be collected from
empirical/theoretical/simulation sources

Nodes can be mobile and hence number of
contenders can change
Re-compute number of reachable nodes based
on transmission range

Hidden Node Collision Detection

Interference from nodes outside transmission
Appendix: SMART MAC Mobile Network

Appendix: SMART MAC Mobile Network


Topology: random
Mobility: All nodes converge to the same point
Number of nodes: 100
Simulation duration: 90 minutes
Validation measures:

Application Throughput

Performance measure:

Elapsed time
Real time ratio

CONFIDENTIAL RESTRICTED ACCESS: This information may not be disclosed, copied, or transmitted in any format without the prior written consent of OPNET Technologies, Inc. 2010 OPNET Technologies, Inc.


Appendix: SMART MAC Mobile Network

Appendix: SMART MAC Mobile Network

Test Performance Gain
SMART MAC simulation speed compared to WLAN

100% faster

SMART MAC simulation speed compared to real time


100% faster than real time

