Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 66

Simulation for Grid Computing

Henri Casanova Univ. of California, San Diego casanova@cs.ucsd.edu


1

Grid Research
Grid researchers often ask questions in the broad
area of distributed computing
Which scheduling algorithm is best for this application on a given Grid? Which design is best for implementing a distributed Grid resource information service? Which caching strategy is best for enabling a community of users that to distributed data analysis? What are the properties of several resource management policies in terms of fairness and throughput? What is the scalability of my publish-subscribe system under various levels of failures ...
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 2

Grid Research
Analytical or Experimental? Analytical Grid Computing research
Some have developed purely analytical / mathematical models for Grid computing
makes it possible to prove interesting theorems often too simplistic to convince practitioners but sometimes useful for understanding principles in spite of

dubious applicability

One often uncovers NP-complete problems anyway


e.g., routing, partitioning, scheduling problems

And one must run experiments

Grid computing research: based on experiments Most published works


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 3

Grid Computing as Science?


You can tell you are a scientific discipline if
You can read a paper, easily reproduce (at least a subset of) its results, and improve You can tell to a grad student Here are the standard tools, go learn how to use them and come back in one month You can give a 1-hour seminar on widely accepted tools that are the basis for doing research in the area

We are not there today


But perhaps I can give a 1-hour seminar on emerging tools that could be the basis for doing research in the area, provided a few open questions are addressed

Need for standard ways to run Grid experiments


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 4

Grid Experiments
Real-world experiments are good
Eminently believable Demonstrates that proposed approach can be implemented in practice

But... Can be time-intensive


Execution of applications for hours, days, months, ...

Can be labor-intensive
Entire application needs to be built and functional
including all design / algorithms alternatives include all hooks for deployment

Is it a bad engineering practice to build many full-fledge solutions to find out which ones work best?
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 5

Grid Experiments (2)


What experimental testbed?
My own little testbed
well-behaved, controlled, stable often not representative of real Grids

Real grid platforms


(Still) challenging for many grid researchers to obtain Not built as a computer scientists playpen other users may disrupt experiments other users may find experiments disruptive Platform will experience failures that may disrupt the

experiments Platform configuration may change drastically while experiments are being conducted Experiments are uncontrolled and unrepeatable
even if disruption from other users is part of the experiments, it prevents back-to-back runs of competitor designs / algorithms
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 6

Grid Experiments (3)


Difficult to obtain statistically significant results on an

appropriate testbed
And to make things worse... Experiments are limited to the testbed
What part of the results are due to idiosyncrasies of the testbed? Extrapolations are possible, but rarely convincing Must use a collection of testbeds... Still limited explorations of what if scenarios
what if the network were different? what if we were in 2015?

Difficult for others to reproduce results


This is the basis for scientific advances!
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 7

Simulation
Simulation can solve many (all) of these difficulties
No need to build a real system Conduct controlled / repeatable experiments In principle, no limits to experimental scenarios Possible for anybody to reproduce results ...

Simulation
representation of the operation of one system (A) through the use of another (B)
Computer simulation: B a computer program

Key question: Validation


correspondence between simulation and real-world
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 8

Simulation in Computer Science


Microprocessor Design
A few standard cycle-accurate simulators are used extensively http://www.cs.wisc.edu/~arch/www/tools.html Possible to reproduce simulation results

Networking
A few standard packet-level simulators
NS-2, DaSSF, OMNeT++

Well-known datasets for network topologies Well-known generators of synthetic topologies SSF standard: http://www.ssfnet.org/ Possible to reproduce simulation results

Grid Computing?
None of the above up until a few years ago
Most people built their own ad-hoc solutions

Promising recent developments


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 9

Simulation of Distributed Computing Platforms?


Simulation of parallel platforms used for decades
Most simulations have made drastic assumptions Simplistic platform model
Processors perform work at some fixed rate (Flops) Network links send data at some fixed rate Topology is fully connected (no communication interference) or

a bus (simple communication interference) Communication and computation are perfectly overlappable

Simplistic application model


All computation is CPU intensive Clear-cut communication and computation phases Application is deterministic

Straightforward simulation in most cases


Just fill in a Gantt chart with a computer rather than by hand No need for a simulation standard
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 10

Grid Simulations?
Simple models:
perhaps justifiable for a switched dedicated cluster

running a matrix multiply application

Hardly justifiable for grid platforms


Complex and wide-reaching network topologies
multi-hop networks, heterogeneous bandwidths and latencies non-negligible latencies complex bandwidth sharing behaviors contention with other traffic

Overhead of middleware Complex resource access/management policies Interference of communication and computation

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

11

Grid Simulations
Recognized as a critical area
Grid eXplorer (GdX) project (INRIA)

Build an actual scientific instrument


Databases of experimental conditions 1000-node cluster for simulation/emulation Visualization tools

Still in planning What simulation technology???

Two main questions


1. What does a representative Grid look like? 2. How does one do simulation on a synthetic representative Grid?
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 12

Presentation Outline
Introduction

Simulation for Grid Computing?


Generating Synthetic Grids

Simulating Applications on Synthetic Grids


Current Work and Future Directions

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

13

Why Synthetic Platforms?


Two goals of simulations:
Simulate platforms beyond the ones at hand Perform sensitivity analyses

Need: Synthetic platforms


Examine real platforms Discover principles Implement platform generators

What Simulation results in my paper?


Results for a few real platforms Results for many synthetic platforms

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

14

Generation of Synthetic Grids


Three main elements
Network Topology
Graph

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

15

Generation of Synthetic Grids


Three main elements
Network Topology
Graph Bandwidth and Latencies

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

16

Generation of Synthetic Grids


Three main elements
Network Topology
Graph Bandwidth and Latencies

Compute Resources
And other resources

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

17

Generation of Synthetic Grids


Three main elements
Network Topology
Graph Bandwidth and Latencies

Compute Resources
And other resources

Background Conditions
Load and unavailability

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

18

Generation of Synthetic Grids


Three main elements
Network Topology
Graph Bandwidth and Latencies

X
X
X

Compute Resources
And other resources

Background Conditions
Load and unavailability

Failures

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

19

Generation of Synthetic Grids


Three main elements
Network Topology
Graph Bandwidth and Latencies

X
X
X

Compute Resources
And other resources

Background Conditions
Load and unavailability

Failures

What is Representative and Tractable?


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 20

Synthetic Network Topologies


The network community has wondered
about the properties of the Internet topology for years
The Internet grows in a decentralized fashion with seemingly complex rules and incentives Could it have a mathematical structure? Could we then have generative models?

Three generations of graph generators


Random Structural Degree-based
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 21

Random Topology Generators


Simplistic
Place N nodes randomly in a square Vertices u,v connected with prob. P(u,v) =

Waxman [JSAC88]
Place N nodes randomly in a CxC square P(u,v) = e-d / ( C 2), 0 < , 1
d = Euclidian distance between u and v

First model to be widely used for network simulations

Others
Exponential, Locality-based, ...

Shortcoming: Real networks have a non-random


structure with some hierarchy
Grid Performance Workshop 2005 Edinburgh, Scotland, June 2005 22

Structural Generators
... the primary structural characteristic affecting
the paths between nodes in the Internet is the distribution between stub and transit domains... In other words, there is a hierarchy imposed on nodes... [Zegura et al, 1997] Both at the AS level (peering relationships) and at the router level (Layer 2) Quickly became accepted wisdom:
Transit-Stub [Calvert et al., 1997] Tiers [Doar, 1996] GT-ITM [Zegura et al., 1997]
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 23

Power-Laws!
In 1999, Faloutsos et al. [SIGCOMM99] rocked
topology generation with power laws
Results obtained both for AS-level and router-level information from real networks

Outdegree (number of edges leaving a node)


For each node v, compute its outdegree dv Node rank, rv: the index of v in the order of decreasing degree
Nodes can have the same rank

Law: dv proportional to rvR

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

24

Power-Laws!
Random Generators do not
agree with Power-laws

Structural Generators do not agree with Power-laws

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

25

Degree-based Generators
New common wisdom: A topology that does not

agree with power-laws cannot be representative Flurry of development of power-law generators after the Faloutsos paper
CMU power law graph generator [Palmer et al., 2000] Inet [Jin et al., 2000] BRITE [Medina et al., 2001] PLRG [Aiello et al., 2000]

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

26

Structure vs. Power-law


We know network have structure AND power laws Combine both?
GridG project [Dinda et al., SC03]
Use a Tiers-generated topology Add random links to satisfy the power-laws

How about just using power-laws?


Comparison paper [Tangmunarunkit et al., SIGCOMM02] Degree-based generators capture the large-scale structure of real networks very well, better than structural generators!
structural generators impose too strict a hierarchy

Hierarchy arises naturally from the degree-based generators


e.g., backbone links

Works both for AS-level and router-level


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 27

Short story
What generator?
To model large networks (e.g., > 500 routers)
use degree-based generators

To model small networks (e.g., < 100 routers)


use structural generators
degree-based will not create any hierarchy

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

28

Bandwidths, latencies, traffic, etc.


Topology generators only produce a graph
We need link characteristics as well

Option #1: Model physical characteristics


Some models in topology generators Need to simulate background traffic
No accepted model for generating background traffic Simulation can be very costly

Option #2: Model end-to-end performance


Models ([Lee, HCW01]) or Measurements (NWS, ...) Go from path modeling to link modeling?

Turns out to be a difficult question


DARPA workshop on network modeling
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 29

Bandwidths, latencies, traffic, etc.


Maybe none of this matters?
Fiber inside the network mostly unused Communication bottleneck is the local link Appropriate tuning of TCP or better protocols should saturate the local link Dont care about topology at all!

Or maybe none of this matters for my


application
No network contention
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 30

Compute Resources
What resources do we put at the endpoints? Option #1: ad-hoc generalization
Look at the TeraGrid Generate new sites based on existing sites

Option #2: Statistical modeling


Examing many production resources Identify key statistical characteristics Come up with a generative/predictive model

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

31

Synthetic Clusters?
Many Grid resources are clusters What is the typical distribution of clusters? Commodity Cluster synthesizer [Kee et al.,
SC04]

Examined 114 production clusters (10K+ procs) Came up with statistical models Validated model against a set of 191 clusters (10K+ procs) Models allow extrapolation for future configurations Models implemented in a resource generator
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 32

Architecture/Clock Models
Processor Pentium2 Celeron Pentium3 Pentium4 Itanium Athlon Fraction (%)
4.0

Pentium2

Celeron

Pentium3

Pentium4

Itanium

1.4
Clock Rate (Ghz)

3.5 3.0

4.1 40.3 34.6 3.9 0.0

2.5 2.0 1.5 1.0 0.5 0.0 1997

5000
1998 1999 2000 2001 Year 2002 2003 2004 2005

4000

AthlonMP
AthlonXP Opteron

12.4
1.3 2.0
1000

# of Processors

3000

2000

Current distribution of proc families Linear fit between clock-rate and release year within a processor family Quadratic fraction of processors released on a given year
1998 1999 2000 Year 2001 2002 2003

Model future distributions and speeds


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 33

Other models?
Other models
Cache size: grows logarithmically

Number of processors per node: log2-normal


Memory size: log2-normal Number of nodes per cluster: log2-normal

Models were validated against a snapshot of ROCKS


clusters

These clusters have been added to the training set More clusters are added every month GridG
Provide a generic framework in which such laws and other correlations can be encoded
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 34

Resource Availability / Workload


Probabilistic models
Naive: exp. distributed availability and unavailability intervals Sophisticated: Weibull distributions [Wolski et al.]

Traces
NWS, etc. Desktop Grid resources [Kondo, SC04]

Workload models
e.g., Batch schedulers
Traces Models [Feitelson, JPDC03] job inter-arrival times: Gamma amount of work requested: Hyper-Gamma number of processors requested: Compounded (2^, 1, ...)
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 35

A Sample Synthetic Grid


Generate 5,000 routers with BRITE Annotate latency according to BRITEs Euclidian distance
method (scaling to obtain the desired network diameter) Annotate bandwidth based on a set of end-to-end NWS measurements Pick 30% of the end-points Generate a cluster at each end-point according to Kees synthesizer for Year 2006 Model cluster load with Feitelsons model with a range of parameters for the random distributions Model resource failures based on Inca measurements on TeraGrid
Grid Performance Workshop 2005 36

Edinburgh, Scotland, June 2005

Synthetic Grid Generation


Still far from widely accepted standards Many ongoing, promising efforts
Researchers have recognized this as an issue Tools from networking can be reused A few Grid tools are available

What is really needed


Repository of Grid Measurements Repository of Synthetic/Generated Grids

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

37

Presentation Outline
Introduction

Simulation for Grid Computing?


Generating Synthetic Grids

Simulating Applications on Synthetic Grids


Current Work and Future Directions

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

38

Simulation Levels
Simulating applications on a synthetic platform? Spectrum of simulation levels
more abstract

Mathematical Simulation Discrete-event Simulation

Based solely on equations Abstraction of system as a set of dependent actions and events (fine- or coarse-grain) Trapping and virtualization of low-level application/system actions

less abstract

Emulation

Boundaries above are blurred (d.e. simulation ~ emulation) A simulation can combine all paradigms at different levels
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 39

Simulation Options
Network
Macroscopic: Flows in pipes
coarse-grain d.e. simulation + mathematical simulation
more abstract

Microscopic: Packet-level simulation


fine-grain d.e. simulation

Actual flows go through some network


emulation

less abstract

CPU
Macroscopic: Flows in a pipe
coarse-grain d.e. simulation + mathematical simulation

more abstract

Microscopic: Cycle-accurate simulation


fine-grain d.e. simulation

Virtualization via another CPU / Virtual Machine


emulation

less abstract

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

40

Simulation Options
Application
Macroscopic: Application = analytical flow Less Macroscopic: sets of abstract tasks with resource needs and dependencies
coarse-grain d.e. simulation Application specification or pseudo-code API

more abstract

Virtualization
emulation of actual code with trapping of

application-generated events

less abstract

Two projects
MicroGrid (UCSD) SimGrid (UCSD + IMAG + Univ. Nancy)

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

41

MicroGrid
Set of simulation tools for evaluating middleware,

applications, and network services for Grid systems Applications are supported by emulation and virtualization
Actual application code is executed on virtualized resources

Microgrid accounts for


CPU network

Application
Virtual Resources

Virtualization

MicroGrid
Physical Resources

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

42

MicroGrid Virtualization
Resource virtualization
resource names are virtualized
gethostname, sockets, Globus GIS, MDS, NWs, etc.

Time virtualization
Simulating the TeraGrid on a 4 node cluster Simulating a 4 node cluster on the TeraGrid

CPU virtualization
Direct execution on (a fraction of) a physical resource No application modification

Main challenge
Synchronization between real time and virtual time

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

43

MicroGrid Network
Packet-level simulation
Network calls are intercepted Are sent to a network simulator that has been configured with the virtual network topology

Implemented with MaSSF


An MPI version of DaSSF [Liu et al., 2001]
Configured via SSF standard (MDL, etc.)

Protocol stacks implemented on top of MaSSF


TCP, UDP, BGP, etc.

Socket calls are trapped without application modification


real data is sent delays are scaled to match the virtual time

Main challenges
scaling (20,000 routers on a 128-node cluster) synchronization with computation
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 44

MicroGrid in a NutShell
Virtualization via another CPU
(emulation) Can be really slow But hopefully accurate Virtualization via trapping of application events (emulation) Can have high overhead But captures the overhead!

Microscopic: Packet-level simulation


(fine-grain discrete event simulation)
Can be really slow for long transfers But hopefully accurate

CPU Application Network


less abstract

Emulation

Discrete-event Simulation
Grid Performance Workshop 2005

Mathematical Simulation

more abstract

Edinburgh, Scotland, June 2005

45

SimGrid
Originally developed for scheduling research
Must be fast to allow for thousands of simulation

Application
No real application code is executed Consists of tasks that have
dependencies resource consumptions

Resources
No virtualization A resource is defined by
a rate a which it does work a fixed overhead that must be paid by each task traces of the above if needed + failures

A task can use multiple resources


data transfer over multiple links, computation that uses a disk and a CPU
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 46

SimGrid
Uses a combination of mathematical simulation
and coarse-grain discrete event simulation
Simple API to specify an application rather than having it already implemented Fast simulation

Key issue: Resource sharing


In MicroGrid: resource sharing emerges out of the lowlevel emulation and simulation
Packets of different connections interleaved by routers CPU cycles of different processes get slices of the CPU

Drawback: slow simulation How can one do something faster that is still reasonable?
Come up with macroscopic models of resource sharing
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 47

Resource Sharing in SimGrid


Macroscopic resource sharing can be easy
A CPU: CPU-bound processes/threads get a fair share of the CPU in steady state Why go through the trouble of emulating CPU-bound processes? Just say how many cycles they need, and compute how many cycles they get per second

Macroscopic resource sharing can be not so easy


The Network
Many end-points, routers, and links Many end-to-end TCP flows?

How much bandwidth does each flow receive?


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 48

Bandwidth Sharing
Macroscopic TCP modeling is a field
Fluid in Pipe analogy Rule of Thumb: Share of what a flow gets on its bottleneck link is inversely proportional to its Round-Trip Time
10 Mb/sec 2 Mb/sec 8 Mb/sec 8 Mb/sec 8 Mb/sec 8 Mb/sec 10 Mb/sec 10 Mb/sec 10 Mb/sec

Turns out TCP in steady-state implements a type


of resource sharing called: Max-Min Fairness
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 49

Max-Min Fairness
Principle
Consider the set of all network links, L
cl is the capacity of link l

Considers the set of all flows, R


a flow = a subset of L xr is the bandwidth allocated to flow r

Bandwidth capacities are respected l L, r R | l r xr cl TCP in steady-state is such that: minr R xr is maximized

The above can be solved efficiently (with


appropriate data structures)
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 50

SimGrid
Uses the Max-Min fairness principle for all
resource sharing
fast validated in the real-world for CPUs validated with NS-2 for networks

Limitation
Max-Min fairness is for steady-state
e.g., no TCP slow-start e.g., no process priority boosts

Unclear when it starts breaking down Is justified for long enough transfers and computations
reasonable for scientific applications not so much for applications such as a Grid information service

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

51

SimGrid in a NutShell
Macroscopic: Flows in a pipe
(mathematical simulation + coarse-grain d.e. simulation) Very fast Not accurate for short transfers Macroscopic: abstract tasks with resource needs and dependencies (coarse-grain d.e. simulation) Very fast Abstract application model

Macroscopic: Flows in a pipe


(mathematical simulation + coarse-grain d.e. simulation) Very fast Abstract application model

CPU

Application
less abstract

Network
Mathematical Simulation
more abstract

Emulation

Discrete-event Simulation
Grid Performance Workshop 2005

Edinburgh, Scotland, June 2005

52

Other Projects
ModelNet
Network emulation unmodified application packets routed through a core cluster
GigaBit-switched nodes running a modified kernel Emulates router queues

More emulation than MicroGrid Only for networking, but plans to add support for computation

Still many in-house simulators that may aspire to become


widely-used tools
EmuLab/DummyNet ChicagoSim OptorSim EDGSim GridSim ...
Grid Performance Workshop 2005 53

Edinburgh, Scotland, June 2005

So what should I use?


It really depends on your goal / resources
SimGrids network model has clear limitations, e.g. for short transfers SimGrid simulations are easy to set up MicroGrid simulations take a lot of time (although they can be parallelized) ModelNet requires some hardware setup SimGrid does not require for a full application to be written MicroGrid models overhead of system calls implicitly ...

Key trade-off: accuracy and speed


The more abstract the simulation the fastest The less abstract the simulation the most accurate

Does this trade-off really hold?


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 54

Simulation Validation
The crux of most simulation work in most domains of

computer science Validation is difficult and almost never done convincingly


Provide justification that the model is plausible Convince people that the simulator implements the model (verification) Provide a few graphs that show that its reasonable
validation in a few special cases, at best validation against another validated simulator

Argue that although absolute values are off, the trends are respected Conclude that the simulator is useful to compare algorithms/designs Obtain scientific results?????
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 55

FLASH vs. FLASH


FLASH vs. (Simulated) FLASH: Closing the Simulation Loop [Gibson et
al., ASPLOS00]

FLASH project at Stanford


building large-scale shared-memory multiprocessors Went from conception, to design, to actual hardware (32-node) Used simulation heavily over 6 years

The authors went back and compared simulation to the real world!
Simulation error is unavoidable
30% error in their case was not rare Negating the impact of we got 1.5% improvement

One should focus on simulating the important things A more complex simulator does not ensure better simulation
simple simulators worked better than sophisticated ones, which were unstable simple simulators predicted trends as well as slower, sophisticated ones

It is key to use the real-world to tune/calibrate the simulator

Conclusion: for FLASH, the simple simulator was all that was needed

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

56

Presentation Outline
Introduction

Simulation for Grid Computing?


Generating Synthetic Grids

Simulating Applications on Synthetic Grids


Current Work and Future Directions

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

57

Grid Simulation: Accuracy vs. Speed


Comparing simulators and validating them is a gigantic
amount of work
It will never been clear-cut
identify simulation regimes

It doesnt lead to many papers Its eminently politically incorrect Results depend on what the simulation is used for

Therefore nobody does it The story one would like to tell is, e.g.:
Start with SimGrid simulations at first to identify promising approaches Move to MicroGrid emulations to precisely quantify the trade-offs

How can we substantiate this story?


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 58

Current Work in SimGrid


SimGrid uses a simulation engine called SURF
currently SURF performs a blend of mathematical simulation and discrete event simulation

Current work
Adding a MaSSF back-end (i.e., MicroGrid) Adding a ModelNet back-end

SimGrid
SURF MaSSF ModelNet

Goal: evaluate the speed-accuracy trade-off for


simulation of the network Expected result:
SimGrid sufficient and fast for large-enough messages
e.g., Good for scientific applications

MaSSF/ModelNet required for small/frequent messages


e.g., Good for middleware applications (e.g., NWS), p2p applications

But beware FLASH vs. FLASH!


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 59

What about Validation?


The GRAS project [Quinson et al.] Idea: provide a way to compile code into real-world code
and into simulation code
Write the application using the GRAS API Compile it into a SimGrid simulation Compile it into a real-world code
currently provides its own back-end and deployment could use Globus as a back-end

Run and compare

Goals
smooth transition from design to prototyping to production easy validation and simulation calibration

Plan
Use GRAS to easily compare SimGrid, MaSSF, ModelNet, and real world networks Move on to full applications
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 60

Conclusion
Simulation is difficult
Eternal question: What does really matters?

Grid researchers are actively working on it


Usable tools exist Grid simulation today should not re-invent the wheel

Two crucial next steps


Repository of synthetic Grids, Grid measurement datasets, and Grid simulation software Scientifically sound validation experiments
Validate simulators Understand what matters and what does not

Only then will we have a scientific discipline with a


standard way to conduct experiments and a way for researcher to reproduce each others results.
Grid Performance Workshop 2005

Edinburgh, Scotland, June 2005

61

Questions?

62

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

63

A Simple Experiment

- Sent out files of 500MB up to 1.5GB with TTCP - Using from 1 up to 16 simultaneous connections - Recorded the data transfer rate per connection
Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 64

Experimental Results
Normalized data rate per connection

Number of concurrent TCP connections


Edinburgh, Scotland, June 2005 Grid Performance Workshop 2005 65

Max-Min Fairness
Captures other resource sharing beyond networks!
Interference of Communication and Computation
[kreaseck et al., IJHPCA05]

CPU sharing

Edinburgh, Scotland, June 2005

Grid Performance Workshop 2005

70

You might also like