SPH With MPI - Matthew Anderson - sph2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Smoothed Particle

Hydrodynamics with MPI


Matthew Anderson

Center for Relativity


University of Texas at Austin



? 29 November 2001 – p.1/15






Parallelization Issues

Major Issues at hand:


Domain decomposition and load balancing
Nearest neighbor finding
Minimizing communication costs
Optimizing computation



? 29 November 2001 – p.2/15






Review of Parallelization Issues

The Major issue: Nearest Neighbor Finding


Recall that SPH
represents a fluid by a sampling of its elements
finds gradients by weighted interpolation over neighboring
particles
Cubic Spline Interpolated Kernel
0.8

0.6

f(h)
0.4

0.2

= Fluid
0.0
= Fluid mass element ( particle ) 0.0 0.5 1.0 1.5 2.0
h



? 29 November 2001 – p.3/15






Smoothing Length

SPH must have a sufficient number of particles


within a smoothing length. The number of
neighbors must be:
5 particles for 1 dimension
21 particles for 2 dimensions
57 particles for 3 dimensions



? 29 November 2001 – p.4/15






Parallel Nearest Neighbor Finder

Methods being attempted:


Use particle index to find nearest neighbors
Each index carries the particle’s velocity, 4-position, pressure,
temperature, entropy, matter density, baryon number, and list of
nearest neighbors (once found).

01

23

45

67

89

:;

<=

>?
@A

BC

DE

FG

HI

JK

LM

NO
PQ

RS

TU

VW

XY

Z[

\]

^_
+ + + + + + + +

+
!

"#

$%

&'

()

*+

,-

./
+ + + + + + + + +



xy

lm

`a

’“

¢£

²³

+ + + + + + + + + +



z{

no

bc

‘

 ¡

°±

+ + + + + + + + + +



|}

pq

de

Ž

žŸ

®¯

+ + + + + + + + + +



~

rs

fg

Œ

œ

¬­

+ + + + + + + + + +


€

tu

hi

Š‹

š›

ª«

+ + + + + + + + + +


‚ƒ

vw

jk


ˆ‰

˜™

¨©

+ + + + + + + + + +

†‡

–—

¦§

+ + + + + + + + +


„…

”•

¤¥

+ × × × × × × × ×

× × × × × × × ×

+× +× +× +× +× +× +× +×
× × × × × × × ×

× × × × × × × ×

× × × × × × × ×



? 29 November 2001 – p.5/15






Parallel Nearest Neighbor Finder

Methods being attempted:


Search by index using a variable smoothing
length along the processor boundary
Processor Boundary

hb



? 29 November 2001 – p.6/15






Parallel Nearest Neighbor Finder

Methods being attempted:


Use neighbors of nearest neighbors for
neighbor updates

Nearest neighbors
Neighbors of a nearest neighbor



? 29 November 2001 – p.7/15






Trade-Offs: Search by Index

Search by particle index:


The index search is simple, and minimizes communication ( 3


 



ghostzones = particles to exchange for an








simulation )


The index search works well when the initial data is decomposed
so that neighboring particles have neighboring indices
The index search performs well with random velocity distributions
in the initial data
The index search will not work for highly dynamic cases



? 29 November 2001 – p.8/15






Trade-Offs: Variable H

Search by particle index with a variable


smoothing length:
Further minimizes communication ( 2 ghostzones =

 



particles to exchange for an





simulation )
Increases chance for a numerical instability
Will not work for highly dynamic cases



? 29 November 2001 – p.9/15






Trade-Offs: Search the Neighbors

Searching the neighbors of nearest neighbors for


updates:
Will work for highly dynamic cases
Simple to code and implement
Increases communication and computation time ( more particles
to search )
Nearest neighbors must initially be found some other way




? 29 November 2001 – p.10/15





Performance Results


Particle index search: test run


Scalability with Fixed Problem Size
300.0

250.0

Ideal
Execution time

200.0 MPI fixed problem size

150.0

100.0

50.0

0.0
1 3 5 7
Number of Processors




? 29 November 2001 – p.11/15





Performance Results

Computation time
Performance with 4 Processors
500.0

400.0

Execution time 300.0

200.0

100.0

0.0
20 40 60 80 100 120
Problem Size ( N x N x N )


Regression indicates computation time of




? 29 November 2001 – p.12/15





Performance Results

Efficiency
Efficiency for 51 x 51 x 51
1.00

0.95
Efficiency

0.90

0.85

0.80
1 3 5 7
Number of Processors




? 29 November 2001 – p.13/15





Performance Comments

All runs were performed on the NCSA SGI


Origin 2000
Runs with fewer processors were performed
in an interactive queue
Performance runs did not dump any data
Accuracy of nearest neighbors checked
against serial code




? 29 November 2001 – p.14/15





Work In Progress

Complete testing of nearest neighbor


methods
Complete parallelization of full SPH code
Develop space-filling curve method for initial
data partition




? 29 November 2001 – p.15/15




You might also like