Professional Documents
Culture Documents
6 PLP Partitioning
6 PLP Partitioning
• Numerical integration
• N-body problem
Parallel Programming 3
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved.
4.2
Example: Add Sequence of
Numbers
• Add numbers X0 , … Xn-1
• Consider dividing the sequence into p
parts
• Each part will have n/p numbers
• (X0 … X(n/p)-1 ), (X(n/p) … X(2n/p)-1 ), …,
(X(p-1)n/p … Xn-1 )
Parallel Programming 4
Partitioning a sequence of numbers
into parts and adding the parts
Parallel Programming 5
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved.
4.3
Master
/* number of numbers for slaves */
s=n/p;
for (i=0, x=0; i<p; i++, x=x+s)
/* send s numbers to slaves */
send(&numbers[x], s, Pi );
sum = 0;
/* wait for results from slaves */
for (i = 0; I <p; i++) {
recv(&part_sum, Pany )
sum = sum +part_sum;
}
Parallel Programming 6
Slave
/* receive s numbers from the master */
recv(numbers, s, Pmaster )
part_sum = 0;
/* add the numbers */
for ( i = 0; i < s ; i++)
part_sum = part_sum + numbers[i];
/* send the sum to the master */
send (&part_sum, Pmaster )
Parallel Programming 7
Divide and Concur
• Divide a problem into smaller problems
that are of the same form as the larger
problem.
• Can use recursion (sequential)
• The recursion method will continually
divide the problem until the tasks can’t be
broken down into smaller parts.
Parallel Programming 8
Sequential: Recursion for Adding Numbers
int add( int *s) /* add list of numbers s */
{
if ( number(s) <= 2) return (n1+n2)
else {
/* Divide s into two parts */
Divide (s, s1, s2);
Parallel Programming 10
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.4
Dividing a list into parts
Parallel Programming 11
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.5
divide (s, s1, s2); /* Division phase */
/* send one part to another process */
send (s2, P4 );
divide (s1, s1, s2);
send (s2, P2 );
divide(s1, s1, s2);
Process P0
send (s2, P1 );
part_sum = 0; /* Combining phase */
recv (&part_sum1, P1 );
part_sum = part_sum + part_sum1;
recv (&part_sum1, P2 );
part_sum = part_sum + part_sum1;
recv (&part_sum1, P4 );
part_sum = part_sum + part_sum1;
Parallel Programming 12
recv (s1, P0 ); /* Division phase */
divide (s1, s1, s2);
send (s2, P6 );
divide (s1, s1, s2); Process P4
send (s2, P5 )
part_sum = 0; /* combining phase */
recv (part_sum1, P5 )
part_sum = part_sum + part_sum1;
recv (part_sum1, P6 )
part_sum = part_sum + part_sum1;
send (&part_sum, P0 )
Parallel Programming 13
Partial summation
Parallel Programming 14
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.6
M-ary Divide and Concur
• A tree in which each node has M children.
• Useful in decomposing multi-dimensional
problems.
• A quadtree is useful in decomposing two-
dimensional regions into four sub-regions.
• For example, a digitized image could be
divided into four quadrants, and then each
of the four quadrants into four sub-
quadrants, and so on.
Parallel Programming 15
Quadtree
Parallel Programming 16
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.7
Dividing an image
Parallel Programming 17
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.8
Partitioning and Divide and
Concur Example: Bucket Sort
• Is not based on compare and exchange,
• It is naturally a partitioning method.
• Works well if the original numbers
uniformly distributed across a known
interval, say 0 to a - 1.
• This interval is divided into m equal
regions:
• 0 to a/m -1, a/m to 2a/m -1, 2a/m to 3a/m -
1… Parallel Programming 18
Bucket sort
One “bucket” assigned to hold numbers that fall within each region.
Numbers in each bucket sorted using a sequential sorting algorithm.
Parallel Programming 20
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.10
Further Parallelization
Small buckets then emptied into p final buckets for sorting, which
requires each processor to send one small bucket to each of the
other processors (bucket i to processor i).
Parallel Programming 21
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.11
Another parallel version of bucket sort
Parallel Programming 23
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.13
“all-to-all” routine actually transfers rows of an array to columns:
Transposes a matrix.
Parallel Programming 24
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.14
Numerical integration using rectangles.
Each region calculated using an approximation given by
rectangles:
Aligning the rectangles:
Parallel Programming 25
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.15
Numerical integration using
trapezoidal method
Parallel Programming 27
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved.
4.17
Adaptive quadrature with
false termination.
Some care might be needed in choosing when to terminate.
Parallel Programming 31
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.21
else {
int_size = 1.0 / (double) n; //calcs interval size
part_sum = 0.0;
MPI::COMM_WORLD.Reduce(&temp_pi,&calc_pi, 1,
MPI::DOUBLE, MPI::SUM, 0);
Parallel Programming 32
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.22
if (rank == 0) { /*I am server*/
cout << "pi is approximately " << calc_pi
<< ". Error is " << fabs(calc_pi - actual_pi)
<< endl
<<"_______________________________________"
<< endl;
}
} //end else
if (rank == 0) { /*I am root node*/
cout << "\nCompute with new intervals? (y/n)" << endl; cin >> resp1;
}
}//end while
MPI::Finalize(); //terminate MPI
return 0;
} //end main
Parallel Programming 33
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.23
//functions
void printit()
{
cout << "\n*********************************" << endl
<< "Welcome to the pi calculator!" << endl
<< "Programmer: Parallel Hero" << endl
<< "You set the number of divisions \nfor estimating the
integral: \n\tf(x)=4/(1+x^2)"
<< endl
<< "*********************************" << endl;
} //end printit
Parallel Programming 34
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.24
Gravitational N-Body Problem
Parallel Programming 35
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.25
Gravitational N-Body Problem Equations
Parallel Programming 36
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.26
Details
Let the time interval be t. For a body of mass m, the force is:
Parallel Programming 39
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.29
Time complexity can be reduced using observation that a
cluster of distant bodies can be approximated as a single
distant body of the total mass of the cluster sited at the
center of mass of the cluster:
Parallel Programming 40
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.30
Barnes-Hut Algorithm
Start with whole space in which one cube contains the bodies
(or particles).
Parallel Programming 41
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.31
Creates an octtree - a tree with up to eight edges from each
node.
After the tree has been constructed, the total mass and center of
mass of the subcube is stored at each node.
Parallel Programming 42
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.32
Force on each body obtained by traversing tree starting at root,
stopping at a node when the clustering approximation can be
used, e.g. when:
Parallel Programming 43
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.33
Recursive division of two-dimensional space
Parallel Programming 44
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.34
Orthogonal Recursive Bisection
(For 2-dimensional area) First, a vertical line found that divides
area into two areas each with equal number of bodies. For each
area, a horizontal line found that divides it into two areas each
with equal number of bodies. Repeated as required.
Parallel Programming 45
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.35
Astrophysical N-body simulation by Scott
Linssen (undergraduate UNCC student, 1997)
using O(N2) algorithm.
Parallel Programming 46
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.36
Astrophysical N-body simulation by David
Messager (UNCC student 1998) using Barnes-Hut
algorithm.
Parallel Programming 47
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen,
@ 2004 Pearson Education Inc. All rights reserved. 4.37