Numerical Algorithms For The Hypercube Concu - 1988 - Mathematical and Computer

Math1 Comput. Modeliing, Vol. I I, pp. 55-57,1988 0895-7177/88 $3.00 + 0.
00
Printed in Great Britain Pergamon Press plc
NUMERICAL ALGORITHMS FOR THE HYPERCUBE CONCURRENT PROCESSOR
Jean E. Patterson
Farzin Manshadi
Rue1 H. Calalo
Paulett C. Liewer
William A. Imbriale
James R. Lyons
Jet Propulsion Laboratory, California Institute of Technology

4800 Oak Grove Drive, M/S 138-208, Pasadena, California 91109
Abstract. With the development of concurrent computing architectures

which promise cost-effective means of obtaining supercomputing perfor-
mance, there is much interest in applying and in evaluating the actual
performance on large, computationally-intensive problems. Of particular
interest is the concurrent performance of large scale electromagnetic
scattering problems. Two electromagneticcodes with differing under-
lying algorithms have been converted to run on the Mark III Hypercube.
One is a time domain finite difference solution of Maxwell's equations
to solve for scattered fields and the other is a frequency domain
moment method solution. Important measures for demonstrating the
utility of the parallel architecture are the size of the problem that
can be solved and the efficiency by which the paralleling can increase
the speed of execution.
Keywords. Electromagnetic scattering; parallel processing; hypercubes;

parallel algorithms.
INTRODUCTION Because there is no global memory, the system

utilized internode message passing to distrib-
With recent advances in high speed micro- ute data. These internode messages flow
processor technoloqv there has been renewed at 2 megabytes per second per channel with a
interest in ensemble computing architectures. node capable of communicating on all of its
Although microprocessors .yield relatively low channels simultaneously.
individual performance, they can be clustered
to deliver a high performace, cost-effective FINITE DIFFERENCE CODE
computing machine that can be compared in
performance to the higher-cost supercomputers. The first code which has been selected for this
evaluation is a finite difference time domain
We are interested in evaluating the applica- technique for the direct solution of Maxwell's
bility of parallel computing for the solution time-dependent curl equations. By this tech-
of large scale electromagnetic scattering nique one can model the propagation of an
problems. In order to better assess the oen- electromagnetic wave into a volume containing
era1 performance, two electromagnetic scat- a dielectric or conductina structure. This
tering codes which use different numerical volume is represented as a' 3-dimensional
techniques have been converted to run on the lattice. The incident wave is tracked as it
parallel processing Mark III Hypercube. propagates and interacts with the scattering
object by performing a finite difference ver-
THE MARK III HYPERCUBE sion of the curl equations for each cell
within the lattice. This tracking is complete
The Mark III Hypercube, which has been devel- when the desired sinusoidal steady state be-
oped at Jet Propulsion Laboratory/California havior can be observed at each lattice cell.
Institute of Technology, consists of up to This computationally intensive procedure is
32 nodes with a 128-node configuration now simplified by calculating the interaction of
under construction. A hypercube is a connec- the propagating wave front with portions of
tivity scheme which can be viewed as an array the scattering object surface at a given time
of N nodes where each is capable of communi- rather than performing a simultaneous solution
cating directly with n = log2N neighboring of the complete problem (Umashankar, Taflove,
nodes along the edges of an n-dimensional cube, 1984). This spatial independence lends it-
Each node has a pair of Motorola 68020 proces- self well to a parallel decomposition. The
sors -- one is the main application processor parallel implementation uses the identical
and the second is the communication processor. global lattice constructed by the sequential
Currently the hypercube uses the Motorola code but divides the lattice into blocks of
68881 floating point co-processor which deli- nearly equal dimensions. The neighboring
vers 60 to 120 thousand floating point blocks are assigned to nodes which are direct-
operations per second per node. An even fas- ly connected. This decomposition scheme
ter floating point accelerator daughter board assures that each node could perform its
is being added to each node which should boost discrete field updates either with resident
this performance by better than an order of information or with information communicated
magnitude. There are 4 megabytes of dynamic by a node directly connected to it.
RAM and 128 kilobytes of static RAM per node.
55
56 Proc. 6th Int. Cmf. on Mathematical Modelling
METHOD OF CODE From the solution of the amplitudes of the

basis function one can then calculated near-
The code that been selected the field and far-field quantities.
Numerical Code (NEC-2)
was developed Lawrence Livermore The numerical solution requires the inversion
Laboratory. This is used the analysis of a matrix of increasing size as the struc-
electromagnetic response antennas and ture size itself is increased relative to the
metallic structures. code utilizes wavelength. Although there are no theoretical
equations to the induced size limitations, modeling of structures with
on a by sources incident dimensions more than several wavelengths
fields. combines an equation for becomes impractical on conventional computers
surfaces with specialized for due to limitations in available memory and
to provide the modeling a wide of excessive computing time. In the parallel
structures. these equations implementation the inversion is performed by
boundary condition on the a factorization of the interaction matrix into
produces a integral equation where the a right triangular matrix by a series orthogo-
unknowns are the longitudinal currents on wire nal Householder transformations. Since the
segments and the two perpendicular components computations within one column of the matrix
of the surface current on patches. These are independent of those in others, the
equations are solved numerically by a method factored matrix is distributed to the nodes
of moments technique. The method of moments by columns. The node assignment for columns
applies to a general linear operation equation is performed in card dealing fashion to assure
optimal load balance.
LI = E (1)
PERFORMANCE ANALYSIS
where L is a linear integral operator, E is the
known excitation by a source or an incident The codes have been implemented on the
field, and I is the unknown response which one hypercube and verified to obtain the correct
wishes to determine. The unknown response solutions by comparing the results with those
function, I, may be expanded as a sum of basis from sequential runs and the exact solutions.
functions, Ij, Then the performance of the two codes has been
evaluated. One measure of the performance is
N to compare the execution time for a given
I = j$lFj Ij (2) problem on one node versus the time for the
same oroblem runnina on 32-nodes. A speeduo
factor is then determined by dividing the
where Fj are the amplitude coefficients of the single node time by the 32-node time. For the
basis function. A set of equations are next finite difference code speedup factors of up
obtained by taking the inner product of the to 30 have been measured. The method of moment
LI = E with a set of weighting functions (wi) codes, which requires more interprocessor
communication because of the matrix element
CWi I LI'=< Wi 9 E>. (3) manipulation, has measured speedup factors of
up to 26.
Due to the linearity of the operator L, one can
substitute for I to obtain the equation In the method of moments code the fill of the
interaction matrix and the matrix factorization
Fj<Wi, represent more than 90% of the total sequential
i LI > = <Wi, E 2, i = 1 ,... Nc4)
run time. For this reason these two components
j=l
were the first to be converted to concurrent
This equation can then be expressed in matrix algorithms. The timing and speedup factors
notation as of the components for a structure (in this
case, a monopole on a pedestal) modeled by
290 wire segments vs. the number of nodes in
[Al [Fl = [El (5)
the hypercube are shown in Table 1 and 2. Be-
where cause the problem size is too large to run in
one node, the fill and factor times are esti-
Aij = < Wi , LIj' mated by extrapolation of the results from a
(6)
high number of nodes.
Ei =<wi ,E>. (7)
For structures being modeled by a combination TABLE 1. Timing and Speedup Factor of
of wire segments and surface patches, the the Matrix Fill for Varying
matrix equations has the form Size of Hypercube
No. of Fill Time Speedup
!I :][::I=[:] t8)
Nodes
1
2
(set)
1725.0*
876.0
Factor
1.00
1.99
where a, b, c, d are sub-matrices of the
4 445.5 3.87
matrix A representing the interactions wire-
8 230.3 7.49
to-wire, surface-to-wire, wire-to-surface, and
16 122.6 14.07
surface-to-surface respectively. FN and FP
32 68.8 25.07
are the basis function amplitudes for wires
and patches, EW is the electric field at the * Timing extrapolated for l-node
center of wire'.segments, and Hp is the magnetic hypercube
field at the center of the surface patches.
Proc. 6th Int. Conf. on Mathematical Modelling 57
TABLE 2. Timing and Speedup Factor of Taflove, A., M.E. Brodwin (1975). Numerical
the Matrix Factorization for Solution of Steady-State Electromagnetic
Varying Size of Hypercube Scattering Problems Using the Time-
Dependent Maxwell's Equations. IEEE Trans.
No. of Factor Time Speedup Microwave Theory Tech., MTT-233, 623-630.
Nodes (set) Factor
Taflove, A., K.R. Umashankar (1982). A Hybrid
1 Li110.0* 1.00 Moment Method/Finite Difference Time
3141.0 1.95 Domain Approach to Electromagnetic Coup-
4' 1615.0 3.78 ling and Aperture Penetration into Complex
8 841.0 7.27 Geometries. IEEE Trans. Antennas and
16 454.0 13.46 Propagation, AP-30 617-627.
32 259.0 23.59 Umashankar, K.R., A. Tailoove (1984). Analytical
Models for Electromagnetic-Scattering,
* Time extrapolated for l-node Part II: Finite-Difference Time-Domain
hypercube Developments, Final Report in ITTRI Pro-
ject E06538, Electronics Department, ITT
Another measure of the performance is the com- Research Institute.
parison of the maximum size problem which can Yee, K.S. (1966). Numerical Solution of Initial
be executed on the hypercube versus a conven- Boundary Value Problems Involving Max-
tional sequential computer such as the VAX. well's Equations in Isotropic Media.
On the VAX 11/750, the largest finite dif- IEEE Trans. Antennas and Propagation, AP-
ference lattice which can run in a typical 14, 302-307.
user's dynamic memory allocation contains about
192,000 unit cells on the Mark III Hypercube
with 32 active nodes, the largest lattice con-
tains about 2,048,OOO unit cells.
SUMMARY
This work has provided a demonstration of the

applicability of a parallel architecture to the
solution of large electromagnetic scattering
problems. The two techniques used, finite
difference and method of moments, have pro-
vided insight into the comparative speedups
which can be attained for several algorithms.
It also illustrated the flexibility of the
hypercube architecture for different analysis
alqorithms.
The work described above is only the first Step

in developing parallel computing capabilities
for solution of electromagnetic scattering
analysis. Currently effort is underway to
develop an electromagnetics interactive
workstation using the Mark III Hypercube as the
computational element. A Sun 3/160 Color
Graphics Workstation will provide a graphical
interface between the user and the hypercube.
This workstation will be used as an interactive
tool to aid in the design phase for metallic
structures such as antennas.
REFERENCES
Bierman, G.J. (1977). Factorization Methods for

Discrete Sequential Estimation--Mathema-
tics in Science and Engineering Volume
128.
BurkeTJ., A.J. Poggio (198J). Numerical
Electromagnetics Code (NEC)--Method of
Moments (Lawrence Livermore National
Laboratory)
Enquist, B., A. Majda (1977). Absorbing Boun-
dary Conditions for the Numerical Simula-
tion of Waves. Math Comp., 31, 629-651.
Kunz, K.S. (1986). Generalized Three-Dimen-
sional Experimental Lightning Code
(G3DXL) User's Manual, Kunz Associates,
Inc.
Mur, G. (1981). Absorbing Boundary Conditions
for the Finite-Difference Approximation
of the Time-Domain Electromagnetic-Field
Equations. IEEE Trans. Compat., EMC-23,
377-382.

Numerical Algorithms For The Hypercube Concu - 1988 - Mathematical and Computer

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Numerical Algorithms For The Hypercube Concu - 1988 - Mathematical and Computer

Uploaded by

Copyright:

Available Formats

Math1 Comput. Modeliing, Vol. I I, pp. 55-57,1988 0895-7177/88 $3.00 + 0.

NUMERICAL ALGORITHMS FOR THE HYPERCUBE CONCURRENT PROCESSOR

Jet Propulsion Laboratory, California Institute of Technology

Abstract. With the development of concurrent computing architectures

Keywords. Electromagnetic scattering; parallel processing; hypercubes;

INTRODUCTION Because there is no global memory, the system

METHOD OF CODE From the solution of the amplitudes of the

Ei =<wi ,E>. (7)

No. of Fill Time Speedup

This work has provided a demonstration of the

The work described above is only the first Step

Bierman, G.J. (1977). Factorization Methods for

You might also like