Professional Documents
Culture Documents
Abstract.: Array T N
Abstract.: Array T N
Abstract.: Array T N
Todd L. Veldhuizen
Indiana University Computer Science Department
1 Introduction
The goal of the Blitz++ library is to provide a solid \base environment" of
arrays, matrices and vectors for scientic computing in C++. This paper focuses
on arrays in Blitz++, which provide performance competitive with Fortran and
superior functionality. The design of Blitz++ has been in
uenced by Fortran
90, High-Performance Fortran, the Math.h++ library [3], A++/P++ [4], and
POOMA [5]. It incorporates various features from these environments, and adds
many of its own. This paper concentrates on the unique features of Blitz++
arrays.
2 Overview
Multidimensionalarrays in Blitz++ are provided by the class template Array<T,
N>. The template parameter T is the numeric type stored in the array, and N is
its rank (dimensionality). This class supports a variety of array models:
{ Arrays of scalar types, such as Array<int,2> and Array<float,3>
{ Complex arrays, such as Array<complex<float>,2>
{ Arrays of user-dened types. For example, if Polynomial is a class dened
by the user (or another library), Array<Polynomial,2> is a two dimensional
array of Polynomial objects.
{ Nested homogeneous arrays using the Blitz++ classes TinyVector and Tiny-
Matrix. For example, Array<TinyVector<float,3>,3> is a three-dimensional
vector eld.
{ Nested heterogeneous arrays, such as Array<Array<int,1>,1>, in which
each element is an array of variable length.
2.1 Storage layout and reference counting
Array objects are lightweight views of a separately allocated data block. This
design permits a single block of data to be represented by several array views
[3]. Each array object contains a descriptor (also called a dope vector) which
species the memory layout. The descriptor contains a pointer to the array data,
lower bounds for the indices, a shape vector, a stride vector, reversal
ags, and
a storage ordering vector. This last is a permutation of the dimension numbers
[1; 2; : ::; N ] which indicates the order in which dimensions are stored in memory.
Fortran-style column-major arrays correspond to [1; 2; : : :; N ], and C-style row-
major arrays correspond to [N; N ? 1; : : :; 1]. Reversal
ags indicate whether
each dimension is stored in ascending or descending order.
The storage ordering vector and reversal
ags allow arrays to be stored in
any one of N !2N orderings. Only two of these { C and Fortran-style arrays {
are frequently used. There are occasional uses for other orderings: some image
formats store rows from bottom to top, which can be handled transparently by
a reversal
ag.
Arrays are reference-counted: the number of arrays referencing a data block is
monitored, and when no arrays refer to a data block it is deallocated. Reference
counting provides the benets of garbage collection, and allows functions to
return array objects eciently:
Array<float,2> someUserFunction(Array<float,2>&);
Reference-counting and
exible storage formats support useful O(1) array oper-
ations:
{ Arbitrary transpose operations: The dimensions of an array can be permuted
using the transpose(...) member function. This code makes B a shared
view of A, but with the rst and second dimensions swapped:
Array<float,4> A(3,3,3); // A 3x3x3 array
Array<float,4> B = A.transpose(secondDim,firstDim,thirdDim);
The use of fromStart and toEnd is after [3]. An optional third parameter to the
Range constructor species a stride, so subarrays do not have to be contiguous.
3 Array Expressions
Array expressions in Blitz++ are implemented using the expression templates
technique [6]. Prior to expression templates, use of overloaded operators meant
generating temporary arrays, which caused huge performance losses. In Blitz++,
temporary arrays are never created. Since its original development, the expres-
sion templates technique has grown substantially more complex and powerful [1,
2]. Its present incarnation in Blitz++ supports a wide variety of useful notations
and optimizations. The next sections overview the main features of the Blitz++
expression templates implementation from a user perspective.
3.1 Operators
Any operator which is meaningful for the array elements can be applied to arrays.
For example:
Array<float,2> A, B, C, D; // ...
A = B + (C * D);
Array<int,1> E, F, G, H; // ...
E |= (F & G) >> H;
Operators are always applied in an elementwise manner. Users can create arrays
of their own classes, and use whichever overloaded operators they have provided:
class Polynomial f
// define operators + and *
g;
Array<Polynomial,2> A, B, C, D; // ...
A = B + (C*D); // results in appropriate calls
// to Polynomial operators
Math functions provided by the standard C++, IEEE and System V math li-
braries may be used on arrays, for example sin(A) and lgamma(B).
Arrays with dierent storage formats can appear in the same expression; for
example, a user can add a C-style array to a Fortran array. Blitz++ transparently
corrects for the storage formats. Blitz++ allows arrays of dierent numeric types
to be mixed in an expression. Type promotion follows the standard C rules, with
some modications to handle complex numbers and user-dened types.
Blitz++ supplies a set of index placeholder objects which allow array indices
to be used in expressions. This code creates a Hilbert matrix:
Array<float,2> A(4,4);
The tensor indices i,j,k,... are special objects concealed in the namespace
blitz::tensor. Users are free to declare their own tensor indices with dierent
names if they prefer. Tensor indices specify how arrays are oriented in the domain
of the array receiving the expression (Fig. 1). Any missing tensor indices are
interpreted as spread operations; for example, the A(i,j) term in the above
example is spread over the k index.
The vector elds V, force and advect are implemented as arrays of 3-vectors.
This eliminates the need to represent each vector eld as three separate arrays,
common in Fortran implementations. The stencil operators Laplacian3DVec4
and grad3D4 are provided by Blitz++, and implement 4th-order Laplacian and
gradient operators. The Laplacian3DVec4 operator expands into a 45-point
stencil. Blitz++ supplies stencil operators for forward, central and backward
dierences of various orders and accuracies; built on top of these are divergence,
gradient, curl, mixed partial, and Laplacian operators.
Blitz++ provides special support for vector elds (and in general, multi-
component/multispectral arrays). The [] operator is overloaded for easy access
to individual components of a multicomponent array. For example, this code
initializes the force eld with gravity:
const int x = 0, y = 1, z = 2;
force[x] = 0.0;
force[y] = 0.0;
force[z] = gravity;
3.4 Reductions
Reductions in Blitz++ transform an N-dimensional array (or array expression)
to a scalar value:
Array<int,2> A(4,4); // ...
int result1 = sum(A); // sum all elements
int result2 = count(A == 0); // count zero elements
Available reductions are sum, product, min, max, count, minIndex, maxIndex,
any and all. Partial reductions transform an N-dimensional array (or array
expression) to an N-1 dimensional array expression. The reduction is performed
along a single rank:
Array<int,2> A(2,4);
Array<int,1> B(2);
A = 0, 1, 1, 5,
3, 0, 0, 0;
Reductions can be chained: for example, this code nds the row with the mini-
mum sum of squares:
Array<float,2> A(N,N); // ...
int minRow = minIndex(sum(pow2(A),k));
4 Optimizations
The expression tempaltes technique allows Blitz++ to parse array expressions
and generate customized evaluation kernels at compile time. To achieve good
performance, Blitz++ performs many loop transformations which have tradi-
tionally been the responsibility of optimizing compilers:
{ Loop interchange and reversal: Consider this bit of code, which is a nave
implementation of the array operation A = B + C:
for (int i=0; i < N1; ++i)
for (int j=0; j < N2; ++j)
for (int k=0; k < N3; ++k)
A(i,j,k) = B(i,j,k) + C(i,j,k);
400
Vector<T>
350 Array<T,1>
Native BLAS
Fortran 90
300
250
Mflops/s
200
150
100
50
0
0 1 2 3 4 5
10 10 10 10 10 10
Array length
References
1. Georey Furnish. Disambiguated glommable expression templates. Computers in
Physics, 11(3):263{269, May/June 1997.
2. Scott W. Haney. Beating the abstraction penalty in C++ using expression tem-
plates. Computers in Physics, 10(6):552{557, Nov/Dec 1996.
3. Thomas Keer and Allan Vermeulen. Math.h++ Introduction and Reference Man-
ual. Rogue Wave Software, Corvallis, Oregon, 1989.
4. Rebecca Parsons and Daniel Quinlan. A++/P++ array classes for architecture
independent nite dierence computations. In Proceedings of the Second Annual
Object-Oriented Numerics Conference (OON-SKI'94), pages 408{418, April 24{27,
1994.
5. John V. W. Reynders, Paul J. Hinker, Julian C. Cummings, Susan R. Atlas, Sub-
hankar Banerjee, William F. Humphrey, Steve R. Karmesin, Katarzyna Keahey,
M. Srikant, and MaryDell Tholburn. POOMA. In Gregory V. Wilson and Paul Lu,
editors, Parallel Programming Using C++. MIT Press, 1996.
6. Todd L. Veldhuizen. Expression templates. C++ Report, 7(5):26{31, June 1995.
Reprinted in C++ Gems, ed. Stanley Lippman.
7. Todd L. Veldhuizen. The Blitz++ User Guide. 1998. http://seurat.uwaterloo.ca/-
blitz/.