Sparse Matrix

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Sparse matrices (Lec 2-3)

A sparse matrix can be vaguely dened as a matrix with few nonzeros. It is important to take advantage of the sparsity. Sparsity can be structured or unstructured. We will consider the representations, data structure, and operations.

Graph representations

Graph theory is often used in considering nite element/dierence matrices. Denition 1 A graph G = (V, E) is dened by a set of vertices V = {v1 , , vn } and a set of edges E = {(vi , vj )} = V V . For an n n sparse matrix A its adjacency graph is a graph G = (V, E) whose n vertices in V represent the n unknowns and whose edges (vi , vj ) exist if aij = 0. An adjacency graph is generally directed, except for a symmetric pattern, aij = 0, aji = 0 for the edge (vi , vj ). Example 1 A= A= (1)

(2)

For PDEs involving 1 unknow per node, the adjacency graph can usually be the mesh. Nodes that are not directly connected can get connected via matrix operations (multiplications, eliminations, etc.). This causes ll-in. For example, the graph of Ak has edges (i, j) for any nodes i, j when there exists at least one path of lengh k from i to j in the graph of A (assuming no cancellations): cij =
0

aik akj .

Thanks to Prof. Alan Laubs for oering his notes as references.

Cholesky factorization

Assume A is symmetric positive denite (SPD). Compute A = LLT where L is lower triangular. A= = d vT v B d 0 1 v I d 1 n1 1 0 0 A1

d 0

1 vT d

1 where A1 = B d vv T is the Schur complement. If A1 is further factorized as A1 = L1 LT then 1 d 0 d A= 1 v L1 0 d L

1 vT d LT 1 LT

Remark 1 - No pivoting is necessary for numerical stability. - In step i the computation of the Schur complement costs about (n i)2 which explains 1 the total cost of the factorization: 3 n3 + O(n2 ) (notice the symmetry). - The Schur complements may get more and more dense. - We hope that the nal L remains the sparsity. But L is usually more dense. The eliminations correspond to the removal of the nodes from the adjacency graphs of A, A1 , A2 , Edges are added during the eliminations. They are ll edges and correspond to zero entries in A but nonzeros in L. Nodes get connected when they were previous indirectly connected by a path. The removal of a node connects all higher numbered nodes that were previously connected to it. w = u v Assume there is no cancellation. For matrix (1)

Figure 1: Adjancency graphs of A, A1 , A2 2

The lled graph is now G(L + LT ). L=

L + LT is not as sparse as A. In an extreme case, matrix (2) (an arrow matrix), the rst elimination step will create a fully dense A1 ! Cost: O(n3 ), space: O(n2 ).

Figure 2: Adjacency graph for the Schur complement after 1 step of elimination of matrix (2).

Reordering
(P AP T )(P x) = P b

Reordering can reduce the ll. That is, we solve

where P is a permutation matrix. Denition 2 A row (col) permutation A, (A, ) of A is obtained by permuting the rows (cols) of A following the indices = {i1 , , in } where is a permutation of {1, , n} . A, (A, ) can be obtained by premultiplying A by a permutation matrix. A, = P A, A, = AP T . Matrix P is a permutation matrix resulting from the product of some interchange matrices Xij , or by applying the permutation to I. So P = I, . In Matlab A(p, :) applies the row permutations, and A(p, p) applies both row and col permutations. Often symmetric permutations are used to preserve symmetry and diagonal elements. A, = P AP T 3

As an example, the arrow matrix (2) can be permuted to a new arrow matrix. The LU factorization of this matrix has no ll-in! Cost: O(n), space: O(n). A= (3)

A symmetric permutation relabeling the vertices of the adjacency graph without changing the edges. Another example, for (1), the following graph will create no ll-in.

Figure 3: Reordered adjancency graph. Problem 1 Find a permutation P such that the Cholesky factor L of P AP T is sparse. (The problem of minimizing ll is NP-complete.) Notation 1 G = (V, E): graph of A, n = |V | , m = |E|. G+ = (V, E + ) : lled graph (graph of L + LT ), m+ = |E + |. : eliminination order {1, , n} V (vertices of G). G : ordered graph. G+ : lled graph. Lemma 1 (Path lemma) Given G and . Then (u, v) is an edge of G+ i there is a path u w1 wk v connecting u and v in G where wi are lowered numbered vertices in G. Proof. = Induction on the length of the path. (u, v) is either an edge in G or is connected due to the elimination of wi in certain elimination stage. Induction on (u, wi ), (wi , v). = Induction on min( 1 (u), 1 (v)). If min = 1. Either u or v is eliminated at the rst step. Then (u, v) is an edge of G, otherwise they cannot get connected by a path later.

If min > 1. Either (u, v) is an edge of G (done), or (u, v) is a lled edge during the elimination of a variable wi . Thus (u, wi ), (wi , v) are edges of G+ satisfying 1 (wi ) < min( 1 (u), 1 (v)). By induction there exists a path in G from u to wi through lower numbered vertices and a path from wi to v through lower numbered vertices.

Graph game: Choose a vertex v. Add a ll edge between each pair of vs neighbors that are not adjacent. Remove v and its incident edges. The problem above is to nd an ordering of G such that the graph game adds few edges. We approximate the optimal situations using graph-based heuristics.

3.1

Minimum Degree (MD)

Play the graph game, eliminating the vertex of smallest degree (or with least edges connected to it) at each step. In each step Update degrees. Update G. That is, for the matrix: At each step of the elimination, rows and cols are permuted so that the pivot row (col) to be eliminated is the one with the least nonzero o-diagonal entries. Remark 2 Some remarks: Popular strategy. Fast in practice. Little time as compared to the factorization. Straightforward implementation needs O(nm+ ) ops and O(m+ ) space. Improvements to O(nm) time and O(m) space are possible. Local minimization on the ll only. Cant prove anything. See the picture for an example.

Figure 4: Minimum degree ordering. 6

3.2

Cuthill-McKee & Reverse Cuthill-McKee

They are level-set orderings based on travesing the graph by level sets. A level set is dened recursively as the set of all unlabeled neighbors of all the nodes of a previous level set. One traversal example: for each node u in the current level set, its neighbors are inspected. If a neighbor v is not numbered, add v to the next level set. This is called Breadth First Search (BFS). In a current level set, the nodes can be traversed according to their degrees. Algorithm 1 Cuthill-McKee (CM) & Reverse Cuthill-McKee (RCM) Given G with |G| = N . Choosing a starting node r and set it to be p1 . For i = 1 : n, nd all the unlabeled neighbors of node pi and number them in increasing order of degrees. CM ordering: p1 , p2 , , pN RCM ordering: pN , pN 1 , , p1 A. George (1971) observed that RCM provides better results than CM.

3.3

Nested Dissection (ND)

Denition 3 For graph G = (V, E), and V V is a separator of G if the removal of V from G divides G into two or more components. Choose a separator which divide G into pieces. Separator nodes are ordered last. For each piece, apply the same idea recursively. For the 2D model problem (Laplacian), with 1 and 2 levels of separators, the corresponding Cholesky factor is 0 0 ,L = L= 0 0 0 See the picture for an example. If the mesh is N N (for simplicity, let N 2k ) the Cholesky factorization with ND ordering costs O(N 3 ) ops and needs O(N 2 log N ) space. The proof is by reduction. 7

Figure 5: Nested dissection ordering.

Remark 3 - Small separators are preferred. As the elimination of a node connects those nodes that were previously connected to it, there will be a lot of ll between nodes in separators. - Separators roughly divide the graph into pieces with equal size. - Fast in practice, especially for large problems. - Optimal wrt cost and ll for certain classes of graphs (Homan, et al. 1973), e.g. grid graphs. MD often wins for medium-sized problems ND often wins for large problems RCM often wins for long, thin problems

Storage schemes
Coordinate format iA: row indices, jA: col indices, vA: nonzero entries. Entry A(iA(k), jA(k)) = vA(k) 1 2 3 4 5 7 8 9 A= 6 10 11 12 vA : 1 2 3 iA : 1 1 2 jA : 1 4 1 In Matlab, A = sparse(iA, jA, vA). Compressed sparse col/row (CSC/CSR) iA: row starts, jA: col indices, vA: nonzero entries. vA : 1 2 3 12 jA : 1 4 1 2 4 1 3 4 5 3 4 5 iA : 1 3 6 10 12 13 Fill-in in a col/row = move later cols/rows. Diagonally structured DIAG(i, j) = ai,i+of f (j) Linked lists Remark 4 Easy to accommodate ll-in. Sometimes redundant storage. Indirect addressing = slower operations. 9 12 5 5

Original matrix 0 Natural ordering 50 100 150 200 0 0 50 symamd 100 150 200 0 0 50 symrcm 100 150 200 0 0 50 nested 100 150 200 0 100 nz = 1065 200 100 nz = 1065 200 0 50 100 150 200 0 100 nz = 1065 200 0 50 100 150 200 0 100 nz = 1065 200 0 50 100 150 200 0 0 50 100 150 200 0

Cholesky factor

100 nz = 3389

200

100 nz = 1803

200

100 nz = 2570

200

100 nz = 2328

200

Figure 6: Dierent orderings. 10

Figure 7: CSC storage.

Figure 8: Diagonally structured storage.

Figure 9: Linked list storage. 11

References
[1] A. George, J.W.H. Liu, The evolution of the minimum degree ordering algorithm, SIAM review (31), 1-19, 1987. [2] J. A. George, Nested dissection of a regular nite element mesh, SIAM J. Numer. Anal., 10 (1973), pp. 345-363. [3] J. A. George and J. W. H. Liu, Computer solution of large sparse positive denite systems, Prentice-Hall, Englewood Clis, NJ, 1981.

12

You might also like