Professional Documents
Culture Documents
LU Factorization March14
LU Factorization March14
In summary
Ax = b so factorize A into LU L[Ux] = b
Lx’ = b, x’ is a vector, so Ux To get the value of x’ , Lx’ = b
get x from Ux = x’ using back-substitution
x’ = b
BLAS – basic linear algebra
subroutines
TRSV Library
Triangular Solve with Vector
x = x’
Linear Algebra – Matrix Factorization
For ex:
1 2 3
A= 3 1 4 x= b=
5 3 1
1 0 0 1 2 3
0 1 0 3 1 4 Identity and A Matrix
0 0 1 5 3 1
1 2 3
A= 3 1 4 x= b=
5 3 1
1 0 0 1 2 3
0 1 0 3 1 4 1 0 0 e e e
0 0 1 5 3 1 e 1 0 0 e e
e e 1 0 0 e
Convert this
convert this L U
matrix into L
matrix into U
R2 R2 – 3R1
R3 R3 – 5R1
1 2 3
A= 3 1 4 x= b=
5 3 1
1 0 0 1 2 3
It will give first row
0 1 0 0 -5 -5
0 0 1 0 -7 -14
R2 R2 + 3R1
1 0 0 1 2 3
It will give second
3 1 0 0 -5 -5
row as 3 1 0
0 -7 -14
Linear Algebra – Matrix Factorization
1 0 0 1 2 3 1 2 3
0 1 0 0 -5 -5 A= 3 1 4 x= b=
0 0 1 0 -7 -14 5 3 1
R3 R3 + 5R1
R3 R3 – 7/5 R2
1 0 0 1 2 3
3 1 0 0 -5 -5 R3 R3 + 7/5 R2
5 7/5 1 0 0 -7 To get lower triangular matrix
L U
Linear Algebra – Matrix Factorization
1 0 0 X1’ 14
3 1 0 X2’ = 17
5 7/5 1 X3’ 14
Now U x = x’
1 2 3 X1 14
0 -5 -5 X2 = -25
0 0 -7 X3 -21
U
x1 + 2(2) + 3(3) = 14
This is back substitution
x1 = 14 – 9 – 4
x1 = 1
LU Factorization Algorithm:
1. Start
2. Read the elements of matrix into array A
and vector b
3. Calculate elements of L and U
4. Print elements of L and U
5. Find x’ by solving Lx’ = b by forward substitution
6. Find x by solving Ux = x’ by backward substitution
7. Print vector x as the solution
8. Stop
How to do this for large matrix?
Parallel LU Factorization
Blocked algorithms’
Parallel LU Factorization
A = Full Matrix
b x (n-b) A = LU
bxb
(n-b) x b
(n-b) x (n-b)
Parallel LU Factorization
A = LU
bxb b x (n-b)
A
=
Lx = b bxb bxb bxb
L U
= TRSV
3. Compute
3. Compute L10
A10 = L10U00
A10 = L10 U00 TRSM
Using A11= L10 U01 + L11 U11 4. Update A11 is to get
L11 U11 = A11 – L10U01 = A11’
4. Compute L11 & U11 (already computed) A11’ is nothing but L11U11 in
original notation so how do we
L11 U11 = A11 – (L10 U01) get L11U11 recursively
both (n-b) (n-b) (n-b) b x b (n-b)
= A11’ So update A11 to get A11’
Finally recursively solve
A11’ = L11U11 to get set of box
Parallel LU Factorization
Now how to parallelize this blocked algorithm 1. Compute L00 & U00 by
factorizing A00 = L00 U00
• Compute step 1 sequentially
2. Compute U01
• 4th step is bottleneck, because full matrix
A01 = L00 U01 TRSM
multiply and involves (n-b)3 operations
3. Compute L10
• Compute step 2 & 3 in parallel, as NO A10 = L10U00 TRSM
dependencies, block by block,
• Compute step 4 – lots of block by block 4. Update A11 is get
parallel execution – recursively L11 U11 = A11 – L10U01 =A11’