Professional Documents
Culture Documents
Matrix Computations (Golub) PDF
Matrix Computations (Golub) PDF
1, then it sufices to count multiplications ‘as the number of additions is roughly the same. If we just count the mul- tiplicatione, then it cuffices to examine the deepest level of the recursion ‘as that is where all the multiplications occur. In strass there are q ~ d subdivisions and thus, 7°~¢ conventional matrix-matrix multiplications to perform. These multiplications have size tmin and thus strass involves about s = (24)°7t~* multiplications compared to ¢ = (2*)°, the number of multiplications in the conventional approach. Notice that 1G @™1.3, BLOCK MATRICES AND ALGORITHMS 33 Ifd=0, ie., we recur on down to the I-by-1 level, then ) ce Ta ntl neg 3 ‘Thus, asymptotically, the number of multiplications in the Strassen proce- dure is O(n?40"). However, the number of additions (relative to the number cof multiplications) becomes significant 08 rimin gets small. Example 1.8.1 Ifn = 1024 and tain ~ 64, then strane involves (7/8)!9-8 m6 the asthmatic of the conventional algorithm. Probleme 1.3.1. Generalize (1.3.3) a0 that it can handle the variable block-eiza problem covered. by Theorem 1.33. : 3.9.2 Generalize (1.24) and (1.3.5) eo thet they can handle the variable Blockine 2.3.3 Adapt stramn ap that it can handle square matrix multiplication of any order. Hint Ifthe "current" A haa odd dimension, appand a zero row and column, iad Pret ba hee la blocking of the matrix A, then ae 13.5 Spon even and cin te flowing function from RO to Re 2 Ja) = ttamFatny © Senta (0) Show that i 2,9 € RP thea oa Py = Sener + vader +91) = 12) ~ fo) (8) Now consider the mty-n matrs makiptiation @ = AB. Ghre an algoithmn forBy} Ae ‘and ote from Lemma 13.2 that Now analyze each Ay By with the help of Lama 13.1. Notes and References for Sec. 1.3 For quite some time fast methods for matrix multiplication have atractad « ft of at- tention within computer science, See 8. Winograd (1968), “A New Algorithm for Inner Produc,” BBE Trans. Comp. C-17, 6st, YY. Straseen (1969). “Gaussian Elimination is Not Optimal,” Numer. Math. 15, 354-356, YV. Pan (1984), “How Can We Speed Up Matrix Multiplicatio?,” SIAM Repiew 26, so4i6, “Maay of thase methods have dubious practical value. However, withthe publication of D. Bailey (1068). “Extra High Spend Matrix Multiplication on the Cray.2," SIAM J. Sei and Stat. Comp. 9, 603-607. {ie clar that the blankatdieminnal of those fast procedures ia unwiae, The “etability” (of the Straaen algorithm ix diced in $24.10. Seo aleo [NJ Higham (1990). “Exploiting Fast Matrix Multiplication withix the Level 3 BLAS” ACH Trona, Mosh. Soft 16, 352-968. ‘CC. Douglas, M. Heroux, G, Slshman, and RLM. Smith (1994), “CEMMW: A Portable Level 3 BLAS Winograd Variant of Struaen's Matris-Mutrie Multiply Algorithm.” I Comput, Phys. 110, 1-10. 1.4 Vectorization and Re-Use Issues ‘The matrix manipulations discussed in this book are mostly built upon dot products and saxpy operations. Vector pipeline computers are able to perform vector operations such ax these very fast because of special hardware that is able to exploit the fact that a vector operation is a very regular sequence of scalar operations. Whether or not high performance is extracted from such a computer depends upon the length of the vector operands and a number of other factors that pertain to the movement of data such as vector stride, the number of vector loads and stores, and the level of data re-use. Our goal is to build a useful awareness of these issues, We are not trying to build e comprebensive model of vector pipeline1.4, VectoRmaTion AnD Re-Use Issuzs 35 ‘computing that might be used to predict performance. We simply want to Identify the kind of thinking that goes into the design of an effective vector pipeline code. We do not mention any particular machine. The literature is filled with case studies. 1.4.1 Pipelining Arithmetic Operations ‘The primary reason why vector computers are fast has to do with pipelin- ing. The concept of pipelining is best understood by making an analogy to scsembly line production. Suppose the sssembly of an individual automo- bile requires one minute at each of sixty workstations along an assembly line. the line is well staffed and able to initiste the assembly of a new car ‘every minute, then 1000 cars can be produced from seratch in about 1000 + 60 = 1060 minutes. For a work order of this size the line hos an effective “vector speed” of 1000/1060 automobiles per minute, On the other hand, if the assembly line is understalfed and 8 new assembly can be initiated just once an hour, then 1000 hours are required to produce 1000 cars. In this case the line has an effective “scalar speed” of 1/60th automobile per minute, So itis with a pipelined vector operation such as the vector add z = <-+y. The sealar operations = = xj + y are the cars. The aumber of elements is the size of the work order. If the start-to-finish tive requited for each is 7, then & pipelined, length vector add could be completed in time much less than nr. This gives vector speed. Without the pipelining, the ‘vector computation would proceed at s.scalar rate and would approximately require time nr for completion. Let us see how a sequence of fosting point operetions can be pipelined. Floating point operations usually require several cycles to complete. For example, 3 dcycle addition of two scalars x and y may proceed as in Fic.1.4.1. To visualize the operation, continue with the above metaphor :—[ Xa : Spe] tet] Horas == Fig. 14.1 A 3-Cycle Adder ‘and think of the addition unit as an assembly line with throe “work sta- tions”. The input scslars x and y proceed along the assembly line spending ‘one cycle at each of three stations. The sum = emerges after three cycles. « cuarsen 1, Mannix Mesmucamon Pontos sa ite, hak mae a Boe: . Fig. 1.4.2 Pipelined Addition Note that when o single, “iree standing” addition is performed, only one of the three stations is active during the computation. ‘Now consider a vector addition z =+y. With pipelining, the x and y vectors are streamed through the addition unit. Once the pipeline is filed and steady state reached, a 2 is produced every cycle. In Fic.1.4.2 we depict what the pipeline might look like once this steady state is achieved. In this case, vector speed is about three times scalar speed because the time for on individual add is three cycles. 1.4.2 Vector Operations ‘A vector pipeline computer comes with a repertoire of wector instructions, such as vector add, vector multiply, vector scale, dot product, and saxpy. ‘We assume for clarity that these operations take place in vector registers. ‘Vectors travel between the registers and memory by means of vector load and vector store instructions. ‘An important attribute of a vector processor is the length of its vector registers which we designate by v,. A length-n vector operation must be broken down into subvector operations of length v, or less. Here is how such ‘2 partitioning might be managed in the case of a vector addition z = =+y whore = and y are n-vectors: first =1 while first Aaj =; eR} If {01,---.dn} is independent and 6 € span{ai,...,0a}, then b is a unique linear combination of the a;. If $1, ..+4Se are subspaces of FO, then their sum is the subspace defined by $= (ay $aa +++ +04: a, € Si, f= Lk }. Sis gald to be a direct sum if each v € S bas a unique representation sso ay with a € Si. In this case we write $ = $1 ®---@ S,. The intersection of the S; is also a subspace, 5 = 31S Se. +1044} is @ mazimal linearly independent subset of span{a linearly independent subsot of {a1,...52q}- If fojyy.+-s044} is maximal, On} = span{aiye--0i,) and {aiyoo-sdhy) 18 a basis ay} . If SC IR™ is a subspace, then it is possible to find independent basic vectors ‘ay € 5 such that S = span{ay,...,ax} . All bases for a subspace S have the same number of elements. This number is the dimension and is denoted by dim(S) 2.1.2 Range, Null Space, and Rank ‘There aze two important subspaces ossociated with an m-by-n matrix A. ‘The range of A is defined by ran(A) = {y¢R™: y = Az for some 2 € RY}, and the mull space of A is defined by null(A) = {z €R" : Az =0}. +4] is @ columa partitioning, then ran(A) = span{ar,...saq} - ‘The rank of a matrix A is defined by rank(A) = dim (ran(A)). Tt ean be shown thot rank(A) = rank(AT). We say that A ¢ Re™*" is rank deficient if rank(A) < min{m,n). If A€ R™*, then im(null()) + ronk(A) KA=[a, 50 CHAPTER 2. MATRIX ANALYSIS 2.1.3 Matrix Inverse ‘The n-by-n identity matriz Iq is defined by the column partitioning Ine len en] where e4 is the kth “canonical” vector: ee =(Qye-50 1, 0, ‘The canonical vectors arise frequently in matrix analysis and if their di- mension is ever ambiguous, we use superscripts, ie., ef") € IR". If A and X are in FE" and satisfy AX = 1, then X is the inverse of A and is denoted by A™!, If A~' exists, then A is said to be nonsingular. Otherwise, we say A is singular. ‘Several matrix inverse properties have an important role to play in mae trix computations. The inverse of a product is the reverse product of the (AB) = BA", 11) ‘The transpose of the inverse is the inverse of the transpose: (AYP a (Attar (212) ‘The identity Be At BB AA (2.13) shows how the inverse changes if the matrix changes. ‘The Sherman-Morrison- Woodbury formula gives convenient expres- sion for the inverse of (A+UV") where 4 ¢ RO*™ and U and V are n-by-k: (AF UVT) AAMT VATU) VTA! (2.1.4) ‘A rank & correction to a matrix results in a rank k correction of the inverse. Ja (2.1.4) we assume that both A and (I+ V?A~'U) are nonsingular. ‘Any of these facts can be verified by just showing that the “proposed” inverse does the job. For example, here is how to confirm (2.1.3): B(A*—~ B-4B — AYA“) = BA (B - A)A™ 2.14 The Determinant If A = (0) € R™!, then ita determinant is given by det() = a. The determinant of A ¢ R"*" is defined in terms of order n — 1 determinanta: det(A) = De 1)? *ayydet(Ay))2.1. BAsic IpeAs FROM LINEAR ALGEBRA 51 Here, A1y is an (n ~ 1)-by-(n— 1) matrix obtained by deleting the frst row ‘and jth column of A. Useful properties of the determinant include det(A)det(B) ABER" det(AB) = det(A) = det(A) Aer" det(cA) = det A) ceRAcR™ det(A) #0 4 Ais nonsingular Ae RO" 2.1.5 Differentiation Suppooe a is a scalar ood that A(a) is an m-by-n matrix with entries a(c2). If ay(a) is a differentiable finetion of a for all and j, then by A(a) we mean the matrix a Ha) = Zale) = (Lasla)) = @yla)- ‘The diferentiation of a parameterized matrix tums out to be a handy way to examine the sensitivity of various matrix problems. Probleme 2.1.1 Show that if Ae FO" bas rank p, then therw existe an X€R™? and 9 Y¥ @ RO*? wach that A = XY, where rank(X) = raak(Y) op. 2.1.2 Suppose A(a) € Fad B(a) € FE** are matrices whose entree are difer- cotiable fanctions of the scalar @. Show Emorocen = [Zac] 210+ 40 [2 aa] P2.1.3 Suppome A(a) ¢ FO%™ haa enti that ar diferntiabe functions of the salar ‘a. Amuming Aa) almye noaingula, obow $ acer] = -A0o-* [Zara] Acad. 2.14 Suppose A GFE, BG HE and that g(e) = JT A 27% Show that the aden of § i gion by VAs) = HAF + Ale 8 P2.1.5 Amume that both A and +97 are soosagulr whare A € FO** sad, € RE Show that f= saves (A 4-wo?o = then 1 alo soles perturbed ight bead ide problem ofthe form Az= b +e. Give an xpremi fr aia tem of A, 3, and Notes and Referanons for Sec. 2:1 ‘There ae many introductory near lgebra vets, Among them, the flowing are par- ‘Heulary ef PR Halmos (1858). Finite Dimensional Vector Spaces, od od, Van Nostrand Reinhold, Prinoron, 52 CHAPTER 2. MATRIX ANALYSIS 8.3, Loon (1980). Linaor Algebra ith Applications. Macmillan, New York. Strang (1989), Intreduction to Linear Alger, Waleley-Casbegs Prem, Wellley D, Lay (1994). Linear Algebra and fis Applications, Addioon- Wesley, Reading, MA. © Meyer (1907. Course Appin Lineer Agere, SIAM Pubatons,Phiadephi PAL More advanced treatments include Gantmacher (1050), Hora nad Johnson (1985, 2991), aod AS, Honma (104), The Theory of Mabe Numeral Analy, Gas (Bate ‘dell, Boston. M, Marcus and BL. Mine (1964). A Survey of Matrie Theory ond Matric Inorualiten Allyn tnd Bacon, Boston. LN. Fraaldin (1968). Matris Theory Prentice Hall, Englewood Cif, NJ. R. Baliman (1970). introduction to Matrez Analyes, Second Editon, McGraw-Hill, New York. P. Lancaster and M. Tiamenetaky (1085). ‘The Theory of Matrices, Sesond dition JMC Oncega (1087), Matrie Theory: A Second Course, Plus Pres, New York. 2.2 Vector Norms ‘Norms serve the same purpose on vector spaces that absolute value docs fon the real line: they furnish a measure of distance. More precisely, R” ‘together with a norm on R® defines a metric space. Therefore, we have the familiar notions of neighborhood, open sets, convergence, and continuity ‘when working with vectors and vector-valued functions, 2.2.1 Definitions A vector norm on IR" is a function /-R" —~ R that satisfies the following properties: f(z) 20 zeER', (f(z) =0iffz=0) f(z+y) < f(z) + fy) ye R” faz) = |alf(z) aeRzeR" We denote such a function with a double ber notation: f(2) = J. Sub- ‘scripts on the double bar are used to distinguish between various norms. ‘A useful class of vector norms are the p-norms defined by Hod, = (eu? ++ leal)# pan (223) ‘Of these the 1, 2, and oo norms are the most important: Veh, = false lea lz = (lei? +--+ Jeni)? = (Pa)! Wells = max lel2.2. Vector Nonms 53 ‘A unit vector with respect to the norm l| || is a vector x that satisfies Iz}=1. 2.2.2 Some Vector Norm Properties A classic result concerning p-norms is the Holder inequality: By || we say that tobe = 2-24] isthe absolute error in . If #0, then 2-21) ot TT prescribes the relative error in 4. Relative error in the oo-norm can be ‘translated into o statement about the number of correct significant digits in @. In particular, if Hele a 1 then the largest component of has approximately p correct significant digits Bxample 2.21 I= (170 06674)7 and w (1.235 .06128)7, ben 12 #lln/I = las 0043 = 10-*. Nota than 4y has about three signcant digits that are correct while aly coe significant digit in 23 ia comect. 54 CHAPTER 2. MATRIX ANALYSIS 2.2.4 Convergence We say that 0 sequence {2(*)} of n-vectors converges to = if fm 2 zi =0. ‘Note that becuse of (2.2.4), convergence in the a-norm implies convergence {in the A-norm and vice versa. Problems P22. Show that i € RO, then lity oii, = 2 by P22. Prove the Cauchy-Schwarts inequality (223) by comakdering the inegsality OS (ax + by)" (az + by) for suitable scalars a and 6. PB.23 Verify tat + Hy I Nyy 804 [Nag ate Yetor norms 2.24 Verify (22.5}(227). Whe is euality achieved in each em? 2.25 Show that in RY, 20) + if ad only 20 24 fr B= 2.2.6. Show that aay vector norm on R* is waiforly continuous by veritying the inequality | lel ~ vil ll Alla Bll. For the most part we work with norms that satisfy (2.3.4), ‘The p-norms have the important property that for every A € R"™" and ZER" we have | Az |, < |] All,] 2[,. More generally, for any vector norm ff «lq 08 Re" and’ - jon R™ we have | Azlly < Algal? la where A fag is « matrix norm defined by Arig Tle” (2.3.5) HAllas = Wesay that | lug is subordinate tothe vector norm fg and | «By. Since the set {z €'R": |x, = 1} is compact and | = [1 i continuous, © follows that Walaa jaa LAr ly = Lae" ly (238) for some z* ¢ IR having unit a-norm. 2.3.2 Some Matrix Norm Properties ‘The Frobenius and p-norms (especially p = 1, 2, co) satisfy certain ineqtal- ities that are frequently used in the analysis of matrix computations. For AER™® we have UA: $f Allp < Vala ean max lou] < Ala S Vein max layl (2.38) VAh = me Seu (239) Vly = ae Lieut 230) Fal Ale SUAb < VIALS sn) JRiAh sib s vila (32)2.3. Marmix Noms 37 HAER™", 1 1, a contradiction. ‘Thus, fF is nonsingular. To obtain’ an expression for its inverse consider the identity . (E*) (+P) = 1-FH, = Siace | FI, <1 i flows that im F* = 0 because 1F*Y, < WIS Thus, (vde)e-n al w It follows thot (I'-F)-* = Jim 7 F*. From this itis easy to show that is 1. IF il, ta-ry s Died = & Note that |(I-F)--Iil, < [Fllp/(- Il F'ill,) a8 @ consequence of the lemma. ‘Thus, if ¢ <1, then O(6) perturbations in I induce O(¢) perturbations in the inverse. We next extend this result to general matrices. ‘Thoorem 2.3.4 If A is nonsingular and r= AE J, <1, then A+ E is nonsingular and (A+ 2)" AI, WEL WAT —2)- Proof. Since A is nonsingular A +E = A(I ~ F) where F = —A-'E. Since | Fil, = r < Lit follows from Lemma 2.3.3 that IF is nonsingular and {(2- F)-" fl, < 1/(1-r). Now (A+ £)-! = (I= F)"1A! and so taser, s Abe2.4. Foire PRECISION MaTREx COMPUTATIONS 59 Equation (2.1.3) seys that (A+ E)~! — An! = —A~18(A + B)~! and oo by taking norms we find VAs et- A, < PAM EE NAS, 193 < MULE g Problems P2.3.1 Show | ABI, SHAI,II Bil, where 1 M or 0 < |a op bl < m respectively. ‘The handling of these and other exceptions is hardware/system dependent. 2.4.3 Cancellation Another important aspect of finite precision arithmetic is the phenomenon of catastrophic cancellation. Roughly speaking, this term refers to the ex ‘treme loss of correct significant digits when small numbers are additively computed from large numbers. A well-known example taken from Forsythe, Malcolm and Moles (1977, pp. 14-16) i the computation of e* via Tay- Jor series with a > 0. The roundoff error associated with this method is ‘There are important camps of machines whose additive floating point operations satiety 7020) = (1+ e)0-© (1+ 49)0 whee [elle] bj =lay,i=im jain BSA > by Sayi ‘With this notation we see that (2.4.6) has the form m, j= in If(A) ~ A} < ulAl. A relation such as this csa be easily tuned into a norm inequality, eg., HAA) = Al}, S all All, However, when quantifying the rounding errors in a matrix manipulation, the absolute value notation can be lot more Informative because it, provides a comment on esch (i,j) entry. 2.4.5 Roundoff in Dot Products We begin our study of finite precision matrix computations by considering ‘the rounding errors that result in the standard dot product algorithm: s=0 for k= In saotnn (24.7) end Here, x and y are n-by-1 floating point vectors. In trying to quantify the rounding errors in this algorithm, we are immediately confronted with notational problem: the distinction be- ‘tween computed abd exact quantities. When the underlying computations are clear, we shall use the /l(-) operator to signify computed quantities.2.4. Favre Precision MATRIX COMPUTATIONS 63 Thus, fi(2"y) denotes the computed output of (2.4.7). Let us bound Uste"9) = 37). It . y=Tl (Eas) , i then #1 = 21ys(1 +61) with [64] 0 and B20, then conventional matrix multiplication produces a product € that ‘has small componentwise relative error: [-C| ¢ mula} [Bl +O(u?) = aujo] +0(u") This follows from (2.4.18). Because we cannot say the same for the Strassen. approach, we conclude that Algorithm 1.3.1 is not attractive for certain nonnegative matrix multiplicstion problems if relatively accurate ij are required. Extrapolating from this discussion we reach two fairly obvious but im- portant conclusions: ‘© Different methods for computing the same quantity can produce sub- stantially diferent results. '* Whether or not on algorithm produces satisfactory results depends upon the type of problem solved and the goals of the ser. ‘These observations are clarified in subsequent chapters and are intimately related to the concepts of algorithin stability and problem condition. Probleme Pad Show thet if (24.7) la applied with y = 2, then fi(2"2) = 272(1 +a) where al ¢ na +0(02), Pada Prov (243). PAA9 Show thet if B ¢ HE™M® with m > m, thon 151 a < VA Ef This eeu is ‘met when deriving norm bounds fom absolute value bounds. 244 Amume the existence of « aqoare rot function stiying f1(/2) = VE() +6) with [el
(0? + ww). But o? =A} =|] Ar fi}, and 0 we must have w = 0. An obvious induction argument completes the proof of the theorem, © ‘The a; are the singular values of A and the vectors uw; and 1 are the ith left singular vector and the ith right singular vector respectively. It2.5. ORTHOGONALITY AND THE SVD. n is easy to verify by comparing columns in the equations AV = UE and ATU = VSP that Ay = 014 Ay = om It is convenient to have the following notation for designating singular val- } i= 1min(mn} (A) = the ith largest singular value of A, Omes(A) = the largest singular value of A, Cmin(A) = the smallest singular value of A. ‘The singular values of @ matrix A are precisely the lengths of the semi-axes of the hyperellipsoid E defined by B= { Az: [la =1}. Bxample 2.5.1 (8 Ble -(S ME ay ‘The SVD reveals s great deal about the structure of a metrix. If the SVD of A is given by Theorem 2.5.2, and we define r by OB DOe> orgy soy =O, theo rank(A) = or (2.5.3) aull(A) = spam{eegty.--+¥a} (254) rand) = epan(yeste} 5 (283) and we have the SVD expansion A= Dowsl (250) st ‘Various 2-norm and Frobenius notm properties have connections to the SVD. If Ae R™", then WAIL = oft--403 pa minimn} (25.7) Ab =m (2.5.8) in "Azle oo (m2n) (259) 2 CHAPTER 2, MATRIX ANALYSIS: 2.5.4 The Thin SVD If A= ULV? € R™ is the SVD of A and m > n, then A=UiEVT where sooty RE" and Ey = E(t, Ln) = diag(oy,...,09) € RO We refer to this much-used, trimmed down version of the SVD as the thin sup. 2.5.5 Rank Deficiency and the SVD One of the most valuable aspects of the SVD is that it enables us to deal sensibly with the concept of matrix rank. Numerous theorems in linear ‘algebra have the form “if such-and-ouch # matrix has full rank, then such- ‘and-such a property holds.” While neat and sesthetic, results of this favor do not help us address the numerical difficulties frequently encountered in situations where near rank deficiency prevails. Rounding errors and fuzzy data moke rank determination a nontrivial exercise. Indeed, for some small ‘ce may be interested in the crank of a matrix which we define by rank(A.e) = min rank) Thus, if A is obtained in a laboratory with each a; correct to within +001, then it might make sense to look at raak(A,.001). Along the same lines, if Ais an m-by-n floating point matrix then itis reasonable to regard A ss numerically rank deficient if rank(A,¢) < min{m,n} with e= ull A [2. Numerical rank deficiency and e-rank are nicely characterized in terms of the SVD because the singular values indicate how near a given matrix is to a matrix of lower rank. ‘Theorem 2.5.3 Let the SVD of A R™" be given by Theorem 2.5.2. If eer =rank(A) and £ Ay= Dovw?, (25.10) min A-Blha = A Avia = ons. (2.5.11) ait [A- Bla = WA~ dul (say2.5. ORTHOGONALITY AND THE SVD a ) it Follows that rank( Ax) 109) and so | A — Ax {la = Ost ‘Now suppose rank(B) = k for some B éR™™". It follows that we can find orthonormal vectors z1;.--,2q-n 90 oull(B) = span{2t,...,2n-1} - ‘A dimension argument shows that wwe have kat WA-BUE > M(A~ Ble =WAZHE = DoT e)? > ods completing the proof of the theorem. © ‘Theorem 2.5.3 says that the smallest singular value of A is the 2-norm distance of A to the set of all rank-deficient matrices. Tt also follows that the set of full rank matrices in IR™*" is both open and dense. Finally, ifre = rank(A,), then a1 Do Borg > CD organ Bo Boy p= minfmn). ‘We have more to say about the numerical rank issue in §5.5 and §12.2. 2.5.6 Unitary Matrices ‘Over the complex field the unitary matrices correspond to the orthogonal ‘matrices. In particular, Q € ©" is unitaryif Q7Q = QQ" = Iq. Unitary matrices preserve 2-norm. The SVD of a complex matrix involves unitary matrices. If A€ (™™, then there exist unitary matrices U € C™™™ and Vee" guch that UM AV = ding(o1,.-.,09) ER" — p= minfm,n} where oy 2 042 ...2 0,20. Problecse 2.5.1 Show that if ts ral and ST =~5, then J ~ $ a nonsingular and the mate (C= S)-*(F% 5) orthogonal. Thin in known 0a the Capley ronaform of 5. ™ CHAPTER 2. MATRIX ANALYSIS; 2.6.2 Stow that angular orthogooal matric ia dagpal. 2.63. Show that FQ = Qi +402 is unitary with Qi, € FE, then the nbn palmate z-(& & ix othogona P25. Brabah properties (25.)(259) 25.8 Prove that emee(A) = mae “ay veR™z6R™ Trl P25 Foc the 2-2 matix A = [ 9 rnin) tata unetion of, ¥, ad = P2S.7 Show that any matrix ia RO ithe limi ofa mquese of fil enk mates. (2.5.8 Show thet if A € R™** has rank n, then || A(A74)-'A? Jja = 1. P2.5.9 What isthe neat asi manixto A= [4 | inthe Robeaus nom? 5 ] deems at an 2.5.10 Show that if A € R™*® then || Ally $ /Fank(A) EA lay thereby sharpening aan. Notes and Referonces for Sec. 2.5 Forsythe and Moler (1967) offer » good account of the SVD's roe in tho analysis of the ‘Ax = problem. Their proof of the decomposition a more traditional than ours is that it maton ue ofthe eigeavale theory for eymetic matrices. Historical SVD relerencer inelade E. Boltrami (1873). ‘Sulla Punsioni Bilinear,” Gionale i Mathematiche 11, 98-106. C. Beart ane G. Young (1999). *A Principal Axia Transformation for Noo- Hermitian Matrices” Bull. Amer. Math Soe. 46, 118-21 G.W. Stewart (1989). "On the. Early History of the Singular Value Decomposition,” ‘SIAM Review $5, 351-566, (One ofthe most significant developments in sclemifc computation hasbeen the increased ‘se of the SVD in application area that roqure the inteligens handling of matrix rank. ‘The range of applications i impremive. One of the mort interesting is CB, Moler and D. Morrison (1988). “Singular Value Analysis of Cryptogramng,” Amer. ‘Math. Monthy 90, 78-27 For geoeraliantions of the SVD to infinite dimensional Uber space, see LC. Gohberg and M.G. Krein (1960). introduction to the Theory of Linear Now-Self ‘Adjoint Operators, Amer. Math. Soc, Providence, Rl . Smithiee (1070). Inter Squations, Cambridge University Press, Cambridge. Reducing the rank of o matrix as in Theorem 25.3 when the perturbing matrix i com ‘erined in dicuamed i IW, Demme (1987). “The smallest perturbation ofa submatri< which lowers the rank ‘and constrained total least squares problemas, STAM J. emer. Anal. #f, 199-206.2.6. PROJECTIONS AND THE CS DECOMPOSITION % GH, Golub, A. Hotiman, and G.W. Stewart (1088). "A Ganeralisation ofthe Eelare- ‘Young Mirsky Approsimation Theorem.” Lin. Alp. and fle Applic 68/80, 317-328, GA. Wotaon (1988). “The Smalls Perturbatioa of « Submmatrox which Lowers the Raa of the Matrix” IMA J. Numer, Anal. 8, 295-304. 2.6 Projections and the CS Decomposition If the object of a computation is to compute a matrix or a vector, thea norms are useful for assessing the sccuracy of the answer or for measuring progress during an iteration. Ifthe object of a computation is to compute a subspece, then to make similar comments we need to be able to quantify the distance between two subspaces. Orthogonal projections are critica in this regard. After the elementary concepts are established we discuss the CS decomposition. This is an SVD-like decomposition that is handy when having to compare & pair of subspaces, We begin with the notion of an ‘orthogonal projection. 2.6.1 Orthogonal Projections Let SC IR" be a subspace. P ¢ R°*" is the orthogonal projection onto Sif ran(P) = $, P! = P, and P? = P. From this definition itis ensy to show that if x €R", then Pz € S$ and (I — P)z¢ $+ If Py and Py are each orthogonal projections, then for any z € R° we have NCP ~ Paz = (Piz) ~ Pade + (Paz) "UE - Pde If ran(P,) = ran(P,) = S, then the right-hand side of this expression is zero showing that the orthogonal projection for a subspace is unique. If the columns of V = [v4,...,1% ] are an orthonormal basis for a subspace S, then it is easy to show that P = VV" i the unique orthogonal projection onto S. Note that if v €R°, then P = w7/u7v is the orthogonal projection onto $ = span{v}. 2.6.2 SVD-Related Projections ‘There are several important orthogonal projections associated with the sin- gular value decomposition. Suppose A = UEVT e R™*" is the SVD of A and thot r = rank(A). If we have the J and V parttionings u=(Uu 6) vel% %] rom=r roner VVE projection on to null()* = ran(AT) V,VP_ = projection on to null(A) U,UT = projection on to ran(A) U-0F = projection on to ran(A)* = null A?) 76 CHAPTER 2. MATRIX ANALYSIS 2.6.3 Distance Between Subspaces ‘The one-to-one correspondence between subspaces and orthogonal projec- tions enables us to devise a notion of distance between subspaces. Suppose ‘S, and Sz are subspaces of Rand that dim(S;) = dim(S;). We define the distance between these two spaces by dist(Si, Ss) = 1 Pi- Palla (2.6.1) where P, is the orthogonal projection onto Sj. The distance between a pair of subspaces can be characterized in terms of the blocks of a certain ‘orthogonal matrix. ‘Theorem 2.6.1 Suppose we=lm We] Z=(% hl Bonk kon-k are n-by-n orthogonal matrices. If Sy = ran(W1) and Sp = ran(Zi), the dist(Si,52) = |WYZe la = 127M ll. Proof. dist(S,52) AWAWT ~ 227 I, = WW™(WAWE - 2.28)2 fy = [fees "JL. Note that the matrices WZ and WY Z; are subinstrices of the orthogonal matrix Qu Qa] 2 | WE 2 o=[8: 2] = [WE whe |-#7 ‘Our goal is to show that || Qa fly = laeh ‘Since Q is orthogons! it follows from } = [SE] @ LehQuelh + 1Qax2i} for all unit 2-norm x € RY. Thus, that Qari = max zi min H Vnih = ymax Bnei pit, Hue = 1G min( Qs)?2.6, PROJECTIONS AND THE CS DzcoMPostrion n Anslogously, by working with Q7 (which is also orthogonal) it is possible to show that ; 1 Qf 1B = 1 - ermal). and therefore Qa = 2 ~omn(Qas)? ‘Thus, Qa lly = 1x2 tO Note that if 5, and $2 are subspaces in IR" with the same dimension, then 0 < dist(S1, 52) < 1. ‘The distance is zero if S, = Sz and one if S:\St # {0}. ‘A more refined analysia of the blocks of the Q matrix above sheds more light on the difference between a pair of subspaces, This requires a special SVD-like decomposition for orthogonal matrices. 2.6.4 The CS Decomposition ‘The blocks of an orthogonal metrix partitioned into 2-by-2 form have highly related SVDs. This is the gist of the CS decomposition. We prove a very useful special case first. ‘Theorem 2.6.2 (The CS Decomposition (Thin Version)) Consider the rs 2-[%] str 2m nia 3 i ef eer ey ee enist orthogonal matrices U, € R™*™, Uz €R™*™, and Vi € R™™ such (2 ay [&le-[8] C= dingloon(Oy), S = ding(sin(®),.. eR", Qa eR /con(n)), sin(@n)), Och che Proof. Since || Qui lla $ II @ lla = 1, the singular values of Qi, are all in he interval (0,1). Let UFQuV, = C = dingler,... 60) = [% S] mice tnt 8 CHAPTER 2. MarRix ANALYSIS be the SVD of @; where we assume ea a> Bs Dey 20. ‘To complete the proof of the theorem we must construct the orthogonal matrix Ua if am = 1% He) ton 7 hoo nu os fa (5 2 Tale-[§ 3] * WW Since the columns of this matrix have unit 2-nona, W; = 0. The columas of Wz are nonzero and mutually orthogonal because WY Wa = Inne — ETE = dlag(t — 2,1,...,1-4) JIG for k = I:n, then the columns of Z = Wa ding(h ses, +05 1/8n) are orthonormal. By Theorem 2.5.1 there exists an orthogonal matrix U, € R™*™ with Up(:,t + In) = Z. It is easy to verify that UF Qav; = dlag(s4,---44n) = 5. Since +s = 1 for k ~ lin, it follow that these quantities are the required cosines and snes, © Using the same sort of techniques it is possible to prove the following more general version of the decomposition: ‘Theorem 2.6.3 (CS Decomposition (General Version)) if otée] is a 2-by-2 (arbitrary) partitioning of an n-by-n orthogonal matriz, then there exist orthogonal o- Fete] = bite] is nonsingular. If 35 such that vTov =2.6. PROJECTIONS AND THE CS DECOMPOSITION nm where C = diag(eys..-s69) and S = diag(ar,. matrices with 0< c4 8 <1. Proof. See Peige and Saunders (1981) for details, We have suppressed the dimensions of the zero submatrices, some of which may be empty. ‘The eavential message of the decomposition is that the SVDs ofthe Qi, are highly related. Example 2.6.1 The matric 07876 o3eoT 04077 =o. 147) are square diagonal oss 0.2198 mz 0287601817 Qe =01301 00502 05805 0162 0.895 The angles amociated with the cosines and sines turn out to be very im Portant in a number of applications. See §12.4. Problems 2.6.1 Show chat i Pw an ortogonal projection, then Q-= I 2P i orchogool. 72.6.2 Whet ar te iagular value of an orthogooal projection? 2.63. Suppom Si = man{s} and 5: = span(y), where © and y are unit -n0na vector in R2. Working ony with the definition of dit), stow that dat( Si, $) = Vi (=F oF verifying that th datnce between Ss and S ual the ina of the angle erween and. Notes and References for Sec. 2.8 ‘The following papers diacunt various aspects of the CS decomposition: . Davia aad W. Kaba (1970). “The Rotation of Eigeuvectort by » Perturbation I,” SIAM J. Num. Anal 7, 1-48, G.W. Stawart (1977). "On the Perturbation of Preudo-Lrversn, Projections and Linear eaot Squares Problems.” SIAM Review 19, 634-682. (CLC. Paige and M. Saunders (1081). “Toward a Generalized Singular Value Decomspoui- on," SIAM J. Num. Anak 18, 308-405, (COC: Paige and M. Wat (1984). “History aad Gaveraity of tbe C8 Decomposition,” Lin. ‘Alp. and Its Applic. 208/200, 303-328. 80 CHAPTER 2. MATRIX ANALYSIS ‘Sea $8.7 for soxme computational details Foca deoper geometrical understanding ofthe CS decomposition and the notion of datance between subspaom, wo TTA, Aris, A. Edelman, and 8, Smith (1008). “Conjugate Gradimt and Newton's ‘Method! on the Graaarian end Stiefel Maaifokin” to appear in STAM J. Matrie Ancl. Ap 2.7 The Sensitivity of Square Systems ‘We now use some of the tools developed in previous sections to analyze the linear system problem Az = 5 where A € R'™" is nonsingular andé ¢ R". ur aim is to examine how perturbations in A and 8 affect the solution 2. A much more detailed treatment may be found in Higham (1996). 2.7.1 An SVD Analysis It A= Soa? = uEvt is the SVD of A, then ‘This expansion shows that small changes in A or b can induce relatively large changes in z if on is small. Tt sbould come as no surprise that the magnitude of om should have 1 bearing on the sensitivity of the Az = b problem when we recall from ‘Theorem 2.5.3 that op isthe distance from A to the set of singular matrices. ‘As the matrix of coefficients approsches this set, itis intuitively clear that ‘the solution z should be increasingly sensitive to perturbations. 2.7.2 Condition A precise measure of linear system sensitivity can be obtained by consider- ing the parameterized aymam (A+ePla() =b+ef 20) =z where F € R°*" and f € R”. If A is nonsingular, then it is clear that 2(¢) is differentiable in a neighborhood of zero. Moreover, #(0) = A-*(f-Fz) and thus, the Taylor series expansion for z(c) has the form 20) = 2 + €2(0) 4012).2.7, Tue SeNsITIviTy OF SQUARE SYSTEMS aL Using ony vector norm and consistent matrix norm we obtain t-te jepatn{tanen} +o. ara et Vel For square matrices A define the condition number n(A) by (A) = AN RATE (2.7.3) with the convention that x(A) = 00 for singular A. Using the inequality OY < [All hel it follows from (2.7.2) that EG) Tap < Alea + m+ oe) (2.7.4) ster ua Wel represent the relative errors in A and 6, respectively. Thus, the relative error in z can be x(A} times the relative error in A and 6. In this sense, the ret suber Sa) ques the santa othe Aa 8 probles Note that x(-) depends on the underlying norm and subscripts are used secondly e+ we HEl ele ea el Tay aad = lel ou(A) on(A) ‘Thus, the 2-norm condition of a matrix A messures the elongation of the hhyperellipsoid {Az Iz lla = 1}. ‘We mention two other characterizations of the condition number. For pnorm condition numbers, we have (A) = 1 A lial] A“ I (2.7.5) (2.78) a (A) asa. ‘This reoult may be found in Keban (1966) and shows that (A) measures the relative p-norm distance from A to the set of singular metrices. For any norm, we also have n(A)= im up (AAAI AY (2.2.7) eo waaay ‘This imposing result merely says thatthe condition number isa normalized Frechet derivative of the map A A-'. Further details may be found in Rice (19666). Recall that we were initially le to n(A) through diferenti- sion. 82 Cuapren 2. Maran ANALYSIS ‘If x(A) is large, then A is anid to be an ill-conditioned matrix. Note that this is a norm-dependent property?, However, any two condition numbers ‘o(-) and ma(-) on IRO*” are equivalent in that constants ¢; and ¢; can be found for which cutalA) S Ka(A) S cate(A) AER. For example, oa R'™" we have dla) ia mi(A) < neq(A) 1” Leta) $ mld) S matt) ar) Seta) < wold) above. For any of the p-norms, we have (A) 2 1. Matrices with small con- dition numbers are said to be well-conditioned . In the 2-norm, orthogonal ‘matrices are perfectly conditioned in that n2(Q) = 1 if @ ia orthogonal. 2.7.3 Determinants and Nearness to Singularity It is natural to consider bow well determinant size measures ill-conditioning. If det(A) = 0 is equivalent to singularity, i det(A) ~ 0 equivalent to near singularity? Unfortunately, there is lttle correlation between det(A) and the condition of Ax = b. For example, the matrix 3, defined by bate at Oe al By = , | ere (279) OO nd thas determinant 1, but oo(By) = n2"1, On the other hand, a very well conditioned matrix can have a very small determinant. For example, ding(10-4,...,10-1) € Re" although det(D,) = 10-*. satisfies rg( Dn 2.7.4 A Rigorous Norm Bound Recall that the derivation of (2.7.4) was valuable because it highlighted the connection between «(A) and the rate of change of 2(«) at « = 0. However, "It also depend upon the definition of “large.” The matter in purrued in 52.52.7. THE SENSITIVITY OP SQUARE SYSTEMS 83 it is a Ltt unsatisfying becouse it is contingent on ¢ being “small enough” ‘and because it sheds no light on the size of the O(c?) term. In this and the next subsection we develop some additional Az = b perturbation theorems ‘that are completely rigorous. ‘We first cotoblish o useful lemma that indicates in terms of x(4) when we can expect a perturbed system to be nonsingular, Lemma 2.7.1 Suppose Az=b AER™", OZER” (A+ Ady = 5445 AACR, AbERT with || OAl| Sel] Al andj Ab|| Selo. en(A) =r <1, then A+ dA is nonsingular and Uivil 2 ler Tel © T= Proof. Since | A“AA|| < eA“! Al] = 7 < 1 it follows from ‘Theorem 2.3.4 that (A+ AA) is nonsingular. Using Lemma 2.3.3 ond the equality (I+ A™'AA)y = 2+ A“1Ab we find Wl < N+ 4-aay (Ded ef Am) L & > (Islet) Qeh+rizi) 2 (iz|+e] A“? ot) Since {| {| = Arf < || All] zl] it follows that 1 Wis 4 We are now set to establish a rigorous Ax = 6 perturbation bound. ‘Theorem 2.7.2 If the conditions of Lemma 2.7.1 hold, then dy-=i) me Ter $ r=7"4) (2740) Proof. Since yore = AMAb ~ AB Ay (27:1) wehave fy—zi] < ef A* Il] bl] + Amt AMM yl and so, w ex(ay ell tz (ragien tA ex(A) (+}4) = oy a w oy CHAPTER 2. MATRIX ANALYSIS Example 2.7.1 The Az =) problem (3 te ][3 +L] hha solution 2 = (1, 1)F and condition rae(A) = 108. IAb=(10-*, 0)", A= 0, Be At aay 848, cheng (he 0-8) 2)? nd te aay (20) ye Late tp w < Mable a) = 10th = 1 Tale © fol6 ‘Thus, che upper bound ia (2.7.10) can be arom oyerentimate ofthe eror induced by the persusbation. Ou the other band, if Ab= (0, 10°*)", AA = 0, and (A+A.Aly = bea, hen this inequality sys IS caxiotiot “Thus, thre ace perturbation for Which te bound in (2.710) semen axa. 2.7.5 Some Rigorous Componentwise Bounds ‘We conclude this section by showing that a more refined perturbation the- ‘ory is possible if componentwise perturbation bounds are in effect and if ‘we make use of the absolute value notation. ‘Theorem 2.7.8 Suppose A= AcR™, 04¢bER™ (AtAdly = b+Ab AAERM™™, AbERT ‘and that |AAJ < eA} and [Ab] < el. If éao(A) {is nonsingular and y= 2 oo 2 lleo Proof. Since j AA lle $ el Ao 804 |] 6 lo < el blo the conditions of Lemma 2.7.1 are satisfied in the infinity norm. This implies that A+ AA fs nonsingular and Iyllo . L+r T Welle * I=r Now using (2.7.11) we find ly—al < |ATHIAH + JA“ Al Tot S ATH Bl + Am Allyl 0 such that (A+ AA) = b+ Ab [AA n) to the Znorm condition of the matrionr oa[e x o in a] sod c= Notes and References for Sec. 2.7 ‘The condition concept ia thoroughly investigated in 4. Rice (1966). “A Theory of Condition” SIAM J. Num. Anal $, 287-310 ‘W. Kahan (1966). "Numerical Linear Algebra,” Canadian Math. Bull 9, 757-801. References for componentwise perurbation theory include 86 (CHAPTER 2, MATRIX ANALYSIS 'W. Outtll and W. Prager (1964). “Compatibility of Approximate Solutions of Linear ‘Equations with Given Esror Bounds lor Coeficiente and Right Hand Sides,” Numer. Math. 6, 405-409. LE. Cope aad BLW. Rust (1970). "Bounds on solutions of nystems with accurate dat” ‘SIAM J. Nar. Anok. 16, 950-63. RD. Shoe (1979), “Scaling for numerical stability in Gaussian Elimination” J. ACM 26, 404-828, LW, Demme (1992), "Phe Componeatwige Distance to the Nearest Singular Matrix,” ‘SIAM J. Motris Anal. Appl. 13, 10-19. DJ, Higham and 8 J. Higham (1992). “Componeatwiie Perturbation Theory for Linear ‘Syrtem with Maltiple Right-Hand Sides,” Lan Alg. and fts Applic. 174, 111-129, NJ. Higham (1994). °A Survey of Componenti Perturbation Theory in Numerical Linear Algebra,” ia Mathematics of Computation 1949-1988: A Half Century of Computational Mathematica, W. Gautsch (ed), Volume 48 of Proceedings of Sym- ona sn Applied Mathematics, American Mathematical Society, Providence, Rhode fend. ‘8. Chandrasaren and 1.C-P. Ipoea (1998). "On the Sensitivity of Solution Components in Linear Syetemw of Equations,” SIAM J. Matris Anal Appl 16, 99-112. ‘The reciprocal of the condition number measures how near 8 given Az = b problem ia to singularity. The importance of knowing bow nea given problem isto a dificult or fnsoluble problem bas come to be appreciated in many computational settings. See ‘A. Laub(1905). “Numerical Linear Algebra Aspects of Control Design Computations,” IBEE Trans, Auto. Cont. AC-S0, 9-108. 4. L, Baslow (1086). *On the Samallor: Positive Singulat Value of an Af-Matrix with Applications to Ergodic Matkov Chal,” SIAM J. Alp. and Disc. Struct. 7, 414 a EW, Demme! (1967). “On the Distance to the Nearot I-Powod Problem,” Numer: ‘Mach. 51, 281-229. LW, Demme (1988). “The Probability that a Numerical Analysis Problem is Difficult,” ‘Math. Comp. 50, 49-420. 1N4J. Higham (1989). “Matrox Nearnens Problems aad Applications,” in Appiiations of ‘Matrix Theory, M.J.C. Gover and 8. Baroets (eds), Oxford University Prem, Oxford UK 1-27.Chapter 3 General Linear Systems §3.1 Triangular Systems §3.2 The LU Factorization §3.3 Roundoff Analysis of Gaussian Elimination §3.4 Pivoting §3.5 Improving and Estimating Accuracy ‘The problem of solving a linear system Az = b is central in scientific computation. In this chapter we focus on the method of Gaussian elimi- notion, the algorithm of choice when A is square, dense, and unstructured. When A does not fall into this category, then the algorithms of Chapters 4, 5, and 10 are of interest. Some parallel Az = b colvers are discussed in Chapter 6. ‘We motivate the method of Gaussian elimination in §3.1 by discussing the ease with which triangular systems can be solved. The conversion of 1 general system to triangular form via Gauss transformations is then pro- sented in §3.2 where the “language” of matrix factorizations is introduced. Unfortunately, the derived method behaves very poorly on 8 nontrivial class of problems. Our error analysis in §3.3 pinpoints the diffcalty and moti- vates $3.4, where the concept of pivoting is introduced. In the final section ‘we comment upon the important practical issues associated with scaling, iterative improvement, and condition estimation. Before You Begin Chapter 1, §§2.1-2.5, and §2.7 are assumed. Complementary references include Forsythe and Moler (1967), Stewart (1973), Hager (1988), Watkins 88 Cuarrer 3. Genenat Linzar SYSTEMS (1991), Ciarlet (1992), Datta (1995), Highamn (1996), Trefethen and Bau (1996), and Demmel (1996). Some MaTLARfunctions important to this chapter are lu, cond, rcond, and the “backslash” operator “\". LAPACK connections include solutions with error bounds ‘with condition entinate Solve AX = B, ATX = BANK = B via PA = LU a Equitbration 3.1 Triangular Systems ‘Traditional factorization methods for linear systems involve the conversion of the given square system to. triangular system that has the same solution, ‘This section is about the solution of triangular systems. Waeeaee 3.1.1 Forward Substitution Consider the following 2-by-2 lower triangular system: 4: 0 ][n] fh fa ta} | con If falas #0, then the unknowns can be determined sequentially: a= hfs t= (bo~fnt1)/tn. ‘This is the 2-by-2 version of an algorithm known as forward substitution. ‘The general procedure is obtained by solving the ith equation in Lz = forse a ( - ¥en) /3.1. ‘TRIANGULAR SySTEMS 89 If this is evaluated for i = 1:n, then 0 comploto specification of zis obtained. Note that at the ith stage the dot produet of L(j,1:i ~1) and e(1:i~ 1) i required. Since & only is involved in the formula for 24, the former may be overwritten by the latter: Algorithm 3.1.1 (Forward Substitution: Row Version) If L.¢ R™ 1s lower triangular and b € RR, then this slgorithum overwrites 5 with the solution to Lz = 6, L is assumed to be nonsingular, (1) = O(1)/E(1, 1) for i= 2m 2(4) = (6G) — LG Asi — 1)o0L4 — 1))/E64,3) end ‘This algorithm requires n? flops. Note that L is accemsed by row. ‘The computed solution 2 satisfies: (E+F)E = 6 [F| < nals) + O(W) Ga) For a proof, see Higham (1996). It says that the computed solution exactly satisfies a slightly perturbed system. Moreover, each entry in the perturbing ‘matrix F is small relative to the corresponding element of L. 3.1.2 Back Substitution ‘The analogous algorithm for upper triangular systems Uz = 6 is called back-substitution. The recipe for 2, is preseribed by = (Ee) and once again 6 can be overwritten by 2;. Algorithm 9.1.2 (Back Substitution: Row Version) If U ¢ RO" is upper triangular ond 6 € R°, then the folowing algorithm overwrites & ‘with the solution to Uz = 6. U'is assumed to be nonsingular, Hn) = H(n)/U (nn) fori=n- 1-11 H) = (OG) — UU 4 + Ln) + 1:0))/0G,4) end ‘This algorithm requires n? flops and accesses U by row. The computed solution 2 obtained by the algorithm can be shown to eatiefy (+r =o IF < nul|+0(u*), (8.1.2) % CHAPTER 3. GENERAL LINEAR SYSTEMS 3.1.3 Column Oriented Versions Column oriented versions of the above procedures can be obtained by re- versing loop orders. To understand what this means from the algebraic point of view, consider forward substitution. Once x; is resolved, it can >be removed from equations 2 through n and we proceed with the reduced system L{2:n, 2:n)z(2:n) = 6(2:n)—2(1)E(2:n, 1). We thea compute x2 and remove it from equations 3 through n, etc. ‘Thus, if this approach is applied (35 2]/2]- [2 wwe find 2; = 3 and then deal with the 2-by-2 system [3 8][2]=[2]-9[2] = [8] Here is the complete procedure with overwriting. Algorithm 3.1.3 (Forward Substitution: Columa Version) IfL ¢ R'™" is lower triangular and b€ RR", then this algorithm overwrites ® with the solution to Lz =. L is assumed to be nonsingular. for j=in=1 664) = G/L.) OG + Lin) = Dj + In) ~ BY)LG + 1:0, 7) end Hn) = Bln) /L (nn) It is also possible to obtain a column-oriented saxpy procedure for back- substitution. Algorithm 3.1.4 (Back Substitution: Column Version) IfU € R™" is upper triangular and b € R", then this algorithm overwrites 6 with the solution to Uz = b. U is assumed to be nonsingular. 2-12 86) = GG.) (ls = 1) = Hy ~ 3) BG) ~ 1,4) end a1) = 6q)/00,1) [Note that the dominant operation in both Algorithms 3.1.3 and 3.1.4 is the saxpy operation. The roundoff behavior of these saxpy implementations is essentially the same as for the dot product versions. ‘The accuracy of a computed solution to a triangular system is often surprisingly good. See Higham (1996). for j3.1, TRIANGULAR SvsTEMS a 3.1.4 Multiple Right Hand Sides Consider the problem of computing a solution X ¢ R™* to LX = B where Le Re*” is lower triangular and Be RO™*. This is the multiple right hand side forward substitution problem. We show thet such » problem ccan be solved by a block algorithm that is rich in matrix multiplication assuming that q and n are large coough. This turns out to be important in subsequent sections where various block fectorization schemes are discussed. ‘We mention that slthough we are considering here just the lower triangular problem, everything we say applies to the upper triangular case as well. ‘To develop a block forward substitution algorithm we partition the equar ton LX = B os follows: Lo | ox a cr) In Ia 0 Lin Ene Assume that the diagonal Menesan Paralleling the development of Algorithm 3.1.3, we solve the system Lu1X; = By for X, and then remove /X; from block equations 2 oe N: In 0 By— In Lon yy By— Lau X La bws «> Law By-LmXy Coming i tte vay we ca he tng oe spy vd elimi sation scheme: for j=1:N (4) Notice that the i-loop oversees a single block saxpy update of the form Bi Bysr sais (Z}-TEH EE)» By By ing For this to be handled as a matrix multiplication in » given architec ture it is clear that the blocking in (3.1.3) must. give sufficiently “big” X;. Let us assume thot this is the case if each X, has at least r rows. plished if NV = ceil(n/r) and X1,...,Xw—1 €R™* and 9 Cuapren 3. General Lingar SysTEMs 3.1.5 The Level-3 Fraction It is handy to adopt a measure that quantifies the amount of matrix multi- plication in a given algorithm. To this end we define the level-3 fraction of an algorithm to be the fraction of flops that occur in the context of matrix multiplication. We call such flops level-3 flops, Let us determine the level-3 fraction for (3.1.4) with the simplifying assumption that n = rN. (The seme conclusions hold with the unequal ‘blocking described above.) Because there are N applications of r-by-r forward elimination (the level-2 portion of the computation) and n? flops overall, the level-3 fraction is approximately given by Nr? 1 loa thoy ‘Thus, for large N almost all flops are level-3 flops and it makes sense to choose NV as large as possible subject to the constraint thet the underlying architecture can achieve a high level of performance when processing block saxpy's of width at least r= n/N. 3.1.6 Non-square Triangular System Solving ‘The problem of solving nonsquare, m-by-n triangular systems deserves some mention. Consider first the lower triangular case when m > 1, i Inj, - [% IneR™™ — eR" Jn }* = IneR™™ beRe Asoume thet Zi is lower triangular, and nonsingular. If we apply forward elimination to 1,2 = by then z solves the system provided Lai(Lj;'61) = ba, Otherwise, there is no solution to the overall system. In auch a case Jeost squares minimization may be appropriste. See Chapter 5 Now consider the lower triangular system Lx = 6 when the number of columns n excceds the number of rows ra. In this case apply forward substitution to the square system L(I:m, 1:ma)x(1:m, :m) = b and preseribe ‘an arbitrary value for 2(m + 1:n). See §5.7 for additional comments on systems that have more unknowns than equations. ‘The handling of nonsquare upper triangular systems is similar. Details are left to the reader. 3.1.7 Unit Triangular Systems A undt triangular matrix is a triangular matrix with ones on the diagonal. Many of the triangular matrix computations that follow have this added bit of structure. It clearly poses no difficulty in the above procedures.3.1. TRIANGULAR SYSTEMS 93 3.1.8 The Algebra of Triangular Matrices For future reference we list a few properties about products and inverses of triangular and unit trlangular matrices. 4 The inverse of an upper (lower) triangular matrix is upper (lower) ‘triangular. ‘+ The product of two upper (lower) triangular matrices is upper (lower) triangular. «The inverse of a unit upper (lower) triangular matrix is unit upper (ower) triangular, ‘+ The product of two unit upper (lower) triangular matrices is unit upper (lower) triangular. Problems PS.1.1 Give an algorithm for computing s nonzero 2 € RY much that Us = 0 where Ue FO** upper triangular with tpn = O and wo tnmicant #O- 3.1.2 Discuss how the determinant of a aquare triangular matrix could be computed ‘with iinimum risk of overton and undertow. 3.1.8 Rewrite Algorithm 3.1.4 given thet U is stored by column in a leagth n(n 1)/2 3.1.4 Welte a detailed version of (3.1.4). Do not amume that divides 1. Sen S27 € FO” we ape tpir nd ht(SF =A) = 8 no Eiaie Sela Gh opty abe soaps ae tas ce wea st SF rcne Ot ope. es ‘| é s=[$ Z]m[o a] % Sm tte oni) Te Tk = Benga eR serait ams Coo een snd we Trt i en solves (8474 — Ae = by. Obverve that 24 and wy = Tez cach require O(n) ope. PS.1-7 Seppone the matric Ry,...,.ly € FO" are all upper triangular, Give an (Olgm) algorithm for molviog the ayatem (Ry Rp~ A) = bamruming thatthe mazrix of coeficknts is nowsingulas, Hiat. Generalize the aoution tothe previous problem. Notes and Rafarancne for Sec. 3.1 ‘The accuracy of triangular syste solvers is analyzed in (NJ. Higham (1980). “The Accuracy of Solutions to Tiangular Syrems,” SIAM J. Num. ‘Anal. #6, 1252-1265, 4 (CHAPTER 3. GENERAL LINEAR SYSTEMS 3.2. The LU Factorization ‘As we have just soon, triangular systems are “easy” to solve. The idea behind Gaussian elimination is to convert a given system Az = b to an ‘equivalent triangular system. The conversion is achieved by taking appro- priate linear combinations of the equations. For example, in the system Szjtin, = 9 6ytte = 4 if we multiply the first equation by 2 and subtract it from the second we obtain Setie2 = 9 ~tr = -14 ‘This ia n = 2 Gaussian elimination. Our objective in this section is to give 8 complete specification of this central procedure and to describe what it does in the language of matrix factorizations. ‘This means showing that the algorithm computes a unit lower triangular matrix L and an upper triangular matrix Uso that A = LU, e. 35 1ojfs s ez} - [2a}lo -3j- ‘The solution to the original Ax = b problem is then found by a two step triangular solve process: ty=b Urey => Ara LUr= =b ‘The LU factorization is a “high-level” algebraic description of Gaussian elimination. Expressing the outcome of a matrix algorithm in the “ln guage” of matrix factorizations is a worthwhile sctivity. It facilitates gen- eralization and highlights connections between algorithms that may appear very different at the ecalar level. 3.2.1 Gauss Transformations ‘To obtain a factorization description of Gaussian elimination we need a matrix description of the zeroing process. At the n= 2 level if x3 # 0 and oN 1-02] More generally, suppose x € R® with zy #0. If Fe (QenOsreieit) n= Eo iektin (ecadomtirt) taken3.2. Tue LU Facrorizarion 95 and we define Mz=1—r, 24) then oyf a of] = |_| a Mur = 0 tee | | 8 oe ae ° Tn general, a matrix of the form My =I- ref € R™" is a Gauss trans- Jormation if the first k components of r € R° are zero. Such a matrix is unit lower triangular. The components of (k + I:n) are called multipliers. ‘The vector r is called the Gauss vector. 3.2.2 Applying Gauss Transformations ‘Multiplication by a Gauss transformation is particularly simple. 1fC ¢ R™* and My =1—ref is 6 Gauss transform, then ae = U=re)C = C= 16o) = c-70(8,9. is an outer product update. Since 7(1:k) = 0 only C(k + 1:n,2) is affected and the update C= M,C can be computed row-by-row as follaws: fori=ktin Cb) = C6) — CUE, ond ‘This computation requires 2(n — 1}r flops. Brxample 8.2.1 aad c=[25 8 a6 0 3.2.3 Roundoff Properties of Gauss Transforms If is the computed version of an exact Gauss vector 7, then it is easy to verify that forte lelsuir 96 CHAPTER 3. GENERAL LINEAR SYSTEMS If + is used in o Gauss transform update and fi((I ~ #¢f)C) denotes the computed result, then fU(I-#ef)C) = (-7ef)C+ 8, where JBL < 3u(ICl + IrllOCk 2) + O(0). Clearly, if 7 has large components, then the errors in the update may be Jarge in comparison to |C|. For this reason, care must be exercised when Gauss transformations are employed, a matter that is pursued in §2.4. 3.2.4 Upper Triangularizing Assume that A IR™". Gauss transformations Mi,...,Ma—1 can usually be found such that My-1-++MzMyA = U is upper triangular. To oe this we first look at the n = 3 case. Suppose ie then 14 7 MA={0 -3 -6 0-6 -1 Likewise, 1 00 1467 M=/|0 10/ + aMa(Mid)=|0 -3 -6 0-21 oo. Extrapolating from this example observe that during the kth step “© We are confronted with a matrix AM“) = Myay---MA that is upper triangular in columns 1 to k—1. ‘+ The multiplier in My are based on A—1)(k 4 1:n,A). In particular, we need aff") + 0 to proceed Noting that complete upper triangularization is achieved after n— 1 steps wwe therefore obtain3.2. THE LU FAcToRizaTion 7 kel while (A(k,&) #0) & (k 3.2.8 Solving a Linear System Once A has been factored via Algorithm 3.2.1, then Z and Vare represented jn the array A. We can then solve the system Az = b via the triangular systems Ly = b and Uz = y by using the methods of §3.1. Example 8.2.2 If Algorithm 32.1 is applied to I a TEETTE 4a] -[}44] Hd = OLD, then y = (1)-1,0)7 aohee Ly = band = = (-1/3,1/2,0)7 solves ue=y. 3.2.9 Other Versions ‘Gaussian elimination, like matrix multiplication, is a triple-loop procedure that can be arranged in several ways. Algorithm 3.2.1 corresponds to the “hij” version of Gaussian elimination if we compute the outer product: update row-by-row: for k=Ln=1 AE + I:n, k) = A(k-+ Ln, B)/A(K,&) for isk+iin for j=k+in Alia) = Ali) ~ AGRA) end end ond ‘There are five other versions: yi, thy, ijk, jik, and jh. The last of these results in an implementation that features a sequence of gaxpy’s and for- ward eliminations. In this formulation, the Gauss transformations are not 100 Onaprer 3. GENerat Linear SysTEMS immediately applied to A as they aro in the outer product version. Instead, their application is delayed. The original A(;,j) is untouched until step 3. ‘At that point in the algorithm A(,j) is overwritten by Mj-1--- MiAG3) The jth Gauss transformation is then computed, ‘To be precise, suppose 1 j and AG, j) is overwritten with U(i,j) fj 24. d=1 while A n case is illustrated by eile a 12 poyfi 2s 46 0-3 -6 depicts the m n case we modify Algorithm 3.2.1 as follows: for k= in rows =k + 1m A(rows, k) = A(rows,)/A(k, ) ifk j we have B,(1:j- 1,15 ~1) = 1-1. It follows that each Ma is 8 Gauss transform with Gauss vector = Eyy-+-Eysir®). 0 ‘As a consequence of the theorem, itis easy to see how to change Algorithm 3.4.1 40 that upon completion, A(i,j) houses L(i,j) for all i > 3. We merely apply each E to all the previously computed Gauss vectors. This is accomplished by changing the line “A(k,k:n) > A(u,k:n)” in Algorithm BA to “Ak, I:n) = AU Lin)” Example 8.4.2 The octorzation PA = LU ofthe matrix in Bvample 3.4.1 ia given by oo1){[s 7 wo 1 0 o}fs % -2 pool? 4 2/=;12 10fjo s w orojle w a} [is - ijlo 0 6 3.4.5 ‘The Gaxpy Version In §3.2 we developed outer product and gaxpy schemes for computing the LU factorization. Having just incorporated pivoting in the outer product version, itis natural to do the same with the gaxpy approach. Recall from (3.2.5) the general structure of the gaxpy LU process: Lal er) for j= in j=. ‘uGjin) = An.) else Solve L({Iij ~ 1 Lj ~ jz = A(Lj ~ 1,4) for and set U(13j ~ 1,3) = 2 u(jin) = Aljin,3) — LG 13 ~ 1) ifi 1 UG 15 =) > L(y —1) end UG,5) =») end In this implementation, we emerge with the factorization PA = LU where P = By.y:-- Ey where Ey is obtained by interchanging rows k and p(k) of the n-by-n identity. As with Algorithm 3.4.1, this procedure requires 2n°/3 flope and O(n?) comparisons. 3.4.6 Error Analysis We now examine the stability that is obtained with partial pivoting. This ‘Fequires an accounting of the rounding errors that are sustained during elimination and during the triangular system solving, Bearing in mind ‘that there are no rounding errors associated with permutation, it is not hard to show using Theorem 3.3.2 that the computed solution ¢ satisfies (A+ B)E = b where II s mu(3]Al + SPTLICI) + fw). (943) Here we are assuming that P, £, and U are the computed analogs of P, E, and U as by the above algorithms. Pivoting implies that the elements of L are bounded by one. Thus fj £ joo } 0 othecwion thea A han an LU fectorization with [fy] $1 and tian = 29-1 3.4.7 Block Gaussian Elimination Gaussian Elimination with partial pivoting can be organized so that it is rich in level-3 operations. We detail » block outer product procedure but block gaxpy and block dot product formulations are also possible. See Dayde and Duff (1988). ‘Assume A € IR" and for clarity that n= rv. Partition A as follows: Aun An a> [a23.4. Prvorina ny ‘The first step in the block reduction is typical and proceeds as fallows: ‘* Use scalar Gaussian elimination with partial pivoting (eg. 9 rectan- {gular version of Algorithm 3.4.1) to compute permutation P; € R°**, unit lower triangular Ii, € J" and upper triangular Uy; € IR'™" 50 a(] = [2]. ‘© Apply the P, across the rest of A: (]-9(% ‘= Solve the lower triangular multiple right hand side problem Lute = Aw Perform the level-3 update A= dn-lWn. With these computations we obtain the factorization bu Ou Us PAal oe nels al 0 an ‘The process is then repeated on the first r columns of A, In general, during step k (1 < & < N ~ 1) of the block algorithm we apply scalar Gaussian elimination to a matrix of size (n ~ (k— I)r)-by-r. ‘An r-by-(n — kr) multiple right hand side system is solved and a level 3 Update of size (n — kr}-by-(n ~ kr) ia performed. The level 3 fraction for the overall proceae is approximately given by 1 ~ 3/(2N). Thus, for large N the procedure is rich in matréx multiplication. 3.4.8 Complete Pivoting ‘Another pivot strategy called complete pivoting has the property that the associated growth factor bound is considerably smaller than 2*-1. Recall that in partial pivoting, the kth pivot is determined by scanning the current subcoluma A(k:n, k). In complete pivoting, the largest entry in the cur- rent submatrix A(k:n, kn) is permuted into the (K,) position. Thus, we ‘compute the upper triangularizasion My-1Ey-1:*-MiE\AFi -~ Fant =U with the property thet in step & we are confronted with the matrix ACCU My Bee MU ELAPY Feat us CHapTen 3. GENERAL LINEAR SYSTEMS ‘and determine interchange permutations E and Fy, such thet ‘We have the analog of Theorem 3.4.1 ‘Theorem 3.4.2 If Gaussian elimination with complete pivoting is used to compute the upper triangularization My Ennis Mi EVAR + Fao = 0 (4.7) then PAQ = LU where P = Enai-- Ey, Q= Fy--- Fans and L is a unit lower triangular matriz with [,5| <1. The kth column of L below the diagonal is a permuted version of the kth Gauss vector. In particular, if My = 1-7 eT then Lk-+ Link) = g(k-+ 1) where 9 = Eni ---Enyit) Proof. The proof is similar to the proof of Theorem 3.4.1. Details are left to the reader. Here is Gaustian elimination with complete pivoting in detail: Algorithm 3.4.2 (Gaussian Elimination with Complete Pivoting) ‘This algorithm computes the complete pivoting factorization PAQ = LU where Lis unit lower triangular ond U is upper triangular, P= Ea1--- Ey and Q = Fi-++ Fy are products of interchange permutations. A(I:k,) i overwritten by U(I:k,k), = Ln. A(k + Len, k) is overwritten by L(k + Im,k),k = Ln 1. Ey interchanges rows and p(k). Fy interchanges colurnns & and 9(F). for k=1:n=1 * Determine p with k - Fp satisfy fal} < P.M (3.48) ‘The upper bound is a rather slow-growing function of k. This fact coupled with vast empirical evidence suggesting that pis always modesty sized (e.g, ‘p= 10) permit us to conclude that Goussian elimination with complete ‘pivoting is stable. The method solves & nearby linear system (A+ E)z = b ‘exactly in the sonse of (3.3.1). However, there appears to be 0 practical Justification for choosing complete pivoting over partial pivoting except in cases where rank determination is an issue, xample 3.4.8 If Gaussian elimination with complete pivoting a applied to the prob- gy 19 ][ a]. [3m im 20 | vmaposns-n tual ra(To) e-[t a). #= (38 tS]. ¢ and # = 1.00, 1.00]? Compare with Examples 3.3.1 and 3.43, 3.4.10 The Avoidance of Pivoting For certain clases of matrices it is not neosssary to pivot. It is important ‘to identify such classes because pivoting usually degrades performance. To 120 CHAPTER 3. GENERAL LINEAR SYSTEMS ilustrate the kind of analysis required to prove that pivoting can be safely avoided, we consider the case of diagonally dominant matrices. We say that Ae RO" is strictly diagonally dominant if lel > Slagle m The following theorem shows how this property can ensure a nice, no- pivoting LU factorization. Theorem 3.4.3 If A” is strictly diagonally dominant, then A has an LU factorization and \lu| <1. In other words, if Algorithm 3.4.1 is oppled, then P =I. Proof, Partition A as follows a wt a- [0%] where a is I-by-1 and note that after one step of the outer product LU process we have the factorization a ut 1 ojft 0 aut vc via }[0 C-w7/a}lo f |° ‘The theorem follows by induction on n if we can show that the transpose of B= C vw" /axis strictly diagonally dominant. This is because we may ‘then assume that 3 has an LU factorization B = LU; and that implies 4=[o a) [3%] = But the proof that 8 is strictly diagonally dominant is straight forward. From the definitions we have et et at You Teg saal < Shey + tS ot @ ms Ey a (het ~ toy + algal ~ les~ 84] = foyl.0 ”“ w34. Prvotina in 3.4.11 Some Applications We conclude with some examples that illustrate how to think in terms of matrix factorizations when confronted with various linear equation situa- tions. ‘Suppose A is nonsingular and n-by-n and that B is n-by-p. Consider the . the multiple right hand side problem of finding X (n-by-p) 90 AX = By i problem. If X = (21,...,2p] and B = [by thea Compute PA = LU. for k= 1p Solve Ly = Phy (3.4.9) Solve Ura = y end Note that A is factored Just once, If B = Jy then we emerge with a computed A! , ‘As another example of getting the LU factorization “outside the loop,” suppose we want to solve the linear system Atz = b where Ac RO", b€ R®, and k is a postive integer. One approach is to compute C = At and then solve Cz = 8. However, the matrix multiplications can be avoided altogether: Compute PA = LU for j= Lk Overwrite b with the solution to Ly=P8. (3.4.10) Overwrite 6 with the solution to Uz = 6. end As a final example we show how to avoid the pitfall of explicit inverse computation. Suppose we are given AG R"*", de R®, and ce R" and ‘that we want to compute s = c" A~'d. One approach is to compute X = A a suggested above and then compute s =<" Xd. A more economical procedure is to compute PA = LU and then solve the triangular systems Ly = Pd and Uz = y. It follows that s = c"s. The point of this example is to stress that when a matrix inverse is encountered in a formals, we must think in terms of solving equations rather than in terms of explicit inverse formation. Probleme 9.4.1 Let A= LU be the LU factorization of m-by-n A with [tj} $1. Let oF and wT Sushi sw of aad pec. Waly te pan fad Deal 122 Chapren 3. GENERAL Linear Systems ‘and ue it to ahow that IU fae S 2°" Allan « (Hint: Take nora and use induction.) 3.42 Show that if PAQ = LU is obtained vie Gaumian elimination with complete Pivottng, then no clement af (im) ia large in abeotuee valve than S43 Soppose A.C FE** hen an LU factorization and that Land U are known, Give ‘a algorichm which can compata tha (i) entry of A“ in appresimately(0=3)?+(n—i)? ope. 3.4.4. Suppose isthe computed averse obtained via (8.4.9). Give an upper bound for [AR ~ Ip. 3.4.5. Prove Theorem 342. P3.4.6 Extend Algoritim 3.43 90 thas it can factor an arbitrary rectangular matrix, 'PS.4.7 Write detailed verion of the block eliminasion algorithm outlined in $8.47 ‘Notes and References for See. 3.4 ‘Am Algol version of Algorithm 3.4.1 a given in HJ. Bowdir, RLS. Martin, G. Petory and J.Hl Wilkiawon (1966), “Solution of Real ‘and Complex Systems of Linear Equations.” Numer. Moth 8, 217-34. See also Witkinsom and Reiaach (1971, $9-110). “The conjactare tat [| You might also like
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good LifeFrom EverandThe Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good LifeRating: 4 out of 5 stars4/5 (5823) The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You AreFrom EverandThe Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You AreRating: 4 out of 5 stars4/5 (1093) Never Split the Difference: Negotiating As If Your Life Depended On ItFrom EverandNever Split the Difference: Negotiating As If Your Life Depended On ItRating: 4.5 out of 5 stars4.5/5 (852) Grit: The Power of Passion and PerseveranceFrom EverandGrit: The Power of Passion and PerseveranceRating: 4 out of 5 stars4/5 (590) Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space RaceFrom EverandHidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space RaceRating: 4 out of 5 stars4/5 (898) Shoe Dog: A Memoir by the Creator of NikeFrom EverandShoe Dog: A Memoir by the Creator of NikeRating: 4.5 out of 5 stars4.5/5 (541) The Hard Thing About Hard Things: Building a Business When There Are No Easy AnswersFrom EverandThe Hard Thing About Hard Things: Building a Business When There Are No Easy AnswersRating: 4.5 out of 5 stars4.5/5 (349) Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic FutureFrom EverandElon Musk: Tesla, SpaceX, and the Quest for a Fantastic FutureRating: 4.5 out of 5 stars4.5/5 (474) Her Body and Other Parties: StoriesFrom EverandHer Body and Other Parties: StoriesRating: 4 out of 5 stars4/5 (823) The Sympathizer: A Novel (Pulitzer Prize for Fiction)From EverandThe Sympathizer: A Novel (Pulitzer Prize for Fiction)Rating: 4.5 out of 5 stars4.5/5 (122) The Emperor of All Maladies: A Biography of CancerFrom EverandThe Emperor of All Maladies: A Biography of CancerRating: 4.5 out of 5 stars4.5/5 (271) The Little Book of Hygge: Danish Secrets to Happy LivingFrom EverandThe Little Book of Hygge: Danish Secrets to Happy LivingRating: 3.5 out of 5 stars3.5/5 (403) The World Is Flat 3.0: A Brief History of the Twenty-first CenturyFrom EverandThe World Is Flat 3.0: A Brief History of the Twenty-first CenturyRating: 3.5 out of 5 stars3.5/5 (2259) The Yellow House: A Memoir (2019 National Book Award Winner)From EverandThe Yellow House: A Memoir (2019 National Book Award Winner)Rating: 4 out of 5 stars4/5 (98) Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New AmericaFrom EverandDevil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New AmericaRating: 4.5 out of 5 stars4.5/5 (266) A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True StoryFrom EverandA Heartbreaking Work Of Staggering Genius: A Memoir Based on a True StoryRating: 3.5 out of 5 stars3.5/5 (231) Team of Rivals: The Political Genius of Abraham LincolnFrom EverandTeam of Rivals: The Political Genius of Abraham LincolnRating: 4.5 out of 5 stars4.5/5 (234) On Fire: The (Burning) Case for a Green New DealFrom EverandOn Fire: The (Burning) Case for a Green New DealRating: 4 out of 5 stars4/5 (74) The Unwinding: An Inner History of the New AmericaFrom EverandThe Unwinding: An Inner History of the New AmericaRating: 4 out of 5 stars4/5 (45)