Docshare - Tips - Mansfield Linear Algebrapdf PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 257

~ ... ', ·' '• ~.._.

; ' ~

{
t ,.... . .. ....
i ~ •· •. ; . .¡.¡ .. :, '\ ! '

1.' •... >· •....... ~.-

·~
~
f .. ,. ., .. •'f¡.
1'
i'
~i ~··· . '"
~·:·.·~··~- ..

LINEAR ALGEBRA
With Geometric Appl ications

--~
1
~
Oaaificaci6n[>A-/.!'f::8.~
Reqiatra,d:/¡>;E .. p.:f.,(r;'___:_

J

•¡
PrQc&dencfa t'O/--!PA:>,J~ ~'

F&eha ,{.?- -X- rtD'.,)7


1
1
$
~. PURE AND APPLIED MATHEMATICS

A Program of Monographs, Textbooks, and Lecture Notes

EXECUTIVE EDITORS-MONOGRAPHS, TEXTBOOKS, AND LECTURE NOTES

Earl J. Taft
Rutgers University
New Brunswick, New Jersey

Edwin Hewitt
University of Washington
Seattle, Washington

CHAIRMAN OF THE EDITORIAL BOARD

S. Kobayashi
University of California, Berkeley
•: Berkeley, California
1

EDITORIAL BOARD

Masanao Aoki W. S. Massey


University of California, Los Angeles Yale University
Glen E. Bredon Irving Reiner
Rutgers University University of /llinois at Urbana-Champaig¡¡
Sigurdur Helgason Pau/ J. Sal/y, Jr.
Massachusetts /nstitute of Technology U11iversity of Chicago

'1 G. Leitman Jane Cronin Scanlon


University of California, Berkeley Rutgers University
Marvin Marcus Martin Schechter
University of California, Santa Barbara Yeshiva University
Julius L. Shaneson
Rutgers University
MONOGRAPHS AND TEXTBOOKS IN
PURE AND APPLIED MATHEMATICS

l. K. YANO. Integral Formulas in Riemannian Geometry (1970)


2.
3.
S. KoBAYASHI. Hyperbolic Manifolds and Holomorphic Mappings (1970)
V. S. VLADIMIROV. Equations of Mathematical Physics (A. Jeffrey, editor; A. Little-
wood, translator) (1970)
LINEAR ALGEBRA
4. B. N. PsHENICHNYI. Necessary Conditions for an Extremum (L. Neustadt, translation
editor; K. Makowski, translator) (1971) .
Witt- Geometric Applications
5. L. NARICI, E. BECKENSTEIN, and G. BACHMAN. Functiona) Ana)ysis and Valuation
Theory (1971)
6. D. S. PASSMAN. lnfinite Group Rings (1971)
7. L. DoRNHOFF. Group Representation Theory (in two parts). PartA: Ordinary Repre- LARRY E. MANSFIELD
sentation Theory. Part B: Modular Representation Theory (1971, 1972) Queens Col/ege of the City
8. W. BooTHBY and G. L. WEISS (eds.). Symmetric Spaces: Short Courses Presented at University of New York
Washington University (1972) Flushing, New York
9. Y. MATSUSHIMA. Differentiable Manifolds (E. T. Kobayashi, translator) (1972)
10. L. E. WARD, JR. Topology: An Outline for a First Course (1972)
11. A. BABAKHANIAN. Cohomological Methods in Group Theory (1972)
12. R. GILMER. Multiplicative Ideal Theory (1972)
13. J. YEH. Stochastic Processes and the Wiener Integral (1973)
14. J. BARROS-NETO. Introduction to the Theory of Distributions (1973)
15.· R. LARSEN. Functional Analysis: An Introduction (1973)
16. K. YANO and S. IsHIHARA. Tangent and Cotangent Bundles: Differential Geometry
(1973)
17. C. PROCESI. Rings with Polynomial Identities (1973)
18. R. HERMANN. Geometry, Physics, and Systems (1973)
19. N. R. WALLACH. Harmonic Analysis on Homogeneous Spaces (1973)
20. J. DIEVDONNÉ. Introduction to the Theory of Formal Groups (1973)
21. l. VAISMAN. Cohomology and Differential Forms (1973)
22. B.-Y. CHEN. Geometry of Submanifolds (1973)
23. M. MARCUs. Finite Dimensional Multilinear Algebra (in two parts) (1973, 1975)
24. R. LARSEN. Banach Algebras: An Introduction (1973)
25. R. O. KuJALA and A. L. VITTER (eds). Value Distribution Theory: Part A; Part B.
Deficit and Bezout Estimates by Wilhelm Stoll (1973)
26. K. B. STOLARSKY. Algebraic Numbers and Diophantine Approximation (1974)
27. A. R. MAGID. The Separable Galois Theory of Commutative Rings (1974)
28. B. R. McDONALD. Finite Rings with Identity (1974)
29. l. SATAKE. Linear Algebra (S. Koh, T. Akiba, and S. Ihara, translators) (1975)
30. J. S. GoLAN. Localization of Noncommutative Rirrgs (1975)
31. G. KLAMBAUER. Mathematical Analysis (1975)
32. M. K. AaosTON. Algebraic Topology: A First Course (1976) MARCEL DEKKER, INC. New York and Base!
33. K. R. GoooEARL. Ring Theory: Nonsingular Rings and Modules (1976)
34. L. E. MANSFIELD. Linear Algebra with Geometric Applications (1976)
Contents

Preface vii

Chapter 1. A Geometric Model


§1. The Field of Real Numbers 2.
§2. The Vector Space 1'2 4.
§3. Geometric Representa/ion of Vectors of 1' 2 7.
COPYJt!GHT © 1976 MARCEL DEKKER, INC. ALL RIGHTS RESERVED.
§4. The Plane Viewed as a Vector Space 12.
§5. Ang/e Measurement in 6! 2 19.
Neither this book nor any part may be reproduced. or transmitted in any forro or by any §6. Generaliza/ion of 1'2 and rff 2 22.
rriean~, electronic or mechanical, including photocopying, microfilming, and recording, or
by any information storage and retrieval system, without permission in writing from the Chapter 2. Rea! Vector Spaces
publisher.
§l. Definition and Examples 32.
MARCEL DEKKER, INC. §2. Subspaces 37.
270 l\ladison Avenue, New York, New York 10016 §3. Linear Dependence 44.
§4. Bases and Dimension 51.
LIBRA.'l.Y OF CONGRESS CATALOG CARD NUMBER: 75-10345 §5. Coordina/es 58.
ISBN: 0-8247-6321-1 §6. Direct Sums 65.
Current printing (last digit):
10 9 8 7 6 5 4 3 2 1
Chapter 3. Systems of Linear Equations
§1. Notation 74.
§2. Row-equivalence and Rank 83.
PRINTED IN THE UNITED STATES OF AMERICA §3. Gaussia,n Elimination 94.

¡¡¡
iv ::::v Contents Contents y

§4. Geometric lnterpretation 101. Appendix: Determinants


§5. Determinants: Definition 106.
116: §J. A Determinan! Function 406.
§6. Determinants: Properties and Applications
120. §2. Permutations and Existence 412.
§7. A/ternate Approaches to Rank
§3. Cofactors, Existence, and Applications 418.
· Chapter 4. Linear Transformations
Answers and Suggestions 429.
§l. Definitions and Examples 130. Symbol Index 489.
§2. The Image and Nu/1 Spaces 137. Subject Index 491.
§3. Algebra of Linear Transformations 146.
§4. Matrices of a Linear Transformation 155.
§5. Composition of Maps 166.
§6. Matrix Multiplication 176.
· §7. Elementary Matrices 187.

Chapter 5. Change of Basis


§l. Change in Coordinares 200.
§2. Change in the Matrix of a Map 208.
§3. Equivalence Re/ations 214.
§4. Similarity 222.
§5. Invariants for Simi/arity 23/.

Chapter 6. lnner Product Spaces


§ 1. Definition and Examp/es 242.
§2. The Norm·and Orthogonality 247.
§3. The Gram-Schmidt Process 255.
§4. Orthogonal Transformations 261.
§5. Vector Spaces over Ar.bitrary Fields 270.
§6. Complex lnner Product Spaces and Hermitian Transformations 277.

Chapter 7. Second Degree Curves and Surfaces


§l. Quadratic Forms 290.
§2. Congruence 298.
§3. Rigid Motions 305.
§4. Conics 3J2.
§5. Quadric Surfaces 32J.
§6. Chords, Rulings, and Centers 329.

Chapter 8. Canonical Forms Under Similarity


§J. Invariant Subspaces 342.
§2. Polynomials 348.
§3. Mínimum Polynomials 360.
§4. Elementary Divisors 367.
§5. Canonical Forms 377.
§6. Applications of the Jordan Form 388.
Preface

Until recently an introduction to linear algebra was devoted primarily to


solving systems of lin~ar equations and to the evaluation of determinants.
But now a more theoretical approach is usually taken and linear algebra is to
a large extent the study of an abstract mathematical object called a vector
space. This modern approach encompasses the former, but it has the advan-
tage of a much wider applicability, for it is possible to apply conclusions
derived from the study of an abstract system to diverse problems arising in
various branches of mathematics and the sciences.
Linear Algebra with Geometric Applications was developed as a text for a
sophomore leve!, introductory course from dittoed material used by severa!
classes. Very little mathematical background is assumed aside from that
obtained in the .usual high school algebra and geometry courses. AlthoJlgh a
few examples are drawn from the calculus, they are not essential and may be
skipped if one is unfamiliar with the ideas. This means that very little mathe-
matical sophistication is required. However, a major objective of the text is
to develop one's mathematical maturity and convey a sense of what con-
stitutes modern mathematics. This can be accomplished by determining how
one goes about solving problems and what constitutes a proof, while master-
ing computational techniques and the underlying concepts. The study of
linear algebra is well suited to this task for it is based on the simple arithmetic
properties of addition and multiplication.
Altnough linear algebra is grounded in arithmetic, so many new concepts

vii
viii Preface Preface ix

-
must be introduced that the underlying simplicity can be obscured by termi- that you should be able to prove each theorem when you first encounter it,
nology. Therefore every effort has been made to introduce new terms only however the attempt to construct a proof will usually aid in understanding
when necessary and then only with sufficient motivation. For example, systenis the given proof. The problems at the end of each section should be con-
of linear equations are not considered until it is clear how they arise, matrix sidered next. Solve enough of the computational problems to master the
multiplication is not defined until one sees how it will be used, and complex computational techniques, and work on as many of the remaining problems
scalars are not introduced until they are actually needed. In addition, ex- as possible. At the very least, read each problem and determine exactly what
amples are presented with each new term. These examples are usually either is being claimed. Finally you should often try to gain an overview of what
algebraic or geometric in nature. Heavy reliance is placed on geometric you are doing; set yourself the problem of determining how and why a par-
examples because geometric ideas are familiar and they provide good inter- ticular concept or technique has come about. In other words, ask yourself
pretations of linear algebraic concepts. Examples employing polynomials or what has been achieved, what terms had to be introduced, and what facts
functions are also easily understood and they supply nongeometric lnter- were required. This is a good time to see ifyou can prove sorne ofthe essential
pretations. Occasionally examples are drawn from other fields to suggest the facts or outline a proof of the main result.
range of possible application, but this is not done often beca use it is difficult At times you will not find a solution immediately, but simply attempting
to clarify a new concept while motivating and solving problems in another to set up an example, prove a theorem, or solve a problem can be very use-
field. ·· fui. Such an attempt can point out that a term or concept is not well under-
The first seven chapters follow a natural development begining with an stood and thus Iead to further examination of sorne idea. Such an examina-
algebraic approach to geometry and ending with an algebraic analysis of tion will often pro vide the basis for finding a solution, but even if it do es not,
second degree curves and surfaces. Chapter 8 develops canonical forms for it should lead to a fuller understanding of sorne aspect of linear algebra.
matrices under similarity and might be covered at any point after Chapter 5. Because problems are so important, an extensive solution section is
It is by far the most difficult chapter in the book. T11e appendix on determi- provided for the problem sets. lt contains full answers for all computational
nants refers to concepts found in Chapters 4 and 6, but it could be taken up problems and some theoretical problems. However, when a problem requires
when determinants are introduced in Chapter 3. a proof,, the actual development of the argument is one objective of the
Importance of Problems The role of problems in the study of mathe- problem. Therefore you will often find a suggestion as to how to begin rather
matics cannot be overemphasized. They should not be regarded simply as than a complete solution. The way in which such a solution begins is very
hurdles to be overcome in assignments and tests. Rather they are the means important; too often an assumption is made at the begining of an argument
to understanding the material being presented and to appreciating how ideas which amounts to assuming what is to be proved, or a hypothesis is either
can be used. Once the role of problems is understood, it will be seen that the misused or omitted entirely. One should keep in mind that a proof is viewed
first place to look for problems is not necessarily in problem sets. 1t is im- in its entirety, so that an argument which begins incorrectly cannot become a
portant to be able to find and solve problems while reading the text. For proof no matter what is claimed in the last line about having solved the
example, when a new concept is introduced, ask yourselfwhat it really means; problem. A given suggestion or proof should be used as a last resort, for once
loo k for an example in which ~ property is not present as well as one in you see a completed argument you can no longer create it yourself; creating
which it is, and then note the differences. Numerical examples can be made a proof not only extends your knowledge, but it amounts to participating in
from almost any abstract expression. Whenever .an abstract expression from the development of linear algebra.
the text or one of your own seems unclear, replace the symbols with par- Acknowledgments At this point I would like to acknowledge the
ticular numerical expressions. This usually transforms the abstraction into an invaluable assistance I have received from the many students who worked
exercise in arithmetic or a system of linear equations. The next plaee to look through my original lecture notes. Their observations when answers did not
for problems is in worked out examples and proved theorems. In fact, the check or when arguments were not clear have lead to many changes and
best way to understand either is by working through the computation or revisions. I would also like to thank Professor Robert B. Gardner for his
deduction on paper as you read, filling in any steps that may have been many helpful comments and suggestions.
ornitted. Most of our theorems will have fairly simple proofs which can be
constructed with little more than a good understanding of what is being
claimed and the kriowledge of how each term is defined. This does not mean
LINEAR ALGEBRA
With Geometric Applications

i.
T
..,j
!

A Geometric Model

§1. The Field of Real Numbers


§2. The Vector Space "f/' 2
§3. Geometric Representation of Vectors of "f/' 2
§4. The Plane Viewed as a Vector Space
§5. Angle Measurement in 8 2
§6. Generalization of "f/' 2 and 1 2
2 1. A GEOMETRIC MODEL §1. The Field of Real Numbers 3

Before begining an abstract study of vector spaces, it is helpful to have a ti es the two operations together:
concrete example to use as a guide. Therefore we will begin by defining a.
particular vector space and after examining a few of its properties, we wm· r·(s + t) = r·s + r·t a distributive law.
see how it may be used in the study of plane geometry.
This is a rather speciallist of properties. On one hand, none of the prop-
erties can be derived from the others, while on the other, many properties of
real numbers are omitted. For example, it does not contain properties of
§1. The Field of Real Numbers order or the fact that every real number can be expressed as a decimal. Only
certain properties of the real number system have been included, and many
~ other mathematical systems share them. Thus if r, s, and t are thought of as
Our study of linear algebra is based on the arithmetic properties of.real complex numbers and R is replaced by C, representing the set of all complex
numbers, and severa! importan! terms are derived directly from these prop- numbers, then all the above properties are still valid. In general, an algebraic
erties. Therefore we begin by examining the basic properties of arithmetic. system satisfying all the preceding properties is called afie/d. The real number
The set of all real numbers will be denoted by R, and the symbol "E" will system and the complex number system are two different fields, and there
mean "is a member of." Thus .j2 E R can be read as ".j2 is a member of the are many others. However, we will consider only the field of real numbers
real number system" or more simply as ".j'l is a real number." Now if r, s, in the first five chapters.
and t are any real numbers, then the following properties are satisfied: Addition and multiplication are binary operations, that is they are only
defined for two elements. This explains the need for associative laws. For if
l'roperties of addition: addition were not associative, then r + (s + t) need not equal (r + s) + t
and r + s + t would be undefined. The field properties listed above may
or R is closed under addition seem obvious, but it is not to difficult to find binary operations that violate
any or all of them.
r + (s + t) = (r + s) + t or addition is associative One phrase in the preceding list which will appear repeatedly is the state-
or addition is commutative ment that a set is closed under an operation. The statement is defined for a set
of numbers and the operations of addition and multiplication as follows:
r+O=r or O is an additive identity
For any rE R, there is an additive inverse -rE R such that Definition Let S be a set ·of real numbers. S is closed under addition if
r+ (-r) =O. r + tE S for all r, tE S. S is closed under multiplication if r· tE S for all
r, tE S.
Properties of multiplication:
For example, if S is the set containing only the numbers 1, 3, 4, then S is
r·sE R or R is closed under multiplication not closed under either addition or multiplication. For 3 + 4 = 7 rj; S and
r·(s·t) = (r·s)·t or multiplication is associative 3 · 4 = 12 rj; S, yet both 3 and 4 are in S. As another example, the set of aH odd
integers is closed undermultiplication but is not closed under addition.
r·s = s·r or multiplication is commutative Sorne notation is useful when working with sets. When the elements are
r·l =r or 1 is a multiplicative identity easily listed, the set will be denoted by writing the elements within brackets.
Therefore {1, 3, 4} denotes the set containing only the numbers 1, 3, and 4.
For any rE R, r # O, there is a multiplicative in verse r - 1 E R For larger sets, a set-building notation is used which denotes an arbitrary
such that r·(r- 1) = l. member of the set and states the conditions which must be satisfied by any
member ofthe set. This notation is { · · · 1 • • ·} and may be read as "the set of
The final property states that multiplication distributes over addition and all · . . such that · · · ." Thus the set of odd .integers could be wriMen as:
4 1. A GEOMETRIC MODEL §2. The Vector Space "1'" 2 5

{x 1 x is an odd integer}, "the set of all x such that x is an odd integer." Or it The elements of "Y 2 are defined to be al! ordered pairs of real numbers
could be written as {2n + 1 1 n is an integer}. and are called vectors. The operations of "Y 2 are addition and scalar multipli-
cation as defined below:

Problems Vector addition: The su m of two vectors (a 1 , b 1) and (a 2 , b 2) is defined


by: (a 1 , h1 ) + (a2, b2) = (a 1 + a 2, b 1 + b2 ).
1. Write out the following notations in words:
a.7ER. b . ./-6f/=R. c.{0,5}. d.{xlxeR,x<O}. For example, (2, -5) + (4, 7) = (2 + 4, -5 + 7) = (6, 2).
e. {xeRix 2 = -1}.
2. a. Show by example that the set of odd integers is not closed under addit.ion. Scalarmultiplication: For any real number r, called a scalar, and any
b. Prove that the set of odd integers is closed under multiplication. vector (a, b) in "1' 2 , r(a, b) is a scalar multiple and is defined by r(a, b) =
(ra, rb).
3. Determine if the following sets are closed under addition or multiplication:
a.{l,-1}. b.{5}. c.{xe.Rix<O}. d.{2nlnisaninteger}. For example, 5(3, -4) = (15, -20).
e. {x E R 1 x ¿ 0}.
4. Using the property of addition as a guide, give a formal definition of what it Now the set of al! ordered pairs of real numbers together with the opera-
means to say that "addition of real numbers is commutative." tions of vector addition and scalar multiplication forms the vector space "Y 2 •
The numbers a and b in the vector (a, b) are called the components ofthe vec-
5. A distributive law is included in the properties of the real number system.
State another distributive law which holds and explain why it was not in- tor. Since vectors are ordered pairs, two vectors (a, b) and (e, d) are equal if
cluded. their corresponding components are equal, that is if a = e and b = d.
One point that should be made immediately is that the term "vector"
may be applied to many different objects, so in other situations the term may
apply to something quite ditferent from an ordered pair of real numbers. In
§2. The Vector S pace "Y 2 this regard it is commonly said that a vector has magnitude and direction,
but this is not true for vectors in "Y 2 .
The strong similarity between the vectors of "Y 2 and the names for points
It would be possible to begin a study of linear algebra with a formal in the Cartesian plane will be utilized in time. However, these are quite ditfer-
definition of an abstract vector space. However, it is more fruitful to consider ent mathematical objects, for "Y 2 has no geometric properties and the Carte-
an example of a particular vector space first. The formal definition will essen- sian plane does not have algebraic properties. Before relating the vector space
tially be a selection of properties possessed by the example. The idea is the "Y 2 with geometry, much can be said of its algebraic structure.
same as that used in defining a field by selecting certain properties of real
numbers. The mathematical problem is to select enough properties to give the
essential character of the example while at the same time not taking so many Theorem 1.1 (Basic properties of addition in "Y 2 ) If U, V, and W
that there are few examples that share them. This procedure obviously cannot are any vectors in "Y 2 , then
be carried 0ut with only one example in hand, but even with severa) examples
the resulting definition might appear arbitrary. Therefore one should not l. U+ VE"/' 2
expect the example to point directly to the definition of an abstract vector 2. U + (V+ W) = (U + V) + W
space, rather it should provide a first place to interpret abstract concepts.
As with the real number system, a vector space will be more than just a 3. U+ V= V+ U
collection of elements; it will also include the algebraic structure imposed by 4. U + (0, O) = U
operations on the elements. Therefore to define the vector space "Y 2 , both its
elements and its operations must be given. 5. For any vector U E "Y 2 , there exists a vector -U E "Y 2 such that
U + (- U) = (0, 0).
§3. Geometric Representation of Vectors of 1' 2 7
6 1. A GEOMETRIC MODEL

Proof Each of these is easily pro ved using the definition of addition 3. Solve the following equations for the vector U:
in "f' 2 and the properties of addition for real numbers. Por example, to prove. a. U+ (2, -3) = (4, 7). c. (-5, 1) +u= (0, 0).
part 3, Jet U = (a, b) and V= (e, á) where a, b, e, dE R, then · b. 3U + (2, 1) = (1, 0). d. 2U + (-4)(3, -1) = (1, 6).
4. Show that for al! vectors U E ií 2 , U + O = U.
U + V = (a, b) + (e, d) S. Prove that vector addition in ií 2 is associative; give a reason for each step.
= (a + e, b + d) Definition of vector addition 6. Suppose an operation were defined on pairs ofvectors from ií 2 by the formula
(a, b)o(c, d) = ac + bd. Would ií 2 be closed under tbis operation?
= (e + a, d + b.) Addition in R is commutative
7. The following is a proof of the fact that the additive identity of ií 2 is unique,
=(e, d) +(a, b) Definition of addition in 1' 2 -.c+hat is, there is only one vector that is an additive identity. Find the reasons for
=V+ U. the six indicated steps.
Suppose there is a vector W E ií 2 such that U + W = U for all U E ií 2·
Then, since O is an additive identity, it must be shown that W = O.
The proof of 2 is similiar, using the fact that addition in R is associative, and
4 foEows from the fact that zero is an additive identity in R. Using the above Let U= (a, b) and W = (x, y) a.?
notation, U + V = (a + e, b + d) and a + e, b + dE R since R is closed then U+ W = (a, b) + (x, y)
under addition. Therefore U + V E "f/ 2 if U, V E "f/ 2 and 1 holds. Part 5 =(a+ x, b +y) ·b.?
but U+ W= (a, b) c.?
follows from the fact that every real number has an additive inverse, thus if
therefore a + x = a and b + y = b d.?
U= (a, b)
so x = O and y = O e.?
and W= O. f.?
U+ (-a, -b) = (a, b) + (-a, -b) = (a - a, b - b) = (0, O)
Following the pattern in problem 7, prove that each vector in ií z has a unique
8.
additive inverse.
and (-a, -b) can be called -U.
9. Prove that the following properties of scalar multiplication hold for any vectors
U, V E ií 2 and any scalars r, sER:
Each property in Theorem 1.1 arises from a property of addition in R and
gives rise to similar terminology. (1) states that "f/ 2 is c!osed under addition. ¡ a. rUEiíz
b. r(U+ V)=rU+rV
From (2) and (3) we say that addition in "f/ 2 is associative and commutative, 1 c. (r + s)U = rU + sU
respectively. The fourth property shows that the vector (0, O) is an identity 1
d. (rs)U = r(sU)
for addition in "Y 2 • Therefore (0, O) is called the zero vector ofthe vector space 1 e. l U= U where 1 is the number l.
"Y 2 and it will be denoted by O. Finally the fifth property states that every
10. Show that, for all U E ií 2: a. OU= O. b.-U= (-l)U.
vector U has an additíve in verse denoted by -U.
Other properties of addition and a list of basic properties for scalar 11. Addition in R and therefore in ií 2 is both associative and commutative, but
multiplication can be found in the problems below. not all operations have these properties. Show by examples that if subtraction
ís viewed as an operation on real numbers, then it is neither commutative nor
associative. Do the same for di vis ion on the set of all positive real numbers.
Problems

1. Find tbe following vector sums: §3. Geometric Representation of Vectors of "f/ 2
a. (2, - 5) + (3, 2). c. (2, -3) + (-2, 3).
b. (-6, I) + (5, -1). d. (l/2, l/3) + (-l/4, 2). The definition of an abstract vector space is to be based on the example
2. Find tbe following scalar multiples: provided by "Y 2 , and we ha ve now obtained all the properties necessary for
a. !(4, - 5). b. 0(2, 6). c. 3(2, -l/3). d. ( -1)(3, -6). the deflnition. However, asid e from the fact that "f/ 2 has a rather simple
§3. Geometric Representation of Vectors of "f/' 2 9
8 1. A GEOMETRIC M O DEL

dcfinition, the preceding discussion gives no indication as to why one would


want to study it, let alone something more abstract. Therefore the remainder
U+ V
of this chapter is devoted to examing one of the applications for lineat
algebra, by drawing the connection between linear algebra and Euclidean
geometry. We will first find that the Cartesian plane serves as a good model
for algebraic concepts, and then begin to see how algebraic techniques can
be used to solve geometric problems.
Let E 2 denote the Cartesian plane, that is the Euclidean plane with a
Cartesian coordinate system. Every point of the plane E 2 is named by an
ordered pair of numbers, the coordinates of the point, and thus can be used
to representa vector of "Y 2 pictorially. That is, for every vector U = (a, b),
thcre is a point in E 2 with Cartesian coordina tes a and b that can be used as a
picture of U. And conversely, for every point with coordinates (x, y) in the o
plane, there is a vector in "Y 2 which has x and y as components. Now if the
vectors of"Y 2 are represented as points in E 2 , how are the operations ofvector Figure 2
addition and scalar multiplication represented?
Suppose U= (a, b) and V= (e, d) are two vectors in 1'2 , then U+ V=
(a + e, b + d), and Figure 1 gives a picture of these three vectors in E 2 • A
little plane geometry shows that when the four vectors O, U, V, and U + V are
viewed as points in E 2 , they lie at the vertí ces of a parallelogram, as in
Figure 2. Thus the su m of two vectors U and V can be pictured as the fourth
vertex of the parallelogram having the two lines from the origin to U and V
as two sides.
To see how scalar multiples are represented, let U= (a, b) and rE R, OU=O
then rU = (ra, rb). If a # O, then the components of the scalar multiple rU
satisfy the equation rb = (bja)ra. That is, if rU = (x, y), then the components

U+ V Figure 3

b+d •
(a +e, b +d)
.. .._,
satisfy the equation of the Iine given by y = (bja)x. Converse! y, any point on
the Iine with equation y = (bja)x has coordinates (x, (bja)x), and the vector
V
d • (x, (bja)x) equals (xja)(a, b) which is a scalar multiple of U. In Figure 3
(e,d) severa! scalar multiples of the vector U = (2, 1) are shown on the line which
u represents all the scalar multiples of (2, 1). If a = O and b # O, then the set
b • of all multiples of U is represented by the vertical axis in E 2 • If U= O, then
(a, b) al! multiples are again O. Therefore in general, the set of al! scalar multiples of
a nonzero vector U are represented by the points on the line in E 2 through the
origin and the point representing the vector U.
e a a+e This interpretation of scalar multiplication provides vector equations
for Jines through the origin. Suppose e is a line in E 2 passing through the
Figure 1
10 1 A GEOMETRIC MODEL §3. Geometric Representation of Vectors of "f/' 2 11

origin and U is any nonzero vector represented. by a point on e. Then identi- for m. Setting P = (x, y) and eliminating the parameter t from x = 2t + 1
fying points and vectors, every point on e is tU for ome scalar t. Letting P and y = t + 3 gives the Cartesian equation x = 2y - 5. The same Cartesian
denote this variable point, P = tU, tER is a vector equation for e, and the equation for m can be obtained using the point-slope formula.
variable t is called a parameter.
Notice the use of the vector U in the equation P = tU + V for m and
Example 1 Find a vector equation for the Ji e ewith Cartesian equa- P = tU for e, in Example 2. U determines the direction of m and e, or
tion y= 4x. rather its use results from the fact that they are parallel. Measurement of
e passes through the origin and the point (1, 4 . Therefore P = t(l, 4), direction for Iines is a relative matter, for example slope only measures direc-
tER is a vector equation for e. If the variable poi t Pis called (x, y), then
(x, ;;) = t(l, 4) yields x = t, y = 4t and eliminati g the parameter t gives
y = 4x. Actually there are many vector equations fo e. Since (- 3, -12) is on
tion relative to given directions, whereas the fact that lines are parallel is
independent of coordina te axes. For the vectors in "Y 2 , direction is best left
as an undefined term. However, each nonzero vector of "Y 2 can be used to
-
e, P = s(- 3, - 12), sE R is another vector equatio for e. But s(- 3, -12) express the direction of a class of parallellines in E 2 •
= (- 3s)(l, 4), so the two equations ha ve the same raph.
Definition If m is a Iine with vector equation P = tU+ V, tER,
Using the geometric interpretation of vector a dition, we can write a then any nonzero scalar multiple of U gives direction numbers for m.
vector equation for any Iine in the plane. Given a Ji e e through the origin,
suppose we wish to find a vector equation for the !in m parallel to e passing
through the point V. If U is a nonzero vector on e, t en tU is on e for each Example 3 The line e with Cartesian equation y = 3x has P =
t(I, 3) as a vector equation. So (1, 3), 2(1, 3) = (2, 6), -3(1, 3) = ( -3, -9)
tE R. Now O, tU, V, and tU+ V are vertices of a arallelogram, Figure 4,
~¡., are all direction numbers for e. Notice that 3/1 = 6/2 = -9/-3 = 3 is the
and !17 is parallel to e. Therefore tU+ Vis on m r each value of t. In ·,!

fact every point on m can be expressed as tU+ Vfo sorne t, so P = tU+ ,1~· slope of e.
V, t E R is a vector equation for the Iine m. ~

Example 2 Find a vector equation for the ine m passing through


the point (1, 3) and parallel to the linee with Cartesi n equation x = 2y.
The point (2, 1) is on e so P = t(2, 1) + (1, 3), t R is a vector equation
¡ In general, if a line has slope p, then it has direction numbers (1, p).

Example 4 Find a vector equation for the Iine m with Cartesian


equation y = 3x + 5.
mis parallel to y = 3x, so(!, 3) gives direction numbers for m. It passes
through the point ( -1, 2), so P = t(l, 3) + (-1, 2), t E Risa vector equation
for m. To check the result, let P = (x, y), then x = t - 1 and y = 3t + 2.
V Eliminating t yields y = 3x + 5.

/
,•
/ / Aside from providing a parameterization for lines, this vector approach
/ / will generalize easily to yield equations for lines in 3-space. In contrast, the
/ / approach to Iines in the plane found in analytic geometry does not generalize
/ /
/ / to Iines in 3-space.
/ •
U~ tU
There are two possible points of confusion in this representation of
/
• vectors with points. The first is in the nature of coordinates. To say that a
point P has coordinates (x, y) is not to say that P equals (x, y). An object is
not equal to its name. However, a statement such as "consider the point with
coordina tes (x, y)" is often shortened to "consider the point (x, y)." But such
Figure 4 a simplification in terminology should not be made until it is clear that points
12 1. A GEOMETRIC MODEL §4. The Plane Viewed as a Vector 8_iace 13

and coordinates are different. The second point of confusion is between thing in "Y' 2 • Similarly angles can be measured in E 2 while angle has no mean-
vectors and points. There is often the feeling that "a vector really is a point ing in "Y' 2 • In order to bring the geometry of E 2 into the vector space, we
isn't it?" The answer is emphatically no. lfthis is not clear, consider the situa- could define the Iength of a vector (a, b) to be Ja 2 + b2 , but this definition
tion with a number Iine such as a coordinate axis. When the identification is would 'come from the plane rather than the vector space. The algebraic
made between real numbers and points on a line, no one goes away saying approach is to add to the algebraic structure of the vector space so that the
"but of course, numbers are really points !" properties of length and angle in the plane will represent properties of vectors.
The following operation will do just that.

Problems Definition Let U= (a, b) and V= (e, -d); then the dot produet of U
and Vis U o V= (a, b) o (e, d) = ae + bd.
l. For each of the vectors (3, -4) (0, -1/2) and (2, 3) as U;
a. Plot the points in E 2 which represent U, -U, iU, OU, and 2U.
b. Without using a formula, find a Cartesian equation for the line representing For example,
all scalar multiples of U.
(2, -6) o (2, 1) = 2·2 + (-6)·1 = -2;
2. Plot the points in E 2 which represent the vectors O, U, V, and U + V and sketch
the parallelogram they determine when (3, -4)o(-4, -3) = -12 + 12 =O and (1, 1) o (1, 1) = 2.
a. U= (1, 4); V= (3, 2). b. U=(l,l); V=(-1,1).
Notice that the dot product of two vectors is a scalar or number, not a
3. How are the points representing U, V, and U + V related if U is a scalar multi-
vector. Therefore the dot product is not ·a multiplication of vectors in the
ple of Vand V of. O?
usual sense of the word; for when two things are multiplied, the product
4. Find a vector equation for the lines with the following Cartesian equations: should be the same type of object.
a. y = x. b. y = 5x. c. 2x - 3y = O. d. y = O. The algebraic system consisting of all ordered pairs of real numbers
e. y = 5x- 6. f. x = 4. g. 2x - 4y + 3 = O. together with the operations of addition, scalar multiplication, and the dot
5. Find direction numbers for each of the Iines in problem 4. product is not really "Y' 2 • This new space will be denoted by tff 2 • The vector
6. Find a vector equation for the line through space g 2 differs from "Y' 2 only in that it possesses the dot product and "Y' 2 does
no t. The similarity between the notations E 2 and tff 2 is not accidental; it will
a. (5, -2) and parallel to 3x = 2y. soon beco me evident that g 2 is essentially the Cartesian p!ane together with
b. (4, O) and parallel to y = x.
c. (1, 2) and parallel to x = 5. the algebraic structure of a vector space.
Notice that for any vector U= (a, b) in tff 2 , U o U= a 2 + h2 • This is
7. Find Cartesian equations for the lines with the given vector equations. the square of what we thought of calling the length of U.
a. P = t(6, 2) + (7, 3), tE R. b. P = 1(5, 5) + (5, 5), tE R.
c. P = 1(2, 8)+ (1, 2), tE R.
Definition The length or norm of a vector U E tff 2 denoted 11 Ull, is defined
8. Show that if P = tU + V, tER and P = sA + B, sER are two vector equa-
tions for the same line, then U is a scalar multiple of A.
by IIUII = JU u.o

The notation 11 Ull may be read "the length of U." In terms of compo-
nents, if U= (a, b) then ll(a, b)ll = Ja 2 + b2• For example
§4. The Plane Viewed as a Vector Space
11(2, -1)11 = J4+T = J5, 11(3/5, 4/5)11 = 1, and 11011 = O.

Pictures of vectors as points in the Cartesian plane contain Euclidean Now with the length of a vector U= (a, b) equal to the distance between
properties which are lacking in "Y' 2 • For example, the distan ce from the origin the origin and the point with coordina tes (a, b) in E 2 , the Iine segment between
to the point (a, b) in E 2 is J a 2 + b 2 but this number does not represent any- the origin and (a, b) could be used to represent both the vector U and its
§4. The Plane Viewed as a Vector S pace 15
14 1. A GEOMETRIC MODEL

U+ V V

o -.;.____
Figure 5

length. But each nonzero vector of "f/' 2 and hence tff 2 determines a direction in
the Cartesian plane. Therefore the vector U = (a, b) can be represented picto- -V
rially by\an arrow from the origin to the point with coordinates (a, b). Using Figure 6
this reprcrsentation for vectors from tff 2 , the picture for addition of vectors in
Figure 2 can be redrawn. In Figure 5, the sum U + Vis represented by an
arrow which is the diagonal of the parallelogram determined by O, U, V, and v.

\
U + V. This figure illustrates the parallelogram rule of addition which states
that the sum of two vectors, viewed as arrows, is the diagonal (at the origin)
of the parallelogram they determine.
An arrow is not necessarily the best way to represent a vector from tff 2 •
Viewing vectors as arrows would not improve the picture of ail the scaiar •U
mu! tiples of a vector. The choice of which representation to use depends on
the situation.
An arrow which is not based at the origin may be viewed as representing
a vector. This arises from a representation of the difference of two vectors and
ieads to many geometric applications. For U, V E tff 2 , U - V = U + (-V),
so that viewing U, V, - V, and U - V as arrows in E 2 gives a picture such as
that in Figure 6. Ifthese four vectors are represented by points anda few lines
are added, then two congruent triangles with corresponding sides parallel are
obtained, as shown in Figure 7. Therefore the line segment from V to U has
the same Jength and is parallel to the line from the origin to U - V. In other Figure 7
words, the arrow from V to U has the same iength and direction as the arrow
representing the difference U- V. Therefore the arrow from Vto U can also
be used to represent U - V, as shown in Figure 8. This representation can be vector (1, -6) which indicates that it should be viewed as some other arrow.
quite confusing if one forgets that the arrow is only a picture for an ordered A vector equation for the line determined by two points can be obtained
pair ofnumbers; it is not the vector itself. For example, when U= (4, 1) and using the above representation of a vector difference. Suppose A and B are
V= (3, 7), the difference U- V= (1, -6) is an ordered pair. The simplest two points on a line t ín E 2 • Viewing these points as vectors, the difference
representation of the vector (1, - 6) is either as a point or asan arrow at the B- A may be represented by an arrow from Ato B; see Figure 9. Or B- A
origin ending at the point with coordinates (1, -6). There is nothing in the is on the line parallel to ez:which passes through the origin. Therefore the
16 1. A GEOMETRIC MODEL
:;¡¡ §4. The Plane Viewed as a Vector S pace 17

1
1
1
1
1

.-1
1
- -- --
Figure 8
Example 2
Figure 10

Find the midpoint M of the Iine segment joining the


two points A and B in the plane E 2 •
If A, B, and M are viewed as vectors, then B - A can be thought of as
an arrow from A to B. Mis in the middle of this arrow, but f(B - A) is not
M. Rather, f(B - A) can be represented by the arrow from A to M, as in
Figure 10. Therefore t(B- A) = M- A, orA must be added to -HB- A)
to obtain M; and M= A + t(B- A)= f(A + B).
A B-A
M = !(A + B) is a simple vector formula for the midpoint of a line


which does not explicitly involve the coordinates of points. One of the main
strengths of the vector approach to geometry is the ability to express relation-
ships in a form free of coordinates. On the other hand, the vector formula can
easily be converted to the usual expression for the coordinates of a midpoint
in analytic geometry. For if the point A has coordinates (x 1 , y 1) and B has
coordinates (x 2 , y 2 ), tlien M= t(A + B) = (t(x 1 + x 2 ), t(y 1 + y 2 )).
Figure 9 The vector form of the midpoint formula is used in the next example to obtain
a short proof of a geometric theorem.
difference B - A gives direction numbers for the Iine &, and a vector equa-
tion for & is P = t(B - A) + A, tE R. Example 3 Prove that the line joining the midpoints of two sides
of a triangle is parallel to and half the length of the third side.
Example 1 Find a vector equation for the !in e & through (- 2, 3) Suppose A, B, and C are the vertices of a triangle, Mis the midpoint of
and (4, 6). side AB and N is the midpoint of si de BC, as in Figure 11. Then it is necessary
. & has dir~ction numbers (6, 3) = (4, 6) - ( -2, 3), and ( -2, 3) is a to show that the 1ine MN is ha1f the length and parallel to AC. In terms of
pomt onthe !me. Thus P = t(6, 3) + ( -2, 3), tER is a vector equation for vectors this is equivalent to showing that N- M= t(C - A). But we know
&. Severa! other ,equations could be obtained for &, such as p = s(- 6, _ 3) that M = t(A + B) and N = t(B + C), so
-: (4, 6), sE R. fhese t~o equations have the same graph, so for each par-
tt~ular value of t, there 1s a value of s which gives the same point. In this case N - M = J(B + C) - t(A + B) = JB + tC - tA - tB
th1s occurs when s = 1 - t.
= t(C- A).
19
18 1.A GEOMETRIC MODEL §5. Angle Measurement in lz

§5. Angle Measurement in rff 2

The dot product was used to define length for vectors in rff 2 after first
finding what the length should be in order to have geometric meaning. T~e
definition of angle can be motivated in the same way. We need the properties
listed below but a detailed discussion of them will be postponed to Chapter 6.

~-

Theorem 1.2 Let U, V, and W E g 2, and r E R, then

Figure 11
1. UoV=VoU
2. U o (V + W) = U o V + Uo W
Problems
3. (rU) o V = r(U o V)
l. Compute the following dot products: a. (2, -l)o(3, 4). b. (0, O) o(l, 7). 4. U o V ~ O and U o U = O íf and only if U == O.
C.(2, -J)o(J, 2). d. (5, -l)o(-4, 3).
2. Find the lengths of the following vectors from rff 2 :
To motiva te the definition of an angle in rff 2 , suppose U and V are two
a. (1, 0). b. o1v2. tJV2). c. (-4, 3). d. (1, -3). vectors such that one is nota scalar multiple of the other. Representing U, V,
3. Show that llrUII = ir 111 Uli for any vector U E rff 2 and any scalar r. and U - V with arrows in E 2 yields a triangle as in Figure 12. Then the angle
4. Let U, V, WE 8 2 • a. What is the meaning of Uo(VoW)? e between the arrows representing U and Vis a well-defined angle inside the
b. Show that Uo(V + W) =U o V+ UoW. triangle. Since the three si des of the triangle can be expressed in terms of the
5. Use arrows from the origin in E 2 to represent the vec'tors U, V, U + V, - V, given vectors, the law of cosines has the form
and U - V= U + (- V) when
IIU- Vll 2 == 11Uil 2 + IIVII 2 - 2IIUIIIIVII cos8.
a. U= (2, 4), V= (-2, 3). b. U= (5, 1), V= (2, -4).
6. Find a vector equation for the line.through the given pair of points.
a. (2, -5), ( -1, 4). b. (4, 7), (10, 7). c. (6, 0), (3, 7).
u
7. Find the midpoint of each line segment joining the pairs of points in problem 6.
8. Let A and B be two points. Without using components, find a vector formula
for the point Q which is on the line segment between A and B and one-fifth
the distance to A from B.
9. Without using components, prove that the midpoints of the sides of any
quadrilateral are the vertices of a parallelogram.
10. Let A, B, and C be the vertices of a triangle. Without using components, show
that the point on a median which is 2/3 of the distance from the vertex to the
opposite side can be expressed as }(A + B + C). Conclude from this that
the medians of a triangle are concurren t.
11. Prove without using components that the diagonals of a parallelogram bisect
each other. Suggestion: Show that the midpoints of the diagonals coincide. Figure 12
20 §5. Angle Measurement in 1 2 21
1. A GEOMETRIC MODEL

Using Theorem 1.2 and the definition of 1ength, if the ang1e between them is 90°. The zero vector is taken to be perpendicular
or orthogonal to every vector.
2
il U - Vil = (U - V) (U- V) = U o U- U o V- V o U+
0
V o V

= IJUil 2 - 2U o V+ IIVII 2 • The zero vector is defined to be perpendicular to every vector for con-
venience, for then it is not necessary to continually refer to it as a special case.
Th~refore. the. 1aw of cosines gives us the relation U o V = il Uilil Vil cos e. Although the definition of perpendicular has a good geometric interpretation,
Th1s relatwn tles together the angle between the arrows and the vectors they it is easier to use the following algebraic condition.
represent, making it clear how angles should be defined in g •
2
Theorem 1 .3 Two vectors U and V in C2 are perpendicular if and
.Definition If U and Vare nonzerovectors in C2 , the angle bétween them only if U o V = O.
IS the angJe e which satisfies 0 5 e 5 180° and
Proof ( => )* Suppose U and V are perpendicular. Then the angle
UoV () between U and Vis 90° and cosO =O, or either U or Vis the zero vector
cos O = 11 Uilil Vil . and 11 Ull or il Vil is O. In each case U o V = 11 Ullll VIl cos O = O.
( <=) Suppose U o· V = O. Then since U o V = 11 Ullll VIl cos O, one of the
If U or Vis the zero vector, then the angle between them is taken to be zero. three factors is O. If 11 Ull or 11 VIl is O, then U or Vis the zero vector so U and V
are perpendicular. If cos O = O, then O = 90 o and U is again perpendicular
to V.
~ith this definition, angles in g 2 will have the usual geometric inter-
pretatwn when the vectors of C2 are represented by arrows in E2. The second
part of the definition in sures that an angle is defined for any pair of vectors Example 4 Let e be the line given by P = t(2, 5) + (7, 1), tE R. Find
and we can write U o V= il Uilil Vil cos O for all vectors U, V from g • ' a vector equation for the line m passing through (1, O) which is perpendicular
2
toe.
g has direction numbers (2, 5), therefore the direction numbers (a, b) for
Example 1
lf U= (1, 2) and V= (- 1, 3), then the angle between m must satisfy O = (2, 5) o (a, b) = 2a + 5b. There are many solutions for
U and V satisfies
this equation besides a= b = O. Taking (a, b) = (5, -2) yields the equation
P = t(5, -2) + (1, 0), tER, for the line m.
cosO = (1, 2) o (- 1, 3) = _1
il(1, 2)ilil( -1, 3)il .J2'
Problems

l. Find the angles between the following pairs of vectors.


Example 2 If U= (1, 1), V= (- 1, 2) and O is the angle between a. (1, 5), (-10, 2). b. (1, -1), (-1, 3). c. (1, v3), (y'6, v2).
1
U and ·V, then cos O= 1/.JIO .ore= cos- 1/.JIO. Jt would be poss1'ble 2. Find a vector equation for the line passing through the given point and perpen-
t o o btam a dec1mal
· .
approx1matwn for Oif required. dicular to the given Iine.
a. (O, 0), P = t(l, -3), tE R. b. (5, 4), P = t(4, 4), tE R.
Examp!e 3 When U= (4, 6) and V= ( -3, 2), cosO= O. There- c. (2, 1), p = t(O, 7) + (4, 2), tE R.
fore th~ an~le between U and Vis 90°, and if U and Vare represented by
*The statement "P if and only if Q" contains two assertions; first "P only if Q" or
arrows m E , then the arrows are perpendicular. "P implies Q" which may be denoted by P=>Q and second "P if Q" or "Pis implied by Q"
which may be denoted by P"""' Q. Each assertion is the converse of the other. An assertion and
Definition its converse are independent statements, therefore a theorem containing an if and only if
Two nonzero vectors in C 2 are perpendicular or orthogonal statement requires two separate proofs.
§6. Gen~ralization of r and rff, 23
22 1. A GEOMETRIC MODEL 2

3. Show that, in <f 2 , the vectors (a, b) and (b, -a) are perpendicular. positive integer. Thus in the vector space "fí 4 ,
4. The proof ofTheorem 1.3 uses the fact that if UE tf 2 and 11 Ull = O, then U= O.
Show that this is true. · (-2,0,5,3) + (8, -3, -6, 1) = (6, -3, -1,4).

5. Give a vector proof of the Pythagorean theorem without the use of components.
(The proof is simplest if the vertex with the right angle represents the zero And in "fí 6 ,
vector of tfz.) 2(0, !, O, -7, 3, 4) = (0, 2, O, -14, 6, 8).
6. Show that the diagonals of a rhombus (a parallelogram with equal sides) are
perpendicular without using components. The numbers in an ordered n-tuple from "fí" will be called components.
7. Without using components, prove that any triangle inscribed in a circle with Thus 4 is the second component of the vector (3, 4, J2,
O, 1) from "fí s·
one side a diameter is a right triangle having the diameter as .hypotenuse. Since the vectors of "fí 3 are ordered triples, they can be represented by
(Suppose the center of the circle represents the zero vector of C 2 .) points in Cartesian 3-space. Such a space has three mutually perpendicular
coordinate axes with each pair of axes determíníng a coordinate plane. A
point Pis gíven coordinates (a, b, e) if a, b, and e are the dírected distances
from the three coordínate planes to P. lf a rectangular box is drawn at the
§6. Generalization of "fí 2 and <f 2 origin with height, wi~th, and depth equal to lcl, lbl, and !al, then Pis at the
vertex furthest from the origin, as in Figure 13. Now P can be used as a
At this time the examples provided by "fí 2 and <f 2 are sufficient to begin pictorial representation of the vector in "fí 3 with components a, b, and c.
a general discussion of vector spaces. But before leaving the geometric discus- Since we Jíve in a 3-dímensional world, we cannot draw such pictures
sion, a few idea~ concerning 3-dimensional Euclidean geometry should for vectors from ·r" when n > 3. This does not affect the usefulness of these
be covered. The vector techniques developed for two dimensions generalize to spaces, but at first you m ay wish to think of "Yn as either "fí 2 or "fí 3·
n dimensions as easily as to three. Thus "fí 2 can be generalized in one process The vector notation u sed in the previous sections is free of components.
to an infinite number of vector spaces. That is, capitalletters were used to denote vectors rather than ordered pairs.
For example, the midpoint formula was written as M = 1(A + B) and the
equation of a Jine as P = tU + V, tE R. Now that there are other vector
Oefinition An orderedn-tuple ofnumbers has the form (a 1 , ••• , an) where
a1 , • •• , an E R and (a 1 , ••. , an) = (b 1 , ••• , bn) if and only if a1 = b1 , • •. , z
an = bn.

Thus when n = 1, the ordered n-tuple is essentially a real number; when


n an ordered pair; and when n = 3, it is an ordered triple.
= 2, it is

Definition For each positive integer n, the vector space "Yn consists of all
ordered n-tuples of real numbers, called vectors, together with the operation
of vector addition:
y

and the operation of scalar multiplication:

for any rE R. X

This definition gives an infinite mfrnber of vector spaces, one for each Figure 13
;:
~··
25
24 1. A GEOMETRIC MODEL §6. Generalization of "f/' 2 and l2

spaces, the question arises, when can these vectors be interpreted as coming of n-dimensional Cartesian space can be found as geometric properties of g•·
from .Yn? The answer is that all the general statements about .Y2 carry over That is, we can gain geometric insights by studying the vector spaces g• when
to "Y•. Thus it can be shown that addition in "Y. is associative and cotnmuta- n > 3.
tive or that r(U + V) = rU + r V for all U, V E .Yn and r E R. Everything As with .Y 2 and the spaces .Y., the properties ·of g 2 carry o ver directly
cardes over directly to "Y•. to g •. The terms length, angle, and perpendicular are defined in just the same
The properties of a line are the same in all dimensions. A line is either way. Thus the vectors (4, 1, O, 2) and ( -1, 2, 7, I) are perpendicular in g4
determined by two points or there is a unique Iine parallel to a given line beca use
through a given point. Therefore, the various vector statements about lines in
the plane can be regarded as statements about Iines in higher dimensions. (4, l, O, 2) o ( -1, 2, 7, 1) = -4 + 2 + O + 2 =O.
This means that the set of all scalar multiples of a nonzero vector from .Yn
can be pictured as a Iine through the origin in n-dimensional Cartesian space. Arrows, triangles, and parallelograms )ook the same in two, three, or higher
dimensions, so the geometric results obtained in the plane also hold in each
of the other spaces.
Example 1 The three coordinate axes of 3-space have vector equations
P = (t, O, 0), P = (0, t, 0), and P = (0, O, t) as t runs through the real nu~­
bers. The line passing through the point with coordina tes (I, 3, - 2) and the Examp/e 3 Find a vector equation for the Iine e in 3-space passing
origin has the vector equation P = t(l, 3, -:-2), tE R. through the points with coordinates (2, 5, -4) and (7, -1, 2).
Direction numbers for e are given by (7, -1, 2) - (2, 5, -4) =
(5, -6, 6) and e passes through the point (7, -1, 2). Therefore, P =
In particular, each nonzero vector in "Y. gives direction numbers for a t(5, -6, 6) + (7, -1, 2), tER is a vector equation for e. As in the 2-
set ofparallel lines in n-dimensional Cartesian space. Notice that the measure- dimensional case, there are many other equations for e. But notice that
ment of slope used in the plane is a ratio requiring only two dimensions and setting P = (x, y, z) yields three parametric equations
cannot be applied to lines in three or more dimensions.
X= 5t + 7, y= -6t - 1, z = 6t + 2, tE R.
Example 2 The graph of the vector equation P = t(I, 3, -2) +
(4, O, 1), tER is the line with direction numbers (I, 3, -2), passing through It is not possible to eliminate t from these three equations and obtain a
the point with coordinates (4, O, 1). It is also parallel to the line considered in single Cartesian equation. In fact three equations are obtained by eliminating
Example l. t, which may be written in the form

x-7 y+l z-2


Euclidean properties, that is, length and angle, are again introduced -5-=--=-r-=-6-
using the dot product.
These are called symmetric equations for e. They are the equations of three
Definition The dot product of the n-tuples U= (a 1 , •.• , a.) and V= planes that intersect in e.
(b 1 , . . . , b.) is U o V= a 1 b1 + · · · + a.b•. The algebraic system consisting
of all ordered n-tuples, vector addition, scalar multiplication, and the dot Next in complexity after points and Iines in 3-space come the planes. We
product will be denoted by gn· will use the following charactetJZation of a plane in 3-space: A plan e n is
determined by a normal direction N and a point A, that is, n consists of all
The space g 2 was obtained from "Y 2 usin,g the plane E 2 as a guide, and Iines through A perpendicular to N. This is a perfect situation for the vector
3-dimensional geometric ideas will provide a basis for viewing properties space approach. If P is any point on n, then viewing the points as vectors in
of g 3 • But for higher dimensions there is Iittle to build on. Therefore, for g , P - A is perpendicular to N. This gives rise to a simple vector equation,
3
dimensions larger than three the procedure can be reversed, and properties for P is on the plane n if and only if (P - A) o N = O. The direction N is
26
1.A GEOMETRJC MODEL §6. Generalization of 'Y 2 and & 2 27

This 'examp1e suggests that the graph of a first degree equation in 3-space
is a plane not a line.

Theorem 1.4 The Cartesian equation of a plane in 3-space is a linear


equation. Conversely, the locus of points (x, y, z) satisfying the linear equa-
tion ax + by + ez = d is aplane with normal (a, b, e).

Proof The first part is left to the reader, see problem 7 at the end of
this section. For the converse, suppose A = (x 0 , y 0 , z0 ) is a point on the locus,
then ax 0 + by0 + ez 0 = d. lf P = (x, y, z) is a general point on the locus,
P-A then ax + by + ez = d, and putting the equations together gives

ax + by + ez = ax 0 + by 0 + ez0
or
Figure 14
a(x - x 0 ) + b(y - Yo) + e(z - z 0 ) = O

called a normal to the plane, and it makes a nice picture if N is represented by or


an arrow perpendicular ton, as in Figure 14. However, it must be remembered
that N is an ordered triple not an arrow. (a, b, e) o [(x, y, z) - (x 0 , y 0 , z0 )] = O,
which is N o (P - A) = O when N = (a, b, e). That is, P satisfies the equa-
Example 4 Find a vector equation and a Cartesian equation for the tion of the plane passing through the point A with normal N.
plane passing through the point A = (1, -2, 3) with normal N= (4, _ 1, 2).
A vector equation for this plane is
The point of Theorem 1.4 is that planes not lines have linear Cartesian
[P- (1, -2, 3)] o (4, -1, 2) =O. equations in rff 3 • Since lines in the plane have linear equations, it is often as-
sumed that lines in 3-space must also have linear equations. But a linear equa-
Now ~uppose the arbitrary point P has coordinates (x, y, z). Then the vector tion should not be associated with a line, rather it is associated with the
equatwn figure determined by a point and a normal direction.

[(x, y, z) - (1, -2, 3)] o (4, -1, 2) =O


Definition A hyperp/ane in n-space is the set of all points P satisfying
becomes the vector equation (P - A) o N = O, where P, A, N E rt. and N t= O.

(x - 1)4 + (y + 2)(- 1) + (z - 3)2 =O For n = 3 a hyperplane is simply a plan e in rff 3 •


or Every hyperplane has a linear Cartesian equation and conversely.
For P, N, A in the equation (P- A) o N= O, let P = (x 1 , ••• , x.), N=
4x - y + 2z = 12. (a 1 , ••• , a.), andA o N = d. Then (P - A) o N = O or Po N :;= A o N be-
comes a 1x 1 + · · · + a.x. = d, a general linear equation in the n coordinates
!hat is, the ~artesian equation of this plane is a first degree or linear equation of P. Tlie converse folloW! just as in the proof of Theorem 1.4.
m the coordmates.
This means that a 1ine in the plane E 2 can be regarded as a hyperplane.
§6. Generalization of "/1" 2 and /2 29
28 ~ 1. A GEOMETRIC MODEL

10. What is the graph of the Cartesian equation y = 5:


Por example, consider the line t in E 2 with equation y = mx + b.
a. in E2 ? b. in 3-space? c. in 4-space?
O= y- mx- b = (x, y) o (-m, 1) - b 11. Determine if the following pairs of Iines intersect. That is, are there values
for the parameters t and s which give the same point?
= (x,y)o(-m, I)- (O,b)o(-m, 1)
a. P = !(2, 1, -5), P = s(4, 3, O), t, sER
= [(x, y) - (0, b)] o (-m, I) = (P - A) o N. b. P = t(l, -1, 2), P = s(l, 3, -2) + (4, 4, O), t, sE R.
c. P = t(l, 3, O) + (2, O, 1), P = s(4, -1, 3) + (l, 1, 2), t, sE R.
Here A = (0, b) is a point on t and N= (-m, 1) is the direction perpen- d. P = t(l, 0, 3) + (4, -5, -6), P = s( -3, 2, 6) +(O, 2, 0), t, sE R.
dicular to t. e~ p = t(l, -2, -1) + (3, -3, -1), P = s(3, O, -4) + (4, 1, -3), t, sER.

12. Write equations in the form (P- A)•N = Ofor the Iines in the plane with the
following Cartesian or vector equations:
Problems
a. y = 4x - 3. b. y = 7. c. P = !(3, 2) + (6, 1), tE R.
1. Perform the indicated operations and determine the vector space from which d. P = t(l, 3) + (1, 1), tE R. e. P = (t, t), tE R.
the vectors come. 13. A vector U E rff 11 is a unit vector if 11 Ull = l.
a. (2, -1, 1) + (6, 2, 5). d. (l, 3, 2, o, 1) - (1, 4, 3, 6, 5). a. Show that for any nonzero vector UE rffn, U/l/ Ull is a unit vector.
b. 4(2, -3, 1, O)+ 3(-l, 4, 2, 6). e. (2/3, -1/3, 2/3)•(2/3, -1/3, 2/3). b. Obtain unit vectors from (2, 2, 1), (3, 4), (1, 1, l).
c. (2, 4, -1) •(3, -3, - 6). f. (3, 1, -4, 2)•(-l, 1, 3, 4).
14. Let U be direction numbers for a Iine e in 3-space. Show that if U is a unit
2. Prove that U+ V= V+ U when U, VE "Y 3 • vector then its three components are the cosines of the angles between U and
3. Prove that (r + s)U = rU +sU when r, sER and U E "f/ 11 • the three positive coordinate axes. (The components of U are called direction
cosines of the line t.)
4. Find the Iengths of the following vectors from tff 3 :
15. Find direction cosines for the Iine through
a. (1/2, 1/2, 1/2). b. (1, 3, -2). c. (1/3, 2/3, -2/3).
a. (2, 4, 6) and (3, 2, 4). b. (2, 3, 5) and (3, 3, 6). c. (0, 4, 1) and (0, 9, 1).
5. What is the angle between a diagonal of a cube in 3-space and an edge of the
16. Three vertices ofthe box in Figure 13 are not labeled. What are the coordinates
cube?
of these three points?
6. Find a vector equation for the fine passing through the following pairs of
points: 17. How might the vectors of "Y 1 be represented geometrically?

a. (7, 2, -4), (2, -1, 4). c. (l, 3, O, 2), (0, -1, 2, O).
b. ( -4, 3, 0), (2, 1, 7). d. (3, 7, -1, -5, 0), (1, -1, 4, 2, 5).
7. Show that the Cartesian equation of a plane in 3-space is linear.
8. Use the comer of a room, the fioor and two walls, to visulaize the three co-
ordinate planes in Cartesian 3-space. Determine where the points with co-
ordinates (2, I, 4), (2, - 1, 4), and (2, 1, -4) would be located if the coordinates
are given in feet. Then determine where the graphs with the following equations
would be located.
a. P = (0, O, t), tE R. b. z = O. c. z = 4. d. x = 3.
e. P = t(O, 5, -5) + (0, 3, 0), tE R. f. y= -l. g. y= x.
h. p = t(l, 1, O), tE R. i. X+ y+ z = 3.
9. Find vector and Cartesian equations for the hyperplane with given normal
N passing through the given point A.
a. N= (0, O, l), A= (l, 2, 0). d. N= (1, 0), A= (-7, O).
b. N= ( -3, 2, l), A = (4, O, 8). e. N= (3, 1, O, 2), A = (2, 1, 1, O).
c. N= (2, -3), A = (1, 1).
Real Vector Spaces

§1. Definition and Examples


§2. Subspaces ·
§3. Linear Dependence
§4. Bases and Dimension
§5. Coordinates
§6. Direct Sums
32 2. REAL VECTOR SPACES §1. Definition and Examples 33

The space "/í 2 was introduced to illustrate the essential properties to be instead of real number is that the study of real vector spaces can easily be
included in the definition of an abstract vector space. Since this space itself generalized to complex vector spaces by sirnply changing the rneaning of the
appeared abstract and uninteresting, the dot product was added yielding C 2 term scalar. However, at this point little would be gained by allowing the
and a new approach to Euclidean geometry. We now return to the simple scalars to come from sorne other field, and since only real vector spaces will
properties of "/í 2 which lead to many different spaces and ideas; it will be be under consideration until Chapter 6, the adjective real will be dropped for
sorne time before we again find the need to add more algebraic structure. convenience.
The definition of a vector space is abstract in that it says nothing about
what vectors are; it only lists a few basic properties. The fact that "/í 2 is a
vector space, as defined above, is contained in a theorem and a problem
§1. Definition and Examples found in the preceding chapter. But rnany other algebraic systems also satisfy
the definition.
Definition A real vector space "/í is a set of elements U, V, W, ... , called
vectors, together with the following algebraic structure: Example 1 The n-tuple spaces "fí. are all vector spaces as defined
abo ve. The proof of this is very similar to the proof for "/í 2 , the only difference
being in the nurnber of cornponents toa vector. Since c. is obtained from "fí.
Addition: The sum U + V ls defined for every pair of vectors and
by adding the dot product, c. is a vector space for each positive integer n.
satisfies

l. U+ VE"fí (Ciosure) Example 2 Consider the set of all arrows or directed line segrnents

2. U+ (V+ W) = (U+ V) + W (Associative Law) OP frorn the órigin O to the point P in the plane E 2 • Define addition of arrows
with the parallelogram rule for addition. To define scalar multiplication,
3. U+ V= V+ U (Commutative Law) --> -->
suppose the length of OP is x. ForrE R, Jet rOP be
4. There exists a zero vector OE "/í such that U+ O= U foral! U E "/í.
5. For every vector U E "/í there is an addi'tive in verse -U E "/í such that i. the arrow in the sarne direction as OP with length rx if r > O;
U+ (-U)= O.
ii. the origin if r = O;
-->
111. the arrow in the opposite direction from OP with length -rx if
Sea lar multiplication: The scalar rnultiple rU is defined for every scalar
r <O.
rE R and every vector U E "/í, and satisfies

l. rU E "/í (Ciosure) Using a little plane geornetry one can show that the set of all arrows
together with these two operations satisfies the definition of a vector space.
2. r(U + V) = rU + rV Notice that this vector space is ncither "/í 2 nor C 2 •
(Distributive Laws)
(r + s)U = rU + sU
3. (rs)U = r(sU) (Associative Law) Example 3 The real number systern R is a vector space. In fact, the
conditions in the definition of a vector space becorne the field properties of R
4. l U = U, l the rnultiplicative identity in R. when U, V, and W are taken to be real nurnbers. Thus a real nurnber can be a
vector.
The word "real" in the terrn real vector space indicates that the scalars
are real nurnbers. It would be possible to choose the scalars frorn sorne other Example 4 Let R[t] denote the set of all polynomials in t with real
field such as the cornplex nurnber systern, in which case the system would be coefficients, for example, 2/3 + St - 4t 5 E R[t]. With the usual algebraic
called a "cornplex vector space." One advantage of using the term scalar rules for adding polynornials and multiplying a polynomial by a real nurnber,
34 2. REAL VECTOR SPACES 35
§1. Definition and Examples

R[l] is a vector space. To prove this it is necessary to prove that all nine Now two functions in :!F are equal if and only if they give the same value for
properties of addítion and scalar multiplication hold in R[t]. For example, to each value of x E [0, 1), therefore f + g = g + f Associativity of addition
prove commutativity of addition it is necessary to show that P + Q = ·Q + P follows· from the fact that addition in R is associative in the same way.
when Let O denote the zero function in :!F, that is O(x) = O for all x E [0, 1].
Then O is the zero vector of :!F, for given any /E :!F,

(f + O)(x) = f(x) + O(x) = f(x) + O = /(x),


This follows from the fact that addition in R is commutative, for the term
(ak + bk)tk in the sum P + Q becomes (bk + ak)t\ a term in Q + P. The ,. therefore f + O = f Finally if - f is. detined by (- f)(x) = - [f(x)], one can
zero vector of R[t] is the number O, since the elements of R[t] are not poly-
nomial equations, a polynomial such as 3 + 1 cannot be zero.
¡ show that - f is the additive in verse off
Scalar multiplication is detined in :!F by

(1j)(x) = r[f(x)] for any fE :!F and rE R.


This example begins to show how general the term "vector" is, for in this
space polynomials are vectors. Thus 5 - 71 3 + 21 6 is a vector.
1

Aside from closure, the properties of scalar multiplication which must be
satisfied in :!F, if it is to be a vector space, are proved using tield properties of
Examp!e 5 Consider the set of all complex numbers a + bi, with the real number system. See problem 5 at the end of this section.
a, b E R, i = .j=I and equality given by a + bi = e + di if a = e, b = d. If Therefore :!F is a vector space, and such continuous functions as x 2 + 4x,
addition is defined in the usual way, eX, sin x and .JX3+1, x E [0, 1], are vectors.
Vector spaces similar to :!F can be obtained in endless variety. For
(a + bi) + (e + di) = a + e + (b + d)i example, the condition that the functions be contiimous can be dropped or
changed, say to differentiable, or the domain can be changed. In fact, other
and scalar multiplication is defined by
vector spaces, besides R, could be used for the domain or range, and in time
such vector spaces of functions will become our main concern.
r(a + bi) = ra + (rb)i for rE R,

then the complex numbers form a (real) vector space, see problem 3 at the Example 7 A set containing only one element, say Z, can be made
end of this section. into a vector space by defining Z + Z = Z and rZ == Z for all rE R. The
conditions in the definition of a vector space are easily satisfied. Since every
Examp!e 6 Let $' denote the set of all continuous functions with vector space must have a zero vector, Z is that vector, and this simple vector
domain [0, 1) and range in R. To make $' into a vector space it is necessary to space is called a zero vector space.
define addítion and scalar multiplicatíon and show that all the nine properties
hold. F or f and g in :!F define f + g by
Examp!e 8 Consider the set of all sequences {a.} of real numbers
(/ + g)(x) = f(x) + g(x) which converge to zero, that is lim"~"'a" = O. With addition and scalar
for all x E [0, 1].
multiplication defined by
It is proved in elementary calculus that the sum of continuous functions is
continuous, therefore :;¡; is closed under addition. Commutativity follows {a.} + {b.} == {a. + b.} and r{a.} = {ra.}.
from commutativity of additíon in R for if J, g E$',
this set becomes a vector space. Thus the sequences {1/n} or 1, 1/2, 1/3,
1/4, ... and {( -1/2)"} or -1/2, 1/4, -1/8, l/16, ... can be regarded as
(/ + g)(x) == f(x) + g(x) Definition of addition in $' vectors.
== g(x) + f(x) Addition is commutative in R Other vector spaces can be constructed similar to this one. For ex-

ample, the set of all convergent infinite series could be used. In which case
== (g. + f)(x) Definition of addition in ff. :L :'~ 1 ( -1)" 1/n == -1 + 1/2- 1/3 + 1/4- 1/5 + · · · would be a vector.
36 2. REAL VECTOR SPACES 37
§2. Subspaces

The above examples should give an idea of the diversity of vecto¡; spaces. 5. a. In Example 6, fi.II in the reasons in the chaín of equalities yíeldingf + O =f.
It is because of this diversity that an abstract vector space will be studied b. Show that f + (- f) =; O, fE ff, with reasons.
rather than a particular example. For the proof that a certain condition c. Verífy that scalar multiplication·in ff satisfies the conditions necessary for ff
holds for the ordered pairs of "Y 2 cannot be used to claim that the condition to.be a vector space.
holds for the functions of 9i or the polynomials of R[t]. On the other hand, a 6. What is the zero vector in a. "Y 3? b. ff? c.C? d.R[t]? e."Ys?
condition shown to hold in an abstract vector space "Y, also holds in "Y 2 , f. the space of sequences which converge to zero? g. the space of arrows in
§i, and R[t], as well as in vector spaces constructed after the proof is com- E 2 starting at the origin?
pleted. But since the vector space "Y is abstract, nothing can be assumed 7. Each of the following systems fails to be a vector space. In each case find all
about it without justification. Just how little is known about "Y is indicated the conditions in the ctefinition of a vector space which fail to hold.
·by the fact that it must be assumed that 1U = U for all U E "Y. The vectors of
a. The set of all ordered pairs of real numbers with addition defined as in "Y 2
"Y cannot be described. In fact, the term "vector" now only means "member
and scalar multiplication given by r(a, b) = (ra, b).
of a vector space" and a vector might be anything from a number to a func-
b. As in a except scalar multiplication is r(a, b) = (ra, O).
tion. c. The set of all ordered pairs with scalar multiplication as in "Y 2 and addition
It should be noticed that two symbols appear repeatedly in the above given by (a, b) + (e, d) = (a - e, b - d).
examples. The zero vector was always denoted by O, even though it might be d. The set of all functions f from R to R such that /(0) = 1, together with the
the number O, an n-tuple of zeros, or a zero function. Although it is possible operations of .?. (cos x and x 3 + 3x + 1 are in this set.)
to denote the zero vector of a space "Y by 0-¡r, the meaning of O is usually 8. A real 2 x 2 (read "two by two") matrix is defined to be an array of four
clear from the context in which it is used. The other symbol which can mean
many things is the plus sign. Addition is a different operation in each vector numbers in the form (~ J), with a, b, e, dE R. Using the vector spaces "Yn
space. The precise meaning of a plus sign must also be determined from the as a guide, define addition and scalar multiplication for 2 x 2 matrices and
context. prove that the resulting system is a vector space.
N ot all the examples presented above will be of equal interest. Asid e
from vector spaces to be obtained later, we will be interested primarily in "Y.
and cff., with many references to R[t] and §l. An examination of the examples
and the problems below will show that a system forms a vector space by §2. Subspaces
virtue of properties of real numbers, or more generally, because the real
number system is itself a vector space. Fut.ure examples of vector spaces will In the study of an abstract algebraic system such as a vector space, sub-
also be based on the properties of R or sorne other vector space. systems which share the properties of the system are of major importance.
The consideration of such subsystems in vector spaces willlead to the central
concepts of linear algebra. An analogy can be drawn with geometry and the
Problems
importance of !ines in the-pJ,¡we or lines and planes in 3-space. Before stating
l. Find the sum of the f~wing pairs of vectors. the definition of a subspace, a little terminology concerning sets is needed.

a. (2, -4, 0, 2, 1), (5, 2, -4, 2, -9) E "Y s. c. 2 - i, 4i - 7 E C.


b. 2 + 6t 2 - 3t 4 , 21 + 3t 4 E R[t]. ct. cos x, 4- ,;-x Eff. Defin ition Let A and B be sets. A is a subset of B if every element of A
is in B, that is, if x E A implies x E B. If A is a subset of B, write A e B for "A
2. Find the additive inverse for each vector in problem l.
is contained in B" or B :::> A for "B contains A."
3. Show that the complex number system is a vector space, as claimed in Example
5.
Por example, the set ofall integers is a subset of R and {t 4 , 3t + 2} e
4. Verify that scalar multiplication in R[t] satisfies the conditions required if R[t] R[t]. In general a set contains two simple subsets, itself and the "empty set,"
is to be a vector space. the set with no elements.
38 2. REAL VECTOR SPACES §2. Subspaces 39

Definition Two sets A and B are equal if A e B and Be A. Proof l. To show that O is unique, suppose there is a vector X in
1/ such that U+ X= U for all U E"//. Consider X+ O. By the definition
of o, X + O = X. But by the commutativity of addition, X + O = O + X
Definition A subset fl' of a vector space 1/ is a subspace of"f/ if fl' is and O + X = O by assumption. Therefore X = O.
a vector space relative to the operations of "//. (Note: fl' is a Capta! script S.) To show that the additive inverse of U is unique, suppose U+ Y= O
for sorne Y E"//. Then
Every vector space 1/ contains two simple subspaces. Namely, 1/ itself,
since every set is a subset of itself, and {0}, the set containing only the zero -U= -U+ O= -U+ (U+ Y)= ((-U)+ U)+ Y
vector of"f/. {O} is easily seen to be a vector space in its own right and is there- =(U+(-~+ Y= O+ Y= Y+ O= Y.
fore a subspace of "//. {O} is called the zero subspace of 1/ and should not be
confused with the zero vector space defined in the previous sedion. These 2. To show that OU = Onote that OU = (O + O) U= OU + OU. There-
two subspaces are called trivial or improper subspaces of "//, and all other
fore
subspaces are called non trivial or proper subspaces of "//. Thus fl' is a proper
subspace of 1/ if it contains so me nonzero vectors but not aii the vectors of "//. O = OU - OU = (OU + OU) - OU = OU + (OU - OU) = OU + O = OU.

Example 1 Let fl' = {(0, r, O)lr E R}. Show that fl' is a proper Can you find a similar proof for the statement rO = O?
su bspace of 1/ 3 • 3. Left to the reader. .
The elements of !1' are ordered triples, so fl' is a subset of "f/ 3 . It is not 4. To prove that -U= ( -l)U, it need only be shown that ( -1)U.
difficult to show, using the operations of 1/3 , that fl' satisfies aii nine condi- acts like the additive inverse of U since by part 1 the additive inverse of U IS
tions in the definition of a vector space. For example, suppose U, V E fl', unique. But U+ ( -l)U = 1U+ ( -l)U = (1 + (-l))U = OU = O. There-
then there exist r, sER such that U = (0, r, O), V= (O, s, 0), and U + V= fore ( -l)U is the additive inverse of U, -U.
(0, r + s, 0). Since R is closed under addition, r + s E R and the sum U + V
is in fl'. That is, [!' is closed under addition. Since U, V E 1/3 and addition in Parts of Theorem 2.1 were obtained for 1/ 2 in problems 7, 8, and 10 on
1/3 is commutative, U + V = V + U in Y. The remaining properties are page 7, however, the theorem is stated for an abst~ac~ vector spac~ .. There-
proved similarly. fore these are properties of any algebraic system satisfymg the definitwn _of a
fl' is a proper subspace of 1/ 3 for the vector (0, -5, O) E fl' implying that vector space be it 1/m R[t], or a system to be defined at sorne future time.
fl' #- {O} and (0, O, 7) <j fl' but (0, O, 7) E 1/3 so fl' #- 1/3 •

Definition Let S be a subset of a vector space "//. S is closed under addi-


Applying the definition directly is not the best way to determine if a sub- tion if U+ V E S for all U, V E S, where + is the addition in"//. S is closed
set is a subspace. However, a few basic facts are needed before obtaining a under scalar multiplication if rU E S for all U E S and rE R where scalar multi-
simpler method.
plication is as defined in 1/.

Theorem 2.1 Let 1/ be a vector space. Then for aii vectors U E "//,
Example 2 Show that S = {(a, O) la E R, a > O} e 1/ 2 is closed
under addition but not scalar multiplication.
l. -U and O are unique. Let U, VE S, then U= (x, O), V= (y, O) for sorne x >O and y> O.
2. OU = O and rO = O, for all rE R. The sum x + y > O so (x + y, O) E S. But (x + y, O) = U + V, therefore,
S is closed under addition. However, ( -l)U = ( -x, O) and -x < O. There-
3. If rU = O, then either r = O or U= O. fore the scalar multiple ( -l)U 'Í S, and S is not closed under scalar multipli-
4. -U= (-l)U. cation.
40 2. REAL VECTOR SPACES §2. Subspaces 41

The following theorem shows that it is only necessary to verify three Example 5 Show that f/ = {/E ffl/is differentiable andf' = /}
conditions to prove that a subset of a vector space is a subspace. is a subspace of §".Note thatjE f/ ifand only if y = f(x) is a solution ofthe
differential equation dy/dx = y.
The zero function clearly satisfies dyjdx = y, so f/ is nonempty. If
Theorem 2.2 f/ is a. subspace of a vector space "f/ if the following
¡; g E f/, then (f + g)' = f' + g' so f' = f and g' = g implies (f + g)' =
conditions hold: ¡ + g. That is, 9' is closed under addition. Similarly for any scalar r, (rf)' =
rf', so that if f' = f, then (rf)' = rf and f/ is closed under scalar multiplica-
l. f/ is a nonempty subset of "f/.
tion also. Therefore, f/ is a subspace of ff, in fact, f/ = {eexjcE R}.
2. f/ is closed under addition. It is not always the case that the solutions of a differential equation form
a vector space, but when they do aJI the theory ofvector spaces can be applied.
3. f/ is closed under scalar multiplication.

Example 6 If !C, is viewed as Euclidean n-space, then Iines through


Proof Since it is given that f/ is closed under addition and scalar
the origin are subspaces of !C,.
multiplication, it must be shown that f/ satisfies the remaining seven condi-
A Jine through the origin is given by all scalar mu! tiples of sorne nonzero
tions in the definition of a vector space. Most of these follow beca use f/ is a vector U E !C,. Therefore the claim is that {rUlr E R} is a subspace of !C,. The
subset of "//. For example, if U, V E 9', then U, V E "f/ and since "f/ is a vector
set is nonempty because it contains U. Since rU + sU= (r + s)U and
space, U + V= V+ U, therefore, addition is commutative in f/. Similarly, r(sU) = (rs)U for all r, sER, the set is closed under the operations of !C,.
associativity of addition and the remaining properties of scalar multiplication Therefore each line through the origin is a subspace of ~C•.
hold in f/. So it only remains to show that 9' has a zero vector and that each
element off/ has an additive inverse in 9'.
By part 1 there exists a vector V E 9' and by part 3, OVE 9', but OV =O Example 7 !7 = {a(2, 1, 3) + b(l, 4, 1) + e(l, -3, 2)ja, b, e E R}
by Theorem 2.1, so OE f/. Now suppose U is any vector in f/. By 3, ( -I)U E is a proper subspace of "f/ 3 .
f/, and (- J)U = -U by Theorem 2.1, so - rJ E f/. Therefore 9' is a vector !7 e "f/ 3 since "f/ 3 is closed under addition and scalar multiplication.
space and hence a subspace of "f/. !7 is nonempty for (2, 1, 3) E !7 when a = 1, b = e = O. For closure suppose
U, V E !7, then there exist scalars such that

Example 3 For a positive integer n, Jet Rn[t] denote the set of all U= a 1(2, 1, 3) + b 1(1, 4, 1) + e¡(I, -3, 2)
polynomials in t of degree Iess than n. R"[t] is a subset of R[t], and since
t"- 1 E R.[t], it is nonempty. Addition and scalar multiplication cannot increase and
the degree of polynomials, so R.[t] is closed under the operations of R[t].
Therefore, for each positive integer n, R.[t] is a subspace of R[t]. V= a 2 (2, 1, 3) + bz(l, 4, 1) + e 2(1, -3, 2).

So
Example 4 Show that "/Y = {/E ffl/(1/3) = O} is a subspace of
the vector space of functions ff. U+ V= [a 1 + a 2 ](2, 1, 3) + [b 1 + b ](1, 4, 1) + [e 1 + e 2 ](1, -3, 2)
2
"/Y is nonempty since the function f(x) = 18x2 - 2 is a member. To
prove that "/Y is closed under the operations of ff, letf, g E "/Y. Then/(1/3) = and
g(l/3) = O so (f + g)(l/3) = f(l/3) + g(l/3) = O + O = O. Therefore f + g
E "/Y, and "/Y is closed under addition. For any rE R, (r/)(1/3) = r(/(1/3)) = rU = [ra 1](2, 1, 3) + (rb¡](l, 4, 1) + [rc 1](1, -3, 2).
rO = O, so rf E "/Y and "/Y is closed under scalar multiplication. Thus "/Y
satisfies the hypotheses of Theorem 2.2 and is a subspace of §". In this Therefore the vectors U + V and rU satisfy the condition for membership in
example it should be clear that the choice of 1/3 was arbitrary while the !7, and !7 is closed under addition and scalar multiplication. Thus .r:J is a
choice of O = f(l/3) was not. subspace of "f/ 3 .
42 2. REAL VECTOR SPACES §2. Subspaces 43

Since f7 # {0}, f7 is a proper subspace of 1' 3 if not every ordered triple 2. Show that a plane through the origin in 3-space represents a subspace of rf 3·
is in !7. That is, given (r, s, t) E 1'3 , do there exist scalars a, b, e such that 3. Prove that if a subspace .9' of"f/' 3 contains the vectors (1, 1, 1), (0, 1, 1), and
(0, O, 1), then .9' = f 3·
a(2, 1, 3) + b(l, 4, 2) + c(l, -3, 2) = (r, s, t)?
4. Suppose a subspace .9' of '"!/" 3 contains the following vectors, determine in
This vector equation yields the following system of linear equations: each case if 9' can be a proper subspace of "f/" 3·
a. (l, O, 0), (0, 1, O), and (0, O, 1).
2a+ h+ c=r b. (2, 1, O) and (3, O, 1).
c. (1, O, 2), (4, O, -5), (0, O, 1), and (6, O, 0).
a+ 4b- 3c = s d. (1, 1, 0), (1, O, 1), and (0, 1, 1).
3a + b + 2c =t. s. Write an arbitrary element from the following sets.
a. {(x, y, z)lx + 4y- z = O}.
If e is eliminated from the first pair and last pair of equations, then b. {(a, b, e, d)j3a = 2e, b = 5d}.
c. {(a, b, e)ia + b - e = O and 3a - b + 5c = 0}.
?a+ ?b = s+ 3r d. {(x, y, z)ix + 3y- 2z =O and 2x +?y- 6z = 0}.

lla + llb = 2s + 3t 6. Determine if the following are subspaces.


a. {PE R[t]jdegree of Pis 3} e R[t].
is obtained, and for sorne choices of r, s, t these equations have no solution. b. {(a, b)ia and b are integers} e f 2·
In particular, if r = 1, s = t = O, no val u es for a and b can satisfy the equa- c. {(2a + 3b, 4a + 6b)ja, bE R} e 1' 2·
tions. Therefore (1, O, O)~ !7, and f7 is a proper subspace of 1' 3 • d. {(x, y)jy = x 2 } e 1' 2·
e. {fE ~lf(x) = ax 2 orf(x) = ax 3 for sorne a E R} e F.
f. {(x, y, z)\x +y + z= 1} e 1' 3·
Examp/e 8 llll = {a(2, - 5) + b(3, -7) 1a, bE R} is the trivial sub- 7. Why is it impossible for a line that does not pass through the origin of the
space 1' 2 of the vector space 1'2 • Cartesian plane to represen! a subspace of 1' 2?
The proof that llll ís a subspace of 1'2 follows the pattern in Example 7.
8. Preve that the vector spaces "f/', and R have no proper subspaces.
Assuming this is done, llll e 1' 2 sin ce llll is a subspace of 1' 2 • Therefore to
show that llll = 1' 2 , it is necessary to prove that 1' 2 e llll. lf (x, y) E 1' 2 , 9. Determine if the following are subspaces of ~.
then it is necessary to find scalars a and b such that (x, y) = a(2, - 5) + a. 9' = {flf' = O}. What is the set 9'?
b(3, -7) or x = 2a + 3b and y = - 5a - ?b. These equations are easily b. 9' = {flf' + 3[ + 5 =O}.
solved for a and b in terms of x and y giving c. 9' = {fif" + 4/'- 7f= O}.
d. 9' = Ulff'Y = f}.
(x,y) = [-7x- 3y](2, -5) + [5x + 2y](3, -7). 10. Suppose !!' and y are subspaces of a vector space f. Define the sum of 9'
and Y by
Therefore (x, y) E t1/l. So 1' 2 e t1/l and t1/l = 1'2 • 9' +Y= {U+ VIUE9', VEY},
the intersection of 9' and ff by
!!' n ff = {UjUE!I' and UEY},
Probfems
and the union of 9' and Y by
9' U ff= {UjUE9'orUEff}.
l. Show that the following sets are subspaces of the indicated spaces.
a. Prove that the sum !!' + ff is a subspace off.
a. {(a, b, O)! a, bE R} e "Y 3 • b. Prove that the intersection !!' n ff is a subspace of '"Y.
b. {(x, y, z)lx = 3y, x +y= z} e "Y 3 • c. Show that in general the un ion 9' u Y satisfies only two of the conditions
c. {at 3 + btla, bE R} e R[t]. in Theorem 2.1.
d. {f/f(x) = a sin x + b cos x, a, bE R} e ~. d. Find a necessary condition for 9' U ff to define a subspace of "Y.
44 2. REAL VECTOR SPACES
§3. Linear Qgpendence 45

11. For the proof of Theorem 2.1, fill in the reasons for the chain of equalities vector. Thus if U is a nonzero vector in "Y 2 , 2'{U} can be thought of as the
showing that
line in the plane through U and the origin.
a. -U = Y in part l. b. O = OU in part 2. c. U = O in part. 4.
12. Prove that rU = O implies r = O or U = O. Example 1 Suppose U, V E "Y 3 are nonzero. Determine the possible
13. Suppose "Y is a vector space containing a nonzero vector. Prove that "Y has an geometric interpretations of .<e'{ U, V}.
infinite number of vectors. If one vector is a scalar multiple of the other, say U = rV, then any
14. · Suppose Z is an element of a vector space "Y such that for sorne U E "1'; linear combination aU + bV becomes a(rV) + bV = (ar + b)V. Therefore
U + Z = U. Prove that Z = O. 2'{U, V} = {tVjt E R}, a líne through the origin.
If neither vector is a scalar multiple of the other, the scalar multiples of
U and V determine two lines through the origin. A linear combination of U
and V, a U + b V, is the sum of a point from each line, which by the parallelo-
§3. linear Dependence gram rule for addition is a point on the plane containing the two lines (see
Figure 1). Therefore 2' {U, V} is the plane determined by U, V, and the origin.
The previous problem set touches on severa! ideas that are of funda- This example provides a geometric argument for the fact that the span
mental importance to linear algebra. In this section we will introduce terms of two vectors from "// 3 cannot equal all of "Y 3 •
for these ideas and begin to look at their consequences.

Definition If U 1 , ••• , Uk are vectors from a vector space and


r 1 , ••• , rk are scalars, then the expression r 1 U1 + + rk Uk is a linear com-
bination of the k vectors.

The expressions 2(2, 1) + 3(3, -5)- !(6, 2) = (10, -14) and 5(2, 1)-
(6, 2) = (4, 3) are linear combinations of the three vectors (2, 1), (3, - 5) and
(6, 2) from "Y 2 . In fact, every vector in "Y 2 can be expressed as a linear com-
bination of these vectors, one way to do this is

(a, b) = [1 7a + 3b](2, 1) + a(3, - 5) - [6a + b](6, 2).


Figure 1
As further examples, the sets f7 and ó!/ in Examples 7 and 8 of the previous
section consist of all possible linear combínations of three and two vectors
respectively. ' Example 2 Show that the span of {(2, -1, 5), (0, 3, -1)} contains
the vector (3, 6, 5) but not (2, 5, 0).
Definition Let S be a subset of a vector space "//. The set of all possible (3, 6, 5) E 2'{(2, -1, 5), (O, 3, -1)} if there exist sca1ars x and y such
linear combinations of elements in S is called the span of S and is denoted by that (3, 6, 5) = x(2, -1, 5) + y(O, 3, -1). This vector equation yie1ds the
.<e'( S). If S is the empty subset, .<e'( S) contains only the zero vector of "//'. linear equations 3 = 2x, 6 = -x + 3y, and 5 = 5x -y, which are satisfied
by x = 3/2 and y = 5/2. Therefore (3, 6, 5) is in the span.
The vector (2, 5, O) is in the span if there exist x, y E R such that (2, 5, O)
From the above discussion, it can be said that "Y 2 ís the span of (2, 1),
(3, -5), and (6, 2), or "f/ 2 = 2'{(2, 1), (3, -5), (6, 2)}.
= x(2, -1, 5) + y(O, 3, -1). This equation yields 2 = 2x, 5 = -x + 3y,
and O = 5x - y. With x = 1, the other two equations become y = 2 and
The span of a single vector is simply the set of all scalar mu! tiples of the
y = 5, therefore no solution exists. That is, (2, 5, OH 2' {(2, -1, 5), (0, 3, -1)}.
46 2. REAL VECTOR SPACES §3. Linear Dependence 47

Example 3 2'{1, t, t 2 , t\ ... , t", .. .} = R[t]. Definition A vector U is linear/y dependent on the vectors U1 , ••• , Uk if
In this case the set is infinite, but linear combinations are defined on1y UE2'{U 1 , ••• , Uk}, that is, ifthere are sca1ars a1 , • . • , ak such that U=
for a finite number ofvectors. Therefore 2'{1, t, t 2 , .• •} is the set ofall finite a 1 U1 + · ·· + akUk.
linear combinations of the vectors 1, t, t 2 , t 3 , .•• , t", .... This is precise1y
what polynomials are.
Thus the vector (2, 1, 3) is 1inearly dependent on (1, 4, 1) and (1, -3, 2)
from the above considerations.
Theorem 2.3 If Sisa subset of a vector space "f/, then the span of S If a vector U is 1inearly dependent on V, then U is a scalar multiple of V.
is a subspace of "f/. Thus if U, V E "f/ 2 and U is linear! y dependent on V, then as points in the
~--
plan e, U al)d V are collinear with the origin. lf U, V, W E "Y 3 are nonzero and
W is 1inearly dependent on U and V, then either U, V, and W are collinear
Proof The proof that 2'(S) is nonempty and closed urider addition
·with the origin, or W is on the plan e determined by U, V, and the origin.
is left to the reader. To show 2'(S) is closed under scalar multiplication, let
U E 2'(S), then by the definition of span there exist vectors V 1 , ••• , VP E S
and scalars r 1 , ••• , rP such that U= r 1 V1 + · · · + rPVP' So Examp!e 4 Show that the vector 8 - t + 1t 2 is linearly dependent
2
on 2 - t + 3t 2 and 1 + t - t •
rU = r(r 1 V 1 + · · · + rPVP) That is find scalars a and b such that
= (rr 1)V1 + · · · + (rrp)VP'
Therefore rU is a linear combination of V 1 , • •• , VP, or rU E 2'(S).
or that
Definition Let S be a subset of a vector space and o/1 the span of S. (8 - 2a - b) + (-1 + a - b)t + (7 - 3a + b)t 2 = O.
The vector space o/1 is said to be spanned or generated by S and the elements
of S span o/1 orare generators of o/1. All the coefficients in the zero po1ynomia1 are zero, so a and b satisfy

Therefore two of the above results can be restated as, R[t] is spanned by 8-2a-b=0 2a+b=8
the vectors 1, t, t 2 , ••• and "f/ 2 is spanned by {2, 1), (3, - 5), and (6, 2). Every
-1+ a-b=O or -a+b= -1
vector space spans itself, that is 2'("f/) = "Y. Therefore every vector space is
spanned by sorne set. But the space itself is not a very interesting spanning 7- 3a + b =O 3a- b = 7.
set. An "interesting" spanning set would just span in the sense that if one
vector were removed, it would no longer span. (That is, polynomials are equa1 if their coefficients are equal.) These equa-
Consider the vector space !!/ introduced in Example 7 of the previous tions are satisfied by a = 3, b = 2, so that
section; !!/ = 2' {(2, 1, 3), (1, 4, 1), (1, -3, 2)}. Although !!/ is spanned by
the three vectors, it can be spanned by any two of them. For examp1e,
(2, 1, 3) = (1, 4, 1) + (1, -3, 2), so if U E!!/, then
is the desired linear combination.
U= a(2, 1, 3) + b(l, 4, 1) + c(l, -3, 2)
= a[(1, 4, 1) + (1, -3, 2)] + b(l, 4, 1) + c{l, -3, 2) If U is linearly dependent on the vectors U1 , • •• , Uk, then U=
=[a+ b](l, 4, 1) +[a+ c](l, -3, 2). a 1 U1 + ... + akUk for sorne scalars a 1 , •.• , ak. Therefore (- l)U + a¡ U¡
+ ... + akUk = O. That is, O is a linear combination of the k + 1 vectors,
Thus U E 2'{(1, 4, 1), (1, -3, 2)} or !!/e 2'{(1, 4, 1), (1, -3, 2)}. The othe'r with at least one of the scalars nonzero. (Although - 1 is not zero, all the
containment is immediate so!!/ = 2'{(1, 4, 1), (1, -3, 2)}. other scalars a 1 , ••• , ak cou1d equal zero.)
\_
48 2. REAL VECTOR SPACES §3. Linear Dependence 49

Definition The finite set ofvectors {V1 , .•. , Vh} is linear/y dependent if From our discussion of linear dependence and geometry we can say that
there exist scalars b 1, ••• , bh, not all zero, such that b 1 V1 + .. · + bhVh = O. a set of two vectors in "/!" 2 is linear!y independent if as points they are not
An infinite set is linear/y dependen! if it contains a finite linearly Úpendent collinear with the origin or equivalently if they span "/!" 2 • A set of three
subset. vectors in "/!" 3 is linearly independent if as points in 3-space they are not
coplanar with. the origin. We will see that in such a case the three vectors
The phrase "not all zero" causes much confusion. It does not mean "all span "/1" 3 .
not zero" and in sorne situations zero must be used as a coefficient. Por
example, the set {(1, 2), (2, 4), (1, O)} is linearly dependent since 2(1, 2) + Theorem 2.4. The set ofvectors {W 1, ••• , W.} is linearly independent
( -1)(2, 4) + 0(1, O) = O. provided t 1 W1 + · · · + t.w. =O implies t 1 = · · · = t. =O, that is,
Every set containing the zero vector is linearly dependent; even the set . . provided the only linear combination yielding the zero vector is the one
{O} for 1(0) = O is a linear combination satisfying the cond.itions of the. with all coefficients equal to zero.
definition.

Proof This follows directly from the definition of linear dependence


Example 5 Determiné if the set {(2, 5), (- 1, 1), (- 4, - 3)} is linear! y for sets of vectors.
dependen t.
Suppose
Example 7 Determine if the set of vectors {sin x, cos x} is linearly
independent in the vector space !F.
x(2, 5) +y( -1, 1) + z( -4, -3) =O = (0, 0).
Suppose a(sin x) + b(cos x) = O(x) for sorne scalars a, bE R. Then
a(sin x) + b(cos x) = O for every value of x in the interval [0, 1]. But when
This vector equation yields the system of linear equations
x = O, sin x = O, so b = O and the equation becomes a(sin x) = O for all
x E [0, 1]. Now sin l. =P O implies a = O. Therefore, the only linear combina-
2x- y- 4z =O and 5x +y- 3z =O.
tion of these functions yielding the zero function is O(sin x) + O(cos x) and
the set is linearly independent.
If x = Y = z = O is the only so1ution, the vectors are not linear! y dependent.
But there are many other solutions, for example, if x = 1, then y= -2,
z = l. Therefore (2, 5) - 2(- 1, 1) + ( -4, -3) =O, and the vectors are Examp/e 8 {!, t, t 2 , .•• , t", .. . } is linearly independent in R[t].
linearly dependent. This is immediate, for a (finite) linear combination of these vectors is a
polynomial with each term involving a ditferent power of t. Such a poly-
nomial is zero only if all the coeffi.cients are zero.
Examp/e 6 Remove one vector from the above set and determine
if {(2, 5), (- 1, 1)} is linear!y dependen t.
That is, are there scalars x, y not both zero, such that We have now encountered sevet!ld terms using the word linear; linear
equation, linear combination, linear dependence, linear independence, and
x(2, 5) +y( -1, 1) = (0, O)? thus we begin to see why the study of vector spaces is called linear algebra.
The concepts introduced in this section are basic to both the theory and the
This vector equation yields 2x - y = O and 5x + y = O, which together have applications of linear algebra. Therefore it is important to gain a good
only the solution x = y = O. Therefore the set is not linearly dependent. understanding of their meaning.

Definition A set of vectors is linear/y independent if it is not linearly Problems


dependen t.
l. Determine which of the following vectors are in the span of the set
{( -2, 1, -9), (2, -4, 6)}. a. (0, 3, 3). b. (1, O, 1). c. (0, O, 0).
Thus the set in Example 6 is 1inearly independent.
d. (1, O, 0). e. (1, -5, 0). f. (1, -4, 1).
51
50 2. REAL VECTOR SPACES §4. Bases and Dimension

2. Determine which of the following vectors are members of c. Show that each of the following vectors can be expressed in an infinite
3
.9?{1- t + 1, 31 2 + 21, 13 }. a. 12 • b. 1- l. c. 51 3 + 6t 2 + 4. number of ways as a linear combination of the vectors in S: (0, O, 0),
d. 13 +
1 + 1 + l.
2
e. 1 + 31 + 31- l.
3 2 (0, -1, O), (2, O, -1), and (6, 2, -3).
d. Prove that ..P(S) e 2'{(0, 1, O), (2, O, -1)}.
3. Which of the followíng sets span "Y 3 ?
e. Show that ..P(S) = ..P{(O, 1, 0), (2, O, -1)}.
a. {(1, 1, 0), (0, 1, 1), (1, 1, 1)}.
b. {(1, O, 0), (4, 2, 0), (0, 1, O), (0, O, 0), (O, O, 1)}. 13*. Suppose {U~> ... , Uk} is linearly independent and Vrt ..P{Vt. ... , Uk}.
c. {(1, 1, 0), (2, O, -1), (4, 4, O), (5, 3, -1)}. Prove that {V, U~> ... , Vd is linearly independent.
d. {(1, 1, 1), (J, 2, 3)}. 14*. Suppose "Y= ..P{U~> ... , Uk} and U 1 E ..P{U2, ... , Uk}. Prove that "Y=

4. Wríte out how the following statements are read. ..P{Uz, ... , Vk}.
a. UE .9?{V, W}. b. Y' e .9?{U, V}. c. Y'= .9?{U, V}. 15*. Suppose every vector ~-vector space "Y has a unique expression as a linear
d . .9?{Ut. ... , Ud. combination of the vectors U~> ... , u... Pro ve that the set {U t. . . • , U"} is
5. a. Prove that no finite set can span R[t]. linearly índependent.
b. Use parta to show that no finite set can span :F. 16. The fact that r(V 1 + · · · + Uk) = rU, + · · · + rUk for any k vectors in "Y
6. Why is it unnecessary to write linear combinatíons with parentheses, such as and any rE R was used in the proof of Theorem 2.3. Use induction to prove
+ r2U2) + r3U3) + ·· · + rkUk)?
(· · ·((r1U1 this fact.
7. Show that {(a, b), (e, d)} is linearly dependen! in 1' 2 if and only if ad- be=
O.
8. ?nder what conditions on a and b are the following sets linearly independent
m "Ya? §4. Bases and Dimension
a. {(1, a, 2), (1, 1, b)}. b. {(1, 2, 0), (a, b, 2), (1, O, 1)}. 2
c. {(a, 2, 3), (0, 4, 6), (0, O, b)}. The coordinate axes in the Cartesian plane E can be viewed as all scalar
multiples of the vectors (1, O) and (0, l) frorn 1f 2 • Therefore (1, O) and (0, l)
9. a. Write the statement "sorne vector in {U1 , ••• , U.} is a linear combination
span the plane or rather the vector space 1f 2 • A set of generators for a vector
. of the other vectors" in terms of the vectors and scalars. 2
space can be as irnportant as the coordinate systern in E • However, if a set
b. Show_ that the statement in a implies {U~t ... , U.} is linearly dependen!.
of generators is linear! y dependent, then a smaller set will also span. Therefore
c. Why JS the statement in part a not a satisfactory definition of linear de-
pendence? we shalllook for Iinearly independent sets that span a vector space.

10. Are the following sets linearly independent or dependen!?


a. {2 - 3i, 1 + i} in C. e. {eX, e2x, e3x} in ;¡¡;. Definition A basis for a vector space 1f is a linearly independent, order-
b. {cos2 X, 17, sin 2 x} in :!7. f. {1 + 31\1 4 - 5, 61} in R[t]. ed set that spans .Y.
c. {41,; 2 + t, 31 2 - 1} in R[t]. g. {tan x, sin x} in§.
d. {3x /(x + 1), -x/(2x + 2), 4x} in§.
Example 1 The set {(l, 0), (0, l)} is a basis for 1' 2 •
11. Let S= {(2, -4), (1, 3), (-6, -3)} e 1' 2 •
a. Show that S is Jinearly dependen!.
lf x(l, O) + y(O, 1) = O, then (x, y) = O = (0, O) and x =y =
O. There-
fore the set is linearly independent. And every vector (a, b) E 1f 2 can be writ-
b. Show that the vector (5, O) can be expressed as a linear combination of the
ten as (a, b) = a(l, O) + b(O, 1), so the set spans 1f 2 • The set {(1, O), (0, l)}
vectors in S in an infinite number of ways. Do the same for the vector
(0,-20). will be called the standard basis for 1f 2 •
c. Show that two vectors from S, say the first two, span "Y 2 •
d. Show that .9?(S) is the trivial subspace "Y 2 of -r 2 •
Example 2 1t has airead y been deterrnined in previous exarnples that
12. Let S= {(6, 2, -3), (-2, -4, 1), (4, -7, -2)} e 1' 3 •
a. Show that S is linearly dependent. *You should not go on to the next section section before working on problems 13, 14,
b. Show that .9?(S) is a proper subspace of 1'3 • Suggestion: consider the and 15. If the problems are not clear, start by looking at numerical examples from "fí z or
vector (0, O, 1).
"fí 3•
§4. Bases and Dimension 53
52 2. REAL VECTOR SPACES

{1, t, t 2 , ••• , t", . .. } spans R[t] and is Iinearly independent. Therefore this be kept in mind to understand what is meant by, "Jet "f" be a vector space."
infinite set is a basis for R[t]. The next theorem will be used to prove that if a vector space "f" has a
finite basis, then all bases for "f" have the same number of vectors.

Example 3 Construct a basis for "f" 3 •


We can start with any nonzero vector in "f" 3 , say (1, -4, 2). The set Theorem 2.5 If {W1 , ••• , Wk} is a Iinearly dependent ordered set of
{(1, -4, 2)} is Iinearly independent, but since its span is {r(1, -4, 2)1r E R}, nonzero vectors, then Wi E .P { W 1 , ••• , Wi-l} for sorne j, 2 ::; j ::; k. That
it does not span "f" 3 . Since the vector (1, -4, 3) is clearly not in .P {(1, -4, 2)}, is, one of the vectors is a linear combination of the vectors preceding it.
the set {(1, -4, 2), (1, -4, 3)} is Iinearly independent (problem 13, Section 3).
Does this set span "f" 3 ? That is, given any vector (a, b, e), are there real num- Proof Since W 1 -:p O, the set { W1 } is linear! y in dependen t. Now sets
bers x, y such that (a, b, e) = x(I, -4, 2) + y(I, -4, 3)? If so then x and y
can be built from { W1 } by adding one vector at a time from W2 , ••• , Wk
must satisfy a= x +y, b = -4x- 4y, ande= 2x + 3y. But these equa-
until a linearly dependent set is obtained. Since {W1, ••• , Wk} is linearly
tions have a so1ution only if a = - 4b. Therefore,
dependent, such a set will be obtained after adding at most k - 1 vectors.
Thus there is an indexj, 2::; j::; k, such that {W1 , ••• , Wi_¡} is linearly
.P{(1, -4, 2), (1, -4, 3)} ={(a, b, e)la = -4b}.
independeilt and {W 1, .•• , Wi _ 1 , W) is 1inearly dependen t. Then there
- exist scalars a1, ••• , ai• not an.zero, such that
This is not "f" 3 , so any vector (a, b, e) with a -:p -4b can be added to give a Ji .
linear! y independent set of three vectors. Choose the vector (- 1, 3, 5)
arbitrarily, then {(1, -4, 2), (1, -4, 3), ( -1, 3, 5)} is Iinearly independent.
This set spans "f" 3 if it is possible to find x, y, z E R such that
If ai -:p O, we are done, for then
(a, b, e) = x(1, -4, 2) + y(1, -4, 3) + z(- 1, 3, 5)

for ariy values of a, b, e, that is, for any (a, b, e) E "f" 3 . A solution is given
e,
by X= -29a- 8b - e, y= 26a + 7b + and z = -4a- b. Therefore
But if ai = O, then a 1 W 1 + + ai-l Wh 1 = Owith not all the coefficients
{(1, -4, 2), (1, -4, 3), (-1, 3, 5)} is a basis for "f" 3 .
equal to zero. That is, { W 1 , ••. , Wi_ 1} is linear! y dependent, contradicting
the choice of j. Therefore ai -:p O as desired, and the proof is complete.
The above construction can be viewed geometrically. First an arbitrary
nonzero vector was chosen. The second vector was chosen from all vectors
not on the Iine determined by the first vector and the origin, that is, the Iine Two false conclusions are often drawn from this theorem. It does not
given by P = t(I, - 4, 2), t E R. Then the -..,.,
third vector was chosen from all say that every linearly dependent set contains a vector which is a linear com-
vectors not on the plane determined by the first two and the origin, that is, bination of the others or of those preceding it. For example, consider the
the plane with equation x + 4y = O. Obviously it is possible to obtain many linearly dependent sets {O} and {0, t} e R[t]. It also does not say that the
bases for "f" 3 with no vectors in common. The important point to be estab- last vector of a linearly dependent set is a linear combination of the vectors
2
Iished is that every basis for "f" 3 has the same number of vectors. preceding it; consider the linearly dependent set {t, 3t, t } e R[t].
Consider for a moment the vector space ff. If it has a basis, it cannot be
obtained by inspection as with "f" 2 or R[t]. And since a basis for ff must be Theorem 2.6 If a vector space "f" has a finite basis, then every basis has
infinite (problem 5, Section 3), it is not possib1e to build a basis as in Example
the same number of vectors.
3. But there is a theorem which states that every vector space has a basis.
The proof of this theorem is beyond the scope of this course, but it means
that ff has a basis even though we cannot write one out. Most of our study Proof S1.1ppose B 1 = {U 1 , ••• , U11 } and B2 = {V1 , ••• , V'"} are
will concern vector spaces with finite bases, but the spaces R[t] and ff should bases for "f". We will first show that n ::; m by replacing the vectors in B 2
54 Z. REAL VECTOR SPACES §4. Bases and Dimension 55

with vectors frorn B 1 • Each step of the argument will use the fact that B2 x 1E 1 + x 2 E2 + · · · + x.E. =O, then (x 1, x 2 , ••• , x.) = (0, O, ... , O) and
spans "Y and B 1 . is linearly independent. x 1 = x 2 = · · · = x. = O.
To begin U¡ E "Y and "Y = 2' {V¡, ... , Vm}, therefore the ordered set "Y, is spanned by these n vectors, for if U E 1í., then U = (a 1, a 2 , ••• , a.)
{ U 1 , V 1 , ••• , V111 } is linearly dependent. All the vectors in this set are nonzero for sorne a 1, a2 , ••• , a. E R. But
(Why?), so by Theorem 2.5 sorne vector, say Vi, is a linear cornbination of
U1 , V 1 , ••• , Vi_ 1 , and the set

so "Y,= 2'{E 1 , E 2 , ••• , E.}. Thus {E 1 , ••• , E.} is a basis for "Y., and dirn
"Y, =. 11. This basis will be called the standard basis for "Y" and will be denoted
spans "Y (problern 14, Section 3). Now renumber the vectors in this set so by {E¡}. In 1' 2 , {E¡} = {(!, 0), (0, !)} and in "f/' 3 , {E¡} = {(1, O, 0), (0, 1, 0),
that V1 no longer appears. Then { U1 , ••• , u.} is linearly i ndependent and (0, O, 1)}.
{U1 , V2 , •.• , V,.} is a spanning set for "//'.
This argument can be repeated until the first 11 vectors of B2 ha ve been re-
Examp/e 5 The set { 1, t, t 2 , ••• , t"- 1 } can easily be shown to be a
placed by the vectors of B 1 • A typical argument is as follows: Suppose k of the
basis for R,JtJ, the set of all polynornials in t of degree less than 11. Therefore
V's ha ve been replaced to obtain (with relabeling) { U 1 , ••• , Uk, Vk+ 1, ••• , V111 }.
dirn R.[t] = n.
This set spans "Y and uk+l E "Y, so {U¡, ... , Uk, uk+i• vk+l• ... , Vm} is
linearly dependent. Therefore one of the vectors in this ordered set is a
linear combination of the preceding vectors. This vector cannot be one of the Severa! corollaries follow frorn Theorern 2.6.
U's since {U 1 , ••• , Uk+ 1 } e B 1 , which is Iinearly independent, so it is one of
the V's. (Note that the set is Iinearly dependent, so there must be at least one
of the V's left.) Removing this vector and relabeling the rernaining V's
Corollary 2.7 If di m "Y = 11 and Sisa set of n vectors that span "Y, then
S is a basis for "Y.
yields the set {U 1 , ••• , Uk+l• Vk+ 2 , •• •• , V,.}, a spanning set for "//'. Thus
for each vector from B 1 , there must be a vector to remove from B2 , that is,
11 ::; m. Example 6 Show that {1 + t + t 2 , 1 + t, 1} is a basis for R 3 [tJ.
Now interchanging the roles of B 1 and B2 , and using the fact that B 1 It can be deterrnined by inspection that this set spans R3 [tJ, for
spans "Y and B 2 is Iinearly independent yields m ::; n. So n = m.
a + bt + et 2 = e(I +t+ 2
t ) + (b - e)(I + t) + (a - b)l.
Theorern 2.6 permits the following definition.
The set has 3 vectors and dim R 3 [tJ = 3, so by Corollary 2.7, the set is a basis
for R 3 [tJ.
Definition If a vector space "Y has a basis containing n vectors, then
the dimension of "Y is n; write di m "Y = 11. If "Y = {0}, then dirn "Y = O. In
either case, "Y is said to be finite dimensional. "Y is infinite dimensional if it is Corollary 2.8 If di m "Y = 11 and S is a linearly independent subset of
not finite dimensional. n vectors, then S is a basis for "//'.

From the above examples, di m "Y 2 = 2 and dirn "Y 3 = 3, whereas R[tJ Examp/e 7 The set {(a, b), (e, d)} is linearly independent in 1'2 if
and !Ji' are infinite dimensional. ad- be# O (problem 7, Section 3). Dim 1'2 = 2, so by Corollary 2.8,
{(a, b), (e, d)} is a basis for "Y 2 if ad - be # O.
Example 4 The dimension of "Y. is n.
Consider the vectors E 1 = (1, O, ... , 0), E2 = (0, 1, O, ... , O), ... , Corollary 2.9 If dim "Y = n, then any subset of "Y containing more
En = (0, ... , O, 1) in "Yn. These n vectors are linearly in0ependent, for if than 11 vectors is Iinearly dependent.
56 2. REAL VECTOR SPACES §4. Bases a.!!f! Dimension 57

With this corollary, severa! previous problems can be solved by inspec- Examp/e 9 Extend the Iiñearly independent set {l + t - 2t 2 , 2t +
tion .. For example, a set such as {(1, 5), (2, 7), (4, O)} must be linearly depen- 3t 2 } to a basis for R 3 [t].
dent in "f" 2 • That is, find a vector in R 3 [t] which is not in the space !7 = .P { 1 + t - 2t 2 ,
2t + 3t 2 }. This could be done by tria! and error but instead we will char-
Theorem 2.1 O Let !7 be a proper subspace of a finite-dimensional acterize the set !7. A vector a + bt + et 2 is in fl' if there exist real num-
vector space "f". Then dim !7 < dim "f". bers x and y such that a+ bt + et 2 = x(I + t - 2t 2 ) + y(2t + 3t 2 ).
Therefore the equations a= x, b = x + 2y, and e= ·-2x + 3y must
have a solution for all a, b, e E R. This is possible only if b = a + 2y and
Proof Suppose dim "f" = n, then no linearly independent subset of e= -2a + 3y or equivalently if 7a- 3b + 2e =O. So [!' = {a + bt
[!'can contain more than n vectors by Corollary 2.9. Hence di m !7 ::; dim "f". + et 2 17a - 3b + 2e = 0}, and any vector a + bt + et 2 such that 7a -
But if !7 has a linearly independent subset of n vectors, then by Corollary 2.8, 3b + 2e # O may be added. Using l + t + t 2 , {l + t - 2t 2 , 2t + 3t 2 ,
it is a basis for "f" and !7 = "f/". Since !7 is a proper subspace in "f/", a linearly 2
l + t + ! } is a basis for R 3 [t] by Corollary 2.8, and it is an extension of the
independent subset in fl' cannot have more than n - 1 vectors, that is dim given linearly independent set.
fl' < di m "f".

Example 8 The dimension of R[t] is infinite, so Theorem 2.10 does Problems


not apply. In fact, 2"{1, t 2 , t\ ... , t 2 ", ••• } is a proper subspace of R[t], yet
it is also infinite dimensional. l. Show that the vector space of complex numbers C has dimension 2.
2. Show that any nonzero real number constitutes a basis for R when viewed as
a vector space.
Definition Let dim "f" = n and suppose S is a linearly independent
subset of "f/' containing k vectors, k < n. lf Bis a basis for "f" which contains 3. Use Theorem 2.10 to prove that neither i' 1 nor R has a proper subspace.
the vectors in S, then B is an extension of S. 4. Determine if !he following are bases:
a. {(2, 1, - 5), (7, 4, 3)} for i' 3 •
Theorem 2.11 Any linearly independent subset of a finite-dimensional b. {2 + 5i, 1 + 3i} for C.
c. {t, 3t 2 + 1, 4t + 2, 1 - 5t} for R 3 [t].
vector space can be extended to a basis of the vector space.
d. {(10, -15), (6, -9)} for 1' 2 •
e. {! 3 , 4t 2 - 6t + 5, 2t 3 - 3t 2 + 5t- 9} for R 4 [t].
Proof Suppose dim "f" = n, and S= {U,, ... , Uk} is linearly in- f. {(2, 1), ( -3, 4)} for i' 2 •
dependen! in "f/", with k < n. Since k < n, Corollary 2. 7 implies that S does S. Determine which of the following statements are true. If false, find a counter-
not span "f/". Therefore there exists a vector Uk+ 1 in "f" but not in the span example.
of S. Now { U1 , ••• , Uk> Uk+ 1} is linearly independent (see problem 13, a. Every linearly independent set is the basis of sorne vector space.
Section 3) and it contains k + 1 vectors. If k + 1 = n, this is a basis for "f" b. The set {U¡, ... , U.} is linearly independent if a1 U1 + . · · + a,.u. =o
by Corollary 2.8. lf k + 1 < n, then this procedure can be repeated, yielding and a 1 = · · · =a.= O.
a linearly independent set containing k + 2 vectors. When n - k vectors ha ve c. The dimension of a subspace is less that the dimension of the space.
been added in this fashion, a basis will be obtained for "f" (by Corollary 2.8), 6. Prove: a. Corollary 2.7 b. Corollary 2.8 c. Corollary 2.9.
which is an extension of S.
7. Determine the dimension of the following vector sjJaces:
a. 2{(2, -3, 1), (1, 4, 2), (5, -2, 4)} e 1' 3 •
In general, a given basis for a vector space "f" does not contain a basis b. 2{eX, e 2 X, é•} e ?".
for a particular subspace fl'. But Theorem 2.11 shows that if "f" is finite c. 2{t- 1, 1 - t 2 , t 2 - t} e R 3 [t].
dimensional, then it is always possible to extend a basis for fl' to a basis 8. Show that the vector space of 2 x 2 matrices as defined in problem 8, page
for "f". 37, is four dimensional.
58 2. REAL VECTOR SPACES §5. Coordinates 59
,:
!'

9. Extend the following sets to bases for the indicated vector space: only necessary to prove that it is linearly independent. Therefore suppose
a. {(4, -7)} for f 2 • c. {t, t + 4} for R 3 [t].
2 there are scalars C; such that c1 U1 + · · · + c. u. = O. By uniqueness, there
b. {(2, -1, 3), (4, 1, 1)} for f 3• d. {t- 1, t 2 + 5} for R3 [t). is only one such expression for the zero vector and O = OU 1 + · · · + OU.-
Therefore c1 = · · · = c. = O and the set is linearly independent. Thus
10. Let [!' and ff be subspaces of a finite-dimensional vector space and prove that
dim([l' + ff) = dim [!' + dim ff - dim([l' n ff). Begin by choosing a {U1 , . . . , u.} is a basis for "Y,
basis for the subspace [!' n ff.
11. Use problem 10 to prove that two planes through the origin in 3-space intersect Definition Let B = {U1 , ••• , U.} be a basis for a (finite-dimensional)
in at least a Iine through the origin. vector space ·r. For W E "Y suppose W = a 1 U 1 + · · · + a.u., then the
~- scalars a 1 , ••. , a. are the coordinates of W with respect to the basis B. We
will write W:(a 1 , ••• , a.h for the statement "Whas coordinates a1 , ••• , a.
with respect to B."
§5. Coordinates
Bases were defined as ordered sets so that this coordina te notation would
Up to this point there has been no mention of coordinates with regard be meaningful.
to vector spaces. Care has been taken to call the numbers in an n-tu pie vector
components rather than coordinates as suggested by our experience with
analytic geometry. And although coordinates ha ve been used in the Cartesian Example 1 Consider the basis B = {(1, 3), ( -2, 4)} for 1' 2 .
plane and 3-space, they have not been considered on the same leve! as the {1, 3) = 1{1, 3) + 0(-2, 4), therefore the coordinates of {1, 3) with
discussion of vector spaces, for they are defined in terms of distances from respect to this basis are 1 andO, or {1, 3):{1, 0) 8 • Similarly ( -2, 4):(0, 1h.
Iines and planes which is not the aspect we will generalize to vector spaces. To find the coordinates of a vector such as (9, 7) with respect to B, it is neces-
However, a discussion of coordinates in vector spaces has direct application sary to so1ve the vector equation (9, 7) = x(1, 3) +y( -2, 4) for x and y.
to geometry and should even aid in understanding what coordinates mean The solution is x = 5 and y = -2, so the coordinates are given by (9, 7):
there. The theoretical basis for the definition of coordinates is in the follow- (5, -2)B.
ing fact.
Examp/e 2 Consider the basis B' = {(1, 3), (0, 2)} for 1' 2 .
Theorem 2.12 The set { U1 , ••• , U.} is a basis for the vector space "Y For this basis (1, 3) :(1, 0) 8 • and (0, 2) :(0, 1)a. and the coordina tes of any
if and only if every vector of "Y has a unique expression as a linear combina- other vector with respect to B' can be found by inspection. That is, the first
tion ofthe vectors U1 , •• • , u.. component of (0, 2) is zero, so the first coordinate of a vector with respect to
B' must be its first componen t. Thus (9, 7): (9, -10)8 •.

Proof (=>) Suppose {U1 , . . . , U.} is a basis for the vector space "Y
and U E "Y. Since a basis spans the space, there exist scalars such that U = Take care to notice that a vector, even an ordered pair, does not have
a1 U 1 + · · · + a. un. To see that this expression is unique, suppose U= coordinates per se. Coordinates can be found only when given a basis, and
b1 U1 + · · · + b.u•. Then our notation will always make the connection clear. This connection is
usually omitted in analytic geometry, and while this cou1d be done here one
O= U - U = (aí - b 1 )U1 + · · · + (a. - b.)Un. can afford to be s1oppy only when the ideas being passed over are clea/ But
this is not the case with coordinates. In fact, problems are caused by the
But a basis is linearly independent, so a1 = b1 , • •• , a. = b., and there is feeling that, having used coordinates in analytic geometry, there is nothing
only one expression for U as a linear compination of the vectors in a basis. new here. Actually coordinates can be used in geometry with litt1e under-
(<=) lf every vector of "Y has a unique expression as a linear combina- standing and no problem arises until it becomes necessary to change to
tion ofthe vectors in the set {U1 , ••. , u.}, then the set spans "Y, and it is another coordinate system. e
60 2. REAL VECTOR SPACES §5. Coordinates 61

This equation has the solution x = -1, y = z = 1, so 2t:( -1, l, 1)B. In the
same way it can be shown that 8t 2 + 3t + l :(3, -2, 5)B and 2:(1, 1, -1)B.

Example 4 The coordinates of a vector in "Yn with respect to the


standard basis {E¡} are the same as the components of the vector. Thus
(7, -1): (7, -l)(Etl and (4, 8, 2) :(4, 8, 2)¡E,¡·
It is worthwhile repeating that a vector does not have coordinates, it only
has coordinates with respect to a particular basis. Thus the numbers 4, 8, 2
are called the components of the vector (4, 8, 2) and not coordinates.

Example 5 B = {4 - i, i + 2} is a basis for C. The coordinates of


the basis vectors with respect to B are given by 4 - i: (1, O)o and i + 2 :(0, 1)a.
Solving the equation i = x(4 - i) + y(i + 2) for x and y gives i:( -l/3, 2/3)B.
Similarly l :(1/6, l/6)o and l + i:( -l/6, 5f6)B.

In Example 3, coordinates of polynomial vectors from R 3 [t] are ordered


Figure 2
triples, and in Example 5, coordinates of complex numbers from C are
ordered pairs. So the process of obtaining coordinates associates vectors in
lt is not difficult to represent a basis B for "Y 2 and coordinates with an n-dimensional vector space with ordered n-tuples. What is more, the
respect to B geometrically. Suppose B = {U, V}, then B determines two lines algebra of the vector space can be performed on the coordinates. Thus, in
in E 2 corresponding to 2' {U} and 2' {V}; see Figure 2. These lines might be Example 5, if the coordinates of 1 and i with respect to B are added as ele-
cal!ed "coordinate axes," and the coordinates of any vector W could be ments of "Y 2 , the result is (1/6, 1/6) + (-1/3, 2/3) = ( -1/6, 5/6), which are
found using the parallelogram rule for addition. That is·, if the line P W is the coordinates of the sum 1 + i with respect to B. n-tuples may be easier
parallel to 2' {V} and the line Q W is parallel to 2' {U}, then P = a U, Q = b V to handle than other vectors, such as polynomials, so that it will often be
implies that W:(a, b)a. simpler to study an n-tuple space rather than sorne other n-dimensional
Analytic geometry begins with a single coordinate system, but linear vector space. The basis for doing this is given by the next theorem.
algebra is at the other extreme. We begin by considering all possible bases for
a vector space, and the nature of the problem at hand will determine which Theorem 2.13 Let dim "Y = n, B be a basis for "Y, and U, V E"/"',
should be used. It is not only important to be able to find coordinates with rER. If U: (a 1 , ••• , an)B and V:(b 1 , ••• , bn)o, then
respect to any given basis, but the ease in handling coordinates will become
one of the main strengths of linear algebra. and rU:(ra 1 , ••• , ranh·

That is, the coordinates of a sum are given by the sum of the two n-tup1es of
Examp!e 3 It can be shown that B = {t 2 + 1, t + 1, t 2 + t} is
coordinates, and the coordinates of a scalar multiple are the scalar multiple
linearly independent and thus a basis for R 3 [t]. The coordinates of the basis
of the n-tuple of coordinates.
vectors with respect to B are easily found:

t2 + 1:(1, O, O)o, t + 1 :(0, 1, O)B, and t2 + t:(O, O, I)B. Proof Suppose B = {W 1 , ••• , Wn}, then by assumption

To find the coordinates of a vector such as 2t with respect to B, consider the U= a 1 W 1 + · · · + a"W" and V= b¡ W¡ + · · · + b"W".
equation Therefore,
x(t 2
+ 1) + y(t + 1) + z(t + 1) = 2
2t. U+ V= (a 1 + b¡)W¡ + · · · + (an + bn)Wn.
62 2. REAL VECTOR SPACES §5. Coordinates 63

Thus the coordina tes of U+ V with respect to B are a 1 + b 1, ••• , a, + b, corresponds one and only one vector in "/fí, and conversely. The last two
as desired. conditions insure that the algebraic structure of "f/ corresponds to the
For rU, algebraic structure of "/fí. For example, T(U) + T(V) = T(U + V) means
that the result obtained when two vectors, U and V, are added in f and then
rU=r(a 1 W 1 + + a,W,) = (ra 1)W 1 + · · · + (ra,)W, the corresponding vector, T(U + V) is found in ·tfí, is the same as the result
obtained when the vectors T(U) and T(V) corresponding to U and Vare added
or rU:(ra 1 , ••• , ra,)a. in "/fí. A vector space is a set of vectors together with the operations of addi-
tion and scalar multiplication, and an isomorphism establishes a total cor-
respondence between the vectors and the algebraic operations. Therefore,
Examp!e 6 B = { 1, t, t 2 , t 3 } is a basis for R4 [t], for which coordina tes
any algebraic result ob+ainable in "f/ could be obtained in ·ffí, and the two
can be found by inspection. vector spaces are indistinguishable as algebraic systems. Of course the
Let U= t 3 - 3t + 7 and V= -/ 3 + 2t 2 + 6t. Then the sum U+ V vectors in "f/ and •ffí may be quite different, as shown by the next example.
= 2t 2 + 3t + 7. Now U:(7, -3, O, 1) 8 , V:(O, 6, 2, -l)a, U+ V:(7, 3, 2, O)a
and in f 4 , (7, -3, O, 1) + (0, 6, 2, -1) = (7, 3, 2, 0). Thus addition of
vectors in R 4 [t] can be performed by adding their coordinates with respect to Examp/e 7 The vector space e is isomorphic to f 2 •
B in "f/ 4 . We will use the basis B = {!, i} for C to establish a correspondence T
from e to f 2· For each vector u E e, if U:(a, b)B, then set T(U) = (a, b).
Thus T(4- 7i) = (4, -7) and T(i) = (0, !).
In general, if di m "f/ = n and Bis a basis for "Y, then for each vector in
T is a "transformation" or "mapping" which sends complex numbers to
f there is an n-tuple of coordina tes with respect to B and conversely. Also
ordered pairs. Thus T satisfies the first condition in the definition of an iso-
the operations of addition and scalar multiplication in "Y can be performed
morphism.
on the coordinates as if they were vectors of "Y,. In other words, as sets the
The second condition requires that no two complex numbers are sent by
spaces "f/ and "Y, correspond andas algebraic systems they correspond. Such
T to the same ordered pair and that every ordered pair corresponds to sorne
a relationship is called an "isomorphism." The definition of this correspon-
dence is in four parts: the first two establish the correspondence between
complex number. To check the first part of this condition, suppose U, V E e
are sent to the same ordered pair. That is, T(U) :=(a, b) = T(V). Then by the
the sets and the last two state that the correspondence "preserves addition
definition of T, U = a + bi = V, and only one vector corresponds to (a, b).
and scalar multiplication."
For the second part, suppose W E f 2 . If W = (e, d), then e + di E e and
T(e + di) = W Thus T satisfies the second condition of the definition.
Definition A vector space f is isomorphie to the vector space 1fí, writ- The third condition of the definition requires that T "preserve addition."
ten "f/ ~ 1fí, if there exists a correspondence T from the elements off to Beca use of the way T is defined, this is established by Theorem 2.13. But to
those of 1fí which satisfies: check it directly, suppose U = a + bi and V= e + di, then U + V=
(a + e) + (b + d)i. Therefore T(U) = (a, b), T(V) = (e, d), and T(U + V) =
l. For each V E "f/, T(V) is a unique vector of 1fí. (a + e, b + d), so that

2. For each W E •ffí, there is a unique vector V E"!', such that T(V) =
T(U) + T(V) = (a, b) + (e, d) = (a + e, b + d) = T(U + V).
w.
3. Foral! U, V E"!', T(U) + T(V) = T(U + V). By a similar argument we can show that T satisfies the fourth condition
in the definition of isomorphism and say that T "preserves scalar multiplica-
4. For all U E f and rE R, rT(U) = T(rU). tion."

In Chapter 4 terminology will be adopted which will considerably


e
Thus is isomorphic to f 2 and as vector spaces they are indistinguish-
able; the difference in appearance of the vectors is of no algebraic significance.
shorten this definition. But for now it precisely states the nature of an iso-
This example is merely a special case of a theorem which states that there is
morphism. The first two conditions insure that for every vector in f there
essentially only one vector space for each finite dimension.
64 2. REAL VECTOR SPACES §6. Direct Sums 65

Theorem 2.14 If dim "f" = n, n # O, then "f" is isomorphic to 1',.. 4. Show that B = {t 3 , / 3 + t, t 2 + 1, t + 1} is a basis for R 4 [t] and find the
coordinates of the following vectors with respect to B:
a. / 2 + l. b. t 3 • c. 4 d. t 2 • e. t 3 - t 2 , f. t 2 - t.
Proof We will use the association of vectors with coordinates to
obtain a correspondence T from "f" to "f" •. Since "f" is a vector space, it has a 5. ·Find the coordinates of the following vectors from e with respect to the
basis B. Then if UE 1', U: (a 1, ••• , a.) 8 and we define T(U) by T(U) = basis B = {1 - 2i, i - 3}:
(a 1 , •• • , a.). Thus for U E "f", T(U) is a uniquely determined ordered n-tuple a. 3 - i. b. i. c. -5. d. 1 + 3i. e. 3 + 4i.
in "f"•. 6. Find the coordinates of (2, -3, 5) with respect to
To see that T satisfies the second condition of the definition of an iso- a. B 1 = {(1, O, 0), (1, 1, 0), (1, 1, 1)}.
morphism, first suppose T(U) = T(V) for sorne U, V E "f". Then U and V b. B 2 = {(1, -1, 0), (-4, 6, -10), (-1, 3, -9)}.
have the same coordinates with respect to B. But coordinates uniquely c. B3 = {(1, 1, 0), (0, 1, 1), (1, O, 1)}.
determine a vector, so U= V. Now suppose W E "f"., say W = (x 1, ••• , x.). d. B4 = {(1, O, 0), (0, 1, O), (0, -5, 5)}.
Then there exists a vector Z E "f" such that Z:(x 1 , . . • , x.) 8 • Therefore, 7. Suppose Bis a basis of "Y 2 such that (4, 1): (3, 2)a and (1, 1):(6, -3) 8 • Find
T(Z) = W, and every n-tuple is associated with a vector from "f". Thus condi- the basis B.
tion 2 is satisfied.
8. Find B if (1, 0):(5, -S)s and (0, !):( -1, 2)s.
The third and fourth conditions follow at once from Theorem 2.13, or
may be obtained directly as in Example 7. Therefore ·'Y is isomorphic to 1'•. 9. Suppose / 2 :(1, O, Os, t :(0, 1, Os, and 1 :(1, 1, O)n. Find the basis B for R 3 [t].
10. Find the basis B for R 2 [t] if 10- 1:(4, -3)n and 5t + 6:(1, 1) 8 •
To say that two vector spaces are isomorphic is to say that one cannot 11. Define a correspondence from R[t] to e by T(a + bt) = a + bi. Prove that T
be distinguished from the other by algebraic means. Thus there is essentially is an isomorphism from R 2 [t] to C.
only one vector space for each finite dimension, and the study of finite- 12. Prove that any two subspaces of "Y 3 represented by Iines through the origin
dimensional vector spaces could be carried out entirely with the spaces "f" •. are isomorphic.
However, this is not a practica! approach. For example, the space of 2 x 2 13. Read the following statements.
matrices is isomorphic to "f" 4 (by problem 8, Section 4), but there are many a. (4, 7):(1, 5)s. b. 2/ 2 - 2:(-2, 1, l)(t+2,3r2,4-r2),
situations in which the arrangement of numbers in a matrix is too useful to
14. Show that the following maps satisfy all but one of the four conditions in the
be lost by writing the numbers in a 4-tuple. The same is true for the polyno- definition of an isomorphism.
mial spaces R.[t]. An operation such as differentiation has a simple definition
a. Tr(a, b) = (a, b, O); T1 from "Y 2 to "Y 3 .
for polynomials, but its meaning would be lost if defined in terms of n-tuples. b. T2(a, b, e)= (a + b, 2a- e); T2 from "Y 3 to "Y 2 •
So even though two vector spaces may be algebraically the same, the fact that
15. Let T(a, b) = 2a +(a- b)t.
their elements are different,is sufficient to continue regarding them as distinct
a. Find T(l, O) and T(2, -4).
spaces. ··~
b. Prove that T is an isomorphism from "Y 2 to R 2 [t].
16. Prove that "Y 2 and "Y 3 are no! isomorphic. (The idea used to obtain a cor-
Problems respondence in Example 7 cannot be applied here, however, this does not
constitute a proof that no isomorphism exists.)
l. Find the coordinates of (2, -5) with respect to each of the following bases
for "Y 2:
a. {(6, -5), (2, 5)}. b. {(3, 1), ( -2, 6)}. c. {E;}. d. {(1, 0), (2, -5)}. §6. Direct Sums
2. Follow the same directions as in problem 1 for the vector (1, 1).
The su m of two subspaces !/ and f7 of a vector space "f" has been defined
3. Obtain by inspection coordinates of the following vectors with respect to the as!/ + f7 = {U+ VIU E!/ and V E f/}. !/ + f7 is again a subspace of "f"
basis B = {(4, 1), (1, O)}:
(problem 10 page 43), and the spaces !/ and f7 are called summands of the
a. (1, 0). b. (0, 1). c. (3, -6). d. (2, 8). e. (7, -5). sum !/ + f/.
66 2. REAL VECTOR SPACES §6. Direct Sums 67

Example 1 Let Y= 2'{(1, 1, O), (0, 1, 1)} and f7 = 2'{(1, O, 1), so Z E Y n f/. By assumption, U 1 -:1 U2 , so Z -:1 O and Y n f7 is not the
(1, 1, 1)}. Then the sum of Y and f7 is given by zero subspace. Therefore if a vector can be expressed as a sum of vectors
from Y and f7 in more then one way, then Y and f7 have a nontrivial
Y+ f7 = 2'{(1, 1, 0), (0, 1, 1), (1, O, 1), (1, 1, 1)}. intersection.
This sum equals "Y 3 because any three ofthe vectors are linearly independent.
If Y and f7 are regarded as planes through the origin in 3-space, then Example 2 If Y and f7 are the subspaces of "Y 3 given in Example 1,
Y + f7 expresses 3-space as the sum of two planes. then Y n f7 = 2'{(1, 2, 1)} #- {0}. Show that this implies that sorne vector
W E "Y 3 does not ha ve a unique expression as a sum of one vector from Y
In Example 1, each vector W E "Y 3 can be expressed as a su m of a vector and one vector from f7.
U E Y and a vector V E f/. Figure 3 shows such a situation. Choose W= (1, -1, -1). Then W= U1 + V 1 with U1 = (1,0, -1)EY
and V 1 = (0, -1, O) E f/. Now

W = U 1 + V1 +O= U 1 + V1 + (1, 2, 1)- (1, 2, 1)


= [(1, O, -1) + (1, 2, 1)] + [(0, -1, O) - (1, 2, 1)]
= (2, 2, O)+ (-1, -3, -1).

u Since (1, 2, 1) E Y n f7 and Y, f7 are subspaces, U2 = (2, 2, O) E Y and


,...•---

/
/
/
/
/

- --.w
/
Vz = ( -1, -3, -1) E f/. Thus W = U2 + V2 is another expression for W
But Y n f7 contains an infinite number of vectors, so there are an infinite
number of ways to express (1, -1, -1) as a su m of vectors from Y and f/.
/
/ Moreover since "Y 3 = Y + f/, any vector in "Y 3 could have been taken
for W

Theorem 2.15 Let Y and f7 be subspaces of a vector space "Y and "Y =
Y + f/. Every vector W E "Y has a unique expression as a su m W = U + V
with U E Y, V E f7 if and only if Y n f7 = {0}.

Proof ( =>) Suppose W E "Y and W = U + V, U E Y, V E f7 is


unique. Let Z E Y n f/. Then U+ Z E Y and V- Z E f7 since Y and f7
Figure 3
are closed under addition. Now W =(U+ Z) + (V- Z), and by unique-
ness U= U+ Z and V= V- Z, so Z =O and Y n f7 = {0}.
There is a close analogy between writing a vector space as the sum of (<=) The contrapositive* of this statement was proved just befo re
subspaces and writing a vector as a linear combination of vectors. In the Example 2. That is, if there is nota unique expression, then Y n f7 ,¡. {0}.
latter case, coordinates are obtained when there is a unique linear combina-
tion for each vector. For sums of subspaces, the most interesting use will
again be when there are unique expressions for vectors. Suppose this is not Definition Let Y, f7 be subspaces ofa vector space "Y. The su m Y+ f7
the case for "Y = Y + f/. That is, suppose there is a vector W E "Y such is said to be a direct sum and is written Y EB f/, if Y n f7 = {0}. If "f/ =
that W = U1 + V 1 and W = U2 + V2 with U1 , U2 E Y, V 1 , V2 E f/, all
distinct. Then O= W- W = U1 + V 1 - (U 2 - V2 ), implies U1 - U2 = *The contrapositive of the statement "P implies Q" is the statement "not Q implies
not P." A statement is equivalen! to its contrapositive so that P implies Q may be proved
V2 - V 1 • Call this vector Z, then Z = U1 - U2 E Y and 2á= V2 - V 1 E f/, by showing that the negation of Q implies the negation of P.
68 2. REAL VECTOR SPACES §6. Direct Sum!!:z, 69

f/ Ej;) !!T, then f/ Ej;) !!T is a direet sum deeomposition of "Y, and f/ and !!T rnay
be called direet summands of "Y.

Examp/e 3 B = {(4, -7), (2, 5)} is a basis for "Y 2 • Let f/ =


2' { (4, - 7)} and !!T = 2' { (2, 5)}. Consider the su m f/ + !!T.
Since B spans "Y 2 , "Y 2 = f/ + !!T. Now suppose Z E f/ n !!T, then
Z = a(4, -7) and Z = b(2, 5) for sorne a, bE R, and because Bis linearly
independent, Z = O. Therefore, the surn f/ + !!T is direct, and we can write
"Y 2 = [/ Etl !!T.

Geometrically, Example 3 shows that the Cartesian plane E2 may be


expressed as the direct sum of two lines through the origin. It should be clear
that the same result would be obtained using any two distinct lines through
the origin in E 2 • How does this relate to the idea of a coordinate system in Figure 4
E2?

Example 5 Let f/ be as in Example 4 and d/1 = .P{(l, O, 1)}. Since


Example 4 Suppose f/ = {(x, y, z)lx + y + z = O} and !!T = (1, O, 1) $ f/, f/ + !!T is a direct sum. To see that f/ Ej;) !!T = "Y 3 , consider
2'{(3, 1, 4)}. The sum f/ + !!T is direct, for if (a, b, e) E f/ n !!T, then the equation (a, b, e) = (x, y, -x -y) + (z, O, z). This has the solution
a + b + e = O, and for sorne scalar t, (a, b, e) = t(3, 1, 4). Eliminating a, x ,;, t( -a + b + e), y = b, z = f(a + b + e) for any values of a, b, e and
b, e gives 3t + t + 4t = O or t = O, so (a, b, e) = O and f/ n !!T = {0}. so "Y 3 e f/ EB !/. Therefore f/ EB !!T is another direct sum decomposition of
Thus we can write f/ Etl !!T. "Y 3·
Now is f/ Etl !!T a direct sum decomposition of "Y 3 ? That is, is f/ Etl !!T = In terms of this direct sum, (8, 9, -1) = (O, 9, -9) + (8, O, 8) with
"Y 3 ? Since (1, - 1, O) and (1, O, -1) span f/, the direct su m is 2' {(1, - 1, 0), (0, 9, -9) E f/ and (8, O, 8) E !!T.
(1, O, -1), (3, 1, 4)}. Since the three vectors in this set are linearly independ-
ent, "Y 3 = f/ Etl !!T.
Suppose one wished to find the expression for a vector, say (8, 9, -1), Combining the results of Examples 4 and 5 gives f/ Ej;) !!T = f/ EB d/1 but
using the sum f/ Etl !!T. There must be scalars x, y, t such that (8, 9, -1) = f7 =1 d/1, therefore there is no cancellation law for direct sums.
(x, y, -x-y) + t(3, 1, 4). This equation has the solution (8, 9, -1) =
(2, 7, -9) + (6, 2, 8) with (2, 7, -9) E f/ and (6, 2, 8) E§' as desired. Theorem 2.16 If f/ and f7 are subspaces of a finite-dimensional
vector space and f/ + §' is a direct sum, then
The deco.mposition obtained in Example 4 expresses 3-space as the direct
sum of aplane f/ anda line §', not in f/. (Why must both the plape and the dim(f/ EB f/) = dim f/ + dim !!T.
line pass through the origin ?) Given a point W, the points U E f/ and V E !!T
can be obtained geometrically. The point U is the intersection off/ and the Proof Suppose {U1 , .•. , Um} and {V1 , . . . , V.} arebasesforf/ and
line through W parallel to the line §'. And Vis the intersection of the line §' f/, respectively. Then dim f/ = m, dim !!T = n, and the proof will be com-
anda line through W parallel to the plane f/. Then, as illustrated in Figure 4, plete if it can be shown that { U1 , ..• , Um, V1 , ••• , V.} is a basis for f/ Etl !!T.
U + V = W by the parallelogram rule for addition. It is easily seen that the set spans f/ Etl !!T, therefore suppose
It should be clear that 3-space can be expressed as the direct sum of
any plane and a line not in the plane, if both pass through the origin.
~
~'
2. REAL VECTOR SPACES Review Problems 71
70
.'
a. Let Y and !Y be as in problem 6c. Find U E Y, V E !Y such that U + V=
j

Let Z = a 1 U1 + · ·· + amUm, so ZEY. But Z = -b 1 V 1 - • • · - b.Vm 7.


so z E :Y also. Thus z E Y 11 :Y = {O} and Z = O. Now the U's are linearly (-4, -1, 5).
i:1dependent so a 1 = ... = am = O, and the V's are linearly independent so 1 b. Let Y and !Y be as in problem 3b. Find U E Y and V E .9" such that
U + V = (0, - 6, 5).
b 1 = ... = b. = O. Therefore the set is linearly independent anda basis for
Y' Ei1 :Y. Since the basis contains m + n vectors, S. Let Y= {(a, b, c)ia + b- e= O} and !F = {(a, b, c)j3a- b- e= 0}.
Find U E Y and V E fF such that U + V = ( 1, 1, 0).
dim(Y EB :Y) = in +n = dim Y + dim :Y. 9. Prove that if !/ is a proper subspace of a finite-dimensional vector space -r,
then there exists a subspace fF such that f = Y <±J !F.
(This result follows directly from the fact that dim(Y + :Y) = dim Y + 10. Find a direct sum decomposition for R[t].
é.im :Y - dim(Y 11 :Y), from pro'mem 10, page 58.)

Unlike coordinates, it is not easy to see the value of a direct sum decom-
position at this time. However, there are many situations in which it is ad- Review Problems
' antageous to split a vector space up into subspaces vi a a direct sum.
1. Let S = { U 1 , ••• , U,}. Determine if the following statements are true or false.
a. If a 1 U1 + · · · + a, U, = O when a 1 = · · · = a, = O, then S is linearly
Problems independent.
b. Iff = .2'(S), then -r = a 1 U 1 + · · · +a, U, for sorne scalars a¡, ... , a,.
1. Show that .2'{(1, 7)} + .2'{(2, 3)} gives a direct sum decomposition of -r 2· c. If .2'(S) = .2'{U2, ... , U,}, then S= {U2, ... , U,}.
2. Show that the sum .2'{(1, 4, -2)} + .2'{(3, O, 1)} is direct. Does it give a d. If S is Jinearly dependen!, then a1 U1 + · · · +a, U, = O for sorne scalars
Q¡, . . . , On.
direct sum decomposition of -r 3?
e. If S linearly dependen!, then U, is a linear combina! ion of U¡, ... , U,_ 1 •
3. Let Y = .2'{U, V}, !Y = .2' {W, X} and determine if Y + !Y is a direct sum f. If S spans the vector space f, then dim f ¿ n.
in each of the following cases:
2. Suppose -r, 'ir are vector spaces, U, V are vectors, and a, b are scalars. In
a. U= (1, O, 0), V= (O, 1, 0), W = (0, O, 1), X= (1, 1, 1). our notation which of the following statements can be meaningful?
b. U= (2, -1, 3), V= (1, 2, 0), W= (-1, 2, -1), X= (2, -4, 2).
a. {a, b} E R. f. (a, b) E R. k. UE V. p. -r n "'fí'.
c. U= (1, O, O, O), V= (0, 1, O~ 0), W = (0, O, 1, O), X= (0, O, O, 1).
b. (a, b) E-r. g. 'ir e -r. l. {a} E R. q.-r=aU+bV.
d. U= (3, -2, 1, 0), V= (1, 3, O, 4), W = (4, 1, 1, 4), X= (2, 1, O, 1).
c.af. h. a, bE R. m. U e -r. r. Ua.
4. Let N= (1, -1, .3), Y= {U E .!31U • N= O} and !Y= .2'{N}. Show that d. -r + 11'. i. U+ V. n. (U, V) e f. s. f ={U, V}.
S 3 is the direct sum of the plane Y and the normalline !Y. -~' . e. fE 'ir. j. un f. o.fEf. t. U <:FJ V.
;,
5. In problem 4 each vector W E 1 3 can be expressed as W = V1 + V2 with :~
3. Suppose S= {2t + 2, 3t + 1, 3- 7t} e R[t].
V¡ E y and vl E !Y. Use the fact that w, V¡, v2, ando determine a rectangle
a. Show that S is linearly dependen!.
along with properties of the cosine function to show that b. Show that t may be expressed as a linear combination of the elements from
and
S in an infinite number of ways.
c. Show that .2'(S).= .2'{3! + 1, 3- 7!}.
6~ For the following spáces, determine if f 3 = Y +!Y. When this is the case 4. Suppose f is an n-dimensional vector space. Determine which of the follow-
determine if the sum is direct. ing statements are true. If false, find a counterexample.
a. Y= {(x, y; z)ix = 2y), !Y = {(x, y, z)iz = 0}. a. Any n vectors span f .
b. Y= {(x, y, z)ix =y, z =O}, !Y= {(x, y, z)ix =O, y= -z}. b. Any n + 1 vectors are linearly dependen! in f.
c. y = .2'{(4, 1, 3)}, !Y = .2'{(2, 1, 1), (1, o, 2)}. c. Any n vectors forma basis for f .
d. y= .2'{(2, -1, 1)}, !Y = .2'{(1, 5, 3), (3, 4, 4)}. d. Any n- 1 vectors are linearly independeiit in f .
72 2. REAL VECTOR SPACES

e. Any basis for r contains n vectors.


f. Any n + 1 vectors span r.
g. Any n - 1 vectors fail to span r.
5. Suppose .e/= {(x, y, z)J2x- y+ 4z =O} and ff = {(x, y, z)J5y- 3z =O}
are subspaces of r 3·
a. Show that the sum .e/ + ff is not direct.
b. Find u E g> and V E ff such that u + V= (2, 9, 6).
6. Prove or disprove: There exists a vector space containing exactly two vectors.
7. Find a counterexample for each of the following false statements.
a. If {U¡, ... , Un} is a basis for r and .e/ is a k-dimensional subspace of r,
then {U¡, ... , Uk} is a basis for .e/.
b. lf {U¡, U2 , V 1 , V2 } is a basis for .e/ and {U¡, Uz, W¡, Wz, W3} is a basis
for ff, then {U¡, U2 } is a basis for .e/ n ff.
c. Any system of n linear equations in n unknowns has a solution.
d. Any system of n linear equations in m unknowns has a solution provided
n:::;; m.
e. lf {U 1 , ••• , Un} is linearly independent in "Y, then dim r:::;; n,

Systems of Linear
Equations

§1. Notation
§2. Row-equivalence and Rank
§3. Gaussian Elimination
§4. Geometric lnterpretation
§5. Determinants: Definition
§6. Determinants: Properties and Applications
§7. Alternate Approaches to Rank
74 3. SYSTEMS OF LINEAR EQUATIONS §1. Notation 75

Many questions about vectors and vector spaces yield systems of linear were written with this notation, then a 11 = 2, a 12 = 3, a 13 = -5, a21 = O,
equations. Such collections of first degree polynomial equations in severa! a22 = 6, a 23 = 1, b1 = 4, and b2 = 7. The double index notation is very
unknowns occur repeatedly in the problems and examples of the previous useful and will be found throughout the remainder of the text.
chapter. If one Jooks back over these situations, it will be noticed that the The above system of linear equations may arise in severa! different ways.
problem was not always to find values that satisfy all the equations simulta- For example, Jet U1 = (a 11 , •... , a. 1), U2 = (a 12 , ••• , a. 2 ), ••• , Um =
neously. Often the problem was to show that there are no solutions or that a (a 1m, .•• , a.,.), and W = (b 1 , ••• , b11 ) be vectors from "Y •. Then the system
system has many solutions. In addition the number of equations in a system of equations (1) arises in the following situations:
was not always the same as the number of unknowns. However, even in the
simple case of two equations in two unknowns it is possible to have no solu- Determining if W E 2'{ U1 , ••• , Um}. That is, do there exist scalars
l.
tion, a unique solution, or many solutions as illustrated by the following three x 1, xm such thál"'W = x 1 U1 + · · · + x,.U,.?
••• ,

systems: 2. Determining ifthe set {U 1 , . • . , U,.} is linearly independent. That is,


do there exist scalars x 1 , ••• , xm, not all zero, such that x 1 U1 + · · · + x,.U111
2x + 3y = 4 2x + 3y = 4 2x + 3y = 4
= O? In this case W = O so b 1 = · · · = b. = O. ·
4x + 6y =O, X- y= 0, 4x + 6y = 8. 3. DtÚermining the coordinates of Wwith respect to the basis {U1 , . . • ,
u.}. In this casen = m and W:(x 1 , . • • , x.)¡u,, ... ,u"¡·
Systems of linear equations will continue to occur in our study of linear
algebra, so a general procedure is needed to determine if a system has solu- There are many other ways in which the system (1) might arise, but these
tions and if so· to find their values. Also, in the process of solving this prob- three are general enough to yield al! possible systems of linear equations.
lem, severa! important concepts will be obtained. Once the doubly indexed notation for coefficients is adopted, it turns out
that most of the remaining notation is superfluous. This is shown in the
following example.

§1. Notation
Example 1 The following is a system oftwo equations andan array
of numbers derived from the coefficients and the constant terms.
If different letters are used to denote the various coefficients in a general
system of n linear equations with m unknowns, we soon run out of symbols. X+ 2y = 1 2
Also, such notation would make it difficult to identify where particular coeffi-
cients occur in the system. Therefore a single symbol is used with a double
2x + 3y = 4 2 3 4
indexing system. The first index or subscript designates the equation in which
Using the technique of eliminating x to solve for y, one multiplies the first
the coefficient appears and the second index designates the unknown for
equation by -2 and adds the result to the second equation. Applying the
which it is the coefficient. Thus a system of n linear equations in m uriknowns
same procedure to the array, we add -2 times the first row to the second row.
may be written as
This yields the following set of equations and corresponding array of num-
bers.
a 1 ¡X¡ + a¡zXz + + a¡mXm = b¡
a21 x 1 + azzXz + + azmXm = b2
(1) X+ 2y = 1 2
a. 1x 1 + a.zXz + + a.mxm = b •• -y= 2 o -1 2

With this notation aij is the coefficient of xj, 1 :::; j :::; m, in the ith equation, Now multiply the new second equation (row) by -1 to obtain the following
1 :::; i :::; n. If the system equations and array.

2x 1 + 3x2 - 5x 3 =4 X+ 2y = 2 .1
6x 2 + x3 = 7 y= -2 o -2
76 3. SYS:rEMS OF LINEAR EQUATIONS §1. Notation 77

Finally multiply the new second equation (row) by -2 and add the result to (m + l)st column. Thus A* has the form
the first.

X 5 o 5
y= -2 o -2

The final array of numbers, if properly interpreted, gives the solution. Example 2 If A andA* denote the coefficient and augmented matri-
This example shows that the plus signs, the equal signs, and even the
ces of the system
symbols for the unknowns are not needed to solve a system with an elimina-
tion technique. It might be argued that, in the course of eliminating a variable,
2x + 3y - 5z + w = 4
one often separates the equations and would not continually write down both
equations. Although such a procedure might be fine for two equations in two x+ y -w=7
unknowns, it often leads to problems with larger systems having different ';.:
9x + 3z- w = 8
numbers of unknowns and equations. J
~ then
Definition An n x mmatrix, read "n by m matrix," is a rectangular array
of numbers with n rows and m columns. Let A be an n x m matrix with the
element in the ith row and jth column denoted aij. Then
-5o 1)-1 and
A*= ( ! b -5o 1 4)
2 3
-1 7 .
3 -1 3 -1 8
a 11 a 12
· · · alm)
A = (a;) = az¡ azz · ·.· a2m . The arra y of numbers considered in Example 1 is the augmented matrix
of the given system without the matrix notation. As shown in that example,
(
a. 1 a. 2 . . . anm the augmented matrix contains all the essential information of a system of
linear equations.
The following are examples of a 2 x. 3 matrix, a 3 x 1 matrix, and a With the next two definitions, matrices can be used to express any system
3 x 3 matrix, respectively: of linear equations as a simple equation of matrices.

(29 53 6)1 ,
-4o 8) Definition Let A = (a;) be an n x m matrix and B = (b 11 k) a p x q
(! 2 9
3 .
matrix, then A = B if n = p, m = q, and aij = bij for all i and j.

Definition The n x m matrix A = (a;) is the coefficient matrix for the Definition Let E= (e;) be an n x m matrix and F = (!;) 1
an m x 1
following system of n linear equations in m unknowns: matrix, then the product of E and F is the n x 1 matrix EF = (g ¡) with
g¡ = e¡¡J¡ + e¡z/2 + · · · + e¡mfm for each i, 1 ::; i::; n.

For example,
+ a.mxm = b,

The augmented (coefficient) matrix for this system, denoted A*, is the (
21 3
2 1
7) (- ;)1 = (2(3)
1(3)
+ 3(2) + 7(
+ 2(2) + 1( -1)
-1)) (5)6
=
n x (m + 1) matrix obtained from A by adding the constant terms as the
78 3. SYSTEMS OF LINEAR EQUATIONS §1. Notation 79

Now suppose A = (a;) is the n x m matrix of coefficients for the is a solution. That is,
system (!) of n linear equations in m unknowns, Jet X= (x). be an m x 1
matrix of the unknowns and let B = (b;) be the n x 1 matnx of constant
terms. Then the equations in (1) can be written as a single matrix equation
AX=B.

If a system of linear equations has a solution, it is said to be consistent.


Example 3 Using a matrix equation to write a system of linear
The problem of finding coordinates of a vector in "fí, with respect to sorne
equations, basis always leads to a consistent system of linear equations. If a system of
linear equations has no solution, it is inconsistent. An inconsistent system of
x1 - 3x 2 + x3 = O
linear equations is obtained when it is found that a vector from "f/'11 is not in
4x 1 + 6x 2 - x3 = 7 the span of a set.
If the constant terms in a system of linear equations are all zero, AX = O.
beco mes then there is always a solution, namely, the trivial solution X = O obtained by
setting all the unknowns equal to zero. Such a system is called homogeneous
beca use all the· nonzero terms have the same degree. A homogeneous system
-3
of linear equations is obtained when determining if a subset of "f/'11 is linearly
6
dependent. Notice that in such a case, with X¡ U¡ + ... + xmum = O, the
question is not: Is there a solution? but rather: Is there a solutio.n with not
and all the x's equal to zero? Such a solution, X = with e e
:f. O, for a homo-
geneous system AX = O is called a nontrivial solution.
2x + y= 1

X- 3y =2 Examp!e 4 Consider the homogeneous system oflinear equations


5x- y= 4
2x + 3y +z =O 3
or
beco mes X- 6y- Z = 0 -6

As with all homogeneous systems, this system has the trivial solution X = O
or x = y = z = O. But it also has many nontrivial solutions. For example,

Definition Suppose AX = Bis a system of n linear equations in the m


and X= ( ;)
unknowns X¡, ••• ' Xm. Then X¡ = C¡, ..• ' Xm = Cm or X= e= (e) is a
-10
e
solution of this system if A = B.
are both solutions.
For the second system in Example 3, If the abo ve solutions are written as ordered triples in "Y 3 , then one is
a scalar multiple of the other. Using the spaces "Y" as a guide, it is a simple
matter to turn the set of all n x m matrices, for any n and m, into a vector
space.
80 3. SYSTEMS OF LINEAR EQUATIONS §1. Notation 81

Definition Let vltnxm ={AlA is an n x m matrix}. lf A, BEvltnxm with That is, rC = (re) is a solution for AX = O, and we have shown that CE !J'
A =(a;) and B = (bii), setA + B = (c;j) where cii = aii + bii and rA = implies rC E !J' for any r E R.
(dii) where dii = ra;i, for any r E R. ·
!J' is called the solution spaee of the homogeneous system AX = O. Since
the solution space is a subspace of an m-dimensional vector space, it has a
Example 5 In vH 2 x 3• finite basis. This means that a finite number of solutions generate all solutions
by taking linear combinations.
-4
-1 D=G Example 6 lt can be shown that the solution space of the system
and
3x + 3y + 6z - w =O
4 3
(
o 2 1) ( o
o 8 = 12
s
o
4)
32 .
x-y+ 2z+w=0
5x + y+ lOz + w =O
Theorem 3.1 The set vH. xm of n x m matrices, together with the opera- is
tions of addition and scalar multiplication as defined above, is a vector space.

Proof The proof is almost identical to the proofthat "fí. is a vector


space and is left to the reader. ~{( -!}(i)}
The definitions and theorems of Chapter 2 referred to an abstract Therefore every solution is a linear combination of two vectors, or every
vector space, therefore they apply to vltnxm even though it was not in hand solution is of the form x = 3a, y = 2b, z = -a, w = 3b for sorne a, bE R.
at the time. This means that we can speak of linear! y dependent sets of n x m
matrices or the dimension of .,#• xm· (Can you find a simple basis for .,#• xm ?) Suppose !J' is the solution space of a homogeneous system of linear equa-
In particular, the following theorem suggested by Example 4 is not difficult to tions in m unknowns. Then the vectors of !J' are m x 1 matrices. These
prove. matrices are like m-tuples written vertically and therefore might be called
eolumn vectors. In contrast, an element of "fí" or .,# 1 x" wou1d be called a row
Theorem 3.2 Let AX = O be a homogeneous system of n linear equa- vector. We have found that a plane in 3-space has a linear Cartesian equa-
tion in m unknowns. Then the set of all solutions !J' = {ClAC= O} is a tion. Thus the solution space of one homogeneous linear equation in three
subspace of the vector space .,#m x 1 • variables can be viewed as a plane in 3-space passing through the origin. In
general, the column vectors of !J' and thus !J' itself can be represented by
geometric points. Such graphs of solutions are considered in problem 14
Proof !J' is nonempty because X = Ois a solution for AX = O. We below and in Section 4.
will show that !J' is closed under scalar multiplication and leave the proof
that it is closed under addition as a problem.
LetA =(a;), X= (xi) and suppose C =(e) E!/'. Then a¡ 1e 1 + · · · + Problems
a¡mcm = O for each i, 1 ~ i ~ n. If rE R, then using properties of the real
numbers, we obtain l. Show that each of the following questions Ieads toa system of linear equations.
a. Is 3t in the span of {1 + t, 3t 2 + t, 4t + 7}?
b. Is 2 + 3iE!t'{6 + 9i, 7- 2i}?.
82 3. SYSTEMS OF liNEAR EQUATIONS §2. Row-equivalence and Rank 83

c. ls (2, 1, -4) E .2'{(1, 3, 6), (2, 1, 1)}? 9. Give an argument involving the dimension of 1' 3 which shows that not all
d. What are the coordinates of (2, 1, -4) with respect to the basis {(1, 3, 6), systems of three linear equations in two unknowns can be consisten!.
(2, 1, 1), (1, o, 2)}?
10. Perform the indicated operations.
e. What are the coordinates of t 2 + 5 with respect to the basis {3t - 2,
2
t +4t-1,t 2 -t+6}?
2. Por each of the following sets of vectors, show that a homogeneous system of
(2 - 43 - 7)5 + (O3 27 81) ·
a. 1
linear equations arises in determining if the set is linearly dependent or in-
dependen!. 11. What is the zero vector of Ji3 x 2. J13 x 1, Ai¡ x 4• and Jlt 2 x 2?

a. {t 2 + 3t 3 , t 2 - 5, 2t + t 2 } in R4[t]. 12. The homogeneous system AX = O of n linear equations in m unknowns has the
b. {3 + 4i, 2i- 1, 6- i} 'rrfc. trivial solution X= O. What are the sizes of these two zero matrices?
c. {ex, e 2 X, e 5 x} in !F.
13. Complete the proof of Theorem 3.2.
d. {(2, 1, -3, 4), (5, -1, -4, 7), (1, -3, 2, 1)} in 1'4 •
14. Suppose AX = B not homogeneous. Let S = {e E A1m X dAC = B}. Call S
3. Using the double subscript notation a; 1 for coefficients, what are the values the solution set of the system.
of a¡¡, a23, a13, a 24, and a21 in the following system?
a. Show that S is not a subspace of Ji m x 1·
2x 1 - 5x2 + 6x4 = -6 b. If AX = B is a single linear equation in three unknowns, what is the
3x2 + X3 - 7x4 = 8.
geometric interpretation of S?
4. Find the coefficient matrix and the augmented matrix for c. If AX = B is two linear equations in three unknowns, what are the pos-
sible geometric interpretations of S?
a. the system of equations in problem 3.
b. 2x- y = 7 c. 2x 1 - 3x2 x 3 =O +
X + Y = 3 X1 - 2x3 = 0.
5x- y= 6.

5. Compute the following matrix products. §2. Row-equivalence and Rank

o 1
The technique of eliminating variables to solve a system of linear equa-
1 o
tions was not sufficient to han die all the systems that occurred in the previous
chapter. However, it does contain the essential ideas necessary to determine
6. Write the systems in problem 4 as matrix equations.
if a system is consistent and, when it is, to find all of the solutions. In order
7. Suppose the homogeneous system AX = O is. obtained in determining if the to see this, we must first put the procedure we have been using to solve sys-
set S e r. is linearly dependent. Find S given that A is tems of linear equations on a firm basis.

5 2 7) (i j). 2 5 7 8) 1 3)
4 9
a. ( 1 3 8 · b. e~
(21' 13 oO 62 . d. fl o17.
5
Definition Two systems oflinear equations are equivalent if every solu-
tion of one system is a solution of the other, and conversely.
8. Suppose AX = B is the system of linear equations obtained in determining if
WE.P{U¡, ... , Um}. Given the augmented matrix A*, find W, U¡, ... ,
Um E r •. That is, two systems with m unknowns, A 1 X = B 1 and A 2 X = B2 , are
equivalent provided A 1 e= B 1 if and only if A 2 e = B 2 for e E Amx 1•

a. A* = (¡ - ~ ~).
9 5 4)
*- 2 1 8
c.A-176"
(o 2 o Example 1 The following two pairs of equivalent systems are given
rr
84 3. SYSTEMS OF LINEAR EQUATIONS §2. Row-equivalence and Rank 85

without justification at this time: system by 1/2, an operation of type II yields

2x + 3y = -5 x=2 X- !Y+ 2z =2
and
5x- 2y = 16 y= -3, 6x - 9y + 6z = 7.

x-y+z=4 x-y+z=4
Finally adding -3 times the first equation of the original system to the
2x +y- 3z =2 and 3y- 5z = -6 second equation, a type III operation, gives
5x +y- 5z = 1 o= -13.
2x - 5y + 4z = 4
In the first case, the system is equivalent to equations which provide a unique 6y- 6z = -5.
solution. In the second, the system is equivalent to one that is obviously
inconsistent.
Proof It is sufficient to show that an equivalent system results when
one operation of each type is performed, and it should be clear that an opera-
Thus, determining if a system of linear equations is cónsistent and ob-
tion of type 1 yields an equivalent system. Therefore, suppose A 1 X = B 1 is
taining solutions amounts to finding an equivalent system in a particular
transformed to A 2 X = B 2 with an operation of type II, say the kth equation
form. The first fact to be established is that the technique of eliminating
of A 2 X = B2 is r times the kth equation of A 1 X = B 1 , r =f. O. If A 1 = (aij),
variables yields equivalent systems of equations.
X= (x), and B 1 = (b¡}, then the systems are identical except that the kth
equation of the first system is ak1 x 1 + · · · + akmxm = bk while the kth equa-
Theorem 3.3 If one system oflinear equations is obtained from another tion of the second is rak 1x 1 + · · · + rakmxm = rbk.
by a finite number of the following operations, then the two systems are Now if X= e= (e) is a solution of A 1 X = B 1, then X= e is a solu-
equivaient. tion of each equation in A 2 X = B2 except possibly the kth equation. But
ak¡C¡ + ... + akmcm = bk so rak¡C¡ + ... + rakmcm = rbk and X= e is
Type l. Interchange two equations. also a solution of this equation. Conversely, if X= e is a solution of A 2 X =
Type II. Multiply an equation by a nonzero scalar. B2, then since r =f. O, it can be divided out of the equality rak 1c 1 + · · · +
Type III. Add a scalar multiple of one equation to another equation. rakmcm = rbk so that X = e satisfies the kth equation of the first system.
Therefore X= e is a solution of A 1 X = B 1 , and the two systems are equiva-
lent.
Before proving the theorem consider sorne examples. The system
The proof that an operation of type III yields an equivalent system is
2x- 5y + 4z = 4
left to the reader. -._,.,

6x - 9y + 6z = 7
In Example 1 of the previous section, a system of linear equations was
beco mes solved using an elimination technique. We now see that a sequence of opera-
tions of types II and 111 was used. At the same time the procedure was per-
6x- 9y + 6z = 7 formed more simply on the augmented matrix. Combining this idea with the
above theorem gives a general approach to the solution of systems of linear
2x- 5y + 4z = 4 equations. When the word "equation" in the above three types of operations
is replaced by "row" and the operations are performed on matrices they are
using an operation of type l. Multiplying the first equation of the original caiied elementary row operations.
§2. Row-equivalence and Rank 87
86 3. SYSTEMS OF LINEAR EQUATIONS

Examp/e 2 Find the solution of An equivalent system can be obtained in which it is possible to read off
all the so1utions for this system.
2x- 3y = 4
X+ 2y = 7 -2 1 1) (1
-4 3 o lit o
-2
o
1) (1o o-2
-2 lit
o _;).
by performing elementary row operations on the augmented matrix.
An arrow with I, JI, or III below it will be used to indicate which type of Thus the system
elementary row operation has been used at each step.
X - 2y 3
~-

2 - 3 4) ( 1 2 7) ( 1 2 7) ( 1 2 7 ) ( 1 o 29/7) z = -2
( 1 2 7 -r 2 -3 4 lit o -7 -10 lT' o 1 10/7 lit o 1 10/7 .
is equivalent to the given system, and x = 3 + 2t, y = t, z = -2 is a solu-
First the two rows are interchanged, then -2 times the first row is added to
tion for each value of the parameter t. Writing
the second row, next the second row is multiplied by -1/7, and ·finally the
last row is multip1ied by -2 and added to the first row. The resulting matrix (x, y, z) = t(2, 1, 0) + (3, O, -2), tER,
is the augmented matrix for the system
shows that the two planes with the given equations intersect in a line.
X = 29/7

y= 10/7 If consistent, solve the system


Example 5
which has x = 29/7, y = 10/7 as its only solution. Since this system is equiva-
lent to the original system, the original system is consistent with the unique x+ y-z=1
solution x = 29/7, y = 10/7. +z =4
2x- 3y
4x- y- z = 6
Example 3 Show that
.t
:H~ ~H~ ~)
1 -1 1 -1

(~
2x+ y=3 1 -1
-3 1 -5 3 -5 3
4x + 2y = -1 -1 -1 -1 -1 -5 3

-1 1) (1 1 o 1/5)
.c(~ +H~ o
is inconsistent. -1 -2/5
Performing e1ementary row operations on the augmented matrix yields: -5 32--¡r>01 -3/5 1 -3/5 -~/5 .
o o o o o o o
2 1 3)....,.,.(1 1/2 3/2)='(1 1/2 3/2)....,.,.(1 1/2 3/2)
(4 2 -1 11 4 2 -1 111 o o -7 11 o o 1 . After the first two steps, which eliminate x from the second and third equa-
tions, a system is obtained with the Iast two equations identical. Although
Since the equation O = 1 is in a system equivalent to the given system, the the original system appeared to involve three conditions on x, y, and z, there
given system is inconsistent. are only two. All solutions for the given system are obtained from the final
system of two equations by letting z be any real number. This gives x =
Examp/e 4 Find all solutions for t + tl, y= -t + !t, and z = t with tE R.

X- 2y + Z=1 The above examples suggest that a complete answer to the problem of
2x - 4y + 3z = O. solving a system of linear equations is obtained by transforming the aug-
88 3. SYSTEMS OF LINEAR EQUATIONS §2. Row-equivalence and Rank 89

mented matrix into a special form. This form isolates sorne or all of the This implies that it should be possible to transform any matrix to echelon
unknowns. That is, certain unknowns appear only in one equation of the form with elementary row o-perations. The proof that this is the case simply
associated system of linear equations. Further, if an unknown, x;, appears in formalizes the procedure used in the examples.
only one equation, then none of the unknowns preceding it, xk with k < i,
are in that equation. Once such an equivalent system is found, it is clear if
it is consistent. And if consistent, then all solutions can be read directly from Theorem 3.4 Given an n x m matrix A, there is a matrix in echelon
it by assigning arbitrary values to the unknowns which are not isolated. The form that is row-equivalent to A.
basic characteristics of this special form for the augmented matrix are as
follows: Proof If A = O, then it is in eche Ion form. For A i:- O, the proof is
by induction on the number of rows in A. To begin, suppose the hth co1umn
l. The first nonzero entry of each row should be one. of A is the first column with a nonzero element and akh 'f:- O. Multiplying the
2. The column containing such a leading entry of 1, should have only kth row of A by Ifakh and interchanging the kth and 1st rows yields a matrix
O's as its other entries. with 1 as the leading nonzero entry in its first row. Then with elementary row
3. The leading 1's should m ove to the right in succeeding rows. That Ís, operations of type III, the other nonzero entries in the hth column can be
the leading 1 of each row should be in a column to the right, of the column replaced by O's. Call the matrix obtained A 1, then the first row of A 1 con-
containing the leading 1 of the row above. forms to the definition of echelon form and no column before the hth con-
4. A row containing on1y O's should come after aii rows with non- tains nonzero terms.
zero entries. For the induction step, suppose the matrix-AP = (b;) has been obtained
which conforms to the definition of echelon form in its first p rows, 1 ~
p < n, and bij = O for i > p and j < q, where bpq = 1 is the 1eading one in
Definition A matrix satisfying the four conditions above is in (row) the pth row. If there are no nonzero entries in the remaining rows, then AP
eche Ion form. is in eche1on form. Otherwise there exist índices r, s such that b, 5 'f:- O with
r > p (i.e., b,5 is in one of the remaining rows), and if i > p, j < s, then
The first and last matrices below are in echelon form, whi1e the middle bij = O (i.e., b,5 is in the first column with nonzero entries below the pth
two are not. row.) Now multiply the rth row of AP by 1/b,s and interchange the rth and
.·J (p + 1)st rows. This yields a matrix with 1 as the 1eading nonzero entry in the
(p + l)st row; by assumption, s > q, so this 1 is to the right of the leading l
o 1 5 o 6) 1 2 3 1 0 0
) ( I o 2 o 2 o) in the preceding row. Now use elementary row operations of type III to
0 1 4 , 0 0 1 , ~¿¿~;~.
) (
o o o 1 1' make all the other nonzero en tries in the sth column equal toO. If the matrix
(o o o o o (
001 010 000001 obtained is denoted by Ap+ 1 = (cij), then the first p + 1 rows of Ap+ 1 con-
form to the definition of a matrix in echelon form, and if i > p + 1, j < s,
then cij = O. Therefore the proof is complete by induction.
In the second matrix, two of the co1umns containing leading 1's, for the
second and third rows, contain other nonzero entries. In the third matrix,
the leading 1 in the third row is not in a column to the right of the column Example 6 The following sequence of elementary row operations
containing the 1eading 1 in the second row. illustrates the steps in the above proof.

Definition Two n x m matrices are row-equivalent if there exists a finite


sequence of elementary row operations that transforms one into the other.
(oo o2 o6 o4 ¡) ("o o o o ¡) ("o o o3o24)
8 ~ 1 3 2 4 ~
1
1
o o o 1 2 o o o 1 2 o o o 1 2
l 3 o 1 3 o
e(~ ~H~ ~H~ ~)
1 3 2
Examples 2 through 5 suggest that any consistent system of linear equa- o o 1 o o 1 o o
tions might be solved by transforming the augmented matrix to echelon form. o o o o o o o o o
90 3. SYSTEMS OF liNEAR EQUATIONS §2. Row-equivalence and Rank 91

If the first matrix is the matrix A in the proof, then the third would be A 1 , the the subspace of "f/'m spanned by the n rows of A. That is,
fourth A 2 , and the Jast A 3 which is in echelon form. In A the nonzero element
denoted akh in the proof is 2 with k = h = 2. In A 1 the element b,s is 1 with 2'{(a 1 ¡, O¡z, ... , O¡m), ... , (anl• 011 z, ... , Onm)}
r = 3 and s = 4. Notice that s = 4 > 2 = h as desired.
is the row space of A.
There are usually many sequences of elementary row operations which
transform a given matrix into one in echelon form, and it is not at all clear For example, the row space of the matrix
that different sequences will not Jead to different echelon forms. The proof
that there is in fact only one echelon form is suggested by the relationship
between an echelon form and the solution of a system of linear equations.
However, a formal proof of this will be omitted, see problem 14.
is'2'{(2, 3), (!, -2), (4, 8)} = "f/' 2 •
Theorem 3.5 Each n x m matrix is rowcequivalent to only one matrix The dimension of the row space of a matrix A is often called its "row
in echelon form. rank," but it is easily seen that this is just the rank of A as defined above.

Because of the last two theorems it is possible to refer to "the echelon Theorem 3.6 If A and B are row-equivalent, then they have the same
form" of a given matrix. Further the number of nonzero rows in the echelon row space.
form of a matrix A is determined only by A and not the procedure used to
obtain the echelon form. Proof lt need only be shown that if A is transformed to B with one
elementary row operation, then the row spaces are equal. We will considera
, Definition The rank of a matrix is the number of rows with nonzero type II operation here and leave the others to the reader. Therefore suppose
entries in its (row) echelon form. the kth row of B is r times the kth row of A for sorne nonzero scalar r. If
U1 , ••• , Uk, ... , u. are the rows of A, then U1 , ••• , rUk, ... , u. are the
rows of B. Now suppose X is in the row space of A, then
Thus, from Example 4,
X= a 1 U 1 + for sorne scalars a¡
-2 l
rankG -4 3 ol) = 2 = a 1 U1 + since r i= O.

and from Example 5, Therefore X is in the row space of B and the row space of A is contained in
the row space of B. The reverse containment is obtained similarly, proving

~2
1 -1 that the row space is unchanged by an elementary row operation of type II.
rnnk(! -3
-1 -1
1 :)
Now the row space of a matrix equals the row space of its echelon form,
and the rows of a matrix in echelon form are obviously linearly independent.
A second characterization of rank should be discussed before returning Therefore we have the following theorem.
to our general consideration of systems of linear equations.
·,·
Theorem 3.7 The rank of a matrix equals the dimension of its row
Definition Let A = (a¡) be an n x m matrix. The row space of A is space.
~.
92 3. SYSTEMS OF LINEAR EQUATIONS §2. Row-equivalens.e and Rank 93

Example 7 Find the dimension of the subspace !/ of "f" 4 given by


Problems
!/ = .Sf{(5, 1, -4, 7), (2, 1, -1, 1), (1, -1, -2, 5)}.
1. All the solutions of the system
The dimension of !/, is the rank of the matrix x+y+z=6
2x-y+z=3

A= ~
·(5 -4 7)
-1 1 .
4x +y+ 3z = 15
are of the form x = 3 - 2k, y = 3 - k, z = 3k for sorne real number k.
-1 -2 5 Determine which of the following systems also have these solutions.
a. x +y +z =6 b. 3x + 2z = 9 c. x + y - 2z = O
But the matrix
4x +y + 3z = 15. 3y + z = 3 3x +y + z = l.
X - 2y = -3.

2. Suppose the system A 2 X = B2 is obtained from A1X = B 1 by adding r times


the kth equation to the hth equation. Prove that the two systems are equiva-
lent. That is, prove that an operation·or type III yields an equivalent system.
is the echelon form for A. Therefore the dimension of !/ is 2. 3. Find the echelon form for the following matrices.

(i3 -1) .
Notice that in obtaining the dimension of !/ we have also obtained a
characterization of its elements. For U E!/ if and only if there are scalars 2o o) o -3 6 1 6) 1 4 1 3)
a and b such (1 o o
a. O 1 3 . b. ~ c. (oo -2 4 1 45 .
-1 2 1
d. 2 8 3 5 .
(1 4 2 7

4. Solve the following systems of linear equations by transforming the aug-


U= a(1, O, -1, 2) + b(O, 1, 1, -3) =(a, b, -a+ b, 2a- 3b).
mented matrix into echelon form.

Therefore ;¡ a. x +y+ z = 6 b. X+ y= 2 C. X+ y+ Z = 4
·,t!
.J x- 2y + 2z = -2 3~- y= 2 3x + 2y + z = 2
3x + 2y- 4z = 8. X - y= 0. x + 2y + 3z = 3.
!/ = {(x, y, z, w)jz =y - x, w = 2x- 3y}.
5. List all possible echelon forms for a 2 X 2 matrix. Do the same for 2 x 3 and
3 x 2 matrices.
Example 8 Determine if {(2, 2, -4), ( -3, -2, 8), (1, 4, 4)} is a basis
for "//' 3 . 6. Characterize the span of the three vectors in Example 8, as was done in
Consider the matrix A which has the given vectors as rows. Example 7.
7. Use the echelon form to find the row spaces of the following matrices. If the
space is not "Y," characterize itas was done in Example 7.

a.
(1 i)· b. G 211 -3
4.
-1)
c. 2 e -11 -~). d.
2
G o3 D·
The echelon form for A is e1
e. O 1 o
1 5 .
3 o 1 6
4) f. (21 -11
3 -1 6
42 -2)8.
o

(
1 o
o 1 -4)
2 .
8. Find the rank of each matrix in problem 7.
o o o 9. Explain why the rank of an n x m matrix cannot exceed eíther n or m.
10. Prove or disprove: If two matrices have the same rank, then they are row-
Therefore the set is not a basis for "//' 3 beca use it does not contain three equivalent.
linearly independent vectors.
11. Determine if the following sets are linearly independent by finding the rank
94 3. SYSTEMS OF LINEAR EQUATIONS §3. Gaussian Elimination 95

of a matrix as in Example 8. Therefore, if the unknowns x 2 and x 5 are assigned arbitrary values, these
a. {(6, 9), (4, 6)}. b. {(2, -1, 8), (0, 1, -2), (1, -1, 5)} equations determine values for x 1 , x 3 , and x 4 . Thus the solutions for this sys-
c. {(1, 3), (2, 5)}. d. {(1, O, 1), (0, 1, 1), (1, 1, 0)}. tem involve two parameters and may be written in the form
12. How is the statement "A E Ál n X m" read?
x 1 = 1 - 2s + t, x 2 = s,. x 3 = -2t, x 4 = 2- 3t, Xs = t
13. Suppose A E J!fnxm and Bis obtained from A with an elementary row opera-
tion of type III. Show that A and B ha ve the same row space. with s, tE R.
14. Suppose E= (e¡¡) and F = Cfu) aren x m matrices in echelon form. Assume
ekh -1- hh, but e¡¡ = /¡¡ for all i > k, 1 S: j S: m, and ek¡ = /. 1 for all j < h.
Transforming a matrix. into its echelon form with a sequence of ele-
That is, E -1- F and the fir~try at which they differ is in the k, h position, mentary row operations is often called row reduction of the matrix, and row
counting from the bottom. reduction of the augrhented matrix for a system of linear equations is caiied
Prove that the two systems ofhomogeneous equations EX= Oand FX =O Gaussian elimination or Gaussian reduction. If the system AX = B is con-
are not equivalent by finding a solution for one system which is not a solution sistent and the rank of A is k, then Gaussian elimination yields a system of k
for the other. Why does this prove that a matrix A cannot have both E and F equations in which each of k unknowns has been eliminated from all but one
as echelon forms? equation. As the preceding example shows, all solutions are easily read
directly from such a system. But Gaussian elimination also reveals if a system
is inconsistent, for it yields a system that includes the equation O = l. This
situation occurs in Example 3, pag~ 86 and Ieads to the firstgeneral theorem
§3. Gaussian Elimination on systems of linear equations.

We are now in a position to solve any consistent system of linear equa- Theorem 3.8 A system of linear equations is consistent if and only
tions. But equally important, we can find conditions for the existence and if the rank of its coefflcientmatrix equals the rank of its augmented matrix.
uniqueness of a solution in general.
The solutions of a consistent system of linear equations with augmented
matrix in echelon form can easily be obtained directly from the equations. Proof Elementary row operations do not mix entries from different
Each equation begins with an unknown that does not appear in any other columns. Therefore, when the augmented matrix A* of the system AX = B
equation. These unknowns can be expressed in terms of a constant and any is transformed into a matrix in echelon form with a sequence of elementary
unknowns (parameters) that do not appear at the begining of an equation. row operations, the matrix obtained by deleting the Iast column is the
echelon form for A. Hence the rank of A and the rank of A* are different if
and only if there is a leading 1 in th~ last column of the echelon form for A*,
Examp!e 1 Given a system with augmented matrix in echelon form that is, if and only if the equation O = 1 is in a system equivalent to the given
system AX =B. Therefore rank A # rank A* if and only if AX = B is
- x5 = 1
inconsistent.
+ 2x5 =O
x4 + 3x5 = 2, Example 2 Show that the system

the equations can be rewritten in the form x+y+z=1


x-y-z=2
x 1 = 1 - 2x2 + x5
x- 3y- 3z =O
x3 = -2x5
x4 = 2- 3x5 • is inconsistent.
96 3. SYSTEMS OF LINEAR EQUATIONS §3. Gaussian Elimination 97

The a~gmented matrix for the system therefore rank A* = rank A = 2 and the system is consistent. Since there are
three variables and rank A == 2, there is 1 = 3 - 2 parameter in the solutions.
And the solutions are given by x = 2 - 2t, y = t, z = 1, te R.

The relationship between rank and the number of parameters in a solu-


tion leads to the second general theorem.
is row-equivalent to

Theorem 3.9 A consistent system oflinear equations has a unique sol u-

(~ ~).
1 1
-2 -2 . tion if and only if the rank of the coefficient matrix equals the number of
o o -1 unknowns.

therefore the rank of A* is 3. The coefficient matrix A is row equivalent to Proof Let AX = B be a consistent system of linear equations in m
unknowns. There is a unique so1ution if and only if there are no parameters

(~
1 in the solution. That is, if and only if m - rank A = O as stated.
-2 '
.~

o
Corollary 3.1 O A consistent system oflinear equations with fewer equa-
So rank A = 2 ,¡. 3 = rank A* and the system is inconsistent. Notice that it tions than unknowns has more than one solution.
was not necessary to obtain the echelon forms to determine that the ranks
are not equal. Proof Let AX = B be a consistent system of n linear equations in
m unknowns with n < m. Then the rank of A cannot exceed the number
NOw suppose AX = B is a consistent system of n linear equations in m of rows, which is n, and the result follows from Theorem 3.9.
unknowns. Then the rank of A ( or A*) determines the number of parameters
needed to express the solution, for an unknown becomes a parameter in the Notice that this corollary does not say that a system with fewer equations
solution when, after Gaussian elimination, it does not begin an equation. than unknowns has more than one solution. For example, the system
Since the rank of A determines the number of equations in the row-reduced
system, there will be m - rank A parameters. Thus, in Example 1, there are 4x- 2y + 6z = 1
five unknowns (m = 5) and the rank of A is 3, so the solution should have
5 - 3 = 2 parameters, as is the case. 6x- 3y + 9z = 1

has no solutions.
Examp!e 3 Solve the system Recall that a homogeneous system, AX = O, is always consistent, having
the trivial solution X = O. Therefore the problem is to determine when a
X+ 2y + Z = 3 homogeneous system has nontrivial solutions.
2x + 4y - 2z = 2.
Corollary 3.11 The homogeneous system of linear equations AX =O
The augmented matrix A* for this system has echelon form: has a nontrivial solution if and only if the rank of A is Iess than the number
of unknowns.

(o1 2o o1 2)1 , Proof This is simply a special case of Theorem 3.9.


98 3. SYSTEMS OF LINEAR EQUATIONS §3. Gaussian Elimination 99

Examp!e 4 Determine if the set Second use the above criterion for the existence of nontrivial solutions
of homogeneous systems of linear equ11tions. Suppose x, y, z satisfy

x(2, -1, 3, 4) + y(1, 3, 1, -2) + z(3, 2, 4, 2) = O,


is Iinearly independent in R[t].
Suppose x, y, z are scahirs such that This vector equation yields the homogeneous system

2x+ y+ 3z =O
Collecting coefficients and setting them equal to O gives the homogeneous -x + 3y + 2z =O
system 3x + y+ 4z =O
4x + 2y + 2z =O 4x- 2y + 2z =O
3x + 4y- z =O
with coefficient matrix
- x - 5y + 4z = O,

and the set is linearly independent if this system has only the trivial solution,
that is, if the rank of the coeffi~ient matrix equa1s the number of unknowns.
2 13 23)
-1
But the coefficient matrix is row-equivalent to · ( 3
4 -2 2
1 4 '

1o
o 1
9) .'_,., and the set is linearly independent if the rank of this matrix is 3. A little com-
(o o -1 '
o putation shows that both of the above matrices have rank 2, therefore the
set is linearly dependent.
and so has rank 2. Thus the set is linearly dependent.
Notice that the two matrices in Example 5 are closely related. In fact,
Example5 Determine if the rows of one are the columns of the other.

{(2, -1, 3, 4), (1, 3, 1, -2), (3, 2, 4, 2)}


Definition The transpose of the n x m matrix A =(a;) is the m x n
is linearly independent in "Y 4 • matrix Ar = (bhk), where bhk = akh for all h and k.
This problem can be approached in two ways. First use the fact that the
rank of a matrix is the dimension of its row space and consider the matrix Thus each matrix in Example 5 is the transpose ofthe other. We will see
that, as in Example 5, a matrix and its transpose always have the same rank.

1
2 -1
3 4)
3 1 -2 .
We have seen that, since the dimension of "Yn is n, any set of n + 1
vectors in "Y" is linearly dependent. This fact can now be proved without
(3 2 4 2 reference to dimension.
If the rank of this matrix is 3, then the rows of the matrix and the set are
linearly independent. Theorem 3;12 A set of n + 1 vectors in "Yn is linear! y dependen t.
100 3. SYSTEMS OF LINEAR EOUATIONS §4. Geometríc lnterpretation 101

Proof Given {U,, ... , u.+,} e "~"'·· suppose x,, ... ' Xn+ 1 satisfy
the equation x 1 U 1 + ... + xn+tUn+l =O. This vector.equation ~ields_a
system of n (one for each component) homogeneous, !mear equations In §4. Geometric lnterpretation
n + 1 unknowns. Since there are fewer equations than unknowns, there
is not a unique solution, that is, there exists a nontrivial solution for Let AX = B be a system of n linear equations in m unknowns. A solu-
x 1 , ••• , Xn+ 1 and the set is linearly dependent. ·,~·· e
tion X = for this system can be viewed as a point in Cartesian m-space. For
e
although is an m x 1 matrix, its transpose is essentially an m-tuple of real
numbers. Call the set {e E .H.,x ¡jAe = B} the so/ution set for AX = B.
Problems Then the geometric question is: What is the na tu re of the graph of the solu-
tion set in m-space? Of course, if the system is inconsistent, the set is empty
In problems 1 through 10, use Gaussian elimination to determine if the system
and, as we saw in the first section, the solution set is a subspace of vi!m x 1 if
is consistent and to find all solutions if it is.
and only if B = O.
l. 2x- 3y = 12 2. 3x- 6y = 3 3. 2x- 6y = -1 Recall that the graph of a single linear equation in m unknowns is a
3x + 4y =l. 4x- By= 4. 4x + 9y = 5. hyperplane in m-space (a line in the plane if m = 2, and a plane in 3-space
4. -Sx + 15y= 3 5. 2x - 3y - 2z = 2 if m = 3). Therefore, a system of n linear equations in m unknowns can be
2x- 6y = 4. 6x --9y- 3z = -3. thought of as giving n hyperplanes in m-space, although they need not all be
3x + 2y + z = 6 distinct. A solution of the system of equations is then a point in common to
6. 2x + y - 6z = 1 7. 3x- 6y + 3z = 9 8.
x+y-z=2 2x- 4y + 2z = 6 2x- 4y + 5z = 3 all n hyperplanes, and the graph of the solution set is the intersection of the
+ 5z = hyperplanes.
3x + 2y- 7z =O. Sx - 10y 15. x+y-z=l.

9. 4x + 3y - z + w = -2 10. 2x +y - z + 3w = O
2x+y+z+w=2 5x - 8y + 5z - 3w = O Example 1 Determine how the planes with Cartesian equations
5x+3y+z+w=-2. x - 3y + 2z - 2w = O 4x - 2y + 6z = 1, 2x + y - 3z = 2, and 6x - 3y + 9z = 4 intersect in
3x - 2y + z + w = O. 3-space.
11. Determine if the following sets of vectors are linearly dependent. Gaussian elimination on the system of 3 equations in 3 unknowns yields
the inconsistent system
a. {2t- t3, 4 + t2, t + t2 + !3}.
b. {1 + 21 2 + 61 6 , t 2 + 14 + 31 6 , 1- 21 }.
4
x=O
c. {(2, 4, -3), (4, O, -5), ( -1, 2, 1)}.
d. {(1, O, -1, O), (0, 1, O, ___:_ 1), (O, 1, -1, O), (1, O, O, -1)}. y- 3z =O.
12. Find the transpose of ~-..,..,
0=1
S 2
6) (3l ~1 7)~ .
a. (~ !). b. (t ~ ~). c. W· d. 2 5 O .
(
3 4 2
e.
1 3 2
Therefore,
the planes
there is no point common to the three planes. In this case, two of
are parallel, with normal direction (2, ~ 1, 3). Since not all three
planes are parallel, the two parallel planes intersect the third plane in a pair
13. Find the rank of each matrix in problem 12 and show that it is equal to the of parallel Iines.
rank of its transpose.
14. Suppose a system of linear equations has more than one solution. Show that Now suppose the system A X= Bis consistent. Then the nature of the
it must have an infinite number of solutions. graph ofthe solution set depends on the number of parameters in the solution.
15. Suppose { V 1 , ••• , Vn} is a linear! y independent set in "Y n· Prove without refer- If there are no parameters, the system has a unique solution, which is repre-
ence to dimension that this set spans "~'n· sented by a single point. Thus if two equations in two unknowns ha ve a uni-
3. SYSTEMS OF LINEAR EOUATIONS §4. Geometric lnterpretation 103
102

que solution, they are represented by two nonparallel lines in the plane can be generalized to call the graph of the equation P = U+ tV + sW,
intersecting at a point. But in general a solution may involve one or more U, V, W E "f/' m with V and W linearly independent, aplane in m-space.
parameters.
Example 3 Consider the intersection of the three hyperplanes in
Example 2 Consider the three planes in 3-space given by 4-space given by the following system:

X- 2y + Z = 5 x + y + 2z + 5w = 5

3x + y- 4z = 1 3x +y+ 8z + 7w = 9
x + 5y- 6z = -9. x - y + 4z - 3w = -l.

Gaussian elimination yields Gaussian elimination yields

x-z= x + 3z + w =2
y- z = -2. y- z + 4w = 3.

Therefore the solution involves one parameter and is given by x = 1 + t, Therefore the solution requires two parameters, and the graph, having
y = t - 2, z = t, tE R. The nature ofthe solution set in 3-space is clear when two degrees of freedom, is a plane. To see this, write the solution as x =
the solutions are written as 2 - 3s - t, y = 3 + s - 4t, z = s, w = t with s, t E R or in vector fonn as

(x, y, z) = t(l, 1, 1) + (1, -2, O) with tE R. (x, y, z, w) = (2, 3, O, O) + s( -3, 1, 1, O) + t( -1, -4, O, 1), s, tE R.

So the three planes intersect in the line with direction numbers (1, 1, 1) This is the plane in 4-space through the point (2, 3, q, O) which is parallel to
·passing through the point (1, -2, 0). the plane through the origin spanned by ( -3, 1, 1, O) and ( -1, -4, O, 1).

In the preceding example, one parameter is needed to express al! the Gaussian elimination provides solutions in parametric form, but Car-
solutions, and the graph of the solution set is a !in e. The number of parameters tesian equations may be obtained by row n!duction of a matrix. Sometimes
needed to express ·a solution is geometrically described as the number of the Cartesian form is more useful. For example, the coefficients in a Carte-
degrees of freedom in the graph. Thus a line has one degree of freedom. In sian equation for aplane in 3-space provide direction numbers for a normal.
all of 3-space there are three degrees of freedom, and a single point has no
degrees of freedom. In 3-space, this leaves the plane which should have two
Example 4 Find the normal to the plane in 3-space given by
degrees of freedom. Consider the Cartesian equation of a plane in 3-space,
x + 2y - z = 3. The solutions of this equation may be written in the form
x = 3 - 2t + s, y = t, z = s with t, sE R. Here the two parameters s and t f/ = {(3a + 2b, 2a + 3b, 6a- b)la, bE R}.
exhibit the two degrees of freedom. Or the solution could be written as
Although f/ is de:fined using two parameters, it need not be aplane. But
(x, y, z) = (3, O, O) + t( -2, 1, O)+ s(1, O, 1), t, SE R.
(3a + 2b, 2a + 3b, 6a - b) = a(3, 2, 6) + b(2, 3, -1)
This shows that the plane passes through the point (3, O, O) and is parallel to
the plane through the origin spanned by ( -2, 1, O) ¡¡jl({ (1, O, 1). This idea and (3, 2, 6) and (2, 3, -1) are linearly independent, so there are in fact two
104 3. SVSTEMS OF LINEAR EQUATIONS §4. Geometric ln~pretation 105

degrees of freedom and fl' is a plane. Since fl' is the row space of Euclidean geometry. However, the algebraic view distinguishes one point,
the origin, whereas from the geometric perspective, al! points are identical.
In time it will be necessary to develop ideas that allow us to separate proper-
ties which depend on a figure's position from those that do not.

the Cartesian equation can be found by row reduction. The matrix


Problems

1. Determine the number of independent equations in the following systems and


identify the graph of the solution set.
is row-equivalent to a. 2x +y =
3 b. 3x + 9y = 3 c. 3x - y + 2z = O
2y = 1
X - X + 3y = 1 X+ 4y + Z = 0
4x + 7y = 7. 2x + 6y = 2. 2x- 5y + z =O.
d. 4x - 2y + 6z = O f.
-2x +y- 3z =O e. 2x + 4y - 6z + 2w = 2
y-z=O
6x - 3y + 9z = O. 3x + 6y - 9z + 3w = 3.
So[/'= .2"{(1, O, 4), (0, l, -3)} and (x, y, z)efl' provided (x, y, z) = X - z =O.
x(l, O, 4) + y(O, 1, -3). Therefore z must equal 4x- 3y, and the Cartesian 2. Determine how the planes, given by the following equations, intersect.
equation for fl' is 4x - 3y - z = O. Thus (4, - 3, - 1) is a normal direction. a. 2x + y - z = O, x + y + 3z = 2.
b. 2x - y + 3z = O, 6x - 3y + 9z = O.
A point P in m-space has m degrees of freedom. lf the point is required c. x + y + z = O, 2x - y = O, x + z = O.
to satisfy a single linear equation, then there is one constraint on the freedom d. x + y + z = 2, 2x + y - z = 3, x + 2y + 4z = 3.
of P and it has m - 1 degrees of freedom, that is, it Iies on a hyperplane. 3. Given three distinct planes in 3-space, write their equations in the form AX =
lf Pisto satisfy the consistent system AX = B of n equations in m unknowns, B. List all possible combinations of the values for the rank of A and the rank
then the number of constraints on P need not be n. Rather it is the rank of of A*, and describe the geometric condition in each case.
the coefficient matrix A. We will say that AX = B has k independent equa- 4. Find a Cartesian equation for the following sets by using elementary row
tions ifthe rank of the augmented matrix is k. Thus if P satisfies the consistent operations to simplify a matrix that has the given set as its row space.
system AX = B'with k independent equations, then there are k restraints on a. {(2a + 3b, 3a + Sb, 4a + Sb)/a, bE R}.
the freedom of P, or P has m - k degrees of freedom. Recall that the number b. {(a, 2a- 3b, 2b- a)/a, bE R}.
of parameters required in the solution of a consistent system of equations c. {(a + 4b + 3e, a+ b + 2e, 2a- b + 3e)/a, b, e E R}.
in m unknowns, is also m - rank A. For example, although three hyper- d. {(a+ b, 2a + 3b +e, a + 2e, 3a- 2b + 4e)/a, b, e E R}.
planes were given in Example 3, they did not consitute three constraints on the e. {(a + b + 2e, -a - b, a - b - 2e, b - a)/ a, b, e E R}.
points of the graph in 4-space. If they had, there would have been 1 = 4 - 3 5. Find the intersection of the given hyperplanes, and describe it geometrically.
degree offreedom, and the solution would have been a line instead of aplane. a. x +y+ z + w = 6 b. x +y+ 3z + 2w = -2
Suppose S is the solution set for a consistent system of linear equations. 2x +y+ 4z- 4w =O 2x + 3y + 8z + 4w = -3
Then the algebraic character of S depends on whether the system is homo- 3x + 2y + Sz - Sw = 2. x - y - z + 2w = -4.
geneous, for S is a subspace only if the system is homogeneous. Otherwise S X¡ + 2x2 + 5x3 - 2x 4 + 4x 5 = 3
is the translate of a subspace, see problem 6 below. In contrast, there is no c. 2x¡ + X 2 + 7x3 -- x 4 + 2x 5 = 3
important geometric distinction for the graph of S. For example, a Iine X¡ + 3x2 + 6x3 - 5x4 + 6x 5 = 2.
represents a subspace only if it passes through the origin. But al! Iines are 6. Suppose AX""' B is a consisten! system of k independent equations in m
essentially the same, and any point could have been chosen as the origin. The unknowns. Let S be the solution set for AX =B.
choice of a particular Cartesian coordinate system is required to obtain our a. Show that if P and Q are any two points in S, then the en tire line containing
geometric representation of vectors and the corresponding algebraic view of P and Q is in S.
106 3. SYSTEMS OF LINEAR EQUATIONS §5. Determinants: Definition 107

b. Let !!' be the solution space for the homogeneous system AX = O. Show Definition The determinant of a 2 x 2 matrix
that S may be written in the form S= {U + VI V E!!'} where U is any vector
in S. (Geometrically S is a translate of !!' and so can be thought of as being
parallel to !/'.)
7. Suppose S and !!'are as in problem 6, and AX = Bis given by:
2x- 3y = 1 is the number ad- be and will be denoted by IAI, det A, and by
6x- 9y = 3.

1~ ~1·
Show that S and !!' are parallellines, with !!' passing through the origin.
8. Suppose S and !!'are as ir:?'l'feblem 6, and AX = Bis given by
2x - 4y - 2z = -2 Thus
3x- 5y + z = -4
2x - 2y + 6z = -4.
a. Find S. Why does par! a of problem 6 hold here?
b. Find !!'.
1; !1= 2( 5) - 3(1) = 7

c. Show that S is parallel to !!'. and


9. Suppose S and !!' are as in problem 6, and AX = Bis given by
2x - 3y - 5z + 4w = -1
3x- 2y + w = 6 1~ ~/ = 24- 24 =o.
2x +y+ 7z- 4w = 11.
a. Find S and !!'. Since a 2 x 2 matrix has rank 2 if and only if its rows are linearly
b. Express S as the set {U + VI V E!!'} for sorne vector U. independent, we can make the following statement: For A E vlf 2 x 2 , rank
c. Why is the result in part a of problem 6 satisfied in this case? A = 2 if and only if det A f: O. This can be applied to the system of linear
equations

ax. +by= e
§5. Determinants: Definition ex+ dy =f

The determinant can be thought of as a function that assigns a real We know that this system has a unique solution if and only if the rank of the
number to each n x n matrix. There are severa) approaches to the definition coefficient matrix is 2, that is, if and only if
of this function, and although they differ considerably in their leve) of
abstraction, they alllead to the same properties. For our purposes, we must
be able to calculate the value of a determinan! and use determinants in
mathematical statements. Therefore we will take a simple computational
approach to the defiriition, even though it will then not be possible to prove Moreover, if there is a unique solution, it has the form
the central property of determinants.
To motívate the general definition, consider the expression ad - be, for
a, b, e, dE R. This expression has occurred severa) times in this and the previous
chapter. For example, we know that the set {(a, b), (e, d)} in "Y 2 is linearly
x _ ed - bf _/;
- ad - be -
:1
¡abf' y= af-ee =
ad- be a b ·
H~;
independent if and only if ad - be f: O. Je dJ e d
3. SYSTEMS OF LINEAR EQUATIONS §5. Determinants: Definition 109
108

This is an example of Cramer's rule for finding the unique solution of a sys- Thus
tem of n independent, consistent linear equations in n unknowns. Sine~ there
is no pressing need for the general form of this rule, it will be postponéd until IUoNI
A = bh = IIVII/iNil = 1u o NI
the proof can be given in matrix form (see Theorem 4.22, page 185). Notice
that the particular expression for the solution in terms of determinants is not = l(x, y) o (w, -z)l = lxw- yzl
the only one possible. A number in the form xw - yz can be written as a
determinant in many different ways. For example

xw- yz
l
-x¡
= zw. xyl = lyw -z . As a particular case, suppose U = (2, 5) and V = (7, - 3). Then

Examp/e 1 Let U= (x, y) and V= (z, w). Then the area of the
parallelogram determined by O, U, V and U + V in the plane is the absolute
value of and the area of the parallelogram with vertices (O , O) , (2 , 5) , (7 , - 3) , an d
(9, 2) is l-411 = 41.

¡; ~1· f!
Althou.gh an~ V are assumed to be independent in the above example,
Let b be the length of the base of the parallelogram and h the height, the calculatiOn IS vahd provided only that V =? O. When U and V are linear! y
as in Figure l. Then if A is the area of the parallelogram, A = bh and b = 1VI. dependent, the determinant is zero, but there is also no area.
The height can be found using the angle e between U and a vector N per- '· When it comes to defining the value of the determinant of a 3 x 3
matrix, there is no expression like ad - be to work from. Therefore suppose
pendicular to V. For then
we ask that determinants be defined so that their absolute value is the volume
¡u o NI !U o NI of the parallelepiped in 3-space determined by the origin and three independ-
h = 1 Ulllcos e¡ = 1 Ull¡l UIIIINif = liNil . ent vectors. That is, if A = (a 1, a 2 , a 3 ), B = (b 1 , b 2 , b3), and C = (e 1 , e2 , e3),
then the determinant of the matrix
Now since V= (z, w) we can let N= (w, -z) and IINII = Jw 2 + z2 = !lVII.

.. ..,_,
s~ould ~e defined so that its value is plus or minus the volume ofthe parallele-
piped wüh three adjacent sides determined by A, B, and C. By computing the
volume V of this parallelepiped, we will see how the determinant of a 3 x 3
o matrix might be defined. Let b be the area of the base determined by B and
C, and let h be the height (see Figure 2). Then V= bh. If N is a normal to the
~la~e ~f B and C, then the formula h = lA o NI/IINII holds in 3-space justas
1t d1d In the pl~ne. To find N, we need a Cartesian equation for the plane
fll{B, C}. That IS, we must characterize the row space of the matrix
V
Figure 1
3. SYSTEMS OF LINEAR EQUATIONS §5. Determinants: Definition 111
110

plane, and

lA o NI
V= IINIIIiNi! = lA o NI.

Therefore the volume is the absolute value of

e This expression provides the pattern for defining the determinant of 3 x 3


Figure 2 as well as n x n matrices. As an example, the determinant of the 3 x 3
matrix

If b 1c 2 - b 2 c 1 i= O, then the echelon forro for this matrix is

b3 c2 - b2 c3
o b 1 c2 - b2 c 1
should be
b1 c3 - b3 c1
o b 1 c2 - b2 c1

Thus
In order to use the above expression to define the determinant of a
3 x 3 matrix, it is necessary to know how to compute determinants of 2 x 2
matrices. This will be an essential characteristic of our definition. The defini-
tion of determinant for (n + 1) x (n + 1) matrices will use determinants of
and the equation for the plane 2{B, e} can be written as n x n matrices. Such a definition is called a definition by induction.
For 1 x 1 matrices, that is (r) with rE R, define the determinan! to be

det (r) = l(r)l = r.


Notice that a particular order has been chosen for the coefficients. Now we
can take the vector N to be So 1(- 3)1 = -3. Now suppose determinant has been defined for all n x n
matrices (n cou1d be 1). Let A = (a;) be an (n + 1) x (n + 1) matrix. Then
for each entry aii of A, define the minor of a¡i to be the determinant of the
n x n matrix obtained from A by deleting the ith row and the jth column.
The cofactor of a;i is defined to be ( -l)i+ i times the minor of a¡i and is de-
e
N is called the cross product of B and and is normal to the plane of B and e noted by Aii. Then the determinant of A, denoted by IAI or det A, is defined by
even if b 1 c 2 - b 2 c 1 = O. This could be shown directly, but it follows at once
from a property of determinants, see problem 2 of Section 6.
e
The fact that IINII = IIBIIIICII sin is proved in Section 3 ofthe appendix
on determinants. Using this equation, the area b equals IINII, just as in the The determinant of an n x · n matrix will often be called an n x n determinan t.
3. SYSTEMS OF LINEAR EOUATIONS §5. Determinants: Definition 113
112

Example4 Compute the determinant. of


Example 2 Let

2 1 3) 3 1
1 2 o o
o o)
A=
( -4 5 -1 .
o 4 -2 A=o13o·
(
o o 2 1
Then the minor of 4 is the 2 x 2 determinant
detA = 3A 11 + OA 12 + 1A 13 + OA 14

¡_¡ -il = 10. = 3(-1) 1 +1 1 3


2 o o
o+ 1(-1) 1 + 3
1 2 o
o 1 o
o o 1
o 2 1
The matrix
= 3(2)(-1)1+11~ ~1 + (-1)1+11¿ ~1 + 2(-1)1+ 2 1~ ~1
= 6(3) + 1 - 2(0) = 19.

is obtained by de1eting the third row and second co1umn, so the cofactor of lf it were not for the presence of severa! zeros in the above matrix, the
4, A32 , is ( -1)3+ 210 = -10. eva1uation of the 4 x 4 detern1inant wou1d involve a considerable amount of
The minor of 2 in A is arithmatic. A 2 x 2 determinant involves two products. A 3 x 3 determinant
is the sum of three 2 x 2 determinants, each multiplied by a real number
and therefore involves 3·2 products, each with 3 factors. Similarly, a 4 x 4
. determinant involves 4 · 3 · 2 = 24 products of four numbers each. In general,
to evaluate an n x n determinant it is necessary to compute n · (n - 1) ·
and the cofactor of 2 is A 11 = ( -1) 1+1(-6) = -6. . ... 3. 2 = n! products of n factors each. Fortunately there are short cuts,
The determinant of A is given by so that a determinant need not be evaluated in exactly the way it has been
defined. The following theorem is proved in the appendix on determinants,
det A = a 11 A 11 + a 12 A 12 + a 13A13 but at this point it is stated without proof. ·

=2(-1)1+1¡¡ =~1+1(-1)1+ 2 1-~ =~1+3(-1) 1 + 3 1-~ ¡J Theorem 3.13 Let A = (a;) be an n x n matrix, then for each i,
1 ::; i ::; n,
-12 - 8 - 48 = -68.

Example 3 Show that this definition of determinant agrees with the


This is called the expansion of the determinant along the ith row of A.
definition for 2 x 2 matrices.

And for eachj, 1 ::; j ::; n,


au a¡21 = a¡¡Au + a¡2A!2
J a21 a22
= a 11 (-1) 1 +1 l(azz)l + ad-1) 1+2l(a2t)l
This is the expansion of the determinant along thejth column of A.
114 3. SYSTEMS OF LINEAR EQUATIONS §5. Determinants: Definition 115

Example 5 Let
Problems

32 5 1 -17)
1 o Evaluate the following determinants:

/~
A=2oo 7
(
1 3 o 2 l.
/i il· 2. j(-5)j. 3. 127 o
2 931 . 4. 11o 51 291 . 5.
5 o 8 o o 1
1
2 g/·
1 2 8 -1 1 o o 1
The expansion of det A along the first row, as in the definition, results in four o o 2 o1 . 7. o 1 1 1 8. 111 2 331 .
3 x 3 determinants. But if det A is expanded along the 3rd column or the 3rd
6. o 3 7 o 1 o o. 5 4 1
row, then only one 3 x 3 cofactor need be evaluated. Expansion along the 2 1 3 -3 1 1 1 o
3rd column of A gives 9. Show that the four points A, B, e, and D are the vertices of a parallelogram
with D opposite A and find its area when
a. A = (0, 0), B = (2, 5), e= (1, -4), D = (3, 1).
detA =
3 1
1(-1) 1 + 3 2 O -1o = 2(-1)2+1 1 1
1 3 2 3
-1¡
2 ,; -2(2 + 3) = -10.
b. A= (2, -5), B = (1, 6), e= (4, 7), D = (3, 18).
c. A= (1, 1), B = (2, 4), e= (3, -2), D = (4, 1).
10. Find the volume of the parallelepiped in 3-space determined by the vectors
Expansion along the 3rd row of A gives a. (1, O, O), (0, 1, 0), (0, O, 1).
b. (3, 1, -1), (4, 2, 2), (-1, 2, -3).

detA
5 1

3
7

o 2
1
= 2(-1)3+ 1 1 O -1 = 2(-1) 1 + 2 1
3
-1¡
2 = -2(2 + 3) = -10.
11. Write out the six terms in the expansion of an arbitrary 3 x 3 determinan!
and notice that this sum contains every possible product that can be formed
by taking one element from each row and each column. Does this idea
generalize to n x n determinants?
The values are the same as predicted by Theorem 3.13. In the expansions, 12. Prove Corollary 3.15.
the first 3 x 3 minor was expanded along its 2nd row and the second along 13. Let A be an n x n matrix. Use induction to prove that jAr¡ = JAj.
its 2nd column. 14. For B, e E S 3 , Jet B X e denote the cross product of B and C. Compute the
following cross products:
Two corollaries follow easily from Theorem 3.13. a. (1, O, O) X (0, 1, O). d. (2, 3, 2) X (-2, 1, 0).
b. (0, 1, O) X (1, O, 0). e. (-4, 2, -8) x (6, -3, 12).
c. (1, 2, 3) X (1, 2, 3) .. f. (3, O, 2) X ( -5, O, 3).
Corollary 3.14 If a row or column of an n x n matrix A contains only 15. Suppose the definition of a 3 x 3 determinan! is extended to allow vectors as
O's, then det A = O. entries in the first row, then the value of such a determinan! is a linear com-
bination of the vectors.
Corollary 3.15 If A is an n x n matrix and Bis obtained from A by a. Show that if B = (b¡, b2, b3) and e= (e¡, c 2, c3), then
multiplying a row or column of A by a scalar r, then det B = r det A.
B X e=


E2
b2
E31
b3
l c1 c2 C3
The second corollary agrees with the idea that a determinant measures where {E¡, E2, E 3 } is the standard basis for 3 3 •
volume. For if the edges of a parallelogram or parallelepiped determined by b. Let A = (a¡, a2, a 3 ) and show that
one direction are mu1tiplied by r, r > O, then the area or volume is also
multiplied by r.
116 3. SYSTEMS OF LINEAR EQUATIONS §6. Determinants: Pr_9Perties and Applications 117

A o (B x e) is the scalar triple product of A, B, and e and the volume of two rows. Since n · > 2, at 1east one row is not involved in the interchange,
the parallelepiped determined by A, B, and e in G3 • say the kth is such a row and expand IBI along this row. Then IBI = bk 1 Bkl
16. Use a geometric argument to show that {A, B, e} is Iinearly dependent if + bk 2 Bk 2 + · · · + bknBkn· For each indexj, the cofactor Bk1 differs from the
A o (B x e)= O. cofactor AkJ of akJ in A on!y in that two rows are interchanged. These co-
factors are determinants of (n - 1) x (n - 1) matrices, so by the induction
17. a. Compute [(0, 1, 2) x (3, O, -1)] x (2, 4, 1), and (0, 1, 2) x [(3, O, -1) x
(2, 4, 1)]. assumption, BkJ = - AkJ· Since the kth row of B equa1s the kth row of A, the
b. What can be concluded from part a? above expansion for IBI becomes
c. What can be said about the expression A x B x e?

which is the negative of the expansion of IAI along the kth row of A.
Now 1 ho1ds for all 2 x 2 matrices, and if it ho1ds for all (n - 1) x
§6. Determinants: Properties and Applications (n - 1) matrices (n ~ 3), then it holds for all n x n matrices. Therefore 1
is proved by induction:
The evaluation of a determimint is greatly simplified if it can be expanded Part 2 was pro ved in the previous section (Corollary 3.15); and the induc-
along a row or column with only one nonzero entry. If no such row or tion proof of 3 is 1eft as a prob1em.
column exists, one could be obtained by performing elementary row opera-
tions on the matrix. The following theorem shows how such operations affect Examp!e 1
the value of the determinan t.
-5 2 -3 7 -4 o 7 4
Theorem 3.16 Suppose A is an n x n matrix and Bis obtained from 4 -2 1 = 4 -2 1 = 1(-1) 2 + 3 /
2
-
3
¡= -29.

A by a single elementary row operation, then 2 3 o 2 30

Here the second determinant is obtained from the first by adding three times
l. IBI = -IAI if the elementary row operation is of type I. the second row to the first row.
2. IBI = riAI if a row of A is'multiplied by a scalar r.
3. IBI = IAI if the elementary row operation is of type III.
Elementary row operations can be used to obtain a matrix with zeros in
a column but usually not in a row. To do this it is necessary to make changes
Proof The proof of 1 is by induction on the number of rows in A. in the columns of the matrix. Elementary eolumn operations are defined just
It is not possible to interchange two rows of a 1 x 1 matrix, so consider as elementary row operations except they are applied to the columns of a
2 x 2 matrices. If matrix rather than the rows.

B=(ea d)b '


;.;;

and ' Examp!e 2 The following transformation is made with elementary


column operations of type III:
then

IBI =eh- da= -(ad- be)= -IAI.


2 -1) (3 2 3)
1 -2 -----> 2 10----->
(-10 1203).
3 -5 4 3 1 -2 3 1
Therefore the property holds for 2 x 2 matrices. Now assume that 1 holds
for all (n - 1) x (n - 1) matrices with n 2: 3. Let A be an n x n matrix First 2 times the 2nd column is added to the 3rd and then -2 times the 2nd
A = (a;) and 1et B = (b;j) be the matrix obtained from A by interchanging column is added to the 1st.
118 3. SYSTEMS OF LINEAR EQUATJONS §6. Determinants: Properties and Applications 119

Theorem 3.17 Suppose A is an n x n matrix and Bis obtained from A The technique shown in Example 3 is the same as that used in finding the
with a single elementary column operation, then echelon form of a matrix, except that it is necessary to keep track of the
changes in value that arise when elementary row operations of type 1 and II
l. IBI = -lA if the elementary column operation is of type I.
J
are used.
2. IBI = rlAI if a column of A is multiplied by r. Elementary row operations change the value of a determinant by at
3. IBI = JAI if the elementary column operation is of type III. most a nonzero scalar multiple. Therefore, the determinant of an n x n
matrix A and the determinant of its echelon form can differ at most by a
Proof Performing an elementary column operation on a matrix has nonzero multiple. Thus if the echelon form for A has a row of zeros, then
the same effect as perform~an elementary row operation on its transpose. JAI =O or:
Therefore this theorem follows from the corresponding theorem for ele-
mentary row operations and the fact that lA TI = JAI for aii n x n. matrices A. Theorem 3.20 The rank of an n x n matrix A 1s n if and only if
IAI #O.
Theorem 3.18 If two rows or two columns of an n x n matrix A are
proportional, then IAI = O. Now using determinants, two results from the theory of systems of
linear equations can be restated.
Proof If A satisfies the hypothesis, then there is an elementary row or
column operation of type III which yields a matrix having a row or column Theorem 3.21 1. A system of n linear equations in n unknowns,
of zeros. Thus JAI = O. AX = B, has a unique solution if and only if !Al # O.
2. A homogeneous system of n linear equations in n unknowns AX = O
lt is possible to evaluate any determinant using only elementary row or has a nontrivial solution if and only if lA 1 = O.
elementary column operations, that is, without computing any cofactors. To
see how this can be done, we wiii say that the elements a¡¡ of a matrix A =
Examp!e 4 Determine if
(a;) are on the main diagonal of A. And that A is upper triangular if all the
elements below the main diagonal are zero, thl;\t is, if aii = O when i > j. {(2, 1' o, -1), (5, 8, 6, - 3), (3, 3, o, 4), (!' 2, o, 5)}
Lower triangular is similarly defined. Then we have the foiJowing theorem.
is a basis for "f/" 4 .
Theorem 3.19 If A =(a;) is an n x n upper (lower) triangular matrix, A set of 4 vectors is a basis for "f/" 4 if it is Jinearly independent, or equiva-
then JAI = a 11 a22 • • • a••. lently if the matrix · with the vectors as rows has rank 4, that is, a nonzero
determinant. But
Proof By induction and Jeft as a problem.
2 1 o -1 2 1 -1 o o
Example 3 Use elementary row operations to evaluate a determi- 5 8 6 -3 = 6(-1)2+3 3 3 4 -6 -3 3 7 =O.
nant. 3 3 o 4 1 2 5 -3 2 7
2 o 5
2 4 -6 1 2 -3 1 2 -3
3 1 -4 =23 1 -4 =20 -5 5 Since two column operations yield a 3 x 3 matrix with two columns proper-
2 2 5 2 2 5 o -2 11 tional, the determinant is zero and the set fails to be a basis.
1 2 -3 1 2 -3
-10 o 1 -1 = -10 o 1 -1
Problems
o -2 ll o o 9 rr
= -1 o. 1. 1 . 9 = -90. l. Use elementary row and column operations to simplify the evaluation of the
120 3. SYSTEMS OF LINEAR EQUATIONS §7. Alternate Approaches to Rank 121

following determinánts: matrix A, the rank is n provided IAI =F O, and IAI = O says only that the
1 1 o o rank is less than n. However, ~ven if IAI = O orA is n x m and the determi-
a.¡¡ -~ ¡¡. 2 1
b. 3 -1
14 2
c.
1 1 1 1
o
oo
1 1
1 1
o nant of A is not defined, it is possible to use determinants to find the rank of A.

1 1 o 1 2 4 -6 4 o 1 2 3 Oefinition A k x k submatrix of an n x m matrix A is a k x k matrix


o 1 1 o 2 6 -1 1· 1 2 3 o
d. 1 1 o o. e. 6 o 3 6 . f. 2 3 o 1 obtained by deleting n - k rows and m - k columns from A.
1 1 1 1 1 2 o 2 3 o 1 2
2. Let U, V E S 3 , and use problem 15 of Section 5 along with the properties of
The 2 x 3 matrix
determinants to prove
a. U X V= -V X U.
b. U x Vis perpendicular to both U and V.
c. lf U and V are Iinearly dependent, then U x V = O.
3. Use determinants to determine if the following sets are linearly independent: has three 2 x 2 submatrices;
a. {(4, 2, 3), (l, -6, 1), (2, 3, 1)}.
b. {3 + 4i, 2 + i}.
c. {t 5 + 2t, t 3 + t + 1, 2t 5 + 5t, t 3 + 1}.
4. Is the set {(4, O, O, O, 0), (3, -2, O, O, 0), (8, 2, 5, O, 0), (-3, 9, 4, 2, 0),
(0, 5, 3, 1, 7)} a basis for "Y 5 ?
It has six 1 x l submatrices; (2), (1),.(5), etc.
5. Give induction proofs for the following statements.
-~
a. An elementary row operation of type 111 does not affect the value of a
determinan t. Definition The determínant rank of an n x m matrix A is the largest
b. The determinant of an upper triangular n x n matrix is the product of the value of k for which sorne k x k submatrix of A has a nonzero determinan t.
elements on its main diagonal.
6.- Suppose A = (al)) is an n x n matrix with a11 #- O. Show that det A=
Examp!e 1 Let
(lf(a 11 )"- 2 ) det B, where B =(bu) is the (n- 1) X (n- 1) matrix with 2:::; i,
j:::; n, and b, 1 = a 11 a 11 - a 11 au. This procedure, called pivota/ condensation,
1
ínight be used by a computer to reduce the order of a determinant while
introducing only one fraction .. 3
-2
7. Use pivota) condensation to evaluate
2 1 3 2
Then
b. 52 47 221 .
1 3 2 1
a. 53 71 421 . c. 2 2 1 1
19 6 8 13 2 3 3 1 2 1 4 1 3
8. Prove that interchanging two columns in a determinant changes the sign of the 2 3 1 =o
determinant. (This does not require induction.) 2 -2 2

and

§7. Alternate Approaches to Rank

The rank of a matrix is defined as the number of nonzero rows in its


echelon forro, which equals the dimension of its row space. For an n x n Therefore the determinant rank of A is 2.
122 3. SYSTEMS OF LINEAR EQUATIONS §7. Alternate Approaches to Rank 123

The rank or row rank of A is also 2, for the echelon form for A is is the echelon form for A and the rank of A is 2. Now

4/5)
(~
o
1 -1/5 .
o o
is a 2 x 2 submatrix of B with nonzero determinant. A can be transformed to
Theorem 3.22 For any matrix A, the determinant rank of A equals the B with a sequence of elementary row operations; working backward we
rank of A. obtain a sequence that transforms the nonzero rows of B to the first two
rows of A (in this case any pair of rows in A could be obtained).
~.. .

Proof s
Let A be an n x m matrix with determinant rank and rank
! 2 o !) (! 2 2 1/2 5/2)
r. We will show first that s ::;: r and then that r ::;: s. 1/2 5/2)--d(l
( 0013¡¡¡>00 13 11 0 o -3/2 -9/2
If the determinant rank of A is s, then A has an s x s submatrix with
nonzero determinant. The s rows of this submatrix are therefore linearly 2 1/2 5J2)-d(2 4 1 5)
independent. Hence there are s rows of A that are linearly independent, and 10 1 8 JI 5 10 1 8 .
the dimension of the row space of A is at Jeast s. That is, the rank of A is at
least s or s ::;: r. Performing this sequence of operations on the submatrix e we obtain
For the second inequa!ity, the rank of A is r. Therefore the echelon form
for A, call it B, has r nonzero rows and r columns containing the leading 1's
for these rows. So B has an r x r submatrix, call it e, with 1's on the main
I o) (I 112) (I 112) (I 112) (2 1)
(O 1 ¡¡¡> o 1 Ir' o - 3/2 ¡¡¡> 5 1 Ir' 5 1 .
diagonal and O's e!sewhere. Thus ICI = 1 # O. Now the r nonzero rows of
B span the rows of A, therefore there is a sequence of elementary row opera- The matrix
tions which transforms these r rows to r rows of A, leaving the remaining
e
n - r rows all zero. This transforms the r x r submatrix toan r x r sub-
matrix of A, and since elementary row operations change the value of a
determinant by at most a nonzero multiple, this r x r submatrix of A has
nonzero determinant. That is, the determinant rank of A is at least r or r ::;: s. is a 2 x 2 submatrix of A, which must have determinant nonzero; in fact,
the determinant is - 3. Thus the determinant rank of A cannot be less than 2,
the rank of A.
Example 2 It may be helpful to look at an example of the pro-
cedure u sed in the second part of the preceding proof.
Let Determinant rank might be used to quickly show that the rank of a
3 x 3 matrix with zero determinant is 2, as in Example l. But in general,

A~G
4 it is not practica! to determine the rank of a matrix by finding the determinant
10
6 !) rank. For example, if the rank of a 3 x 5 matrix is 2, it would be necessary
to show that all ten, 3 x 3 submatrices have zero determinant before con-
sidering a 2 x 2 submatrix.
~hen In this chapter, matrices have been used primarily to express systems of
linear equations. Thus the rows of a matrix contain the important informa-
2 o
h(~ ~)
tion, and rank is defined in terms of rows. But in the next chapter, the
o 1 columns will contain the information needed, so the final approach to rank
o o is in terms of the columns.
124 3. SYSTEMS OF LINEAR EQUATIONS §7. Alternate Approaches to Rank 125

Definition The column space of an n x m matrix A is the subspace of Suppose {u 1' ... ' un -1} e "f/'n• lt is necessary to find a vector wE "f/' n

vltnx 1 spanned by the m columns of A. The column ran.'; of A is the dimension such that the vector equation x 1 U1 + · · · + Xn-I u._ 1 = W has no so1ution
of its column space. for x 1 , •.. , x.- 1 • This vector equation yields a system of n linear equations
in n - 1 unknowns. If the system is denoted by AX = B, then A is an
n x (n - 1) matrix, and we are 1ooking for B = wr so that the rank of A*
Examp/e 3 Let
is not equal to the rank of A.
Suppose the rank of A is k, then k ~ n - 1, and A has k Iinearly in-
dependent co1umns. But the columns of A are vectors in vll.x 1 , which has
dimension n. So there exists a column vector B in vil n x 1 which is independent
then the column space of A is of the columns of A. Let this column vector B determine W by W = Br.
Then the rank of the augmented matrix A* is k + 1, and for the vector W,
the system is inconsistent. That is

and the column rank of A is 2. If


and the n - l vectors do not span "~'··

Using elementary column operations, it is possible to define two new


relations between matrices following the pattern of row-equivalence. A
matrix A is column-equivalent to B if B can be obtained from A with a finite
then sequence of elementary column operations, and A is equivalent to B if B can
be obtained from A with a finite sequence of elementary row and column
operations. The re1ation of equivalence for matrices will appear 1ater in
another context.

is the column space of B, and the column rank of B is 2. Examp/e 5 Show that

The transpose of a matrix turns columns into rows, therefore the column
rank of a matrix A is the rank (row rank) of its transpose Ar. Also, transpos-
ing a k x k matrix does not affect its determinant (problem 13, page 115),
so the determinant ranks of Ar and A are equal. Therefore,"'
(~ L!)
is equiva1ent to
column rank A = (row) rank Ar = determinant rank Ar 1
o
o1 oo) .
= determinant rank A = (row) rank A, (
o o o
proving the following theorem.
Using column and row operations, we obtain

Theorem 3.23 The co1umn rank of a matrix equals its rank.


o 3)1 ~o
.(1 3) (1
4o -5 ~o o4 -53) ~o
(1
Example 4 Show that no set of n - 1 vectors can span the space "~'n· (i 4
4 -2
lli
o 4 -5
lli
o o o o
126 3. SYSTEMS OF LINEAR EQUATIONS Review Problems 127

4. Show that the line in the plane through the points with coordinates (x¡, y 1 )
Problems and (xz, Yz) has the equation

l. Find a matrix equivalent to the given matrix with 1's on the main diagonal and
;: ~~ ~1 =o.
O's elsewhere. 11 1 1
a. (~ i)· b. (22 -53 91) .
4 2 6
c. m. e d. 3 46 o)
2 42 o
3 1 2 1 5
2 . e. (; ~) 5. Use problem 4 to find an equation for the line through the points with co-
ordinates:

2. Find the column space and thus the rank of the following matrices: a. (1 ,0), (0, 1). b. (3, 4), (2, - 5). c. (5, 1), (8, 9).

6. Using problem 4 as a guide, devise a determinan! equation for the plane


a. (~ !)· b. ( -12 11 o
3) . c. (4o o
2 3)
o o 5
1 . d. (3o 21 21 . 5)
1 . through the three points with coordinates (x¡, y¡, Z¡), (xz, Y2, z2), and
1 5 6 2 o 1 3 (x 3 , y 3 , z 3 ). (What would happen in the equation if the three points were
3. Why is it impossible for the column space of A to equal the row space of A, collinear ?)
unless A is a 1 x 1 matrix?
7. Use elementary row operations to find a Cartesian equation for the follow-
4. The column space of a matrix A is essentially the same as the row space of Ar. ing planes:
Show that even for n x n matrices, the row space of A need not equal the row
a. {(8a + 8b, 4a + 8b, 8a + 11b)ia, bE R}.
space of AT by considering
b . .?{(5, 15, 7), (2, 6, 8), (3, 9, -2)}.
c. {(2a- 4b + 6c, a+ b- 2c, -?a+ 2b- c)ia, b, e E R}.
~)· (2~ ¿ =~).
1 5
a. A= ( 2 4 b. A = d . .?{(2, 3, -2), (-5, -4, 5)}.
1 -1 -4 10
5. How do you read AT? 8. Show that the plane through the points with coordinates (x¡, y¡, z 1 ),
(x2, Y2, z 2), and (x 3 , y 3 , z 3) has the equation
6. Use determinan! rank to prove that if U x V= Ofor U, V E g 3 , then U and V
are linearly dependen!. X - X¡ y - Yt Z- Z¡l
Xz - X¡ Yz - Yt Zz - Z¡ = 0.
7. Use the theory of systems of linear equations to prove that if {U1 , ••• , U;,} 1X3 - X 1 y3 - y1 Za - Z¡
span "Yn, then {U¡, ... , Un} is linearly independent. In your proof, write out
9. Gíven a system of n linear equatíons in n unknowns, why is one justified in
the systems used, showing the coefficients and the unknowns.
expecting a u ni que solution?
10. In the next chapter, we will find that matrix multiplication distributes over
matrix addition. That is, A( U + V) = A U + AV provided the products are
Review Problems defined. Use this fact to prove the following:
a. The solution set for a homogeneous system of linear equations is closed
l. Suppose AX = B is a system of n linear equations in 2 unknowns. Each under addition (Theorem 3.2, page 80). ·
equation determines a line in the plane. Consider all possible values for the b. lf X= C 1 is a solution for AX = B, and X= C 2 is a solution for AX = O,
ranks of A and A* and in each case determine how the n lines intersect. then X= C 1 +
C 2 is a solution for AX = B (problem 6, page 105).
2. Suppose AX = Ois a system of linear equations obtained in determining if the 11. lnstead of viewing both A and 8 as being fixed in the system of linear equa-
set S e R"[tJ is linearly independent. Find a set S for which this system is tions A X = B, consider only A to be fixed. Then A X = B defines a cor-
obtained if respondence from ,({m x 1 to ,({" x ¡.

a. A= (i ~). b. A=(~ ~)· 3 1


c. A= (2 -1 a. To what vectors do O), (_~), and (~) correspond in ,({ 3x1 if

Suppose S = {V¡, ... , Vm} e "Y n and m < n. Use the theory of linear equa-
3.
tions and column rank to prove that S does not span "Yn. A= (i D?
128 3. SYSTEMS OF LINEAR EQUATIONS

b. To what vectors do (z). C!), and


A= (b ¡
eD~)?
correspond in .-112 X 1 if

c. Suppose this correspondence satisfies the second condition in the definition


for an isomorphism, page 62. What can be concluded about the rank of A,
and the relative values of n and m?

Linear Transformations

§1. Definitions and Examples


§2. The lmage and Null Spaces
§3. Algebra of Linear Transformations
§4. Matrices of a Linear Transformation
§5. Composition of Maps
§6. Matrix Multiplication
§7. Elementary Matrices
130 4. LINEAR TRMISFORMATIONS §1. Definitions and Examples 131

A special type of transformation, the isomorphism, has been touched the sum of the images and the image of a scalar multiple is the scalar multiple
upon as a natural consequence of introducing coordinates. We now tuxn to of the image. We say that a linear transformation "preserves addition and
a general study of transformations between vector spaces. · scalar multiplication."
Almost none of the functions studied in calculus are linear maps from
the vector space R to R. In fact,f: R-+ R is linear only ifj(x) = ax for sorne
real number a.
§1. Definitions and Examples The two conditions in the definition of a linear transformation are in-
cluded in the definition of an isomorphism (see page 62). That is, isomor-
phisms are linear transformations. But the converse is not true.
Definition Let "f/ and~-be vector spaces. If T(V) is a unique vector in
ifí for each V E 1', then T is afunetion, transformation, or map from ·r,
the
domain, to ifí, the eodomain. If T(V) = W, then W is the image of V under Examp/e 1 Define T: i' 3 -+ i' 2 by T(x, y, z) = (2x + y, 4z). Show
Tand Vis a preimage of W. The set of all images in "'r, {T(V)IV E 1'}, is the that T is linear but not an isomorphism.
image or range of T. The notation T: i' -+ "'r will be used to denote a trans- If U= (a 1 , a2 , a3 ) and V= (b 1 , b2 , b3 ), then
formation from i' to "'r.
T(U + V) = T(a 1 + b1 , a 2 + b 2 , a 3 + b3 ) Definition of + in "f/ 3
According to this definition, each vector in the domain has one and only
one image. However, a vector in the codomain may have many preimages = (2[a 1 + b1] + a 2 + b2 , 4[a 3 + b3 ]) Definition of T
or non e at all. For example, consider the map T: i' 3 -+ "f/ 2 defined by = (2a 1 + a2 , 4a3 ) + (2b 1 + b2 , 4b 3 ) Definition of + in i' 2
T(a, b, e) =(a+ b- e, 2e- 2a- 2b). For this map, T(3, 1, 2) = (2, -4).
Therefore (2, -4) is the image of(3, 1, 2) and (3, 1, 2) is a preimage of(2, -4). = T(a 1 , a2 , a3 ) + T(b 1 , b2 , b 3 ) Definition of T
But T(l, 2, 1) = (2, -4), so (1, 2, 1) is also a preimage of (2, -4). Can you = T(U) + T(V).
find any other preimages of (2, -4)? On the other hand, (1, O) has no
preimage, for T(a, b, e) = (1, O)yields the inconsistent system a + b - e = l. Therefore T preserves addition of vectors. T also preserves scalar multiplica-
2C - 2a ..:.. 2b == -Ó. Thus sorne vectors in thé codomain "f/ 2 have many tion for if rE R, then
preimages, while others have none.
In the preceding definition, i' and 1f/ could be arbitrary sets and the
T(rU) = T(ra 1 , ra2 , ra 3 ) Definition of scalar multiplication in "f/ 3
definition would still be meaningful provided only that the word "vector"
were replaced by "element." In fact, elementary calculus is the study offunc- = (2ra 1 + ra 2 , 4ra 3 ) Definition of T
tions from a set of real numbers to the set of real numbers. But a transforma-
tion between vector spaces, instead of simply sets of vectors, should be more
= r(2a 1 , +a2 , 4a3 ) Definition of scalar multiplication in "f/ 2
than simpJy a rule giving a vector in 1f/ for each vector in 1', 'it should also = rT(a 1 , a2 , a 3 ) Definition of T ·
preserve the vector space structure. Therefore, we begin by restricting our
attention to maps that send sets as vector spaces to vector spaces.
= rT(U).
Thus T is a linear map. However, T is not an isomorphism for there is not
Definition Let "f/ and 1f/ be vector spaces and T: "f/ -+ "'f/". Tis a linear a unique preimage for every vector in the codomain "f/ 2 • Given (a, b) E "f/2 ,
tfansformation ora linear map if for every U, V E i ' and rE R: there exists a vector (x, y, z) E i' 3 such that T(x, y, z) = (a, b) provided
l. T(U + V)= T(U) + T(V); and . 2x + y = a and 4z = b. Given a and b, this system of two equations in the
2. T(rU) = rT(U). three unknowns, x, y, z, cannot have a unique solution. Therefore, if a
vector (a, b) has one ·preimage, then it has many. For example, T(3, O, 1)
Thus, under a linear transformation, the image of the sum of vectors is = (6, 4)_ = T(2, 2, 1).
132 4. LINEAR TRANSFORMATIONS §1. Definitions and Examples 133

J.
Definition A transformation T: "/'--+ "/Y is one to one, written 1-1, if The map in Example 2 is not onto for the image does not contain (1, 6, 3), a
T(U) = T(V) implies U = V. vector in the codomain.
The second condition in our definition of an isomorphism requires that
A map from "/' to "/Y is one to one if each vector in the codomain "/Y has an isomorphism be both one to one and onto. lt is now possible to give a
at most one preimage in the domain "/'. Thus the map in Example 1 is not simple restatement of the definition.
1-1. The property of being one to one is inverse to the property of being a
function. For T: "/'--+ "/Y is a function provided T(U) = W, and T(U) = X
implies W = X. In fact, the function notation T(U) is used only when U has Definition An isomorphism from a vector space "/' to a vector space
a unique image. "/Y is a one to one, onto, linear map from "/' to '"IY.
A map may be linear and one to one and still not be an isomorphism.
An ·isomorphism is a very special transformation in that isomorphic
Examp!e 2 The map T: "/' 2 --+ "/' 3 defined by T(x, y) = (x, 3x -y, 2x) vector spaces are algebraically identical. But an arbitrary linear transforma-
is linear and one to one, but not an isomorphism. tion may fail to be either one to one or onto.
To show that T preserves addition, let U= (a 1 , a 2 ) and V= (b 1 , b2 ),
then Example 3 The map T: "/' 2 --+ "/' 2 defined by T(a, b) = (0, O) is
linear, but it is neither one to one nor onto.
T(U + V) = T(a 1 + h1 , a2 + h2 )
= (a 1 + b 1, 3[a 1 + b¡] - [a 2 + h2 ], 2[a 1 + b¡]) Example 4 Let
= (a 1, 3a 1 - a2 , 2a 1) + (b 1 , 3b 1 - b2 , 2b 1)
= T(a 1 , a2 ) + T(b 1, b2 ) = T(U) + T(V).
Then T is a linear map from R[t] to itself which is onto but not 1-1.
The proof that T preserves scalar multiplication is similar, thus T is linear.
The image of a polynomial PE R[t] under T is the derivative of p with
To see that T is 1-1, suppose T(U) = T(V). Then (a 1 , 3a 1 - a 2 , 2a 1) =
respect to t. One can prove directly that T is linear, but this is also implied
(b 1 , 3b- b2 , 2b 1) yields a 1 = b 1 , 3a1 - a2 = 3b 1 - h2 , and 2a 1 = 2b 1 .
by the theorems on differentiation obtained in elementary calculus. T maps
These equations have a 1 = b1 , a2 = b2 as the only solution, so U= V and T
R[t] onto R[t]; for given any polynomial b0 + b 1t + .. · + bmtm E R[t], the
is 1-1. polynomial
However, T is not an isomorphism beca use sorne vectors in "/' 3 have no
preimage under T. Given (a, b, e) E"/' 3, there exists a vector (x, y) E"/' 2 such
that T(x, y) = (a, b, e) provided x = a, 3x - y = b, and 2x = c. For any
choice of a, b, e, this is a system of three linear equations in two unknowns,
x and y, which need not be consistent. In this case, if 2a f= e, there is no ''.•

solution. For example, (1, 6, 3) has no preimage in "/' 2 under T. Therefore is sent to it by T. Thus, every polynomial has a preimage and T is an onto
T does not satisfy the definition of an isomorphism. map. But T fails to be one to one. (Why?)

Definition A transformation T: "/' --+ '"IY is onto if for every vector A direct sum decomposition of "/' given by "/' = Y E& ff determines
W E "/Y, there exists a vector V E"/' such that T(V) = W. two linear transformations; P 1 : "/'--+ Y and P2 : "/'--+ ff. These projection
maps are defined by P 1(V) = U and PiV) = W where V= U+ W, U E y
and W E ff. P 1 and P2 are well-defined maps because, in a direct sum, U
T maps "/' onto the codomain "/Y if every vector in "/Y is the image of ~nd W are uniquely determined by V. The proof that a projection is linear
sorne vector in"/'. That is, T is onto if its image or range equals its codomain. IS left to the reader.
134 4. LINEAR TRANSFORMATIONS §1. Definitions and Examples 135

Example 5 Let Y = {(x, y, z)ix +y+ z = O} and ff =


2{(2, 5, 7)}. Then Y + ff is direct and g 3 = Y Etl ff expresses g 3 as the
direct su m of a plan e and a line. Thus there are two projections P 1 : ~ 3 ~ Y (-sin 1/>, cos if>) _)-- (0, 1)
and P 2 : g 3 ~ ff. Given (7, 3, 4) E g 3 , one finds that (7, 3, 4) = (5, -2, - 3)
+ (2, 5, 7) with (5, -2, -3) E Y and (2, 5, 7) E ff. Therefore, P¡(7, 3, 4)

= (5, -2, -3) and P 2 (7, 3, 4) = (2, 5, 7). (cos if>, sin if>)

A projection map is much more general than the perpendicular projec-


tion found in Euclidean geometry. However, a perpendicular projection can
be expressed as a projection map. For example, if ff in Example 5 is replaced
~-
(1, O)
by Sf{(l, l, 1)}, the subspace (line) normal to the plane Y, then P 1 (V) is at
the foot of the perpendicular dropped from V to the plane Y. That is, P 1
is the perpendicular projection of g 3 onto Y.
. A projection map is necessarily onto; for given any vector U E Y, U
is in "Y and U = U + O, OE ff. Therefore P 1(U) = U, and every vector in Y Figure 1
has a preimage in "Y. On the other hand, if ff '# {0}, then P 1 fails to be
1 - 1. This is easily seen, for every vector in ::T is sent to the same vector
lengths or the relative position of points. Thus a rotation of g 2 about the
by P 1 , namely O. That is, if W E :Y, then W = O + W with OE Y and
origin is a linear map, and we can use this fact to obtain a componentwise
WEff, so P 1 (W) =O.
The linearity conditions have a good geometríc interpretatíon. Gener- expression for T. Suppose cp is the angle of rotation with positive angles
ally, a linear map sends parallel lines into parallellines. To see this, consider measured in the counterclockwise direction. Then T(1, O) = (cos cp, sin cp)
the linee with vector equation P = k U + V, k E R, for U, V E "Y", U ;6 O. and T(O, 1) = (-sin cp, cos cp) as in Figure l. Now using the linearity of T,
If U and V are víewed as points in Euclidean space, then e is the line through
V with direction U. Now suppose T: "Yn ~ "Yn is a linear map, then T(a, b) = T(a(I, O) + b(O, 1))
= aT(I, O) + bT(O, 1)
T(P) = T(kU + V)= T(kU) + T(V) = kT(U) +T(V) for all k E R.
= a(cos cp, sin cp) + b( -sin cp, cos cp)
Therefore if T(U) '# O, then Tsends e into the line through T(V) with direc- = (a cos cp - b sin cp, a sin cp + b cos cp).
tion T( U). Now any line parallel toe has direction U and is sent toa line with
direction T( U). What would happen to the parallel line given by P = k U + W,
From the geometric point ofview, a rotation in g 2 ~ust be 1-1 and onto, but
k E R, if T(V) = T(W)? What would happen toe if T(U) =O?
this could a1so be obtained from the componentwise definition.
We can use this geometric interpretation to show that a rotation of
the pl~ne about the origin must be linear. Certainly a rotation sends parallel
lines into parallel lines, but the converse of the above argument is false.
That is, a map may send parallellines into parallellines without being linear; Problems
considera translation. However, if T: g 2 ~ g 2 is a rotation about the origin,
then T(O) = O and T moves geometric figures without distortion. Thus the l. Determine which of the following transfoimations are linear.
a. T: "f"2 _, 1"2 ; T(a, b) = (3a- b, b2).
parallelogram determined by O, U, V and U + Vis sent to the parallelogram
determined by O, T(U), T(V), and T(U + V). So by the parallelogra,m rule
b. T: 1"3 _, 1"2; T(a, b, e)= (3a 2e, b):+
for addition T(U + V)= T(U) + T(V), and a rotation preserves addition. c. T:Jt2x2--R;T(~ ~)=det(~ t).
T also preserves scalar multiplication beca use a rotation m oves a line passing + bt + et 2 ) = (a ~ b)t 2 + 4et- b.
d.. T: R3[t] _, R3[t]; T(a
through the origin, {rUir E R}, toa line through the origin without changing e. T: 1"3 _, 1"4; T(a, b, e) = (a - 5b, 7, a, 0).
136 4. LINEAR TRANSFORMATIONS §2. The lmage a·nd Null S paces 137

f. T: e- R; T(a + bi) = a + b. 12. The reftection of E 2 in the line with equation y = x is given by T(x, y) = (y, x).
g. T: e- e; T(a + bi) = i - a.. Is this reftection linear? 1-1? onto?
2. Suppose T:"f"- ifí. Show that Tis linear ifand only if T(aU + bV) d:aT(U) 13. Suppose T: "f" - ifí is linear and there exists a nonzero vector U e "f" such
+ bT(V) for all a, be R and all U, Ve "f". that T( U) = O. Prove that T is not one to one.
3. How do you read "T: "f"- ifí?" 14. Suppose T: f - ifí is a linear map, U¡, ... , Uk E "f" and r¡, ... , rk E R.
4. Show that each of the following maps is linear. Find all preimages fot the given Use induction to prove that if k ;;::: 2, then
vector W in the. codomain. T(r¡U¡ + ··· + rkUk) = r¡T(U¡) + ··· + rkT(Uk).
a. T(a, b) = (2a - b, 3b - 4a); W = ( -1, 5).
b. T(a, b, e) = (a + b, 3a- e); W = (0, O).
15. Suppose T: f - "'fí is linear and {V1, ... , Vn} is a basis for f. Pro ve that the
c. T(a, b) =(a- b, a+ b, 2a); W = (2, 5, 3).
image or range of T is !l' {T(V¡), ... , T(Vn)}. That is, the image of T is a
d. T(a + bt + e/ 2 ) =(a+ b)t + (e- a)t 2 ; W =t.
subspace of the codomain ifí.
e. T(a + bt + et 2 ) = b + 2et; W = t.
f. T(a + bt + et 2 ) =a + b +(b + e)t + (e- a)t 2 ; W = 3 - 3t 2 •
5. State what it would mean for each of the following maps to be onto.
a. T:"f" 3 -"f" 4 • b. T:vft2x3-vft2x2· c. T:R3[t]-R2[t]. §2. The lmage and Null Spaces
6. Show that each of the following maps is linear. Determine if the maps are one
to one or onto. The requirement that a transformation be linear and thus preserve the
a. T: vft2x3- vft2x2; r(~ ! /) = (d ~e b fe). algebraic structure of a vector space is quite restrictive. In particular, the
image of the zero vector under a linear map must be the zero vector. This
b. T: e- e; +
T(a bi) = O.
generally involves two different zero vectors. That is, if T: "//' ---+ i/1' is linear
c. T: 1'" 2 - "1'" 3 ; T(a, b) = (2a + b, 3b, b- 4a).
and 0,.-, 01¡r are the zero vectors in "f/' and "'Y, respectively, then T(O,.-) = Oif'.
d. T: R 3[t]- R 3[t]; T(a + bt + e/ 2 ) =a + b + (b + e)t + (a + e)t 2.
e. T: 1'" 3 - "1'" 3 ; T(a, b, e)= (2a + 4b, 2a + 3e, 4b- 3e). For, using any vector V e"//', we obtain T(O,.-) = T(OV) = OT(V) = 0 1¡r. The
f. T: R[t]- R[t]; T(ao + a¡t + · · · + ant") use of ditferent symbols for the zero vectors is helpful here but not necessary.
1 1 Knowing that T is a map from "//' to "'Y is sufficient to determine what O
= aot + -a
2 1 t + · · · +n-+- 1an t"+ •
2 1
means in T(O) = O.
7. Find the following images for the projection maps in Example 5. The requirement that a map preserve the vector space structure also
a. P 1 (18,2,8). b. Pi18,2,8). c. P¡(-1,-1,-12). guarantees that the image of a vector space is again a vector space. In general,
d. P 2 (0, O, 14). e. P 1 (x, y, -x-y). f. P2(x, y, -x-y). if T is a map and the set S is contained in its domain, then the set T[S]
8. a. Show that a projection map is a linear transformation. = {T(U)i U E S} is called the image of S. The claim is that if S is a vector
Suppose "f" = [!' + !!T but the sum is not direct. Wli.y isn't there a space, then sois the image T[S].
b.
projection map from "f" to [!'? ____,

9. a. What map defines the perpendicular projection of E 2 onto the x axis? Example 1 Define T: "f/' 2 ---+ "f/' 3 by
b. What map defines the perpendicular projection of E 2 onto the line with
equation y = 3x? T(a, b) = (2a - b, 3b - a, a + b).
10. a. Find the map T that rotates the plane through the angle rp = 60°. What
is T(!, O)? . If Y = 2' {(1, 2)}, then
b. Use the componentwise expression for a rotation to show that any
rotation of the plane about the origin is linear. T[ff] = { T( U)i U E S"} = { T(k, 2k)jk E R}
c. Prove that a rotation about the origin is one to one and onto.
11. A translation in the plane is given by T(x, y) = (x + h, y + k) for sorne
= {(2k - 2k, 3(2k) - k, k + 2k)lk E R}
h, k E R. Is a translation linear? 1-1? onto? = 2'{(0, 5, 3)}.
4. LINEAR TRANSFORMATIONS §2. The lmage and Null Spaces 139
138

Note that, in this case, the image of the subspace 8' of "Y 2 is a subspace of Examp/e 3 Consider the linear map T: "Y 2 -t "Y 2 given by T(a, b)
1'"3·
= (2a - b, 3b - 6a).
T does not map "Y 2 onto "Y 2 because
Theorem 4.1 If T: "Y -t 1/í is linear and 8' is a subspace of "Y, then
fr = {T(a, b)la, bE R} = {(2a - b, 3b- 6a)la, bE R}
the image of 8', T[Y'], is a subspace of 1/í.
= g-'{(2, -6), (-1, 3)} = 2'{(1, -3)} =1 "f/ 2 •
Proof It is necessary to show that T[Y'] is nonempty and closed
The image of T can be viewed as the Iine in E 2 with Cartesian equation
under addition and scalar multiplication. Since 8' is a subspace· of "Y, O E 8' y = - 3x. Each point on this Iine has an en tire line of preimages. To see this,
and T(O) = O because T is linear. Thus O E T[Y'] and T[Y'] is nonempty. To
consider an arbitrary point W = (k, - 3k) in the image. The set of all pre-
complete the proof it is sufficient to show that if U, V E T[.9"] and r, sE R,
images for W is
then rU + sV E T[Y']. (Why is this sufficient?) But U, V E T[Y'] implies that
there exist W, X E 8' such that T(W) = U and T(X) = V. And 8' is closed
{VE "Y 2 IT(V) = W} = {(x, y)IT(x, y)= (k, -3k)}
under addition and scalar multiplication, so r W + sX E 8'. Therefore
T(rW + sX) E T[Y']. Now T is linear, so T(rW + sX) = rT(W) + sT(X) = {(x, y)l2x -y = k}.
= rU + sV completing the proof.
Therefore the en tire line with equation y = 2x ..:.. k is sent by T to (k, - 3k).
In particular, this theorem shows that the image of T, T["Y], is a subspace If &k denotes this line and Pk = (k, - 3k), then T[&d = Pk. Every point of
cf the codomain 1/í. Thus it might be called the image space of T. We will the domain is on one of the· !in es &k, so the effect of T on the plane is to col-
denote the image space of T by f T· Using this notation, T: "Y -t 1f/ is onto lapse each line &k to the point Pk. This is illustrated in Figure 2. The line &0
if and only if fr = 1/í.
A map fails to be onto because of the choice of its codomain. If T: y
·r - t 1fí is not onto, the codomain might be changed to § r· Then T: "Y -t § r
is onto; or any transformation maps its domain onto its image. Hence if y
T: "Y -t 1fí is linear and one to one, then T: "Y - t § r is an isomorphism.

Example 2 Define T: "Y 2 -t "Y 3 by

T(a, b) = (2a + b, b- a, 3a + b).


Find the image space of T.

fr = {T(a, b)la, bE R}
= {(2a + b, b - a, 3a + 4b)la, bE R}
= g-'{(2, -1, 3), (1, 1, 4)}
= g-'{(1, o, 7/3), (0, 1, 5/3)}
= {(x, y, z)i7x + 5y - 3z = 0}.
1t can be shown that Tis linear and 1-1. Thus the fact that fr is represented
by a plan e agrees with the fact that § r is isomorpl:ílé to "Y 2 • Figure 2
140 4. LINEAR TRANSFORMATIONS §2. The lmage and ,N ull S:paces 141

passes through the origin and thus represents a subspace of the domain. This which has coefficient matrix
subspace determines much of the nature of T for every line parallel to it is

~
sent to a point. This idea generalizes to all linear maps. That is, knowing -1
what vectors are sent to O, tells much about a linear transformation. A= ( 1
The following theorem is easily proved and left to the reader. -1 o
If T: .Y --+ "'f/ is linear, then {V E "YI T(V) = O} is a sub- The rank of A .is 2, therefore there are non trivial solutions and .%r # {0}.
Theorem 4.2
So T is not one to one.
space of"Y.
A second source of information about a map is provided by the dimen-
Definition The null space .% r of a linear map T: .Y --+ 1Y is the set sion of its null space.
of all preimages for O;¡r. That is, .% r = {V E "YI T( V) = O}.

The null space is often ca Jled the kernel of the map. Theorem 4.4 If T: .Y --+ 1f/' is linear and .Y is finite dimensional, then
In Example 3, the null space of T is represented by the line 11 0 , and
every Iine parallel to the null space is sent to a single point. Example 5 and dim "f/' = dim.% r + dim .Fr.
problem 13 at the end of this section examine similar situations in 3-space.
As indicated, the null space provides information about a linear map. Proof Dimension is the number of vectors in a basis, therefore we
The simplest result is given by the next theorem. must choose bases for the three vector spaces. The best place to start is with
.% r. for then a basis for .Y can be obtained by extension. Therefore, suppose
Theorem 4.3 Suppose T: .Y --+ "'f/ is linear. T is one to one if and only {V¡, ... , Vk} is a basis for .% r· Then dim .% r =k. If.% r = {0}, then
if.% r = {0}. That is, T is 1-1 if and only if 0-¡r is the only vector that is k = O and this basis is empty. Since the dimension of .Y is finite, there exists
mapped to 0-nr by T. a basis V¡, ... , Vk> Vk+l• ... , v. for .Y. Then dim "f/' = n and we need a
basis for .Fr· If W E .FT• there exists a vector U E .Y such that T(U) = W,
Proof (=>) Suppose T is 1-l. We must show that {O} e.% r and and there exist scalars a 1 , ••• , a. such that U= a 1 V1 + · · · + a.Vw By
.% r e {0}. The first containment holds because T(O) = Ofor any linear map. problem 14, page 137,
So suppose U E.% T• that is, T(U) = O. Then since Tis 1-1, T(O) = O = T(U)
implies O = U and .% r e {0}. Thus .% r = {0}. W = T(U) = a 1 T(V1 ) + · · · + akT(Vk) + ak+ 1 T(Vk+ 1) + · · · + a.T(V.)
(<=) Suppose A'r = {0}. If T(U) = T(V) for sorne U, VE "Y, then = ak+ 1 T(Vk+ 1) + · · · + a.T(V.).
O= T(U)- T(V) = T(U- V). That is, U- VEA'r and U- V= O.
Therefore U= Vand T is 1-l. Therefore the set B = {T(Vk+ 1 ), ••• , T(V.)} spans .F r· To show that B is
linearly independent, suppose
Example 4 Determine if the map T: R 3 [t] --+ R 3 [t] is one to one if
rk+ 1 T(Vk+ 1) + · · · + r.T(V.) = O.
T(a + bt + ct 2 ) = a - b + (b - c)t + (e - a)t
2

Then
If a + bt + ct 2 E.% T• then T(a + bt + ct 2 ) = O. This yields a system
of three homogeneous linear equations in three unknowns:

a-b=O andrk+lVk+I + · · · + r.V.E.A'r. Thereforethereexistscalarsr 1, .•• , rkER


b-c=O such that

c-a=O
142 4. LINEAR TRANSFORMATIONS §2. The lmage and Null Spaces 143

(Why?) But {V 1 , ••• , V"} is Jinearly independent as a basis for 1"", so Severa! conclusions follow at once from Theorem 4.4 using the fact that
rk+t = · · · = r. =O and Bis a basis for .fy. Now dim %y+ dim . .fy the dimension of §y cannot exceed the dimension of the codomain.
= k + (n - k) = n = dim 1"" as claimed.
Corollary 4.5 Suppose T: 1""--+ 11' is linear and 1"" is finite dimensional,
Definition The dimension of the null space of T,% n is the nullity of then
T; and the dimension of the image space § r is the rank of T.
l. dim .fy::; dim 1"".
Thus, Theorem 4.4 states that the nullity of T plus the rank of T equals 2. T is 1-1 if and only if dim .fr = dim 1"".
.:he dimension of the doma'nt-of T. 3. T is onto if and only if dim § r = di m 11/' .
4. If dim 1"" > dim 11/', then T is not 1-1.
5. If dim 1"" < dim 11/', then T is not onto.
Example 5 Find the rank and nullity of the map T: 1"" 3 --+ 1"" 3 given 6. If 1"" = 11', then T is 1-1 if and only if T is onto.
by
T(a, b, e) = (2a - b + 3e, a - b + e, 3a - 4b + 2e). This corollary provides a simple proof that 1""" is not isomorphic to
when n # m, for any linear map from -r. to "f'"m is either not 1-1 when
"f'"m

The vector (x, y, z) is in% y if n > m, or it is not onto when n < m.


The result of problem 14, page 137, which was used in the proof of
2x- y+ 3z =O Theorem 4.4, is important enough to be stated as a theorem.

x-y+z=O
Theorem 4.6 Suppose T: 1"" --+ 11' is linear and {V 1 , ••• , Vn} is a
3x - 4y + 2z = O. basis for 1"". If V= a 1 V 1 + · · · + a" Vm then T(V) = a 1 T(V1 ) + · · · +
anT(Vn).
This hom~geneous system is equivalent to x + 2z = O, y + z = O. Therefore
.Aír = {(x, y, z)lx + 2z =O, y+ z =O}= 2'{(2, 1, -1)}. This states that a linear map is completely determined by its action on
a basis. If the images of the basis vectors are known, then the image of any
So the nullity of T is 1, and by Theorem 4.4, the rank of T is 2 = 3 - 1 other vector can be found. Again we see how restrictive the linearity require-
= dim 1""3 - dim %y. ment is. Compare with the functions studied in calculus for which even a
If T is thought of as a map sending points in 3-space to points in 3-space, large number of functional values need not determine the function.
then% y is a line g with equation P = t(2, 1, -1), tE R. This Iine is sent to The fact that a linear map is determined by its action on a basis means
the origin by T. Moreover, as in Example 3, every Iine parallel to t is that a linear map can be obtained by arbitrarily assigning images to the
sent to a single point. For a line paraiiel to t has the vector equation P vectors in a basis. Suppose B = {V1, ••• , V.} is a basis for 1"" and W1 , ••• , W.
t(2, 1, -1) + V for so me vector V E 1"" 3 : and are any n vectors from a vector space 11'. Then set T{V1) = W 1 , • •• , T(V.)
= W" and define T on the rest of 1"" by
T(P) = tT(2, 1, -1) + T(V) = O+ T(V) = T(V).
Further since the rank of this map is 2, we know that § r is a plane T(V) = a 1 W1 + ·· · + a"W" if V= a 1 V 1 + ··· + a"V".
through the origin. In fact,
It is not difficult to show that T is a linear map from 1"" to 11' which sends
.fr = {(2a- b + 3e, a-+ e, 3a- 4b + 2e)la, b, e E R}
b V¡ to W¡, 1 ::; i::; n .

= {(x,y, z)E1"" 3Ix- 5y + z = 0}.


Exainple 6 Given the basis {(<Z; 6), (0, !)} for 1""2, define a linear
(Be sure to check this computation.) map T: 1"" 2 --+ 1""3 which sends (2, 6) to (8, 10, 4) and (0, 1) to (- 1, 4, 2).
144 4. LINEAR TRANSFORMATIONS §2. The lmage and Null Spaces 145

Since (a, b) = ta(2, 6) + (b - 3a](O, 1), we define T by 3. Find T[9'] when


a. T(a, b, e)= (2a + b, 3b, 4b- e); !fl = 2'{(-1, 3, 5), (1, O, 4)}..
T(a, b) = ta(8, 10, 4) + [b - 3a]( -1, 4, 2)
b. Tas in parta and !fl = {(x, y, z)iy = x z}. +
c. T(a, b, e)= (a+ b-e, 2e- b); !fl = .P{(-1, 2, 1)}.
= (1a - b, 4b - 7a, 2b - 4a). d. Tas in parte and !fl = {(x, y, z)iz =O}.
4. Determine if T is 1-1 by examining the null space Y r·
Then T is seen to be a linear map from f 2 to f 3 and a. T(a + bt + e/ 2 ) = 2a + b +(a+ b + e)t 2 •
b. T(a + bt + e/ 2 ) =e+ (a- b)t + at 2 •
T(2, 6) = (14 - 6, 24 - 14, 12 - 8) = (8, 10, 4); c. T(a, b, e, d) = (e -· 3d, O, 2a +
b, e, O, e d).+
d. T(a + bt) =a+ 2at +(a- b)t 2 + bt 3 •
T(O, 1) = (-1, 4, 2) e. T(a, b, e) = (a + 2b + e, 3a - b + 2e).
5. How could one determine that the maps in parts a and e of problem 4 are not
as desired.
1-1 by inspection?

We will frequently want to find a linear map that has a certain effect on 6. Suppose 1' = !fl 0 ff and P: 1 ' - ff is a projection map. Find § P and .#'p.
the vectors of a basis. In such cases we will define the map on the basis and 7. Suppose T: 1 ' - if" is linear and S e if". r- 1 [S] (read "the complete in verse
"extend it Iinearly" to the entire vector space by the foregoing procedure. In image of S") is the set of all preimages for vectors in S, that is, r- 1 [S] =
general, if S e f and T: S~ 'iris given, then to extend T linear/y to f is {VEfiT(V) E S}.
to define Ton all off in such a way that T: f ~ "'fí is linear, and for each a. What is r- 1[{0}]?
U E S, T(U) has the given value. However, if S is not abasis for "!', then it b. Show that if !fl is a subspace of if", then r- 1[9'] is a subspace off.
may not be possible to extent T linearly to f ; see problem 10 at the end of c. What condition must be satisfied by r-
1
[ { W}] for any W E if" if T is to

map 1' onto if/"?


this section.
d. What condition must be satisfied by r- 1[ { W}] for any W E if" if T is to
lf T: f ~ 'ir is the extension of T: S ~ "'fí, then T: S ~ 'ir is the
be one to one?
"restriction of T toS." We will most often wish to restrict a map to a sub-
space of the domain: lf T: f ~'ir and !/ is a subspace of "!', then the 8. B = {(4, 3), (2, 2)} is a basis for 1' 2 • Define Ton B by T(4, 3) = (7, 5) and
T(2, 2) = (2, 2). Extend T linearly to 1' 2 and find T(a, b).
restriction of T to !/ is the map Ty: !/~'ir defined by Ty(U) = T(U) for
all U E!/. Since the domain of the restriction map T fl' is a vector space, if 9. Set T(4, 1) = (2, 1, -1), T(3, 1) = (4, -2, 1), and extend Tlinearly toa map
Tis linear, then sois Ty. from 1' 2 to 1'3 • What is T(a, b)?
10. a. If we set T(2, 4) = (6, -1, 3) and T(3, 6) = (3, O, 1), can T be extended
linear! y to a linear transformation from 1' 2 to 1' 3 ?
Problems b. If T(2, 4) = (2, O, 6) and T(3, 6) = (3, O, 9), can T be extended to a linear
transformation from 1' 2 to 1' 3 ? ..,.,_,

l. Find .Y r and Y r for the following linear maps: 11. Show that T: 1' z- 1' 2 is linear if and only if there exist scalars a, b, e, d such
a. T: 1' 3 ~ 1' 2 ; T(a, b, e) = (3a - 2e, b + e). that T(x, y) = (ax + by, ex + dy).
b. T: 1' 2 -1' 3 ; T(a, b) =(a- 4b, a+ 2b, 2a- b). 12. Prove that if T:f, -fm and the ith componen! of T(x 1, ... , x.) is ofthe
c. T: C - C; T(a + bi) = a + b. form a¡¡X1 + a;2X2 + · · · + a,.x. for sorne scalars a1 ., ••• , a 1., then T is
2
d. T: R 3 [t]- R 3 [t]; T(a + bt + et 2 ) =a+ 2e + (b- e)t +(a+ 2b)t • linear.
2. Let T: 1' 2 - 1' 2 be defined by T(a, b) = (4a - 2b, 3b - 6a) and find the 13. Define T: 1' 3 - 1' 3 by T(a, b, e) = (4a + 2b - 6e, O, 6a + 3b - 9c).
images of the following lines under T: a. Show that X r is represented by a plane through the origin.
a. {t(l, 5)lt E R}. d. {t(2, 1) + (3, 5)lt E R}. b. Show that every plane parallel to Y r is sent by T to a single point.
b. {t(l, 2)lt E R}. e. {1(5, 10) + (1, 4)lt E R}. c. Let !fl be the subspace normal to Y T· Show that the restriction map T!l'
c. {t(l, 2) + (3, 5)lt E R}. is 1-1 and the image space of T y is the image space of T.
f. Under what conditions is one of these Iines sent to a single point? d. Could !fl in part e be replaced by any other subspace?
146 4. LINEAR TRANSFORMATIONS §3. Algebra of linear Transformations 147

14. Suppose "Y is a finite-dimensional vector space and T: "Y - 1lí is linear but Example 1 Let S, TE Hom(1'" 2 , 1/ 3 ) be given by
not 1-1. Find a subspace fl' of "Y such that Ty is 1-1 and the image of the
restriction T y equals the image of T. Is there a unique choice for Y? ·
S(a, b) = (a - b, 2a, 3b - a) and T(a, b) = (b - a, b, a - b).
15. Suppose T: "Y- 1lí is linear and {U., ... , Un} is a basis for "Y. Show that the
vectors T( U 1 ), ••• , T( U,) span the image space ~ T· Then S + T is given by

[S+ T](a, b) = S(a, b) + T(a, b) = (O, 2a + b, 2b)

§3. Algebra of Linear Transformations and

Recall the construction of the vector space ff, in which vectors are con-
[rT](a, b) = (rb - ra, rb, ra - rb).
tinuous functions from [0, 1] to R. The algebraic structure of ff was defined
using addition and multiplication in the codomain of the functions, R. Example 2 Let S, TE Hom(R 3 [t]) be defined by
Further, the proof that ff is a vector space relied on the fact that R is itself a
vector space. This idea can easily be generalized to other collections of func- S(a + bt + ct 1 ) = (a + b)t + at 1 and T(a + bt + ct 2 ) = (e - a)t 1 - bt.
tions with images in a vector space. In particular, the algebraic structure of
a vector space can be introduced in the set of all linear transformations from Then
one vector space to another.
A linear transformation preserves the algebraic structure of a vector [S + TJ(a + bt + ct 2 ) = at + ct 1 •
space and thus belongs to a broad class of functions between algebraic
systems which preserve the structure of the system. Such functions are called For example,
homomorphisms. A map between the field of real numbers and the field of
complex numbers would be a homomorphism if it preserved the field struc-
ture. Since a linear transformation is an example of a homomorphism, the
set of alllinear transformations from 1/ to "/fí is often denoted Hom("f/, "'fí). The maps 7 S and 4T have the following effect on the vector 3 - 2t + t2:
[7S](3 - 2t + t 2 ) = 7[S(3 - 2t + t 2 )] = 7[t + 3t 2 J = 7t + 2lt 2
Definition If 1/ and "/11" are vector spaces, then
and
Hom("f/, "/fí) = {TITis a linear map from 1/ to "/fí}.

When 1/ = "'fí, Hom("f/, "//) will be written as Hom("f/).


Theorem 4.7 The set of linear transformations Hom('f", 111") together
Note that the statement TE Hom("f/, "/fí) is simply a short way to write with addition and scalar multiplication as defined above is a vector space.
"T is a linear transformation from 1/ to "/fí."
To give Hom('f", "/fí) the structure of a vector space, we must define
Proof It is necessary to prove that all nine conditions in the de-
operations of addition and scalar multiplication:
finition of a vector space hold, these are Iisted on page 32.
For S, TE Hom('f", "/fí) define S+ T by [S+ T](V) = S(V) + T(V)
Hom("f/, "'fí) is closed under addition if, for any two maps S, TE
for all V E 1'". For TE Hom(1'", "'fí) and r E R, define rT by [rT]( V) = r[T( V)J
Hom('f", 111"), S + T: 1'"-> "'fí and S + T is linear. S + T is defined for
forall Ver.
each V E 1' and since "/fí is closed under addition, the codomain of S + T is
148 4. LINEAR TRANSFORMATIONS §3. Algebra of Linear Transformations 149

ifí, therefore S + T: "Y--+ ifí. To show that S + T is linear, suppose U, that-r(S + T) = rS + rT forrE R and S, TE Hom("Y, ifí), it is necessary
V E "Y and a, b E R, then to show that [r(S + T)](V) = [rS + rT](V) for any V E "Y:

[S+ T](aU + bV) = S(aU + bV) + T(aU + bV) Definition of + in [r(S + T)](V) = r[(S + T)(V)] Definition of scalar multiplication in
Hom("Y, ifí) Hom("Y, "'fí)

= [a[S(U)] + b[S(V)]] + [a[T(U)] + b[T(V)]] S, TE Hom("Y, 1/í) = r[S(V) + T(V)] Definition of addition in Hom("Y, "'fí)

= [a[S(U)] + a[T(U)]J + [b[S(V)J + b[T(V)]] Commutativity and = r[S(V)] + r[T(V)] Distributive law in ifí
associativity of + in
= [rS](V) + [rT](V) Definition of scalar multiplication in
ifí
Hom{"Y, "'fí)
= a[S(U) + T(U)] + b[S(V) + T(V)J Distributive law in ifí
= [rS + rT](V) Definition of addition in Hom("Y, 1Y).
= a[[S + T](U)J + b[[S + T](V)] Definition of + in
Hom("Y, ifí). Therefore the maps r(S + T) and rS + rT yield the same image for each
vector V and so are equal.
Therefore S + T is linear and S + TE Hom("Y, ifí) . .
To prove that addition is commutative, that is, S + T = T + S for all
With Theorem 4.7 we have a new class of vector spaces and the term
S, TE Hom("Y, 1/í), it is necessary to show that [S + T](V) = [T + S]( V) for
vector applies toa linear transformation. Moreover, any definition or theorem
all V E "Y. That is, two maps are equal if and only if they ha ve the same effect
about abstract vector spaces now applies to a space of linear transformations,
on every vector. But Hom("Y, ifí).

[S + T](V) = S(V) + T(V) = T(V) + S(V) = [T + S](V).


Examp/e 3 Determine if the vectors T 1 , T 2 , T3 E Hom("Y 2 ) are
(At what point in this chain of equalities is the commutivity of addition in "/f/ linearly independent when T 1(a, b) = (2a, a - b), Tz(a, b) = (4a + b, a),
used ?) Therefore addition is commutative in Hom("Y, "/Y). The proof that and T 3 (a, b) = (b, 3a).
addition is associative in Hom("Y, ifí) is similar. Consider S = x 1 T 1 + x 2 T 2 + x 3 T 3 , a linear combination of these
Define the zero map 0: "Y--+ ifí by O(V) = 01,- for all V E "Y. The map O vectors. The vectors are linearly independent if S = O implies x 1 = x 2 = x 3
is easily seen to be linear and thus a member of Hom("Y, "/fí). Further = O. If S = O, then S(a, b) = O(a, b) = (0, O) for all (a, b) in "Y 2 • But S is
determined by what it does toa basis. Using {E¡} we have
[T + O](V) = T(V) + O( V) = T(V) + O = T(V) for all V E "Y.
-,., S(I, O) = (2x 1 + 4x 2 , x 1 + x 2 + 3x3 ) = (0, O)
Therefore T + O = T for any TE Hom("Y, "/Y), andO is the additive identity
for Hom("Y, 1/í). Notice that, in general, there are four different zeros as- and
sociated with Hom("Y, "/Y); the number zero, the zero vectors in "Y and 1fí,
and the zero map in Hom("Y, ifí). S(O, 1) = (x 2 + x3 , -x) = (0, 0).
Por the final property of addition, define - T by [- T](V) = - [T(V)]
for all V E "Y.It is easily shown that if TE Hom("Y, 1/í), then -TE Hom("Y, "/fí) These vector equations yield the homogeneous system
and T + (- T) = O. Therefore every element has an additive inverse in
Hom("Y, 1/í). 2x 1 + 4x 2 =O, x1 + x 2 + 3x3 = O, x2 + x3 =O, -X¡= 0,
To complete the proof, it is necessary to prove that all the properties
for scalar multiplication hold in Hom("Y, "'fí). Only one of the distributive which has only the trivial solution. Thus the set {T1 , T2 , T3 } is linear! y
laws will be proved here with the other properties left to the reader. To prove independent in Hom("Y 2 ).
150 4. LINEAR TRANSFORMATIONS §3. Algebra of Linear Transformations 151

Why does Example 3 show that the dimension of the vector space vector in {E;} and it sends the second vector in {E;} to zero. Extending this
Hom(1" 2) is at Ieast 3? map linearly to 1" 2 gives

Tu(a, b) = aT12 (1, O) + bT1i0, 1) = a(O, 1) + bO = (0, a).


Example 4 Consider the three vectors in Hom('i" 2 , 1" 3) given by
T 1 (a, b) =(a- b, a, 2a- 3b), Tia, b) = (3a- b, b, a+ b), and T 3 (a, b)
Yo u might check that the other maps are justas simple with T 11 (a, b) = (a, O),
= (a + b, ó - 2a, 7b - 3a). Is the set {T1 , T 2 , T3 } linearly independent?
T 21 (a, b) = (b, 0), and T 22 (a, b) = (0, b).
Suppose S= x 1 T 1 + x 2 T 2 + x 3 T3 =O. Then S(l, O) = S(O, 1)
The claim is that B = {T11 , TJi, T21 , T22 } is a basis for Hom(1" 2 ).
= (0, O, O) yields two equations in 1" 3 :
The proof that Bis linear! y independent follows the pattern set in Examples 3
and 4. To show that B spans Hom('i" 2 ), suppose TE Hom('i" 2 ). Then there
exist scalars aii such that T(l, O)= (a 11 , a 12 ) and T(O, 1) = (aw a 22 ).
(Since T is determined by its action on a basis, these scalars completely
and
determine T.) Now for any vector (x, y):

T(x, y) = xT(I, O) + yT(O, 1)


Setting components equal, we obtain a homogeneous system of six linear = (xa 11 + ya 21 , xa 12 + ya22 )
equations in three unknowns. But the rank of the coefficient matrix is 2, so
= a 11 (x, O)+ a 12 (0, x) + a 21 (y, O)+ a2 i0, y)
there are nontrivial solutions. That is, S = O does not imply that x 1 = x 2
= x 3 = O, and the set {T1, T2 , T3 } is linearly dependent. = a 11 T 11 (X, y)+ a 12 T 12 (x, y)+ a21 T 21 (x, y)+ a22 T 2 ix, y)

If one looks to the Iast two examples for general ideas, the first point to = [a 11 T11 + a 12 T12 + a 21 T21 + a 22 T 22 ](x, y).
notice· is that again the problem of determining linear dependence or inde-
pendence Ieads to a homogeneous system of linear equations. Second, the Therefore
number of homogeneous equations obtained, 4 and 6, is the product of the
dimension of the domain and the dimension of the codomain; 4 = 2 · 2 and
6 = 2·3. In general, ifthe maps come from Hom('f"., 'i"m), then each ofthe
and B spans Hom('i" 2 ). Thus Bis a basis for Hom('i" 2 ) and the dimension
n vectors in a basis for 1"n yields a vector equation in 1"m• and the m com-
is 4.
ponents in each vector equation yield m linear equations. Si·nce a system of
In showing that B is a basis for Hom('i" 2 ), we have discovered how to
nm homogeneous linear equations has a nontrivial solution if there are more
find the coordinates ofa map with respect to B. In fact, T: (a 11 , a 12 , a 21 , a 22 ) 8 •
than nm unknowns, we can conclude that dim Hom('f"., 1"m) :::; nm. We
Hence if T(x, y) = (3x - 4y, 2x), then T(l, O) = (3, 2) and T(O, 1) = ( -4, O),
could prove that the dimension of Hom('i" 2 ) is 4 by extending the set in
so T: (3, 2, -4, O)a. In the next section we will essentially arrange these co-
Example 3 to a basis. But it will be more informative to construct a standard
ordinates into a matrix to obtain a matrix representation for T.
basis for Hom('i" 2 ) from the standard basis for 1" 2 .
The technique for constructing a basis for Hom('i" 2 ) can easily be
generalized.
Example 5 Show that dim Hom('i" 2) = 4.
If we follow the pattern set in 1""' R[t], and J!t n x m of finding a simple
Theorem 4.8 If dim 1" = n and dim 1fí =m, then dim Hom('i", "'fí) =
basis, then we should loo k for a collection of simple nonzero maps in Hom(1" 2).
nm.
Using the standard basis {E1} for 1" 2 , we can obtain a map by choosing
images for E 1 and E 2 , and then extending linearly. The simplest choice is to
send one vector to zero and the other to either E 1 or E 2 • This yields four Proof 1t is necessary to find a basis for Hom('i", "'fí). This means
maps TIJ: 1" 2 ~ 1"2 with Tii(E¡) = Ei and Tcq(Ek) = (0, O) for k =f. i; defining nm linear maps from 1" to "'fí. Since all we know about 1" and 1fí
i,j = 1, 2. For example, T12 sends the first basis vector in {E;} to the second is their dimensions, the natural place to start is with bases for 1" and "'fí.
152 4. LINEAR TRANSFORMATIONS §3. Algebra qf Linear Tranyormations 153

So suppose {V 1 , ••• , V.} and {W 1 , • •• , Wm} are bases for "Y and "f//, Example 6 Suppose B = {T11 , T 12 , T 13 , T 21 , T 22 , T23 } is the basis
respectively. For each i, 1 :<; i :<; n, and eachj, 1 :<; j :<; m, we obtain a map for Hom("f/' 2 , "f/' 3 ) obtained using the bases B 1 = {(3, 1), (1, O)} for "f/' 2 and
Tu as follows: Let Tu(V;) = Wi, Tu(Vk) = O if k #- i, and extend Tu to a B2 = {(!, 1, 1), (l, 1, 0), (1, O, O)} for "f/' 3 • Find Tw T 13 and the coordinates
linear map from "Y to "ff/. This gives us nm vectors in Hom("Y, "f//), and it is of T with respect to B, given that T(x, y) = (3x - 2y, Sy, 4y - x).
only necessary to show that they are linearly independent and that they span For T21 , we know that T21 (3, 1) = (0, O, O) and T21 (1, O) = (1, 1, 1).
Hom("Y, "f//). Since (a, b): (b, a - 3b)a,, T21 is defined by
· Suppose a linear combination of these vectors is the zero map, say
T21 (a, b) = b(O, O, O) + [a- 3b](l, 1, 1) =(a - 3b, a- 3b, a- 3b).
m n
S= Y¡¡Tu + Y¡zT!2 + ... + '•mTnm = L L ruTu =O.
i=l j=l
For example, T 21 (15, 3) = (6, 6, 6). For T 13 ,

Then S(Vh) = O(Vh) = 011'" for each basis vector Vh. Therefore,
TJa, b) = bT13(3, 1) + [a - 3b]T13 (1, O)
= b(l, O, O) + (0, O, O)
= (b, o, 0).
m Therefore Tn(15, 3) = (3, O, 0).
= L [r!jT j(Vh) + r jT /Vh) + · · · + rhiTh/Vh) + · · · + '•iT.j(Vh)]
j=l
1 2 2 To find the coordinates of T with respect to B, notice that
m
T(3, 1) = (7, 5, l)
= L [O + o + ...
j=l
+ o + rhjwj + o + ... + O]
= l(l, l, l) + 4(1, + 2(1, O, O)
l, O)
= IT11 (3, l) + 4Tti3, l) + 2T13 (3, 1)

The W's form a basis for 1f/, so for each integer h, 1 :<; h :<; n, rh 1 = rh 2 = [IT11 + 4T12 + 2T13 ](3, 1).
= · · · = rhm = O and the maps Tii are linearly independent.
To prove that these maps span Hom("Y, "'f/), suppose TE Hom("Y, 1fí). So if T: (a 11 , a12 , a 13 , a 21 , a22 , a23 )a, then a 11 = l, a12 = 4, and a 13 = 2.
Since T(V;) E "ff/ and {W1, ... , W.,} spans ifí, there exist scalars a¡i such F or the second vector in B 1 , T(l, O) = (3, O, - l): (- l, l, 3)a,, therefore the
that T(V¡) = a¡¡ W 1 + a¡ 2 W 2 + · · · + a¡m Wm, for 1 :<; i :<; n. Now T and coordinates of T with respect to B are given by T: (l, 4, 2, -1, l, 3)a.
the linear combination Li= LJ=1 1 aiiTu have the same effect on the basis
{V1 , . . . , V.}, for if 1 :.:; h:.:; n, then In the process of obtaining the basis {Tu} for Hom("Y, "f//) we have
found how to obtain the coordinates a¡i of a map with respect to {Tu}. The
double index notation was natural since two bases were used, but it also
strongly suggests the matrix (a;). In the next section we will see how to as-
sociate the transpose of this matrix with the map.
m
:L ahiwi
= j=l
Problems

l. Suppose Tt. T2. TJ E Hom('i'" 2 , 1'" 3) are given by T 1 (a, b) = (3a- b, 2a + 4b,
Therefore T = Li'= Li=
1 1 aiiTii and the maps Tu span Hom("Y, if/). a- 8b}, Tz(a, b) =(a+ b, b, a- 4b), and T3 (a, b) =(a+ Sb, -2a,
Thus the nm maps Tu form a basis for Hom("Y, 1f/) and the dimemsion 3a- 8b). Find the following vectors:
of Hom("Y, 1fí) is nm. a. Tt + T2. b. STt. c. T3- 3T2 • d. T1- 4T2 + T3 •
154 4. LINEAR TRANSFORMATJONS §4. Matrices of a Linear Transformation 155

2. Determine if the following sets of vectors are linearly independent: Find the coordina tes of the following vectors with respect to B.
a. {T1, T2, T 3, T4 } e Hom('t'" 2); T 1(a, b) =(a+ b, a), T2(a, b) = e. T(a +bt) = (a+ b)t 2. g. T(a +
bt) = O.
(a, a + b), T 3(a, b) = (a + b, 0), T4(a, b) = (0, b). f. T(a + bt) = 3a- b + (4b- 3a)t + (7a + 5b)t 2.
b. {T¡, T2 , T3 } e HomW 2 ); T 1(a, b) =(a, b), T2(a, b) = (b, a), T3(a, b)
12. Use the bases B 1 = {(1, 2, 1), (0, 1, -1), (1, 4, O)} for 1""3 and B2 = {(0, 1),
= (2a - b, 2b - a).
(1, 2)} for 1"" 2 to define the maps in the basis B = {Tu, T12, T21, T22, T3¡,
c. {T1 , T2, T 3, T4 } e Hom('t'" 2, 1"" 3); T1(a, b) =(a, b, b), T2(a, b) =
T32 } for Hom('t'" 3, 1"" 2). Find the following images:
(a, a, b), T 3(a, b) = (b, a, a), T4(a, b) = (b, b, a).
a. T21 (1, 4, O) b. Tn(1, 4, 0). c. Tn(O, -5, 5).
d. {T1 , T2, T3 } e Hom('t'" 2, 1"" 3); T 1(a, b) =(a+ b, 2b, b- a), T2(a, b)
d. T22(3, 4, 5). e. Tu(1, O, 0).
= (a- b; 2a; a + b), T 3(a, b) = (a, a + b, b).
Find the coordina tes of the following maps with respect to B.
3. What does the senten~'let TE Hom('t'", 111')" mean? f. T(x, y, z) = (3x- y+ z, 2y- x + 4z).
4. Let S, TE Hom('t'", 111') and U, V E 1"". g. T(x, y, z) =(y- 2z, 3x + 4z).
a. Identify three different meanings for the plus sign in the equation 13. What are the 4 zeros used in the vector space
[S+ T](U + V)= [S+ T](U) +[S+ T](V). a. Hom('t'" 2, 1"" 4)? b. Hom(1""3, R3[t])? c. Hom(.Jt2x3, .Jt2x2)?
b. What are the meanings of the symbol O in O( U+ O)= O( U)= O?
5. a. Prove that Hom('t'", 111') is closed under scalar multiplication.
b. Prove that[a + b]T = aT + bTfor any TE Hom('t'", 111') anda, bE R.
6. Let B = {T 11 , T 12 , T21 , T22 } be the basis constructed for Hom('t'"2) in
§4. Matrices of a Linear Transformation
Example 5. Find the following images:
a. T12 (3, 5). b. T22C4, -6). We have seen that if "fí and 1f/ are finite-dimensional vector spaces,
c. [2T12 - 3T21 + Tn](l, 3). d. [5Tu + ?T12 + 4T2,](0, -3). then sois the space of maps Hom("fí, "/Y). Further each choice of bases for
Find the coordinates of the following vectors with respect to B:
"fí and 1f/ yields a basis for Hom("fí, "/fí). The coordinates of a linear trans-
e. T(a, b) = (4a, b - 3a). g. T(a, b) = (4a + b, 0).
formation T from "fí to 1f/ with respect to such a basis can be arranged as
f. T(a, b) = (0, 0). h. T(a, b) = (2a + 3b, a - 5b).
a matrix in severa! ways. The following example shows which way we should
7. Write out the following sums: choose and illustrates one of the major uses for such a matrix.
3 2
c. L; L; aiJ TIJ.
l= 1 J= 1
Examp!e 1 Suppose T: R 2 [t] -+ R 3 [t] is defined by
8. Write the following sums using the sigma notation: .
a. ruTu + r12T12 + r21T21 + r22T22.
b. a41T41 + a42T42 + a43T43 + a44T44· T(a + bt) = 3a + b + (a - 4b)t + (a + 2b)t 2 •
c. r 1auT11 + r 1a 12 T 12 + r2a21T21 + r2a22T22 + r3a31T31 + r3a32T32.
An arbitrary image vector has the form x + yt + zt 2 , and the equation
9. Suppose TlJ are the maps used in Example 6.
T(a + bt) = x + yt + zt 2 yields the system of linear equations:
a. Find T 12 • b. Find T2 2. c. T23(4, 7) = ? d. Tl!(5, 9) = ?
10. If B is the basis for Hom('t'" 2, 1"" 3) from Example 6, find the coordina tes of T 3a +b = x
with respect to B when
a. T(x, y) = (x - y, 4x + y, 3y). c. T(x, y) = (x + 2y, x, x - y). a-4b=y
b. T(x, y) = (5x - 3y, O, 2x + 2y). d. T(x, y) = (3x - 2y, 4y, 2x).
a+2b=z
11. Use the bases {1, t} and {1; t, t 2} for R 2[t] and R 3[t] to obtain the basis B =
{Tu, T12, T 13 , T21• T 22, T23} for Hom (R2[t], R3[t]). or in matrix form
a. What are the maps T 11 , T13, and T21?
b. What is the vector 2Tu + JT21 - 4Tl3?
Express the following· maps as a linear combination of the TlJ's.
c. T(a + bt) = at + bt 2. d. T(a + bt) = 2b + (a- 4b)t + 5at 2.
156 4. LINEAR TRANSFORMATIONS §4. Matrices of a Linear Transformation 157

.;;.,
Therefore the image of any vector a + bt in R 2 [t] can be found by matrix the two columns in A contain the coordinates with respect to B 2 of T(1) and
multiplication. As an example, T(t), the images ofthe two vectors in B 1•
Using Example 1 as a guide, we associate matrices with a mapas follows:
T(4 - 6t) = 12 - 6 + (4 + 24)t + (4 - 12)t 2
= 6 + 28! - 8! 2 ' Definition Let TE Hom(f, "/Y) and suppose B 1 = {V1 , ••• , Vn}
and B2 = {W1, •.• , Wm} are bases for f and "/Y, respectively. If
and T(Vi): (a 1i, a 2 i, ... , am) 82 for 1 ::; j ::;n, then the m x n matrix (a¡) is the
matrix of T with respect to the bases B1 and B2 •

3 1) (12- 6) ( 6)
(~ -~ (_!)= :~i~ = !~. Therefore A is the matrix of Twith respect to B1 and B 2 ifthejth co1umn
of A contains the coordinates with respect to B 2 for the image of the jth vector
in B 1 • Since a linear map T is determined by its action on any basis, the
Further since matrix for T with respect to B 1 and B 2 contains enough information, in co-
ordinate form, to determine the image of any vector under T.

Example 2 Suppose TE Hom(f 2 , f 3 ) is given by T(a, b) =


(2a + b, 3a + b, a + 2b). Find the matrix A of T with respect to the basis
we can conclude that T(1 + 3t) = 6 - llt + 7t 2 • B 1 = {(4, 1), (1, O)} in the domain and the basis B2 = {(1, 1, 1), (1, 1, 0),
In using the matrix (1, O, O)} in the codomain.
Since

T(4, 1) = (9, 13, 6): (6, 7, -4)a 2

and
to find images under T, it is clear that the vectors are not being used directly.
Instead the columns contain coordinates with respect to the bases B 1 = {1, t} ·" ..
T(1, O)= (2, 3, 1): (1, 2, -1h 2 ,
and B 2 = {1, t, t 2 }. Thus 4- 6t: (4, -6h, and 6 + 28t- 8! 2 : (6, 28, -8) 82 •
e$,

In general, the column vector the matrix A is

A= (
-4
~ ;)·
-1
contains the coordinates of a + bt with respect to B 1 and
Following Example 1, we should expect that if V: (a, b) 8 , and

contains the coordinates of x + yt + zt 2 with respect to B2 • Moreover, the


matrix A is related to these bases, for T(1) = 3 + t + t 2 has coordinates then T(V): (x, y, zh 2 • To check this, choose V= (4, -6) arbitrarily. Then
(3, 1, 1) 82 and T(t) = 1 - 4t + 2t 2 has coordinates (1, -4, 2}a 2• Therefore V= (4, -6): ( -6, 28)8 , and T(V) = (2, 5, -8): ( -8, 14, -4h 2 • In this case
158 4. LINEAR TRANSFORMATIONS §4. Matrices of a Linear Transformation 159

we do have Proof Suppose B 1 = {V1 , ••• , V.} and B 2 = {W1 , ••• , Wm}.
Then by assumption

n m
V= ~>Yj and T(V) = L y¡W¡.
¡~¡
j~l

The proof that coordinates for an image can always be obtained in this way Further, by the definition of the matrix A,
is given by Theorem 4.9.
m
T(V) =
¡~¡
L aijW¡.
Example 3 Let T be as in Example 2. Find the matrix A' of Twith
respect to the standard bases for "Y 2 and "Y 3 •
Therefore

T(I, 0) = (2, 3, 1): (2, 3, l)¡E,J


T(V) = rCt/Yi) = it1xiT(V)
and

T(O, 1) = (1, 1, 2): (1, 1, 2)¡E 1¡, = Ixi[faijwi]


j~l ¡~¡
= f[±auxi]w;
¡~¡ j~l

therefore

So y¡, the ith coordinilte of T(V) with respect to B 2 , is I:i~ 1 auxi. Now

n
Examples 2 and 3 show that ditferent bases may give rise to ditferent Yt L aljxi
j~l
a 11 x 1 + a 12 x 2 + + a¡nXn
matrices for a given map. Therefore, one cannot speak of a matrix for a n
map without reference to a choice of bases. As a further example, consider Y2 L1a 2 jXj
j~
a 21 X 1 + X 22 X 2 + + aznXn
the two bases {(1, 0), (0, 1)} and {(3, 1, 1), (1, -4, 2), (5, 1, 2)}, the matrix
for the map T of Examples 2 and 3 with respect to these two bases is n
Ym L amjXj
j~l
Om¡X¡ + am2X2 + + amnxn

all a¡z a¡. X¡

a21 a22 a2n Xz


Thus by a careful choice of bases, we have obtained a particularly simple
matrix for T. We could continue to produce matrices for T in infinite variety
by choosing ditferent bases, but not every 3 x 2 matrix can be a matrix for
ami am2 amn x.
T. Can yo u think of a 3 x 2 matrix which is not the matrix of T with respect
to sorne choice of bases? What about a matrix with a column of zeros?
that is, Y = AX.
Theorem 4.9 Let A = (a¡) be the matrix of T with respect to B 1 and
B 2 • If V: (x 1, • •• , x.)s, and T(V): (y 1, • •• , Ymh 2 , set X= (x 1 , • •• , x.f and Lo6sely, the matrix equation Y= AX states that the product of a matrix
Y= (y¡,··., Ymf, then Y= AX. A for T times coordina tes X for Vequals coordinates Y for T(V). Notice the
160 . 4. LII\!EAR TRANSFORMATIONS §4. Matrices of a Linear Transformation 161

similarity between the matrix equation for a linear transformation, Y = AX, the rank of A is 2. On the other hand, the image space of T is
and the equation for a linear function from R to R, y = ax. We should also
observe that the matrix equation AX = Y is in the form of a systeril of m fr = .P{(-1, 1, 2), (7, 7, 7), (2, 4, 2), (-3, 3, 6)}
linear equations in n unknowns, see problem 9 at the end of this section.
Suppose TE Hom("Y"' "Y m) has matrix A with respect to any basis = .P{(1, o, -1), (0, 1, 1)}.
B = {V1 , ••• , V.} for "Y. and the standard basis for "~'m· In this special
case, T( Vi) is the transpose of the jth column of A. Then since the rank of T So the rank of T is also 2. Notice that J r is spanned by the transposed
equals the dimension of fr = .P{T(V1), . . . , T(V.)}, the rank of Tequals columns of A'. This happens only because A was obtained using the standard
the (column) rank of A. basis in the codomain.

In general, if A is a matrix for T, the columns of A do not span J T• but


Example 4
the ranks of A and T must be equal.

T(a, b, e, d) = (a - b + 2e, 2a - b + d, a - 2e + d).


Theorem 4.10 Suppose TeHom("Y, 1Y) has A as its matrix with re-
Suppose A is the matrix of T with respect to spect to B 1 and B 2 , then rank T = rank A.

B = {(1, 2, O, 1), (3, O, 2, 1), (0, -2, O, 2), (1, O, -2, 1)} Proof If B 1 = {V 1 , ••• , V.}, then fr = .P{T(V1), ••• , T(V,,)}
and the jth column of A contains the coordinates of T(V) with respect to
in the domain and {E¡} in the codomain. Show that the rank of A equals the B 2 . The columns of A and the spanning vectors of J T are related by the
rank of T. isomorphism L: Amx 1 -> 1fí where L((a 1 , ••• , am)r) = W if W: (a 1 , ••• ,
First, am)B 2 • (The proof that Lis an isomorphism is the same as the proof that an
m-dimensional vector space is isomorphic to "Y'"' Theorem 2.14, page 64.)
T(l, 2, O, 1) = ( -1, 1, 2), lf A =(a¡), then T(V):(a 1i, ... , am)B 2 • Therefore L((a 1i, ... , am)T)
= T( V), or L sends the jth column of A to the jth spanning vector of J r·
T(3, O, 2, l) = (7, 7, 0), This means that the restriction of L to the column space of A is a 1-1, onto
T(O, - 2, O, 2) = (2, 4, 2) map from the column space of Ato fr. Therefore dim(column space of A)
= dim fr or rank A = rank T.
T(l, O, -2, 1) = ( -3, 3, 6).
Therefore Example 5 Find the rank of T: R 3 [t] -> R 3 [t] using a matrix for
T, if
7 2 -3\
A=
C' ; 7 4
o 2 ~r
T(a + bt + et 2) = 3a + b + Se + (2a + b + 4e)t + (a + b + 3e)t 2 •

The matrix of T with respect to the basis B = {1, t, t 2 } in both the domain
Since A is column-equivalent to and codomain is

o o
A'~ ( -1: o
~)·
o
162 4. LINEAR TRANSFORMATIONS §4. Matrices of a linear Transformation 163

and sorne situations a simple matrix is needed while in others the bases should be
simple. But in general, it is not possible to have both.

A'=
1 o o)
O 1 O
We have been considering various matrices for a particular linear trans-
formation. Suppose instead we fix a choice for the bases B 1 and B2 , and
( -1 2 o consider all maps in Hom('i'", 1fí). If di m 1'" = n and dim "/Y = m, then each
map TE Hom('i'", "/fí) determines an m x n matrix Ar which is the matrix
is column-equivalent toA. Hence the rank of A is 2 and rank T = 2. Since of Twith respect to B 1 and B2 • Thus there is a map <p defined from Hom('i'", 1fí)
di m J r ::::; di m R 3 [t], T is neither 1-1 nor onto. to J!t m X n which sends T to the matrix Ay.
The matrix A' can be used to find the image space for T. Since elementary
column operations do not change the column space of a matrix, the columns
Example 7 Consider the map <p :Hom('i'" 2 ) ~ Jt 2 x 2 obtained using
of A' span the column space of A and are coordinates of spanning vectors for
B 1 = {(2, 4), (5, -3)} and B 2 = {(!, 0), (3, 1)}. Then for TE Hom(1'" 2 ), cp(T)
.Ir. That is, .Ir= .P{U, V} where U: (1, O, -l)n and V: (0, 1, 2) 8 . These
is the matrix of T with respect to B 1 and B 2 •
coordinates give U= 1 - 12 and V= 1 + 21 2, so .Ir = 2'{1 - 12, t + 21 2 }.
Suppose T 1, T 2 , T 3 , T4 E Hom('i'" 2 ) are given by
This basis for J r is analogous to the special basis obtained for the image
space in Example 4.
The matrix A' is not the matrix of T with respect to {1, t, t 2 } for both
T 1(a, b) = (a, b), Tia, b) = (2a - b, a + b),
the dorriain and codomain. But bases could be found which would give A' T 3 (a, b) = (2a, b), T4 (a, b) = ( -b, a).
as the matrix for T. In fact, an even simpler matrix can be obtained.
Then

Example 6 Let T and A be as in Example 5. The matrix


cp(T¡) = (
-10
4
14)
-3 ' cp(T2) = ( -l~ ~).
o o)
(-8 19) -1;).
1
A"= O 1 O
( <p(T3) = 4 -3 , cp(T4) = ( -1~
o o o
is equivalent to A. That is, A" can be obtained from A with elementary row Notice that T 2 = T3 + T4 and
and co1umn operations. Find bases B 1 and B2 for R 3 [1] with respect to which
A" is the matrix of T.
Since the third co1umn of A" has all zeros, the third vector in B 1 must
be in .A' r = 2'{1 + 21 - 12 }. Therefore take 1 + 2t + t 2 as the third vector
of B 1 • Then the first two ve<...ors in B 1 may be chosen arbitrarily provided the That is,
set obtained is linearly independent. Say we take B 1 = {t, 12 , 1 + 2t- 12 }.
Now since T(t) = 1 + t + t 2 and T(t 2 ) = 5 + 4t + 3t 2 , these must be taken
as the first two vectors in B 2 • (Why do these two images have to be inde-
suggesting that cp preserves addition. Further the map 4T1 given by [4T.J(a, b)
pendent? T is not one to one.) The third vector in B2 may be any vector in
R 3 [t] so long as B2 is linearly independent. So Jet B2 = {1 + t + t 2 , 5 + 4t
= (4a, 4b) has
+ 3t 2 , 7}. Then A" is the matrix of T with respect to B 1 and B2 •
-40 56)
( 16 -12
In order to obtain a simple matrix for a map, such as A" in Example 6,
complicated bases may be needed, while the simple basis {1, t, t 2 } used in as its matrix with respect to B 1 and B2 • So <p(4T1) = 4[cp(T1)], suggesting
Example 5 produced a 3 x 3 matrix with no speeial form for its entries. In that <p is a linear transformation.
164 4. liNEAR TRANSFORMATIONS §4. Matrices of a Linear Tra~formation 165

The map q¡ · is essentially the correspondence that sends maps from 2. UsethebasesB 1 = {1,t}andB2 = {l,t,t 2 }tofindmatricesforthefollowing
Hom(-r, 11') to their coordinates with respect to a basis {T;J- The only linear trarisformations from Hom(Rz[t], RJ[t]).
difference is that instead of writing the coordinates in a row, they are broken a. T(a + bt) = a + bt. b. T(a + bt) = (a + b)t 2 •
up into the columris of a matrix. From this point of view, the correspondance c. T(a + bt) = O. d. T(a + bt) = 3a- b + (4b - 3a)t + (7a + 5b)t 2 •
qJ should be an isomorphism. 3. Find the matrix of TE Hom(i'" 2 , 1'" 3 ) with respect to the bases {(7, 3), (5, 2)}
and {(1, 3, 1), (1, 4, -2), (2, 6, 3)} when
a. T(a, b) = (0, O, 0).
Theorem 4.11 If dim -r = n and dim "/Y= m, then Hom(-r, 11') is b. T(a, b) = (5b- 2a, 20b- 8a, 4a- JOb).
isomorphic to Afmxn· c. T(a, b) = (a - 2b, O, b).

4.. Find the matrix of T(a, b, e) = (a + 3b, b - 2e, a + 6e) with respect to B 1 and
Proof Suppose B 1 and B2 are bases for -yr and 1/í, respectively. B 2 when;
For each TE Hom(-r, 1/í), let q¡(T) be the matrix of Twith respect to 8 1 and a. Bt = B2 ={E,}.
B 2 • lt must be shown that q¡: Hom(-r, 1/í)-> Afmxn is an isomorphism. b. B 1 = {(4, 1,,;-2), (3, -2, 1), (1, 1, O)}, B 2 = {E¡}.
The proof that q¡ is linear is straightfoward and left to the reader. c. B 1 = {(4, 1, -2). (3, -2, 1), (1, 1, 0)},
To show that (p is 1-1, su ppose TE .Al"'~'' then B 2 = {(7, 5, -8), (-3, -4, 9), (1, 2, 3)}.
d. B1 = {(2, 1, 0), (4, O, 1), ( -12, 4, 2) },
B 2 = {(5, 1, 2), (4, -2, 10), (1, O, 1)}.
q¡(T) = O..llmxn"
e. B 1 = {(1, O, 0), (0, 1, 0), (6, -2, -1)},
B 2 = {(1, O, 1), (3, 1, 0), (8, 4, 2) }.
That is, the matrix of T with respect to B 1 and B2 is the zero matrix. Hence
the matrix equation for T is Y = OX. Since every image under T has co- 5. Let A be the matrix of T(a, b, e) = (b -- 3a + 2e, 4e - 3a - b, a - e) with
ordinates Y = 0_11 .... ,, T is the zero map 0Hom( 1r, 1n- This means that the null respect to the standard basis for 1'" 3 • Use the matrix formula Y = AX to find:
space of q¡ contains only the zero vector, so T is 1-1. a. T(O, 1, 0). b. T(2, -1, 3). c. T(2, 2, 2).
To prove that q¡ is onto, Jet {Tij} be the basis for Hom(-r, ir) con- 6. Let A be the matrix for the map T(a, b) = (3a - b, b- 4a) with respect to the
structed using the bases B 1 and B2 • Then q¡(T;) is the m x n matrix with 1 basis {(1, 1), (1, O)} for 1'" 2 (in both the domain and codomain). Use the matrix
in the ith column and jth row, and zeros everywhere el se. The set of all such formula Y= AX to find: a. T(l, 1). b. T(4, 0). c. T(3, 2).
m x n matrices spans vffmxn• so qJ maps Hom(-r, 1f/") onto vffmxn· Thus (p d. T(O, 1).
is an isomorphism and Hom(-r, "'r) ~ ~tnxm· 7. Let A be the matrix of T(a + bt + ct 2 ) = 2a - e -\- (3a -\- b)t -\- (e -
a- 3b)t 2 with respect to {1, t, t 2 }. Use Y= AX to obtain: a. T(t).
Theorem 4.1 1 is an important result for it te lis us that the study of b. T(l + t 2 ). c. T(3t - 4). d. T(O).
linear transformations from one finite-dimensional vector space to another 8. Suppose A ís the matrix of T with respect to the basis B 1 = {V¡, ... , v.}
may be carried out in a space of matrices, and conversely. Computations in the doma in and B 2 = { W¡, ... , w.. }. What does it mean if:
with maps are often handled most easily with matrices; while severa! theo- a. The 3rd column of A has all zeros?
retical results for matrices are easily derived with linear maps. b. The 4th column of A has all zeros except for a 7 in the 2nd row?
c. The 1st column of A contains only two nonzero entries?
d. All nonzero entries of A occur in the first and last rows?
Problems e. A = (a,,;) and au = O unless i = j?
9. Suppose A as the matrix of a map Twith respect to B 1 and B 2 • View the matrix
1. Find the matrices of the following maps with respect to the standard bases for equation Y = AX as a system of linear equations in the coordinates of vectors.
the n-tuple spaces. What can be infered about T if:
a. T(a, b) = (a, b). b. T(a, b) = (0, a). a. AX = Y is consisten! for all Y?
c. T(a, b, e) = (0, e, 0). d. T(a, b) = (2a - b, 3a + 4b, a + b). b. AX = Y has a unique solution if consisten!?
166 4. LINEAR TRANSFORMATIONS §5. Composition of Maps 167

c. AX = Y is not a1ways consistent? Definition Let T: "f/" ~ 1/í and S: 1/í ~ ú/1. The eomposition of S
d. There is at 1east one parameter in the so1ution for AX = Y for any Y? and T is the map S o T: "f/" ~ ú/1 defined by [S o T](V) = S(T(V)) for every
10. The matrix vector VE "f/".

A'= (b ?) Let T: "f/" 3 ~ "f/" 2 and S: "f/" 2 ~ "f/" 3 be given.by


Example 1
is equivalent to the matrix A in problem 6. Find bases B 1 and B 2 for "Y 2 with
respect to which A' is the matrix for T.
T(a, b, e) = (a + b, a - e) and S(x, y) = (x - y, 2x -y, 3y).
11. The matrix
'WL._ 1 1
A'= ( O o go) Then
o o
is equivalent to the matrix A in prob1em 5. Find bases B 1 and B 2 for "Y a with [S o T](a, b, e) = S(T(a, b, e)) = S(a + b, a - e)
respect to which T has matrix A'.
= ([a + b] - [a - e], 2[a + b] - [a - e], 3[a - e])
12. Let T, A, and A' be as in Example 3. Therefore, A' is column-equiva1ent toA.
Find bases B 1 and B2 for "Y 4 and "Y a; respectively, so that A' is the matrix of = (b + e, a + 2b + e, 3a - 3e).
Twith respect to B 1 and B 2.
13. Let B1 = {(2, 1, 1), (- 1, 3, 1), (1, O, 2)} and B2 = {(4, 1), (1, O)}. Suppose
The composition S o T could be thought of as sending a vector V E "f/" 3 to
rp(T) is the matrix of Twith respect to B 1 andB 2 • Define T 1 and T 2 by T 1(a, b, e) [S o T](V) by first sending it to "f/" 2 with T and then sending T( V) to "f/" 3 with
= (3a - b + e, 3e - 4a) and Tia, b, e) = (e - b - 2a, a - 2e). S. When V= (1, 2, O) then T(l, 2, O) = (3, 1), S(3, 1) = (2, 5, 3), and
a. Find rp(Tt), rp(T2), rp(T1 + +
T 2), and rp(2T1 3T2 ). [S o T](l, 2, O) = (2, 5, 3). This is represented pictorially in Figure 3.
b. Show that rp(T1 + T 2) = rp(T1) + rp(T2). The composition T o S is also defined for the maps in Example 1, with
c. Show that rp(2T1 + 3T2) = 2rp(T1 ) + 3rp(T2 ).
14. Show that the map rp: Hom("Y, 11')---+ vffmxn is linear. [To S](x, y)= T(x- y, 2x- y, 3y) = (3x- 2y, x- 4y).

15. Suppose T, B~> and B 2 are as in Example 6, page 153.


Notice that not only are T o S and S o T not equal, but they have different
a. Find the matrix A of Twith respect to B 1 and B 2 •
b. How does A compare with the coordinates of T with respect to the basis domains and codomains, for T o S: "f/" 2 ~ "f/" 2 and S o T: "f/" 3 ~ "f/" 3 •
{TIJ}?

16. Suppose A = (a¡¡) is the matrix of TE Hom("Ya, "Y 2) with respect to the .y
bases B1 and B2, and B = {T11 , Tu, T21> T2 2, Ta 1, T32 } is the basis for 2
Hom("Y a, "Y 2) constructed using B 1 and B 2 • Write the coordina tes of Twith
respect to B in terms of the entries a1¡ in A.

§5. Composition of Maps

Suppose S and Tare maps such that the image of T is contained in the
domain of S. Then S and T can be "composed" to obtain a new map. This
operation yields a multiplication for sorne pairs of maps, and using the as-
sociation between matrices and maps, it leads directly to a general definition
for matrix multiplication. Fi\JUre 3
168 ::1) 4. LINEAR TRANSFORMATIONS §5. Composition of Maps 169

Example 2 Proof l and 3 are left to the reader. For 2, suppose V is in the
domain of T3 , then
T(a, b, e) = (a - e, O, b + e) and S(x, y)= (y, -y, x +y);
(T 1 o [T2 o T 3 ])(V) = T 1([T2 o T3 ](V)) = T 1(T2 (T3(V)))
Then S o T is undefined beca use the codomain of T is "f/ 3 while the domain
of S is "f/ 2 • However, ToS is defined, with [To S](x, y) = (-x, O, x). Thus = [T 1 o T2 ](T3 (V)) = ([T1 o T 2 ] o T3 )(V).
it may be possible to compose two maps in only one way.
·~··
Since the maps T 1 o [T2 o T 3 ] and [T1 o T 2 ] o T3 have the same effect o~ V,
they are equal, and composition of maps is associative.
Composed maps are encountered frequently in calculus. The chain rule
for differentiation is introduced in order to differentiate composed functions.
For example, the function y = Jsin x is the composition of z = f(x) and If S and T are linear maps from a vector space to itself, th~ product
y= g(z) wheref(x) = sin x and g(z) = JZ:.
For then, S o T is always defined. Further S o T: "f/ ~ "f/ is linear, so Hom("f/") is
closed under composition. Thus composition defines a multiplication on the
y= [g of](x) = g(f(x)) = g(sin x) = Jsin x. space of linear transformations Hom("f/"). This is a special situation for up
to this point we have considered only one other multiplicative operation
Composition may be viewed as a multiplication of maps. This is in within a vector space, namely, the cross product in ~ 3 •
contrast to scalar multiplication, which is only the multiplication of a Consider the algebraic system consisting of Hom("f/") together with
number times a map. The most obvious characteristic of this multiplication addition and multiplication given by composition, while disregarding scalar
is that it is not defined for all pairs of maps. But when composition is defined, multiplication. This algebraic system is not a vector space. 1t is rather a set
it satisfies sorne of the properties associated with multiplication of real of elements, maps, on which two operations, addition and multiplication,
numbers. are defined, and as such it is similar to the real number system. If we check
through the basic properties of the real numbers listed on page 2, we find
that addition of maps satisfies all the properties listed for addition in R. Of
Theorem 4.12
the properties listed for multiplication in R, we have seen, in Theorem 4.12,
l. The composition of linear maps is linear.
that Hom("f/") is closed under multiplication and that the associative law for
2. Composition of maps is assodative; if T 1 o [T2 o T3 ] and
multiplication holds in Hom("f/"). Moreover, the distributive laws hold in
[T 1 o T 2 ] o T 3 are defined, then
Hom("f/"). An algebraic system satisfying just these properties of addition
and multiplication is called a ring. Thus the set of linear maps, Hom("f/"),
together with addition and multiplication (composiotion) forma ring.
It is easy to see that Hom("f/") does not satisfy all the field properties
3. For linear maps, composition distributes over addition; if
listed for R. -....,.,

Example 3 Let S, TE Hom("f/" 2 ) be given by

S(a, b) = (2a - b, a + 3b) and T(a, b) = (a + b, 2b - a).


Similarly
Show that S o T i= T o S.
We have

provided the maps may be composed and added. [S o T](a, b) = S(T(a, b)) = S(a + b, 2b - a) = (3a, 7b - 2a),
170 4. LINEAR TRANSFORMATIONS §5. Composition of Maps 171

and That is, S o T = 1 as desired. Since composition is not commutative, it is


necessary to check that T o S = 1 also, in order to show that S is T - 1 •
[ToS]( a, b) = T(S(a, b)) = T(2a - b, a + 3b) = (3a + 2b, 1b),
Therefore S o T =1: ToS and multiplication is not commutative in Hom('i' 2 ). Example 5 Define TE Hom(R 3 [t], 1' 3 ) by
You might check to see that the same result is obtained with almost any pair
ofmaps from Hom(1' 2 ). T(a + bt + et 2 ) = (a + b + e, 2a + b + 2e, a + b).
A ring of maps does share one additional property with the real num- Given that T is invertible, find T- 1 •
bers; it has a multiplicative identity. We define the identity map 1: 1' ~ 1' If ~-
by /(V) = V for all V E 1'. It is then immediate that 1 E Hom('i').and 1 o T
= To 1 = Tfor any map TE Hom('i'). Since a multiplicative identity exists, T(a + bt + et 2 ) = (x, y, z),
the ring Hom('i') is called a ring with identity.
Since there is an identity for composition, we can look for multiplicative then
in verses.

Definition Given T: 1' ~ 11'. Suppose S: 11' ~ 1' satisfies S o T


= !.y and- ToS = /111'. Then S is the in verse of T. T is said to be in vertible, That is, a, b, and e must be expressed in terms of x, y, and z. The system
and its inverse is denoted by r- 1 •
a+ b +e= x, 2a + b + 2e =y, a+ b = z,

Example 4 Suppose T: 1' 2 ~ 1' 2 is given by T(a, b) = (2a + 3b, has the solution
3a + 4b). Show that T is invertible by finding T- 1 •
Write T(a, b) = (x, y). If a map. S exists such that S o T = /, then a= -2x +y+ z, b = 2:X- y, e=x-z
S(x, y) = (a, b). Therefore, S may be obtained by solving
for any x, y, and z. Therefore, T- 1 should be defined by
2a + 3b = x
3a + 4b =y r- 1(x, y, z) = -2x +y+ z + (2x- y)t + (x- z)t 2 •

for a and b in terms of x and y. These equations have the unique solution lt must now be shown that r-t o T = IR,[t] and T o r- 1 = 1-;r,. This is left
to the reader.
a= -4x + 3y
b = 3x- 2y. In the real number system, every nonzero number has a multiplicative
inverse. However, a similar statement cannot be made for the nonzero maps
Therefore S should be defined by in Hom('i' 2 ).

S(x, y) = (-4x + 3y, 3x - 2y).


Exf!mple 6 Suppose TE Hom('i' 2 ) is defined by T(a, b) = (a, O).
Using this definition, Show that even through T =1: O, T has no inverse.
Assume a map S e Hom('f" 2 ) exists such that S o T =l. Then S(T(V))
[S o T](a, b) = S(T(a, b)) = S(2a + 3b, 3a + 4b) = Vfor all V E 'f"2 • But .;V r =1: {O}; forexample, T(O, 1) = (0, O) = T(O, 2).
Therefore S(O, O) must be (0, 1) on the one hand and (0, 2) on the other. Since
= (-4(2a + 3b) + 3(3a + 4b), 3(2a + 3b) - 2(3a + 4b)) this violates the definition of a map, S cannot exist. In this case, S fails to
=(a, b). exist because T is not one to one.
172 4. LINEAR TRANSFORMATIONS §5. Composition of Maps 173

Suppose we had asked that ToS = l. Then T(S(W)) = W for all and if YEifí, then [ToS](Y) =-T(S(Y)) =Y. Thus S is r- 1 and T is
W E .Y 2 • But now ..fr f:. .Y 2 ; in particular, (0, 1) ~ ..f r· So if W is ta)<en to invertible.
be (0, 1), then W has no preimage under T. Therefore no matter how S(O, l)
is defined, T(S(O, l)) f:. (0, 1), and this map S cannot exist either. Here S Theorem 4.14 shows that a linear map is invertible if and only if it is an
fails to exist beca use T is not onto. isomorphism. However, the essential requirement is that the map be one to
one, for if T: r -> ifí is one to one but not onto, then T: r -> ..fT is both
Before generalizing the implications of this example, in Theorem 4.14, one to one and onto. So an invertible map may be obtained without changing
a few facts concerning inverses should be stated. They are not difficult to how T is defined on .Y. On the other hand, if T is not 1-l, then there is no
prove, so their proofs are left to the reader. such simple change which will correct the defect. Since being l-1 is essential
to inverting a map, a linear transformation is said to be nonsingular if it is
Theorem 4.13 one to one; otherwise it is singular. A singular map cannot be inverted even
l. If T is invertible, then T has a unique in verse. if its codomain is redefined, for it collapses a subspace of dimension greater
2. If T is linear and invertible, then T- 1 is linear and invertible with than zero to the single vector O.
(T-1)-1 = T. The property of being invertible is not equivalent to being nonsingular
3. If S and Tare invertible and S o T is defined, then S o T is in vertible or l-1. However, in the improtant case of a map from a finite-dimensional
with (S o T)- 1 = y-t o s- 1 • vector space to itself, the two conditions are identical.

Notice that the inverse of a product is the product of the inverses in the Theorem 4.15 If .Y is finite dimensional and TE Hom('"f'"), then the
reverse order. This change in order is necessary because composition is not following conditions are equivalent:
commutative. In fact, even if the composition (S- 1 o T- 1) o (S o T) is l. T is in vertible.
defined, it need not be the identity map. 2. T is l-1. (Nullity T = O or .Al r = {0}.)
3. · T is onto. (Rank T = dim .Y.)
Theorem 4.14 A linear map T: .Y -> "/fí is invertible if an only if it 4. If A is a matrix for T with respect to sorne choice of bases, then
is one to one and onto. det A f:. O.
5. T is nonsingular.

Proof (=>) Assume y-t exists. To see that T is 1-1, suppose


V E ,Al T· Then since y-t is linear, T- 1(T(V)) = r- 1(0) = O. But Proof These follow at once from Theorem 4.14 together with the
two facts:
T- 1(T(V)) = [T- 1 o T](V) = !(V) = V.
···~
dim .Y = rank T + nullity T and rank T = rank A.
Therefore V = O and .Al T o;= {O}, so T is 1-1.
- To see that T is onto, suppose W E 1/í. Since r-t is defined on 1r, there We can see how the conditions in Theorem 4.15 are related by con-
is a vector U E .Y, such that r-t (W) = U. But then . sidering an arbitrary map TE Hom(.Y 2 ). From problem 11, page 145, we
know that T(a, b) = (ra + sb, ta + ub) for sorne scalars r, s, t, u. If we write
T(U) = T(T- 1(W)) = [To T- 1](W) = l(W) = W. T(a, b) = (x, y), then r- 1 exists provided the equations

Thus "/fí e J T and T is onto. ra + sb = x


( <=) As sume T is 1-1 and onto. For each W E "/fí, there exists a vector
V E .Y, such that T(V) = W, because T maps .Y onto 1/í. Further V is
ta + ub =y
uniquely determined by W since Tis 1-1. Therefore we may define S: 1/í-> .Y
can be solved for a and b given any values for x and y. (This was done in
by S(W) = V where T(V) = W. Now if X E .Y, [S o T](X) = S(T(X)) = X
Example 5.) From our work with systems of linear equations, we know that
§5. Composition of Maps 175
174 4. LINEAR TRANSFORMATIONS --

3. Show that T 2 = 1 when T(a, b) =(a, 3a - b).


mch a solution exists if and only if the coefficient matrix
4. Is T 2 meaningful for any linear map T?

A= G~) 5. Suppose Te Hom('f", if") and S E Hom(if", ó/1). Prove that So TE Hom('f", ó/1).
That is, prove that the composition of linear maps is linear.

has rank 2. But A is the matrix of T with respect to the standard basis for 6. Suppose T 1 (a, b, e) = (a, b + e, a- e), T2(a, b) = (2a - b, a, b - a) and
"r' 2 • So T - t exists if and only if det A ,¡: O or rank T = 2. Further, the rank T 3 (a, b) = (0, 3a- b, 2b). Show that the maps T1o[T2 + T3] and T1oT2 +
c.f A is 2 if and only if the equations have a unique solution for every choice T1 o T3 are equal.
c.f x and y. That is, if and only if T is 1-1 and onto. - 7. a. Prove that the distributive law in problem 6 holds for any three linear
Returning to the ring of maps Hom('i'"), Examples 3 and 5 can be maps for which the expressions are defined.
generalized to show that if the dimension of ií is at least 2, then multiplica- b. Why are there two distributive Jaws given for the ring of maps Hom('f")
tion in Hom("fí) is not commutative and nonzero elements need not have when only one was stated for the field of real numbers R?
rr:ultiplicative inverses. So the ring Hom(ií) ditfers from the field of real s. Find all maps S E Hom('f" 2 ) slich that So T = ToS when
numbers in two important respects. Suppose we briefty consider the set of a. T(a, b) = (a, 0). c. T(a, b) = (2a - b, 3a).
nonsingular maps in Hom("fí) together with the single operation of c:omposi- b. T(a, b) = (2a, b). d. T(a, b) = (a + 2b, 3b - 2a).
tion. This algebraic system is similar to the set of all nonzero real numbers 9. Suppose TE Hom('f") and let f/ = {S e Hom('f")jSo T = ToS}.
together with the single operation of multiplication. The only ditference is a. Prove that f/ is a subspace of Hom(f).
that composition of maps is not commutative. That is, for the set of all non- b. Show that if dim f;;:::: 1, then dim f/;;:::: 1 also.
sbgular maps in Hom("fí) together with composition (!) the set is closed
10. Determine if T is invertible and if so find its inverse.
under composition; (2) composition is associative; (3) there is an identity
element; and (4) every element has an inverse under composition. Such an
a. T(a, b, e) = (a - 2b + e, 2a + b - e, b - 3a).
b. T(a, b) = (a- 2b, b - a).
algebraic system with one operation, satísfying these four properties is called c. T(a, b, e) = a+ e+ bt + (-a - b)t 2 •
a group. The nonzero real numbers together with multiplication also form d. T(a + bt) = (2a + b, 3a + b).
a group. However, since multiplication of numbers is commutatíve, this e. T(a + bt) =a+ 3at + bt 2.
system is called a commutative group. Still another commutative group is f. T(a, b, e) = (a - 2e, 2a + b, b + 3e).
formed by the set of all real numbers together with the single operation of 11. Suppose T: f - 11' is linear and invertible.
addition. a. Prove that T has only one inverse.
A general study of groups, rings, and fields is usually called modern b. Prove that T- 1 is linear.
algebra. We will not need to make a general study of such systems, but in
12. Show by example that the set of all nonsingular maps in Hom('f" 2) together
time we will have occasion to examine general fields, rings of polynomials,
with O is not closed under addition. Therefore the set of nonsingular maps
and groups of permutations.
together with addition and composition does not form a ring.
13. Find an example of a linear map from a vector space to itself which is
Problems a. one to one but not onto.
b. onto but not one to one.
l.Find SoTand/or ToS when 14. Suppose the composition S o T is one to one and onto.
a. T(a, b) = (2a- b, 3b), S(x, y)= (y- x, 4x). a. Prove that T is one to one.
b. T(a, b, e) = (a - b + e, a + e), S(x, y) = (x, x - y, y). b. Prove that S is onto.
c. T(a, b) =(a, 3b +a, 2a- 4b, b), S(a, b, e) = (2a, b). c. Show by example that T need not be onto and S need not be 1-1.
d. T(a + bt) = 2a- bt 2 , S(a + bt + et 2 ) =e+ bt 2 - 3at 3 •
2. For Te Hom('f"), T" is the map obtained by composing Twith itself n times. 15. Suppose risa finite-dimensional vector space and Te Hom('f"). Prove that
If 1"' = 1"' 2 and T(a, b) = (2a, b - a), find if there exists a map S E Hom('f") such that SoT = l, then ToS= l. That is,
a. T 2 • b. T 3 • c. T". d. T 2 - 2T. e. <1f + 3/. if SoT = 1 (orToS= 1), then T is nonsingular and S= T- 1 •
176 4. LINEAR TRANSFORMATIONS §6. Matrix Multiplication 177

16. Suppose S(a, b) = (4a + b, 3a + b) and T(a, b) = (3a + b, 5a + 2b). the matrix of S o Twith respect to {E¡} is
a. Find SoT.
b. Find s-t, r- 1, and (SoT)- 1.
c. Find r- 1oS- 1 and check that it equals (SoT)- 1.
d. Show that s- 1oT- 1oSoT '# /. That is, (SoT)- 1 '# s- 1oT- 1.
17. Prove that if S and Tare invertible and SoT is defined, then (SoT)- 1 = Therefore the product of As and Ar should be defined so that
T-1os-1.
18. We know from analytic geometry that if T. and Tp are rotations of the plane
through the angles a and p, respectively, then T. o Tp = T«+ P· Show that the
equation T. o Tp(l, O) = T. +P(l, O) yields the addition formulas for the sine and
cosine functions. Notice that the product should be defined with the same row times column
rule used to represent systems of linear equations in matrix form. For the
first column of As,r is the matrix product

§6. Matrix Multiplication


(a
11
Gz¡
a 12
Gzz
)(b")
b21
The. multiplication of linear transformations, given by composition in
conjunction with the correspondence between maps and matrices, yields a and the second column is
general definition for matrix multiplication. The product of matrices should
be defined so that the isomorphisms between spaces of maps and spaces of
matrices preserve multiplication as well as addition and scalar multiplication.
That is, if S and Tare maps with matrices A and B with respect to sorne choice In other words, the element in the ith row andjth column of As,r is the sum
of bases, then S o T should ha ve the matrix product AB as its matrix with of the products of the elements in the ith row of As times the elements in the
respect to the same choice of bases. An example will lead us to the proper jth column of Ar. This "row by column" rule will be used to define multipli-
definition. cation in general. The summation notation is very useful for this. If we set
As,r = (e;) above, then eii = a; 1 bu + a; 2 b2 j for each i and j, therefore
Example 1 Let S, TE Hom('f" 2 ) be given by 2

e;j = L a;kbkj·
S(x, y)= (a 11 x + a 12 y, a21 x + a22 y), k=l

T(x,y) = (b 11 x + b 12 y, b21 x + b22 y), A sum of this form will be characteristic of an entry in a matrix product.
Notice that in the sum L~= 1 a;kbkj• the elements a;k come from the ith row
then of the first matrix and the elements bkj come from the jth column of the

As= (a,, a,z)


az¡ an
and
second.

Definition Let A = (a;) be be an n x m matrix and B = (b;j) an


are the matrices of S and T with respect to the standard basis for 'f" 2 • And m x p matrix. The produet AB of A and Bis the n x p matrix (e;) where
sin ce
m

S o T(x, y)= ([a 11 b11 + a 12 b2¡]x + [a 11 b12 + a12 bn]y, eij = L a;kbkj = a; 1b 1j + a; 2b2j + · · · + a;mbmj•
k=l

[a 21 b 11 + a22 b2 ¡]x + [a21 b12 + a22 b22 ]y), 1 ::::; i ::::; n, 1 ::::; j::::; p.
178 4. LINEAR TRANSFORMATIONS §6. Matrix Multiplication 179

(41-2)(-; (o11 -14)2'


This definition yields AsAT = As.T when As, An and As.T are as in 3
- )
Example l. The definition also agrees with our previous definition of matrix
1 3 5 3 ~ =
product when B is m x l. Not all matrices can be multiplied, the product
AB is defined only if the number of columns in A equals the number of rows
in B. What does this correspond to in the composition of maps?
(-2
2
-3)(4
o 131 -2)
5
= (-58 =72 -19) 4.
3 1 13 6 -1
Example 2 Let
The complicated row by column definition for matrix multiplication is

A = (a;) = (i :)
2 -1
chosen to correspond to composition of linear transformations. Once this
correspondence is established in the next theorem, all the properties relating
to composition, such as associativity or invertibility, can be carried over to
matrix multiplication.
and

B = (bii) = ( -1
1 2
1
-1
2
o)
-2 .
Theorem 4.16 Let T: "Y ---+ 11/" and S: 1fí ---+ il/f be linear with dim "Y
= p, dim 1fí = m, and dim il/f = n. Suppose B 1 , B2 , B 3 are bases of "Y, 1fí,
il/f, respectively, and tllat AT is the m x p matrix of T with respect to B 1
Since A is 3 x 2 and B is 2 x 4, AB is defined but BA i s not. The product and B 2 , As is the n x m matrix of S with respect to B2 and B 3 , and As.T is
AB = (e;) is a 3 x 4 matrix with the n x p matrix of S o T with respect to B 1 and B 3 •
Then AsAT = As.T·

Proof The proof is of necessity a Iittle involved, but it only uses


Thus the definitions for a matrix of a map and matrix multiplication.
2
Let
c 11 = ,Lalkbkl = 2·1 + 4(-1) = -2
k=l

and
2
c34 = .L a3kbk4 = 2-o +e -1)·C -2) = 2. n
k=l
S(Wk) = L ahkuh
h=l
for 1 ~k~ m.
Computing all the entries cli gives

m
T(V) = .L bkjwk
k=l
for 1 ~ j ~p.

Example 3 Sorne other examples of matrix products are: And if As.T = (chi), then

(23 -1)(2
-4 1 3
2) (5o -83)' G)
= (1 3 O) = (i 1~ ~} for 1 :::; j ~p.
~. LINEAR TRANSFORMATIONS §6. Matrix Multiplication 181
180

The proof consists in finding a second expression for S o T(V). that cp(T) = A, cp(S) = B, and cp(U) = C. Using these maps,

+ = cp(T)[cp(S) + cp(U)]
So T(V) = S(T(Vi)) = s(~/kiwk) =J 1
bkiS(Wk)
A(B C)
= cp(T)cp(S + U) cp preserves addition

= f bki[h=lf ahkuh] h=lf [k=lf ahkbki] Uh.


k=l
=
= cp(To [S+ U]) cp preserves multiplication
= cp(T o S + T o U) Distributive law for maps
The last equality is obtained by rearranging the nm terms in the previous = cp(To S)+ cp(To U) cp preserves addition
sum. Now the two expressions for S o T(V) give
= cp(T)cp(S) + cp(T)cp(U) cp preserves multiplication
= AB + AC.
!he map cp carries the algebraic properties of Hom("Y.) to Jtt.x.,
But the U's are linearly independent, so chi = LZ'= 1 ahkbki for each h and j.
makmg J!tnxn a ring with identity. Thus cp: Hom("Y.) ~ J!fnxn is a (ring)
That is, (a 11 k)(bk) = (eh) or AsAT = AsoT
homomorphism. The identity in J!t n x n is cp(I), the n x n matrix with ¡ 's on the
If this proof appears mysterious, try showing that it works for the main diagonal and O's elsewhere. This matrix is denoted by I. and is called
the n x n identity matrix. If A is an n x n matrix, then Al. = l 11 A = A. If
spaces and maps in Example 1, or yo u might consider the case "Y = "Y 2 ,
Bis an n x m matrix with n -:f. m, then I.B = B and the product of B times
"'fí = "Y 3 , and ó!1 = "Y 2· I. is not defined, but Blm = B.
For all positive integers n and m, let cp: Hom("Ym, "Y.)~ J!fnxm with
cp(T) the matrix of T with respect to the standard bases. For any particular
n and m we know that cp is an isomorphism, and with the above theorem, Definition Let A be an n x n matrix and suppose there exists an
cp preserves multiplication where it is defined. That is, if TE Hom(Vp, "Y m) n x n matrix B such that AB = BA = !,.. Then Bis the inverse of A, written
and S E Hom("Y,., "Y.), then cp(S o T) = cp(S)cp(T). Thus cp can be used to A-l, and A is said to be nonsingular. Otherwise A is singular.
carry the properties for composition cif maps over to multiplication of
matrices.
T~eorem4.18 Let TEHom("Y.) and AEA'inxn be the matrix of T
w¡th respect to the standard basis for "Y., i.e., A = cp(T). Then A is non-
Theorem 4.17 Matrix multiplication is associative and distributive over singular if and only if T is nonsingular.
addition.
Corollary 4.19 An n x n matrix A 1s nol?.!fngular if an only if
That is,
IAI #O.
A(BC) = (AB)C, A(B + C) = AB + AC, (A + B)C = AC + BC
These two facts are easily proved and left as problems.
hold for all matrices A, B, and C for which the expressions are defined.
Example 4 Show that
Proof Consider one of the distributive laws. Suppose

A E J!tnxm•

Then there exist maps TE Hom("Y m• "Y.) and S, U E Hom("Y P' "Y m) such is nonsingular and find A-l.
182 4. LINEAR TRANSFORMATIONS §6. Matrix Multiplication 183

We have that JAI = 1, so A is nonsingular and there exist scalars a, b, e, d Consider the matrix B = (b;) obtained from A by replacing its kth row
such that with its hth row. (This is not an elementary row operation.) Then bli = áii
when i # k and bki = ahi> for all j. Since B and A differ only in their kth

(2 1)(a b) I (101o).
53ed
= 2
= rows, the cofactors of corresponding elements from their kth rows are equal.
That is, Bki = Aki' for 1 s j s n. Therefore, expanding JBI along the kth
· row yields,
This matrix equation yields four linear equations which have the unique
n n
solution a = 3, b = -1, e = - 5, and d = 2. One can verify that if
IBI L bkiBki =j=1
=j=1 L ahiAki·
This gives the first equation, for B has two equal rows, and JBJ = O.
The second equality contains the corresponding result for columns and
then AB = BA = I 2 and B = A - 1 . This is nota good way to find the inverse it may be obtained similarly.
of any matrix. In particular, we will see that the inverse of a 2 x 2 matrix A
can be written down directly from A. Using Theorem 4.21, we see that the sum of any row in A times a row
of cofactors is either O or JAJ. 1)erefore using a matrix with the rows of
The definition of A - 1 requires that AB = In and BA = In, but it is not cofactors written as columns yields the following:
necessary to check both products. Problem 15 on page 175 yields the fol-
lowing result for matrices.

Theorem 4.20 Let A, BE A nX n· If AB = In or BA = In> then A is (""


a2l

anl
a12
a22

an2
"')C" A,.
a2n

ann
Al2

Atn
A22

A2n
A,)
An2

Ann
nonsingular and B = A - 1•
o
The amount of computation needed to invert an n x n nonsingular
matrix increases an n increases, and although there is generally no quick way
to find an inverse, there are two general techniques that have several useful
~e IAI

o
o)
0

IAI
= IAIIn.

consequences. The first technique results from the following fact.


So when IAI #O, the matrix of cofa:ctors can be used to obtain A- 1 •

Theorem 4.21 Suppose A = (a;) is an n x n matrix with Aii the


Definition Let A = (a;) be an n x n matrix. The adjoint (matrix) of
cofactor of aii. If h # k, then A, denoted A•di, is the transpose of the cofactor matrix (A;). That is, A•dJ
n = (A;)T and the element in the ith row andjth column of A•dJ is the cofactor
and L a;hAik =O. of the element in the jth row and ith column of A.
i=1 .

(The first equation shows that the sum of the elements from the hth row Now the above matrix equation can be rewritten as AA•dJ = IAIIn.
Therefore if IAI # O, we can divide by IAI to obtain
times the cofactors from the kth row is zero, if h # k. Recall that if h = k,
then LJ= 1 ahiAhi = det A.)
or
Proof The proof consists in showing that LJ= 1 ahiAki is the
expansion of a determinant with two equal rows. This is the adjoint formula for finding the inverse of a matrix.
4. LINEAR TRANSFORMATIONS §6. Matrix Multiplication 185
184

Example 5. Consider the matrix The adjoint formula for the in verse of a matrix can be applied to certain
systems of linear equations. Suppose AX = B is a consistent system of n
independent linear equations in n unknowns. Then IAI #O and A- 1 exists.
A_
-
(25 1)
3 Therefore, we can multiply the equation AX = B by A - 1 to obtain X = A - 1B.
This is the unique solution for
from Example 4. Since IAI = 1,

A- 1 = A adi = (
-5
3 -1)2 . The solution X= A -.lB is the matrix form of Cramer's rule.

In general, for 2 x 2 matrices Theorem 4.22 (Cramer's rule) Let AX = B be a consistent sys-
tem of n independent linear equations in n unknowns with A = (a¡),
a b)adi = (
(e d -e
d -b),
a
X= (x¡), and B = (b). Then the unique solution is given by

au al,i-1 b¡ al,i+l aln


so if
a21 az,¡-1 b2 a2,i+ 1 a2n

anl an,j-1 bn an,j+ 1 ann


X¡= for 1 ::;j::; n.
IAI
is nonsingular, then
That is, X¡ is the ratio of two determinants with the determinant in the nu-
a b)-l 1 ( d -b)
merator obtained from IAI by replacing the jth column with the column of
(e d = ad - be -e a · constant terms, B.

Asan example, Proof The proof simply consists in showing that the jth entry in the
matrix solution X = A-lB has the required form.

(-22 3)-l
7
= _!_(7 -3) = (7/20 -3/20)
20 2 2 1/10 1/10 .

Example 6 Find the inverse of the matrix

l.
A=240.
2 3) Therefore
(1 2 1
We have

-1
5
-12)
6 =
( 2/3 5/6
-1/3 -1/6 -2)
1 .
for eachj, but b1A 1¡ + b2A 2¡ + · · · + bnAn¡ is the expansion, along thejth
column, of the determinant of the matrix obtained from A by replacing the
-3 6 o -1/2 1 jth column with B. This is the desired expression for x¡.
186 4. LINEAR TRANSFORMATIONS §7. Elementary Matrices 187

Example 7 Use Cramer's rule to solve the following system for 5. a. Prove Theorem 4.18.
x,y, andz: b. Prove Corollary 4.19.
c. Prove Theorem 4.20.
2x+y-z=3 6. Use Cramer's rule to solve tlie following systems of equations:
x + 3y + 2z = 1 a. 2x- 3y = 5 b. 3x- 5y- 2z = 1 c. 3a + 2c = 5
X + 2y = 4. X +y - Z = 4 4b + 5d = 1
x + 2y + 2z = 2. - x + 2y + z = 5. a - 2c + d = O
4b +Se= 5.
. The determinant ofthe coefficient matrix is 5, therefore the equations are 7. B = {(1, -2, -2), (1, -1, 1), (2, -3, O)} is a basis for "Y 3 • If the vector
independent and Cramer's rule can be applied: (a, b, e) has coordina tes x, y, z with respect to B, then they satisfy the following
system of linear equations:
3 1 -1 2 3 -1 2 3 x +y+ 2z =a, -2x- y- 3z = b, -2x +y =c.
1 3 2 1 1 2 3 1 Use the matrix form of Cramer's rule to find the coordinates of
2 2 2 1 2 2 2 2 a. (1, O, 0), b. (1, O, 3), c. (1, 1, 8),. d. (4, -6, 0),
X= Y= Z=
5 5 5 with respect to the basis B.
8. Let B = {(4, 1, 4), ( -3, 5, 2); (3, 2, 4)} and use the matrix form of Cramer's
This gives x = 12/5, y = -1, and z = 4/5. rule to find the coordinates with respect to B of a. (4, -1, 2).
b. (- 3, 5, 2). c. (0, 1' 0). d. (0, o, 2).
9. Let T(a, b) = (2a- 3b, 3b- 3a). If A is the matrix of T with respect to the
Problems standard basis for .Y2 , then A - 1 is the matrix of r- 1 with respect to the
standard basis. Find A and A - 1 and use A - 1 to find a. r- 1(0, 1).
l. Compute the following matrix products: b. r- 1 (-5, 6). c. r- 1 (1, 3). d. r- 1 (x,y).
a. (i3 -1)(1 4)
5 3 o. b. (2 1 3)( -¡). c.
-i
( 4 -1)(2 1)
~ 5 1. 10. Let T(a, b, e)= (2a + b + 3c, 2a.+ 2b +e, 4a + 2b + 4c) and use the same
technique as in problem 9 to find a. r- 1(0, o, 8). b. r- 1(9, 1, 12).
c. r- (x, y, z).
1
d. ( -i)(4 2 - 3). e. (64 6
9) ( -23 -6)
4 · f. (2
o 41 25) ( -16 -34 o.
3)
110 3 22
2. Let S(x, y) = (2x- 3y, x + 2y, y- 3x), T(x, y, z) = (x + y+ z, 2x- y
+ 3z) and suppose As, ATare the matrices of S and T with respect to the §7. Elementary Matrices
standard bases for "Y 2 and "Y 3 • Show that the product ATAs is the matrix of
ToS with respect to the standard bases.
The second useful technique for finding the inverse of a matrix is based
3. Use the adjoint formula to find the inverse of:
on the fact that every nonsingular n x n matrix is row-equivalent to In. That
a. (i ~). d. (~ l D· is, the n x n identity matrix is the echelon form for an n x n matrix ofrank n.
It is now possible to represent the elementary row operations that transform
(~ g ?). 1 2 o) a matrix to its echelon form with multiplication by special matrices called
e.
o 1 o
f.
(o 1 4.
3 o 1
"elementary matrices." Multiplication on the Ieft by an elementary matrix
will give an elementary row operation, while multiplication on the right
4. a. Prove that the product of two n x n nonsingular matrices is nonsingular. yields an elementary column operation. There are three basic types of ele-
b. Suppose At. A 2 , ••• , Ak are n x n nonsingular matrices. Show that elementary matrices, corresponding to the three types of elementary opera-
A1A2 • • • Ak is nonsingular and
tions, and each is obtained from In with an elementary row operation.
(A1A2 · · · Ak)- 1 = A;; 1A;;! 1 .,r. Ai 1 A¡ 1 • An ~ x n e/ementary matrix oftype I, Eh, k> is obtained by interchanging
4. liNEAR TRANSFORMATIONS §7. Elementary Matrices 189
188

the hth and kth rows of I•. Jf A is an n x m matrix, then multiplying A on Theorem 4.23 Elementary matrices are nonsingular, and the inverse
the left by Eh,k interchanges the hth and kth rows of A. of an elementary matrix is an elementary matrix of the s¡tme type.

Examp/e 1 Proof One need only check that

o
E,,,G ~) ~ (~ o
1 !)G ~) ~ Gil The notations used above for the various elementary matrices are useful
in defining the three types, but it.is generally not necessary to be so explicit
o
E,.{! ~)~ (l ~)(! ~)~(~
2 5 2 5 4 2 in representing an elementary matrix. For example, suppose A is an n x n
o
ü
7 1 1 7 7 J' nonsingular matrix. Then A is row-equivalent to 111 and there is a sequence of
4 2 o o 4 2 2 5 elementary row operations that transforms A to 1•. This can be expressed as
1 4 o o 4 4 EP· · ·E2 E 1 A = I. where E 1, E 2 , ••• , EP are the elementary matrices that
perform the required elementary row operations. In this case, it is neither
An n x n elementary matrix of lype JI, Ek), is obtained from I. by important nor useful to specify exactly which elementary matrices are being
multiplying the hth row by r, a nonzero scalar. Multiplication of a matrix on used. This example also provides a method for obtaining the inverse of A.
the left by an elementary matrix of type JI multiplies a row of the matrix by For if EP · · · E2 E 1 A = I., then the product EP · · · E2 E 1 is the inverse of A.
a nonzero scalar. Therefore

Examp/e 2

That is, the inverse of A can be obtained from ! 11 by performing the same
E1(1/2)(; 4 8) = ('/2
5 3 o n(; 4 8) (' 2 4)
53=453" sequence of elementary row operations on I. that must be applied to A in
order to transform A to !,. In practice it is not necessary to keep trae k of the
o
!) ~ (~ o ~)(! 1)~(-l
2 2 2
E,( -3)(! -2
5
-3 -2
5
6
5 -i) elementary matrices E 1 , • •• , EP, for the elementary row operations can be
performed simultaneously on A and /,.

An n x n elementary matrix of type lll, Eh,k(r), is obtained from I. by Examp/e 4 Use elementary row operations to find the inverse of
adding r times the hth row to the kth row. Multiplication of a matrix on the
Jeft by an elementary matrix of type III performs an elementary row opera-
tion of type III.
the matrix of Examples 4 and 5 in the previous section.
Examp/e 3 Place A and ! 2 together and transform A to ! 2 with elementary row
operations.

~ !) -~ n(~ ~ ~) G_¡ ~).


= ( = 1 1
~)-~~·G
1/2 1/2
~)~(¿
1/2 1/2
~)
G 3 o 3 o 1/2 -5/2

7)o (~o o~
= ~ ~ 7) (~ -~ ~ =
-¡¡4(~ 1/2 1/2
~)~(~
o 3 -1)
2 .
3 1 o 1 3 1 -5 -5
190 4. LINEAR TRANSFORMATIONS §7. Elementary Matrices 191

Therefore matrices. This expression can be rewritten in the form A = (EP · · · E 2 E 1)- 1 ,
and using prob1em 4 on page 186 we have A = E! 1 E2 1 · · • EP- 1 • Since the

(25 31) -l =(
-5
3 -1)2 . inverse of an elementary matrix is itself an elementary matrix, we have pro ved
the following.

This agrees with our previous results. Comparing the three ways in which
Theorem 4.24 Every nonsingular matrix can be expressed as a product
A -t has been found should make it clear that the adjoint formula provides
of elementary matrices.
the simplest way to obtain the inverse of a 2 x 2 matrix.
~.
The importan ce of this theorem líes in the existence of such a representa-
For larger matrices the use of elementary row operations can be simpler
tion. However, it is possible to find an explicit representation for a non-
than the adjoint method. However, it is much easier to locate and correct
singular matrix A by keeping track of the elementary matrices u sed to trans-
errors in the computation of the adjoint matrix than in a sequence of ele-
form A to 1•. This also provides a good exercise in the use of elementary
mentary row operations.
matrices.

Examp/e 5 Use elementary row operations to find


Example 6 Express

r121 2r 4

We have as a product of elementary matrices.

3 2 3o o o) ( 2 o o o)
1 o
2
~H~ 4 2 o o 1 o o1 23 o1 -2o o1
1 1 1
o o 1 1.0 OutO
G 4 2 o o
Therefore
2 o o 2 o o 1
.v(~ o o -1 1/2 o o 1 o -13 -3~2)
1 1 o o)
o (
lit o 1 o
3 1
1
1/2
1

(~ n = (~ 2
~r~cb ~r~(~ 1~3r~(~ -~r~
o o -2 -5
w(~ 1 o 1 3 -3!2)o o -1 1/2
= (~ ~)(~ nG ~)G ~).
So Is this expression unique?

r rc
1
1

2 4
2 o2 =
2 -53
1
o -1 -3)2
1/2 )·
This technique for finding an inverse is based on the fact that if A ís
If a matrix A is multiplied on the right by an elementary matrix, the
result is equivalent to performing an elementary column operation. Multiply-
ing A on the right by Eh,k interchanges the hth and kth columns of A, mul~
tiplication by Eh(r) on the righLmultiplies the hth column of A by r, and mul-
tiplication by Eh,k(r) on the right adds r times the kth column to the hth
nonsingular, then A -t = EP · • · E2 E1 for so me sequence of elementary column (note the change in ordt1f}.
192 ~LINEAR TRANSFORMATIONS §7. Elementary Matrices 193

Example 7 Let Theorem 4.26 If A and B aren x nmatrices, then IABI = IAIIBI.

o
A~(~ ~}
1
2 2 Proof Case l. Suppose Bis nonsingular. Then B = E1 E2 • • ·Eq
3 5 for sorne sequence of elementary matrices E 1 , ••• , Eq, and the proof is by
induction on q.

roo:)
then When q = 1, B = E 1 and IABI = IAIIBI by the lemma.
Assume IABI = IAIIBI for any matrix B which is the product of q...,.. 1
1 o

:)(~
o
AE,,, ~ G32
1
2
o o o =
o 1 o
o) 2 5 2
elementary matrices, q ~ 2. Then

5 3 2 5 IABI = IAE¡·. ·Eq-lEql = I(AE¡·. ·Eq-¡)Eql


o o 1
:·í.
= IAE1 • • ·Eq-d IEql By the lemma
o o
~ G ~)(~
:~

~)~G
o o .'f

~)
1 1
1 o = IAIIEt .. ·Eq-d IEql By assumption
AE,(l/2) 2 2 2 1 "'
3 5
o 1/2 3 5/2 = IAII(E¡" ·Eq-¡)Eql By the lemma
o o
= IAJIBI.
o o

~)(~ -~) ~ (l o o)
o
(~
1
1 o
AE4 ,z( -5) = 2 2
o 1 2 2 -4 . Case 2. Suppose Bis singular. Let S and T be the maps in Hom(f.)
,2 3 5 3 5 -6 which have A and B as their matrices with respect to the standard basis. B is
o o singular, therefore Tis singular which implies that S o Tis singular. [If T(U)
The fact that a nonsingular matrix can be expressed as a product of = T(V), then S o T(U) =S o T(V).] Therefore AB, the matrix of So Twith
elemenatry matrices is used to prove that the determinan t. of a pr~duct is the respect to the standard basis, is singular. Thus IABI = O, but B is singular,
product of the determinants. But first we need the followmg spec1al case. so IBI = O, and we have IABI = IAIIBI.
Since every n x n matrix is either singular or nonsingular, the proof is
complete.
Lemma 4.25 If A is an n x n matrix and E is an n x n elementary
matrix, then IAEI = IAIIEI = lEAl.
Corollary 4.27 If A is nonsingular, then IA- 1 1 = 1/IAI.
Proof First since an elementary matrix is obtained from I. with a
single elementary row operation, the properties of determinants give Corollary 4.28 If A and B are n x n matrices such that the rank of
AB is n, then A and B each have rank n. -..._,

The proofs for these corollaries are left to the reader.


Then AE and EA are obtained from A with a single column and row operation
The second corollary is a special case of the following theorem.
so
IAEh,kl = -IAI = IEh,kAI,
"";
Theorem 4.29 If A and B are matrices for which AB is defined, then
IAElr)l = riAI = IEh(r)AI, rank AB.::;; rank A and rank AB::;; rank B.
IAEh,k(r)l = IAI = IEh,k(r)AI.
Rather than prove this theorem directly, it is easier to prove the cor-
Combining these two sets of equalities proves that IAEI = IAIIEI = lEAl. responding result for linear transformations.
4. liNEAR TRANSFORMATIONS §7. Elementary Matrices 195
194

Theorem 4.30 Let S and Tbe linear maps defined on finite-dimensional 4. Use elementary row operations to find the inverse of:
vector spaces for which S o T is defined, then rank S o T ::; rank T and
rank S o T ::; rank S.
a.
3 1 4)
(21 O5 o2 . b. (~ ¡). c. (~ o~ ~ !).o
3 6
d. (~ ~
2 5 3
g).
5. We know that if A, BE ,({nxn and AB =In, then BA =In. What is wrong
Proof Suppose { U1 , ••• , U.} is a basis for the domain of T, and with the following "proof" of this fact?
rank T = k. Then dim .'l'{T(U1 ), • •• , T(U.)} = k, and since a linear map Since AB = I., (AB)A = InA = A = Ain. Therefore A(BA) - Ain = O or
cannq_t increase dimension, rank S o T = dim .'l'{S o T(U 1), • •• , S o T(U.)} A(BA - In) = O. But A :/= O since AB = In :/= O, therefore BA - In = O and
cannot exceed k. That is, rank S o T ::; rank T. On the other hand, BA =In.
--{S o T(U¡), ... , S o T(U.)} spans f S·T and is a subset of f s· Therefore 6. Len;--B ~ ,({n x n· Prove that if AB = Oand A is nonsingular, then B = O.
dim f s.T ::; dim f s or rank S o T ::; rank S. 7. a. Prove that if A is row-equivalent to B, then there exists a nonsingular
matrix Q such that B = QA.
Now Theorem 4.29 for matrices follows from Theorem 4.30 using the b. Show that Q is unique if A is n x n and nonsingular.
c. Show by example that Q need not be unique if A is n x n and singular.
isomorphism between linear transformations and matrices.
8. Find a nonsingular matrix Q such that B = QA when

a. A= (f j ~), (ó ? -~). b
B = A (~3 1)2 B (ó ?)
Problems
c. A=(~ ~ 1), B = (b ? -~~). ~ , ~ O O.

l. Let A = (~ j), B = (! ~), and C = (~ ! ~ ~)· Find the elementary


d. A = (f
4 6 -6
~), B = (b ?) .
o o o

matrix that performs each of the following operations and use it to carry out
the operation:
a. Add -5/2 times the first row of A to the second row.
9. Let A - - (1¡ :2 bo) and B = (o1 12 -_ 3)4 . Fmd
. the elementary matrix
b. Interchange the first and second rows of B. which performs the following elementary column operations and perform the
c. Divide the first row of A by 2. operatwn by multiplication: a. Divide the second column of A by 2.
d. Interchange the second and third rows of C. b. Subtract twice the first column of A from the second. c. Interchange
e. Subtract twice the first row of C from the second row. the first and second columns of B. d. Multiply the second column of B
f. Add -3 times the secm;td row of B to the third row. by 3 and add it to the third column. e. Interchange the second and third
g. Divide the third row of C by 4. columns of A.
2. a. Find the elementary matrices E1 used to perform the following sequence 10. Suppose B = AP, A, B, and P matrices with P nonsingular. Prove that A is
of elementary row operations: column-equivalent to B.
11. Find a nonsingular matrix P such that B = AP when
(~ ~)~(1 ~)~(ó -~)~(ó i)~(ó ?).
b. Compute the product E 4 E3E2E 1 and check that it is 5· 9(2 6) -l
.
a. A=(~ ~ ~), B =(b ? g). b. A=(! ~). B =(~ -6?).
c. Find the inverses of E¡, E2, E 3, and E4. 12. Find nonsingular matrices Q and p. such that B = QAP when
d. Compute the product E"i. 1 E2 1 Ei 1 E¡ 1 •
a. .
A as m 8a, B = (10 o o)
1 0 . b. .
A as m Se, B = (1g o
bgo) .
3. Write the following matrices as products of elementary matrices:

a. (~ _ 1 ~). b. (~ ¿). c. (~ ! !)· d. ·(? 1~)· c. A as in 11 b, B = (z ~).


4. LINEAR TRANSFORMATIONS Review Problems 197
196

13. Prove that if A is nonsingular, then IA- 1 1 = 1/IAI. 8. Let A be an n X n matrix such that A 2 = A, prove that either A = In orA is
singular. Find a 2 X 2 matrix A with all nonZero en tries such that A 2 = A.
14. Prove that rank AB ::;; rank A using the corresponding result for linear ~rans­
formations. 9. Suppose T(a, b, e)= (2a + 3b+ e, 6a + 9b + 3e, 5a- 3b +e).
15. Use rank to prove that if A, BE Al" x" and AB = l., then A is nonsingular. a. Show that the image space· J r is a plane. Use elementary row operations
on a matrix with J ras row space to find a Cartesian equation for J T·
b. Use Gaussian elimination to show that the null space .#'r is a line.
c. Show that any line parallel to .#' r is sent to a single point by T.

Review Problems d. Find bases for f 3 with respect to which (bo o~ g)o is the matrix for
T.
2 3) .
Let ~ 1 be the matrix of TE Hom(f 2• f (g 1~ g)
l.
( 3) with respect to the bases e. Find bases for f 3 with respect to which is the matrix for
{(0, 1), (1, 3)} and {(-2, 1, 0), (3, 1, -4), (0, -2, 3)} and find: a. T(2, 5). T.
b. T(l, 3). c. T(a: b).
2. Find the linear map T: f f which has 1. as its matrix with respect to
10. Suppose TE Hom(f). Prove that .#'r nJr = {O} if and only if .#'r2 =
3 - 3
.AI'r.
B1 and B2 if:
a. Bt =B2. 11. Define T: f 3 - f 3 by T(a, b, e) = (a+ b, b + e, a- e).
b. B1 = {(1, O, 1), (1, 1, 0), (0, O, 1) }, B2 = {(2, -3, 5), (0, 4, 2), ( -4, 3, 1) }. a. Show that T- 1 does not exist.
3. Prove the following properties of isomorphism for any vector spaces f, "/f', b. Find T- 1 [.'1'] if !7 = {(x, y, z)ix- y+ z = 0} and show that
T[T- 1 [.9"]] f= !7. (The complete inverse image of a set is defined in
and il/t:
problem 7, page 145.)
a. -r=-r.
b. If f =
"/f', then 1f' = f .
=
c. Iff 1f' and 1f' lllt, then f = lllt. =
c. Find T- 1 [5] if !T = {(x, y, z)i3x- 4y + z = 0}.
d. Find T- 1 [{(x, y, z)ix- y- z = 0}].

4. Define TE HomW 2) by T(a, b) = (2a- 3b, b- a). 12. For which values of k does T fail.to be an isomorphism?
a. Find the matrix A of T with respect to the standard basis. a. TE Hom(f 2 ); T(a, b) = (3a + b, ka- 4b).
b. Prove that there do not exist bases for f 2 with respect to which (Ó g) b. TE Hom('r 2); T(a, b) = (2a- kb. ka+ b).
c. TE Hom('r 3 ); T(a, b, e) =(ka+ 3e, 2a- kb, 3e- kb).

c.
is the matrix for T.
Find bases for 1'" 2 with respect to which
T.
(_j -v is the matrix for 13. The definition of determinant was motivated by showing that it gives area in
the plane and should give volume in 3-space. Relate this geometric idea with
the fact that S E Hom( g 2) or S E Hom( g 3) is singular if and only if any matrix
5. Suppose f and 1f' are finite-dimensional vector spaces. Prove thak.íl{ = "/Y for S has zero determinant.
if and only if dim f = dim "/Y.
14. For the given matrix A, show that sorne matrix in the sequence 1 A A 2
6. Let T(a, b, e) = (2a- 3e, b + 4e, Sa + 3b). A3 , ••• is a linear combination of those preceding it. "' ' '
a. Express .#' r as the span of independent vectors.

b. Find bases B1 and B2 for 1'" 3 with respect to which ( -! oo o)o


1 O is the a. A= (Ó j). b. A= G ~). c. A=(~=~~).
o o 2

7.
matrix of T.
Let T(a, b, e, d) = (a + 3b + d, 2b - e + 2d, 2a + 3e - 4d, 2a + 4b + e).
d. A=(~ -~ ~).
1 1 -1
e. A = (~ =1j). f. A~~ oo. oo o)
1 o o.
o 1 o
o
Find bases for f 4 with respect to which A is the matrix of T when

(80 0b0~ 0~)· b~ ~)·


15. Suppose A E vltnxn· Show that there exists an integer k::::;; n 2 and scalars
a. A = b. A = (?
0100
ao, a¡, ... , ak such that
aoln + a1A + a2A 2 + ... + akAk = O.
198 4. LINEAR TRANSFORMATIONS

16. a. Suppose TE Hom(1'") and the dimension of 1'" is n. Restate problem 15 in


terms of T.
b. Find two different proofs for this fact.
17. Suppose 1'" is a three-dimensional vector space, not necessarily 1'" 3, and ifí is
a vector space of infinite dimension. Show that there is a one to one, linear map
from 1'" to ifí.
18. Associate with each ordered pair of complex numbers (z, w) the 2 x 2 matrix
~ ~),
( -w z
where the bar denotes the complex conjuga te (defined on page 277).
Use this association to define the "quaternionic multiplication"
(z, w)(u, v) = (zu - wii, zv + wü).
a. Show that the definition of this product follows from a matrix product.
b. Use Theorem 4.17 to show that this multiplication is associative.
c. Show by example that quaternionic multiplication is not commutative.
19. a. Suppose A is n x n and B is n x m. Show that if A is nonsingular, then
rank AB = rank B.
b. Suppose C is n x m and D is m x m. Show that if D is nonsingular, then
rank CD = rank C.
c. Is it true that if rank E ::;; rank F, then rank EF ¿ rank E?

Change of Basis

§1. Change of Coordinates


§2. Change in the Matrix of a Map
§3. Equivalence Relations
§4. Similarity
§5. lnvariants for Similarity
200 5. CHANGE OF BASIS §1. Change in Coordinates 201

In this chapter we will restrict our attention to finite-dimensional vector and since {U1 , ••• , U.} is linearly independent, the coefficient of U; on the
spaces. Therefore, given a basis, each vector can be associated with, an left equals the coefficient of U; on the right. For example, the terms involving
ordered n-tuple of coordinates. Since coordinates do not depend ori the U 1 on the right are xíp 11 U 1 + x~p 12 U 1 + · · · + x~p 1 .U 1 , and this sum
vector alone, they .reflect in part the choice of basis. The situation is similar equals (L}= 1 p 1 ixj)U~o so that the equation becomes
to the meas ure of slope in analytic geometry, where the slope of a line depends
on the position of the coordinate axes. In order to determine the influence
of a choice of basis, we must examine how different choices affect the co-
ordinates of vectors and the matrices of maps.
~ pnJ.x~)
+ .. · + ( i..J J Un·
j=l

§1. Change in Coordinates Thus

n n n
Let B be a basis for the vector space "Y, with dim "Y = n. Then each X¡= LPtixj,
j=l
Xz = j=l
LPziXj, ... , Xn -- 1....
"' PnjXj•1
j=l
vector V E .Y has coordinates with respect to B, say V: (xt> . .. , x.h. If a
new basis B' is chosen for "Y, and V: (x~, ... , x~h·, then the problem is to The general relationship between the new and old coordinates,
determine how the new coordinates x; are related to the old coordinates X;.
Suppose B = {U 1, ••• , U.} and B' = {U;, ... , U~}, then n
X¡= LPiiXJ
j=l
X¡ U¡ + .. : + x.u. = V= XÍ u; + ... + x~U~.
should suggest the matrix product
So a relationship between the two sets o( coordinates can be obtained by
expressing the vectors from one basis in terms of the other basis. Therefore,
suppose that for eachj, l ~ j ~ n,
(l)
n
u;. = Piiui + ... + Pnjun = LPijui.
i= 1
If the column of coordinates for V with respect to B is denoted by X and the
That is, u;: (p!j• . .. 'Pn)B· The n X n matrix p = (p;) will be called the column of coordinates for V with respect to B' by X', then the matrix equa-
transition matrix from the basis B to the basis B'. Notice that the jth column tion (1) becomes X= PX'. That is, the product of the transition matrix P
of the transition matrix P contains the coordinates of Uj with respect to B. from B to B' times the coordina tes of a vector V with respect to B' yields the
The scalars Pii could be indexed in other ways, but this notation will conform coordinates of V with respect to B.
with the notation for matrices of a linear transformation.
Now the equation Examp/e 1 B = {(2, 1), (1, O)} and B' = {(0, 1), (1, 4)} are bases
for "Y 2 • By inspection (0, 1): (1, -2) 8 and (1, 4): (4, -7h, therefore the
x 1 U1 + .. · + x.u. = x~u; + ... + x~U~ transition matrix from B to B' is

beco mes

x 1 U1 + · · · + x.u. = x~(.tp¡¡U;) + ... + x~(.tP;.u;).


¡= 1 t-l To check Eq. (1) for this change in coordinates, consider a particular vector
202 5. CHANGE OF BASIS §1. Change in Coordinates 203

from "f" 2 , say (3, - 5). Since (3, - 5): (- 5, 13)a and (3, - 5): ( -17, 3)a., y

and __. (4, 3)8


_.- -- -- 1 (1, 1)8 .
1
Therefore (0, 2)8 1
1
(- 2 6)
s•s B' 1
PX' =( 1
-2
4)(-17)
-7 3
= (-5)
13
=X
~

as desired.

It might be helpful to work through the derivation of the formula (3, 0)8
X= PX' for a particular example, see problem 4 at the end of this section. (2 -
s1)B' (6 -
S' s• s3)B'
Figure 1
Examp/e 2 Suppose the problem is to find the coordinates of several
vectors in "f" 2 with respect to · the basis B = {(3, 4), (2, 2)}. Rather than
o::>tain the coordinates of each vector separately, the transition matrix from Then using our interpretation of vectors as points in the Cartesian plane E 2 ,
B to {E;} can be u sed to obtain coordinates with respect to B from the co- the two sets of coordinates give two names for each point in the plane. This
ordinates with respect to {E;}, i.e., the components. In this case the standard is shown in Figure 1 where the two sets of coordinate names are listed for a
basis is viewed as the new basis and the coordinates of E; with respect to B few points. The vectors represented by the points are not indicated, but their
are found to be (1, O): ( -1, 2)a and (0, 1): (1, -3/2)B. Therefore the co- components are the same as the coordinates with respect to the standard
o::dinates of an arbitrary vector (a, b) with respect to B are given by the basis B. The familiar x and y axes of the Cartesian coordinate system re-
product present the subspaces spanned by (1, O) and (0, 1), respectively. But the two
lines representing the span of each vector in B' can also be viewed as coor-
dina te axes, the x' axis representing .P{(3, 1)} and the y' axis representing
.P{(l, 2)}. These two axes define a non-Cartesian coordinate system in E 2
corresponding to B'. It is not diffi.cult to plot points in such a coordinate
Examp!e 3 Find the transition matrix P from the basis B system using the parallelogram rule for addition of vectors. For example, if
= {t + 1, 2t 2 , t - l} to B' = {4t 2 - 6t, 2t 2 - 2, 4t} for the space R 3 [t]. A is the point with coordinates (1/2, -1) in the x'y' system, then it is the
A little computation shows that 4t 2 - 6t :(- 3, 2, - 3)B, point representing the vector f(3, 1) + (-1)(1, 2). This point is plotted in
2
2t - 2:( -1, l, 1)a, and 4t: (2, O, 2)a. Therefore Figure 2 where the ordered pairs denote vectors, not coordinates. This x'y'
coordinate system looks strange because of our familarity with Euclidean

(-3 -1 2)
geometry, but most of the bases we have considered for "f" 2 would yield
P= 2 1 O. similar non-Cartesian coordinate systems in E 2 •
-3 1 2 The second way to view a change of basis is as the linear transformation
which sends the old basis to the new basis. With B = {U1, ••• , Un} and
A change of basis can be viewed either as a change in the names (co- B' ={U{, ... , U~}, there is a linear transformationfrom "f" to itselfwhich
ordinates) for vectors or as a transformation. The first view was used in sends U; to u¡ for each i. This map is defined by
de-riving the formula X = PX' and can be illustrated geometrically with the
space "f" 2 • Suppose Bis the standard basis for "f" 2 and B' = {(3, 1), (1, 2)}. T(x 1 U1 + · · · ~PJ = x 1 U~ + · · · + xp~.
204 5. CHANGE OF BASIS §1. Change in Coordinates 205

'i:

1
1
1
1
1
/
,_JA
/
/
/

Figure 2

Therefore T sends the vector having coordinates x 1, ••• , x. with respect to


B to the vector having the same coordinates with respect to B'. Consider the
above example in "Y' 2 • In this case, the map T is given by

T(a, b) = T(a(l, O) + b(O, 1)) = a(3, 1) + b(l, 2) = (3a + b, a + 2b).

Since T sends "Y' 2 to itself, it can be represented as a map from the plane to
the plane, as in Figure 3. T sends the x axis to the x' axis and the y axis to the
y' axis, so the xy coordinate system is used in the domain of T and the x'y'
system in the codomain. When viewed as a map, this change of basis for "1' 2 "
is again seen to be too general for Euclidean geometry, for not only isn't the '-'
,.-..
Cartesian coordinate system sent to a Cartesian system, but the unit square "
e, e o
with vertices (0, 0), (1, 0), (0, 1), (1, 1) is sent to the parallelogram with vertices •• CQ
(0, 0), (3, 1), (1, 2), (4, 3). Therefore the map T, arising from this change in
basis, distorts Iength, angle, and area. This does not imply that this change
--
,.-.. ,.-..

in basis is geometrically useless, for it provides an acceptable coordinate


change in both affine geometry and projective geometry.
Now to bring the two views for a change of basis together, suppose
B = {U1 , ••• , u.} and B' = {U{, . .. , U~} with P the transition matrix
from B to B' and T the linear transformation given by T(U;) = U[. Then
206 5. CHANGE OF BASIS §1. Change in Coordinates 207

P is the matrix of T with respect to the basis B in both the domain and (<=) Let B = {V1 , ••. , Vn} and Jet Tbe the map in Hom("f/) that has A
codomain. This is immediate for if A is the matrix of T with respect .to B, as its matrix with respect to B. (Why does such a map exist?) Then A non-
then thejth column of A contains the coordinates of T(Uj) with respect to B. singular implies that T is nonsingular. Thus T(V1 ) = U 1 , ••• , T(Vn) = Un
But T( U) = Uj and the coordina tes of Uj with respect to B are in the jth are linearly independent. Since the dimension of "/' is n, S is a basis for "/'.
column of P, so A = P. This shows that the transition matrix P is non-
singular, for T is nonsingular in that it sends a basis to a basis.
Problems
Theorem 5.1 If Pis the transition matrix from B to B', then
1. Find the transition matrix from B to B' when
l. P is nonsingular. a."-73-= {(2, 3), (0, 1) }, B' = {(6, 4), (4, 8) }.
2. X' = p-t X. That is, coordina tes of vectors with respect. to B' are b. B = {(5, 1), (1, 2)}, B' = {(1, 0), (0, 1)}.
obtained from coordinates with respect to B u pon multiplication by p- 1• c. B = {(1, 1, 1), (1, 1, 0), (1, O, 0)},
3. p-t is the transition matrix from B' to B. B' = {(2, O, 3), (-1, 4, 1), (3, 2, 5)}.
d. B = {t, 1, t 2 }, B' = {3 + 2t + t 2 , t 2 - 4, 2 + t }.
e. B = {!, t 2 + 1, t 2 + t}, B' = {t 2 + 3t- 3, 4t 2 + t + 2, t 2 - 2t + 1 }.
Proof 1 has been obtained above. 2 is obtained from the equation f. B = {T¡, T 2 , T3}, B' = {S¡, Sz, S3}, with T¡(a, b, e)= 2a, Tz(a, b, e)=
P X' = X on multiplication by the matrix p-t. For 3, suppose Q is the transi- a- b, T 3 (a, b, e)= e, S 1 (a, b, e)= 3a - b + 4c, Sz(a, b, e)= 4b +Se,
tion matrix from B' to B. Then thejth column of Q contains the coordinates S 3 (a, b, e)= a + b +c. [That is, B and B' are bases for Hom(1" 3 ).]
of Uj (thejth basis vector in B) with respect to B'. By 2 these coordinates are
2. Use a transition matrix and coordinates with respect to the standard basis to
given by p- 1X where X contains the coordinates of Uj with respect to B. But
find the coordinates of the given vectors with respect to the basis B =
X= (O, ... , O, 1, O, ... , O?= EJ. Therefore the product p-tx = p-t Ej {(3, 1), (5, 2) }.
is the jth column of p-t. That is, Q = p-I as claimed. a. (4, 3). b. (1, 3). c. (-3, -2). d. (a, b).
3. Use a transition matrix and coordina tes with respect to {E1 } in 1" 3 to find the
Exarriple 4 Find the transition matrix P from B = {(3, - 2), (4,- 2)} coordinates of the given vectors with respect to the basis B = {(2, -2, 1),
ro the standard basis for "/'2 • (3, o, 2), (3, 5, 3) }.
The transition matrix from {E¡} to B is a. (1, 1, 1). b. (2, 3, 2). c. (7, 1, 5).
4. Work through the derivation of the formula X= PX' for the following exam-
ples:
a. 1" = 1"2 , B = {(2, 1), (1, 0)}, B' = {(0, 1), (1, 4)}.
b. 1" = R3[t], B = {1, t, t 2 }, B' = {1 + t 2 , 1 + t, t + t 2 }.
By Theorem 5.1, this is p-t, therefore 5. a. Prove that (AB)T = BTAT and (A- 1 )T = (Al)- 1 •
b. Show that the relationships between coordinates with respect to two
-2)
3/2 .
different bases can be written with the coordinates as row vectors to give
(X¡, . •. , Xn) = (x~, ... , x~)pT
and
Theorem 5.2 Suppose dim "/' = n, B is a basis for "/', and (x~, ... , x~) =(X¡, ... , Xn)(P- 1)T.

S = {U 1, ••• , Un} e "/'. Let A be the n x n matrix containing the coor- 6. For each basis B' below, make two sketches for the change from the standard
dinat~s of Uj with respect to B in its jth column, 1 :::; j :::; n. Then S is a basis basis (B) for 1"2 to B', representing the change first as a renaming of points and
for "/' if and only if A is nonsingular. second as a transformation from the plane to the plane. How do the trans-
formations affect the geometry of the plane?
a. B' = {(1, 1), (4, -1) }. b. B' = {(1, 0), (2, 1) }.
Proof ( =>) lf S is a basis, A is the transition matrix from B to S and
c. B' = {(2, O), (O, 2) }. d. B' = {(1/Vi', 1/V'í.), ( -1/V't, 1/V'í.) }.
therefore nonsingular. e. B'= {(-1,0),(0,1)}. f. B' = {(-1, 0), (0, -1)}.
208 5. CHANGE OF BASIS §2. Change in the Matrix of a Map 209

7. Suppose B1 and B2 are bases for a vector space "Y and A is the matrix of the and
identity map 1: "Y - "Y with respect to Bt in the domain and B 2 in the
codomain. Show that A is the transition matrix from B2 to Bt. ·
8. Prove that a transition matrix is nonsingular without reference to the linear
A'= (11 -37 -713)
map derived from the change in basis.
is the matrix of Twith respect to Bí and B~. The transltion matrix from B 1
to Bí is

§2. Change in the Matrix of a Map

If T i.s a linear transformation from "Y to '"lf", then each choice of bases
for "Y and 1f" yields a matrix for T. In order to determine how two such
matrices are related, it is only necessary to combine the relationships in volved. and the transition matrix from B 2 to Bí is
Suppose A is the matrix of T with respect to B 1 in "Y and B 2 in 1f" andA' is
the matrix of Twith respect to B~ and B~. Then A' should be related toA by
transition matrices. Therefore let P be the transition matrix from the basis
Q = (31 5)2 '
B 1 to Bí for "Y and let Q be the transition matrix from the basis B 2 to Bí
for '"lf". Now the matrices of a map and the transition matrices are used with with
coordinates, so for U E "Y Jet X and X' be the coordinates of U with respect
to B 1 and Bí, respectively. Then T(U) E "/Y, so Iet Yand Y' be the coordina tes
of T(U) with respect to B 2 and Bí, respectively. These coordinates are related
as follows:
Now the formula Q- 1 AP =A' can be verified for this example:
X=PX', Y= QY', Y=AX, Y'= A'X'.

These equations yield QY' = Y= AX = A(PX') or Y'= Q- 1(A(PX')).


But Y' = (Q- 1AP)X' states that Q- 1AP is the matrix of T with respect to
Bí and Bí orA' = Q- 1 AP.

Examp/e 1 Let TE Hom("Y 3 , "1'2 ) be given by T(~ e)=


(3a - e, 4b + 2e). Let Theorem 5.3 A and A' are matrices of sorne linear map from "Y to
"/Y if an only if there exist nonsingular matrices P and Q such that A' =
B1 = {(1, 1, 1), (1, 1, O), (1, O, 0)}, B2 = {(O, 1), (1, - 2)}, Q- 1 AP.
Bí = {(2, -1, 3), (1, O, 2), (0, 1, 1)}, Bí = {(1, 1), (2, 1)}.
Proof (=>) This has been obtained above.
Then the matrix of T with respect to B 1 and B2 is (<=) If A and A' are n x m matrices, then di m "Y = m and di m ifí = n.
Choose a basis B 1 = {V1 , •.• , Vm} for "Y anda basis B 2 = {W1 , .•• , W.}
A= (102 103 6)3 ' for "/Y. Then there exists a map TE Hom("Y, "/Y) which has A as its matrix
with respect to B 1 and B2 • We now use P and Q to construct new bases for
210 5. CHANGE OF BASIS §2. Change in the Matrix of a Map 211

"f'" and if!'. If P = (pij), then there exist vectors v;, ... , V~ such that to the standard basis in both the domain and codomain. Then T is given by

T(a, b, e) = (3a + b, 4e - 2b, a + 3e).


Let B; = {v;, ... , V~}. Then P is the matrix of coordinates of the vectors
from B; with respect to the basis B 1 and Pis nonsingular, so by Theorem 5.2, Since !Al io O, T is nonsingular and the set
B; is a basis for "f'". Further the transition matrix from B 1 to B; is P. We define
B~ = {W;, ... , W~} similarly with W;: (q 1 ¡, q2 ¡, . . . , q11 ¡) 82 for 1 ::::; i:::; n, B = {T(I, O, 0), T(O, 1, 0), T(O, O, l)} ·
where Q = (q¡). Then B~ is a basis for if!' and the transition matrix from
= {(3, O, 1), (l, ~2, 0), (0, 4, 3)}
B 2 to B~ is Q. Thus, by the first part of this theorem, the matrix of T with
respect to B; and B~ is Q - I AP, completing the proof.
is a basis for 1""3 . Now the matrix of Twith respect to the standard basis in
the domain and B in the codomain is / 3 , therefore A is equivalent to / 3 • (How
The matrix equation A' = Q- 1 AP expresses the relationship between
might this example be generalized to a theorem ?)
two matrices for a map. Examination of this relation would reveal properties
shared by all possible matrices for a given map. If such an examination
encompassed all possible choices of bases, then Q- 1 would represent an Equivalence of matrices is obtained by considering a general transforma-
arbitrary nonsingular matríx and could be denoted more simply by Q. tion and arbitrary choices of bases. A very important relationship is obtained
in the special case of a map from a vector space to itself if the matrices are
obtained using the same basis in both the domain and codomain. To see how
Definition Two n x m matrices A and B are equivalen! if there exist
this relation should be defined, suppose TE Hom("f'") has the matrix A with
nonsíngular matrices P and Q such that B = QAP.
respect to the basis B and the matrix A' with respect to B'. Let P be the
transition matrix from B to B'. Then with the same change of basis in both
Recall that in Chapter 3 we called two matrices equivalent if one could the domain and the codomain, the relation A' = Q- 1 AP becomes A'
be obtained from the other using elementary row and column operations. = p- 1 AP.
1t should be clear that these two definitions are equivalent, as stated in pro-
blem 3 at the end of this section.
Definition Two n x n matrices A and B are similar if there exists a
Using the term equivalent, Theorem 5.3 can be restated as follows: Two
nonsingular matrix P such that B = p-I AP.
matrices are equivalent if and only if they are the matrices for sorne map with
respect to two choices of bases in the domain and codomain. In Chapter 4
severa! examples and problems produced different matrices for the same Examp/e 3 Suppose Te Hom("f'" 3 ) is defined by
map and hence equivalent matrices. Examples 2 through 6 on pages 157
to 162 contain severa! pairs of equivalent matrices. T(a, b, e) = (2a + 4e, 3b, 4a + Se).

Example 2 Show that the 3 x 3 matrix Let B be the standard basis and let B' = { ( l, O, 2), (0, 2, 0), (2, 1, - 1)}. Show
that the matrix of Twith respect to Bis similar to the matrix of Twith respect
3 1
A= O -2 4
o) to B'.
A little computation shows that the matrix of T with respect to Bis
(1 o 3
is equivalent to the identity matrix / 3 • 2 o 4)
A= O 3 O
Consider the map Te Hom(f 3 ) which has A as its matrix with respect (
4 o 8
rz::
212 5. CHANGE OF BASIS §2. Change in the Matrix of a M~ 213

and that similarity will lead to many useful concepts. Before begining such a study, it
would be well to consider the general class of relations, called equivalence
relations, to which both similarity and equivalence belong.
A'=
toO o3 Oo)
(o o o
Problems
is the matrix of T with respect to B'. The transition matrix from B to B' is
l. Let T(a, b) = (2a - b, 3a - 4b, a + b), Br = {(1, 0), (0, 1) }, B; =
o {(2, 3), (1, 2) }, B 2 = {(1, O, 0), (0, 1, 0), (0, O, 1) }, and B~ = {(1, 1, 1), (1, O, 0),

P~ (~ 2
o 3) ~
2.
(1, O, 1)}. Find the matrices A, A', P, Q, and show that A'= Q- 1 AP.
LetT(a, b) = (4a- b, 3a + 2b), B = {(1, 1), (-1, 0)}, and B' = {(0, 1),
:~ ( -1, 3) }. Find the matrices A, A', P and show that A' = p-r AP.
and f.
3. Show that A is equivalen! to B if and only if B can be obtained from A by
elementary row and columil operations.
o 2/l)
p-I = ('' 2~5 o 1/2 o .
-1/5
4. a.
b.
Show that (~ i) is equivalen! to / 2•

What condition would insure that a 2 x 2 matrix is equivalen! to the


identity matrix / 2 ?
Therefore we have
5. Show that (J ~ ~) is equivalen! to (ó ~ g) by
o o o o
p-'AP ~
.
r'(¡
4
o
o ~)(~
3
o
2
o -1
o2) - ('''
o
2/5
o
1/2
o
2/l)('"
-1~5
o
2~ o ~)
6 a.
b.
2 1
using a linear transformation.
using the result of problem 3.
6. Suppose TE Hom('Y"") has A as its matrix with respect to B. Prove that if
o3 o)O =A'.
~n o o
7.
A' = p-r AP for sorne nonsingular matrix P, then A' is the matrix of Twith
respect to sorne basis for 1'".
Find all n x n matrices which are similar to 1•.
8. In what ways do the relations of equivalence and similarity differ?
Theorem 5.4 Two n x n matrices are similar if and only if they are
matrices for sorne linear transformation from a vector space to itself 9. a. Show that (~ Ó) is similar to (Ó _ ~).
( _ ~ Ó) is not similar to any matrix of the form (~ ~),
using the same basis in the domain and codomain.
b. Show that
where x and y are real numbers.
Proof Half of this theorem has been obtained abo ve and· the other
half follows as in the proof of Theorem 5.3. 10. Let A= Gi =i). B = (Ó ~). and P = (¡ ~).
a. Show that B = p-I AP.
b. Let Pr and P 2 denote the two columns of P. Show that AP 1 = 4P 1 and
Similarity is a special case of the relation equivalence, or similar matrices
APz = 6Pz.
are also equivalen!. Therefore, since the matrices A and A' of Example 3 are
similar, they are equivalent. However, equivalen! matrices need not be similar. 11. Suppose
For example the equivalen! matrices A and / 3 of Example 2 are not similar;
P -¡ 13 P = / 3 # A for all nonsingular matrices P. We will find that simi-
larity is a much more complex relation than equivalence; and the study of
214 5. CHANGE OF BASIS 215
§3. Equival~nce Relations

Let Pi denote the jth column of P and prove that for each column, A Pi = which relates everything to everything else. This easily satisfies the definition
riPh 1 :$; j :$; n. (Notice that this generalizes the result obtained in problem
of an equivalence relation, but it is not very interesting.
10.)
12. Let A' be the matrix p-t AP of problem 11, with r's on the diagonal and O's
elsewhere. If A' is the matrix of TE Hom('f") with respect to the basis B' = Examp/e 1 Similarity is an equivalence relation on the set of all
{U{, ... , U~}, then what are the images of the vectors in B'? 11 x matrices. That is, if S= uffnxn andA ~ B means "A is similar to B"
11
for A BE ..Jt x , then ~ is an equivalence relation on S.
To justif~ ;his, it is necessary to verify that the three conditions in the
definition of an equivalence relation hold for similarity.
§3. Equival~e Relations Reftexive: For any 11 x n matrix A, A = 1;; 1 Al., therefore A is similar
to A and similarity is reftexive.
Symmetric: Suppose A is similar to B. Then there exists a nonsingular
An equivalence relation identifies elements that share certain properties.
matrix P such that B = P- 1 AP. Therefore PBP- 1 = PP- 1 APP- 1 =A, or
!t is much Iike the relation of equality except that equivalent objects need
A = Q- 1 BQ with Q = p- 1 . Q is nonsingular since it is the inverse of a
not be identical. The congruence relation between triangles in plane geometry
nonsingular matrix, so B is similar to A and similarity is symmetric.
provides a familiar example. Two congruent triangles are equal in aii respects
Transitive: Suppose A is similar to B and B is similar to C. Then B
~xcept for their position in the plane. Two such triangles cannot be caiied
= p- 1 AP and e= Q- 1 BQ, with P and Q nonsingular. So C = Q- 1 BQ
~qua!, yet the relationship between them satisfies three major properties of
che equality relation. First, every triangle is congruent to itself. Second, con-
= Q- 1( r 1AP)Q = (PQ)- 1 A(PQ). Since the product of nonsingular
matrices is nonsingular, A is similar to C and similarity is transitive.
gruence does not depend on the order in which two triangles are considered.
This completes the proof that similarity is an equivalence relation.
And third, if one triangle is congruent to a second and the second is congruent
to a third, then the first is congruent to the third. We will use these three
properties, shared by equality and congruence of triangles, to characterize Each equivalence relation identifies elements that share certain pro-
an equivalence relation. In order to define such a relation, we need a notation perties. For similarity, the collection of all matrices similar. toa given matri_x
for an arbitrary relation. Therefore suppose S is a set with a, b E S. Let the would contain all possible matrices for sorne linear map usmg the same basts
~ymbol ~ (tilde) denote a relation defined for elements of S; read a ~ b as in the domain and the codomain. Such a collection of equivalent elements is
"a is related to b." Essentiaiiy a relation is a collection of ordered pairs of called an equivalence class.
elements from S written as a ~ b. For example, the order relation for real
numbers is expressed in this way: 2 < 5.
Definition Suppose ~ is an equivalence relation defined on the set S.
For any element a in S, the set of all elements equivalent to a is called the
Definition A relation ~ defined on a set Sisan equivalenee relation on equivalenee classofa and will bedenoted by [a]. Therefore [a] = {bE Sla ~ b}.
S, if for all a, b, e E S
l. a ~ a for all a E S ~ is reflexive.
2. If a ~ b, then b ~ a Example 2 What is the quivalence class of
~ is svmmetrie.
3. If a ~ b and b ~ e, then a ~ e ~ is transitive.

This definition is quite general, for nothing is said about the set S; its
elements need not even by vectors-for the congruence relation above they for the equivalence relation of similarity?
are triangles. And further, the three properties give no hint as to how a Using the definition, one aílswer is
particular equivalence relation might be defined.
It should be clear that the relation of equality is an equivalence relation
on any set. At the other extreme is the relation defined by a ,..., b for all a, b E S
216 5. CHANGE OF BASIS §3. Equivalence Relations 217

Therefore complicated. For example,

[(~ ~) J (~ ~)}
= {
and

using P = 12 , and
Example 3 Isomorphism defines an equivalence relation on a set
of vector spaces. Suppose S is the set of all finite-dimensiona1 vector spaces
and 1et "f/ ~ 1fí for "//, "'Y E S mean""// is isomorphic to 1fí." Isomorphism
is refle¡¡:ive since l-¡r: "Y-+ "Y is an isomorphism and it is symmetric for if
using T: "f/ -+ 1fí is an isomorphism, then r-t: 1fí -+ "f/ is a1so an isomorphism.
Finally the relation is transitive for it is not difficult to show that if

p = ( 2 -1)
-1 1 .
T1 : "Y -+ 1fí and T2 : 1fí -+ ó/1 are isomorphisms, then T 2 o T1 : "Y -+ 0/t is
a1so an isomorphism.
What is the nature of an equivalence class for this relation? Since "f/ is
A second answer uses the map Tdefined by T(a, b) = (2a + b, 3a + 5b). isomorphic to "'Y if and only if dim "Y = dim 1fí, there is one equivalence
The matrix of T with respect to the standard basis is class for each dimension. With "Y 0 denoting the zero vector space, the
equivalence classes can easily be listed, ["Y 0 ], ["Y¡], ["Y 2 ], ..• , ["Y.], ... , and
every finite-dimensional vector space is in one of these equivalence classes;
R 3 [t] E["// 3], ..# 2 x 2 E ["Y4 ], Hom("f/ 2 , "Y 3 ) E ["Y 6 ], and so on.

Therefore,
Since an equivalence class in S contains equivalent elements, the char-

[G ;)J = {A E ..# 2 x 2 IA is the matrix of Twith respect to


acter of a class and the differences between classes provide information about
both the equivalence relation and the elements of S. The following theorem
gives the basic properties of equivalence classes.
sorne basis for "Y 2 }.

Using the basis B = {(2, -1), (1, 0)}, Theorem 5.5 Suppose ~ is an equivalence relation on the set S, then
1. a E [a] for every element a in S.
T(2, -1) = (3, 1): ( -1, 5)a, 2. a ~ b if and only if [a] = [b].
T(1, 0) = (2, 3): (- 3, 8)B, 3. For any two equivalence classes [a] and [b], either [a] = [b] or [a]
and [b] have no elements in common. -.,...,
and
Proof 1 and 3 are left as problems. For 2 there are two parts.
( =>) Suppose a ~ b. To pro ve that [a] = [b] it is necessary to show that
[a] e [b] and [b] e [a]. Therefore suppose x E [a], that is, a ~ x. By the
symmetry of ~, b ~a, and by transitivity, b ~a and a~ x imply that
Notice that neither of these two answers provides a simple way to b ~ x. Therefore x E [b] and [a] e [b]. Similarly [b] e [a]. Therefore if a ~ b,
determme. 1f. say (12 -3)1 or (_ 61 1).
1
.m [(23 1)]
IS
5
· then [a] = [b].
(<=) Suppose [a]= [b]. By 1, aE[a], so aE[b] or b ~a. Now by sym-
Not all equivalence classes of 2 x 2 matrices under similarity are so metry a ~ b, as desired.

L
;,
218 5. CHANGE OF BASIS §3. Equivalence Relations 219

This theorem shows that an equivalence relation defined on a set S diagonal. Denote this matrix by
divides S into a collection of equivalence classes in such a way that each
element of S is in one and only one of the classes. Such a multually exciusive
and totally exhaustive division is called a partition of S. For the equivalence
relation, elements within an equivalence class are essentially identical, and the
importan! dilferences are between the equivalence classes. where the zeros represen! blocks of zeros in the matrix.
Dueto notation, the element a appears to play a special role in the equiv-
alence class [a]. However, the preceding theorem shows that if e E [a], then
the class [a] could also be written as [e]. lt might be better to say that the Proof Suppose A is the matrix of TE Hom('i"", 1fí) with respect to
element a represents the equivalence class [a]. The equivalence class some.clloice of bases. Then it need only be shown that there are bases for
'f'" and 1fí with respect to which (~k ~) is the matrix of T. Since

di m 'f'" = dim.fo r + di m .Al' y,


could also be represented by
a basis B 1 = {U 1, ••• , Uk> Uk+ 1 , ••• , Um} can be obtained for 'f'", so that
{T(U 1), ••• , T(Uk)} is a basis for .for and {Uk+t• ... , Um} is a basis for
.Al' r· (If this is not clear, refer to the proof of Theorem 4.4 on page 141.)
and written as .fo r is a subspace of 1fí, so there is a basis for 1fí of the form B 2
= {T(U 1), ••• , T(Uk), Wk+t• ... , Wn}. Now the matrix of T with respect
to B 1 and B 2 contains O's everywhere except for 1's in the first k en tries of
its main diagonal. That is, the matrix of T with respect to B 1 and B2 is
Although -any element in an equivalence class can represen! that class, it is (~k ~). Since A and (~k ~) are matrices for T with respect to dilferent
often possible to pick a "best" or "simplest" representative. Thus the equiv-
bases, they are equivalen!.
alence classes of vector spaces under isomorphism were denoted ['f'"nl using
the n-tuple spaces as representatives. When one special element is chosen to
represen! each equivalence class in S, the choices are called canonical forms Theorem 5.6 shows that each equivalence class of matrices under
for the elements of S. Selecting one such element from each equivalence
dass may also be called a cross section of the equivalence classes. Choosing a
equivalence contains a matrix in the form (~k ~). That each equivalence
canonical form to represent each equivalence class is not done arbitrarily, dass contains exactly one matrix in this form follows from the fact that
rather the element chosen should exemplify the properties shared by the rank QAP = rank A, if Q and Pare nonsingular (problem 19, page 198).
elements of the class. lf a E S, then the canonical form in the class [a] is Therefore matrices of this form can be taken as canonical forms for matrices
called the canonical form for a. under equivalence. For example, the matrix
The relation of equivalence for n x m matrices provides a good example
of a choice of canonical forms. The proof that equivalence is an equivalence 1 3 ¡)
relation on the set vltnxm follows the pattern for similarity in Example l. To
choose a set of canonical forms, we pick exactly one matrix from each equiv-
alence class. A way to do this has been hinted at in severa! problems and
(l o 4 2
1 7 3

examples, and follows from the fact that equivalence depends only on rank. has rank 2, therefore

o o
Theorem 5.6 If A is an n x m matrix of rank k, then A is equivalen!
to the n x m matrix with only k nonzero entries appearing as 1's on the main (~ 1 o
o o ~)
220 5. CHANGE OF BASJS §3. Equivalence Relations 221

would be taken as its canonical form. The matrix to determine if two triangles are congruent by simply checking a single pro-
perty such as area or the length of a side.
Two invariants are easily obtained for similarity. First since similar ma-
trices are equivalent, rank is invariant under similarity. A second invariant is
Gil given by the determinant.

also has rank 2, so Theorem 5.7 If A and B are similar matrices, then IAI = IBI.

Proof There exists a nonsingular matrix P such that B = p- 1 AP.


Therefore

is its canonical form. IBI = IP- 1 API = lr 1 IIAIIPI = (JJIPI) IAIIPI = IAI.
It terms of linear maps, Theorem 5.6 states that, given any map T, there
It is not difficult to show that rank and determinant do not constitute a
is sorne choice ofbases for which the matrix of Thas the form (~k ~), where complete set of invariants for matrices under similarity. Consider the matrices
k is the rank of T.
The selection of matrices in the form ( ~ ~) for canonical forms under and

equivalence reflects the fact that two n x m matrices are equivalent if and
only if they ha ve the same rank. That is, the rank of a matrix is not affected Each has rank 2 and determinan! 1, but they are obviously not similar.
by equivalence, and this single property can be used to determine if one [P- 1 ! 2 P = 12 for any.nonsingular matrix P.]
n x m matrix is equivalent to another.
Problems
Definition Suppose ~ is an equivalence relation on S. A property is
an invariant or invariant under ~ provided that when a E S satisfies the pro- 1. Which of the properties reflexive, symmetric, transitive hold and which fail to
perty and a ~ b, then b also satisfies the property. hold for the following relations?
a. The order relation s for real numbers.
b. The relation ~ defined on real numbers by x ~ y if x 2 + y 2 = 4.
We have seen that not only is rank invariant under equivalence of c. The relation ~ defined on S = (1, 3} by 1 ~ 3, 1 ~ 1, 3 ~ 1 and 3 ~ 3.
matrices, but the rank of a particular matrix completely determines~hich d. The relation ~ defined on S= R by 1 ~ 3, 1 ~ 1, 3 ~ 1, 3 ~ 3.
equivalence class it belongs in. Dimension plays a similar role for the equiv- e. The relation ~ defined on S = R by a ~ b if ab ¿ O.
alence relation defined on finite-dimensional vector spaces by isomorphism. 2. What is wrong with the following "proof" that requiring an equivalence
Dimension 'is invariant under isomorphism, and the equivalence class to relation to be reflexive is superfulous?
which a vector space belongs is completely determined by its dimension. If a ~ b, then by symmetry b ~ a, and by transitivity, a ~ b and b ~ a
In fact, the invariance of dimension was used to choose the n-tuple spaces implies a ~ a. Therefore if ~ is symmetric and transitive, a ~ a for all a E S.
"~"n as canonical forms for vector spaces under isomorphism. Rank and
3. Define the relation ~ on the set of integers Zas follows: For a, b E Z, a ~ b
dimension are said to be "complete sets of invariants" for equivalence and if a - b = Se for some e E Z. That is, a ~ b if a - b is divisable by 5. For
isomorphism, respectively. In general, a complete set of im•ariants for an example, 20 ~ 40, 13 ~ -2, but 2 '"" 8.
equivalence relation is a list of properties that can be used to determine if a. Prove that ~ is an equivalence relation on Z.
two elements are equivalent. Often more than one invariant must be checked b. How many equivalence classes does Z have under ~ and what does each
to determine if two elements are equivalent. For example, it is not possible look like?
222 5. CHANGE OF BASIS §4. Similarity 223

c. What would be a "good" choice for canonical elements under ~? nor ha ve we found a complete set of invariants that could be used to deter-
d. What property of integers is invariant under this relation? mine when any two matrices are similar. With the results for equivalence in
4. Define ~ on R by a ~ b if a - b is rational. mind, one might hope that every n x n matrix is similar to a matrix with all
a. Prove that ~ is an equivalence relation on R. en tries off the main diagonal equal to zero. Although the problem of obtain-
b. Describe the equivalence classes [1], [2/3], h/2], and [n]. i ng a canonical form for similarity is not so easily resolved, there are many
c. How many equivalence classes are there for this relation? situations in which the matrices under consideration are similar to such
5. Let S be the set of all directed line segments in the plane E 2 • That is, an matrices.
elementPQ E S is the arrow from the point P to the point Q. Define ~ on S
by, PQ ~ XY if PQ and XY ha ve the same direction and Jength. Definition An n x n matrix A = (a;) is diagonal if aii = O when
a. Show that ~ is an equivalence relation on S.
""
i -:1 j. An n x n matrix is diagonalizab!e if it is similar to a diagonal matrix.
b. How might an equivalence class be drawn in E 2 ?
c. Choose a set of canonical forms for the equivalence classes.
d. How might t~e equivalence classes be related to the geometric use of Since a diagonal matrix A = (a¡) has zero entries off the main diagonal,
vectors? it is often written as diag (a11 , a22 , .•• , a••). Thus
6. Let T: "/1" 2 ~ "/1" 2 be given by T(a, b) = (2a- b, 3b - 6a). Define ~ on "Y 2 by
U"' Vif T(U) = T(V).
a. Prove that "' is an equivalence relation on "Yz. diag(2, 5, 3) =
2 o o)
O 5 O .
(
b. Give a geometric description of the equivalence classes. oo 3
7. Define T: "Y 3 ~ "Y 2 by T(a, b, e) = (b - 2a + 4c, 4a - 2b - 8c) and for
U, VE "/1" 3 , define U~ V by T(U) = T(V). Then ~ is an equivalence relation The matrix
on "Y 3 as in problem 6.
a. What is our usual designation for [O]? 2 o 4)
b. Describe the-equivalence classes for "' geometrically.
8. A matrix A is row-equivalent to B if B = QA for sorne nonsingular matrix Q
(
A= O 3 O
4 o 8
(problem 7a, page 000.)
a. Show that row-equivalence is an equivalence relation on .lfnxm· of Example 3, page 211, was found to be diagonalizable, for A is similar to
b. Find an invariant under row-equivalence. diag(lO, 3, 0).
c. Choose canonical forms for matrices under row-equivalence. Our immediate goal will be to determine when a matrix is diagonalizable.
d. How many equivalence classes does Jf 2 x 2 ha ve under row-equivalence? That not every n x n matrix is similar to a diagonal matrix can be seen by
9. Let S be the set of all triangles in the plane and Jet "' be the congruence considering the matrix
relation between triangles. List some properties oftriangles which are invariant
under this equivalence relation.

10. If (Í g) is the canonical form for an n x m matrix under equivalence, then


what are the sizes of the three blocks of zeros? This matrix is similar to matrices óf the form

(ae db)- (01 O)(a b) l (-ab


1

§4. O e d = ad - be a 2
Similarity
where a, b, e, and d satisfy ad - be -:1 O. Such a matrix is diagonal only if
We have seen that every matrix has a very simple canonical form under
equivalence, and that rank can be used t~etermine when two matrices are a2 = b2 -= O. But then ad- be= O, so(~ ~) is not similar toa diagonal
equivalen t. In contrast, we have not mentioned a cano ni cal forro for similarity matrix.
224 5. CHANGE OF BASIS §4. Similarity 225

What can be concluded from the fact that an n x n matrix A is similar Example 1 Given that 3 and 1 are the characteristic values for
to the diagonal matrix D = diag(A. 1 , A. 2 , ••• , A..)? One way to findout would
be to consider the map Te Hom(f.) having matrix A with respect to sorne
basis B and matrix D with respect to B'. Then the transition matrix P from
A= (-5 12)
-4 9 '
B to B' satisfies p- 1 AP =D. Since D is a diagonal matrix, the action of T
on the basis B' is particularly simple. For if B' = {Uí, ... , U~}, then T(Uí) find all characteristic vectors for A and a matrix P such that p- 1 AP is
= A. 1 Uí, ... , T(U~) = .A..U~. Ifthe equations T(Uj) = A.iu; are written using diagonal.
the matrix A, they become AP 1 = A. 1 P 1, ••• , AP. = A..P. where P 1 , ••• , P. A vector
are the coordinates of Uí, ... , U~ with respect to B. But by definition, the
jth column of P contains the coordinates of Uj with respect to B. This rela-
tionship, APi = .A.iPi, between A, the columns of P, and the diagonal entries V=(~)
of D gives a necessary and sufficient condition for an n x n matrix to be
diagonalizable.
is a characteristic vector for A corresponding to 3, if AV = 3 V or

Theorem 5.8 An n x n matrix A is diagonalizable if and only if


there exist n linearly independent column vectors P 1 , ••• , P. such that
AP1 = A. 1 P 1 , . •• , AP. = A..P. for sorne scalars A. 1 , ••• , A.•. Further, if Pis
the n x n matrix with Pi as itsjth column, then p - l AP = diag(A. 1 , A.2 , ... , A..). The equations - 8a + 12b = O and -4a + 6b = O are dependent, yielding
the condition 2a = 3b. Therefore, every vector of the form (3k, 2kV with
Proof (=>) This direction was obtained above. k i= Ois a characteristic vector for the characteristic value 3. A characteristic
vector V corresponding to 1 satisfies A V = V or
(<=) Suppose A Pi = A.iPi for 1 ::;; j ::;; n, and Pi is the jth column of the
n x n matrix P. Since the n columns of P are Iinearly independent by as-
sumption, P is nonsingular, and we must show that p-I AP is a diagonal
matrix.
Write P as (P 1 , • •• , P.). Then since matrix multiplication is row by
column, the product AP can be written as (AP 1 , • •• , AP.). Therefore, using This system is equivalent to a - 2b = O, so all vectors of the form (2k, k)T
the hypothesis, AP = (A. 1 P 1 , ••• , A.. P.). That is, AP is obtained from P by with k i= O are characteristic vectors of A corresponding to l.
multiplying each column of P by a scalar, or AP is obtained from P by n A is a 2 x 2 matrix and the vectors (3, 2)T and (2, l)T are two Iinearly
. ··~
elementary column operations of the second type, and the product of the n independent characteristic vectors for A. Therefore, by Theorem, 5.8, A is
elementary matrices that perform these operations is diag(A. 1 , ••• , A..). There- diagonalizable, and if
fore

AP = P diag(A. 1 , •.• , A..) or P- 1 AP = diag(A. 1 , .•• , A..).

then
Definition Suppose A is an n x n matrix and Vis a nonzero columri
vector such that A V = A. V for sorne scalar A.. Then A. is a characteristic value
of A and Vis a characteristic vector for A corresponding to A..
1
p- AP = diag(3, 1) = (~ ~).

Severa! other terms are used for characteristic value and characteristic Since there are many characteristic vectors for A, there are many ma-
vector. Most common is eigenvalue and eigenvector, but the adjectives pro- trices that diagonali~e .A. For example, (10, 5l and (6, 4l are linearly in-
per, Iatent, and principal can also be found. dependent charactenst¡c vectors corresponding to 1 and 3, respectively. So
226 5. CHANGE OF BASIS §4. Similarity 227

if Since the expansion of a determinan! contains all possible products


taking one element from each row and each column, the determinant of the
matrix A - Al. is an nth degree polynomial in A. .
º=e~ ~}
Definition For an n X n matrix A' det(A - u.) is the characteristic
then
polvnomial of A, det(A - Al.) = O is the characteristic equation of A, and the
roots of the equation det(A - Al.) = O are the characteristic roots of A.

The roots of a polynomial equation need not be real numbers. Yet in


Many characteristic vectors were obtained for each characteri'stic value our vector spaces, the scalars must be real numbers. Therefore, a character-
in Example l. In general, if ,1. is a characteristic value for an n x n matrix istic root of a matrix need not be a characteristic value. Such a situation
A, then there exists a nonzero vector V su eh that AV= ,1. V. This condition occurs with the matrix
can be rewritten as follows:

O= AV- A.V =AV- A.(!. V)= (A - U.)V.

(Why must the identity matrix be introduced ?) Therefore, if ,1. is a character- for the characteristic equation for A is A2 + 1 = O. In the next chapter,
;stic value for A, then the homogeneous system oflinear equations (A - AI.)X complex vector spaces are defined and the equation A V = AV with A E C can
= O has a nontrivial solution, namely, X= V =f. O. Since a homogeneous be considered.
~ystem with one nontrivial solution has an infinite number of solutions, we
see why there are many characteristic vectors for each value. Conversely,
given A. such that (A - AI.)X = O has a nontrivial solution X = W =f. O, Example 3 Determine if
then (A - AI.)W =O andA W = AW. Therefore, A is a characteristic value
for A if and only if the homogeneous system (A - AI.)X = Ohas a non trivial
solution. But (A - AI.)X = O is a system of n equations in n unknowns,
therefore it has a nontrivial solution if and only if the determinant of the
coefficient matrix A - Al. is zero. This pro ves:
A= (
-3
~ -1~ j)
is diagonalizable.
Theorem 5.9 Let A be an n x n matrix and A E R. A is a characteristic The characteristic equation, det(A - A/3 ) = O, for A is
value of A if and only if det(A - Al.) = O.
3-A 6 6
Examp/e 2 For the matrix A of Example 1,
o 2-A o =o or (2 - A)(A 2 + 3A) = o,
-3 -12 -6- A

det(A - A/2) =1( =~ 1~) -A(~ ni=~-~~ A 9 ~Al obtained by expanding along the second row. The characteristic roots 2, -3,
and O are real, so they are characteristic values for A. A is diagonalizable if
= A2 - 4A + 3. it has three Iinearly independent characteristic vectors. Solving the system of
linear equations A V = AV for each characteristic value Ayields the character-
The roots of the equation A2 - 4A + 3 = O are 3 and 1, the characteristié istic vectors (12, -5, 3l, (1, O, -1?, and (2, O, -If for 2, -3, andO,
values given for A in Example l. respectively. For exampl~when A = -3, the system of linear equations
228 5. CHANGE OF BASIS §4. Similarity 229

AV= 3Vis We have V 1 '# Osince V1 is a characteristic vector, and ..1. 1 '# A2 by assump-

u
tion, therefore a = O. But then O = a V 1 + b V2 = b V2 implies b = O, for
3 6 6 V2 is also a characteristic vectot". This contradicts the assumption b '# O,
( -3
o 2 or 5 so b = O. Now O = a V1 + b V2 = a V 1 implies a = O and the vectors V 1 and
-12 -12 v2 are linearly independent.

This system is equivalent to a + e = O and b == O, therefore (k, O, -k)T


Corollary 5.11 If A. 1 , ••• , A.k are distinct characteristic values for a
is a characteristic vector for each k i= O. The three characteristic vectors
matrix A and V 1 , ••• , Vk are corresponding characteristic vectors, then
1bove are easily seen to be independent, so A is diagonalizable. In fact,
V 1 , ••• , Vk are linearly independent.
if

Proof By induction.
~).
1
o
-1 -1 Corollary 5.12 An n x n matrix is diagonalizable if it has n distinct
(real) characteristic values.
then

p- 1 AP
2
= O -3 O .
o o) The converse of Corollary 5.12 is false. For example, In is obviously
diagonalizable but all n ofits characteristic values are the same; the solutions
(o o o of (1 - A.)" = O. As a less trivial example consider the following.

As the next theorem shows, the linear independence of the three charac- Example 4 Determine if
teristic vectors in the preceding example follows from the fact that the three
characteristic values are distinct.
-2
2
2
-1)
6 -1
Theorem 5.1 O Distinct characteristic values correspond to linearly
independent characteristic vectors, or if A. 1 i= A. 2 and V 1 and V2 are char-
acteristic vectors for A. 1 and A. 2 , respectively, then V 1 and V2 are linearly is diagonalizable. The characteristic equation of A, A. 3 - 12A. + 16 = O,
independent. has roots 2, 2, and -4. Since the roots are not distinct, Corollary 5.12 does
not apply, and it must be determined if A has three independent character-
istic vectors. It is not difficult to show that only one in<íept?ñdent character-
Proof Suppose a V1 + b V2 == O for a, b E R. It must be shown that istic vector corresponds to -4; therefore A is diagonalizable if there are two
a = b = O using the fact that A V 1 = A. 1 V 1 and AV2 = A. 2 V2. These three linearly independent characteristic vectors corresponding to 2. Such vectors
equations give must satisfy the equation

Nowassumeb i= Oand workfor acontradiction. lfb i= O, then V2 = ( -ajb)V1 ,


so
This system is equivalent to the single equation a + 2b - e = O. Since the
solutions for a single equation in three unknowns involve two parameters,
230 5. CHANGE OF BASIS §5. lnvariants for Similarity 231

there are two independent characteristic vectors for the characteristic value 2. 8. With the answer to problem 7 in mind, find at least two characteristic values
Thus A is diagonalizable. for each of the following matrices by inspection.

Examp!e 5 Determine if
a. (60 0! 3~). b. (¡0 0¡ 5~). c. (63 8~ 1~).
a. x 2 matrix can fail to ha ve any (real) characteristic values. Can there
(-i
9. A2

-~)
-2
be a 3 x 3 matrix with no (real) characteristic values?
A= 3
b. Find a 3 x 3 matrix with only one independent characteristic vector.
-5 7 -1
10. a. What are the characteristic values of a diagonal matrix?
is diagonalizable. The characteristic equation for A is A. 3 - 5l 2 + 8l - 4 ~- What does this say of the characteristic values of a diagonalizable matrix?
= O and it has roots 2, 2, and l. Again there is only one independent charac- 11. Suppose A is the matrix of TE Hom(f) with respect toa basis B.
teristic vector corresponding to 1, and it is necessary to determine if A has a. What can be said of TifO is a characteristic value for A?
two linearly independent characteristic vectors for 2. But (a, b, ef is a charac- b. What subspace off corresponds to the set of characteristic vectors for A
teristic vector for 2 if a - 2b = O and a - b + e = O. Since the solutions corresponding to the characteristic value O?
~cr this system depend on only one parameter, there do not exist two Iinearly c. If A = diag(a 1 , a 2 , ••• , a") and B = {U¡, ... , Un}, then what do the
mdependent characteristic vectors for 2, and A is not diagonalizable. vectors T(U 1 ), • • • , T(Un) equal?

There are many situations in which it is very useful to replace a matrix


by a diagonal matrix. A geometric example of this is considered in Chapter 7,
§5. lnvariants for Similarity
?ut th~re are severa) nongeometric applications. Sorne of these are surveyed
m Sectwn 6 of Chapter 8, which briefty considers applications of the Jordan
canonical forro. This section could be read at this point since aJordan forro A complete set of invariants for matrices under similarity would be a
for a diagonalizable matrix A is any diagonal matrix similar toA. list of matrix properties that could be checked to determine if two matrices
are similar. Such a list could also be used to choose a set of canonical forros
for matrices under similarity. The results of the last section do not yield such
Problems a complete list of invariants, but they provide a begining.
The characteristic values of a diagonal matrix D are the entries on its
l. Find a matrix Panda diagonal matrix b such that p-t AP = D when A is main diagonal. lf A is similar to D, then the characteristic values of A are
a. (_~ -~). b. (-¿ -~). c. (i o~ ~).
1 3
d. (-~ -~ -~).
3 6 -3
on the main diagonal of D. Therefore, the characteristic values and the
characteristic polynomial of a diagonalizable matrix are invariant under
2. Determine if the following matrices are diagonalizable. similarity. This result is easily extended to all n x n matrices.

a. (1 -t). b. (~0 0; 6~)- c. (! r =~ f).


00 01
d.
1 o 1)
(31 o2 4.
1
Theorem 5.13
mi al.
Similar matrices have the same characteristic polyno-

3. Find all characteristic vectors for the matrix in Example 4 and a matrix p
that diagonalizes A. Proof Suppose A and B are similarn x nmatriceswithB = P- 1AP.
4. Find all characteristic vectors for the matrix in Example 5. Then the characteristic polynomial of B is
5. Why is the zero vector excluded from being a characteristic vector?
6. Suppose A is an n x n matrix with all n roots of its characteristic equation
lB - Ainl = IP- 1 AP - Alnl = IP- 1 AP - l(P- 1lnP)I = IP- 1(A - Ain)PI
equal to r, is A then similar to diag (r, r, ... , r)? = IP- 1 IIA- AiniiPI =lA- Ainl•
7. Why are there many characteristic vectors for each characteristic value of a
matrix? which is the characteristic polynomial of A.
232 5. CHANGE OF BASIS §5. lnvariants for Similarity 233

Since the characteristic polynomial is invariant, its n coefficients provide and


n numerical invarients for an n x n matrix A under similarity.

Example 1 Consider the matrix In terms of the entries in A, s. = det A since it is the constant term in the
characteristic polynomial. Andan examination of how the terms in 2"- 1 are

~ -3
-~)·
obtained in the expansion of det(A - Al.) shows that s 1 = tr A. s._ 1 also has
A= ( ; a fairly simple expression in terms of the elements of A. (See problem 3 at
-1 6 the end of this section.)
We now have n + l numerical invariants for an n x n matrix under
The characteristic polynomial of A is
similarity, n from the characteristic polynomial and the rank. But this is nota
complete set of invariants. This can be seen by considering

So the numbers 18, -9, and 2 are invariants for A. That is, if Bis similar to and
A, then the characteristic polynomial of B will have these coefficients. Notice
that these three numerical invariants are real while two of the three charac- These two matrices cannot be similar, P- 1(3I2 )P = 312 for any P, yet they
teristic roots are complex. These three numerical invariants can be related both have rank 2, trace 6, and determinant 9. The search for additional
to both A and the characteristic roots of A. In this case, 18 is the determinant invariants is more easily conducted with linear transformations than with
of A and the product of the three characteristic roots; and 2 is the sum of the their matrices. Since similarity arose from the consideration of different
main diagonal entries of A, as well as the sum of the three characteristic matrices for the same map, all ofthe above results for matrices extend directly
roots. to maps.

Definition If A = (a¡) is an n x n matrix, then the trace of A, denoted Definition A scalar A. is a characteristic value for TE Hom(i'), if
tr A, is a 11 + a22 + · · · + a••. T(V) = A. V for sorne nonzero vector V E i'. A nonzero vector V E i' such
that T(V) = A. Vis a characteristic vector for T corresponding to the charac-
The determinant and the trace of A are two of the coefficients in the teristic value A..
characteristic polynomial. In general, if 2 1 , ••• , 2. are the n characteristic
roots of an n x n matrix A, then the characteristic polynomial for A may be Since matrices of a map with respect to different bases are similar, and
written in the forro it has been shown that the characteristic polynomial is invariant under
similarity, the following definition can be made.
lA - AI.I = (2 1 - 2)(2 2 - A.)·· ·(A..- A.)

Definition Let TE Hom(i') and suppose A is the matrix for T with


respect to sorne basis for i'. The characteristic polynomial and the charac-
Neglecting the minus signs, the coefficients s., s._ 1 , • •• , s 1 give n numerical teristic equation for Tare the characteristic polynomial and the characteristic
invariants for A. The defining equation shows that each of these n invariants equation for A, respectively. The characteristic roots of Tare the roots of
is a sum of products of characteristic roots. In fact, it is not difficult to see its characteristic polynomial.
that sk is the sum of all possible products containing k of the characteristic
roots 2 1 , ••• , 2•. The value of sk does not depend on the order in which the Theorem 5.9 gives the following result for maps:
roots are written down, thus sk is called the "kth symmetric function" of the
n characteristic roots. The functions s. and s 1 are simplest with
Theorem 5.14 Let TE Hom(i'). A is a characteristic value for T if
and only if A is a real characteristic root for T.
234 5. CHANGE OF BASIS §5. lnvariants for Similarity 235

Maps and matrices correspond vi a a basis. If A is the matrix of T with We will see that a diagonalizable map has a particularly simple action
respect to the basis B, then characteristic vectors for A are the coordinates on a vector space.
with respect to B of characteristic vectors for T. Suppose A. is a characteristic val ue for the map TE Hom('t'"), and Jet
9"(A.) = {V E 'f'"l T( V) = A. V}. That is, Y'(A.) is the set of all characteristic
vectors for A. plus the zero vector. lt is not difficult to show that Y'(A.) is a
Example 2 Find the characteristic values and characteristic vectors
subspace of Y'". The map T operates on Y'(A.) simply as scalar multiplication,
of the map TE Hom(R 3 [t]) defined by so T sends Y'(A.) into itself.
2
T(a + bt + et 2 ) = 2a - e + 3bt + (2e - a)t •

Definition Let TE Hom('t'"). A subspace Y' of Y'" is an invariant


2
The matrix of T with respect to the basis B = {1, t, t } is subspaee of T if T[Y'] e Y'.

2 o -1)
A =
(-1
O 3
o
O ,
2
An invariant subspace need not be a space of characteristic vectors
for example, Y'" is an invariant subspace for any map in Hom('t'").
The simple nature of a diagonalizable map m ay be stated as follows:
which gives (3 - ~YO -
A.) as the characteristic polynomial of T. Therefore If TE Hom('t'") is diagonalizable, then Y'" can be expressed as the direct su m
3, 3, and 1 are the characteristic values of T. A characteristic vector V for T of invariant subspaces of characteristic vectors. If there are more than two
corresponding to 3 has coordinates (x, y, z)a which satisfy distinct characteristic values, then this requires an extension of our definition
of direct sum.

Examp!e 3 For the map TE Hom(R 3 [t]) of Example 2, the invariant


subspaces of characteristic vectors are
This yields x = -z and y is arbitrary, so (h, k, -hh are coordinates of a
characteristic vector for T provided h 2 + k 2 #- O. Therefore every vector of Y'(3) = 2"{1 - t 2 , t} and
the forro h + kt - ht 2 , h2 + k 2 ,¡. O, is a characteristic vector for T cor-
responding to 3. T ís diagonalizable, so the sum Y'(3) + Y'(l) is R 3 [t] and Y'(3) n Y'{l)
The characteristic vectors for 1 could be found without using the matrix = {0}. That is, R 3 [t] = Y'(3) EB Y'(1). Thus given any polynomial PE R 3 [t],
A, for a + bt + ct 2 is a characteristic vector for 1 if there exist unique polynomials U E Y'(3) and V E Y'(1), such that P = U + V.
Then the action of Ton Pis given simply by T(P) = 3U + 1 V.

Thatis, Example 4 Let TE Hom('t'" 3 ) have the matrix

+ 3bt + (2e - a)t 2 = a + bt + ct 2 ,


2a - e
3 2 -1)
which yields a = e, b = O. So the vectors k + kt , k #- O, are characteristic
2
A=
(-23 -26 -12
vectbrs of T corresponding to l.
The matrix A above is diagonalizable, and therefore we should also call with respect to the standard basis. A is the diagonalizable matrix considered
:he map T diagonalizable. in Example 4, page 229, with characteristic values 2, 2, -4. Since T is
diagonalizable, "/!" 3 = Y'(2) EB Y'(- 4), where
Definition TE Hom("/1") is diagof}#lizable if the characteristic vectors
of Tspan "/!". Y'(2) = 2"{(0, 1, 2), (1, o, 1)} and Y'(-4) = 2"{{1, -2, 3)}
236 5. CHANGE OF BASIS §5. lnvariants for Similarity 237

are the invariant subspaces of characteristic vectors. Geometrically;- as a map U, T(U) = (0, -1, -1), and T 2 (U) = (2, -2, -6) are linearly independent.
from 3-space to 3-space, T sends the plane 9'(2) into itself on multiplication Therefore S"( U, T), the invariani subspace generated by U, is all of "Y 3 ; and
by 2 and the line 9'( -4) into itself on multiplication by -4. B' = {U, T(.U), T 2 (U)} is a basis fori/ 3 • Since T 3 (U)= 5T2 (U)- 8T(U) + 4U,
the matrix of T with respect to B' is
Even if a map is not diagonalizable, it acts simply on a subspace if it
has a characteristic value. Thus every map from "Y 3 to "Y 3 must opera te on
sorne subspace by scalar multiplication, for every third degree polynomial
equation has at least one real root. However, if a map on "Y is not diagonal-
izable, then there will not be enough invariant subspaces of characteristic
Compare the characteristic equation for A, ..t 3 - 5..t 2 + 8.-t - 4 = O, with
vectors to generate "Y. .
the relation T\U) - 5T 2 (U) + 8T(U) - 4U = Oobtained above. This sug-
By now you may suspect that finding a complete set of invariants for
gests that A' might be chosen as the canonical representative for the equiva-
similarity is not a simple task. Since this is the case and much can be done
lence class of A under similarity. In fact, you will find that almost any choice
without them, the problem of obtaining such a set and the canonical forros
for U leads to the same matrix. Of course, not every vector and its images
associated with it will be postponed. However, it is possible to give a general
can genera te "// 3 sin ce T has two independent characteristic vectors.
idea of how one might proceed. We ha ve said that a diagonalizable map on
The matrix A' above is simple and easily obtained using the concept
"Y determines a direct sum decomposition of "Y into invariant subspaces of
of an invariant subspace. However, there is another matrix similar to A,
characteristic vectors. The idea is to show that an arbitrary map on "Y de-
which in sorne sense is simpler than A' in that it is closer to being diagonal
termines an essentially unique direct sum decomposition of "// into invariant
and has the characteristic values on the main diagonal. (See problem 11
subspaces. There are usually many different invariant subspaces for any one
below.) Example 5 and problem 11 do not provide a general procedure for
map; so the problem is to find a process which selects subspaces in a well
obtaining a simple matrix similar to any given matrix. But they do indicate
defined way. 1t is possible to construct an invariant subspace by starting with
that the characteristic polynomial and its factors are somehow involved. A
any nonzero vector U E "Y and its images under T. To see how, let
complete solution to the problem of finding invariants and canonical forros
for matrices under similarity is obtained in Chapter 8.
for m = 2, 3, 4, ....
If the dimension of "Y is n, then there exists an integer k, k ::s; n, such that
Problems
the vectors U, T(U), T 2 (U), ... , Tk- 1(U) are linearly independent and the
vectors U, T(U), T 2 (U), ... , Tk- 1(U), Tk(U) are linearly dependent. (Why?) 1. Find the characteristic polynomial for each of the following maps.
Now since Tk(U) E !l'{U, T(U), ... , Tk- 1(U)}, the subspace 9'(U, T) a. T(a, b) = (3a + b, a - 2b).
= ff'{U, T(U), ... , Tk- 1(U)} is an invariant subspace of T. Although dif- b. T(a, b, e) = (a + 3b - 2c, 2a + b, 4b - 4c).
ferent vectors need not produce different in.variant subspaces, this procedure c. T(a + bt) = 2a + b + (3a - 4b)t.
can yield invariant subspaces in infinite variety. d. T(a + bt + ct 2 ) =a+ b- 3c + (2a- c)t 2 •
2. Use the determinan! and trace to obtain the characteristic polynomial for
Examp/e 5 Let TE Hom(i/ 3 ) have the matrix a. (i j). b. (_? Ó)· c. (b -?)· d. (~ i).
3. a. Find a general expression for the coefficient sn_ 1 in the characteristic

A=(-~ -~)
-2 polynomial of an n x n matrix A = (alj). Consider s2 for a 3 x 3 matrix
3 first.
-5 7 -l b. Use the determinan! (s3), s2, and the trace (s¡) to obtain the characteristic
polynomials of

(~1 -~o 1l). (211 425 -6


-3)2 . o1 o.
o)
with respect to the standard basis. (A is the nondiagonalizable matrix con-
sidered in Example 5, page 230.) Suppose we take U= (0, O, 1), then l. ii. iii.
(
o1
o 1 1
238 5. CHANGE OF BASIS Review Problems 239

c. Find the transition matrix P from {E;} to B~ and Q from {E;} to B~.
4. Show by example that the rank of an n x n matrix A is independent of the
d. Show that Q- 1 AP = A'.
characteristic polynomial. That is, the rank and the n values s, ... , sn give
n + 1 independent invariants for A under similarity. · 2. a. Let S be the set of all systems of linear equations in m unknowns. Prove
that equivalence of systems of linear equations defines an equivalence
S. Let T(a, b, e) =(a+ b, -a/2 + b + e/2, e- b).
rclation on S.
a. Find a line e, through the origin, that is an invariant subspace of T.
b. What would be a good choice of canonical forms for this relation?
b. Show that the plane through the origin with normal e is an invariant
subspace of T. 3. Let S= -ftnXJ• and define the relation ~ by A ~ B if IAI = IBI.
c. Use part b to conclude that a proper, invariant subspace of Tneed not be a. Show that ~ is an equivalence relation on S.
a subspace of characteristic vectors. b. How is this relation related to equivalence and similarity of matrices?
~ .. c. Choose a set of canonical forms for ~, when n = 3.
6. Without using matrices or linear equations, show that if T has A as a charac-
teristic value, then it has many characteristic vectors corresponding to A. 4. Show that every 2 x 2 matrix of the form (% ~) is diagonalizable.
7. Suppose A is a characteristic value for TE Hom(r).
5. Determine which rotations in the plane have invariant subspaces of charac-
a. Prove that .9'(A) = {V E r¡ T(V) = AV} is a subspace of r.
teristic vectors. That is, for which values of rp does T(a, b) = (a cos rp -
b. What is our usual notation for .9"(A) when A = O?
b sin rp, a sin rp + b cos rp) have such invariant subspaccs?
8. Determine if the following maps are diagonalizable.
6. Which of the following matrices are diagonalizable?
a. T(a, b, e) = (b, a, e). d. T(a, b) = (2a- b, a 3b). +
o o 1)
b. T(a + bt) = 4a + 4bt. e. T(a, b, e) = (2a, a- 2b, b + 3e).
c. T(a + bt + et 2 ) =a + e + (a+ 4b)t + 4et 2 •
a. (?0 2?08). b. (?0 2?28). c. (?2 0?18). d.
(o1 ol o1 .
9. Let T(a, b, e) = (2a + e, a + 3b - e, a + 2e). 7. Let .9', !T, and d/t be subspaces of a vector space r. Devise definitions for the
a. Show that T is diagonalizable. sum .9' + !T + t1ll and the direct su m .9' EB !T EB d/t which generalize the
b. Find invariant subspaces of characteristic vectors for T and show that r 3 definitions for the terms for two subspaces.
is the direct sum of these spaces.
8. Let TE Hom(1'""4) be given by
10. Supposé TE Hom(1'""3) has 1 as a characteristic value and .9'(1) is a line.
Suppose further that !T is an invariant subspace for Tsuch that r 3 = .9'(1)8:) !T.
T(a, b, e, d) =(a+ d, -3b, 2e- 3a + 3d, 3a- d).
Show that every plane n parallel to !T is sent into itself by T, i.e., T[n] e n. a. Show that T is diagonalizable.
b. Find the invariant subspaces of characteristic vectors .9'(2), .9'(- 2), and
11. Let T be the map in Example 5, page 236. .$"( -3).
a. Show that {U, T(U), T 2 (U)} is a basis for 1'""3 when U= (1, 1, O). c. Show that 1'""4 = .9'(2) ffi .9"(- 2) EB .9'(- 3). (See problem 7.)
b. Find the matrix of T with respect to the basis in part a.
c. ~ind the invariant subspace for T generated by U= (0, 1, 2) and its 9. Define TE Hom(1'"" 3 ) by
1mages. T(a, b, e) = (4a +
2b - 4e, 2a- 4c, 2a +
2b - 2e).
d. Find the matrix of Twith respect to the basis {(0, 1, 2), T(O, 1, 2), (1, l, 1)}. a. Find the characteristic polynomial of T.
b. Show that Tis not diagonalizable, even though the characteristic equation
has three distinct characteristic roots.
c. Let U= (1, O, O) and show that B ={U, T(U), T 2 (U)) is a basis for 1'"" 3 ;
Review Problems or that .9"( U, T) = r 3 • .
d. Find the matrix of T with respect to the basis B of part c.
10. Prove that if A has two linearly independent characteristic vectors correspond-
l. Let !['E Hom(r 3, r 2) be defined by T(a, b, e) = (2a- b + 4e, a+ 2b - 3e),
ing to the characteristic value ). = r, then (A - r) 2 is a factor of the charac-
and suppose A is the matrix of T with respect to the standard bases.
teristic polynomial of A. [A need not be diagonalizable.]
a. Find the canonical form A' for A under equivalence.
b. Find basis B~ and B~ for r 3 and r 2 such that A' is the matrix of T with 11. Give an example of a map T for which .9"( U, T) :f:. .9"( V, T) whenever U and
respect to B{ and B~. V are linearly in dependen t.
rr
240 .:;,; 5. CHANGE OF BASIS

12. If A=(-~ -~), A'= (-g ~), and P = (-~ -i), then P- 1 AP =
A'. Suppose A is the matrix of Te Hom(R2[t]) with respect to the pasis
B = {t, 1 - 5t}.
a. Find T(a + bt).
b. Use the matrix P to obtain a basis B' with respect to which A' is the
matrix of T.

lnner Product Spaces

§1. Definition and Examples


§2. The Norm and Orthogonality
§3. The Gram-Schmidt Process
§4. Orthogonal Transformations
§5. Vector Spaces over Arbitrary Fields
§6. Complex lnner Product Spaces and Hermitian Transformations
243
242 6. _1_1\f_NER PRODUCT SPACES §1. Definition and Examples

The vector space ~ 2 was introduced in Chapter 1 to pro vide an algebraic Therefore an inner product is linear in each variable and so is said to be
basis for the geometric properties of length and angle. However, the defini- bi/inear.
tion of an abstract vector space was molded on "Y 2 and did not include the
additional algebraic structure of tff 2 . tff 2 is an example of an inner product Example 1 The dot product in ~~~ is an inner product, and ~~~ or
space and we now turn to a general study of such spaces. This study will ("f/"11 , o) is an inner product space.
culmina te in showing that a certain type of matrix, called symmetric, is always Recall that if U= (a 1 , ••• , a11) and V= (b¡, ... , bn), then
diagonalizable. Inner product spaces also ha ve applications in areas outside
of geometry, for example, in the study of the function spaces encountered in 11

the field of functional analysis. U o V= a 1b 1 + ··· + G b 11 11 = L a¡b¡.


i=l

The dot product is symmetric because multiplication in R_ is commutative.


To check that the dot product is linear in the first vanable, suppose U
§1. Definition and Examples - (a a) V= (b ... b) and W = (e 1 , ••. , e11 ). For r, sER,
- ¡, · · · ' n' Í' ' n'
rU + sV = (ra 1 + sb 1 , ••• , ra11 + sb 11 ), therefore
Recall that the space ~ 2 was obtained from "Y 2 by adding the dot pro- n n t1
duct to the algebraic structure of "Y 2 • To generalize this idea we will require (rU + sV) o W = L (ra;
i= 1
+ sb;)e; = r .L a¡e¡ + s.~
•= 1
b¡C¡
that a similar operation, defined on an abstract vector space, satisfy the basic •-1
properties of a dot product Iisted in Theorem 1.2 on page 19. =rUoW+sVoW.

Definition Let "Y be a real vector space and Jet <, ) be a function The verification that the dot product is positive definite is left to the reader.
that assigns real numbers to pairs of vectors, that is, <U, V) E R for each
U, V E"//", <, ) is an inner produet if the following conditions hold for all Example 2 Let B = {(1, 3), (4, 2)}. For this basis, define <, >B
U, V, WE "Y and r, sER:
on "f/" 2 by
l. (U, V)= (V, U) <, )is symmetrie.
2. (rU + sV, W) = r(U, W) + s(V, W). (U, V)B = a 1 b 1 + azbz,
3. If U ,¡, O, then (U, U) > O; (0, O) = O ( , ) is positive
definite. ~here U: (a 1 , a 2 )a and V: (b 1 , b2 )B. Then <,
)Bis an inner product on "//" z·
The vector space "Y together with an inner product ( , ) is an inner produet We will justify that <, )Bis po~itive defi~ite here; Suppos; U: (a¡, a2~B•
spaee and will be denoted ("//",<, )) or simply tff. then u,¡, O if and only if the coordmates sat1sfy (a 1 ) + (a2) > O, that JS,
U,¡, O if and only if (U, U)B >O.
The inner product space ("f/" 2 , ( , )B) is not the same space as 8' 2
Since the inner product of two vectors is a scalar, an inner product is
= ("f/" 2 , o). For example, ((1, 3), (1, 3))B = 1, since (1, 3): (1, O)a bu~
sometimes called a sealar produet, and finite-dimensional inner product spaces
(1, 3)o(1, 3) = 10. As another examp1e, (-2, 4):(2, -1)B and (1, -7).
are often called Euclidean spaees.
An inner product is a function of two variables, th~ vectors U and V. ( -3, 1)a, therefore
If the second variable is held constant, say V = V0 , then a real-valued func- ({-2, 4), (1, -7))B = 2(-3) + (-1)1 = -7.
tion T is obtained, T(U) = (U, V0 ). The second condition above n:quires
that T be linear, thus it is said that an inner product is linear in the :first Obvious1y every basis for "Y; gives .rise to an inner product on "Y 2• and
variable. But using the symmetry of an inner product, there are many ways to make "Y 2 into an inner product space. However, not

+ sW) = (rV + sW, U)= r(V, U)+ s(W, U)


all such inner products are different; ( , )¡(l,OJ,(O,I)l and )¡¡o,1),(1,0)l <,
(U, rV both ha~e the same effect as the dot product on vectors from "//" 2· Are there
= r(U, V)+ s(U, W). any other bases B for "f/" 2 for which ( , )Bis the dot product?
244 6. INNER PRODUCT SPACES §1. Definition and Examples 245

The use of a basis to define an inner product can easily be applied to is a positive number (j such that jl(x) > k/2 for all values of x such that
any finite-dimensional vector space. A polynomial space provides a good p- (j < x < p + b. For simplicity, assume O< p - (j and p + (j < 1, then
example.
(/,/) = IJZ(x)dx = Jp-6f (x)dx + Jp+6 f (x)dx + JI /
2 2 2
(x)dx
Examp!e 3 Choose B = {1, t, t 2
as a basis for R 3 [t], and define
Jo o _p-6 p+6
the function ( , ) 8 by
}
p+6 J\x)dx > Jp+6 tk dx = bk > O.
;:::
J p-6 p-6

Thus ( , ) is positive definite and defines an inner product on $'.


It is easily shown that ( , ) 8 satisfies the conditions for an inner. product,
therefore R 3 [t] together with <, )
8 yields an inner product space of poly- Examp/e 5 Since polynomials can be thought of as continuous
nomials. In this space, functions, the integral can be used to define various inner products on a
polynomial space. For P, Q E R 3 [t], set
(t, t)B = 1,
and so on. (P(t), Q(t)) = f: P(t)Q(t) dt,

An inn_er product space need not be finite dimensional. The integral can then (R 3 [t], ( , ) ) is a different inner product space from the one con-
be used to provide an inner product for functions. structed in Example 3. Computing the inner product for any of the three
pairs of polynomials considered in Example 3 shows that this is another
Example 4 Consider the vector space ff of all continuous real- space, for
valued functions with domain [0, 1]. For any two functionsf, g E ff, define
(t, t) = J: 2
1 dt = 1/3,

= J:
(f, g) = J>(x)g(x) dx.
2 3
(t, t ) t dt = 1/4,
Thus
(3 + 2t, 4t + t 2) = 61/6.
(x, x) = J: 2
x dx = 1/3,
Examp/e 6 The function ( , ) defined on "f/ 2 by
(6x, .JI - x 2
) = J: 6x.JI - x 2 dx = 2,
··~

((a, b), (e, d)) = 3ae- ad -be+ 2bd

(sin nx, cos nx) = J: sin nx cos nx dx =! J: sin 2nx dx =O.


is an inner product.
( , ) is easily seen to be symmetric. For bilinearity, let r, sE R, then

This function is an inner product on ff if the three conditions in the (r(a 1 , b 1) + s(a 2 , b 2 ), (e, d))
definition hold. Symmetry is immediate and bilinearity follows easily from = ((ra 1 + sa 2 , rb 1 + sb 2 ), (e, d))
the basic properties of the integral. However, the fact that ( , ) is positive
definite is not obvious. It must be shown that (/,/) = J6[f(x)] 2 dx > Ofor = 3(ra 1 + sa 2 )e - + sa 2 )d- (rb 1 + sb 2 )e + 2(rb 1 + sb 2 )d
(ra 1
any nonzero function f But to say f :f. O only means there is a point p at = r(3a 1e- a 1d- b 1e + 2b 1d) + s(3a 2 e- a 2d- b 2e + 2b 2 d)
whichf(p) :f. O, and without using continuity it is not possible to prove that
(/,/)>O. Suppose/(p) :f. OandjZ(p) =k> O. Then by continuity, there = r((a 1 , b 1), (e, d)) + s((a 2 , b 2 ), (e, d)).
246 6. INNER PRODUCT SPACES §2. The Norm and Orthogonality 247

Thus <, )
is linear in the first variable, and since <, ) is symmetric, it is 8. Letf((a, b), (e, d)) =(a b)(i ~) (~).
bilinear. Finally, if (a, b) i= (0, 0), then
a. Find/((1, -2), (1, -2)) and/((5, -3), (-1, 14)).
b. Show that f is bilinear.
((a, b), (a, b)) = 3a 2
- 2ab + 2b 2
c. Does f define an inner product on "Y 2?
= 2a 2
+ (a - b) 2 + b2 > O 9. Suppose <,) is an inner product defined on "1'2. Let ((1, 0), (1, O)) = x,
((1, O), (0, 1)) =y, and ((O, 1), (0, 1)) = z. Show that
and the function is positive definite. The inner product space obtained from ((a, b), (e, d)) = xae + yad + ybe + zbd
"Y 2 using this inner product is neither ~ 2 nor the space obtained in Example 2,
for ((1, 0), (0, 1)) = 3 and ((1, 3), (1, 3)) = 13. =(a b)(~ ~) (d).
10. Find the matrix (~ ~) of problem 9 for
Problems a. The dot product.
b. The inner product in Example 2, page 243.
l. Evaluate the following dot products: c. The inner product in Example 6, page 245.
a. (2, 1)o(3, -4). b. (4, 1, 2)o(2, -10, 1). 11. Suppose x, y, z are arbitrary real numbers, not all zero. Does the bilinear
C. (-3, 2)o(-3, 2). d. (0, 1, 3, 1)o(2, 5, -3, 0). function
2. Let B = {(O, 1), (1, 3)} be a basis for "Y 2 and define<, ) 8 by< U, V) 8 = a1 a2 + f((a, b), (e, d)) =(a b)(; ~) (~)
b1b2 where V: (a 1, b1) 8 and U: (a 2, b 2)8 •
a. Evaluate i. ((0, 1), (0, 1))8 • ii. ((1, O), (1, 0)) 8 • define an inner product on "Y 2?
iii. ((1, 3), (1, 3))8 . iv. ((1, 3), (0, 1))8 •
v. ((2, 4), (1, 0))8 • vi. (( -2, 0), (1, 1))8 •
b. Show that < , )n is an inner product.
c. Show that if V: (a, b)n, then (V, (0, 1))8 =a and (V, (1, 3))8 = b. §2. The Norm and Orthogonality
3. Suppose "Y has B = {U¡, ... , Un} as a basis and < , ) 8 is defined by< V, W) 8 =
i
1=1
a¡b¡ where V: (a¡, ... , an) 8 and W: (b¡, ... , bn) 8 • When representing a vector U from ~ 2 geometrically by an arrow, the
a. Prove that < , )n is an inner product for "Y. length of the arrow is o .J U U.
This idea of length can easily be extended to
b. Show that V= (V, U1 ) 8 U1 + · · · +(V, Un)nUn. any inner product space, although there will not always be such a clear
geometric interpretation.
4. Let B = {1, 1 + t, 1 + t + ! 2 } and define <, ) 8 on R 3 [t] as in problem 3.
Compute the following inner products:
a. (1 + t, 1 + 1)8 • b. (1, 1 + t + t 2 ) 8 • Definition Let ("Y, <, )) be an inner product space and U e"//, then
2
C. (2t- 1 , 4 + t - 3t ) 8 • d. (3 - t 2, 2 + 4t + 6t 2) 8 •
2
the norm of U is 1! Ull = .J<U, U). U is a unit vector if 11 Ull = l.
5. Let ("Y, <, )) be an inner product space. Prove that (O, V) = O for every
vector V E "Y.
Consider the function space §' with the integral inner product. In this
6. Por u, V E "Y2• Jet ( ur, VT) be the 2 X 2 matrix with ur and vr as columns. space, the norm of a function f is (J¿j 2(x)dx)t. Thus llx3 11 = (Jáx 6 dx)t
Define b(U, V)= det(UT, vr).
= 1/.J"l and lli'll = .j:J¡(e2 - 1). Since the norm of x 3 is 1/.j7, the norm of
a. Find b((2, 3), (1, 4)), b((2, 3), (2, 3)), and b((1, 4), (2, 3)).
b. Show that b is a bilinear function.
.j7x3 is 1, and .J1x
3
is a unit vector in the inner product space ($', <, )).
In general, if U i= O, then (1/11 UII)U is a unit vector.
c. Show that b is neither symmetric nor positive definite.
A basic relation between the inner product of two vectors and their
7. Define b(J, g) = /(l)g(l) for f, g E F. norms is given by the Schwartz Inequality.
a. Find b(2x, 4x - 3) and b(x 2, sin x).
b. Show that b is bilinear. cr
c. Does b define an inner product on F? Theorem 6.1 (Schwartz Inequality) Suppose U and V are vectors
248 6. INNER PRODUCT SPACES §2. The Norm and Orthogonality 249

from an inner product space, then (U, V) 2 ::::; IIUII 2 11Vf or J(U, V)J ::::; but the geometric significance in tff 2 requires that the cosine function be
JJUIJJJVIJ. u sed .
.;;

Proof Consider the vector xU + V for any x e R. Since ( , ) is Definition If U and V are nonzero vectors in an inner product space,
positive definite, O ::::; (xU + V, xU + V). Using the bilinearity and sym- then the angle between U and Vis
metry of ( , ), we have

O~ x (U, U)+ 2x(U, V)+ (V, V)= IJUII 2 x 2 + 2(U, V)x + JJVf,
2
cos
-t( (U, V))
11 UJIIJVII .

for all real values of x. This means that the parabola with .equation With this definition it is possible to speak of the angle between two
2 2
y = JI UIJ x + 2( U, V)x + JI VIJ 2 is entirely above or tangent to the x axis, polynomials or two functions.
so there is at most one real value for x that yields y = Oand the discriminant If the definition of angle is extended to the zero vector by making the
2 2
(2( U, V) ) - 411 Ull ll VIJ 2 must be nonpositive. But (2( U, V) ) 2 - 411 UJ1 2 JJ VJJ 2 angle between it and any other vector zero, then the relation (U, V) =
::::; Ois the desired relation.
JI UJI 11 V/1 cos 8 holds for all vectors in an inner product space. For the space
tff 2 , the definition of angle was motivated by the law of cosines in the
Corollary 6.2 (Triangle Inequality) If U and V are elements of an Cartesian plane. Now in an arbitrary inner product space the law of cosines
inner product space, then JI U + VIl ::::; JI UJI + JI V JI. is a consequence of the definition of angle.

Proof Theorem 6.3 (Law of Cosines) If U and V are vectors in an inner


product space and 8 is the angle between them, then
2
IIU + Vll =(U+ V, U+ V)= IIUII 2 + 2(U, V)+ IIVII 2
2 2
~ IIUII + 211UII IJVII + IJVII 2
2
= (11 Ull + IIVII)
2

JIU- Vll 2 = IIUII + IJVII - 2JJUIIIJVII cos 8.

If the vectors U, V, U + V from tff 2 are represented by the si des of a Proof This follows at once from
triangle in the plane, then the triangle inequality states that the Iength of one
side of the triangle cannot exceed the sum of the Iengths of the other two IJU- VJI 2 =(U- V, U- V)= (U, U)- 2(U, V)+ (V, V).
si des.
The Schwartz inequality justifies generalizing the angle measure in tff 2 The angle between two vectors from an arbitrary inner product space
to any inner product space, for if U and V are nonzero vectors, then (U, V) 2 need not be interpreted geometrically. Therefore it will often not be of in-
~ IIUII JIVII implies that
2 2
terst to simply compute angles between vectors. However, the angle of 90°
or n/2 plays a central role in all inner product spaces, much as it does in
(U, V) Euclidean geometry.
-! ~ IJUIIIJVII ~ l.

Therefore, there exists exactly one angle 8 such that O ~ 8 ~ n and Definition Let ("//, ( , ) ) be an inner product space. U, V e "f/ are
orthogonal if (U, V) = O.
(U, V)
8
cos = IIUIJIIVIJ' The angle between nonzero orthogonal vectors is n/2, therefore ortho-
gonal vectors in tff 2 or tffn might also be called perpendicular. In !!F, with the
Notice that any function with [ -1, l] in its range could have been chosen, integral inner product used in Section l, the functions x and 3x - 2
§2. The Norm and Orthogonality 251
250 6. INNER PRODUCT SPACES

are orthogonal, as are the functions sin nx and cos nx. As another example, Example 1 {(1/.}2, 1/.}2), (1/.}2, -1/.}2)} is an orthonormal
consider the inner product space ("Y, ( , ) 8 ) where the inner product ( , ) 8 basis for rff 2 and {(1/.j2, O, 1/.}2), (1/.)3, 1j.J3, -1/.}3), ( -1/.)6, 2/.}6, 1/.}6)}
is defined using coordinates with respect to a basis B. In this space any two is an orthonormal basis for rff 3 •
distinct vectors from B are orthogonal. Thus (2, 1) and (1, O) are orthogonal
in the space ("1' 2 , ( , ) 8 ) with B = {(2, 1), (1, 0)}. Examp!e 2 An orthonormal basis for rff 2 can be written in one of
The definition of orthogonal imp1ies that the zero vector is orthogona1 the following two forros for sorne angle cp:
to every vector including itself. Can an inner product space have a nonzero
vector that is orthogona1 to itself? {(cos cp, sin cp), (-sin cp, cos cp)} or {(cos cp, sin (p), (sin cp, -cos cp)}.
The Pythagorean theorem can be stated in any inner product space and
is proved by 'mnply replacing V with .:.... V in the law of cosines. To verify this, suppose {U, V} is an orthonormal basis for cf 2 • If U
= (a, b); then 1 = IIUII 2 = a 2 + b2 implies that lal : : ; l. Therefore there
exists an ang1e esuch that a= cos 8. Then b2 = 1 - cos 8 imp1ies b = ±sin e.
2
Theorem 6.4 (Pythagorean Theorem) If U and V are orthogonal
Since ±sin e = sin ±e and cos e = cos ±e, there is an angle cp su eh that
vectors,then IJU+ Vll 2 = IJUIJ 2 + IIVIJ 2 •
U= (cos cp, sin cp). Similarly there exists an angle a such that V= (cos a,
Since an inner product space has more algebraic structure than a vector sin a). Now
space, one might exp~ct restrictions on the choice of a basis. To see what
night be required, consider the inner product space rff 2 • Since rff 2 is essentially
O = U o V= cos cp cosa + sin cp sin a = cos (cp - a),
the Cartesian plane viewed as im inner product space, a basis {U, V} for rff 2 so cp and a differ by an odd multiple of n/2 and cos a = (-sin cp )( ± 1) and
should yield a Cartesian coordinate system. That is, the coordinate axes sin a = (cos cp)(± 1). Therefore U o V= O implies that V= ±(-sin (p,
2'{U} and 2'{V} should be perpendicular. Further, the points rU and rV
cos cp).
should be r units from the origin. This requires that U and V be orthogonal
and unit vectors. Consider the change of basis from the standard basis for rff 2 to the first
form for an orthonormal basis in Example 2. If T: rff 2 ~ rff 2 is the linear
Definition A basis {V1 , ••• , Vn} for a vector space "Y is an orthonormal transformation obtained from this change, then
basis for the inner product space ("Y, ( , )) if (V¡, Vi) = O when i t= j and
(V¡, V¡)= 1, for 1 ::::; i,j::::; n. T(1, O) = (cos cp, sin cp) and T(O, 1) = (-sin cp, cos cp).

That is, an orthonormal basis is a basis consisting of unit vectors that Therefore T is given by
are mutually orthogonal. The standard basis {E¡} is an orthonormal basis
T(x, y) = (x cos cp -y sin cp, x sin cp + y cos cp).
for rffn, but there are many other orthonormal bases for rffn.
There is a useful notation for expressing the conditions in the above
That is, this change in basis is equivalent to a rotation through the angle
definition. The Kronecker delta is defined by
cp. We will see, in Example 2, page 262, that the change from the standard
basis to the second form for an orthonormal basis is equivalent to a reflec-
(jii = w if i t= j
if i =j.
tion in a line passing through the origin.

Using the Kronecker delta, {V1, ••• , Vn} is an orthonormal basis for the Examp!e 3 Let B = {(2, 5), (1, 4)} and Jet ( , ) 8 be the inner
inner ~roduct space ("Y, ( , )) if and only if (V¡, Vi) = (jii· The Kronecker product for "Y 2 defined using coordinates with respect to B. Then
delta ·1s useful any time a quantity is zero when two indexes are unequal.
For example, the n x n identity matrix may be written as In = (()ii). ((2, 5), (2, 5))B = 1 = ((1, 4), (J, 4))B
§2. The Norm and Orthogonality 253
252 6. INJIER PRODUCT SPACES

and Thus the orthogonal subset {(1, 2, 1), (1, -1, 1)} of rff 3 yields the Ofthonormal
set {(1/~6, 2/~6, 1/~6), (1/.J3, -1/~3, 1/~3)}.
((2, 5), (1, 4))8 = o.
Theorem 6.7 An orthogonal set is linearly independent.
Therefore Bis an orthonormal basis for the inner product space ("Y 2 , 8 ). <, )
Notice that the standard basis for "Y 2 is not an orthonormal basis for this
inner product space. For example, the norm of (1, O) is J4f j3, so (1, O) is Proof Suppose S is an orthogonal subset of the inner product
not a unit vector in this space. space ("/', ( , )). S is Jinearly independent if no finite subset is linearly
The result of this example can be extended to an arbitrary finite-dimini- dependent; note that "Y need not be finite dimensional.
sional inner product space. Therefore, suppose a 1 V1 + · · · + akVk =O for sorne k vectors from S.
Then
Theorem 6.5 Bis an orthonormal basis for ("/', <, )) if and only if o= (O, V;)= (a 1 V 1 + · · · + a¡V¡ + · · · + akVk, V;)
<, ) equals <, )
8 . That is, if U: (a 1 , • •• , anh and V: (b 1 , • •• , b.)a, then B = a 1 (V1 , V1 ) + · · · + a¡(V¡, V¡)+ · · · + ak(Vk, V;)
is orthonormal if and only if (U, V) = a 1b 1 + · · · + a.b•.
= a¡(V¡, V¡).
This theorem is easily proved. It shows that the orthonormal bases are
Since V¡ # O, a¡ = O for each i, 1 5 i 5 k, and S is linearly independent.
precisely those bases for which the inner product can be computed with co-
ordinates much as the dot product was defined in rff 2 •
The coordinates of a vector with respect to an orthonormal basis are Example 4 An orthonormal set of functions from ff with the
easily expressed in terms of the inner product. This follows at once from integral inner product can be constructed using the fact that

t
Theorem 6.5.

Corollary 6.6 If B = { V 1 , ••• , V.} is an orthonormal basis for


sin 2nnx dx = J: cos 2nnx dx =O
("Y, < , )),
then the ith coordina te of U with respect to Bis <U, V;). That
for any positive integer n.
is, U: ((U, V1), ••• , (U, V.)) 8 . .
For example,

In sorne circumstances it is eíther not possible or not necessary to


obtaín an orthonormal basis for an inner product space. In such a case one
(sin 2nnx, cos 2mnx) = J: sin 2mrx cos 2mnxdx

of the following types of sets is often useful.


= J: sin 2(n + m)nx dx + J: sin 2(~ m)nx dx
Definition Let S be a subset of nonzero vectors from an inner product
=O.
space ("Y, ( , ) ). S is an orthogonal set if (U, V) = O for all U, V E S with
U # V. An orthogonal set S is an orthonormal set if 1/ Ull = l for all U E S. Therefore the functions sin 2nnx and cos 2mnx are orthogonal for all positive
integers n and m. Further, since
An orthonormal basis is obviously both an orthonormal set and an
orthogonal set. On the other hand, an orthonormal subset of an inner product
space need not be a basis for the space. For example, the set {(l, O, O), (0, I, O)}
J~ sin 2
2nnx dx = ! J~ l - cos 4nnx dx = !,
is orthonormal in rff 3 but it is not a basís for rff 3 . An orthonormal set can be
obtaíned from an orthogonal set by simply dividing each vector by its norm. the norm of sin 2nnx is lj.J2 and .J2 sin 2mrx is a unit vector. Proceeding in
§3. The Gram-Schmidt Process 255
254 6. INNER PRODUCT SPACES

b. What is the norm of (2, -7) in the inner product space obtained in a?
this manner, it can be shown that
9. Let S= {(1, O, 2, 1), (2, 3, -1, 0), (-3, 2, O, 3)}.
{ 1, -)2 sin 2nx, -)2 cos 2nx, ... , -)2 sin 2mrx, -)2 cos 2nnx, ... } a. Show that S is an orthogonal subset of ~4·
b. Obtain an orthonormal set from S.
is an orthonormal subset of.? with the integral inner product. 10. Following Example 4:
By Theroem 6.7, this set is Jinearly independent. But since it can be a. Show that if n :F m, then sin 2nn:x is orthogonal to sin 2mn:x.
shown that it does not span, it is nota basis for .?. However, using the theory b. Find the Fourier series expansion of /(x) = x.
ofinfinite series, it can be shown that for every functionfin .?, if O < x < 1,
then

"< §3. The Gr~m-Schmidt Process


f(x) = a0 + 2 L (a. cos 2nnx + b. sin 2nnx)
n=l

where We have seen that orthonormal bases are the most useful bases for a
given inner product space. However, an inner product space often arises with
a0 = (/, 1), a. = (f(x), cos 2nnx), h. = (f(x), sin 2nnx). a basis that is not orthonormal. In such a case it is usually helpful to first
obtain an orthonormal basis from the given basis. The Gram-Schmidt process
This expression for f(x) is caiJed the Fourier series off provides a procedure for doing this. The existence of such a procedure will
also show that every finite-dimensional inner product space has an orthonor-
mal basis.
Problems Por an indication of how this procedure should be defined, suppose
!/ = .'l'{ U1 , U2 } is a 2-dimensional subspace of an inner product space. Then
1. Use the integral inner product for !F to find the norm of the prob1em is to find an orthonormal basis {V1 , V2 } for !/. Setting V1
a. x. b. 2x + l. c. cos 8n:x. = UtfiiU1 IIyields a unit vector. To obtain V2 ,consider W 2 = U2 + xV¡.
2. Verify that the Schwartz inequality holds for For any x E R, .'l'{V1 , W2 } = .'l'{U 1 , U2 }. Therefore it is necessary to find
a. ·U= (1, 3), V= (2, 1) in ~ 2 • <
a value for x such that W 2 , V1 ) = O. Then V 2 can be set equal to W2 /ll Wzll·
b. U= x, V= .../1 - x 2 in !F. (How do we know that W 2 cannot be O?) W2 is orthogonal to V1 if
3. Show that the Schwartz inequality for vectors in ~" yields the inequality
2
(t
1=1
a1b1) ::;; (t (a (t (b
1~1
2
1) )
1~1
2
1) )

for all real numbers a1, b1• Therefore when x = -(U2 , V1 ), (W2 , V¡) =O.
4. Prove that (U, V) 2 = 11 Ull 2 11 Vll 2 if and only if U and Vare linearly dependen t.
5. Find two orthonormal bases for ~2 containing the vector (3/5, 4/5). Example 1 Suppose !/ = 2{U1 , U2 } is the subspace of lf 3 spanned
by U 1 = (2, 1, -2) and U2 = (0, 1, 2). Using the above technique,
6. Show that each of the following sets is orthonormal in ~ 3 and for each set find
an orthonormal basis for ~ 3 containing the two vectors.
a. {(1/.Y2, O, -1/V2), (0, 1, O)}. V¡ = U¡/IIU.II = (2/3, 1/3, - 2/3)
b. {(-1/3, 2/3, -2/3), co. t/v'2, 1/v2)}.
c. {(2/7, 3/7, 6/7), (6/7, 2/7, -3/7)}. and W 2 is given by
7. Let S be a nonempty subset of an inner product space ("Y,<,)). Show that the
set of all vectors orthogonal to every vector in S is a subspace of "Y. W2 = U2 --: cu2 o _v.w.
8. B = {(4, 1), (2, 3)} is a basis for "Y 2 • = (0, 1, 2)- [(0, 1, 2) o (2/3, 1/3, -2/3)](2/3, 1/3, -2/3)
a. Find an inner product defined on "Y 2 for which B is an orthonormal
basis. = (2/3, 4/3, 4/3).
256 6. INNER PRODUCT SPACES §3. The Gram-Schmidt Process 257

Setting V2 = W2 /IIW2 1l = (l/3, 2/3, 2/3) yields Suppose now that it is desired to find an orthonormal basis for the
3-dimensional space ~{U 1 , U2 , U3 }. Using the above procedure, we can
{V1, V2 } = {(2/3, l/3, -2/3), (l/3, 2/3, 2/3)}, obtain V 1 , and V2 such that {V1 , V2 } is an orthonormal set and 2{V1 , V2 }
== 2{U1 , U2 }. To obtain a vector W3 from U3 orthogonal to V1 and V2 ,
an orthonormal basis for f/ constructed from U1 and U2 • one might consider subtracting the orthogonal projections of U3 on V1 and
V 2 from U3 • Therefore let
There · is a good geometric illustration which shows why W 2 =
U2 - (U2 , V 1 )V1 is orthogonal to V 1 • Since V 1 is a unit vector,

Then

Since cos 8 ís the ratio of the side adjacent to the angle 8, and the hy- (W3 , V1 ) = (U3 , V1 ) - (U3 , V1 )(V1 , V1 ) - (U3 , V2 )(V2 , V1 )
potenuse, this vector is represented by the point P at the foot of the per-
== (U3 , V1 ) - (U3 , V1 ) =O.
pendicular dropped from U2 to the line determined by Oand V1 , see Figure 1.
Thus W 2 = U2 - (U2 , V1 )V1 can be represented by the arrow from P
So W 3 is orthogonal to V1 . Similarly W3 is orthogonal to V2 , and setting
to U2 • That is, W 2 and V1 are represented by perpendicular arrows. The
V3 = W 3 /ll W3 !1 yields an orthonormal basis {V1 , V 2 , V3 } for ~{ U1 , U2 , U3 }.
vector (U2 , V1 )V1 is called the "orthogonal projection" of U2 on V1 •
This construction should indicate how the following theorem is proved by
Figure 1 shows that subtracting ~he orthogonal projection of U2 on V1 from
induction.
U2 yields a vector orthogonal to V1 • In general, the orthogonal projection
of one vector on another is defined as in lff 2 with V1 replaced by U1 /ll U 1 11.
Theorem 6.8 (Gram-Schmidt Process) Let 8 be a finite-dimensional
inner product space with basis {U 1 , ••• , Un}. Then 8 has an orthonormal
Defínition If U 1, U2 are vectors from an inner product space with
U 1 of. O, then the orthogonal projectíon of U2 on U 1 is the vector
basis W~o ... , v.} defined inductively by V1 = Udll U1 l and for each
k, 1 < k :::;; n, Vk = Wk/11 Wk!! where

Example 2 Use the Gram-Schmidt process to obtain an orthonor-


mal basis for
:·.;

2{(1, O, 1, 0), (3, 1, -1, 0), (8, -7, O, 3)} e 8 4 .

·.'·:·.
·.,:·!
Set V1 = (1/.j'l, O, 1/.j'l, 0). Then

W2 = (3, 1, -1, O) - [(3, 1, -1, O) o (1/.j'l, O, IJ.j'l, 0)](1/.)2, O, 1/.)2, O)


= (2, 1, -2, 0),

and V2 = W2 /IIW2 11 = (2/3, l/3, -2/3, O) is a unit vector orthogonal to


Figure 1 V1 . Finally subtracting the orthogonal projections of (8, -7, O, 3) on V1
258 6. INNER PRODUCT SPACES §3. The Gram-Schmidt Process 259

and v2 from (8, -7, o, 3) gives Proof These follow at once from the corresponding results for
vector spaces, using the Gram-Schmidt process.
JV3 = (8, -7, O, 3) - [(8, -7, O, 3) o (1/)2, O, 1/)2, 0)](1/)2, O, 1/)2, O)
- [(8, -7, O, 3) o (2/3, 1/3, -2/3, 0)](2/3, 1/3, -2/3, O) Examp!e 3 Suppose tff is the inner product space obtained by
= (8, -7, O, 3)- (4, O, 4, O)- (2, 1, -2, O)= (2, -8, -2, 3). defining <, )on 1/ 2 by

So V3 = W3 /IIW3 I = (2/9, -8/9, -2/9, 1/3) and the orthonormal basis is <(a, b), (e, d)) = 3ae - 2ad- 2be + Sbd.

{(1/Jl, O, 1/)2, 0), (2/3, 1/3, -2/3, 0), (2/9, -8/9, -2/9, 1/3)}. Find an orthonormal basis for tff. (W¡;.._

An orthonormal basis can be constructed from the standard basis for


In practice it is often easier to obtain an orthogonal basis such as -r2 A vector orthogonal to (1, O) is given by
.
{U 1 , W2 , ••• , Wn} and then divide each vector by its 1ength to get an or-
f1onorma1 basis. This postpones the introduction of radicals and even allows <(0, 1), (1, O)) ( /
for the elimination offractions by sca1ar multip1ication. That is, if(1/2, O, 1/3, O) (0, 1) -((!,O), (1, O)) (1, O)= (0, !) - -2/3, O)= (2 3, !).
i5 orthogonal toa set ofvectors, then sois (3, O, 2, 0). In Examp1e 2 the second
and third orthogonal vectors are given by Therefore {(1, 0), (2, 3)} is an orthogonal basis for tff, and dividing each
vector by its length gives the orthonormal basis {(1/)3, O), (2/)33, 3/)33)}
(3, 1, -1, O) o (1, O, 1, 0)( O O) for tff.
1 1
( 3, 1' - 1, O) - (1, O, 1, O) o (1, O, 1, O) ' ' '

= (3, 1, -1, O)- (1, O, 1, O)= (2, 1, -2, O) betinition Let Y be a subspace of an inner product space tff. The set
and
y1. = {V E tffl <U, V) = O for all U E Y}
(8, -7, O, 3) o (1, O, 1, O) )
( 8, - 7, O, 3) - (1, O, 1, O) o (1, O, 1, O) (1, O, 1' O is the orthogonal eomplement of Y. The notation gJ. is often read "9' perp."
_ (8, -7,0,3)o(2, 1, -2,0)( _ O)
2 1 2
(2, 1, -2, O) o (2, 1, - 2, O) ' ' ' Example 4 If 9' is the line .P {(1, 3, - 2)} in tff 3 , then what is the
orthogonal complement of 9'?
= (8, -7, O, 3)- (4, O, 4, O)- (2, 1, -2, O)
Suppose (x,y. z) E 9'1., then (x,y, z) o (1, 3, -2) = Oor x + 3y - 2z = O.
= (2, -8, -2, 3). Conversely, if a, b, e satisfy the equation a + 3b - 2e = O, then
(a, b, e) o (1, 3, -2) =O and (a, b, e) E 9'1.. Therefore gJ. is the plane with
1\ow the desired orthonormal basis is obtained from the orthogonal set Cartesian equation x + 3y - 2z = O. ( 1, 3, - 2) are direction numbers
{(1, O, 1, 0), (2, 1, -2, O), (2, -8, -2, 3)} when each vector is divided by its normal to this plane, so g1. is perpendicular to the line 9' in the geometric
length. sen se.

Corollary 6.9 As a consequence of problem 7, page 254, the orthogonal complement


of a subspace is also a subspace. So if 9' is a subspace of an inner product
l. Every finite-dimensional inner product space has an orthonormal space tff, then the subspace 9' + g>l. exists. This sum must be direct for if
basis. U E 9' and U E 9'1., then <U, U) = O and U = O(Why?) If tff is finite dimen-
2. An orthonormal subset of a &~lite-dimensional inner product space sional, then Corollary 6.9 can be used to show that 9' EB gJ. is a direct sum
tff can be extended to an orthonormal basis for tff. decomposition of tff.
260 6. INNER PRODUCT SPACES §4. Orthogonal Transformations 261

Theorem 6.1 O If g is a subspace of a finite-dimensional inner pro- 7. Let T be a linear map from tf 3 to tf 3 su eh that T( U)• T( V) = U • V for all
duct space ~. then g = [/ El3 f/.i. U, V E tf 3. Suppose a subspace ff? is 'invariant under T, and prove that ff?L
is also invariant.
8. a. Verify that T(a, b, e) = Ga - ~c. b, !a + !e) satisfies the hypotheses of
Proof As a subspace of C, f/ is an inner product space and there-
problem 7.
fore has an orthonormal basis {V1 , ••• , Vk}. This basis can be extended to
b. Find a Iine ff? and the plane ff?L that are invariant under this T.
an orthonormal basis {V1 , ••• , Vk, Vk+l• ... , V"} for C. The proof will be
complete if it can be shown that the vectors vk+ 1' ... ' vn E [/.l. (Why is 9. Verify that 8 2 = ff? (f) ff?L when ff? = 2' {(1, 4) }. What is the geometric
interpretation of this direct sum decomposition of the plane?
this?) Therefore consider Vj with j > k. If U E Y, then there exist scalars
a 1 , ••• , ak such that U= a 1 V 1 + · · · + akVk, so 10. Verify that 8 4 = ff? (f) ff?L when ff? is the plane spanned by (1, O, 2, 4) and
(2, O, -1, 0).

Therefore vj E g.L for k < j :S: n and g = g El3 g.l, §'4. Orthogonal Transformations

What type of map sends an inner product space into an inner product
Problems
space? Since an inner product space is a vector space together with an inner
product, the map should preserve both the vector space structure and the
l. Use the Gram-Schmidt process to obtain an orthonormal basis for the spaces
spanned by the following subsets of t!n. inner product. That is, it must be linear and the inner product of the images
a. {(2, O, 0), (3, O, 5) }. of two vectors must equal-the inner product of the two vectors.
b. {(2, 3), (1, 2)}.
c. {(1, -1, 1), (2, 1, 5)}.
d. {(2, 2, 2, 2), (3, 2, O, 3), (0, -2, O, 6)}.
D~finition Let ("fí, <,
)r) and ("'fí, <, )
1r) be inner product spaces
and T: "fí --+ "/fí a linear map. T is orthogonal if for all U, V E "fí,
2. Construct an orthonormal basis from {1, t, t 2 } for the inner product space

(R 3 [t]; ( , )) where (P, Q) = J: P(t)Q(t)dt.


(T(U), T(V)) 1r = (U, V>r·

Since the norm of a vector is defined using the inner product, it is pre-
3. Let e= (1" 2 , whcre thc inner product is given by
( , ))
served by an orthogonal map. Thus the image of a nonzero vector is also
((a, b), (e, d)) = 4ac - ad- be + 2bd. nonzero, implying that every orthogonal map is nonsingular.
Use thc Gram-Schmidt process to find an orthonormal basis for e.
a. starting with the standard basis for "f" 2 •
Example 1 The rotation about the origin in C2 through the angle
- ,., b. starting with the basis {(2, 2), (- 3, 7) }.
cp is an orthogonal transformation of lff 2 to itself.
4. Define an inner product on "f" 3 by It is not difficult to show that T(U) o T(V) = U o V for all U, V E lff 2
((a 1 , a 2, a 3), (b¡, bz, b3)) when
= a 1 b¡ - 2a¡bz - Za2b 1 + 5a 2b 2 + a 2b3 + a3b2 + 4a3b 3.
Use the Gram-Schmidt process to construct an orthonormal basis for the T(a, b) = (a cos cp - b sin cp, a sin cp + b cos cp).
inner product space (1" 3 , ( , )) from the standard basis for 1" 3 •
Use the fact that sin 2 cp + cos 2 cp = l.
5. Carefully write an induction proof for the Gram-Schmidt process.
6. Find ff?L for the following subsets of en: The distan ce between two points A and B in the plane ( or lff ") is the norm
a. ff? = 2'{(2, 1, 4), (1, 3, 1)}. b. ff? = 2'{(1, 3)}. of the difference A - B. If T: C 2 ~ lff 2 is orthogonal, then
c. [!? = 2'{(2, -1, 3), (-4, 2, -6)}. d. [!? = 2'{(1, o, --2)}.
e. ff? = 2'{(2, O, 1, 0), (1, 3, O, 2), (0, 2, 2, 0)}. IIT(A)- T(B)II = IIT(A- B)ll-= IIA- Bll.
·1
262 6. INNER PROI1UCT. SPACES §4. Orthogonal Transformations 263

That is, an orthogonal map of 8' 2 to itself preserves the Euclide'an distance given by
between points.
T(a, b) = (a cos 2rx + b sin 2o:, a sin 2rx - b cos 2a).
Example 2 The reftection of 8' 2 in a line through the origin is an
orthogonal transformation. Notice that Ttransforms the standard basis for 8' 2 to {(cos cp, sin cp), (sin cp,
Suppose T is the reftection of 8' 2 in the line .e with vector equation -cos cp)} when cp = 2a. This is the second form for an orthonormal basis
P = tU, tER, where U is a unit vector. For any V E 8' 2 , Jet W =(V o U) U; for t 2 obtained in Example 2, page 251.
.then W is the orthogonal projection of V on U. Since W is the midpoint of
thc line betw~V and T(V), W =!(V+ T(V)), see Figure 2. Therefore We will find in Theorem 6.13 that the only orthogonal transformations
T( V) = 2 W - V = 2( V o V) U - V. of t 2 to itself are rotations about the origin and reftections in lines through
The reflection T is linear for the origin.
The dot product was defined in $ 2 to introduce the geometric properties
T(r 1 V 1 + r2 V2) = 2[(r 1 V1+ r 1 V 2 ) o U]U- (r 1 V 1 + r 2 V 2 ) of length and angle. Since an orthogonal map from $ 2 to itself preserves the
= r1 2(V1 o U)V- r 1 V 1 + r2 2(V2 o U)U- r2 V 2 dot product, it preserves these geometric properties. However, the converse
is not true. That is, there are transformations of the plane into itself that
= r1 T(V1) + r2 T:Vz), preserve length and angle but are not linear. A translation given by T(V)
= V + P, P a nonzero, fixed vector, is such a transformation.
and T preserves the dot product for

T(V 1) o T(V2 ) = [2(V 1 o U) U - V 1] o [2(V2 o V)V- V 2 ] Theorem 6.11 Let TEHom('f", 11/) and suppose B = {V1 , •.• , V.} is
an orthonormal basis for (1'", ( , ).,). Then T is orthogonal if and only if
= 4(V 1 o U)(V2 o U) - 2(V1 o U)( U o V2 ) {T(V1 ), ••• , T(V.)} is an orthonormal set in (1/í, ( , ) 1y).
- 2(V2 o U)(V1 o U) + V1 o V1
Proof (=>) If T is orthogonal, then

Thus T is an orthogonal transformation. If U= (cosa, sin a), as in Figure


2, and V= (a, b), then a short computation shows that the reflection T is
(<=) Suppose (T(V¡), T(Vi));r = O¡i and let U, V E 1'". 1f U= L7 =1 a¡V¡

V
and V= LJ=
1 biVi, then

\y• n n

~/\
= La¡ L bj (T(V;), T(V))1Y
i= 1 j= 1

n n n
• T(V) =
i=l j=l
L a;b; =(U, V)B =(U,
L L, a;bAj = i=l .
V).,.

Definition Two inner product spaces are isomorphic ifthere exists an


Figure 2 orthogonal map from 01~ onto the other.
264 6. INNER-2PRODUCT SPACES §4. Orthogonal Transformations 265

Theorem 6.12 Any two n-dimensional inner product spaces are Taking the plus sign for B gives
isomorphic.
-sin cp)·
cos (/J
Proof Suppose re and re' are n-dimensional inner product spaces.
Let B = {V1, ••• , V.} and B' = {V;, ... , V~} be orthonormal bases for
Since A is thc matrix of T with respect to the standard basis,
rff and re', respectively. Set T(V¡) = V[, l ~ i ~ n, and extend T linearly to
aH of re. Then by Theorem 6.ll, T is orthogonal. Since T is clearly onto, the
spaces are isomorphic.
T(x, y) = (x cos cp -y sin ~o, x sin cp +y cos cp).

That is, T is the rotation about the origin, through the angle cp.
Thus any n-dimensional inner product space is algebraically indistin- Using the minus sign for B gives
guishable from re •. This result corresponds to the fact that every n-dimen-
sional vector space is isomorphic to the space of n-tuples, "Y•. 1t of course sin cp)
does not say that re. is the only n-dimensional inner product space. -cos cp '
Supposc T: re-+ re is an orthogonal map and B = {V 1 , ••• , V.} is an
orthonormal basis forre. What is the nature ofthe matrix A for Twith respect and T is given by
to B? If A =(a¡) then T(Vk): (a 1k, ... , a.k)n. The fact that <, ) <,
= >B
and (T(V¡), T(V)) = Dii implies a 1 ¡a 1 j + · · · + a.¡a.j = Dii. But this sum T(x, y) = (x cos <p +y sin cp, x sin cp -y cos cp).
is the entry from the ith row and jth column of the matrix product ATA.
That is, ArA = (bii) = !,., or the inverse of A is its transpose. That is, Tis the reftection in the line through the origin with direction numbers
(cos cp/2, sin cp/2).
Definition A nonsingular matrix A is orthogonal if A-! = Ar.
Examp/e 3 Determine the nature of the orthogonal map from re 2
to itself given by
Notice that the condition A-! = Ar implies that the n rows (or the n
transposed columns) of A forman orthonormal·basis forre., and conversely.
Using this idea, the orthogonal maps from re 2 to itself can easily be charac- T(x, y)= ( -!x + Jy, !x + fy).
terized.
The matrix of T with respect to the standard basis is

Theorem 6.13 If T: re 2 -+ re 2 is orthogonal, then T is either a rota- -4/5 3/5)


tion or a reftection. ( 3/5 4/5 .

With cos (/J = -4/5 and sin cp = 3/5, this is seen to be the matrix of a reftcc-
Proof If A is the matrix of T with respect to the standard basis,
tion in a line C. The slope of e is given by
:hen A is orthogonal. Thus if

then B = {(a, b), (e, d)} is an orthonormal basis for re 2 . Now Example 2, Since cp is in the second quadrant, cp/2 is in the first quadrant and the slope
page 251, states that there exists an angle cp such that of C is 3. Therefore T is the reftection in the Iine with equation y = 3x.

B = {(cos cp, sin <p), ±(-sin cp, cos cp)}. 1t should be clear that any reftection in a Iine through the origin of re 2
266 6. INNER PRODUCT SPACES §4. Orthogonal Transformations 267

has a matrix in the form (~ - n with respect to sorne orthonormal basis,


fcr one line (subspace) is fixed and another is refiected into itself. This result
This already says quite a bit about an arbitrary orthogonal transforma-
tion of rff 3 • But if

det(~ ~) = 1,
can also be obtained by considering the matrix A for T with respect to the
standard basis. Since

sin q>) then T y is a rotation in the plane !/ and there is an angle (P such that
-cos <p '
o
-si~cos
+1
~-
the characteristic equation for A is ..1? = l. Thus ± 1 are the characteristic A = O cos <P <P )·
roots of A, andA is similar to diag (1, -1). For the map in Example 3, the ( O sin q> <P

-n
orthonorma1 basis {(1/)10, -3/)W), (3/)10, 1/)10)} yields the matrix

(~
On the other hand, if
ror T.
The previous argument shows that if A is a matrix for a refiection in the
plane, then det A = - l. On the other hand, if A is the matrix of a rotation det(~ ~)=-1,
in the plane, then det A = cos 2 cp + sin 2 cp = L So the matrix of an or-
thogonal map in the plane has determinant ± l. This is a general property then Ty reflects the plane 2'{V2 , V3 } into itself. Therefore there exist vectors
of orthogonal matrices following from the fact that AAr = In. V~ and V~ such that B' = {V1 , V~, V~} is an orthonormal basis for rff 3 and
the matrix of T with respect to B' is
Theorem 6.14 If A is an orthogonal matrix, then det A ~ ± l.
Characierizing the orthogonal transformations of 3-space, rff 3 , is only a
little more difficult than for the plane. Suppose T: rff 3 ~ rff 3 is orthogonal.
-1
~)·
The characteristic polynomial of T must have a nonzero real characteristic
Since one of the matrices denoted by A' is similar to one denoted by A when
root r since it is a polynomial of odd degree. Let V 1 be a unit characteristic
<P = n, there are three dífferent types of matrix for T. Or we ha ve found that
vector corresponding to r, and let B = {V1 , V 2 , V3 } be an orthonormal basis
if T: rff 3 ~ rff 3 is orthogonal, then T has one of the following matrices with
for rff 3 • (B exists by Corollary 6.9 on page 258.) Then the matrix A of T
respect to sorne orthonormal basis {U 1 , U2 , U3 } for rff 3 :

el o o)
with respect to B has the form

(1o coso <P _,,: ~ )· o 1 o


O sin q> cos <P o o 1
or
Since A is orthogonal, its columns are orthonormal in rff 3 • Therefore r = ± l,
x =y = O, and ( ~ ~) is orthogonal. That is, the line 2' {V¡} is invariant
under T and its orthogonal complement, the plane !/ = 2'{V2 , V3 }, is sent
into itself by the orthogonal, restriction map n o
cos <P
sin q>
_,,:~)~(¿
cos <P o
o
cos <P
sin q>
-sin cp
cos <P
o )C'o oo o)
O 1 O.
1

If T has the first matrix, then T is called a rotation of rff 3 • Thus a rotation of
rff 3 is the identity on a line, 2'{U1 }, called the axis ofthe rotation. Further
268 6. INNER PRODUCT SPACES §4. Orthogonal Transformations 269

the rotation Trotates every plane parallel to 2'{U2 , U3 } into itself, through The sign of ({J is meaningful only in reference to a choice of basis. If the
the angle ([J. This angle lfJ is called the angle ofrotation of T, with the positive last two vectors in B are interchanged, the sign of ({J is changed, for then
direction being measured from U2 to U 3 • If T has the second matrix above, cos lfJ = 1/3, sin ({J = .JS/3. Thus the amount of rotation might be found
then Tis a reflection of re 3 in the plane 2'{U2 , U3 }. Including the third form without reference to the sense of the rotation. This can be done by simply
for the matrix of T we have the following. finding the angle between any nonzero vector V in 2'{(0, l, l)}J. and its
image T(V). If Vis taken to be (1, O, O), then the angle lfJ satisfies
Theorem 6.15 An orthogonal transformation of re 3 to itself is either U o T(U)
a rotation, a reflection in a plane, or the composition of a rotation and a cos lfJ = IJUIJIJT(U)I/ = (1, O, O) o (1/3, 2/3, -2/3) = 1/3.
refiection in a plane.
One further general result should be stated to complete this section.
An orthogonal transformation in re 2 or re 3 is a rotation if its matrix
with respect to an orthonormal basis has determinant + l. Therefore the
following definition is made. Theorem 6.16 The mattix of an orthogonal transformation with respect
to an orthonormal basis is orthogonal.
Conversely, given an n x n orthogonal matrix A and an n-dimensional
Definition Let T: re...... re. be an orthogonal map, B an orthonormal
inner product space re with an orthonormal basis B, there exists an orthogonal
basis forre., andA the matrix of Twith respect to B. Then Tis a rotation of
map T: re-+ re with matrix A with respect to B.
re" if det A = l.

Can yo u guess how a rotation acts in 4-space, re4 ? Proof The first statement has airead y been derived. Por the second,
suppose A = (a;) is orthogonal and B = {V1 , ••• , V.} is an orthonormal
Example 4 Suppose T: re 3 -+ re 3 is the orthogonal transformation basis forre. Define T by T(V) = aljV1 + · · · + a.iv• and extend Iinearly
having the matrix to all of re. Then T is linear and has matrix A with respect to B, so it is o ni y
necessary to show that T preserves the inner product. It is sufficient to show
1/3 2/3 that (T(V;), T(V)) = O¡i, and
-2/3)
-2/3 2/3 1/3
( 2/3 (T(V;), T(V)) = (T(V;), T(V))B = a 1 ¡a 1i + · · · + a,.;a,i
2/3 1/3

with respect to the standard basis. Determine how T acts on re 3 • which is the e!ement from the ith row and jth column of the product ATA.
Since T is orthogonal and det A = 1, T is a rotation. If U líes on the But ATA = I,., so (T(V;), T(V)) = O¡i and T preserves the inner product
axis of T, then T(U) = U. This equation has the solutions U= t(O, 1, 1), of re.
tE R. Therefore the axis of T is 2'{(0, 1, 1)} and T rotates each plane with___,
equation y + z = constant into itself. To find the angle of T, let
Problems
B = {(0, I.j'l, 1.)2), (0, -1/.)2, 1/.)2), (1, O, O)}
l. Determine the nature of the following orthogonal maps in rff 2 •
This is an orthonormal basis forre 3 with the first vector on the axis of rotation. a. T(x, y) = (h + h/3y, tv3x- ty).
Computing the matrix A of T with respect to this basis yields b. T(x, y) = Ctx + tv3y, -tv3x +{y).
c. T(x, y) = (-1-x + h. -·h - !y).
d. T(x, y) = (x, y). e. T(x, y) = (-y, -x).
1 o o)
A= O ij3 .JB/3 . 2. a. Show that the composition of two orthogonal maps is orthogonal.
( b. Show that the composition of two rotations in rffn is a rotation.
O -.JS/3 1/3
c. ls the composition of two reflections a reflection?
Therefore the angle ({J of T satisfies cos ({J = 1/3 and sin ({J = -.JS/3. 3. Suppose T: rff~rff is linear and preserves the norm, show that Tis orthogonal.
270 6. INNER PRODUCT SPACES §5. Vector Spaces over Arbitrary Fields 271

4. Fill in the missing entries to obtain an orthogonal matrix numbers. But there are many situations in which it is necessary to use scalars

a.
1/3 2/3 a )
b 2/3 e .
(2/3 d 2/3
b.
(1/VJ 1/,/3
b e 1/V2 .
a-) from other "number systems." For example, we have ignored complex roots
of a characteristic equaíion simply because of the requirement that a scalar
1/V6 d l/v'6
be a real number. Because of the existen ce of complex roots, it will be neces-
5. Determine which of the following maps of <ff 3 are orthogonal and give a
sary io introduce complex inner product spaces to provean importan! theorem
geometric description of those that are orthogonal.
about diagonalizability.
a. T(a, b, e) = (a/V'l + b/V'l, b/v''l- afv''l, -e).
b. T(a, b, e) = (b/V'l + efV'l, a, b/v''l + e/V'l). Before considering the complex case in particular, it is worthwhile to
c. T(a, b, e) = (bfv''l- ajy''l, afv''l + bfV'l, -e). see exactly how our definition of a vector space might be generalized. Instead
d. T(a, b, e) = (a/v''I- efv''I, b, afv''I + ejy'2). ofusing scalars from the real number system, the scalars could be chosen from
e. T(a, b, e) = (a/V'I + b/V'l, b/V'I- e/V'I, afV'I + e/yZ). any system that shares certain properties with the real numbers. The abstract ~­
6. Give two different proofs of the fact that the product of two n x n orthogonal algebraic system satisfying these properties is called a field.
matrices is orthogonal.
7. Determine how the orthogonal transformation T acts on <ff 3 if the matrix of Definition Let F be a set on which two operations called addition and
T with respect to the standard basis is multiplication are defined such that for all a, bE F, a + bE F and a·b
6/7 2/7 3/7) ( 6/7 2/7 - 3/7) = abE F. The system (F, +, ·) is afield if the following properties hold for
a. 2/7 3/7 -6/7 .
( -3/7 b.
2/7 3/7 6/7 . all a, b, e E F:
6/7 2/7 -3/7 6/7 -2/7
1/3 2/3 - 2/3) (2/3 1/3 - 2/3) l. a + (b + e) = (a + b) + e and a(be) = (ab)e.
c. 2/3 1/3
( -2/3 2/3 .
d. 2/3 -2/3 1/3 .
2/3 1/3 1/3 2/3 2/3 2. a + b = b + a and ab = ba.
-1/9 -8/9 4/9) ( 1/9 8/9 4/9) 3. There exists an element O E F, such that a + O = a, and there exists
e. ( -8/9 -1/9 -4/9 . f. 8/9 1/9 -4/9 .
4/9 -4/9 -7/9 -4/9 4/9 -7/9 and element 1 E F, 1 =1= O, such that, a·l = a.
4. For each a E F, there exists -a E F, such that a+ (-a) = O, and
8. Given PE rff,, the map T defined by T( V) = V + P for all V E rff" is cailed
for each a E F, if a =1= O, there exists a- 1 E F such that a ·a- 1 = l.
a translation.
5. a(b + e) = ab + ae.
a. Show that a translation preserves the distance between points.
b. When is a translation orthogonal?
lt should be clear that the real number system is a field. In fact, the pro-
9. Find a vector expression for a reflection of the plane <ff 2 in a Iine that does not
perties required of a field are the same as those Iisted for R on page 2 in
pass through the origin. Show that this map preserves distance between points
preparation for defining the vector space "Y 2 • It is not difficult to show that
but is not linear and does not preserve the dot product.
the complex numbers C = {a + biJa, bE R} form a field with addition and
10. Let T: rff 4 ~<1 4 be orthogonal and suppose Thas a real characteristic root. multiplicatio~ defined by
a. Show that T has two real characteristic roots.
b. Describe the possible action of T in terms of a rotation about a plane and (a + bi) + (e + di) = a + e + (b + d)i
reflections in hyperplanes.
ll. Prove Euler's theorem: If T: rffn~t!" is a rotation and n is odd, then 1 is a and
characteristic value for T. That is, any rotation of t!" fixes sorne Iine, when n
is odd. (a+ bi)·(e +di)= ae- bd + (ad + be)i.
There are many fields besides the real and complex numbers. For ex-
ample, the set of all rational numbers, denoted by Q, forros a field within
§5. Vector Spaces over Arbitrary Fields
R. The su m and product of rational numbers is ratio na!, so Q is closed under
addition and multiplication. The identities O and 1 of R are in Q, and the
Although the point has not bee'fi stressed, all the vector spaces considered additive and multiplicative in verses of rational numbers are rational numbers.
to this point have been real vector spaces. That is, the scalars have been real Thus Q is a "subfield" of R.
272 6. INNER PRODUCT SPACES §5. Vector Spaces over Arbitrary Fields :2! 273

The fields R, e, and Q are infinite fields in that they contain an infinite Similarly, the set of all polynomials in t with coefficients from F, denoted
number of elements. But there are also finite fields. The simplest finite fields, by F[t], can be turned into a vector space over the field F. The set of all n x m
denoted by zp, are the fields of integers modulo p, where p is a prime. zp matrices with elements from F, denoted vltnxm(F), forms a vector space over
is the set {0, 1, 2, ... , p - l} together with addition and multiplication F, and if "f/ and "/fí are vector spaces over F, then so is Hom("f/, 1/í).
defined using the operations in· the integers as follows:
Example 1 Consider the vector space "f/ JCZ2 ). Since the field con-
a+b=c if a +b= kp +e for sorne · k E Z, O ::::; e < p
tains only two elements, it is possible to list all the vectors in this space. They
are: (0, O, O), (1, O, O), (0, 1, 0), (0, O, !), (1, 1, 0), (1, O, 1), (0, 1, 1), (1, 1, 1).
and
Therefore "f/ 3 (Z2 ) is finite in the sense that there are only eight vectors. lt
ah = d if ah = hp +d for sorne h E Z, O ::::; d < p.· must a1so be finite dimensional since a basis could contain at most seven
vectors. However, the seven nonzero vectors are not linearly independent.
Tliat is, a + h and ab in Zr are remainders obtained when a + h and ab For example, the set {(1, 1, 0), (1, O, 1), (0, 1, 1)} is linearly dependent in
are divided by p in Z. For example, in Z 7 , 2 + 3 = 5, 6 + 5 = 4, 5·2 = 3, "f/ 3 (Z 2 ). In fact, (1, 1, O)+ (1, O, 1) + (0, 1, 1) = (1 + 1, 1 + !, 1 + 1)
5·6 = 2, and 2·3 = 6. A proofthat ZP is a field is omitted since these finite = (0, O, 0).
fields are introduced here simply to indicate the diversity of fields. The field
Z 5 is considered in problem 6 at the end of this section, and you might note Example 1 implies that the basic properties of "f/.(R) carry over to
that the elements of Z 5 are esseiltially the equivalence classes obtained in "f/.(F). In particular, the standard basis {E¡} for "f/.(R) is a basis for "f/.(F).
problem 3, page 221. The simplest field is Z 2 , which contains only O and l. Therefore the dimension of ·'fí.(F) is n.
In Z 2 , 1 + 1 =O, so 1 is its own additive inverse! This fact leads to sorne in-
teresting complications when statements are made for arbitrary fields.
The real number system and any field F share the properties used in de- Example 2 Consider the vector space "f/ z(C). The vectors in this
fining a vector space. Therefore the concept of a vector space can be gener- space are ordered pairs of complex numbers, such as (3 - i, 2i - 6), (1, 5),
alized by simply replacing R by Fin our original definition. This yields the (6i, i). Given an arbitraty vector (z, w) with z, w E e, (z, w) = z(!, O) + w(O, 1),
definition for a vector space "f/ over a jield F. This is an algebraic system so {E¡} spans "f/ 2 (e). And if
consisting of a set "//, whose elements are called vectors, a field F, whose
elements are called scalars, and operations of vector addition and scalar z(!, O) + w(O, 1) =O =(O, 0),
multiplication.
Given any field F, the set of ordered n-tu pies can be made into a vector then z = O and w = O by the definition of an ordered pair, so {E¡} is Iinearly
space over F for each positive integer n. Let independent. So {E¡} is a basis for "f/ z( C) and the vector space has dimension
2.
·-~
"f/.(F) = {(a 1, ... , a.)la 1, ••• , a. E F}

and define Example 3 Let T: "f/' 2 ( e) -> "f/ z( C) be defined by

(a 1, ••• , a.)+ (b 1, ••• , h.)= (a 1 + b 1, ••• , a.+ h.) T(z, w) = (iz + 3w, (1 - i)w).

c(a 1, ••• , a.) = (ca 1, ••• , ca.), for e E F.


The image of any vector under T is easily computed. For example;

The proof that "f/.(F) together with these two operations is a vector space
over F is exactly like the proof that "f'·. is a real vector space. In fact, "f/.(R)
T(2 + 3i, 3 - i) = (i(2 + 3i) + 3(3 - i), (1 - i)(3 - i))
is "f/n· = (6 - i, 4 - 4i).
274 __ 6. JNNER PRODUCT SPACES §5. Vector Spaces over Arbitrary Fields 275

T is a linear map, for if U= (z 1 , w1) and V= (z 2 , w2 ), then Example 3 serves to show that anything that has been done in real vector
spaces can be done in a complex vector space. C could ha ve been replaced by
T(U + V) = T(z 1 + z2 , w1 + W 2 ) an arbitrary field and a similar statement could be made for a vector space
over any field F. Notice this means that all the results on systems of linear
= (i[z 1 + z 2 ] + 3[w 1 + w2 ], (1 - i)[w 1 + w2 ]) equations apply to systems with coefficients from an arbitrary field. However,
= ([iz 1 + 3w 1 ] + [iz 2 + 3w2 ], (1 - i)w 1 + (1 - i)w2 ) when it comes to defining an inner product on an arbitrary vector space, the
nature of the field must be taken into account. One way to see why is to con-
= (iz 1 + 3w 1 , (1 - i)w 1) + (iz 2 + 3w2 , (1 - i)w 2 ) sider the requirement that an inner product be positive definite. This con-
= T(U) + T(V). dition requires an order relation on the elements of the field. But not all
fields can be ordered, in particular the complex numbers cannot be ordered,
Similarly, T(zU) = zT(U) for any z E C. see problem 15 below.
Since T(1, O) = (i, O) and T(O, 1) = (3, 1 - i), the matrix of T with
respect to the standard basis {E¡} is
Problems
A= ( O 1 3_¡ ) Evlt2xiC).
¡
l. Compute the following.
a. (2 + i, 3i- 1) - (i- 5, 2- 4i). [3 + i)( 1 + i, 2i - 4, 3 - i).
b.

(2ti 4 +-. 1~) (2i1 3-i


The determinant of A is i(1 - i) = 1 + i, so T is nonsingular and the matrix
(i+~
3
))(!¡;). o 2l++3i)
of r- 1 with respect to {Ei} is c. 3i2i d.
1 .
i.

A-1 = _1
1+i
(1 -o i -3) =
i
(-io (3i- 3)/2)
(1 + i)/2 .
2. a. Show that B = {(! + i, 3i), (2, 3 + 2i)} is a basis for "Y 2 (C).
b. Find the coordinates of (4i- 2, 6i- 9) and (4i, 7i- 4) with respect to
B.

Therefore 3. a. Determine if S = {(i, 1 - i, 2), (2, 1, - i), (5 - 2i, 4, - 1 - i)} is linear) y


independent in "Y 3 ( C).
r- 1 (a, b) = (-ia + t(3i- 3)b, t(l + i)b). b. Determine if (3 + i, 4, 2) is in the span of S.
4. Prove that· "Yn(F) is a vector space over F, noting aJI the points at which it is
~F
"You might check to see that either To T- 1 = 1 or T- 1 o T =l. necessary to use sorne field property of F.
:~;~
The characteristic polynomial of T is ..l
• 1~'!¡
5• Define Fn[t) as Rn[t] was defined and show that dim Fn[t] = n.
:1;
6. Find the following elements in Zs:
(i- .A.)(I - i - A.) = .A.2 - A. + 1+ i. a. 3 + 4, 2 + 3, and 3 + 3.
b. 2·4, 3·4, 4 2 , and 26 (note 6 rf= Z 5 ).
Since T is defined on a complex vector space, its characteristic values may c. -1, -4, and -3.
be complex i:mmbers. In this case i and 1 - i are the characteristic values for d. 2-1, 3- 1 , and 4- 1 •
T. These values are distinct, so T is diagonalizable, although its characteristic e. Prove th.·t Z 5 is a field.
vectors will hitve complex components. (1, O) is a characteristic vector cor- 7. Define Zn, for any positive integer n, in the same way Zp was defined. Are Z 4
responding to i, as is (i, O) or (z, O) for any complex number z #: O, and and z6 fields?
(3, 1 - 2i) is a characteristic vector corresponding to 1 - i. Therefore the 8. a. Find all vectors in 2'{(1, 4)}, if (1, 4) E "Y2(Z5 ).
matrix of Twith respect to the basis {(1, 0), (3, 1 - 2i)} is the diagonal matrix b. Is {(2, 3), (3, 2)} a basis for "Y 2 (Z 5 )?

G~~¡). _c. How many vectors are ther_e in the vector space "Y 2 (Z5 )?
d. Is {(2, 1, 3, 1), (4, 2, 1, 0), (1, 3, 4, 2)} linearly independent in "Y 4 (Z5 )?
e

.;,
276 6. INNER.JRODUCT SPACES §6. Complex lnner Product Spaces 277

9. Let T: "Y 2 (C)-"Y 2 (C) be defined by T(z, w) = (w, -z). b. Show that{(a, O)[a E R} is a subfield which is isomorphic to the field of
a. Show that T is linear. real numbers.
b. Find the characteristic equation of T. c. Show that (0, 1)2 = -l. Therefore if we set i = (0, 1), then i 2 = -t·
c. Find the characteristic values and corresponding characteristic vectors using the identification given in part b.
for T. d. Show that this field of ordered pairs is isomorphic to the field of complex
d. Find a basis for "Y 2 ( C) with respect to which T has a diagonal matrix. numbers.
10. Follow the directions from problem 9 for the map given by
T(z, w) = ([i- 1]z + [3i- l]w, 2iz + 4w).
11. Let T(a + bi, e + di) = (a + d- ci, b + (a - d)i). Show that T is not in §6. Complex lnner Product Spaces and Hermitian
Hom("Y z(C)). Transformations
12. Let "Y be the algebraic system whose elements are ordered pairs of real
numbers. Define addition as in "Y z(R) and scalar multiplication by z(a, b) =
(za, zb), z E C, a, bE R. Determine if "Y is a vector space over C. The dot product of ce. cannot be directly generalized to give an inner
prodilct on r.(C) because the square of a complex number need not be real,
13. Let "Y be the algebraic system whose elements are ordered pairs of complex let alone positive, e.g., (! - i) 2 = - 2i. However a positive defi.nite function
numbers. Define addition as in "Y z( C) and scalar multiplication by r(z, w) =
can be defined on "f/.(C) using the fact that the the product of a + bi and
(rz, rw), rE R, z, w E C.
a. Show that "Y is a vector space over R. _
a - bi is both real and nonnegative.
b. What is the dimension of "Y?
14. Let "Y be the algebraic system whose elements are ordered pairs of real Definition The conjugate of the complex number z = a + bi is
numbers. Define addition as in 1í z(R) and scalar multiplication by q(a, b) = z =a- bi.
(qa, qb), q E Q, a, bE R. (Q denotes the set of rationals.)
a. Show that "Y is a vector space over Q. Thus 5 - 3i = 5 + 3i, 2i- 7 = -2i- 7, and 6 = 6.
b. How does "Y differ from "Y z(Q)? All ofthe following properties of the complex conjugate are easily derived
c. Is 1í a subspace of "Y 2 (R)? from the definition.
d. Show that {(1, 0), ( v2, O)} is linearly independent in 11.
e. Is (n, n) in the subspace of 1í given by .P{(1, 0), (0, 1)}?
f. What can be concluded from parts d ande about the dimension of "Y? Theorem 6.17 Suppose z and w are complex numbers, then
15. A field F can be ordered if it contains a subset P (for positive) which is closed
l. z + w = + w. z
under addition and multiplication, and for every x E F exactly one of the
2. Z·W = Z·W.
following holds; x E P, x = O, or - x E P. Show that the field of complex 3. z.z
= a + b if z =a+ bi, a, bE R.
2 2

numbers cannot be ordered. 4. = z.z


5. z = z if and only if z E R.
16. For each nonzero vector U E rff 2 , suppose Bu is an angle satisfying
U= (a, b) = 11 U[[(a/[IU[[, b/IIU[[) = 11 U[[ (cos Bu, sin Bu).
Let Tou denote the rotation of C2 through the angle Bu, and define a multi- Definition For Z = (z 1, ••• , z.) and W = (w 1, ••• , w.) m "f/"(C),
plication in rffz by let Z o W = z1 ·w 1 + · · · + z.·w..
UV = [[ U[[Tou(V) if U-# Oand VE rff 2
and For two ordered triples of complex numbers,
OV = Owhen U = O.
That is, to multiply U times V, rotate Vthrough the angle Bu and then multiply
(3 - i, 2, 1 + i) o (4, 2 + i, 3i - 1)
by the length of U. = (3 - i)4 + 2(2 - i) + (1 + i)(- 3i - 1)
a. Show that the set of all ordered pairs in rff z together with vector addition
and the above multiplication is a field. = 18- IOi.
278 6. INNER PRODUCT SPACES §6. Complex lnner Product Spaces 279

We will call this operation the standard inner product for ordered n- <aZ t bW: U) = a<Z, U) + b<Z, U)
tuples of complex numbers and denote the vector space "f/n(C) together with for all a, b E C, and Z, W, U E "//.
this inner product by ~•. But it should be quickly noted that this inner pro-
duct does not satisfy all the properties of the inner product of lff•. Complex inner product spaces can be constructed from complex vector
spaces in the same ways that real inner product spaces were obtained from
real vector spaces.
Theorem 6.18 For all vectors Z, W, U E~.,
l. Zo WE C.
2. Zo W= WoZ. Examp!e 1 The set of all continuous complex-valued functions
3. Z o (W + U)= Z o W + Z o U and (Z + W) o U= Z o U+ W o U ~defined on [0, 1] might be denoted by .?(C). This is a complex vector space
4. (aZ) o W = a(Z o W) and Z o (a W) = a(Z o W) for all a E C. with the operations defined as in.?. Define < , ) on .?(C) by
5. Z o Z is real and /; o Z > O if Z =1 O.
<J, g) = J:!(x)g(x) dx,
Proof of (2) If Z = (z 1 , .•. , z.) and W = (w 1 , .•. , w.), then

n n n n -.--- so that
Z o w = L Z¡W¡ = L W¡Z¡ = L W¡Z¡ = i=l
L W¡Z¡ = i=l
L W¡Z¡ = w z. o
i=l i=l

Can you fill in a reason for each step?


i=l
<ix2 , 3 + ix) = J: ix 2 (3 - ix) dx

Notice that the dot product in lff. satisfies all the preceding properties
= J~ 3ix2 + x 3 dx = i + 1/4.
since the conjugate of a real number is the number itself. The dot product
of lff. was said to be symmetric and bilinear. The corresponding properties It can be shown that the function. < , ) is a complex inner product for the
for the inner product of ~.are numbered 2, 3, and 4 in Theorem 6.18, and space of functions .?( C).
thc inner product of ~. is said to be hermitian. Theorem 6.18 provides the
essential properties which should be required of any inner product on a com-
The definitions of terms such as norm, orthogonal, and orthonormal
plex vector space.
basis are all the same for either a real or a complex inner product space.
Therefore the Gram-Schmidt process could be applied to Iinearly independ-
Definition · Let "// be a complex vector space and <, )
a function such ent subsets of complex inner product spaces.
that <Z, W) E C for all Z, W E"//, which satisfies
l. <Z, W) = <W, Z).
Example 2 Given
2. <Z, aW +bU)= a<Z, W) + 5<Z, U), for all a, bE C, and z, W,
U E"'/.
3. <Z, Z) >O if Z =1 O.
f/ = 2{(1 + i, 3i, 2 - i), (2 - 3i, 10 + 2i, 5 - i)}
Thcn <, ) is a complex inner product on "//, and ("//, <, ))
is a complex
a subspace of ~ 3 , find an orthonormal basis for Y.
inner product space. <, ) may also be called a positive definite hermitian
We have 11(1 + i, 3i, 2 - i)l/ = 4, so t(l + i, 3i, 2 - i) is a unit vector.
product on "// and ("//, < , )) may be called a unitary space.
The orthogonal projection of (2 - 3i, 10 + 2i, 5 - i) on this vector is

The second condition in this definition shows that a complex inner [(2 - 3i, 10 + 2i, 5 - i) o t(I + i, 3i, 2 - i)] t(l + i, 3i, 2 - i)
product is not linear in both variables. However, you might verify that the
first and second conditions imply = [1 - 2i](l + i, 3i, 2 - i) = (3 - i, 6 + 3i, - 5i).
280 6. INNER PRODUCT SPACES §6. Complex lnner Product Spaces 281

Subtracting this projection from (2 - 3i, 10 + 2i, 5 - i) yields ( -l - 2i, Definition An n X n matrix A with entries from e is hermitian if
4 - i, 5 + 4i), which has norm .j63. Therefore an orthonormal basis for ¡¡r =A.
9'is
Thus
{t(l + i, 3i, 2 - i), (IJ.j63)( -l - 2i, 4 - i, 5 + 4i)}.
2- i
The primary reason for introducing complex inner product spaces here 2 3- 2i) 3
( 3+2i 5 '
is to show that matrices of a certain type are always diagonalizable. The next 1- i
step is to consider the following class of linear transformations.
are examples of hermitian matrices.
Definition Let ("Y, < , )) be a complex inner product space and
T: "Y -> "Y a linear map. T is hermitian if (T(U), V) = <U, T(V)) for all Theorem 6.20 Suppose TE Hom('?.) has matrix A with respect to the
vectors U, V E "Y. standard basis. T is hermitian if and only if A is hermitian.

The identity map and the zero map are obviously hermitian maps, and
Proof ( =>) This direction has been obtained abo ve.
we will soon see that there are many other examples. But first it should be
(<=) Given that A = (a¡) is hermitian, it is necessary to show that
shown why hermitian transformations are of interest.
T(U) o V= U o T(V) for all U, V E'?•.
Theorem 6.19 The characteristic values of a hermitian map are real.
Let U= (z 1 , ••• , z.) and V= (w 1 , ••• , w.), then
Proof Suppose T is a hermitian transformation defined on the
complex inner product space ("Y, <, )).
Then a characteristic val ue z for
T might well be complex. However, if W is a characteristic vector for z,
n n
+ L" a.iziw,
then
L a iziw
j= 1
1 1 + L a2 iziw2 +
j= 1 j=l
zJIWII 2 = z(W, W) = (zW, W) = (T(W), W) = (W, T(W))
= (W, zW) = z(W, W) = ziiWII 2 •
Since W is a nonzero vector, z = z, hence the characteristic value z is real.
From this theorem we see that if T: '?. -4 '?11 is hermitian, then a matrix
of T with respect to any basis has only real characteristic roots. This implies
that there is a collection of complex matrices that have only real charac-
teristic roots. To determine the nature of these matrices, suppose A = (aij) Thus T is hermitian and the proof is complete.
is the matrix of T with respect to the standard basis {E;}. Thep T(E)
= (a 1i, ... , a,.), and since T is hermitian, Therefore, every hermitian matrix gives rise to a hermitian transforma-
tion on '?.- Since a hermitian transformation has only real characteristic roots,
aii = E¡ o (a 1i, . .. , a.i) = E¡ o TtE) we have:
= T(E¡) o Ei = (a u, ... , a.;) o Ei = aii·
Theorem 6.21 The characteristic values or characteristic roots of a
Therefore, if we setA= (a¡), the matrix A satisfies ¡¡r = A. hermitian matrix are real.
282 6. INNER PRODUCT SPACES §6. Complex lnner Product Spaces 283

A matrix with real en tries is hermitian if it equals its transpose, because Tk+ 1 : 9"k+ 1 ~ 9"k+ 1 • As a map from 9"k+ 1 to itself, Tk+ 1 has n - k charac-
conjugation does not change a real number. An n x n matrix A is said tobe teristic values, which must be ).k+ 1 , ••• , ).n· Now if Vk+ 1 is a characteristic
symmetric if Ar = A. For a real symmetric matrix, Theorem 6.21 becorries: vector for Tk+ 1 corresponding to ..1.k+ 1 , Vk+ 1 is a characteristic vector for
T and vk+l E Sf{V¡, o Vk}J.. Therefore v,,
o o' Vk, vk+l are k+ 1
o o o'

linearly independent characteristic vectors for T, and the proof is complete


Theorem 6.22 The characteristic equation of a real symmetric matrix
by induction.
has only real roots.

Corollary 6.24 Every hermitian matrix is diagonalizable.


This means that the characteristic roots of matrices such as

21 41 73) This. corollary refers to diagonalizability o ver the field of complex


and
(3 7 o numbers. However, it implies that real symmetric matrices are diagonalizable
over the field of real numbers.

are all real. But even though


Theorem 6.25 lf A is an n x n real symmetric matrix, then there exists
a real orthogonal matrix P such that P- 1 AP is a real diagonal matrix.
1+
(4 + i
i4 +3 i)
Proof Since A is real and symmetric, A is hermitian and hence
is symmetric, it need not have real characteristic roots. diagonalizable over the field of complex numbers. That is, there is a matrix
The important fact is that not only are the characteristic roots of a real P with en tries from C such that P _, AP is diag (.A. 1 , ••• , .A.n), where .A. 1 , ••• , .A..
symmetric matrix real, but such a matrix is always diagonalizable. Again, are (real) characteristic roots of A. But the jth column of P is a solution of
the result is obtained first in the complex case. the system of linear equations AX = ).jX. Since ..1.j and the entries of A are
real, the entries of P must also be real. Therefore, A is diagonalizable over
the field of real numbers. From Theorem 6.23 we know that the columns of
Theorem 6.23 If T: ~. ~ ~. is hermitian, then Tis diagonalizable.
Pare orthogonal in rff., therefore it is only necessary to use unit characteristic
vectors in the columns of P to obtain an orthogonal matrix which díagonalízes
Proof T is diagonalízable if it has n linearly independent cbarac- A.
teristic vectors in ~ •. Let . 1. 1 , .•. , ..1.. be the n real characteristic values for
T. Suppose V1 is a characteristic vector corresponding to . 1. 1 • If n = 1, the
So far there has been little indication that symmetric matrices are of
proof is complete, otherwise continue by induction on the dimension of ~n·
particular importance. Symmetric matrices are associated with inner pro-
Suppose V1 , ••• , Vk, k < n, ha ve been obtained, with V, a characteristic
ducts, as suggested in problem 9, page 247. But because of Theorem 6.25
vector corresponding to ..1.¡, 1 s; i s; k, and V1, ••• , Vk linearly independent.
it is often advantageous to introduce a symmetric matrix whenever possible.
The proof consists in showing that there is a characteristic vector for ..1.k+ 1
in the orthogonal complement of .2"{V1 , ••• , Vd. Therefore, suppose
Y'k+t = S!"{V,, ... , Vk}J.andletTk+t betherestrictionofTtoY'k+t· That Examp/e 3 Find a rotation of the plane that transforms the poly-
is 11.+ 1(U) = T(U) for each U E Y'k+t· This map sends 9"k+ 1 into itself, for nomial x 2 + 6xy + y 2 into a polynomial without an xy or cross product
if U e 9"k+ 1 and i s; k, then term.
First notice that

That is, TH 1 (U) is in the orthogoaal complement of .2"{V1 , ••• , Vk}, and
284 . 6. INNER PRODUCT SPACES §6. Complex lnner Product Spaces 285

where 1t should be pointed out that the major iheorems of this section could
have been stated for an arbitrary finite-dimensional complex inner product

A =G i} space instead of the n-tuple space ~•. However, since each such space is
isom,orphic to one of the spaces ~., only notational changes would be in-
volved in the statements of such theorems.
The characteristic values for the symmetric matrix A are 4 and -2, and
(1/.}2, 1/.}2) and ( -1/.}2, 1/.}2) are corresponding unit characteristic
vectors. Constructing P from these vectors yields the orthogonal matrix Problems

P -- (1/.}2 -1/.}2) 1. Compute:


1(.)2 1/.)2 ' a. (2, 1 + i, 3i)o(2 - i, 4i, 2 - 3i) in ~ 3 •
b. (1 - i, 3, i + 2, 3i)o(5, 2i- 4, O, i) in ~4·
and · 2. Find the orthogonal complement off!> in·~ 3 when
a. f!> = .'l'{(i, 2, 1 + i)}.
b. f!> = .'l'{(2i, i, 4), (1 + i, O, 1 - i)}.
3. Show that if Z, W, U e~. anda, bE C, then Zo(aW +bU)= a(ZoW) +
li(Z o U).
Since det P = 1, Pis the matrix of a rotation in the p!ane given by 4. a. Define a complex inner product (, ) on .Y 2 ( C) such that B = {(3 + i, 2i),
(1, i - 2)} is an orthonormal basis for (1" 2 (C), <, )).
b. What is the norm of (4i, 2) in this complex inner product space?
5. Which of the following matrices are hermitian and/or symmetric?

or X= PX' with a.
2
(1 + i
1-
4i
¡) · b.
(21 2)
3 ·
c. (
2 +i i 2 + ¡)
3 ·

2 2i 4
~ li) · G ~) · ~ 3i 3
I 2

(x')·
d. (4 e. f. (2 ¡ )·
X'= y'. 6. Suppose A is hermitian, a. Why are the diagonal entries of A real? b. Show
that the determinan! of A is real.

a. Show that the matrix A = (~ =~) cannot be diagonalized over the


Now if the given polynomial is written in terms of the new variables x' and
7.
y', we obtain
field of real numbers.
b. Find a complex matrix P such that p- 1 AP is diagonal.
x 2 +-6xy + y2 = XrAX = (PXYA(PX') = X'rprAPX'
----.., 8. Suppose T: ~ 2 -~ 2 is given by T(z, w) = (2z + (1 + i)w, (1 - i)z + 3w).
1
= X'rp- APX' = (x', y')(~ -~)(;:) a. Find the matrix A of T with respect to the standard basis, and conclude
that T is hermitian.
b. Find the characteristic values of A and a matrix P such that p-t AP is
diagonal.
9. Do there exist matrices that cannot be diagonalized over the field of complex
The polynomial in x' and y' does not have an x'y' term as desired. If the
numbers?
graph of the equation x 2 + 6xy + y 2 = k is "equivalent" to the graph
of 4x' 2 - 2y' 2 = k for a constant k, then it would clearly be much easier to 10. In Example 3, page 283, it is stated that Pis the matrix of a rotation, yet since
no basis is mentioned, no map is defined.
find the second graph. 1t may be noted that no rotation is explicitly given
a. Define a rotation T: 8 2 -<ff 2 such that Pis the matrix of Twith respect to
here; this situation is investigated in problem 10 at the end of this section.
the standard basis for cff 2·
286 6. INNER PRODUCT SPACES Review Problems 287

b. Let B = {T(l, O), T(O, 1)}. Show that for each WE<ff2, if W: (x, y)<E;l 7. Suppose .'? and :Y are subspaces of an inner product space @'. Prove the
and (x, YV = P(x', yy, then W: (x', y')B. following statements:
c. Sketch the coordina te systems associat_ed with the bases {E;} and B in the a. .'7 e (.'7 1 ) 1 and .'? = (.'7 1 ) 1 if @' is finite dimensional.
plane, together with the curve that has the equation x 2 6xy y 2 = 36 + + b. (.'? + :Y)l = .'?l n :Y l.
c. (.'? n ff) 1 ::J f/' 1 + :Y 1 and (.'? n Y) = .'7 + Y if @' is finite dimen-
1 1 1
in coordinates with respect to {E1} and the equation 4x' - 2y' 2 = 36 in
2

coordinates with respect to B. sional.

11. Let G be the subset of tff 2 defined by 8. If TE Hom("f") and there exists T* E Hom("f") such that (V, T*(U)) =
(T(V), U) for all U, VE "Y, then T* is called the adjoint of T.
G = {V E@' 2/ V: (x, y)¡E,J and 5x 2 + 4xy + 2y 2 = 24 }.
a.- Find T* if TE Hom(@' 2 ) is given by T(a, b) = (2a + 3b, 5a - b).
a. Find an orthonormal basis B for @' 2 such that the polynomial equation b. Suppose "Y is finite dimensional with orthonormal basis B. If A is the
~ing G in coordinates with respect to B does not have a cross product matrix of Twith respect to B, determine how the matrix of T* with respect
term. to Bis related to A when "Y is a real inner product space; when "Y is a
b. Sketch the coordina te systems representing {E1 } and B in the plane along complex inner product space.
with the points representing the vectors in G. .c. Show that -fr> = %~ if TE Hom(tff,).
12. A linear map T defined on a real inner product space (''Y, (, )) is symmetrie if
(T(U), V)= (U, T(V)) fgr all U, Ver. 9. Partition n x n matrices into blocks (--~-:-z--) where r + s = n, A is r x r,
a. Show that the matrix of T with respect to any basis for "Y is symmetric if Bis r X S, e is S X r, and D is S X S.
T is symmetric.
a. . the product (A
Wnte ·e· !B1)(A2!Bz).
1
;D·---- ·e·-- :D···- 111 b loc kform.
b. Suppose T: "Y- "Y is orthogonal and prove that T ís symmetric if and 1: 1 2- 2
only if T 2 = ToT =l.
c. What are the possible symmetric, orthogonal transformations from @' 3 b. Suppose ( ~'-g) is symmetric and A is nonsingular. Find matrices E and
into itself?
Fsuch that (-J\{r(~i-~)(í!-f) = (~i~).
c. Use part b to obtain a formula for the determinan! of a symmetric matrix
-partitioned into such blocks.
Review Problems 10. Use the formula obtained in problem 9 to find the determinan! of the following
partitioned symmetric matrices.
= ae + be + ad + rbd an inner product --- (2 3'1 4) LJ tJ)
(~---1-.-i)·
l. For which values of r is ((a, b), (e, d))
on "f"z? a. b. }---}j} -1 · e.
(2 2 3 5 .
2. Use the Gram-Schmidt process to construct an orthonormal basis for 4 1:2 3 4 5 5 2
("f"2, (, )) from the standard basis when:
a. ((a, b), (e, d)) = 2ae- 3ad- 3bc + 5bd.
b. ((a, b), (e, d)) =(a b)(i 1) (~J·
3. Prove that (U+ V, U- V) = O if and only if 1/ U/1 = 1/ VI/. What is the
geometric interpretation of this statement? ·
4. Show that (A, B) = tr(ABT) defines an inner product on the space of all real
n x n matrices.
5. Prove the parallelogram law:
r, U+ V/1 2 + 1/ U- V/1 2 = 21/ U/1 2 + 21/ V/1 2

6. Suppose ("Y, ( , )) is a real finite-dimensionalinner product space. An element


of Hom ("Y, R) is called a linear funetional on "Y. Show that for every linear
functional T, there exists a vector U E "Y such that T( V) = (U, V) for all
VE "Y.
Second Degree Curves
and Surfaces

---..,

§1. Quadratic Forms


§2. Congruence
§3. Rigid Motions
§4. Conics
§5. Quadric Surfaces
§6. Chords, Rulings, and Centers
290 7. SECOND DEGREE CURVES ANO SURFACES §1. Quadratic Forms 291

Our geometric illustrations have been confined primarily to p.oints, For example, the second degree polynomial 2xi - x 1x 2 + 9x2 x 1 can be
lines, planes, and hyperplanes, which are the graphs of linear equations or written as
-;ystems of linear equations. Most other types of curves and surfaces are the
graphs of nonlinear equations and therefore lie outside the study of vector
~paces. However, linear techniques can be used to examine second degree
~olynomial equations and their graphs. Such a study will not only provide
a good geometric application of linear algebra, but it also leads naturally to Although XT AXis a 1 x 1 matrix, the matrix notation is dropped here for
severa! new concepts. simplicity. There should be no question about the meaning of the equation
All vector spaces considered in this chapter will be over the real number LL= 1 aiixixi = XT A X. This equation states that XT AX is a homogeneous
1ie1d and finite dimensional. second degree polynomial in x 1 , ••• , x., and the coefficient of X¡Xi
is the element from the ith row and thejth column of A. (In general, a poly-
nomial is said to be homogeneous of degree k if every term in the polynomial
is of degree k.)
§1. Ouadratic Forms
Definition A quadratieform on an n-dimensional vector space "//, with
Consider the most general second degree polynomia1 equation in the co- basis B, is an expression of the form Li~i= 1 aiixixi where x 1 , • •• , x. are
ordinates x 1 , ••• , x. of points in Euclidean n-space. Such an equation might the coordinates of an arbitrary vector with respect to B. If X= (x 1, ... , x.)T
be wrltten in the form andA = (a;), then

±
n n
L aiixixi +L b¡X¡ +e = O, where aii• b¡, e E R. aiixixi = XT AX
i,j= 1 i= 1
i,j;:::::: 1

For example, in the equation 2x~ - x 1 x 2 + 9x 2 x 1 - 2x 2 + 7 = O, n = 2, and A is the matrix of the quadratie form with respeet to B.
a 11 = 2, a 12 = -1, a 21 = 9, a22 = b 1 =O, b 2 = -2, and e= 7. The
g~ometric problem is to determine the nature of the graph of the points
Thus a quadratic form is a homogeneous second degree polynomial in
V\ hose coordina tes satisfy such an equation. We a1ready know that if x and y
coordinates, it is not simply a po1ynomial. In fact, any one homogeneous
are the coordinates of points in the Cartesian plane, then the graphs of second degree polynomial usually yields different quadratic forms when
1
x + y 2 - 4 = O and x 2 + 4x + y 2 = O are circles of radius 2. But what associated with different bases.
might the graph of 4yz + z 2 + x - 3z + 2 = Olook like in 3-space? Before
considering such questions we will examine the second degree terms of the
general equation. The first step is to write these terms, L7.i= 1 aijxixi, in a Example 1 x
Consider the polynomial 2xf - 1 x 2 + 9x2 x 1 • If x 1
form using familiar notation. Notice that the polynomial 'L?.i=l aijxixi, and x 2 are coordinates with respect to the basis B = {(2, 1), ( -1, 0)}, then
can be rewritten as this polynomial determines a quadratic form on "//2 • This quadratic form
assigns real numbers to each vector in "Y 2 • The value of the quadratic form
at (2, l) is 2 since (2, 1): (1, O)n gives x 1 = 1 and x 2 =O. And since ( -1, 1):
(1, 3)s, the value ofthe quadratic form at (1, 1) is 2(W - 1(3) + 9(3)1 = 26.
For an arbitrary vector (a, b), x 1 = b and x 2 = 2b - a (Why?). Therefore
suggesting a product of matrices. In fact, if we let X= (X¡, . .. , x.)T and the value of the quadratic form 2xi - x 1 x 2 + 9x2 x 1 at (a, b) is 18b2 - 8ab.
A = (a;j), then This gives the va1ue of the quadratic form at ( -1, 1) to be 18 + 8 or 26 as
n
before. -
L aijxixi = XTAX. Since the coordinates of (a, b) with. respect to the standard basis are a
i,j=l and b, the polynomial 18b2 - 8ab is also a quadratic form on 1'2 •
292 7. SECOND DEGREE CURVES ANO SURFACES §1. Quadratic Forms 293

2xi - x 1x 2 + 9x 2 x 1 and 18b 2 8ab are different quadratic forros in that.


- using a nonsymmetric matr.ix, but the polynomial 8x 2 + 5xy + 5yx + y 2
they are different polynoroials in coordinates for different bases, but they yields the same values for given values of x and y and can be expressed as
assign the saroe value to each vector in "Y 2 • That is, they both define a par-
ticular function froro "Y 2 to the real nurobers, called a quadratic function.

Definition A quadratic function q on a vector space "Y with basis B


using a symmetric matrix. Choosing a symmetric matrix to represent a quad-
is a function defined by ratic function will make it possible to find a simple representation by a change
n of basis, for every symmetric matrix is diagonalizable.
q(V) = L aiixixi
i,j== 1
for each V E "Y

Definition The matrix of a quadratic function q with respect toa basis B


where V: (x 1 , ••• , x.)n.
is the symroetric roatrix A = (a;) for which
That is, a quadratic function is defined in terms of a quadratic forro. n
The two quadratic forros abo ve determine a quadratic function q on "Y 2 q(V) = L aijxixi
for which q(2, I) = 2, q( -!, 1) = 26, and in general q(a, b) = 18b 2 - 8ab, i,j=1

so that q: "Y 2 -+R.


The relation between a quadratic function (r = q(V)) and a quadratic
(x
with V: 1 , •.• , x.)n.
forro (r = xrAX) is much the same as the relation between a linear trans- The quadratic forro 'L7.i= 1 aiixixi is symmetric if aii = aii for all i andj.
torroation (W = T(V)) and its roatrix representation (Y= AX). But there
are an infinite number of possible matrix representations for any quadratic Example 2 Recall the quadratic forro considered in ~xample 1 given
function, in contrast to the linear case where each basis determines a unique by 2xf - x 1 x 2 + 9x 2 x 1 in terros of the basis B. The quadratic function q
matrix for a roap. For example, the above quadratic function is given by the on "Y 2 deterroined by this quadratic forro is defined by
quadratic forros 18b 2 - 8ab, 18b 2 - 4ab - 4ba, and 18b 2 + 12ab - 20ba.
Therefore
q(V) = (x 1 x 2 ) (29 - 0I)(xx 21 ),

q(a, b) =(a b)(~ ~~)(~) = (a b)( -~ ~:)(~)


However, G-~) is not the roatrix of q with respect to B because it is not
=(a b)( -2~ ~D(~). syrometric. The matrix of q with respect to Bis(~ ~), since it is symmetric,
-~_.,•
:.• and when V: (x 1, x 2 ) 8 ,
However, out of all the possible matrix representations for a quadratic
function there is a best choice, for it is always possible to use a symroetric
matrix. This should be clear, for if in 'L7.i= 1 aiixixi, ahk ;f. akh for sorne h
and k, then the terros ahkxhxk + akhxkxh can be replaced by

We have defined quadratic functions in terms of coordinates, so the


:1
definition of each quadratic function is in part a refiection of the particular
which ha ve equal coefficients. For example, the polynomial 8x 2 + 7xy + 3yx basis used. But a quadratic function is a function of vectors, and as such it
+ y 2 can be written as should be possible to obtain a description without reference to a basis. An
analogous situation would be to have defined linear transforroations in
terros of roatrix roultiplication rather than requiring thero to be roaps that
preserve the algebraic structure of a vector space. Our coordinate ap-
294 7. SECOND DEGREE CURVES ANO SURFACES §1. Quadratic Forms 295

proach to guadratic functions followed naturally from an examination Examp!e 3 Construct a function b on ~ 2 as follows: Suppose T
or second degree polynomial eguations, but there should also be a co- is the linear map given by T(u, v) = (2u - v, 5v - 3u) and set b(U, V)
ordinate-free ·characterization. We can obtain such a characterization by = U o T(V) for all U, V E~ 2 . The linearity of T and the bilinearity of the dot
referring to bilinear functions. It might be noted that severa! inner pro- product guarantees that b is bilinear. Therefore the function b gives rise to a
ducís considered in Chapter 6 appear similar to guadratic forms. For example, bilinear form for each basis for ~ 2 • A little computation shows that
the inner product in Example 6 on page 245 is expressed as a polynomial
in the coordinates of vectors with respect to the standard basis.
b((x, y), (u, v)) = 2xu - xv - 3yu + 5yv.

Definition A bilinear form on an n-dimensional vector space 1/', with The right side of this eguation is a bilinear form in coordinates with respect
~
basis B, is an expression of the form 'L7.i=t aiixiyi were x 1 , ••• ,.xn and to {E;}. In this case, the eguations a;i = b(U;, U/J become
Y 1 , ••. , Yn are the .coordinates of two arbitrary vectors with respect to
B. A bilinear fot:m may be written as xrA Y with X= (x 1 , • . . , xnl, a 11 = b(E1, E 1) = (1, O) o T(1, O) = (1, O) o (2, 3) = 2,
Y= (y 1 , • •• , Yn)r, and A = (a;). A is the matrix of the bilinear form with
respect to B, and the bilinear form is symmetric if A ís symmetric. a 12 = b((l, 0), (O, 1)) = -1,

a21 = b((O, 1), (1, O))= -3,


Thus a bilinear form is a homogeneous second degree polynomial in the
coordinates of two vectors. The bÜinear form suggested by Example 6, page a 22 = b((O, 1), (0, 1)) = 5.
245, is given by 3x 1 y 1 - x 1 y 2 - x 2 y 1 + 2x2 Yz, where x 1 , x 2 and y 1 , y 2
are the coordina tes of two vectors from 1/' 2 with respect to the standard basis. If X is set egua! to Y in the bilinear form xrA Y, then the guadratic form
This bilinear form yields a number for each pair of vectors in 1/' 2 and was xr AX is obtained. This suggests that every bilinear form corresponds to a
uscd to obtain an inner product on "f/" 2 • However, a bilinear form need not guadratic form. Since each guadratic function q is given by q(V) = xr AX
determine an inner product. Consider the bilinear form 4x 1 y 2 - 3x2 y 2 with A symmetric, q should be associated with a symmetric bilinear function
delined on "/"' 2 in terms of coordina tes with respect to {E;}. This bilinear form b, given by b(U, V) = xrA Y. Using the fact that a symmetric bilinear func-
defines· a bilinear function from 1/' 2 to R, but it is neither symmetric nor tion may be defined without reference to coordinates, we have obtained a
positive definite. coordinate-free characterization of guadratic functions.
The name "bilinear form" is justified by the fact that a bilinear form is
simply a coordinate representation of a real-valued, bilinear function. That
is, a function b for which: Theorem 7.1 If b is a symmetric, bilinear function on 1/' and q is
defined by q(U) = b(U, U) for all U E 1/', then q is a quadratic function
l. b(U, V) E R for all U, V E 1/'. on 1/'.
2. For each W E 1/', the map T given by T(U) = b(U, W), for all Conversely, for every guadratic function q on 11", there is a unique sym-
U E 1/', is linear. metric, bilinear function b on 1/' such that q(U) = b(U, U) for all U E 1/'.
3. For each W E 1/', the map S given by S( U) = b(W, U), for all
U E 1/', is linear.
Proof If b is a symmetric bilinear function on 1/' with b(U, V)
It is not difficult to obtain this relationship between bilinear maps and = xr A Y for sorne basis, then xr A Y is a symmetric bilínear form on 1/'.
bilinear forms. It is only necessary to show that if b is a bilinear function on To obtain the symmetry, notice that
1/', and B = {U1 , ••• , Un} is a basis for 1/', then
xrAY = b(U, V)= b(V, U)= yT AX = (YTAXf = XTATY
n
b(U, V) = L aiixiyi
i,j= 1 implies that A = Ar. (Why is yr AX egua! to its transpose?) Thus the func-
<r tion q deftned by q( U) = b( U, U) = xr AXis a quadratic function expressed
where U: (x 1 , • •• , xn) 8 , V: (y 1 , • •• , Ynh, and aii = b(U;, U).
with a symmetric matrix.
296 7. SECOND DEGREE CURVES ANO SURFACES §1. Quadratic Forms 297

F_or the converse, notice that if b' is a symmetric bilinear function, then
Problems
b'(U + V, U + V) = b'(U, U) + 2b'(U, V) + b'(V, V).
l. Express the following polynomials in the form xrAX with A symmetric.
This suggests that given a quadratic function q, a function b might be defined a. x~ + 3xtx2 + 3x2xt - 5x~. b. 4x~ + 1x1x2.
c. x 2 + 5xy- y 2 + 6yz + 2z 2. d. 5x 2 - 2xi + 4y 2 + yz.
by
2. Find all matrices A that express the quadratic function q in the form
b(U, V) = t[q(U + V) - q(U) - q(V)].
(b) if q(a, b) = a 5ab + 7b
(a, b) A 2
-
2

This function is symmetric, bilinear and b(U, U) = q(U); the fact that it is
symmetric is obvious and the justification of the other two properti~s is left 3. Suppose A = (i ~) is the matrix of a quadratic function q with respect to a
as an exercise. For uniqueness, su pose there are two symmetric bilinear func- basis B. Find q(a, b) if Bis given by
tions b 1 and b2 such that for all U E "Y a. B = {E1}. b. B = {(3/5, 4/5), (4/5, -3/5)}.
c. B = {(l/v'2, l/v'2), (lfv'2, -l/v/2)}.
b 1(U, U) = q(U) = bz(U, U). d. B = {(,!2,1), (v/2, -l)}.
4. What is the difference between a quadratic form anda quadratic function?
Then for all U, V E "Y, 5. Write the double sum L:1.í= 1 aiJx,x1 in four ways using two summation signs.
For each of these ways, write out the sum when n = 2 without changing the
+
b 1 (U V, U+ V)= q(U) + 2b 1(U, V)+ q(V), order of the terms.
bz(U + V, U+ V) = q(U) + 2bi(U, V) + q(V), 6. Show that if q is a quadratic function on "Y, then q(rV) = r 2 q(V) for all rE R
and Ve"f/".
but both left sides equal q(U + V) so b 1(U, V) = bz(U, V) and b 1 = h2 • 7. Let b be the bilinear function of Example 3, page 295. Find the bilinear form
that arises from b when b( U, V) is expressed in terms of coordina tes with
respect to the basis
Example 4 Let b be the bilinear function defined on ~ 2 by b(U, V) a. B = {(4, 1), (1, -2)}. b. B = {(3, 2), (!, 1)}.
= U o T(V) with T(u, v) = (u + 4v, 4u - 3v). Then
8. Suppose b(U, V)= L:1.J=l aiJXtYJ with U: (Xt, ... , Xn)B, V: (Yt. ... , Yn) 8 ,
and B = {Ut. ... , U.}. Justify the equations aiJ = b(U1, U1).
b((x, y), (u, v)) = xu + 4xv + 4yu- 3yv = (x y)(! -~)(~). .,
'
.~
9. Find the quadratic function q that is obtained from the given symmetric
bilinear function b.
a. b((x, y), (u, v)) = 2xv + 2yu - 5yv.
which shows that b is symmetric. b determines a quadratic function q on
b. b((x, y, z), (u, v, w)) = 3xu- xw + 2yv- zu + 4yw -1- 4zv.
-~by q(U) = b(U, U) = U o T(U). In terms of coordinates with respect to
{E;}, 10. Find the symmetric bilinear form that is obtained from the given quadratic
form.
a. xi- 4xtX 2 - 4x 2 x 1 -1- 7x~ on 1&' 2 with basis {E¡}.
b. xf + x 1 x 2 + x 2 x 1 - xi- 6x 2 x 3 - 6x 3 x 2 on 1&' 3 with basis {E¡}.
11. Suppose q is a quadratic function on "Y and b is defined by
Notice that b(U, V)= i[q(U -1- V)- q(U)- q(V)].
Prove that bis a bilinear function on "Y.
12. Show that if b is a bilinear function on <&',, then there exists a linear map
T: rffn- Cn such that b(U, V)= U • T(V).
is the matrix of T, of b, and of q with respect to {E;}. 13. a. How might the term "linear form" be defined on a vector space "Y?
298 7. SECOND DEGREE CURVES ANO SURFACES §2. Congruence 299

b. What kind of map would ha ve a coordinate expression as a linear form? as its matrix with respect to {E¡}. Does there exista basis for which the matrix
c. Given a bilinear form :67.J= 1 a 11x 1y 1 in coordinates with respect to B, off is diagonal?
define b(U, V)= ~7.;= 1 a11x 1y1 where U: (xt, ... , Xn)B and V: (YI> ... , Ynh. That is, does there exist a nonisngular matrix P such that pr AP is
Use the results of parts a and b to show that b is a bilinear function.
diagonal? If

§2. Congruence p = (~ ~)
Th~ression of a quadratic function in terms of a quadratic form or then
a bilinear function in terms of a bilinear form depends on the choice of a
basis. If the basis is changed, one expects the polynomial form to change.
The problem again is to determine how such a change occurs and to discover
pTAP = (aab ++ 2ac
2
+ 3c
2bc + 3cd
2
ab
b2
+ 2ad + 3cd)
+ 2bd + 3d 2 .
a ·'best" polynomial to represent a given function. That is, we have another
equivalence relation to investigate, namely, the relation that states that two So pr AP is diagonal only if be = ad, since P must be nonsingular, A is not
pc-lynomials are equivalent if they represent the same function. Rather than congruent to a diagonal matrix.
work with the polynomials directly, it. is simpler to study their matrices. And This example shows that congruence differs from similarity, for the
since a quadratic function can be defined in terms of a symmetric bilinear matrix A is diagonalizable, that is, A is similar to the diagonal matrix
function, it would be simplest to consider the bilinear case first. diag(I, 3).
Suppose b is a bilinear function on "Y, having matrix A with respect to
the basis B. That is, b(U, V) = xr A Y for U, V E "Y where X and Y are the Now suppose q is a quadratic function given by q(U) = b(U, U) where
coordinates of U and Vwith respect to B.lf A' is the matrix of b with respect b is a symmetric bilinear function. If A and A' are two matrices for q with
to another basis B', then how is A' related toA? Suppose Pis the transition respect to two choices of basis, then A andA' are also matrices for b. There-
m atrix from · B to B'. Then using primes to denote coordinates with respect fore, two symmetric matrices are congruent if and only if they are matrices
to 8', X= PX' and Y= PY'. Therefore,
of sorne quadratic function with respect to two choices of basis.
b(U, V)= XrAY= (PX')rAPY' = X'r(PrAP)Y'.
Example 2 Let q(a, b) = 2a 2 + 8ab. The matrix of q with respect
But b(U, V)= X'r A' Y'. Since the equality X'r A' Y'= X'r(pr AP)Y' holds to {E¡} is
for aii X' and Y'(i.e., U and V), A' = pr AP.

Definition Two n x n matrices A and B are congruent if there exists a


nonsingular matrix P such that B = prAP.
What is the matrix of q with respect to B = {(2, 1), (3, -3)}?
That is, the relationship between two matrices for a bilinear function is The transition matrix from {E¡} to B is
called congruence. The converse also holds, so that two matrices are congruent
if and only if they are matrices of sorne bilinear function with respect to two
choices of basis.

Example 1 Suppose f is the bilinear function on "Y 2 having and

-3
3) = (24o -5~).
300 7. SECOND DEGREE CURVES..:J\ND SURFACES §2. Congruence 301

Therefore Q is a nonsingular n x n matrix with QT = Q, so A is congruent to

(
24
o
o)
-54
QT diag(J. 1, ••• , J..)Q
= QT diag(J. 1/A, ... , J.) JI;,, J.p+ 1/J- J.p+ 1 , ••• , J..f~, O, ... , O)
is the matrix of q with respect to B and q(a, b) = 24xi - 54x~ when = diag(l, ... , 1, -1, ... , -1, O, ... , 0).
(a, b):(x 1, x 2 ) 8 •

Theorem 7.2 Congruence JS an equivalence relation on the set of all Thus every real symmetric matrix is congruent to a diagonal matrix with
n x n matrices. + 1's, -1 's, and O's on the main diagonal. However, it must be proved that
no other combination is possible before this matrix can be called a canonical
form. A diagonal matrix with a different number ofO's on the main diagonal
The proof ofTheorem 7.2 is like the corresponding proofs for equivalence could not be congruent to A because rank is not changed on multiplication
and similarity of matrices, andas with these relations, one looks for invariants by a nonsingular matrix. Therefore it is only necessary to prove the following
under congruence in search for basic properties of bilinear and quadratic theorem.
functions. Certainly rank is invariant under congruence, but since A' = pT AP
implies det A' = (det P) 2 det A, the sign of det A is also invariant. Moreover,
if the matrix P is required to satisfy det P = ± 1, as when P is orthogonal, Theorem 7.3 If a real symmetric 11 x n matrix A is congruent to both
then the value of det A is invariant. To obtain a complete set of invariants
or a set of canonical forms for congruence is not an easy task, but if only D 1 = diag(l, ... , 1, -1, ... , -1, O, ... , O)
real symmetric matrices are considered, then the solution is almost in hand.
The restriction to real symmetric matrices is not unreasonable in that both
real inner products and quadratic functions are required to have real sym- and
metric matrices.
The first step in obtaining a canonical form for real symmetric matrices D 2 = diag(I, ... , 1, -1, ... , - 1, O, ... , 0),
under congrucnce is to use the fact that every real symmetric matrix can be diag-
onalized with an orthogonal matrix. Given an n x n real symmetric matrix A,
there exists an orthogonal matrix P such that p-t A P = diag(J. 1, ••• , J.,) then p = t.
where J. 1, ••• , )., are the n characteristic values of A. Suppose Pis chosen
so that the positive characteristic values, if any, are ). 1, ••• , ).P and the negative
Proof Suppose q is the quadratic function on "Y, which has D 1 as
characteristic values are ).p+ 1, ••• , )." where r is thc rank of A. Then if
its matrix with respect to the basis 8 1 = { U 1, ••• , U.}, and D 2 as its"'''''i'atrix
r < 11, J.,+ 1, ••• , J., are the zero characteristic values of A. Now diag(J. 1 , ••• , J.,)
with respect to 8 2 = {V1, ••• , V,}.
is congruent to the matrix
If W: (x 1 , ••• , x,}s 1 , then

diag(l, ... , 1, - !, ... , -1, O, ... , 0).


q(W) = xf + .. · +X~ - X~+ 1 - .. • - x;.

To see this, let Q be the matrix Thus q is positive on the subspace Y' = 2'{ U 1, ••• , UP}. That is, if Z E Y'
and Z f= O, then q(Z) > O. On the other hand, if W: (y 1 , ••• , Y.) 82 , then
diag(I/A, ... , 1/JJ.P, 1/J -),p+ 1, ••• , 1/J -}." 1, ... , 1).
q(W) = Yi + .. · +Y?' -y,\ 1 - .. • -y; .

.-~
302 7. SECOND DEGREE CURVES ANO SURFACES §2. Congruence 303

Therefore q is nonpositive on the subspace Example3 Find the canonical forro for the matrix

!!T = .2"{Vr+l• ... , V, V,+ 1 , ••• , V.}. o 3


A= 3 O
( -2 5
That is, if Z E !!T, then q(Z) :so; O. This means that if Z E Y n !!T, then
q(Z) = O and Z = O. Thus the dimension of Y + fJ is the sum of the
dimensions of Y and !!/, and we ~ave__ _ under congruence.
The characteristic equation for A is A. 3 - 38A. = O. This ·equation has
roots ±J38 andO, so the canonical forro is
n ~ dim(Y + !!T) = dim Y + dim !!T = p + (n - t).

Therefore O ~ p - t and t ~ p. 1
o -1
o o)o
Redefining Y and !!T yields p ~ t, completíng the proof. (
o o o
Thus there are two numerícal invaríants that completely determine or diag(l, -1, 0). That is, A has rank 2 and signature O.
whether two n x n real symmetric matrices are congruent. First the rank and Suppose A is the matrix of a quadratic function q on 1/3 with respect
second the number of positive or negative characteristic values. The second to the standard basis. Then q is given by
invaríant is defined as follows.
q(a, b, e) = 6ab - 4ae + IObe,

Definition The signature of an n x n real symmetric matrix A is the and a basis B could be found for 1/ 3 for which q(a, b, e) = xi - x~ when
number of positive characteristic values p minus the number of negative (a, b, e): (x 1 , x 2 , x 3 )a.
characteristic values r - p, where r = rank A. If the signature is denoted by
s, then s = p - (r - p) = 2p - r.
Examp/e 4 Find the canonical form for

Theorem 7.4 Two n x n real symmetric matrices are congruent if and


only if they have the same rank and signature.
-3 2)
4 1
1 2
Proof Two real symmetric n x n matrices have the same rank r
under congruence.
and signature s if an on!y if they are congruent to
The characteristic polynomial for A is -39 + 7A2 - ,1,3. Since only the
signs of the characteristic roots are needed, it is not necessary to solve for them.
diag(l, . . . , 1, - 1, ... , - 1, O, ... , O) If A. 1 , A. 2 , A. 3 are the three roots, then det A = .A 1 .A 2 A. 3 = -39, and either all
--.......- ~,---" ---....r--
p r-p n-r three roots are negative or one is negative and two are positive. But tr A
= ..1. 1 + ..1.2 + ,1.3 = 7, therefore at least one root is positive, and the
where p = t(s + r). Since congruence is an equivalence relation, this proves canonical form is diag(l, 1, -1).
the theorem.

Suppose ( , ) is an inner product defined on an n-dimensional, real


The matrices diag(l, ... , 1, -1, ... , -1, O, ... , O) are taken as the v.ector space "/C. As a bilinear function, ( , ) may be written in the from
canonical forms for real symmetric matrices under congruence. (U, V) ~ xrA Y where X and Y are coordinates of U and V with respect to
304 7. SECOND DEGREE CURVES ANO SURFACES §3. Rigid Motions 305

sorne basis. Since an inner product is symmetric, the matrix A must be 7. a. Does b((xt. x 2), Ú't. Y2)) = 2x1Y1 + 6x1Y2 + 6x2Y1 + l8x2Y2 define an
symmetric. :Now the requirement that <, ) be positive definite implies ~hat inner product on r 2?
b. Does
both the rank and the signature of A are equal to n, for if not, then there
exists a basis {U 1 , .• •. , U,.} in terms of which b((Xt, X2, X3), (Y t. Y2, y3)) = 4X¡y¡ + 2X¡YJ + 2X3Y1 + 3X2Y2 - X2Y3
- X3Y2 + 2x3Y3
define an inner product on "Y 3?
8. Show that if A is a symmetric matrix and B is congruent to A, then B is also
If r < n, (U,+ 1 , U,+ 1 ) =O, which contradicts positive definiteness; and symmetric.
< >
if p < n, then up+ ¡, up+ 1 = -1, which also contradicts positive definite- 9. Suppose A and B are real symmetric n x n matrices. Show that if A is congruent
ness. Therefore every inner product OQ. "f/ can be expressed in the form to B, then there exists a quadratic function on "Y, that has A and B as matrices
<U, V) = x 1 y 1 + · · · + x,.y,. with respect to sorne basis. This is not a new with respect to two choices of basis.
fact, we know that <, ) has such a coordinate expression in terms of any
orthonormal basis for ("//, <, )).It cotild however be stated in the terms of
this section by saying that a matrix u sed in any coordina te expression for any
inner product on "f/ Iies in the equivalence class containing the identity matrix, §3. Rigid Motions
>.,,
[!,.], or that any matrix for an inner product is congruent to the identity
matrix. The coordinate transformations required to express a quadratic func-
. ··1
tion in canonical form are too general to be used in studying the graph of a
second degree equation in Cartesian coordinates. A transformation of
-~
Problems Euclidean space is required to be a "rigid motion," that is, it should move
figures without distortion just as rigid objects are moved in the real world.
l. Let q(a, b) = a 2 + 16ab- 11b 2 for (a, b) E "f/ 2. In this section we will characterize and then define the "rigid motions" of
a. Find the matrix A of q with respect to {E1 }. Euclidean n-space.
b. Find the canonical form A' for A. The intuitive idea of a "rigid motion" suggests severa! properties. First
c. Find a basis B' with respect to which A' is the matrix of q. it should send Iines into Iines and second, it should be an isometry, that is, the
d. Express q in terms of coordinates with respect to· B'. distan ce between any two points of rff, should equal the distance between their
j
2. Follow the same directions as in problem 1 for the quadratic function defined images. The requirement that a transformation be an isometry, or distance
on "f/ 3 by q(a, b, e)= 9a 2 + 12ab + 12ac + 6b 2 + 12c 2 • 1 preserving, implies that it is one to one. The first condition is satisfied by any
3. Let q be the quadratic function defined on R 2 [t] by q(a + bt) = 14a 2 + 16ab + ,¡ linear map, and both conditions are satisfied by orthogonal transformations.
46 2 • However, not aii orthogonal transformations would be considered "rigid
a. Find the matrix A of q with respect to B = {1, t}. motions." For example, the reflection of 3-space in a plane transforms an
b. If B' = {2 - 5t, 3t - 1}, use the transition matrix from B to B' to find the object such as your left hand into its mirror image, your right hand, which is
matrix A' of q with respect to B'. not possible in the real world. Therefore a "rigid motion" should also
c. Write q in terms of coordinates with respect to B'.
preserve something Iike left or right handedness.
4. Prove that congruence is an equivalence relation on A!, x ,.
5. Suppose A and B are n x n matrices such that xr A Y = xrBY for all X,
Definition The order of the vectors in any basis for rff,. determines an
Y E .4, x t· Show that A = B.
orientalion for rff•. Two bases B and B' with transition matrix P determine the
6. For each of the following symmetric matrices, find the rank, the signature, and same orientation if det P > O and opposite orientations if det P < O.
the canonical form for congruence.

a. (_~ -~). b. (-~ -~). c. (f -io ó).


4 5
d. ·(-jo -i
-2 1
~).' For example, an orientation is given for the Cartesian plane or Cartesian
3-space. by the coordinate system related to the standard basis. Por the plane,
§3. Rigid Motions 307
306 7. SECOND DEGREE CURVES ANO SURFACES

X z

--~--~~-----4--X

y
-+-----~--Y

Figure 3
y

Figure 1 Definition Suppose T: $ 11 -> ~f. is a nonsingular linear transformation.


If B = {U 1 , ••• , U.} is any basis for ~f., let B' = {T(U 1), ••• , T(U11)}.
the orientation is usually established by choosing the second coordinate axis, Then T is orientation preserving if B and B' determine the same orientation
or basis vector, to be 90° from the first axis, measured in the counterclockwise for ~f •.
direction. All our diagrams in the Cartesian plane have used this orientation.
Figure 1 shows two choices of coordinate systems which give opposite This definition could be restated as follows:
orientations to the plane. There are only two possible orientations for the
piane; in fact, the definition of orientation implies that c. can be oriented in
Theorem 7.5 A linear transformation T:lf.-> ~f. is orientation
oniy two ways.
preserving if and only if the matrix of T with respoct to any basis for c. has
The orientation for Cartesian 3-space is usually given by the "right hand
rule." This rule states that if the first finger of the right hand points in the a positive determinant.
direction of the x axis, the first basis vector, and the second finger points in
the direction of the y axis, se,cond basis vector, then the thumb points in the Proof If B = {U~> ... , U.}' is a basis for $ "' then the matrix of T
direction of the z axis or third basis vector, see Figure 2. The other orientation with respect to Bis the transition matrix from B to B' = {T(U1), . . . , T(U.)}.
for 3-space follows the "left hand rule" in which the three coordinate direc- (Why must B' be a basis for each direction ?)
tions are aligned with the first two fingers and thumb of the left hand. A
coordinate system which gives a left-handed orientation to 3-space is il-
Iustrated in Figure 3. Now with this definition, a reflection of 3-space in a plane is not orienta-
tion preserving, for the matrix of such a reflection with respect to an orthonor-
z mal basis has determinant -1. Therefore, the third requirement of a "rigid
motion" is that it be orientation preserving. This eliminates reflections, so
that the only orthogonal transformations that satisfy all three conditions are
the rotations.
If the essential character of a "rigid motion" is that it move. figures
without distortion, then it is not necessary for the transformation to be linear.
A linear map must send the origin to itself, yet from the geometric viewpoint,
there is no distinguished point. It is here that Euclidean geometry differs
from the study of the vector space g m and nonlinear transformations must be
introduced.
A translation T: t.~ 1. given by T(U) == U+ Z for U E In and Z
X a fixed point in c., should be admitted as a "rigid motion." lt is easily seen
that a translation sends lines into lines and preserves distance. However, the
~Figure 2
§3. Rigid Motions 309
308 7. SECOND DEGREE CURVES ANO SURFACES

definition of orientation preserving must be modified to include such a non- the definition insures that the composition of a rotation and a translation
linear map. can be called orientation preserving.
Rotations and ti:anslations satisfy the three conditions deduced from the
concept of a distortion-free transformation of Euclidean n-space. The charac-
Examp/e 1 Suppose T: C2 ~ C2 is the translation given by terization of a rigid motion will be complete upon showing that every such
transformation is the composition of a translation and a rotation.
T(a, b) = (a, b) + (2, 1) = (a + 2, b + 1).

How does T affect a given orientation? Theorem 7.6 Suppose T: c.~ c. satisfies the following conditions:
Suppose C2 is oriented in the usual way by {E¡}. Since Tmoves the origin, l. T transforms lines into lines.
the fact that {E¡} is defined in relation toO must be taken into consideration. ' 2. T is an isometry.
That is, the coordinate system is determined by the three points O, E 1, and 3. T is orientation preserving.
E2 • This coordinate system is moved by Tto the coordinate system determined Then T is the composition of a translation and a rotation.
by T(O), T(E¡), and T(E2 ). These two coordinate systems are illustrated in
Figure 4, with x' and y' denoting the new system. Even though the new co- Proof Suppose first that T(O) =f. O. Then T may be composed with
ordinate system does not have its origin at the zero vector of C 2 , it orients the translation which sends U to U - T(O). This composition sends O to O
the plane in the same way as {E¡}; for the second axis (y') is 90° from and satisfies the three given conditions. Therefore it is sufficient to assume
the first (x') measured in the counterclockwise direction. Further, the direc- that T(O) = O and show that T is a rotation.
tions ofthe new coordinate axes are given by T(E 1) - T(O) and T(E2 ) - T(O), By 2, T preserves distance, therefore
that is, by E 1 and E 2 • This example shows how the definition of orientation
preserving might be extended to nonlinear maps. for all U, V E c•.
IJT(U)- T(V)ii = IIU- VIl

Definition Suppose T: c.~ c. is one to one and B = {U 1 , ••• , u.} This implies that Tpreserves the norm,for setting V= Oyields 11 T(U)II = 11 u¡¡.
is a basis for c•. Set To show that T is a linear map, first consider T(rU) for sorne rE R and
U E c•. The points O, U, and rUare collinear, so T(O) =O, T(U), and T(rU)
B' = {T(U 1) - T(O), . .. , T(U.) - T(O)}. are also collinear by l. Hence T(rU) = kT( U) for sorne scalar k. Since T
preserves the norm,
T is orientation preserving if B and B' determine the same orientation for c•.
IIT(rU)II = llrUII = lriiiUII = lriiiT(U)II.
This definition essentially defines a translation to be an orientation
preserving
___, map. If T is linear, it contains nothing new, for T(O) = O. But But IIT(rU)II = lkiiiT(U}II, so k= ±r. If k= -r, then Tchanges the rela-
tive position of the three points, violating 2. (If k = r = O, then there is
y' nothing to prove.) Therefore k = r, and T preserves scalar multiplication.
In a similar way, using the parallelogram 'rule for addition, it can be shown
y T(E2 ) that T preserves addition (see problem 6 at the end of this section). Thus T
is a linear map.
1 Since T is linear and preserves the norm, T is an orthogonal transforma~
Ez tion (problem 3, page 269). But T is required to preserve orientation, so
T(O) T(E 1 ) x'
:.; its matrix with respect to any basis has a positive determinan t. That is, T is
a rotation, completing the proof.
o E¡ X

Figure 4
The preceding discussion justifies the following definition.
§3. Rigid Motions 311
310 7. SECOND DEGREE CURVES ANO SURFACES

identical questions: What types of geometric figures are in S? What is the


Definition A rigid motion of rt. is a composition of a finite number
nature of the equivalence classes of S under congruence? What are the
of rotations and translations.
geometric invariants of elements from S?
If AX = B is a single linear equation, then the set of all solutions, G,
Much of Euclidean geometry can be viewed as a search for properties is a hyperplane. As with lines (hyperplanes in 6' 2 ), any two hyperplanes in
0f figures that remain invariant when the figure is moved by a rigid motion. 6' are congruent. Therefore one equivalence class contains all the hyper-
Here the term "figure" simply refers to a collection of points in a Euclidean planes in 6'•. At the other extreme, if A has rank n, then Gis a single point.
space, such as a line, a circle, or a cube. Since any two points are congruent, one can be sent to the other by a transla-
tion, one equivalence class contains all the points of 6'n- .
In general, if the rank of A is r, then G may be expressed m terms of
Definition Two geometric figures in 6'."'!fte congruent if there exists a
n - r parameters, so the geometric dimension of Gis n - r. It can be shown
rigid motion of 6'. that sends one onto the other.
that two graphs from S are congruent if and only if they have the same
dimension. Therefore only one invariant is needed to distinguish between
It should be clear that congruence is an equivalence relation on any elements of S, either the rank of A or the dimension ofthe graph G. This means
coJllection of figures. As a simple exa¡nple, consider the set of all lines in the there are n equivalence classes or types of solution sets. When rank A = n
plane. Given any two lines, there is a rigid motion that maps one onto the or G has dimensionO, the equivalence class is 6'. and the graphs are points;
other. Therefore any two lines in the plane are congruent. This means that when rank A = n - 1 or G has dimension 1, the equaivalence class is the
there is no property, invariant under rigid motions, that distinguishes one set of all lines in n-space, and so on.
line from another. Something like slope only measures relative positi'on, it
is not intrinsic to the line. Thus the set of alllines contains only one equivalence Other examples can be obtained by replacing the linear system AX = B,
class under congruence. defining each graph G, by sorne other condition on the coordinates of points.
As another example, consider the set of aii circles in the plane. lt is not In particular, if AX = Bis replaced by a general second degree polynomial
difficult to show that two circles from this set are congruent if and only if equation in n variables, then we are back to the problem posed at the begin-
they have the same radius. As with lines, circles can be in different relative ing of this chapter. Now we can see that the problem is to use the geometric
positions, but two circles with the same radii are geometrically the same. equivalence relation of congruence to study the possible graphs of second
The length of the radius gives a geometric invariant that distinguishes non- degree equations, and to find a complete set of geometric invariants which
ccngruent circles. Thus the set of all circles in the plane is divided into an in- describe them. With the invariants will come a set of canonical forms which
finity of equivalence classes by congruence, one for each positive real number. will amount to choosing a standard position for each type of graph.
An invariant under congruence is an intrinsic geometric property, a
property of the figure itself without reference to other objects. It is in this sense
that Euclidean geometry searches for invariants under rigid motions. This Problems
is essentially what occurs in the study of congruent triangles in plane geometry.
However, reflections are usually allowed in plane geometry because as 3- l. Show that for any n, ~.can be oriented in only two ways.
dimensional beings we can always pick up a triangle, turn it over, and put it 2. Suppose T: ~ 2 --+ ~ 2 is given by T(a, b) = (a, b) + (-4, 3). Sketch the coordi-
back in the plane. Of course, a 4-dimensional being would view our right nate system determined by {E¡} and the coordinate system it is sent into by T.
and Ieft hands in just the same way. 3. Prove that a translation of 8.:
a. Maps lines onto lines.
Example 2 Suppose S is the collection of all sets G e 6'n such b. Preserves the distance between points.
that G = {e E 6'.IA cT = B} for sorne consistent system of linear equations 4. Suppose T: 8.--. 8. sends Iines onto lines and is an isometry. Prove that T
in n unknowns, AX = B. That is, S contains the solution sets for all consistent preserves the order of collinear points.
systems oflinear equations in n unknowns. Examining how congruence either 5. a. Show that any two lines in 8. are congruent.
identifies or distinguishes between elements of S will answer three essentially b. Show that any two hyperplanes in~. are congruent.~
312 7. SECOND DEGREE CURVES ANO SURFACES §4. Conics 313
.:::2>

6. Prove that if T: tff, - tff, is an isometry, sends fines onto lines, and T(O) = O, suppose a translation is given in coordinates by X = X' + Z where Z is a
then T preserves vector addition. fixed element of vff.x 1, that is, x 1 = x'1 + Z¡, ••• , x. = x~ + z,. Then
7. Determine the nature of the equivalence classes under congruence of the ~et of
all spheres in 3-space. xr AK + BX + C = (X' + Z)r A(X' + Z) + B(X' + Z) + C
8. a. Let S denote the set of all tdangles in the plane. Find a complete set of = X'rAX' + X'rAz + zrAX' + zrAz + BX' + BZ + C
geometric itivariants for S under congruence. Be sure to consider orien-
tation. = X'TAX' + B'X' + C'
b. Do the same if S is the set of all triangles in 3-space.
where B' = 2zrA + B and C' = BZ + C. For example, if the polynomial
x 2 + 3xy + 3yx + 7x + 2 undergoes the translation x = x' + 4, y = y' - 5,
then it becomes
§4. Conics
(x' + 4) 2 + 3(.X' + 4)(y' - 5) + 3(y' - 5)(x' + 4) + 7(x' + 4) + 2
A general second degree polynomial in the n variables x 1 , ••• , x. can = x' 2 + 3x'y' + 3y'x' - 23x' + 24y' - 90,
be written as the su mofa symmetric quadratic form xr AX = L,;',i= 1 aiixixi,
a linear form BX = 'L7= 1 b¡X¡ and a constant C. The graph G of the poly- and the coefficients of the second degree terms, l, 3, 3, O, remain the same.
nomial equation xr AX + BX + e= o is obtained by viewing X¡, . . . ' x. You might want to work this example out in the matrix form to confirm the
as coordinates of points in Euclidean n-space with respect to sorne Cartesian general expression, then X= (x, yl, Z = (4, - 5)r, and so on.
coordinate system. This corresponds to viewing x 1, ••• , x. as coordinates The other component of a rigid motion, a rotation, can always be u sed to
of vectors in the inner product space éf. with respect to sorne orthonormal obtain a polynomial without cross product terms, that is, terms of the form
basis. We will take x 1, ••• , x. to be coordinates with respect to the standard xhxk with h -# k. For since A is symmetric, there exists an orthogonal matrix
basis. Coo_rdinates with respect to any other basis or coordinate system P with determinant +1, such that pT AP = P- 1 AP = diag(A. 1, . . . , A..)
will be primed. Thus the graph G is the subset of éf.. given by {V E ét,l where ). 1 , . . . , )., are the characteristic values of A. So the rotation of éf,
V:(x 1 , .•• , x,) 1E,¡ and xr AX + BX + C =O when X= (x 1, . . . , x,)r}. In given in coordina tes by X = P X' changes the polynomial xr AX + BX + C
general, whc;:n n = 2, the graphs are curves in the plane caBed conics or conic to X'r diag(A. 1 , ••• , A..)X' + BPX' + C in which the secorid degree terms
sections. This terminology is justified in problem 5 on page 329. When n = 3, are perfect squares, A. 1 x' 2 + · · · + )..x' 2 • The fact that such a polynomial
the graphs are surfaces in 3-space called quadric surfaces. lt should be noted can always be obtained is called the "Principal Axis Theorem." For ajustifica-
that the general second degree equation includes equations that have de- tion of this name, see Example l below.
generate graphs, such as points or lines, as well as equations that have no Now consid(!r the matrix A of coefficients for the second degree terms.
graph at all. The latter is the case with the equation (x ¡} 2 + 1 = O. Translations do not affect A and rotations transform A to a similar matrix,~

Rather than consider the set of all graphs for second degree polynomial so any invariants of A under similarity are invariants of the polynomial under
equations, and .determine conditions which identify congruent graphs, it is these coordinate changes. This means that similarity invariants of A are
easier to work directly with the polynomial. After determining how a rigid geometric invariants ofthe graph under rigid motions. These invariants include
motion might be used to simplify an arbitrary polynomial, it is then possible rank and the characteristic values, as well as the determinan!, the trace, and
to choose a set of canonical forms for the equations. Choosing such a canoni- other symmetric functions of the characteristic values. For example, the sign
cal set of equations amounts to choosing "standard positions" for the various of the determinan! of A distinguishes three broad classes of conic sections. If
types of graph relative to a given coordinate system, and once a simple col- a second degree polynomial equation in two variables is written in the form
lection of equations is obtained, it is not difficult to selecta set of geometric ax 2 + bxy + cy 2 + dx + ey + f = O, then
invariants for the graphs under congruence.
The first point to notice is that a translation does not affect the coef-
ficients of the second degree terms in xr AX + BX + C = O. To see this,
A= (a
b/2
b/2)
e
§4. Conics 315
314 7. SECOND DEGREE CURVES ANO SURFACES

y
and the determinant of A is ac - b2 /4 = t(4ac - b2 ). Traditionally, rather y'
than use det A, the invariant is taken as -4 det A = b2 - 4ac and called
the discriminan! o[ the equation. The graph is said . to be a hyperbola if
b2 - 4ac > O, an ellipse if b2 - 4ac < O, and a parabola if b2 - 4ac = O.

Example 1 Determine the type of curve given by

llx 2 + 6xy + 19y 2 - 80 =O


and find a coordinate system for which the equation does not have a cross
product term.
We have b2 - 4ac = 62 - 4·11·19 < O, so the curve is an ellípse.
The symmetric matrix of coefficients for the second degree terms is

A= (113 193) .
Figure 5
A has characteristic values 10 and 20, so there exist coordinates x', y' in-
which the equation can be written as Now return to the general case of a second degree polynomial in n
variables. Given XT AX + BX + e, there is a rotation X= PX' that trans-
10x' 2 + 20y' 2 = 80. forms it to
n n
The coordinate system and the rotation yielding these coordinates can be L A¡x; 2 + L b;x; + e,
obtained from the characteristic vectors for A. One finds that (3, -I)T and i=l i=l

(J, 3l are characteristic vectors for 10 and 20, respectively, so if


where ). 1, • •• , ).n are the characteristic values of A. It is then possible to
obtain an equivalent polynomial in which no variable appears in both the
- ( 3/.JlO 1/.JlO) quadratic and the linear parts. If for sorne k, ).k and b~ are nonzero,
p - -1/.JIO 3/.JIO '
then completing a square gives
then P is orthogonal with det P = + 1, and A.kx? + b~xk. = A.k(xk. + tbk.P-ki - ib'/ P-k·
Therefore if in a translation x~ = xk. + 1_bk,f).k, then after the translation the
terms involving x~2 and xk. are replaced by A.kx% 2 minus a constant.

Thus the x', y' coordinate system corresponds to the basis B = {(3/.JlO, Example 2 Find a translation that transforms the polynomial
-1/.JIO), (1/.JIO, 3/.JIO)}. 2x 2 - 5y 2 + 3x + lOy into a polynomial without linear terms.
The graph for the original equation is easily sketched using its equation Completing the squares gives
with respect to the x', y' coordinate system. The line segments PQ and RS
2
in Figure 5 are called the major and minor axes of the ellipse. Direction 2(x + 3/4) 2 - 2·9/16- 5(y- 1) + 5.
numbers for these principal axes are given by characteristic vectors for A,
that is, by the columns of a matrix that diagonalized the quadratic form Therefore the translation given by (x', y') = (x + 3/4, y - 1) transforms
X1'AX. the polynomial to 2x' 2 - 5y' 2 + 31/8.
316 7. SECOND DEGREE CURVES ANO SURFACES §4. Conics 317

Thus, in general, given a second degree polynomial there exists a rigid 50x" 2 25y" 2 = -lOO. This is not quite in the first form above. But with a
-

motion, the composition of a rotation and a translation, that transforms it rotation through 90° given by (x'", y"') = (y", -x") the equation can be
t? a polynomial in a rather simple form. In this form a variable may appear written in the form
elther in the quadratic portian as a square or in the linear portian, but not
both. From this one could choose a set of canonical forms for polynomials
under coordinate changes induced by rigid motions of C". However, from the
geometric point of view two polynomicals should be judged equivalent if
they differ by a nonzero multiple, for multiplying an equation by a nonzero Here the a and b in the form given abo ve are a = 2 and b = .J'l.
The coordinate
c0nstant does not affect its graph. Therefore a set of canonical forms for change required is the composition of all three transformations. Putting them
equations of graphs should be obtained using rigid motions and multiplica- together gives the relation
tion by nonzero scalars. This will be done for polynomials in two and three
variables, establishing standard positions for conics and quadric surfaces.
(x, y) = (4x"'j5 - 3y"'j5 + l, 3x"'j5 + 4y"'/5 + 2).
A general second degree equation in two variables can be put into one of
nine types of canonical forms by changing coordinates with a rigid motion
Let the four coordinate systems used in obtaining the canonical form for the
and then multiplying the equation by a nonzero constan t. For example, if the
equation be denoted by B = {E¡}, B' = {(3/5, -4/5), (4/5, 3/5)}, B" the
polynomial has a positive discriminant, then either x' 2ja 2 - y' 2jb 2 = 1 or
translate of B' and B"' obtained from B" by a rotation through 90°. Then
x r2¡a 2 - y r2/b2 = O can be obtamed
. where a and b represent positive
Figure 6 shows th« graph of each equation in relation to the appropriate
constants.
coordinate system. The hyperbola is in "standard position" with respect to
the :X:", y"' coordina te system.
Examp/e 3 The polynomial equation

The other seven types of canonical forms for second degree equations
2x 2 - 72xy + 23y 2 + 140x - 20y + 50 = O
in two variables are obtained as follows: If the discriminan! is negative, the
equation can be transformed to x' 2 ja 2 + y' jb =1, x' ja + y' jb = -:,
2 2 2 2 2 2
has positive discriminant and therefore is equivalent to one of the above
or x' 2 ja 2 + y' 2 jb 2 = O, with O< b < a, by using a rigid motion and mu!-
forms. Find the form and a coordinate transformation that produces it.
tiplying by a nonzero constant. The graph in Example 1 shows an ellipse in
2 36
The symmetric coefficient matrix ofthe second degree terms ( - ) standard position with respect to the x', y' coordina te system. For that curve,
-36 23
a= .J8 and b = 2. Ifthe discriminant is O, the equation can be transformed
has 50 and -25 as characteristic val ues with corresponding characteristic
to x' 2 = 4py', x' 2 = a2 , x' 2 = -a 2 , or x' 2 =O with panda positive. The
vectors (3, -4)T and (4, 3l. The rotation given by X= PX' with
graph of x' 2 = 4py' is the parabola in standard position with its vertex at
the origin, focus at (0, p )x• ,y'' and directrix the line y' = -p.
3/5 4/5)
p = ( -4/5 3/5 , It is possible to determine which type of canonical form a given poly-
nomial is equivalent to without actually finding the transformations. To
yields the equation see this we first express the entire polynomial in matrix form by writíng the
coordínates (x, y) in the form (x, y, l). Then the general second degree poly-
50x' 2 - 25y' 2 + !OOx' + IOOy' + 50 = O. nomial in two variables can be written as

U pon completion of the squares~ the equation becomes ax 2 + bxy + cy 2 + dx + ey + f

!)(b~2
50(x' + 1) 2 - 50 - 25(y' - 2) 2 + lOO + 50 = O. b/2
= (x y e
Therefore under the translation (x", y") = (x' + l, y' - 2), it becomes d/2 ej2
§4. Conics 319
318 7. SECOND DEGREE CURVES ANO SURFACES

= gr ÁX becomes (FX'l ÁFX' = X'r(pr ÁF)X'. This notation eve11 makes


it possible to express a translation, a nonlinear map, in matrix form. The
translation (x', y') = (x + h, y + k) is given by

y"'
x"'

\
y

/ y' and its inverse is given by


\WL._

l If a translation is written as X = QX', then the polynomial gr ÁX is trans-


formed to (QX'fA(QX') = X'T(Q,T ÁQ)X'.
\
.j
1
Examp/e 4 Perform the translation of Example 2 with matrix
1
multiplication.
In Example 2, the translation (x', y') = (x + 3/4, y - 1) was u sed to
simplify the polynomial

o
2x 2 - 5y 2 + 3x + lOy = (x y !)( ~ -5
3/2 5 Yl(n
This translation is given by

(3, - 4)8 or (S, 0)8 •

x'
(f)~(~
o
o
1 -Jmn
Figure 6 One can check that

We will write this matrix product as gr AX, with f = (x, y,


3 x 3 matrix. Using this notation a rotation with matrix P
Il and A the
= (pii) would (x' y' 1)
( 1
o o1 oo) ( 2
o -5o 3/2K
o
5 o 1
~ nm
3

be written as -3/4 1 1 3/2 5 o o o


o
(x)i (Pu
=
P12 O)(x')
Pül pt/ ~ ~~ ~ (x' y' !)(: -5
o 3~/8ml
or X = FX'. Under the rotation X= PX', the polynomial xr AX + BX + C = 2x' 2
- 5y' 2
+ 31/8.
~
320 7. SECOND DEGREE CURVES ANO SURFACES §5. Ouadric Surfaces 321
::V

This is the polynomial obtained in Example 2. 3. Find a rigid motion that transforms the equation
16x2 - 24xy + 9y2 - 200x- lOOy- 500 =O

The composition of the rotation X= PX' followed by the transl<ition to the canonical forro for a parabola. Sketch the graph and the coordinatc
systems corresponding to the various coordinates used.
(x", y") = (x' + h', y' + k') is given in matrix form by
4. Write the polynomial 5x 2 - 3y 2 - lOx- 12y- 7 in the form XTÁX and
transform it to a polynomial without linear terms with a translation given by
X=PX'.
5. For each of the following equations, use Table I, without obtaining the canon-
ical form, to determine the nature of its graph.
If this coordina te expression is written as k = PX", then the polynomial a. 2x 2 - 6xy + 4y + 2x + 4y -· 24 = O.
2

xr AXis transformed to k"r(pr AP)k". That is, a rigid motion transforms b.· 4x 2 + 12xy + 9y 2 + 3x- 4 =o.
A to pr AP. Since Pis a nonsingular matrix, the rank of A is an invariant of c. 7x 2 - 5xy + y 2 + 8y =O.
the polynomial under rigid motions of tff •• This invariant together with the d. x 2 + 4xy + 4y 2 + 2x + 4y + 5 = O.
e. x 2 + 3xy + y 2 + 2x = O.
discriminant and the property of having a graph (an equation has a graph
if the coordinates for sorne point satisfy the equation) distinguish the nine 6. The equations xr AX + BX + e = O and r(XT AX + BX + C) = O ha ve the
types of canonical forms for second degree polynomíals in two variables. same graph for every nonzero scalar r.. Should the fact that det A f:. det(r A)
This is summarized in the Table I. ha ve been taken into consideration when the discriminan! was called a geometric
invariant?
7. The line in ~ z through (a, b) with direction numbers (a, /3) can be written as
Table 1 X = tK + Ú, t E R, with
b 2 -4ac
>O
RankÁ
3
Graph
Yes
Canonical forms
xz¡az - yz¡bz = 1
Type of graph
Hyperbola
X= w. ú = m. K= (g)·
>O 2 Yes xz¡az - yz¡bz =O Intersecting Jines. a. Show that this line intersects the graph of grÁX = O at the points for
<O 3 Yes xz¡az + yz¡bz = 1 (Real) ellipse which the parameter t satisfies
<O 3 No xz¡az + yz¡bz = _ 1 Imaginary ellipse (KrAK)t 2 + (2KrA0)t + úTAO =O.
<O 2 Yes x2Ja2 + yz¡p =O Poínt ellipse b. Use the equation in part a to find the points at which the line through
=0 3 Yes x 2 = 4py Para bola (- 2, 3) with direction (1, - 2) intersects the graph of x 2 + 4xy - 2y +
=0 2 Yes xz = a2 Parallel lines 5 =O.
=0 2 No xz = -az Imaginary lines
8. Find the canonical forro for the following and sketch the graph togeriler with
=0 1 Yes x 2 =O Coinciden! lines
any coordinate systems used.
a. 2x 2 + 4xy + 2y 2 = 36. b. 28x 2 + l2xy + l2y 2 == 90.
Problems

l. Find the matrices A, B, and Á that are used to express the following polyno-
e
mials in the forro xr AX + BX + and grÁX.
§5. Ouadric Surfaces
a. 3x 2 - 4xy + 7y + 8. b. 8xy- 4x + 6y.
2. Find a rotation that transforms 14x 2 + 36xy - y 2 - 26 = O to x' 2 - y' 2 /2 = The general second degree equation in three variables may be written in
l. Sketch both coordinate systems añd the graph of the two equations. the form kT Ak = O where A is the symmetric matrix
322 7. SECOND DEGREE CURVES ANO SURFACES §5. Quadric Surfaces 323

Table 11

Type of equation N ame for graph

x2fa2 + y2fb2 + z2fc2 = 1 Ellipsoid


x2fa2 + y2fb2 + z2fc2 = -1 No graph (imaginary ellipsoid)
and x2fa2 + y2jb2 - z2fc2 = -1 Hyperboloid of 2 sheets
x 2/é + y 2/b 2 - z 2/c 2 = 1 Hyperboloid of 1 sheet
x2fa2 +
y2fb2 = 2z Elliptic paraboloid
x2fa2 - y2fb2 = 2z Hyperbolic paraboloid
x2fa2 +
y2fb2 +
z2fc2 =O Point (imaginary elliptic cone)
..,__ x2fa2 +
y2fb2 - z2fc2 = O Elliptic cone
x2fa2 +
y2fb2 = 1 Elliptic cylinder
x2fa2 +
yt¡bt = -1 No graph (imaginary elliptic cylinder)
If A = (á;), B = (b 1 , b 2 , b 3 ), and e= (e), then x2fa2 - y2fb2 = 1 Hyperbolic cylinder
x2 = 4py Parabolic cylinder
xTAx = xTAx + 2BX +e, with X = (x, y, z)T. x2fa2 +
y2fb2 =O Line (imaginary intersecting planes)
x2fa2 - y2fb2 =O Intersecting planes
x2 = at Parallel planes
Notice that both A and A are symmetric. If x, y, and z are viewed as Car- x2 = -at No graph (imaginary parallel planes)
tesian coordinates in three-space, then the graph of xT AX = o is a quadric x 2 =O Coinciden! planes
surface. In general, this is actually a surface, but as with the conic sections
sorne equations either have no graph or their graphs degenerate to lines or
points. transforms ax 2 + by+ cz =O to ax' 2 + Jb 2 + c 2 y' =O. Thus the given
We know that for any equation XT AX = O there exists a rigid motion equation can be put into the standard form for a parabolic cylinder.
which transforms it toan equation in which each variable that appears is either Quadric surfaces are not as familiar as conics and at least one is rather
a perfect square or in the linear portien. Therefore it is not difficult to make complicated, therefore it may be useful to briefty consider the surfaces listed
a list of choices for canonical forms, as shown in Table 11. in Table II. Starting with the simplest, there should be no problem with
This list contains 17 types of equations ranked first according to the rank those graphs composed of planes. For example, the two intersecting planes
of A and then by the rank of A. (This relationship is shown later in Table given by x 2 ja 2 - y 2 jb 2 =O are the planes with equations xfa + yfb =O
111.) Each of the possible combinations of signatures for A, linear terms, and xfa - yfb = O.
and constant terms are included in Table 11. (Recall that a constant term can A cylinder is determined by a plane curve called a directrix and a direc-
be absorbed in a linear term using a translation.) Although it may appear that tion. A line passing through the directrix with the given direction is called a
there is no form for an equation of the type ax 2 + by + cz = O, it is possible generator, and the cylinder is the set of all points which lie on sorne generator.
to reduce the number of linear terms in this equation by using a rotation 'J For the cylinders listed in Table II the generators are parallel to the z axis and
that fixes the x axis. To see this write by + cz in the form the directrix may be taken as a conic section in the xy plane. These three types
of cylinders are sketched in Figure 7 showing that the name of each surface
is derived from the type of conic used as directrix.
The nature of the remaining quadric surfaces is most easily determined
by examining how various planes interesect the surface. Given a surface and
Then the rotation given by
...
;¡)·
a plane, points which lie in both are subject to two constraints. Therefore a
'; surface and a plane generally intersect in a curve. Such a curve is called a

bfJb~ + c cfJ~)(;)
section of the surface, and a section made by a coordina te plane is called a
2
trace. For example, the directrix is the trace in the xy plane for each of the
-cfJb 2 + c 2 +c bfJb2 2
z three cylinders described above, and every section made by a plane parallel
324 7. SECOND DEGREE CURVES ANO SURFACES §5. Quadric Surfaces 325

to the xy plane is congruent to the directrix. Further, if a plane parallel to


the z axis intersects one of these cylinders, then the section is either one or
two generators. The sections of a surface in planes parallel to the xy plane
Generators
are often called leve/ curves. If you are familiar with contour maps, you will
recognize that the Iines on a contour map are the leve! curves of the land
surface given by parallel planes a fixed number of feet apart.
The traces of the ellipsoid x 2 fa 2 + y 2 fb 2 + z2 /c 2 = 1 in the three co-
y ordinate planes are ellipses, and a reasonably good sketch of the surface is
obtained by simply sketching the three traces, Figure 8. If planes parallel to
a coordinate plane intersect the ellipsoid, then the sections are ellipses, which
get smaller as the distance from the origin increases. For example, the leve!
curve in the plane z = k is the ellipse x 2 fa 2 + y 2 /b 2 = 1 - k 2 /c 2 , z = k.
···~· '. The elliptic cone x 2 /a 2 + y 2 /b 2 - z 2fc 2 = O is also easily sketched. The
traces in the planes y = O and x = O are pairs of lines intersecting at the
origin. And the leve! curve for z =k is the ellipse x 2fa 2 + y 2 /b 2 = k 2 fc 2 ,
z = k, as shown in Figure 9.
The hyperboloids given by x 2fa 2 + y 2 fb 2 - z2 fc 2 = ± 1 are so named
because the sections in planes parallel to the yz plane and the xz plane are
hyperbolas. (This could also be said of the elliptic cone, see problems 5 and
Elliptic cylinder Hyperbolic cylinder 8 at the end of this section.) In particular, the traces in x = O and y = O
are the hyperbolas given by y 2fb 2 - z2 /c 2 = ± 1, x =O and x 2 fa2 - z2 fc 2
= ± l, y= O, respectively. The hyperboloid of2 sheets, x 2fa 2 + y 2 fb 2 - z2 /c 2
z = - 1, has leve! curves in the plane z = k only if k 2 > c 2 • Such a section is
an ellipse with the equation x 2fa 2 + y 2 fb 2 = -1 + k 2 fc 2 , z =k. As k 2
Generator increases, these leve! curves increase in size so that the surface is in two parts
opening away from each other, Figure IO(a). For the hyperboloid of 1 sheet
with the equation x 2 fa 2 + y 2 fb 2 - z2 fc 2 = 1, there is an elliptic section for
Directrix every plane parallel to the xy plane. Therefore the surface is in one piece, as
shown in Figure IO(b).
--- --- J. ..
Sections ofthe paraboloids x 2fa 2 + y 2 fb 2 = 2z and x 2 fa 2 - y 2 fb 2 = 2z
in planes parallel to the yz and xz planes, i.e., x = k and y = k, are para bolas.
Planes parallel to the xy plane intersect the elliptic paraboloid x 2 fa 2 + y 2 fb 2
1 = 2z in ellipses given by x_2 fa 2 + y 2fb 2 = 2k, z = k, provided k ;::: O. When
k < O there is no section and the surface is bowl-shaped, Figure ll(a). For
1
the hyperboÜc paraboloid with equation x 2fa 2 - y 2 fb 2 = 2z every plane par-
X 1 allel to the xy plane intersects the surface in a hyperbola. The trace in the xy
plane is the degenera te hyperbola composed of two intersecting lines; while
x2 =4py,p >O above the xy plane the hyperbolas open in the direction of the x axis and
below they open in the direction of the y axis. The hyperbolic paraboloid is
Parabolic cylinder called a saddle surface because the region near the origin has the shape of a
Figure 7 saddle, Figure ll(b).
326 7. SECOND DEGREE CURVES ANO SURFACES §5. Quadric Surfaces 327

z
Trace in
y=O
Trace in
x=O
Section in
z=k
y y
y
Trace in
z=O

X
Ellipsoid
Figure 8

Hyperboloid of 2 sheets Hyperboloid of 1 sheet


(a) {b)
Figure 10

Section in
z=k,k>O

Trace in
y=O

X
Elliptic cone
Figure 9
Elliptic paraboloid Hyperbolic paraboloid
(a) (b)
Figure 11
328 7. SECOND DEGREE CURVES ANO SURFACES §6. Chords, Rulings, and Centers 329

Given the equation of a quadric surface, gr AX = O, one can determine surface except for ~he two hyperboloids. The hyperboloids are distinguished
the type of surface by obtaining values for at most five invariants. These .are by the sign of det A, which is an invariant because A is a 4 x 4 matrix.
the rank of A, the rank of A, the absolute value of the signature of A, the
existence of a graph, and the sign of the determinant of A. This is shown in
Table III. Problems

Table III l. Por the following equations find both the canonical form and the transfor-
mation used to obtain it.
Rank A Rank A Abs (sA) Graph Det A Type of graph a. x 2 - y 2 /4- z 2 /9 = l.
b. x 2 /5 - z 2 = l.
Yes <O Ellipsoid
3 c. 4z 2 + 2x + 3y + 8z = O.
No >O Imaginary ellipsoid
3 d. 3x 2 + 6y 2 + 2z 2 + 12x + 12y + 12z + 42 = O.
4 Yes <O Hyperboloid of 2 sheets e. -3x 2 + 3y 2 - 12xz +
l2yz + 4x +4y- 2z =O.
>O Hyperboloid of 1 sheet
Yes
<O
2. Show that every plane parallel to the z axis iritersects the paraboloids x2 fa 2 =
2
2 Yes Elliptic paraboloid y 2 /b 2 = 2z in parabolas.
o Yes >O Hyperbolic paraboloid
3. Show that every plane parallel to the z axis intersects the hyperboloids x 2fa 2 +
3 Yes Point y 2 /b 2
- z 2 /c 2 = ± 1 in hyperbolas,
3
1 Yes Elliptic cone
4. If intersecting planes, parallel planes, and coincident planes are viewed as
2
Yes Elliptic cylinder cylinders, then what degenerate conics would be taken as the directrices?
3
2 No Imaginary elliptic cylinder
5. Justify the name "conic section" for the graph of a second degree polynomial
o Yes Hyperbolic cylinder equation in two variables by showing that every real conic, except for
Yes Parabolic cylinder parallellines, may be obtained as the plane section of the elliptic cone x 2fa 2 +
yz¡bz - zz¡cz =O.
2 Yes Line 6. Find the canonical form for the following equations.
2
o Yes Intersecting planes
a. 2xy + 2xz + 2yz - 8 = O. b. · 2x 2 - 2xy + 2xz + 4yz = O.
2
Yes Parallel planes 7. Determine the type of quadric surface given by
No Imaginary parallel planes
a. x 2 + 4xy + 4y 2 + 2xz + 4yz + z 2 + Sx + 4z = O.
Yes Coincident planes b. x 2 + 6xy - 4xz + yz + 4z 2 + 2x - 4z + 5 =O.
c. 3x 2 + 2xy + 2y 2 + 6yz + 7z 2 + 2x + 2y + 4z + 1 =O.
d. 2x 2 + Sxy- 2y 2 + l2xz + 4yz + 8z 2 + 4x + Sy + 12z + 2 =O.
· ..._The ti ve invariants arise quite na tu rally. A rigid motion is given in 8. Show that the elliptic cone x 2fa2 + y 2 /b 2 - z 2 /c 2 =O is the "asymptotic cone'·
coordinate form by X' = Px with P nonsingular. The other operation used for both hyperboloids given by x 2 /a 2 + y 2 /b 2 - z 2 /c 2 = ±l. That is, sho\\
to obtain the canonical forms is multiplication of the equation by a nonzero that the distance between the cone and each hyperboloid approaches zero as
scalar. This has no effect on the graph or the rank of a matrix, but since A the distance from the origin increases.
is 3 x 3, it can change the sign of its determinant. Therefore the ranks of A
and A are invariants but the sign of det A is not. Recall that for the conics,
det A gives rise to the discriminant. The signature of A is invariant under
§6. Chords, Rulings, and Centers
rigid motions, but it may change sign when the equation is multiplied by a
negative scalar. Therefore the absolute value of the signature, abs(sA), is an
invariant of the equation. These three invariants together with the existence There are severa! ways in which a line can intersect or fail to intersect
or nonexistence of a graph are sufficient to distinguish between all types of a given quadric surface, gr AX = O. To investigate this we need a vector
330 7. SECOND DEGREE CURVES ANO SURFACES §6. Chords, Rulings, and Centers 331

equation for a line in terms of X. Recall that the line through the point is an arbitrary direction, then RT AK = a2 fa 2 + f3 2 /a 2 + y2 fc 2 • This is zero
U = (a, b, e) with direction numbers V = (a, {3, y) has the equation X only if a = f3 = y = O, but as direction numbers, R =P O. Therefore, RT.A k
= tV + U or (x, y, z) = t(a, {3, y) + (a, b, e), tE R. Written in terms .óf =P O for all directions R.
4 x 1 matrices this equation becomes A line near the origin intersects this ellipsoid in two real points, far from
the origin it intersects the surface in two imaginary points, and somewhere
inbetween for each direction there are (tangent) lines which intersect the
ellipsoid in two coincident points.

If the line & given by X = tK + O intersects XT AX = O when t = t 1


The fourth entry in the direction, R, must be zero to obtain an equality; there- and t = t2 , then the point for which t = t(t 1 + t2 ) is the midpoint of the
fore direction numbers and points now have different representations. chord. This midpoint is always real, for even if t 1 and ! 2 are not real, they are
The line given by X = tK + O intersects the quadric surface given by complex conjugates. Thus if RT AK =P O, then every line with direction K
XT AX = O in points for which the parameter t satisfies contains the real midpoint of a chord. lt is not difficult to show that the
locus of all these midpoints is aplane. For suppose Mis the midpoint of the
(tK + O)T A(tK + 0) = O. chord on the line &. If & is parameterized so that X = M when t = O, then
the equation for & is X = tK + M. & intersects the surface XT AX = O
Using the properties of the transpose and matrix multiplication, and the at the points for which t satisfies kT AKt 2 + 2RT AMt + MT AM = O.
fact that for a 1 x 1 matrix Since these points are the endpoints of the chord, the coefficient of t in this
equation must vanish. That is, if & intersects the surface when t = t 1 and
t = t 2 , then !(t 1 + t 2 ) =O implies that ! 1 = -t2 • Therefore the mídpoínt
M satísfies RTAM = O. Conversely, if X = sK + Ois another parameteríza-
this condition may be written in the forro tion with RTAO = O, then & intersects the surface when s = ±s1 for sorne
s 1 and O is the midpoint of the chord. Thus we have proved the following.

The coefficients of this equation determine exactly how the line intersects Theorem 7.7 If RTAR =PO for the quadric surface given by XT AX =O,
the quadric surface. then the locus of midpoints of aiJ the chords with direction R is the plane
If the direction K satisfies kT AK =P O, then there are two solutions for with equation RT AX = O.
t. These solutions may be real and distinct, real and concident, or complex
conjugates. Therefore if kT AK =P O, then the line intersects the surface in
two real points, either distinct or coincident, or in two imaginary points. In Definition The plane that bisects a set of parallel chords of a quadric
all three cases we will say the line contains a ehord of the surface with the surface is a diametral plane of the surface.
the chord being the line segment between the two points of intersection. In
only one case is the chord a realline segment. However with this definition,
every line with direction numbers R contains a chord of the surface XT AX Example 2 Find the equation of the diametral plane for the ellip-
= O provided RTAK =P O. soid x 2 /9 + y 2 + z 2 /4 = 1 thatbisects the chords with direction numbers
(3, 5, 4).

Example 1 Show that every line contains a chord of an arbitrary kT AX = (3, 5, 4, O) diag(I/9, 1, 1/4, -1)(x, y, z, I)T
ellipsoid.
1t is sufficient to consideran ellipsoid given by x2fa 2 + y 2 fb 2 + z2 fc 2 = 1 = díag(1/3, 5, 1, O)(x, y, z, 1)T
(Why?). Forthis equationA = diag(1/a2, 1/b 2 , 1/c 2 , -1), so if R =(a, {3, y, Ol a:: = x/3 + 5y + z.
332 7. SECOND DEGREE CURVES ~D SURFACES §6. Chords, Rulings, and Centers 333

Therefore, the diametral plane is given by x/3 + 5y + z = O. Notice that and only if O is 0n the surface. That is, X= tk + O is either a ruJing ofthe
the direction K is not normal to the diametrial plane in this case. cylinder or it does not intersect the surface. Since O could be taken as an
arbitrary point on the cylinder, this shows that there is a ruling through every
Suppose the line X = tK + O does not contain a chord of xr AX = O, point of the hyperbolic cylinder.
then f.:..T AK = O and the nature of the intersection depends on the values of
Kr AO and or AO. If f.:..T AO -:f. o, then the Iine intersects the surface in only
Definition A quadric surface is said to be ruled if every point on the
one point. (Why isn't this a tangent line?)
surface lies on a ruling.

Examp/e 3 Determine how lines with direction numbers (0, 2, 3) The definition of a cylinder implies that all cylinders are ruled surfaces;
intersect the hyperboloid of two sheets given by x 2 + y 2 /4- z 2 /9 = -l. are any other quadric surfaces ruled? Certainly an ellipsoid is not since every
For this surface A = diag(1, 1/4, -1/9, 1). Therefore, with f.:.. line contains a chord, but a cone appears to be ruled. For the corre given by
= (0, 2, 3, Of, f.:..T AK = 4/4 - 9/9 = O, and lines with direction K do x2fa 2 + y2fb 2 - z 2 /e 2 =O, A = diag(Ifa 2 , 1fb2, -l/e 2 , 0). So if K is the
not contain chords. For an arbitrary point O = (a, b, e, l)r, Kr AO = direction given by K = (a, {3, y, Of, then KT AK = a2fa 2 + f3 2fb 2 - y2je 2 •
2b/4 - 3ef9 = b/2 - ef3. Thi~ is nonzero if (a, b, e) is not in the plane with This vanishes if an only if the point (a, {3, y) is on the cone. Therefore, if
equation 3y - 2z = O. Therefore, every line with direction numbers O = (a, {3, y, l)r is a point on the corre, not the origin, then Kr Ak = Kr .A O
(0, 2, 3) that is not in the plane 3y - 2z = O intersects the hyperboloid = or AO = oand the line x = tK + O is a ruling through O. Since all these
in exactly one point. Each line with direction (O, 2, 3) that is in this plane rulings pass through the origin, every point líes on a ruling and the ellipti~
fails to intersect the surface. For such a line has the equation X = tK + O cone is a ruled surface.
with O = (a, O, O, 1l. In this case, the equation Although the figures do not suggest it, there are two other ruled quadri~
surfaces. Consider first the two hyperboloids with equations of the from
x 2fa 2 + y 2 fb 2 - z 2 /e 2 = :¡:l. For these surfaces A = diag(lja 2 , !jb 2 , -1/c 2 ,
± 1). Suppose O = (x, y, z, 1)r is an arbitrary point on either ofthese surfaces,
becomes Ot 2 + Ot + a2 + 1 = O, which has no solution. In particular when i.e., or AO = o, and consider an arbitrary Iine throuth O, x = tK + O
a = O the line is in the asymptotic con e, x 2 + y 2 /4 - z 2 /9 = O, for the with K = (a, {3, y, O)r. These surfaces are ruled if for each point O there exist>
surface (see problem 8, page 329). a direction K such that KT AK = KT AO = or AO = O. Suppose thesc
conditions are satisfied, then
This example gives a third relationship between a line and a quadric
K1'AK é¡az + f32W -lfcz =O.
=
surface. If KT AK = KT AO = o but or AO -:f. O, then there are no values
for t, real or complex, that satisfy the equation f.:..T AKt 2 + 2f.:..T A0t + 0r A0 Kr AO = axja 2 + f3YW - yz/e 2 =O,
= O. So the line X = tf.:.. + O fails to intersect the surface xr AX = O. The
or AO = xz¡az + yz¡bz - zz¡cz ± 1 =o. --_,_,
or
last possibility is that f.:..T AK = f.:..T .A O = o
AO = for which all tE R
satisfy the equation and the Iine lies in the surface. So

ax/a 2 + f3yW = yzjc 2 = (yfe)(z/e)


Definition A line that lies entirely in a surface is a ruling of the surface.
= ±.Jaz¡az + f32fb2 .Jxz¡az + y2fb2 ± l.
Examp!e 4 Show that a line with direction numbers (0, O, 1) is Squaring both sides and combining terms yields
eithcr a ruling of the cylinder x 2 /4 - y 2f9 = 1 or it does not intersect the
surface.
For this surface A = diag(l/4, -1/9, O, -1). For the direction k
= (0, O, 1, Of and any point 0, kr AK = or AO = O and or AO = O if Since a and b are nonzero and a = f3 = O implies y = O, there is no solution
334 7. SECOND DEGREE CURVES ANO SURFACES §6. Chords, Rulings, and Centers 335

Figure 13
Figure 12

For these surfaces


for K using the plus sign, which means that the hyperboloid of 2 sheets is not lfa 2 o o
a ruled surface. Thus the question becomes, can a and f3 be found so that
(ay - f3x) 2 - (a 2 b2 + f3 2 a2 ) = O, if so, then the hyperboloid of 1 sheet is a
ruled surface. If JyJ #- b, view this condition as a quadratic equation for a
is terms of /3,
~
A=
(
o ± l/b 2
o
o
o
o
o
o
-1 -f)
So if K= (a, f3, y, O)T, then K.TA.K. = a2 ja 2 ± f3 2 fb 2 • For the plus sign this
vanishes only if a = f3 = O. But K.T A. O = axja 2 ± f3yfb 2 - y. Therefore
there cannot be a ruling through O if O lies on the elliptic paraboloid. For
Then the discriminant in the quadratic formula is 4f3\a 2y 2 + x 2 b2 - a2 b2 ). the hyperbolic paraboloid, a, f3, y need only satisfy a2 Ja 2 - f3 2 Jb 2 = O and
This is always nonnegative for x 2 ja 2 + y 2 Jb 2 = 1 is the leve! curve oftP,e sur- axja 2 - f3yfb 2 - y = O. Again there are two rulings through O with the
face in z = O and all other leve! curves are larger. That is, if (x, y, z) is on the directions K = (a, b, xja- yjb, O)T and K = (a, -b, xja + yjb, O)T. Thus
surface, then x 2 ja 2 + y 2 jb 2 ~ l. Therefore if O is not on the trace in z = O the hyperbolic paraboloid is also a doubly ruled surface, as shown in Figure
and JyJ #- b, then there are two different solutions for a in terms of f3. Further, 13. Since a portion of a hyperbolic paraboloid can be made of straight lines,
for each solution there is only one value for y satisfying K.T A.R. = K.TAO the surface can be manufactured without curves. This fact has been u sed in the
= O, so in this case there are two rulings through O. Ifyou check the remain- design of roofs for sorne modern buildings, Figure 14.
ing cases, you will find that there are two distinct rulings through every point We will conclude this section with a determination of when a quadric
on the surface. Thus not only is the hyperboloid of 1 sheet a ruled surface, surface has one or more centers.
but it is "doubly ruled." This is illustrated in Figure 12, but it is much better
shown in a three-dimensional model. Definition A point U is a center of a quadric surface S if every line
Now turn to the two types of paraboloids given by x2 fa 2 ± lfb 2 = 2z. through U either
336 7. SECOND DEGREE CURVES ANO SURFACES §6. Chords, Rulings, and Centers 337

Therefore x, y, z are coordinates of a center provided

.( 2 3
AX = 3 6
¡' -1 o
-1 o

or

Figure 14

One finds that this system of equations is consistent, so the surface has at
l. contains a chord of S with midpoint U, or
least one center. In fact, there is a line of centers, for the solution is given
2. fails to intersect S, or
by x = 2t + 2, y= - t - 1, and z = t for any tE R. This quadric surface
3. is a ruling of S.
is an elliptic cylinder.
That is, a center is a point about which the quadric surface is symmetric.
If O is a center for the surface given by j(T AX = Oand K is sorne direc-
Examp/e 6 Determine if the hyperboloid of 2 sheets given by
tion, then each of the above three conditions implies that KT ,40 = O. For
x2 - 3xz + 4y 2 + 2yz + IOx - 12y + 4z= O has any centers.
KT AO to vanish for all directions K, implies that the first three entries of For this equation
AO must b!! zero, or AO = (0, O, O, *)T. Conversely, suppose P is a point
for which AP = (0, O, O, *f. Then for any direction K = (ex, {3, y, O)T,
-3 5)
A=( -3~
K AP = O. Therefore, the line X = tK + P (I) contains a chord with
ATAA A A A

o
_! ~ -~
midpoint P if KT AK ,¡. O, or (2) does not intersect the surface if KT AK = O and
and OT ,40 #- O, or (3) is a ruling of the surface if KT AK = OT AO = O.
5
This proves:

is the system of linear equations given by AX = (0, O, O, *)T. This system


Theorem 7.8 The point O is a center of the quadric surface given by has the unique snlution x =y= 1, z = 2. Therefore this hyperboloid has
j(T AX =O if and only if AO = (0, O, O, *)T. exactly one center.

Example 5 Determine if the quadric surface given by


Definition A quadric surface with a single center is a central quadric.
2x 2
+ 6xy + 6y 2
- 2xz + 2z 2
- 2x + 4z + 5 = O
The system of equations AX = (0, O, O, *)T has a unique solulion if
has any centers. and only if the rank of A is 3. Therefore, ellipsoids, hyperboloids, and eones
For this polynomial, are the only central quadric surfaces. A quadric surface has at least one
center if the rank of A equals the rank of the first three rows of A. That is
~
-1
A

A=
( ;

-1 O
o
2
-~)2 . when the system AX = (0, O, O, *f is consistent. Further, a surface with a~
least one center has a line of centers if rank A = 2 and a plane of centers if
-1 o 2 -5 rank A = l.
338 7. SECOND DEGREE CURVES ANO SURFACES Review Problems 339

Problems
Review Problems
l. Determine how the Iine intersects the quadric surface. If the Iine contains a
chord, find the corresponding diametral plane. 1. Find a symmetric matrix A anda matrix B that express the given polynomial
+
a. (x, y, z) = t(l, --1, 2) (-2, -2, 2); as the sum of a symmetric bilinear form xrAX and a linear form BX.
2x 2 + 2xy + 6yz + z 2 + 2x + 8 = O. a. 3x 2 - 5xy + 2y 2 + 5x- ?y.
b. (x, y, z) = t(1, -1, 1) + (0, 1, -2); b. x 2 + 4xy + 6y.
c. 4x 2 + 4xy- y 2 + !Oyz + 4x- 9z.
2xy + 2xz - 2yz - 2z 2 + Sx + 6z + 2 = O.
d. ·3xy + 3y 2 - 4xz + 2yz - 5x + y - 2z. <!IL.....
c. (x, y, z) = t(2, v2, O);
2. Prove that if S, TE Hom('Y, R), then the function b defined by b(U, V) =
x 2 -- 2xz + 2y 2 + Syz + 8z 2 + 4y + 1 =O.
S(U)T(V) is bilinear. Is such a function ever symmetric and positive definite?
2. Determine how the fa mil y of lines with direction numbers (1, -2, 1) intersects
the quadric surface given by 3. Determine which of the following functions define inner products.
+
4xy y 2 - 2yz + 6x + 4z +
5 = O.
+
a. b((x 1 , x 2 ), (Yt. y 2 )) = 5x 1 y 1 + 6x1Y2 + 6XzY1 9XzYz.
b. b((x 1, x 2 ), (y 1, y 2 )) = 6x 1y¡ + 1X¡Yz + 3XzY! + 5XzYz·
3. Show that every diametral plane for a sphere is perpendicular to the chords it c. b((x 1 , x 2 ), (y 1, y 2 )) = 4x 1 y 1 + 3X¡Yz + 3xzy¡ + 2XzYz·

,;:¡;:« ~bd th(Í ~i~uk•l:"(¡"T¡)""""'"'e fo'


bisects.
4. For each surface find the direction numbers of the rulings thróugh the given 4. :;ud( ~;
point.
a. x 2/5 + y 2/3 - z 2/2 =O; (5, 3, 4).
b. x 2 - y 2 /8 = 2z; (4, 4, 7). 5. Transform the following equations for conics into canonical forro. Sketch the
c. 4xy + 2xz- 6yz + 2y- 4 =O; (1, O, 2). curve and the coordinate systems used.
d. x 2 - 8xy + 2yz- z 2 - 6x -- 6y- 6 =O; ( -1, O, 1). a. x 2 - 4y 2 +4x +
24y - 48 = O.
5. Find all
the centers of the following quadric surfaces. b. x 2 - y 2 - 10x- 4y + 21 =O.
a. 3x 2 + 4xy + y 2 + 10xz + 6yz + 8z 2 + 4y- 6z + 2 =O. c. 9x 2 - 24xy +
16y2 - 4x - 3y =O.
b. x 2 + 2xy + y 2 + 2xz + 2yz + z 2 + 2x + 2y + 2z = O. d. x 2 + +
2xy y 2 - 8 = O.
c. x 2 + 6xy - 4yz + 3z 2 - Sx + 6y - 14z + 13 = O. e. 9x 2 - 4xy + 6y 2 + 12v5x + 4v5y - 20 =O.
d. 4xy + 6y 2 - Sxz- 12yz + 8x + 12y =O.
6. a. Define center for a .conic.
6. Suppose O is a point on the surface given by xr ÁX = O. b. Find a system of equations that gives the centers of conics.
a. What conditions should a direction K satisfy for the line X = tK + O to c. Which types of conics have centers?
be tangent to the surface?
7. Find the centers of the following conics, see problem 6.
b. Why should O not be a center if there is to be a tangent line through 0?
a. 2x 2 + 2xy + 5y 2 - 12x + 12y + 4 =O.
A point of a q).ladric surface which is not a center is called a regular point of
b. 3x 2 + 12xy + 12y2 + 2x + 4y = O.
the surface.
c. 4x 2 - lOxy + 7y 2 + 2x- 4y + 9 =O.
c. What is the equation of the tangent p1ane to xr ÁX = O at a regular
point 0? 8. Jdentify the quadric surface given by
7. If the given point is a regular point for the surface, find the equation of the
a. 2x 2 + 2xy + 3y2 - 4xz + 2yz + 2x - 2y + 4z = O.
tangent plane at that point, see problem 6.
b. x 2 + 2xy + y 2 + 2xz + 2yz + z 2 + 2x + 2y + 2z - 3 = O.
c. x 2 + 4xy + 3y2 - 2xz - 3z 2 + 2y - 4z + 2 = O.
a. x 2 + y 2 + z 2 = 1; (0, O, 1).
b. x 2 + 6xy + 2y 2 + 4yz + z 2 - 8x -- 6y- 2z + 6 =O; (1, 1, -1). 9. Given the equation for an arbitrary quadric surface, which of the 17 types
c. 4xy- 7y 2 + 3z 2 - 6x + 8y- 2 =O; (1, 2, -2). would it most likely be?
d. 3x 2 - 5xz + y 2 + 3yz- Sx + 2y- 6z =O; (2, 4, 5). 10. Find all diametral planes for the following quadric surfaces, if their equation
8. Find the two lines in thz:: plane x - 4y + z = O, passing through the point is in standard form.
+ +
(2, 1, 2), that are tangent to the ellipsoid given by x 2 /4 y2 z 2 /4 = l. a. Real elliptic cylinder.
340 7. SECOND DEGREE CURVES ANO SURFACES

b. Elliptic paraboloid.
c. Hyperbolic paraboloid.
11. How do lines with direction numbers (3, 6, O) intersect the quadric s~rface
with equation 4xz - 2yz + 6x = O?
12. Find a rigid motion of the plane, in the form }{ = P:X', which transforms
52x 2 - 72xy + 73y 2 - 200x + IOOy = Ointo a polynomial equation without
cross productor linear terms. What is the canonical form for this equation?

Canonical Forms Under


Similarity

§1. lnvariant Subspaces


§2. Polynomials
§3. Minimum Polynomials
§4. Elementary Divisors
§5. Canonical Forms
§6. Applications of the Jordan Form
§1. lnvariant Subspaces 343
342 8. CANONJCAl FORMS UNDER SIMJlARITY

In Chapter 5 we found severa) invariants for similarity, but they were The following theorem gives another characterization of when a sum
insufficient to distinguish between similar and nonsimilar matrices. There- is direct. The proof is not difficult and is left to the reader.
fore it was not possible to define a set of canonical forros for matricesunder
similarity. However, there are good reasons for taking up the problem again. Theorem 8.1 The sum 9" 1 + · · · + 9"1, is direct if and only if for
From a mathematical point of view our examination of linear transformations U¡ E 9"¡, 1 ::; i ::; h, U¡ + ... + u,, = o implies U¡ = ... = uh = o.
is left open by not having a complete set of invariants under similarity. On
the other hand, the existence of a canonical form that is nearly diagonal Now the above statement regarding characteristic spaces becomes:
permits the application of matrix algebra to severa) problems. Sorne of these
applications will be considered in the final section of this chapter.
'---Throughout this chapter "Y will be a finite-dimensional vector space Theorem 8.2 If TE Hom("Y) is diagonalizable and A1 , ••• , Ak are
over an arbitrary field F. the distinct characteristic values of T, then "Y = 9"(A 1) EB · · · EB 9"(Ak).

Proof Since "Y has a basis B of characteristic vectors,


1/" e 9"(A 1) + · · · + 9"(Ak), therefore "Y = 9"(A 1) + · · · + 9"(Ak)·
§1. lnvariant Subspaces To show the sum is direct suppose U¡ + ... + uk = owith U; E 9"(A;).
Each U; is a linear combination of the characteristic vectors in B correspond-
A set of canonical forms was easily found for matrices under equivalence. ing to A¡. Characteristic vectors corresponding to different characteristic
The set consisted of all diagonal matrices with 1's and then O's on the main values are linearly independent, therefore replacing each U¡ in U1 + · · ·
diagonal. 1t soon became clear that diagonal matrices could not play the + Uk with this linear combination of characteristic vectors yields a linear
same role for similarity. However, we can use the diagonalizable caseto show combination of the vectors in B. Since Bis a basis, all the coefficients vanish
how we should proceed in general. and U; = O for each i. So by Theorem 8.1 the sum is direct.
Since two matrices are similar if and only if they are matrices for sorne
map TE Hom("Y), it will be sufficient to loo k for properties that characterize Examp/e 1 Suppose TE Hom("Y 3 ) is given by
maps in Hom("Y), rather than n x n matrices. Therefore suppose TE Hom("Y)
is a diagonalizable map. Let A1 , ••• , Ak, k ::; dim 1/", be the distinct charac- T(a, b, e) = (3a- 2b- e, -2a + 6b + 2e, 4a -10b - 3c).
teristic values of T. For each characteristic value A¡, the subspace 9"(A;) =
{V E "YI T(V) = A¡ V} is invariant under T. To say that 9"(A;) is a T-invariant T has three distinct characteristic values, 1, 2, and 3, and sois diagonalizable.
subspace means that if U E 9"(A;), then T(U) E 9"(A;). Call S"'(A;) the eharae- Therefore 1/" 3 = 9"(1) EB 9"(2) EB 9"(3). In fact one can show that
teristie spaee of A; (iris al so called the eigenspace of A;.) Then the characteristic
spaces of the diagonalizable map T yield a direct sum decomposition of 1/" 9"(1) = ~{(l. -2, 6)}, 9"(2) = ~{(0, 1, -2)}, 9"(3) = ~{(1, -2, 4)}
in the following sense.
and {(1, -2, 6), (0, 1, -2), (1, -2, 4)} is a basis for 1' 3 •
Definition Let 9" 1 , ••• , 9"h be subspaces of "Y. The su m ofthese spaces
is Return to a diagonalizable map T with distinct characteristic values
A1 , •.• , Ak .. Then T determines the direct sum decomposition "Y =
9"(A 1) EB · · · EB 9"(Ak) and this decomposition yields a diagonal matrix for
T as follows: Choose bases arbitrarily for 9"(A 1 ), ••• , 9"(Ak) and Jet B
!he ~u mis direet, written 9" 1 EB · · · EB 9"h, if U1 + · · · + Uh = u; + ... + U~ denote the basis for r composed of these k bases. Then the matrix of T
Imphes U¡ = u¡ when U¡, u¡ E 9" ¡, 1 ::; i ::; h. 9" 1 EB ... EB 9" h is a direct with respect to B is the diagonal matrix with the characteristic values
sum deeomposition of "Y if "Y = 9" 1 EB · · · EB 9"1,. A1, . . . , Ak on the main diagonal. In Example 1 such a basis is given by
~
344 8. CANONICAL FORMS UNO~ SIMILARITY §1. lnvariant Subspaces 345

B = {(l, -2, 6), (0, l, -2), (!, -2, 4)} and the matrix of Twith respect to in sorne well-defined way and then choosing bases which make the blocks
B is diag(l, 2, 3). relatively simple. The problem is in the selection of the subspaces, for it is
How might this be generalized if T is not diagonalizable? Suppose easy to generate T-invariant subspaces and obtain bases in the process.
enough T-invariant subspaces could be found to express "/'. as a direct sum For any nonzero U E "Y, there is an integer k :s; dim "Y such th.at the
of T-invariant subspaces. Then a basis B could be constructed for "Y from set {U,T(U), ... , Tk- 1(U)} is Iinearly independent and the set {U,T(U), ... ,
bases for the T-invariant subspaces in the decomposition. Since T sends Tk- 1(U), Tk(U)} is linearly dependent. Set ff(U, T) = .f!l{U, T(U), ... ,
each of these subspaces into itself, the matrix for T with respect to B will Tk- 1 (U)}. Then it is a simple matter to show that Y( U, T) is a T-invariant
have its nonzero entries confined to blocks along the main diagonal. To subspace called the T-cyclic subspace generated by U. Moreover the sct
see this suppose 9' 1 , ••• , 9'1, are T-invariant subspaces, although not neces- {U, T(U), ... , Tk- 1 (U)} is a good choice as a basis for Y(U, T). Problem4
sarily characteristic spaces, and "Y = 9' 1 $ · · · $ ffh. If dim 9' 1 =a,, at the end of this section pro vides an example of how a direct sum of T-cyclic
choose a basis B; = {U; 1 , ••• , U;~J for each 9';, and let subspaces Ieads to a simple matrix for T.
There is a polynomial implicit in the definition of the T-cyclic subspace
Y(U, T) which arises when a matrix is found for T using the vectors in
{U, T(U), ... , Tk- 1 (U)}. This polynomial is of fundamental importance
Then the matrix A of T with respect to B has its nonzero entries confined to in determining how we should select the T-invariant subspaces for a direct
blocks, A 1 , . • . , A 1, which contain the main diagonal of A. The block A; sum decomposition of "Y. Since Tk(U) E Y( U, T), there exist scalars
being the matrix of the restriction of T to 9'; with respect to the basis B,. ak-t• ... , a 1, a 0 such that
(Since 9' 1 is T-invariant, T: 9' 1 ..... 9' 1.)
Thus

A =
A
1

Az .
o) .
or

(
O A,
Therefore U determines a polynomial
We will cal! A a diagonal block matrix and denote it by diag(A 1 , ••• , A,).
A diagonal block matrix is also called a "direct su m of matrices" and denoted
by A 1 $ · · · $ A,, but this no'tation will not be used here.
such that when the symbol x is replaced by the map T, a linear map a(T)
is obtained which sends U toO. This implies that a(T) sends al! of Y( U, T)
Examp!e 2 Suppose T: éf 3 ..... éf 3 is rotation with axis .f!1 {U} = Y to Oor that Y( U, T) is contained in .Aía<T>• the null space of a(T).
and angle () #- O. Let f7 = gJ., then f7 is not a characteristic space for T,
but Y and f7 are T-invariant subspaces and éf 3 = Y $ .'Y. Now suppose
{U, V, W} is an orthonormal basis for !%' 3 with Y = .f!1 {V, W}. Then the Exampte 3 Consider the map TE Hom("f/ 3 ) given by T = 21.
matrix of T with respect to {U, V, W} is For any nonzero vector U E "Y 3 , T( U) = 2U. Therefore the polynomial
associated with U is a(x) = x- 2. Then a(T) = T -21 and

(~~ JJ = (~·i·
o; 8
e?~ e· ~-csi0-~5- ~)-
SIU u
a(T)U = (T- 2I)U = T(U)- 2U = 2U- 2U =O.

Since T = 21, Y(U, T) == .f!l{U} and .Aía(T) = "f/ 3 . Thus the null spacc
Thus a decomposition of "f/ into a direct sum of T-invariant subspaces contains the T-cydic space, but they need not be egua!.
gives rise to a diagonal block matrix. The procedure for obtaining canonical
forms under similarity now becomes one of selecting T-invariant subspaces The null space of a(T) in Example 3 is all of "Y 3 and, therefore, a T-
§1. lnvariant Subspaces 347
346 8. CANONICAL FORMS UNDER SIMILARITY

invariant subspace. But the fact is that the null space of such a polynomial in However, we cannot proceed without them. Therefore the needed properties
T is always T-invariant. are derived in the next section culminating in the unique factorization theorem
for polynomials. 1t would be posslble to go quickly through this discussion
and then refer back to it as questions arise.
Theorem 8.3 Suppose Te Hom(f) and bm, ... , b 1, b0 E F. Let
b(T) = bmTm + · · · + b1 T + b0!. Then b(T) E Hom(f) and Jlíb(T) is a
T-invariant subspace off. Problems

Proof Since Hom(f) is a vector space and closed under composi- 1. Suppose
tion, a simple induction argument shows that b(T) E Hom(f).
To show that .!V b(T) is T-invariant, let W E .!Vb(TJ• then (i1 j ~)
1 2
is the matrix of TE Hom(1" 3 ) with respect to {E¡}, for the given vector U, find
b(T)T(W) = (bmr + · · · + b 1 T + b0 J)T(W) [J'(U, T) and the polynomial a(x) associated with U.
= b"'Tm(T(W)) + + b 1 T(T(W)) + b0 J(T(W)) a. U= (1, O, O). b. (1, -1, 0). c. (-2, O, 1). d. (0, 1, O).
2. Let TE Hom(1" 3 ) ha ve matrix
= bmT(Tm(W)) + + b 1 T(T(W)) + b0 T(I(W))
= T(b'"Tm(W) + · · · + b 1 T(W) + b 0 1(W)) (f -~ ~)
= T(b(T) W) = T(O) = O. with respect to {E1}, find [J'(U, T) and the polynomial a(x) associated with U
when:
Therefore T(W) E Jlíb<TJ and .;Vb(TJ is T-invariant. a. U= (-1, O, 1). b. U= (2, O, -1). c. U= (1, 1, 1).
d. What is the characteristic polynomial of T?
Example 4 Let T be the map in Example 1 and b(x) = x 2 - 3x. 3. Show that [J'(U, T) is the smallest T-invariant subspace of 1" that contains U.
That is, if .'T is a T-invariant subspace and U E .'T, then [!'(U, T) e !T.
Find Jlíb(Tl"
b(T) = T 2 - 3T. A little computation shows that 4. Suppose
-1
b(T)(a, b, e)= (-2b- e, -4a + 2b + 2e, 8a- 8b- 6e), 1
1
o
and that Jlíb(TJ = .P{(I, -2, 4)}. In Example 1 we found that (1, -2, 4) is 1
a characteristic vector for T, which confirms that .;Vb(T) is T-invariant. is the matrix of TE Hom(J"s) with respect to {E¡}.
a. Show that 1" 5 = [J'(U¡, T) Ef) Y(Uz, T) EB [J'(U3, T) when
U1 = (0,0, :_1, 1, O), U 2 = (1, O, -1, O, 0), U3 = (-1, 2, 3, -1, 0).
Our procedure for obtaining canonical forros under similarity is to obtain
a direct sum decomposition of f into T-invariant subspaces and choose b. Find the diagonal block matrix of T with respect to the basis
bases for these subspaces in a well-defined way. The preceding discussion B ={U¡, T(U¡), Uz, U3, T(U3)}.
shows that every nonzero vector U generates a T-invariant subspace 9'( U, T) 5. Prove that for a diagonal block matrix,
and each polynomial a(x) yields a T-invariant subspace Jlía(T)· Although det diag (A 1 , A 2 , ••• , Ak) = (det A1 )(det Az) · · · (det Ak).
these two methods of producing T-invariant subspaces are related, Example
,, 6. Suppose TE Hom(1" 3 ) has
3 shows that they are not identical. Selecting a sequence of T-invariant sub-
-i)
.;

spaces will use both methods; it will be based on a polynomial associated


with T and basic properties of polynomials. These properties el ose! y parallel (~1 2
-~
2
properties of the integers and would be obtained in a general study of rings. as its matrix with respect to {E¡}. Find the null space Yv(T) when
348 8. CANONICAL" FORMS UNDER SIMILARITY §2. Polynomials 349

a. p(x) = x + 3. b. p(x) x 2 - 5x + 9.
=
field where a polynomial may vanish for every element and yet not be the
c. p(x) = x 2 - 4. d. p(x) = (x + 3)(x2 - 5x + 9). zero polynomial, see problem 1 at the end of this section.
7. Suppose TE Hom("Y 3 ) has Two operations are defined on the polynomials of F[x]. Addition in

( ~1 o~
1
i) F[x] is defined as in the vector space F[t], and multiplication is defined as
follows: If p(x) = a11 x" + · · · + a 1 x + a0 and q(x) = bmX" + · · ·b 1x + b0 ,
then
as its matrix with respect to {E¡}. Find ff P<T> when
a. p(x) = x- 2. b. p(x) = (x - 2) 2 • c. p(x) = x.
d. p(x) = x(x - 2). e. p(x) = x(x - 2) 2 • p(x)q(x) = cn+mx"+m + · · · + c 1x + c0
8. Suppose TE Hom("Y) and p(x) and q(x) are polynomials such that "Y =
with
ff P<T> + ff q<TJ· What can be concluded about the map p(T) o q(T)? .

9. Suppose TE Hom("Y 3 ) has ck = L aibi, 1 ::;; k ::;; n + m.


( -~o }1 ?)2
i+ j=k

For example, if p(x) = x 3 - x2 + 3x - 2 and q(x) = 2x 2 + 4x + 3, then


as its matrix with respect to {E¡}. Find ffv<r> when
a. p(x) = x- 2. b. p(x). = (x- 2) 2 • c. p(x) = x2 - 3x + 3. p(x)q(x) = 1·2x 5 + (1·4- 1·2)x4 + (1·3- 1·4 + 3·2)x3
d. p(x) = x - l. e. Find p(x) such that p(T) = O.
+ (- 1· 3 + 3 · 4 - 2 · 3)x 2 + (3 · 3 - 2 · 4)x - 2 · 3
3 2
= 2x 5 + 2x4 + 5x + 3x + x - 6.

§2. Polynomials
The algebraic system composed of F[x] together with the operations of addition
and multiplication of polynomia1s will also be denoted by F[x]. Thus F[x]
We have encountered polynomials in two situations, first as vectors in is nota vector space but is an algebraic system with the same type of opera-
the vector space R[t], for which t is an indeterminant symbol never allowed tions as the field F. In fact, F[x] satisfies all but one of the defining properties
to be a member of the field R, and then in potynomial equations used to find of a field-in general polynomials do not have multiplicative inverses in F[x].
characteristic values, for which the symbol J. was regarded an unknown
variable in the field. A general study of polynomials must combine both
Theorem 8.4 F[x] is a commutative ring with identity in which the can-
roles for the symbol used. Therefore suppose that the symbol x is an indeter-
cellation law for multiplication holds.
minant, which may be assigned values from the field F, and denote by F[x]
the set of all polynomials in x with coefficients from F. That is, p(x) E F[x] if
p(x) = a.x" + · · · + a 1x + a0 with a11 , • • • , a 1, a0 E F. Proof The additive properties for a ring are the same as the addi-
tive properties for the vector space F[t] and therefore carry over to F[x]. This
Ieaves six properties to be established. Let p(x), q(x), and r(x) be arbitrary
Definition Two polynomials from F[x], p(x) = a11 x" + · · · + a 1x + a0
elements from F[x], then the remaining properties for a ring are:
and q(x) = b111 x"' + · · · + b 1x + b0 are equal if n =manda¡= b¡ for all
i, 1 ::;; i ::;; n. The zero polynomial O(x) of F[x] satisfies the condition that if
l. p(x)q(x) E F[x]
O(x) = a.x" + · · · + a 1x + a 0 , then a" = · · · = a 1 = a0 =O.
2. p(x)(q(x)r(x)) = (p(x)q(x))r(x)
3. p(x)(q(x) + r(x)) = p(x)q(x) + p(x)r(x) and
The two different roles for x are nicely illustrated by the two statemen'ts (p(x) + q(x))r(x) = p(x)r(x) + q(x)r(x).
p(x) = O(x) and p(x) = O. The first means that p(x) is the zero polynomial, F[x] is commutative if
while the second will be regarded as an equation for x as an element in F, 4. p(x)q(x) = q(x)p(x).
that is, p(x) is the zero element ofF. The distinciton is very clear over a finite F[x] has an identity if
350 8. CANONICAL FORMS UNDER SIMILARITY §2. Polynomials 351
1
5. there exists l(x) E F[x] such that p(x)I(x) = I(x)p(x) = p(x). The nonzero elements ofF are polynomials of degree O in F[x]. F is a
The cancellation Jaw for multiplication holds in F[x] if field so that any nonzero element ofF is a factor of any polynomial in F[x].
1
6. p(x)q(x) = p(x)r(x) and p(x) #- O(x), then q(x) = r(x) arid if However, a polynomial such as 2x + 3 above can be factored into the pro-
q(x)p(x) = r(x)p(x) and p(x) #- O(x), then q(x) = r(x). duct of an element from F anda monic polynomial in only one way asid e from
i 1 order; 2x + 3 = 2(x + 3/2). Thus, given a monic polynomial, we will only
Parts 1 and 4 are obvious from the definition of multiplication and the look for monic factors.
corresponding properties ofF. Thus only half of parts 3 and 6 need be estab- In deriving our main result on factorization we will need two properties
lished, and this is left to the reader. For 5, take l(x) to be the multiplicative of the integers which are equivalent to induction, see problem 13 at the end
identity ofF. Parts 2 and 3 follow with a little work from the corresponding of this section.
properties in F. This Jeaves 6, which may be proved as follows:
Suppose p(x)q(x) = p(x)r(x), then p(x)(q(x) - r(x)) = O(x) ímd it is
Theorem 8.5 The set of positive integers is well ordered. That is, every
sufficient to show that if p(x)s(x) = O(x) and p(x) .¡.. O(x), then s(x) = O(x).
nonempty subset of the positive integers contains a smallest element.
Therefore, suppose p(x)s(x) = O(x), p(x) = a11 x" + · · · + a 1x + a0 with
a" #-O, so p(x) #- O(x), and s(x) = h111 X 111 + · · · + b 1x + b0 • Assume that
s(x) #- O(x) so that bm #-O. Thenp(x)s(x) = a11 b111 x"+m + terms in lower powers Proof The proof is by induction. Suppose A is a nonempty subset
of x. p(x)s(x) = O(x) implies that a11 b111 = O, but a11 #- O and F is a field, so of positive integers. Assume A has no smallest element and Jet S = {nln is
bm = O, contradicting the assumption tpat s(x) #- O(x). Therefore s(x) = O(x), a positive integer and n < a for all a E A}. Then 1 E S for if not then 1 E A
and the cancellation law holds in F[x]. and A has a smallest element. If k E S and k + 1 ~ S, then k + 1 E A and
again A has a smallest e1ement. Therefore k E S implies k + 1 E S and S is
the set of aiJ positive integers by induction. This contradicts the fact that A
Definition An integral domain is a commutative ring with identity in
is nonempty, therefore A has a smallest element as claimed.
which the cancellation law for multiplication holds.
Many who find induction difficult to believe, regard this principie of well
Hence a polynomial ring F[x] is an integral domain. The integers together ordering as "obvious." For them it is a paradox that the two principies are
with the arithmetic operations of addition and multiplication also com- equivalen t.
prise an integral domain. Actually the term integral domain is derived from There are two principies of induction for the positive integers. The
the fact that such an algebraic system is an abstraction of the integers. Our induction u sed in the abo ve proof m ay be called the "first principie of induc-
main objective in this section is to prove that there is a unique way to factor tion."
any polynomial just as there is a unique factorization of any integer into
primes. Since every element of Fis a polynomial in F[x], and Fis a field, there
Theorem 8.6 (Second Principie of lnduction) Suppose B is a set
are many factorizations that are not of interest. For example, 2x + 3
of positive integers such that
= 4(xf2 + 3/4) = t(lOx + 15).
l. 1 E B and
Definition Let p(x) = a11 x" + · · · + a 1x + a0 E F[x]. If p(x) .¡.. O(x) 2. If m is a positive integer such that k E B for all k < m, then m E B.
and an #- O, then an is the leading coefficient of p(x) and p(x) has degree n
denoted deg p.= n. [f the leading coefficient of p(x) is 1, then p(x) is monic. Then B is the set of all positive integers.
For a(x), b(x) E F[x], b(x) is a factor of a(x) or b(x) divides a(x), denoted
b(x)la(x), if there exists q(x) E F[x] such that a(x) = b(x)q(x).
Proof Let A = {nln is a positive integer and n ~ B}. If A is non-
empty, then by well ordering A has a smallest element, call it m. m ~ B so
There is no number that can reasonably be taken for the degree of the m #- 1 and if k is a positive integer Jess than m, then k E B. Therefore, by
zero polynomial. Howe\ler it is useful if the degree of O(x) is taken to be 2, m E B and the assumption that A is nonempty leads to a contradiction.
minus infinity. Thus B is the set of all positive integers.
352 8. CANONICAL FORMS UNDER SIMILARITY §2. Polynomials 353

As with the first principie of induction, both the well ordering property Corollary 8.8 (Remainder Theorem) Let e E F and p(x) E F[x]. Then
and the second principie of índuction can be applied to any set of integers the remainder of the divison p(x)j(x - e) is p(e).
obtained by taking all integers greater than sorne number. ·
Proof Since x - e E F[x], there exist polynomials q(x) and r(x)
Theorem 8.7 (Division Algorithm for Polynomials) Let a(x), b(x) such that p(x) = (x - e)q(x) + r(x) with deg r(x) < deg (x - e), but
E F[x] with b(x) -:1 O(x). Then there exist polynomials q(x), called the deg (x - e) = 1 so r(x) is an element ofF. Since r(x) is constant, its value
quotiel1t, and r(x), called the remail1der, such that can be obtained by setting x equal to e; p(e) = (e - e)q(e) + r(e). Thus
the remainder r(x) = p(e).
a(x) = b(x)q(x) + r(x) with deg r < deg b.
Corollary 8.9 (Factor Theorem) Let p(x) E F[x] and e E F, x - e
Moreover the polynomials q(x) and r(x) are uniquely determined by a(x)
is a factor of p(x) if and only if p(e) = O.
and b(x).

Proof If deg a < deg b, then q(x) = O(x) and r(x) = a(x) satisfy Definition Given p(x) E F[xJ and e E F. e is a root of p(x) if x - e is
the existence part of the theorem. Therefore, suppose deg a ~ deg b and a factor of p(x).
apply the second principie of induction to the set of non-negative integers
B = {11lif deg a = 11, then q(x) and r(x) exist}. A third corollary is easiiy proved using the first principie of induction.
If deg a = O, there is nothing to pro ve for a(x) and b(x) are in F and we
can take q(x) = a(x)jb(x), r(x) = O(x). So O E B. Corollary 8.10 If p(x) E F[xJ and deg p(x) =n > O, then p(x) has at
Now assume k E B if k < m, that is, that q(x) and r(x) exist for c(x) if most 11 roots in F.
dege =k. Leta(x) = a,xm + ··· + a 1x + a 0 andb(x) = b.:x!' + · · · + b 1x
+ b0 with m > 11. Consider the polynomial a'(x) = a(x) - (a,fb.)xm-nb(x).
Deg a' < in, so by the induction assumption there exist polynomials q'(x) Of course a polynomial need have no roots in a particular field. For
and r(x) such that a'(x) = b(x)q'(x) + r(x) with deg r < deg b. Therefore, exampie, x 2 + i has no roots in R. However, every polynomial in R[xJ has
if q(x) = q'(x) + (a,fb.)xm-n, then a(x) = b(x)q(x) + r(x) with deg r < at least one root in C, the complex numbers. This follows from the Funda-
deg b. Therefore k < m and k E B impiies m E B, and by the second principie mental Theorem of Algebra, since F[x] e C[x]. This theorem is quite dif-
of induction, B is the set of all non-negative integers. That is, q(x) and r(x) ficult to prove, and will only be stated here.
exist for all poiynomials a(x) such that deg a ~ deg b.
The proof of uniqueness is left to problem 5 at the end of this section. Theorem 8.11 (Fundamental Theorem of Algebra) If p(x) E C[x],
In practice the quotient and remainder are easily found by long division. then p(x) has at least one root in C.
-.,.,

Example 1 Find q(x) and r(x) if a(x) = 4x 3 + x2 - x +2 and Definition A field F is algebraically closed if every polynomial of
b(x) = x 2
+x - 2. positive degree in F[xJ has a root in F.
4x- 3
x2 +x - 2 l4x 3 + x1 - x + 2 The essential part of this definitíon is the requirement that the root be
4x 3 + 4x2 - 8x in F. Thus the complex numbers are algebraically closed while the real
- 3x2 + 7x + 2 numbers are no t. We will find that the simplest cano ni cal form under similarity
- 3x 2 - 3x + 6 is obtained when the matrix is considered to be over an algebraically closed
!Ox- 4 fiel d.
If the polynomial p(x) has e as a root, then x - e is a factor and there
Therefore q(x) = 4x - 3 and r(x) = !Ox - 4. exists a polynomial q(x) such that p(x) = (x - c)q(x). Since deg q =
354 8. CANONICAL FORMS UNDER SIMILARITY §2. Polynomials 355

deg p - 1, an induction argument can be used to extend Theorem 8.11 as Proof If S = {O(x)}, there is nothing to prove, therefore suppo~e
follows. S# {O(x)}. Then the well-ordering principie implies that S contains a poly-
nomial of least finite degree. lf d(x) denotes such a polynomial, then it ís
Corollary 8.12 If p(x) E C[x] and deg p = 11 > O, then p(x) has only necessary to show that d(x) divides every element of S.
exactly 11 roots. (The n roots of p(x) need not all be distinct.) If p(x) E S, then there exist polynomials q(x) and r(x) such that p(x)
= d(x)q(x) + r(x) with deg r < deg d. But r(x) is in the ideal since r(x)
= p(x) - q(x)d(x) and p(x), d(x) E S. Now since d(x) has the minimal finite
Thus an nth degree polynomial p(x) E C[x] can be factored into the degree of any polynomial in S and deg r < deg d, r(x) = O(x) and d(x)lp(x)
product ofn linear factors of the form x - e;, 1 s; i s; n, with p(e;) = O. A as desired.
Po'fynomial p(x) in R[x] is also in C[x] and it can be shown that any complex
roots of p(x) occur in conjugate pairs. Therefore p(x) can be expressed as a The ideal in Example 2 is principal with d(x) = x - 4 as a single
product of linear and quadratic factors in R[x]. generator. For an ideal generated by two polynomials, a single generator
Before obtaining a factorization for every polynomial in F[x], we must guaranteed by Theorem 8.13 is also a greatest common divisor.
investigate greatest common divisors of polynomials. This is done with the
aid of subsets called ideals.
Theorem 8.14 Any two nonzero polynomials a(x), b(x) E F[x] have
a greatest eommon divisor (g.c.d.) d(x) satisfying
Definition An ideal in F[x] is a nonempty subset S of F[x] such that
J. if p(x), q(x) E S, then p(x) - q(x) E S, and l. d(x)la(x) and d(x)/b(x)
2. if r(x) E F[x] and p(x) E S, then r(x)p(x) E S. 2. if p(x)la(x) and p(x)lb(x), then p(x)ld(x).

Examp/e 2 Show that S= {p(x) E R[x]/4 is a root of p(x)} is an Moreover there exist polynomials s(x), t(x) E F[x] such that
ideal in R[~].
If p(x) and q(x) E S, then (x - 4)/p(x) and (x - 4)/q(x) implies that d(x) = s(x)a(x) + t(x)b(x).
(x - 4)/(p(x) - q(x)), so p(x) - q(x) E S, and (x - 4)/r(x)p(x) for any
r(x) E R[x], so r(x)p(x) E S. Therefore both conditions are satisfied and S
Proof The set
is an ideal.
S = {u(x)a(x) + v(x)b(x)iu(x), v(x) E F[x]}
Any set of polynomials can be used to generate an ideal in F[x]. For
example, given a(x) and b(x) in F[x], define S by is an ideal in F[x] containing a(x) and b(x). Therefore, by Theorem 8.13,
there exists a nonzero polynomial d(x) E S such that d(x) divides every element
S = {u(x)a(x) + v(x)b(x)iu(x), v(x) E F[x]}. of S. In particular, d(x)la(x) and d(x)lb(x).
Since d(x) E S, there exist polynomials s(x), t(x) E F[x] such that d(x)
It is not difficult to check that S is an ideal in F[x] containing both a(x) and :_·. = s(x)a(x) + t(x)b(x), and it only remains to show that 2 is satisfied. But
b(x). Notice that if e is a root of both a(x) and b(x), then e is a root of every if p(x)la(x) and p(x)/b(x), then p(x) divides s(x)a(x) + t(x)b(x) = d(x).
J?Oiynomial in S. In fact this ideal is determined by the roots or factors com-
mon to a(x) and b(x). This is shown in the next two theorems.
Two polynomials will have many g.c.d.'s but they have a unique monic
g.c.d. In practice a g.c.d. is most easily found if the polynomials are given
Theorem 8.13 F[x] is a principal ideal doma in. That is, if S is an ideal in in a factored form. Thus the monic g.c.d. of a(x) = 4(x - 2) 3 (x 2 + 1) and
F[x], then there exists a polynomial d(x) such that S = {q(x)d(x)/q(x) E F[x]}. b(x) = 2(x - 2) 2 (x 2 + 1) 5 is (x - 2) 2 (x 2 + 1).
e
356 8. CANONICAL FORMS UNO~ SIMILARITY §2. Polynomials 357

Definition Two nonzero polynomials in F[x] are relatively prime if Theorem 8.17 (Unique Factorization Theorem) Every polynomial
their only common factors in F[x] are elements ofF. of positive degree in F[x] can be expressed as an element of F times a
product of irreducible monic polynomials. This expression is unique except
2
Thus x - 4 and x 2 - 9 are relatively prime even though each has for the order of the factors.
polynomial factors of positive degree. 1 f

Proof Every polynomial in F[x] can be expressed in a unique way


Theorem 8.15 If a(x) and b(x) are relatively prime in F[x], then there as ea(x) where e E F and a(x) is monic. Therefore it is sufficient to consider
exist s(x), t(x) E F[x] such that 1 = s(x)a(x) + t(x)b(x). a(x) a monic polynomial of positive degree. The existence proof uses thc
second principie of induction on the degree of a(x). If deg a = 1, then a(x)
is irreducible, therefore an expression exists when a(x) has degree l. So as-
.Proof Since a(x) and b(x) are relatively prime, 1 is a g.c.d .
sume every monic polynomial of degree less than m has an expression in terms
of irreducible mónic polynomials and suppose deg a ,;, m, m > l. If a(x)
We are finally ready to define the analogue in F[x] to a prime number
is irreducible over F, then it is expressed in the proper form. If a(x) is not
and to prove the analogue to the Fundamental Theorem of Arithmatic.
irreducible over F, then a(x) = p(x)q(x) where p(x) and q(x) are monic
Recall that this theorem states that every integer greater than 1 may be ex-
polynomials of positive degree. Since deg a = deg p + deg q (see problem
pressed as a product of primes in a unique way, up to order.
2 at the end of this section), p(;x) and q(x) have degrees less than m. So by thc
induction assumption p(x) and q(x) may be expressed as products of ir-
Definition A polynomial p(x) E F[x] of positive degree is irreducible reducible monic polynomials. This gives such an expression for a(x), which
o1•er F if p(x) is relatively prime to all nonzero polynomials in F[x] of degree has degree m. Therefore, by the second principie of induction, every monic
less than the degree of p(x). polynomial of positive degree may be expressed as a product of irreducible
monic polynomials.
That is, if deg p > O, then p(x) is irreducible over F if its only factors For uniqueness, suppose a(x) = p 1(x)P2(x) · · ·p¡,(x) and a(x) =
from F[x] are constant multiples of p(x) and constants from F. q1(x)qix) · · · qk(x) with p 1 (x), ... ,ph(x), q 1 (x), ... , qk(x) irreducible
monic polynomials.p 1(x)Ja(x) so p 1(x)/q 1(x) · · · qk(x). Since p 1(x) is irreducible_
Examp!es an extension of Theorem 8.16 implies that p 1 (x) divides one of the q's. Say
p 1(x)/q 1(x), then since both are irreducible and monic, p 1 (x) = q1 (x), and by
l. AII polynomials of degree 1 in F[x] are irreducible over F. (Why?)
the cancellation Jaw, pix)· · ·p¡,(x) = qix)· · ·qk(x). Continuing this proces~
2. The only irreducible polynomials in C[x] are polynomials of degree
l. (Why?) with an induction argument yields h = k and p 1(x), ... , Pk(x) equal to
q1(x), . .. , qk(x) in sorne order.
3. ax + bx + e E R[x] is irreducible over R if the discriminan!
2
2
b - 4ae <O.
4. x 2 + 1 is irreducible over R but not over C. Corollary 8.18 If a(x) E F[x] has positive degree, then there ~t
5. x - 2 is irreducible over th~ field of rationals but not over R or C.
2
distinct irreducible (over F) monic polynomials p¡(x), . .. , Pk(X), positive
integers e 1 , ... , ek andan element e E Fsuch that a(x) = epí'(x) p~2 (x) · · ·pZk(x).
This expression is unique up to the order of the polynomials p 1 , •• . , Pk·
Theorem 8.16 Suppose a(x), b(x), p(x) E F[x] with p(x) irreducible over
F. If p(x)/a(x)b(x), then either p(x)/a(x) or p(x)/b(x).
Definition If a(x) = epi'(x)· · ·pZk(x) with e E F, p 1(x), ... ,Pk(x)
distinct irreducible monic polynomials, and e 1 , ••• , ek positive integers, then
Proof If p(x)/a(x) we are done. Therefore, suppose p(x),(a(x).
e; is the multiplicity of p;(x), 1 =::; i =::; k.
Then p(x) and a(x) are relatively prime so there exist s(x), t(x) E F[x] such
that 1 = s(x)p(x) + t(x)a(x). Multiply this equation by b(x) to obtain
b(x) = (b(x)s(x))p(x) + t(x)(a(x)b(x)). Now p(x) divides the right side of this Finding roots for an arbitrary polynomial from R[x] or C[x] is a difficult
equation so p(x)/b(x), completing the proof. task. The quadratic fomula provides complex roots for any second degree
358 8. CANONICAL FORMS UNDER SIMILARITY §2. Polynomials 359

polynomial, and there are rather complicated formulas for third and fourth 9. Use Euclid's algorithm to find a g.c.d. of
degree polynomials. But there are no formulas for general polynomials of a. a(x) = 2x 5 - 2x 4 + x3 - 4x 2 - x - 2, b(x) = x 4 - x 3 - x - l.
degree larger than four. Therefore, unless roots can be found by good b. a(x) = 2x 4
+ 5x - 6x -
3 2
+ 6x 22, b(x) = x 4 + 2x 3 - 3x 2 - x + 8_
guesses, approximation techniques are necessary. One should not expect an c. a(x) = 6x4 + 7x3 + 2x2 + 8x + 4, b(x) = 6x 3 - x 2 + 4x + 3.
arbitrary polynomial to have rational roots, Jet alone integral roots. In fact, 10. Suppose p(x) = anx" + · · · + a 1x + a0 E R[x] has integral coefficients. Show
for a polynomial with rational coefficients there are few rational numbers that if the ratio na] number cfd is a root of p(x), then c/ao and d/an. How might
which could be roots, see problem 10. this be extended to polynomials with rational coefficients?
11. Using problem 10, list all possible rational roots of
a. 2x 5 + 3x4 - 6x + 6. c. x 4 - 3x3 + x - 8.
Problems
b. x" + l. d. 9x 3 + x - l.

l. Suppose p(x) E F[x] satisfies p(c) = O for all e E F. Show that if F is a finite 12. Is the field of integers modulo 2, that is the field containing only O and 1,
field, then p(x) need not be O(x). What if F were infinite? algebraica]] y closed?

2. Show that if a(x), b(x) E F[x], then deg ab = deg a + deg b. (Take r - oo = 13. Prove that the first principie of induction, the second principie of induction,
- oo for any rE R.) and the well-ordering principie are equivalen! properties of the positive
integers.
3. Let F be a fiel d. a. Show that the cancellation law for multiplication holds in
F. b. Suppose a, bE F and ab = O, show that either a = O or b = O. 14. Let "Y be a vector space over F, U E "Y, TE Hom("Y) and show that {p(T)U/
p(x) E F[x]} is the T-cyclic subspace Y( U, T).
4. Find the quotient and remainder when a(x) is divided by b(x).
a. a(x) = 3x4 + 5x 3 - 3x 2 + 6x + 1, b(x) = x 2 + 2x- l. 15. Let TE Hom("Y) and U E "Y.
b. a(x) = x 2 + 2x + 1, b(x) = x 3
- 3x. a. Show that S= {q(x) E F[x]/q(T)U =O} is an ideal in F[x].
c. a(x) = x 5 + x3 - 2x 2 + x - 2, b(x) = x 2 + l. b. Find the polynomial that generales S as a principal ideal if S#- {O(x)}.
d. a(x) = x 4 - 2x 3
- 3x 2 + 10x - 8, b(x) = x - 2.
16. You should be able to pro ve any of the following for Z, the integral domain of
5. Show that the quotient and remainder given by the division algorithm for integers.
polynomials are unique. a. The division algorithm for integers.
b. Z is a principal ideal domain.
6. Prove that any two nonzero polynomials a(x), b(x) E F[x] ha ve a least commom
c. If a, bE Z are relatively prime, then there exist s, tEZ such that 1 =
multiple (l.c.m.) m(x) satisfying
l. a(x)/m(x) and b(x)/m(x)
sa + tb.
d. The Fundamental Theorem of Arithmetic as stated on page 356.
2. if a(x)/p(x) and b(x)/p(x), then m(x)/p(x).
7. Find the monic g.c.d. and the monic l.c.m. of a(x) and b(x). 17. Extend the definition of F[x] to F[x, y] and F[x, y, z], the rings of polynomials
a. a(x) = x(x - 1)4 (x + 2) 2 , b(x) = (x- 1)3 (x + 2)4 (x + 5). in two and three variables, respectively.
b. a(x) = (x 2 - 16)(5x3 - 5), b(x) = 5(x- 4) 3(x- 1). a. Show that F[x, y] is not a principal ideal domain. For example, consider
the ideal generated by xy and x + y.
8. Prove Euclid's algorithm: If a(x), b(x) E F[x] with deg a¿deg b>O, and the b. Suppose S is the ideal in R[x, y, z] generated by a(x, y, z) = x 2 + y 2 - 4
polynomials q1(x), r1(x) E F[x] satisfy and b(x, y, z) = x 2 + y 2 - z, and Jet G be the graph of the intersection
a(x) = b(x)q 1(x) + r 1(x) O<deg r 1<deg b of a(x, y, z) =O and b(x, y, z) =O. Show that S can be taken asan alge-
b(x) = r1(x)q2(x) + r 2 (x) O<deg rz<deg r1 braic description of G in that every element of S is zero on G.
r¡(X) = r 2 (x)q3(x) + r3(x) O<deg r3<deg rz c. Show that the polynomials z - 4, x 2 + y 2 + (z - 4)2 - 4, and 4x2 +
4y2 - z 2 are in the ideal ofpart b. Note that the zeros ofthese polynomials
rk_z(x) = 'k-t(x)qk(x) + rk(x) give the plane, a sphere, anda cone through G.
'k-t(x) = rk(x)qk+t(x) + O(x), d. Give a geometric argument to show that the ideal S in part b is not
then rb) is a g.c.d. of a(x) and b(x). principal.
360 8. CANONICAL FORMS UNDER SIMILARITY §3. Mínimum Polynomials 361

Proof By the above remarks there exists a polynomial p(x) such


that deg p .::;; n 2 and p(T) = O. Therefore the set
§3. Minimum Polynomials
B = {klthere exists p(x) E F[x], deg p =k > O and p(T) = O}
Now return to the polynomial associated with the T-cyclic subspace
!/(U, T). Recall that if !/(U, T) = 2'{U, T(U), ... , Tk- 1(U)} and Tk(U) is nonempty, and since the positive integers are well ordered, B has a least
+ ak_ 1 Tk- 1(U) + ·· · + a 1 T(U)+ a0 U =O, then this equation may be element. Let m(x) be a monic polynomial of least positive degree such that
written as a(T)U = O where a(x) = xk + ak_ 1xk-l + · · · + a 1x + a0 • m(T) = O. This proof is completed in the same manner as the corresponding:
Notice that a(x) is monic and since !/(T, U) has dimension k, there does not properties were obtained for the T-minimum polynomial of U.
exist a nonzero polynomial b(x) of lower degree for which b(T)U = O. Is it clear that the symbol O in this proof denotes quite a different vector
Therefore we will call a(x) the T-minimum po/ynomial of U. · from O in the preceding proof?

Theorem 8.19 If a(x) is the T-minimum polynomial of U and Definition The minimunz poly11on1ial m(x) of a nonzero map TE
b(T)U = O, then a(x)Jb(x). The monic polynomial a(x) is uniquely determined Hom(i/") is the unique monic polynomial of lowest positive degree such
by u. that m(T) = OHom(t·¡·

Jf the mínimum polynomial of T is the key to obtaining a canonical


Proof By the division algorithm there exist polynomials q(x) and from under similarity, then a method of computation is required. Fortunately
r(x) such that b(x) = a(x)q(x) + r(x) with deg r < deg a. But r(T)U = O it is not necessary to search through all 11 2 + 1 maps /, T, ... , T" for a
2

for dependency. We will find that the degree of the mínimum polynomial of T
cannot exceed the dimension of "f/, Corollary 8.28, page 378. However,
r(T)U = (b(T) - q(T) o a(T))U = b(T)U - q(T)(a(T)U). this does leave 11 + 1 maps or matrices to consider.

If r(x) # O(x), then r(x) is a nonzero polynomial of degree less than deg a
and r(T)U = O which contradicts the n!marks preceding this theorem. Example 1 Find the mínimum polynomial of TE Hom("f/ 3 ) if the
Therefore a(x) divides b(x). This also gives the uniqueness of a(x), for if matrix of T with respect to {E¡} is
deg b = deg a, then they differ by a constant multiple. Since a(x) is monic,

A~H -~)
it is uniquely determined by U. 3
1
-1
The T-minimum polynomial of U determines a map that sends the
en tire subspace !/(U, T) to O. There is a corresponding polynomial for the M ultiplication gives
vector space that plays the central role in obtaining a canonical form under
similarity. We see that there is a polynomial in T that maps all of "f/ to O as
2
follows: If di m "f/ = n, then di m Hom("f/) = n 2 and the set {!, T, T 2 , ••• , T" } A2 =
(-2
-~ o
4
-2)
-6 and A3 = r-6
-~
o
o -14)
-10 .
is linearly dependent. That is, there is a linear combination of these maps (a -2 3 -2 9
polynomial in T) that is the zero map on "f/.
It is clear that the sets {/3 , A} and {/3 , A, A 2 } are linearly independent, and a
little computation shows that {/3 , A, A 2 , A 3 } is Jinearly dependent with
Theorem 8.20 Given a nonzero map TE Hom("f/), there exists a unique 3
A = 3A 2 - 4A + 4/3 . Therefore {!, T, T 2 } is linearly independent and
monic polynomial m(x) such that m(T) = OHom('f") and if p(T) = OHum("l·¡ T 3 - 3T 2 + 4T- 4/ = O, so the mínimum polynomial of T is x 3 - 3x 2
for sorne polynomial p(x) then m(x)Jp(x). + 4x- 4.
362 8. CANONICAL FORMS UNDER SIMILARITY §3. Minimum Polynomials 363

Another approach to finding the mínimum polynomial of a map is given Example 2 Use these tlieorems to find the m1mmum polynomial
in the next two theorems. of the map in Example l.
Consider the basis vector E 1 ; E 1 = (1, O, 0), T(E1 ) = (1, -1, 0), and
Theorem 8.21 Suppose TE Hom('f"') and U E 'f"'. If m(x) is the mín- T 2 (E¡) = ( -2, -2, 1) are linearly independent. T 3 (E 1) = (-6, -2, 3) and
imum polynomial of T and a(x) is the T-minimum polynomial of U, then T\E 1 ) = 3T 2 (E1 ) - 4T(E1 ) + 4E1 • Therefore the T-minimum polynomial
a(x)Jm(x). of E 1 is x 3 - 3x 2 - 4x + 4. This polynomial divides the mínimum poly-
nomial of T, m(x) and deg m ::::; 3, therefore m(x) = x 3 - 3x2 + 4x - 4.

Proof By the division algorithm there exist polynomials q(x) and


r(x) such that m(x) = a(x)q(x) + r(x) with deg r < deg a. Since m(T)U Examp!e 3 Find the mínimum polynomial of Te Hom('f"'~if the
= a(T)U = O and r(T) = m(T) - q(T) o a(T), we have r(T)U == O. But matrix of T with respect to {E;} is
a(x) is the T-minimum polynomial of U and deg r < deg a. Therefore r(x)

( -~ _; -~)-
= O(x) and a(x)Jm(x).

3 6 -1
Theorem 8.22 Let {U 1 , ••• , U11 } be a basis for 'f"' and Te Hom('f"'). If
a 1 (x), ... , a.(x) are the T-minimum polynomials of U 1 , ••• , U., respec-
E 1 = (1, O, 0), T(E1 ) = (3, -2, 3), T 2 (E¡) = (2, 4, -6), and T 2 (E1 )
tively, then the mínimum polynomial of T is the monic least common mul-
= -2T(E1 ) + 8E1 • Since E 1 and T(E 1 ) are linearly independent, the T-
tiple of a 1(x), . .. , a11 (x).
minimum polynomial of E 1 is x 2 + 2x - 8. Further computation shows that
E 2 and E 3 also have x 2 + 2x - 8 as their T-minimum polynomial, so this is
Proof Suppose m(x) is the mínimum polynomial of T and s(x) is the mínimum polynomial of T.
the monic l.c.m .. of a 1(x), ... , a.(x). Each polynomial a 1(x) divides s(x) so
s(T)U; = O for 1 ::::; i ::::; n. Therefore s(T) = O and m(x)Js(x).
The map Tin Example 3 was considered in Example 4, page 229. In that
To show that m(x) = s(x), suppose p 1(x), . .. , Pk(x) are the distinct
example we found that the characteristic polynomial of Tis - (x - 2) 2 (x + 4).
irreducible monic factors of the polynomials a 1(x), ... , a.(x). Then
Since the mínimum polynomial of T is (x - 2)(x + 4), the mínimum poly-
nomial of a map need not equa1 its characteristic polynomial. However, the
for 1 ::::; j::::; n mínimum polynomial of a map always divides its characteristic polynomial.
This is a consequence ofthe Cayley-Hamilton theorem, Theorem 8.31, page
with eii = O allowed. Since s(x) is the J.c. m. of a 1 (x), ... , a.(x), 380.
Recall that our approach to obtaining canonical forms under similarity
s(x) = p~'(x)p~(x)· · ·p~•(x) relies on finding a direct sum decomposition of 'f"' into T-invariant sub-
spaces. This decomposition is accomplished in two steps with the first given
where e1 = max{e 11 , e12 , • •• , e1k}. Now m(x)Js(x) implies that there exist by a factorization of the mínimum polynomial of T.
integers/1 , .•• ,fk such that
Theorem 8.23 (Primary Decomposition Theorem) · Suppose m(x) is
m(x) = p{'(x)p{ 2 (x)· · ·p{•(x). the mínimum polynomial of TE Hom('f"') and m(x) = p~'(x) · · ·pNx) where
p 1 (x), ... , Pk(x) are the distinct irreducible monic factors of m(x) with
m(x) divides s(x) and both are monic, so if m(x) # s(x), then.fh < eh for sorne multiplicities el> ... , ek, respectively. Then
index h and .fh < ehi for sorne indexj. But then ai(x) ,( m(x), con!fadicting
Theorem 8.21. Therefore m(x) = s(x) as claimed.
(C
364 8. CANONICAL FORMS UNDER SIMILARITY §3. Minimum Polynomials 365

Proof There are two. parts to the proof: (1) Show that the sum as follows. Suppose a basis B; is chosen for each null space .JVP,•'(TJ and
of the null spaces equals "f/" and (2) show that the sum is direct. these bases are combined in order to give a basis B for "//". Then the matrix
Define q;(x) for 1 ::;; i ::;; k by m(x) = pf'(x)q;(x). Then for any U e"//", O of T with respect to B is diag(A 1 , . . • , Ak), where A; is the matrix of the
= m(T)U = p'f'(T)(q;(T)U), therefore q¡(T)U E .!VP "i(TJ· This fact is used to restriction
obtain (1) as follows: The monic g.c.d. of q 1(x), .' .. , qk(x) is 1, so by an
extension of Theorem 8.15 there exist polynomials s1(x), ... , sk(x) such that T: .!Vp¡"I(T) -t .!Vp¡"i(T)

with respect to the basis B;. In general this cannot be regarded as a canonical
form because no invariant procedure is given for choosing the bases B;. The
Therefore second decomposition theorem will provide such a choice for each B;. How-
ever there is one case for which choosing the bases presents no problems,
namely, when the irreducible factors are alllinear and of multiplicity ,l.
Suppose the mínimum polynomial of T factors to p 1(x)· · ·Pk(x) with
and applying this map to the vector U yields p;(x) = x - A;, 1 ::;; i ::;; k, and A1 , . • • , Ak distinct elements from the field
F. Then the null space .!V p·,(T) is the characteristic space 9'(A;). Therefore if
U= l(U) = s1(T)(q 1(T)U) + · · · + sk(T)(qk(T)U). B; is any basis for .!V p¡(TJ• then the block A; is diag(A;, ... , A;) and the
diagonal block matrix diag(A 1, ••. , A k) is a diagonal matrix. This establishes
half of the following theorem.

Theorem 8.24 TE Hom("f/") is diagonalizable if and only if the mín-


imum polynomial of T factors into linear factors, each of multiplicity l.
which establishes (1).
For (2), suppose W¡ + ... + wk = owith W¡ E .!V P¡"i(T) for 1 ::;; i ::;; k.
q;(x) and pf'(x) are relatively prime, so there exist s(x) and t(x) such that Proof ( =>) If T is diagonalizable, then "f/" has a basis B
1 = s(x)q;(x) + t(x)pf'(x). Applying the map = {U¡, ... , U"} of characteristic vectors for T. Let A1 , ••• , Ak be the
distinct characteristic values for T and consider the polynomial
J = s(T) o q;(T) + t(T) o p'f'(T)

to W; E .!V P,'•(TJ yields


Since each vector Ui E B is a characteristic vector, q(T)Ui = O. (Multiplica-
= s(T)oq;(T)W; + = s(T)oq;(T)W; +O.
---
W; t(T)op'f'(T)W;

But pj(x)lq;(x) if j # i, so q¡(T)Wi =O if i ,¡. }, and we have


tion of polynomials is commutative, thus the factor involving the characteris-
tic value for Ui can be written last in q(T).) Therefore q(T) = O and the
mínimum polynomial of T divides q(x). But this implies the mínimum
polynomial has only linear factors of multiplicity l.
O= q¡(T)O = q¡(T)(W, + ··· + Wk) = q;(T)W;. (<=) This part of the proof was obtained above.

Therefore, W; = s(T)(q;(T)W;) = s(T)O =O, proving that the sum is direct


Examp/e 4 Suppose the matrix of TE Hom("f/" 3 ) with respect to
and completing the proof of the theorem.
{E;} is
It should be clear that none of the direct summands .!V P ••(TJ in this

-~)·
-2
decomposition can be the zero subspace. ' 3
This decomposition theorem provides a diagonal block matrix for T 7 -1
366 8. CANONICAL FORMS UNDER SIMILARITY §4. Elementary Divisors 367

Determine if T is diagonalizable by examining the mínimum polynomial of T. 8. For the map T of Example 4, page 365, the T-minimum polynomial of E1
Choosing E 1 arbitrarily, we find that the T-minimum polynomial of is (x - 2) 2 (x - 1). Use this fact to obtain a vector from E 1 which has as its
E 1 is x 3 - 5x 2 + 8x- 4 = (x- 2) 2 (x- 1). Since this polynomial divide~ the T-minimum polynomial
mínimum polynomial of T, the mínimum polynomial of T contains a linear a. (x- 2)2. b. x- l. c. x- 2.
factor with multiplicity greater than l. Therefore T is not diagonalizable. 9. Suppose TE Hom(fn(C)) and every nonzero vector in 'Yn(C) has the same
The same conclusion was obtained in Example 5, page 230. T-minimum polynomial. Show that this implies that T = zl for sorne z E C.
10. Suppose TE Hom{f) and the T-minimum polynomial of U is irreducible
o ver F. Pro ve that fl'( U, T) cannot be decomposed into a direct su m of nonzero
Problems T-invariant subspaces.
11. Let TE Hom(f) and suppose p 1(x) is the T-minimum polynomial of U1,
l. Suppose the matrix of TE Hom(f 3 ) with respect to {E;} is
U 1 E r, for i = 1, 2. Prove thatp 1{x) andp 2 (x) are relatively prime if and only

( -~6
1 6
-~ -i). if p 1 (x)p 2 (x) is the T-minimum polynomial of U 1 + U 2 •
12. A map TE Hom{f) is.aprojection (idempotent) if T 2 =T.
Find the T-minimum polynomial of the following vectors. a. Find the three possible mínimum polynomials for a projection.
a.(l,O,O). b.(l,0,-2). c.(-3,1,3). d.(l,-1,-1). b. Show that every projection is diagonalizable.

2. Suppose TE Hom(f 2 ) is given by T(a, b) = (3a- 2b, ?a- 4b). 13. Show that TE Hom(f) is a projection if and only if there exist subspaces
a. Find the T-minimum polynomial of (1, O) and (0, 1). [!' and :T such that f = fl' (j) :T and T(V) = U when V= U+ W with

b. Show that every nonzero vector in f 2 has the same T-minimum poly- UEfl', WE:T.
nomial. 14. Show that f = [!' 1 (j) ... (j) fl' k if and only if there exist k projection maps
c. If T is viewed as a map in Hom("J' 2 ( C)), does every nonzero vector have T¡, ... , Tk such that
the same T-minimum polynomial? i. T,(f) = fl';.
3. Find the mínimum polynomial for TE Hom(f n) given the matrix for T with ii. T1 o Tj = O when i ,t. j.
respect to the standard basis for f". iii. 1 ='= T 1 + · · · + Tk.
o
-~)·
-16
a. (-~ ~). b. (-~ -4)
15. Prove the primary decomposition theorem using the result of problem 14.
2. c. o
1 (bd. -8
o -5 (b Z).
o o
(_~ i). (~ -~ -~
-1 10
e. -1 f.
o o g. (-5 -8 -2)
2.
1 o -6"
1 -4
o 1 o -10 4 §4. Elementary Divisors
4. Use the mínimum po1ynomial to determine which of the maps in problem 3
are diagonalizable. · Given TE Hom('f"), the primary decomposition theorem expresses 'f" as
5. For each map Tin problem 3, find the direct sum decomposition off" given a direct su m of the null spaces .;Vp,••<T>• where p;(x) is an irreducible factor
by the primary decomposition theorem. of the mínimum polynomial of T with multiplicity e¡. The next problem is
6. Let T be the map in problem 3e, then the decomposition of f 3 obtained in to decompose each of these null spaces into a direct sum of T-invariant sub-
problem 5 is of the form fi'(U¡, T) (j) fi'(U2 , T) for sorne U 1 , U2 E r 3 • Find spaces in sorne well-defined way. There are severa! approaches to this pro-
the matrix of Twith respect to the basis {U 1 , T(U 1 ), U2}. blem, but none of them can be regarded as simple. The problem is somewhat
7. Suppose dim f = n, TE Hom{f), and the mínimum polynomial of T is simplified by assuming the field F is algebraically closed. In that case each
irreducible over F and of degree n. p;(x) is linear, and the most useful canonical form is obtained at once. How-
a. Show that if u E f, u# o, then r = fi'(U, T). ever this approach does not determine canonical forms for matrices over an
b. Find the matrix of Twith respect to the basis {U, T(U), ... , rn- 1 (U)}. arbitrary field, such as R, and so will not be followed here. Instead we will
c. What is this matrix if f = 1' 2 and the mínimum polynomial of T is choose a sequence of T-cyclic subspaces for each null space in such way that
+
x 2 - 5x 7? each subs¡;:llce is of maximal dimension.
368 8. CANONICAL FORMS UNDER §IMILARITY §4. Elementary Divisors 369

A decomposition must be found for each of the null spaces in the pri- == x!' + ah_ 1.x!'- 1 + · · · + a 1x + a0 is the h x h matrix
mary decomposition theorem. Therefore it will be sufficient to begin by
assuming that the mínimum polynomial of TE Hom('t") is p"(x) with p(x) o o o o -ao
irreducible over F. Then A'p•(T) = 1'". It is easy to show that 'f" contains a 1 o o o -a¡
T-cyclic subspace of maX:imal dimension. Since p"(x) is the· mínimum poly-
C(q) =
o o o -a2
nomial of T, there exists a vector U1 E 'f" such that p•- 1(T)U 1 =!= O. There-
fore, the mínimum polynomial of U 1 is p•(x). (Why?) Suppose the degree of o o o -ah-2
p(x) is b, then deg p• = eb and dim Y(U 1 , T) = eb. Since no element of 't" o o o 1 -ah-t
can have a T-minimum polynomial of degree greater than eb, Y(U1 , T) is ·
a T-cyclic subspace of maximal dimension in 1'". If 'f" = Y(U1, T), then we That is, the companion matrix of q(x) has 1's on the "subdiagonal"
have obtained the desired decomposition. below the main diagonal and negatives of coefficients from q(x) in the last
column.
Example 1 Suppose Ifthe mínimum polynomial of TE Hom('t") isp•(x) and degp• = dim 1'",
then the companion matrix of p•(x) is the matrix of T with respect to a basis
for 1'". In this case C(p•) can be taken as the canonical form for the matrices
of T. However, it is not always so simple. For example, if TE Hom('f" 3 ) has

-2 -4 2)
is the matrix of TE Hom('f" 3 ) with respect to the standard basis. The mínimum
polynomial of T is found to be x 3 - 6x 2 + l2x - 8 = (x - 2) 3 which is
( 2
-4
4
-4
-1
4

of the form p•(x) with p(x) = x - 2 and e = 3. The T-minimum polynomial


of E 1 is also (x- 2)3, so we may take U 1 = (1, O, 0). Then dim Y(U 1, T) as its matrix with respect to {E¡}, then the mínimum polynomial of T is
= 3 ·1 = 3 and Y( U1 , T) = 'f" 3 • Although Tis not diagonalizable, the matrix (x - 2) 2 • Therefore T is not diagonalizable, and if U 1 is chosen as above, the
of Twith respect to the basis {U 1, T(U 1), T 2 (U 1)} has the simple forro dimension of Y(U 1 , T) is only 2. So a T-cyclic subspace of maxímal dimen-
sion in 'f" 3 is a proper subspace of 'f" 3 •
Returning to the general.case, suppose Y( U 1, T) =!= 1'". Then we must
· s) .
o1 oo -12 find a vector U2 that generates a T-cyclic subspace of maxímal dímension
(o 1 6 satisfying Y(U 1, T) n Y(U2 , T) = {0}. The first step in searching for such a
vector is to observe that if V rf; Y(U 1, T), then when k is sufficiently large,
p\T)VEf!>(U1, T). In particular, p•(T)VEY(U 1, T). Thus for each Vrf;
Notice that the elements in the Jast column come from the mínimum poly- Y(U 1 , T), there exists an integer k, 1 :::; k :::; e, such that l(T)V E Y( U 1, ~
nomial of T, for and l- 1
(T) V rf; Y( U1 , T). Let r2 be the largest of all such integers k. Then
l :::; r2 :::; e and there exists a vector W E 'f" such that p''(T)W E Y(U 1 , T)
and p''- 1(T)W rf; Y(U 1, T). lf p''(T)W =O, then we may set U2 = Wand we
will be ab1e to show that the sum Y(U 1 , T) + Y(U2 , T) is dírect. However,
if p''(T) W =!= O, then we must subtract a vector from W to obtain a choice
Therefore this matrix for T does not depend on the choice of U 1, provided
for U2 • To find this vector, notice thatp''(T)VEY(U 1 , T) means that there
the T-minimum polynomial of U1 is (x- 2) 3 .
exists a polynomial f(x) E F[x] such that

Definition The companion matrix of the momc polynomial q(x) p''(T)W = f(T)U 1 • (l)
370 8. CANONICAL FORMS UNDER SIMILARITY §4. Elementary Divisors 371

(See problem 14, page 359.) Applying the map pe-"(T) to Eq. (1) gives F. Then there exist h vedors U 1 , ••• , Uh such that

"Y = 9'(U1 , T) E9 9'(U2 , T) E9 · · · E9 9'(Uh, T)

Since pe(x) is the T-minimum polynomial of U1 , pe(x)lpe-r'(x)f(x) and and 9'(U;, T) is a T-cyclic subspace of maximal dimension contained in
therefore p"(x)lf(x). Supposef(x) = pr'(x)q(x), then Eq. (1) becomes
9'(U;, T) E9 9'(U;+ 1 , T) E9 · · · E9 9'(Uh, T), 1:$i<h.
p''(T) W = f(T)U 1 = p''(T) o q(T)U 1 •
Further if p''(x), ... , p'•(x) are the T-minimum polynomials of U 1 , • •• , Uh,
Therefore if we set U2 = W- q(T)U 1, then respectively, then e = r 1 ;;::: • • · ;;::: rh ;;::: l.

pr'(T)U2 = p''(T)W- pr'(T) o q(T)U1 =O


It is too soon to claim that we have a method for choosing a canonical
as desired, and it is only necessary to check that p''- \T)U2 rf: 9'(U 1 , T). But form under similarity since the decomposition given by Theorem 8.25 is not
unique. But we will show that the T-minimum polynomials p''(x), ... , pr•(x)
pr'- 1 (T)U 2 = p''- 1(T)(W- q(T)U 1) are uniquely determined by T and thereby obtain a canonical form. Both
these points are illustrated in Example 2 below. The candidate for a canonical
= pr'- 1(T)W- pr'- 1 (T) o q(T)U¡. form is the diagonal block matrix given by this decomposition. Call {U¡,
T(U;), ... , T''b-t(U;)} a cyclic basis for the T-cyclic space 9'(U;, T), 1 :S i
1
So if p''- (T)U2 E 9'( U 1 , T), then so is pr'- 1 (T) W, contradicting the choice ::; h, where deg p = b. Then the restriction of T to 9'(U;, T) has the com-
of W. Therefore there exists a vector U2 with minirnum polynomial p''(x), panion matrix C(p'') as its matrix with respect to this cydic basis. Therefore,
2
1 :$ r :S e, such that pr,- 1(T)U2 rf: 9'(U 1 , T) and r2 is the largest integer if we build a cyclic basis B for "Y by combining these h cyclic bases in order,
such that pr'- 1 (T)V rf: 9'(U 1 , T) for sorne V E "Y. then the matrix of T with respect to Bis diag(C(p''), C(p''), . .. , C(p'•)).
It rernains to be shown that 9'(U1, T) 11 9'(U2 , T) = {0}. Therefore
suppose Vi= O is in the intersection. Then V= g(T)U2 for sorne g(x) E F[x].
Suppose the g.c.d. of g(x) and pr'(x) is p•(x). Then there exist polynomials Examp!e 2 If
s(x) and t(x) such that
6 -96 .-26)
p"(x) = s(x)g(x) + t(x)pr'(x). ( -1
-3 9 -3
Replacing x by T and applying the map p•(T) to U2 yields
is the matrix of TE Hom("Y 3 ) with respect to {E;}, then (x - 3) 2 is the
p•(T)U2 = s(T) o g(T)U2 + t(T) o pr'(T)U]. = s(T)V + O. minimum polynomial of T. Find the matrix of T with respect to a cyclic basis
for "Y 3 given by the secondary decomposition theorem.
Since Vis in the intersection, p•(T)U2 E 9'(U 1, T) and a;;::: r2 • But a< r2 ·">.
The degree of (x - 3) 2 is 2, so U1 must be a vector that generates a T-
for g(T)U2 = V i= O irnplies pr'(x), the T-minirnurn polynornial of U2 , does cyclic subspace of dimension 2. Clearly any of the standard basis vectors
not divide g(x). This contradiction shows that the intersection only contains O, could be used. (Why?) Ifwe take U 1 = (1, O, 0), then 9'(U1 , T) = ..'l'{(1, O, 0),
andsothesum9'(U1 , T) + 9'(U2 , T)isdirect.If"f" = 9'(U 1, T) E9 9'(U2 , T), (6, -1, 3)}. Since thedimension of "Y 3 is 3, U2 must generate a !-dimensional
we have obtained the desired decornposition; otherwise this procedure may T-cyclic subspace. Therefore U2 can be any characteristic ve.ct?r for t~e
be continued to prove the following. characteristic value 3 which is not in 9'(U1, T). The charactenstJc space 1s

9'(3) = {(a, b, c)la - 3b + 2c = O}


Theorem 8.25 (Secondary Decomposition Theorern) Suppose the
minimum polynornial of TE Hom("f") is p'(x) where p(x) is irreducible over and 9'(3) 11 9'(U 1, T) = ..'l'{(3, -1, -3)}. Therefore wemay set U2 = (3, 1, O)
§4. Elementary Divil;ors 373
372 8. CANONICAL FORMS UNDER SIMILARITY

the secondary decomposition theorem, and find the corresponding diagonal


and
block matrix.
U 1 must be a vector such that p 2(T)U 1 :f. O. Choosing E 1 arbitrarily
"Y 3 = 9'((1, O, 0), T) Ef) 9'((3, 1, 0), T)
(the definition of T does not suggest a choice for U 1) we find that
is a decomposition of "Y 3 given by the secondary decomposition theorem.
This decomposition determines the cyclic basis B = {U 1, T(U 1), U2 }.
p 2 (T)E 1 = (T 2 - 4T + 4/)E 1 = (-24, 16, 32, -32, O) :f. O.
Now the T-minimum polynomial of U 1 is (x - 3) 2 , so the restriction of
Therefore we set U1 = E 1 and obtain
T ot 9'(U 1, T) has C((x- 3) 2 ) as its matrix with respect to {U 1 , T(U 1)}.
And the T-minimum polynomial of u2 is X - 3, so C(x - 3) is the matrix
9'(U 1, T) = .P{(I, O, O, O, 0), (14, -12, 8, 8, -8), (28, -32, 64, O, -32)}.
of Ton 9'(U2 , T). Therefore the matrix cif Ton "Y 3 with respect to the cyclic
= .P{(1,0,0,0,0),(0, 1,2, -2,0),(0,0, -4,2, 1)}.
basis Bis
In looking for a vector W, dimensional restrictions leave only two possibili-
diag(C{(x - 3) 2 ), C(x - 3)) =
o -9 o) ties; either p(T)V E 9'(U 1, T) for all V rf: 9'(U1 , T) or there exists a vector V
(o o6 O3 .
_I_
such that p(T) V rf: 9'( U1 , T). Again we arbitrarily choose a vector not in
9'(U1 , T), say E 2 • Then
Note that almost any vector in "Y 3 could ha ve been chosen for U1 in p(T)E2 = (T- 2/)E2 = (8, -6, -2, 8, -4),
this example. That is, U1 could have been any vector not in the 2-dimensional
characteristic space 9'(3). And then U2 could have been any vector in 9'(3)
and it is easily shown that p(T)E2 r/: 9'(U 1, T). Therefore set W = E 2 and
not on a particular line. So the vectors U1 and U2 are far from uniquely
r = 2. Since p 2 (T) W = (- 6, 4, 8, -8, O) is not O, we cannot Jet '02 equal
determined. However, no matter how they are chosen, their T-minimum 2 2
W. [Note that p 2 (T)W = tT 2 (U 1) - T(U1) + U 1 , showing that p (T) E
polynomials will always be (x - 3) 2 and x - 3. 2
9'(U1 , T) as expected. Is it clearthat r2 = 2 is maximal? What wouldp (T)W rf:
Y(U 1, T) and r 2 ;:::: 3 imply about the dimension of -yr 5 ?] Still following the
The secondary decomposition theorem provides the basis for the prooffor the secondary decomposition theorem, it is now necessary to subtract
claim made in the last section that the degree of the mínimum polynomial an element of !/( U1 , T) from W. Equation (!)in this case is p 2 (T)W = f(T)U 1
for TE Hom("Y) cannot exceed the dimension of "Y. For if p•(x) is the with f(x) = tx 2 - x + l. Therefore the polynomial q(x) satisfying f(x)
mínimum polynomial of T, then = pr 2 (x)q(x) is the constant polynomial q(x) = 4 and we should take

dim "Y ;:::: dim 9'(U 1, T) = deg pr' = deg p•. U2 = W- q(T)U 1 = E 2 - 4E1 = (-4, 1, O, O, O).

In Example 2, p"(x) = (x - 3) 2 has degree 2 and "Y = "Y 3 has dimension 3. Then since dim 9'(U 1, T) = 3, dim 9'(U 2 , T) = 2, and dim ji" 5 = 5, we have
-yr 5 = 9'((1, O, O, O, 0), T) Ef) Y'(( -4, 1, O, O, 0), T).
Example 3 If the matrix of TE Hom("Y 5 ) is
2
The cyclic basis given by this decomposition is B = {U 1 , T(U 1), T (U 1),
14 8 -1 -6 2) U 2 , T(U 2 )} and the matrix of T with respect to Bis
-12 -4 2 8 -1
8 -2 o -9 o
( 8 8
-8 -4
o
o
o
4
2
o
with respect to {E¡}, then the mini mu m polynomial of T is p 3 (x) = (x - 2) 3 .
diag(C(p 3 ), C(p
2
)) = (! L-j L:).
o o o o -4
Find a direct sum decomposition for "Y 5 following the procedure given by o o o 1 4
374 8. CANONICAL FORMS UNDER SIMILARITY §4. Elementary Divisors 375

The two previous examples suggest quite strongly that the T-minimum The reverse inclusion is immediate so that
polynomials associated with !/(U 1 , T), ... , Y(Uh, T) are independent of
U1, • •• , U,, and that they are determined entirely by the map T. lf so, then
the matrix of Twith respect toa cyciic basis for "f/' can be taken as the canoni-
cal form for the equivalence ciass· under similarity containing all possible
Similar!y
matrices for T.

Theorem 8.26 The numbers h, r~> · · · rh in the secondary deposition


theorem depend only on T.
These direct su m decompositions satisfy the secondary decomposit~theorem
That is, if"f/' = !/(U¡, T) EB · · · EB !/(U~., T) is another decomposition and .;Vp(T) is a T-invariant space of djmension less than n, so the induction
for "Y satisfying the secondary decomposition theorem and the T-minimum
assumption applies a.nd h' = h.
polynomial of u; is pr';(x), then h' = h and r¡ = r¡, . .. 'r~ = rh.
Now consider the image space Jp(T) = p(T)["f/']. By a similar agrument
it can be shown that
Proof The proof uses the second principie of induction on the
dimension of "f/'. JP<T> = !/(p(T)U 1 , T) EB · · · EB !/(p(T)Uk, T)
If di m "f/' = 1, the theorem is clearly true.
Therefore suppose the theorem holds for all T-invariant spaces of
where k is the index such that rk > 1 and rk+ 1 = 1 or k = h with rh > l.
dimension less than n and Jet dim "f/' = n, n > l. Consider the null space of
Then the T-minimum polynómial of p(T)U¡ is pr'- 1(x), 1 ~ i ~k. Also,
p(T), .;Vp(T)· Jf .;V~(T) = "f/', then e = 1 and the dimension of each T-cyclic
space. is deg p. In this case hp = dim "f/' = h'p and the theorem is true. If
.!Vv<T> i= "Y, Jet Ve .;Vp(T)· Then Ve !/(U1 , T) EB · · · EB !/(Uh, T) implies Jp(T) = !f(p(T)U¡, T) EB · · · EB !/(p(T)U~., T)
there exist polynomials/1 (x), ... ,j,,(x) such that V= / 1 (T)U 1 + · · · + f,(T)Uh
and with k' defined similarly. Now Jp(T) is also a T-invariant space, and since
i= {0}, J p(T) is a proper subspace of "f/'. Therefore the induction as-
.;V p(T)
O= p(T)V = p(T) of1 (T)U 1 + .. · + p(T) oj,(T)Uh. sumption yields k' = k and r~ = r 1 , ••• , r~ = rk with r~+ 1 = · · · = r~
= rk+ 1 = · · · = rh = 1, which completes the proof of the theorem.
Since there is only one expression for O in a direct sum 1 p(T) o/;(T)U¡ = O
for 1 ~ i ~ h. Therefore the mínimum polynomial of U¡, pr;(x), divides
p(x)f¡(x), implying that pr'- 1(x) divides J¡(x). Thus there exist polynomials Definition If the mínimum polynomial of Te Hom("f/') is p•(x) with
q 1(x), ... , qh(x) such that p(x) irreducible over F, then the polynomials p•(x) = pr'(x), pr (x), ... , pr•(x)
2

from the secondary decomposition theorem are the p(x)-elementary divisors


ofT.

Therefore
From the comments that have been made above, it should be clear that
if p•(x) is the mínimum polynomial of T, then a diagonal block matrix can be
given for T once the p(x)-elementary divisors are known.
which means that
Example 4 lf TE Hom("f/' 8 ) has p(x)-elementary divisors
.!V p(T)
1
e: !/(pr' - (T)U1, T) EB · · · EB !/(pr'- 1(T)Uh, T). (x 2 - x + 3) 2 , x 2 - x + 3, x 2 - x + 3, then the diagonal block matrix
<r
376 8. CANONICAL FORMS UNDER SIMILARITY §5. Canonical Forms 377

of T with respect to a cyclic basis for "1'" 8 is 7. Find the (x + 1)-elementary divisors of TE Hom("Y4 ) given the matrix of T
with respect to {E1} and the fact that its mínimum polynomial is (x + 1)2 •
o o o
o o
-9 j
6 i
o
a. (-1_:o -~
-1
_¡ -i)
-2 -2 · b.
(g -g ó -i)
-9 10 -7 -4 ·
o 1 o -7 ¡ o o -1 o 1 o -2
.
d~agcc
(2
p ), c(p), C(p)) = 001
_____ .. _________________ ..2!
__ ro·----~3--' 8. Find a diagonal block matrix for a map with the given elementary divisors.
a. x 2 + x + 1, x 2 + x + 1, x 2 + x + l.
i:.................
1 1i
-¡--o·--..~-3
b. (x + 3)3, (x + 3) 2 , (x + 3) 2 •
c. (x 2 - 4x + 7) 2 , x 2 - 4x + 7.
o ¡1 9. Combine the primary and secondary decomposition theorems to find a diagonal
block matrix composed of companion matrices similar to
Herep(x) = x 2 - x + 3 andp 2(x) = x4 - 2x 3 + 7x2 - 6x + 9.
a. (-! =~ -f ~i).
-4 o -4 7
b. (~r
6
_: ~~ -~¡).
o 6 -10
Problems

l. Find the companion matrices of the following polynomials.


a. x - 5. b. (x- 5) 2 • c. (x- 5) 3 • d. x 2 - 3x + 4. §5. Canonical Forms
e. (x 2 - 3x + 4) 2 •
2. Suppose the companion matrix C(q) E vil h x h is the matrix of TE Hom( "Y) with The primary and secondary decomposition theorems determine a direct
respect to sorne basis B. sum decomposition of "1'" for any map TE Hom("f'"). This decomposition
a. Show that q(x) is the mínimum polynomial of T. yields a diagonal block matrix for T, which is unique up to the order of the
.~
b. Shów that ( -!)l'q(x) is the characteristic polynomial of T. blocks, and thus may be used as a canonical representation for the equiv-
3. Find a cyclic basis B for "Y 3 and the mati:ix of T with respect to B if the matrix alence class of matrices for T.
of T with respect to {E¡} is

a. (~
o
-1o - 2~)- b. (-i -: -T).
-4 -4 4
Theorem 8.27 Suppose m(x) is the mínimum polynomial of TE Hom("f'")
and m(x) = p1'(x) · · · p~"(x) is a factorization for which p 1(x), ... ,Pk(x) are
4. Suppose TE Hom("Y) and B is a basis for "Y. Let U1 , ••• , Uh be vectors distinct monic polynomials, irreducible over F. Then there exist unique
satisfying the secondary decomposition theorem with U2 , ••• , Uh obtained positive integers h;, rij with e¡ = r; 1 ::=:: • • • ::=:: r;h, ::=:: 1 for 1 :::; i :::; k and
fro~, ... , wh as in the proof. 1 :::; j :::; h¡ and there exist (not unique!y) vectors Uij E "f'" such that "f'" is
a. Why is there always a vector in B that can be used as U1 ? the direct sum of the T-cyclic subspaces [1-'(U;j, T) and the T-minimum
b. Show that W 2 , ••• , Wh can always be chosen from B. polynorniai Of ijij is p~U(x) for J :::; i :::; k, J :::; j :::; h¡.
5. Suppose TE Hom("Y6) and p•(x) is the mínimum polynomial of T. List all
possible diagonal block matrices for T (up to order) composed of companion
matrices if Proof The factorization m(x) = p~'(x)· · ·pk"(x) is unique up to
a. p"(x) = (x + 3)2. b. p"(x) = x- 4. c. p•(x) = (x 2 + x + 1) 2 • the order of the irreducible factors, and the primary decomposition theorem
d. p•(x) = (x + 4) .
2 3
e. p•(x) = (x- 1) • 3 gives
6. Suppose (x - e)' is the T-minimum polynomial of U, e E F.
a. Show that SI'( U, T) contains a characteristic vector for c. (1)
b. Show that SI'( U, T) cannot contain two linearly independent characteristic
vectors for c. -- - Now consider the restriction of T to each T-invariant space .!Vp,••<T>· Thc
378 8. CANONICAL FORMS UNDER SIMILARITY §5. Canonical Forms 379

mínimum polynomial of this restriction map is Pi'(x). So from the previous (<=) Let A be an n x n matrix with elementary divisors pi' 1(x) for
section we know that there exist unique positive integers h 1, e1 = r 11 ;:: • • • 1 ::;; j:::; h¡, 1 :::; i:::; k. Suppose A is the matrix of TE Hom("f/) with respect
;:: r¡h, ;:: 1 and there exist vectors U¡¡, ... ' uih¡ E .;Vp,••(T) e "Y with T- to sorne basis. Then there exists a basis for "Y with respect to which Thas the
minimum polynomials p~"(x), ... ,p?h•(x), respectively, such that matrix
D = diag(C(p~"), ... , C(p~"h•)) .
.!VP,"•<TJ = Y( U;¡, T) EB · · • EB Y(U1h,, T).
Thus A is similar to D. lf A' E J!t.x.(F) has the same elementary divisors
Replacing each null space in the sum (!) by such direct sums of T-cyclic
as A, then by the same argumentA' is similar to D and hence similar to A.
subspaces completes the proof of the theorem.
This theorem shows that the elementary divisors of a matrix form a
Corollary 8.28 The degree of the mimmum polynomial of complete set of invariants under similarity, and it provides a canonical form
TE Hom("Y) cannot exceed the dimension of "Y. for each matrix.

Proof
Definition If A E .H.x.(F) has elementary divisors Pi'i(x), 1 ::;; j::;; h;
k h¡ k k and 1 :::; i :::; k, then the diagonal block matrix
dim "Y= L L dim Y(U
i=l j=l
1j, T):?: L dim Y(U
i=l
11 , T) = L degp'f'
i=l
diag(C(p;"), ... , C(p;•h•), ... , C(pk"'), .. . , C(pk"h•))
= degm.
is a rational canonical form for A.
The collection of all p 1{x)-elementary divisors of T, 1 ::;; i ::;; k, will be
referred to as the e/ementary divisors of T. These polynomials are determined Two rational canonical forms for a matrix A differ only in the order of
by the map T and therefore must be reflected in any matrix for T. Since two the blocks. It is possible to specify an order, but there is Iittle need to do so
matrices are similar if and only if they are matrices for sorne map, we have here. It should be noticed that the rational canonical form for a matrix A
obtained a set of invariants for matrices under simjlarity. It is a simple matter depends on the nature of the field F. For example, if A E .H.x.CR), then A
to extend our terminology to n x n matrices with entries from the field F. is also in .lf. x .( C), and two quite different rational canonical forros can be
Every matrix A E .H.x.(F), determines a map from the vector space .H.x 1 (F) obtained for A by changing the base field. This is the case in Example 1
to itself with A(V) = AVfor all V E .H.x 1(F). Therefore we can speak of the below.
mínimum polynomial of A and of the elementary divisors of A. In fact, the The following result is included in the notion of a canonical form but
examples and problems of the two previous sections were stated in terms of might be stated for completeness.
matrices for simplicity and they show that the results for a map were first
obtained for the matrix.
Theorem 8.30 Two matrices from .lf.x.(F) are similar if and only if
they have the same rational canonical form.
Theorem 8.29 Two matrices from .lf.x.(F) are similar if and only if
they have the same elementary divisors.
Example 1 Find a rational canonical form for

o -1

A~(-~
Proof ( =>) lf A is the matrix of T with respect to the basis B, then -1
Y = AX is the matrix equation for W = T(V). Therefore, the elementary
divisors of A equal the elementary divisors of T. If A andA' are similar, then
they are matrices for sorne map T and hence they have the same elementary
o
o
o -1
3
4 o -4
-2
2
-io)
E .lf s x s(R).

divisors. <r -2 1 3 -2 -2
380 8. CANONICAL FORMS UNDER SI~LARITY §5. Canonical Forms 381

Suppose {E¡} denotes the standard basis for .H.x 1(R). Then a little The determinant of a diagonal block matrix is the product of the determinants
computation shows that the A-minimum polynomial of E1 and E4 is (x - .1) 2 , of the blocks (problem 5, page 347), so if we set bii = deg p~ii, then
the A-mínimum polynomial of E 2 is x 2 + 4 and the A-minimum polyncimial
of E 3 and Es is (x - 1) 2 (x 2 + 4). Therefore the mínimum polynomial of A
is m(x) = (x- 1) 2 (x 2 + 4). The polynomials p 1(x) = x- 1 and P2(x)
= x 2 + 4 are irreducible over R and m(x) = p~(x)p 2 (x). Thus p~(x) and But det(C(p~'1) - AI6 . ) is the characteristic polynomial of C(p/'1), and by
p 2 (x) are two elementary divisors of A. The degrees of the elementary divisors problem 2, page 376 this polynomial is ( -1) 6 iip?J(A.). So q(A.) =
of A must add up to 5, so A has a third elementary divisor of degree l. This ( -l)"p'¡"(A.) .. ·p~•••(A.). But if m(x) is the mínimum polynomial of A, then
must be x - l. Since the elementary divisors of A are p~(x), p 1 (x), and P2(x), m(x) = p~ 11 (x)pí21 (x) · · ·pí;"'(x). Therefore q(A.) = m(A.)s(A.) for sorne polynomial
a rational canonical form for A is s(A.), and m(A) = Oimplies q(A) = Oas desired.

Corollary 8.32 If p; 11 (x), ... , p~•"•(x) are the elementary divisors


for an n x n matrix A, then the characteristic polynomial of A is
( -1 )"p~li(A.) ... p~·"•(A.).

Although we have used the rational canonical form to prove the Caley-
If the matrix of this example were given as a member of .H 5 x 5 ( C), then Hamilton theorem, the form itself leaves something to be desired. As a di-
the mínimum polynomial would remain the same but its irreducible factors agonal block matrix most of its entries are O, but a general companion matrix
would change. Over the complex numbers, is quite different from a diagonal matrix. This is true even in an algebraically
closed field, for example, the companion matrix of (x - a) 4 is
m(x) = (x - 1) 2(x + 2i)(x - 2i).
o o o
Then (x - 1) 2 , x + 2i, and x - 2i are elementary divisors of A, and again 1 o o a4)
4a 3
(o 1 o 6a 2 .
there must be another elementary divisor of degree l. Why is it impossible -
for the fourth elementary divisor to be either x + 2i or x - 2i? With ele- o o 1 4a
mentary divisors (x - 1) 2 , x - 1, x + 2i, and x - 2i, a rational canonical
form for A E .H s x 5 ( C) is obtained from D by replacing the block ( ~ ~)
For practica! purposes it would be better to have a matrix that is more
- nearly diagonal. One might hope to obtain a diagonal block matrix similar
. h (2i
Wlt 0 -2i .
o) to the companion matrix C(p•) which has e blocks of C(p) on the diagonal.
Although this is not possible, there is a similar matrix, which comes el ose ro-
this goal.
Considera map TE Hom('i'") for which 'i'" = !/(U, T) and p•(x) is the
Theorem 8.31 (Cayley-Hamilton Theorem) An n x n matrix T-minimum polynomial of U. Then C(p•) is the matrix of T with respect to
[or a map in Hom('i'")] satisfies its characteristic equation. the cyclic basis {U, T(U), ... , T" 6 - 1 ( U)}, where b is the degree of p(x). \Ve
That is, if q(A.) = O is the characteristic equation for the matrix A, then will obtain a simpler matrix for T by choosing a more complicated basi;;.
q(A) =O. Supposep(x) = x 6 + a6 _ 1xb-t + ··· + a 1x + a 0 .ReplacexbyTandapply
the resulting map to U and obtain
Proof Suppose the rational canonical form for A is D =
diag(C(p~ 11 ), C(p~kh•)). Then since A is similar to D, q(A.) = det(D - Al.).
••• ,
382 8. CANONICAL FORMS UNDER SIMJLARITY §5. Canonical Forms 383

Now repeatedly apply the map p(T) to obtain the following relations: so on. Then using the second equation above, we have

/(T)U = p(T)Tb(U) + ab_ 1p(T)Tb- 1(U) + · · · + a 1 p(T)T(U) T(p(T)Tb- 1(U)) = p(T)Tb(U)


+ a0 p(T)U = p 2(T)U- ab_ 1 p(T)Tb- 1(U)- · · · - a0 p(T)U.

p\T)U = p 2 (T)Tb(U) + ab_ 1p 2 (T)Tb- 1(U) + · · · + a 1p 2 (T)T(U) Therefore


+ aop 2 (T)U

p•- 1(T)U = p•- 2 (T)Tb(U) + ab_ 1 p•-\T)Tb-!(U) + · · · + a 1p.- 2 (T)T(U) --.:._


+ GoPe- 2 (T) U.
Notice that the entries from the last column of C(p) appear in each equation,
and the p + 1 vectors on the right side of each equation are like the vectors
of a cyclic basis. Also the vector on the left side of each equation appears in Call this matrix the hypercompanion matrix of p'(x) and denote it by Ch(p•).
the next equation. Therefore, consider the following set of vectors:
Example 2 Jf p(x) = x 2 - 2x + 3, then p(x) is irreducible over
T(U) T 2 (U) yb-1(U)
R. Find the companion matrix ofp 3 (x), C(p 3 ), and the hypercompanion matrix
p(T)T(U) p(T)T 2 (U) p(T)Tb- 1(U)
of p 3 (x), Ch(p 3 ).
p 2 (T)T(U) p 2 (T)T 2 (U) pz(T)Tb-1(U)
o o o o o -27
p•-1(T)Tb-1(U). 1 o o o o 54
o o o
;

o 1 -63
There are eb vectors in this set, so if they are linearly independent, they form
a basis for "Y. A linear combination of these vectors may be written as q(T)U
:\.,
~~t
C(p3) =
o o 1 o o 44
for sorne polynomial q(x) E F[x]. Since the highest power of T occurs in
::; o o o 1 o -21
p•- 1 (T)Tb-!(U), the degree of q(x) cannot exceed eb- l. Therefore, ifthe set •.-
o o o o 6
:j.
is linearly dependent, there exists a nonzero polynomial q(x) of degree less ... ~
and
than p•(x) for which q(T)U = O. This contradicts the fact that p•(x) is the
T-minimum polynomial of U, so the set is a basis fGr "Y. Order these vectors o -3 oo o o
from left to right, row by row, starting at the top, and denote the basis by 2 oo o o
Be. We will call Be a canonical basis for "/'. Let Ae be the matrix of Twith re- Ch(p3) =
o 1 -3 o
o o
spect to this canonical basis Be. The first b - 1 vectors of Be are the same as o o 1 2 o o
the first b - 1 vectors of the cyclic basis, but o o o 1 ¡o -3
o o o o: 1 2

If the companion matrices of a rational canonical form are replaced


Therefore the b x b submatrix in the upper left comer of Ac is C(p), and by hypercompanion matrices for the same polynomials, then a new canonical
the entry in the p + 1st row and pth column is l. This pattern is repeated; form is obtained, which has its nonzero entries clustered more closely about
each equation above gives rise toa block C(p) and each equation except for the main diagonal.
the last adds a 1 on the subdiagonal between two blocks. For example, the
second row of elements in Be begins like a cyclic basis with T(p(T)U)
= p(T)T(U), T(p(T)T(U)) = p(T)T2 (U), T(p(T)T 2(U)) = p(T)T3(U), and Definition lf A E Anxn(F) has elementary divisors pj'1(x), 1 ::::; j ::::; h;
384 8. CANONICAl FORMS UNDER SIMllARITY §5. Canonical Forms 385

and 1 ::; i ::; k, then the diagonal block matrix respect to which the matrix of T is

is a classieal eanoniealform for A. p(x) = x- 2.

Example 3 Suppose TE Hom('i" 6 ) has elementary divisors


(x 2 + x + 2) 2 and (x- 5) 2 • Find a matrix D 1 for T in rational canonical
form and a matrix D 2 for T in classical canonical form. Example 5 Let

e
o o; o
o o o -4!
1 o o -4 !
o 1 o 5i
o -2
-1
1
¡_Q_ o
o -2
A= O
-1 -1
-3
o
8
1 -24-8)
3 -16 .
o o 1 =2 j o 1 -1 o 1 -1 o
·-····------------- --------··r·o·-- - 2s ·
o : 1 10 ·~1 ~ Find a classica1 canonical form for A as a member of v!t 4 x iR) and a J ordan
form for A as a member of v!t 4 x i C).
If p(x) is a first degree polynomial, say p(x) = x - e, then the hyper- The A-mínimum polynomial of (l/8, -1/8, O, O)T is easily found to be
companion matrix C¡,(pe) is particularly simple with the characteristic value x 4 + 8x 2 + 16. Since the degree of the mínimum polynomial for A cannot
e on the main diagonal and 1's on the subdiagonal. Thus if all irreducible exceed 4, x 4 + 8x 2 + 16 is the mínimum polynomia1 for A. Over the real
factors of the mínimum polynomial for A are linear, then the nonzero en tries numbers this po1ynomial factors into (x 2 + 4) 2 , whereas over the complex
in a classical canonical form for A are confined to the main and subdiagonals. numbers it factors into (x - 2i) 2 (x + 2i) 2 • Therefore the classical canonical
If A is not diagonalizable, then such a form is the best that can be hoped for form for A is
in computational and theoretical applications. In particular, this always
occurs over an algebraically closed field such as the complex numbers.
o1 -4:
o: oo oo)
Definition lf the mínimum polynomial (or characteristic polynomial) ( a-·········c:··o . ·-~4
of A E v!t" x .(F) factors into powers of first degree polynomials o ver F, then o o: 1 o
a classical canonical form for A is called a Jordan eanonieal form or simply
aJordan form. . .____, and
o

Gf' i)
Every matrix with complex entries has aJordan form. So every matrix
with real entries has aJordan form if complex entries are permitted. How-
o
-2i
lll .,{{4xiC).
ever, not every matrix in v!t" xn(R) has aJordan form in v!t" xnCR). For example,
1 -2i
the equivalence class of 6 x 6 real matrices containing D 1 and D 2 of Example
3 does not contain a matrix in Jordan form.
The second matrix is aJordan formas required.

Examp/e 4 Suppose TE Hom(-r 5 ) has elementary divisors (x - 2? If a map T has a matrix in Jordan form, then this fact can be used to ex-
and (x - 2) 2 , as in Example 3, page 372. Then a matrix for T in classical press T as the sum of a diagonalizable map and a map which becomes zero
canonical form is a Jordan form, and there is a canonical basis for 'i" 5 with when raised to sorne power.
386 8. CANONICAL FORMS UNDER SIMILARITY §5. Canonical Forms 387

Definition A map TE Hom('t'") is nilpotent if Tk = O for sorne positive 4. Find a classical canonical form for A as a member of vf( 4 x 4 (R) and a J ordan
integer k. The index of a nilpotent map is 1 if T = O and the integer k if form for A as a member of .J'f 4 x4(C) when
Tk = O but yk-I :1 O, k> J.

Nilpotent matrices are defined similary.


A = (-~i ~ of -~!).o
-1 1
5. Determine conditions which insure that the minimum polynomial of an n x n
Theorem 8.33 If the mínimum polynomial of TE Hom('i') has only matrix A equals (- 1)" times the characteristic polynomial of A.
linear irreducible factors, then there exists a nilpotent map T" and a diag- 6. Use the characteristic polynomial to find the minimum polynomial and the
onalizable map Td such that T = T" + Td. Jordan form of the following matrices from .Jt 3 x 3 (C). ~

Proof Since the minimum polynomia1 has only linear factors,


a. ( 6 -~
-1 1
-i).o b. (-i
-1
-~ -~)-
1 o
c. (=! ~ 16) ..
2 -2 -3
there exists a basis Be for 1' such that if A is the matrix for T with respect 7. Suppose the characteristic polynomial of A is (x- 2) 6 •
to Be, then A is in Jordan canonical form. Write A = N + D where D is a. Find all possible sets of elementary divisors for A.
the diagonal matrix containing the diagonal entries of A and N is the "sub- b. Which sets of elementary divisors are determined entirely by the number
diagonal matrix" containing the subdiagonal entries of A. It is easily shown of linear! y independent characteristic vectors for 2?
that N is a nilpotent matrix, problem 14 below. So if T" E Hom('i') has N
as its matrix with respect to Be and Td E Hom('i') has D as its matrix with 8. Compute (f ~)" for any positive integer n.
respect to Be, then T = T" + Td. 9. For the following matrices in Jordan form, determine by inspection their
elementary divisors and the number of linearly independent characteristic
Theorem 8.33 may be proved without reference to the existence of a vectors for each characteristic value.
Jordan form. This fact can then be used to derive the existence of a Jordan
form over.an algebraically closed field. Seé problem 16 below.
a. (! ~ ~ 8). b. (! f ~ 8). oo o)o.
2
o
0012 0002 o 2
Problems 10. Suppose the matrix of T with respect to {E¡} is

l. Given the elementary divisors of A E .lt" x n(R), find a rational canonical form
D 1 for A anda classical canonical form D2 for A.
a. (x- 3) 3 , x- 3, x- 3. c. (x 2
+ 16) , x + 16.
2 2 -5 2 1 1
(=i -~ ~ !) .
b. (x 2
+ 2
+
4) , (x 2) 2 • d. (x 2
+ 2x + 2) 3 • a. Show that the minimum polynomial of Tis (x 2 + 2) 2 •
b. With TE Hom(?'"4 (R)), find a canonical basis Be for 1'"4 begining with
2. Suppose the matrices of problem 1 are in .Átnxn(C). Determine the elementary (0, O, 1, -1) and find the matrix of T with respect to Be.
divisors of A and find a rational canonical form D 3 for A and a Jordan c. With TE Hom(1'"4 ( C)), find a canonical basis Be for 1'" 4 ( C) and the
canonical form D 4 for A. matrix of T with respect to Be.
3. Suppose the matrix of TE Hom(-r' 4 ) with respect to {Er} is 11. a. Show that every characteristic value of a nilpotent map is zero.

(!o -g o~ =f).
b. Suppose TE Hom(-r') is nilpotent, and dim 1'" = 4. List all posible sets
of elementary divisors for Tand find a matrix in Jordan form for each.
1 3 c. Prove that if Tis nilpotent, then Thas a matrix with O's offthe subdiagonal
and 1's. as the only nonzero en tries on the subdiagonal, for sorne choice
a. Show that the minimum polynomia1 of T is (x - 2) 4 •
of basis.
b. Find the canonical basis Be for 1"4 which begins with the standard basis
vector E 1 • 12. Show that the following matrices are nilpotent and find a Jordan form for
c. Show that th!l:lilatrix of Twith respect to Beis aJordan canonical form. each.
388 8. CANONICAL FORMS UNDER SIMILARITY §6. Applications of the Jordan Form 389

( 6 -4 3)
2) ideas. However, there are no simple geometric applications of the Jordan
a.
-3
6 -4 1 . b. (
2 -2
= 1 -1
1
1
1
1
= 1)
1 . c.
1
( -21 -21 -12 -4
5
3
3 - 1 .. 2 .
2 -1 .· 2
form for a nondiagonalizable matrix. On the other hand, there are manv
places where the Jordan form is used, and a brief discussion of them wiÍI
show how linear techniques are applied in other fields of study. Such an
13. Find a nilpotent map T. and a diagonalizable map T4 such that T = T,, + T4
application properly belongs in an orderly presentation of the field in ques-
when
a. T(a, b) = (5a - b, a + 3b). tion, so no attempt will be made here to give a rigorous presentation. Wc
b. T(a, b, e) = (3a- b - e, 4b + e, a - 3b). will briefly consider applications to differential equations, probability, and
numerical an,Jysis.
14. Suppose A is an n x n matrix with all en tries on and above the main diagonal -,(
equal to O. Show that A is nilpotent. Differential equations: Suppose we are given the problem of finding
x 1 , ••• , x. as functions of t when
15. Suppose the mimimum polynomial of TE Hom('Y) has only linear irreducible
factors. Let T 1 , ••• , Tk be the projection maps that determine a primary
decomposition of 'Y, as obtained in problem 15 on page 367. Thus if A.J> ..• , A.k D,x 1 = dx¡jdt = a 11 x 1 + · · · + a 1.x.,
are the k distinct characteristic values of T, then T1: 'Y- .k'<r-;.,o"•·
a. Prove that Td = A. 1 T 1 + · · · + A.k Tk is diagonalizable. for each i, 1 :s; i :s; n. If the aij's are all scalars, then this system of equa-
b. Prove that T,, = T- T 4 is nilpotent. Suggestion: Write T = To 1 = tions is cailed a homogeneous system of linear first-order differential equations
To T 1 + · · · + To Tk and show that for any positive integer r,
with constan! coefficients. The conditions x¡(t 0 ) = c1 are the initial conditions
T;, = (T- A. 1 /)'o T 1 + · · · + (T- A.kl)'o h at t = t 0 . Such a system may be written in matrix form by setting A = (a 1) ,
16. Combine problems llc and 15 to show how the decomposition T = Td + T,, X= (x¡, ... , x.f, and e= (e¡, ... ; c.)T. Then the system becomes
yields a matrix in Jordan form. (The statement of llc can be strengthened to D,X = AX or D,X(t) = AX(t) with X(t 0 ) = C.
specify how the 1's appear, and then proved without reference to a Jordan
form. See problem 17.)
17. Suppose TE Hom('Y) is nilpotent of index r and nullity s. Example 1 Such a system is given by
a. Show that {O}= .k' ro e .k'r e .1Vr' e · · · e ffrr ='Y.
b. Show that 'Y has a basis D,x = 4x - 3y - z, x(O) = 2
B 1 = {U 1 , ••• , Va., Ua 1 + 1 , ••• , U" 2, ••• , UQ,._,+J, ... , Uar}, D,y =X- z, y(O) = 1
with {U 1 , ••• , Ua 1 } a basis for% r; for i = 1, ... , r.
c. Prove that the set {UJ> ... , Ua 1 _ , T(Ua 1 +1), •.• , T(Ua 1 . , ) } may be D1z = -x + 2y + 3z, z(O) '= 4.
extended toa basis for% r' for i = 1, ... , r - l.
d. Use parte to obtain a basis B 2 = {W1 , . . . , Wa,, ... , War- 1 +1, ... , This system is written D,X = AX, with initial conditions at t = O given by
WaJ such that { W1, •.. , Wa;} is a basis for X r; for each i, and if W E 82
X(O) = C, if
..
is ...,.,
in !he basis for .1V' r', but not the basis for .!V r'- •, then

e.
T(W)E{Wa 1 _,+1, ... , Wa 1 _
and so on.
1 }, T 2 (W)E{Wa 1 - 3 +1, ... , Wa 1 _,},

Find a basis 8 3 for 'Y such that the matrix of T with respect to 83 is A= ( ~
-3
o
-¡)
-1,
-1 2 3
diag(N 1 , ••• , N 5 ), where N 1 has 1's on !he subdiagonal and O's elsewhere,
andifN1 isb1 X bl>thenr=b¡~b2~ ... ~bs~l.
A solution for the system D,X = AX with initial conditions X(t 0 ) = C
is a set on n functions j 1 , ••• J. su eh that x 1 = f 1(t), ... , x. = J.(t) satisfies
the equations and j 1(t 0 ) = c 1, ••• ,f,,(t 0 ) = c•. Usually D,X = AX cannot
§6. Applications of the Jordan Form be solved by inspection or by simple computation of antiderivatives. How-
ever, the system would be considerably simplified if the coefficient matrix
Our examples have been chosen primarily from Euclidean geometry to A cou!d be replaced by its Jordan form. Such a change in the coefficient
provide clarification of abstrae! concepts and a context in which to test new matrix amounts to a change in variables.
390 8. CANONICAL FORMS UNDER SIMILARITY §6. Applications of the Jordan Form 391

Theorem 8.34 Given the system of differential equations D1X = y'= 3e 21 , and z' = -e- 21 , therefore the solution of the original system is
AX with initial conditions X(t 0 ) = e, Jet X= PX' define a new set ofvari-
ables, P nonsingular. Then X is a solution of D1X = AX, X(t 0 ) = e if and
1

PX' = 2 1 1) ( e 21 )
only if X' is a solution of D 1X' = p-IAPX', X'(t 0 ) =p-Ie. X =
(21 11 32 -e-3e 21

Proof Since differentiation is a linear operation and Pis a matrix of


or
constants, P(D 1X') = DlPX'). Therefore D1X' = p-I APX' if and only if
D¡(PX') = A(PX'), that is, if and only if X= PX' satisfies the equation x = 2et + 3e2t - e-2t
~_D 1 X=AX.
y = et + 3e2t - 3e-2t
Example 2 Find x, y, z if z = 2e 1 + 3e 21 - 2e- 21 .

D 1X = D~(~)z = (~~10 ;
2
-~!)(~) = AX
-10 z
and
Example 3 Consider the system given in Example l. A Jordan
form for A is given by p- 1 AP with

For this problem t 0 = O. and p- 1 AP=


2
12 O.
o o)
The characteristic values of A are 1 and ±2 so A ís diagonalizable. (o o 3
In fact, if
Thus A is not diagonalizable. If new variables are given by X= PX', then
it is necessary to solve the system

then

p-IAP =
1
O 2
o Oo) . with
( o o -2
x'(O)) =P- 1e= (o1 1 1)(2) (5)5.
-11 1 =
Therefore if new variables x', y', z' are defined by X= PX', then it is neces- (y'(O)
z'(O) 1 -1 O 4 1
sary to solve D 1X' = diag(1, 2, -2)X' with initial conditions

The first and Iast equations are easily solved: x' = 5e 21 and z' = e • The
31

X'(O) =p-Ie= -14 -12 2) (4)1 = (


-5 31) .
21
second equation is D 1y' = x' + 2y' or D 1y' - 2y' = 5e . Using e- as an
21

( -1 o 1 3 -1 integrating factor,

The equations in the new coordinates are simply D1x' = x', D1y' = 2y', and or D¡(y'e- 21) = D¡(5t).
D 1z' = -2z', which have general solutions x' = ae1, y' = be21 , z' = ce- 21
Therefore y'e- 21
21
where a, b, e are arbitrary constants. But the initial conditions yield x' = e1, = 5t + b, but y' = 5 when t =O so y' = 5(t + 1)e ,
cr
392 8. CANONICAL FORMS UNDER SIMILARITY §6. Applications of the Jordan Form 393
:::!)

and the solution for the original system is Proof Let A.= (a;) and Ak = (aij,k)- The entries in A must be
bounded, say by M= max {laiill1 ::; i, j::; n}. Then the entries in A 2 are
2
-1
-1 1 2)( 5(t 5e+ 1)é' .' )
boimded by nM 2 for

1 -1 e3'

The general solution of D,x = ax is x = ce•' where e is an arbitrary


constan t. The surprising fact is that if e'A is properly defined, then the general An induction argument gives la;1,d ::; nk-tMk. Therefore,
solution of D,X = AXis given by e'AC where C is a column matrix of con-
stants. There is no point in proving this here, however computation of e'A
is most easily accomplished using aJordan form for A. ·
Computation of ~ and e'A: The matrix eA is defined by following the
pattern set for cf in the Maclaurin series expansion: But (nM)kfk! is the k+ 1st term in theseriesexpansionfore•M, whichalways
converges. Therefore, by the comparison test the seires Li"=o (1/k!)a;j,i:
"' xk
-¿-
-k=ok!'
converges absolutely, hence it converges for all i and j. That is, the series
Li"=o (lfk!)Ak converges.

Given any n x n matrix A, we would set Thus we may define


CX) 1
= L kiA\
k=O · and

if this infinite sum of matrices can be made meaningful.


If A is diagonal, then eA is easily compUted for

Definition Suppose Ak E Alnxn for k =O, 1, 2, ... , and Ak = (au,k).


If limk-ro a;i,k exists for all i and j, then we write
But if A is not diagonal, then eA can be computed using aJordan form for A.
where B = (bii) and bii = Iim aij,k·
k-ro

Theorem 8.36
Sk = L~=o A; is apartial sum of the infinite series of matrices Lf:o A¡. ----.
This series converges if limk-ro Sk exists. If limk-ro Sk = S, then we write
Li"=o Ak =S. Proof

Theorem 8.35 The series

~ 1 k
.t.. -A
k=ok!

converges for every A E ..4!, x w Example 4 Let A and P be as in Example 2. Then p-t AP =
394 8. CANONICAL FORMS UNDER SJMILARITY §6. Applications of the Jordan Form 395

dia$(1, 2, -2) and Consider

e'A =
N=(~ ~}
P(e'diag(l,2,-2))p-t
with
= P diag(e', e2 ', e- 2 ')P- 1
Then
Now with C as in Example 2,

is the solution obtained for the system of differential equations ·given in and
· Example 2.

The calculation of eA is more difficult if A is not diagonalizable. Sup-


pose J = diag(J 1 , ••• , J11 ) is aJordan forro for A where each block l; con-
tains 1's on the subdiagonal and a characteristic value of A on the diagonal.
Then eJ = diag(é', ... , eJ''). Thus it is sufficient to consider Jan n x n With
matrix with A. on the diagonal and 1'son the subdiagonal. Write J = A.I. + N,
then N is nilpotent of index n. It can be shown that eA+B = eAe8 provided
that AB = BA, therefore

o
1 X = e'AC gives the solution obtained for the system of differential equations
1 in Example 3.

1/(n - 2)! Markov chains: In probability theory, a stochastic process concerns


events that change with time. If the probability that an object will be in a
and certain state at time t depends only on the state it occupied at time t - 1,
then the process is called a Ma.rkov process. lf, in addition, there are only a
o
1 ...... o)o finite number of states, then the process is called a Markov chain. Possible
states for a Markow chain might be positions an object can occupy, or qual-
.. . o . ities that change with time, as in Example 6.
If a Markov chain involves n states, let au denote the probability that
.. . 1 an object in the ith state at time t was in the jth state at time t - l. Then
a;i is called a transition probability. A Markov chain with n states is then
characterized by the set of all n2 transition probabilities and the initial dis-
Exampfe 5 Let A and P be the matrices of Example 3. Then tribution in the various states. Suppose x 1 , ••• , x. give the distribution at
t = O, and set X = (x ¡, . . . , xnl, then the · distribution at time t = 1 is

p- 1 AP=
2 o o)
12 0.
given by the product AX, whére A = (aij) is the n x n matrix of transition
· probabilities. When t = 2, the distribution is given by A2 X, and in general,
(o o 3 the distribution when t = k is given by the product AkX.
(
396 8. CANONICAL FORMS UNDER SIMILARITY §6. Applications of the Jordan Form 397

The matrix A of transition probabilities is called a stochastíc matrix. X = (1/2, 1/2? gives the ínítíal dístribution. Therefore, the product
The sum of the elements in any column of a stochastic matrix must equal l.
This follows from the fact that the sum a 1j + a2 j + · · · + a.j is the pro-
AX = (5/6 l/3)(1/2) = (7/12)
bability that an element in the jth state at time t was in sorne state at time 1/6 2/3 1/2 5/12
t - l.
If A is the stochastic matrix for a Markov process, then the ij entry in indicates that after one generation, 7/12th of the population should be
Ak is the probability of moving from the jth state to the ith state in k steps. brown-eyed and 5/l2th blue-eyed. After k generations, the proportions would
If limk_,oo Ak exists, then the process approaches a "steady-state" distribu- be given by Ak(1/2, 1/2V.
tion. That is, if B = Iimk_,oo A\ then the distribution Y= BX is unchanged The stochastic matrix for this Markov chain is diagonalizable. If
by the process, and A Y = Y. This is the situation in Example 6.
If A is a diagonal matrix, then it is a simple matter to determine if a
steady state is approached. For if A = diag(d1, ••• , d.), then A k = diag(df, ... , p (2 -1)1 '
= 1
d~). So for a diagonal matrix, limk_,oo A exísts províded ld¡l < 1 or d¡ = 1
for all the diagonal entries d¡. This condition can easily be extended to diago- then
nalizable matrices.

Theorem 8.37 If B = p- 1 AP, and limk_,oo Ak exists, then

Therefore the hypotheses of Theorem 8.38 are satisfied, and


lim Bk = r 1
(lim Ak)P.
k-+oo k-+oo

lim Ak = P[lim(l O )k]p-:i = p(l O)p-1 = (2/3 2/3)


k->oo k-<:o O l/2 OO 1/3 1/3 ·
Proof.
Hence this chain approaches a steady-state distribution. In fact,
Iim Bk = lim (P- 1 AP)k = lim p-lAkP =r 1
(lim Ak)P.
k-+oo k-+oo k-+co k-+oo

2/3
( 1/3
2/3) (112) (2/3)
Using Theorem 8.37, we have the following condition. l/3 1/2 = 1/3 '

so that, in time, the distribution should approach two brown-eyed individuals


Theorem 8.38 If A is diagonalizable and IAI < l or A = l for each _ for each blue-eyed individual. Notice that, in this case, the steady-state
characteristic value A. of A, then limk_,oo Ak exists. ----::zcftstribution is independent of the initial distribution. That is, for any a with
O:-:; a:-:; 1,
Example 6 Construct a Markov chain as follows: Call being
brown-eyed, state 1, and blue-eyed, state 2. Then a transition occurs from 2/3 2/3)(1 - a) = (2/3)
one generation to the next. Suppose the probability of a brown-eyed parent ( 1/3 1/3 a l/3 ·
having a brown-eyed child is 5/6, so a 11 = 5/6; and the probability of having
a blue-eyed child is 1/6, so a 21 = 1/6. Suppose further that the probability ,";-·,
Numerical analysis: A stochastic matrix A must have 1 as a characteristic
of a blue-eyed parent having a brown-eyed child is 1/3 = a 12 ; and the pro- value (see problem 8 at the end of this section), therefore limk__,oo Ak cannot
bability of having a blue-eyed child is 2/3 = a22 . be O. In contrast, our application of the Jordan form to numerical analysis
If the initial population is half brown-eyed and half blue-eyed, then requires that Ak approach the zero matrix.
398 8. CANONICAL FORMS UNDER SJMILARITY §6. Applications of the Jordan Form 399

Theorem 8.39 Suppose A is a matrix with complex entries. Theorem 8.40 The series Ir'=
o Ak converges if and only iflimk-+oo A k= O.
Limk-oo Ak = O if and only if 1.1.1 < 1 for every characteristic value A of A. Further when the series converges,

00

Proof Jf J = diag(J 1, ••• , Jh) is a Jordan form for A, then L Ak =(In- A)-1.
limk-oo Ak =O if and only if limk .... oo Jk =O. But Jk = diag(J}, ... , J~), so it is k=O

sufficient to considera single Jordan block Un + N where A is a characteristic


val u e of A and N contains 1's on the su bdiagorial and O's elsewhere. That Proof Let sk be the partía! sum L~=O A''.
is, it mus! be shown that (=>) If Lk"=o Ak converges, say Lk"=o Ak =S, then for k> 1, Ak =
sk- sk-1 and ----
lim (Aln + N)k = O if and only if 1.1.1 < l.
k-+oo lim Ak = lim (Sk - Sk_ 1 ) =S- S = O.
k_.co k-?co
If k ;:::: n, then since the index of N is n,
(<=) Suppose limk .... oo A k = O and consider
(21" +N/= )._k¡"+ e)Ak-1N

+ (;}1k-2N2 + . , . + (n ~ J}.k-n+1Nn-1 Then

where (~) is the binomial coefficient h !(kk~ h)!. Therefore


or
J. k ·O o
G).~.k-1 Ak o
Taking the limit gives
(U.+ N)k = Ü)Ak-2 (~)Ak-1 o
I. = lim(I. - Ak+ 1 ) = lim (!. - A)Sk = (I. - A) lim Sk.
k-+co k-+-oo k-+-oo

( k )Ak-n+1 ( k )Ak-n+2 _A k
n- 1 n-2 Therefore the limit of the partial sums exists and is equal to (!. - A)- 1 ,
that is,
If limk .... "' (Al. + N)k = O, then IAI < 1 since ).k is on the diagonal. Con- CX)

versely, if IAI < 1, then L Ak =(l.- A)-1.


k=O

and lim(~);..k-i =O
k~co l Corollary 8.41 Lf=o Ak =(l.- A)- 1 if and only if I.AI < 1 for
every characteristic value .A of A.
follows from repeated applicalions of L'Hopital's rule for 1 :::; i :::; n,
so that limk-+oo (}..[n + Nl = O. The above theorems are employed in iterative techniques. For example,
suppose EX = Bis a system of n linear equations in n unknowns. If A = In - E,
Using this resuhe can show that the series Lf=o Ak behaves very much then X ~ AX + B is equivalel}t to the original system. Given a nonzero
like a geometric series of the form :L:,o ar", a and r scalars. vector X 0 , define Xk by Xk = AXk-l + B for k = 1, 2, .... If the sequence
400 8. CANONICAL FORMS UNDER SIMILARITY §6. Applications of the Jordan Form 401

of vectors X0 , X 1 , X 2 , • •• converges, then it converges toa solution of the approximately, say to four or five decimal places. Then this technique might
given system. For if limh-oo Xh = e, then provide a method for approximating a solution involving only matrix
multiplication and addition. This is obviously not something to be done by
e= Ae + B = (!,- E)e + B, hand, but a computer can be programmed to perform such a sequence of
operations.
or Ee = B. This is a typical iterative technique of numerical analysis.

Theorem 8.42 The sequen'ce X0 , X 1, X2 , ••• converges to a solution Problems


for Ee = B if IA.I < 1 for every characteristic value A. of A = /, - E.
l. Solve the following systems of differential equations by making a change of
variables.
Proof a. D,x = x - 6y, x(O) = 1
D,y = 2x - 6y, y(O) = 5.
X 1 = AX0 + B, b. D,x = 7x - 4y, x(O) = 2
D,y = 9x - 5y, y(O) = 5.
X2 = AX1 + B = A 2 X 0 + AB + B, c. D,x = x - 2y + 2z, x(O) = 3
D,y = 5y - 4z, y(O) = 1
X 3 = AX2 + B = A 3 X 0 + (A 2 + A + I,)B, D,z = 6y - 5z, z(O) = 2.
d. D,x = 5y - z, x(O) = 1
and by induction D,y = - X + 5y - z, y(O) = 2
D,z = -x + 3y + z, z(O) = l.
2. Compute eA and e'" if A is

o5 oo) . 3 o o o)
With the given assumption, a. (-1 4)
-2 5 · b. (~ ~). c. o4
(o 1 5
d.
1 3 o o
(o 1 3 o .
o o 1 3
3. For the systems D,X =, AX, X(O) = C, of problem 1, show that e'"C is the
solution obtained by making a change of variables.
4. An nth order linear homogeneous differential equation with constan! coeffi-
(!, - A) -J B is a solution for EX = B for cients has the form
(1)
1
Define x 1 = x, x2 = D,x, x 3 = Dtx, ... , Xn = D~- x, and consider the
homogeneous system of linear first-order differential equations
Notice that the solution obtained above agrees with Cramer's rule D,x, = x2
beca use D,x 2 = x 3
(2)
D,xa-I = x"
DrXn = - 011 - I X n - Gu-tXn-2 - • ·' - 01Xz - OoX¡.

What if E were singular? a. What is the form of the matrix A íf the system (2) is written as D,X =
AX?
Because of our choice of problems and examples, it may not be clear b. Show that x = f(t) is a solution of ( 1) if and only if x 1 = f(t) is included in
why one would elect to use such an iterative technique to solve a system of a solution of (2).
linear equations. But suppose the coefficients of a system were given only 5. Find x as a function of t if
402 8. CANONICAL FORMS UNDER SIMILARITY Review Problems 403

a. D~x- 5D,x + 6x =O, x(O) = 2, D,x(O) =-l. c. Find a matrix for T in rational canonical form.
b. D~x- 8D,x + 16x =O, x(O) = 7, D,x(O) = 34. d. Find a matrix for T in classical canonical form.
c. Dtx + 2Dlx- D,x- 2x =O, x(O) = 2, D,x(O) = 1, Dlx(O) =-;.:l.
4. Given TE Hom(-r), suppose there exist n T-invariant subspaces .!/ 1 e .!/ 2
6. Find the Jimk~"' A k if it exists when A is e··· e.!/n_ 1 e.!/n = -r, with dim .!/1 = i, lsisn. (Such a sequence of
a. (b 3~J b. (b 4~3).
c. (-1-2 ll/6). d. (~ -4)
-5·
subspaces exists if -r is over an algebraically closed field F.)
a. Show that there exists a basis B = {U 1 , • . • , Un} for -r su eh that for each
o o o)
e.
( 1/2 o
3 1
-1 o ~)· . f.
("
3/51
o o
2 o
o o
-4
o
2/3
1
o
g. ( -1/2
1 34
-1/2 2
-3)
-3/2
-1/2
o
i, 1sisn, {U 1, ... , U¡} is a basis for .!/¡.
b. Determine the form of the matrix A of T with respect to B .
c. Show that the entries on the main diagonal of A are the characteristic
values of T.
7. a. Show that if (x - 1) 2 is an elementary divisor of A, then limk~co Ak
d. Suppose p(x) E F[x]. Prove that Ji is a characteristic value for p(T) if and
does not exist. only if Ji == p(A.) for sorne characteristic value A. for T.
b. Show that if limk ~"' A k. = O, then det A< l. Is the converse true?
5. Find the companion matrix and the hypercompanion matrix of
8. Let A be a stochastic matrix and B = (1, 1, ... , 1). Show that 1 is a charac- a. (x + 2) 3 . b. (x 2 + x + 1)2 •
teristic value of A by considering the product BA.
6. Find rational and classical canonical forms for
9. Does limk~"' Ak exist for the following stochastic matrices?
3/4 1/3) (1/4 1/3 2/3)
2000)
1 1 o o 22 o1 oo oo)
a. ( 114 213 . b. 2/4 1/3 O .
1/4 1/3 1/3
a. ( 1 O 2 O·
-2 1 1 1
b. ( -1
7 -3
1
-1
1 o
2
o

.··..
10. Show that if E is singular, then A = In - E has a characteristic value A. such "' 7. Suppose (x + 4) 3 (x + 1)2 is the mínimum polynomial of a matrix A. Deter-
that IA.I;;o: 1. mine the possible elementary divisors of A and the corresponding number of
independent characteristic vectors if
a. Ais5x5. b. Ais7x7.
8. Suppose the mínimum polynomial of A is (x 2 + 4) 3 . Find a classical canonical
Review Problems form for A if
a. A E vlf 6 x 6(R).

l. Suppose (-1
-2 -9
~ 8)
-5
is the matrix of TE Hom(-r 3) with respect to
9. Suppose the mínimum polynomial of A is (x- 3)3 • Find the possible Jordan
forms for A if A is a. 3 x 3. b. 5 x 5.
{E¡}. 10. Suppose T is nilpotent and A is a matrix in Jordan form for T. Show that the
a. Show that x 3 - 1 is the T-minimum polynomial of E3. number of O's on the subdiagonal of A is one Jess than the nullity of T.
b. Find the vectors U 1 = (T- /)(E3 ) and U 2 = (T 2 + T + /)(E3 ). 11. Find the Jordan form for the nilpotent map TE Hom(-rn) if
c. Find the T-minimum polynomials of U1 and U2. a. n = 6, nullity of T = 4, and the index is 3.
d. Find the matrix of Twith respect to the basis {U¡, T(U¡), U2}. b. n = 7, nullity of T = 4~ and the index is 2.
e. ls there a matrix "for T in Jordan form? c. Show there are two possible forms if n = 7, nullity = index = 3.
Prove Theorem 8.1: .!/ 1 + · · · + .!/h is a direct sum if and only if U1 + ·· ·
D,(:) (i -~ =~)(:) (-~)=C.
2.
+ uh =o, with U¡ E.!/¡ for 1sish, implies U¡ =o for 1sish. 12. a. Solve = = AXif X(O) =

3. Suppose (-~
5
-r l¡ -!)
o 3 4
isthematrixofTEHom(-r4)withrespect
.:¡ b. Show that the solution is given by X= e'AC•
13. How might Theorem 8.38 be extended to include nondiagonalizable matrices?

to the standard basis.


a. Find the T-minimum polynomial of EJ.
b. What is the mínimum polynomial of T?
,.'·

APPENDIX

Determinants

§ 1.A Determinant Fu nction


§2. Permutations and Existence
§3. Cofactors, Existence, and Applications
406 APPENDIX: DETERMINANTS §1. A Determinant Function 407

A computational definition for the determinant was given in Chapter 3. then


This approach was sufficient for the purposes of computation, but it djd not
provide a proof of the important fact that a determinant may be ex¡:ianded
along any row or any column. Now that we are familiar with the concepts
of linearity and bilinearity, it is reasonable to take an abstract approach and
view the determinant as a multilinear furiction. We have also introduced
arbitrary fields since Chapter 3. So the determinant should be defined for If the second column of A is multiplied by 4, then the determinant of the
matrices in A nx n(F), that is, for matrices with en tries from an arbitrary new matrix is
field F.
dot(A,, 4A,, A,)~ dot(l -8 2) = (3
4 1 4 det 2
12 o 5
§1. A Determinant Function

And the function V~ det(A 1 , V, A 3 ) is given by


We begin by redefining determinants using as a guide the properties of
the determinant obtained in Chapter 3. By definition, det is a function that
assigns a real number to each real n x n matrix. However, because of the ·
simple effect of elementary row and column operations on a determinant, it
may well be viewed as a function ofthe n rows or the n columns. For example,
suppose A 1 , ••• , An are the n columns of the n x n matrix A, and write
This function is clearly a linear map from A 3 x 1(R) to R.
A = (A 1, ••• , An). Then det A = det(A 1 , ••• , An), and det takes on the
appearance of a function of n (vector) variables. Now the property relating
to an elementary column operation of type II takes the form Definition Let "Y be a vector space and fa function of n variables
U¡, ... ' un E "Y.fis multi/inear if the function
det(A 1 , ••• , rAk> ... , An) = r det(A 1 , ••• , Ak, ... , An).
V ~f(U1 , ... , Uk_ 1, V, Uk+!> ... , Un)
That is, as a function of n columns, the· function det preserves scalar mul-
tiplication in the kth variable. Or the function given by is linear for each k, 1 ::;; k ::;; n, and for each choice of vectors

V~ det(A 1, •.. , Ak_ 1 , V, Ak+ 1 , ••• , An) U¡, ... ' uk-1> uk+I> ... ' un E "Y.

from Anx 1(R) to R preserves scalar multiplication. We could go on to show Therefore the claim is that det is a multilinear function of the n columns
that this function is linear or that, as a function of the n columns, det is linear in an arbitrary matrix. We will not prove this, but simply note that this result
in each variable. was obtained for 2 x 2 matrices in problem 6, page 246.
We found that an elementary column operation of type 1, interchanging
columns, changes the sign of det A. This property will be introduced with
Example 1 Suppose
the following concept.

Definition A multilinear functionfis alternating if /(U1, ..• , Un) =O


whenever. at least two of the varüÍbles U¡, ... , Un are equal.
408 APPENDIX: DETERMINANTS §1. A Determinant Function 409

If two columns of a real matrix are equal, then the determinant is zero For the matrix with Ah + Ak in both its hth and kth columns,
so if the function det is multilinear, then it is alternating.
We are now prepared to redefine determinants. O= D(A 1, ••• , Ah+ Ak, ... , Ah+ Ak, ... , A.)
= D(A 1 , ••• , Ah, ... , Ah, ... , A.) + D(A 1, ••• , Ah, ... , Ak, ... , A.)
Definition A determinantfunction D from ~.x.(F) to Fis an alternat-
+ D(A 1 , ••• , Ak, ... , Ah, ... , A.) + D(A 1, ••• , A k> ••• , Ak, ... , A.)
ing multilinear function of the n columns for wlúch D(I.) = l.
= D(A 1 , ••• , Ah, ... , Ak, ... , A.) + D(A 1 , . , . , Ak, ... , Ah, ... , A.).
Therefore a determinant function D must satisfy the following conditions
This shows that the two functional values are additive inverses in F, establish-
for all A 1 , • •• , A. E ~.x 1(F) and all k, 1 s k s n:
ing the theorem.
l. D(A 1 , ••• ,Ak + Aí,, ... ,A.)
= D(A 1 , ••• , Ak> ... , A.) + D(A 1 , ••• , Aí,, ... , A.). Skew-symmetry is not equivalent to alternating. However, in a field
2. D(A 1 , ••• , cAk, ... , A.) = cD(A 1 , ••• , Ak> ... , A.), for all e E F. satisfying 1 + 1 ~ O, a skew-symmetric multilinear function is alternating,
3. D(A 1 , ••• , A.) = Oif at least two columns of A = (A 1 , ••• , A.) are see problem 6 at the end of this section.
equal. To find an expression for D(A), write each column Ak of A in the form
4. D(I.) = D(E 1 , ••• , E.) = 1 where E 1 , ••• , E. is the standard basis
for ~.x 1(F). These vectors are actually Ei, ... , EJ, but the transpose nota- n
tion will be dropped for simplicity. Ak = I;a;kE¡.
¡;¡

So little is required of a determinant function that one might easily For example, if
doubt that D could equal det. In fact there is no assurance that a determinant
function D exists. One cannot say, "of course one exists, det is obviously a
determinant function," for the proof of this statement relies on the unproved
result about expansions along arbitrary columns.
Assuming that a determinant function exists, is it clear why the condition
D(I.) = l is assumed? Suppose there are many aiternating multilinear func- then
tions of the columns of a matrix. This is actually the case, and such a con-
dition is required to obtain uniqueness and equality with det.
We will proceed by assuming that a determinant function exists and
search for a characterization in terms of the entries from the matrix A. The
first step is to note that D must be skew-symmetric. That is, interchanging two
variables changes the sign of the functional value.
When every column is written in this way it will be necessary to keep track
ofwhich column con tributes which vector E;. Therefore we introduce a second
Theorem A1 If D exists, then it is skew-symmetric. That is, inter- index and write
changing two columns of A changes D(A) to - D(A).

Ak = I" a;.kE;•.
Proof It must be shown that if 1 ::s; h < k ::s; n, then Ík= 1

D(A 1 , ••• , Ah, ... , Ak, ... , A.) = -D(A 1, ••. , Ak, ... , Ah, ... , A.). So in the above example, A 3 = 2E 1 , + 8E2 , + 5E3 ,. With this notation, A
-~
l
410 APPENDIX: DETERMINANTS 1'
§1. A Determinant Function 411
J
)
takes the form 11
But D must be skew-symmetric and D(E 1 , E 2 ) = l. Therefore,

A = (A 1 , ••• , A.) l D(A) = a11 a22 a21 a 12 •


l -

That is, if D exists, then it equals det on 2 x 2 matrices.


This example suggests the general situation, but it may be useful to
compute D(A) for an arbitrary 3 x 3 matrix using the above formula. You
Now if a determinant function D exists, it is linear in each variable. Therefore,
writing A in this way shows that should find that D(A) is again equal to det A when A ís 3 x 3.

n n n There are now two ways to proceed in showing that a de~I!linant func-
D(A) =. L a;, I L a; 22 • • • L1 a;.,.D(E;,, E; 2
, • •• , E;,,). · tion exists. The classical approach is to show that such a function can only
l! = l 12::::::1 i 11 :;:;
assign one value to each matrix of the form (E¡,, ... , E;). Of course, if two
columns are equal, then the val u e must be O. But if no two columns are equal,
Example 2 Use the above formula to compute D(A) when A is the then all the vectors E 1 , ••• , E. are in the matrix., So using enough inter-
2 x 2 matrix changes, the sequence E;,, ... , E;., can be rearranged to E 1 , ••• , E•. That is

D(E;,, ... , E;) = ± D(E¡, ... , En) = ±l.


In the next section we will show that this sign is determined only by the order
In this case the expression ofthe índices i 1, ••• , in. A more abstract approach to the problem of existence
will be considered in Section 3. Using this approach we will ignore this sign
problem and obtain an expression for D(A) in terms of cofactors.
Before proceeding to show that a determinant function exists, it should
be pointed out that if such a function exists, then it must be unique, for a
beco mes function is single valued. So if•D exists, then D(E:;,, ... , E;.,) has a unique
value for each choice of i 1 , ••• , in. This value depends only on the índices
i 1 , ••• , i. and the value of D(E 1 , ••• , E.). Since D(E1 , ••• , E.) is required
to be 1, we have:

Therefore Theorem A2 If a determinant function exists, then it is unique.

Problems

and 1. Assuming a determinant function exists, let {E¡} be the standard basis in
A 3 x 1 (F) and find
2 2 a. D(E¡, EJ, E 2). c. D(EJ, E2, E 1).
D(A) = L a¡¡~ il=l
it=l
L a;, 2 D(E;,, E;) b. D(E2 , E¡, E 2). d. D(E3, E~, E2).
2. Assuming a determinant function exists, let {E,} be the standard basis in
= + a11 a22 D(E¡, E2 )
a 11 a 12 D(E1 , E 1) .Jt4x t(F) and find

+ a21 a12 D(E2 , E 1) + a21 a22 D(E2 , E 2 ). a. D(E~, E3, E4, E2). c. D(E2, E4, E¡, E3).
b. D(E3, E2, El> E 3). d. D(E¡, E~, E~, E 1 ).
APPENDIX: DETERMINANTS §2. Permutations and Existence 413
412

3. Show that the function W- det(U¡, ... , Uk-t• W, Uk+t• ... , Un) is a linear identity in & •. And finally since permutations are 1 - 1, they have inverses,
function from .lfn x 1 (R) to R for any n - 1vectors U¡, ... , Uk- ¡, Uk+ ¡, . _.. , Un so &. is a group. The proof of noncommutativity is left as a problem.
from .((n X l (R).
4. Show that conditions 1 and 2 following the definition of a determinan! function In the previous section we found that D(A) is expressed in terms of
are equivalen! to conditions 1* and 2 if 1* is given by: D(E;,, . .. , E;.) and the value of D(E;,, . .. , E;) depends on the arrange-
1.* D(A¡, ... , Ah+ cAk, ... , An) = D(At. ... , A 1, ••• , An) ment of the integers i 1 , ••• , i •. If this arrangement contains all n integers,
then it is a permutation in&. given by CT(l) = i 1, ••• , CT(n) = i•. Therefore
for any two columns. Ah and A~, h -1' k, and any e E F.
the expression obtained for D(A) might be written as
5. Prove that if U¡, ... , Un are linearly dependen! in .ltnx 1(F) anda determinan!
function exists, then D(U¡, . .. , U,.) =O.
6. Suppose f is a multilinear function of n variables, and F is a field in which
1 + 1 -1' O. Prove that if f is skew-symmetric, then f is alternating.
Still assuming such a function D exists, the problem now is to determine
7. Could the definition of a determinan! function ha ve been made in terms of rows
how the sign of D(Ea(tl• ... , Ea(n)) depends on CT.
instead of columns?
8. Suppose f is a multilinear function such that f( U¡, ... , Un) = O whenever two
adjacent columns are equal. Prove that/is alternating. Definition A transposition is a permutation that interchanges two
elements and sends every other element to itself.
9. Suppose f is a skew-symmetric function of n variables. Pro ve that if f is linear·
in the first variable, then f is multilinear.
For a given permutation CT E&"' or i 1, ••• , i"' there is a sequence of
transpositions that rearranges i 1, ••• , Í11 into the normal order l, ... , n.
That is, there is a sequen ce of transpositions which when composed with the
permutation CT yields the identity permutation l.
§2. Permutations and Existence
~
A permutation is a one to one transformation of a finite Example 1 Use a sequence of transpositions to rearrange the
Definition
permutation 4, 2, 1, 5, 3 into l, 2, 3, 4, 5.
set onto itself.
This may be done with three transpositions.

We are intersted in permutations of the first n positive integers and will 4, 2, 1, 5, 3---71,2,4, 5, 3 interchange 4 and l
Jet &. denote the set of all permutations of {I, ... , n}. For example if
---7 1, 2, 4, 3, 5 interchange 5 and 3
CT(I) = 3, CT(2) = 4, CT(3) = 1, CT(4) = 5, and CT(5) = 2, then CT is a member of
& 5 . Anot~lement of & 5 is given by r( 1) = 2, r(2) = 1, r(3) = 3, r(4) = 5, ---7 1, 2, 3, 4, 5 interchange 4 and 3.
r(5) = 4. Since permutations are 1 - 1 maps, their compositions are 1 - 1,
so two permutations may be composed to obtain a permutation. For CT and But this sequence of transpositions is not unique, for"example:
r above, CT o r is the permutation 11 E & 5 given by .u( 1) = 4, .u(2) = 3, .u(3) = 1,
.u(4) = 2, and p(5) = 5. Find ro CT, and note that it is not ¡1. 4, 2, 1, 5, 3 """2,1' 4, 1, 2, 5, 3---¡;¡' 1, 4, 2, 5, 3 4:2' 1, 2, 4, 5, 3
---s,3' 1, 2, 4, 3, 5 ----¡;-y 1, 2, 3, 4, 5
Theorem A3 The set of permutations r!/ 11 together with the operation
of composition is a group. If n > 2, then the group is not commutative. where the numbers under each arrow indicate the pair interchanged by a
transposition.
Proof Closure of & 11 under composition was obtained above.
Associativity is a general property of composition. The identity map I is the Since both sequences in Example 3 contain an odd number oftransposi-
414 APPENDIX: DETERMINANTS §2. Permutations and Existence 415

tions, either sequence would give Theorem A4 Every transposition is an odd permutation.

Proof Suppose u E EJJ. is a transposition interchanging h and k


But if 4, 2, 1, 5, 3 can be transformed to 1, 2, 3, 4, 5 with an even number of with h < k. Then there are 2(k - h - 1) + 1 (an odd number) factors in
transpositions, then the determinant function D does not exist. Therefore Tii<i (xu(i) - xuu>) that must be multiplied by -1 to obtain Tii<i (x; - xi).
If h < i < k, then k - h - 1 of these factors are
it is necessary to prove that only an odd number of transpositions will do the
job. The general proof of such an "invariance of parity" relies on a special
polynomial.
The product of all factors X; - xi for 1 ~ i < j ~ n is written as
Another k - h - 1 factors in the wrong order are
n
TI (x;- xi).
i<j

and finally
Denote this polynomial by P(x 1, ••• , x.). Then

These are all the factors that must be multiplied by -1 since all other choices
and so on. for i < j yield u(i) < u(j). Thus u is an odd permutation.
Por u E EJJ., apply u to the índices in the polynomial P(x 1 , ••• , x.).
Then for the índices in xu(i) - xuU>• either u{i) < u(j) or u(j) < u(i) and
Xu(i) - xu(j) = -(xa(j) - Xa(i)). Therefore the factors in Tii<j (xu(i) - Xu(j)) Theorem A5 If i 1, ••• , i. gives an even (odd) permutation in EJJ. and
differ from the factors in Tií'<i (x; - x) by at most multiples of -1 and there exist m transpositions that transform i 1 , .•• , i. into 1, ... , n, then
mis even (odd).
P(xa(l)> . .. 'Xa(n)) = ±P(x¡, .. . 'x.).
Proof Suppose u is an even permutation given by a(l) = i 1, ••• , u(n)
Definition A permutation u E EJJ. is even if = i. and r 1 , ••• , rm are transpositions such that r"' o · · · o r 1 o u = J. Since
1 is an even permutation, app1ying the permutation rm o · · · o r 1 o u to the
P(xa(l\> ... 'Xa(n)) = P(x¡, ... 'x.) indicies in P(x 1 , • •• , :X.) leaves the polynomial unchanged. But u leaves
P(x 1 , ••• , x.) unchanged, and each transposition changes the sign. There-
and odd if fore the composition changes the polynomial to ( -1)mP(x 1 , ••• , x.) and
m must be even. Similarly m is odd if u is an odd permutation.
P(xa(l)• ... ' Xa(n)) = - P(x¡, ... ' x.).

Definition The sign or signum of a permutation u, denoted sgn u, is


Examp/e 2 Let u E fJJ 3 be given by u(1) = 3, u(2) = 2, and u(3) = 1, 1 if u is even and -1 if u is odd.
then u is odd for
·"'· If there exist m transpositions that transform a permutation into the
P(xa(l)> Xa(2)• Xa(3)) = (Xa(l) - Xa(2))(Xa(1) - Xa(3))(xa(2) - Xa(3))
normal order, then sgn u = ( -1)"', and we can write
= (x3 - Xz)(x3 - x 1)(x2 - x 1)
= (-1) 3 (x2 - x 3)(x1 - x 3)(x1 - x 2) P(xa(l)• . .. , Xa(n)) = (sgn u)P(x 1, . .. , x.).
rr
= -P(x 1, x 2, x 3). A value for m can be obtained by counting the number of inversions in the
416 APPENDIX: DETERMII\W.NTS §2. Permutations and Existence 417

permutation a. An inversion in a permutation occurs whenever an integer To show that D is alternating, suppose the hth and kth columns of A
precedes a smaller integer. Thus the permutation 2, 1, 5, 4, 3 contains four are equal. Then if -r is die transposition interchanging h and k,
inversions. Each inversion in a permutation can be turned around with a
single transposition. l;herefore if a has m inversions, then sgn a = ( -l)m. D(A) =L (sgn Jl)ap(I)I • • • ap(n)n + L (sgn t o Jl)a,.p(I)l • • • a<•p(n)n·
p.e.án J.UE&In

Example 3 The permutation given in Example 1 contains five For each J1 E d., sgn J1 = 1 and sgn -ro f1 = -l. Further since the hth and
inversions, therefore sgn (4, 2, 1, 5, 3) = ( -1) 5 = -l. The second sequence kth columns of A are equal,
of transpositions used to transform 4, 2, 1, 5, 3 to 1, 2, 3, 4, 5 is a sequence
that inverts each of the five inversions.

Now if E¡,, ... , E¡. is a rearrangement of E 1 , ••• , E., then the number
of interchanges required to obtain the standard order depends on whether Therefore D(A) = O, and D is alternating.
i 1 , • •• , i. is an even or odd permutation. Therefore
Problems
D(E¡" ... , E¡..) = sgn(i 1 , •.• , i.)D(E 1 , •• • , E.)
== sgn(i 1 , . . . , i.), l. Write out all the permutations in f!i'3 and f!i' 4 and determine which are even
and which áre odd.
and the formula for D(A) obtained on page 410 becomes 2. Let a, rE f!i' 5 be given by a(I) = 4, a(2) = 2, a(3) = 5, a(4) = 1, a(5) = J
and r(l) = 5, r(2) = 1, r(3) = 2, r(4) = 4, r(5) = 3.
D(A) = L (sgn
aefJJn
a)au(I)Iau(Z)Z • • • a .. (n)w Find a.ao-r. b.roa. c.a- 1 • d.r- 1 •
3. Suppose a E f!i' 3 satisfies a(l) = 2, a(2) = 3, and a(3) = l. Find all permuta-
Soto pro ve the existence of a determinant function it is only necessary to show tions in f!i' 3 that commute with a, that is, al! rE f!i' 3 for which a o r = ro a.
that this formula does in fact define such a function. 4. Prove that if n > 2, then f!i'. is not a commutative group.
5. Show that sgn a o r = (sgn a)(sgn r) for all a, rE f!i',..
Theorem A6 The function D: vfl.x.(F)--> Fdefined by 6. a. Write out P(x,, Xz, x 3 , x4).
b. Use this polynomial to find the sign of a and r if a is given by a(I) = 4,
D(A) = L (sgn a)a<I(l)l . . . au(n)n for A =(a¡)
a(2) = 2, a(3) = 1, a(4) = 3 and r is given by 1, 4, 3, 2.
qe!fl'n

7. How many elements are there in f!i'.?


is a determinant function.
8. a. Show that .91, = {,u E f!i' .¡,u is even} is a subgroup of f!i',.
b. Suppose rE f!i'. is a transposition. Show that each odd permutation a E f!i',
Proof Showing that D is multilinear and that D(I.) = 1 is not can be expressed as r o ,u for sorne ,u E .91,.
difficult and is left to the reader. 9. Find the number of inversions in the following permutations. Use the number
In proving that D is alternating we will use the alternating group of of inversions to determine the sign of each.
even permutations, d. = {Jl E &>.jsgn J1 = 1}. lf -r is a transposition in a. 5, 4, 3, 2, l. c. 5, 1,4,2,3.
&>., then b. 6, 5, 4, 3, 2, l. d. 4, 3, 6, 2, 1, 5.
10. Suppose A = (A 1 , ••• , A,) and B = (b 11 ) are n x n matrices.
a. Show that the product may be written as

see problem 8 below. AB =(f. b¡¡A¡, ... , 'f, b,A,).


i=l l=l
1
418 APPENDIX: DETERMINANTS §3. Cofactors, Existence, and Applications 419

b. Use the definition given for D in Thoerem A6 and part a to prove that Therefore
D(AB) = D(A)D(B).
11. Suppose 1 is an alternating multilinear function of k variables and a E &k.
Prove thatf(V. 0 ¡, . . . , V•<k>) = (sgn a)f(V" ... , Vk).
Now since CAkl, ... , A~kl) is the (n - 1) x (n - 1) matrix obtained from
A by deleting the kth row and the 1st column, we can give an inductive defini-
tion of D just as was done in Chapter 3.
§3. Cofactors, Existence, and Applications

<Si-:__

We will obtain another definition for D(A) by again assuming that a Theorem A7 Define D: J!tnxnCF)-+ Finductively by D(A) = a 11 when
determinant function exists. First, since D is to be linear in the first variable n = 1 andA= (a 11 ); and

'
D(A) = D(A¡, A2, . .. , An)
n
= L ak¡( -l)k+I D(A~kl, ... 'A~kl) wheri n > l.
k=!

Then D is a determinan! function for each n.


This expresses D(A) in terms of D(Ek, A 2 , ••• , An). But since D is alternating,
D(Ek, Az, . .. , An) does not involve the kth entries of A 2, . .. , Aw Therefore,
if A~k) denotes the hth column of A with the kth entry deleted, Aik) = Proof It is clear from the definition that D is linear in the first
L i*k a¡hE¡, and D(Ek> A 2 , •• • , AJ differs from D(Aj,k>, . .. , A~kl) by at
variable, and a straightfoward induction argument shows that D is linear in
most a sign. Write every other variable.
To prove that D is alternating, first note that the definition gives

Th¡; value of e can be found by setting the columns of A equal to basis vectors.
n(~ !) = a(-l)1+ D(d) +
1
c(-1) 1 + 2D(b) = ad- cb.
For
·-·
} So
D(Ek, E 1 , ••• , Ek_ 1 , Ek+ 1 , ••• , EJ
= cD(E{k>, ... , m".! 1 , EP') 1 , ••• , E!kl) D(~ ~) = ac - ca

= cD(/(n-!)x(n-1)) =C.
and D is alternating for 2 x 2 matrices. Therefore assume D is alternating
On the other hand, D must be skew-symmetric, so for (n - 1) x (n - 1) matrices, n > 2. If A is an n x n matrix with two
columns equal, then there are two cases.
D(Ek, E¡, ... , Ek-t• Ek+t• ... , En) Case/: Neither ofthe equal columns is the first column of A In this case
1 two columns of (Aj_k>, ... , A~k)) are equal, so by the induction assumption
= ( -l)k- D(E¡, .. . , Ek_ 1 , Ek, Ek+ 1 , ••• , En)
D(A~k>, ... , A~kl) = O for each k and D(A) = O.
= {-l)k-tD(IJ = (-l)k-1. Case l/: The first column of A equals sorne other column. _Since case 1
and the multilinearity of D imp1y that D is skew-symmetric in all variables
This gives e= (- v-t, but itwill be better to write e in the form ( -J)k+t. but the nrst, it is sufficient to assume the first two columns of A are equal.
420 APPENDIX: DETERMINANTS
§3. Cofactors, Existence, and Applications 421

Thus

D(A) = D(A 1 , A 1 , A 3 , ••• , A.)


n
= "'a
1-J kl
(-l)k+tD(A<k>
1 '
A<k)
3 ' · • · '
A<k>)
n ·
k=t n
= L ak,,D(A ¡, ... 'A¡,-¡, Ek> Ah+¡, ... ' A.)
k=!
To evaluate the terms in this expression, Jet AY·h) denote the jth column of
n
A with the hth and kth entries deleted, so
-- '-'
"' akh ( - I)k+hD(A<k>
1 , ••. ' A<kl
h-1, A<k>
h+ 1, ... ' A<k>)
n •
k=t
A(k,h)
J
= "'
,t_,
aIJ. .E.¡•
i*k,h For the last equality, note that if k < h, then
:::
Then '
-~ D(E 1, ... , Ek-t• Ek+ 1 , ••• , E¡,- 1, E¡,, Ek, Eh+ 1, ••• , E.)
= ( -l)h-(k+tl+tD(E1 , ••• , Ek-t• Ek, Ek+ 1 , ••• , E¡,- 1, Eh+ 1 , ••• , E.)
k-t = ( -l)h-kD(I.) = ( -It+k.
= L a¡,¡(-l)h+!D(A~k,h), ... , A~k,ll))
h= 1 The case k > h is handled similarly. Now the matrix
n
+ L a¡,¡( -1/'D(Ak,h), ... , A~k,hl),
h=k+t

and is obtained from A by deleting the kth row and the hth column. Therefore wc
will denote ( -Ir"D(A~k>, ... , A~':! 1 , A~~\, ... , A~k>) by Akh and call ir
D(A) = D(A 1, A 1 , A 3 , ..• , A.) the cofactor of akh· This terminology will agree with that of Chapter 3 when
n k-! we show that the functions D and det are equal. We will say that D(A 1
="' "'a a (-l)k+h+tD(A<k,hl
i.J ./..,¡ kL hl 3 ' · • ·'
A<k,hl)
n = L k= 1 ak 1,Akh is the expansion by cofactors of D(A) along the hth column of A.
k= 1 h= 1
Writing D(A) = L k= 1 ak,Ak 1, shows that all the terms containing a¡i for
n n
+ L L aklah1(- v+hD(Ak,h), ... , A~k,h)). any i and j are in the product aiiAii. But each ter m in D(A) contains exactly
k= 1 h=k+ 1 one element from each row. Therefore if the terms in D(A) are grouped ac-
cording to which element they contain from the ith row, then D(A) must
The product a; 1ai 1 appears twice in this expression for each i =1= j, and since ·~ the form
the coefficients are ( -l)í+ i+ 1 D(AU.i>
3 ' · • • '
A<n i,il) and (-J)i+ ;D(AU,i>
3 , · .. ,
A~i,i>), D(A) = O. Therefore D is alternating.
To complete the proof it is necessary to show that D(l.) = l. This is
proved by induction, noting that if k =1= 1, then the multilinearity of D implies The equality D(A) = L J= 1 a;iAii gives the expansion by cofactors of D(A)
that D(E~kl, . .. , E~k¡) = O. along the ith row of A.

Since there is only one determinant function, the functions defined in Theorem A8 The functions D and det are equal on vil,. x .(R).
Theorems A6 and A 7 must be equal. The first step in showing that this func-
tion equals det is to generalize the idea u sed above. If we write the hth column
Proof By induction on n. D and det are equal for 1 x 1 real ma-
422 APPENDIX: DETERMINANTS §3. Cofactors, Existence, and Applications 423

tríces. Therefore assume they are equal for (n - 1) x (n - 1) real matrices, Then replacing the columns of E's with A's gives
n > 1, and Jet A be a real n x n matrix. Then expanding D(A) along the first
row gives

n
D(A) = ~:aliA 1 i So det AB = det B det A = det A det B.
j=l
We may also prove Cramer's rule (Theorem 4.22) directly from the
n
1 > A< 1 l properties of the determinant function. Suppose AX = B is a system of n
- ~
-.t..Jalj (-1) 1 +iD(A<11 >, . . . , A<j-1, A<n1>)
j=!
j+l, ... ,
linear equations in n unknowns. If A = (A 1 , ••• , A"), then the system can
n be written in the form --~
= L a1 j( -1) 1 +idet (minor of a 1j) by induction assumption
j=1 n
= det A. or 1:x¡A¡ =B.
i= 1

We will now drop the notation D(A) and write det A for the deterrninant Now since det is an alternating multilinear function
of A E .41nxn(F). Theorem A8 shows that our two definitions of cofactor are
equivalent, and it supplies a proof for Theorem 3.13 which stated that det A
could be expanded by cofactors along any row or any column. We might now det(A 1 , ••• , Aj-!> .i
•=1
X¡A¡, Aj+ ¡, ... , An)
prove that the basic properties of determinants hold for matrices with enties
from an arbitrary fiel d. The fact that det Ar = det A is still most easily pro ved = xi det(A 1 , •.. , Aj~J> Ai, Aj+l• ... , An)
by induction, but the other properties can be obtained quite easily using the = xi det A.
fact that det is an alternating multilinear function. For example, the fact
that the determinant of a product is the product of the determinants (Theorem But L 7= 1 x ;A¡ = B, so
4.26) miglit be proved over an arbitrary field as follows:
Let A = (a¡) = (A 1 , ••• , A") and B = (bii) be n x n matrices. Then .¡. det(A 1 , ••• , Aj-!> B, Ai+ 1 , ••• , An) = Xj det A.
the jth column of AB has the form
Therefore if det A =1 O,

det(A 1, ••• , Ai_ 1 , B, Ai+I• ... , A")


xi = det A ·

It is now possible to derive the adjoint formula for the inverse of a


matrix using Cramer's rule, see problem 4 at the end of this section.
As a final application of the determinant function we will redefine the
Therefore
cross product in tC 3 and prove that

IIU X Vil= IIUIIIIVII sin e.

Since the function det is multilinear and skew symmetric, Definition The cross product of U, V E c3, denoted by u X V, ís the
vector in e3 that satisfies
for all W E <f3.
424 APPENDIX: DETERMINANTS §3. Cofactors, Existence. and Applications 425

The existence of a vector U x V for each pair of vectors from C3 is a We are also in a position to show that the cross product is invariant
consequence of problem 6, page 286. Since this is a new definition for the under rotations, which is equivalent to showing that the cross product of two
cross product, it is necessary to show that it yields the computational formula vectors can be computed using their coordinates with respect to any ortho-
given in Chapter 3. normal basis having the same orientation as the standard basis.

Theorem A9 Theorem A11 Suppose T: C 3 --+ C 3 is a rotation. Then

T(U) x T(V) = T(U x V) for all U, V E C 3 .

Proof Let P be the matrix of T with respect to the standard basis.


Proof This follows at once from the fact that the components of If the transpose notation is omitted for simplicity, then we can write T(W)
U x Vare given by U x V o E; when i = 1, 2, 3. For example, when i = 1, = PWfor any WE C 3 and

det(;~ ~~ ~) =
T(U) x T(V) o W = det(T(U), T(V), T(T- 1 (W)))
U x V o E1 = det(x 2Y2)
X3 Y3
= det(X2 X3).
h Y3 = det(PU, PV, PT- 1(W))
x3 Y3 O
= det P det(U, V, T- 1 (W))
This theorem shows that if the standard basis vectors are treated as =u X V o r- 1(W) Since det P = 1
numbers, then
= T(U x V) a T(T- 1 (W)) Since T is orthogonal
x, y, = T(U x V) o W.
U x V = det x 2 Y2
(
X3 Y3 Sin ce this holds for all W E C 3 , it implies that T( U) x T(V) = T( U x V).

The determinants of these two matrices are equal because the second matrix
is obtained from the first by interchanging two pairs of columns and then Theorem A12 If U and V are orthogonal unit vectors, then
taking the transpose. U x Vis a unit vector.
The basic properties of the cross product are now immediate conse-
quences of the fact that det is an alternating multilinear function. Proof There exists a rotation T that sends U to E 1 and V to E 2 •
therefore, using the fact that T is orthogonal we have
Theeorm A1 O If U, V, W E C 3 anda, bE R, then
1. U x Vis orthogonal to U and V. IIU X VIl = IIT(U X V)ll = IIT(U) X T(V)II = IIE, X E211 = IIE311 =l.
2. The cross product is skew-symmetric, that is, U x V = - V x U.
3. The cross product is bilinear, that is, Now the fact that 11 U x VIl = 11 Ull 11 VIl sin O follows from the next
theorem.
U x (aV + bW) = aU x V+ bU x W
Theorem A13 e
Suppose is the angle between U and V. Then there exists
and a unit vector N, orthogonal to U and V, such that

(aU + bV) x W = aU x W + bV x W. uX V= IIUIIIIVII sin e N.


426 APPENDIX: DETERMINANTS §3. Cofactors. Existence, and Applications 427

Proa! If U and V are Jinearly dependent, say V = rU, then 6. Prove that the cross product is a. skew-symmetric. b. bilinear.
U x V o W = det(U, rU, W) =O for all Wetff 3 , and U x V= O. In, this 7. a. Show that the parallelogram with vertices O, U, V, U + V in cf 3 has area
e
case, sin = O and the equation holds for any vector N. IIUx VIl.
Therefore suppose U and V are Jinearly independent. Use the Gram- b. Find the area of the parallelogram with vertices O, (2, 3, 1), (4, -5, -2),
Schmidt process to build an orthonormal basis {X, Y} for if{U, V} from and (6, -2, -1).
the basis {U, V}. Then X = U/IIUII and the angle between V and Y is either c. Use part a to find a geneneral formula for the area of the triangle with
90° - e ore -90°. Since {X, Y} is an orthonormal basis, vertices A, B, e E cf 3·
d. Find the area ofthe triangle with vertices (3,- 2, 6), (1, 5, 7), and (2, -4, 1).

V= (V o X)X +(V o Y)Y 8. Consider the set cf 3 together with the operation given by the cross product.
How el ose does (rf 3 , x) come to satisfying the definition of a group?
= (V o X)X + IIVIIII Yll COS ±(e- 90°)Y 9. a. Extend the definition of cross product to cf 4·
= (V o X) X + 11 VIl sin e Y. b. Suppose U, V, and Ware linearly in dependen! in cf 4 • Show that U x V x W
is normal to the hyperplane spanned by U, V, and W.
Therefore c. Find (1, O, O, O) x (0, 1, O, O) x (0, O, 1, O).
d. Find (2, O, 1, O) X (0, 3, O, 2) X (2, O, O, 1).
uX V= (IIUIIX) X ((V o X)X + IIVII sin e Y) 10. Suppose 1' is an n-dimensional vector space over F. A p-vector on 1' is an ex-
pression of the form rU1 1\ U 2 1\ • • • 1\ u. where rE F and U t. .•• , u. E 1',
= /IUIIIIV/1 sin(} X x Y. for 2 S p s n. p-vectors are required to satisfy
i. U1 1\ • • • 1\ (aUt + bU[) 1\ • · • 1\ u. = aU 1 1\ • • • 1\ U1 1\ • • • 1\ u.
X x Y is a unit vector by Theorem Al2. It is orthogonal to X and Y by + bU1 1\ • · • 1\ U{ 1\ • • • 1\ u. for i = 1, ... , panda, bE F.
Theorem A 10, and sois orthogonal to U and V. Therefore, setting N = X x Y ii. U1 1\ • • • 1\ Up = Oif any two vectors U1 and U1 are equal, i =,6 j.
U 1\ W, read "U wedge W," is called the exterior product of U and W. The
gives the desired equation. N o ti ce that when U and Vare linearly independent,
vector space of all p-vectors on 1', denoted by A•r, is the set of all finite sums
N= U x V/11 U x VIl, as expected.
of p-vectors on '"Y.
a. Show that in N'"Y 2 , (a, b) 1\ (e, d) = [ad- bc](1, O) 1\ (0, 1).
Corollary A14 The set {U, V} is linearly dependent in tff 3 if and only if b. Show that dimN1' 3 = 3 and that dimA 21'4 = 6.
U x V= 0. c. Prove that dimA"'"Y = 1 by showing that if {V¡, ... , Vn} is a basis for '"Y,
then {V1 1\ • • • 1\ Vn} is a basis for A"'"Y.
d. For TE Hom('"l'), defineAT: N'"!'- A"'"Y by
Problems AT('L rU1 f... • • • 1\ Un) = 'L rT(U1 ) 1\ • • • 1\ T(Un).
Show that th.ere exists a constant 1TI such that
l. Show that the formulas in Theorems A6 and A7 give the same expression for T(W¡) 1\ ••• 1\ T(Wn) = ITI wl 1\ ••• 1\ Wn for all Wt. ... ' Wn E '"Y.
D(A) when A is an arbitrary 3 x 3 matrix. Call 1TI the determinant of T. Notice that 1TI is defined without reference
toa matrix.
2. Using the definition for D given in Theorem A7,
a. · Prove that D is multilinear. e. Prove that if S, Te Hom('"Y), then IS o TI= ISIITI-
b. Prove that D(ln) = l. f. Prove that if A is the matrix of T with respect to sorne basis, then 1TI =
det A.
3. Show that when k > h,
D(A~> ... , Ah-~> Bk, Ah+~> ... , An)
= (-l)k+hD(A~k>, ... , A1':\, A1';\, ... , A~k>).
4. Use Cramer's rule to derive the adjoint formula for the inverse of a nonsingular
matrix.
5, Show that U X Vis orthogonal to U and V.
Answers and Suggestions
430 ANSWERS ANO SUGGESTIONS Chapter 1 431

e. Property of OE R and the fact that equations have unique solutions in


R. (The Iatter can be derived from the properties of addition.)
Chapter 1 f. Definition of the zero vector.
8. Por U= (a, b) E "f/' 2 , assume there is a vector TE "f/' 2 such that U+ T =O
§1 Page 4 and show that T = (-a, -b).
9. All these properties are proved using properties of real numbers. To show that
1. a. Seven is a real number. b. J- 6 is not a real number. (r + s)U = rU + sU, let U= (a, b), then
c. The set containing only O and 5. [r + s]U = [r + s](a, b)
d. The set of all x such that x is a negative real number. = ([r + s]a, [r + s]b) Definition of scalar multiplication in "Y 2
e. The set of all real numbers x such that x 2 = -l. Notice that this s~ 1 s = (ra + sa, rb + sb) Distributive law in R
empty. = (ra, rb) + (sa, sb) Definition of addition in "Y 2
= r(a, b) + s(a, b) Definition of scalar multiplication in "Y 2
2. a. 1 + 1 = 2. b. Consider the product (2n 1 + 1)(2n 2 + 1). = rU+ sU.
3. a. Closed under multiplication but not addition. b. Neither. 10. b. If U= (a, b), then from problem 8, -U= (-a, -b). But ( -1)U
c. Closed under addition but not multiplication. d. Both. e. Both. = (-1)(a, b) = ((-1)a, (-1)b) = (-q, -b).
4. Addition in R is eommutative if r + s = s + r for all r, sE R. 11. Subtraction would be commutative if x - y = y - x for all x, y E R. This
5. (r + s)t = rt + st can proved using the given distributive Iaw and the fact does not hold for 2 - 5 = -3 while 5 - 2 = 3. This shows why we use
that multiplication is commutative. Can you prove this? signed numbers.

§2 Page 6 §3 Page 12

1. b. If (x, y) is a point on the Iine representing all scalar multiples of {3, -4),
1. a. (5, -3). b. (-1, 0). c. o. d. (1/4, 7/3). then (x, y) = r(3, -4) for sorne rE R. Therefore x = 3r and y = -4r, so
2. a. (2, -5/2). b. o. c. (6, -1). d. ( -3, 6). x/3 = r = yf-4 and y= (-4/3)x.
3. a. (2, 10). b. ( -1/3, -1/3). c. (5, -1). d. (13/2, 1). 3. The three points are collinear with the origin.
4. Suppose U is the ordered pair (a, b), then 4. Remember that there are many equations for each Iine.
U+ O = (a, b) + (0, O) Definition of O a. P = (t, t), tE R. b. P = t(1, 5), tE R. c. P = t(3, 2), tE R.
d. P=(t,O),tER. e. P=t(l,5)+(1,-1),tER.
= (a + O, b + O) Definition of addition in "Y 2
= (a, b) Zero is the additive identity in R f. P = (4, t), tE R. g. P = t(l, 1/2) + {1/2, 1), tE R.
=u. 5. a. (1, 1). b. (1, 5). c. (3, 2). d. (1, 0). e. (1, 5).
f. (0, 1). g. (2, 1).
5. Set U= (a, b), V= (e, d), and W =(e,/) and perform the indicated opera-
tions, thus 6. a. P = t(2, 3) + (5, -2), /E R. b. P = 1(1, 1) + {4, O), tER.
U+ (V+ W) =(a, b) +((e, d) + (e,f)) c. P = t(O, 1) +
(1, 2), tER.
=(a, b) +(e+ e, d + /) Reason? 7; a. x - 3y +2= O. b. y =X. c. 4x - y - 2 = O.
C~ntinue until you obtain (U+ V)+ W, it should take four more steps, 8. The equations have the same graph, e, and the vectors V and B are on e.
w1th reasons, and the fact that addition is associative in R rnust be used. Thereforethereexistscalarss0 andt0 suchthat V= soA + BandB = toU + V.
6. No. The operation yields scalars (numbers) and vectors are ordered pairs of This leads to U = ( -s0 /t0 )A. What if t 0 = O?
numbers.
7. a. Vectors in "Y 2 are ordered pairs. §4 Page 18
b. Definition of vector addition in "Y 2 •
c. By assumption U + W = U. 1. a. 2. b. o. c. o. d. -23.
d. Definition of equality of ordered pai~s. 2. a. l. b. l. c. 5. d. ..)lO.
Chapter 2 433
432 ANSWERS ANO SUGGESTIONS

3. Let U= (a, b) and compute the length of rU, remember that .¡t:2 =Ir!. 9. a. [P- (1, 2, O)] o (0, O, 1) =O, z =O.
b. [P- (4, O, 8)] o ( -3, 2, 1) =O, 3x- 2y- z = 4.
4. a. lt has none. b. Use the same procedure as when proving the pro- c. [P- (1, 1)] o (2, -3) =O, 2x- 3y =-l.
perties of addition in "Y 2 • · .
d. [P- (-7, O)] o (1, O)= O, x = -7.
6. a. P = t(l, -3) + (2, -5), tER. b. P = 1(1, 0) + (4, 7), teR. e. [P- (2, 1, 1, O)] o (3, 1, O, 2) =O, 3x +y+ 2w = 7, P = (x, y, z, w).
c. P = /(3, -7) + (6, 0), tE R. 11. a. Y es at (0, O, 0). · b. Yes at (2, -2, 4) since the three equations t = s + 4,
7. a. (1/2, -1/2). b. (7, 7). c. (9/2, 7/2). -t = 3s + 4, and 2t = -2s are all satisfied by t = -s = 2. c. No for
a solution for any two of the equations does not satisfy the third. d. No.
8. Q= H +~B.
e. Yes at (1, 1, 1).
11. Notice that if A, ii, C, and D are the vertices of a parallelogram, with B 12. a. [P- (0, -3)] o (4, -1) =O. d. [P- (1, 1)] o (-3, 1) =O.
opposite D, then A - B = D - C. Write out what must be proved in vector b. [P- (0, 7)] o (0, 1) =O. e. po(1,-1)=0.
form and use this relation. c. [P- (6, 1)] o (-2, 3) =O.
13. b. ±(2/3, 2/3, 1/3), ±(3/5, 4/5), ±(lfVJ, 1/VJ, 1/VJ).
§5 Page 21 14. The angle between U and the positive x axis is the angle between U and
(1, O, O).
l. a. 90°. b. cos- 1 -2/'1/5. c. 30°. 15. a. ( -1/3, 2/3, 2/3) or (1/3, -2/3, - 2/3).
2. a. P = /(3, 1), tER. b. P = t(i, -1) + (5, 4), tER. b. ±(1/Vl, O, 1/Vl). c. ±(0, 1, 0).
c. P = t(l, O) + (2, 1), tE R. 16. (a, b, O), (a, O, e), (0, b, e).
l
7. lf A, B, C are the vertices of the triangle with A and B at the ends of the di~me­ 17. As points on a directed line.
ter and O is the center, then A = -B. Consider (A - C) o (B - C).
.j
1
§6 Page 28 i Chapter 2
¡
l. a. (8, 1, 6), "Y 3 or rff 3• d. (0, -1, -1, -6, -4), "Y 5 or &' 5 1
b. (5, O, 10, 18), "Y 4 or C4. e. 1, rffJ. '1 §1 Page 36
c. O, rff3. f. -6, rff4.
l. a. (7, -2, -4, 4, -8). b. 2 + 2t + 6t 2 •
2. Since U, Ve "Y3, Jet U= (a,. b, e), V= (d, e, f) and proceed as in the cor- c. 3i - 5. d. cos x + 4 - .yX:.
responding proof for "Y 2 • 2. a. ( -2, 4, O, -2, -1), ( -5, -2, 4, -2, 9). b. 3t 4 - 6t
2
- 2, -2t
4. a. .YJ/2. b. '1/14. c. l. - 3t 4 • c. ¡ - 2, 7 - 4i. d. -cos x, vx - 4.
5. Choose a coordínate system wíth a vertex of the cube at the origin and the 5. a. Definition of addition in ff; Definition of O; Property of zero in R.
coordinate axes along edges of the cube. cos- 1 1/'1/3. -~. f+ (-/)=O if (f + (-/))(x) = O(x) for all x E [0, 1].
(f + (-/))(x) =f(x) + (-/)(x) =/(x) + (-/(x)) =O= O(x).
6. a. P=t(5,3,-8)+(2,-1,4),tER.
d. p = t(2, 8, -5, -7, -5) + (1, -1, 4, 2, 5), tE R. What is the reason for each step?
c. The fact that ti' is closed under scalar multiplication is proveo in any
8. a. The intersection of the two walls. b. The ftoor. c. The plane introductory calculus course. To show that r(f +.g) = rf + rg, for all rE R
parallel to and 4 feet above the ftoor. d. The plane parallel to and 3 feet
and J, g E§, consider the value of [r(f + g)](x).
in front of the wall representíng the yz coordinate plane. e. The line on
[r(f + g)](x) = r[(f + g)(x)] Definition of scalar multiplication in ti'
the wall representing the yz coordina te plane which íntersects the ftoor and the
= r(f(x) + g(x)] Definition of addition in ti'
other wall in points 3 feet from the comer. f. The plane parallel to and 1
= r(f(x)] + r[g(x)] Distributive law in R
foot behind the wall representing the xz coordinate plane. g. The plane
which bísects the angle between the two walls. h. The íntersection of the
= [rf](x) + [rg](x) Definition of scalar multiplication in ti'
plane in part g with the ftoor. i. The plane which intersects the floor
= [rf + rg](x) Definition of addition in ti'.
and two walls in an equilateral triangle with side 3'1/2. Therefore the functions r(f + g) and rf + rg are equal.
435
434 ANSWERS ANO SUGGESTIONS Chapter 2

6. a. (0, O, 0). b. O(x) = O, for all x E [0, 1]. c. O = O + Oi. b. The function f(x) = -5/3 is in .'/, so !/ is nonempty. But !/ is neither
d. OE R. e. (0, O, O, O, O). f. {an} where an =O for all n. closed under addition nor scalar multiplication.
g. The origin. c. !/ is subspace. d. !/ is not a subspace.
7. a. The distributive law (r + s)U = rU + sU fails to hold. 10. a. Since !/ and fY are nonempty sois the sum !/ + !Y. If A, BE(.'/+ !Y),
b. 'If b -1 O, then 1(a, b) # (a, b). then A = ul + vl and B = Uz + Vz for sorne U¡, u2 E!/ and sorne
c. Addition is neither commutative nor associative. Vt. V2 E !Y. (Why is this true?) Now A+ B = (Ut + Vt) + (U2 + Vz)
d. Not closed under addition or scalar multiplication. There is no zero element = (U 1 + U2 ) + (V1 + V2 ). Since !/ and fY are vector spaces, U1 + Vz E!/
and there are no additive inverses. and vl + v2 E !Y, therefore !/ + fY is closed under addition. Closure under
scalar multiplication follows siinilarly.
8. Define (~ ~) + (; ~) = (~ t; ~ t {) and r (~ ~) = (~~ ~~). 11. a. Def. of O; Def: of Y; Assoc. of +; Comm. of +; D""-of- U; Comm.
of +; Def. of O. b. Def. of additive inverse; Result just established;
§2 Page 42 Assoc. of +; Def. of -OU; Def. of O.
12. Suppose rU = O, then either r = O or r #- O. Show that if r #- O, then U = O.
1. a. Call the set !/. (4, 3, O) E.'/, so!/ is nonempty. If U, VE!/, then U 13. If r -1 s, is rU # sU?
= (a, b, O) and U = (e, d, O) for sorne scalars a, b, e, dE R. Since the third 14. You want a chain of equalities yielding Z =O. Start with
componen! of U + V = (a + e, b + d, O) is zero, U + Vis in .'/, and !/ is
Z = Z + O= Z +(U+ (-U)) =
closed under addition. Similarly, !/ is closed under scalar multiplication.
(How does !/ compare with "/' 2 ?)
b. Call the set fY. If U E fY, then U= (a, b, e) with a= 3b anda+ b =c. §3 Page 49
Suppose rU = (x, y, z), then (x, y, z) = (ra, rb, re). Now x = ra = r(3b)
= 3(rb) = 3y and x +y = ra + rb = r(a + b) = re = z. Thus rU E fY 1. a. (0, 3, 3) is in thespan ifthereexist x,y E R such that (0, 3, 3) = x( -2, 1, -9)
and fY is closed under scalar multiplication. + y(2, -4, 6). This equation gives -2x + 2y =O, x- 4y = 3, and -9x
+ 6y = 3. These equations have the common solution x = y = -l. Thus
2. If N is normal to the plane, then the problem is to show that {U E ~3jU o N
=O} .is a subspace of ~ 3 • (0, 3, 3) is in the span.
b. Here the equations are -2x + 2y = 1, x- 4y =O, and -9x + 6y =l.
3. The question is, does the vector equation The first two equations are satisfied only by y = 1/6 and x = 2/3, but these
x(1, 1, 1) + y(O, 1, 1) + z(O, O, 1) =(a, b, e) values do not satisfy the third equation. Thus (1, O, 1) is not in the span.
have a solution for x, y, z in terms of a, b, e, for all possible scalars a, b, e? c. O is in the span of any set. d. Not in the span.
That is, is (a, b, e) E!/ for all (a, b, e) E"/' 3 ? The answer is yes, for x = a, e. X= 1, y= 3/2. f. X= 2/3, Y= 7/6.
y = b - a and z = e - b is a solution. 2. a. Suppose t 2 = x(t 3 - t + 1) + y(3t 2 + 2t) + zt 3 • Then collecting terms
4. a. lmproper: (a, b, e) = a(l, O, O)+ b(O, 1, O)+ e(O, O, 1). in t gives, (x + z)t 3 + (3y·- 1)t 2 + (2y- x)t + x =O. Therefore, x + z
b. Proper. F:or example, it need not contain (7, O, O). = o, 3y - 1 = O, 2y - x = O, and x = O. Since these equations imply
2
c. Proper. If b -1 O, then (a, b, e) need not be a member. y = 1/3 and y = O, there is no solution for x, y, and z, and the vector t is
d. lmproper: 2(a, b, e) =[a+ b- e](l, 1, O)+ [a- b + e](1, O, t) not in the span.
+[-a+ b + e](O, 1, 1). b. With x, y, zas above, x = -1, y= O, and z =l.
c. x = 4, y = 2; z = l. d. Not in the span.
5. a. (a, b, a+ 4b) for sorne a, bE R.
e. x = -1, y = 1, z = 2.
b. (x, 5y, ~x, y) for sorne x, y E R.
c. ( -t, 2t, 1) for sorne tE R. d. ( -4t, 2t, 1) for sorne 1 E R. 3. a. It spans; (a, b, e) ='Ib- e](I, 1, O)+ [b- a](O, 1, 1) + [a- b + e](1, 1, 1).
b. It spans with (a, b, e) = a(1, O, O)+ 0(4, 2, O)+ b(O, 1, O)+ (0, O, O)
6. Only one set forms a subspace.
+ e(O, O, 1).
8. Show that if !/ is a subspace of "/' ¡, and !/ contains a nonzero vector, then c. It does not span. From the vector equation (a, b, e) = x(l, 1, O)
"Y 1 e !/. + y(2, O, -1) + z(4, 4, O) + w(S, 3, -1) can be obtained the two equations
9. a. !/ is the subspace of all constant functions, it is essentially the vector -y-:- w = e and 2y + 2w =a- b. Therefore (a, b, e) is in the span only if
space R. <e a- b + 2e =O.
436 ANSWERS ANO SUGGESTIONS Chapter 2 437

d. It does not span. (a, b, e) is in the span only if a - 2b +e= O. 13. In problems such as this, you may not be able to complete a proof at first,
4. a. "U is a linear combination of Vand W," or "U is in the span of Vand but you should be able to start. Here you are given two conditions and asked
W" of "U is Iinearly dependent on V and W." to prove a. third. Start by writing out exactly what these conditions mean.
b. "!7 is contained in the span of U and V." In this situation, this entails writing out expressions involving linear com-
c. "!7 is spanned by U and V." binations and making statements about the coefficients. From these expres-
d. "The set of all linear combinations of the vectors U¡, . .. , Uk" or "the sions, you should be able to complete a proof, but without them a proof is
span of U1, . .. , Uk." impossible.
5. a. Your proof must hold for any finite subset of R[t], including a set such 14. The same rcmarks apply here as were given for problem 13.
as {t + 5t 6 , t 2 - 1}.
15. Note that for any n vectors, O = OU1 + · · · + OUn.
6. Beca use addition of vectors is associative.
16. The first principie of mathematical induction states that if S is a set of integers
7. {(a, b), (e, d)} is linearly dependent if and only if there exist x and y, not both which are all greater than or equal to b, and S satisfies:
zero, such that x(a, b) + y(e, d) = (0, 0). This vector equation yields the l. the integer b is in S,
system of linear equations
2. whenever n is in S, then n + 1 is also in S, then S contains all in-
ax + ey =O tegers greater than or equal to b.
bx + dy =O. There is a nice analogy for the first principie of induction. Suppose an
Eliminating one variable from each equation yields the system infinite collection of dominas is set up so that whenever one falls over, it knock~
(ad- be)x =O the next one over. The principie is then equivalent to saying that if the first
(ad- be)y = O, domino is pushed over, they all fall over.
which has x = y = O as the only solution if and only if ad- be ,p O. ! For this problem, we are asked to prove that the set S ='
'l {klr(Ur + · · · + Uk) = rU1 + · · · + rUd includes all positive integers.
8. a. a # 1 and b # 2. b. 2a - b - 4 # O. c. ab # O. 1
(Here b = 1, although it would be possible to take b = 2.) Thus there are
l
9. a. You cannot choose Ur or Un to be the "sorne vector," you must use U1 ¡ two parts to the proof:
where i is any integer between 1 and n. j l. Prove that the statement, r(Ur + · · · + Uk) = rUr + · · · + rU;:
c. Besides the fact that the statement in part a is rather cumbersome to write is true when k = l.
out, it does not imply that {O} is linear! y dependen t. 1 2. Prove that if the statement is true for k = n, then it is true for k
j
10. a. Independent. b. Dependent. c. Dependent. d. Dependent, = n +l. That is, prove that r(U1 + · · · + Un) = rU1 + · · · + rUn implies

e. Independent. f. Independent. g. Independent. r(U 1 + · · · +Un+ Un+r) = rU 1 + · · · + rUn + rUn+t·


11. a. The vector equation x(2, -4) + y(l, 3) + z(-6, 3) =O has many solu- For the second part, no ti ce that
tions. They can be found, in this case, by letting z have an arbitrary value and r(U1 + · · · + Un+ Un+r) = r([U 1 + · · · + Un]+ Un+l).
solving for x and y in terms of z. Each solution is of the form x = 3k, y = 6k,
z = 2k for sorne real number k.
§4 Page 57
b. (5,0) =-~. -4) + b(1,3) + e(-6, -3)witha = 3k + 3/2,b = 6k + 2,
and e = 2k, for any k E R. That is, (5, O) = (5, O) + O, and (5, O) = ~(2, -4)
+ 2(1, 3). l. Show that {1, i} is a basis.
For (0, - 20), a = 3k + 2, b = 6k - 4, and e = 2k, k E R. 3. The dimension of a subspace must be an integer.
d. .P(S) e "Y 2 is immediate, and part e gives "Y 2 e .P(S).
4. a. No; dim "f/3 = 3. b. Yes. Show that the set is linearly independent,
12. a. 3k(6, 2, -3) + 5k( -2, -4, 1)- 2k(r, -7, -2) =O for all k E R. then use problem 1 and Corollary 2.8.
b. If (0, O, 1) is in the span, then there must be scalars x, y, z such that c. No; by Corollary 2.9. d. No; 10(-9)- (-15)(6) =O.
6x - 2y + 4z = O and - 3x + y - 2z = l. e. No; it cannot span. f. Yes; 2(4)- 1(-3) = 11 #O.
c. Let x, y, z be the scalars needed in the linear combination. ForO, see part
a. For (0, -1, O); x = 3k + 1/10, y= 5k + 3/10, and z = -2k, k E R, 5. a. True. b. False. A counterexample: 0(1, 4) =O, but the set {(1, 4)}
For (-2, O, -1); x = 3k + 2/5, y= 5k + 1/5, and z = -2k, k E R. For is not linearly dependent. c. False.
(6, 2, -3); x = 3k + 1, y= 5k, and z = -2k, k E R. 6. a. Let dim "Y = n and S = {U1 , ••• , Un} be a spanning set for "Y. Suppose
438 ANSWERS ANO SUGGESTIONS Chapter 2 439

a, C.:1 + · · · + a. u. = O and show that if one ~f the scalars is nonzero, then 8. B = {(1, 4), (1/2, 5/2)}.
"Y JS spanned by n - 1 vectors. This implies that dim "Y ::::;; n - 1 < n.
b. Let dim "Y= n and S= {U¡, ... , U.} be linearly independent in "Y. 9. If B ={U, V, W}, then t 2 ~ U+ W, t = V+ W, and 1 = U+ V, so
Suppose S does not span "Y and show that this implies the existence of a Jine- = {it 2 - lt + t, -!t 2 + !t + !, -}t 2 + !t-
B !}.
arly independent set with n + 1 vectors within "Y•. 10. B = {4 + 2t, 2 + 3t}.
~ L ~ h 3. ~ 2 11. (Notice that T is not defined using coordinates.) To establish condition 2 in
the definition of isomorphism, suppose T(a, b) = T(e, d). Then a+ bt =e
8. Consider the set { (Ó g), (g Ó)• (? g), (g ?) }· + dt and a = e, b = d. Therefore, T(a, b) = T(e, d) implies (a, b) = (e, d).
9. a. Add any vector (a, b) such that 4b + 7a ,¡. O. For the other part of condition 2, suppose W E R 2 [t], say W = e + ft. Then
b. Add any vector (a, b, e) such that 2a - 5b - 3e ,¡. O. e, /E R, so there is a vector (e, f) E "1' 2 and T(e, f) = W. Thus condition 2
c. Add any vector a + bt + et 2 such that a - 4e ,¡. O. holds.
d. Add any vector a + bt + et 2 such that a + b - Se ,¡. O. ,
;!
12. lf !/ and Y are two such subspaces, then there exist nonzero vectors U, V E i'" 2
1 such that !/ = {tU/tER} and Y = {tV/t E R}.
10. !f !/ n .r has S= {U,, ... , Ud as a basis, then dim(!/ n Y) =k. g n y ~
!
JS a subspace of !/, therefore S can be extended to a basis of !/. Call this basis ¡ 13.
,, a. The coordinates of (4, 7) with respect to B are 1 and 5.
{U¡,···, Uk, Vk+t• ... , V.}, so dim !/ = n. Similarly S can be extended to ~ b. The vector 2t 2 - 2 has coordinates -2, 1, and 1 with respect to the
a basis {U¡, ... , Uk, Wk+t. ... , Wm} of Y, and dim y= m. basis {t + 2, 3tZ, 4- t 2 }.
You should now be able to show that the set {U1 , ••• , Uk, Vk+t. ... ,
14. a. Although T(a, b) = T(e, d) implies (a, b) = (e, d), not every vector of
V., Wk+l, ... , Wm} is a basis for !/+Y. Notice that if a 1 U1 + · · · + akUk
"Y 3 corresponds to a vector in "Y 2 , e.g. consider (0, O, 1). Thus condítion 2
+ bk+lVk+l + ··· + bnVn + ek+,Wk+, + ··· + emWm = O,thenthevector
does not hold.
~,_U,+'''+ akUk + bk+t Vk+l + · · · + bnVn = -(ek+lWk+l + · · · + emWm)
JS m both !/ and Y. Condition 3 holds for T(U + V) = T((a, b) + (e, d)) = T(a + e,
b + d) = (a + e, b + d, O) = (a, b, O) + (e, d, O) = T(a, b) + T(e, d)
11. Aplane through the origin in 3-space can be viewed as a 2-dimensional subspace
of "Y 3 • ·
= T(U) + T(V).
b. For condition 2, suppose W = (x, y) E "Y 2, then (x, O, 2x - y) is a
vector in "Y 3 and T(x, O, 2x - y) = (x, y). Therefore, half the condition holds.
§5 Page 64 But if T(a, b, e) = T(d, e,/), then (a, b, e) need not equal (d, e,/). For T(a, b, e)
= T(d, e, /) implies a + b = d + e and 2a - e = 2d- f. And these equa-
tions ha ve roan y solutions besides a = d, b = e, e =f. For example a = d + 1,
l. a. ~ ou. need to find a and b such that (2, - 5) = a(6, - 5) + b(2, 5). The
solutJOnJsa = -b = 1/2 ' so(2 ' -5)··(1/2 -1/2) {(6,- 5),(2,5))· b =e- 1, e = f + 2, thus T(4, 6, 3) = T(S, 5, 5) and condition 2 fails to hold.
T satisfies condition 4 since T(rU) = T(r(a, b, e)) = T(ra, rb, re) = (ra + rb,
• ,
b. (1/10, -17/20)¡(3,1),(- 2,6¡¡. c. (2, -5)CEtJ•
d. (0, 1)c(l,oJ.c2,-5JJ· 2(ra) - re) = r(a + b, 2a- e) = rT(a, b, e) = rT(U).

2. a. (3/40, 11/40)cc 6, _ 5¡,c 2, 5>J· b. (2/5, 1/10){(3,1),(-2,6)}• 15. a. 2 + t and 4 + 6t.


C. (1, 1){E¡}• d. (7/5, -1/5)((1,0),(2,-5)}• 16. Suppose T is a correspondence from "Y 2 to "Y 3 satisfying conditions 1, 3, and 4
3. a. (0, 1)s. b. (1, -4) 8 • C. ( - 6, 27)s. d. (8, - 30)8 • in the definition of isomorphism (the map in problem 14a is an example.) Show
e. (- 5, 27) 8 • that T fails to satisfy condition 2 by showing that the set {T( V)/ V E "Y 2 } cannot
a. t 2 + 1: (0, O, 1, O)s. contain three linearly independent vectors.
4. d. 12 : (-1, 1, 1, -1)8 •
b. t 2 : (1, O, O, 0) 8 • e. 13 - t 2 : (2, -1, -1, 1)s. i
c. 4: (4, -4, O, 4)8 • f. t 2 - t: (O, O, 1, -1)8 • §6 Page 70
5. a. (0, -1)s. b. (-3/5, -1/5)s. c. (1, 2)s. d. (-2, -1)s.
e. (-3, -2)8 • l. (a, b) E .s!'{(l, 7)} n .s!'{(2, 3)} if (a, b) = x(1, 7) and (a, b) = y(2, 3) for
6. a. somex, y E R. Thus x(l, 7) = y(2, 3), but {(1, 7), (2, 3)} is linearly independent,
(5, -8, 5)81 • b. (O, -1/2, 0)82 • c. (- 3, o, 5)s,. d. (2, 2, 1)s•. so x = y = O and the sum is direct. Since {(1, 7), (2, 3)} must span "Y 2 , for any
7. B = {(2/3, 5/21), (1, 1/7)}. ccW E "Y 2, there exist a, bE R such that W = a(l, 7) + b(2, 3). Now a(l, 7)
:¡'
440 ANSWERS ANO SUGGESTIO~ Chapter 3 441

Review Problems · Page 71

1. a, b, and e are false. d is true but incomplete and therefore not very useful. e
and f are false.
2. b, d, g, h, i, and p are meaningful uses of the notation. s is meaningful if U
= V = O. The other expressions are meaningless.
3. b. Note that showing S is linearly dependen! amounts to showing that O may
be expressed as a linear combination of the vectors in S in an infinite number
of ways. t = ( -1/4- 4k)(2t + 2) + (1/2 + 5k)(3t + 1) + k(3 - 7t) for any
kER.
c. Use the fact that 2t +2 = 1{3t + 1) + !(3- 7t). Compare with problem
14, on page 51.
4. Only three of the statements are true.
a. A counterexample for this statement is given by {0, !} in "Y = Rz[t] or
{(1, 3), (2, 6)} in "Y = "Y 2·

Figure A1 5. b. If U= (a, 2a + 4b, b) and V= (e, 3d, 5d), then there are many solutions
to the equations a + e = 2, 2a + 4b + 3d = 9, and b + 5d = 6. One solu-
tion is given by a = b = e = d = l.
E ~{(1, 7)} and b(2, 3) E~ {(2, 3)} so "Y 2 is contained in the sum and
"Yz = ~{(1, 7)} (8 ~{(2, 3)}. 7. a. For n = 3, k= 1, choose "Y= "1'3, {U¡, Uz, U3} ={E¡}, and Y==
2. No. ~{(4, 1, 7)}.

3. a. If (a, b, e) E Y n .'T, then there exist r, s, t, u E R such that


(a, b, e)= r(l, O, O)+ s(O, 1, O)== t(O, O, 1) + u(l, 1, 1).
Therefore (r; s, O)= (u, 11, t + 11) and Y n .'T = ~{(1, 1, 0)). Chapter 3
b. Direct sum. c. Direct sum. d. Y n .'T = ~{(4, 1, 1, 4)}.
5. If W = O, it is immediate, so assume W ¡. O. Let be the angle between W e §1 Page 81
e
and N, see Figure A l. Then cos = ± /1 V2 //!// W/1 with the sign depending on
e
whether is acule or obtuse. And V2 = ± /1 Vz/I(N//IN/1) with the sign depend- l. a. 3y =O b. 6x + 7y = 2 C. X+ 2y = 2
ing on whether W and N are on the same or opposite sides of the plane Y, x + y+ 4z = 3 9x- 2y = 3. 3x +y= 1
e
i.e., on whether is acute or obtuse.
x + 7z =O. 6x + y= -4.
6. a. "Y 3 =Y + .'T, the su m is not direct. b. "Y 3 ¡. Y + .'T. 2. a. 3y =O, x +y+ z =O, z =O, -5y =O.
c. "f/" 3 = y EEl .'T. d. "Y 3 f. y + !:T. b. 3x - y + 6z = O, 4x + 2y - z = O.
7. a. U= (4a, a, 3a) and V= (2b + e, b, b + 2e) for sorne a, b, e E R. U c. Suppose aex + be 2x + ee 5 x = O(x) for all x E [0, 1]. Then x = O give-;
= (- 20, -5, -15) and V = (16, 4, 20). a + b + e =O; x = 1 gives ea+ e 2b + e 5 e =O, and so on.
b. U = (1, -8, 6) and V = (- 1, 2, - 1).
3. a 11 = 2, a 23 = 1, a 13 =O, a24 = -7, and a21 =O.
8. U= (a, b, a + b) and V= (e, d, 3c - d) for sorne a, h, e, dE R. Since the
sum Y + ,o/" is not direct, there are many solutions. For example, taking
d =O gives U= (2, 1, 3) and V= (-1, O, -3).
4. a. A=(~
·-5 o
3 1 -~). A*=(~
-5
3
o
1 -76 -6)
8 .

9. Choose a basis for /fP and extend it to a basis for "Y. b. A=G
-1)
1 , A*=
-1
G -11 D·
10. Two possiblities are R 4 [t] (8 ~ {! 4 , t 5 , .•• , t", ... } and -1
~{1, t2, (4, . . . 't2", .. . } 8) ~{t, t3, ts, ... , ¡2n+t, .. . ). c. A= (i -3
o -i). A*= (i
-3
o
1
-2 8)·
442 ANSWERS ANO SUGGESTJONS Chapter 3 443

2. AtX = Bt and AzX = B 2 are the same except for their hth equations which
are ah 1Xt + · · · + ahmXm = bh and (ah! + rak¡)X¡ + · · · + (ahm + rakm)Xm
= bh + rbk, respectívely. Therefore, it is necessary to show that if A 1 e = B 1,
then X= e ís a solution of the hth equation in A 2 X = B 2 and if A 2 e = B 2 ,
then X= e is a solution of the hth equation of A 1 X = B 1 •
6. a. 2 -35
b ~). (g b} -i)· (g g b ~)·
(0
3' a. (g b. c. (g g -g b d.

c. (f -3
o 2)3 , x = 2, y = 3, and z = l.
~
4. a. From the echelon form (bo o~ oO
1 1
7. a. In determining if S e "Y. is.linearly dependent, one equation is obtained
1 o -1 o)
for each component. A is 2 X 3, so AX = O"has two equatíons and. n = 2.
Further there is one unknown for each vector. Sínce A has three columns S
b. (óo o~ o~).givesx=y=l. c. O 1
(o o
2 O , no solution.
o 1
must contain three vectors from "Y 2 • If the first equation is obtained from ;he
first component, then S= {(5, 1), (2, 3), (7, 8)}. 5. The only 2 X 2 matrices in echelon form are (& g), (g b), (b ~) and
b. {(2, 1), (4, 3)}.
c. {(2, 1, 2), (5, 3, 1), (7, O, 0), (8, 2, 6)}. (b Í), where k is any real number.
d. {(5, 2, 1, 6), (1, 4, o, 1), (3, 9, 7, 5)}.
6. Call the span .?. Then (x, y, z) E.?, if (x, y, z) = a(l, O, -4) + b(O, 1, 2)
8. a. W = (3, 2), Ut = (2, 1), and U2 = (-1, 1). =(a, b, -4a + 2b). Therefore !/ = {(x, y, z)/z = 2y- 4x}, i.e., !/ is the
b. W =(O, O, 0), U1 = (1, 2, 4), and U2 = (5, 3, 9). plane in 3-space wíth the equation 4x - 2y + z = O.
c. W = (4, 8, 6, 0), U1 = (9, 2, 1; 0), and U2 = (5, 1, 7, 2).
(g b), so the row space is "Y
10. a. (24 -1 8)
11 3 . b. (_!)· 7. a. The echelon form is 2•

b. (b ~ - ~) is the echelon form, therefore U is in the row space if


11. (g g). (g). (0, O, O, 0), and (& &) . o o
U = a(l, O,- 2)
o
+ b(O, 1, 3) and the row space is the set {(x, y, z)/2x- 3y + z
12. The zero in AX = O is n x 1 ; the zero in X = O is m x 1. =O}.
1 o
14. a. Or! S. b. A plane not through the origin. c. (o 1 ~), {(x, y, z)/z =x + 5y}.
c. S may be empty, or a plane or line not passing through the origin.
d. "Y 3• e. (b ~ g ~). {(x, y, z, w)/w = x + 2y + 3z}.
o o 1 3
§2 Page 93
f. (Ó ~ ~ ~), {(x, y, z, w)/z = 2x and w = 2x + 6y}.
1. a.
4 1 3
3
(1 1 1) ( 3--:_ ~k)
k -
_(3 - 2k
12 - Sk
+
+ 33 -- kk++ 9k3k) = ( 6)
15 ·
8. a. 2. b. 2. c. 2. d. 3. e. 3. f. 2.
3
Therefore all the so1utions satisfy this system.
11. a. The rank of (¡ ~) is 1, so the vectors are linearly dependen t.
b. Rank is 2. c. Rank is 2. d. Rank is 3.
b. (~
1 -2 o
~ f) (
3 2
f) ~)
3--:_ = ( holds for all values of k.
3k -3 .
12. "A is an n by m matrix."

3 13. Let U1o ... , u. be the rows of A and U¡, ... , Uh + rUk, ... , u. the rows
c. (13 11 -2)1 (3-Jklk) (6 9k) (O)
k = 12 =4k -¡, 1 for all k.
of B. Then it is necessary to show
2{U¡, ... , uk, ... , uh, ... , u.}
Therefore the solutions do not satisfy this system. e 2{U¡, .•• , Uk, ... , Uh + rUk, ... , U.}
444 ANSWERS ANO SUGGESTIONS Chapter 3 445

and §4 Page 105


:&'{U¡, ... , Uk, ... , Uh + rUk, ... , U.}
e .!&'{U1 , ••• , Uk, ... , Uh, ... , U.}. l. a. 2; a point in the plane. b. 1; a line in the plane.
c. 2; a line through the origin in 3-space.
14. The idea is to find a so!ution for one system which does not satisfy the other
d. 1; a plane through the origin in 3-space.
by setting sorne of the unknowns equal to O. Show that it is sufficient to con-
sider the following two cases.
e. 1; a hyperplane in 4-space. f. 3; the origin in 3-space.
Case 1 : ekh is the leading entry of 1 in the kth row of E, and the leading 2. a. In a line. b. The planes coincide.
1 in the kth row ofF is fic1 with j > h. c. At the origin. d. In a line.
Case 2: Neither ekh nor hh is a leading entry of l.
3. Rank A = rank A* = 3; the planes intersect in a point.
In the first case, show that x 1 must be a parameter in the so.Iutions of EX= O,
Rank A = rank A* = 2;· the planes intersect in a line.
but when XJ+ 1 = · · · = Xm = O in a so!ution for FX = O, then x1 = O.
(Rank A = rank A* = 1 is excluded since the planes are distinct.)
Rank A = 1, rank A* = 2; three parallel planes.
§3 Page 100 Rank A = 2, rank A* = 3; either two planes are parallel and intersect the
third in a pair of parallel lines, or each pair of planes intersects in one of thrce
l. X = 3, y = -2. parallellines.
2. X = 1 + 2t, y = t, tE R. 4. a. Set Y= {(2a + 3b, 3a j- 5b, 4a + 5b)/a, bE R}. Then Y = .!&'{(2, 3, 41,
= 1/2, y = 1/3.
3.
4.
X

Inconsistent, it is equivalent to
(3, 5, 5)} is the row space of A = (~ ~ 1). Since A is row-equivalent to

X- 3y = 0. (6? -~), Y= .!&'{(1, O, 5), (0, 1, -2)}. That is, E Y if (x, y, z) (x, y,;:)
0=1. = x(I, O, 5) + 1, -2). This gives the equation z = 5x-
y(O, 2y.
b. (6 _~ - ~) is row-equivalent to · (b ? _iW, and an equation is
5. Equivalent to
X- fy = -2
z = -3, x- 2y- 3z =O.
so x = 3t- 2, y = 2t, and z = -3, for any tE R. c. x- 3y + z =O. Notice that although the space is defined using 3 para-
meters, only two are necessary.
6. Inconsistent. 7. X = 3 + 2s - f, y = S, Z = f, S, tE R. d. 4x - 2y + 3z - w = O. e. x + y + z + w = O.
8. X= y= Z =l. 9. X = - 2t, y = 3t - 2, z = t, w = 4, tE R. 5. a. (x, y, z, w) = (4, O, O, 2) + t( -3, 2, 1, 0), tE R. This is the line with direc-
10. The system is equivalent to tion numbers (- 3, 2, 1, O) which passes through the point (4, O, O, 2).
x- ~z + w =O b. (x,y, z, w) = (-3, 1, O, 0) + t(-1, -2, 1, O)+ s(-2, O, O, 1), t, sER.
y- ~z w =O, + This is the plane in 4-space, parallel to .!&'{(-1, -2, 1, 0), (-2, O, O, 1)} and
therefore the solutions are given by x = s - t, y = 5s- t, z = 1s, w = t, ~sing through the point (- 3, 1, O, 0). .
for s, tE R. c. (x 1 , x 2 , x 3 , x 4 , x 5 ) = (1, 2, O, 1, O)+ t( -3, -1, 1, O, O)+ s(O, -2, O, O, 1),
t, sE R. The plane in 5-space through the point (1, 2, O, 1, O) and parallel to the
11. a. Independent. b, e, and d are dependent. plane.!&'{(-3·, -1, 1,0,0),(0, -2,0,0, !)}.

12. a. (i ~), b. (i ~)· c. (2, 5, 6).


d. (~ g ~)· 6. a. An arbitrary point on the Iine is given by t(P- Q) + P = (1 + t)P- tQ.
If S= (x 1), P = (p 1) and Q = (q¡), show that x 1 = (1 +
t)p 1 - tql> for
1 s j s n, satisfies an arbitrary equation in the system AX = B.
e. (~ ~ ~ i)· b. Recall that both S e {U+ V/ VE Y} and {U+ V/ VE Y} e S must be
established. For the first containment, notice that, for any W E S, W = U
13. a. 2. b. 2. c. l. d. 2. e. 3. +(W- U).
15. From the definition of the span of a set, obtain a system of n linear equations 8. a. S= {(-3, -?k, -1-· 4k, k)/kER}. This is the line with vector equa-
in n unknowns. Use the fact that {V1 , ••• , V.} is linearly independent to show tion P = t( -7, -4, 1) + (- 3, -4, 0), tE R.
that such a system is consistent. b. Y is the line with equation P = t( -7, -4, 1), tE R.
446 ANSWERS ANO SUGGESTIONS Chapter 3
447

9. a. S= {(4- 2a + b, 3- 3a.+ 2b, a, b)ja, bE R}; 17. a. (18, -c-5, -16) and (22, 8, -4).
!/ = 2"{(-2, -3, 1, 0), (1, 2, O, 1)}. b. The cross product is not an associative operation.
c. The expression is not defined -beca use the order in which the operations
b. S = {(4, 2, O, O) + (b - 2a, 2b - 3a, a, b)ja, bE R}.
are to be performed has not been indicated.
c. S is a plane.

§5 Page 115 §6 Page 119


l. 2. 2. -5. 3. 2. 4. l. 5. o. 6. O. 7. 2. 8. O.
1. a. 13. b. 15. c. o. d. l. e. -144. f. 96.
9. a. A parallelogram, since A is the origin and D = B C. Area = 13. + 3. a. The determinan! is - 11, so the set is linearly independent. b. The
b. A parallelogram since B - A = D - C. The area is the same as the area determinan! is -5. c. The determinant is O. ~-
of the parallelogram determined by B- A and C-A. The area is 34.
c. Area = 9. 4. The set yields a lower triangular matrix with all the elements on the main
diagonal nonzero.
10. a. l. b. 30.
5. a. Follow the pattern set in the proof of Theorem 3.12, that interchanging
12. Suppose the hth row of A is multiplied by r, A = (a 11 ) and B = (bu). Expand
rows changes the sign of a determinan!, and the proof that jAr¡ = jAj, problem
B along the hth row where bhi = rahi and Bhi = Ahi· The corollary can also
be proved using problem 11. 13, page 115.
1 a12 /a 11 atnlau
13. The statement is immediate for 1 x 1 matrices, therefore assume the pro- a2n- Oz¡(a¡nfau)
perty holds for all (n - 1) x (n - 1) matrices, n ¿ 2. To complete the proof 6
• jAj = a 11 O Ozz - az¡(a¡z/au) ·
by induction, it must be shown that if A = (au) is an n x n matrix, then O an2 - an1(a12/au)
jATj = jAj.
Let AT = (b 11 ), so b11 = a11 , then 7. a. 35 714 21
1¡3·1- 5·7 3. 4 - 5. 21-! 32 621
=33·6-9·7 3. 8 - 9. 2 - 3 -27
¡- = -34
jATj = b11B11 + b12B12 + · · · + b1nB1n 19 6 8
= a11B11 + Oz¡Bl2 + ... + OniBln·
The cofáctor B1 1 of b1 1 is ( -1)1+ 1M u, where MJJ is the minor of b11 • The
minor MJJ is the determinan! of the (n - 1) x (n - 1) matrix obtained from
b.,~
2
1~l=!l~:i=~::
1 3 2
5·2- 2·2,_ 51
5·3- 3·2 - .

AT by deleting the 1st row andjth column. But if the 1st column andjth row
of A are deleted, the transpose of Mu is obtained. Since these matrices are c.
1
2
3
2
2 1
i 1
=tz 2·3- 1·1
2·2-2·1
12·1- 3·1
2·2- 1·3 2·1- 1·2¡
2·1-2·3 2·1-2·3
2·2- 3·3 2·1- 3·2
(n - l) X (n - 1), the induction assumption implies that M 11 is equal to the 3 1 2 1
minor of a1 1 in A. That is, BJJ = (-1) 1 +1M 11 = (-J)i+ 1 M 11 = A11 • There- 5 1 0
for~ th~ above expansion becomes lAr¡ = a 11 A 11 + a21 A 21 + · ·. + an 1 An~>
=;¡11 2 - 4 - 2 11
=;:rs1¡-22 -101 = 10·
-24 -2o
-1 -S -4
wh1ch IS the expansion of jAj along the first column.
Now the statement is true for all 1 x 1 matrices, and if it is true for all
(n - 1) X (n - 1) matrices (n ¿ 2), it is true for all n x n matrices. There- §7 Page 126
fore the proof is complete by induction.

14. a. (!? g¡, -16 g¡, lb ?1) = (0, O, 1). b.


Note that parts a and b show that the cross product is not commutative.
(0, O, -1). l. a. (g !)· b. (g ! g)· c. (g)·
c. O. d. (- 2, -4, 8). e. O. f. (0, -19, 0). e. (b ?).
15. a. If the matrix is called A = (a 11 ), then
Auau = ~~: ~!lEt = [bzc3 - b3c2](1, O, 0).
16. A • (B x C) is the volume of the parallelepiped determined by A, B, and C.
Since there i~o volume, A, B, and C must lie in a plane that passes through 4. a. The row space of A= 2"{(1, O, 11/3), (0, 1, -1/3)} and the row space of
the origin. AT = 2'{(1, o, -1), (0, 1, 1)}.
448 ANSWERS ANO SUGGESTIONS Chapter 4 449

b. The row space of A = 2'{(1, O, 1), (O, 1, -2)} and the row space of AT 3. "T is a map from "Y to ir," or "Tsends "Y into 1/í."
= 2'{(1, O, -3), (0, 1, 1)}.
4. a. Suppose T(a, b) = (-1, 5). Then 2a- b = -1 and 3b- 4a = 5. These
5. "A transpose" or "the transpose of A." equations have a unique solution, and (1, 3) is the only preimage.
6. lf U= (a, b, e) and V= (d, e, f), then the components of U x Vare plus or b. {(1, -t, 3t)ft E R}. c. No preimage.
minus the determinants of the 2 x 2 submatrices of d. The problem is to determine which vectors in RJ[t] are sent to 1 by the map
+
T. That is, for what values of a, b, e, if any, is T(a + bl et 2 ) = t? This
(d ~ ¡). happens only if a + b = 1 and e - a = O. Therefore any vector in {r +
7. That is, if x 1 U1 + · · · + x.U. = W has a solution for al! W E "Y"' show that (1 - r)t + rt 2 fr E R} is a preimage of t under the map T.
x, U, + ··· + x U = O implies x 1 = · · · = x. =O.
11 11
+
e. {tt 2 kfk E R}. f. {3 +k- kl +
kt 2 fk E R}.
5. a. Por each ordered 4-tuple W, there is at Ieast one ordered triple V such
Review Problems Page 126 that T( V) = W.
b. Por each 2 x 2 matrix A, there ls a 2 x 3 matrix B, such that T(B) = A.
2. a. The two equations in two unknowns would be obtained in determining c. For each polynomial P of degree less than 2, there is a polynomial Q of
if {4 + t, 5 + 3t} (or {l + 41, 3 + 51}) is linearly independent. degree less than 3, such that T(Q) = P.
b. S = {2 + 91 + 12 , 8 + 31 + 71 2 } e R 3 [1].
c. S= {3 +
21, 1 - 1, 4 +
61} e R 2 [1]. 6. a. Onto but not 1-l. For example, T(Z Z ~) = (g g), for any scalar k.
5. a. 1
1 1 1
h ~ ;/
= 1 - y - x. So an equation is y = 1 - x.
b.
d.
Neither 1-1 nor onto.
1-1 and onto.
c. 1-1, but not onto.
e. Neither. f. 1-1, but not onto.

b. y= 9x- 23. c. 3y- 8x + 37 =O. 7. a. (14, -8, -6). b. (4, 10, 14). c. (1, 4, -5).
· f.
7. a. 5x + 6y - 8z = O.
b. 3x- y= O. d. (2, 5, 7). e. (x, y, -x-y). (0, O, 0).
c. 3x + 8y + 2z = O.
d. X+ z =O. 8. a. Suppose "Y = JI>(±) !T and V= U + W, with U E Y, W E !T. Then
9. If AX = B is a system of n linear equations in n unknowns, then there is a P 1 (V) = U. lf rE R, then rV = r(U + W) = rU + rW, and rU E Y, rWE !T.
unique solution if and only if the determinan! of A is nonzero. Since [Al may (Why?) So P 1(r V) = rU = rP 1(V), and P 1 preserves sea lar multiplication.
equal any real number, it is reasonable to expect that a given n x n matrix The proof that a projection map preserves addition is similar.
has nonzero determinan!. That is, it is the exceptional case when the value of 9. a. lf Y= 2'{(1, O)}, then !T = 2'{(0, 1)} is perpendicular to JI> and S 2
the determinan! of an n x n matrix is O, or any other particular number. =Y(±) !T. Thus P 1 is the desired projection. Since (a, b) = (a, O) + (0, b),

11. a. AG) = W· (~~)· W· and respectively.


P 1 (a, b) =(a, 0).
b. P 1 (a, b) = (a/10 +
3b/10, 3a/10 +
9b/10).
10. a. T(a, b) = (a/2 - y')b/2, y'3a/2 + b/2), T(l, O) = (1/2, y')/2).
b. (g), (g), and (?). respectively. c. To show that a rotation is 1-1 and onto, Jet (a, b) be any vector in the
c. The requir~nt is that, for each BE vlt" x"
there exists one and only one codomain, and suppose T(x, y) =(a, b). That is,
X E vil"' x 1 , such that AX = B. What does this say about A when B = O? Is (x COS e- y sin e, X sin e + )' COS e) = (a, b).
there always an X when m< n? This vector equation yields a system of linear equations. Consider the coef-
ficient matrix of this system and show that, for any a, b E R, there is a unique
solution for x and y.

Chapter 4 11. A translation is linear only if h = k = O. Every translation is both one to one
and onto.
§1 Page 135 12. The reflection is linear, 1-1, and onto.

l. a, e, e, and g are not linear. b, d, and f are linear maps. 13. Show that T(U) = T(O).

2. Remember that an if and only if statement requires a proof for both implica- 14. This is proved for n = 2 in problem 2. The induction assumption is that it
tions. is true for any linear combination of n - 1 terms, n ;:::: 3.
450 ANSWERS ANO SUGGESTIONS Chapter 4 451

§2 Page 144 §3 Page 153

l. a. §y= {(3a- 2e, b + e)ja, b, eER} = 2'{(3, 0), (O, 1), (-2, 1)} 1. a. [Tr + T 2 ](a, b) = (4a, 2a + 5b, 2a- 12b).
= "Y 2• b. [5T](a, b) = (15a- 5b, 10a + 20b, 5a- 40b).
JV' r = {(a, b, e)/3a- 2e =O, b + e =O} = {t(2, -3, 3)/t E R} c. [T3 - 3T2 ](a, b) = (2b- 2a, -2a- 3b, 4b).
= 2'{(2, -3, 3)}. d. 0Hom(1r 2 ,1!" 3 ).
b. ~T = {(x, y, z)/5x + 7y- 6z =O}; JV' r ={O}. 2. a. x1T1 + x 2Tz + X3T3 + x4T4 =O yields the equations x1 + x 2 + x 3
c. ~T = R; JV'r ={a- ai/aER} = 2'{1- i}. =O, x 1 + x 3 =O, x1 + Xz =O, and x2 + X4 =O. This system has only the
d. ~T = ..'l'{J -f- 12 , f -f- 21 2 ,2- 1} = ..'&'{1 -f- 12 ,2- !}. trivialsolution,sothevectorsarelinearlyindependent. b. 2T1 - Tz- T3
JV'r ={a+ bt + el 2/b =e, a= -2e} = ..'&'{2- 1- 12 }. = O. c. T 1 - Tz + T3 - T4 = O. d. T 1 + T 2 - 2T3 = O.
2. a. {(4!- 2(51), 3(51)- 61)/t E R} = {!(2, -3)/t E R} = 2'{(2, -3)}. 3. "Let T be a linear map from "Y to "/!'."
b. {0}. c. {(2, -3)}. d. 2'{(2, -3)}. e. {(-4, 6)}.
f. If it is parallel to 2'{(1, 2)}, i.e., the line with equation y= 2x. 4. b. Since we cannot multiply vectors, the first zero must be the zero map in
Hom("Y, "11'), the second is the zero vector in "Y, and the fourth is the zero vec- .
3. a. 2'{(1, O, -2), (0, 1, 1)} = {(x, y, z)/2x- y+ z = 0}. tor in "/1'.
b. 2'{(1, O, 1/2), (0, 1, 5/6)} = {(x, y, z)/3x + 5y- 6z =O}.
c. o. d. "Y 2· 5. a. If TE Hom("Y, 11') and r E R, then it is necessary to show that rT: "Y ~ "/1',
and that the map rT is linear.
4. a. (a, b, e) E JV' r if 2a + b = O and a + b + e = O. A homogeneous system b. Why is it necessary to show that [(a+ b)T](V) = [aT + bT](V) for all
of 2 equations in 3 unknowns has non trivial solutions, so T is not 1-1. V E "Y? Where is a distributive law in "/1' u sed?
b. One to one. c. T(a, -2a, O, O)= O, foral! a E R.
d. One to one. d. Not one to one. 6. a. (0, 3). b. (0, -6). c. ( -9, 5). d. ( -12, O).
e. (4, -3, O, 1) 8 • f. (0, O, O, O)s. g. (4, O, 1, 0) 8 •
5. Each is a map from a 3-dimensional space to a 2-dimensional space. h. (2, 1, 3, -5)s.
6. ~P = :T and JV'p = .?. 7. a. a12T1z + az2T22 + a32T32. b. a31T31 + a32T32.
7. a. JV' :f. b. Why is r- 1 [/T] nonempty? For closure under addition and c. auTu + a21T21 + a31T31 + a12T12 + a22T22 + a32T32.
scalar multiplication, is it sufficient to take U, V E r- 1[/T] and show that 2 2 4 3 2
rU + sVE r- 1 [/T] for any r, sER? 8. a. ¿:; ¿:; Yij Tij. b. I: a4 T4 . c. ¿:; .L; r1a 1¡Tu.
1=1 i=l }=1 1 1 1=1 J=l
c. T- 1[{W}] must contain at least one vector.
9. a. Tu(x, y) = (y, y, 0). b. T22(x, y) = (x - 3y, x - 3y, 0).
8. (x, y) = [x- y](4, 3) + [2y- hl(2, 2), therefore T(x, y) = (4x- 3y, 2x- y). c. T23 (4, 7) = (-17, O, O). d. Tu(5, 9) = (9, 9, 9)
9. (x, y) = [x- 3y](4, 1) + [4y - x](3, 1), so T(x, y) = (IOy- 2x, 3x - Ily, 10. a. T(3, 1) = (2, 13, 3): (3, 10, -ll)s 2 , therefore T(3, 1) = [3T11 + IOT1z
1y- 2x). + 4Tzz
- 11 T13](3, 1). And T(I, O) = (1, 4, O): (0, 4, - 3) 82 , so T(I, O) = [OTz¡
10. a. (2, 4) = i(3, 6), but T(2, 4) ;6 jT(3, 6). Therefore any extension of T - 3TZJ](I, 0). Therefore T = 3T11 + IOT12 - 11 T1z + OT21 + 4T22- 3Tz3,
and T: (3, 1O, -11, O, 4, - 3) 8 • Yo u should be obtaining the coordinates of
to all "Y z will fail to be linear.
vectors with respect to B 2 by inspection.
b. Yes, but not uniquely, for now T(2, 4) = ~T(3, 6). An extension can be
b. T: (8, -8, 12, 2, -2, 5) 8 • c. T: (2, 1, 2, 1, O, 0) 8 •
obtained by extending {(2, 4)} toa basis for "1' 2 , say to {(2, 4), (5, 7)}, and
d. T: (6, -2, 3, 2, -2, 3)s.
then assigning an image in "Y 3 to the second basis vector, say T(5, 7) = (0, 3, 8).
For these choices, T may be extended linearly to obtain the map 11. a. T 11 (a + bt) =a. T13 (a + bt) = at 2 • Tz 1 (a + bt) = b.
T(x, y) = GY - h, 2x - y, ~Y - ~x). b. [2T11 + 3T21 - 4Tn](a + bt) = 2a + 3b- 4at 2 •
c. T = T12 + T23. d. T = 2Tz1 + T1z - 4Tzz + 5Tt3·
11. Observe that since T(1, O) E "Y 2 , there exista, e E R such that T(l, O) = (a, e). e. (0, O, 1, O, O, 1)8 • f. (3, -3, 7, -1, 4, 5)8 •
13. a. JV' r = {(x, y, z)/2x +y- 3z = O}. g. (0, O, O, O, O, O)s.
b. Recall that aplane parallel to JV' T has the Cartesian equation 2x + y - 3z 12. a. (0, 0). b. (1, 2). c. (- 5, -1 0). d. Yo u must first find the
=k for sorne k E R; or that for any fixed vector U E "Y 3, {V+ U/ V E JV' r} coordina tes of (3, 4, 5) with respect to B 1 • Tzz(3, 4, 5) = - 2(1, 2) = (- 2, -4).
is a plane parallel to JV' T· cr: e. T12 (1, O, O)= (4, 8). f. T: (3, 2, 2, -2, 9, -1) 8 •
d. .? could be any subspace of dimension 1 which is not contained in JV' T· g. T: (7, O, -10, 3, -5, 4) 8 •

..... ~
452 ANSWERS ANO SUGGESTIONS ~ Chapter 4 453

13. a. O E R, (0, O) E "Y 2, (0, O, O, O) E "Y 4, and the map in Hom("Y 2, "Y 4) that 10. There are an infinite number of answers. Jf Bt = {E¡}, then B 2 must be
sends (x, y) to (0, O, O, O) for all (x, y) E "Y 2· {(3- 4), (-1, 1)}.
11. lf Bt = { Vt. V2, V3 }, then V3 must be in the null space of T. One possible
§4 Page 164 answerisB 1 = {{1,0,0),(0, 1,0),(1, 1, 1)}andB2 = {(-3, -3, 1),(1, ~I,O),
(1, O, 0)}. Check that B 1 and B2 are in fact bases and that they give the desired

l. a. (b ~) · b. (~ 8)· oo o)1 . d. (~ -~)·


result.
12. If B 1 = {V1 , V2, V3, V4 } and B2 = {W1 , W2, W3}, then V3, V4 .!V'r and
o o T(V¡} = Wt- W3, T(V2) = W2 W3. +
E

2. a. (g b)· b. (~ ~t 8)· d. ( -~ -1)· 13 • ( -5 7 2) ( o


26 -33 -3 ' -4
-312 -3)
12 '
(-522 -'21
4 -1)9 '
-10 5 -5)
!~).
18 ( .40 -30 30 .
3. a. (g g)· b. (b g)· c.
(
-3
-7 -8 14. Let B 1 = {V1 , ••• , V.} and B 2 = {W1 , ••• , Wm}. Jf ffJ(T) =A=
(ali),

4. a. (? b ~~)· b. (_~ -3 4)
-4 l .
c. (g b l)·
then T(V1) = L:r. 1 aiJ W1• To show that ffJ preserves scalar multiplication,
notice that [rT](V1) = r[T(V1)] = rl:7~t a 11 W, = .L;;". 1 ra, 1 W1• Therefore
9 1 [rTJ(V1 ):(ra 11 , ••• , ram¡) 82 and qJ(rT) = (ra 11). But (ra 11) = r(aiJ) = rA
= rlfi(T).
d. ande. (g b g)·
15. a. A= (i -~)·
5. A=(=~ -1 _1)· a. A (l) -l),
= ( so T(O, l, O)= (1, -1, 0).
16. If T: (b 11 , b 12 , b 13 , b21, bn, b23)s, then T = L:ff.¡ .L;f.¡ bhkThk· Therefore
b. (-1, 7, -1). c. (0, O, O). T(V1) = .L;~.¡ .L;f.¡ bhkThk(V¡) = .L;f.¡ blkWik·

6. A=(-~ -~). a. (1, l): (1, 0)¡ 0 , 1¡,(t.o¡¡, A(b) = (-~). and :._3(1, l) Hence T(V1): (b 1¡, b12, b1J)s 2 , and (%:~ %~~) = (~~; ~~:).
b13 b23 a3¡ a32
+ 5(1, O) '= (2, -3), so T(l, 1) = (2, -3).
b. A(~)=(-~~) and T(4, O)= (12, -16). §5 Page 174
c. Am =(-m and T(3, 2) =(7, -10). l. a. S o T(a, b) = (4b- 2a, 8a- 4b), To S(x, y) = (2y- 6x, 12x).
d. A(_D = (_~) and T(O, 1) = (-1, ]). b.
c.
S
S
o T(a, b, e)= (a- b +e, -b, a+ e), To S(x; y)= (2y, x +y).
o T is undefined, T o S(a, b, e) = (2a, 2a +
3b, 4a - 4b, b).
+bt) = -b - 6at 2 , T o S is undefined.
~ ~
d. S o T(a
7. The matrix of T with respect to B 1 and Bz is A = (
-1 -3
- b).
1 2. a. T 2(a, b) = (4a, b - 3a). d. [T 2 - 2TJ(a, b) = (0, -a - b).
3
b. T (a, b) = (8a, b - 7a). e. [T + 3/](a, b) = (5a, 4b - a).
a. ( ~ ~ -6)(~)=( ~).soT(t)=t-3! 2 • c. T"(a, b) = (2"a, b - [2" - l]a).
-l -3 l o -3
b. A(!, O, W= + t 2 ) = 1 + 3t.
(1, 3, O)r, so T(l 5. Since by definition, S o T: "Y - 11', it need only be shown that if U, V E "Y
c. A(-4, 3, Of = (-8, ·-9, -5)T, so
T(3t- 4) = -8- 9t- 5t 2 • and a, bE R, then [S o T](aU + b V) = a([S o Jl(U)) + b([S o T]( V)).
d. A(O, O, O)r = (0, O, O)r, so T(O) = O. 7. b. When multiplication is commutative, a(b + e) = (b + e)a.
8. a. T(V3 ) =O. b. T(V4 ) = 7W2 • c. T(V¡)isalinearcombinationof 8. If S E Hom("Y 2), then there exist scalars x, y, z, and w such that S(a, b)
two vectors in B2. d. Yr e ..'t'{W1 , Wm}. e. T(V¡) =a¡¡ W¡, ... , +
= (xa + yb, za wb) (problem 11, page 145), and since a linear map is
T(V,,) = a.,W, provided n::::; m. What if n >m? determined by its action on a basis, S o T = T o S if and only if S o T(E1)
9. a. T is onto. b. Tis 1-1. c. T is not onto. d. T is onto, but = T o S(E1), i = 1, 2, where {E¡} is the standard basis for "Y 2 .
not 1-1. a. {S/S(a, b) = (xa, wb) for sorne x, w E R}. b. Same as a.
454 ANSWERS ANO SUGGESTIONS Chapter 4 455

+ yb, za + wb)lx + 2y - w = O, 3y + z = 0}.


c.
d.
{S(a, b) = (xa
{S(a, b) = (xa + yb, za + wb)lx +y- w =O, y+ z = 0}. 7. a. ( ~ ¡ =~) (b) = ( ~),
therefore (1, O, 0): (3, 6, -4)a.
-4 -3 1 o -4
9. b. For any map T, T o 1 = 1 o T. b. (0,3,-1)8• C. (-3,2,1)8• d. (0,0,2)B.
10. a. Not invertible.
(=~9 =~10
2
b. Solving the system a - 2b = x, b - a =y for a and b, gives T- 1 (x, y) 8. a.
-23/2
J~~)(-i)=(=f),so(4,-1,2):(-2,-1,3)8 .
2 3
=(-x-2y,-x-y).
c. Suppose r- 1 (x + yt + zt 2 ) =(a, b, e), then a+ e= x, b =y, and b. (0, 1, 0)8. c. (-9, -2, 10)8. d. (21, 5, -23)8.
-a- b = z. So r- 1 (x + yt + zt 2 ) =(-y- z, y, X+ y+ z).- - - .
d. T- 1 (x, y) =y- X+ [3x- 2y]t. 9. A=(-~ -~) and A- =
1
(=~ ~2} 3 ). For a. (=~ ~2} 3 ) (b) =
e. A map from R 2 [t] to R 3 [t] is not invertible. ( -=2}3), therefore r- 1 (0, 1) = ( -1, - 2/3). b. ( -1, 1). c. ( -4, - 3).
f. r- 1 (x, y, z) = (-3x + 2y- 2z, 6x- 3y + 4z, -2x +y- z).
d. (-x-y, -x -1y).
11. a. Assume S 1 and S 2 are inverses for T and consider the fact that
S1 = S 1 o 1 = S 1 o [T o S 2 ]. 10. The matrices of T and r- 1
with respect to the standard basis for r3 are

13. The vector space cannot be finite dimensional in either a or b.


2
2
1
2
3)1 and (-3/21 1
-1/2
5/4)
-1 respectively.
(
4 2 4 1 o -1/2
15. If V= [To S](W), consider S(V). Where do you need problem 14? a. (10, -8, -4). b. (1, -2, 3).
16. a. S o T(a, b) = (17a +
6b, 14a 5b). + C. T- 1 (x, y, z) = ( - h - !Y+ ~z, X+ y- Z, X -iz).
b. s- 1 (a, b) =(a- b, 4b- 3a). r- 1
(a, b) = (2a- b, 3b- 5a).
(S o T)- 1 (a, b) = (5a- 6b, 17b--:- 14a).
§7 Page 194

§6 Page 186 o 1 o) 1/2 o)1 .


l. a. b.
(o1 oo o1 . c. ( o
1· a. (-118 128) · b. (6).
c. (~1 Ü· d. (_:~ ~: -2~).
12 6 -9 e. (-~o o~ 8). f.
(
o1 o1 oo) . g.
1
o
(o
o
1 g)·
19 15 16)
1 o -3 1 o 1/4
e. (g 8)· f. 30 -8 4 .
( 5 1 3 2. a. E¡= (162 ~), Ez = (-~ ~), Ea= (b -1~6)' E4 = (b -i).
3· a. 3/2 -2)
( -1/2 1 · b.
( 2/11
-3/11
1/11)
4/11 · -1~z). c. E 1-1 -
-
(2o o)1 , E-1
2 = 5 o)
1 , (1 E-1
3
(1
= o -6o) , E-1
4 = o (1 13) .
d. ~(-i -: -~)· e. (bo g ~)· -2 8)
1 -4. d. (~ ~).
-3 6 -1 1 o 6 1
3. There are many possible answers for each matrix.
4. a. Show that the product B- 1 A - 1 is the in verse of AB.
b. By induction on k for k~ 2. Use the fact that (A 1A 2 •• ·AkAk+t)- 1 a. (Ó ~) (? Ó) (Ó ~) (Ó -f)· b. G ~) (Ó -~) (Ó j).
(z l ?) (~ l ?) (8 ? l) (8 ! ?)(8 r ?)(8 r ?) .
= ((A1Az • · ·Ak)Ak+t)- 1

5. For Theorem 4.18 (=>) Suppose Tis nonsingular, then T- 1 exists. Let rp(T- 1) c.
= B and show that B = A- 1 •
( <=) Suppose A- 1 exists. Then thereexists Ssuch that q¡(S) = A- 1 (Why?)
Show that S= r- 1 •
d. (~ l g) (~ ~ ~) (~ ~ b) (~ r ~) (~ b ~) (g b ~).
6. a. (~) = ~r =
(i -
b. x = 14, y= 3, z = 13.
1
mH-i ~) (~) = e~m ·
c. a= 9/8, b = -3/8, e= 13/16, d = 1/2.
4. a. (l ~ ~
o1• o1 o)o -
o o 1
456 ANSWERS ANO SUGGESTIONS Chapter 4 457

b. (6, 5, -13). c. (9a- b, 26a- 7b, -46a + llb).


~) ~
1 o1 2 o1 -31 oo) ~ (1o o1 -22 o1 -31
(g 5 -4
-2
o -2 1 o o 6 -5 13 2. a. T is the identity map.
b. T(a, b, e) = (6a- 6b- 4c, -6a + 10b + 3c, 4a- 2b + e).
1 o 2 o 1 o ) (1 o o 5/3 -10/3 -1/3)
(g 1 -2 1 -3 o - o 1 o -2/3 4/3 1/3 . 3. a. Show that 1 is an isomorphism.
o 1 -5/6 13/6 1/6 o o 1 -5/6 13/6 1/6 b. If T: 1"' - 1f' is an isomorphism, show that T- 1 exists and is an iso-
3 1 4)-l
Therefore ( 1 O 2
2 5 o
( 5/3 -10/3 -1/3)
= -2/3
-5/6
4/3
13/6
1/3 .
1/6
b. -5/2 3/2 . ( 2 -1) morphism.
c. Show that the composition of two isomorphisms is an isomorphism.

c. (
-3/8
-1/2
3/16
o1
-1/2
1/8
1/2
-1/16
-17/12)
-1/3
21/24. d.
( 9/2
-3
8 -15/2)
-S 5 .
4. a. A= (_i
. (3, 4)}, B2 = {E1 }.
-n. b. The rank of A is 2 ,¡. l. d. B1 = {(5, 2),

3/8 o -1/8 1/12 2 3 -3


6. a . .iVr = .2'{(3, -8, 2)}.
5. Consider the product (1~ -1) (¡ ~). b. There are many possible answers. B 1 can be taken to be {(1, O, 0), (0, 1, 0),
6. Let q¡ be the isomorphism q¡: Hom(1"'n) - ,((n X n with q¡(S) = A and q¡(T) = B. (3, -8, 2) }, which is clearly a basis for 1"' 3 • Then since T(O, 1, O) = (O, 1, 3),
Show that T = O so that B = q¡(O) = O. the basis B 2 must have the form {U, (0, 1, 3), V}, with T(l, O, O) = 2U-
(0, 1, 3) + 3 V. There are now an infinite number of choices for B 2 given by thc
7. a. Since A is row-equivalent to B, there exists a sequence of elementary row
solutions for this equation, satisfying the condition that U, (0, 1, 3), V are
operations that transforms A to B. If this sequence of operations is performed linearly independent.
using the elementary matrices E¡, ... , Ek, show that Q = Ek · · ·E2E1 is
nonsingular and satisfies the equation B = QA. 8. For the first part, suppose A 2 =A and A is nonsingular. Show that this
b. Suppose Q1 A = Q2 A, if A is nonsingular, then A- 1 exists. implies A = 110 •

c. ( -1 ~) G ¡) = (ó ~) = (~ -~m G ¡) · The 2 x 2 matrix (~ ~) satisfies these conditions if a + d = 1 and


2
be= a- a •

e~ -2 8).
2
~)·
-4
8. a. ( 3/2
-1/2
-2)1 . b. ( -15 1 c. -1 9. a. §r = {(x, y, z)jy = 3x}. b. .iVr = .2'{(2, 1, -7)}.
-13 10 1
A-1. c. The line parallel to .iVr that passes through U has the vector equation
d.
P = t(2, 1, -7) +U, tER.
o -2 1 o
9. a. (g 1/2
o ~)· b. (g 1
o ?)· c. (o1 o
o o ?)· d. (g 1
o 1)· 11. b. T(a, b, e) E Y if (a + b) - (b + e) + (a - e) = O, or a = c. Therefore
T- 1 [.9] = {(a, b, c)ja =e} and T[T- 1 [.9]] = {T(a, b, a)ja,b E R} = 2'{(1, 1, O))
o
o o) #Y.
e. (g 1 o
1 .
c. T- 1[.'1] = {(a, b, c)j4a - b - Se = 0}. d. 1"' 3 •
10. See problem 7a. 12. a. k= -12. b. No values. c. k= O, -2.
. .._...,
(-~ ~Y)·
-4/3 13. Suppose S E Hom(C 2). We can show that it' S is nonsingular, then S sends
11. a. 1
o
b. (J -5)
2 . parallelograms to parallelograms; and if S is singular, then it collapses paral-
lelograms into lines.
12. a. With Q as in 8a, b. With Q as in Se, c. With P as in llb,
For U, V E @" 2 , write ( ur, vr) for the matrix with U and V as columns.
p = (1o o
1 -8)
2. p = (1o o 18)
1 -1~ . Q= ( 0 o 0o).
1 1 Then area (0, U, V, U+ V)= jdet(Ur, vr)¡ is the area of the parallelogram
o o 1 o o -5 6 1 with vertices O, U, V, and U + V.
If A is the matrix of S with respect to the standard basis, then (S(U))T
= AUr. That is, in this case, coordinates are simply the components of the
Review Problems Page 196 vector.
S sends the vertices O, U, V, U+ Vto O= S(O), S( U), S( V), S( U)+ S( V)
l. a. (2,5):(-1,2)¡<o. 1 ¡, 0 , 3 ¡¡and(~ 1)(-1) = (_i),soT(2,5)=4(-2,1,0) =S( U+ V). Now the area of the figure with these four vertices is given by
jdet((S(U))r, S(V)Y)j = jdet(AW, AVT)j = jdet A(W, vr)j
+ 7(3, 1, -4)- 3(0, -2, 3) = (13, 17, -37). = jdet A det(W, VT)j = jdet Al area (0, U, V, U+ V).
,.-.,
458 ANSWERS ANO SUGGESTJONS
o
V> Cll
'-' ,.-.,
11
,.-., .
..........
'-'
.......
So if det A
no area.
= O (and S is singular), then the image of every parallelogram has
-.....· .,
11 -
,.-.,.....;-
'-'
::>..
'-' ,.-.,
&-.-
• r<i'
1\ '-'
'-' 1 '
14. a. A2 = 4A - 3[2. b. A = 7A.
2
c. A = 2A •
3 2 h 1
1 ' 1 \
d. A 3 = 4A 2 - 9h e. A 3 = 4A 2 - A - 6/3 • f. A4 = O. 1 1 \
1 \
1
1 \
1 \
1 \
1 \
\,.-.,
Chapter 5 \ o
.
§1 Page 207
-
h
1. Por each part, except b, you should be able to find the coordinates ofthe vectors
in B' with respect to B by inspection.

(-~
1
a. (-~ ~). b. ( 2/9 -1/9)
-1/9 5/9 . c. 3
-V·
el
-5

-~)·
-1 2
(~
~
Cll
,.-.,
-4o 21) . -3 2 -2) 1 . f. -4
......
e.
d.
1 o 4 2 o 5 (i Cll
,.-., '-'
1
The transition matrix from B to {E¡} is (31 25r -- -5) , and ..... N M
2. a. (
-12 od;
......
3 '-'
od;
(1)
,......
(1)
.... ....
(_¡ -~)(j) = (-D, SO (4, 3): (-7, 5) 8 • ::;¡
.------
'-' ::;¡

---
C) C)

b. (-13,8)s. c. (4,-3)s. d. (2a-5b,3b-a)s. u: 1 u:


.....
'-'¡
1
1
1

(-~ ~ ~)-1 = (-~~ -j -~~).


1
3.
1 2 3 -·4 -1 6
a. (2, -2, 1)8 • b. (1, -1, 1) 8 • c. (2, O, l)s.
4. a. Por this choice, (0, 1) = (2, 1) - 2(1, O) and (1, 4) = 4(2, 1) - 7(1, 0).
If V: (xt> x2)s and V: (x~, x~)s•, then
V= x 1 (2, 1) + x2(1, O) = x~(O, 1) + x~(l. 4)
= xa(2, 1)- 2(1, O)]+ x~[4(2, 1)- 7(1, O)] ~~ ~
= [x~ 4xn(2, 1) + + [-
2xi - 7x~](1, O).
~ .......... ~
M,....
Therefore x 1 = x~ + 4x~ and x2 = -2xi - 7x~ or ::>.. ::>.. '-' '-'
Ql ,.-.,
,.-., Ql •
X= (~~) = (_i _j) (~D = PX'. ><
o
V>
.....
.....
'-' '-'
5. a. To prove (AB)T = BrAr, Jet A = (a u). B = (bu). Ar = (cu). Br = (du),
AB = (e 11 ), (ABY = Uu), and BTAr = (Bii). Then express fu and Bu in terms
S ~
Cll
~
....
6.
of the elements in A and B to show that fu = Bu·
a. The transformation changes Iength, angle, and area. See Figure A2.
b. The transformation is called a "shear." The points on one axis are fixed, Cll
,.-.,
1

o
'-'
o
-
'-'

so not alllengths are changed. See Figure A3.


c. The transformation is a "dilation." Lengths are doub!ed, but angles .....
'-'
remain u~anged.
460 ANSWERS ANO SUGGESTIONS Chapter 5 461

d. The transforrnation is a rotation through 45°, length, angle, and area y


rernain u.nchanged by a rotation.
e. The transforrnation is a reflection in the y axis. Length and area are. un-
T
changed but the sense of angles is reversed.
f. This transforrnatioil can be regarded as either a rotation through 180°
or a reflection in the origin.
8. Suppose Pis the transition rnatrix frorn B to B', where B and B' are bases for
"Y and dirn .:Y = n. Then the colurnns of P are coordinates of n independent
vectors with respect to B. Use an isornorphisrn between "Y and .,u. x 1 to show
that the n colurnns of Pare linearly independent. X

§2 Page 213
p

l.
A=(~ -1)
-4, A'= c61i -S)
-~ ' p
1) Q =(~'1
=(23 2'
1
o
~)·
Figure A4
1 o
2. A=G -3)1 ' .4' = (-11 -1~). p =G l)· d. The property of having a certain rernainder when divided by 5.
4. a. Por transitivity, suppose a ~ b and b ~ e, then there exist rational
3. If Pis nonsingular, Pis the product of elernentary matrices, P = E 1E 2 • • ·Ek,
numbers r, s such that a - b = r and b - e = s. The rationals are closed
and AP = AE1E2 · · ·Ek is a rnatrix obtained frorn A with k elernentary colurnn
under addition, so r + s is rational. Since a - e = r + s, a ~ e and ~ is
operations.
transitive.
4. a. Suppose Q = /2 , and P = (23 41) -l. b, [1] = [2/3] = {rir is rational}, h/2] = { v2 + rir is rational}.
c. The nurnber is uncountably infinite.
5. a. Jf T(a, b, e) =(a+ b + e, -3a + b + Se, 2a + b), then the rnatrix of
T with respect to {(1, O, 0), (0, 1, 0), (1, -2, 1)} and {(1, -3, 2), (1, 1, 1), 5. b. Given PQ and any point X E E 2 , there exists a point Y such that PQ ~ XY.
(1, O, O)} is the desired rnatrix. Therefore drawing all the line segments in (f<2l would completely fill the
7. lf I. = p-t AP, then PI.P- 1 = A. plane. So only representatives for [PQl can be drawn.
12. T(U_;) = rpj. c. The set of all arrows starting at the origin is a good set of canonical forros.
Compare with Example 2, page 33.
d. 1t is possible to define addition and scalar multiplication for equivalence
§3 Page 221 classes of directed line segments in such a way that the set of equivalence
[JJQ] + [XY] can be defined to
~

l. a. Reflexive and transitive, but not syrnrnetric. classes becomes a vector space. Por example,
b. Syrnrnetric, but neither reflexive nor transitive. be [PT] where ~ XY PS
and P, Q, S, Tare vertices of a parallelogram as in
c. The relation is an equivalence relation. Figure A4. In this vector space any arrow is a representative of sorne vector,
d. Syrnrnetric and transitive, but not reflexive. but the vectors are not arrows.
e. Reflexive and syrnmetric, but not transitive.
6. b. The lines with equations y = 2x + b, where bE R is arbitrary.
2. Suppose a is not related to anything. Consider problern Id.
7. a. K T· b. Al! planes with normal (2, -1, -4).
3. a. Por syrnmetry: if a ~ b, then a- b =Se, for sorne e E Z. Therefore 8. a. Row-equivalence is a special case of equivalence with P = !,..
b- a= S( -e). Since -e is an integer, b ~ a. This equivalence realtion 1
b. Rank beca use of the reason for part a.
is usually denoted a = b mod S, read "a is congruent to b modulo S."
c. The (row) echelon forro.
b. Five. Por exarnple the equivalence class with representative 7 is
[7] = {pE Zl7- p = Sk for sorne k E Z} d. An infinite number, since (g g), (8 b), (b ?) , and (b ~), for each
= {7 - SkJk E Z} = {2 + 5hJh E Z} = [2]. x E R, belong to different equivalence classes.
462 ANSWERS ANO SUGGESTIONS Chapter 5 463

9. A few are area, perimenter, being isoceles or having a certain value for the 10. a. The diagonal entries. b. They are invariant under similarity.
lengths of two si des and the included angle. 11. a. T is singular. b. ff T•
10. Write the matrix as (J~ g:). then 0 1 is k x (m- k), 02 is (n- k) x k, ~nd
03 is (n - k) x (m - k). §5 Page 237

l. a. -7- ,1. + .1. 2 • c. -11 + 2,1. + .1.2 •


§4 Page 230 d. -5..1.- .1. 3 •

l. For each part, there are many choices for P and the diagonal entries in D 2. a. 2- 5,1. + .1. 2 • d. -23- 3,1. + .1. 2

might be in any order. 3. a. sn-1 = 2.::1.1 A¡¡, where A;; is the cofactor of a¡¡.
a. P= (-~ -~). D = (~ -~). b. P = (-~ -D• D = (ci -?)· b. The characteristic polynomial of A is IAI - s2A. + (tr A)A. 2 - .1. 3 •
i. -2 + 6.1. + 2,1.- ,1. 3 , ii. 35,1.- ,1. 3 . iii. 1 - 3,1. + 3.1. 2 - ,1. 3 ,
c. P= (?o ¡1 -~).
-3
D = diag(3, 5, 0). 4. You should be able to find two matrices with the same characteristic poly-
nomial but different ranks.

d. P = (-Io -118
?),
2
D = diag(2, -4, 10).
5. a. The characteristic polynomial is 2- 4,1.
b. The plane has the equation x + z = O.
+ 3.1. 2
- .1. 3 • t = 2'{(1, O, 1)}.

2. a. There are no real characteristic roots, therefore it is not diagonalizable. 7. b. ff T•


b. The characteristic equation has three distinct real roots since the discri- 8. The maps in a, b, and e are diagonalizable.
minan! of the quadratic factor is 28. c. (a, b, e, d)T is a characteristic
9. b. .9'(1) = 2'{(1, -1, -1)} and .9'(3) = 2'{(1, O, 1), (0, 1, 0)}. Since
vector for 1, if 4a - 3c +
d =O and this equation has three independent ¡; {(1, -1, -1), (1, O, 1), (0, 1, O)} is a basis for "f/3, "f/3 = Y(l) (f) .9'(3). This
solutions. d. There are only two independent characteristic vectors.
·~·;:· also implies that Y(l) n .9'(2) = {0}, but to prove it directly, suppose
3. All vectors of the form (h, k, h + 2k)T, with h 2 + k 2 # O, are characteristic U E .9'(1) n .9'(3). Then U = (a, -a, -a) and U= (b, e, b) for sorne a, b, e E R.
vectors for 2 and (k, - 2k, 3k), with k # O, are characteristic vectors for -4. But then a = b = -a =e implies U= O and .9'(1) n .9'(3) = {0}. Thus

(61 ?2 -~).
the sum is direct.
lf P = then p- AP = diag(2, 2, -4).
1
10. Any plane n parallel to !T can be written as {U + Pi U E !T} for sorne fixed
3
PE Y(l).
4. (2k, k, -k)r, with k # O, are the characteristic vectors for 2, and (k, k, k)T,
k # O, are the characteristic vectors for l.
11. a. T(U) = (1, 2, 2), T 2 (U) = (-1, 3, 7). b. (?o g1 -:).
5
5. AO = ..1.0 for all matrices A and all real numbers A..
6. Not. necessarily, consider (f ~). c. T 2 (U)E2'{U, T(U)}. d. (! ~ ?)· Notice that the characteristic
7. The matrix A - U. is singular for each characteristic value ..1., therefore .the values for Tare on the main diagonal.
system of linear equations (A - U.)X = O has at most n - 1 independent
equations and n unknowns. Review Problems Page 238
8. a. 2, 4, and 3, for subtracting any of these numbers from the main diagonal
yields a singular matrix. b. 5 andO; forO, note that two columns are equal.
c. 7 and -2; subtracting -2 yields two equal rows. ···¡
l. a. (6 ? g). b. There are many answers. If B~ = {(1, O, 0), (0, 1, 0),
(1, -2, -1)} then B~ = {(2, 1), (-1, 2)}.
9. a. Every polynomial equation of odd degree has at least one real root.
2. b. For consisten! systems, choose the equations exhibiting the solutions
b. (?1 g1 og) and (6o -1g ?)o give two different types of examples. using the system with augmented matrix in echelon form. What about in-
consisten! systems?
464 ANSWERS ANO SUGGESTIONS :2> Chapter 6 465

3. c. (ó0 ?0 8) for each x in R.


X
8. a.
10. a.
8, O.
x = z = 1, y= O.
2 b. X= 13/100, y= -11/100, Z = 17/100.
5. The discriminant of the characteristic equation is -4 sin 8. Therefore a
C. X= 3, y= -1, X= 2.
rotation in the plane has characteristic values only if 8 = nn, where n is an
integer. There are two cases, n even or n odd. 11. f is symmetric and bilinear but it should be easy to find values for x, y, and
z for whichfis not positive definite.
6. b, e, and d are diagonalizab!e.
7. Definethe sum by !J'+ff+6lt={U+V+W!UE!J', VE§', WEd/t}.
Two equivalent definitions for the direct sum are: 9' + ff + q¡ is a direct §2 Page 254
sum if 9' n (ff + 6/1) = ff n (9' + 6/1) = q¡ n (9' + ti)= {O}. Or 9' + t7
+ q¡ is a direct sum iffor every vector V E -r, there exist unique vectors X E 9', 1. a. 1/VJ. b. .JB/3. c. 1/v'I.
Y E t/, Z E q¡ such that V= X+ Y+ Z. 2. a. (U, V) 2
= 25, IIUII = 10 and 11 Vll 2 = 5.
2

8. a. There are three distinct characteristic values. b. (U, V) 2 = 1/9, IIUII 2 = 1/3 and IIVII 2 = 2/3.
b. 9'(2) = 2'{(0, O, 1, 0), (1, O, O, 1)}, 9'(-2) = 2'{(1, O, 3, -3)}, and
9'(-3) = 2'{(0, 1, o, 0)}.
4. The direction ( <=) is not difficult. Por ( =),
consider the case (U, V) = 11 Ullll VIl
with u, V.¡. O. The problem is to find nonzero scalars a and b such that
c. Pollows since {(0, O, 1, 0), (1, O, O, 1), (1, O, 3, -3), (0, 1, O, O)} is a a U+ b V= O. If such scalars exist, then O = (a U+ b V, V) = 11 Vll(all Ull
basis for -r 4· + biiVII). Therefore a possible choice is a = IIVII and b = -11 VIl. Consider
(aU + bV, aU + bV) for this choice.
9. a. 8- 4A. + 2A. 2 - ,13. d. (?0.1g -~).
2 5. The second vector (x, y) must satisfy (x, y) o (3/5, 4/5) = O and x 2 + y2 = l.
Solutions: (4/5, -3/5) and (-4/5, 3/5).
12. a. T(a + bt) = -33a- 5b + (203a + 31b)t.
b. B' = {5- 29t, 1 - 7t}. 6. a. There are two vectors which complete an orthonormal basis, ±0/v'I, O,
1/v'J.). Note that these are the two possible cross products of the given
vectors.
b. ± 0/3v2)(4, 1, -1). c. ± (3/7, -6/7, 2/7).
Chapter 6 7. Prove that {Vj(V, U)= O for all U E S} is a subspace of "f/'.
8. a. Use ( , )n. b. Since (2, -7): (2, - 3)s, 11 (2, -7)11 = .JTI.
§1 Page 246 9. b. {(1/V6) (1, O, 2, 1), (1/.J14) (2, 3, -1, 0), (1/.JTI) (-3, 2, O, 3)}.
10. a. Use the identity 2 sin A sin B = cos (A - B) - cos (A + B). Why
l. a. 2 b. O. c. 13. d. -4.
isn't the integral zero when n = m?
2. a. i. l. ii. 10. iii. l. iv. O. v. 8. vi. -14. b. t - (1/n) I;;'= 1 (1/n) sin 2nnx. Notice that this series does not equal f(x)
b. Por linearity in the first variable, let r, sER and U, V, W E -r 2 . Suppose when x = O or x = 1; at these points the series equals H/(0) + /(1)).
U:(a 1 , a2)n, V:(bl> h 2)8 , and W:(c¡, c2)n, then (rU + sV):(ra 1 + sb 1 ,
ra2 + sb2)n (Why?). Using this fact, (rU + sV, W) 8 = (ra 1 + sb 1 )c 1 + (ra 2
+ sb2)c2 and r(U, W)n + s(V, W)n = r(a¡C¡ + a2c2) + s(b 1 c1 + b2c 2 ). §3 Page 260
4. a. l. b. o. c. 9. ·d. -14.
l. a. V1 = (1, O, O) and V2 = (0, O, 1) for W2 = (3, O, 5) -
5. Recall that OV = O. [(3, O, 5) o (1, O, 0)](1, O, O) = (0, O, 5).
6. a. 5, O, -5. b. Since b is not symmetric, it is necessary to show that b. {(2/.J13, 3/.J13), ( -3/./f3, 2/.J13)}.
b( U, V) is linear in each variable. c. {C1/V3, -1/V'J, I/V3), co.
1/v'I, 1/V'I)}.
7. a. 2, sin l. b. b(rf + sg, h) = (r/(1) + sg(1))h(1) = (r/(1))h(l) + sg(1)h(1) d. {(1/2, 1/2, 1/2, 1/2), (1/V6, O, -2/V6, 1/V6), (- 2/.J30, - 3/.J30, 1/.J30,
= rb(/, h) + sb(g, h), fill in the reasons. Since b is symmetric, this implies b 4/.J3Q)}.
is bilinear. c. Consider !(x) = sin nx. 2. {1, 2v3t- v3, 6v:St 2 - 6v:St + vs}.
466 ANSWERS ANO SUGGESTIONS Chapter 6 __ _ 467

teristic polynomial for T is -1 + A. + A. 2 - A. 3 = -(A. + 1)(A. - 2A. + 1),


2
3. a. {(1/2, 0), (l/._/2S, 2N7)}.
b. Check that your set is orthonormal. so the other two characteristic values are both l. Therefore T is the reflection
in the plane through the origin which has a characteristic vector corresponding
4. {(1, o, O), (2, t, O), (-2¡..¡3, -tN3, tN3)}.
to -1 as norm.al. (1, -2, 3) is such a vector, so T is the reflection of cf 3 in the
.? = {(x, y, z)/2x + y+ 4z = O, x + 3y + z =O}
1
6. a. plane x - 2y + 3z = O.
= 2 {(11, -2, -5)}. c. The determinant is -1, so -1 is a characteristic value for T. The charac-
b. 2{(3, -1)}. c. {(x, y, z)/2x- y+ 3z = 0}. teristic polynomial is as in part b. Therefore T is a reflection. T(1, -1, 1)
d. {(x, y, z)/x = 2z}. e. 2{( -2, -4, 4, 7)}. = - (1, -1, 1), hence T reflects cf 3 in the pla~e :Vith equati.on x - ~ + z = ~·
8. b. 2{(0, 1, O)} is invariant under T. 2{(0, 1, O)}T = {(x, y, z)/y = 0}. d. The determinant is -1 and the charactenst1c polynomJal of T IS -1 + ,A.
9. .? = 2{(4, -1)}. cf 2 =.? + [/ 1 because {(1, 4), (4, -1)} span~ The
1 + ~A.2 - A,3 = -(A.+ 1)(A.2 - 5A./3 + 1). Since the discriminant of the
second degree factor is negative, there is only one real characteristic root.
sum is direct, that is,.? n .? 1 = {0}, because {(1, 4), (4, -1)} is linearly Thus T is the composition of a reflection and a rotation. A characteristic
independent.
vector corresponding to -1 is (1, -3, 1). Therefore the reflection is in the
10. .? 1 = S!'{(O, 1, O, 0), (4, O, S, -5)}. plane x- 3y + z =O and the axis ofthe rotation is 2{(1, -3, 1)}. (1, O, -1)
1
E 2{(1, -3, 1)} so the angle of the rotation rp, disregarding its sense, is the
angle between (1, O, -1) and T(l, O, -1) = (4/3, 1/3, -1/3). Thus cos f/1 = 5/6.
§4 Page 269 e. T is the rotation with axis 2{(2, -2, 1)} and angle n.
f. Tis therotation with axis 2{(1, 1, O)} and angle f/1 satisfying cos f/1 = -7/9.
l. a. Since the matrix of T with respect to {E¡} is ( vM~ ~I~~)' T is a 9. Suppose T is the reflection of cf 2 in the line & with equation P = tU+ A,
reflection. The line reflected in is y = (1 v3)x. tE R. Consider Figure A5.
b. The rotation through the angle -n/3. 10. b. An orthonormal basis {U¡, U2, U3 , U4} can be found for cf4 for which
c. The rotation through rp where cos rp = -4/5 and sin rp = -3/5.
the matrix of T is either
d. The identity map, rp =O. e. Reflection in the line y= -x.
3. You need an expression for the inner product in terms of the norm. Consider
~ ~)· (8 b ? O ' oo) (1o o o oo)
-1 (-1o -1o o oo)
1/U- V// 2 • (og b
o b d o o o -1
O
o
O a e '
o b d
or O
o
O a e
o b d
1¡..¡3 1¡..¡3 1¡..¡3)
where (a e) = (c?s f/1 -sin f/1)
1/3 2/3 -2/3)
4. a. ( -2/3 2/3 1/3 . b.
( -1¡..¡2 o 1N2 . b d sm rp cos f/1
for sorne angle rp. The first matrix gives a
2/3 1/3 2/3 l/v6 -2¡..¡6 1N6
rotation in each plane parallel to 2{U3, U4 }, that is, a rotation about the
5. a. The composition of the reflection of cf 3 in the xy plane and the rotation plane 2 {U¡, U2 }, and the second matrix gives a reflection of E4 in the hyper-
with the z axis as the axis of rotation and angle -n/4. plane 2{U~> U2, U3}.
b. Not orthogonal. 11. This may be proved without using the property that complex roots come
c. The rotation with axis 2{(1 - v2, -1, O)} and angle n. in conjuga te pairs. Combine the facts det A-l = 1 and det(ln - A) =
d. The rotation about the y axis with imgle n/4. ( -l)"det(A - / 11) to show that det(A - /") = O.
e. Not orthogonal.
6. Use the definition of orthogonal for one, and problem 2 for the other. §5 Page 275
7. a. The determinant is 1, so Tis a rotation. (2, 1, O) is a characteristic vector
for 1, so the axis ofthe rotation is 2{(2, 1, O)}. The orthogonal complement of l. a. (7, 7i '---- 3). b. (2 + 4i, 2i- 14, 10).
the axis is the plane [/ = 2{(1, -2, 0), (0, O, 1)}; and the restriction Ty 6) ' . (1 - i S - 6i 1 + 4i )
c.
Si-
( Si d. 2 + 5i 13 - i 4 + 13i .
rotates.? through the angle rp satisfying cos rp = (0, O, 1) • T(O, O, 1) = 2/7. i 1 + 3i i- 1
The direction of this rotation can be determined by examining the matrix of
Twith respect to the orthonormal basis {(2N'S, 1/V''S, 0), (1/y''S, -2/y'S, 0),
(0, o, 1) }.
2. a. Det et ~
3 2
i ¡)
= 1 - i ,¡. O, so the set is linearly indep!mdent.
Since the dimension of 1'" 2 ( C) is 2, the set is a basis.
b. The determinant is -1, so -1 is a characteristic value for T. The charac- b. (1 + i, 3i): (4, -3) 8 , (4i, 7i- 4): (1 - i, 2i- 1)s.
:::V 468 ANSWERS ANO SUGGESTIONS Chapter 6 469

3. b. For every w E e, [1 + i -iw](i, 1 - i, 2) + [2 + (1 - 3)w](2, 1, -i)


+ w(5 - 2i, 4, -1 - i) = (3 + i, 4, 2).
6. a. 3 + 4 = 2, 2 + 3 = O, 3 + 3 = l.
b. 2·4 = 3, 3·4 = 2, 4 2 = 1, 26 = 4.
c. -1=4,-4=1,-3=2. d. 2- 1 =3,3- 1 =2,4- 1 =4.

~
7. 22 =O in Z 4 • Why does this imply that Z4 is nota field?
1
,-.. 8. a. .'t'{(l, 4)} = {(0, 0), (1, 4), (2, 3), (3, 2), (4, 1)}.
:::.. b. No. c. 25. d. No.
'-'
f-.
9. a. IfU = (z, w)andaE e, thenT(aU) = T(a(z, w)) = T(az,aw) = (aw, -az)
= a(w, -z) = aT(U). Similarly T(U + V)= T(U) + T(V).
b. A. 2 + l. c. If z is any nonzero complex number, then z(l, i) are
characteristic vectors for the characteristic value i, and z(l, -i) are charac-
teristic vectors for - i.
d. The matrix of Twith respect to the basis {(1, i), (1, -i)} is diag(i, -i).
10. b. A. 2 - (3 + i)A. + 2 + 6i.
c. The discriminant is -18i. To find .J-18i, solve (a+ bi) 2 = -18i for
a and b. If z is any nonzero complex number, z(i - 2, i) are characteristi::
vectors for the characteristic value 2i, and z(l + i, -2i) are characteristk
vectors for 3 - i.
d. The matrix of T with respect to the basis {(i- 2, i), (1 + i, -2i)} is
diag(2i, 3 - i).
13. b. Dim "1' = 4. Notice that "1' # "1' z(e).
14. b. Vectors such as (v2, O) and (1, e) are in "1' but they are not in "1' 2 (Q).
c. No. d. If (a/b)l + (e/d)v2 =O with a, b, e, dE Z and e # O, then
v2 = -ad/be but v2 rf= Q. e. No since n rf= Q. f. The dimension
is at least 3. In fact, "1' is an infinite-dimensional vector space even though the
vectors are ordered pairs of real numbers.
15. Suppose there exists a set P e e closed under addition and multiplication.
Show that either i E P or - i E P leads to both 1 and -1 being in P.
-___,
§6 Page 285

l. a. 4i- l. b. -4- lli.


2. a. {(x,y,z)J-ix+2y+(l-i)z=0}. b. .'t'{(i,2i,-l)}.
4. a. Use< ' >R· b. (4i, 2): (1 + i, -2)u, so Jj(4i, 2)jj = v6.
5. a. Neither. b. Both. c. Symmetric. d. Hermitian. e. Neither.
f. Neither.
7. a. The characteristic equation for A is ..1.2 + 9 = O, so A has no real charac-
teristic values.
b. If P = ( ~
3 3
1 ¡ 3 ¡),
then P- 1 AP = diag(3i, - 3i).
Chapter 7 471
470 ANSWERS ANO SUGGESTIONS

y' b. E=-A- 1 B,F=D-BTA- 1 B.


y
c. det (~ri····~) = det A det(D - BT A - B).
1

10. a. -277. b. 85. c. 162.

X Chapter 7
~

§1 Page 297

X2)(~ -~)(~J (~~)·


(0, -y'i4)B l. a.

c. (x
(X¡

y e
z) 562 -~
5/2 0) (X)
~ ~ .
b. (X¡

d.
X2)(7j2 762)

(x y z)
(
-1
5 o
O 4
1/2 l~2) (f)·
Figure A6
2. A = (~ ~) where x +y= -5.

3. a. 3a 2 + 2ab + 3b 2 • + 14ab + 51b 2 ).


8. a. (1 ~ i 1 t ¡). b. (l/25)(99a2
using (a, b):(Wa + 4b), t(4a- 3b)) 8 • c. 4a 2 + 2b 2 •
This is obtained
d. a2 + b •
2

b. ). = 1, 4. Jf P =(~ti 1
~ ¡), then p- AP = diag(l, 4).
1
5. When n = 2, the sum L.;}= 1 67 =1 a1Jx;x1 is
2 2
9. Yes. Examples may be found with real or complex entries. L.; a¡jX¡Xj + L.; a2jX2XJ = a¡¡X¡X¡ + a¡2X¡X2 + a21X2X1 + a22X2X2.
j~l j=l
10. a. T(a, b) = (afv''l.- bfv''J., ajy''J. + bfv''J.). Three other ways to write the sum are
b. B = {(1/V'l., 1/V'l.), (-1/V'l., 1/y''J.)}.
c. The curve is a hyperbola with vertices at the points with coordinates f; f;
i=l j=l
a¡jX¡Xj, t X¡ t O¡jXj,
i=l j=l
t Xj t a¡jX¡.
J=l 1=1
( ± 3, O)B or ( ± 3/ y''J., ± 3/ y'l.)¡E 1¡ and asymptotes given by y' = ± y''J.x' in 7. a. If b( U, V) = xrA Y with X and Y the coordina tes of U and V with respect
terms of coordina tes with respect to B.
to B, then b(U, V)= 21x1y¡ + 3x 1Yz + 21x2Yt + 30x2Yz· For example,
11. a. If B = {(2/VS, 1/VS), (1/v'S, -2/VS)}, then G ={Ve S 2 jV:(x', y') 8 a 12 = (4, 1) o T(l, -2) = (4, 1) o (4, -3) = 3.
and 6x' 2 + f 2 = 24}.
b. Figure A6.
b. 14x1 y¡ + 1XtY2 + 5x2Yt + 3X2Y2·
9. a. q(x, y)= 4xy- 5y2. b. q(x, y, z) = 3x 2 - 2xz + 2y + Syz.
2

Review Problems Page 286 10. a. X¡y¡ - 4XtY2 - 4X2Yt + ?x2Y2·


b. X¡YJ + X1Y2 + XzY1 - X2Y2 - 6XzY3 - 6X3Y2·
l. r >l. 12. Consider Example 4 on page 296.
2. a. {(1/V'l., 0), (3/v'l., y'J.)}. b. {(1/v'3, 0), (-1j,J15, ,J3í5)}. 13. a. lt should be a homogeneous 1st degree polynomial in the coordinates of
7. a. To see that sP need not equal (!1' 1 ) \ consider the inner product space of ·~ an arbitrary vector. Thus a linear form on "Y with basis B is an expression of
j the form a 1 x 1 + · · · + a.x. where X¡, ... , x. are the coordinates of an
continuous functions ff with the integral inner product. Let sP = R[x], the set
of all polynomial functions. Then using the Maclaurin series expansion for arbitrary vector with respect to B.
sin x, it can be shown that the sine function is in (!1'1 ) \ but it is not in !1'. b. If U:(x¡, ... ,x.)8 and Tis defined by T(U)= L:1= 1 a,x1 for Ue"Y,

8. a. T*(x, y)= (2x + 5y, 3x- y). b. AT; _AT~ then. Te Hom("Y, R) and (a¡, ... , a.) (JJ =AXis the matrix expression

9. a. (~1-;-l~~~····)· . ~;~;·t-~;~;). for T in terms of coordina tes with respect to B.


472 ANSWERS ANO SUGGESTIONS
Chapter 7 473

c. Show that if W is held fixed, then T( U) = b( U, W) and S( U) = b( W, U) y


are expressed as linear forros.
y'

§2 Page 304
----~--~---+---4--~~--._----~x
o
l. a.
c.
A=(~ -1V· b. A'= (ó -?)·
B' = {(2/5, 1/5), (-1/5'1/3, 2/5'1/3)}.
d. q(a, b) = xf- x~ if (a, b): (x¡, x 2)s··
2. b. The characteristic values are 9, 18, O; A'= diag(l, 1, 0). T(Ez)
c. B' = W/9, 2/9, -2/9), Cv2/9, v2/18, v2/9), c-2/3, 2/3, 1/3)}.
_____.___---------------~--------~ x'
d. q(a, b, e) = xf + x~ if (a, b, e): (x¡, x 2, x 3 )s··
T(O) T(E 1 )
3. a. A=c: ~).b. A'=pTAP=(-g g).
c. q(a + bt) = -4xf + 2x~ if a+ bt: (X¡, x2)s .. Figure A7
5. Consider Ietting X = Er and Y = EJ, E; and Ei from {E1 }.
6. a. r = 2, s =O, diag(l, -1). b. r = 2, s = -2, diag(-1, -1). b. Use the normal to obtain the rotation. In this case you rnust show that
c. det A = -3, sor= 3, tr A= 6, sos= 1, diag(l, 1, -1). your map sends the first hyperplane onto the second. (Why wasn't this necessary
d. The characteristic polynomial of A is -13 + 5). - 3). 2 - ). 3 • in a?)
Det A = -13, so r = 3. Tr A = -3, so the roots might be all negative or 6. Use the fact that O, U, V, U+ V, andO, T(U), T(V), T(V) + T(V) determine
only one might be negative. But the second symmetric function s 2 of the roots parallelograms to show that T(V + V)= T(V) + T(V).
is -5. If all the roots were negative, Sz would be positive so the canonical form 7. Show that two spheres are congruent if and only if they ha ve the sarne radius.
is diag(1, 1, -1).
Recall that X is on the sphere with center C and radius r if 11 X- Cll = r.
7. a. The ~atrix (~ 1 ~) is singular.
b. The characteristic polynomial for the matrix A of q with respect to {E1 } §4 Page 320
is 8- 21). + 11). 2 - ). 3 , This polynomial is positive for all values of).::;; O,
therefore all three characteristic roots are positive and A is congruent to 13 • A=(-~ -~), B ~ (0, 7), A=
3 -2
o o)
l. a. ( -2 7/2.
o 7/2 8
§3 Page 311 b. A= (~ ci), B = (-4, 6),
o
A= ( 4 o
4· -2) 3 .
-2 3 o.
l. If P is a transition ~. what are the possible signs of det P? 2. The characteristic values of A are 26 and 13 with corresponding characteristic
2. Figure A7. vectors (3, 2) and (- 2, 3). Therefore the rotation given by
3. a. A line has the equation P = tU+ V for sorne V, V E Cn, tE R. Show
that T(P) lies on sorne line e for all tE R and that if W E e, then T(tU + V)
(x)y (3/Vl3 -2/V!})
=
2/V13 3/Vl3 y'
(x')
= W for sorne tE R. yields 26x' 2 - 13y' 2 - 26 =O or x' 2 - y' 2 /2 = l. See Figure AS.
b. That is, does 11 T(V)- T(V)Ii = 11 V- VIl for all U, V E C,? 3. The rigid motion given by
4. Suppose A, B, and C are collinear with B between A and C. Show that if T(B)
is not between T(A) and T( C), a contradiction results. ({) = (-1¿~ ¡~~ ~/~15)(~::)
5. a. Given two lines e, and e2 find a translation which sends e1 to a Jíne
yields x" 2 = 4py" with p = 2. Let x', y' be coordinates with respect to B' =
through the origin. Show that there exists a rotation that takes the translated
{(4/5, -3/5), (3/5, 4/5)}, then B" is obtained frorn B' by the translation
line toa line parallel to e2 and then translate this line onto ez.
(x", y") = (x' - 2, y' + 3). See Figure A9.
474
ANSWERS ANO SUGGESTJONS Chapter 7 475

y'
4. .fTAA'= (x y 1)(_~ =¡ =~) ({)· If P= (g r -f) and g = P.f,'
then .f'T(FTAP).f' = 5x' 2 - 3y' 2 •
5. a. Intersecting lines. b. Parabola. c. Ellipse, (0, O) satisfies the
equation. d. No graph (imaginary parallellines). e. Hyperbola.
6. Probably, but since A is a 2 X 2 matrix, det(rA) = r 2 det A, so the sign of the
determinant is unchanged if r oF O.
7. b. The equation is -7t 2 + 28t- 21 =O. The values t = 1, 3 give (-1, 1)
and (1, - 3) as the points of intersection.
8. a. B' = {(1/V2, 1/v2), (-1/V2, 1/v2)}. The canonical form is x' 2 = 9.
See Figure AlO(a).
b. B' = {(1/.J10, -3/.J10), (3/.J10, 1/.JlO)}. The canonical form is x' 2 j32
+ y' 2 /( v'W = l. See Figure A10(b).

§5 Page 329

l. a. (yxz;') __ (0o1 g1 Óo) (x) , and multiplici:ition by -1 yields


~ x'
2
/4 + y' 2 j9 - z'2

Figure AS = -1.

b. ( ~) = (g ? -l) (~:). yields x'


2
/5 - y'
2
= l.
c. The translation (x', y', z') = (x- 2, y, z + 1) yields 4z' 2 + 2x' + 3y' = O,
and 4x" 2 - .J13y" =O or x" 2 = 4py", withp = .J13/16, results from

(x")
y" = (-2/V13
z"
O
3/Vl3
-3/Vl3 O
-2/V13 O z'
o1)(x')
y'

d. (x', y', z') = (x +2, y+ 1, z + 3), x' 2 /2 + y' 2 + z' 2 /3 = -l.


e. The rotation

(;) _(-~~~o -mo


z
1
- -2/3 2/3
=~~~ Z)(;:)
1/3 O z'
o 1 1
yields 9x' 2 - 9y'2 = 18z' or x' 2 - y'i = 2z'.
2. The plane y = mx + k, m, k E R, intersects the paraboloid x 2 fa 2 - y''fb 2 = 2z
in the curve given by
(1fa 2 - m 2 /b 2 )x2 - imxfb 2 + k 2 /b 2 = 2z, y = mx + k.
2
If 1fa # m 2 fb 2 , then there is a translation under which the equation becomes
x' 2 = 4pz', y'= m'x' + k'. This is the equation of a parabola in the given
plane: If l/a 2 = m2 /b 2 , the section is a Iine which might be viewed as a de-
x" generate parabola.
Figure A9
476 ANSWERS ANO SUGGESTIONS Chapter 7 477

4. Intersecting lines, parallel lines, and concident Iines.


y 6. a. x' 2 /8 + y' 2 /8- z' 2 /4 =-l.
7. a. Parabólic cylinder. b. Hyperboloid of 2 sheets.
c. Point. d. Intersecting planes.
8. Consider a section of the two surfaces in the plane given by y = mx. lt is suf-
ficient to show that as x approaches infinity, the difference between the z co-
ordinates approaches zero.

§6 Page 338

l. a .. (- 2, -2, 2) and (- 25/4, 9/4, -13/2). The diametral plane is given by


X + ?y - Z + 1 = 0.
b. (1, O, :-1) is a single point of intersection.
c. (- v'l/2, -1/2, 0), two conincident points of intersection. The diametral
plane is given by 2x + 2y2y + (4v'2 - 2)z + 2y'2 = O.
2. Suppose the line passes through the point O, then if O is not on the plane with
equation 4x + y - 2z = 5, the Iine intersects the surface in a single point. lf
y O is on this plane but not on the line with equations 4x +y - 2z = 5, ?x- 7z
= 15, then the line does _not intersect the surface. If O is on the Iine given above,
then the line through it with direction (l, -2, 1) is a ruling. (Does this give
an entire plane of rulings?)
4. a. For Á = diag(l/5, 1/3, -1/2, O) and O = (5, 3, 4, W we must find K
= (o:, p, y, O)T such that KrÁK = KrÁO = O. These equations can be com-
bined to yield (3y - 4P) 2 = O, o: = 2y - P. therefore K may be taken as
(5, 3, 4, O)r as expected.
b. K = (1, y'S, 4 - y'2, O)T or (1, - y'S, 4 + y'2, O)r.
c. K= (1, 1, 1, O)r or (9, 2, -12, O)T.
d. K= (O, 1, 2, O)T.
5. a. There are no centers. b. A plane, x + y + z = -1, of centers.
c. (1, 1, 3). d. A Iine, X= t( -3, 2, 1) + (3, -2, 0), of centers.
6. a. Kr AO = O, for then the Iine either intersects the surface in 2 coincident
points (Kr AK #- O) or it is a ruling (KT ÁK = 0).
b. If O is a center, then KTÁK = O for every direction.
c. The point X is on the tangent plane if X- Ois the direction of a tangent
or a ruling (X #- 0). Therefore (X - O)TAO = O. But O is on the surface,
so the equation is xrÁO =o. Is the converse c!ear? That if vr AO =o, the:~
Vis a point on the tangent plane.
7. a. z = l. b. (1, 1, -1) is a singular point-the center or vertex of the
x' cone. c. x- 8y- 6z + 3 =O. d. 2lx- 25y + 4z + 38 =O.
(b) 8. If A= diag(1/4, 1, 1/4, -1) and O= (2, 1, 2, l)T, then we must find K such
Figure A10 that (2KT A0) 2 - 4(KTÁK)(ÚTAO)= O. (Why?) The tangent lines are given
by X= t(1, O, 1) + (2, 1, 2) and X= t(9, 2, 1) + (2, 1, 2).
478 ANSWERS ·ANO SUGGESTIONS Chapter 8 479

Review Problems Page 339 c. Y(U, T) = .5!'{(1, 1, 1), (3, 6, 6)}; a(x) = x 2 - 7x + 9.
d. x 3 - 8x 2 + 16x- 9.
l. a. A= (-5~2 -5i2), B = (5, -7). b. A = G ~)' B = (O, 6). o -3~ o
L ..... } __ L..........
c. A = (io -I 5
~). B = (4, O, -9).
o
d. A =
o 3/2 -2)
3/2 3
(-2 1 o
1 , B =
4. b.
.:·........
-1:
---¡-·¡y····:.:.:T
(-5, 1, -2).
o ¡1 1
5. For the two block case, suppose A 1 E vltrxr and A 2 E vltsx., and use the fact
3. a. An inner product. b. Not symmetric. c. Not positive definite. that
4. a. -2; diag(-1, -1). b. O; diag(l, -1). c. 2; diag(l,J.._Q).
(At O) _ (At O)
O A2 -
O)
O I. O A2 .
(Ir
2
5. a. x' /16- y' /4 = 1; x' = x + 2, y' =y- 3.
2

b. x' 2 - y' 2 = O; x' = x - 5, y' = y + 2. 6. a. .5!'{(1, -3, 1)}. b. {(x, y, z)ix- 3y + z = 0}.
c. y'= 5x' 2 ; x = Wx'- 4y'), y= ~(4x' + 3y'). c. {0}. d. 1"3
d. x' 2 = 4.Y2; x = (I/V2)(x' +y'), y = (IJV2)(y' - x'). 7. a. .5!'{(0, 1, 0)}. b. .5!'{(0, 1, 0), (1, O, 1)}. c. .5!'{(2, 1, -2)}.
e. x' 2 /5 + y' 2 /10 = 1; x = (l/V5)(2x' + y' - 4), y = d. .5!'{(0, 1, 0), (2, 1, -2)}. e. 1" 3, for p(T) =O.
(1/VS)(-x' + 2y'- 3).
8. It is the zero map.
7. a. (4, - 2). b. ((x, y)i3x + 6y + 1 = 0}. c. (1, 1). 9. a. .5!'{(1, O, 2)}, b. .5!'{(1, O, 2)}, compare with 7a and b.
8. a. Hyperboloid of 2 sheets. b. Parallel planes. c. Hyperbolic c. .5!'{(1, o, 1), (0, 1, 0)}. d. {0}.
cylinder. e. Use problem 8 with parts a and c.
9. Ellipsoid or hyperboloid. See problem 9, page 127.
10. a. Any plane containing the z axis. §2 Page 358
b. Any plane parallel to the z axis.
c. Any· plane parallel to the z axis but not parallel to either trace of the l. Consider x 2 - xZ 2 [x] or in general the polynomial (x - e 1 )(x - e 2 ) • • •
E
surface in the xy plane. (x - eh) when F =
{e¡, ... , eh}.
11. Each line intersects the surface in a single point. 4. a. q(x) = 3x - x + 2, r(x) = x + 3.
2
c. q(x) = x 3 - 2, r(x) = x.
b. q(x) = O(x), r(x) = x 2 + 2x + l. d. q(x) = x 3 - 3x + 4, r(x) = O(x).
4/5 -3/5 11/5)
12. P = (3Ó5 ~5 2(5 . x' 2 /8 + y' 2 j2 = J. 5. Suppose a(x) = b(x)q 1 (x) + r 1 (x) with deg· r 1 < deg b and a(x) = b(x)q 2 (x)
+ rl(x) with deg r2 < deg b. Show that q¡(x) # q 2 (x) and r 1 (x) # r2(x) leads
to a contradiction involving degree.
6. Use the unique factorization theorem.
Chapter 8 7. a. g.c.d.: (x- 1)3(x + 2) 2 ; l.c.m.: x(x- 1)4 (x + 2) 4 (x 2 + 5).
b. g.c.d.: (x- 4)(x- 1); l.c.m.: (x 2 - 16)(x- 4)2(x 3 - 1).
§1 Page 347 8. The Jast equation implies that rk(x) is a g.c.d. of rix) and rk_ 1 (x), then the next
to the Jast equation implies that rk(x) is a g.c.d. of rk_ 1 (x) and rk-2(x).
l. a. Y(U, T) = .5!'{(1, O, 0), (3, 1, 1), (10, 8, 6)} and
9. a. x 2 + 1 is a g.c.d. The polynomials obtained are:
a(x) = x 3 - 8x 2 + 18x- 12 = (x- 2)(x 2 - 6x + 6).
q 1 (x) = 2x, r 1 (x) = x 3 - 2x 2 + x - 2; q 2 (x) = x + 1, r 2(x) = x 2 + 1;
b. Y( U, T) = S!'{ U}; a(x) = x- 2.
q3(x) = x - 2, r3(x) = O(x). ·
c. Y(U, T) = S!'{U, (-6, O, 0), (-18, -6, -6)}; a(x) as in parta.
b. Re1atively prime. q 1 (x) = 2, r 1 (x) = x 3 - 4x + 6: q 2 (x) = x + 2,
d. S!'(U, T) = .5!'{(0, 1, 0), (1, 3, 1)}: a(x) = x 2 - 6x + 6.
r2(x) = x 2 + x - 4; q 3(x) = x - 1, r3(x) = x + 2; q4(x) = x - 1, r4(x)
2. a. Y(U, T) = .5!'{(-1, O, 1), (0, 1, 2), (3, 7, 8)} and = -2; qs(x) = -!(x + 2), rs(x) = O(x).
a(x) = x 3 - 8x 2 + 16x- 9 = (x- 1)(x 2 - 7x + 9). c. ix + 1 is a g.c.d. q 1 (x) = x + 1, r 1 (x) = 2x 3 - x 2 + x + 1; q2(x) = 3a:;
b. Y(U, T) = S!'{ U}; a(x) = x - l. rl(x) = 2x 2 + x; q3(x) = x - 1, r3(x) = 2x + 1; q4(x) = x.
480 ANSWERS ANO SUGGESTIONS Chapter 8 481
~

10. Suppose c/d is in lowest terms, that is e and d are relatively prime as integers, 2. a. Show that if B = {U¡, ... , Uh}, then B = {U¡¡ T(U1 ), ••• , Th-. 1 (U 1 )f
and consider d"p(cfd). and if q(x) is not the mínimum polynomial of T,. then B is Iinearly dependen t.
11. a. ±6, ±3, ±3/2, ±2, ±1/2, ±1. c. ±8, ±4, ±2, ±l. b. Expand det(C(q)- Uh) along the Jast column and show that the cofactor
b. ±l. d. ± 1/9, ± 1/3, ±l. of the element in the kh position (occupied by -ak_ 1) is ( -:-1 )k+ll(- ;¡_)k- 1(I)k-l
when k< h and (-1)h+h(-J.)h,-l when k== h. ·
12. Consider x 2 + x + l.
3. a. Since the minimuin polynomial is (x- 2)3, 1"3 =S"( U¡, T) with U 1 any
13. Why is it only necessary to prove that well ordering implies the first principie
vector such that (T- 21) 2 U 1 # O. A possiblecyclic basis is {(0, O, 1); ( -1, 1, 2),
of induction? Given A such that 1 E A and k E A implies k + 1 E A, consider
(-6, 5, 4)}, which yields the matrix
the set B of all positive integers not in A.
17. c. z- 4 =a- b, x 2 + y 2 + (z- 4) 2 - 4 = (z- 3)a + (4- z)b,
4x 2 + 4y 2 - z 2 = (z + 4)b - za.
and
(?0 g1 -1~)-6 .
d. A curve in 3-space cannot be described by a single equation. b. The mínimum polynomial is (x- 2) 2• For U¡ 1{; .9'(2), choose u2 E .9'(2)
such that U 2 1{; .9'(U¡, T). For example, B = {(1, O, 0), (-2, 2; -4), (1, -1, O)i
§3 Page 366 yields the matrix

+ 7x- 3 = o -4 o)
l. a. x 3 - 5x 2
d. X - l.
(x- 1) 2 (x - 3). b. X- 3. c. (x- 1) 2 •
(o1 o4 o2 .
2. a. x2 + x + 2. c. If TE Hom(1"z(C)), then Tis diagonalizable. 4. b. Suppose U1 , ••• , U¡_ 1 have been chosen for 1 < i ::s;; h and 1"
3. a. (x + 4)(x - 3). b. (x + 2) 2 • c. x 3 - 3x 2 + 3x - l. = .9'(U1 , T) ® · · · ® .9'(U1 _ ¡, T). Then a maximal integer r 1 exists such that
d. (x 2 -1- 8x -1- 16)(x + 5). e. (x 2 + x + 1)(x + 2). p''- 1 (T) W 1{; .9' and p''(T) W E .9', for sorne W E 1". Show that p''- 1 (T) V= O
f. x 4 + 6x 2 + 9. g. x 2 - x- 2. for all V E B leads to a contradiction.

4. 3a and 3g. 5. a. diag(C(p 2 ), C(p 2 ), C(p 2 )), diag(C(p 2 ), C(p 2 ), C(p ), C(p )), diag( C(p 2 ), - 3/4 ).
b. 4h. c. diag(C(p 2 ), C(p)). d. C(p3).
5. a. .;V'T+4/Ef).JV'T-3/ =.9'(-4)(f).9'(3) = ..z'{(l, -1)}G;)..z'{(1,6)}.
e. diag(C(p 3), C(p 3)), diag(C(p 3), C(p 2 ), C(p)), diag(C(p 3), / 3).
b. .;V'(T+ 2J¡2 = 1" 2• C. .;V'(T-1)3 = 1" 3•
d. .;V'(T+41)2 G;).!V' T+SI = 2'{(1, O, 0), (0, 1, O)}® 2'{(0, O, 1)}. 6. a. (T- e!)' U= O implies (T- cl)((T- c!)r-I U)= O.
e. .!V'T 2 +T+I ® .!V'T+2I = 2'{(1, O, -1), (0, 1, O)}® 2'{(1, O, -2)}. 7. a. There are three linearly independent characteristic vectors for the charac-
f. .;V'(T2+3IJ2 = 1" 4• teristic value -1, therefore the (x + 1)-elementary divisors are (x + 1)Z,
g. .;V' T-21 ®.;V' T+I = {(x, y, z)!Sx -1- 10y"- 2z = O}® 2'{(1, -1, -1) }. X -1- 1, X -1- l.

6. (! =i -~)· 7. c. (? -~).
b. There are two linearly independent characteristic vectors for -1 so the
(x -1- 1)-elementary divisors are (x -1- 1) 2 , (x -1- 1) 2 •
o o -27: o
8. a. (2, -1, -5) = (T- I)E 1 • b. (3, 3, 3). c. (4, 2, -2). o -1 ! o 1 o -27!
1----.-..1 j
15. Define T1 = s1(T) o q 1(T), then 1 = T 1 -1- • • • -1- Tk and therefore each T 1 is
a projection. T1 o T1 = O for i cf- j follows from the fact that Ph(x)lq.(x) when 8. a.
--- :? :____: (:
-1; b.
o 1 -9!
----.. T-o _::_~r:

:o .. ...::"] j 1 -6:
h cf- g. m(x) = pfi(x)q¡(x) implies .!V'p('<TJ e T 1[1"], and it remains to obtain
o :1 -1
:o -9
the reverse containment. o : 1 -6
o o o -49: o
§4 Page 376 1 o o 56:
c.
o 1 o -30;
o o o 1 --------8!,- ó- --:_:-7
l. a. (5). b. (O1 -25)
lO. c.
(! o
1
125)
-75.
15
d. (o1 -4)
3 .
o :1 4

e.
o
\o1 oo1 oo
-16)
24
9. Suppose the given matrix is the matrix of TE Hom(-r 4) with respect to the
standard basis.
-17.
o o 1 6 a. The mínimum polynomial of T is (x + 1) 2 (x- 1) 2 • Therefore
482 ANSWERS ANO SUGGESTIONS Chapter 8 483

diag(C((x + 1) 2), C((x- 1)2)) is the matrix of T wíth respect to sorne basis. 4i o '_- o
This matrix is similar to the given matrix since both are matrices for T. 1 4i --
b. The mínimum polynomial of T is (x - 2) 2 (x + 4) and T has on\y one
characteristic vector for the characteristic value -4. Therefore T has only one ~4i o
1 -4i
(x - 4)-elementary divisor and because of dimension restrictions, there are
two (x - 2)-elementary divisors, (x - 2) 2 and x - 2. Thus diag(C((x - 2) ),
2 o ' -4i
d. (x + 1 - i) 3 and (x + 1 + i) 3 are the elementary divisors.
C(x - 2), C(x - 4)) is a diagonal block matrix similar to the given matrix.
o o 2 + 2i o
1 o 6i
§5 Page 386 D3 =
o-- 1 --3i-- 3--·ro :
o 2- 2i'
: 1 o -6i
o o 27 : o 3 o o: o o :o 1 -3- 3i
1 o -27 i 1 3 o!
9! o 1 3i i- 1 o o o
l. a. D¡ = O 1
--------::n - - ·-r--r·: 1 i- 1 o
o.. ·············1 ···-··········-··
o - ""fj o -------¡--3 i- 1
1-- ---o··
;·~¡-;___
o
o o o -16 : o o -4: o 1 -i- 1
1 o o o! 1 o: o o 1 -i -1
---------~-ro··-----¡·:
b. Dl = O 1 O -8 : b. Be= {(1, O, O, 0), (0, 1, O, 0), (-1, -2, O, 1), (0, 1, -1, -1)}.
o o 1 o: :1 o---: :.::.z- 3.
---:o --_:::4 o 4. The mínimum polynomial of A is x 2 + 9, i.e., A2 + 9h = O. A classical form
o • 1 -4 o 1 -2
o o o -256: o o -16- o o o o ver R is díag { (~ -6), (~ -6)} and aJordan forro over C is diag(3i, 3i,
1 o o o· 1 o: o o -3i, -3i).
o 1 o -32 o ·--c:-·<r -16
c. D¡= o o 1 o Dz = O O' 1 O 6. a. The characteristic polynomial is -(x- 2) 3 • Since there is only one
(j" _::_: 16 linearly independent characteristic vector for 2, the mínimum polynomial is
o 1 o o 1 (x- 2) 3 , and the Jordan forro is
o o o o o o -2 ¡ o o o o
(of ~1 g)·
-8
1 o o o o -24 1 -2!0 o o o
1 o o o ó 1'0 __:2•o o
D¡= o
-36 2
d. o o 1 o o -32' o o. 1 -2 : o o.
o o o 1 o -18 o o --o-- ·r. o· -i b. The characteristic polynomial is - (x - 2) 3 , but there are two linearly
o o o o 1 -6 o o o o: 1 -2 independent characteristic vectors for 2. So the mínimum polynomial is
(x - 2) 2 anda Jordan form is
2. a. No change, D3 = D 1 and D4 = Dz.
b. (x- 2i) 2 , (x + 2i)l, (x + 2) 2 ;
o 4 : o 2i o ; o (oi o~ g)·
2
1 4i : 1 2i : c. The characteristic polynomial is -(x - 1)(x- i)(x + i) and this must
---- -:-o- - "4 ; -2i o -
D3 = ¡¡ _ 4¡ D4 = 1 -2i • be the negative ofthe mínimum polynomial. AJordan form is diag(l, i, -i).
:o __:_4 __:z- o 7. a. There are 11 possibilities from (x- 2) 6 to x- 2, x ~ 2, x- 2, x- 2,
o - 1 -4 o 1 -2
X - 2, X - 2.
c. (x- 4i)l x- 4i, (x + 4i) 2
, x+ 4i. b. The first and last (as in parta) and (x - 2) 2 , x - 2, x - 2, x - 2, x - 2.
16 o 8. Show by induction that
8i
(~ ~r = (n::-~ ~.).
9. Let k denote the number of independent characteristic vectors for 2.
-__:_4¡ a. (x - 2/, (x - 2)2, k = 2.
484 ANSWERS ANO SUGGESTIONS Chapter 8 485

b. (x - 2)\ x - 2, k = 2.
C. (X- 2) 2 , X- 2, X- 2, k= 3. d. P= 2 1 11
(o 1
1 , p- 1 AP = (21 o
2)
1
2
o 1
~);X= PX', x' =el', y'= (t + 3)el',
10. b. Be= {(0, O, l, -1), (0, 1, O, 0}, (0, O, -1, 0}, (-1, -2, O, -1)}. z' = (1 2 /2 + 3t - 2)e2'.
c. Let p¡(X) =X- v2i and P2(x) =X+ v2i, Be= {U¡, ... ' U4}. If
W=(O,O, 1, -1), U1 =p~(T)Wand U3 =pj(T)W,then
Be= {(O, 2v2i, -5, 4), (-I, -2, -v2i, -1), co, -2v2i, -5, 4),
e-t, -2. v2i, -t)}.
11. b. There are five possible sets: x4; x 3 , x; x 2 , x 2 ; x 2 , x, x; and x, x, x, x.

oo) .
The corresponding Jordan forros are:

o o
oo) (oo o o oo)
(ob o~ g1 og). (bo ~o g
es'
g). (b g g g). (b g g o' o o o o. o1 oo) .
o o o o 1 o o o o o o o o o t 1
12. The elementary divisors for the given matrix are:
a. x 3 • b. x 2 , x. c. x 2 , x 2 •
4. a. AT = qp) withp(Á) = ;t• + a._ 1 ;t•-t + · · · + a 1 Á + ao.
5. a. X= 7e2t- 5e 3 '. b. X = (6t + 7)e41 • C. X= e'+ 2e-t - e- 21 •
13. a. lf Be= {(1, O), (1, 1)}, the matrix of T with respect to Be is (j ~). 6. a. diag(1, 0). b. Does not exist. c. O, characteristic values are
g) with respect to Be, 1/2 and 1/3. d. Does not exist.
T.(a, b) = (a - b, a - b) is the map with matrix (?

and Tia, b) = (4a, 4b) has (Ó ~) as its matrix with respect to Be. Thus
T = T,, + Td. Notice that this answer does not depend on the choice of basis
e.
(_~
o
1
o ?)· f.
('t'
5/2
o
1
o
o -12
o
o
o ~) g. el 8
-1 5 -3.
-1 4 -2
-6)
Be. Will this always be true? 7. b. For the converse, considerdiag(l/4, 3).
b. The matrix of T with respect to Be = {(0, O, 1), ( -1, 1, -2), (0, 1, -1)}
9. Both limits exist.
is (i
003
~ Ifg)· (? g g)
000
is ti'e matrix of T,, with respect to Be, and diag
Review Problems Page 402
(2, 2, 3) is the matrix of Td with respect to Be, then T = T. + Td, and T.( a, b, e)
= (a - b - e, b - a + e, 2a - 2b - 2e) and Td(a, b, e) = (2a, a + 3b,

14.
2e- a- b).
There are two approaches. Either show that a matrix with all characteristic
l. d. (~o =o~ 8).
1
e. No.

roots zero is nilpotent, or show that if A = (au) and A 2 = (biJ), then a¡j = O 3. a. (x- 2) 2 (x + 1) 2
• b. Why is it also (x- 2) 2 (x + 1) 2
?
for i :::;; j implies bii =.,A,for i - 1 :::;; j and continue by induction.
c. diag((? -!). (-) -~)). d. diag((f ~). (-~ -?)).
§6 Page 401 4. c. Notice that if T(V) = A.Vandp(x) = L;~=o a 1x 1, thenp(T)V = D =o a 1A. 1 V
= p(l)V.

1 23) ' p - AP
l. a. P = (2
P(-l3e-2', 9e-3')r.
1
=
2 _ 0), x'=-13e-2t, y'=9e- 3 '; X=
(- O
3
5. a. e -8) ' (31 o
1 o -12
o 1 -6 o 1 3
o) .
3 o b.
("1 oo oo -2 (o1
o 1 o
-t)
-3' o
-1
-l
1
o
o
o J)
o o 1 -2 o o 1 -1
b. p = (~ =j), 1
p- AP = G ~); x' = 4e', (D,y'- y'= 4e')e-• and
y'(O) = 9 yields y' = (4t + 9)e'; X= P(4e', (41 + 9)e')r. 6. a. diag(G ?) , (i ~)). b. diag(G ?) ' (2}, (2)).
c. p = (i ~ -1).
1 1 3
p- 1 AP = diag(l, 1, -1); X= P(5e', -6e', e-')T. 7. a. (x
b. (x
+ 4) 3 , (x + 1) 2 ; one independent vector each for -4 and -l.
+ 4) 3 , (x + 4) 2 , (x + 1)2; 2 for -4 and 1 for -l.
486 ANSWERS ANO SUGGESTIONS Appendix 487

(x + 4) 3, x + 4, (x + 1) x + 1 ; 2 for -4 and 2 for -l.


2
? §3 Page 426
(x + 4) 3, (x+ 1)2, (x + 1) 1 for -4 and 2 for -l.
2
;

(x + 4) 3, (x+ 1) x + 1, x + 1; 1 for -4 and 3 for -l.


2, 4. If AZ = In and Z = (Z¡, ... , Zn), then AZ1 = E1 for 1 :s;; j :s;; n.
o -4 ; o 2i o o ; o 7. b. 3..)61. c. !II(A- B) x (C- B)ll. Would !II(A- B) x (B- C)ll
1 o! 12iO! d. ll.JTI/2.
8. a.
······· ··rny-···:.:.:·ir'
¡1 o¡ b.
o 1 2i :
·
············; --~.::zr· ··¡y··· ····o ·
also give the area?
8. The only property it satisfies is closure.
· ······-ri· o····:.:.:.4 i 1 -2i o
o ¡1 o o ¡ o 1 -2i 9. a. ForU,V,WeC 4 ,1etUx Vx WbethevectorsatisfyingUx Vx WoY
= det( UT, VT, WT, YT) for all Y E?"' 4 • Such a vector exists sin ce the map
3 o o! o 3 o o; o Y ..... det(UT, VT, WT, YT) is linear. The value of U x V x Wmay be obtained
1 3 oi 1 3 o!
(1 ~ ~)·
by computing U x V x W o E1 for 1 :s;; i :s;; 4.
b. 9. ...L ...~...L......... , o·················¡··:r·--o
1 3j
c. (0, o, o, 1).
9. a. :3 o d. (3, 4, -6, -6).
o l, 1 3 o io 3
11. a. subdiag(1, 1, O, O, O). b. subdiag(l, O, 1, O, 1, 0).
c. subdiag(1, 1, O, 1, O, 1) or subdiag(l, 1, O, 1, 1, 0).

12. a. X=(~O I1 1~)(e~'O e~'O e-g)(-1? -~1 -ó1)(-i1)· 3


'

13. Consider problem 8, page 387.

Appendix

§1 Page 411

l. a. -1. b. o c. -1. d. l.
2. a. l. .b. o. c. -1. d. o.
7. Yes.

§2 Page 417
1. The even permutations in f!l' 3 are 1, 2, 3; 3, 1, 2; and 2, 3, l. The odd permuta-
tions are 1, 3, 2; 3, 2, 1; and 2, 1, 3.
2. a. 3, 4, 2, 1, 5. b. 4, 1' 3, 5, 2. c. a. d. 2, 3, 5, 4, l.
3. !, a, a 2 •
6. a. {x1 - x 2)(x1 - x3)(x 1 - x4)(x2 - x3)(x2 - x4)(x3 - x4).
b. For a: (x4 - x2)(x4 - X¡){x4- X3)(x2 - Xt)(x2 - X3)(x1 - X3)
= (-1)4P{x¡, x2, x3, x 4), so sgn a= l. Sgn' = (-1) 3 =-l.
7. n!
8. b. Jf a is odd, 'o a is even. Set 'o a= Jl, then a= ,- 1 o Jl ='o a.
9. a. 10; +1. b. 15; -1. c. 6; +1. d. 9; -1.
Symbol lndext

Vector Spaces Matrix Notation


f(f. 278 A•dJ 183
82 13 A,, 111
s. 24 AT 99
g; 34 A-l 181
ff(C) 279 A* 76
Hom(-1'", "'ff'), Hom(-1'") 146 C(p•) 368
Jr 138 c.(p•) 383
2(S) 44 det A or IAI 111
.Áinxm 80 D(A) 408
.Áinxm(F) 273 diag(d¡, ... , d.) 223
ffr 140 diag(A¡, ... , A.) 344
R[t] 33 eA 393
R.[t] 40 I. 181
.'l'nff 43(10) tr A 232
.'l'+ff 65
.'l'(f)ff 67 Maps
.'/'¡ (f) ... (f) .'1',, 342
.'l'(U, T) 236, 345 /or 1., 170
.'l'(l) 235 T: -1'" __, ifl" 130
.'/'L 259 T[S] 137
-1'"2 5 r-1 170
-r. 22 T-l¡S] 145(7)
-r.(F) 272 ToS 167
Tf/' 144
t Numbers in parentheses refer to próblem numbers.
489
490 Symbol lndex

E2 8
Vectors {E,} 55
IIUII 13, 247 F[x] 348
uxv 110, 423 g.c.d. 355
U• V 13,24 J.c. m. 358(6)
(U, V) 242 R 2
(U, V) 8 246(3) f!l'. 412
U: (a¡, ... , a.)s 59 Q 271

Miscellaneous sgn a 415


[a] 215 z 221(3)
.w. 416 z. 272
a~b 214 z 277
b(x)ia(x) 350 1- 1 132
e 3 AcB 37
01) 250 1"'25"//f' 62
dim "f" 54 {... ¡... } 3
2 (=>) 21
Subject lndext

Cayley-Hamilton theorem, 380


A Center (geometric), 335, 339(6)
Adjoint matrix, 183 Central quadric surface, 337
Alternating function, 407 Characterístic equatíon, 227, 223
Alternating subgroup, 416 Characteristic polynomial, 227, 233
Angle, 20, 249 Characteristic root, 227, 233
Associative laws, 2, 6, 32 Characteristic space, 342
Augmented matrix, 76 Characteristic value, 224, 233
Characterístic vector, 224, 233
B Chord, 330
Basis, 51 Closure under an operation, 3, 39
canonical, 382 Codomain, 130
change of, 200 Coefficient matrix, 76
cyclic, 371 Cofactor, 111, 421
extension to a, 56 Column-equivalence, 125
orthonormal, 250 Column operations, 117
standard, 55 Column rank, 124
Bilinear form, 294 Column space, 124
Bilinear function, 243 Column vector, 81
Binary operation, 3 Commutative laws, 2, 6, 32
Companion matrix, 368
e Complete inverse image, 145(7)
Canonical forms, 218 Components, 5, 23
classical, 384 Composition of maps, 167
echelon, 88 Cone, 323
Jordan, 384 asymptotic, 329(8)
rational, 379 Congruent figures, 31 O
t Numbers in parentheses refer to problem numbers.

491
49~ Subject lndex Subject lndex 493

Congruent matrices, 298 Equivalence relation, 214 lsometry, 305


Conic, 312 Equivalent matrices, 125, 210 H Isomorphism
Conic sections, 329(5) Equivalent systems of equations, 83 Hermitian inner próduct, 278 inner product space, 263
Conjugate of a complex number, 277 Euclidean space, 242 Homogerieous linear equations, 79 vector space, 62, 133
Consistent equations, 79 Euclid's algorithm, 358(8) Homogeneous polynomial, 291
Contrapositive, 67 Euler's theorem, 270(11) Homomorphism, 146 J
Converse, 21 Extension Hyperbola, 314 Jordan form, 384
Coordinates, 59 to a basis, 56 Hyperboloid, 323
Cramer's rule, 185, 423 ofa map, 144 Hypercompanion matrix, 383 K
Cross product, 110, 423 Exterior product, 427(10) Hyperplane, 27 Kernel, 140
Cylin9er, 322 Kronecker delta, 250
F 1
D Factor theorem, 353 L
Ideal, 354 ·
Decomposition of a vector space Fields Identity Law of cosines, 249
direct sum, 67, 342 abstrae!, 271 additive, 2, 6 Leading coefficient, 350
primary, 363 algebraically closed, 353 multiplicative, 2 Least common multiple, 358(6)
secondary, 370 complex number, 271 Identity map, 170 Left-hand rule, 306
Degree of freedom, 102 finite, 272 Identity matrix, 181 Length, 13
Degree of a polynomial, 350 of integers modulo a prime, 272 Image Leve! curve, 323
Determinan!, 107, 111 ordered, 276(15) of a set, 137 Line
expansion by cofactors, 113, 421 rational..number, 271 of a vector, 130 direction cosines for, 29(14)
of a product, 193, 422 real number, 3 Image space, 138 direction numbers for, 11, 24
of a transposed matrix, 115(13) Finite dimension, 54 Inconsistent equations, 79 symmetric equations for, 25
Determinan! function, 408, 416, 419 Fourier series, 254 Index of nilpotency, 386 vector equation for, JO, 24
Diagonalizable map, 234 Functions, 130 Induction Linear combination, 44
Diagonalizable matrix, 223 addition of, 34 first principie of, 437(16) Linear dependence, 4 7, 48
Diametral plane, 331 alternating, 407 second principie of, 351 Linear equations
Differential equations, 389, 401(4) bilinear, 243 Infinite dimension, 54 consisten!, 79
Dimension, 54 determinan!, 408, 416, 419 Inner product criterion for solution of, 95
Direction cosines, 29(14) equality of, 35 bilinearity of, 243 equivalen!, 83
Direction numbers, 11, 24 linear, see Linear transformations complex, 278 Gaussian elimination in, 95
Directrix, 322 multilinear, 407 hermitian, 278 homogeneous, 79
Direct sum, 67, 342 quadratic, 292 positive definiteness of, 242 inconsistent, 79
Discriminan!, 314 scalar multiple of, 35 real, 242 independent, 104
Distributive laws, 3, 32, 168, 180 skew-symmetric, 408 symmetry of, 242 nontrivial solution of, 79
Division algorithm for polynomials, 352 Fundamental theorem of algebra, 353 Integers modulo a prime, 272 operations on, 84
Divisor, 350 Integral donra'iftol"350 solution of, 78
elementary, 375, 378 G of polynomials, 349 solution set of, 82(14), 101
Domain, 130 Gaussian elimination, 95 Intersection, 43(10) solution space of, 81
Dot product, 13, 24 Generator of a cylinder, 322 Invariants, 220 Linear form, 297(13)
Generators of a vector space, 46 complete set of, 220 Linear functional, 286(6)
E Gram-Schmidt process, 257 Invariant subspace, 235 Linear independence, 48
Echelon form, 88 Greatest common divisor, 355 Inverse Linear transformations, 130
Eigenspace, 342 Group, 174 additive, 2, 6, 32 addition of, 146
Eigenvector, 224 alternating, 416 adjoint form for, 183 adjoint of, 287(8)
Elementary divisors, 375, 378 commutative, 174 computation with row operations, 189 composition of, 167
Elementary matrices, 187, 188 of nonsingular maps, 174 of a map, 170 diagonalizable, 234
Ellipse, 314 permutation, 412 of a matrix, 181 extension of, 144
Ellipsoid, 323 multiplicative, 2 group of nonsingular, 174
Equivalence class, 215 Irreducible polynomial, 356 hermitian, 280
494 Subject lndex Subject lndex 495

idempotent, 367(12) of a linear transformation, 157 Orthonormal set, 252 Quadric surface, 312, 321
identity, 170 main diagonal of, 118 Quaternionic multiplication, 198(18)
image space of, 138 multiplication of, 77, 177 p Quotient, 352
inverse of, 170 nilpotent, 386 Parabola, 314
invertible, 170 nonsingular, 181 R
Paraboloid, 323
matrix of, 157 orthogonal, 264 Parallelepiped, volume in 3-space, 111 Range, 130
nilpotent, 386 of a quadratic form, 291 Parallelogram Rank
nonsingular, 173 of a quadratic function, 293 area in plane, 108 column, 124
null space of, 140 rank of, 90 area in 3-space, 427(7) determinan!, 121
one to one, 132 scalar multiplication of, 80 Parallelogram law, 286(5) __ of a map, 142
onto, 132 series of, 392 Parallelogram rule for addition of vectors, matrix, 90
orientation preserving, 307, 308 similarity of, 211 14 row, 91
orthogonal, 261 singular, 181 Parameter, 10 Reflection, 262, 268
projection, 133, 367(12) stochastic, 396 Partition of a set, 218 Regular point, 338(6)
rank of, 142 subdiagonal, 386 Permutation, 412 Relation, 214
restriction of, 144 subdiagonal of, 369 even and odd, 414 Relatively prime polynomials, 356
ring of, 169 symmetric, 282 Perpendicular vectors, 20 Remainder, 352
rotation, 135, 267, 268 trace of, 232 Pivota! condensation, 120(6) Remainder theorem, 353
scalar multiplication of, 146 transition, 200 Plane, 103 Restriction map, 144
singular, 173 transpose of, 99 normal to, 26 Right-hand rule, 306
symmetric, 286(12) triangular, 118 Polynomials, 33, 348 Rigid motion, 305, 310
vector space of, 147 vector space of, 80 characteristic, 227, 233 Ring, 169
zero, 148 Matrix equation, 78 degree of, 350 ideal in, 354
Matrix multiplication, 77, 177 equality of, 348 with identity, 170
M Midpoint, 17 factors of, 350 of maps, 169
Minor, 111 greatest common divisor of, 355 of polynomials, 349
Main diagonal, 118
Monic polynomial, 350 homogeneous, 291 Root, 353
Map, 130; see also Linear transformation
Multilinear function, 407 irreducible, 356 characteristic, 227, 233
translation, 270(8)
Multiplicity of a factor, 357 leading coefficient of, 350 Rotation, 135, 267, 268
Markov chain, 395
Matrices, 76 least common multiple of, 358(6) Row-equivalent matrices, 88
N
addition of, 80 mínimum, 361 Row operations, 85
adjoint, 183 Nonsingular map, 173 monic, 350 Row reduction, 95
augmented, 76 Nonsingular matrix, 181 relatively prime, 356 Row space, 90
of a bilinear form, 294 Nontrivial solution, 79 roots of, 353 Row vector, 81
coefficient, 76 Norm, 13, 247 skew-symmetric, 408 Ruled surface, 333
companion, 368 Normal to a plane, 26 T-minimum, 360 Ruling, 332
congruence of, 298 n-tuple, 22 zero, 348
determinan! of, 111 Nullity, 142 Positive definite inner product, 242 S
diagonal, 223 Null space, 140 Preimage, 130 Scalar, 5, 272
diagonal block, 344 Numerical analysis, 397 Principal axis theorem, 313 Scalar multiplication, 5, 22, 32
diagonalizable, 223 Principal ideal domain, 354 ofmaps, 146
echelon form of, 88 o Projection, 133, 367(12) of matrices, 80
elementary, 187, 188 Orientation, 305 orthogonal, 256 Scalar product, 242
equality of, 77 Orientation preserving map, 307, 308 Proper subspace, 38 Scalar triple product, 116(15)
equivalence of, 125, 210 Orthogonal complement, 259 Pythagorean theorem, 250 Schwartz inequality, 247
hermitian, 281 Orthogonal map, 261 Section of a surface, 323
hypercompanion, 383 Orthogonal matrix, 264 Q Series of matrices, 392
identity, 181 Orthogonal set, 252 Quadratic form, 291 Sets
inverse, 181 Ortho!9nal vectors, 20, 249 Quadratic function, 292 equality of, 38
Jordan, 361 Orthonormal basis, 250 symmetric, 293 intersection of, 43(1 O)
496 Subject lndex

orthogonal, 252 Translation, 270(8)


orthonormal, 252 Transpose, 99
partition of, 218 Transposition, 413
subsets of, 27 Triangle inequality, 248
union of, 43(10) Trivial solution, 79
well ordered, 351 Trivial subspace, 38
Sign or signum, 415
Signature, 302 u
Similar matrices, 211 Unique factorization theorem for poly-
Singular map, 173 nomials, 357
Singular matrix, 181 Unit vector, 29(13), 247
Skew-symmetric function, 408 Unitary space, 278
Solution of a system of linear equations, 78
V
Span,44,46
Standard basis, 51, 55 Vectors
Standard position for a graph, 312 addition of, 5, 22, 32
Subdiagonal, 369 angle between, 20, 249
Submatrix, 121 characteristic, 224, 233
Subset, 37 column, 81
Subspaces, 37 components of, 5, 23
direct sum of, 67, 342 coordinates of, 59
invariant, 235 linear dependence of, 48
nontrivial, 38 norm of, 13, 247
proper, 38 orthogonal, 20, 249
sum of, 43(10) scalar multiplication of, 5, 22, 32
T-cyclic, 345 scalar triple product of, 116(15)
trivial, 38 T-minimum polynomial of, 360
zero, 38. unit, 29(13), 247
Su m .zero, 6, 32
direct, 67, 342 Vector spaces, 32, 272
of maps, 146 over an arbitrary field, 272
of matrices, 80 basis for, 51
of subspaces, 43(10) dimension of, 54
Summand, 65 d·irect sum of, 67, 342
Symmetric equations for a line, 25 generators of, 46
Symmetric functions of characteristic roots, intersection of, 43(10)
232 isomorphic, 62, 133
Symmetric matrix, 282 orthonormal basis for, 250
Systems of linear equations, see Linear real, 32
equations subspace of, 37
sum of, 43(10}
T zero, 35

T-cyclic subspace, 345 w


T-minimum polynomial of U, 360 Well ordered set, 351
Trace
of a matrix, 232 z
of a surface, 323 Zero map, 148
Transformation, 130; see also Linear trans- Zero subspace, 38
formations Zero vector, 6, 32
Transition matrix, 200 Zero vector space, 35

You might also like