Professional Documents
Culture Documents
LA Part1 2in1
LA Part1 2in1
Textbook:
“Elementary Linear Algebra,” 12-th ed., H. Anton and C. Rorres.
( The content of this course comes from the above book and other
references.)
Reference:
1. “Linear Algebra,” Friedberg, Insel, Spence.
Algebra
1. Elementary algebra
2. Linear Algebra
3. Algebra (Modern algebra, Abstract algebra)
- Abstraction and generalization (Ex. Vector Spaces)
v + w, cv (vectors in R2 or R3)
f (t) + g(t), cf (t) (functions)
M + P, cM (matrices)
4
Notation
1. R: the set of real numbers.
2. C: the set of complex numbers.
3. A vector v is denoted by a bold little character.
4. R2 and R3 are the sets of 2-D plane and 3-D space.
2
1
v=
∈ R2 ,
3
� �
1
u = 4 ∈ R3
wH = [1 − i2 4 3]
3
w = 4 ,
2 + i3 1 1 + i 2 − i3 0 0
0 4 MH 1
0 0 3
1−i 0 3
M = 0 , = 4 0
Preview
1. Vectors in Rn (or Cn):
a
an
u=
..
uT = [a1 a2 · · · an]
uH = uT = [a1 a2 · · · an]
Vector Space :
A vector space is a set of vectors, which will be defined later.
Inner product :
The inner product of two vectors x and y,
x · y, �x, y�, �x | y�
When we define an inner product in a vector space, we can use it
to define
1. the length (norm) of a vector ( �x�2 = �x, x� ), and
2. the orthogonality between vectors. ( �x, y� = 0 )
Linear combination c 1 v 1 + c2 v 2 + · · · + c n v n
Linear transformation
v ∈ Rn �→ w ∈ Rm
8
Contents
1. Systems of Linear Equations and Matrices (Chap. 1)
2. Determinants (Chap. 2)
3. Euclidean Vector Spaces (R2, R3, Rn) (Chap. 3)
4. General Vector Spaces (Chap. 4)
5. Eigenvalues and Eigenvectors (Chap. 5)
6. Inner Product Spaces (Chap. 6)
7. Diagonalization and Quadratic Forms (Chap. 7)
8. Linear Transformations (Chap. 8)
9. Additional Topics
(including Singular Value Decomposition and Jordan Forms.)
9
Homework :
Chap 1: 1.1): 8, 12 1.2): 38 1.3): 36 1.4): 42, 46 1.5): 31 1.6): 18, 24
1.7): 40(a), 47 1.8): 16, 45 1.9): 12
10
� Linear Equations
Consider a system of m linear equations on n unknowns,
a a a b
.
11 12 1n 1
a21 a22 a2n b2
a b
(∈ Rm)
1k 1
a2k b2
amk bm
ak =
.. , b=
..
respectively.
12
Eq. (1) also corresponds to a transformation Ax = b
a a · · · a1n x b
.. ..
11 12 1 1
a21 a22 · · · a2n
am1 am2 xn bm
x2 = b2
· · · amn
.. .. .. ..
.. .. ..
· · · a2n
am1 am2
A=
· · · amn
..
14
· · · a2n
.. .. .. ..
b2
am1 am2 bm
[A | b] =
· · · amn
..
15
We will take elementary row operations on [A | b] to make it into
a form that is more easy to solve.
Remarks
1. By taking elementary row operations, we do not affect the solutions
of Ax = b.
2. Each elementary row operation is reversible.
16
×3
×2
×(−3)
×1/2
17
Definition A matrix is in (row) echelon form if it has
the following three properties: (Forward Gaussian elimination)
1 ∗ ∗ ∗ ∗ � ∗ ∗ ∗ ∗
1
or �
∗ ∗
�= 0
0 0 1
0 ∗ 0 � ∗ ∗ ∗
0 0 0 0 0 0 0 0 0 0
0 ∗ 0 0 0 � ∗
1. If there are any rows that consist entirely of zeros, then they are
grouped together at the bottom of the matrix.
2. If a row does not consist entirely of zeros, then the first non-zero
number in the row is a 1. We call this a leading 1 (or pivot).
3. The leading 1 in the lower row occurs farther to the right than the
leading 1 in the higher row.
18
Definition A matrix is in reduced (row) echelon form
if it is in (row) echelon form and satisfies the following property.
Example A row echelon form (R1) and a reduced row echelon form (R2).
1∗∗∗∗ 10∗0∗
0 1 ∗ ∗ ∗ 0 1 ∗ 0 ∗
0 0 0 0 0 00 0 0 0
R1 =
0 0 0 1 ∗ , R2 = 0 0 0 1 ∗
19
After a sequence of row operations, we can make a matrix in a
row echelon form. A matrix may have many row echelon forms.
(For example, adding the 3rd row of R1 to the 2nd row of R1, we obtain
another matrix in row echelon form.)
1∗∗∗∗
0 1 ∗ ∗ ∗
0 0 0 0 0
R1 = 0 0 0 1 ∗
12357 1235 7
0 1 4 6 8 0 1 4 7 17
00000 0000 0
0 0 0 1 9 → 0 0 0 1 9
20
1 0 ∗ 0 ∗
1 0
∗
0 0 1
0 ∗
0 0 0 0 0
R2 =
0
∗
• The pivots are always in the same positions in any echelon form
of A.
• We call these positions pivot positions.
• A pivot column is a column of A that contains a pivot position.
21
Example Reduced row-echelon forms:
0 1 −2 0 1
1 0 0 4 1 0 0
0
0 1 0 7 , 0 1 0 ,
0 0
� �
0 0 1
0 0 1 3
0 0 1 −1
0 0 0 0 0
, 0 0
0 0 0 0 0
0 0 1 5 0 0 1
0 0 0 −0 1
0 1 6 2 , 0 1 0 , 0 0 1 −1 0
22
a a12 . . . a1n b1
a22 . . . a2n
.. .. ..
11
a21 b2
a21
1. If a11 �= 0, perform row operations to create zeros below a11. (− )
a11
a a . . . a1n b1
11 12
0 a�22 . . . a�2n b�2
2. If a11 = 0, interchange the first row with another row (say, the kth
row) for which ak1 �= 0. Then perform the operations in step1 to this
new matrix.
0 ... ... b
. . . .1
. . . .
.. .. .. ..
a ... ... b
k1 2
24
.. .. ..
0 b2
0 am2 . . . amn bm
[A | b] =
..
..
a . . . a1n b1
12
a22 . . . a2n b2
an2 . . . amn bm
.. .. .. ..
� ∗ ∗ ∗ ∗
0 � ∗ ∗ ∗
0 0 0 0 0
[A | b] →
0 0 0 � ∗
�
�
0 . . . 0 bk
, bk �= 0
26
corresponding to
0x1 + 0x2 = 3
27
On the other hand, if the equations are consistent, we can further
make it a reduced echelon form. Begin with the rightmost pivot.
� ∗ ∗ ∗ ∗
0 0 0 0 0
0 0 0 � ∗
� ∗ ∗ ∗ ∗ 1∗ ∗ ∗ ∗ 1∗ ∗ 0 ∗ 10 ∗ 0 ∗
0 0
∗ ∗ ∗ ∗
0 1 0 1 0 1
0 � ∗ ∗ ∗ ∗ ∗ ∗
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0
→ 0 1 → 0 1 → 0 1
0 0 0 � ∗ 0 0 ∗ 0 0 ∗ 0 0 ∗
28
Example Consider a system of linear equations
3x2 − 6x3 + 6x4 + 4x5 = −5
3x1 − 7x2 + 8x3 − 5x4 + 8x5 = 9
3x1 − 9x2 + 12x3 − 9x4 + 6x5 = 15
The augmented matrix
0 3 −6 6 4 −5
3 −9 12 −9 6 15
[A | b] = 3 −7 8 −5 8 9
29
0 3 −6 6 4 −5
6 15
3 −9 12 −9
[A | b] = 3 −7 8 −5 8 9
3 −9 12 −9 6 15
0 3 −6 6 4 −5
→ 3 −7 8 −5 8 9
3 −9 12 −9 6 15 3 −9 12 −9 6 15
0 0 0 0 1 4
0 3 −6 6 4 −5
→ 0 2 −4 4 2 −6 → 0 2 −4 4 2 −6 (pivots)
=⇒ Consistent.
30
3 −9 12 −9 0 −9
0 0 0 0 1 4
→ 0 2 −4 4 0 −14
3 −9 12 −9 0 9
0 0 0 0 1 4
→ 0 1 −2 2 0 −7
3 0 −6 9 0 −72 1 0 −2 3 0 −24
0 0 0 0 1 4 0 0 0 0 1 4
→ 0 1 −2 2 0 −7 → 0 1 −2 2 0 −7
1 0 −2 3 0 −24
0 0 0 0 1 4
0 1 −2 2 0 −7
32
x1+2x2 = 5 x1+2x2 =5
3x1− x2 = 1 ⇒ 3x1− x2 =1
4x1+ x2 = 6
1 2 5 1 2 5 1 2 5
4 1 6 0 0 0
0 −7 − 14
3 −1 1 → 0 −7 − 14 → 0 1 2
33
x1 = −24 + 2x3 − 3x4
x2 = −7 + 2x3 − 2x4
x5 = 4
x3 = t1 ∈ R (free variable)
x4 = t2 ∈ R (free variable)
34
The solution set of Ax = b is
x1 −24 2 −3
t1, t2 ∈ R
x2 −7 2 −2
x5 4 0 0
x = x3 = 0 + 1 t1 + 0 t2 ,
x4 0 0 1
Note that
−24 2 −3
−7 2 −2
A
4 0 0
A 0 = b, 1 = 0, A 0 =0
0 0 1
35
−24 2 −3
0
−7 2 −2
Ax = A + A 1 t1 + A 0 t2
4 0 0
0 0 1
=b+0+0
=b
36
Summary of the process of solving Ax = b
1. Perform elementary row operations on the augmented matrix
[A | b] to make it into an echelon form.
2. Determine if this system is consistent. If it is inconsistent, no solu-
tions exist. Otherwise, further make it into a reduced echelon form.
3. If there are no free variables, we can immediately obtain the unique
solution of this system from the reduced echelon form.
1
1
...
1
�
b
37
4. If there are free variables, solve the reduced system of equations for
the basic variables in terms of free variables.
(In this case, there are infinitely many solutions.)
1 0 −2 3 0 −24
0 0 0 0 1 4
0 1 −2 2 0 −7
38
Example
x + y + 2z = 9 1 1 2 9 1 1 2 9
3x + 6y − 5z = 0 3 6 −5 0 0 3 −11 −27
2x + 4y − 3z = 1 ⇒ 2 4 −3 1 ⇒ 0 2 − 7 − 17
1 1 2 9 1 1 2 9
0 3
1 1 2 9 1 1 0 3 1 0 0 1
0 0 1 3 0 0 1 3 0 0 1 3
⇒ 0 1 − 7/2 − 17/2 ⇒ 0 1 0 2 ⇒ 0 1 0 2
x 1
z 3
⇒ y = 2
39
Note for Ax = b, A: m × n
1 0 −2 3 0 −24
2
(A : 4 × 5)
0 1
0 1 −2 0 −7
0 0 0 0 0 0
[A | b] ∼
0 0 0
4
(#: number)
(# of basic variables) + (# of free variables) = (# of variables) = n
(# of basic variables) = (# of effective equations) ≤ m
40
• Existence and uniqueness of solutions
Existence of solutions
1. The equation Ax = b has a solution (consistent) if and only if
the row echelon form of [A | b] does not have a row like
[0 . . . 0 b�] b� �= 0
x1
xn
.
or
[a1 a2 · · · an] . = b
x1 a 1 + x 2 a 2 + · · · + x n a n = b
41
42
• Homogeneous systems: Ax = 0
3 −9 12 −9 6 0
[A | 0] = 3 −7 8 −5 8 0
has solution as
43
x1 2 −3
t1, t2 ∈ R
x2 2 −2
x5 0 0
x = x3 = 1 t1 + 0 t2,
x4 0 1
with
2 −3
2 −2
A
0 0
1 = 0, A 0 =0
0 1
Therefore, Ax = 0.
44
Conclusions:
Ax = 0 has free variables.
⇐⇒ Ax = 0 has non-trivial solutions.
⇐⇒ Ax = b has infinitely many solutions, if it is consistent.
45
{p + vh | Avh = 0}
(See the previous example.)
46
Proof :
Let S1 = {x | Ax = b} be the solution set of Ax = b, and
S2 = {p + vh | Avh = 0}, Ap = b.
1. Since p + vh ∈ S2, and
A(p + vh) = Ap = b
we have p + vh ∈ S1. Therefore, S2 ⊆ S1.
2. On the other hand, let w ∈ S1. Then Aw = b. Note
A(w − p) = b − b = 0
Since A(w − p) = 0, let vh = w − p. We have Avh = 0. and
w = p + vh ∈ S2. Therefore, S1 ⊆ S2.
48
Example (equivalence relations)
1. A triangle A is similar to a triangle B.
2. A person A is a relative of a person B.
A ∼ A1 ∼ A2 ∼ · · · ∼ B
49
A ∼ A1 ∼ A2 ∼ · · · ∼ B
50
Note that
1. A matrix A is row equivalent to all its row echelon forms.
2. A matrix A is row equivalent to its reduced row echelon form.
3. All the row echelon forms of A are row equivalent.
A ∼ R 1 , A ∼ R2 , · · ·
R1 ∼ R2, R2 ∼ R3 · · ·
51
Recall that a matrix can have many row echelon forms.
1 ∗ ∗ ∗ ∗
1
∗ ∗
0 0 1
0 ∗
0 0 0 0 0
R1 =
0
∗
1 0 ∗ 0 ∗
1 0
∗
0 0 1
0 ∗
0 0 0 0 0
R2 =
0
∗
52
1 0 ∗ 0 ∗ 1 0 ∗ ∗ 0
1 0 1
∗ ∗ ∗
0 0 1 0 0 0
0 ∗ 0 0
0 0 0 0 0 0 0 0 0 0
C1 =
0
, C2 =
∗
0
1
53
If a matrix A has two reduced row echelon forms C1 and C2, then
A ∼ C1 and A ∼ C2
(Equivalence relation)
1. A ∼ A. (Reflexivity)
2. If A ∼ B, then B ∼ A. (Symmetry)
3. If A ∼ B, B ∼ C then A ∼ C. (Transitivity)
Other examples : Set inclusion (A ⊂ B); the finger-guessing game.
54
When there are infinitely many solutions, one may want to find
a solution x̃ that has the minimum “length”.
.. ..
A=
ai1 · · · aij · · · ain
56
..
A= ain
amn
ai1 · · · aij · · · = [a1 a2 · · · an]
am1 · · · amj · · ·
. ..
.
a
m m
1j
a2j
amj
and aj =
.. ∈ R or C is the jth column of A
57
Remarks
1. When m = n, A is a square matrix of order m.
3. An m × n zero matrix
0 ··· 0
0
. . . ..
··· 0
O = ..
...
11
O dnn
the diagonal entries. D =
58
The identity matrix
1 O
1
...
O 1
In =
= [e1 e2 · · · en]
1 0 0
.
0 1 0
0 0 1
e1 = . , e 2 = . , · · · , e n =
. . .
T
Rn Rm
v T (v)
60
n m
Consider a transformation T from R to R ,
T : Rn �→ Rm : v �→ T (v)
which maps v to T (v). The domain of T is Rn and the codomain is
Rm .
1 0 0
0
0 0 1
e1 = , e 2 = 1 , · · · , en = 0
.. .. ..
vn
v=
.. = v1e1 + v2e2 + · · · + vnen
62
n m
Then for a linear transformation T R �→ R , we have
vn
= T (e1) T (e2) · · · T (en)
..
vn
Av = [a1 a2 · · · an]
..
64
Now we consider the definition of matrix multiplication, which
corresponds to the composition of two linear transformations.
x y=Bx z=Ay
S◦T
65
S ◦T : R p → R m z= ABx
T S
p Rn
R Rm
x y=Bx z=Ay
S◦T
66
Now for the m × n matrix A and n × p matrix B,
A = [a1 a2 · · · an] = [aij ] , ak ∈ R m
bnj
Abj = [a1 · · · an]
68
n
[AB]ij = aik bkj
k=1
�
b1j
..
bnj
ai1 · · · ain
69
Remarks
1. A = B if size(A) = size(B) = (m, n), and
[A]ij = [B]ij , 1 ≤ i ≤ m, 1 ≤ j ≤ n
4. A − B = A + (−B).
70
The algebraic properties of matrices,
1. A + B = B + A
2. (A + B) + C = A + (B + C)
3. A + O = A, A + (−A) = O
4. r(A + B) = rA + rB
5. (r + s)A = rA + sA
6. r(sA) = (rs)A, 1A = A
where A, B, and C are of the same sizes, and
r, s are real or complex numbers.
72
Note for two square matrices A and B, in general, AB �= BA.
b1j
..
bnj
[AB]ij = ai1 · · · ain
a1j
..
anj
[BA]ij = bi1 · · · bin
In addition,
AB = O � A = O or B = O
Cf. for any a, b ∈ R (or C), we have ab = 0 ⇒ a = 0 or b = 0.
For example,
1 1
A= , B=
1 −1
1 1
� � � �
1 −1
we have
0 0
AB = =O
0 0
� �
74
AB = AC � B = C
Cf.
A+B =A+C ⇒ B =C
We define A0 = I, if A �= O.
Note that
(A + B)2 = (A + B)(A + B) = A2 + AB + BA + B 2
(A + B)3 = (A + B)(A2 + AB + BA + B 2) = · · ·
76
• The transpose of a matrix
Rn Rm
u Au
AT v v
77
.. ..
A=
ai1 · · · aij · · · ain = [a1 a2 · · · an]
T
a 1j a mj
.
· · · · · ·
. .
. .
.
a2
A =
aTn
=
...
78
A B
|
[(AB)T ]ij
|
T T
−− j-th row −− i-th column
B A
|
[B T AT ]ij
|
−− i-th row −− j-th column
80
We further have
(ABCD)T = DT C T B T AT
H T
aT
2
aTn
A = A ==
...
81
• The inverse of a square matrix
82
n
Theorem If A is an invertible n × n matrix, then for each b in R ,
the equation Ax = b has a unique solution x = A−1b.
1 0 0 0 1 0 ∗ 0
1 0 1
∗
0 1 0 0
0 0 0 0
0 0 0 1 0 0 0 0
I=
0
,
0
0
1
84
Theorem Assume A and B are invertible matrices.
1. (A−1)−1 = A (So if C = A−1, then C −1 = A)
∵ AC = CA = I
Cf.
(ABCD)T = DT C T B T AT
86
From the above, since
(ABCD · · · )−1 = · · · D−1C −1B −1A−1
we have (let A = B = C · · · )
(An)−1 = (A−1)n = A−n
We also note
(cA)−1 = c−1A−1
87
n m
Recall that a linear transformation T from R to R can be represented
by a matrix
A = [T (e1) T (e2) · · · T (en)]
and we can write T (x) = Ax.
(TA)−1 = TA−1
88
TA : x �→ Ax
TA−1 : Ax �→ x
T Rn
Rn
x y=Ax
x = A−1y y
T −1
89
3
Example Consider a linear transformation T on R , define as
a a+b
c c+a
T ( b ) = b + c
1 0 1
A = [T (e1) T (e2) T (e3)] = 0 1 1
Note
a 1 1 0 a a+b a
c 1 0 1 c c+a c
A b = 0 1 1 b = b + c = T ( b )
90
• Elementary matrices
Recall the three elementary row operations on a matrix.
1. (Replacement) Add to one row a multiple of anther row.
2. (Interchange) Interchange two rows.
3. (Scaling) Multiply all entries in a row by a non-zero constant.
×3
×2
91
×(−3)
×1/2
92
1. (Replacement) (n = 3)
001 501
−5 0 1
I = 0 1 0 , E1 = 0 1 0 , E1−1 = 0 1 0
2. (Interchange)
100 100 100
3. (Scaling) Assume c �= 0,
100 100 1 0 0
001 001 0 0 1
I = 0 1 0 , E3 = 0 c 0 , E3−1 = 0 c−1 0
93
94
96
Theorem (Equivalent Statements of Matrix Inversion)
If A is an n × n matrix, then the following statements are equivalent.
a. A is invertible.
b. Ax = 0 has only the trivial solution. (x = 0)
c. The reduced row-echelon form of A is In. ( A ∼ In)
d. A is expressible as a product of elementary matrices.
Proof : ( (a)⇒(b)⇒(c)⇒(d)⇒(a) )
∗
0 1
(b)⇒(c) : [A | 0] ∼ [In | 0]
00 0 0 0
0 1 0
0 0 0
97
(c)⇒(d) :
A ∼ In
Eq Eq−1 · · · E1A = In
Eq−1 · · · E1A = Eq−1
−1
Eq−2 · · · E1A = Eq−1 Eq−1
−1
A = E1−1 · · · Eq−1 Eq−1
−1
(d)⇒(a) : Since A = E1−1 · · · Eq−1 Eq−1, we have
A−1 = (E1−1E2−1 · · · Eq−1)−1 = Eq · · · E2E1
98
−1
• An algorithm for finding A
100
Proof :
1. BA = I
(a) First we prove that A is invertible.
(b) We can show that Ax = 0 has only the trivial solution.
Multiplying both sides by B, we have BAx = B0, or x = 0,
the trivial solution.
By the previous theorem (equivalent statements of matrix inver-
sion), we see that A is invertible and A−1 exists.
(c) Now since A is invertible and
BA = I
Multiply both sides of the above by A−1 from the right, we have
B = A−1. By the previous results, B −1 = (A−1)−1 = A.
101
102
1 0 0
0 1 0
0 0 1
Ax1 = 0 , Ax2 = 0 , · · · , Axn =
0
. . .
. . .
...
11
0 dnn
D=
−1 22
...
11 −1
0 d−1
nn
D =
104
The powers of D is
dk11 0
dk22
�
...
0 dknn
Dk = DD · · · D =
A square matrix whose entries below (above) the main diagonal are
zero.
∗ ∗ ∗ ∗ ∗ 0 0 0
0
∗ ∗
0
∗ ∗
0 ∗ 0
0 0 0
(upper triangular) , ∗ ∗ (lower triangular)
∗ ∗∗ ∗ ∗
0 ∗ ∗ ∗ 0
Theorem Let U and L denote upper and lower triangular matrices, 106
respectively.
a. U T is lower triangular, while LT is upper triangular.
b. U1U2 is upper triangular, L1L2 is lower triangular.
c. If all the diagonal entries of U (or L) are non-zero, then U (or L) is
invertible.
d. If U (or L) is invertible, the U −1 (or L−1) is upper (lower) triangular.
Proof : (a) is clear.
(b)
∗∗ ∗ ∗ ∗ ∗ ∗ ∗ x x x x
x x
∗ ∗ ∗
0 0 x
∗ ∗
0 ∗ ∗ ∗ x
0 0 0 0 0 0 0 0 0 x
0 =0
∗ ∗
0 0 ∗ ∗ x
0 0
107
The proofs of (c) and (d) will be given in the next chapter when we
discuss determinants.
• Symmetric matrces
Definition A square matrix A is called symmetric if AT = A.
a e h p
b f
f c
e k
p k g d
M =
h
= MT
g
is a symmetric matrix.
108
(AB)T = B T AT = BA �= AB
109
(BB T )T = (B T )T B T = BB T
110
• Partitioned matrices
..
r2
rTm
A = ai1 · · · aij · · · ain = [c1 c2 · · · cn] =
..
A11 A12 B1
A= , B= , C = [C1 C2]
A21 A22 B2
� � � �
A11 A12 B1
C = [C1 C2] , A= , B=
A21 A22 B2
� � � �
A11B1 + A12B2
AB = , CA = [C1A11 + C2A21 C1A12 + C2A22]
A21B1 + A22B2
� �
112
Note the order in multiplication of two matrices,
A11B1 + A12B2 �= B1A11 + B2A12
2. Similarly,
A AC A DA
C= , D �=
B BC B DB
� � � � � � � �
113
Let
A11 A12
A=
A21 A22
� �
then
T T
A11 A21
T
T T
A12 A22
A =
Exercise
Find the transpose of B and C
B1
B= , C = [C1 C2]
B2
� �
114
• Column-Row expansion of AB
Recall that for matrix multiplication,
b1j n
.. aik bkj
bnj k=1
�
[AB]ij = ai1 · · · ain =
( ak ∈ Rm, bk ∈ Rp )
1T
b2
bTn
A = [a1 a2 · · · an], B =
..
then
AB = a1bT1 + a2bT2 + · · · + anbTn
115
which is called the column-row rule.
a
1≤k≤n
1k
a2k
amk
ak bTk =
.. [bk1 bk2 · · · bkp] ,
Since
[ak bTk ]ij = aik bkj
we have
n
T
a1bT1 + a2bT2 + · · · + a n bn ij
= ak bTk ij
k=1
�
n
� � � �
= aik bkj
k=1
�
= [AB]ij
116
• Inverses of partitioned matrices
A11 A12
If A = is invertible,
O A22
� �
Note that A is invertible implies that both A11 and A22 are invertible,
and vice versa.
Exercise Check
A−1
11 A−1 −1
11 A12A22 A11 A12
=I
−
O A−1
22 O A22
� �� �
117
Exercise Find the inverse of
A11 O
M=
A21 A22
� �
Hint
A11 O AT11 AT21
MT =
T
A21 A22
� �T
O AT22
=
(M T )−1 = (M −1)T
118
• LU decomposition
(factorization) 1 0 0 0 • ∗ ∗ ∗ ∗
1 0
• ∗ ∗
1 0 0
∗ •
∗ 0 ∗
1 0 0 0 0 0
A= 0
∗ ∗ ∗
∗ 0 ∗
0
L U
A : m × n, L : m × m, U : m × n.
120
Ax = b
1 0 0 0 • ∗ ∗ ∗ ∗
1 0
• ∗ ∗
1 0 0
∗ •
∗ 0 ∗
1 0 0 0 0 0
A= 0 = LU
∗ ∗ ∗
∗ 0 ∗
0
L(U x) = b
Ly = b
Ux = y
121
− Algorithm for an LU factorization
Assume we can reduce a matrix A to an echelon form U by elementary
row operations, without the row interchange,
•∗∗∗∗
0 • ∗ ∗ ∗
0 0 0 0 0
A ∼ ··· ∼ U = 0 0 0 • ∗
Eq · · · E 1 A = U
where each Ek is lower triangular, for example,
1000 1000 1 0 0 0
1 0
0 1
−2 1 0 0 0 1 0 0 0 0
0001 0001 0 0 0 2
0 0 1 0, 3 0 1 0, 0 0
122
In
Eq · · · E1A = U
the matrix (Eq · · · E1) is lower triangular.
Then
A = (Eq · · · E1)−1U = LU
where
L = (Eq · · · E1)−1
is lower triangular.
A = (E1−1 · · · Eq−1)U
123
2 4 −1 5 −2 1 0 0 0
Example
1
−4 −5 3 −8 1 1 0 0
A= =I
−6 0 7 −3 1
2 −5 −4 1 8 1 0
2 4 −1 5 −2 1 0 0 0
1
0 3 1 2 −3 −2 1 0 0
∼ →
0 12 4 12 −5 −3
0 −9 −3 −4 10 1 1 0
2 4 −1 5 −2 1 0 0 0
0 0 0 4 7 1
0 3 1 2 −3 −2 1 0 0
∼ →
−3 4
0 0 0 2 1 1 −3 1 0
2 4 −1 5 −2 1 0 0 0
0 0 0 0 5
0 3 1 2 −3 −2 1 0 0
∼ →
−3 4 2 1
0 0 0 2 1 = U 1 −3 1 0 = L
124
Example
6 −2 0 1 0 0
−1
3 7 5 1
A = 9 1 1 0
1 − 13 0 6 0 0
−1
3 7 5 1
∼ 9 1 → 1 0
1 − 13 0 6 0 0
2
0 8 5 3 1
∼ 0 1 → 9 1 0
1 − 13 0 6 0 0
1 2
0 8 5 3 1
∼ 0 1 → 9 2 0
1 − 13 0 6 0 0
1 2
0 0 1 3 8 1
∼ 0 1=U → 9 2 0 = L
125
Note that
1. LU decomposition doesn’t necessarily exist for every m × n
matrix A.
124
037
A ∼ ··· ∼ 0 0 1
126
where P = (P �)−1
127
Definition For a square matrix A of size n, we define
the trace of A as
n
tr(A) = akk , A = [aij ]
k=1
�
a
a 22
...
11
ann
128
Theorem
Suppose that A and B are two n × n square matrices. Prove that
tr(AB) = tr(BA)
Proof
Both AB and BA are n × n matrices.
n n
tr(AB) = [AB]kk = [A]k�[B]�k
k=1 k=1 �=1
� �
n �
n n
tr(BA) = [BA]kk = [B]k�[A]�k
k=1 k=1 �=1
� �
n �
129
Exercise
Suppose that A and B are two m × n matrices. Prove that
tr(B T A) = tr(AB T )
B T A is n × n, while AB T is m × m.
n m m
T T T
tr(B A) = [B A]kk = [B ]k� [A]�k = [B]�k [A]�k
k=1 k=1 �=1 k=1 �=1
� �
n � �
n �
m n n
T T T
tr(AB ) = [AB ]kk = [A]k� [B ]�k = [A]k� [B]k�
k=1 k=1 �=1 k=1 �=1
� �
m � �
m �
130
� Determinants
The determinant of a square matrix A:
A �→ det(A) ∈ R (or C)
a2
a1
131
A = [a1 a2 a3], Volume = |det(A)|
a3
a2
a1
132
Definition For a square matrix A, define
Aij : the submatrix formed by deleting the ith row and jth
column of A.
Example:
1 −2 5 0
1 5 0
2 0 4 −1
0 −2 0
A= A32 = 2 4 −1
0 4 −2 0
3 1 0 7,
1. For a 1 × 1 matrix
A = a11
� �
we define
�
det(A) = a11
134
..
A= ai1 · · · aij · · · ain
is defined as
det(A)
= a11 det(A11) − a12 det(A12) + · · · + (−1)1+na1n det(A1n)
n
= (−1)1+j a1j detA1j
j=1
�
135
We call
(1) det(Aij ) : the minor of aij , or the (i, j)-minor.
136
Example:
a11 a12
det = a11 det[a22] − a12 det[a21] = a11a22 − a12a21
a21 a22
� �
• In fact, it can be proved that we can expand along any row, say
the ith row,
138
If A, B are triangular matrices,
a a · · · a1n b 0 ··· 0
11 12 11
0 a22 · · · a2n b21 b22 · · · 0
0 0
A= B=
a a · · · a1n
a22 · · · a2n
11 12
0
0 a22 · · · a2n . . . ..
· · · ann
0 0
det = a11a22 · · · ann
· · · ann
.. . . . .. = a11 · det 0
b 0 ··· 0
b22 ··· 0
.
11
bn2
b21 b22 · · · 0 . . . ..
· · · bnn
det = b11b22 · · · bnn
d11 0
d21
...
0 dnn
det
= d11d21 · · · dnn
140
• Row operations and determinants
k 01
Ea = 0 1 0 (replacement)
100
010
Eb = 0 0 1 (interchange)
1 00
0 01
Ec = 0 r 0 , r �= 0 (scaling)
It is clear that
det Ea = 1, det Eb = −1, det Ec = r
142
Let A be a square matrix. We will show that
det (EA) = (det E)(det A)
where E is an elementary matrix.
1. A multiple of one row of A is added to another row.
We conclude that
det (Ek A) = (det Ek ) (det A), k = a, b, c (5)
144
By (5), we have
det A = det (E1E2 · · · · · Ep)
= (det E1) det (E2 · · · · · Ep)
= (det E1) (det E2) · · · · · (det Ep) �= 0
145
• If A is not invertible
Ep� · · · E1� A = I˜
A = E1 · · · EpI˜
For example,
1 2 0 0
0 1
˜ det I˜ = 0
0 0
0 0
0 0 0 0
I=
0
,
1
and
det A = det (E1E2 · · · EpI)
˜ = det (E1E2 · · · Ep) det I˜ = 0
146
Theorem (Equivalent Statements of Matrix Inversion)
148
If A is invertible, let
A = E1E2 · · · Ep
By (5),
det AB = det (E1E2 · · · EpB)
= (det E1) [det (E2 · · · EpB)] = · · ·
= (det E1) (det E2) · · · · · (det Ep)(det B)
= (det E1E2 · · · Ep)(det B)
= (det A) (det B)
Therefore, we have
det AB = det BA = (det A) (det B)
although AB �= BA in general,
149
0 0 · · · unn
. .
. .
150
T
Theorem : If A is an n × n matrix, then det A = det A.
Proof :
1. Hold for n = 1, since A = AT = [a11].
2. Assume it’s hold for (n − 1) × (n − 1) matrices, n ≥ 2.
3. For an n × n matrix A,
det A = a11 · C11 + a12 · C12 + · · · + a1n · C1n
� � �
det AT = a11 · C11 + a12 · C21 + · · · + a1n · Cn1
Note that
1. det (A + B) �= det A + det B
2. det (cA) = cn(det A), if A is of size n × n
152
• Cramer’s Rule
For an n × n matrix A, and b ∈ Rn,
A = [a1 · · · ai · · · an]
define
Ai(b) = [a1 · · · ai−1 b ai+1 · · · an]
namely, replace ai by b.
Example
Ii(x) = [e1 · · · x · · · en]
1 0 · · · x1 0
x2
. . . ..
0 1 0
⇒ det Ii(x) = xi
xi
. ..
.
=0
0
0 0 1
xn
. .. . . .
.
Theorem (Cramer’s Rule) : Let A be an n × n invertible matrix. 153
For any b in Rn, the unique solution x of Ax = b has entries given by
det Ai(b)
xi = ,
det A
i = 1, 2, · · · , n
Proof :
A Ii(x) = A[e1 · · · x · · · en]
= [Ae1 · · · Ax · · · Aen]
= [a1 · · · b · · · an]
= Ai(b)
154
−1
• A formula for A :
Assume A is invertible and A−1 = B = [b1 b2 · · · bn].
Since AB = I,
A = [a1 · · · ai · · · an]
0
1
.
.
and
det Ai(ej ) = Cji = (−1)j+idet(Aji)
Ref.
det(A) = ai1 Ci1 + ai2 Ci2 + · · · + ain Cin
= a1j C1j + a2j C2j + · · · + anj Cnj
156
Since [bj ]i = [B]ij , we have
Cji
[B]ij =
det A
Hence
C C · · · Cn1
1 C C C
A−1 = B = (6)
· · ·
. ..
11 21
n2
12 22 = 1 adj(A)
..
11 21
C12 C22 · · · Cn2
adj(A) =
∗ ∗ ∗ ∗
0∗∗ ∗∗∗
∗ ∗
0
∗
0 ∗
0 0 ∗ 0 0 ∗
0 0 0
U = , U12 = 0 ∗ ∗ , U23 = 0 0 ∗
∗
0 ∗
158
a21 a22
� �
then
(1). the area of the parallelogram determined by a1 and a2 is |det A|,
(2). the volume of the parallelepiped determined by b1, b2 and b3 is
|det B|.
a2
a1
160
1. It holds if
�1 0 0
�1 0
A= ,
0 �2
� �
0 0 �3
B = 0 �2 0
then det A = �1�2 and det B = �1�2�3 are the corresponding area
and volume, respectively.
�2
�1
161
a2 − ka1 a2
a1
ka1
cos(φ) − sin(φ) s 0
[a1 a2 − ka1] =
sin(φ) cos(φ) 0 t
� � � �
162
Ref:
� �
� det [a1 a2 · · · an] �
a1
a2 − c21a1 ⊥ a1
a3 − c31a1 − c32a2 ⊥ a1 , a2
..
163
� Vector Spaces
(x0, y0)
164
n n
For vectors in R (or C ), we have the following definitions.
1. For two vectors u = (u1, u2, . . . , un)T and v = (v1, v2, . . . , vn)T
in Rn, the sum u + v is defined by
166
T T
Theorem If u = (u1, u2, . . . , un) , v = (v1, v2, . . . , vn) , and
w = (w1, w2, . . . , wn)T are vectors in Rn ( or Cn ) and c and d
are scalars in R ( or C), then:
1. u + v = v + u
2. (u + v) + w = u + (v + w)
3. u + 0 = u (0 = (0, 0, . . . , 0)T )
4. u + (−u) = 0 ( −u = (−1)u )
5. c(u + v) = cu + cv
6. (c + d)u = cu + du
7. c(du) = (cd)u
8. 1u = u
The above properties will be used to define a vector space.
167
• Motivation
Consider some mathematical sets and the related linear operations.
1. Rn (or Cn)
v1 + v2 , cv1, v = (v1, v2, . . . , vn)T , vk ∈ R or C
3. Mmn(R), or Mmn(C)
M1 + M2, cM1
4. { f (t) ∈ R, t ∈ R }
f1(t) + f2(t), cf1(t)
168
Remarks
1. For the above sets, we have similar the definitions of addition “+”
and scalar multiplication.
2. How to efficiently study further issues such as the concept of bases,
linear transformation, eigenvalues/eigenvectors, inner products, etc.?
169
The properties of the above theorem are actually common for the
following sets.
1. Rn (or Cn)
2. p(t) = a0 + a1t + a2t2 + · · · + antn, ak ∈ R or C
3. Mmn(R), or Mmn(C)
� �
4. { f (t) ∈ R, t ∈ R }
5. { {s} = (s1, s2, . . .), sk ∈ R or C }
6. { X : Ω → R }
170
n n
We use the properties of the vectors in R (or C ) to define
a vector space.
············
············
171
Axiom (or Definition )
A vector space V over a field F (R or C) is a nonempty set V of
vectors on which are defined two operations, called
addition (+) and scalar multiplication,
- addition: u + v
- scalar multiplication: cu
u, v ∈ V and c ∈ F (R or C)
172
For all u, v, w ∈ V and scalars c, d ∈ F (R or C),
1. (Closure) The sum of u and v, denoted by u + v, is in V .
2. (Closure) The scalar multiple of u by c, denoted by cu, is in V .
u v
V
u + v cu
3. u + v = v + u
4. (u + v) + w = u + (v + w)
5. There is a zero vector 0 in V such that u + 0 = u for any u.
6. For each u in V , there exists a vector w such that u + w = 0
We use −u to denote w.
7. c(u + v) = cu + cv
8. (c + d)u = cu + du
9. c(du) = (cd)u
10. 1u = u
174
By the above definition of a vector space, it can be verified that
each of the following sets can be regarded as a vector space.
3. Mmn(R), or Mmn(C)
4. {f (t) ∈ R, t ∈ R}
u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
176
u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
177
Example A vector space of real-valued functions defined on (a, b),
denoted by F (a, b), is a vector space. The space F (a, b) is not
of finite dimension.
F (a, b) = {f (t) ∈ R | t ∈ (a, b)}
u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
178
(V, F )
Example (Rn, R), (Cn, C), (F n, F ) are vector spaces.
180
Remarks
• Any kind of set can be a vector space if the above ten rules are
satisfied.
u v
u+v =v+u
u+v
0 (u + v) + w = u + (v + w)
cu
c(u + v) = cu + cv
(c + d)u = cu + du
c(du) = (cd)u
u+0=u
1u = u
u+w =0
181
Fact: The zero vector 0 in a vector space V is unique.
Assume we have two zero vectors 01 and 02 in V .
Then
0 1 = 0 1 + 0 2 = 02
w1 = w1 + 0 = w1 + (u + w2) = (w1 + u) + w2 = 0 + w2 = w2
182
Therefore, in a vector space,
1. the zero vector 0,
2. and the negative vector of a vector u, denoted by −u,
are well-defined.
183
Theorem (Cancellation law for vector addition)
Let V be a vector space and u, v, w are vectors in V . If
u+w =v+w
then u = v.
Proof.
There exists a z ∈ V such that w + z = 0.
u = u + 0 = u + (w + z)
= (u + w) + z
= (v + w) + z
= v + (w + z)
=v+0=v
184
Theorem Let (V, F ) be a vector space, u a vector in V , and
k a scalar in F . Then
1. 0u = 0
2. k0 = 0
3. (−1)u = −u
4. If ku = 0, then k = 0 or u = 0
Proof
1. 0u + 0u = (0 + 0)u = 0u = 0u + 0
By the cancellation law for vector addition, we have 0u = 0.
2. k0 = k(0 + 0) = k0 + k0
k0 = k0 + 0
⇒ k0 + k0 = k0 + 0 ⇒ k0 = 0 (by the cancellation law)
185
186
Remarks
1. In the above, we define the vector space.
2. We use the abstract method to achieve generalization.
3. Abstraction in mathematics is the process of extracting the underly-
ing essence of a mathematical concept.
4. We have an abstract in the front of an article or a book.
5. When referring to an abstract topic, we can use the cases (or exam-
ples) of R2 or R3 to verify it.
187
2
Example Let V = R . Define addition and scalar multiplication
operations as follows. For u = (u1, u2), v = (v1, v2), we define
ku = (ku1, 0)
The first nine rules of the axiom are satisfied. However, Axiom 10 fails
to hold.
1u = 1(u1, u2) = (u1, 0) �= u
188
a2 + b2
(a0 + a1t + a2t2) + (b0 + b1t + b2t2) = a1 + b1
a0
a2
�
c(a0 + a1t + a2t2) = c a1
In the following, we will study the issues of subspaces and the dimension
of a vector space.
189
• Subspaces
Definition A subspace of a vector space V is a subset H of V
that is also a vector space.
190
3 2
Example V=R , and H = R .
z
x
191
To determine if a subset H of a vector space V is a subspace, we
only need to examine the following three rules, for all u, v ∈ H, and
c ∈ F,
1. 0 ∈ H
2. (Closure) u ∈ H, v ∈ H ⇒ u + v ∈ H
3. (Closure) u ∈ H, c ∈ F ⇒ cu ∈ H
192
For a subset H ⊂ V , H is a subspace if for all u, v ∈ V , and c ∈ F ,
1. 0 ∈ H
2. (Closure) u ∈ H, v ∈ H ⇒ u + v ∈ H
3. (Closure) u ∈ H, c ∈ F ⇒ cu ∈ H
u v u+v =v+u
(u + v) + w = u + (v + w)
u+v
0 c(u + v) = cu + cv
cu
(c + d)u = cu + du
c(du) = (cd)u
1u = u
u+0=u
u+w =0 Note (−1)u = −u
193
{(x, y) | y = ax}
194
3
Example The subspaces of R include
• {0}
• Lines through the origin H = {(t, at, bt) | t ∈ R}, a, b ∈ R.
• Planes through the origin.
• R3 z
x
195
196
Example Assume m < n, and
W = {c0 + c1t + · · · + cmtm | c0, . . . , cm ∈ R}
V = {c0 + c1t + · · · + cntn | c0, . . . , cn ∈ R}
Then W is a (proper) subspace of V .
W ⊂ Rn
198
Example
P (a, b) ⊂ C ∞(a, b) ⊂ C m(a, b) ⊂ C 1(a, b) ⊂ C(a, b) ⊂ F (a, b)
200
Example Let Mnn be the space of square matrices of size n.
1. The set of n × n symmetric matrices (A = AT ) is a subspace of Mnn.
2. The set of n × n upper matrices (U ) is a subspace of Mnn.
3. The set of n × n lower matrices (L) is a subspace of Mnn.
4. The set of n × n diagonal matrices (D) is a subspace of Mnn.
Exercise Is H ∪ W a subspace of V ?
202
3
The subspaces of R :
1. The x-axis, y-axis, and z-axis.
2. The xy-plane, yz-plane, zx-plane.
W
U
In fact, 0 ∈ W ∩ U
204
How to define the dimension of a vector space?
1. 0 ∈ W (Let c1 = · · · = cp = 0)
2. Assume u ∈ W , s ∈ W , k ∈ F ,
u = c 1 v 1 + c2 v 2 + · · · + c p v p
s = d 1 v 1 + d2 v 2 + · · · + d p v p
then u + s = (c1 + d1)v1 + (c2 + d2)v2 + · · · + (cp + dp)vp ∈ W
206
• We call span{v1, v2, . . . , vp} the subspace spanned (or generated)
by {v1, v2, . . . , vp}.
If v1, v2, . . . , vp are linearly dependent, there are c1, c2, . . . , cp,
not all zero,
|c1|2 + |c2|2 + · · · + |cp|2 �= 0
such that
c1v1 + c2v2 + · · · + cpvp = 0
208
In this case, say, ck �= 0, then
v1 v1
v2 v2
c1v1 + c2v2 = 0 u = d 1 v 1 + d2 v 2
209
Example The set of vectors {(1, 0, 0)T , (0, 1, 0)T , (0, 0, 1)T } is a linearly
independent set,
1 0 0 0
0 0 1 0
c1 0 + c2 1 + c3 0 = 0 ⇒ c1 = c2 = c3 = 0
while the set {(1, 0)T , (0, 1)T , (2, 3)T } is not an independent set,
2 1 0
=2 +3
3 0 1
� � � � � �
210
For a vector space V ,
v1 , . . . , v p ∈ V
x ∈ span{v1, . . . , vp} = W
212
Basis : For a vector space V , we want to find a set of vectors
S = {v1, v2, . . . , vn} such that
x 1 0
=x +y
y 0 1
� � � � � �
213
(0, 1)
(1, 0)
214
Suppose β = {v1, v2, . . . , vn} is a basis of a vector space V . For
each x ∈ V , we can write x = c1v1 + c2v2 + · · · + cnvn.
n
1
c2
cn
[x]β =
.. ∈ R
Rn
V
x [x]β
β
215
For example, consider the space of polynomials of degree smaller
than 3,
Let
β = {1, t, t2}
be a basis of V . For a polynomial
we have
d0
d2
[ p(t) ]β = d1 ∈ R3
216
(1, 1)
(0, 1)
A vector space V may have more than one bases. Any set of vectors
{v1, . . . , vn} in V can be a basis of V if it is linearly independent and
it spans V .
For example, in R2, we can choose β = {(1, 0), (0, 1)} as a basis,
or γ = {(1, 1), (1, −1)} as a basis.
217
Theorem A vector space V can have different bases, say,
β = {b1, . . . , bn}, γ = {f1, . . . , fm}.
However, they must have the same number of vectors, i.e., m = n.
Proof:
1. m > n
Consider the coordinates of f1, . . . , fm relative to β,
f f f
11 21 m1
f12 f22
218
f f21 . . . fm1 c 0
.
11 1
f12 f22 . . . fm2
This definition is well-defined, since all bases have the same number of
vectors.
220
10 01 00 00
M1 = , M2 = , M3 = , M4 =
00 00 10 01
� � � � � � � �
since
a b 10 01 00 00
=a +b +c +d
c d 00 00 10 01
� � � � � � � � � �
222
cr
[v1 v2 · · · vr ]
.. = 0
224
Note that for a basis β of a vector space V , β is a set that
1. contains maximum number of linearly independent vectors,
2. contains minimum number of vectors that span V .
Then
/ span(β �)
v1 ∈
since β is an independent set.
Therefore,
span(β �) �= V
and β � cannot span V .
226
Summary:
For a vector space V , a basis β = {v1, v2, . . . , vn}
1. is an linearly independent set,
2. and it spans (generates) V .
3. n = dimV
228
In summary, for a vector space V of dimension n,
a set S = {v1, v2, . . . , vn} is a basis if
• S spans V , or
• S is linearly independent.
/ span{S}.
2. If span{S} � V , there is a vector u1 ∈ V , u1 ∈
Let S1 = S ∪ {u1}. Note S1 is also an independent set.
Span(S)
u1
230
4. If span{S1} � V , there is a vector u2 ∈ V , u2 ∈
/ span{S1}.
Let S2 = S1 ∪ {u2}.
V
span(S ∪ u1)
u2
dim(W ) ≤ dim(V )
232
Proof :
• Let βW be the a basis of W . Since βW ⊂ W ⊆ V and V is
finite-dimensional, we have that βW is a finite set, and W is
finite-dimensional.
• Since βW can be extended, if necessary, to become a basis βV of V ,
we have dim(W ) ≤ dim(V ) for βW ⊆ βV .
• If dim(W ) = dim(V ) = n, then βW is an independent set of n vectors
in V , and βW becomes a basis for V . So V = W = span(βW ).
Note the above result may not be true if the dimensions of V and W
are not finite, dim(V ) = dim(W ) = ∞.
For example, P (a, b) � C(a, b).
233
n
• Change of basis in R
Let
1 0 0
.
0 1 0
0 0 1
� = {e1, . . . , en} = { . , . , . . . , }
. . .
234
Then
c d
1 1
c2 d2
cn dn
[x]β =
.. , [x]γ =
..
and
x c d
1 1 1
x2 c2 d2
xn cn dn
x = [x]� = . = [u1 · · · un] . = [v1 · · · vn]
. . ···
= Pβ [x]β = Pγ [x]γ
�
2. [x]γ = Pγ−1Pβ [x]β = Pβγ [x]β , where Pβγ = Pγ−1Pβ .
236
2
Example Consider a basis β = {u1, u2} of R , where
1 1
u1 = , u2 =
0 2
� � � �
If
−2
[x]β =
3
� �
then
1 1 1
x = −2u1 + 3u2 = (−2) +3 = = [x]�
0 2 6
� � � � � �
237
2
Example Consider a basis β = {u1, u2} of R , where
1 1
u1 = , u2 =
0 2
� � � �
If
1
x = [x]� =
6
� �
assume
c1
[x]β =
c2
� �
then
1 1 c1 1
x = c1u1 + c2u2 = =
0 2 c2 6
� �� �� �
c1 1 1 1 1
= = =
1 −0.5 −2
⇒ [x]β =
c2 0 2 6 0 0.5 6 3
� � � �−1 � � � �� � � �
238
,
−9 −5
Example Let β = {u1, u2} = { } and
1
� � � �
−1
1 3
γ = {v1, v2} = { , } be two bases in R2.
� � � �
−4 −5
2 d1
Assume [x]β = , let [x]γ = . Then
3 d2
� � � �
2 d1
x = [u1 u2] = [v1 v2]
3 d2
� � � �
and
2 1 3 d1 d1 24
= =
−9 −5
⇒
3 d2 d2
� �� � � �� � � � � �
1 −1 −4 −5 −19
239
• Change of basis in a general vector space V
240
2 d0
p(t) = 2 + 3t + 4t2,
4 d2
[p]β = 3 , [p]γ = d1
⇒ d0 + d1 = 2, d1 + d2 = 3, d2 = 4
3
⇒ d2 = 4, d1 = −1, d0 = 3,
4
[p]γ = −1
241
Consider an m × n matrix A,
rT1
ck ∈ R m , rk ∈ Rn
rT
2
A = [c1 c2 · · · cn] =
. ,
rTm
.
Definition
• The column space of A: Col(A) = span{c1, c2, . . . , cn} ⊆ Rm
• The row space of A: Row(A) = span{r1, r2, . . . , rm} ⊆ Rn
• The null space of A is
Nul(A) = {x | Ax = 0} ⊂ Rn
242
A
Rn Rm
Col(A)
Nul(A)
0
A : m×n
Col(A) = {Ax | x ∈ Rn} x = (x1, x2, . . . , xn)T
= {x1c1 + x2c2 + · · · + xncn | xk ∈ R}
= span(c1, c2, · · · , cn)
Nul(A) = {x | Ax = 0}
243
Theorem Elementary row operations do not change the null space
of a matrix.
Proof
Nul(A) = {x | Ax = 0}
= {x | EAx = 0}
since the elementary row operation is reversible, and each elementary
matrix is invertible.
In general, we have
{x | Ax = 0} ⊆ {x | BAx = 0}
or
Nul(A) ⊆ Nul(BA)
244
Theorem Elementary row operations do not change the row space
of a matrix.
×3
×2
m n
rT
2
A = [c1 c2 · · · cn] =
. , ck ∈ R , r k ∈ R
rTm
.
245
Example
1 −3 4 −2 5 4 1 −3 4 −2 5 4
2 −6 9 −1 8 2 0 0 1 3 −2 −6
0 00 0 0 0
A=
−1 3 −4 2 −5 −4
2 −6 9 −1 9 7 ∼ R = 0 0 0 0 1 5
Row(A) = Row(R)
= span{(1, −3, 4, −2, 5, 4), (0, 0, 1, 3, −2, −6), (0, 0, 0, 0, 1, 5)}
246
c1 a 1 + c 2 a 2 + · · · + c n a n = 0
⇔ c1b1 + c2b2 + · · · + cnbn = 0
248
−3 6 −1 1 −7 1 −2 0 −1 3
0 0 0 0 0
2 −4 5 8 −4
A = 1 −2 2 3 −1 ∼ R = 0 0 1 2 −2
250
Example For
1 −3 4 −2 5 4 1 −3 4 −2 5 4
2 −6 9 −1 8 2 0 0 1 3 −2 −6
0 00 0 0 0
A=
−1 3 −4 2 −5 −4
2 −6 9 −1 9 7 ∼ R = 0 0 0 0 1 5
we have
Row(A) = span{(1, −3, 4, −2, 5, 4), (0, 0, 1, 3, −2, −6), (0, 0, 0, 0, 1, 5)}
1 4 5
2 9 8
1
Col(A) = span{
−4 −5
2 , 9 , 9 }
251
The above examples illustrate how to find the bases of
Col(A) and Row(A).
Example Find bases for Nul(A), Row(A) and Col(A) of the matrix
−3 6 −1 1 −7
2 −4 5 8 −4
A = 1 −2 2 3 −1
1 −2 0 −1 3 0 x1 − 2x2 − x4 3x5 = 0
x3 + 2x4 − 2x5 = 0
0 0 0 0 0 0 0 =0
� �
A 0 ∼ 0 0 1 2 −2 0
252
x1 2x2 + x4 − 3x5 2 1 −3
x2
x2 1 0 0
x4
x3 = −2x4 + 2x5 = x2 0 + x4 −2 + x5 2
x5 x5 0 0 1
x4 0 1 0
Nul(A) = {x | Ax = 0}
2 1 −3
t1, t2, t3 ∈ R
� 1 0 0 �
0 0 1
= t1 0 + t2 −2 + t3 2 ,
0 1 0
253
Note that
2 1 −3
1 0 0
0, −2 , 2
0 0 1
0 1 0
254
−3 6 −1 1 −7 1 −2 0 −1 3
0 0 0 0 0
2 −4 5 8 −4
A = 1 −2 2 3 −1 ∼ R = 0 0 1 2 −2
2 5
{a1, a3} = { 1 , 2 }
255
Theorem For any matrix A, the two spaces Row(A) and Col(A)
have the same dimensions.
Proof
A ∼ R
Since Row(A) = Row(R), we have
dim(Row(A)) = dim(Row(R))
256
dim(Row(A)) = dim(Row(R))
dim(Col(A)) = dim(Col(R))
However,
dim(Col(R)) = dim(Row(R)) = number of pivots
1 −3 4 −2 5 4
0 0 1 3 −2 −6
0 0 0 0 0 0
A∼R=
0 0 0 0 1 5
257
Definition For a matrix A, we define its rank as
rank(A) = dim(Col(A)) = dim(Row(A))
A
Rn Rm
Col(A)
Nul(A)
0
258
For an m × n matrix A, we have
rank(A) = dim(Col(A)) ≤ min(m, n)
1 −3 4 −2 5 4
R : m × n (4 × 6)
0 0 1 3 −2 −6
0 0 0 0 0 0
A∼R=
0 0 0 0 1 5,
A
Rn Rm
Col(A)
Nul(A)
0
260
Proof
Let a row echelon form of A be R. (Ax = 0)
rank(A) = dim(Col(A)) = (# of pivots in R)
= (# of basic variables)
1 −2 0 −1 3
0 0 0 0 0
A ∼ R = 0 0 1 2 −2
261
Example As in the previous examples,
1 −3 4 −2 5 4 1 −3 4 −2 5 4
2 −6 9 −1 8 2 0 0 1 3 −2 −6
0 0 0 0 0 0
A= ∼R=
−1 3 −4 2 −5 −4
2
−6 9 −1 9 7 0 0 0 0 1 5
−3 6 −1 1 −7 1 −2 0 −1 3
0 0 0 0 0
�
2 −4 5 8 −4
A = 1 −2 2 3 −1 ∼ 0 0 1 2 −2 = R
262
A
Rn Rm
Col(A)
Nul(A)
0
We have
1. Consistent ⇔ b ∈ Col(A)
263
= {x | x1c1 + · · · + xncn = 0}
264
Ax = b, A : m × n, x ∈ Rn, b ∈ Rm
m A A
1 0 2b2 − b1
1
b2 − b1
0
0 0 b3 − 3b2 + 2b1
0
0 b5 − 5b2 + 4b1
0 0 b4 − 4b2 + 3b1
266
A
Rn Rm
Col(A)
Nul(A)
0
rank(A) + nullity(A) = n (A : m × n)
rank(A) ≤ min(m, n)
267
Overdetermined (m > n) Ax = b
n
A
Rn Rm
Col(A)
m A Nul(A)
0
268
Underdetermined (m < n) Ax = b
Rm
A
n
Rn
Col(A)
Nul(A)
m A 0
269
Let Ax = b be a consistent linear system of m equations
in n unknowns (A : m × n).
�
If A has rank r, then dim( Nul(A) ) = n − r = k.
270
For the above results, recall the previous theorem:
Theorem Suppose Ax = b has a solution p. (Ap = b)
Then the solution set of Ax = b can be expressed as
{p + vh | Avh = 0}, vh ∈ Nul(A)
271
Theorem If A is an m × n matrix, then the following are
equivalent.
1. Ax = 0 has only the trivial solution. (Nul(A) = {0})
2. The column vectors of A are linearly independent.
3. Ax = b has at most one solution (none or one) for every
b ∈ Rm .
A = [c1 c2 . . . cn], ck ∈ Rm
Ax = 0, x = (x1, x2, . . . , xn)T
x1c1 + x2c2 + · · · + xncn = 0
x1c1 + x2c2 + · · · + xncn = b
272
Theorem (Equivalent Statements of Matrix Inversion)
If A is an n × n matrix, then the following statements are equivalent.
a. A is invertible.
b. The column vectors of A are linearly independent.
c. The row vectors of A are linearly independent.
d. The column vectors of A span Rn.
e. The row vectors of A span Rn.
f. A has rank n.
g. A has nullity 0.
273
Proof
Recall that for a square matrix A of size n,
A is invertible
⇔ Ax = 0 has only the trivial solution x = 0.
⇔ The column vectors of A are linearly independent. (b)
(the above is by (1) and (2) of the last Theorem)
⇔ Col(A) = Rn. (d)
⇔ Rank(A) = n. (f)
A
Rn Rm
Col(A)
Nul(A)
0
274
Furthermore, by the Dimension Theorem, since A: n × n, we have
rank(A) = n ⇔ nullity(A) = 0 (g)
rank(AT ) = rank(A) = n
⇔ the row vectors of A are linearly independent. (c)
⇔ the row vectors of A span Rn. (e)
1 −5 7 3 2
(row-echelon form)
0 0 0 0 0
A ∼ 0 0 1 −1 4
rank(A) + nullity(A) = n
rank(AB) ≤ rank(A)
Proof
Nul(AB) ⊇ Nul(B) Bx = 0 ⇒ ABx = 0
⇒ nullity(AB) ≥ nullity(B) (A : m × p, B : p × n)
⇒ rank(AB) ≤ rank(B)
since
rank(AB) + nullity(AB) = rank(B) + nullity(B) = n
277
278
Note in general, rank(AB) �= rank(BA).
For example
1 1 1 0
A= , B=
0 0
� � � �
−1 0
we have
1 1 1 0 0 0
AB = =
0 0 0 0
� �� � � �
−1 0
while
1 0 1 1 1 1
BA = =
0 0
� �� � � �
−1 0 −1 −1
rank(AB) = 0, rank(BA) = 1
279
� Linear Transformation
v T (v)
V : domain, W : codomain
T (V ): range, T (V ) = {T (v) | v ∈ V } ⊆ W
280
Definition
A transformation T : V �→ W is called a linear transformation if for
all vectors u and v in V and all scalars c, we have
1. T (u + v) = T (u) + T (v)
2. T (cu) = cT (u)
T
V W
u v T (u) T (v)
u+v T (u + v)
V W
T
0V 0W
282
For a linear transformation T : V �→ W , we have
T (c1v1 + c2v2 + · · · + cnvn)
= c1T (v1) + c2T (v2) + · · · + cnT (vn)
Let T = I.
T (v1 + v2) = v1 + v2 = T (v1) + T (v2)
T (cv) = cv = cT (v)
Exercise
Consider the mapping T : V �→ W such that T (v) = w0 for every
v ∈ V , where w0 is a constant vector in W . Is T a linear transformation?
284
Example
Consider the differential operation on the space of polynomials,
2 n
V = c0 + c1t + c2t + · · · + cnt | c0, . . . , cn ∈ R
� �
286
Exercise
Define a transformation T from the space of n × n matrices Mn(R)
to R, as
T (A) = det(A)
where A ∈ Mn(R). Determeine if T is a linear transformation.
287
n m
For a linear transformation T : R �→ R ,
we can express T (x) as
T (x) = Ax
where
A = [T (e1) T (e2) · · · T (en)]
1 0 0
0
0 0 1
e1 = , e 2 = 1 , · · · , en = 0
.. .. ..
288
n
Since for any x ∈ R , we can express it as
x
1
x2
xn
x=
.. = x1e1 + x2e2 + · · · + xnen
x
T (ek ) ∈ Rm
1
x2
xn
= [T (e1) T (e2) · · · T (en)]
.. ,
= Ax
289
Remarks
1. Every linear transformatin T from Rn to Rm corresponds to a matrix
A such that T (x) = Ax, with
A = [T (e1) T (e2) · · · T (en)]
290
Examples
1. Refection operator in R3
z
–Reflection about the xy-plane
(x1, y1, z1)
1 0 0 x x
y
z
0 0 −1 −z
x
0 1 0 y = y
0 0 0 0 1
−1
T ( 0 ) = 0 , T ( 1 ) = 1 , T ( 0 ) = 0
291
2. Projection operator on R3
z
– Orthogonal projection
(x1, y1, z1)
100 x x
000 z 0 y
0 1 0 y = y
x (x1, y1, 0)
Note
1 1 0 0 0 0
0 0 0 0 1 0
T ( 0 ) = 0 , T ( 1 ) = 1 , T ( 0 ) = 0
292
3. Rotation operator
Define a linear operator on R2 that rotates a vector x counter-clockwise
through an angle θ.
x
T (x) =
cos(θ) − sin(θ)
sin(θ) cos(θ) y θ
� �� �
1 cos(θ) 0
T( )= T( )=
− sin(θ)
0 sin(θ) 1 cos(θ)
� � � � � � � �
293
Definition Let T : V �→ W be a linear transformation.
• The kernel (or null space) of T is defined as
Ker (T ) = N (T ) = {x | T (x) = 0} ⊂ V
• Nullity (T ) = dim (N (T ))
• Rank (T ) = dim (R(T ))
T
V W
R(T )
N (T )
0
294
T
V W
R(T )
N (T )
0
295
Recall that
Nul(A) = {x | Ax = 0} ⊆ Rn
Col (A) = {Ax | x ∈ Rn} ⊆ Rm
296
Definition A linear transformation T : V → � W is said
one-to-one if T maps distinct vectors in V to distinct vectors in W .
That is
x1 �= x2 ⇒ T (x1) �= T (x2)
or equivalently
T (x1) = T (x2) ⇒ x1 = x2
V W
x
297
(⇐)
Suppose that T is not one-to-one, then there exist x1 and x2 such that
x1 �= x2 but T (x1) = T (x2). Then T (x1 −x2) = 0, and x1 −x2 ∈ N (T ),
but x1 − x2 �= 0, which indicates N (T ) �= {0}, a contradiction.
298
V W
299
T
V W
R(T )
N (T )
0
300
T
V W
R(T )
N (T )
0
N (T ) = {x | T (x) = 0}
R(T ) = {T (x) | x ∈ V }
301
Proof:
Let S = {v1, . . . , vp} is a basis of N(T ), p ≤ n. Then
T (v1) = · · · = T (vp) = 0
dim(R(T )) + dim(N(T )) = (n − p) + p = n
302
304
x (x1, y1, 0)
305
V T
W
T (V )
v T (v)
306
T
V W
R(T )
N (T )
0
V W
T −1
308
T (x) = y
We can define an inverse of T , T −1 : W → V , by
T −1(y) = x
V is isomorphic to W ⇔ W is isomorphic to V
V and W are isomorphic.
� �
310
T1 T2
V W Z
T1−1 T2−1
311
312
Rn
[u]β
313
314
n
Example V = {c0 + c1x + · · · + cnx | ck ∈ R} is isomorphic to
Rn+1.
c0 + c1x + · · · + cnxn �→ (c0, c1, . . . , cn)T
10 01 00 00
M1 = , M2 = , M3 = , M4 =
00 00 10 01
� � � � � � � �
since
a b 10 01 00 00
=a +b +c +d
c d 00 00 10 01
� � � � � � � � � �
Proof : We have proved (a) ⇔ (b) ⇔ (c). It is clear that (b) ⇔ (d).
In addition,
(a) ⇔ nullity(A) = 0 ⇔ N (T ) = {0} ⇔ (e)
316
Recall that when a square matrix A is not invertible, its reduced row
echelon form is like
102 120
000 000
0 1 1 or 0 0 1
102 2 102 0 2
000 0 000 1 0
0 1 11 = 0 1 10 = 1
317
318
• Affine Transformation
Definition An affine transformation from Rn to Rm is a mapping
of the form
S(u) = T (u) + f0, u ∈ Rn
where T is a linear transformation from Rn to Rm and f0 is a constant
vector in Rm.
Example
The mapping
01 1
S(u) = u+
1
� � � �
−1 0
Appendix
• The development of number systems
1. Natural numbers (1, 2, 3, . . .)
x + 5 = 3, x =?
2
√
5. Complex numbers ( a + ib, i = −1 ) C
The equation
xn + an−1xn−1 + · · · + a1x + a0 = 0
has n roots in C. (Fundamental theorem of algebra)
2. (a + b) + c = a + (b + c) and (a · b) · c = a · (b · c)
5. a · (b + c) = a · b + a · c
4
Example R, C, Z2 = {0, 1} are examples of fields.
6
• About finding a basis for Col(A)
−3 6 −1 1 −7 1 −2 0 −1 3
0 0 0 0 0
2 −4 5 8 −4
A = 1 −2 2 3 −1 ∼ R = 0 0 1 2 −2
8
−3 6 −1 1 −7 1 −2 0 −1 3
0 0 0 0 0
2 −4 5 8 −4
A = 1 −2 2 3 −1 ∼ R = 0 0 1 2 −2
x1 2 1 −3
(12)
x2, x4, x5 ∈ R
x2 1 0 0
x3 = x2 0 + x4 −2 + x5 2 ,
x5 0 0 1
x4 0 1 0
where x1 and x3 are basic variables, and x2, x4, x5 are free vairables.
9
x2, x4, x5 ∈ R
x2 1 0 0
x3 = x2 0 + x4 −2 + x5 2 ,
x5 0 0 1
x4 0 1 0
10
2. If we choose x2 = 1, x4 = x5 = 0, then (11)
Ax = x1a1 + x2a2 + x3a3 + x4a4 + x5a5 = 0
reduces to
x1a1 + a2 + x3a3 = 0
and by (12)
x1 2 1 −3
x2, x4, x5 ∈ R
x2 1 0 0
x3 = x2 0 + x4 −2 + x5 2 ,
x5 0 0 1
x4 0 1 0
x2, x4, x5 ∈ R
x2 1 0 0
x3 = x2 0 + x4 −2 + x5 2 ,
x5 0 0 1
x4 0 1 0
12
4. If we choose x5 = 1, x2 = x4 = 0, then (11)
Ax = x1a1 + x2a2 + x3a3 + x4a4 + x5a5 = 0
reduces to
x1a1 + x3a3 + a5 = 0
and by (12)
x1 2 1 −3
x2, x4, x5 ∈ R
x2 1 0 0
x3 = x2 0 + x4 −2 + x5 2 ,
x5 0 0 1
x4 0 1 0
14
−3 6 −1 1 −7 1 −2 0 −1 3
0 0 0 0 0
2 −4 5 8 −4
A = 1 −2 2 3 −1 ∼ R = 0 0 1 2 −2