Lecture Notes On Linear Algebra: S. K. Panda IIT Kharagpur October 25, 2019

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Lecture Notes on Linear Algebra

S. K. Panda

IIT Kharagpur

October 25, 2019

1
Contents
1 Vector Space 3
1.1 Definition and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Linear Combination and Linear Span . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Linear Dependency and Independency . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Basis and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Inner Product Space 12


2.1 Definition and Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Existence of Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Orthogonal and Orthonormal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Orthogonal Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Linear Transformation 16
3.1 Definition, Examples and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Matrix Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Matrix Representation Under Change of Basis . . . . . . . . . . . . . . . . . . . . . . 19

4 Eigenvalues and Eigenvectors 20


4.1 Recall Eigenvalues and Eigenvectors of Matrices . . . . . . . . . . . . . . . . . . . . 20
4.2 Eigenvalues (Characteristic values) and eigenvectors (characteristic vectors) of oper-
ators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3 Annihilating Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.5 Types of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.6 Jordan Canonical form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5 Definiteness of Hermitian Matrices 27

2
1 Vector Space
1.1 Definition and Examples
Definition 1.1. A set V over a field F is a vector space if x + y is defined in V for all x, y ∈ V
and αx is defined in V for all x ∈ V, α ∈ F s.t.

1) + is commutative, associative.
2) Each x ∈ V has a inverse w.r.t +.
3) Identity element 0 exists in V for +.
4) 1x = x holds x ∈ V, where 1 is the multiplicative identity of F.
5) (αβ)x = α(βx), (α + β)x = αx + βx hold α, β ∈ F, x ∈ V.
6) α(x + y) = αx + αy holds α ∈ F, x, y ∈ V.

Example 1.1. The following examples must be verified by the reader.

1. R3 is a vector space over R. Here (x + αy)i := xi + αyi .

2. Mm,n := {Am×n : aij ∈ C} is a vector space over C. Here


[A + αB]ij := aij + αbij .

3. RS := {functions from S to R} is a vector space over R. Here (f + αg)(s) := f (s) + αg(s).

4. The set of all real sequences (being RN ) is a vector space over R.

5. The set of all bounded real sequences is a vector space over R.

6. The set of all convergent real sequences is a vector space over R.

7. {(an ) | an ∈ R, an → 0} is a vector space over R.

8. The set of all eventually 0 sequences is a vector space over R. We call (xn ) eventually 0 if
∃k s.t. xn = 0 for all n ≥ k.

9. C((a, b), R) := {f : (a, b) → R | f is continuous} is a vector space over R.

10. C 2 ((a, b), R) := {f : (a, b) → R | f 00 is continuous} is a vector space over R.

11. C ∞ ((a, b), R) := {f : (a, b) → R | f infinitely differentiable} is a vector space.

12. {f : (a, b) → R | f 00 + 3f 0 + 5f = 0} is a vector space over R.

13. R[x]= {p(x) | p(x) is a real polynomial in x} is a vector space over R.

14. R[x; 5]= {p(x) ∈ R[x] | degree of p(x) ≤ 5} is a vector space over R.

15. The set {0} is the smallest vector space over any field.

16. {An×n | aij ∈ R, a11 = 0} is a vector space over R.

17. {An×n | aij ∈ R, A upper triangular} is a vector space over R.

3
18. {A ∈ Mn×n (R) | A is symmetric } is a vector space over R. What about the skew symmetric
ones? How about At = 2A?

Remark 1.1. Any field F is a vector space over the same field F.

Remark 1.2. The field F does matter for deciding a non-empty set is a vector space or not under
some addition. The following is an example.

Example 1.2. We have already seen that R is a vector space over R. But R is not a vector space
over C.

1.2 Subspace
Definition 1.2. Let (V, +, ., F be a vector space and let W ⊆ V. Then W is called a subspace of
V if W is a vector space over the same field F under the same operations: the restriction of + to
W × W and the restriction of . to F × W .

Theorem 1.1. Let V be a vector space over F and W ⊆ V. Then W is a subspace of V if and only
if x, y ∈ W we have αx + βy ∈ W for all α, β ∈ F

Proof. Left to the reader.

Example 1.3. The following examples must be verified by the reader.

1. {0} and V are subspaces of any vector space V, called the trivial subspaces.

2. The vector space R over R does not have a nontrivial subspace.


Take a subspace W 6= {0}. Then ∃a ∈ W s.t. a 6= 0. So αa ∈ W for each α ∈ R. So W = R

3. The vector space R over Q has many nontrivial subspaces. Two of them are
√ √ √ √
Q[ 2] = {a + b 2 : a, b ∈ Q} and Q[ 3] = {a + b 3 : a, b ∈ Q}.

4. Let a ∈ R2 , a 6= 0. Then the set {x|at x = 0} is a nontrivial subspace of R2 . Geometrically,


this set represents a straight line passing through the origin. In fact, these are the only
nontrivial subspaces of R2 .
T
Theorem 1.2. Let {Ut | Ut is subspace of V} be a nonempty class. Then Ut is a subspace of V.
t
T
Proof. Let x, y ∈ Ut and let α, β ∈ F. Then x, y ∈ Ut for each t. Since Ut is a subspace of V
t T T
for each t, αx + βy ∈ Ut for each t. Therefore αx + βy ∈ Ut . Hence Ut is a subspace of V.
t t

Remark 1.3. The union of two subspaces of a vector space need be a subspace of that vector
space. The following is an example.

Example 1.4. Let V = R2 . Let U = {x ∈ R2 | x2 = 0} and let V = {x ∈ R2 | x1 = 0}. We


can easily check that both are subspaces R2 . Take U ∪ V = {x ∈ R2 | x1 x2 = 0} which is not a
subspace.

Exercise 1.1. Give two subspaces U, W of R3 for each: a) U ∪ W is a subspace of V. b) U ∪ W is


not a subspace of V.

4
The following theorem supplies a necessary and sufficient condition to make union of two sub-
spaces again a subspace.
Theorem 1.3. Let U, W be two subspaces of V. Then U ∪ W is a subspace of V if and only if
either U ⊆ W or W ⊆ U.
Proof. First we assume that U ∪ W is subspace of V. To show either U ⊆ W or W ⊆ U. Suppose
that U 6⊆ W and W 6⊆ U. Then there exist two vectors x and y in V such that x ∈ U but not in W
and y ∈ W but not in U. Therefore x, y ∈ U ∪ W. Then x + y ∈ U ∪ W. This implies that either
x + y ∈ U or W. If x + y ∈ U, then (x + y) − x = y ∈ U which is not possible. If x + y ∈ W, then
(x + y) − y = x ∈ W which is not possible. Therefore either U ⊆ W or W ⊆ U.
Conversely, U ⊆ W or W ⊆ U. Then either U ∪ W = W or U ∪ W = V. Hence U ∪ W is subspace
of V.
Remark 1.4. If U is a subspace of W and W is a subspace of V, then U is a subspace of V.
Definition 1.3. If U, W are subspaces of V, then U + W := {u + w | u ∈ U, w ∈ W} and this is
called the sum of U and W.
Theorem 1.4. If U, W are subspaces of V. Then U + W is the smallest subspace of V which
contains both U and W.
Proof. First we show that U + W is subspace of V . Let x, y ∈ U + W. Then x = x1 + x2
for some x1 ∈ U and x2 ∈ W and y = y1 + y2 for some y1 ∈ U and y2 ∈ W. Take +βy =
α(x1 + x2 ) + β(y1 + y2 ) = (αx1 + βy1 ) + (αx2 + βy2 ). Since U and W are subspaces of V. Then
αx1 + βy1 ∈ U and αx2 + βy2 ∈ W. Therefore +βy ∈ U + W. Hence U + W is subspace of V.
Now we show that W, U ⊆ V. Let x ∈ U. Then x = x + 0. It is clear that x ∈ U + W. Hence
U ⊆ V. Similarly we can show that W ⊆ V.
Now we show that U + W is the smallest subspace of V which contains both U and W. Suppose
that there is a subspace Z containing U and W such that Z ( U + W. Then there exist a vector
x ∈ U ⊆ V but not in Z. Since x ∈ U ⊆ V, x = x1 + x2 where x1 ∈ U and x2 ∈ W. Notice that
x1 , x2 ∈ Z. Therefore x = x1 + x2 ∈ Z. This is a contradiction. Hence U + W is the smallest
subspace of V which contains both U and W.
Definition 1.4. When U ∩ W = {0}, it is called the internal direct sum of U and W. Notation:
U ⊕ W.
Remark 1.5. If V = U ⊕ W, then W is called complement of U.
Theorem 1.5. Let U, W be two subpaces of V. Then V = U ⊕ W iff for each v ∈ V, there exists
unique u ∈ U and there exists unique w ∈ W such that v = u + w.
Proof. First we assume that V = U ⊕ W, that means, U ∩ W = {0}. Let x ∈ V. Then there
exist x1 ∈ U and x2 ∈ W such that x = x1 + x2 . Now we show that x1 and x2 are unique. Suppose
there two vectors y1 (6= x1 ) ∈ U and y2 (6= y1 ) ∈ W such that x = y1 + y1 . Then x1 + x2 = y1 = y2
this implies x1 − y1 = y2 − x2 . Therefore x1 − y1 , y2 − x2 ∈ U ∩ W. This is only possible when
x1 − y1 = y2 − x2 = 0. Then x1 = y1 and x2 = y2 .
Conversely, v ∈ V, there exists unique u ∈ U and there exists unique w ∈ W such that v = u+w.
If we show that U ∩ W = {0}, then we are done. Let x ∈ U ∩ W. Then x = x + 0 where x ∈ U
and 0 ∈ W, and x = 0 + x where 0 ∈ U and x ∈ W. This is possible only when x = 0 otherwise the
hypothesis is wrong. Then U ∩ W = {0}. Hence V = U ⊕ W.

5
Definition 1.5. Let V, W be vector spaces over F. V × W is Cartesian product of V and W.
Addition and and scalar multiplication on V × W defined by

(v1 , w1 ) + (v2 , w2 ) := (v1 + v2 , w1 + w2 )


and

α(v, w) = (αv, αw), α ∈ F


.

Exercise 1.2. Then V × W is a vector space over F.

1.3 Linear Combination and Linear Span


k
P
Definition 1.6. (Linear Combination) Let xi ∈ V, αi ∈ F, i = 1, · · · , k. We call αi xi a linear
i=1
combination of x1 , · · · , xk .
k
P
Theorem 1.6. The set { αi xi | αi ∈ F} is a subspace of V.
i=1

k
P k
P
Proof. Let x, y ∈ { αi xi | αi ∈ F} and α, β ∈ F. Therefore x = αi xi αi ∈ F for i = 1, . . . , k
i=1 i=1
k
P
and y = βi xi βi ∈ F i = 1, . . . , k. Then
i=1

k
X k
X k
X
αx + βy = α αi xi + β βi xi = (ααi + ββi )xi
i=1 i=1 i=1

k
P k
P
This implies αx + βy ∈ { αi xi | αi ∈ F}. Hence { αi xi | αi ∈ F} is subspace of V.
i=1 i=1
m
P
Definition 1.7. (Linear Span) Let S ⊆ V. By span of S we mean { αi xi | αi ∈ F, xi ∈ S}. The
i=1
span of S is denoted by ls(S).

Theorem 1.7. Let S ⊆ V. Then ls(S) is a subspace of V.

Proof. The proof is similar to the proof of Theorem 1.6. ls(φ) = {0}.

Theorem 1.8.SLet V be a vector space and S ⊆ V. Let L = {U | U is subspace of V, S ⊆ U}.


Then ls(S) = U.
U∈L

6
1.4 Linear Dependency and Independency
Definition 1.8. Let x1 , . . . , xk ∈ V. If a nonzero linear combination of xi ’s becomes 0, then we
say x1 , . . . , xk are linearly dependent. We say x1 , . . . , xk are linearly independent, if they are
not linearly dependent.
Example 1.5. The following examples must be verified by the reader
 
2 1
1. In R , the vectors x = , e1 , e2 , are linearly dependent, as x − e1 − 2e2 = 0.
2
       
3 −1 4 2
2. v1 =  0 , v2 =
  1 , v3 =
  2 , v4 =
  1  are linearly dependent in R3 , as
−3 2 −2 1
2v1 + 2v2 − v3 + 0v4 = 0.
   
1 −1
3. v1 =  1 , v2 =  1  are linearly independent in R3 . How?
1 1
       
3 −1 4 2
4. v1 =  0 , v2 =  1 , v3 =  2 , v4 =  1  are linearly dependent in R3 , as
−3 2 −2 1
2v1 + 2v2 − v3 + 0v4 = 0.
   
1 −1
5. v1 =  1 , v2 =  1  are linearly independent in R3 . How?
1 1
Remark 1.6. Thus φ is linearly independent and any set containing 0 is linearly dependent.
Theorem 1.9. Let S = {x1 , · · · , xn } be linearly dependent. Then S is linearly independent iff
there exits k s.t. xk is linear combination of x1 , · · · , xk−1 , xk+1 , . . . , xn (that is, there exists xk such
that xk ∈ ls(S − {αk }).
Proof. Since x1 , · · · , xn be linearly dependent, there exist scalars α1 , , αk ∈ F not all zero such
n n
6 0. Then xk = − α1k (
P P
that αi x1 = 0. Without loss of generality we assume that k = αi x1 ).
i=1 i=1,i6=k
So xk is linear combination of x1 , · · · , xk−1 , xk+1 , . . . , xn .
Conversely, ∃k s.t. xk is a linear combination of x1 , · · · , xk−1 , xk+1 , . . . , xn . That means xk =
c1 x1 + c2 x2 + · · · + ck−1 xk−1 + ck+1 xk+1 + · · · + cn xn for some scalars ci ∈ F for i = 1, . . . , k − 1, k +
1, . . . , n. Then c1 x1 + c2 x2 + · · · + ck−1 xk−1 − xk + ck+1 xk+1 + · · · + cn xn = 0. This implies S is
linearly dependent.
Remark 1.7. The above theorem gives you a technique to check whether a set is linearly dependent
or not. That is, you just have to check that a vector from that set is a linear combination of
remaining vectors of S.
The following theorem says something more.
Theorem 1.10. Let x1 , · · · , xn be linearly dependent, x1 6= 0. Then ∃k > 1 s.t. xk is a linearly
combination of x1 , · · · , xk−1 .

7
Proof. Consider {x1 }, {x1 , x2 }, · · · , {x1 , x2 , · · · , xn } one by one. Take the smallest k > 1 s.t.
Sk = {x1 , · · · , xk } is linearly dependent. So Sk−1 is linearly independent(since k is the smallest).
Pk
In that case if αi x1 = 0 α1 , , αk ∈ F not all zero, then alphak 6= 0. Otherwise Sk will be linearly
i=1
independent. Therefore xk is linear combination of x1 , · · · , xk−1 .
Theorem 1.11. Every subset of a finite linearly independent set is linearly independent.
Proof. Let S be a linearly independent set. Let S = {x1 , . . . , xk }. Let S1 ⊆ S. We have to show
that S1 is linearly independent. Suppose S1 is linearly dependent. Then there exists a vector say
xm in S1 such that xm ∈ ls(S1 − {xm }). Since S1 − {xm } ⊆ S − {xm }. Then xk ∈ ls(S − {xm }).
Hence S1 is linearly independent, a contradiction. Therefore S1 is linearly independent.
Remark 1.8. The above theorem says that if you will be able to find out a subset of a set S which
is linearly dependent, then S is linearly dependent. But if you will find out a subset of S which is
linearly independent, then you can not conclude anything about S.
Remark 1.9. Every subset of a finite linearly dependent set need not be linearly dependent. In
1
R2 , the vectors x = , e1 , e2 , are linearly dependent, as x − e1 − 2e2 = 0. The subset {e1 , e2 }
2
is linearly independent.
Till now we have seen linear independency and dependency for finite set. Next we are going
to define linear independency and dependency for infinite set and we hire the concepts of linear
independency and dependency of finite to define for infinite set.
Definition 1.9. Let V be a vector space and let S ⊆ V be infinite. We say S is linearly dependent
if it contains a finite linearly dependent set, otherwise it is linearly independent.
Remark 1.10. Theorem 1.11 is also true if S is infinite.
Theorem 1.12. Let S = {v1 , . . . , vn } ⊆ V and T ⊆ ls(S) such that m = |T | > |S|. Then T is
linearly dependent.
The above theorem is quite important. This theorem says that if you have a finite subset S of a
vector space containing n elements. Then any finite subset of ls(S) containing more than n elements
is linearly dependent. For example, consider the vector space R3 . Take S = {(1, 1, 0), (1, 0, 0)}.
Take T = {(1, 1, 0), (1, 0, 0), (3, 1, 0)}. You can easily check that T ⊆ ls(S) and T contains three
elements. T is linearly dependent.
Corollary 1. Any n + 1 vectors in Rn is linearly dependent.
Proof. Follows as Rn = ls(e1 , . . . , en ).
The following corollary is quite important. This theorem gives you a technique to extend a
linearly independent set to a larger linearly independent set. This technique will be used frequently
through out this course. So keep it in your mind.
Corollary 2. Let S ⊆ V be linearly independent, x ∈ V \ S. Then S ∪ {x} is linearly independent
iff x ∈
/ ls(S).
Corollary 3. Let S ⊆ V be linearly independent. Then ls(S) = V iff each proper superset of S is
linearly dependent.

8
1.5 Basis and Dimension
Definition 1.10. Let V be a vector space and B ⊆ V. Then B is called basis for V if the the
following are hold.
i) B is linearly independent,
ii) ls(B) = V.
Theorem 1.13. Every vector space has a basis.
Proof. To prove this theorem we need Zorn’s lemma. So I skip the proof at this point of time.
Remark 1.11. Every non-trivial vector space has infinitely many basis. That means basis is not
unique.
Let S ⊆ T . We say S is a maximal subset of T having a property (P) if
i) S has (P)
ii) no proper superset of S in T has (P).
Example 1.6. Let T = {2, 3, 4, 7, 8, 10, 12, 13, 14, 15}. Then a maximal subset of T of consecu-
tive integers is S = {2, 3, 4}. Other maximal subsets are {7, 8}, {10}, {12, 13, 14, 15}. The subset
{12, 13} is not maximal.
Definition 1.11. A subset S ⊆ V is called maximal linearly independent if
i) S is linearly independent
ii) no proper super set of S is linearly independent.
Example 1.7. The following examples must be verified by the reader.
1. In R3 , the set {e1 , e2 } is linearly independent but not maximal linearly independent.
2. In R3 , the set {e1 , e2 , e3 } is maximal linearly independent.
3. Let S ⊆ Rn be linearly independent and |S| = n. Then S is maximal linearly independent.
Theorem 1.14. A set S ⊆ V is maximal linearly independent, then ls(S) = V.
Proof. First we assume that S is maximal linearly independent set. To show ls(S) = V.
Suppose ls(S) 6= V. Then there exists α ∈ V but not in ls(S). Take S1 = S ∪ {α}.
Claim: The set S1 is linearly independent. Suppose it is dependent. Then S1 has a finite
subset which is linearly dependent, say, R. There are two cases.
CASE I: α 6∈. Then R is a finite subset of S. A contradiction that R is linearly dependent
(every subset of a linearly independent set is linearly independent ans S is linearly independent).
CASE II: α ∈. Let R = {α1 , . . . , αk , α}. Since R is linearly dependent then there exist
c1 , c2 , . . . , ck , c not all zero in F such that
c1 α1 + c2 α2 + · · · + ck αk + cα = 0
. If c = 0, then ci = 0 for i = 1, . . . , k (since {α1 , . . . , αk is a subset of S which is linearly
independent). So c 6= 0. Then α = 1c (c1 α1 + c2 α2 + · · · + ck αk ). This implies α ∈ ls(S), a
contradiction.
Hence we have proved our claim that S1 is linearly independent.
We notice that S ( S1 . A contradiction that S is maximal. Hence ls(S) = V.

9
Theorem 1.15. A subset S ⊆ V is a basis of V. Then S is maximal linearly independent set.

Proof. Suppose that S is not maximal. Then there exits a linearly independent set S1 such
that S ⊆ S1 and S1 ⊆ ls(S). By using Theorem 1.12, S1 is dependent, a contradiction. Hence S1
is linearly independent.

Remark 1.12. It is clear that every basis is maximal linearly independent set.

Remark 1.13. Let V be a vector space and let B be a basis of V. There exists unique α1 , . . . , αk ∈
B and unique c1 , . . . , ck ∈ F such that x = c1 α1 + c2 α2 + · · · + ck αk .

Definition 1.12. Let V be a vector space. Then V is called finite dimensional if it has a basis
B which is finite. The dimension of V is the cardinality of B and it is denote by dim(V ).

Remark 1.14. dim({0}) = 0.

Theorem 1.16. Let S, T be two basis of a finite dimensional vector space V. Then |S| = |T |.

Proof. Since T ⊆ ls(S) and T is linearly independent, then by Theorem 1.12 |T | > |S|.
Similarly |S| > |T |. Hence |S| = |T |.

Theorem 1.17. Let V be a finite-dimensional vector space with dim(V ) = n. If U is a subspace


of V , then dim(U) ≤ dim(V), the equality holds if and only if dim(U) = n.

Theorem 1.18 (Basis Deletion Theorem). If V = ls({α1 , . . . , αk }). Then some vi can be
removed to obtain a basis for V.

Proof. Let S = {α1 , . . . , αk }). If S is linearly independent, then S is basis of V. If S is linearly


dependent, then there exists αi ∈ S such that αi is linear combination of rest of the vectors of S.
That is αi = b1 α1 + · · · + bi−1 αi−1 + bi+1 αi+1 + · · · + bk αk where bp ∈ F for p = 1, , i − 1, i + 1, . . . , k
and not all zero. We assume that bt 6= 0. Take S1 = S − {αt }. It is clear that ls(S1 ) = V. If S1 is
linearly independent, then S1 is basis of V. Continuing this process we get a linearly independent
subset consisting of k − p = n vectors(since n is finite existence of such p is possible. Therefore Sp
is a basis of V.
The following is an application of Basis deletion theorem.

Theorem 1.19. Let V be a finite dimensional vector space and let dim V = n. Let S = {α1 , . . . , αn } ⊆
V such that ls S = V. Then S is a basis for V.

Proof. If we show that ls(S) = V, then we are done. If ls(S 6= V, then by using Extension
theorem, we can extend S to be a basis for V and which contains at least n + 1 elements. A
contradiction that dim(V) = n. Hence S is a basis of V.

Theorem 1.20 (Basis Extension Theorem). Every linearly independent set of vectors in a
finite-dimensional vector space V can be extended to a basis of V .

Proof. Let dim(V) = n. Let S = {α1 , . . . , αk } be a linearly independent set. If ls(S) = V, then
S is basis of V. If ls(S) 6= V, then there exists a vector β1 ∈ V but not in ls S. Take S1 = S ∪ {β1 }.
Using Theorem 1.15, the set S1 is linearly independent. If ls(S1 ) = V,then S1 is basis for V. If
ls(S1 ) 6= V, then there exists a vector β2 ∈ V but not in ls S1 . Take S2 = S1 ∪{β2 }. Using Theorem
1.15, the set S2 is linearly independent. If ls(S2 ) = V,then S2 is basis for V. If ls(S2 ) 6= V, then

10
there exists a vector β3 ∈ V but not in ls S2 . Take S3 = S1 ∪ {β3 }. Continuing this process we get
a linearly independent subset consisting of k + p = n vectors(since n is finite existence of such p is
possible. Therefore Sp = {α1 , . . . , αk , β1 , . . . , βp }. We notice that Sp is linearly independent and it
contains n vectors. So Sp is basis of V.
The following is an application of Basis extension theorem.

Theorem 1.21. Let V be a finite dimensional vector space and dim V = n. Let S = {α1 , . . . , αn } ⊆
V such that S is linearly independent. Then S is a basis of V.

Proof. If we show that S is linearly independent then we are done. If S is linearly dependent,
then by Deletion theorem, we can reduce S to a basis for V which contains less than n elements.
A contradiction that dim(V) = n. Hence S is basis of V.

Theorem 1.22. If U, W are subspaces of a finite-dimensional vector space V, then dim(U + W) =


dim(U) + dim(W) − dim(U ∩ W).

Proof. Since V is finite dimensional, U and W both are finite dimensional. Let B = {v1 , , vk }
be a basis of U ∩ W. By using Extension theorem we extend B1 to a basis for U which is
{v1 , . . . , vk , u1 , , um } and basis for W which is {v1 , . . . , vk , w1 , , wp , }.
Let x ∈ U + W. Then x = x1 + x2 for some x1 ∈ U and x2 ∈ W. Therefore
k
X m
X
x1 = ai vi + bj uj
i=1 j=1

and
k
X p
X
x2 = ci vi + dj wj
i=1 j=1
.
k
P m
P k
P p
P
Then x = ai vi + bj uj + ci vi + dj wj .
i=1 j=1 i=1 j=1
This implies x ∈ ls({v1 , . . . , vk , u1 , , um , v1 , . . . , vk , w1 , , wp , })
We now show that {v1 , . . . , vk , u1 , . . . , um , w1 , , wp , }) is linearly independent.
To show that we take,
Xk Xm X p
fi vi + gj uj + hj wj = 0 (1)
i=1 j=1 l=1

Therefore,
k
X m
X p
X
fi vi + gj uj = − hj wj
i=1 j=1 l=1

k
P m
P k
P m
P k
P
This implies fi vi + gj uj ∈ U ∩ W. Then fi vi + gj uj ∈ U ∩ W = αi vi Therefore,
i=1 j=1 i=1 j=1 i=1
k
P m
P
(fi + αi )vi + gj uj = 0. Then fi + αi = 0 for i = 1, . . . k and gj = 0 for j = 1, . . . , m as
i=1 j=1

11
{v1 , . . . , vk , u1 , , um } is linearly independent. Put the values of gi in Equation (), then
k
X p
X
fi vi + hl wl = 0
i=1 l=1
.
Therefore fi = 0 for i = 1, . . . , k and hj = 0 for j = 1, . . . , p as {v1 , . . . , vk , w1 , , wp , }. Hence
{v1 , . . . , vk , u1 , , um , v1 , . . . , vk , w1 , , wp , }) is linearly independent.
Then {v1 , . . . , vk , u1 , , um , v1 , . . . , vk , w1 , , wp , }) is basis of U + W. Therefore dim(U + W) =
k + m + P + k − k = dim(U) + dim(W) − dim(U ∩ W).
Definition 1.13. Let V be a finite dimensional vector spaces and let B = {x1 , . . . , xn } be a basis.
Let x ∈ V. Then there exists unique α1 , . . . , αn ∈ F such that x = α1 x1 + · · · + αn xn . Then
(α1 , . . . , αn ) is called co-ordinate of x.
Remark 1.15. A vector space is infinite dimensional if it is not finite dimensional.

2 Inner Product Space


2.1 Definition and Example
Let V be a vector space over the field C. An inner product is a mapping h., .i : V × V → C which
satisfies the following conditions.

1. hx, xi ≥ 0 for all x ∈ V and hx, xi = 0 iff x = 0.


2. hx, yi = hx, yi for all x, y ∈ V.
3. h, yi = αhx, yi for all x, y ∈ V and for all α ∈ C.
4. hx + y, zi = hx, zi + hy, zi for all x, y, z ∈ V.
An inner product space is a vector space along with an inner product.
Example 2.1. The following example must be verified by the reader.
Pn
1. Let V = Rn and F = C. The following is an inner product on V. hx, yi = i xi yi . Hence
(Rn , h., .i) is an inner product space.
2. Let V = Cn and F = C. The following is not an inner product on V. hx, yi = ni xi yi .
P

3. Let V = Cn and F = C. The following is an inner product on V. hx, yi = ni xi yi . Hence


P
(Cn , h., .i) is an inner product space.
4. V = Mn (R) and F = R. The following is an inner product on V. hA, Bi = (AB t ). Hence
((M )nn , h., .i) is an inner product space.
Rb
5. V = C[a, b] and F = R. The following is an inner product on V. hf, gi = a f (x)g(x)dx.
Hence (C[a, b], h., .i) is an inner product space.
Rb
6. V = CR and F = R. The following is not an inner product on V. hf, gi = a f (x)g(x)dx.
7. V = Mn (C) and F = C. The following is not an inner product on V. hA, Bi = (AB t ).

12
2.2 Existence of Inner Product
Theorem 2.1. Every non-trivial finite dimensional vector space is inner product space.

Question: What happens if we consider an infinite dimensional vector space?

2.3 Orthogonal and Orthonormal


Definition 2.1. Let (V, h., .i) be an inner product space. Let u, v ∈ V. Then u and v is called
orthogonal if hu, vi = 0.

Definition 2.2. Let (V, h., .i) be an inner product space. A subset S of V is said to be orthogonal
if hu, vi = 0 for all u, v ∈ V.

Theorem 2.2. Let (V, h., .i) be an inner product space. Let S = {α1 , . . . , αk } be an orthogonal
subset of V and let αi 6= 0 for i = 1, . . . , k. Then S is linearly independent.

Remark 2.1. The converse of the above theorem is not true. The following is an example.
n
Example 2.2. Let (Rn , h., .i) be an inner product space where hx, yi =
P
xi yi . Take u = (1, 0)
i=1
and v = (1, 1). These two vectors are linearly independent but not orthogonal.

Definition 2.3. Let (V, h., .i) be an inner product space. Let {α1 , α2 , . . . , αn } ⊆ V. Then
{α1 , α2 , . . . , αn } is called orthonormal if hαi , αj i = 0 for i 6= j and hαi , αi i = 1.

The next immediate question is that can we construct an orthogonal set from a finite linearly
independent set. The answer is yes. Gram Schimdt supplied a process to construct an orthogonal
set from a linearly independent finite set.

Theorem 2.3. (Gram Schmidt Orthogonalization Theorem) Let (V, h., .i) be an inner product
space. Let {α1 , α2 , . . . , αn } be a linearly independent set. Then there exists an orthogonal set
{β1 , β2 , . . . , βn } such that ls({β1 , β2 , . . . , βn }) = ls({α1 , α2 , . . . , αn }).

Proof. Since {α1 , α2 , . . . , αn } be a linearly independent set, αi 6= 0 for i = 1, . . . , n.

Step 1.

Put
β1 =1 .
hα2 ,β1 i
β2 = α2 − hβ1 ,β1 i β1

..
.
hα2 ,β1 i hαn ,βn−1 i
βn = αn − hβ1 ,β1 i β1 − ··· − hβn−1 ,βn−1 i βn−1 .

Step 2.

13
Check hβi , βj i = 0 for i 6= j for i, j ∈ {1, . . . , n}.

Step 3.

To show ls({β1 , β2 , . . . , βn }) = ls({α1 , α2 , . . . , αn }).

This part can be proved by using method induction.

Question 1. What happens if we have an infinite linearly independent set?

Question 2. Can we construct an orthogonal set from a linearly dependent set?

Theorem 2.4. Every non-trivial finite dimensional inner product space has orthonormal basis.

Theorem 2.5. Let (V, h., .i) be an inner product space. Let {α1 , α2 , . . . , αn } be an orthonormal
Pn
basis. Then for each x ∈ V we have x = hx, αi iαi
i=1

2.4 Orthogonal Complement


Definition 2.4. Let (V, h., .i) be an inner product space. Let S ⊆ V. Then the set {u ∈ V |
hu, vi = 0 for all v ∈ S} is called orthogonal complement of S and it is denoted by S ⊥ .

Theorem 2.6. Let (V, h., .i) be a finite dimensional inner product space. Let S is subspace of V.
Then (S ⊥ )⊥ = S.

Remark 2.2. Check whether the above theorem is true or not for infinite dimensional inner product
space.

Theorem 2.7. Let (V, h., .i) be a finite dimensional inner product space. Let W be a subspace of
V. Then V = W ⊕ W⊥ .

Proof. There are two steps.


Step 1.

To show W ∩ W⊥ = {0}.

Let x ∈ W ∩ W⊥ . That is x ∈ W and x ∈ W⊥ . Then hx, xi = 0 this implies x = 0. Hence


W ∩ W⊥ = {0}.

Step 2.

To show V = W ⊕ W⊥ .

It is clear that W ⊕ W⊥ ⊆ V. We now show that V ⊆ W ⊕ W⊥ . Take x ∈ V.

Take a basis of W which is {u1 , . . . , uk }. We now extend it to a basis for V which is {u1 , . . . , uk , v1 , . . . , vm }.
We transform {u1 , . . . , uk , v1 , . . . , vm } to an orthogonal set by using Gram Schmidt process which

14
is {α1 , . . . , αk , β1 , . . . , βm }.
k
P m
P
Then x = ai αi + bj βj where ai , bj ∈ F for i = 1, . . . , k and j = 1, . . . , m.
i=1 j=1
Pk m
P
Let x1 = ai αi and x2 = bj βj . It is clear that x1 ∈ W. We can easily check that
i=1 j=1
hx1 , x2 i = 0. Therefore y2 ∈ W⊥ . Then x ∈ W ⊕ W⊥ . This implies V ⊆ W ⊕ W⊥ . Hence
V = W ⊕ W⊥ .

Remark 2.3. Let (V, h., .i) be a finite dimensional inner product space. Let W be a subspace of
V. For each x ∈ V there exists unique x1 ∈ W and x2 ∈ Wprp such that x = x1 + x2 . The vector x1
is called the orthogonal projection of x on W .

Theorem 2.8. Let (V, h., .i) be a finite dimensional inner product space. Let W be a subspace of V.
Let {α1 , . . . , αk } be an orthogonal basis of W. Then the orthogonal projection of any vector x ∈ V
k
P hx,αi i
on W is hαi ,αi i αi .
i=1

Definition 2.5. Let (V, h., .i) be an inner product space. Let x ∈ V. The norm of a vector is
denoted by ||x|| and which is equal to (hx, xi)1/2 .

Theorem 2.9. (Cauchy Schwarz Inequality) Let (V, h., .i) be an inner product space. Let x, y ∈ V.
Then |hx, yi| ≤ ||x||||y||. Equality holds if and only if x and y are linearly dependent.

Proof. If y = 0, then it is trivial.


If y 6= 0, take x − ty where t ∈ C. Then

0 ≤ hx − ty, x − tyi

=hx, xi + hx, −tyi + h−ty, xi + h−ty, −tyi

=hx, xi − t̄hx, yi − thy, xi + |t|2 hy, yi


hx,yi
Put t = hy,yi .
¯
hx,yi hx,yi
hx, xi − hy,yi hx, yi − hy,yi hy, xi + |t|2 hy, yi
|hx,yi|2 |hx,yi|2 |hx,yi|2 2
=hx, xi − hy,yi − hy,yi + (hy,yi ) hy, yi

|hx,yi|2
=hx, xi − hy,yi
|hx,yi|2
hx, xi − hy,yi ≥ 0

Then |hx, yi| ≤ (hx, xi)1/2 (hy, yi)1/2 .

The first inequality which we have used in this proof is 0 ≥ hx − ty, x − tyi. If |hx, yi| = ||x||||y||
hold, then hx − ty, x − tyi = 0. This says that x = ty. Then x and y are linearly dependent.
Exercise: Let V be an inner product space.

15
1. h0, vi = 0 for all v ∈ V.

2. hu, vi = 0 for all v ∈ V =⇒ u = 0.

3 Linear Transformation
3.1 Definition, Examples and Properties
Definition 3.1. Let V and W be two vector spaces over the same field F. A mapping T : V → W
is said to be linear transformation if T (αx + βy) = αT (x) + βT (y) for all x, y ∈ V and for all
α, β ∈ F.

Convention 3.1. If T : V → W is a linear transformation and x ∈ V, then the T (x) is usually


denoted by T x, i.e., T x := T (x) for all x ∈ V.

Example 3.1. The following examples must be verified by the reader.

1. Let T : R2 → R2 defined by T (x, y) = x. Then T is a linear transformation.

2. Let A ∈ Mm×n (R), let T : Rn → Rm be defined by T (x) = Ax, x ∈ Rn . Then T is a linear


transformation.
Rb
3. Let T : C[a, b] → R be defined by T (f ) = f (x)dx. Then T is a linear transformation.
a

4. Let V and W be vector spaces over the same field F. Let T : V → W be defined by T (x) = 0,
x ∈ V. This transformation is called Zero transformation.

5. Let V be vector space over the field F. Let T : V → V be defined by T (x) = x, x ∈ V. This
transformation is called identity transformation.

6. Let V and W be vector space over the field F and let λ ∈ F. Let T : V → V be defined by
T (x) = λx, x ∈ V. This transformation is called scalar transformation.

Theorem 3.1. Let T : Rn → R be a linear transformation. Then T has the following form.
k
αi xi for some αi ∈ R for i = 1, . . . , n and for all (x1 , . . . , xn ) ∈ Rn .
P
T (x1 , . . . , xn ) =
i=1

Theorem 3.2. Let T : Rn → Rm be a linear transformation. Then there exist linear transforma-
tions Ti : Rn → R for i = 1, . . . , m such that T (x) = (T1 (x), . . . , Tm (x)) for all x ∈ Rn .

Definition 3.2. Let T : V → W be a linear transformation.

1. N (T ) := {x ∈ V : T (x) = 0}. N (T ) is a subspace of V. The subspaces N (T ) is called the


null space of T .

2. R(T ) := {T (x) : x ∈ V}. R(T ) is a subspace of W. The subspaces R(T ) is called the range
space of T .

Definition 3.3. The dim(R(T )) is called the rank of T and dim(N (T )) is called the nullity of T .

16
Theorem 3.3. Let T : V → V be a linear transformation. Then T is one-one if and only if
N (T ) = {0}.

Theorem 3.4. Let T : V → W be a linear transformation. Then the following are true.

1. If u1 , . . . , un are in V such that T (u1 ), . . . , T (un ) are linearly independent in W, then u1 , . . . , un


are linearly independent in V.

2. If T is one-one and u1 , . . . , un are linearly independent in V, T (u1 ), . . . , T (un ) are linearly


independent in W.

Theorem 3.5. Let T : V → {W be a linear transformation. Then the following are true.

1. If B is a basis of V, then R(T ) ⊆ span(T (B)).

2. dim(R(T )) ≤ dim(V).

3. If T is one-one, then dim(R(T )) = dim(V).

4. If V and W are finite dimensional such that dim(V) = dim(W), then T is one-one if and only
if T is onto.

Theorem 3.6. (Rank Nullity Theorem) Let V be a finite dimensional vector space over the field
F. Let W be a vector space over the field F. Let T : V → {W be a linear transformation. Then
N ullity(T ) + Rank(T ) = dim(V).

Definition 3.4. Let T : V → W be a linear transformation. Then T is said to be isomorphism if


T is bijective (one-one+onto).

Definition 3.5. Let V and W be two vector spaces over the same field F. Then V and W are said
to be isomorphic if there is an isomorphism from V to W.

Theorem 3.7. Let V and W be two finite dimensional vector spaces over the same field F. Let V
and W be isomorphic. Then dim(V) = dim(W).

Notation: We use L(V, W) to denote set of all linear transformation from V to W.

Theorem 3.8. Show that L(V, W) is a vector space over the field F.

Remark 3.1. We have seen that a field is also a vector space.

Definition 3.6. The space L(V, F) is called the dual space of V and it is denoted by V∗ . Elements
of V are usually denoted by lover case letters f , g, etc.

Example 3.2. The following example must be checked by the reader.

1. Let T : Mn×n → R defined as T (A) = (A), A ∈ Mn×n .


R1
2. Let T : C[0, 1] → R defined as T (f ) = f (x)dx.
0

17
Theorem 3.9. Let V be a finite dimensional space and B = {v1 , . . . , vn } be an order basis of V.
n
P
For each j ∈ {1, . . . , n}, let fj : V → F be defined by fj (x) = αj for x = αj vj . Then the
j=1
following are true.

1. f1 , . . . , fn are in V∗ and they satisfy fi (vj ) = δij for i, j ∈ {1, . . . , n}.

2. {f1 , . . . , fj } is a basis of V∗ .

Definition 3.7. Let V be a finite dimensional space and B = {v1 , . . . , vn } be an order basis of V.
A basis {f1 , . . . , fj } of V∗ such that fi (vj ) = δij for i, j ∈ {1, . . . , n}. Then {f1 , . . . , fj } is called
dual basis of V∗ .

Theorem 3.10. If V is finite dimensional. Then V and V∗ are isomorphic.

Definition 3.8. Let T : V → W be a linear transformation and let Let S : W → U be a linear


transformation. Then S ◦ T defined by (S ◦ T )(v) = S(T (v)) is also a linear transformation from V
to U.

Definition 3.9. Let V and W be two vector spaces over the same field F. Let T : V → W be a
linear transformation. Then T is invertible if T is bijective (one-one+onto).

Remark 3.2. Let T : V → W be an invertible linear transformation. Then T −1 is also a bijective


linear transformation from W to V.

3.2 Matrix Representation


Let V and W be finite dimensional vector spaces over the same field F and let B1 = {u1 , . . . , un }
and B2 = {v1 , . . . , vm } be ordered bases of V and W, respectively. Let T : V → W be a linear
transformation. For each j ∈ {1, . . . , n}, let a1j , . . . , amj in F be such that
m
X
T (uj ) = aij vi
i=1
.
n
P
Let x ∈ V. There exist α1 , . . . , αm ∈ F such that x = αi ui .
j=1
Xn n
X n
X m
X m X
X n 
T (x) = T ( αj uj ) = αj T (uj ) = αj aij vi = αj aij vi
j=1 j=1 j=1 i=1 i=1 j=1
.

Definition 3.10. The matrix A = (aij ) in the above discussion is called the matrix representa-
tion of T with respect to the ordered bases B1 , B2 of V and V, respectively. This matrix is usually
denoted by [T ]B1 B2 , that is, [T ]B1 B2 = (aij ).

Remark 3.3. The word ordered basis is very significant for as the order of basis is changed, the
entries aij will change their position and so the corresponding matrix will be different.

18
Example 3.3. Let T be a linear transformation from R2 to R2 defined by T (x1 , x2 ) = (−x2 , x1 ).
Let B1 = {α1 = (1, 0), α2 = (0, 1) and B2 = {β1 = (1, 2), β2 = (1, −1)} be ordered basis for R2 .
Then find out [T ]B1 B2 = (aij ).
Sol: T (α1 ) = T (1, 0) = (0, 1) = 1(1, 2) + 1(1, −1) = 1β1 + 1β2 .

T (α2 ) = T (0, 1) = (−1, 0) = −1/3(1, 2) + −2/3(1, −1) = −1/3β1 + −2/3β2 .


 
1 −1/3
Therefore [T ]B1 B2 = (aij ) = .
1 −2/3
Remark 3.4. Let V and W be finite dimensional vector spaces over the same field F. Let dim(V) =
n and dim(W) = m. Assume that B = {u1 , . . . , un } and B 0 = {v1 , . . . , vm } are ordered basis of V
and W respectively.

1. We have seen that for each linear transformation T : V → W, we have a matrix A ∈ Mn×m (F)
such that [T ]B,B 0 = A.

2. Let A ∈ Mn×m (F). Then there exists a linear transformation T : V → W such that A =
m
P
[T ]B,B 0 and such linear transformation is T (uj ) = aij vi for j = 1 . . . , n.
i=1

Theorem 3.11. Let V and W be finite dimensional vector spaces over the same field F. Let
dim(V) = n and dim(W) = m. Then L(V, W) is isomorphic to Mn×m (F).
Proof. Hint: Define ζ : L(V, W) → Mn×m (F) such that ζ(T ) = [T ]B,B 0 where B = {u1 , . . . , un }
and B 0 = {v1 , . . . , vm } are ordered basis of V and W respectively.
Show that is an isomorphism from L(V, W) to Mn×m (F).
Corollary 4. Let V and W be finite dimensional vector spaces over the same field F. Let dim(V) =
n and dim(W) = m. Then dimension of L(V, W) = mn.
Theorem 3.12. Let V be a finite dimensional vector space over the same field F. Let S and T be
two linear transformations from V and to V. Let B be an ordered basis of V. Then [S ◦ T ]B,B =
[S]B,B [T ]B,B .
Theorem 3.13. Let V be a finite dimensional vector space over the same field F. Let T be an
invertible linear transformation from V and to V. Let B be an ordered basis of V. Then [T −1 ]B,B =
[T ]−1
B,B .

3.3 Matrix Representation Under Change of Basis


Theorem 3.14. Let V be an n-dimensional vector space over the same field F. Let T be a linear
transformation from V and to V. Let B = {u1 ,n } and B 0 = {v1 , . . . , vn } be two ordered basis of V.
Then there exists a non-singular matrix P such that [T ]B,B = P −1 [T ]B 0 ,B 0 P .
Proof. Let S : V → V be a linear transformation such that S(ui ) = vi for i = 1, . . . , n.
Now x ∈ Ker(S)S(x) = 0, x = α1 u1 + · · · + αn un , αi ∈ F.

=⇒ S(α1 u1 + · · · αn un ) = 0.

19
=⇒ α1 S(u1 ) + · · · + αn S(un ) = 0.

=⇒ α1 v1 + · · · + αn vn = 0.

=⇒ αi = 0 for all i.

=⇒ x = 0.

=⇒ Ker(S) = 0.

Then S is one-one. Since V is finite dimensional then S is onto. Let [T ]B,B = A = (aij ).
n
P
Then T (uj ) = aij ui .
i=1
n n
Therefore S ◦ T ◦ S −1 (vj ) = S ◦ T (uj ) = S(T (uj )) = S(
P P
aij ui ) = aij vi .
i=1 i=1
[ST S −1 ]B 0 ,B 0 = (aij ) = [T ]B,B .

=⇒ [S]B 0 ,B 0 [T ]B 0 ,B 0 [S −1 ]B 0 ,B 0 = [T ]B,B .

=⇒ [S]B,B [T ]B,B [S]−1


B,B = [T ]B,B .

=⇒ [T ]B,B = [S]−1
B 0 ,B 0 [T ]B ,B [S]B ,B = P
0 0 0 0
−1 [T ] 0 0 P where P = [S] 0 0 .
B ,B B ,B

Remark 3.5. In the above theorem we have seen that if T : V → V is a operator. Let B and B 0
be two bases of V. Then the matrix [T ]B,B is similar to the matrix [T ]B 0 ,B 0 .

Let V and W be finite dimensional vector spaces over the same field F and let T : V → W
be a linear transformation. Let B1 = {u1 , . . . un } and B10 = {u01 , . . . , u0n } be two bases of V and
0 } be two bases of W. One may want to know the relation
B2 = {v1 , . . . , vm } and B; 02 = {v10 , . . . , vm
between [T ]B1 B10 and [T ]B2 B;02 .

For this purpose we consider the linear transformations T1 : V → V and T2 : W → W such that

T1 (ui ) = u0i
and T2 (vi ) = vi0 f ori=1,. . . ,nandj=1. . . ,m.
Theorem 3.15. [T ]B1 B2 = [T2 ]−1
B2 B 0
[T ]B10 B20 [T1 ]B1 B20
2

4 Eigenvalues and Eigenvectors


4.1 Recall Eigenvalues and Eigenvectors of Matrices
Definition 4.1. Let A ∈ Mn×n (F). A scalar λ is said to be an eigenvalue of A if there exists a
non-null vector x ∈ Fn such that Ax = λx. Any such (non-zero) x is called an eigenvector of A
corresponding to the eigenvalue λ.

20
Remark 4.1. Let A ∈ Mn×n (F). Then det(xI − A) is a polynomial
 of xwhose co-efficients are
1 1 1
coming from F. For example consider the following matrix A = 0 2 2. Then det(xI − A) =
0 0 3
3 2
x − 6x + 11x − 6.
Theorem 4.1. Let A ∈ Mn×n (F) and λ ∈ F. Then λ is an eigenvalue of A if and only if λ is a
root of the polynomial det(xI − A).
Definition 4.2. Let A ∈ Mn×n (F). Then the polynomial det(xI − A) is called characteristic
polynomial and the equation det(xI − A) = 0 is called characteristic equation.
Theorem 4.2. Let A ∈ Mn×n (F). Then the constant term and the coefficient of xn−1 in det(xI −A)
are (−1)n det(A) and −(A).
 
1 1 1
Remark 4.2. For example consider the following matrix A = 0 2 2. Then det(xI − A) =
0 0 3
x3 − 6x2 + 11x − 6. The (A) = 6 and det(A) = 6. It is clear that coefficient of x2 is −(A) and
constant term is (−1)3 det(A).
Remark 4.3. The following question is quite natural. Does every matrix have eigenvalue? This
question is similar to the following question. Let P (x) ∈ P(x, F), where P(x, F) set of all polynomials
with coefficients are coming from F. Does P (x) have root in F? The answer is no. For example,
consider the polynomial x2 + 1 in P(x, R), this polynomial does not have any root in R. Now we are
able to answer our 1st question. If x2 + 1 is the characteristic
  polynomial of a matrix A ∈ M2 (R),
0 1
then A does not have eigenvalues. Here is that A = .
1 0
Theorem 4.3. The following are true.
1. Let A ∈ Mn×n (C). Then A has at least one eigenvalue.

Proof. The characteristic polynomial of A is PA (z) = det(A − xI). Fundamental theorem of


algebra says that PA (x) has at least one root. Hence A has at least one eigenvalue.
2. Similar matrices have the same characteristic polynomial. But the converse is not true.

Proof. Let A and B be two similar matrices. Then there exists a nonsingular matrix P
such that P −1 AP = B. You can easily check that det(B − xI) = det(P −1 AP − xI) =
det(P −1 AP − xP −1 P ) = det(P −1 ) det(A − xI) det(P ) = det(A − xI).
   
0 1 0 0
Converse is not true. Consider the following two matrices. A = and B = .
0 0 0 0
These two matrices have same characteristic polynomial but they are not similar.
3. Let A, B ∈ Mn (F). Then AB and BA have same characteristic polynomials.

Proof Let A ∈ Mm×n (F) and B ∈ Mn×m (F), where m ≤ n. Then det(xI − AB) =
xn−m det(xI − BA).

21
4. Let f (x) be a polynomial and λ be an eigenvalue of A corresponding to the eigenvector x. Then
f (λ) is an eigenvalue of f (A) corresponding to the eigenvector x. But the converse is not true.

Proof: It is very trivial.

Definition 4.3. 1. Let A ∈ Mn×n (F) and let λ be an eigenvalue of A. The number of times λ
appears as a root of the characteristic equation of A is called algebraic multiplicity.

2. Let A ∈ Mn×n (F) and let λ be an eigenvalue of A. Let S be the set of all eigenvectors of A
corresponding to λ. Then S ∪ {0} is a subspace of Fn . This subspace is called eigenspace
of A corresponding to λ and it is denoted by Eλ (A). Eλ (A) = N ull(A).

3. Let A ∈ Mn×n (F) and let λ be an eigenvalue of A. Then dim(Eλ (A)) is called geometric
multiplicity of λ with respect to A.

Theorem 4.4. Let A ∈ Mn×n (F) and let λ1 and λ2 be two distinct eigenvalues. Then Eλ1 (A) ∩
Eλ2 (A) = {0}.

Proof. The proof is trivial.

Theorem 4.5. Let A ∈ Mn×n (F) and let λ be an eigenvalue of A. Then the algebraic multiplicity
of λ is greater equal to the geometric multiplicity.

Proof. Let geometric multiplicity of λ be m. That is dim(Eλ (A)) = m. Let B1 = {x1 , , xm }


n n
be a basis of dim(Eλ (A)). Since Eλ (A) is subspace P of F . We extend B1 to a basis for F ,
say, B = {x1 , , xm , xm+1 , . . . , xn }. Axi = λxi + n for i = 1, . . . , m. Axk = yk for
k=1,k6=i 0xi
n
k = m + 1, . . . , n. Since yk is a vector in F , then yk is expressed as a linear combination of B.
yk = c1,k x1 + c2,k x2 + · · · + cn,k xn for k = m + 1, . . . , n.
 LetP = [x1 x2 . . . xn ] be a matrix which is nonsingular as B is basis. AP = P B where B =
m C
, where C is an matrix of size m × n − m and D is a matrix of size n − m × n − m.
0 D
Then P −1 AP = B. A and B are similar and they have set of eigenvalues. The multiplicity of λ
in B is at least m. Hence algebraic multiplicity of λ is greater equal to the geometric multiplicity.

Definition 4.4. A matrix is said to be diagonalizable if there exists a nonsingular matrix P such
that P −1 AP is a diagonal matrix. That is A is similar to a diagonal matrix.

Remark 4.4. If A is diagonalizable, then the eigenvalues of A are the diagonal entries of that
diagonal matrix. Every matrix is not diagonalizable.

Theorem 4.6. Let A ∈ Mn (F) and let λ1 , . . . λk be the distinct eigenvalues of A. Then the following
are equivalent.

1. A is diagonalizable.

2. A has n linearly independent eigenvectors.

3. Algebraic multiplicity(λi )=geometric multiplicity(λi ) for i = 1, . . . , k.

4. Eλ1 (A) ⊕ Eλ2 (A) ⊕ Eλ3 (A) ⊕ · · · ⊕ Eλk (A) = Fn .

22
Notation: Let A ∈ MC. Then A∗ = ĀT . For real case it should be A∗ = AT .
Definition 4.5. Let A ∈ MC. Then A is called Normal matrix if A∗ A = AA∗ . For real case it
should be AT A = AAT .
Definition 4.6. Let A ∈ MC. Then A is called Hermitian if A∗ = A. For real case it is called
real-symmetric and AT = A.
Theorem 4.7. Let A be Hermitian (or real-symmetrix). Then the following are true.

1. Each eigenvalue of Hermitian ( or real symmetric) matrix is real.

2. The eigenvectors corresponding to distinct eigenvalues of Hermitian (or real-symmetric)


matrix are orthogonal.
Proof. Trivial.
Definition 4.7. Let A ∈ MC. Then A is called Skew-Hermitian if A∗ = −A. For real case it is
called real-Skew-symmetric and AT = −A.
Theorem 4.8. Each eigenvalue of Hermitian ( or real symmetric) matrix either 0 or purely imag-
inary.
Proof. Trivial.
Definition 4.8. Let A ∈ MC. Then A is called Unitary if A∗ A = AA∗ = I. For real case it is
called orthogonal and AT A = AAT = I.
Theorem 4.9. Let A be unitary (real orthogonal). Then the following are true.
1. The column ( row ) vectors of A form an orthonormal set.

2. Each eigenvalue of A us unit modulus.


Proof. Trivial.
Remark 4.5. We can notice that Hermitian (real-symmetric), Skew-Hermitian (Skew-Symmetric),
Unitary (orthogonal) matrices are normal matrices.
Theorem 4.10 (Schur Decomposition). Let A ∈ Mn (C). Then there exists a unitary matrix S
such that S ∗ AS = U , where U is an upper triangular matrix.
Proof. Since A is complex matrix, A has at least one eigenvalue. Let x1 be an eigenvector of
A corresponding to the eigenvalue λ. Extend x1 to a basis of Cn . Let {x1 , x2 , . . . , xn } be such
basis of Cn . By using Gram Schimdt process, we transform  {x1 , x2 , . . . , xn } to an orthonormal set
λ v
{y1 , . . . , yn }. Let S1 = (y1 , y2 , . . . , yn ). Then AS1 = S1 where v is a row vector of size
0 B
1 × n − 1 and B ∈ Mn−1(C). 
λ v
Therefore S1∗ AS1 = .
0 B
Applying the induction hypothesis  on B, we have a unitary matrix T of size n − 1 such that
1 0
T ∗ BT is upper triangular. Let S2 = . You can easily check that S2 is unitary matrix.
0 T

23
Then S2∗ S1∗ AS1 S2 is upper triangular. Consider S = (S1 S2 )∗ . Product of two unitary matrices
is unitary. Hence S ∗ AS is upper triangular.

Definition 4.9. A matrix A is said to be unitarily (orthogonally) diagonalizable if there exists a


unitary matrix (orthogonal) S such that S ∗ AS = D.
Theorem 4.11 (Spectral Theory). Let A ∈ Mn (C) and let A is normal. Then there exists a
unitary matrix S such that S ∗ AS = D, where D is diagonal matrix. That is each normal matrix is
unitarily (orthogonally) diagonalizable.
Proof. By using Schur theorem, we have S ∗ AS = U , where U is upper triangular. Since A is
normal you can easily prove that U is normal. The matrix U is normal and upper triangular, then
U is diagonalizable.
Corollary 5. Hermitian, Skew-Hermitina, Unitary are unitarily diagonalizable. Similarly real-
symmetric. skew-symmetric and orthogonal matrices are orthogonally diagonalizable.

4.2 Eigenvalues (Characteristic values) and eigenvectors (characteristic vec-


tors) of operators
Definition 4.10. 1. Let V be a vector space over the field F. Let T be a linear operator from V to
V. Let λ ∈ F is called eigenvalue of T if there exists a non-zero vector v ∈ V such that T (v) = λv.
2. Let λ ∈ F is called eigenvalue of T if T − λI is not one-one.
Example 4.1. T : R3 → R3 defined by T (x1 , x2 ) = (−x2 , x1 ). This linear operator does not have
eigenvalues.
Example 4.2. T : R3 → R3 defined by T (x1 , x2 ) = (x2 , 0). This linear operator has two eigenvalues
which are 0 and 1.
Definition 4.11. Let T be a linear operator on the finite-dimensional space V over the field F.
Let A be a matrix representation of T with respect to some basis of V. Then the polynomial
PT (x) = det(A − xI) is called characteristic polynomial of T .
Remark 4.6. The above definition is make sense only because similar matrices have same char-
acteristic polynomial. Otherwise for different different bases there are different different matrix
representation of T , and hence PT (x) is not unique.
We recall the notation of co-ordinate of a vector with respect to a basis. Let V finite-dimensional
vector over the field F and let x ∈ V. Then co-ordinate of x with respect to a basis b is denoted by
[x]B .
Theorem 4.12. Let T be a linear operator on the finite-dimensional space V over the field F. Let
A be a matrix representation of T with respect to some basis of V. Then λ is an eigenvalue of T if
and only if λ is an eigenvalue of A.
Proof. First we assume that λ is an eigenvalue of T . Let x be an eigenvector of T corresponding
to the eigenvalue of λ. Then T (x) = λx. Let B be the basis of V such that [T ]B = A. Then
co-ordinate of [T (x)]B = A[x]B =⇒ λ[x]B = A[x]B . Since x is non-zero vector hence co-ordinate
of x is also non-zero and [x]B is a vector in Fn . Hence λ is an eigenbvalue of A and corresponding
eigenvector is co-ordinate of x with respect to the basis B.
Similarly we can show that if λ is eigenvalue of A, then is an eigenvalue of T .

24
Corollary 6. Let T be a linear operator on the finite-dimensional space V over the field F. The
eigenvalues of T are the zeros of its characteristic polynomial.

Definition 4.12. Let T be a linear operator on the finite-dimensional space V over the field F.
and let λ be an eigenvalue of A. Let S be the set of all eigenvectors of A corresponding to λ. Then
S ∪ {0} is a subspace of V. This subspace is called eigenspace of T corresponding to λ and it is
denoted by Eλ (T ). Eλ (T ) = N ull(T ).
The dim(Eλ (A)) is called geometric multiplicity of λ with respect to A.

Definition 4.13. Let T be a linear operator on the finite-dimensional space V. We say that T is
diagonalizahle if there is a basis for V each vector of which is a characteristic vector of T .

Remark 4.7. If there is an ordered basis B = {u1 , . . . , un } for V in which each ai is a characteristic
vector of T , then
 the matrix ofT in the ordered basis B is diagonal. If T (ui ) = ci ui for i = 1, . . . , n.
c1 0 · · · 0
 0 c2 · · · 0 
then [T ]B =  . .. .
 
. ..
. . .
0 0 · · · cn
We certainly do not require that the scalars c1 , . . . , cn be distinct.

Theorem 4.13. Let T be a linear operator on the finite-dimensional space V. Let λ1 , . . . λk be the
distinct eigenvalues of T . Then the following are equivalent.

1. T is diagonalizable.

2. Algebraic multiplicity(λi ) =geometric multiplicity(λi ) for i = 1, . . . , k.

3. Eλ1 (T ) ⊕ Eλ2 (T ) ⊕ Eλ3 (T ) ⊕ · · · ⊕ Eλk (T ) = V.

4. T has n linearly independent eigenvectors.

Proof. The proof is similar as matrix case.

4.3 Annihilating Polynomials


Let T be a linear operator on the finite-dimensional space V over the field F. Let P and q be two
polynomials in P(x, F). Then
(p + q)(T ) = p(T ) + q(T ) and pq(T ) = p(T )q(T )).

Definition 4.14. Let T be a linear operator on the finite-dimensional space V over the field F.
Let P be in P(x, F). Then P is called the annihilating polynomial of T id p(T ) = 0.

The question is that does we have such type of polynomials for each linear operator T on the
finite-dimensional space V over the field F. The answer is affirmative.

Theorem 4.14. Let T be a linear operator on the finite-dimensional space V over the field F. Then
T always has an annihilating polynomial.

25
Proof. Let dim(V) = n. Then dim(L(V, V)) = n2 . That is any subset of L(V, V) which contains
n2 + 1 elements is linearly dependent.
2
We consider the following set {I, T, T 2 , T 3 , . . . , T n } and this is a subset of L(V, V) which
2
contains n2 + 1 elements. Hence {I, T, T 2 , T 3 , . . . , T n } is linearly dependent. Therefore there
2
exist scalar c1 , . . . , cn2 +1 not all zero such that c1 I + c2 T + c3 T 3 + · · · + cn2 +1 T n +1 = 0. Then
2
P (x) = c1 + c2 x + c3 x2 + · · · + cn2 +1 xn +1 is an annihilating polynomial of T .

Remark 4.8. Let T be a linear operator on the finite-dimensional space V over the field F. Let S
be the set of all annihilating polynomial of T , that is, S = {p(x) ∈ P(x, F)|p(T ) = 0}. Then S is a
subspace of P(x, F). Furthermore, it is an infinite dimensional subspace.

Definition 4.15. Let T be a linear operator on the finite-dimensional space V over the field F.
The minimal polynomial for T is the (unique) monic generator of the ideal of polynomials over F
which annihilate T .

Remark 4.9. The minimal polynomial p for the linear operator T is uniquely determined by these
three properties :

(1) p is a monic polynomial over the scalar field F.

(2) p(T ) = 0.

(3) No polynomial over F which annihilates T has smaller degree than p has.

Definition 4.16. If A is an n X n matrix over F, we define the minimal polynomial for A in an


analogous way.

Remark 4.10. If the operator T is represented in some ordered basis by the matrix A, then T
and A have the same minimal polynomial.

Theorem 4.15. Let T be a linear operator on an n-dimensional vector space V [or, let A be an
n × n matrix]. The characteristic and minimal polynomials for T [for A] have the same roots,
except for multiplicities.

Proof. Let mT (x) be the minimal polynomial for T . Let λ be a scalar. We want to show is that
mT (λ) = 0 if and only if λ is a characteristic value of T .
First, suppose mT (λ) = O. Then mT (x) = (x − λ)q, where q is a polynomial. Since degq <
degmT , the definition of the minimal polynomial mT tells us that q(T ) 6= 0. Then there exists a
vector v ∈ V such that q(T )(v) 6= 0. Let u = q(T )(v). Since mT (T ) = 0, we have mT (T )(v) =.
Then (T − λI)q(T )(v) = 0 This implies λ is eigenvalue of T corresponding eigenvector q(T )(v).
Now, suppose that λ is a characteristic value of T , say, T (v) = λv with v(6= 0) ∈ V. Then
mT (T )(v) = mT (λ)v =⇒ mT (λ) = 0.

Corollary 7. Let T be a linear operator on an n-dimensional vector space V. The minimal poly-
nomial of T divides the characteristic polynomial of T .

Theorem 4.16. Let T be a linear operator on an n-dimensional vector space V. Let P (x) be an
annihilating polynomial of T over F. Then minimal polynomial divides that polynomial of P .

26
Proof. You can easily prove it by using division algorithm.
We will prove the following two theorems latter.

Theorem 4.17. Let T be a linear operator on an n-dimensional vector space V. Then T is


diagonalizable if and only if the minimal polynomail of T is the product of distinct linear factors.

Theorem 4.18 (Cayley Hamilton Theorem). Let T be a linear operator on an n-dimensional vector
space V. If PT (x) is the characteristic polynomial for T , then PT (T ) = 0.

4.4 Invariant Subspaces


Definition 4.17. Let V be a vector space and T a linear operator on V. If W is a subspace of V,
we say that W is invariant under T if for each vector α in W the vector T (α) is in W, i.e., if T (W)
is contained in W.

Remark 4.11. When V is finite-dimensional, the invariance of W under T has a simple matrix
interpretation, and perhaps we should mention it at this point. Let B 0 = {v1 , . . . , vr } be a basis
of W. We extend B 0 to basis for V which is B = {v1 , . . . , vr , vr+1 , . . . , vn }. Let A = [T ]B so that
Pn
T (vj ) = Aij vj . Since W is invariant under T , the vector T (vj ) ∈ W for j ≤ r. This means
i=1  
B C
Aij = 0 if j ≤ r and i > r. Hence A has the block form A = , where B is an r × r matrix
0 D
and C is an r × n − r matrix and D is an n − r × n − r matrix. The matrix B is precisely the
matrix of the induced operator TW in the ordered basis B 0 .
   
B C P (B) E
Lemma 4.1. Let A = and let P (x) be a polynomial. Then P (A) = , where
0 D 0 P (D)
E is some new matrix of size r × n − r.

Theorem 4.19. Let W be an invariant subspace for T . The characteristic polynomial for the
restriction operator TW divides the characteristic polynomial for T . The minimal polynomial for
TW divides the minimal polynomial for T .

Proof. Using previous lemma and corollary, one can easily prove it.

4.5 Types of Operators


4.6 Jordan Canonical form

5 Definiteness of Hermitian Matrices

27

You might also like