Professional Documents
Culture Documents
Algebra (Ferrar) PDF
Algebra (Ferrar) PDF
Algebra (Ferrar) PDF
64431
co
co
>
OVY-
75
2-I2-6&
5,000.
5 ^
1
F37 A
Accession No.
Author
last
marked below.
ALGEBRA
A TEXT-BOOK OP
DETERMINANTS, MATRICES, AND ALGEBRAIC FORMS
BY
W.
L.
FERRAR
M.A., F.R.S.E.
SECOND EDITION
Amen
House, London
E.CA
BOMBAY CALCUTTA MADRAS KARACHI KUALA LUMPUR CAPE TOWN IBADAN NAIROBI ACCRA
on latent vectors.
W.
HERTFORD COLLEGE, OXFORD,
1957.
L. F.
have tried to provide a text-book of the more elementary properties of determinants, matrices, and algebraic forms. The sections on determinants and matrices,
this
book
to be read at school.
The book
as a whole
is
University teaching in
mathematics should, in my view, provide at least two things. The first is a broad basis of knowledge comprising such theories and theorems in any one branch of mathematics as are of constant application in other branches. is incentive and opportunity to acquire a detailed of some one branch of mathematics. The books knowledge
The second
available
if
make reasonable provision for the latter, especially the student has, as he should have, a working knowledge of at least one foreign language. But we are deplorably lacking in
books that cut down each topic, I will not say to a minimum, but to something that may reasonably be reckoned as an essential part of an undergraduate's mathematical education. Accordingly, I have written this book on the same general plan as that adopted in my book on convergence. I have
included topics commonly required for a university honours course in pure and applied mathematics I have excluded topics
:
am
indebted
may
well serve as
my readers to further algebraic reading. Without that the list exhausts my indebtedness to others, pretending I may note the following: Scott and Matthews, Theory of
a guide for
Determinants
;
Bocher, Introduction
;
to
Modern Algebraic Theories Aitken, Determinants and Matrices Turnbull, The Theory of Determinants, Matrices, and Invariants', Elliott, Introduction to the Algebra of Quantics Turnbull and Aitken, The Theory of Canonical Matrices', Salmon, Modern
;
PREFACE
*
vii
Higher Algebra (though the modern' refers to some sixty years back) Burnside and Panton, Theory of Equations. Further,
;
though the reference will be useless to my readers, I gratefully acknowledge my debt to Professor E. T. Whittaker, whose invaluable research lectures' on matrices I studied at Edinburgh
*
many
years ago.
this book are many. I hope they are, all of them, deliberate. It would have been easy to fit in something about the theory of equations and eliminants, or to digress at
one of several possible points in order to introduce the notion of a group, or to enlarge upon number rings and fields so as to
give
some hint of modern abstract algebra. A book written expressly for undergraduates and dealing with one or more of these topics would be a valuable addition to our stock of
university text-books, but I think little is to be gained by references to such subjects when it is not intended to develop
them
seriously.
In part, the book was read, while still in manuscript, by my friend and colleague, the late Mr. J. Hodgkinson, whose excellent lectures on Algebra will be remembered by many Oxford
men.
In the exacting task of reading proofs and checking references I have again received invaluable help from Professor E. T. Copson, who has read all the proofs once and checked
deeply grateful to him for this work and, in particular, for the criticisms which have enabled me to remove some notable faults from the text.
nearly
all
the examples.
am
Finally, I wish to
thank the
both
on its publishing and its printing side, for their excellent work on this book. I have been concerned with the printing of mathematical work (mostly that of other people !) for many years, and I still marvel at the patience and skill that go to the printing of
a mathematical book or periodical.
W.
HERTFORD COLLEGE, OXFORD,
September 1940.
L. F.
CONTENTS
PART
I
......
.
. .
DETERMINANTS
5 ELEMENTARY PROPERTIES OF DETERMINANTS .28 THE MINORS OF A DETERMINANT 41 HI. THE PRODUCT OF Two DETERMINANTS 55 IV. JACOBI'S THEOREM AND ITS EXTENSIONS 59 V. SYMMETRICAL AND SKEW-SYMMETRICAL DETERMINANTS
I.
.
II.
PART
II
MATRICES
^VI. DEFINITIONS AND ELEMENTARY PROPERTIES VII. RELATED MATRICES VIII. THE RANK OF A MATRIX IX. DETERMINANTS ASSOCIATED WITH MATRICES
.
.
67
.83 .93
.
109
PART
/
III
119 135
XI. THE POSITIVE-DEFINITE FORM XII. THE CHARACTERISTIC EQUATION AND CANONICAL FORMS
.
.
144
..... ......
.
.160
.
169
200
PART
Number
In its initial stages algebra is little more than a generalization of elementary arithmetic. It deals only with the positive can all remember the type of problem integers, 1, 2, 3,...
.
We
that began
'let
if
x came to 3| we
The numbers used in this book may be either real or complex and we shall assume that readers have studied, to a greater or a lesser extent, the precise definitions of these numbers and
the rules governing their addition, subtraction, multiplication,
and
2.
division.
Number
rings
2,
....
(1)
Let r, s denote numbers selected from (1). Then, whether denote the same or different numbers, the numbers
and
belong to
(1).
(1) is
shared by other
sets of
a~\-b^J6,
(2)
where a and b belong to (1), have the same property; if r and s A set of numbers belong to (2), then so do r+s, rs, and r X laving this property is called a KING of numbers.
6'.
1
4702
PRELIMINARY NOTE
3.
Number
3.1.
fields
and every Consider the set of numbers comprising number of the form p/q, where p and q belong to (1) and q is not
zero, that is to say,
(3)
Let r,s denote numbers selected from (3). Then, when s is not zero, whether r and s denote the same or different numbers,
the numbers
all
r+g>
_ s>
rxs>
^s
is
The property
others
:
among many
(4)
all
complex numbers;
all real
numbers
(rational
and
irrational);
(5)
(6)
Each of the
and
(6) constitutes
field.
DEFINITION. A set of numbers, real or complex, is said to form a FIELD OF NUMBERS when, if r and s belong to the set and s is not
zero
>
r+s,
set.
rs,
rxs,
r-^-s
(1) is
not a
field; for,
whereas
it
contains
and
2, it
|.
the work
field is
In the early part of the book this aspect of the matter need not be emphasized: in some of the later chapters the essence of the theorem is that all the operations envisaged by the theorem can be carried out within the confines of any given field of numbers. In this preliminary note we wish to do no more than give a
PRELIMINARY NOTE
formal definition of a
field
Matrices
is
called a matrix.
Thus
22
*2m
is
n we speak of a square matrix of order n. Associated with any given square matrix of order n there are a number of algebraical entities. The matrix written above,
a matrix.
When m
with
(i)
m=
n, is associated
*2n
nl
(ii)
2 2
*,**
\
I I
in the 2n variables
(iv)
ars^ys
\
r=l s=l
2 2 Or^r^
where
(v)
rr
PRELIMINARY NOTE
The
theories of the matrix
and of
its
closely knit together. The plan of expounding these theories that I have adopted is, roughly, this: Part I develops properties of the determinant; Part II develops the algebra of matrices,
referring
back to Part
I for
any
result
may
associated forms.
CHAPTER
Introduction
1.1. In the following chapters it is assumed that most readers will already be familiar with determinants of the second and third orders. On the other hand, no theorems about such
is
complete
in itself.
Until the middle of the last century the use of determinant notation was practically unknown, but once introduced it
gained such popularity that it is now employed in almost every branch of mathematics. The theory has been developed to such an extent that few mathematicians would pretend to a knowledge of the whole of it. On the other hand, the range of theory that is of constant application in other branches of mathematics is relatively small, and it is this restricted range that the book
covers.
1.2.
Determinants
a^x+b^j
are satisfied
0,
a 2 x+b 2 y
y,
=
least,
being different
one of them, at
y)~b l (a 2 x~\-b 2 y)
0,
and so
Similarly, (n L b 2
(a l b 2
~a
b l )x
0.
a 2^i)#
0>
is
an(l so a i^z
a z^i
0.
usually written as
ai a2
The term a l b 2
is
Since
is
said
The determinant has one obvious property. If, in interchange simultaneously a x and b l9 a 2 and 6 2 we get
,
we
bl
a2
b2 ai
instead of
c^
b2
#2^1-
That is, the interchange of two columns of (1) reproduces the same terms, namely a l b 2 and (i 2 b lt but in a different order and
wifrh the opposite signs.
Again,
let
numbers
#,
y and
y
z,
not
the three
(2)
equations
0,
0,
(3) (4)
0;
and
(4),
a^b 2 )x
(b 2 c 3
(c 2
b 3 c 2 )z
c.^a 2 )z
a 3 b 2 )y
(2),
a3
= 0, = 0,
a3 b 2 )}
and
so,
from equation
c3
a 2 )+c 1 (a 2 b 3
0.
We
by A;
may
be written as a 3 ^2 C l>
a l ^2 C 3
a i ^3 C 2~t" a 2 ^3 C l
is
a 2 ^1 C 3 + a 3 ^1 C 2
0.
zA
By
Since
0.
=
as
0.
6L
b2 63
in
which form
it is
The term a x
62 c3
is
a 2 x+b 2 y+...+k 2
= z =
0,
0,
the equations are to be satisfied by a set of values It is suggested that this x, 2/,..., z which are not all zero. function of the coefficients may be conveniently denoted by
if all
x
bl
which form it may be referred to as a determinant of order n, kn called the leading diagonal. and #! 6 2 Just as we formed a determinant of order three (in 1.2) by using determinants of order two, so we could form a determinant of order four by using those of order three, and proceed step by step to a definition of a determinant of order n. But this is not the only possible procedure and we shall arrive at our
in
definition
by another path.
determinants of
We
such a
way
The expression
(1) is
of the form
wherein the
to
r, s,
the values
taken over the six possible ways of assigning 1, 2, 3 in some order and without repetition.
The leading diagonal term a^b^c^ is prefixed by +. (III) As with the determinant of order 2 (1.2), the interchange of any two letters throughout the expression (1) reproduces the same set of terms, but in a different order and with the opposite signs prefixed to them. For example, when a and b are interchanged! in (1), we get
(II)
(1),
Determinants of order n
2.1. Having observed (1.5) three essential properties of a determinant of the third order, we now define a determinant
of order n.
bi
b,
. .
Ji
*
fc
J2
(1)
n
is that
b ,i
jn
kn
conditions:
(I)
it is
an expression of
the
form
(2)
2 i a/A-"^0>
ivherein the
sum
is
to r, s,..., 6 the
values
taken over the n\ possible ways of assigning 1, 2,..., n in some order, and without
prefixed by the
repetition;
(II)
is
sign
+;
any
other term is such that the inter-
change of any two letters^, throughout (2) reproduces the same Throughout we use the phrase 'interchange a and 6* to denote the
,
.
simultaneous interchanges a t and b l9 a 2 and 6 2 a s an d & 3 ..., a n and b n J See previous footnote. The interchange of p and </, say, means the simultaneous interchanges
p and
qi t
p 2 and
y^,...,
p and
fi
qn .
and with
Before proceeding
definition yields
The proof that divided into four main steps. First step. Let the letters a, 6,..., k, which correspond to the columns of (1), be written down in any order, say
a's, 6's,..., &'s
...,
d, g, a,
...,
p,
...,
q,
....
(A)
is
An
an ADJACENT INTERCHANGE. Take any two letters p and q, having, say, m letters between them in the order (A). By w+1 adjacent interchanges, in each of which p is moved one place to the right, we reach a stage at which p comes next after q\ by m further adjacent interchanges, in each of which q is moved one place to the left, we reach a stage at which the order (A) is reproduced save that p and q have changed places. This stage has been reached by means of 2m+l adjacent
called
interchanges.
all
we end
if
initial signs.
Accordingly,
the
condition (III) of the definition is satisfied for adjacent interchanges of letters, it is automatically satisfied for every inter-
change of letters. Second step. The conditions (I), (II), the determinant (of the second order)
(III) fix
the value of
to be a l b 2
a2 b l
For, by
(I)
and
(II),
and, by so that
(III),
we cannot have a l b 2 +a 2 b l
step. Assume, then, that the conditions (I), (II), (III) 1. are sufficient to fix the value of a determinant of order n 4702 C
Third
10
By
the determinant
suffix 1
;
An
has the
this set of
terms
8
is
<*i2b
wherein,
(i)
Ct...k e9
,
(3)
by
(I),
as applied to
is
An
the
sum
(nI)\
possible
assigning to
6 the values 2,
3,...,
in
(II),
as applied to
. . .
Aw
the term 6 2 c 3
kn
is
prefixed
rt ,
by
Finally,
(iii)
letters 6,
c,...,
k changes
fix the
Hence, by our hypothesis that the conditions (I), (II), (III) value of a determinant of order n 1, the terms of (2) in which a has the suffix 1 are given by
(3 a)
on our assumption that a determinant of order is the conditions (I) (II) (III) fixes the signs of all by terms in (2) that contain a 1 b 2t a^g,..., ab n Fourth step. The interchange of a and b in (2) must, by condition (III), change all the signs in (2). Hence the terms of (2) in which 6 has the suffix 1 are given by
This,
nl
defined
(3b)
for (3 b) fixes the sign of a term fi^c, ... k Q to be the opposite of the sign of the term a l b8 cl ... kg in (3 a).
now show
The adjacent interchanges 6 with c, that (2) must take the form
with
rf,
...,
j with k
11
63
bn
-...
+ (-!)"a3 63
(4)
an bn
Jn
That
to say, if conditions (I), (II), (III) define uniquely a determinant of order n 1, then they define uniquely a deteris
minant of order n. But they do define uniquely a determinant of order 2, and hence, by induction, they define uniquely a determinant of any order.
Rule for determining the sign of a given term. If in a term a r b s ...k there are X suffixes less than n that come after n, we say that there are \ inversions with respect to n. For example, in the term a 2 b 3 c^d lJ there is one inversion
2.2.
}1 tl
with respect to 4. Similarly, if there are A n-1 suffixes less than n 1 that come after n 1, we say that there are X n _ l inversions
with respect to n
1
;
and so
on.
The sum
is
number of
a
=
A6
b c
1
e f
(5}
and 5 come
after 6,
5,
A5 A4
A3
= = =
0,
3,
come
after 4,
2,
A2
1,
Aj
0;
is
the total
If ar b 8
number of
...
inversions
+ 3+2+1 =
1
8.
interchanges of letters and, on restoring alphabetical order, make n the last suffix. For example, in (5), where Xn 2, the
12
and / with d
f
give, in
On
we have
5 are in their
and the
the last suffix, A^_ x adjacent Similarly, interchanges of letters followed by a restoration of alphabetical order will then make n 1 the (n l)th suffix; and so on.
letters
make
i.e.
By
(III),
the sign
to be prefixed to
,
any term of
(2)
is
(1)^, where N,
the total
number
of inversions of suffixes.
Let
^m^
be
n.
In the term
let there
jji
suffixes greater
than
of these ^ m greater suffixes in evaluating N, accounts for one inversion with respect and, to each of them. It follows that
suffix
Then the
3.
Aw
,
o/
2.1
can be expanded
I(-l)
a,
Na
b s ...k g
where
(ii)
r,
,...,
0;
bn
Cn
h h
This theorem has been proved in
2.
13
A
bl
00
determinant
is
unaltered in value
when rows
that is to
say
^
fC<\
By Theorem
where
a,
1,
is
2 (-!)%&.
/?,...,
/c
..
(7)
is
6,...,
the total
number of inversions of
letters.
[There are /x n inversions with respect to k in ajS.../c if there are fi n letters after k that come before k in the alphabet; and so on: ^i+^2+*--+M/i * s the total number of inversions of letters.]
Now
If
(7),
say
*
(
M (-l)
we
A>
8)
we
In
(9) (-\)Uar b8 ...j ke come after k, so that in (9) there that come before 8. Thej*e are are p n suffixes greater than letters that come before j in the alphabet but after it in Fn-i (8), so there are ^ n _^ suffixes greater than t that come before 2.3 that M, which is it in (9); and so on. It follows from defined as 2 /*n> * s equal to N, where N is the total number of (8) there are
letters that \i n
inversions of suffixes in
(9).
the expansion of the second determinant of the enunciation, may also be written as
(7),
Thus
which
is
2(-i)%A-^>
is, by Theorem 1, the expansion of the of the enunciation; and Theorem 2 is proved.
which
first
determinant
THEOREM
in a
3.
or of two rows,
by
I.
It follows at once
the definition of
An
that an
by
14
COROLLARY. .// a column (roiv) is moved past an even number of Columns (rows), then the value of the determinant is unaltered;
in particular
6i 6, d,
d,
d,
bt
di
If a column (row) is moved past an odd number of columns (rows), then the value of the determinant is thereby multiplied
by -l.
For a column can be moved past an even (odd) number of columns by an even (odd) number of adjacent interchanges. In the particular example, abed can be changed into cabd by lirst interchanging 6 and c, giving ticbd, and then interchanging a and c.
3.11.
as
3,
(ii)
of
Theorem
first
row.
By
there
a similar expansion
a*
A2
63 64
c2
3
r/
a$
fc
C2 C4
^2
a4
c4
&4
d4
its first
first
row, the
dl
a
b
Similarly,
the
first
determinant
may
be
written as
b l Cj d l 6 3 c 3 dz
dl
ci!
bl
dl
a3
15
Again, by Theorem 2, we may turn columns into rows and rows into columns without changing the value of the determinant. Hence there are corresponding expansions of the
determinant by each of
its
columns.
We
3.2.
Chapter
II,
1.
THEOREM
its
4.
rows, identical,
value
columns, or rows, leaves its value unaltered. But, by Theorem 3, if its value is x, its value after the interchange of the two columns, or rows, 0. is -~x. Hence x x, or 2x
identical
// each element of one column, or row, is multiplied by a factor K, the value of the determinant is thereby multiplied by K.
5.
THEOREM
This
is
definition, for
...k.
n,
=K
ab
3.3.
THEOREM
6.
is
equal to the
different
sum
2n
particular,
is the
sum
a2
62
is
This again
is
the algebraic
sum
16
of a determinant is unaltered if to each element of one column (or row) is added a constant multiple of the corresponding element of another column (or row); in particular,
7.
THEOREM
The value
a,+A6 2
Proof. Let the determinant be of two distinct columns of A.
AH
of
2.1,
x and
?/
the letters
Let
be the determinant
formed from A n by replacing each xr of the x column by xr -{-\yr where A is independent of r. Then, by Theofem 1,
,
az
aj b v 62
xt
2/2
2/2
*n
%n
2/n
But the second of these is zero since, by Theorem 5, it is A times a determinant which has two columns identical. Hence A^ = A COROLLARIES OF THEOREM 7. There are many extensions of Theorem 7. For example, by repeated applications of Theorem 7
.
;i
it
each column (or row) of a determinant fixed multiples of the SUBSEQUENT columns (or rows) and leave the value of the determinant unaltered; in particular,
We may
add
to
i o
61
<
60
63+^3
There
is
QUENT.
Another extension of Theorem
column
(or row) and
column
(or
row)
to
every other
b1
a 2 +A6 2 a3
17
many others. But experience rather than set rule the better guide for further extensions. The practice of adding multiples of columns or rows at random is liable to lead
is
A==
a l -\-Xb 1 a 2 +A6 2
is,
by Theorem
6,
A6 3
all
MC3
Theorem
6,
such as
A6
5.
Hence the
deter-
minant A
is
equal to
NOTE. One of the more curious errors into which one is led by adding 0. In the multiples of rows at random is a fallacious proof that A example just given, 'subtract second column from first, add second and = l,i/ = l,a choice third, add first and third', corresponds toA= l,/Lt of values that will (wrongly, of course) 'prove' that A = 0: the manipulation of the columns has not left the value unaltered, but multiplied the determinant by a zero factor, 1-f A/zv.
3.5. Applications of
Theorems
1-7.
7
A n we form
,
new
is
obtained by taking
times the
18
first
ra times the
times the
refers to
EXAMPLES
1.
87
42
18
17
3
7
-1
4
45 50
91
3
6
-5
The working that follows is an illustration of how one may deal with an isolated numerical determinant. For a systematic method of computing determinants (one which reduces the order of the determinant from, say, 6 to 5, from 5 to 4, from 4 to 3, from 3 to 2) the reader is referred to Whittaker and Robinson, The Calculus of Observations
(London, 1926), chapter v.
On
writing
c{
cl
2c a
c3
Cg
c 3 -f-3c4 ,
by Theorem
I.
On expanding
2.
Prove thatf
1
.
a
yot.
On
taking
ca
c lf c' 3
=
1
c3
c 2 , the
determinant becomes
j8
y~]3
a(j3-y)
fo
y(a-j3)
t This determinant, and others that occur in this set of examples, can be evaluated quickly by using the Remainder Theorem. Here they are intended
as exercises on
1-3.
19
and
5, is
equal to
i.e.
3.
Prove that
^
O (D
/
(-/J)
a)
\2
A \j
(<x-y)
iQ ID
(a-S)
/O
v/3
0.
y)
\2
0/
SS\2
(y
_ a) 2
2
(y
_)i
2
(y-8)
2
(8-a)
(8-j8)
(8-y)
studied the multiplication of determinants wo shall see (Examples IV, 2) that the given determinant is 'obviously' zero, being the product of two determinants each having a column of zeros.
When we have
But, at present,
r( r%
r's
we
treat
it
as an exercise on
factor a
factor
/3
/?
Theorem
from
r(
7.
Take
5),
=^
7*2
(Theorem
r %~ r z
r3
r4
factor
Then A/(a-)8)(j8-y)(y-S)
equal to
-y
0-fy-28
y-8
In this take
/?
c{
=
y;
c^it
c2,
c.^
c2
c 3 , c3
c3
c4
factors
a,
/?,
becomes
2
2 2
2
(j8-a)(y-j8)(8-y)
2
2
a+j3-28 )3+y-28
y-8
28-jS-y
S-y
2S-a-
00 00 22
r^
r^
r' 2
^2
~ r3
becomes
a-y
j8-8
y-8
28-jS-y
8-y
which, on expanding by the first row, is zero since the determinant of order 3 that multiplies a y is one with a complete row of zeros.
4.
Prove that
3
(a-j8)
3
(a-y)
(j8-y)
0.
(j8-a) (y-<*)
3
(y-0)
20
y+a
r' 3
a+,
r^r*
1
1
ft
0.
a
8
7.
y
8+a
y+8
a
Prove that
2
a+/?
y
2
-)8
a 2 -y 2
2 jS
-8 2
82
0.
y
Q
2 ft
r-<**
HINT. Use Theorem Prove that
2a j8 + a y+a
2_^2 2_^2
2_ y 2
2 y ~g2
6.
8.
a2j8
a-f-y
a4~8
=?=
0.
j8+y
2y
8
y+jS
y+8
28
+y
\/
9.
Prove that
1
1
)3
y
y
2
y8a
a2
10.
2
j8
82
Prove that
^
-a -b -d
c
e
c e
Of
ct c2
directly if
we
consider
it
as
bl
b2
63 64
C3
C4
whose expansion is 2 (~ l)^ar^ c e^u ^ ne value of N being determined by the rule of 2.2. There are, however, a number of zero terms in the
expansion of A. The non-zero terms are a 2/ 2 [a 2 &!C 4 c?3 and so is prefixed by (
6 e
2
I)
],
2
,
2
,
each prefixed by +,
21
two terms aebf [one a 2 64 c r rf3 the other a3 6 X c4 e/2 and so each prefixed
by(-l)
3 ],
-f-
Hence
1,
d
e
a
e
d
is
a 2 d 2 -f 6
2
-f-
2bcef
2cafd
y
l 2
2/
is
bed
d
a
b
x-\~c
x-\-c
as a product of factors.
HINT.
Two
linear factors
factor.
is
Cj-f- Ac 2 ,
Ca
c 2 -f^c3 ,
c3
+ vc
l9
ci
c4 .
Notation for determinants The determinant A of 2.1 is sufficiently indicated by its leading diagonal and it is often written as (a<ib 2 c3 ...k n ). The use of the double suffix notation, which we used on p. 3
4.
rt
of the Preliminary Note, enables one to abbreviate still further. The determinant that has ara as the element in the rth row
and the 5th column may be written as \ars Sometimes a determinant is sufficiently indicated by
\.
its first
22
row; thus (a>ib 2 c 3 ...k n ) may be indicated by the notation is liable to misinterpretation.
5.
but
alternants.
a3 a
2
3
j8
ft*
/3
oc
1111
y 2 y y
S3 S2
is
equal to (oc /?)(a y)(a 8)(j8 y)()S 8)(y S); the correspondn ~ l n ~" ... 1), is equal to the ^ ing determinant of order n, namely (oc
product of the differences that can be formed from the letters a, /?,..., K appearing in the determinant, due regard being paid
to alphabetical order in the factors.
called
ALTERNANTS.
On expanding A 4 we
,
see that
it
may
be regarded
6 in the variables
(a)
as a
a,
j8,
(b)
as a non-homogeneous polynomial of highest degree 3 in a, the coefficients of the powers of a being functions of
OL
j8,
since
it
to a polynomial in a, the determinant has a factor a /?. a similar argument, the difference of any two of a, j3, y, 8
By
is
facto? of
A4
so that
Since
A 4 is
the factor
y, S,
is
K, and
in
A4
it is 1.
Hence
K=
and
The
last
product
is
conveniently written as
(>/?, y>S)>
and
23
corresponding product of differences of the n letters K as (a, /?,..., *). This last product contains /?,...,
factors
also
is
and the degree of the determinant (a n ~ 1 j3 " 2 ... 1) is (n~ l)+(r& 2 + .., + !, so that the argument used for A 4
/l
readily extended to
An
'5.2.
==
^e product is taken over the n-th roots of unity. determinant is called a CIRCULANT.
Such a
Let
cos
2kn --J-^sm n n
,
2&7T
(A;
rt
l,2,...,n).
In
replace the
c
first
column
cx
by a new column
c[,
where
=
A
unaltered.
fact
The
first
that aj n
column of the new determinant is, on using the -1 = a, 1 and writing a 1 +a 2 co+...+an a) n
a 1 +a 2 a)+a 3 aj 2 +...+a H o} n - 1
a,
Hence the
first
24
w - 1 which is therefore a>+...-fa n aj (Theorem 5) a cu factor of A. This is true for 1, 2,. ..,?&), so that o^ (A;
= a +a
1
= Jf
,
(o 1
+a
cu Jb
+...+aw a>2-
1
).
(11)
Moreover, since A and JJ are homogeneous of degree n in the must be independent of these variables a lt a 2 ,..., a n the factor
variables
and so
is
a numerical constant.
(1 1),
Comparing the
coefficients of aj
we
see that
K=
1.
6.
of
are said to be a
2,..., w-
PERMUTATION
in
some
order.
We
can
1, 2,...,
n by
suitable
interchanges of pairs; but the set of interchanges leading from For example, the one order to the other is not unique.
denoted by
/I 3 2 6 4 5\
6 4 5\
II 2 3 4 6 5\
\1 3 2 6 4 5J'
\1 2 3 6 4 5/
\1 2 3 4 6 5/'
\1 2 3 4 5 6/'
whereby we first put 1 in the first place, then 2 place, and so on. But we may arrive at the same
first
in the second
final result
by
fifth,
and
so on:
or
we can proceed
/4 3 2 6 1 5\
\4 3 2
6 5/'
1
\4 3
2 6 5/'
as
first
steps towards
moving
the reader will see for himself, there is a wide variety of sets of interchanges that will ultimately change 432615 into 123456.
suppose that there are interchanges in any ONE way of going from r 5,,.., 6 to 1, 2,..., n. Then, by condition III in the definition of a determinant (vide 2.1), the term ar b 8 ...tc0
t
Now
in the expansion of A n (a 1 6 2 "-^n) * s prefixed by the sign 1)^. But, as we have proved, the definition of A n by the ( conditions I, II, III is unique. Hence the sign to be prefixed
any given term is uniquely determined, and therefore the sign (1)^ must be the same whatever set of interchanges is
to
r, $,...,
to
1, 2,...,
nl
Thus,
if
one
way
of
26
going from r, 5,..., 6 to 1, 2,..., n involves an odd (even) number of interchanges, so does every way of going from r, 5,..., d
to
1, 2,...,?i.
Accordingly, every permutation may be characterized either as EVEN, when the change from it to the standard order 1, 2,..., n
can be effected by an even number of interchanges, or as ODD, when the change from it to the standard order 1, 2,..., n can be effected by an odd number of interchanges.
6.2. It can be proved, without reference to determinants, one way of changing from r, 5,..., 6 to 1, 2,..., n involves
that, if
an odd (even) number of interchanges, so does every way of effecting the same result. When this has been done it is ar &s ...&0, where legitimate to define A n = (a\b 2 ...k n ) as the plus or minus sign is prefixed to a term according as its suffixes form an even or an odd permutation of 1, 2,..., n. This is, in fact, one of the common ways of defining a determinant.
7. Differentiation
When
variable x, the rule for obtaining its differential coefficient follows. If A denotes the determinant
as
and the elements are functions of x, d&/dx is the sum of the n determinants obtained by differentiating the elements of one row (Or column) of A and leaving the elements of the other rows (or columns) unaltered. For example,
nI
2x
x
2x
X
4702
a;
26
from Theorem
since
V(-D^ dx
-4
dx
dx
k,
dk/dx
ing
is
the
sum
Ti1
EXAMPLES
V/l.
II
Prove that
1
a2
l'
y
c3
f
^jC^ where s r
^Cj,, cj
a r +j3 r +y r
y/2.
Prove that
1 1
ft
y
aj
3.
Prove that
a
(y-h)
first
Prove that each of the determinants of the fourth order whose rows are (i) 1, j8+y-f 8, a 2 a 3 (ii) 1, a, jS 2 +y 2 -f S 2 y8, is equal to if(a,j8,y,8). Write down other determinants that equal
4.
, ;
,
6.
-a,
where
a>
1.
27
first
row
is
a3
...
a 2n
,
is
aa a a -a n+2 ,...).
7.
++8)
the product of a circulant of order n (with first row ai+a n+l and a skew circulant of order n (with first row a x a n+lf
first
row
9.
By
show
t/"
in
sum
degree p,
is
equal to
HINT. The first three factors are obtained as in Theorem 8. The degree of the determinant in a, /?, y is /owr; so the remaining factor must be linear in a, ]8, y and it must be unaltered by the interchange of any two letters (for both the determinant and the product of the first three factors are altered in sign by such an interchange). 2 Alternatively, consider the coefficient of 8 in the alternant (a,/?, y, S).
11.
Prove that
1
y*
03
12.
Extend the
results of
Examples 10 and
11 to determinants of
higher order.
CHAPTER
1
.
II
minors
obtained by deleting the row and the column containing ar is called the minor of ar and so for other letters. Such minors are called FIRST MINORS; the determinant of order n 2 obtained by deleting the two rows with suffixes r, s and the two columns with letters a, b is called a SECOND MINOR; and so on for third, fourth,... minors. We
the determinant of order
I
;
shall
first minors by a r j8r ,... can expand A by any row or column (Chap. for example, on expanding by the first column,
denote the
We
/t
I,
3.11);
A*
or,
ia 1 -a 2a2 +...
+ (-l)^- a
1
/l
/i
(1]
An
1.2.
= _a
a 2 +& 2 &-...
+ (-l)^
*2
(2;
The above notation requires a careful of sign in its use and it is more convenient CO-FACTORS. They are defined as the numbers
consideration
to
introduce
A ,B
r
,...,Kr
(r=l,2,...,n)
such that
An
ai
A 1 +a 2 A 2 -}-...+a n A n
A m by its
= a n A n +b n Bn +...+kn K n
these being the expansions of A w by
its
various rows.
29
Ar B
,
ry
...
are obtained
by
/?r ,...
a simple matter to determine the sign for any minor. For example, to determine whether C2 is particular +)>2 or y2 we observe that, on interchanging the first two
rows of A n
69 b,
k,
a3 a
while,
by the
so that
C2
y2
Or
that
3 is
+S 3
A n is
unaltered (Theorem
it
3,
row up
the
until
becomes thf
first
on expanding by
first
so formed,
But
and
so /)3
An
S3
.
1.4. If
we
co-factor
A rs of ars in
use the double suffix notation (Chap. I, 1.3, ( \ars is, by the procedure of
\
4),
l)
the
r +s ~ 2
is,
A rs
Aw
is
l)
r+s
We
we
have seen
in
that, with
=
(3)
If
by
where
s is
as
bs
...
A:
s,
1, 2,..., w,
other than
...
;
r,
ks that
A B
r,
r ,...,
are unaffected
30
such a change in the rth row of minant, (3) takes the form
Q
=aA
s
(4)
proved
in the
summed up
The sum of elements of a row (or column) multiplied by their awn co-factors is A; the sum of elements of a row multiplied by the corresponding co-factors of another row is zero; the sum of elements of a column multiplied by the corresponding co-factors of another column is zero.
We also
THEOREM
The determinant A
(a l b 2
...
kn )
may
be
& A
where r
is
(5)
,
(6)
is
any one of
. . .
the
numbers
1, 2,...,
n and x
any one of
(7)
the letters a, 6,
k.
Moreover,
= a A +b E +...+k K = y X +y X +...+y Xn
8
r 3 r
s
r)
tl
(8)
1,
2,...,
two different numbers taken from two different letters taken from a, 6,..., k. x, y are
where
r, s are
n and
3.
Preface to
4-6
to a group of problems that
We
full
come now
depend
for their
discussion on the implications of 'rank' in a matrix. This full discussion is deferred to Chapter VIII. But even the more
elementary aspects of these problems are of considerable 4-6. importance and these are set out in
4.
The
4.1.
solution of
non-homogeneous
linear equations
The equations
a l x+b l y+c^
0,
=
c2
are said to be
homogeneous
equations
a^x+b^y
c ly
31
?/,..., t
so
ar x+br y+...+kr t
are
it is
lr
(r
l,2,...,ra)
all satisfied,
these equations are said to be CONSISTENT. If the equanot possible so to choose the values of x, /,...,
,
y =
0,
3x
2y
For example,
x+y
2,
are satisfied
when x
xy =
0,
3x~2y =
x, y,...,
are consistent, since all three equations 2, 1, y 1; on the other hand, x+y
6 are inconsistent.
n non-homogeneous
t,
linear equations, in
the n variables
ar z+b r y+...+kr t
lr
(r
...
1,2,.. .,71).
(9)
Let A (a 1 b 2 ar b n ... in A.
,
...k n )
and
10,
let
A n Bn
r
be the co-factors of
the result
Since,
by Theorem
Ja A =
r
A,
2b A =
r
0,...,
(9)
by
its
corresponding
and
adding
that
is,
is
A#
(/!
6 2 c3
...
i n ),
(10)
the determinant obtained by writing I for a in A. Similarly, the result of multiplying each equation
(9)
by
its
corresponding
and adding is
liB!+...+l n Bn
Ay
and so
on.
(a,l 2 c 3
...
fcj,
(11)
have a unique solution given by (10), (11), and their analogues. In words, the solution is A x is equal to the determinant obtained by putting I for a in A;
the equations
(9)
*
.
When A
for b in A;
and so on'.f
When A
of (10), (11), and their analogues is also zero. determinants are zero the equations (9) may or
sistent:
to a later chapter.
f Some readers may already be this result the differences of form are unimportant.
32 5.
The solution of homogeneous linear equations THEOREM 11. A necessary and sufficient condition that not all zero, may be assigned to the n variables x, ?/,..., t
the
values,
so that
n homogeneous
equations
ar x+b r y+...+kr t
hold simultaneously
5.1.
is (a l b 2
...
(r
l,2,...,n)
(12)
kn )
0.
NECESSARY.
satisfied
by values
,
of#, ?/,..., not all zero. LetA^E (a l b 2 ...&) and let A r JBr ,... be the co-factors of ar b r ,... in A. Multiply each equation (12) by its
,
corresponding
Similarly, AT/
least
one of x,
and add; the result is, as in 4.3, A# == 0. 0, Az = 0,..., A = 0. But, by hypothesis, at be zero. ?/,..., t is not zero and therefore A must
Ar
0.
5.21.
In the
first
Omit
place suppose,
from
(12)
...
nl eqxiations
=
2,3,...,rc),
further,
that
A^^to.
(13)
b r y+c r z
+ + k = -a
(r
coefficients
4.3,
in place of A,
we obtain
Az =
x
(b 2
d4
...
kn )x
= C^x,
...,
= A&
But
1
B^,
C^f,
where ^
0),
satisfying the
equations
4 1 +6 1 5 1 +...
0,
and hence
is a sufficient I. This proves that A corresponding to r 0. to hold provided also that condition for our result
omitted equation
and some other first minor, say C89 is not zero, an of the letters a and c, of the letters x and z, and of interchange the suffixes 1 and s will give the equations (12) in a slightly changed notation and, in this new notation, A l (the C8 of the and if any one old notation) is not zero. It follows that if A
If
Al =
A ^
THE MINORS OF
first
A,
DETERMINANT
all zero,
33
minor of A
y,...,
t
is
may
be assigned
to x,
all satisfied.
5.22. Suppose
now
that
arid that
every
first
minor of
A is zero; or, proceeding to the general case, suppose that all minors with more than R rows and columns vanish, but that at least one minor with R rows and dplumns does not vanish.
ing suffixes) so that one non-vanishing minor of R rows is the in the determinant (a x 6 2 ... e n+l ), where e first minor of a
letters
and interchang-
denotes the (JK+lJth letter of the alphabet. Consider, instead of (12), the R+l equations
ar x+b r y+...+e r \
(r
1,2,...,
R+l),
.
(12')
where A denotes the (/?+l)th variable of the set x, y,... The determinant A' EE (a^ b 2 ... e R+l ) = 0, by hypothesis, while the minor of a t in A' is not zero, also by hypothesis. Hence, by
5.21, the equations (12') are satisfied
when
A
= 4i, y=B^
...,
= J0;,
(14)
where A[,
B[,.,.
Moreover,
A[ 7-0.
Further,
if
E+ < r < n,
1
,
being the determinant formed by putting ar b ry ... for a v b lt ... in A', is a determinant of order JR+1 formed from the coefficients
zero.
of the equations (12); as such its value is, by our hypothesis, Hence the values (14) satisfy not only (12 ), but also
7
ar x+br y+...+er X
(R+l
<r<
n).
for x,
Hence the n equations (12) are satisfied if we put the values (14) variables in (12) other y,..., A and the value zero for all
in (14) is not zero.
34
THE MINORS OF A
If
and
if,
further,
no
first
minor of
of the
row (column);
Ar
"A.
=^= -=K B
S
'
~K;
ar x+br y+...+kr t
(r=l,2,...,n).
(15)
for,
They Theorem
are
satisfied
by x
= Av
y = B
1
l9 ...,
K^,
by
10,
^ = A = 0, = 2,3,...,n). K =
(r
2, 3,..., n,
and consider
b r y+cr z+...+kr t
4.3,
= -a
(r
s).
in place of
...
k n )y
k n )z
(6i c 2
...
= =
x,
(a l c 2
(6 X
2
...
kn )x,
...
d3
i w )a;,
and
no
s
suffix s.
s
That
s
is,
we have
....
A y=B
(15),
A z = C x,
x, y,...,
t
Hence
= Kv = A v y = Bv -4^! = ^^, A C = A C
x
...,
8,
....
(16)
=A
r,
y=S
is
r,
...,
= K,,
is
as a solution of (15).
proved.
A B =A B
r 8 s
If n
fi rs t
r
minor
by
zero,
A B
8
8,
etc.,
equation
dr
A A8
5:
7?
-~^r V' K
8
if / (17) V
35
and
EXAMPLES
*
1
.
III
By
a rx + b ry + c rz + d r t prove that
ar
(r
=
<
0, 1,2, 3),
2 {^\{a~
a
ar
<*'
'
>
c){a~d}
*> 3) "
Obtain identities in
of
a.
o, 6, c,
%/2. If
a h
9
h
b
f
c
g, f, c in A,
prove that
2
aG
3.
-\-2hFG + bF -}-2gGC-\-2fFC-}-cC
2
<7A.
0, etc.
EC = F GH =
AF,...,
of Example 2 is equal to zero, prove that and hence that, when C -f- 0,
so also are
v/
Prove also that, if a,..., h are real numbers and C is negative, A and B. [A = be / 2 F = ghaf, etc.] 4. Prove that the three lines whose cartesian equations are a r x+b r y+cr (r = l,2,3)
,
jbhen
if
(Oi& 2 c 3)
0.
HINT. Make equations homogeneous and use Theorem 1 1 5. Find the conditions that the four planes whose cartesian equations = 1, 2, 3, 4) should have a finite point in are ar x-\-b r y-\~c T z-\-dr (r
common.
6.
given circles
(r
1, 2, 3)
0.
ffi
03
36
By writing
+y =
0,
prove that the circle in question is the locus of the point whose polars with respect to the three given circles are concurrent.
7.
circles
8.
Express in determinantal form the condition that four given may have a common orthogonal circle.
Write down
P
3
3
XZ
as a product of factors, and, by considering the minors of x, x 2 in the determinant and the coefficients of x, x 2 in the product of the factors,
Extend the
results of
Example
7. Laplace's
expansion of a determinant
7.1.
The determinant
^(-lfa
where the sum
the values
is
1, 2,...,
b a ct ...k e
(18)
r, 5,...,
in
is
the total
number of
s,...,6.
show that the expansions of A n by its rows and columns are but special cases of a more general procedure. The terms in (18) that contain ap b q when p and q are fixed, con,
We now
stitute
sum
ap bq (AB)pq
is
= ap bq ^c
...k e
(19)
in
to
2)! ways of assigning some order, excluding p and q. ,..., Also, an interchange of any two letters throughout (19) will reproduce the same set of terms, but with opposite signs pre-
6 the values
1, 2,...,
37
determinant, either (AB)pq or (AB)pq is equal to the determinant of order n 2 obtained by deleting from A w the first and
second columns and also the pth and qth rows. Denote this determinant by its leading diagonal, say (c d u ...k0), where u,..., 6 are 1, 2,..., n in that order but excluding p and q.
t
,
Again, since (18) contains a set of terms ap bq (AB) pq and an interchange of a and 6 leaves (AB) pq unaltered, it follows (from the definition of a determinant) that (18) also contains a set of terms aq bp (AB)pq Thus (18) contains a set of terms
.
(ap b q
-aq bp )(AB) pq
(c t
du
...k e ),
a q bp
may
is
be denoted by
(ap b q ), so that
terms in (18)
(<*
p b q )(c d u ...k e ).
i
the terms of (18) are accounted for if we take possible pairs of numbers p, q from 1, 2,..., n. Hence
Moreover,
all
all
An
IKA)M-^)>
where the sum is taken over all possible pairs p, q. In this form the fixing of the sign is a simple matter, for the leading diagonal term ap bq of the determinant (ap b q ) is prefixed by plus, as is the leading diagonal term c d u ...k0 of the determinant (c d u ...ko). Hence, on comparison with (18),
t
A
where
(i)
= 2(-l)*(aA)M-^)>
1, 2,...,
(20)
the
t,
(ii)
(iii)
...,
6 are
N
p,
is
the total
q, t,u,..., 0.
The sum
7.2.
When
once the argument of 7.1 has been grasped argument extends to expansions such as
.k e )
9
it is
(21)
and so
on.
We
38
an expansion of A by second minors, (21) an expansion by third minors, (22) an expansion by fourth minors, and so on.
Various modifications of
7.1 are often used.
In
7.1
we
expanded A by
;l
its first
it
by any two
(or
two columns: we may, in fact, expand more) columns or rows. For example, A w is
equal to
Co
as
we
see
by interchanging a and
and
also 6
and
d.
The
Laplace expansion of the latter determinant by its first two columns may be considered as the Laplace expansion of A w by
its
third
7.3.
Determination of sign in a Laplace expansion The procedure of p. 29, 1.4, enables us to calculate with ease
the sign appropriate to a given term of a Laplace expansion. Consider the determinant A |ars |, that is, the determinant
its
rth
(23)
A(r l9 r 2 8 l9 8 2 ) denotes the determinant obtained by the r x th and r 2 th rows as also the ^th and s 2 th deleting columns of A, and (ii) the summation is taken over all possible
'
where
(i)
pairs r v r 2
(r x
<
r 2 ).
Now, by
that
where A(r 1 ;5 1 )
is
appears in the
(r 2
<r
<s
2,
l)th
39
in the expansion o
St
are given
by
for &(r v r 2 ]s v s 2 )
is
obtained from
A^
by
deleting the
row
Nri+ri-Ui+Sa
(24)
That
is
expansion
numbers of
sum of the row and column in the first factor of the term. appearing The rule, proved above for expansions by second minors, easily extends to Laplace expansions by third, fourth,..., minors.
(
l)
where a
is the
the elements
7.4. As an exercise to ensure that the import of (20) has been grasped, the reader should check the following expansions of determinants.
a2 a3
62
c2
63
c3
ci
3
*i
The
signs are,
by
(20),
by
(24),
c2
c3
d2 ds
c?4
e2
e3
e.
c4
The
sign
is
most
easily fixed
by
(24),
which gives
it
to be
l)
l+5+1+2 .
40
'(in)
ca Ca
&2
c3
c4
&4 ^4
(a 2 & 3 c 4 ), factor of a r in A a
.
whcro Aj
Aa
== (0364^),
A3
-E (a^b l c 2 ) t
and A\
is
the co-
may
The convention
single
is
that
when a
is
term
(as
here r
literal suffix is repeated in the repeated) then the single term shall the terms that correspond to different
is applicable only when, by the one knows the range of values of the suffix. In what context, follows \^e suppose that each suffix has the range of values
ars xs
an
which denotes
8
2a =
1
rs
xsy
(3)
n
bjs>
which denotes
2a
n
r
rj
^s>
W
xr xs(^)
n
rs
ars xr xs
which denotes
r
^L 2 8=1a =l
In the
last
example both
and
are repeated
and
so
we must
sum with
The repeated suffix is often called a 'DUMMY SUFFIX', a curious, but almost universal, term for a suffix whose presence implies the summation. The nomenclature is appropriate because the meaning of the symbol as a whole does not depend on what letter is used for the 'dummy'; for example, both
ars xs
and
arj x^
42
On
W*
W.
r
(6)
^a =
1
rs
xs
(r=
l,2,...,n).
Thus, in
(6), r is
of the values
1, 2,...,
to be thought of as a suffix free to take any one n; and no further summation of all such
in considering
linear equations.
2. Consider
n variables x
(i
ari x t
= = (r
1, 2,...,
n)
l,2,...,n).
When we
the forms
substitute
Xt
(7)
*>isX,
(<=l,2,,..,n),
(8)
become
ari b i8
(r=l,2,...,n).
(9)
Now
where
consider the
n equations
crs E=
\crg
\
ari b ia
(11)
If the determinant!
of values
(i
But,
when
the
equations
^ ^^ = ^^
=
is
\
= 0, then (10) is satisfied by a set which are not all zero (Theorem 11). 1, 2,...,?i) satisfy equations (10), we have the n i
=
Q>
(12)
and
so
all
EITHER
OB
f The reader who is unfamiliar with the summation convention is recommended to write the next few lines in full and to pick out the coefficient of
9
in
i-l
{
Compare Chap.
I,
4 (p. 21).
43
t>x.
are satisfied
is, \b ra
=
\
by a
\crs
\
all zero;
that
0.
Hence
\b rs
if
0,
\ar8 \,
is
\
zero.
\,
\,
The determinant \c rs i.e. \ari b is is of degree n and of degree n in the ft's, and so it is indicated that
|cr,|
in the a's
= *Klx|6J,
(13)
where k
By
arr
when
--
s,
1, it is
indicated that
|cl
= l|X|6w
|.
(14)
is an indication rather than a proof of the theorem contained in (14). In the next section we important give two proofs of this theorem. Of the two proofs, the second is
The foregoing
it is
certainly the
more
artificial.
3.
/*
THEOREM
13.
Let
\afit
\b ra
where
3.1
First proof
suffix
In this proof, we shall use only the Greek letter a as a implying summation: a repeated Roman letter
dummy
will not
stands for
a 12 fc 21
\cpq
44
in the expansion of this determinant. This we can do by cona sidering all a lx other than a lr to be zero, all a 2x other than 2s
to be zero,
and so
on.
Doing
this,
we
\cpq
a lr brl
a ir br2 a 2s bs2
2s
b sn
anz bzl
a nz bz2
a nz bzn
b rl
b r2
bsl
bs2
Z2
'
Unless the numbers r, 5,..., z are all different, the *&' determinant is zero, having two rows identical. When r, $,..., z are the numbers 1, 2,..., n in some order, the '6' determinant is N equal to (l) \bpq where N is the total number of inversions of the suffixes in r, 5,..., z (p. 11, 2.2, applied to rows). Hence
\,
is
the values
taken over the n\ ways of assigning is the 1, 2,..., n in some order and
number of
r,
5,...,
z.
But the
is
pq
3.2. Second proof a proof that uses a single suffix notation. Consider, in the first place,
A
Then
(17)
for
a single term.
45
deter-
we
In the determinant
(17)
take
3.5.
Then
(17)
becomes
AA'
=
1
a1
+6
a2
aij8 1
+6
-1
and
so,
on expanding the
last
determinant by
its first
two
(18)
columns,
(a l b 2 ...k n ), A'
!
= (a^
ki
* n )- The determinant
.
.
kn
in
in each
element
if
of
principal diagonal
and
elsewhere,
is
unaltered
we
write
= r +a
2
rn + 1
=
.
.
Hence
Ki
I
When we expand
its first
n columns,
-10.0
-1
.
7.3)
-1
an
...-
(_l)n
Moreover, 2n is an even number and so the sign to be prefixed to the determinant is always the plus sign. Hence
ln
*n
is
a /i
Kn
(19)
which
4.
another
13.
called the
In
forming the first row of the determinant on the right of (19) we multiply the elements of the first row of A by the elements
of the successive columns of A'; in forming the second row of the determinant we multiply the elements of the second
row of A by the elements of the successive columns of A'. The process is most easily fixed in the mind by considering (18). A determinant is unaltered in value if rows and columns are interchanged, and so we can at once deduce from (18) that, if
A'
=
A*
then (18)
may
AA' AA'
=
(20)
=
(21)
f See Chapter VI.
47
are multiplied
by the
elements of the rows of A'; in (21) the elements of the columns of A are multiplied by the elements of the columns of A'. The
of these is referred to as multiplication by rows, the second as multiplication by columns. The extension to determinants of order n is immediate:
first
several examples of the process occur later in the book. In the examples that follow multiplication is by rows: this is a matter
of taste, and
many
by columns or by
however,
by applying the
rules
shall arrange for multiplying determinants to particular examples. the examples in groups and we shall indicate the method of solution
We
one example
in
each group.
2
(<*-y)
The determinant
is
the product
by rows
1
of the
two determinants
2a
-2)8
a
ft
a2
1 1
-2y
of these
]8).
7
is
By Theorem
second to
(j8
8, tlx) first
equal to
2(/?
y)(y
a)(a
/?)
and the
y)(y
a)(a
[In applying Theorem 8 we write down the product of the differences and adjust the numerical constant by considering the diagonal term of
the determinant.]
\/2.
The determinant
of
Example
3, p. 19, is
the product
by rows of the
two determinants
and so
3.
is
zero.
Prove that the determinant of order n which has (ar a g )* as the element in the rth row and 5th column is zero when n > 3.
Evaluate the determinant of order n which has (o r element in the rth row and 5th column: (i) when n =
4.
3
a,)
4,
(ii)
as the
when
n >
4.
48
'5.
/6.
results of
Examples
3 3
and 4
to higher powers of a r
a8
(a-x)
(6
(c
(a-?/)
(b
2/)
3 3
(a-z)
(6
z)
(c
a3
63
c3
x) x)
(c
y)
z)
z3
=
7.
9a6rx^/z(6
c)(c~a)(a
b)(y
z)(z
x)(x
y).
(a-x)
(b
2 2
(a-?/)
(6
i/)
2 2
x)
(c-x)
is
(c-y)
a2
62
c2
zero
when k
1,
and evaluate
it
when k
and when k
>
3.
if s r
1
ft
a
a
8
82 S3
Hence (by Theorem 8) prove that the first determinant is equal to the 2 product of the squares of the differences of a, y, 8; i.e. {(a,/?,y, 8)}
,
.
9.
Prove that
(x
a)(x
j3)(x
y)(x
8) is
a factor of
==
X2
X3
X4
s r in
and
S2
8
3
4
82
83
8*
y
If
A!, then
.
we have anything but O's in the first four places of the last row of we shall get unwanted terms in the last row of the product A! A 2 So we try 0, 0, 0, 0, 1 as the last row of A! and then it is not hard to see that, in order to give A! A 2 = A, the last column of A 2 must
be
\j x,
x 2 x 3 x4
,
,
The
factors of
follow from
Theorem
8.
'.
49
111
3
j8
111
ft
-}-jS
a3
y 3 y
Note how the sequence of indices in the columns of the first determinant affects the sequence of suffixes in the columns of the last
determinant.
v/11.
54
So
SS
So
*1
*2
*8
X*
where
12.
sr
a r -f
-f y
r
.
Extend the
results of
Examples 8-11
to determinants of higher
orders.
13. Multiply the determinants of the fourth order whose rows are given by 2 x r y r x* + y* 1 r *H2/ r - 2 *r -22/r 1;
and
o r,
=
(x r
1,
2,
3,
4.
xa ) z + (y r
ya Y>
where Hence prove that the determinant |o r zero whenever the four points (<K r ,y T ) are
,,|,
concyclic.
14. Find a relation between the mutual distances of a sphere.
five points
on
15.
By
a + ib
c-}~id
-(cid)
ib
prove that the product of a sum of four squares by a sum of four squares is itself a sum of four squares.
16.
3 s three and hence prove that (a* + b + c*-3abc)(A* + B +C*-3ABC) 3 3 3 + F + Z 3XYZ. may be written in the form 2 62 a 2 6c, B ca, C = c a&, then Prove, further, that if
3 3 3 Express a -f 6 + c
5.2) of
order
A =
17. If
-hy,
ti
of(x,y);Sn
Y = hx-\ by; (x^y^, (x ;y two sets of values + yiY Sl2 XiX t -\-y Y etc.; prove that*S\ = S2l
2 2)
19
l
2t
and that
4702
Sn
/*
50
ISA. (Harder.) Using the summation convention, let S^.a rs x rx s where a: 1 x 2 ,..., x n are n independent variables (and not powers of x), r a rs x%; AjU = x ^X r be a given quadratic form in n variables. Let r ^ wliere (rrj, a^,..., x%), A = !,. ^> denotes n sets of values of the variables
or
1
,
x 2 ,..., x n
Prove that
18s. (Easier.) Extend the result of Example 17 to three variables (.r, T/, z) and the coordinates of three points in space, (x lt y lt z^), (x z i/ 2 z 2 ),
,
5. Multiplication of
arrays
two arrays
y\
in
which the number of columns exceeds the number of rows, and multiply them by rows in the manner of multiplying determinants. The result is the determinant
A
If
=
(1)
we expand A and pick out the terms involving, say, A 72 we see ^ia ^ they are /^yo^jCg b 2 c 1 ). Similarly, the terms b 2 c l ). Hence A contains a term in P*yi are (b l c 2
where (b 1 c 2 ) denotes the determinant formed by the last two columns of the first array in (A) and (^72) the corresponding determinant formed from the second array.
It follows that A,
(ftiC 2 )(j8 1
when expanded,
y2
)
contains terms
(2)
+ (c
moreover
(2)
accounts for
all
the expansion of A. Hence A is (2); that is, the sum of all the products of corresponding determinants of order two that can be formed from the arrays in (A).
Now
and
t
consider a,
b,...,k,...,t,
<
JV;
notation in Greek
51
obtained on multiplying
a
IS
/i
*n
^n
EE
(3)
This determinant, when expanded, contains a term (put nth letter, equal to zero)
all
(4)
which
the product of the two determinants (a l b 2 ... k n ) and (! j8 2 ... K n ). Moreover (4) includes every term in the expansion of A containing the. letters a, 6,..., k and no other roman letter.
is
Thus the full expansion of A consists of a sum of such products, the number of such products being NC the number of ways of choosing n distinct letters from N given letters. We have therett
,
fore
THEOREM
The determinant A,
of order n, which
that have
is
obtained
on multiplying by rows two arrays roles, where n < N, is the sum of all
columns and n
>
The determinant
adding
so
formed
is,
in fact, the
product by rows
to each array.
For example,
52
is
ft
az
bz
<*3
and
is
also obtained
determinants
*>
ft
EXAMPLES
1.
The arrays
1
1
000
-2.r 4
-2</ 4
1 1
*J
2/J
*4
2/ 4
have 5 rows and 4 columns. By Theorem 15, the determinant obtained on multiplying them by rows vanishes identically. Show that this result gives a relation connecting the mutual distances of any four points in a plane.
2.
3.
If a,
)3,
y,...
denotes the
sum
are the roots of an equation of degree n, and of the rth powers of the roots, prove that
if
sr
8, p. 48,
and Theorem
14.
4. Obtain identities by evaluating (in two ways) the determinant obtained on squaring (by rows) the arrays
(i)
a
a'
b
b'
c
c'
(ii)
abed
b'
c'
a'
df
5.
let
S ^ax z +
...
+ 2fyz +
...,
X ^ ax + hy+gz,
==
hx + by+fz,
Z^
gx+fy-\-cz
53
B,...,
be the co -factors of
2/2
a, 6,...,
h in the discriminant of S.
^22/1-
Let
2/iZ 2
2i
*?
Zi# 2
2 2 ^i
^12/2
Then
Solution.
The determinant
is
which
is
2i
z2
2
2/a
F2
and
so
is
But
(Yj
2 ),
when written
in full, is
x 2 + by i +/z 2
which
is
xl
*2
yL
2/2
Zi
h
9
f
C
Z2
is
the
sum
6.
The
It
is
sometimes useful to be able to write down the product of a determinant of order n by a determinant of order m. The
We see that the result is true by considering the first and last
determinants when expanded by their last columns. The
first
54
by the
two determinants
the signs having the same arrangement in both summations. A further example is
d,
ft
^4
C4
^4
a result which becomes evident on considering the Laplace expansion of the last determinant by its third and fourth
columns.
X
3
X
#2
&2
C2 c3
ft
a3
63
al
+ *3ft
a 3 a2
This
is,
perhaps, easier
if
little less
elegant.
JACOBI'S
I
.
Jacobi's theorem
THEOREM
16.
Ar B
,
r ,...
,
a determinant
A a
_ =
(a l o 2
...
/c
x
)t
).
Then
If we multiply
A' == (A 1
...
Kn =
)
A"- 1
--
at
bl
X'
A'
A,
B,
an
bn
B
AA'
we obtain
A
r
--
for a r
A +b B8 + ...-{- k
8 r
equal to
A when r
when A
then
is
and
is
s.
Hence
so that,
AA'
^
s
0,
A'
=
0,
is
A A
71
,
71
-1
.
But when A
is
r
0,
A B
A s Br =
A'
For, by
Theorem
12, if
of A' by
its first
two columns
sum
of zeros.
.
Hence when A
0,
71
-1
DEFINITION. A'
is called the
of A.
THEOREM
17.
With
to
1
in A'
is
equal
the notation of /l ~ 2
;
and
and
suffixes.
When we
56
JACOBI'S
THEOREM AND
A
.
.
ITS
a.
EXTENSIONS
we obtain
wherein
are zero.
all
terms to the
left
Thus
A B9
When A = A = 0, the
0, this
result follows
gives the result stated in the theorem. When by the argument used when we con16;
sidered A' in
Theorem
if
each
B C
r
B C
8
is
zero.
A',
Moreover,
we
where
is
we multiply by rows
(1)
where the j's occur in the rth row and the only other non-zero terms are A's in the leading diagonal. The value of (1) is
General form of Jacobi's theorem Complementary minors. In a determinant A, of order n, the elements of any given r rows and r columns, where r < n, form a minor of the determinant. When these same rows
2.
2.1.
and columns are excluded from A, the elements that are left form
57
which,
sign
complementary minor of the original The sign to be prefixed is determined by the rule that minor. a Laplace expansion shall always be of the form
prefixed,
is
called the
where yn ^r
is
Thus, in
*2
c2
d2
(1)
by
its first
+(d 1
C 3 )(b 2
d4 ).
THEOREM 18. With the notation of Theorem 16, let $Jlr be a minor of A' having r rows and columns and let yn _r be the complementary minor of the corresponding minor of A. Then
CYT>
MJf
v /
fl
~~T
^"^ Z\
Ar
1
.
In
particular, if
is the
determinant
then t in the
dispose of the easy particular cases of the theorem. If r 1, (3) is merely the definition of a first minor; for example, in (1) above is, by definition, the determinant (b 2 c 3 d4 ). l
Let us
first
we see by Theorem 12. 0. and whenever A Accordingly, It remains to prove that (3) is true when r > 1 and A ^ 0. So let r > 1, A ^ 0; further, let /zr be the minor of A whose elements correspond in row and column to those elements of
If r
1
>
and A
0,
0,
as
1
(3) is
true whenever r
Then, by the definition of complementary minor, due regard being paid to the sign, one Laplace expansion
A' that compose 2Rr
4702
.
58
JACOBI'S
THEOREM AND
ITS
r-
EXTENSIONS
form
other elements other elements
It
thus sufficient to prove the theorem when 9W is formed from the first r rows and first r columns of A'; this we now do. be the next Let -4,..., E be the first r letters, and let JP,...,
is
AI
FI
Aj_
an
en
fn
kn
first
nr elements of the leading diagonal of the and other elements of the nr rows
1 's
all
last
we obtain
A
fr
If Kr
/I
fr+l
fn
*>+!
If
b Kn
That is to say,
and so
(3) is
true whenever r
>
and A
^ 0.
CHAPTER V
SYMMETRICAL AND SKEW-SYMMETRICAL DETERMINANTS
The determinant for every r and s. The determinant
1.
\a
is
\
= aw
ar8
\ars
is
s.
of the definition that the elements in the leading diagonal of #.. such a determinant must all be zero, for a^
2.
THEOREM
19.
zero.
The value of a determinant is unaltered when rows and columns are interchanged. In a skew-symmetrical determinant such an interchange is equivalent to multiplying each row of the original determinant by 1, that is, to multiplyHence the value of the whole determinant by (l) n ing a skew-symmetrical determinant of order n is unaltered when
.
it is
multiplied
by
w
(
l)
and
if
is
be zero.
3. Before considering determinants of even order
we
shall
examine the
first
odd
order.
Let
A r8
a skew-symmetrical deterrow and 5th column of |arg 1 of odd order; then A ra is ( minant l)*- ^, that is, = A r8 For A n and A v differ only in considering column A^ for row and row for column in A, and, as we have seen in 2, a column of A is (1) times the corresponding row;
|,
.
A rs we
n1
We
is
that
require yet another preliminary result: this time one often useful in other connexions.
60
SYMMETRICAL AND
THEOREM
20.
The determinant
au
a 22
(1)
s
(2)
is the co-factor
ra
of
a^ in
\ars \.
The one term of the expansion that involves no X and no Y is \ars \S. The term involving Xr Ys is obtained by putting S, all X other than JC r and all Y other than 7, to be zero. Doing A rs But, considerthis, we see that the coefficient of X r Y is s the Laplace expansion of (1) by its rth and (n-fl)th rows, ing we see that the coefficients of X r YH and of a rs S are of opposite A rs Moreover, when s sign. Hence the coefficient of X r Y is A is expanded as in Theorem 1 (i) every term must involve Hence (2) accounts for all the either S or a product X r Y s
,
.
. .
THEOREM
the square of
is
In Theorem
of order
let
is
2?i
1
;
20, let \a rs
Yr
r,
0,
be a skew-symmetrical determinant is zero (Theorem 19). Further, so that the determinant (1) of Theorem 20
\
now
is
ZA X X
rs
r
s.
(3)
Since
= 0, A A = A A (Theorem 12), or A* = A A or A = A AJA Hence (3) is and A /A n = AJA Z V4 ...iV4 nB ), (Z V4 (4) where the sign preceding X is (1)^, chosen so that
\a rs
rl
\
rs
sr
rr
ss
rr
ss ,
ls ,
rs
lr
tl .
11
22
Now
^4
n ,..., A nn
minants of order 2n
elements, of degree
1,
SKEW-SYMMETRICAL DETERMINANTS
then
(4) will
61
But
-n
and
is
an
a?
a perfect square. Hence our theorem is true when n It follows by induction that the theorem is true for
1.
all
integer values of n.
6.
The
Pfaffian
polynomial whose square is equal to a skew-symmetrical determinant of even order is called a Pfaffian. Its properties and relations to determinants have been widely studied.
Here we give merely sufficient references for the reader to study the subject if he so desires. The study of Pfaffians,
warrant
interesting though it may be, is too special in its appeal to its inclusion in this book.
G. SALMON, Lessons Introductory 1885), Lesson V.
S.
to the
Sir T.
to the History of Determinants, vols. i-iv This history covers every topic of determinant theory; it is delightfully written and can be recommended as a reference to anyone who is seriously interested. It is too detailed and exacting for the beginner.
MUIR, Contributions
(London, 1890-1930).
EXAMPLES VI (MISCELLANEOUS)
1.
Prove that
where
A rs
is
the co-factor of a rs in
\a r8
If a rs
a 8r and
=
\
\a ra
\.
0,
may
be written as
62
2.
EXAMPLES
If
VI
ar$
,
S=
n
r
xt
and
A;
<
n,
prove
that
a fct
is
independent of
a?!,
*..., xk .
1,
Theorem
10.
A4
and
A\,
JS 21 1
-^a
J3
Al
-^j
0,
B?
C\
BJ
CJ
B\
(7}
BJ
Cl
BJ
C\
17): use
HINT. B\C\
4.
B\C\
a4 A $ (by Theorem
Theorem
10.
is
a determinant of order
co-factor of afj in A,
and A
7^ 0.
ri
the
whenever
a n +x
=
+x
0.
6.
Prove that
1 1 1
sin a
sin/?
cos a
cos/?
sin2a
sin
2/9
cos2a
cos2j8
1
1
siny sin 8
sine
cosy
cos 8
sin2y
sin 28
cos2y
cos 28 cos2e
cose
sin2e
and the summation convention be used 6. Prove that if A = |ari (Greek letters only being used as dummies), then the results of Theorem 10 may be expressed as
|
t^r =
&**A mr
= =A=
ar ^A tti
Qr-A*..
when when
=56
a, *.
EXAMPLES
Prove also that transformation
then
7.
VI
63
if
variables
r
X by
the
^ X = r* &x = A^X^
8
9
t
Prove Theorem 9 by forming the product of the given circulant by the determinant (o^, 0*3,..., o>J), where Wi ... a) n are distinct nth roots
of unity.
8. Prove that, determinant
if
fr (a)
gr (a)
hr (a) when r
1,
2,
3,
then the
whose elements are polynomials in x, contains (x a) 2 as a factor. HINT. Consider a determinant with c{ = c^ c,, c^ = c, c a and use
the remainder theorem.
9. If the elements of a determinant are polynomials in x, and if r columns (rows) become equal when x o, then the determinant has r~ l as a factor. (xa)
10.
as
Verify Theorem 9 when n = 5, o x = a 4 = a 5 = 1, a, = a, and a 1 by independent proofs that both determinant and product are
(
equal to
o i +o + 3) ( a _
l)4( a
+ 3a 3 + 4a + 2a-f 1).
PART
II
MATRICES
4702
real or complex, either as separate x lt # 2 ,..., x n or as a single entity x that can be broken up into its n separate pieces (or components) if we wish to do so. For example, in ordinary three-dimensional space, given a set of axes through and a point P determined by its coordinates with respect to those axes, we can think of the vector OP (or of the point P) and denote it by x, or we can
entities
'
give prominence to the components x ly x^ # 3 of the vector along the axes and write x as (x v x 2 ,x^).
a single entity we shall refer to merely* a slight extension of a common practice in dealing with a complex number z, which is, in fact, a pair of real numbers x, y."\
it
as a 'number'.
X X
X =a
r
rl
x l +ar2 x 2 +...+arn x n
(r
1,...,
n),
(I)
wherein the ars are given constants. [For example, consider a change of axes in coordinate geometry where the xr and r are the components of the same
vector referred to different sets of axes.] Introduce the notation A for the set of n 2 numbers, real or
complex,
nn a
n ia a
n ln a
and understand by the notation Ax a 'number' whose components are given by the expressions on the right-hand side of equations as (1). Then the equations (1) can be written symbolically
X = Ax.
f
(2)
iii.
Cf. G.
68
with but
little
practice, the
sym-
when
equations such as
2.
(1)
a familiar fact of vector geometry that the sum of a JT 3 and a vector Y with vector JT, with components v 2
X X
components Y Y2 73
ly
,
is
a vector Z, or
X+Y,
with components
Consider
now a
'number' x
X given by
(2)
X = Ax,
where A has the same meaning as in
'number'
1.
given by
Y = Bx
2
where
numbers
.
b ln b nn
and
(3) is
r
n equations
(
Y = ^*l+&r2*2+-" + &rn#n
*>
2 i-i
n )'
by comparison with vectors in a plane or in denote by X-\-Y the 'number with components r r space,
naturally,
9
We
X +Y
But
X +Y =
r
r
(arl
+6rl K+(ar2 +6
a +...
+ (a n+&rnK,
r
numbers
a ln+ b ln
anl+ b nl
an2+
ft
n2
ann+ b nn
We are thus led to the idea of defining the sum of two symbols such as A and J5. This idea we consider with more precision
in the next section.
69
Matrices 3.1. The sets symbolized by A and B in the previous section have as many rows as columns. In the definition that follows we do not impose this restriction, but admit sets wherein the number of rows may differ from the number of columns.
,
DEFINITION
1.
An
array of
mn
a i2
tfoo
. .
am a9~
a
aml
am2
is called
a MATRIX.
of order n.
When
m=
the
array
is called
a SQUARE
MATRIX
A, or
is frequently denoted by a single letter other symbol one cares to choose. For by any example, a common notation for the matrix of the definition is [arj. The square bracket is merely a conventional symbol
In writing, a matrix
a,
or
(to
mark the
fact that
we
and is conveniently read as 'the matrix'. As we have seen, the idea of matrices comes from linear substitutions, such as those considered in 2. But there is one important difference between the A of 2 and the A that
denotes a matrix. In substitutions,
A is thought of as operating
and so
on some 'number*
x.
The
We now
introduce a
number of
make
precise the
A + B,
DEFINITION 2. Two matrices A, B are CONFORMABLE FOR ADDITION when each has the same number of rows and each has the same number of columns.
DEFINITION 3. ADDITION. The sum of a matrix A, with an element ar8 in its r-th row and s-th column, and a matrix B, with an dement brt in its r-ih row and s-th column, is defined only when
70
A,
row and
s-th
column.
The sum
is
denoted by A-}-B.
T^i]'
[
The matrices
are distinct; the
first
ai
2
[aj
order
2,
U
e.g.
OJ
K
[a 2
Ol
rfri
cJ cj
- K+&I
L
ql.
C 2J
OJ
[6 2
a 2+ 6 2
DEFINITION
are those of
4.
The matrix
I.
is that
multiplied by
5.
DEFINITION
SUBTRACTION.
A~B is
A~}-(B).
matrix
A where A is the one-rowed For example, the matrix the matrix [~a, b]. Again, by [a, b] is, by definition,
5,
*
Definition
ai
ci
and
this,
by Definition
3, is
equal to
c2
DEFINITION
6.
^1
is called
DEFINITION 7. The matrices A and B are said to be equal, and we write A B, when the two matrices are conformable for addition and each element of A is equal to the corresponding element
ofB.
It follows
'A
BW
from Definitions 6 and 7 that 'A mean the same thing, namely, each ar3
.
=
is
B' and
equal to
the corresponding b rs
DEFINITION 8. MULTIPLICATION BY A NUMBER. When r a number, real or complex, and A is a matrix, rA is defined
element of A.
is to
71
A=
"!
bjl,
then
3.4
=
we
36
In virtue of Definitions
3, 5,
and
8,
2A
3
instead of instead of
A-\-A,
5
2 A,
and so
reader,
on.
By
we can
an easy extension, which we shall leave to the consider the sum of several matrices and write,
A+A+A-\-(l-\-i)A.
is based directly on the addition and subtraction of their elements, which are real or complex numbers, the laws* that govern addition in ordinary algebra also govern the addition
of matrices.
algebra are
:
in
is
(a+b)+c = a+(b+c),
either
(ii)
b+a;
(iii)
= a-\~b. = ra+r&, 6) (a The matrix equation (A + B) + C = A + (B+C) is an immediate consequence of (a-\-b)-\-c = a-\-(b-\-c), where the small
r(a+6)
letters denote the elements standing in, say, the rth row and 5th column of the matrices A, B, C and either sum is denoted B-\-A by A-\-B-\-C. Similarly, the matrix equation A-{-B
;
an immediate consequence of a-{-b = b+a. Thus the associative and commutative laws are satisfied for addition and subtraction, and we may correctly write
is
A+B+(C+D) = A+B+C+D
and so
on.
On
we may
write
r(A
+ B) = rA+rB
72
when
= ra+rb
RA RB
3
b are numbers, real or complex, we have not yet when assigned meanings to symbols such as R(A-}-B), are all matrices. This we shall do in the sections that It, A,
when
r* a,
follow.
4.
= 6 z +6 z = 621*1+628
11
1
12
2,
*a>
where the a and b are given constants. They enable us to express xl9 x 2 in terms of the a, 6, and Zj, z 2 in fact,
;
If
we
1,
we may
tions(l)as
x==Ay>
z
y=Bz
and write
We
x=zABz,
provided we interpret bolic form of equations be the matrix
(2 a)
that
is,
we must
AB
2 2
6 21 6 21
0n& 12 +a 12 6 22
a 21 6 12 +a 22 6 22 ~|. J
^
'scalar pro-
This gives us a direct lead to the formal definition of the product A B of two matrices A and B. But before we give this
formal definition
it will
scalar product or inner product of two numbers DEFINITION 9. Let x, y be two 'numbers', in the sense of 1, having components x ly # 2 ,..., xn and ylt y2 ,..., yn Then
5.
.
The
73
INNER PRODUCT
the inner
numbers.
If the two
and
m ^ n,
product
is
not defined.
In using this definition for descriptive purposes, we shall permit ourselves a certain degree of freedom in what we call
a 'number'. Thus, in dealing with two matrices [ars ] and we shall speak of
[6 rJ,
as the inner product of the first row of [ars ] by the second column of [brs ]. That is to say, we think of the first row of a 12 ,.--> a ln ) and of [ar8] as a 'number' having components (a n
,
[b rs \
as a 'number' having
components
6.
Matrix multiplication
6.1.
DEFINITION
10.
Two
when
matrices A,
the
are
CONFORMABLE
AB
number of columns in
is
number of rows in B.
DEFINITION 11. PRODUCT. The product AB is defined only when the matrices A, B are conformable for this product: it is then defined as the matrix ivhose element in the i-th row and k-th
column
of B.
It
is
is the
has as
an immediate consequencef of the definition that many rows as A and as many columns as B.
best
AB
The
way
conformal
is
of seeing the necessity of having the matrices to try the process of multiplication on two nonif
A = Ki,
UJ
AB
and a column of
B= p [b
two
1 2
Cl i,
cj
consists of one letter
consists of
letters, so
that
we cannot
One
f The reader is recommended to work out a few products for himself. or two examples follow in 6.1, 6.3.
470'J
74
On
the
BA = fa ^l.fa
L&2
2J
column.
THEOREM BA.
J2.
The matrix
As we have
if
just seen,
we can form
AB
and
BA
only
of
rows of A.
When these products are formed the elements in the ith row and &th column are, respectively,
in
AB,
A
B
by the kth
in
BA,
by the 4th
sa4l
column of A.
Consequently, the matrix matrix as BA.
6.3.
AB
is
AB
Pre -multiplication and post-multiplication. Since and BA are usually distinct, there can be no precision in
by B'
until it is clear
whether
it
shall
mean
AB or
BA.
Accordingly,
we
multiplication and pre-multiplication (and thereafter avoid the use of such lengthy words as much as we can!). The matrix A
post-multiplied
by
is
the matrix
AB\
the matrix
pre-
multiplied
by
B is the matrix
BA.
Some
Lc
^J
lP
^J
h
b
glx pi
f\
c\
=
pw;-|-Ay+0z"|. l^x-i-by-^fzl
\y\ LzJ
[gz+fy+cz]
aft instead
75
law for multiplication. One of the the product of complex numbers is the dislaws that govern ab--ac. This tributive law, of which an illustration is a(6+c)
distributive
The
law also governs the algebra of matrices. For convenience of setting out the work we shall consider only square matrices of a given order n and we shall use A, B ...
t
We
is
down
in the tth
In
this notation
we may
write
])
ik\
(Definition 11)
3)
= AB+AC
Similarly,
(Definition 11).
6.5. The associative law for multiplication. Another law that governs the product of complex numbers is the associative law, of which an illustration is (ab)c a(6c), either being com-
abc.
We shall now consider the corresponding relations between three square matrices A, B, C> each of order n. We shall show that
(AB)C
= A(BC)
symbol
let
and thereafter we
of them.
ABC
to denote either
Let
B=
[6 ifc ],
[c ik ]
and
EC =
[yik ],
where, by
Definition 11,
"*
76
row and
i ^
Similarly, let
AB =
[j8 /A; ],
Then
is
b il c at
(2)
(2) gives the same terms as (1), though order of arrangement; for example, the term in a different 3 in (1) is the same as the term 2, j corresponding to I
corresponding to
3,
Hence A(BC)
to denote either.
2 in
(2).
ABC
We
6.6.
use
2
,
A*,... to
AA, AAA,
etc.
ciple involved in the previous work has been grasped, it is best to use the summation convention (Chapter III), whereby
n
b i} c jk
denotes
^b
n
n
it
ik
By
extension,
au bv cjk
denotes
j 2a
i/
6 fl c
#
it
is
and once the forms (1) and (2) of 6.5 have been studied, becomes clear that any repeated suffix in such an expression a 'dummy' and may be replaced by any other; for example,
Moreover, the use of the convention makes obvious the law of formation of the elements of a product ABC...Z\ this product of matrices is, in fact,
[
a il bli
im-
zik\>
the
i,
77
for multiplication
The third law that governs the multiplication of comnumbers is the commutative law, of which an example is plex ab = ba. As we have seen (Theorem 22), this law does not hold for matrices. There is, however, an important exception.
The square matrix of order n that has unity in its leading diagonal places and zero elsewhere is called THE UNIT MATRIX of order n. It is denoted by /.
DEFINITION
12.
in / is usually clear from rarely necessary to use distinct symbols for unit matrices of different orders.
n and I
the unit
matrix of
order n; then
1C
/
CI
/a
C
....
Also
= /3 =
Hence / has the properties of unity in ordinary algebra. Just we replace 1 X x and a; X 1 by a; in ordinary algebra, so we replace /X C and Cx I by C in the algebra of matrices. C kl Further, if Jc is any number, real or complex, kl C and each is equal to the matrix kC.
as
.
.
be noted that if A is a matrix of'ft rows, B a matrix of n columns, and / the unit matrix of order n, then IA A and BI B, even though A and B are not square matrices. If A is not square, A and / are not con7.2. It
may
also
The
division law
is
governed also by the division law, which states that when the product xy is zero, either x or y, or both, must be zero. This law does not govern matrix products. Use OA 0, but the to denote the null matrix. Then AO Ordinary algebra
AB =
or
is
A=
[a
61,
B=
F b
2b
1,
OJ
[-a
2a\
78
is
is the zero matrix, although neither the product the zero matrix.
AB
nor
Again,
For example,
01,
if
A=
TO AB={Q [O
9.
a [a
o
61,
B=
BA =
Oj
01,
[a
f
Oj
6 2 1.
ab
2
OJ
[-a
-a6j
A,
J3,...,
representing
be added, multiplied, and, in a large measure, as though they represented ordinary numbers. manipulated The points of difference between ordinary numbers a, 6 and
matrices,
matrices A,
(i)
#are
(ii)
= 6a, AB is not usually the same as BA\ whereas ab = implies that either a or 6 (or both) is does not necessarily imply zero, the equation AB =
whereas ab
that either
in
or
Theorem
29.
10.
The determinant of a square matrix 10.1. When A is a square matrix, the determinant
that has
the same elements as the matrix, and in the same places, is called the determinant of the matrix. It is usually denoted by
1-4
.
1
Thus,
if
column, so that
K-*l-
A=
It
is
that
if
and
are square matrices of order n, and if the two matrices, the determinant of the matrix
B; that
is,
\AB\
Equally,
\BA\
= = =
\A\x\B\.
\B\x\A\.
Since \A\ and \B\ are numbers, the commutative law holds for
their product
and
\A\
\B\
\B\
\A\.
79
BA |, although the matrix AB is not the same matrix BA. This is due to the fact that the value of as the a determinant is unaltered when rows and columns are inter|
changed, whereas such an interchange does not leave a general matrix unaltered.
10.2.
The beginner should note that 10.1, both A and B |-4 x \B\, of
\
in
the
equation
It is
are matrices.
= k\A
|,
where k
is
a number.
For simplicity of statement, let us suppose A to be a square matrix of order three and let us suppose that k = 2. The theorem '\A + B\ = \A\+\B\' is manifestly false (Theorem 6, p. 15) and so \2A\ 2\A\. The true theorem is easily found. We have
A=
i
a!3
22
^23
33J
say, so that
2A
=
2a, *3i
2a 12
^22
(Definition
8).
2a 32
2a 33 J
=
is
&\A\.
evident on applying 10.1 to the matrix product 21 x A, where 21 is twice the unit matrix of order three and so is
020
[2
01.
2J
EXAMPLES VII
Examples 1-4 are intended as a kind of mental arithmetic. 1. Find the matrix A + B when
(i)
A =
Fl
21,
=
= B=
(ii)
U 4j A=\l -1 2 -2
L3
[5 17
61;
8J
[3
-3 -4
6
6].
-3
2
3],
(iii)
A=
[I
[4
80
Ana.
f 6 [6
81,
(ii)
F4
6
-41,
(iii)
[5
9].
L10
12J
-6
-8j
L8
2.
Is
it
A + B when
(i)
A
A A
(ii)
(iii)
has 3 rows, B has 4 rows, has 3 columns, D has 4 columns, has 3 rows, B has 3 columns ?
Ans.
3.
(i), (ii),
No;
(iii)
only
if
3 rows.
AB when it possible to define the matrix BA have the properties of Example 2? A, Ans. AB (i) if A has 4 cols.; (ii) if B has 3 rows; (iii) depends on thenumber of columns of A and the number of rows of B.
Is
BA
4.
(i)
(i) if
has 3
cols.; (ii) if
has 4 rows;
(iii)
always.
Form
the products
11,
A^\l
LI
when
(ii)
A =
\l
21,
B=
F4 15
51. 6j
1J
L2
3J
Examples 5-8 deal with quaternions in their matrix form. 5. If i denotes *J( l) and /, i, j, k are defined as
i
01,
ij
t=ri
Lo
i,
01,
y=ro
L-i
k,
n,
oj
&-ro
it
i,
*i,
oj
j.
kj
ik
=
c,
/.
bi
cj
dk, a,
6,
d being
QQ'
(a*
+ 6 + c + d 2 )/.
2
2
al-pi-yy-Sk,
a,
j9,
y, 8
being
QPP'Q'
[see
7]
=
8.
1
i-
Q'p'pg.
/,
i,
j,
k are respectively
r
l-i
0100 0010
9.
oooiJ
By
00011 LooioJ
I
-1000
=
I,
Oi,
00-10
L-ioooJ
0100
01,
ij
n,
oj
Lo
l-i
81
together with matrices al-\-bi, albi, where a, 6 are real numbers, show that the usual complex number may be regarded as a sum of matrices whose elements are real numbers. In particular, show that
(a/-f &i)(c/-Mt)
10.
(oc-6d)/-h(ad+6c)i.
Prove that
y
z].ra
[x
h
b
0].p]
f\ \y\
c\
[ax+&t/
\h
19
\z\
that
is,
one column.
11.
A*-B* = (A-B)(A + B) 2 2 is true only if AB = BA and that aA + 2hAB + bB cannot, in general, linear factors unleash B = BA. be written as the product of two
12. If Aj, A 2 are numbers, real or complex, and of order n, then
is
a square matrix
where /
law).
is
^'-A^I-A^+AiA,/
l
>
= PtX n + PiX*^
= p^(A -Ai I)...(A -A w 1), where A p A n are the roots of the equation /(A) = 0. are numbers, then BA 14. If B = X4-f/i/, where A and
then
/(A)
...,
/*
AB.
The use
15.
rp^
p lt p l9 l Pn\ ?i P
Pat
2>3J
LPti
may
be denoted by
Pj
where
PU
denotes r^ n
p ltl, pn],
Plf denotes
Ptl denotes
4709
|>81
Pu denotes
82
Q,
Qn
>
p+Q =
NOTE. The first 'column' of PQ as it is written above is merely a shorthand for the first two columns of the product PQ when it is obtained directly from [> ] X [gfa].
ifc
by putting for 1 and i in Example 6 the tworowed matrices of Example 9. 17. If each P and Q T8 is a matrix of two rows and two columns, r8
16.
Prove Example
1, 2, 3,
1, 2, 3,
-i
the z'th and yth rows Prove that the effect of pre -multiply ing a square matrix A by Ii} will be the interchange of two rows of A> and of postmultiplying by Jjj, the interchange of two columns of A. of the unit matrix /.
Deduce that
19.
7 2j
9
/,
Ivk lkj l^
Ikj.
IfH=
If H
affects in the position indicated, then by replacing row$ and affects by replacing col v by col.j Acol.j. rowi-J-ftroWj
HA
AH
by an element h by
is a matrix of order n obtained from the unit matrix by 20. is the the rth unity in the principal diagonal by fc, then replacing is the result of result of multiplying the rth row of A by k, and multiplying the rth column of A by k.
HA
AH
cos
sin 01,
["
cos 2
<
cos [cossin
is
sin 2
(/>
Lcos^sin^
zero
22.
when
and
differ
by an odd multiple of
When
I
(A 1 ,A 2 ,A 3 )
and
two
lines
A,A 2
A}
AAi
A 2 A3
A2 A3
the lines
A
I
is
zero
23.
if
and only
if
and
m are perpendicular.
first
Prove that
L2
L when
denotes the
The transpose
1.1.
of a
13.
matrix
DEFINITION
which, for r
is called the
1, 2,...,
If A is a matrix ofn columns, the matrix n, has the r-th column of A as its r-th row
It is
(or the
If
A=
A',
is
said to be SYMMETRICAL.
applies
to
all
The
definition
rectangular matrices.
For
example,
is
the transpose of
n [1
[2
si,
ej
for the
are considering square matrices of order n we shall, throughout this chapter, denote the matrix by writing down the element in the ith row and kth column. In this notation,
if
When we
A =
[a ik ],
then A'
[a ki ]', for the element in the ith row is that in the kth row and ith column
THEOREM
23.
LAW
and
(A BY
B'A';
AB
is the
product of
in
1.1, let
so that
A A =
(
[a ik ],
[ttfcj,
B=
B'
[b ik l [b ki ].
Then, denoting a matrix by the element in the ith row and kth column, and using the summation convention, we have
AB =
[a ti b ik l
(AB)'
{a kl
b^.
84
RELATED MATRICES
the ith row of
is
Now
A'
akv
flfc 2 >"'>
b u b 2i ,... b ni and the kth column of &kn> the inn^r product of the two is
is
, 9
Hence, on reverting to the use of the summation convention, the product of the matrices B and A' is given by
f
B'A'
=
,...,
COEOLLAKY. // A,
transpose of the product
AB...K
product K'...B'A'.
By
the theorem,
(ABCy
and
2.
= =
C'(AB)'
C'B'A',
so,
step
by
step, for
The Kronecker
The symbol
B ik
S ifc
is
delta
defined
i
when
= k.
a particular instance of a class of symbols that is extensively used in tensor calculus. In matrix algebra it is often convenient to use [8 ik] to denote the unit matrix, which, as we have already said, has 1 in the leading diagonal positions (when
It
is i
3.
adjoint matrix
Transformations as a guide to a correct definition. Let 'numbers' X and x be connected by the equation
X = Ax,
where
(1)
symbolizes a matrix [aik\. When A 0, x can be expressed uniquely in terms \a ik of (Chapter II, 4.3); for when we multiply each equation
X =
i
aik xk
(la)
BELATED MATRICES
by the
co-factor
85
n,
i(
and sum
for
1, 2,...,
we obtain
(Theorem
10)
That
or
is,
xt ==
T
t-i
*,
=
fc=i
(^w/A)**.
(2 a) in
1
(? a)
- A~ X,
(1),
which
is
A" 1
to
mean
provided we interpret
a square matrix of order n\ let A, or \A\ denote the determinant \aik \; and let ^i r, denote the co-factor of ar8 in A.
3.2.
[aik ] be
Formal
A=
i*
== 0;
=56
0.
DEFINITION
15.
TAe matrix [A ki ]
[a ik ].
[aife ] is
is
the
ADJOINT, or the
ADJUGATB, MATRIX O/
DEFINITION
matrix [A ki /&]
16.
is
When
a non-singular matrix,
of [a ik ],
the
that
Notice that, in the last two definitions, it is A ki and not A ik is to be found in the ith row and fcth column of the matrix.
3.3.
Properties of the reciprocal matrix. The reason for the reciprocal matrix' may be found in the properties to be enunciated in Theorems 24 and 25.
the
name
THEOREM
n and
if
24.
// [a ik ]
A r8 is
the
and
where I
is the
= Mx[4K/A] = /, [A /&]x[aik = I,
k{ ]
(i)
(2)
86
RELATED MATRICES
The element
in the ith
which (by Theorem 10,f p. 30) is zero when i ^ k and is unity when i = k. Thus the product is [S ik] = /. Hence (1) is proved. Similarly, using now the sum convention, we have
and so
3.4.
(2) is
proved.
justified,
when A denotes
M'
for
in
we have shown by
-i
(1)
and
(2)
of
Theorem 24
that, with
such a notation,
and
A~*A
by
its
/.
That
is
reciprocal
is
equal
l But, though we have shown that A~ is a reciprocal of A, we have not yet shown that it is the only reciprocal. This we
do in Theorem
25.
25. // A is a non-singular square matrix, there is one matrix which, when multiplied by A, gives the unit only
THEOREM
matrix.
/ and be any matrix such that the matrix [A ki /&]. Then, by Theorem 24, AA~ l
Let
AR
1
let
A~ l
denote
A(B-A-i)
steps)
A- A(R-A~
l
l
)
= A~
24),
.Q
(p. 75,
6.5),
that
is
(by
(2)
of
Theorem
I(R-A-i)
Hence,
f
0,
i.e.R-A- l
1
.
(p. 77,
7.1).
if
AR
/,
must be A'
6, p. G2.
RELATED MATRICES
Similarly,
if
87
RA =
/,
we have
1
and
that
so
is
Hence,
3.5.
i.e.
l
A' 1
In virtue of Theorem 25 we are justified in speaking of not merely as a reciprocal of A, but as the reciprocal of A. Moreover, the reciprocal of the reciprocal of A is A itself, or,
in symbols,
(A24,
For
since,
by Theorem
= A. AA~ = A~ A
1
)"
/, it
follows that
is
a reciprocal of
A~ l
and, by Theorem
25,
it
must be
the
reciprocal.
4.
for matrices
2
,
We have already used the notations A ^4 3 to stand for AA, A A A,... When r and s are positive integers it is inherent
,...
A XA
8
)
.
With
this notation
(
(1) in
>
0,
t,
and
> >
r
method.
Let
t
r+k, where k
>
r
0.
Then
r ~k
A
But
and,
when
A A~
r
r+l
=
l
^4 r
-1
l
/^- r+1
r
.4 r
-1
.4- r + 1 ,
so that
Hence
,4- r
fc
-'.
88
Similarly,
RELATED MATRICES
we may prove
and
that
9
(A*)
= A"
negative integers.
(2)
whenever
5.
s are positive or
The
reciprocal of
a product of factors
is the
(ABC)~
(A-*B-*)-*
Proof.
= =
C- l B~ lA-\
BA.
factors,
we have
(p. 75,
(B-*A~ ).(AB)
= =
B- 1A-*AB
B-*B
6.5)
Hence B~ lA~ l must be the reciprocal of AB. For a product of three factors, we then have
25, it
and so on
for
THEOREM
27.
(A')~
Proof.
(A-
1
)'.
is,
by
definition,
BO that
(A-
Moreover,
definition of
A'^-iy 1
/.
it
is-a reciprocal
must
the reciprocal.
RELATED MATRICES
6.
89
Singular matrices
If
a singular matrix, its determinant is zero and so we cannot form the reciprocal of A, though we can, of course, form the adjugate, or adjoint, matrix. Moreover,
is
THEOREM
is the zero
28.
its
adjoint
matrix.
If
Proof.
[a ik \
and
\a ik
=
\
0,
then
both when
7^
k and when
Of*]
k.
X [ A ki]
and
A ki\ x [*]
are both of
them
null matrices.
The division law for non- singular matrices 7.1. THEOREM 29. (i) // the matrix A is non-singular, implies B = 0. equation A B =
7.
(ii)
the
//
^4
Me matrix
0.
is
AB
reci-
implies
Proof,
(i)
^4 is
non-singular,
it
has a
it
But A~ 1 AB
(ii)
=
B
IB
is
B, and so
B
it
0.
Since
non-singular,
follows that
has a reciprocal
B~ l and
,
Since
^4 7?
0, it
ABB- - Ox B- 1 1
0.
But
ABB' = AI =
1
A, and so
^4
0.
7.2.
'//
4702
The
AB
Mew. eiMer
-4
=
N
0,
or JB
0, or
BOJ?H
90
7.3.
RELATED MATRICES
THEOREM
30.
//
is
is
a non-
(1)
and
there is
A=
Moreover,
Proof.
YB.
(2)
X=
B^A,
Y=
AB~*.
1
BB~ A
1
==
satisfies
(1);
for
and
- BR-A = = Bis
B(R-B~*A)
so that
RB-*A the zero matrix. Similarly, Y = ^4 J5- is the only matrix that satisfies
1
(2),
7.4.
certain circumstances
when
B are not square matrices of the same order. COROLLARY. // A is a matrix of m rows and n columns, B
and
X=
m
n
9
B~ 1A
a non-singular
= AC"
1
.
columns and
has
m rows,
is
B~ l and A
are
B~ 1A, which
a matrix of
m rows
B.B-^A
where /
is
= IA=A,
m
may
rows and be proved
Further,
A=
BR,
must be a matrix of
n columns, and the first part of the corollary as we proved the theorem.
RELATED MATRICES
The second part of the
way.
7.5.
91
corollary
may
The
corollary
is
down
[a^]
is
a non-singular
n
lL
fc-1
a ik x k
= l,...,rc)
=
Ax, where
6,
(3)
may
x stand
The
t= fc-i i>2/*
(i
= l,...,n)
b'
(4)
may be
The
y'A where
y
6', y'
stand
A~ l b,
is
y'
b'A~ l
An
interesting application
EXAMPLES VIII
1.
When
A =Ta
g
h
b
gl,
B=
r* t
\vi
*, y*
*
*3 1,
f\
c\
9
Ui
form the products AB, BA, A'B' B'A' and verify the
in
result enunciated
Theorem
23, p. 83.
HINT. Use
2.
X=
Form
these products
4.
Prove Theorem
(A'
.
1
)',
by considering Theorem
its
when
A~ l
transpose
is
a sym.
metrical matrix.
HINT. Use Theorem 23 and consider the transpose of the matrix AA' 6. The product of the matrix [c* ] by its adjoint is equal to Ja^l/.
4jfc
92
RELATED MATRICES
of submatrices (compare Example
15, p. 81)
|>],
Use
7.
Let
FA
11,
M~
f/x
11,
N=
where
A, p, v
are non-zero
LO
AJ
LO
/J
HINT. In view of Theorem 25, the last part is proved by showing that the product of A and the last matrix is /. ~ 8. Prove that if f(A) p A n + p l A n l + ...+p n where p ,..., p n are numbers, then/(^4) may bo written as a matrix
,
\f(L)
f(M)
f(N)\
and
that, when/(^4)
is
non-singular, {f(A)}~
in the
l//(Af), l/f(N)
f(L),f(M),f(N).
g(A) are two polynomials in A and if g(A) is non-singular, the extension of Example 8 in which/ is replaced by//gr. prove
9.
If/(^4),
Miscellaneous exercises
10.
Solve equations
(4), p. 91, in
(A')-*b,
23,
= b'A^
1
prove that
1
(A )- - (A- )'. 11. A is a non-singular matrix of order n, x and y are single-column matrices with n rows, V and m' are single-row matrices with n columns;
1
Ax,
(^l- )^.
1
Vx
m'y.
3
= T^- m =
1
,
when a
I
i
linear
transformation
2/^
= 2 a ik xk k=
l
l,
2,
3)
changes
(^,^0,^3),
(2/i)2/22/a)
a plane.
12. If the
elements a lk of
are real,
and
if
AA
'
0,
then
.4
0.
13. The elements a lk of A are complex, the elements d lk of A are the conjugate complexes of a ifc arid AA' = 0. Prove that ^L == J" = 0. 14. ^4 and J5 are square matrices of the same order; A is symmetrical. Prove that B'AB is symmetrical. [Chapter X, 4.1, contains a proof
,
of
this.]
Definition of rank
1.1. The minors of a matrix. Suppose we are given any matrix A, whether square or not. From this matrix delete all elements save those in a certain r rows and r columns. When r > 1, the elements that remain form a square matrix of order r and the determinant of this matrix is called a minor of A of
order
in
r.
Each
individual element of
is,
when considered
of order
1.
this
For
example,
the determinant
is
a minor of
of order
1.2.
2; a, 6,...
1.
any minor of order r-\-l (r ^l) can be expanded by its iirst row (Theorem 1) and so can be written as a sum of multiples of minors of order r. Hence, if every minor of order r is zero, then every minor of order r+1 must also be zero. The converse is not true; for instance, in the example given in 1.1, the only minor of order 3 is the determinant that contains all the elements of the matrix, and this is zero if a = b,
d
Now
e,
and g
= h;
2,
is
not
necessarily zero.
1.3.
single
element of a matrix
is
zero, there will be at least one set of minors of the same order which do not all vanish. For any given matrix with n rows and
there
columns, other than the matrix with every element zero, 1.2, a definite positive integer p is, by the argument of
such that
and n, not all minors of order EITHER p is less than both vanish, but all minors of order p+1 vanish, p
94
OR p is equal to| min(m, n), not all minors of order p vanish, and no minor of order p+ 1 can be formed from the matrix.
This
number
definition
RANK of the matrix. But the much more compact and we shall take can be made
p
is
called the
DEFINITION
zero
'
17.
is the largest
r are
For a non-singular square matrix of order n, the rank is n\ for a singular square matrix of order n, the rank is less than n. It is sometimes convenient to consider the null matrix, all of whose elements are zero, as being of zero rank.
2.
Linear dependence
2.1. In the remainder of this chapter we shall be concerned with linear dependence and its contrary, linear independence; in particular, we shall be concerned with the possibility of
expressing one row of a matrix as a sum of multiples of certain other rows. We first make precise the meanings we shall attach
to these terms.
Let a lt ..., am be 'numbers', in the sense of Chapter VI, each having n components, real or complex numbers. Let the components be ^
*
(a u ,...,
field J
a ln ),
...,.
(a ml ,...,
a mn ).
are said to be
Let
be any
of numbers.
Then a l9 ..., a m
0.
F if there is an equation
(1)
all
Zll+- + Cm =
wherein
all
the
I's
belong to
and not
The equation
(1)
implies the
n equations
^a lr +...+/m a mr
-0
(r
= l,...,n).
respect to
F is linear
When
when
m = n, m 7^ n,
min(m, n)
m=
=
min(m, n)
2.
the smaller of
m and n.
95
wherein
all
the
J's
belong to F.
The equation
('
(2)
implies the
n equations
*lr
= Jr + -+'m amr
= If-i
)-
field
to be the 2.2. Unless the contrary is stated we shall take of all numbers, real or complex. Moreover, we shall use
the phrases 'linearly dependent', 'linearly independent', 'is a sum of multiples' to imply that each property holds with
respect to the field of
example,
b
'the
and c* will not necessarily integers but may be any numbers, /! and 1 2 are
For all numbers, real or complex. "number" a is a sum of multiples of the "numbers" imply an equation of the form a = Z 1 6-f/ 2 c, where
real or complex.
As on previous occasions, we shall allow ourselves a wide interpretation of what we shall call a 'number'-, for example, in Theorem 31, which follows, we consider each row of a matrix
as a 'numbet'.
3.
Rank and
3.1.
if A
THEOREM
and
linear dependence 31. If A is a matrix of rank r, where r 1, has more than r rows, we can select r rows of A and
sum
Let
r
have
(i)
There is at least one non-zero determinant of order n that can be formed from A. Taking letters for columns and suffixes for rows, let us label the elements of A so that one such nonzero determinant
is
denoted by
==
6,
bn
kn
of any other row.
With
n+0
be the
suffix
96
2+-+Mn = b n+6)
Z lf Z ,..., l n 2
(1)
in
given by
Hence the rowf with suffix n-{-9 can be expressed of multiples of the n rows of A that occur in A. This n m. proves the theorem when r 0. Notice that the argument breaks down if A
and
so on.
as a
sum
<
(ii)
Let
is
<
n and,
also, r
< m.
r.
There
of
As
in
(i),
take
letters for
columns and suffixes for rows, and label the elements so that one non-zero minor of order r is denoted by
M=
ax
^
bl
Pi
b2
br
p2
Pr
'r
With
rows of
have
suffixes
r+1, r+2,..., m.
Consider
now
the determinant
ar+0
r and where oc is the letter where 6 is any integer from 1 to ANY column of A. then D has two columns If a is one of the letters a,..., identical and so is zero. D is a minor of If a is not one of the letters a,..., p, then and since the rank of A is r D is again zero. A of order r+1;
m>,
of
letters
have
suffix
tt
we may write
equations n
(1) as
97
Hence D is zero when a is the letter of ANY column of A, and if we expand D by its last column, we obtain the equation
^ ar^+A
r
a1 +A 2 a 2 +...+Ar ar
= 0,
(2)
and where M, Aj,..., A^ are all independent of ex. Hence (2) expresses the row with suffix r-\-Q as a sum of symbolically, multiples of the r rows in
where
M^
AI
-~~
A2
Ar
This proves the theorem when r is less than Notice that the argument breaks down if
3.2.
THEOREM
q rows of
32.
to select
<
as a
sum
Suppose that, contrary to the theorem, we can select q rows A and then express every other row as a sum of multiples of them. Using suffixes for rows and letters for columns, let us label the elements of A so that the selected q rows are denoted by = 1,..., m, where m is the number of Pit /2>---> /V Then, for k
of
= *lkPl+
k k
and
Now
be
(3) expresses our hypothesis in symbols. consider ANY (arbitrary) minor of A of order r. Let the
= > q,
1,..., q,
(3) is
3)
8/fc
letters of its
&!,...,
kr
let
the suffixes of
its
rows
which
is
o
o
.
v
each of which has r
4708
e.
en
q columns of zeros. O
98
Accordingly, if our supposition were true, every minor of A of order r would be zero. But this is not so, for since the rank
of
is
r there
,
is
of order r that
is
not
is
zero.
established.
4.
Non -homogeneous
4.1.
linear equations
We now
consider
n
^a *=i
a set of
of equations
values of x
it
bi
(i
l,...,
m),
...,
(1)
xn
Such a
set
may either be consistent, that is, at least one set of may be found to satisfy all the equations, or they
x can be found the equations. To determine whether the equais,
may
be inconsistent, that
set of values of
we
A=
an
a ln
"I,
==
bl
We
the augmented matrix of A. Let the ranks of A, B be r, r' respectively. Then, since every r' or r A has r' minor of A is a minor of B, either r [if
call
<
r,
We
4.2.
shall
prove that
when
r'
and are
inconsistent
when
Since
r
r'
Let r
<
r'.
< ^
r'.
m,
<
m.
We
as a
can select
rows
of
sum
of multiples
of these r rows.
Let us number the equations (1) so that the selected r rows become the rows with suffixes 1,..., r. Then we have equations
of the form
ar+0,i
=
(1)
wherein the A's are independent of t. Let us now make the hypothesis that the equations
are
99
to
Then, on multiplying
n,
(2)
we
get
br+e
*ie*>i+-+\e*>r
(3)
The equations., (2) (3) together imply that we can express any row of B as a sum of multiples of the first r rows, which is contrary to the fact that the rank of B is r' (Theorem 32).
and
(1)
cannot be consistent.
(1)
Let
r'
r.
If r
= ra,
may
be written as
(5)
below.
If r
<
row as
we can select r rows of B and express every other a sum of multiples of these r rows. As before, number
ra,
i
the equations so that these r rows correspond to Then every set of x that satisfies
1,...,
r.
%ax = b
t
l,...,r)
(4)
satisfies all
the equations of
(1).
a denote the matrix of the coefficients of x in (4). as we shall prove, at least one minor of a of order r must Then, have a value distinct from zero. For, since the kth row of A
let
Now
can be written as
A lkPl+'"
\
+ *rkPr>
\
every minor of A of order r is the product of a determinant wherein i has the values 1,..., r a determinant \a ik l^ifc'l by
\,
that is, every minor of A of (put q order r has a minor of a of order r as a- factor. But at least
r in
the work of
of order r
3.2);
one minor of
is
a of order r is not zero. Let the suffixes of the variables x be numbered so that the first r columns of a yield a non-zero minor of order r. Then, in this notation, the equations (4) become
r
a u xt
e=i
n
&,
J au x *=r+i
(*
= >> r),
l
(5)
side
wherein the determinant of the coefficients on the left-hand The summation on the right-hand side is is not zero.
If
present only
>
r,
when n we may
>
r.
Hence,
if
r'
and
100
r (1) are consistent and certain sets of n x can be given arbitrary values, f of the variables If n = r, the equations (5), and so also the equations (1), are consistent and have a unique solution. This unique solution
>
the equations
may
7.5) as
x
where
of
(5),
= A?b,
>
A:
6
is
is
and x
5.
is the single-column matrix with elements b v ..., b n the single^column matrix with elements xl3 ... xn
9
.
Homogeneous
The equations
linear equations
IaXi =
t
,
(*
!,..., HI),
(6)
which form a set of m homogeneous equations in n unknowns xn may be considered as an example of equations (1) a?!,...,
when
all
The
results of
4 are immediately
applicable.
Since
and
is equal to the rank of A, are zero, the rank of so equations (6) are always consistent. But if r, the rank
all 6^
of A,
is
4.3,
the only
given by
6
x
Hence, when r x2 that is, x l
4.3,
= A^b,
0.
= n the only solution of (6) is given by x = 0, = = = xn = 0. When r < n we can, as when we obtained equations (5) of
...
(6) to
the solution of
(t
t~i
,!*,=
<=r-fl
au xt
1,..., r),
(7)
by
of the variables may be given arbitrary values; the f Not all sets of criterion is that the coefficients of the remaining r variables should provide
nr
a non-zero determinant of order r. J The notation of (7) is not necessarily that of (6): the labelling of the elements of A and the numbering of the variables x has followed the procedure
laid
down
in
4.3.
101
assigning arbitrary values to xr+l ,..., x n and then solving (7) for Once the values of xr+v ..., xn have been assigned, the #!,..., xr
.
(7) determine x v ..., xr uniquely. In particular, if there are more variables than equations, i.e. n w, then r < n and the equations have a solution other x l = ... = x n 0. than
equations
>
6.
Fundamental
6.1.
sets of solutions
the rank of A, is less than n, we obtain n r distinct solutions of equations (6) by solving equations (7) for the sets of values
r,
When
nr
== 0>
Xr+2
(8)
*n
0,
Xn
0,
...,
#n
1.
We
shall use
to denote the single-column matrix whose eleare x v ..., x n The particular solutions of (6) obtained
.
we denote by
X*
As we
is
(*
= l,2,...,n-r).
X
(9)
of (6), which shall prove in 6.2, the general solution obtained by giving #rfl ,..., x n arbitrary values in (7), is furnished by the formula
X=Yaw*X =
fc
fc
(10)
That is to say, (a) every solution of (6) is a sum of multiples of the particular solutions X^. Moreover, (6) no one X^ is a sum of multiples of the other X^.. A set of solutions that has the
properties (a) TIONS.
and
(6) is
called a
There
is
If
we
replace the numbers on the right of the equality signs in (8) by the elements of any non-singular matrix of order nr, we are led to a fundamental set of solutions of (6). If we replace these
numbers by the elements of a singular matrix of order nr, solutions of (6) that do not form a fundawe are led to
nr
shall
prove in
6.4.
102
6.2.
made
in
6.1.
We
are
seen, every solution of such a set is a solution of a set of equations that may be written in the form r n n ~ n x (i 1 C7\ u <r 1 v? r\
Z^^u^i
JL
\*
";>
{/
We
shall, therefore,
original equations but shall consider only As in 6.1, we use to denote x v ...,
x n considered as the
XA
to denote the
particular solutions of equations (7) obtained from the sets of values (8). Further, we use l k to denote the kill column of (8)
considered as a one-column matrix, that is, I k has unity in its kth row and zero in the remaining 1 rows. Let A denote l the matrix of the coefficients on the left-hand side of equations
nr
denote the reciprocal of A v In one step of the proof we use XJJ+ 5 to denote xa ,..., x a+8 considered as the elements of a one-column matrix.
(7)
and
let
A^ 1
The general solution of (7) is obtained by giving arbitrary values to xr+1 ,..., x n and then solving the equations (uniquely, since A l is non-singular) for x l9 ..., x r This solution may be
.
XBut the
last
Xr+k
matrix
is
Many
The remainder of
readers will prefer to omit these proofs, at least on a 6 does not make easy reading for a beginner.
first
reading.
103
(7)
we have proved
may
be repre-
X^lVfcX*.
6.3.
(10)
A fundamental set contains n r solutions* 6.31. We first prove that a set of less than nr solutions
,
cannot form a fundamental set. Let Y lv .., Yp where p < nr, be any p distinct solutions of (7). Suppose, contrary to what we wish to prove, that Yt form
-
a fundamental
set.
Then
X*
where the
=.!/.*
Y,
(k
I,...,
n-r),
(9).
6.1
By Theorem
31
applied to the matrix [A /A.], whose rank cannot exceed p, we as a sum of multiples of p of them (at can express every most). But, as we see from the table of values (8), this is
fc
is
Y^ cannot form a fundamental set. solu6.32. We next prove that in any set of more than is a sum of multiples of the retions one solution (at least) mainder. Let Yj,..., Yp where p > nr, be any p distinct solutions of
nr
(7).
Then, by
(10),
where p ik
(at most).
is
-.
Exactly as in
sum
of multiples of
nr
6.31,
of
we them
and
from
6.31
set
must contain
nr
if
solutions.
6.4.
Proof of statements
made
in
6.1 (continued).
6.41. If [c ik ] is a non-singular matrix of order X^ is the solution of (7) obtained by taking the values
nr,
and
xr+l
c iV
xr+2
C i2>
>
xn
104
then,
in
6.2,
On
3.1),
we
get
where
Hence, by form
every solution of (7) may be written in the n-r n-r X xr+i 2, yik Xfc
1=1
fc=l
V
not possible to express any one X^ as a sum of multiples of the remainder. For if it were, (11) would give equations of the formf
Moreover,
it is
6.31,
would imply
could be expressed as a
solutions
is
sum
of multiples of the
set.
Hence the
XJ form a fundamental
6.42. If [c ik ]
a suitable labelling of
that
c
p+8,k
=2
8
1
where p
is
<
r.
But, by
and
t as a
can be written
105
an equation
set of solutions.
as a
sum
INDEPENDENT. By what we have proved in 6.41 and 6.42, a set of solutions that is derived from a non-singular matrix [c ik \ is linearly independent and a set of solutions that is derived from a singular
it
matrix [c ik ] is not linearly independent. Accordingly, if a set of n r solutions is linearly independent, must be derived from a non-singular matrix [c ik ] and so, by
such a set of solutions
is
6.41,
a fundamental
set of
set.
r solutions is
fundamental set. Moreover, any set of n r solutions that is not linearly independent is not a fundamental set, for it must be derived from
a singular matrix
[c tk ~\.
matrix
1. Prove that the rank of a matrix is unaltered by any of the following changes in the elements of the matrix (i) the interchange of two rows (columns),
:
the multiplication of a row (column) by a non-zero constant, the addition of any two rows. HINT. Finding the rank depends upon showing which minors of the matrix are non-zero determinants. Or, more compactly, use Examples VII, 18-20, and Theorem 34 of Chapter IX.
(ii)
(iii)
2. When every minor of [a ik ] of order r-f-1 that contains the first rows and r columns is zero, prove that there are constants A^ such that the kth row of [a ik ] is given by
Pk
= 2 \kPii-l
or otherwise) that every minor of [a ik ]
3.2,
r+ 1
is
then zero.
106
3.
and
2, prove that if a minor of order r is non-zero every minor obtained by bordering it with the elements from an arbitrary column and an arbitrary row are zero, then the rank of the
matrix
4.
is r.
Rule for symmetric square matrices. If a principal minor of order r is non-zero and all principal minors of order r-f- 1 are zero, then
the rank of the matrix
is r.
(See p. 108.)
results for
Prove the rule by establishing the following a ik in (i) If A ik denotes the co -factor of
(7
=
U
tl
a*r
a ir
a8, a t$
a a
is the complementary minor of then A 88 A it A^^L^ MC, where 88it atg a tt a t9 ast in C (Theorem 18). = and every corresponding A 88 = 0, then every (ii) If every (7 = Atg = and (Example 2) every minor of the complete matrix Ait
[a ik ] of order
(iii)
r+1
is
zero.
If all principal minors of orders r-f 1 and r -f 2 are zero, then the rank is r or less. (May be proved by induction.)
Numerical examples
5.
2458
2 3J 3 [1347],
1
1
(iii)
3
1
(iv)
658
.3872.
Ans.
(i)
Prove that the three points (xr yr z r ), r = 1, collinear according as tho rank of the matrix
, ,
in
a plane are
(x,y,z).
2, 3,
is
2 or
3.
107
7. The coordinates of an arbitrary point can be written (with the notation of Example 6) in the form
1, 2,
HINT.
Theorem
8.
coordinates
9.
Give the analogues of Examples 6 and 7 for the homogeneous (x, y, z, w) of a point in three-dimensional space.
configuration of three planes in space, whose cartesian equa-
The
a
is
ilL
x+ a ia t/-f a i8 z =
b{
(i
1, 2, 3),
determined by the ranks of the matrices A =s [a ik ] and the augmented matrix B (4.1). Check the following results. ^ 0, the rank of A is 3 and the three planes meet in (i) When \a ijc a finite point. (ii) When the rank of A is 2 and the rank of B is 3, the planes have
\
no
of
finite
The planes
if
every
A ik
is
zero,
is 1.
When
or
the rank of
is 2,
either (a)
(b)
parallel,
not.
Since |a^|
0,
(Theorem 12), and if the planes 2 and 3 intersect, their line of interHence in section has direction cosines proportional to n -4 ia lz (a) the planes meet in three parallel lines and in (b) the third plane
meets the two parallel planes in a pair of parallel lines. The configuration is three planes forming a triangular prism; as a special case, two faces of the prism may be parallel. B is 2, one equation (iii) When the rank of A is 2 and the rank of is a sum of multiples of the other two. The planes given by these two equations are not parallel (or the rank of A would be 1), and so the configuration is three planes having a common line of intersection. (iv) When the rank of A is 1 and the rank of B 10 2, the three planes
are parallel.
(v)
When
the rank of
is 1, all
three equa-
108
10. Prove that the rectangular cartesian coordinates (X 19 2 J 3 ) of the orthogonal projection of (x l9 x 2 # 3 ) on a line through the origin having direction cosines (J lf J 2 Z 3 ) are given by
, , ,
Ma h
i
l\
LMa
where X, x are
11. If
Mi
single
column matrices.
sponding m matrix,
M denotes a corre-
X2 - L
and
and that the point (L-\- M)x is the projection of # on a line in the plane if and only if I and in are perpendicular. of the lines I and
is
CHAPTER IX
DETERMINANTS ASSOCIATED WITH MATRICES
1.
The rank
of a product of matrices
1.1. Suppose that A is a matrix of n columns, and B a matrix of n rows, so that the product A B can be formed. When A has has n^ rows n^ rows and B has n 2 columns, the product
AB
and n2 columns.
If a ik b ik are typical elements of A, B,
,
C ik ==
is
AB
may, with a
suitable labelling of
21
U 22
When
^ n), A
b nl
b nn
is
the
a. *m
*nl
in
that
is,
A
/
is
by a /-rowed
minor of B.
When
^ n,
ii
A
-
is
am
aln
15),
bn
m
nf
/
%
Hence (Theorem
U
/
'
when
> n,
is
zero;
and when
<
w,
is
the
sum
t
of order
the products of corresponding determinants that can be formed from the two arrays (Theorem 14).
of
all
Hence every minor of A B of order greater than n is zero, and every minor of order / ^ n is the sum of products of a /-rowed minor of A by a /-rowed minor of B.
110
Accordingly,
THEOREM
rank
For,
33.
AB
of either factor.
by what we have shown, (i) every minor of AB with more than n rows is zero, and the ranks of A and B cannot exceed n, (ii) every minor of A B of order t < n is the product or the sum of products of minors of A and B of order t. If all
the minors of
or if
all
the minors of
then so are
all
minors of
AB of order r.
1 ,2. There is one type of multiplication which gives a theorem of a more precise character.
rows and n columns and B is a non// A has singular square matrix of order n, then A and AB have the same rank; and if C is a non-singular square matrix of order m, then
THEOREM
34.
and
CA
If the rank of
But A = AB
n,
and the rank of AB is />, then/o ^ r. B~ and so r, the rank of A, cannot exceed p and which are the ranks of the factors AB and B" 1 Hence p = r.
is
CA
is similar.
characteristic equation; latent roots 2.1. Associated with every square matrix A of order n
The
is
the
is
matrix
A/,
where /
is
n and A
any number,
real or complex.
In
.
full, this
.
matrix
is
ra n
a 12
a 21
a 22
a n2
this
aln a 2n ann~~
of the form
an\
'
'
The determinant of
/(A)
matrix
n
(X
is
(-l)
0,
+p
is,
n -l
where the
pr
(1)
roots
that
of
(2) \A-\I\ = 0, are called the LATENT ROOTS of the matrix A and the equation itself is called the CHARACTERISTIC EQUATION of A.
111
THEOREM
35.
satisfies
its
own
if
then
A"+p l A*- l +...+p n _ l A+p n I = 0. The adjoint matrix of A XI, say J5, is a matrix
in X of degree
whose
the
ele-
or
less,
coeffi.
A-i,
(3)
are matrices
in
joint =
Now, by Theorem
24, the product of a matrix by its addeterminant of the matrix x unit matrix. Hence
(A-\I)B
\A-\I\xI
Since
(1).
B is given by
(3),
we have
This equation
is
true for
all
values of A and
we may
therefore
equate coefficients! of
A; this gives us
.i
=(-!)"!>!/,
equations
+ ...+
JPll )a li
(t,
1,...,
n),
so that the equating of coefficients is really the ordinary algebraical procedure of equating coefficients, but doing it n 8 times for each power of A.
112
Hence we have
= A n I+A n - p I+...+A. Pn . I+pn I = (-l)-{-A-Bn . +A^(-Bn _ +ABn _ )+...+ +A(-B +AB )+AB = 0.
l
ll
A, so
that
\A-XI\
it
follows that
(Example
13, p. 81),
(A-XilKA-Xtl^A-Xtl) =
It does not follow that
A+ Pl A-*+...+p
A
=
is
0.
Xr I
the
zero matrix.
3.
// g(t) is a polynomial in t, and A is a square matrix, then the determinant of the matrix g(A) is equal to
THEOREM
36.
the product
Let
so that
g(t)
g(A)
10,
r=l
The theorem
also holds
when g(A)
is
is
a rational function
non -singular.
3.2. Further, let g^A) and g 2 (A) be two polynomials in the matrix A, g 2 (A) being non-singular. Then the matrix
that
is
113
a rational function of A with a non-singular denominator g2 (A ). Applying the theorem to this function we obtain, on
writing
g^A^g^A)
= g(A),
r-l
From
this follows
36.
THEOREM
the matrix
DEFINITION
matrix are
(i) (ii)
18.
the interchange of
a row
(or
column) by
a constant
(iii)
row (or column) of a constant multiple of the elements of another row (or column).
the addition to the elements of one
Let A be a matrix of rank r. Then, as we shall prove, it can be changed by elementary transformations into a matrix of the
form
7
I'
"J o
where / denotes the unit matrix of order r and the zeros denote
null matrices, in general rectangular.
elementary transformations of type (i) a matrix such that the minor formed by the by replace elements common to its first r rows and r columns is not equal to zero. Next, we can express any other row of A as a sum of
In the
first place, f
multiples of these r rows; subtraction of these multiples of the rows from the row in question will ther6fore give a row of
zeros.
(iii),
follows
is
4702
114
matrix unaltered
before us
is
1)
P
o o
denotes a matrix, of r rows and columns, such that \P\ 7^0, and the zeros denote null matrices. By working with columns where before we worked with rows, this can be transformed by elementary transformations, of type
where
(iii),
to
P
full,
C"|
h'-of
Finally, suppose
P is,
Q-\
in
U-t
^2
(i)
and
and
so, step by step, to a matrix having zeros in all places other than the principal diagonal and non-zeros| in that diagonal. A final series of elementary transformations, of type (ii), presents us with /, the unit matrix of order r.
4.2.
As Examples VII,
4.1
envisaged in
tion of
A by non-singular matrices.
t If a diagonal element were zero the rank of these transformations leave the rank unaltered.
would be
less
than
r: all
115
_r/
where
Tr
matrices.
4.3.
DEFINITION
19.
Two
matrices are
EQUIVALENT
if it is
possible to
to the other
by a chain of elementary
transformations.
are equivalent, they have the same rank J3 2 Cv C2 such that there are non-singular matrices l9
If
A! and
A2
and
BI A 1 C
From
this it follows that
= /;- B A
2
C2
Ai
= B^BtA&C^ - LA M,
2
= B 1 B 2 and so is non-singular, and = C2 (7f l say, where L and is also non-singular. Accordingly, when two matrices are equivalent each can be obtained from the other through pre- and post-multiplication
by non-singular
Conversely,
matrices,
matrices.
if
A2
is
of rank
r,
and
are non-singular
and
then, as we shall prove, we can pass from A 2 to A l9 or from A! to 2 by elementary transformations. Both A 2 and l are
A2
to
to T r TT by elementary transformations and so, by using the inverse r operations, we can pass from I' to A by elementary transformations. Hence we can pass from A 2 to A v or from A to A by elementary transformations.
l
2y
We
4.4. The detailed study of equivalent matrices is of fundamental importance in the more advanced theory. Here we have done no more than outline some of the immediate con-
Theory of Canonical
CHAPTER X
ALGEBRAIC FORMS: LINEAR TRANSFORMATIONS
1.
Number
fields
We
a number
field
liminary note.
DEFINITION 1. A set of numbers, real or complex, a FIELD of numbers, or a number field, when, if r and to the set and s is not zero,
r+s,
also belong to tKe
set.
is called
s belong
rs,
rxs,
r~s
fields are
the
the
field
field
of
of of
of
all
rational real
numbers (gr
say);
numbers;
(iii)
the
the
field
field
all
must contain each and every number that is contained in $r (example (i) above); it must contain 1, since it contains the quotient a/a, where a is any number of the set; it must contain 0, 2, and every integer, since it contains the difference 1 1, the sum 1 + 1, and so on; it must contain
Every number
field
it
by
2.
another.
Linear and quadratic forms 2.1. Let g be any field of numbers and let a^, b^,... be determinate numbers of the field; that is, we suppose their valuefi to be fixed. Let xr yr ,... denote numbers that are not to be thought of as fixed, but as free to be any, arbitrary, numbers from a field g 1? not necessarily the same as g. The numbers we call CONSTANTS in g; the symbols xr yr ,... we call aip &#> VARIABLES in $ v
,
An
expression such as
120
is
said to be a
an expression
SUCh as
is
said to be a BILINEAR
FORM
in the variables
xi and
y$\
an
expression such as
.jJX^i^
is
(%
= %)
(3)
said to be a
case,
when we
in the variables a^. In each wish to stress the fact that the constants a^
QUADRATIC FORM
belong to a certain field, g say, we refer to the form as one with coefficients in g. A form in which the variables are necessarily real numbers is said to be a 'FORM IN REAL VARIABLES'; one in which both coefficients and variables are necessarily real numbers is said to be a 'REAL FORM'.
term 'quadratic form' is This restriction of usage is (3) dictated by experience, which shows that the consequent theory is more compact when such a restriction is imposed.
2.2. It should be noticed that the
used of
only when a^
= a^.
The
restriction
is,
real: for
an
expression such as
is
when we
For example, x*+3xy+2y 2 is a quadratic form in the two variables x and y, having coefficients
On
2.3.
1,
a2 2
i2
=a =i
2i
Matrices associated with linear and quadratic forms. The symmetrical square matrix A == [a^], having a^
121
is associated with the quadratic A' and so A because a^ symmetrical a^-, of A. the transpose In the sequel it will frequently be necessary to bear in mind
form
(3): it is
is
a sym-
We
form
(2)
of rows and n columns; and with the linear form (1) we associate the single-row matrix [<%>> ^l- More generally,
we
m linear forms
m rows and
n columns.
i} ]
Notation.
(2),
(3),
one of
We denote the associated matrix [a of any or (4) by the single letter A. We may then
(2)
(3)
conveniently abbreviate
the bilinear form
to
A(x,y),
A(x,x),
to to
and the
m linear forms
it
(4)
Ax.
are merely shorthand notations; the third, can be so regarded, is better envisaged as the though product of the matrix A by the matrix x, a single-column matrix having x l9 ..., xn as elements: the matrix product Ax, which has as many rows as A and as many columns as x, is then a single column matrix of m rows having the m linear forms as its
also
elements.
2.5.
forms. As
in 2.4, let x denote a single-column matrix with x v ... x n then x', the transpose of x is a single-row elements matrix with elements x v ..., x n Let A denote the matrix of the quadratic form 2 2 ars xr xsThen Ax is the single-column matrix having
}
.
as the element in
4702
its
is
122
(the
(the
number of
is
x n and
when x and y are single-column matrices having as row elements, the bilinear form (2) is 2/ lv .., y n
2.
DEFINITION
The DISCRIMINANT of
the quadratic
form
i=i j=i
is
1>0^
(# =
namely
*)
\a^\.
Ae determinant of
its coefficients,
3.
Linear transformations
3.1.
The
set of equations
Xi=2* av x
(*
= I.-,
w),
(1)
wherein the a^ are given constants and the Xj are variables, is said to be a linear transformation connecting the variables Xj and the variables X^ When the a tj are constants in a given field 5
we say
JJ;
when the
is real.
the transformation
|a^|,
The determinant
MODULUS OF
DEFINITION 4. A transformation is said to be NON-SINGULAR when its modulus is not zero, and is said to be SINGULAR when its modulus is zero. We sometimes speak of (1) as a transformation from the xi to the X i or, briefly, x -> X.
;
3.2.
The transformation
(1) is
X = Ax, (2) in which X and x denote single-column a matrix equation xn respectively, Xn and x matrices with elements X v
...,
l9
...
123
Ax
When A
(Chap. VII,
is
a non-singular matrix,
3)
has a reciprocal
A" 1
(3)
and
A-^X
A~*Ax
x,
which expresses x directly in terms of X. Also and so, with a non-singular transformation, it
whether
it
(.4~ )~
is
= A,
immaterial
-> x. or in the form be given in the form x -> X whatsoever, there is one when Moreover, Ax, given any A~ 1 X. and only one corresponding x, and it is given by x When A is a singular matrix, there are for which no corre-
X=
sponding x can be defined. For in such a case, r, the rank of the matrix A, is less than n, and we can select r rows of A and express every row as a sum of multiples of these rows (Theorem
31).
(1)
gives relations
(4)
**=i*w*<
wherein the
(4) will
l
(k
= r+l,...,n),
so the set
lt ...,
ki
are constants,
and
of x.
linear transformation
3\
the relation
2X
4 (2 6/
is
),
the pair X,
Y must satisfy
no corresponding pair
of
3.3.
The product
x, y, z
be
each with n components, and let all suffixes be understood to run from 1 to n. Use the summation
'numbers' (Chap. VI,
1),
convention.
may
Ay
(3.2),
and the
/
bjk z k
9 z>
may
Bz.
124
If in (1)
we
obtain
(3)
= *ifok*k>
a transformation whose matrix is the product (compare Chap. VI, 4). Thus the result of two successive Bz may be written as transformations x Ay and y
which
is
AB
x
Since
==
ABz.
AB
is
we adopt a
DEFINITION 5. The transformation x = ABz is called the Bz. [It PRODUCT OF THE TWO TRANSFORMATIONS X = Ay, y is the result of the successive transformations x = Ay, y Bz.]
The modulus of the transformation (3) is the determinant ^at * s the product of the two determinants \A\ and \B\. Similarly, if we have three transformations in succession,
\
a ijbjk\>
>
= Ay,
any
Bz,
Cu,
ABCu and the modulus of the resulting transformation is x the product of the three moduli \A\, \B\, this transformation is
and
|
<7|;
and
so for
finite
number
of transformations.
4.
*=i
n
TL*> ik
Xk
(t
= l,...,n)
au
(1)
= 2 2a i=l j=l
ij
x i xj
aji)-
2)
When we
by
(1),
we obtain a quadratic
expression is result of applying
Xv
Xn
(1) to
the form
(2).
is applied to a // a transformation x A(x,x), the result is a quadratic form C(X,X) quadratic form whose matrix C is given by
THEOREM
37.
= BX
C
where B'
is the
B'AB,
transpose of B.
125
a proof that depends entirely on matrix theory. is given by the matrix product x'Ax (com-
By hypothesis,
23).
= X'B'
(Theorem
x'Ax
= X'B'ABX = X'CX, =
(3)
where
Jf 1? ...,
Moreover, (3) is not merely a quadratic expression in n but is, in fact, a quadratic form having c^ c^.
>
To prove
this
we observe
that,
when
G=
A')
B'AB,
(Theorem
(since
23, Corollary)
A =
Second proof
summation convention.
Let
all suffixes
run from
to
a a x i xi
The transformation
expresses
(4) in
=a = b ik X k
(
aa
jt)-
4)
(5)
the form
c kl
Xk X
(6)
may calculate the values of the c's by carrying out the have actual substitutions for the x in terms of the X.
We
We
a ij xi xj
=a =
ij
b ik
Xk b X
jl
so that
c kl
buaybfl
equal to the
= b ik
B'.
is
=a
jt ,
we have
126
which
form,
4.2.
is
the same
sum
lk
.
having c w = c
as b u a tj bjk or clk
Thus
(6) is
a quadratic
THEOREM
to
38.
// transformations x
= CX
= BY
are
applied
the result is
a bilinear form
D is given by D = C'AE,
Theorem
37,
transpose of C.
First proof.
As
we
write
the bilinear form as a matrix product, namely x'Ay. It follows at once, since x' X'C', that
where
D=
x'Ay C'AB.
= X'C'ABY = X'DY,
much
detail is
This proof is given chiefly to show how avoided by the use of matrices: it will also serve
may
be written as X V
'
m
' "
~~i-l-
tt
so that
^ik
and
-4(*,y)
=2 =
i
The matrix of this bilinear form in the ith row and kth column
,
in
x and
This matrix
is
therefore
AB, where A
is
[ct ik ]
and
B is
[b ik \.
127
by putting
(*'
xi
= 2 c ik x k =
fc
v-,
)>
(3)
t/'s
unchanged. Then
m
n
m
The sum
]T c ifc a #
^s
= 1 2(Ic< fc-i/=ri=i
fc
'
**%-
(4)
c lm
aml
Now
(5) is
the transpose of
that
is,
of
(7,
(3),
while
(6) is
the matrix
bilinear
form (4) is C'A. The theorem follows on applying the two transformations
succession.
4.4.
=2
l ik
Xk
(*'
128
the determinant
\l ik \,
is
equal to
M;
let
form
be
1=1 J=l
.2
JU;Z,Z, (^ =
fi ).
(2)
Then
in symbols
=
|
Jf|a,,|.
is
The content of
this
theorem
statement 'WHEN A QUADRATIC FORM is CHANGED TO NEW VARIABLES, THE DISCRIMINANT IS MULTIPLIED BY THE SQUARE
OF THE MODULUS OF THE TRANSFORMATION '. The result is an immediate corollary of Theorem 37. By that theorem, the matrix of the form (2) is C = L'AL, where L is
the matrix of the transformation. The discriminant of
is,
(2),
that
the determinant
\C\, is
10)
by the product
is
\A\,
and
\L\.
But
\L\
M, and
since
unaltered
also equal to
is
This theorem is of fundamental importance many times in the chapters that follow.
5.
it will
be used
Hermitian forms
5.1.
In
its
BI-
LINEAR FORM
given by
.2
1
1 J
|>tf*< 1
field
wherein the coefficients a^ belong to the bers and are such that
of complex
num-
denotes, as in the theory of the complex variable, the a+ij8, where a and j8 are real, conjugate complex; that is, if z
The bar
then z
ifi.
129
A'
=A
(2) is
(2)
that satisfies
said to be a
HERMITIAN
form such as
is
HERMITIAN FORM. The theory of these forms is very similar to that of ordinary bilinear and quadratic forms. Theorems concerning Hermitian
said to be a
Cogredient and contragredient sets When two sets of variables x v ..., xn and y l9 ..., yn are related to two other sets X v ..., X n and Yv ... Y by the same n
6.
6.1.
transformation, say
AX,
Y) are said to be
is
related to a set
Zv
...
Zn
by a transformais,
tion
whose matrix
is
2 =
or
Z = A'z,
Z) are said to be contra-
z (equally,
X and
A transformation
another
is
The
sets of variables (x l9
y^ zj and
(x 2
j/ 2 ,
z 2 ),
regarded as the
coordinates of two distinct points, are cogredient sets: the coordinates of the two points referred to the new triangle
of reference are (X 19 Yly ZJ and (X 2 7 Z 2 ), and each is 2 obtained by putting in the appropriate suffix in the equa,
,
tions
4702
(1).
130
lx+rny+nz
(2)
be regarded as the equation of a given line ex in a system of homogeneous point-coordinates (x, y, z) with respect to a given
Then the line (tangential) coordinates of Take a new triangle of reference and suppose (I, m, n). that (1) is the transformation for the point-coordinates. Then (2) becomes =
triangle of reference.
a are
LX+MY+NZ
I
0,
whcre
M = w^+WgW-fwZgTfc,
N=
The matrix
coordinates
of the coefficients of
I,
(3)
m, n in
(3) is
the transpose
in
(I,
(1).
Hence point-
and line-coordinates
r
m, n) are contra-
gredient variables when the triangle of reference is changed. [Notice that (1) is X, 7, Z -> x, y, z w hilst (3) is /, m, n->
L,
M,
N.}
The notation of the foregoing becomes more compact when we consider n dimensions, a point x with coordinates (x l9 ... xn and a 'flat' I with coordinates (J 1} ..., l n ). The transformation
9
the transformation
6.2. Another example of contragredient sets, one that contains the previous example as a particular case, is provided by
differential operators.
tion of
where
Let F(x l9 ... x n be expressed as a funcmeans of the transformation x AX, by l9 denotes the matrix [#]. Then
9
...
Xn
dF
___
^ 6F
dX
dxj __
^
9
dF
That
is
if
= AX
dx
131
The
X=
*<=i^
i
=i
= l,...,n).
(I)
Is
^X",.
it
possible to
Xx t
is
(i
1,...,
a result
to hold,
we must have
n
Xx i
=J a
f
ij
xj
(i
...,n),
(2)
and this demands (Theorem 11), when one of x l9 ..., x n from zero, that A be a root of the equation
A(\)
differs
a u -A
a 12
a ln
0.
(3)
a 7il
a n2
'
u nn
'
This equation is called the CHARACTERISTIC EQUATION of the transformation (1) any root of the equation is called a CHARAC;
TERISTIC
If A
#!,...,
is
,
NUMBER
all
x n not
Let A
is
A 1? a characteristic
zero (Theorem 11), that satisfy equations (2). number. If the determinant A(XJ
there
is
of rank
(nl),
that
or
satisfies (2).
A(X l
is
less,
5).
when
teristic
number
Ar
is
called a
POLE
equal to a characCORRESPONDING TO Ar
A
is
.
are homogeneous co2 ,X 3 ) (^,^0,^3) and (X V ordinates referred to a given triangle of reference in a plane,
7.2. If
7^0, may be regarded as a method of generating a one-to-one correspondence between the variable point (x lt x 2 ,x 3 ) and the variable point (X V 2 ,X 3 ). A pole
then
(1),
with
|^4|
of the transformation
itself.
is
132
not elaborate the geometrical implications*)* of such transformations; a few examples are given on the next page.
We
EXAMPLES
1.
numbers of the form a-f &V3, where a and 6 are integers (or zero), constitute a ring; and that all numbers of the form a-f 6V3, where a and 6 are the ratios of integers (or zero) constitute a
Prove that
all field.
2.
2 Express ax*-\~%hxy-}-by as a quadratic form
(in
the definition of
3.
2,1)
its
discriminant
is
is
transformations
z
4.
+w + n
3 7?
\ 3 X+fjL 3
Y+VsZ
, ,
Prove that, in solid geometry, if i^j^lq and i a J 2 k 2 are two unit frames of reference for cartesian coordinates and if, for r = 1,2,
if
jr
kf
jr
k,
ir ,
k,
ir
-j
r,
then the transformation of coordinates has unit modulus. [Omit if the vector notation is not known.]
5.
Verify
Theorem
37,
quadratic form
is
l 2 X-{-m 2 Y, by actual and the transformation is x = l^X-^m^^ y substitution on the one hand and the evaluation of the matrix product (B'AB of the theorem) on the other.
6.
5,
when
the
bilinear
form
is
7.
ABC
Z),
E,
referred to
(
as triangle of reference are (x^y^z^ (# 2 2/2 2 2) an(l Prove that, if (l,m,n} are line -coordinates referred to (L, N) are line -coordinates referred to DEF, then
ABC
x ^y^^)-
and
AZ
= XiL+XtM+XaN,
etc.,
f9 ...
x, y, z
>
X, Y, Z and then
f Cf. R. M. Winger, An Introduction to Protective Geometry (New York, 1922), or, for homogeneous coordinates in three dimensions, G. Darboux, Principes de gtomttrie analytique (Gauthier-Villars, Paris, 1917).
133
In the transformation
and
y,
deiined
by
y=Bx,
are introduced.
Y = BX
1
.
(|B|ytO),
Y=
Ct/,
where
C = BAB"
> y
is
given by
9. The transformations # = JB-X", y = -B^ change the Hermitianf a lj x i yj (with A' A) into c^-X^Iy. Prove that
C=
10.
BX,
(A'
gate complex x
where C
11.
=
JB
is
together with
its
conjuf
A) into c^X^Xj (C
C),
B'AB.
and
the quadratic form A (x, x) (or by singular transformation x = every coefficient of C belongs to ft.
Prove that,
BX
belong to a given field ft. transformed by the nonBx) into C(X,X), then
Prove that, if each xr denotes a complex variable and a rs are com= 1,..., n), the Hermitian numbers such that a rs a 8r (r 1,..., n; 8 plex form a rs x r xs is a real number.
12.
character.]
if
by
the-
y'
then every straight line transforms into a straight line. Prove also that there are in general three points that are transformed into themselves and three lines that are transformed into themselves. Find the conditions to be satisfied by the coefficients in order that
every point on a given line may be transformed into itself. 14. Show that the transformation of Example 13 can, in general, by a change of triangle of reference be reduced to the form
X'
Hence, or otherwise,
Z,' = Y' - J8F, aX, yZ. show that the transformation is a homology (plane
perspective, collineation)
if
will
make
al o3
of rank one.
&i
cl
^2
^
c3
ca
63
HINT.
ing lines
f
itself.
of rank one there is a line I such that In such a case any two correspond-
employed
in
Examples
9, 10,
and
12.
134
15.
two dimensions, the homology which the point (x{ y{ z{) is the
f
centre
16.
px+a(\x+fjLy-\~vz)
Show
of
it
involutory
(i.e.
two applications
if
[Remember
(z',y',z').]
CHAPTER XI
THE POSITIVE-DEFINITE FORM
1.
Definite real
1.1.
forms
6.
DEFINITION
The
real quadratic
form
/<)
.2
is
JXj*<*j
(y =
it is
x = 0. x2 ... x n other than the set X L It is said to be NEGATIVE-DEFINITE if it is negative for every set = ... = x n = 0. of real values of #!,..., x n other than x = x 2
real values of x^...,
fl
said to be POSITIVE-DEFINITE if
For example,
not
positive-definite, while ^x\~2x\ is positive-definite; the first is positive for every pair of real
3a-f is
+ 2j-|
0,
positive
V2 and x 2 ^3. and # 2 2, and is zero when x l An example of a real form that can never be negative but
is
when x l
x2
x2
0;
the second
1,
but
it is
negative when x l
is
given by
This quadratic form is zero when x 0, 2, x 2 3, and x 3 and so it does not come within the scope of Definition 6. The
point of excluding such a form from the definition is that, whereas it appears as a function of three variables, x l9 x 2 o? 3
,
it is
essentially a function of
two
,
variables, namely,
X =
l
3x l -2x 2
X =
2
x3
positive-definite
is
still
positive-definite
new
set of variables,
provided only that the tw o sets are connected by a real nonsingular transformation. This we now prove.
THEOREM
is
(i
40.
real positive-definite
form in
the
variables
n positive-definite form in the n variables X^..., #!,..., .r,, provided that (he two sets of variables are connected by a real,
nun~si/tf/ul(ir transformation.
and
let
the real
136
non-singular transformation be
y
X = Ax.
,
Then x
= A~ X.
1
Let
B(x, x) become C(X X) when expressed in terms of X. Since B(x, x) is positive-definite, C(X X) is positive for every X save that which corresponds to x = 0. But X = -4#, where
.4 is
Hence
0.
is a matrix equation, x being a NOTE. The equation # = column matrix whose elements are x l9 ...,x n
single-
1.2.
positive-definite
(
\
form
in
variables
a ll x 2 l
,
I
2 a nn x n
arr
U J' -> Q]
We now
is
show that every positive-definite form in n variables a transformation of this obvious type.
Every real positive-definite form, B(x, x), can be a real transformation of unit modulus into a form transformed by
41.
THEOREM
wherein each
crr is positive.
in later work.
The manipulations that follow are typical of others that occur Here we give them in detail; later we shall refer this section and omit as many of the details as clarity back to
B(x,x)
22 byx^
r=l
r<8
(by
bit )
The terms of
(1)
6ii^i+26 12 ^ 1 ^2 +...+26 ln a; 1 ^n
Moreover, since (1) xn ... and x 2
is
(2)
positive-definite, it is positive
0;
when x l
=1
hence 6 n
>
0.
(2)
may
be written as|
t It
is
137
n
2,...,
),
(3)
X =x
r
= (r
we have
B(x,x)
2 ,...,
X^
(4)
say.
The transformation
1
(3) is
biJbi
Hence (Theorem
j8 22
>
X =
X =
r
when
X is positive-definite ^ 2).
and
we have
J 22
P22
Let this
writing
+ form in X quadratic
Jl
/
a quadratic form in
3 ,...,
3 ,...,
Xn
Xn be 2 2 yrs^r^sJ
P2n
^22
/?
^hen,
on
T7
^1>
xr
[
Y"
/Q
/-*23
V"
i^
(5)
fe
7 - Xr
r
(r
3,...,
n),
we have
;.
(6)
The transformation
y 33
(5) is
>
(4),
form
(6) is positive-definite
by Theorem and
a quadratic form in
Z4
,...,
Zn
(7)
wherein 6 U
4702
j3 22 ,
y33
are positive.
As a preparation
for the
next
138
that
i2
J 22
*8
+
733
733
6 1 if
8)
= x+ b* x+ _ + b* x
f2
x *+... + +
X +&x
J22
n,
We have thus transformed B(x,x) into (8), which is of the form required by the theorem; moreover, the transformation x -> | is (9), which is of unit modulus.
2.
form THEOREM
42.
set
of necessary
and
sufficient conditions
2 **<**
be positive-definite is
(1)
>
0,
an
a 12
^nl
a'In
139
1.2,
positive-definite, then, as in
there
is
transformation
~ar~
n9
P22
is
transformed into
(2)
.+ Knn Xl
and
in this
form a n
(2) is
/? 22 ,...,
of the form
an
j8 22
...
and so
is
positive.
(1)
Now
consider
when x n
a ln _^
0.
nl variables x v
=
an
j8 22
...
By
xn _ v
to
nl terms
a n-I,n-l
and so
is
positive.
Similarly,
on putting x n
and x n _ l
an
j8 22
=
...
0,
to
2 terms,
set of conditions is necessary. the set of conditions holds, then, in the Conversely, and we may write, as in 1.2, place, a n
and so
on.
first
>
A(x,x)
au *
-f
X X3 +...,
2
(3)
140
say,
Aj
=
=
%i-\
-- #2
an
(fc
'
~ an
^N'
Z*
Before
**
> 1).
we can proceed we must prove that /? 22 is positive. for k > 2. Consider the form and transformation when x k ~ The discriminant of the form (3) is then a u j3 22 the modulus
,
of the transformation
is
unity,
is
a u a 22
af 2
two
variables x
and x 2 only,
a ilr22
tt
ll tt 22
tt
!2>
and so
may
by hypothesis. Hence
/? 22
is
positive,
and we
73
,...,
Yn
where
v = A v *i
i>
72 =
7 =
fc
-3L
fc
(A->2).
That
is,
we may
write
A(x,x)
a 11
(4)
where
!!
and
jS 22
are positive.
and the transformation from X to The discriminant of (4) is a n /? 22 y 33 the modulus of the transformation is unity, and the discriminant of (3) is
Consider the forms
for k
Y when x k ~
>
3.
13
which is positive by hypothesis. Hence, by Theorem 39 applied to a form in the three variables x lt x 2 # 3 only, the product a n ^22 733 ^ s e(l ual t ^he determinant (5), and so is positive.
,
Accordingly, y33
is
positive.
We may
141
we may
j
write
A (x
x) in the
/A\
lorm
wherein a n
,
V9
9.
V2
j8 22 ,...,
*n
Mi
(7)
P22
The form
(6) Ls positive-definite in
(7) is
X
}
ly ...,
Xn
transformation
y
non-singular,
A(x
x)
is
positive-definite in
x ly ... xn (Theorem
40).
y
2.2. We have considered the variables in the order x v x^... xn and we have begun with a n We might have considered the variables in the order xny x _ ly ... x l and begun with a nn We should then have obtained a different set of necessary and
. .
tl
sufficient conditions,
namely,
n,n
n,n-l
a n-l,n
a n-l,n-l
Equally, any permutation of the order of the variables will give rise to a set of necessary and sufficient conditions.
2.3.
{
The form
is
A (x, x)
if
is
negative-definite
if
the
form
y
A(x,x)}
positive-definite.
is
negative-definite
and only
<
0,
i
*22
*32
^33
EXAMPLES XI
I
.
(i)
(ii)
is
is
not.
142
2.
^22
3 3
r-i
o n xr xt
(a rs
asr )
-i
is
where
K
:r
is
a positive-definite form in r 2
:r 3
whose discriminant
is
an
x%,
HINT.
X = a^x^a^x^-a^x^ X =
2
3.
in
2
.r,
y, z
F
is
(a l x-\-b l y-}-c l
a l^l
4.
C lJ
Extension of Example 3. Prove that a sum of squares of r distinct linear forms in n variables is a quadratic form in n variables of zero discriminant whenever r < n.
5.
Harder.
is,
in
general, equal to r: and that the exception to the general rule arises when the r distinct forms are not linearly independent.
6.
is
2
=-!
<**,
(r
l,...,w)
HINT. Compare Examples 3 and 4. 7. Prove that the discriminant of the quadratic form
r^s
is
^(xr ~xs Y
(r,s
l,...,n)
of rank
8.
n1.
minimum at xr = otr provided that 22/wfi6 1S a positive-definite form and each /j is zero. Write down conditions that /(#, y) may be a minimum at the point x == a, y ft.
If /(#!,..., # n ) is a function of n variables, /, denotes denotes d 2f/8x i dxj evaluated at xr a r (r = l,...,n), prove that / has a
9. By Example 12, p. 133, a Hermitian form has a real value for every set of values of the variables. It is said to be positive -definite when
this value
iCj
is positive for every set of values of the variables other than =...= x n = 0. Prove Theorem 40 for a positive -definite Hermitian form and any non -singular transformation.
143
by a transformation
Prove that a positive-definite Hermitian form can be transformed of unit modulus into a form
c u -Xi -X\+ ... +c nn wherein each c rr is positive. HINT. Compare the proof of Theorem 4 1
11.
x n %n
.
The determinant
\A
== [a^j
is
Hermitian form
22 a
O'
3,
Hermitian form
_ XX-\~YY
Y=
a 2 x-\-b^y-\- c 2 z,
4, 5,
where
is zero.
X=
c^tf-f b^y^c^z,
and
6.
CHAPTER
XII
The
1.1.
A equation of
A(x,x)
2 when the variables x are changed multiplied by a transformation of modulus to variables (Theorem 39). by
If A(x,x) and C(x, x) are any two distinct quadratic forms in n variables and A is an arbitrary parameter, the discriminant
2 when of the form A(x, x) \C(x,x) is likewise multiplied by the variables are submitted to a transformation of modulus M.
Hence,
if
a transformation
\
=1
rs
b ij\
changes
^a
rs
xr x89
%c
xr xs into 2<x r8
XX
r
s,
2,Yrs
XX
r
s>
tllen
The equation
\A
XC\
-Ac 11
0, or, in full,
&1
-o,
fl
Ac 7fl
a nn
Ac w
is
called THE A EQUATION OF THE TWO FORMS. What we have said above may be summarized in the theorem THEOREM 43. The roots of the A equation of any two quadratic
:
The
coefficient of
each power A r
(r
0,
1,...,
n) is multiplied
by the square of the modulus of the transformation. 1 .2. The A equation of A (x, x) and the form x\
. . .
+x% is called
is
full,
the equation
a !2
a<><>
=
)
0.
^2n
'"nl
a'n2
145
roots of this equation are called the LATENT ROOTS of the matrix A. The equation itself may be denoted by \A XI\ 0, where / is the unit matrix of order n.
is
the characteristic equation has no zero root when \a ik When \a ik 0, so that the rank of the matrix [a ik ] is r
^ 0. < n>
later,
there
is
we
shall
nr zero roots.
prove
2.
The reality of the latent roots 2.1. THEOREM 44. // [c ik ] is the matrix
is
of a positive-definite
ele-
#11
U
A
#12
C 12
^'2i
# ^22
^2 22
i *nl
are real.
Let A be any root of the equation. Then the determinant vanishes and (Theorem 11) there are numbers Z v Z 2 ,..., Z n not all zero, such that
,
=1
58
=
ar8^8
(r
= !,...,); = l,...,n).
(1)
that
is,
^ZCrs^s
n
Z,
(r
=X
Zr
real,
we obtain terms
= Z Z +Z Z = =
Zr Z r
r
r
f
4702
V(-l),
X and Y real.
146
Hence the
A( = r l
l
is
I UX +7
)+2
r<S
|c
rs
(Xr X8 +Y Y8 )} r
'
or,
{l<*rr(X r + Y*) +
2
an (Xr
X +Y Y
a r
)),
X{C(X,X) + C(Y,Y)}
=
is
A(X,X)+A(Y,Y).
positive-definite
(2)
and
since the
cient of A in equation (2) is positive. Moreover, since each a lk is real, the right-hand side of (2) is real and hence A must be real.
NOTE.
If the coefficient of A in (2) were zero, then (2) would tell us It is to preclude this that we require
COROLLARY. //
2.2.
both A(x,x)
positive-definite
the form C(x,x) is zf+xl+'-'+^n and is positive-definite. Thus Theorem 44 contains, as a special case, the following theorem, one that has
c rr
1
When
and
c r3
(r
^ s),
THEOREM
i
45.
When
a 12
a 22~~^
[a ik ] is
real
\AXI\ =
0, that is,
of
n
a 21
a ln
a 2n
'
'
anl
is real.
a n2
is the
'
'
ann-
When
[a^]
is positive.
3.
Canonical forms
3.1. In this section
we prove
that, if
A =
[a ik ] is a square
matrix of order n and rank r, then there is a non-singular transformation from variables # lv .., x n to variables n v ...,
that changes
A (x, x)
into
*
r,
(3)
CANONICAL FORMS
147
if
where g v ..., gr are distinct from zero. We show, further, that the a ik are elements in a particular field F (such as the field
.
of real numbers), then so are the coefficients of the transformation and the resulting g t We call (3) a CANONICAL FORM.
show that any given quadratic form can be transformed into one whose coefficient of x\ is not zero. This is done in 3.2 and 3.3.
first
The
3,2.
Elementary transformations.
I.
We
shall
have occa-
sion to use
TYPE
The transformation
xl
=X
r,
xr
= Xv
x8
=X
(5
1, r)
is
non-singular;
its
determinant in
full
modulus is 1, as is seen by writing the and interchanging the first and rth columns.
is
Moreover, each coefficient of the transformation and so belongs to every field of numbers F.
If in the quadratic form
either
or
0,
JJa
rs
xr xs one
f the
numbers
a ii>--->
or a suitable transnot zero, then either a n tf/,,1 formation of type I will change the quadratic form into 0. n r S9 wherein 6 U
is
II\b X X
If, in
2a
is
rs
a nn are
2 2 ^rs %r X
6 12 ,..., b ln
is
8>
wherein every bn
not zero.
TYPE
II.
The transformation
in
n variables
x
t
x^Xi+X,, x^Xi-X.,
is
=X
(t=8,l)
non-singular. Its
(i)
modulus
1
is
in the first
elsewhere,
(ii)
has
in the
first,
in the 5th,
and
in
(iii)
1)
has
position and
elsewhere.
is 2.
148
or
field
b nn are zero,
b r8 xr x8 all the numbers 6 n ,..., in the quadratic form but one b l8 is not zero, then a suitable transforma-
2 2, cra-^r^
wherein
is
not zero.
step towards the canonical form. The foregoing elementary transformations enable us to transform every quadratic form into one in which a u is not zero. For
3.3.
The
first
xs
(ars
= aj.
(1)
If one a^ is not zero, a transformation of type I changes (1) into a form B(X, X) in which 6 n is not zero.
If every arr is zero but at least one ars is not zero, a transformation of type I followed by one of type II changes (1) into
a form C(Y Y) in which c n is not zero. The product of these two transformations changes (1) directly into C(Y,Y) and the 2. modulus of this transformation is
y
We
summarize these
results in a theorem.
THEOREM 46. Every quadratic form A(x, x), with coefficients in a given field F, and having one ars not zero, can be transformed into a form by a non-singular transformation with coefficients in
B(X, X) whose
3.4.
coefficient
fc
is
not zero.
DEFINITION
the
is
defined to be
rank of
matrix of
its coefficients.
with
quadratic form in n variables and of rank r, in a given field F, can be transformed by a noncoefficients singular transformation, with coefficients in F, into the form
47.
1
THEOREM
ZJ+...+
Z,
is
(1)
where
oc
v ...,
ar are numbers in
n n
ij
equal
to zero.
^ 2a
xi xj\ an ^
let
CANONICAL FORMS
149
[%]. If every a^ is zero, the rank of A is zero and there is nothing to prove. If one a^ is not zero (Theorem 46), there is a non-singular
transformation with coefficients in
that changes
A (x, x)
into
where
c^
^ 0.
This
may
be written as
coefficients in
F,
-X"i<-)
-- JT
i
2 +...H
--^ n
i
>
Y^X,
(=2,...,*),
Moreover, the transformation direct from x to y is the product of the separate transformations employed; hence it is nonsingular and has its coefficients in F, and every b is in F.
If every b tj in (2) is zero, then (2) reduces to the form (1); the question of rank we defer until the end. If one by is not zero, we may change the form ] ^i Yf variables in the same way as we have just changed the in
n1
2^
original
form
in
variables.
We may
is
a non-singular transformation
= Il Y =
ii
(i=2,...,n),
(3)
^- The equations (3), together with F Zj, conx 2 7^ a non-singular transformation of the n variables Yl9 ...,Y stitute n into 2 15 ..., Z n Hence there is a non-singular transformation,
where
150
the product of all the transformations so far employed, which has coefficients in F and changes A(x x) into
y
c^ZJ+^ZS+i3 =
i
JU,Z,Z,, =3
(4)
wherein
and a 2
On
proceeding in this
Either
we
arrive
^ 0,
* l Xl+...+* k Xl+
y=/c-f i
dtjXtX,,
(5)
0, in
wherein k
<
n, a x
0,..., a^.
but every d^
which
or
we
arrive after
steps at a form
1
In either
(*<n)
(6)
by a product of transformations each of which is non-singular and has its coefficients in the given field F. It remains to prove that the number k in (6) is equal to r,
the rank of the matrix A.
whereby we
o.z?.
(?)
(7) is
is
non-
singular, the matrix B'AB has the same rank as A (Theorem 34), that is, r. But the matrix of the quadratic form (7) consists
of a lv .., a k in the
first
so that its
COROLLARY. // A(x, x)
its
it
is a quadratic form in n variables, with in a given field F, and if its discriminant is zero, coefficients can be transformed by a non-singular transformation, with
coefficients in
F, into a form
(in
n~l
F.
variables at most)
where
c^,...,
oc
^ are numbers in
is,
This corollary
CANONICAL FORMS
151
theorem itself; since the discriminant of A(x,x) is zero, the rank of the form is n I or less. We state the corollary with a view to its immediate use in the next section, wherein F is the field of real numbers.
4.
of
two
real quadratic
Let A(x,
let
x),
in
variables
and
C(x, x) be positive-definite.
Then
there is
real,
and are
all
real
The roots of \A~XC are all real (Theorem 44). Let X l be any one root. Then A(x,x) X l C(x,x) is a real quadratic form whose discriminant is zero and so (Theorem 47, Corollary) there is a real non-singular transformation from x v ..., x n to
\
F!,...,
Yn
such that
where the
numbers. Let
this
same transformation,
when
C(x,x)=i Jy^r,. =U =l
i
(1)
Then we have,
.
for
an arbitrary
A,
l
A(x,x)XC(x,x)
= A(x,x)-X
C(x,x)+(X 1 -X)C(x,x)
,.
(2)
it is
Since C(x, x)
is
also
is
40).
Hence y u
(2) in
1.2)
real quadratic
Z2
,...,
Zn
152
we have
(3)
A(x
where 6 and
This
is
x)
=
.
denote quadratic forms in Z 2 ,..., Zn step towards the forms we wish to establish. we proceed to the next step we observe two things: Before variables Z 2 ,..., Z n is a positive-definite form in the (i)
tfj
the
first
ifj
nl
for
is
(1),
which
(ii)
positive-definite in
The
roots of
= 0, together with A = A account \A--\C\ = 0; for the forms (3) derive from
|0
l9
...
Yn
A^|
t,
A(x
x)
and
(Theorem
and
so
0,
that
is,
of
. .
o,
*\C\
is
**-,
=
0.
If A x
is
I^ACI =
a repeated root of
A0|
0,
then
also a root of
\B
0.
ifi
in place
where
2 is
positive, A 2
is
a root of
C/ ,..., 2
Un is real and non-singular. Z l we have a real, nonthe equation C/j adjoin singular transformation between Z v ... Z n and Uly ... Un Hence there is a real non-singular transformation (the product of all
tion between
Z2
,...,
Z n and
When we
l9 ...,
Un such that
C(x,x)=
CANONICAL FORMS
where
o^
153
and A 2 are roots, not necessarily and F are real quadratic forms. As at \A-XC\\ f the first stage, we can show that F is positive-definite and that the roots of \fXF\ = together with X l and A 2 account for
and a 2 are
positive; Aj
distinct, of
all
the roots of
\AXC\ =
0.
Thus we may proceed step by step to the reduction of A(x, x) and C(x, x) by real non-singular transformations to the two
formS
A(x,x)
=X
a l Y* + Xt*tYl+...+X n <* n
ai
Yl
c(x,x)=
wherein each
of
^
rf+
n+...+
a n Y*,
for all the roots
is
\AXC\ =
0.
X =
r
V,.y,
A(x,x)
5.
the
#?
+ ...+*
into XI+...
is
called
an ORTHOGONAL TRANSFORMATFON.
We
shall
examine such
transformations in Chapter XIII; we shall see that they are necessarily non-singular. Meanwhile, we note an important theorem.
THEOREM
49.
real quadratic
form A(x,
x) in
variables can
form
of \A
A/|
0.
More-
consists
in
writing
rrf-f-...+af
.'*.]
for
C(x, x)
in
XV,
The number
If
B is the
is
A(x, x)
4702
X\ +
. .
+ X n X%,
154
then (Theorem 37) E'AE is the matrix of the form (1) and this has X v ..., X n in its leading diagonal and zero elsewhere. Its
rank
is
the
number of
not zero.
But
since
is
B'AB
\I\
Hence
the
rank of
is
equal to
0.
7.
The signature
7.1.
r
of a quadratic
in
form
(3.4), a real quadratic
As we proved
Theorem 47
form of rank
a^..., ar are real and not zero. As a glance at 3.3 will show, there are, in general, many different ways of effecting such a reduction; even at the first step we have a wide choice as to which non-zero ars we select to become the non-zero 6 U or c n as the case may be. The theorem we shall now prove establishes the fact that, starting from the one given form A (#, #), the number of positive a's and the number of negative a's in (1) is independent of the method of reduction.
wherein
THEOREM
to the
50.
is
reduced by two
real,
and
B 2 say,
(2)
farms
*+...+,**,
y?+...+/?r y*,
(3)
the
the
number of positive a's is equal to the number of positive fi's and number of negative a's is equal to the number of negative j3's.
9
Let n be the number of variables x l9 ... x n in the initial quadratic form; let p, be the number of positive a's and v the number of positive jB's. Let the variables X, Y be so numbered that the positive a's and jS's come first. Then, since (2) and (3) are transformations of the same initial form, we have
-...-^*.
(4)
CANONICAL FORMS
156
/x
Now
the
>
v.
Then
Q
n+vfji
equations
...,
^=
0,
F,=
0,
^
X Y
r,
0,
...,
Xn =
.
(5)
are homogeneous equations in the n variables! # 1? ..., xn There are less equations than there are variables and so (Chap. VIII, xn 1? ..., n in which 5) the equations have a solution x 1
=
.
when x
Then, from
(4)
and
(5),
which
is
Hence
impossible unless each X' and Y' is zero. either we have a contradiction or
and, from
(5),
X = 0, X'^ = 0,
l
...,
-2^
...,
= 0, X' = 0. n
>
|
j
v 6) '
But
(6)
^j
>
"^n
say
fc^i
ik
xk
in full,
have a solution xr
=^
in
which
1} ...,
=
\
0,
which
is
non-singular.
/x
to a contradiction. /z leads Similarly, the assumption that v v and the theorem is proved. Accordingly, /^
> >
leads to a contradiction.
One of the ways of reducing a form of rank r is by the orthogonal transformation of Theorem 49. This gives the form
7.2.
where A ls ...,
equation.
A,,
Hence the number of positive a's, or /J's, in Theorem 50 the number of positive latent roots of the form. in x xn f Each X and Y is a linear form
lf ... 9
.
is
]56
7.3.
have proved that, associated with every quadratic two numbers P and JV, the number of positive and of negative coefficients in any canonical form. The sum of P and N is the rank of the form.
We
form are
DEFINITION
the form.
8.
The number
PN
is called the
SIGNATURE of
conclude with a theorem which states that any two quadratic forms having the same rank and signature are, in a certain sense, EQUIVALENT.
7.4.
We
THEOREM
having
the
51. Let A^x, x), A 2 (y, y) be two real quadratic forms same rank r and the same signature s. Then there is
real
non-singular
transformation
By
that
transforms
AI(X,X) into
A 2 (y,y).
is
When
Ai(x,x)
reduced to
its
canonical form
it
becomes
(1)
9
^Xl+.-.+cipXl-p^Xfa-...-^*,
where the
where
/Ts are positive, the transformation from x to
a's
and
where
is
JJL
i(s+r)
and
real
and non-singular.
The
real transformation
changes
(1)
to
f?+...+^-^i-...-fr
y
(2)
is, then, a real non-singular transformation, say that changes A^(x x) into (2). C\, Equally, there is a real non-singular transformation, say C 2 1, that changes A 2 (y,y) into (2). Or, on considering the y reciprocal process, (2) is changed into A 2 (y,y) by the trans-
There
formation
C2
y.
Hence
3
= ^0-10
A 2 (y,y).
EXAMPLES XII
1.
C(x, x) have
r
1,...,
c ra
when
k and 8
1,...,
n.
Prove that the A equation of the two forms has k roots equal to unity.
CANONICAL FORMS
2.
157
Write down the A equation of the two forms ax 2 -f 2hxy + by 2 and and prove, by elementary methods, that the roots and ab h 2 > 0. are real when a >
a'x 2 -\-2h'xy-{-b'y 2
HINT. Let
=
The condition for real roots
(ab'
a(a?
in A
2
becomes
2
)
0,
aV^-a^-ftX^-ftMft-a,) >
When ab
h2
0.
>
0,
a x and
/3 t
be proved by writing c^ y iS. y-j-i8, In fact, the A roots are real save when a lt ft l9 roots with suffix 1 separate those with suffix 2.
3.
a. 2 ,
/? 2
where
A(x
are
4.
x)
=-
foe?
all positive.
=
is
negative.
r
a r ,xr x,;
lk
X =
a k-i.k
<*kk
X,
xk
X,
xk
subtracting from the last row multiplies of the other rows and then, for A k by subtracting from the last column multiples of the other columns, prove that A k and f k are independent of the variables x t ... x k 0. when k < n. Prove also that A n
By
6.
5,
.
and with
.
a lk
7.
Uso the
result of
Dk is zero,
Example
An
of
Example
5 to
158
>!,...,
Consider what happens when the rank, r are all distinct from zero.
of
is less
than n and
y/'S.
ara^r-^ with Transform 4cc a x 3 -f- 2# 3 ^ -f 6^ x 2 into a form a n i=. 0, and hence (by the method of Chap. XI, 1.2) find a canonical form and the signature of the form.
9.
22
4, that
2
is
of rank 3 and signature 1. Verify Theorem 60 for any two independent reductions of this form to a canonical form.
10.
is
if
and only if its rank does not exceed HINT. Use Theorem 47.
is
2.
= BX,
12.
13. trove that the discriminant of a Hermitian form A(x,x), when it transformed to new variables by means of transformations x = BX,
for
Hermitian forms.
all
\A
is
the roots of
n
r-l
2,xr xr
Prove the analogues of Theorems 46 and 47 for Hermitian forms. A(x,x), C(x,x) are Hermitian forms, of which C is positivedefinite. Prove that there is a non-singular transformation that expresses the two forms as
13.
14.
A!
XX
l
4-
...
-f A n
X n 3T n
XX+
l
...
-f
X n X nt
Show
where A^...,^ are the roots of \A-\C\ 15. A n example of some importance in analytical dynamics.^ that when a quadratic form in m-\-n variables, 2T = ^ara xr xt (arB = a^),
all real.
is
and are
expressed in terms of lf ..., m # m +i-., #m+n where are no terms involving the product of a by an x.
,
dT/dxr there
,
Solution. The result is easily proved by careful manipulation. Use the summation convention: let the range of r and s be 1,..., m; let the range of u and t be m-f 1,..., m+n. Then 2T may be expressed as
2T = art xr x8 + 2aru xr y u + a ut y u y
tt
where, as an additional distinguishing mark, we have written y instead of a: whenever the suffix exceeds m. We are to express T in terms of the y's and new variables f, given by
t Cf.
77:
The Routhian
function.
owe
this
example to Mr.
Hodgkinson.
CANONICAL FORMS
Multiply by A 8r the co -factor of asr in w, and add: we got o
,
9
169
|ara
|,
a determinant of order
Now
2T =
= 2AT =
where ^(i/, y) denotes a quadratic form in the t/'s only. But, since ar8 = a8T we also have A 8r = A r8 Thus the term multiplying y u in the above may be written as
,
.
ru
T8
a8U A 8r
r,
and
this
is
zero, since
both r and
8 are
dummy suffixes.
Hence,
2T may
2T = (AJAfafr+b^vt
whore r, a run from
1
to
m and u,
run from
m+
to
m+n.
CHAPTER
XIII
ORTHOGONAL TRANSFORMATIONS
1.
We recall the
9.
DEFINITION
a;2_|_ ...-f #2 into
transformation x
is called
AX
that transforms
XI+...+X*
is called
analytical geometry. When (x, y, z) are the coordinates of a referred to rectangular axes Ox, Oy, Oz and (X, Y, Z) point
coordinates referred to rectangular axes OX OY, OZ, whose direction-cosines with regard to the former axes are
are
its
,
(l lt
l9
n^,
(/ 2 ,
2,
7?
2 ),
(? 3 ,
w3
are connected
by the equations
Moreover,
1.2.
z 2 +?/ 2 +z 2
- X*+Y*+Z - OP
2
2
.
some
If A =
special property.
[ar8 ]
is
readily obtained.
AX,
t ns
anl
= =
X = (*
Hence
n),
!,...,
(s i~~ t).
These are the relations that mark an orthogonal matrix: they are equivalent to the matrix equation
A' A
as
is
/,
(2)
ORTHOGONAL TRANSFORMATIONS
1.3.
161
When (2) is satisfied A' is the reciprocal of (Theorem so that A A' is also equal to the unit matrix, and from this 25), A' in full) fact follow the relations (by writing
A
(
=
When n
noted in
forms
I).
1.1,
the relations
and the transformation is the change of axes (1) and (3) take the well-known
=
on.
l,
Z1
+Z 2 m 2 +Z 3 m 3
0,
and so
1.4.
We now
52.
embody important
THEOREM
matrix
necessary
to be
orthogonal is
is
non-singular.
THEOREM
is
53.
an
orthogonal transformation.
Let x
Hence
= A X, X = BY be orthogonal transformations. BB' = /. AA' = /, (AB)(AB)' = ABB' A' (Theorem 23) = AIA' = AA' = /,
is
Then
proved.
is
THEOREM
either
54.
1.
+1
or
IfAA'
\A\.\A'\
I and
\A\
=1. But
55.
\A'\
is
=
a
\A\,
and hence
1
1-4
1
1.
THEOREM
Let
4702
If X
latent root of
an orthogonal transforma-
A A' =
7,
and so
(4)
A =
1
A-*.
162
ORTHOGONAL TRANSFORMATIONS
36, Corollary, the characteristic (latent) roots
By Theorem
of A' are the reciprocals of those of A. But the characteristic equation of A' is a determinant which, on interchanging its rows and columns, becomes the characteristic equation of A. Hence the n latent roots of A are the reciprocals of the n latent roots of A, and the theorem follows, (An alternative
proof
2.
is
given in Example
6, p. 168.)
of
an orthogonal matrix
be an orthogonal matrix the equa-
A may
tions (1) of
must be
satisfied.
There are
1) __ n(n-\-l)
n(n
of these equations. There are n 2 elements in A. We may therefore expectf that the number of independent constants necessary to define
completely will be
n
If the general orthogonal matrix of order
in terms of some other type of matrix,
is
we must look
that has \n(n-~ 1) independent elements. Such a matrix skew-symmetric matrix of order n\ that is,
the
3,
has
+2 =
in the
| The method of counting constants indicates what results to expect: rarely proves those results, and in the crude form we have used above
ifc
it
ORTHOGONAL TRANSFORMATIONS
THEOREM
provided that
56.
163
n, then,
// $
is
is
I+S
non-singular,
=
is
(I-S)(I+8)-i
n.
is
skew-symmetric,
-5,
(i-sy
i+s,
(i+sy
= /-s
and when I+S is non-singular it has a reciprocal (I+S)- 1 Hence the non-singular matrix O being defined by
we have
0'
)'
(Theorem 23)
(Theorem
27)
(5)
and
00'
/
Now
$ and
I+S are
I+S
is
non-singular.
and, from
(5),
l
00'
(I+S)is
(I
52).
Hence
NOTE. When the elements of S are real, I + S cannot be singular. This 2.2, was well known to Cayley, who disfact, proved as a lemma in covered Theorem 56.
2.2.
LEMMA. // S
is
real
I+S
is
non-singular.
down S and
x\ e.g.,
by
with
71-3,
164
ORTHOGONAL TRANSFORMATIONS
The differential coefficient of A with respect to x is the sum of n determinants (Chap. I, 7), each of which is equal to a skewsymmetric determinant of order n 1 when x = 0. Thus, when n = 3,
dA
dx
we
the
sum
of n(n
1)
2 when x =
equal to
0;
and
By
Maclaurin's theorem,
we then have
(6)
1 2
A
where
AO+Z 2 +* 2 2 +...+x,
n,
i
is
2 a sum
of skew-symmetric determinants of order n 1, and so on. But a skew-symmetric determinant of odd order is equal to
zero and one of even order
is
19, 21).
Hence
n even n odd
,
A A
(7)
where
so
,
P P are either squares or the sums of squares, and P Plv are, in general, positive and, though they may be
l9
...
..
zero in special cases, they cannot be negative. 1. The lemma follows on putting x
2.3.
THEOREM
57.
Every
can be
J(I-S)(I+S)-*
where
S is a skew-symmetric
matrix and
is
a matrix having
i1
is
As a preliminary
true for
all
to the proof
we
establish a
lemma
that
ORTHOGONAL TRANSFORMATIONS
LEMMA. Given a
matrix J, having
so that
1 is
165
it is possible to choose a square matrix each diagonal place and zero elsewhere, in
JA.
Multiplication by J merely changes the signs of the elements of a matrix row by row; e.g.
"1
. .
"IFa,,
a a 32
a 1Ql a 33J
a 11
a, 21
-i
L
a 12
W 31
"~" i
"33 J
or,
Bearing this in mind, we see that, either the lemma is true for every possible combination of row signs, we must have
0.
(8)
But we can show that (8) is impossible. Suppose it is true. Then, adding the two forms of (8), (i) with plus in the first row, (ii) with minus in the first row, we obtain
=
a 2n
for every possible
....
-\-a nn -\- 1
(9)
a nn +l
combination of row signs. We can proceed a like argument, reducing the order of the determinant by by 0^+1 0: but it is unity at each step, until we arrive at
and
a nn -\- 1
= 0.
Hence
Proof of Theorem 57. Let A be a given orthogonal matrix with real elements. If A has a latent root 1, let = A : be a matrix whose latent roots are all different from JV A
1,
where Jt
is
1
is
of the lemma.
Now Jj
having
* non-singular and its reciprocal Jf is also a matrix in the diagonal places and zero elsewhere, so that
where
=
A
/f
l9
Moreover,
of the type required by Theorem 57. being derived from A by a change of signs of
*
and
is
166
ORTHOGONAL TRANSFORMATIONS
certain rows,
matrix
A^I
l
when
is
orthogonal and we have chosen it so that the is non-singular. Hence it remains to prove that a. real orthogonal matrix such that -4 1 +/ is nonis
singular,
so that
To do
= (I-AJV+AJ-*.
(10)
we
may
4
*
'
A' *
A' A l /, since A l is orthogonal. Thus A l Further, A l A l 1 and we may work with A v A(, and / and A( are commutative
f
numbers and so
obtain,
on using
the relation
A' l l
/,
2I-2A A' 1
1
r\
S'; that
is,
(10),
and
so, since
I+S
is
non-singular J and /
S,
I+S
are com-
mutative,
we have
Al =(I-S)(I+S)-\
(12)
the order|| of the two matrices on the right of (12) being immaterial.
We
2.4.
in so far as
they relate
we
see that
is
A = J(I-S)(I+S)t
||
p. 163. p. 163.
Compare
2. 2.
ORTHOGONAL TRANSFORMATIONS
is
167
and
be so written.'
2.5.
THEOREM
58.
The
latent roots of
Let
if
is
[ars ] be a real orthogonal matrix. Then (Chap. X, a latent root of the matrix, the equations
7),
Xxr == a rl x 1 + ...+a rn x n
(r
l,...,n)
(13)
... x n = 0. have a solution a; 1 ,..., x n other than x These x are not necessarily real, but since ara is real we have, on taking the conjugate complex of (13),
also
Xxr
a rl x l +...+a ni x
fl
(r
=
of
1,...,
n).
By
(1)
1,
we have
r-l
r=l
But not all of # 15 ..., xn are zero, and therefore Hence AA = 1, which proves the theorem.
^x
xr
>
0.
EXAMPLES XIII
1.
r-i
4 4
i 4
-*
i
Li
is
4-44 -4J
4
orthogonal. Find
its
latent roots
and
verify that
Theorems
54, 55,
and
J 2.
[45 i
Ll
is
-I
orthogonal.
3.
-c
a
-61,
-a\
I
168
ORTHOGONAL TRANSFORMATIONS
'l+a*-6 2 -c*
-2(c-j-a&)
l-f.
2(c-a6)
2(ac
a a_{_&2 + c 2
6)
a
+a l + + c t
fci
I_}_
2(a-6c) a 2_|_62 + c 2
2(o+6c)
2
1-a 2
.l+a
Hence
4.
-f&
+c
form associated with A; S is a skew-symmetric matrix such that I-j-SA is non-singular. Prove that, when SA = AS and
J-<^
F>
5.
(Harder. )J
^L is
and/?
symmetric,
<S
skew-symmetric,
(^4-f A^)-
^-^),
tf'^ + S)/?
Prove also that, when
= A + S,
R'AE = A,
X=
RY,
F'^F.
X'AX =
6. Prove Theorem 55 by the following method. Let A be an orthogonal matrix. Multiply the determinant \A \I\ by \A\ and, in the resulting determinant, put A' = I/A. 7.
transformation x
= UX, x
UX which makes
mark
of a unitary
called a unitary transformation. Prove that the transformation is the matrix equation UU' = /.
is
8.
Prove that the product of two unitary transformations Prove that the modulus
is itself
unitary.
9.
equation
10.
MM =
M of a unitary transformation
satisfies
the
/.
is
form e
i<x
of the
f Compare Lamb, Higher Mechanics, chapter i, examples 20 and 21, where the transformation is obtained from kinematical considerations. J These results are proved in Turnbull, Theory of Determinants, Matrices,
and Invariants.
Introduction
1.1.
requires a complete book, and books devoted exclusively to that study already exist. All we attempt here is to introduce
the ideas and to develop them sufficiently for the reader to be able to employ them, be it in algebra or in analytical geometry.
1.2.
We
In
Theorem 39 we proved that when the variables of a quadratic form are changed by a linear transformation, the discriminant of the form is multiplied by the square of the modulus of the transformation. Multiplication by a power of the modulus, not
necessarily the square, as a result of a linear transformation is the mark of what is called an invariant'. Strictly speaking, the
'
in fact, applied to
word should mean something that does not change at all anything whose only change after a
it is,
linear
transformation of the variables is multiplication by a power of the modulus of the transformation. Anything that does not change at all after a linear transformation of the variables is
called
an 'ABSOLUTE INVARIANT'.
1.3. Definition of an algebraic form. Before we can give a precise definition of invariant' we must explain certain technical terms that arise.
DEFINITION
variables x,
10.
A sum
?/,..., t,
wherein the a's are arbitrary constants and the sum is taken over all integer or zero sets of values of ex, )S,..., A which satisfy the
conditions
O^a^k,
is called
4702
...,
O^A<&,
a+p+...+X
k,
(2)
?/,..., t.
the variables x,
170
xi^y
r)l
r\(n
is
in
x and
...
y.
k\ /a!
A! in (1)
coefficients n!/r!(n >)! in (3) are not essential, but they lead to considerable simplifications in the resulting theory of the
forms.
shall use F(a, x) and similar notations, to denote an algebraic form; in the notation (f>(b,X), F(a,x) the single a symbolizes the various constants in (1) and wish to (3) and the single x symbolizes the variables. If we
1.4.
Notations.
We
such as
mark
number of variables
is
\//v fl/\
\71
9
n,
we
shall use
F(a,x) n
An
alternative notation
(
IA
9
\
/
0'
1.5***?
fy /
which
is
(3),
and
]
WIT*
^i
or
\a
( oc
which is used to denote the form (1). In this notation the index marks the degree of the form, while the number of variables is either shown explicitly, as in (4), or is inferred from the context.
Clarendon type, such as x or X, will be used to denote singlecolumn matrices with n rows, the elements in the rows of x or X being the variables of whatever algebraic forms are under
discussion.
The standard
to variables
linear transformation
...
y ,
Xv X n x = X
r l
namely,
rl
+...+lrn
Xn
(r
1,...,
n),
(5) (6)
will
be denoted by
M X,
coefficients
lrs
in (5).
As
in pre-
vious chapters, \M\ will denote the determinant whose elements are the elements of the matrix
171
On making
the substitutions
(5) in
a form
9
F(a,x)
(7)
we obtain a form of degree k in X l9 ... X n this form we denote by G(A,X) = (A ,...)(X lt ..,,X n )". (8) The constants A in (8) depend on the constants a in (7) and on the coefficients lrs By its mode of derivation,
.
For example,
let
F(a,x)
Xi
a n xl+2a l2 x l x 2 +a 22 x%,
=
~
l-ft
Xi~\-li 2
X2
X2
^21^1
^22
then
F(a,x)
G(A,X)
#11^
where
An =
last thing
we
shall
wish to do
will
be to
calculate the actual expressions for the A's in terms of the a's: be sufficient for us to reflect that they could be calculated
's
are linear
1.5.
Definition of an invariant.
11.
DEFINITION
A function
an
be, the
form F(a, x)
matrix
of the
is
said to be
M of
of the coefficients a of the algebraic invariant of the form if, whatever the
(6)
may
is
same function of
the coefficients
form G(A,X)
equal
to the
cients a) multiplied
of
\M
M.
For instance,
example of
1.4,
or
may be proved either by laborious calculation an appeal to Theorem 39. Hence, in accordance with by Definition 11, a n a 2 2~ a i2 * s an invariant of the algebraic form
a result that
172
Again, we may consider not merely one form F(a,x) and its transform G(A,X), but several forms F^a, #),..., Fr (a, x) and
their transforms
as before,
we
write
G (A,X)
r
x in terms of
in
MX.
transformation
M X,
r
if,
of the
the
same function of
G (A,X)
is
coefficients a) multiplied
For example,
F^a, x)
if
2
and
a 2 x+b 2 y,
(9)
is,
so that
F2 (a
we
x)
G2 (A,X)
(a 2 a l
+b
(
see that
a l oc l +b l
X
(11)
12,
al b 2
a2 bl
is
a joint-
invariant of the two forms a 1 x-\-b l y, a 2 x-\-b 2 y. This example is but a simple case of the rule for forming the 'product' of two transformations: if we think of (9) as the trans-
and
to variables x
(10) as the transformation from x and y to is merely a statement of the result proved in
1.6.
Covariants.
An
invariant
is
a function of the
cients
only. Certain functions which depend both on the coefficients and on the variables of a form F(a, x), or of a num-
ber of forms
F (a, x),
r
173
unaltered, save for multiplication by a power of \M\, when the X. Such funcvariables are changed by a substitution x
For example, let u and v be forms in the variables x and In these, make the substitutions x
y.
so that u,
by the
du
du
du
dv
j
r>
dv
1
,,
I
dv
2
T"~"
J
\r
a^
__ ~~
m
du
du
du
2
,
^x
"dy dy'
dY
dv
" V1
dv
l
dx
""*dy*
dX
dv
~dX
dY
(12)
dv
dY
(12) is best seen if
1
The import of
we
write
n
it
in full.
\
Let
(13)
and,
Then
X
The function
depending on the constants a r br of the forms u, v and on the variables x, y, is, apart from the factor l^m^l^m^ unaltered
,
174
when the
ingly,
1.7.
4
constants ar br are replaced by the constants A r r and the variables x, y replaced by the variables X, Y. Accord-
we say
that (14)
is
v.
is
defined as
a function of the coefficients a, of a form F(a,x) which is equal to the same function of the coefficients A, of the transform G(A,X), multiplied by a factor that depends only on the constants of the transformation',
in question
In the present, elementary, treatment of the subject we have left aside the possibility of the factor being other than a power of \M\. It is, however, a point of some interest to note that the wider definition can be adopted with the same ultimate restriction on the nature of the
factor,
2.
Examples
2.1.
of invariants
and covariants
x,
w be
n forms
in the
n variables
wx
u,,
w,,
dujdy,..., is called
the Jacobian of
the n forms. It
is
usually written as
It
is,
as
^U^Vy^jW)
i~
.
IT'
^
\
2.2.
in the variables x,
.
y,...,
d u/dx
11 uxy uvv
2
,
d u/dx8y,...
wxz
^yz
Then the
deter-
uxx uvx
it
(2)
175
in fact, if
we
H(u\X)
For,
calling
\M\*H(u;x).
,
(3)
by considering the Jacobian of the forms ux uyy ..., and them u v u^..., we find that 2.1 gives
(4)
But
and
if
8u
y
then
= m X+w
1
F+...,
__
a _
a _
Hence
xx YX
U ZX
U YZ
U ZY
X
UxY
UyY
as
may
last
two determinants by
That
Y\ ,Aj
is
to say,
I
,.
Mil
D2>"a(A,r,...)
r,
which, on using
(4),
yields
H(u',X)= \M\*H(u\x).
176
2.3,
of
2.1
1,...,
n)
an invariant.
is
It
is
forms.
As we have seen
the eliminant
in Theorem 11, the vanishing of a necessary and sufficient condition for the
(r
...
equations
^^...^^ =
= =x
1,...,
tl
0.
u of
Hessian
is
independent of the
variables and so
ar invariant.
When
we have
determinant
form.
3 2 u/8x r dx s
2ar r8
is, apart from a power of 2, the the discriminant of the quadratic namely,
in
two variables
is
Invariants and covariants of a binary cubic. A form is usually called a binary form thus the general
:
binary cubic
^3+3,^+3^2+^3.
it
is
(7)
We
can write down one covariant and deduce from The Hessian of (7)
apart from numerical
factors,
a second
(2.2) a
a 2 x+a3 y
i.e.
(
a?)#
+K
(8) is
a x a 2 )xy+ (a x tf 3
(8):
afty*.
(8)
The discriminant of
an invariant of
we may expect
that
it is
to find (we prove a general theorem later, invariant of (7) also.f The discriminant is
3.6)
an
a a2
i(a a 3
f
a\
K^a^
a^a^
precise
a^)
(9)
the phrases
The reader will clarify his ideas if ho writes down the *is an invariant of (7)', 'is an invariant of (8)'.
meaning of
177
and
itself, is,
a x
2(a Q a 2
i.e.
+ 2a xy+a
x
al)x+ (a a z ~a^ a 2 )y
3&Q&!
ct
<
(a
a x x 2 + 2a 2 xy+a3 y 2 a 3 a x a 2 )x+2(a 1 a 3 a\
(GQ
&3
-f-
3(2&J # 3
a 2 a3
a a|
2a 2 )y 3
(10)
connects
shall show that an algebraical relation and (10). Meanwhile we note an interesting (7), (8), (9), (and in the advanced theory, an important) fact concerning the coefficients of the co variants (8) and (10).
At a
later stage
we
If
we
write
c Q x*-{-c l xy-{-lc 2
+c 1
a;
?/+-c 2 :r?/
2t
-|-Lt
.
c3
and so on
not a disordered set of numbers, as a first glance at (10) would suggest, but are given in terms of c by means of the formula
_
where p
that
CQ
is
the degree of the form u. Thus, when we start with the cubic (7), for which p 3, and take the covariant (8), so
#o a 2~ a v
we
and
da 2
is
a\)x
is
written as
,
+c
x y + Jc a xy* + Jc 3 y 3
2
t
4702
Compare
3.6, 3.7.
178
2.6.
method
u
of
forming
F(a,x) ==
(a Q ,...
a m )(x l9 ... x n ) k
9
(12)
say J
0(a ,...,aJ,
//
is
an invariant of F(a,x).
That
is
to say,
when we
substitute
JfX
in (12)
we obtain a form
0(A X)
9
u
the relation
(Ao
...
A m )(X
l9
...
X n )*
A
(13)
(12)
and the
of (13) satisfy
fixed constant, independent of the matrix M. Now (12) typifies any form of degree k in the n variables and (14) may be regarded as a statement concerning a?!,..., x n
where
s is
some
if
F(a' x)
y
=
=
((*Q
...
xn ) k
transformed by x
v
J/X
into
G(A',X)
a',
the coefficients
Equally, when A is an arbitrary constant, u-\-\v is a form of degree k in the variables x v ... x n It may be written as
y .
(a Q +Xa' ,...
a m +A<
)(.r 1 ,...,
xn
k
)
M X,
l9
it
becomes
9
(A
that
is,
,...,
A m )(X
(A
l9 ...,
Xn
9
k
)
Xn
)" 9
+\A^
...
A m +XA'm )(X
Xn
k
)
.
Just as (15) may be regarded as being a mere change of of the form notation in (14), so, by considering the invariant
</>
u-\-Xv
we have
9
4(A Q +XA^..
which
is
A m +XA'm )=
(16)
Now
a mere change of notation in (14). each side of (16) is a polynomial in A whose coefficients
179
values of A and so the coefficients of A, A, A 2 ,... on each side of (16) are equal. Hence the coefficient of each power of A in
the expansion of
<f>(a Q
+Xa^... a m +\a m )
y
is
a joint-invariant of u and
<f>r(
a Qi-"> a m
%>>
|
a m)>
(A Q ,...,
Am
AQ,...,
A'J
s \M ^ r K,..., a m
m <,..., a' ).
(17)
The
Then
dX
db m dX
=
and the value of
(18)
i$ +
c)6
-+^
db m
when
is
given by
(o
oj d
+...+<4
U(a
c*a,,,/
,...,a
m + Aam)
(17),
a polynomial in
A.
we
have
(a ,...,aJ.
(20)
180
Examples 1-4 can bo worked by straightforward algebra and without appeal to any general theorem. In these examples the transformation
y = l 2 X + m 2 Y, liX+niiY, and so \M\ is equal to ^m^ l^m^ forms ax + by, ax 2 + 2bxy+cy 2 are transformed into ^Y-f#F, .4X 2 -f 2J5.YF-f-<7F 2 and so for other
is
taken to be
forms.
1.
is
Ans.
2.
AB'A'B =
ab' 2
|Af |(a&'-a'&).
Prove that
2
2ba'b'+ca'
is
,
ax 2 + 2bxy-\-cy 2
Ans.
AB' -2BA'B'-{-CA' 2 =
,
\M\
(ab'
-2ba'b'+ca' 2 ).
2hh' is an invariant of the two quadratic 3. Prove that ab' + a'b forms ax 2 + 2hxy -f by 2 a'x 2 -f 2h'xy -f 6'?/ 2 4wa. AB' + A'B-2HH' = |Af 2 (a6 +a'6-2M ).
.
a'(frr-f-C2/) is
Ans.
B'(AX + BY)-A'(BX+CY)
5. Prove the result of Example 4 by considering the Jacobian of the two forms ax 2 -\-2bxy + cy 2 a'x-\-b'y.
,
2 Prove that (ab a'b)x + (ac a'c)xy + (bc' of the forms ax 2 -f- 2bxy + cy 2 a'x 2 + 2b'xy -f c'y 2
6.
b'c)y
is
the Jacobian
8.
b 2 c l )(ax-{-hy-\-gz)
exercises
on
2.6.
first
showing that
abh
is
an
invariant of ax 2 -f 2hxy -f by 2
10.
Prove the
result of
Example
by considering the
2
joint-invariant
bg
ch 2 ,
i.e.
the determinant
=SH
a h
9
h
b
is
an invariant of the quadratic form ax 2 + by 2 +cz 2 + 2fyz+2gzx-\-2hxy, and find two joint -in variants of this form and of the form a'x 2 + ... + 2h'xy.
181
= a'A+ b'B+c'C+2f'F+ 2g'Q + 2h'H, = aA' + bB' + cC' + 2JF' + 2gG'-ir 2hH\
a, a',... in
where A,
are of
A',...
denote co-factors of
some importance
in analytical geometry.
+ 2gzx -f 2hxy
,
is the geometrical significance of this joint-invariant ? Prove that the second joint-invariant, indicated by 0' of Example 11,
and the
linear
form
Ix -\-my-\-nz.
What
is
identically zero.
13.
Prove that
I*
Im lm
f
m*
=
f
(Im'-l'm)*.
211'
I'*
l'm
2mm
m'*
I'm'
14.
Prove that
dx*dy
is
a covariant of a form u whose degree exceeds 4 and an invariant of a form u of degree 4. HINT. Multiplication by the determinant of Example 13 gives a I' determinant whose first row is [x = IX -\- mY, y
Compare
1
2.2: here (/
-f
l'~)
a determinant whose
first
row
is
Hence x
1 5.
initial
2
determinant by
c
3
,
\M
6
.
|
ad
a
b
c
b*e
i.e.
b
c
d
e
.
d
4
is
an invariant of
(o,6,c,d, e,)(x,2/)
182
16.
v xy
v yy
<>
wxy
.
is
a covariant of
Properties of invariants 3.1. Invariants are homogeneous forms in the coefficients. Let 7(a ,..., a m ) be an invariant of
3.
(a
,...,
a m )(x v ..., x n
k
)
.
-MX change u
into
Then, whatever
M may be,
9
I(A Q ,...
A m )= W/(a
M.
...,
0>
...,aJ,
(1)
xl
wherein \M\
Since
of
^4
= AJtj, =A
71
.
#2
= AJT
2,
xn
= A-X" n
y
,...,
tlie
values
...,
am A
fc
.
Hence
That
a
(1)
becomes
7(a
A*,...,
a m A*)
- A/(a
=
0> ...,
aj.
(2)
is
to say, /
is
q,
and w by
kq/n.
3.2.
y+...+a k y
is
is
called its
t The result is a well-known one. If the reader does not know the result, a proof can be obtained by assuming that / is the sum of I l9 / 2 ,..., each homogeneous and of degrees q l9 g 2 "- and then showing that the assumption contradicts
(2).
183
.
is
of weight
is
said to be ISOBARIC
a\
is
isobaric,
2x1; we
refer
af as being of weight
2,
the
of that weight.
are
isobaric.
Let
r--o
and
,...,
ak
Let the
. \
K-'Y'.
(4)
(kr)\
Then, by
3.1 (there
being
now two
variables, so that
,...,a k )
2),
I(A
,...,
Ak
\M\^I(a
(5)
M may be.
x
X,
XY,
,...,
for
which \M\
a
A.
The values of -4
a x A,
Ak
cular transformation,
,
...,
ar A r
...,
a k \k
That
power of A associated with each coefficient A is equal to the weight of the coefficient. Thus, for this particular transformation, the left-hand side of (5) is a polynomial in A such
is,
the
that the coefficient of a power A"' is a function of the a's of weight w. But the right-hand side of (5) is merely
A^/(a
,...,aJ,
(6)
A. Hence the \M\ of A that can occur on the left-hand side of (5) is only power and its coefficient must be a function of the a's of weight
184
(5)
must
be /K,..., ak ).
Hence /(a
,..,,
ak )
is
of weight \kq.
3.4. Notice that the weight of a polynomial in the a's must be an integer, so that we have, when 7(a ,..., a k ) is & polynomial invariant of a binary form,
(a
,...,
a k ),
(7)
and
\kq
weight of a polynomial
(7)
an
integer.
Accordingly, the 8 of
3.5.
must be an
integer.
Jie
form (1.3)
the suffix A (corresponding to the power of xn in the term) is and the weight of called the weight of the coefficient tMt \
a^
a product of coefficients
of
its
is
defined to be the
sum
of the weights
constituent factors.
Let I (a) denote a polynomial invariant of (8) of degree q in J/X change (8) into the a's. Let the transformation x
^
...,
(9)
Then, by
3.1,
I(A)
\M\**lI(a)
(10)
M may be.
#w _i
xl
for
=X
l9
= X n -i>
xn
A^n>
which \M\ =
A.
that /(a) must be isobaric and of weight since the weight is necessarily an integer, kq/n kq/n. Moreover,
3.3,
we can prove
integer.
must be an
EXAMPLES XIV B
Examples 2-4 are the extensions to covariants of results already proved for invariants. The proofs of the latter need only slight modification.
185
If C(a,x)
is
u,
and
tlr
<7(a, x) is
sum
of
= C (a,a;r + + C (a,*)
1
',
denotes degree in the variables, then each separate where the index term (7(a, x) is itself a covariant.
HINT. Since
<7(a, x) is
a covariant,
C(A X)
9
=
=
\M\ C(a,x).
.X\
,..., JY" n
for
#!,...,
T
9
# n and, consequently,
for
lt ...,
Xn
^ C (A r
X)*'F
t'
|Af |*
2 O (a,^)^^.
r
r
t
Each
side
is
a polynomial in
arid
we may equate
coefficients of liko
powers of t.
2.
covariant of degree
in t3ie variables is
homogeneous
in the
coefficients.
XX n
3.1,
C(aA
fc
,
X) w
\ C(a,x)*>, A rw O(a, a) w
if
ns
A-^C^oA*, ic)
OT ==
is
of degree
kq
\ n8 \
or
&#
ns-\-m.
3. If in a covariant of a binary form in x, y we consider x to have 2 weight unity and y to have weight zero (x of weight 2, etc.), then a m of in the coefficients a, is isobaric and of covariant C(a,x) degree q
,
weight
^(kq-\- m).
HINT.
= AF
t
and
If in a covariant of a
to
,
then a covariant C(a, x) w of degree q in the coefficients and of weight {kq -f (n 1 )w}jn. HINT. Compare 3. 5.
3.6.
form in x lt ... x n we consider x l9 ... x n _ l to have zero weight (x^^x^ of weight 2, etc.),
a, is isobaric
Let u(a, x)^ be a given form of degree k in the n variables x v ..., x n Let C(a x) be a covariant of u, of degree w in the variables and of degree q are not the in the coefficients a. The coefficients in C(a,x)
Invariants of a covariant.
actual a's of
4702
186
The transformation x
and, since
is
a covariant, there
X)
Thus the transformation changes C(a,x) into \M \~ 8 C(A,X) m But C is a homogeneous function of degree q in the a's, and so also in the A'a. Hence
.
C(a,x)>
\M\-*C(A,X)*
is
C(A\M\-
!<*,X).
(11)
Now
for
(12)
wllich
\M\<I(b).
Take
w
\
C(A\M\-l*,X).
If I(b)
of a
it is
of degree r in the 6, then when expressed in terms the /!(a), a function homogeneous and of degree rq in
is
coefficients a.
is
rq
That is, II(CL) is an invariant of the original form u(a,x). The generalities of the foregoing work are not easy to follow. The reader should study them in relation to the particular
example which we now
has a covariant
give;
it is
taken from
2.5.
C(a x)
y
(a Q a 2
al)x
+(a Q a 3
a1
u, this
to say,
or
(.%-^+..^_
(11) above.
...,
(Ha)
which corresponds to
187
J/X
into
then
(12a)
see that
Take
and we
(9)
of
2.5 is
Covariants of a covariant.
w argument of 3.6, using a covariant C(b, x) of a form of degree in n variables where 3.6 uses /(&), will prove that C(b,x) w
is
an
invariant of a form u(a,x)^ it is immediately obvious, from the definition of an invariant, that the square, cube,... of I (a) are also invariants of u. Thus there is an unlimited number of
invariants;
and
so for covariants.
the other hand, there is, for a given form u, only a finite number of invariants which cannot be expressed rationally and
integrally in terms of invariants of equal or lower degree. Equally, there is only a finite number of covariants which can-
On
not be expressed rationally and integrally in terms of invariants and covariants of equal or lower degree in the coefficients of u.
Invariants or covariants which cannot be so expressed are called irreducible. The theorem may then be stated
The number of irreducible covariants and invariants of a given form is finite.' This theorem and its extension to the joint covariants and invariants of any given system of forms is sometimes called the
Gordan-Hilbert theorem. Its proof present book.
is
irreducible covariants
may
be an algebraical relation
188
its
terms can be expressed as a rational and integral function of the rest. For example, a cubic u has three irreducible covariants, itself, a covariant (?, and a co variant H\ it has one
is
namely,
We
4.
shall
4.3.
Canonical forms
4.1.
will
aX 2 -\-b l Y 2
a
obtained
by writing
2
(^
x-\--
and making the substitution X x+(h/a)y, Y form was considered in Theorem 47, general quadratic We now show that the cubic
y.
The
p. 148.
(1)
may,
in general, be written as pX'*-\-qY*\ the exceptions are cubics that contain a squared factor.
follows
There are several ways of proving this: the method that is, perhaps, the most direct. We pose the problem
'Is it
possible to find p,
q,
a,
/?
so that (1)
is
identically
(2)
equal to
It
is
p(x+y)*+q(x+py)*r
if,
possible
/?,
p and q to
p+q =
For general values of a be consistent if ex j8.
assumption a
=
^3.
p<x+qP
= =
We
The
first
two
if
we can
0.
189
(3) follows
P,
also satisfy
2
^+Pa
That
+Qa,
0,
ot(oc
+P*+Q) =
0,
p(p*+P0+Q)
0.
the third and fourth equations are linear consequences of the first two if a, /? are the roots of
is,
t
+Pt+Q =
0,
where P,
two equations
0,
0.
a 2 +Pa^+Qa Q
3
+Pa 2 + 001
= =
a,
j3
and, provided they are distinct, p, q can then be determined from the first two equations of (3).
Thus
(1)
(2)
provided that
(4)
is
a 2 ~a a 3 ) 2 -4(a a 2 -al)(a 1 a 3 -a 2 )
is
not equal to
We may
readily see
is
what cubics are excluded by the pronot zero. If (4) is zero, the two quadratics
a x 2 +2a 1 xy+a 2 y 2
a l x 2 +2a 2 xy+a 3 y 2
,
have a common
aQ x
2
a Q x 3 +3a l x 2y+^a 2 xy 2 +a 3 y 3 +2a l xy+a 2 y 2 have a common linear factor, and so the latter has a repeated linear factor. Such a cubic may be written as X 2 Y, where X and Y are linear in x and y.
f
The reader
will
in the
form
a x 2 + 2a l x+a 2
0,
a l x z + 2a 2 x-\-a z
O x 2 -f2a 1 a;+a 2
0,
a * 8 -f ^a^-^Za^x-^-a^
=
common
have a common root, and so the latter has a repeated root. Every root is a repeated root of the former. and F'(x] to F(x) =
190
4.2.
(a 0v ..,a 4 )(^?/)
(5)
By
f
may
(6)
two quadratic
2
,
aoX +2h'xy+b y
Let
us, at
a"x 2 +2h"xy+b"y 2
first, preclude quartics that have a squared factor, linear in x and y. Then, by a linear transformation, not neces-
sarily real,
we may
(6) in
the formsf
By
applying Theorem 48, without insisting on the transformation being real, we can write these in the forms
Moreover, since
written as
or,
(7)
a squared
zero,
on making a
final substitution
.A.
= AJ.AJ,
(8)
we may
This
the canonical form for the general quartic. A quartic having a squared factor, hitherto excluded from our discussion, may be written as
is
4.3.
Relations
among
the invariants and covariants of a form are fairly easy to detect when the canonical form is considered. For example, the cubic
u
is
=a
to
x 3 +3a l x 2y+3a 2 xy 2 +a 3 y*
known (2.5)
a covariant
a covariant
have
(a Q a 2 -al)x*+...
[(8)
3
H=
G
of
2.5],
2.5],
(aga 3
3a a 1 a 2 +2af)a; +...
[(10) of
and an invariant
A
f
(a
a 3 --a 1 a 2 ) 2 --4(a a 2
af)(a 1 a 3 --a
2
).
191
Y=
x-{-/3y
X = x-\-oty,
In these
while
It
is
J/X,
\M 2 #,
|
(?!
The factor of H is |Af a because H is a Hessian. The factor |M| 3 of (7 most easily determined by considering the transformation x A^Y, A 3 ,..., A% = a 3 A 3 so that y = AF, which gives A =
|
is
is
A 6 or \M\ A
,
The
factor |M|
of
A may bo determined
in the
same way.
Accordingly,
we have proved
that, unless
2
0,
A^
If
+4#
3
.
(9)
0,
both
5.
G and
Geometrical invariants
5.1.
Projective invariants. Let a point P, having homogeneous coordinates x, y, z (say areal, or trilinear, to be definite)
referred to a triangle point P' in the plane
ABC
TT'.
in a plane
TT,
AEG
y referred to A' B'C' metry prove (in one form or another), there are constants such that z = nz'. a fa/j y = my',
',
.
.
and
let x',
Let A' B'C' be the projection on TT' of z' be the homogeneous coordinates of P' Then, as many books on analytical geoI,
ra,
192
be the homogeneous coordinates of P' referred to a triangle, in TT', other than A'B'C'. Then there are relations
Let X, F,
x
y'
z'
^ =X =A
X+fjL 2
Y+v 2 Z,
TT
*+/i3 7+1/3 Z.
figure in
on to a plane
'
IT'
gives
x
y
=-
liX+miY+n^Z,
f
wherein, since
(1)
concurrent
lines,
the determinant
(/ x
ra 2
n3
is
not zero.
leads to the type of transformation we have been considering in the earlier sections of the chapter. Geometrical properties of figures that are unaltered by projection
Thus projection
(projective properties)
may
variants or covariants of algebraic forms and, conversely, any invariant or covariant of algebraic forms may be expected to
= liX+m^Y,
= X+m
l
Y,
(2)
be considered as the form taken by (1) when only lines through the vertex C of the original triangle of reference are
may
in question; for such an equation as ax 2 +2hxy+by 2 sponds to a pair of lines through (7; it becomes
corre-
AX*+2HXY+BY* =
0,
say, after transformation by (2), which is a pair of lines through a vertex of the triangle of reference in the projected figure. We shall not attempt any systematic development of the geometrical approach to invariant theory: we give merely a few
is
unaltered by pro2
two
pairs of lines
f
ax +2hxy+by*,
a'x*+2h xy+b'y
193
0: this represhould form a harmonic pencil is ab'+a'b2hh' sents a property that is unaltered by projection and, as we
is
alge-
Again,
lmes
to indicate
**
HE
we may expect 22
** =
is
quartic.
The condition
ace+2bcd-ad'--b*e-~c*
and, as we have seen (Example 15, p. 181), J is an invariant of the quartic. The condition that the four lines form an equi-
harmonic
pencil,
i.e.
that the
first
cross-ratios
P
(the six values arising lines) are equal, is
~P
P-~ l
0.
Moreover, / is an invariant of the quartic. Again, in geometry, the Hessian of a given curve is the locus of a point whose polar conic with respect to the curve is a pair
of lines: the locus meets the curve only at the inflexions and multiple points of the curve. Now proper conies project into
is
polar lines
its
with respect to the three curves are concurrent; thus geometrical definition turns upon projective properties of
the curves and, as we should expect, the Jacobian of any three algebraic forms is a covariant.
In view of their close connexion with the geometry of projection, the invariants and covariants we have hitherto been
considering are sometimes called projective invariants and Covariants.
4702
c c
194
5.2.
= 2 X+m 2 Y+n
l
Z,
(3)
wherein (l^m^n^, (/ 2 ,ra 2 ,ri 2 ), and (l B ,m 3 ,n 3 ) are the directioncosines of three mutually perpendicular lines, is a particular type of linear transformation. It leaves unchanged certain
functions of the coefficients of
(4)
linear transformation.
al
X +b Y
2 l
(5)
we
2 2 2 2 2 2 -{-Y have, since (3) also transforms x +y -}-z into 2 2 2 the values of A for which (4) X(x +y +z ) is a pair of linear 2 2 2 is factors are the values of A for which (5) \(X )
,
+Z
+Y +Z
a pair of linear
a
A
factors.
0,
-0,
bl
h
9
b-\
f
A
Cj
c-A
respectively.
a+b+c = ai+bi+Cv
A + B+C - Ai+Bi+Ct
where
(A
fee-/
2
,
etc.),
is
(4).
That
is
to say,
invariant under orthogonal are not invariant under the general linear transformation; they transformation. On the other hand, the discriminant A is an
all linear
transformations.
(3) is
equivalent to a change
a geo-
195
that depend
solely
on measurement.
For
example,
is
an invariant of the
linear
forms
Xx+py+vz,
under any orthogonal transformation: for (6) is the cosine of the angle between the two planes. Such examples have but
little
algebraic interest.
6.
forms (or curves) that are obviously unchanged when the X. Such are the variables undergo the transformation x
Hessian, and so on. One of the less obviousf of these arithmetic invariants, as they are called, is the rank of the matrix
formed by the coordinates of a number of points. Let the components of x be x, ?/,..., z, n in number. Let
points (or particular values of x) be given; say
(a?!,..., Zi),
...,
(# m ,..., z m ).
is
Suppose the matrix is of rank r (< m). Then (Theorem 31) we can choose r rows and express the others as sums of multiples
of these r rows. For convenience of writing, suppose all rows after the rth can be expressed as sums of multiples of the first r rows. Then there are constants A lfc ,..., A^ such that, for any
letter
t
of
x,..., z,
r.
(1)
f This also
is
geometry.
196
The transformation x
in full, say
= M X has an J = c u *+...+c
llt
inverse
*,
X = M~ x
l
or,
Z=
c nl
x+...+c nn z.
(X v ..., ZJ,
and supposing that
t
...,
(-2T
m ,..., Z m ),
x, y,..., z,
is
we have
By
(1)
(1),
c^,...,
c^
.
is
zero
and hence
(2)
implies
Tr+k
we
see
l
X lk T1+ ...+\rk T r
Conversely, as
we have used
32, it
implies (1). By Theorems 31 follows that the ranks of the two matrices
(2)
X = M~ x,
by using x
are equal.
6.2.
Transvectants.
We
binary forms of invariants which are derived from the consideration of particular values of the variables. Let (#!,2/i), (#2>2/2) ^ e ^ wo cogredient pairs of variables (x,y) each subject to the transformation
x
wherein \M
- IX+mY,
lm'
3
I'm.
,,
= VX+m'Y,
(3)
=
\
Then
3 ^
(3)
(compare Chap.
X,6.2).
If
now
the operators
dA
-,
- -, di
- - ~dA 01
,
operate on a function
197
lt
i(
X Y
2,
variables,
we have
8
1
dY2
dY^
dX 2
8y
V+r
==
it
(Im'-l'm )[
\l
^
-
&
& --
\
).
/A
(4)
dt/ 2
a*/! <fc 2 /
an invariant operator. Let t^, v l be binary forms ^, v when we put x = x l9 y = y^ u 2 v 2 the same forms u, v when we put x = x 2 y y%\ U^V^ and t/2 Ta the corresponding forms when ^, v are expressed in and X 2 Y2 are terms of X, Y and the particular values l9 Y^ on the product U^V^ we have introduced. Then, operating
(4) is
, ,
Hence
and
so on.
That
is,
for
any integer
/*,
^ a*^-
These results are true for all pairs (asj,^) and (x 2 ,y 2 )> We may, then, replace both pairs by (x, y) and so obtain the
theorem that
&ud v
r
dtf'&y
is
dx
d ru r~l
dr v
'- 1
(5)
dy
It
a co variant (possibly an invariant) of the two forms u and v. is called the rth transvectant of u and v. u in (5), we obtain covariants of Finally, when we take v
198
the basis of a systematic attack on the invariants and covariants of a given set of forms.
Further reading The present chapter has done no more than sketch some of the elementary results from which the theory of invariants and covariants starts. The following books deal more fully with
7.
the subject:
E. B. ELLIOTT,
An
Introduction
to the
GRACE and YOUNG, The Algebra of Invariants (Cambridge, 1902). WEITZENBOCK, Invariantentheorie (Gr6ningen, 1923). H. W. TUBNBULL, The Theory of Determinants, Matrices, and Invariants
(Glasgow, 1928).
EXAMPLES XIV c
Write down the most general function of the coefficients in a Q x 3 -^-3a l x 2 y-i-3a 2 xy 2 -{- a 3 y 3 which is (i) homogeneous and of degree 2
1.
in the coefficients, isobaric, and of weight 3; (ii) homogeneous degree 3 in the coefficients, isobaric, and of weight 4.
and of
HINT,
(i)
Three
is
the
sum
of 3 and
or of 2
.
and
1;
possible are numerical multiples of o a 3 and a l a a Ans. <xa a 3 -f/faia2 where a, J3 are numerical constants.
2.
What
(a ,...,a 4 )(o;,2/)
4
.
What
is
\a r8
u x\ + a 22 x\ + a 33 x\ + 2a 12 x l x 2 + 2a 23 x 2 # 3 -f 2a 31 x 3 x l
HINT. Rewrite the form as
2
summed
4.
Ztfw*'-'-'**'
2.
Thus a n
2
2.o.o
a 23
2
==
,
ao. 1. 1*
+ 2hxy + by
a'x 2 -f 2h'xy
+ b'y
and
(ab'-a'b)* + 4(ah'-a'h)(bh'-b'h)
is
is
(a<...,a n )(x,y)
199
relates to
4.3) of
(i)
invariant
6.
a cubic,
the discriminant of a quadratic form, (ii) the (iii) the invariants 7, J ( 5.1) of a quartic.
Show
that the general cubic ax* + 3bx 2 y-\-3cxy 2 +dy* can be re-
duced to the canonical form AX 3 +DY 3 by the substitution Y = p'x-\-q'y, where the Hessian of the cubic is
(ac~b*)x +(ad-bc)xy+(bd-c
2 2
2
X = px+qy,
)y
= (px+qy)(p'x+q'y).
HINT.
The Hessian of
is
XY:
i.e.
AC = B\ BD = C
a cube;
or the form
identically zero.
Find the equation of the double lines of the involution determined the two line -pairs by
7.
ax*
+ 2hxy+by =
2
0,
a'x*
+ 2h'xy+b'y* =
0,
it
by the usual
ab^ a b~2hh =
l l
0,
a'b l
+a
b'~2h'h l
0.
are such that a triangle can be inscribed in /S" and circumscribed to S, then the invariants (Example 11, p. 180) 2 4A0' independently of the choice of A, 0, 0', A' satisfy the relation
8.
If
two conies S,
*S"
triangle of reference*
S s 2Fmn+2Gml + 2Hlm =
S'
0,
0.
The general
9.
result follows
by
linear transformation.
coefficients
ar8 of
m linear
form8
in
an*i + ."+arn*n
is
(r
l,... f
m) (Xr
variables
an arithmetic invariant.
, ,
10.
Prove that if (x r y r
Y Z
r,
r)
by x
M X,
z3
is
an invariant.
(a
is
,...,
a n )(z,
n
t/)
,
(6
,..,,
b n )(x, y) n
form and
is
forms. It
Ans. a 6 n
12.
na 1 6 w _ 1 + in(n
I)a 2 6 n _ 2 +...
is
Prove, by Theorems 34 and 37, that the rank of a quadratic form unaltered by any non -singular linear transformation of the variables.
CHAPTER XV
LATENT VECTORS
1.
Introduction
1.1.
Vectors in three dimensions. In three-dimensional geometry, with rectangular axes 0j, O 2 and Of 3 a vector
,
,
=-.
OP
has components
(bl> b2> b3/>
V(f +!+#)
(!)
i
151'
is
I5i'
!5T
Two
vectors,
(??!, 772,
*)
with com-
ponents
are orthogonal
if
l*h
Finally, if e 1? e 2
,
+ ^2+^3^U.
(0,1,0),
(
(2)
(1,0,0),
(0,0,1),
a vector
with components
g
19
f3 )
may
be written as
= lie^^e.+^eg.
e^ e 2 e 3 is of unit length, by
,
(1),
and the
are, of
three vectors are, by (2), mutually orthogonal. course, unit vectors along the axes.
1.2.
They
algebraical pre-
available to us
if
we use a
single-
column matrix to represent a vector. A vector OP is represented by a single-column matrix ^ whose elements are 1? 2 3 The length of the vector is defined to be
,
.
LATENT VECTORS
matrix form. The transpose and elements
9
201
'
of
is
is
a single-element matrix
yj
to be orthogonal
S'Y)
is,
in
matrix notation,
(2)
0.
The matrices e 1? e 2 e 3
,
are defined
by
e 3==
ea
PI*
PI;
fi
ei+^ 2 e
The change from three dimensions to ?i dimensions is immediate. With n variables, a vector is a single-column matrix
with elements
t t
Sl b2'"->
is
bw DEFINED
to be
the vector
yj
is
Two
vectors ^
and
when
(4)
fi'h+&'h+--l-,'?
=
-
>
which, in matrix notation, may be expressed in either of the two forms t == 5) ( 5'^ = 0, yj'5
J-
is
a unit vector
I-
5'5
Here, and
Finally,
later,
we
1 denotes the unit single-element matrix. define the unit vector e r as the single-column
in the rth
;
AB only
(of.
the
to the
number of rows of B
TQ^'-
p. 73)
when and
or
Dd
202
this notation the
LATENT VECTORS
n vectors er
(r
l,...,?i)
gonaland
1.3.
^
this subsection
Orthogonal matrices. In
we use x r
THEOREM
vectors.
59.
Let
x l9 x 2 ,..., x
;l
Then
;l
x lt x 2 ,..., x
Proof.
matrix X whose columns are the n vectors an orthogonal matrix. Let X' be the transpose of X. Then the product X'X is
the square
is
The element in the rth row and <sth column of the product using the summation convention,
X otr X OLS-
is,
on
When
s
xr
is
when
^r
this
x r and x3
are orthogonal.
Hence
Y' J
is
an orthogonal matrix.
is
X^,
while X'
is
The element
is
in the rth
X'X
matrix).
Let Let
^ s\ then x' \ 0, since x and x are orthogonal. = since x a unit vector = si then x' x r x^ x = / and X is an orthogonal matrix. Hence
r
r
3
r
1,
is
JSC'Ji
LATENT VECTORS
COROLLARY.
f
203
In
yj
the
[X
= X~
l ]
let
y r t^en
Xri,
Proof. Let 7 be the square matrix whose columns are J^,..., y, r X'x r and the columns of are x^..., x n Then, since yr
Y = X'X =
,
/.
is
established.
Linear dependence. As
in
1.3,
x r denotes a vector
with elements
*lr" *'2r>"*> ^nr'
v ...
mJ
not
which
(1)
x 1 + ... + / m x m -=0.
1
If ( 1 )
is
true only
when I
,...,l
independent.
LEMMA
Proof. Let there be given n-\-k vectors. Let A be a matrix having these vectors as columns. The rank r of A cannot exceed
n and, by Theorem 31 (with columns for rows) we can select r columns of A and express the others as sums of multiples of the
selected r columns.
x,
2>r e r rl 5
\x rs
\
(50,
l,...,n).
(2)
The determinant \X
from
(2),
^
r8
r
2X ,x = JTe =i
a
(r=l,...,n).
(3)
Any
vector x p (p
>
n)
X
and
therefore,
/>
r=l
2,
Xrp e r>
by
(3),
204
LATENT VECTORS
2.
LEMMA
Proof.
Any n
independent.
Let x l9 ...,x n be mutually orthogonal unit vectors. there are numerical multiples k r for which Suppose
*1
x 1 + ... + fc x
ll
ll
=
>
0. l,...,n.
(4)
Pre-multiply by
x^,
where
s is
any one of
Then
M;X, =
and hence, since x^x s = 1, ks Hence (4) is true only when
ix
0.
Aliter.
|X|
- k n = 0. By Theorem 59, X'X = 7 and hence the determinant 1. The columns x^.^x^ of X are therefore linearly
...
independent.
2.
Latent vectors
2.1. Definition.
We
A
its logical
a given square matrix of n rows and columns and A a numerical constant. The matrix equation
60.
THEOREM
Let
4x =
in which
Ax,
(1)
is
with at least
equation
a single-column matrix of n elements, has a solution one element of x not zero if and only ifX is a root of the
\A-\I\
0.
2 ,...,
(2)
.2
a tf^
= A&
(*
1>-,
tt).
These linear equations in the n variables f ls ...,^ w have a nonzero solution if and only if A is a root of the equation [Theorem 1 1 ]
an
&9.1
a 12
#22 22
0,
(3)
and
(3) is
merely
(2)
written in
t
full.
7.
Chapter X,
LATENT VECTORS
DEFINITION. The
the
205
are the latent roots of
roots in X of \A
is
XI
=
\
matrix
when \
satisfying
Ax
is
Ax
to the root A.
LATENT VECTOR of A
corresponding
2.2.
to
THEOREM 61
Let x r and
x r and x s are always linearly independent, and when A is symmetrical, x r and x s are orthogonal.
Proof.
By
hypothesis,
Ax =
r
Ar
xr
Ax =
8
X8
x8
(1)
(i)
kr
x;+k8 x,
(2)
0.
(2)
Then,
since Ax =
s
r
Xs
xs
gives
s s
s)
Ak x r
and
so,
is
by
(1),
^ ~X (k x = X (k k X x = k X x
s
r
x r ),
That
to say,
when
(2) holds,
Av(A,.-A 5 )x r
0.
By
kr
hypothesis
xr
is
and
(2)
reduces to ks
true only
if
xs
0.
=
ks
^X
0.
Hence
(ii)
(2) is
kr
0; accordingly,
x r and x s
Let A'
A. Then, by the
x' 8
first
equation in
.
(1),
Ax =
r
x^A r x r
(3)
(
in
,
gives
A'
x' s
A,
Ax =
r
X 8 x' 8
xr
(4)
From
(3)
and
(4),
(A r -A,) X ;
xr
0,
Xs
0, x!^
xr
0.
Hence x r and X 8
206
3.
LATENT VECTORS
49, p. 153):
real quadratic
form %'A% in
the
variables
lv ..,
n can be
to the
form
A 1 T/f+A 2
?7|
+ ...+A
/,
77*
0.
When A has
62.
THEOREM
latent vectors
Let
A
/l
be
.
Then
there are
distinct
rml unit
x 1 ,...,x
corresponding
to these roots.
,
If
is the
X 1 ,..., x n
the transformation
from
variables
? ea l
orthogonal
transformation and
Proof. The roots X v ...,X n are necessarily real (Theorem 45) and so the elements of a latent vector x r satisfying
^4x r
-A
xr
(1)
can be found in terms of real numbersf and, if the length of any one such vector is k, the vector k~ l x r is a real unit vector satisfying (1 ). Hence there are n real unit latent vectors x^..., x n and the matrix having these vectors in its n columns has real
,
numbers as
its
elements.
is
orthogonal to X 8 when r
--
s and,
xr x
where
o r8
=
8 rr
S ra
(2)
1
when
r -^ s
and
(r
l,...,n).
As
in the
f When x r is a solution and a is any number, real or complex, xf is also a solution. For our present purposes, we leave aside all complex values of a.
LATENT VECTORS
207
proof of Theorem 59, the element in the rth row and 5th column of the product X'X is 8 rg whence
;
X'X
%A$ = n'X'AXn
*
(3)
and the discriminant of the form in the variables T^,.--, y n s the are Ax ly ... Ax n and matrix X'AX. Now the columns of from (1), are AjX^.-jA^x^. Hence these,
AX
(i)
(ii)
xi,...,
x^,
AX
are A x
in
is
X r^s X by
(2).
~ \X
Xs
==
Thus X'AX has X l9 ...,X n as its elements in the principal diagonal and zero elsewhere. The form ^ (X'AX)m is therefore
f
3.2.
When
whether
.4
It is in fact true
all its
that,
\AXI\
of the
x 1} ..., x n
the
is
matrix with these vectors as columns, the transformation a real orthogonal transformation that gives
wherein
Aj,...,
\ n are the
roots (some of
characteristic equation
\A~ A/|
0.
is
The proof will not be given here. The fundamental difficulty to provef that, when \ (say) is a fc-ple root of \A-\I\ = 0,
there are k linearly independent latent vectors corresponding to Aj. The setting up of a system of n mutually orthogonal unit
f Quart. J. Math. (Oxford) 18 (1947) 183-5 gives a proof by J. A. Todd; another treatment is given, ibid., pp. 186-92, by W. L. Ferrar. Numerical examples are easily dealt with by actually finding the vectors see Examples
;
XV,
11
and
14.
208
LATENT VECTORS
vectors
or
is then effected by Schmidt's orthogonalization processf the equivalent process illustrated below. by Let y lt y 2 y 3 be three given real linearly independent vectors.
,
Choose
and let X 1
k l y r Choose
x^ya+KXj
that
is,
0;
(1)
since x'x
\L
t
1, oc
xi v 2-J
is
Since y 2 and x are linearly independent, the vector y 2 +aX 1 not zero and has a non-zero length, 1 2 say. Put
Then X 2
is
Now
and
That
is,
determine
it is
orthogonal to x t
x'ifra+jfrq+yxa)
x 2 (y 3
since
= +^x +yx - 0.
1 2)
(2)
(3)
x x and x 2
are orthogonal,
3,
The vector y 3
-x'2 y 3
y 3 are
linearly independent;
it
and with
we have a unit vector which, by (2) and (3), is orthogonal to x x and x 2 Thus the three x vectors are mutually orthogonal unit
.
vectors.
Moreover,
if
then also
:=
Ay 4
yly 2
:L-r:
Ay 2
Ay 3 =
,
Ay 3
AXi,
-/I
X2
AX 2
and x r
W.
X2 X3
t
J
More
matrix
L. Ferrar, Finite Matrices, Theorem 29, p. 139. precisely, a is the numerical value of the element in the single-entry xjy z Both XjXj and xjy 2 are single-entry matrices.
.
LATENT VECTORS
4.
209
Collineation
ri
Let ^ be a single-column matrix with elements 1 ,..., and other letters. Let these elements be the current coordinates so for of a point in a system of n homogeneous coordinates and let A be a given square matrix of order n. Then the matrix relation
-Ax
(1)
point y.
Now
to
Y)
by
Ti,
where T is a non-singular square matrix. The new coordinates, X and Y, of the points x and y are then given by
TX,
= T Y.
(
1 )
is
expressed by
TY = ATX
or, since
is
non-singular,
by
Y=
T~ 1 ATX.
1
(
(2)
The effect of replacing A in 1 ) by a matrix of the type T~ A T amounts to considering the same geometrical relation expressed in a different coordinate system. The study of such replace1 ments, A by T~ AT, is important in projective geometry. Here shall prove only one theorem, the analogue of Theorem 62. we
4.1.
When
are
,
all distinct.
THEOREM
Let
and
(ii)
di*g(X l ,...,X H ).
Proof,
We
prove that
is
non-singular
by showing that
are
E6
210
LATENT VECTORS
&!,...,&
Let
0.
(1)
Then, since
At n
=
r
A /t t H
A(k l t l +...+k n _ l t n _ l )
That
is,
r
r
1,
Since
Aj,,...^,,
are
CiMi + '-'+'n-A-iVi^O,
wherein
(2)
the numbers r^. ..,_! are non-zero. 1 for w gives repetition of the same argument with n
all
by
step, to
tj is
aiMl =
0,
a^O.
0.
a non-zero vector and therefore k v = By hypothesis, We can repeat the same argument to show that a linear
tion
rela-
implies k 2
holds
only
if
____ K
2
...
__ K n
A U.
This proves that t lv ..,t n are linearly independent and, these being the columns of T, the rank of T is n and T is non-singular. are (ii) Moreover, the columns of
AT
A 1 t 1 ,...,A n t n
7
When
If
not symmetrical, it does not follow that a A?-ple root of \A A/ gives rise to k linearly independent latent vectors.
f If necessary,
a
/ 81
* *
'is!
tt
AX
F Al
1'
Aa
^i
ttt
aJ
LO
A,J
LATENT VECTORS
If
it
211
so
t tj,..., n
happens that A has n linearly independent latent vectors and T has these vectors as columns, then, as in the proof
of the theorem,
whether the
On
We
illustrate
by simple
examples.
A=
is
,
\*
V
J
LO
The
characteristic equation
(a
A)
x,
Ax
ax; that
is,
From
the
first
to satisfy
Ax
and the only non-zero vectors of these, x 2 <*x are numerical multiples of
|0
(ii)
Let
A =
x must again
a.x l
\oc
latent vector
satisfy
Ax =
ocx.
This
now
re-
quires merely
ocx lt
OLX%
ax2
The two
all
Commutative matrices and latent vectors Two matrices A and B may or may not commute; they may or may not have common latent vectors. We conclude this
5.
chapter with two relatively simple theorems that connect the two possibilities. Throughout we take A and B to be square matrices of order n.
212
5.1. Let
LATENT VECTORS
and
in
common, say
Then
t lv .., t tl
,
t l9 ...,t n
Let
corresponding to latent
roots
of correspondAj,..., A rt say, of A they are latent vectors As in 4.1 (ii), p. 210, say, of B. ing to latent roots ^4,. ..,/*
tr
(r
(r
l,...,n),
l,...,n),
p, r
tr
L
That
is
diag(A 1 ,...,A w ),
^TT~ 1 AT
T~ 1 ATT- 1 BT
so that
It follows that
TL,
L,
T~*BT
If.
and, on pre-multiplying by
T and
post-multiplying by
We
common
Now
suppose that
AB =
BA. Let
be a latent vector
Then At
At
and
(1)
XBt.
latent vector of
non-singular
Bt
B~ l Bt
is
0.]
a multiple of
Bt
and
t is
Art,
a latent vector of
k of B.
If there are
latent vectors of
A,
say
MV>
t
t /;j
By
1.4,
m<
n.
LATENT VECTORS
then, since
213
At r
At r
(r
l,...,m),
all zero,
Mi + + Mi,,
.-.
(2)
is
a latent vector of A corresponding to A. By our hypothesis that there are not more than in such vectors which are linearly independent, every latent vector of the form (2).
corresponding to A
l,...,m)
is
of
As
in (1),
ABtr = XBt
(r
so that
Bt r
is
a latent vector of A corresponding to A; it is there(2). Hence there are constants ksr such that
l9
B I i,t, r-1
Let
mm M. = m t ,
...
1 1 = r=l
1,
s
2(2 -l Wl
lc
>A,
K
7
K
Then there
lm
km \
kmm.
(
are
numbers
j
lr
/!,...,'/
ksr
=--
dl s
(s
m)
r-l
r,
Hence
to
is
a latent root of
B and 2 ^ *r
*s
a latent vector of
corresponding
THEOREM
and
let
65. Let
.
A,
each distinct latent root A of A correat least one latent vector of A which is also a latent vector sponds
A B = BA
B\
Then
of B.
f
common
4702
For a comprehensive treatment of commutative matrices and their latent vectors see S. N. Afriat, Quart. J. Math. (2) o (19o4) 82-85.
E'C2
214
LATENT VECTORS
EXAMPLES
XV
Referred to rectangular axes Oxyz the direction cosines of OX, OY, (l^m^nj, (I 2r 2 ,n 2 ), (/ 3 ,W3,n 3 ). The vectors OX, OY, OZ are mutually orthogonal. Show that
1.
OZ
are
/!
1
"2
is
-6,6, -6),
(c,0,
-c,0),
(0,d,0, -d).
Verify that they are mutually orthogonal and find positive values of a, 6, c, d that make these vectors columns of an orthogonal matrix. 3. The vectors x and y are orthogonal and T is an orthogonal matrix.
Prove that
4.
X = Tx
and
Ty
bt
The
vectors a and
b have components
and (2,3,4,5).
Find
is
*, so that
= b+
fc
a
l lt
1)
and d
lt
2, a
,n 2
the
for which,
four
5.
c^
=
,
c1
+n
a,
of a square matrix
and
A' A =
diag( a ?,a?,ai,aj).
Show
is
-2),
(0,0,1)
-2),
(0,2,2),
(0,0,3).
Find
0],
1
-2
3J
is a + (0,1,-1), 1, = (0,0,1) and that any values of the constants k v and fc r 8. Prove that, if x is a latent vector of A corresponding to a latent root A of A and C = TAT~ l then
showing (i) that the first has only two linearly independent latent vectors; (ii) that the second has three linearly independent latent vectors
(1,0,1), 1 2 li latent vector for
k^ k^
LATENT VECTORS
and Tx.
9.
is
215
a latent vector of
if
also that,
Find
and 7 y. linearly independent, so also are three unit latent vectors of the symmetrical matrix
4
x and y an
01
-3
[2
in the
1
-3J
form
<>'.
2.r 2
KU)-
K-i)
.
- 2X 2 -2F 2 -4Z 2
10.
x, y, z
= 2X -Y*-Z*.
2
11.
form
are
-1, -1,
Prove that, corresponding to A 1, ~ ^ -f 2/~i~ 2 (i) (x, y,z) is a latent vector whenever and (1,0, 1) are linearly independent and (ii) (1,- 1,0)
.<;
( ;
that
1,0)
and
(\-\-k t -~k %
\)
are orthogonal
when
V2'~V2'
are orthogonal unit latent vectors. Prove that a unit latent vector corresponding to A
is
"
Verify that when
(-L .1
WS'VS'VS/'
T is
~ TX,
where x
12.
r=
(x,y,z) and
--
X=
(X,Y
F,
z
Z),
Prove that
With n
a: Z, y variables x l9 ...,x n ,
~ X is an orthogonal transformation.
A^,
...,
show that
xa
xl
\vhero
tion.
13.
a,]8,..../c is
^C a ,
.r n
= XK
a permutation of
l,2,...,n, is
an orthogonal transforma-
2yz+2zx
14.
(F
-Z
)V2.
Whenr =
1,2,3,
2, 3,
4 and
X'.4X
Example
1 1
ia
the
216
LATENT VECTORS
A
are
1,
1,
1,
and that
(1,1,1,1)
-1,0),
(1,-1,-1,1),
.
are mutually orthogonal latent vectors of A Find the matrix T whose columns are unit vectors parallel to the above ~- TX, (i.e. numerical multiples of thorn) and show that, when x
mr' J A. *~\. A.
.
V3 V
^
ji\.
V2 2
Y 2 -l V-.V 2 ^V 4. 3
*
i"
15.
TX whereby
16.
latent vectors.
17.
square matrix A, of order n, has n mutually orthogonal unit Prove that A is symmetrical. Find the latent vectors of the matrices
120
[1
18.
1],
[-2
1
-1
2
1
-51.
1
3J
L3
(1,1,0),
6J
The matrix
(1,2,3)
,
.
has the 1 The matrix 1 0, corresponding to the latent roots A same latent vectors corresponding to the latent roots /x 1,2,3. Find
the elements of
19.
A and B and verify that A B =- BA The square matrices A and B are of order n; A has n distinct latent roots A 1 ,...,A W and B has non-zero latent roots /i 1 ,...,/x n not necessarily BA. Prove that there is a matrix T for which distinct; and AB
.
diag(A 1 ,...,A n ),
a
*!
= 6=
i, c
Work
out
4.
6.
=--
-J;
~iV;
cf.
3.2.
1,2,3.
7.
(i)
=
2,
1,
1),
(0,0,1).
9.
4; use
12.
x = TX
formation gives J x 2 ^ 2 X 2 therefore T'T 13. A = 0, i^2; unit latent vectors are
;
gives
trans-
=
f
/.
W2
JL <A
""V2'
/'
-1\ V2'2'V2/'
\2'2'
V2/'
Use Theorem
t
62.
is
pains.
LATENT VECTORS
16.
217
t\s as
Let
t lv ..,t n
As
in
4.1,
T~
AT =
columns.
T and so
TAT =
A
diagonal matrix
is its
diag(X lt ...,X n ).
own
transpose and so
17.
(i)
(ii)
is.
A = A'. and T'AT = (T'AT)' = T'A'T A = -1,3,4; vectors (3, -1,0), (1,1, -4), (2,1,0). A - 1,2,3; vectors (2, -I,- 1), (1, 1, - 1), (1. 0, - 1). B = n i 01. A = n -i 0],
-j
LO
2 f
3J
-1J
LO
has n linearly independent latent vectors (Theorem 63), which 19. are also latent vectors of (Theorem 65). Finish as in proof of Theorem
64.
INDEX
Numbers
refer to pages.
Array
50.
61.
minors
(q.v.) 28.
as matrix product 122. Binary cubic 188. Binary form 176, 190. Binary quart ic 190.
59.
Canonical form 146, 151, 188. Characteristic equation of matrix 110. of quadratic form 144. of transformation 131. Circulant 23.
:
Elliott,
E. B. 198.
:
Equations
homogeneous
A equation of
32.
12, 36.
Co-factor 28.
Cogredient 129.
Collineation 209.
number
2,
211.
56.
Fundamental solutions
Grace and Young 198.
Hermitiaii form
129.
Definitions
definition 171.
determinant
8.
matrix 69.
Pfaffian61. quadratic form 120. rank of matrix 94.
reciprocal matrix 85. unit matrix 77. weight 182.
Kronecker delta
84.
Determinant
vectors 204.
adjugate 55.
co-factors 28. definition 7, 8, 26.
form
119.
220
Linear substitution 67. transformation 122.
INDEX
canonical form 146, 151, 206.
definition 120.
Matrix
addition 69.
adjoint, adjugate 85. characteristic equation 110. commutative 211. definition 69.
rank 148.
signature 154.
Rank
of matrix 93. of quadratic form 148. Reciprocal matrix 85.
Reversal law
for reciprocal 88. for transpose 83.
Rings,
number
1.
Minors
28.
Summation convention
41.
Sum
56.
Transformation
sum
of 94.
Notation
characteristic equation 131. elementary 147. latent roots 131. modulus 122. non- singular 122. of quadratic form 124, 206. orthogonal, 153, 160, 206. pole of 131. product 123.
number'
67.
substitutions 67.
latent 204.
Permutations
24.
form
Weight
Quadratic form: as matrix product 122.
182, 184.