Download as pdf or txt
Download as pdf or txt
You are on page 1of 200

PERIYAR INSTITUTE OF DISTANCE EDUCATION

(PRIDE)

PERIYAR UNIVERSITY
SALEM - 636 011.

M.Sc. MATHEMATICS
FIRST YEAR
PAPER - I : ALGEBRA

1
Prepared by :
M. Rajasekaran, M.Sc., M.Phil.,
Govt Arts College (Autonomous)
Salem – 636 007.

2
Unit - I
GROUP THEORY
Definition (Group)
A nonempty set of elements G is said to form a group if in G there is
defined a binary operation, called the product and denoted by ‘.’ Such that,
1) a, b  G implies that a, b  G (closure law)
2) a, b, c  G implies that a.(b.c)  (a.b).c (Associative law)
3) there exists an element e  G , such that, a.e  e.a  a for all a  G .
(the existence of identity element)
4) For every a  G there exists an element a 1  G such that a.a 1  a 1.a  e
( a 1 is called the inverse of a)
The group is denoted by (G,.). It is denoted by G itself whenever there is
no confusion about the product. Aorever the product a,b of two elements a,b of
G. is usually written as ab, whenever there is no confusion about the product.
Aorever the product a,b of two elements a,b of g. is usually written as ab,
whenever there is no confusion of notation.
Definition:
A group G is said to be abelian (or commutative ) if for every
a,b  G; ab  ba. (commutative law)
When a group G is finite that is the number of elements of G is finite,
we say what G is a finite group. The number of elements of G is denoted by o
(G), called the order of the group G.
Now, we shall see a few examples of a group.
Example:
Let G  S3 be the set of all one-to-one mappings of the set  x1 ,x2 ,x3 
onto itself. Under the product of composition of mappings, G is a group of
order 6. this group is also known as the “ symmetric group of degree 3 “.
Now we shall se some elementary properties of group whose proofs are
already familiar to you.
1. The identity element of a group is unique
2. Each element in a group has a unique inverse.
 
1
3. If a is an element in a group, then a 1 a.
4. ( ab )1  b1a 1 for all a,b in the group, G
5. Cancellation laws are valid in a group. i.e, ab  ac  b  c and
ba  ca  b  c.
6. In a group g the equation ax  b and Ya  b have unique solutions.
 a,b  G  .
The following theorem gives other equivalent definition of a group G,
when G is finite.
Theorem 1:
Suppose a finite set G is closed under the associative product and that
both cancellation laws are valid in G than G is a group.

3
Proof:
Let G   g1 ,g 2 ......,g n  be a finite set. Let g i  G . Consider the set
G'   g i g1 ,g i g 2 ,........,g i g n 
The element of G' are distinct because if gi g r  gi g s.
Then g r  g s. by last cancellation law.
Also gi g 2 ,gi g 2 ,....... are all in G. But the number of elements in G is
n and the elements of G' are distinct and belongs to G. Therefore, G  G' .
Since g 2  G; g 2  g1 g r for some r,1  r  n.
Therefore, x  g r is a solution of the equation.
We consider, G''   g1 .g i ,g 2 .g i .......g n g i 
As before G''  G. Therefore Y .g1  g 2 has a solution in G.
Therefore, G is a group.
Definition (Subgroup)
A nonempty subset H of a group G is said to be a subgroup of G if
under the operation in G,H itself forms a group.
If H is a subgroup of G and K is a subgroup of H, then K is a subgroup
of G. The following theorem characterizes the subgroups of a group.
Theorem 2:
A nonempty subset H of a group G is a subgroup of G if and only if,
1) a,b  H implies that ab  H
2) a  H implies that a 1  H .
Proof:
If H is a subgroup of G, then it is obvious that (1) and (2) must hold (by
definition H itself is a group under the same operation of G).
Converse:
Since the associative law holds for G, it holds for its subset H also so
the proof will be complete if we prove that the identity e of G belongs to H. By
(1) for any a  H , a 1  H . Therefore a 1a  H . That is e  H . Hence the
theorem.
Note:
The conditions (1) and (2) mentioned in the above theorem can be
combined into a single condition and can be reworded as follows, “A
nonempty subset H of a group G is a subgroup of G if and only
a,b  H  ab 1  H . This condition is very much helpful in deciding
whether a nonempty subset of a group is a subgroup or not.
In the case of finite subgroups, the sufficient condition is week and it is
mentioned below.
Theorem 3:
If H is a nonempty finite subset of a group G and H is closed under
multiplication then H is a subgroup of G.

4
Proof: By theorem 2, it is enough if we prove that whenever a  H , a 1  H .
Suppose that a  H .
Then by closure law, a 2  a.a  H
a3  a 2 .a  H
Thus all positive integral powers of a are in H.
i.e a m  H , ( m is a positive integer)
Thus the infinite collection of elements a,a 2 ,a 3 ,.......a m ..... all must fit
into H which is a finite subset of G. therefore, there must be some repetitions
of these elements. That is, for some r,s with r  s  0; a r  a s .
By cancellation law a r s  e .
Hence e  H , Also since r  s  1.0,a r  s 1  H , and a.a r s 1  a r s  e
Therefore, a 1  a r s 1  H .
Hence the theorem.
Definition (CENTRALIZER OF AN ELEMENT)
If G is group N  a    x  G / xa  ax  the set of all those elements
of G which commute with the particular element a of G is called the normalizer
or contralizer of a in G.
Theorem 4:
N(a) is a subgroup of G.
Proof: Let g1 g 2 be any two elements of N(a). Then g  .a  a.g1 and
g 2 .a  a.g 2 For
g 2 .a  a.g 2 ,g 21 . g 2 .a  g 21
 g 21 . a.g 2  .g 21
i.e,  g 21 .g 2  . a.g 21    g 21 .a  . g 2 .g 21 
i.e, e. a.g 21    g 21 .a  .e
i.e, a.g 21  g 21 .a.
Therefore, g 2  N  a  .
1

 
Now, g1 .g 21 .a  g1 . g 21a  
 g1 . a.g 21 
  g1 .a  .g 21
  a.g1  .g 21
 a. g1 .g 21 
Therefore g1 .g 2  N  a 
1

Therefore, N(a) is a sub group.

5
Definition (CENTRALIZER OF A SUBGROUP)
If H is a subgroup of G, then by the centralizer C(H) of H we mean the
set  x  G / xh  hx for all h  H  .
Theorem 5: C(H) is a subgroup of G.
Proof: Let a,b  C( H ). Then ah=ha and bh=hb for all h  H .
For all h  H ,
 ab  h  a  b h 
1 1

 a h b1 1

 a  bh  1 1
; sin ce h 1  H
 a  hb 
1

  ah  b 1
  ha  b 1
 h  ab 1 
 a,b  C( H )  ab 1  C( H ).
Therefore, C(H) is a subgroup of G.
Definition (CENTRE OF A GROUP) The centre z of a group G is defined
by z   z  G / zx  xz for all x  G  the set of all those elements of G
which commute with each and every elements of G.
Theorem 6:
Z is a subgroup of G.
Proof:
Clearly e  z . Therefore z  null set
Let a,b  z . Therefore, a : x  x.a and b.x  x.b for all x  G .
Also  ab  x  a b x 
1 1

 a  x b 1 1

 a  bx  1 1
, sin ce b  z
 a  xb  1

  ax  b 1
  xa  b 1 ,  a  z 
 x  ab 1 
Therefore, z is a subgroup.
In group theory, there are certain groups which can be generated by a
single element such groups are called ‘cyclic groups’.

6
Definition: (CYCLIC GROUP)
 
For some choice a  G, if G   a   a n / n  0, 1, 2,..... , then G is called
a cyclic group generated by a. The element a is called a generator of G.
Theorem 7: Every cyclic group is abelian but not conversely.
Proof: Let G   g  be a cyclic group. Let x, y  G then for some integers m
and n we can write x  g m and y  g n .
Therefore
x.y  g m g n  g mn  g nm
g n .g m  y.x
Theorem, G is abelian.
To prove that the converse is not true, namely, every abelian group is
not necessarily a cyclic group, we shall give a counter example
KLEIN 4-GROUP:
The Klein 4-group V, given below is abelian but not cyclic, as
explained by the table below

e a b c
.

e e a b c

a a e c b

b b c e a

c c b a e

e  e2  e3  e 4 ; a  a, a 2  e,
a3  a, a 4  e.
b  b, b 2  b 4  e, b3  b; c  c 3  c.
c 2  c 4  e.
From the above equations it is clear that one of element of v generates all
the elements of V.
Therefore, V is not cyclic, even though it is abelian.
Example:
G  1.  1,i,i is a cyclic group generates by i. Note that –i is also a
generator of G.

7
This example clearly shows that a cyclic group may have more than are
generator.
The following theorem gives a condition for the existence of non-trivial
sub groups in a group.
Theorem 8:
Every finite group of composite order possesses a proper subgroup.
Proof:
Let G be a finite group of composite order.
Let 0(G)=mn, where m and n are poditive integers, such that m>1, n>1.
Case (1): when G is cyclic.
Let a be a generator of G. Then 0(a)=0(a)=mn. Therefore, 0 a m  n .  

Let H  a m ,a 2 m ,...,a mn  e . 
H is clearly nonempty subset of G.
Also for 1  r  n; 1  s  n, let a rm and a m be two elements of H.
Then
a rm .a m  a 
r  s m

 a
np  p m
 r  s  np  q 
By division algorithm where 0<q<n.
a rm  a sm  a rm a ( sm )  e  e pnm .a qm
  a r s   e
m
 e p .a qm
 aqm  H since 0<q<n.
That is, H is closed for the composition.
Therefore, H is a subgroup of G.
Now, the elements of H are distinct for if r and s are two integers such
that 1<s<r<n; then,
a rm  a sm  a rm a ( sm )  e
  a r s   e
m

  am 
r s
e
 0  a m   r  s  n,a contradiction.
Therefore, O(H)=n since ab1  H 2  n  mn H is a proper subgroup of G.
Case(2): when G is not cyclic.
In this case the order of every element of G is less then mn. So there
exists an element say b of G, whose order is k, 2  n  mn Hence (n)=H is a
cyclic subgroup of order. K<mn. Therefore H is proper subgroup of G.
Theorem 9:
Every subgroup of a cyclic group is itself cyclic group.

8
Proof:
Let G=(a) be a cyclic group generated by an element a and let H be any
subgroup of G.
If H=G or H=  Q  then obviously H is cyclic. So consider the case
when H is a nontrivial subgroup. Since H is a subgroup of g, every element of
H is an integral power of a.
Let m be the least positive integer such that a m  H
Now, let a s be an arbitrary element of H, then by division algorithm.
S=mq+r, where 0  r  m
Or r=s-mq
Therefore a r  a s  mq
Now, H being a subgroup,
a s  H ; a m  H  a s  H ;a  m  H
 a s  H ; am   H
q

 a s .a  mq  H
 a s mq a  mq  H
 ar  H
Since m is the least positive integer such that a m  H ; and since
0  r  m r=0. Therefore, s=mq
Therefore, an arbitrary element a s  H ; can be expressed as
a s  a mq  a m 
q

That is, H is generated by a m


That is, H= a m
That is, H is cyclic.
Definition :(CONGRUENCE RELATION)
Let G be a group and H be a subgroup of G. For a,b  G we say “a is
congruent of B” mod. H, written as a  b (mod H) if ab1  H
Theorem 10: The relation a  b (mod H) is an equivalence relation.
Proof: We have to prove that the congruence relation is
(1) reflexive
(2) symmetric
(3) transitive
i.e we have to show that
I. a  a (mod H)
II. a  b (mod H)  b  a (mod H)
III. a  b (mod H)  b  c (mod H)
 a  c (mod H)
(1) since H is a subgroup, e  H that is a.a 1  e  H Therefore by
definition a  a (m0d H)

9
(2) a  b mod H  a.b1  H
H  e ;
a  eG
H e
  a.b 1   H
1

H (a)G
 G 
HK   x   hk , h  H , K  K 
 x 
hk  k1h1
k1  k , h1  h
Since H is a subgroup
  b 1  a 1  H
1

 ba 1  H
 b  a (mod H)
a  b (mod H)  b  c (mod H)
 ab1  H and bc 1  H
 ab1 .bc 1  H
Since H is a subgroup.
 a  b 1 .b  c 1  H
 aec 1  H
 ac 1  H
 a  c (mod H)
Hence the theorem.
Definition (COSETS)
 ha 
If H is a subgroup of G, a  G , then Ha    G  is called a right
h 
coset of H in G.
Theorem 11:
 x G 
For all a  G , Ha   (mod H )
a  x 
Proof:
 G 
Suppose we denote by [a], the set  x   x(mod H ) then we shall
 a 
prove that Ha=[a].
First we shall prove Ha  [ a ]
   
If h  G ; then a  ha   a a 1h 1  aa 1 h 1  eh 1  h 1  H
1

10
Therefore from the definition of congruence mod H,
a  ha   H  a  ha (mod H )
1

That is, ha  [ a ] , for every h  H


Therefore Ha  [ a ] , for every h  H
Therefore Ha  [ a ] .
Suppose now, that x  [ a ]
Therefore ax 1  H from the definition of congruence mod H.
 
1
Therefore , ax 1  xa 1  H
That is, xa 1  h for some h  H
That is x=ha and so x Ha
That is [ a ]  Ha
From (1) and (2) Ha=[a]
Hence the theorem.
Theorem 12:
Any two right cosets of H in G are either identical or have element in
common. (disjoint)
Proof:
Since congruence mod H is an equivalence class. Hence by theorem 11,
Ha, a right coset, is an equivalence class. Equivalence classes from a partition
of G so that any two right cosets are either identical or have no element in
common.
Theorem 13:
There is a one-to-one correspondence between any two right cosets of H in G.
Proof:
Define f; Ha  Hb such that ( ha ) f  hb where h  H
f is onto, because for any hb  Hb ; there exists ha  Ha such that
( ha ) f  hb
Also, h1b  h2b , then by cancellation law, h1  h2 , so that h1a  h2 a
whence f is one-to-one.
Therefore, Ha and Hb are in one-to-one correspondence with each other.
Hence the theorem.
We know that the order of a group G denoted by O(G) is the number of
elements in G. Applying the above theorems we get the following interesting
and important result concerning the order of a group and its subgroup.
Theorem 14:
It G is a finite group and H is subgroup of G, Then O(G) is a divisor of a O(G).
Proof:
Let G be a finite group and H be a subgroup of G.
Let O(G)=n and O(H)=m
We know that G is partitioned into right cosets of H.

11
Since G is finite let k be the number of distinct right cosets of H in G.
Since He=H is a right coset the number of elements in each coset is O(H)=m
Therefore total number of elements in k right cosets is km. since any
a  G is in the unique right cosets Ha the right cosets fill out G. Therefore,
G  H Ha1 ........... H ak 1
 O( G )  O( H )  O( Ha1 )  .................  O  H ak 1 
 O( G )  O( H )  O( H )  .....................  O( H )k term
 n  km
 O(H) divides. O(G)
O( H )
O( G )
Hence the theorem.
Definition:
If H is a subgroup G, the index of H in G is the number of distinct right
cosets of H in G. we shall denote it by G( H )  O( H )
O( G )
Note that the converse of Lagrange’s theorem need not be true.
Definition:
If G is a group and a  G ; the order (or period) of a is the least positive
integer m such that a m  e if no such integer exits we say that a is of infinite
order. We use the nation O(a) for the order of a.
Corollary 1:
O( a )
If G is a finite group and a  G then
O( G )
Proof:
Let O(G)=n. since G is finte O(a) is finite. Let O(a)=m. then m is the
least positive integer such that a m  e consider the set.
H  a,a 2 ,.......................ame 
H is clearly a nonempty subset of G
For any two a i ,a j of H, ai a j  ai  j  a mq r by division
Algorithm i+j = mq+r, 0  r  m
 a mq a r
  a m  a r  er a r  a r  H since 0<r<m
q

That is H is a nonempty subset of a finite group G such that H is closed


for the composition in G. consequently by theorem 2 of this lesson H is a
subgroup of G.
Now, all the elements of H are distinct, because if I and j are any two
distinct integers, such that 1  j  i  m then

12
ai  a j  ai a  j  ai a  j
 ai  j  ai  j
 ai  j  a 0  e
 i-j is a positive integer less than m such that ai  j  e , which is a
contradiction.
Therefore ai  a j . That is, all elements of H are distinct.
Therefore by Lagrange’s theorem, O(H) must be divisor or O(G)
O( G ) n
therefore,  k that is  k .
O( H ) m
O( a )
That is  k which shows that the order of a is a divisor of O(G).
O( a )
Corollary 2:
If G is a finite group and a  G , then a0 ( G )  e
Proof:
O( a )
By previous lemma this O( G )  mo( a )
o( G )
 
Therefore a0 ( G )  a mo( a )  a 0( a ) m  em  e
Definition: Euler function 
The Euler function ( n ) is defined for all integers n by the following
( 1 )  1for n>1; ( n ) = number of positive integers less than n and relatively
prime to n.
For example ( 8 )  4 since 1,3,5,7 are the only numbers less that 8 and
relatively prime to 8. note that e for a prime number p, ( p )  p  1
Corollary 3: Euler’s Theorem:
If n is positive integer and a is relatively prime to n, then
a ( n )  1 ( mod n )
Proof:
 
The set G  a1 ,a2 ,...................a ( n ) consisting of positive integers
less than n and relatively prime to n forms a group under multiplication mod n.
this group has order ( n ) . Applying Lemma 11 to this group.
We have a ( n )  1 ( mod n )
Corollary 4: Format’s Theorem
If p is a prime number and a is any integer, then a p  a ( mod p )
Proof:
If p is a prime number then ( p )  ( p  1 )
Case (1)
When (a,b)=1 by Eluer’s Theorem, a ( p )  1 ( mod p )

13
That is; a p 1  1 ( mod p )
That is; a p  a ( mod p )
Case (2)
When ( a, p )  1
Then p/a so that a  0 ( mod p )
Whence 0  a p
 a ( mod p ) which is the required result.
Hence the theorem.
Corollary 5:
If G is a finite group whose order is a prime number p, then G is a
cyclic group.
Proof:
First we shall claim that G has no nontrivial subgroups.
On the contrary if there is nontrivial subgroup for G, then O(H) must
divide O(G)=p.
That is O(H)=1 or O(H)=p (since the only divisors of a prime number
are 1 and itself).
That is, H  e ; or H=G
Now, let a  e  G ; and let H=(a) clearly H is a subgroup of G and
H  e , since a  e  G ,
Therefore H  ( a )  G
That is G is a cyclic group
Hence the theorem.
Product of two subgroups.
Let G be a group. It H,k are two subgroups of G, let us define
 G 
HK   x   hk , h  h, K  K 
 x 
Note that in general, HK need not be equal to KH. Also, in general, this
HK need not be a subgroup of G. The following Theorem gives a necessary and
sufficient condition for HK to be a subgroup of G.
Theorem 15:
Let H and K be subgroups of a group G. then HK is a subgroup of G if
and only if HK=KH.
Proof:
First we suppose that HK=KH
That is, if h  H and K  K , then hk  k1h1 for k1  k , h1  H (Note
that it need not be that k1  k and h1  h ) to prove that HK is a subgroup we
must show that
1. it is closed and
2. every element in HK has its inverse in HK, we shall prove one by one.
(1) suppose,

14
( AB )  AB
 A B
 ( A )( B )
x  hk  HK
G
nn
k
G
 :G 
N
y  h' k'  HK
xy  hkh' k !  h( kH ' )k' but, kh'  KH  HK where kh'  h2 k2 where
h2  H and k2  K . Therefore xy  h( h2 k2 )k !  ( hh2 )( k2k ' )  HK since
hh2  H and k2 k '  k . Therefore HK is closed,
(2) also, x1  ( hk )1  k 1h1  KH  Hk
Therefore x 1  HK
Therefore HK is a subgroup
Conversely, let HK be a subgroup of G.
Then for any h  H , k  K ; h1k 1  Hk
and so kh  h1k 1  Hk
That is kh  HK . Now, if x is any element of HK, x1  hk  HK
 
1
  hk   k 1h 1  KH
1
and so x  x 1
So, HK  KH , thus HK=KH
Hence the theorem.
In the case of an abelian group G, triviall HK=KH and hence we have the
following result.
Corollary 6:
If H,K are subgroups of the abelian group G, then HK is a subgroup of G.
The following theorem gives a formula to find the number of elements
of HK where H and K are finite subgroups of G.
Theorem 16:
If H and K are finite subgroups of G of order O(H) and O(K)
respectively, then O( HK )  O( H ).O( K )
O( H K)
Proof:
Case(1): When H K  e
Since H K  e ; O( H K )  1, so that
We need to prove only O( HK )  O( H ).O( K )

15
Now, O( HK ) will not be equal to O( H ).O( K ) only when all the
elements hk, h  H , k  K are listed there is some collapsing. That is, some
element in the list must appear at least twice. Equivalently, for some
h  h, H and some k  k1  k , hk  h1k1
That is, h11hkk 1  h11h1k1k 1
i.e., h  kk 1 now, since h  H , h11  H .
Therefore, h11h  H since k  k , k 1  k
Therefore, k11  k . Therefore h11h  k1k 1 implies h11h  H K  e ,
therefore h  h1
a contradiction.
Therefore there is no collapsing and hence O( HK )  O( H ),O( K )
and the theorem1 is provided in this case.
Case (ii): When e  H K
First we shall see how often a given element hk appears as a product in
the appear O( H K ) times. To see this, first note if h1  H K , then
hk   hh1   h11k  where hh1  H , since h  H , h1  H k  H and
h11k  k and h11  H k  k and k  k . Thus hk is duplicated in the
product at least O( H K ) times. However, if
hk  h' k' h1h1  k  h1   u and u  H
1
K and so h1  hu from h1h1  u

k1  u 1k from k  k 1   u . Therefore hk appears in the list of HK


1

exactly O( H K ) times. This the number of distinct elements in HK is the


total number in the listing of HK, i.e O( H ).O( K ) divided by the number of
times the given element appears namely O( H K ) . That is
O( H ).O( K )
O( HK )  hence the theorem.
O( H K )
Normal Subgroups
In general for a subgroups, its left coset need not be equal to right coset.
There is a special type of subgroups for which both left and right cosets
coincide. Such subgroups are called “normal subgroups”
Definition (Normal subgroups)
Let N be a subgroup of G. Then G is said to be a “normal subgroups” of
G. if for every g  G and n  N .
gng 1  N
Let gNg 1  N for every g  G clearly the whole group G and the
subgroup consisting of the identity element e alone are improper normal
subgroups in G.

16
It is easy to see that every subgroup of an abelian group is normal. To
see this let G be abelian and N a subgroups of G. let g be any element of G and
n any element of N.
Consider,
gng 1  g  ng 1 
 g  g 1n  (since G is abelian)
 gg  n  en  n  N
1

So that for every g  G and n  N , gng 1  N for every g  G .


Proof:
Suppose that gNg 1  N for every g  G . Then clearly gNg 1  N so
that N is normal in G. (by definition)
Conversely suppose that N is normal in G. then, by definition for
g G .
gNg 1  N
Also g  G  g 1  G
Therefore again by definition of normal
 
1
Subgroup, g 1 N g 1 N
 
i.e., g 1Ng  N . This implies that g g 1 Ng g 1  gNg 1 so that
N  gNg 1 for every g  G .
The relations (A) and (B) imply that for g  G N  gNg 1 , which is the
required condition.
Theorem 17:
A subgroup N to G is normal in G if and only every left coset of N in G
is a right coset of N in G.
Proof:
Let N be a normal subgroup of G. then by the previous theorem, for
 
every g  G . gNg 1  N so that gNg 1 g  Ng for all g  G this implies
that gN  Ng for all g  N , so that each left coset gN; the right coset Ng.
Conversely suppose that each left coset of N in G is a right coset of N in
G. let g be any element of G. then gN  Ng1 for some g1  G since e  N ,
ge g  gN
 g  Ng1 (Since Ng1  gN )
But g  Ng1  Ng  Ng1
Ng=gN (since Ng1  gN )
This we have for every g  G , gN=Ng.
 gNg 1  N for every g  G .

17
This completes the proof of the theorem.
Theorem 18:
A subgroup N of G is a normal subgroup of G if and only if the product
of two right cosets of N in G is again a right coset of N in G.
Proof:
Let N be a normal subgroup of G. let g1 ,g 2 be any two elements of G.
then Ng1 and Ng 2 are two right cosets of N in G. now consider
 Ng1  Ng2   N  g1N  g2
= N  Ng1  g 2 (by Lemma 2, since N is normal)
= NN  g1 g 2 
 N  g1 g 2  , since NN=N)
Since g1 ,g 2  G  g1 g 2  G , it follows that Ng1 ,Ng 2 is also a right
coset of N in G. let g be any element of G. then g 1  G so that Ng and Ng 1
are two right 4.
Cosets of N in G. by our hypothesis NgNg 1 is also as itself is a right
coset of N in G and e  N . Now using the fact that if two right cosets have one
element in common then they are identical, we see that for every
g  G , Ng.Ng 1  N this implies that n1 gn2 g 1  N , for every g  N and
n1n2  N
 n11  n1g.n2 g 1   n11N
 n n  gn,g   N  gn g
1
1 1
1
2
1
N
 gn2 g 1  N for every g  N and n2  N
Since n11 N  N , as n1  N  n11  N
Hence N is a normal subgroup of G, completing the proof.
Problem:
Show that the intersection of two normal subgroups of G is a normal
subgroup of G.
Solution:
Let H and K be two normal subgroups of a group G. then since H and K
are subgroups of G, their intersection H K is also a subgroup of G. we have
only to prove that H K is normal in G. let g be any element of G and n be
any element of H K , now n  H K  n  H , n  k since H is normal in
G it follows that g  H , n  H  gng 1  H similarly we get gng 1  k so
that it follows that gng 1  H K for g  G and n  H K hence H K is
a normal subgroup of G.

18
Problem:
If H is a subgroups of G and N is a normal subgroup of G, prove that
H K is a normal subgroup of H let x be any element of H and a be any
element of H K . Then a  H and a  N . Since N is normal in G.
x  G and a  N  xax 1  N . Also H is a subgroup of G. therefore it
follows that x  H a  H  xax 1  H this xax 1  H N for x  H ,
a  H N so that H N is a normal subgroup of H.
Problem:
Suppose that N and M are two normal subgroups of G and that
N M   e  . Show that for any n  N , m  M , nm  mn
Solution:
Consider the element nmn1m1 since N is normal subgroup in G.
m  G, n1  N  mn1m1  N
Also n  N so that nmn1m1  N (Closure law in N)
Again since M is normal G.
n  G, m  M  nmn1  M
Also m1  M so that nmn1m1  M (closure law in M)
Thus nmn1m1  N and nmn1m1  M
Therefore nmn 1m 1  N M . But N M  ( e )
Therefore nmn1m1  e . i.e.,  nm  mn   ( e )
1

which implies that nm = mn.


Problem:
If N is a normal subgroup of G and H is a any subgroup of G, prove that
NH is a subgroup of G. we first prove that NH=HN, let nh  NH where
n  N ,h  H
We can write nh  hh1nh
(since H is a subgroup of G h  H  h1  H
 h  h1nh 
But since N is normal in G, h 1nh  N .
Hence it follows that nh  NH . Thus NH  HN
On the other hand let hn  HN where h  H , n  N we can write, as
 
before hn  hnh1 h
But hnh1  N , since N is normal in G so that hn  NH . This
HN  NH (B)
(A) and (B)  HN=NH.
Since N and H are two subgroups of G such that NH=HN by a known
result, NH is also a subgroup of G.

19
Problem:
If N and M are normal subgroups of G, prove that NM is also a normal
subgroup of G.
Applying the same argument as in example (5), we see that NM is a
subgroup of G. now we shall show that NM is normal in G.
Let g  G and nm  NM , where n  N , m  M , consider
g  nm  g 1   gng 1  gmg 1  since e  g 1g
Since N is normal in G gng 1  N and
Since M is normal in G, gng 1  M ,
For g  G . Hence it follows that g  nm  g 1  NM
For g  G , n  N and m  M so that NM is a normal sub group of G.
Quotient Groups:
Let N be any normal subgroup of G. let a  G , since N is normal in G,
the left coset aN is equal to the right coset Na. thus there is no distinction
between right and left cosets and we can say that Na is a coset of N in G in this
G G  Na 
case. Let denote the collection of all cosets of N in G. i.e. =  we
N N a G 
shall now prove.
Theorem 19:
G
If G is a group and N is a normal subgroup of G, then is a group with
N
respect to multiplication of cosets.
Proof:
G  Na 
Let   let a,b  G consider (Na)(Nb)=N(aN)b=N(Na)b,
N a G 
Since N is normal=NNab=Nab.
G
Since ab  G Nab is also a coset of N in G, so that Nab  . Thus
N
G
is closed with respect to multiplication of cosets.
N
To prove associatively let X=Na, Y=Nb, Z=Nc where a,b,c  G be any
G
three elements of .consider
N
 xy  z   Na.Nb  Nc   Nab  Nc   Nab  cN 
 N  abc  N   N .Nabc
 N  abc   N  ab  c
 Na  bc  (since  ab  c  a  bc  )

20
 NaNbc  Na  NbNc   X YZ 
G
So that the product in is associative consider the element
N
G G
N  Ne  . Let X  Na,a  G be any element of . Then consider
N N
X=Na Ne=Nae=Na=X and Nx=Ne Na=Ne a=Na=X
G
So that the coset N is identity element for . Finally to prove the
N
G G
existence of inverse, let X  Na  where a  G . Then Na 1  . (Since
N N
1
a G  a G
Further
Na Na 1  Naa 1  Ne  N.
And Na 1 Na  Na 1a  Ne  N
G G
The coset Na 1 is the inverse of Na in . Thus is a group with
N N
respect to coset multiplication.
G
The group with respect to coset multiplication. The group seen above
N
is called the “quotient group” or “Factor group” of G modulo N.
Index of subgroup:
If G is a finite group and N is a normal subgroup of G then
 G  O( G )
O   is called the index of N in G. it is also denoted by the
 N  O( N )
symbol in (N).
G
O    Number of distinct right cosets of N in G.
N
Number of elements i n G O( G )
 (by Larg ranges Theorem) 
Number of elements in N O( N )
Definition:
A mapping  from a group G into a group G is said to be a
“homomorphism” if for all a,b  G .
( ab )  ( a )( b )
Note that the product ab in the term  (ab) is computed in G using the
product of elements of G and in the term ( a )( b ) . The product is that of
elements in G .
If  is a homomorphism of G ‘onto’ G then G is said to be a
homomorphism image of G.

21
Examples of homomorphism:
(1) let ( x )  e for all x  G where a denotes the identity elements of
the group G then fo x, y  G ( xy )  e  ee  ( x )( y ) , (by definition of
 ) so that  is a homomorphism of Giato itself.
(2) Let G be a group and let for every x  G
( x )  x
If x, y  G then
( xy )  xy
 ( x )( y ) (by definition of  )
So that  is a homomorphism of G onto N
(3) Let G be the group of all real numbers under addition and G be the
groups of non-zero real numbers with the product being ordinary multiplication
of real numbers define y : G  G .
By ( a )  2a for every a  G
For a,b  G consider
( a  b )  2ab (by definition of  )
 2 a 2b
 ( a )  ( b ) (by definition of  )
i.e., ( a  b )  ( a )  ( b ) for all a,b  G
so that  is a homomorphism. Since 2 a is always positive the image of  is
not all of G so that  is a homomorphism of G into G put not into G.
Theorem 20:
Let G be a group N a normal subgroup of G. define the mapping
G
 from G to by
N
( x )  Nx for all x  G
G
Then  is a homomorphism of G onto .
N
Proof:
G
Consider the map  : G  such that ( x )  Nx for all x  G
N
G
To show that  is into let x be any element of . Then x can be written
N
as x=Ny, for all y  G , of that x  ( y ) . Hence the mapping  is into
Next to prove that the composition preservation property is required for
homomorphism. Consider x, y  G . We have
( xy )  Nxy (by definition of  )
 ( Nx )( Ny ) (by the property of coset)

22
 ( x )( y ) (by definition of  )
G
Do that  is a homomorphism of G onto
N
Note : From the above theorem we see that every quotient group of a group is
G
a homomorphism image of the group. The mapping  : G  such that
N
( x )  Nx for every x  G
G
is called a “Natural mapping” of a onto
N
Theorem 21:
Let  be a homomorphism of a group G into a group G . Then
(1) ( e )  e the unit element of G
(2) ( x 1 )  ( x )1 for all x  G
Proof:
(1) Let x  G , then ( x )  G
( x )e  ( x ) (since e is the identity of G )
 ( xe ) (Since e is the identity of G)
i.e., ( x )e  ( x )( e ) , (since  is a homomorphism)
Using the cancellation property in the group G , we obtain
( e )  e proving (1)
(2) let x be any element of G. then x1  G consider e  ( e )  ( xx1 )
 ( x )( x1 ) (since  is homomorphism)
Hence by the definition of inverse of ( x ) in G we
have ( x 1 )   ( x ) .
1

Hence the theorem.


Definition [KERNEL OF A HOMOMORPHISM]
Let  be a homomorphism of a group G into a group G . Then the set
 G 
x  the identity element of   e,e is called the kernel of  and it is
 ( x ) 
denoted by k .
Thus the kernel k is the set of all those elements of G which are
mapped onto identity e of G since ( e )  e it follows that e  k so that k is
not empty.

23
Examples:
(4) Let g be the multiplicative group of all n  n non-singular matrices
with entries as real numbers and G , the multiplicative group of all non-zero real
numbers. Define the map
:G G
( A )  A for A  G
Here A denotes the determinant of the matrix (A).
If A,B are any two elements of G, then ( AB )  AB
 A B (by well known property of determinates)
 ( A )( B ) (by definition of)
So that  is a homomorphism of G onto G . Also if x any non-zero real
number then there exists an n  n matrix in G whose determimat is equal to x.
therefore  is onto since the identity element of G is
1. It follows that the kernel k of  is the subgroup of G consisting of marices
with determinant equal to 1
(b) Let G be a groub and N a normal subgroup of G, let  be a mapping
G
:G 
N
Such that  ( x)  Nx for every x  G. We have already seen that  is
automorphism of G onto G/N. We shall now determine the kernel of  .
Let K  be the kernel of this homomorphism  . The identity element of
the quotient group G/N is the coset N. Therefore, by definition.
K   y  G / ( y)  N 
We shall prove that K   N
Let K  K  . Then  ( K )  N , the unit element G/N. But by definition
of  , we have
Theorem 22:
___
If  is a homomorphism of a group G into a group G with kernel K  ;
then K  is a normal subgroup of G.
Proof
__
Let  be a homomorphism of a group G into a group G. Let e, e be the
__
identity elements of G, G respectively. Then the kernel of  is
 __

K   x  G /  ( x )  e 
 
We shall first show that K   ( e ) K  is a subgroup of G. For this we
show that K  is closed under multiplication and every element in K  has
inverse in it

24
__ __
Let x, y  K . Then ( x )  e and ( y )  e
Consider
 ( xy )   ( x ) ( y )
__ __
ee
__
e
So that xy  K . Thus x, y  K  xy  K
Next let x  K 
__
Then  ( x )  e and by lemma (6.2)
1
 __  __
( x )  ( x )   e   e
1 1

 
which shows that x  K . Thus K  is a subgroup of G. Finally to prove that
1

__
K  is normal in G let g  G and K  K . Then ( x )  e . Consider
  gkg 1   ( g )( k )( g 1 ) (since  is a homomorphism)
__
 ( g ) e ( g 1 )
 ( g )( g ) ( by lemma 2)
1

__
=e
which implies that gkg 1  K for all g  G and K  K Therefore K is
normal in G.
This completes the proof.

Definition
A homomorphism  from a group G into a group and onto is called an
__
“Isomorphism” of G onto G ”
Definition
Two group G,G* are said to be “Isomorphic” f there is an isomorphic of
G onto G*. In this case we write G  G*
Assignment
Verify the following
1. G  G
2. G  G * implies G*  G
3. G  G* G*  G** imply G  G**
Theorem 23:
__
A homomorphism  of G into G with kernel K is an isomorphism of G
__
into G if and only if K   ( e )

25
Proof:
Suppose G / K  G. K  e , we have to prove that  is one to one.
Let  (a )   (b)
Then,
__ __
 (a)  (b)  e   (a) (b 1 )  e
1

__
  (ab 1 )  e  ab 1  K
 ab 1  e  a  b  is one to one.
__
Conversely suppose  is an isomorphism of G into G . Then  is one-
__ __
to-one. Let a  K . Then  (a)  e , then unit element of G .   (a)   (e)
__
(since  (e)  e )
 a  e (since  is one-to-one).
Thus a  K  a  e. In other words, e is only element of G which
belongs K . Therefore K  e .
Theorem 24:
Let  be a homomorphism of a group G onto a group G with kernel k.
Then G / K  G.
Proof:
__ __
Consider the map  ( g )  N g  : G  G / K such that  ( g )  Kg , for
__
g  G now define the mapping  :G/ K G as follows: it
X  G / K , X  Kg , then  ( x )   ( g ) .
We first show that the mapping  is well defined, that is, if g.g '  K
and Kg  Kg '
Then  (kg )   (kg ' ). For this consider,
   
1 1
Kg  Kg '  Kg g '  K  g g' K

   
__ __ 1 __
  g ( g ' )1  e the identity, elements G   ( g ) ( g ' )  e (since
 is homomorphism)
 
1 __
 (g)  (g' ) e

  ( g )  ( g ) 
1 __
'
 (g')  e (g')
__
 (g) e  (g')
 (g)  (g')
  ( Kg )   ( Kg ' ) by definition of 
Therefore  is well defined.

26
__ __
We next prove that  is onto. Let x be any element of G . Then, since
 is onto.
__ __
x   ( g ) for some g  G , so that x   ( g )   ( Kg ). Thus for any
__ __ __
x  G there exists an element Kg  G / K such that  ( Kg )  x
Therefore  is onto.
We know show that  is one-to-one. Suppose ( Kg )   ( Kf ) for any
two elements Kg , Kf  G / K . Then this implies that  ( g )   ( f ) (by
definition of- ).
  ( g )  ( f )   ( g )  ( f )
1 1

__
  ( g ) ( f 1 )  e (Since  is a homomorphism)
__
 ( gf 1 )  e (Since  is a homomorphism)
 gf 1  K
 Kg  Kf
So that  is one-to-one.
__
Finally we show that  is a homomorphism of G / K onto G . Let Kg, Kf
be any two elements of G/K, where g , f  G.
Consider,
 ( Kg )( Kf )   ( Kgf ) (by Property of cosets)
  ( gf ) (by definition of  )
  ( g ) ( f ) , (since  is a homomorphism)
__
  ( Kg ) ( Kf ) , (by definition of  is homomorphism of G / K onto G .
This Proves that G / K  G
Theorem 25:
__
Let  be a homomorphism of G onto G with kernel K; for a subgroup H
 __

be defined by H   x  G / ( x)  H  .
 
__
Then H is a Subgroup of G and H  K . If H is normal in G , then H is
normal in G. Moreover this association sets up a one to one mapping from the
__
set of all subgroups of G onto the set of all subgroups at G which contain K.
Proof:
We first Prove that H is a subgroups of G. Let x, y  H . Then  ( x )  H
and  ( y )  H so that xy  H . Thus x, y  H  xy  H . Hence H is closed
under the Product in G.

27
 
__ __
If x  H , then  ( x)  H and so  ( x)   x 1  H , which shows
1

that x 1  H . Thus x  H  x 1  H . Hence H is a subgroup of G. Let x  K .


__ __ __ __
Then  ( x)  e , where e  H . Hence  ( x)  H which implies that x  H ,
Thus H  K .
__ __
We next prove that if H is normal in G , then H is also normal in G. Let

g  G and h  H Then.  ghg 1   ( g ) (h) ( g 1 ) 
__ .
  ( g ) (h)  ( g )   H
1

__ __ __ __
Since  ( g )  G and  (h)  H and H is normal in G . Thus
__
 ( ghg 1 )  H  ghg 1  H  H is normal in G. we next prove that this
__
association  sets up a one-to-one mapping from the set of all subgroups of G
onto the set of all subgroups of G which contain K.
Suppose L is a subgroups of G and L  K
 __ __ 
Let L   x  G f x   (l ), l  L  .
 
__ __
We now prove that L is a subgroup of G
__
Let  (l1 ), (l2 )  L . Then l1 , l2  L . Now l1 , l2  L  l1 , l21  L, (Since L
is a subgroup of G)
  l1 , l21   L (by definition of L )
__ __

 
__
  (l1 ) l21  L (Since  is a homomorphism)
__ __ __
  (l1 )  (l2 )   L Which proves that L is a subgroup of G . Let
1

 __
 __
f   y  G / ( y )  L  . Then sine l  L  (l )  L  l  T We obtain that
 
l T . Also T is a subgroup of G containing k as seen in the first part of the
proof. We now claim that L=T. We must show that T  L let t  T . Then
__ __
 (t )  L  t  L definition of L . Therefore f  L . Consequently L=T. This
completes the proof.

__
Theorem 26: Let  be a homomorphism of G onto G . With kernel K and let
__ __
 __
 __ __
N be a normal subgroup G . Let N   x  G / ( x)  N  Then G / N  G N
 
Equivalently G / N 
G / K 
K / N 

28
__ __ __

Since N is normal in G is normal in G. Define the map  : G  G __


N
__
Such that  ( y)  N  ( g ) for all g  G . We prove that  is a
__ __ __ __ __ __ __
homomorphism of G onto G / N . Let N h  G/ N . Then g  G . Hence there is
__
an element g  G such that  ( g )  g , since  is a homomorphism of G onto
__ __ __ __ __ __ __ __
G . Now,  ( g )  N  ( g )  N g. Thus for N g  G/ N , there is an element
__ __
g  G such that  ( g )  N g
Therefore  is onto.
Again let
T1( xy )  T1 x T1 y 
T2 ( xy )  T2 x T2 y 
T1T2 ( xy )  T1 T2  xy  
 T1  T2 ( x )T2 ( y ) 
 T1 T2 ( x )  T1 T2 ( y ) 
 T1T2 ( x ) T1T2 ( y )
a, b  G. then
( ab )  N( ab )
 N( a )( b )   N( a )  N( b )
  ( a )( b )
G
Hence  is a homomorphism of G onto . Next note that N is the
N
G
identity element of . If g  G then kernel of if and only if ( g )  N .
N
Now, N( g )  N  ( g )  N  g  G . Therefore kernel  of is N.
G
This  is a homomorphism of G onto with kernel N. Then by
N
theorem (1) follows that, this completes the proof of the first part of the
theorem. Finally nothing as a consequence of theorem (1) that G   G / K  and
N N / K 

We obtain that N 
G / K 
N / K
Hence the theorem.

29
Automorphisms:
Definition:
An isomorphism of a group G onto itself is called an “automorphism of
' onto'
G. Thus the map T  G G is an automorphism of G if T(ab)=T(a)
one to one
T(b) for every a,b  G
Theorem 27: (Group of automorphisms of a group).
Let G be a group let A(G) denote the collection of all automorphism of
G. Then A(G) is also a group with respect to composition of function as the
product.
Proof:
Consider A(G)={T/T is an automorphism of G}
Let T1 ,T2  A( 9 ) then T1 ,T2 are one-to-one mappings of G onto itself.
Therefore T1 ,T2 is also a one-to-one mappings of G onto itself, let x,y be any
two elements of G,
T1( xy )  T1 x T1 y 
T2 ( xy )  T2 x T2 y 
T1T2 ( xy )  T1 T2  xy  
Then we have
 T1  T2 ( x )T2 ( y ) 
 T1 T2 ( x )  T1 T2 ( y ) 
 T1T2 ( x ) T1T2 ( y )
Therefore T1T2 is also an automorphism of G onto itself. Thus A(G) is
closed with respect to composition of functions as the product. We know that
the product in A(G). The set of one-to-one mappings of G onto itself satisfies
the associative law. Hence the product of mappings in A(G), a subset of A(G)
is also associative.
Let i denote the mapping of G which sends every element onto itself; that is
 a1 a2 ...... an 
i(x)=x, for all     x  G . Then i is an automorphism of
 c1 c2 ....... cn 
G and i  A( G ) . We have if T  A( G ) We have if T  A( G ) , then clearly,
iT  T  Ti , Thus the identity map i  A( G ) and i is the unity element in A(G).
Finally we establish the existence of inverse. Set T  A( G ) . Since T is a one-
to-one mapping of G onto itself. T-1 exists and is also one-to-one mapping of G
onto itself. We show that T-1 is an automorphism of G, let x, y  G .
  
Then T  T 1 x T 1 y 
 
   
1 1
 T T x T T y  i( x )i( y )  i( xy )  ( xy )
 T x T y   T  xy 
1 1 1

30
Hence T 1  A( G ) . Thus A(G) is a group with respect to composition of
mappings, we now prove.
Theorem 28:
Let g be a fixed element of a group G. Then the mapping Tg : G  G
defined by Tg ( x )  g 1 xg for every x  G is an automorphism of G.
Proof:
To prove that Tg is one-to-one, let x,y be any two elements of G such that
Tg ( x )  Tg ( y ) , then,
Tg ( x )  Tg ( y )  g 1 yg (by definition of Tg)
 x  y , (by cancellation laws in G)
Therefore the mapping Tg is one-to-one. Next to prove that Tg is onto let
y be any element of G. Then gyg 1  G and we have
  
Tg gyg 1  g 1 gyg 1 g  y (by associative law and property of
identity in G)
This proves that Tg is onto. Finally let x, y  G.
Then Tg  xy   g 1( xy )g .
 
 g 1( xg ) g 1 yg  (Since e  gg 1 )
 Tg ( x ) Tg ( y ) , Thus Tg is an automorphism of G. This proposition leads to
Definition:
Let G be a group. The mapping Tg : G  G defined by Tg ( x )  g 1 xg
for every x  G is automorphism of G known as “inner automorphism”.
An automorphism which is not inner is called an “outer automorphism”.

Theorem 29:
For an abelian group the only inner automorphism is the identity
mapping whereas for non abelian groups there non-trivial inner automorphism.
Proof:
Suppose G is an abelian group with identity element e and Tg is an inner
automorphism of G. Then if x  G we have
Tg ( x )  g 1 xg  g 1 gx (Since G is abelian)
 ex  x
Thus Tg ( x )  x for every x  G so that Tg is identity mapping of G.
Next let G be a non-abelian group. Then there exist at least two elements
say a,b  G such that ba  ab  a1ba  b  Ta ( b )  b. Hence Ta is not the
identity mapping of G. Thus for non abelian groups there always exist non-
trival inner automorphisms.
We now prove the following general result.

31
Theorem 30:
Let be a group. Let I( G )  Tg  A( G ) / g  G Then I(G) is normal
subgroup of the group of automorphisms of G and isomorphic to the quotient
group G / Z , Where Z is a centre of G.
i.e. I( G )  G / Z
Proof:
Let A(G) denoted the group of all automorphisms of G. Then
I( G )  A( G ) . Let a  G . We shall first show that Ta1  Ta 
1

i.e. the inner automorphism Ta 1 is the inverse function of the inner


automorphism Ta.
If x  G, then we have
   
TaTa1  x   Ta Ta1 ( x )  Ta  a 1    
xa 1 
1

 
 T  axa 
a
1

 a  axa 1 a  x
1 

Therefore TaTa1 is the identity function on G so that Ta1  Ta 


1

We next Prove that I(G) is a subgroup of A(G). Let Ta ,Tb be any two
elements of I(G). Then a,b  G
If x  G, then
Ta Tb 1  ( x )  TaTb1   x   Ta Tb1( x )
     

 
 Ta bxb 1 a 1 bXb 1 a 
 b a  x b a 
1
1 1

 Tb1a ( x )
which implies that
Ta Tb   Tb1a  I( G )
1
since b 1 a  G . Thus
Ta ,Tb  I( G )  Ta Tb   I( G ) which proves that I(G) is a subgroup of A(G)
1

We now prove that I(G) is normal in A(G). Let T  A( G ) and


Ta  I( G ) . If x  G , then consider
TT T   x   TTla  T ( x )
a
1 1

 T T T ( x ) 
a
1

 T  a T ( x )a 
1 1

 T  a  T T ( x ) T( a )
1 1

(Since T is composition preserving)

32
 T( a 1 )xT( a ). (Since T T 1( x )  x )
1
 T  a  xT( a )
 c 1 xc , where T( a )  c  G
= Tc ( x )
TTaT 1  Tc  I( G ), since c  G
 I( G ) is a normal subgroup of A(G)
Finally we prove that I(G) is isomorphic to G/Z. For this we show that
I(G) is a homomorphic image of G and Z is the kernel of the corresponding
homomorphism. Then by theorem 22 (Fundamental theorem on
homomorphism of groups) it follows that G / Z  I( G )
Consider the mapping  : G  I( G ) defined by ( a )  Ta1 for every a  G
Then  is clearly onto since Ta  I( G )  a  G  a 1  G
Now  ( a 1 )  T  a 1   Ta so that  is onto of I(G)
1

Next we have to prove that  ( ab )   ( a ) ( b ) for every a,b  G .


We have ( ab )  T ab 1 , by definition. It x  G, then
 
1 1
T ab 1 ( x )  ( ab )  x  ab 
1

  bxb  a
1
 abxb 1a 1  a 1 1 1

   T b  xb 
1
 Ta1 bxb 1 1 1

1
a

 Ta1 Tb1  x   Ta1Tb1 ( x )

 Tab   Ta1Tb1
1

( ab )  T  ba 1   Ta Tb  ( a )( b )
1 1

Thus  is a homomorphism of G onto I(G) we now show that Z is the


kernel of 
The identity function i on G is the identity of the group I(G). Let v be the
kernel of  . Then we have
z  K  ( z )  i  Tz1 ( x )  i( x ) for every x  G

 
1
 z 1 xz 1  x
 xzz 1  x
 zx  xz
 zz
Therefore K  Z
This completes the proof of the theorem.

33
Theorem 31:
Let G be a group and  an automorphism of G. If a  G G is of order
O(a)>0, then o(  ( a ))  o( a )
Proof:
Suppose that  is an automorphism of a group G and suppose that a  G
has order n (that is a n  e but for no lower positive power). Then
( a )  ( a n )  ( e )  e . Hence ( a )  e . If  ( a )m  e for some
n n

0  m  n, then ( a m )    a   e which implies that a m  e since  is one-


m

oen. This is a contradiction. This proves the theorem.


Cayley’s Theorem and Permutation Groups.
Theorem 32: (Cayley’s Theorem)
Every group is isomorphic to a subgroup of A(s) for some appropriates.
Proof:
Let G be a group and let A(S) denote the group of all one-to-one
mappings of S onto itself under composition of mappings. For the set s, let us
use elements of G i.e., let us put S=G For a  G , define f a : G  G by
( x ) f a  xa for every x  G .
Let y  G . Then y  ya 1 f a 
Thus for every y  G there exists ya 1  G such that y  ya 1 f a  
Therefore f a maps s onto itself. Further if x, y  s and ( x ) f a  ( y ) f a
for a  G . Then we have xa  ya (by definition of f a )
 x  y (by the cancellation property in G.) so f a is one-to-one. Thus
we have proved that for every a  G, f a  A( s )
Next let us compute f ab for a,b  G , For any x  s  G
 x  f ab  x  ab    xa  b   f a  b   xf a  fb
  x  f a f b.
This implies that f ab  f a fb
Now define   G  A( s ) by  ( a )  f a for every a  G .
We show that  is a homomorphism for a,b  G, consider
  ab   f ab  f a fb  ( a )( b ) .
So that  .Preserves composition in G and A(S). Therefore  is a
homomorphism. Next we show that the kernel of  is e. let g 0  K . Then
( g0 )  fg0 is the identity map on s so that for e  G , ( e ) fg 0  e But
( e ) fg 0  eg 0 (by definition of fg 0 ) comparing these two expressions for
 e  fg0 we conclude that g0  e When ( e ) k  e Hence  is an isomorphism
of G into A(s). Further  is clearly onto, hence G  A( S ) completing the
proof.

34
Theorem 33:
Let G be a group H a subgroup of G, and S the set of all right cosets of H
in G, Then there is a homomorphism  of G into A(S) and the kernel of  is
the largest normal subgroup of G which is contained in H.
Proof:
Let S  Hg / g  G For g  G let Tg : S  s be defined by
 Hx  Tg  Hxg
We prove that Tg  A( s ) . Tg is clearly a mapping from S into s. Also if
for x, y  G,  Hx  Tg   Hy  Tg then

Hxg  Hyg   xg  yg   H   xg  g 1 y 1  H
1

 gg  y  H  xy
1 1
H.
 Hx  Hy  Tg is one-to-one Further any right coset Hx  S can be
written as  Hxg  g   Hxg  Tg which
1 1
proves that Tg is onto, That is
Tg  A( s ). We now Prove that Tgh  TgTh for g ,h  G. For Hx  S consider
 Hx  Tgh  Hx( gh )  Hxgh   Hxg  Th   Hx  Tg    Hx  TgTh
 Tgh  TgTh
Define the mapping  : G  A( S ) such that  ( g )  Tg for g  G
Then  is a homomorphism of G into A(s). For consider
 ( gh )  Tgh  TgTh   ( g ) ( h ) . For all g ,h  G, so that  .Preserves
compositions. Let K be the kernel of  . If g 0  K , then   g0   Tg0 is the
identity map on S, so that for every x  S .
 x  Tg0  x, Since every element of S is a right coset of H in G, we have
 Ha  Tgo =Ha for every a G
i.e. Hag 0  Ha (by the definition of Tgo ) for every a  G .
On the other hand, if b  G is such that Hxb  Hx every x  G ,
retracting oour argument we obtain that b  k . Thus K  b  G / Hxb  Hx
for all x  G
Using the characteristics of K, we show that k is the largest normal
subgroup of G which is contained in H. For this we show that if N is a normal
subgroup of G which is contained in H, then N is contained in K.
Since K is the kernel of a homomorphism of G,K is a normal subgroup of
G. Further if b  K , then  Ha  Tb  Ha
For every: Hb  Heb  He  H
When b  H Thus K  H
Finally let N be a normal subgroup of G, which is contained in H. Let
n  N , a  G . Then ana 1  N  H . So that Hana 1  H Thus Han  Ha for
all a  G . Therefore n  K , by our characterization of k.

35
This complete the proof.

Definition:
Let s be a finite set having n distinct elements x1 ,x2 ,x3, ...xn .A one –to-one
mapping of s onto itself is called “permutation of degree n”
Let  : S  S be one-to-one mapping of s onto S. Then  is an
permutation of degree n. let  ( xk )  xik ,k  1, 2,3,....n we write this
permutation by the symbol
 x1 x2 x3 ...... xn 
 
 xi , xi2 xi3 ........ xin 
i.e. each element in the second row is the  image of element of the first row
lying directly above it.

Example:
Let the permutation  on  1, 2,3, 4 be represented as
1 2 3 4
  
2 4 1 3
Then  ( 1 )  2;  ( 2 )  4; ( 3 )  1 and  ( 4 )  3.

Product of two Permutations:


Let  , be any two permutations in Sn and let them be represented by
a a2 ..... an   b1 b2 ..... bn 
  1  and    
 b1 b2 ..... bn   c1 c2 ..... cn 
Then  is defined by
a a2 ...... an 
   1 
 c1 c2 ....... cn 
This means that  takes a1 into b1, while  takes b1 into c1 so that 
takes a1 into c1 and so on.

36
NOTES
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………

37
Unit - II
RINGS
In the previous four lessons we had introduced the basic notion of
groups and studied their important properties, which are highly useful for our
further studies in Modern Alegbra. In this and the subsequent two lessons we
introduce the notion of rings and study their essential properties, besides other
relevant facts. Just as groups, rings are algebraic systems which serve as
building blocks for various structures encountered in Modern Algebra. While
the abstract notion of a group is founded on the set of bijections of a set onto
itself, the notion of a ring will be seen to have arisen as generalizations from
the set of integers and the additive and multiplicative properties in integers. We
know that groups are one-operational, (i.e) structures in which only one
operation (generally called multiplication) is defined. But in the case of rings,
there are two operations usually called addition and multiplication. Despite this
difference, the analysis of rings will follow the same pattern as for groups. In
that we will be interested in studying analogs of homomorphisms, normal
subgroups, factor or quotient groups etc. We begin with a formal and complete
definition of a ring.
Definition 1:
A ring R  R, ,.  is a system R of double composition. (i.e) a set R
in which two binary operations + and. (respectively called addition and
multiplication) are defined and which associate with every pair of elements a, b
(distinct or otherwise) of R unique elements of R, viz a + b and a. b (or simply
ab) respectively called their sum and product such that the following axioms
are satisfied:
A 0 : a + b  R for every a, b  R . (closure property w.r.t. addition)
A1 : a + (b + c)= (a + b) + c for every a, b, c  R . (associativity of
addition)
A 2 : There exists an element 0  R called the zero or the null
element of R such that a + 0= 0 + a=a for every a  R .
(existence of the zero element)
A3 : Corresponding to any element a of R, there exists an element –a
in R, called the of a such that a + ( - a ) = ( - a ) = 0
A 4 : a + b = b + a a for every a, b  R . (commutativity of addition).
M 0 : a, b  R for every a, b  R . (closure property w.r.t.
multiplication)
M1 : a (b c) = (a b) c for every a, b, c  R . (associativity of
multiplication).
D1 : a (b + c) = ab + ac for every a, b, c  R (left distributive law).
D 2 : (b+c) a = b a + ca for every a, b, c  R (right distributive law)

38
Remark on the above definition:
The axioms A0  A4 given above show that, for the ring R, <R, +> is an
additive abelian group. This group is called the additive group of the ring R
and is usually denoted by R . So also the axioms M 0 and M 1 show <R, .> is a
semigroup and this semigroup is called the multiplicative semigroup of the ring
R. the axioms D1 and D2 show that the multiplication ‘.’ In the ring R is both
left distributive and right distributive over addition ‘+’ in R.
In view of the remark we may define the notion of a ring more briefly
as follows (one may give this definition for a ring).
Definition 2:
A ring R  R, ,.  is an algebraic structures in which two binary (or
algebraic) operations viz. addition ‘+’ and multiplication ‘.’ Are defined, such
that
(i)  R,   is an abelian group (called the additive group of the ring
R),
(ii)  R,.  is a semigroup (called the multiplicative semigroup of the
ring R),and
(iii) the multiplication is both left distributive and right distributive
over addition.
We give below some additional axioms. Some of which may or may
not be satisfied in a ring R.
M 2 : a b = ba for every a, b  R (Commutativity of multiplication)
M 3 : There exists an element l  R , called the unit element or the
identity in R, such that a.l=l.a=a for every a  R . (Existence of the
multiplicative identity)
M 4 : R has at least two elements and the non-zero elements of R form a
group under multiplication. (Multiplicative group structure of the
non-zero elements)
M 5 : For some non-zero elements a of R, there exists elements
b  R such that a.b=0 or b.a=0, where b  0 . (Existence of zero
divisors)
M 6 : If a, b  R and ab=0 then either a or b both are zero (absence of
non-trivial zero divisors)
Definition 3:
A ring R in which axiom M 2 is satisfied is called a commutative ring
and a ring R in which axiom M 2 is not valid (i.e. in which there are at least two
elements a, b such that ab  ba ) is called a non-commutative ring.

39
Note: Throughout our lessons, we always mean by a ring, unless otherwise
stated, a non-commutative (associative) ring. Some authors (for instance, refer
to “General Algebra” by A.G. Kurosh) exclude the axiom M1 (viz.
associativity of multiplication) in the definition for a ring, so that according to
them a ring need not be associative. Examples of rings that are not necessarily
associative can be give.
Definition 4:
A ring R in which axiom M 3 holds good is called a ring with identity
(or unity or unit element).
Definition 5:
A ring R in which axiom M 4 holds good is called a field (not
necessarily commutative), more precisely a division ring or a skew-field or an
S-field. If axiom M 2 is also satisfied (besides axiom M 4 ), then the ring R is
called a commutative field or simply a field.
We will study about fields in great detail in the last chapter.
Definition 6:
A ring R in which axioms M 2 and M 6 satisfied is called an integral
domain. In other words, a commutative ring (with at least two elements) which
has no non-trivial zero divisors is called an integral domain.
Note: Some authors like N. Jacobson (vide: his book “Lectures in Abstract
Algebra”, Vol,1) do not insist on the axiom M 2 (i.e., commutativity) in the
definition of an integral domain.
We will now give a good many standard examples of different types of rings:
Example 1:
The set Z of all integers (positive, negative and zero) is a
(commutative) integral domain with identity under the usual operations of
addition and multiplication of integers.
Example 2:
The set nZ of a of all multiples of a fixed integer n( 1, 1) is a
(commutative) integral domain with identity, the operation being the usual
addition and multiplication of integers.
Example 3:
All rational numbers, so also all real numbers, and complex numbers
form (commutative) fields Q,R,C, the operation being the usual operations of
addition and multiplication of the number rational or real or complex, as the
case may be.
Example 4:
The set Z[i] of all Gaussian integers, viz, all complex numbers of form
a + bi where a, b  Z is a (commutative) integer domain with identity under the
usual operations of addition and multiplication of complex numbers. Z[i] is
only an integral domain and not a field, since the inverse of any Gaussian

40
integer a + bi   0  is not a Gaussian integer, unless, the Gaussian integerf is
1 or  i .
Example 5:
All real numbers of the form a  b c , where c is a fixed positive integer
which is not a perfect square and a, b  Z form a commutative integral
domain, which is not a field.
Example 6:
Let <R, +> be any non-trivial additive abelian group. Define the trivial
multiplication ‘.’ On R by setting a b = 0, the zero (or null) element of the
group <R, +>. Then one easily check that <R, +, .> is a commutative ring,
every one of whose elements  0 is a zero divisor.
All the above examples are infinite commutative rings provided in
Example 6, the additive group is an infinite group. We will now give examples
of finite rings, which are either commutative or non-commutative.
Example 7:
Let n be a fixed positive integer and let Z n denote the set of all residue
 
chasses modulo n. For, example, if n=4,  0,1, 2,3 , where i denotes, for I =
0, 1, 2, 3 the residue class modulo 4 determined by i viz the set of all integers
which are congruent to i modulo 4. Now  Z n , ,  is a finite commutative
ring with identity viz 1 , if  and respectively are the usual addition and
multiplication modulo n. (Some authors use the notation  n . n instead of
 and ). This ring will turn out to be a finite field fill n is a prime, while it
will contain non-trival divisors of zero iff n is not a prime. For example Z 5 is
a field of 5 elements and Z 6 is a commutative ring with identity containing
2,3, 4 as the only on-trival divisors of zero, since 2 3  0,3 40
Example 8:
Let Fn be the set of all n  n matrices [aij ] , where aij are elements
chosen from a field F. For example, F can be taken as Z m , mentioned in the last
2
example, where m is a prime, in which case mn will have m n elements.
Because, matrix multiplication is both left distributive and right distributive
over matrix addition – these facts are supposed to be know, otherwise one may
refer to any standard book on Matrices - Fn under the usual operations of
matrix addition and matrix multiplication, becomes a finite non commutative
ring with identity. Viz. the identity matrix [ ij ] where  ij =0 if i  j and
 ii  1 of the field F. this ring Fn contains on-trival divisors of zero. For
instance, the matrix

41
1 0 0 ... ... 0
0 0 0 ... ... 0 

... ... ... ... ... ...
 
... ... ... ... ... ...
 0 0 0 ... ... 0 
In Fn is a divisor of zero, since
1 0 0 ... ... 0 0 0 ... ... ... 0 
0 0 0 ... ... 0  a a22 ... ... ... a2 n 
  21
... ... ... ... ... ...  ... ... ... ... ... ... 
   
... ... ... ... ... ...  ... ... ... ... ... ... 
 0 0 0 ... ... 0   an1 an 2 ... ... ... ann 
is the zero matrix.
and similarly
 0 a12 a13 ... ... a1n  1 0 0 ... ... 0
0 a a23 ... ... a2 n  0 0 0 ... ... 0 
 22 
... ... ... ... ... ...  ... ... ... ... ... ...
   
... ... ... ... ... ...  ... ... ... ... ... ...
 0 an 2 ... ... ... ann   0 0 0 ... ... 0 
is the zero matrix.
Thus the matrix
1 0 00 ... ...
0 0 0 
0 ... ...

... ... ...
... ... ...
 
... ... ...
... ... ...
 0 0 0 
0 ... ...
In Fn turns out to be a non-trivial two-sided zero-divisor. Thus
Fn contains at last one non-trival divisor of zero; in fact, it contains many more.
Note: If in the above example F is the field of rational (or real or complex)
numbers. Fn turns out to be an example of an infinite, non-commutative ring
with identity and with (many) non-trivial divisors of zero.
It will be subsequently shown that any finite integral domain is a field.
This being so, to give an example of a non-commutative integral domain which
is not a field, we need try to get an example of an infinite non-commutative
integral domain. Actually we give below an important example of an infinite
skew-field, viz, the non-commutative field of quaternion’s (in the form of
matrices) and this skew-field is also a non-commutative integral domain-one

42
can show that a field and a skew-field cannot contain non-trivial divisors of
zero.
Example 9: (The division ring or skew-field of quaternion’s):
 u v
Let F denote the set of all complex matrices of the form  
 v u 
where u, v are complex numbers and u, v denote their conjugates. One
can easily check that F is a skew-field, under the usual operations of matrix
addition and matrix multiplication. In fact, it is not difficult to verify that the
sum and product of any two matrices in F, besides the zero matrix, the identity
matrix and the negative of any matrix in F all belong to F. also for any non-
 u v
zero   , the multiplicative inverse belongs to F, the inverse being
 v u 
1 u v 
 
u 2  v2  v u 
Note: The above mentioned division ring of quaternion’s can also be realized
as a non-commutative algebra of dimension 4 over the field of real numbers.
Having listed certain important examples of rings, let us now mention
certain familiar properties of rings.
Lemma 1:
If R is ring and a, b  R then,
(1) a0=0 a=0,
(2) a(-b)=(-a)b=-(ab),
(3) (-a)(-b)=ab. If , further R is a ring with identity 1, then
(4) (-1)a=-a,
(5) (-1)(-a)=1.
Proof:
(1) a0=0 a=0, a0=0 using the cancellation law (or the property of
idempotent) in the group <R,+>. Similarly, 0a=0 follows.
(2) 0=a0=a [b+ (-b)]=ab+a(-b), which yields a(-b)=-(a,b), by definition of
inverse in the group <R,+> similarly (-a0b=-(ab) follows
(3) using (2), we get (-a)(-b)=-[a(-b)]=-[-(ab)]=ab. (4) and (5) are only
special cases of (2) and (3).
Making use of the additive group <R, +> of the ring R, we define, as
usual, multiple na of any element a  R where n is integer by setting na=
a+a+…….+, (n times, if n is a positive integer)
= (-a)+(-a)+……..+(-a), (-n times, if n is a negative integer)
=0, where n is the integer 0.
With this familiar notion of a multiple of any element a, we can state the
following (easily proved) facts:- if a, b  R and m, n be any two integers

43
(positive or negative or zero) then (i) m (a+b)=ma + mb (ii) m (-a) = -(m a) (iii)
(nb)=(mn) (ab).
Note: The result (i) above is a consequence of the above nature of the additive
group <R, +> of the ring and the result (iii) is a consequence of the two
distributive laws valid in a ring. So also, in order to prove a0=0 and 0a=0 we
need respectively that left and right distributive laws, the following
generalizations of the distributive laws can be easily proved:
Lemma 2:
If a1 , a2 ,.....am , b1 , b2 ,..........bn  R , where R si a ring
 a1, a2 ,.....am b1, b2 ,..........bn   a1b1  a1b2  ........  a1bn  a2b1  a2b2  ....  a2bn  ..........  amb2  ....  amb n mo
re briefly the above fact may be stated as
a.(ai .a j )  a.[ai  (a j )],  a.ai  [(a.a j )],
 a.ai  a.a j  0
Note that since addition is commutative in the ring R it is immateri in
what order the terms in the sum of the right hand side of t1 result (stated above)
are written.
Lemma 3: Any finite integral domain is a field.
Proof:
Let D=<D, +, .> be a finite integral domain. T prove that D is a field it
is enough; if we prove that the non-zero elements of D form a group under
multiplication ‘.’. For the purpose, suffice to show that D*  D  0 is a
semigroup under multiplication and that the equations ax=b, ya=b have
solution x, y  D* , given a, b  D* (vide alternative Definition No.2 for a
group mentioned in Lession 2) From the definition of an integral domain
follows that if a, b  D* . (so that a, b  D* and a, b  0 ) then a, b  D* .
Hence D* is a semigroup under multiplication. Suppose now the finitely many
elements of D* are a1 , a2 ,.........an (all distinct). If a the arbitrary element in D* ,
consider the elements aa1 , aa2 ,.............aa . Since  D* ,.  is a semigroup all the
above elements are in D* an further they are all distinct; for otherwise, suppose
a.ai  a.a j the a.(ai .a j )  a.[ai  (a j )],  a.ai  [(a.a j )], by Lemma1
 a.ai  a.a j  0
 a1  a j  0, since a  0 and D is an domain. Thus
a.ai  a.a j  ai .a j . . Thus shows that all the n elements aa1 , aa2 ,.....aan are
distinct and belong to D* . The above n elements are same a the n elements
a1 , a2 ,.........an but for possibly order. That is give any b  D there exists an
element ai  D* such that aai  b . Similarly we can show that the equation
y.a=b has a solution y in D * . The  D* ,.  turns out to be a group, as a
desired. Hence the Lemma.

44
Note: The above proof must be reminiscent of the proof w had given for
problem 4, in Lesson 2. In fact, if we make use c this problem, namely that any
finite cancellative semigroup is a group this Lemma follows almost
automatically in view in view of the following.
Remark:
A (commutative) ring R with at least one non-zero element is an
integral domain if and only if the non-zero elements I form a cancellative
semigroup under multiplication. In fact, if R in an integral domain and
R*  R  0 , then by definition of a integral domain it follows that
 R* ,.  is a semigroup (See condition a). Further if a  R* , c  R and ab=ac
then a(b-c)=ab-ac=0 which implies b-c=0 as a  0 thus  R* ,.  turns out to
be a (commutative cancellative) semigroup. Conversely suppose  R* ,.  is a
(commutative semigroup). These are a, b  0 in R ab  0 which means that R
has no non-trivial divisors of zero. (In other words, for a, b  R ab=0 always
implies at least either a or b is equal to zero) . Thus R becomes the integral
domain. We will now formally define the notion of divisors zero or zero
divisors in a ring R.
R*  R  0
 R* ,. 
a 1  R* , a  R
b0
ba  ca  b  c
a 1a  e
a 1 (ab)  a 1 0  0
( a 1 ) ab  0,
Definition 7:
Let R be a ring (not necessarily commutative) and a  R . Then a is
said to be a left zero divisor iff there exists ab  0 . R such that a b=0.
similarly a is said to be right zero divisors there exists an element c  0 in R
such that c a=0. the element a said to be a two-sided zero divisor and a right
divisor. (Note that 0 always a zero divisor in R, if R has at least one non zero
element we now show in passing that no field or skew-field contains a non-
trivial zero divisor. This property of a field or a skew-field follow from the
following more general result.
Result:
Let R be a ring with identity e (not necessarily commutative)
and a  R . Then a has no multiplicative left (right inverse in R if a is a left
(right) zero divisor.

45
Proof:
Suppose a has a multiplicative left inverse, say a 1  R . Such a 1a  e .
Then a cannot be a left zero divisor, for other wise there exists an element
b  0 in R such that ab=0. but this implies that a1 (ab)  a1 0  0 , whence
(a 1 )ab  0, (i.e) eb=0 or b=0 which is a contradiction to the assumption that
b  0 . Thus a cannot wave a left inverse. Similarly, it can be shown that if a
has a right and divisor. Hence the result.
Problem 1:
Prove that in a ring R the product of any two hence the product of any
finitely many) left (right) zero divisor again a left (right) zero divisor.
Problem 2:
The units (or invertible elements) in a ring w dentity form a group under
multiplication.
(By a unit in ring R with identity e(say) is meant an element which has a
multiplicative inverse v in R, such that uv=vu=e). v have the following
important fact concerning the ring Z n of resid classes modulo n (refer to
Example 7), as a corollary under Lemma
Corollary: The ring Z n is a field if and only if n is a prime.
Proof:
First assume that n is a prime. We show that Z n is field. The elements
Z n are the n residue classes 0,1, n  1 when as usual, I denotes the residue class
consisting of all integers (of the form  n  i,   Z ) which leave the
remainder I on division by n, (i.e which are congruent to I modulo n. suppose
for a, b  Z n , ab  then ab  0 (mod n). since n is a prime the above implies
that either a or b is  0 (mod n). (i.e) either a or b is equal to c . Z n is a finite
(commutative) ring containing no non-trivial zero divisors (ie) Z n is a finite
integral domain and hence is a field. Converse supposes Z n is a (finite) field.
We prove that n must be a prime suppose not. Then we can write n=pq where
1<p, q<n (i.e) (p and q are proper divisors of n) now pq  whence pq  0 but
pq  0 since, 1<p, q<n. this means that has non-trival zero divisors, namely
p, q . But Z n being a field cannot contain non-trival zero divisors and so we
have a contradiction hence n must be a prime, as claimed.
Definition 8:
Let R be a ring. Then R is said to be of characteristic m(>0), if m is the
least positive integer such that ma=0 for all a  R . R is said be of
characteristic zero if there exists no positive integer m such that ma=0 for all
aR.

46
Note: In the case of an integral domain R, if for some integer m and for some
non-zero element a  R , ma=0, then mx=0 for a x  R . To see this fact,
consider
(mx)a = m(xa) (by right distributive law)
= x(ma) (by left distributive law)
= x0=0
Hence mx=0, since a  0 and R is an integral domain.
Thus in the case of an integral domain R, the characteristic of is the
least positive integer m, if it exists such that ma=0 for some non-zero element a
in R. if no such positive integer m exists then is of characteristic zero.
We will prove the following problem, which gives a defining
information about the characteristic of an integral domain.
Problem 3:
Prove that the characteristic of an integral domain either zero or a
prime.
Proof:
Let R be an integral domain. If the characteristic is zero, there is nothing
to prove. (However, we may note that when a R, not necessarily an intergel
domain, is of characteristic zero then R must be an infinite ring). For otherwise
if the ring R is finite and has only n elements, then by a corollary under
Latgrange’s theorem on groups we get that na=0 for all a  R , which means
that the characteristic of R is either n or less then n, in fact a divisor of and not
zero) suppose now, the integral domain R is of characteristic n  0 . Then we
claim that n must be a prime for otherwise n=pq where 1<p, q<n. then for any
non-zero element a in R,
( pa)(qa)  ( pq)a 2 (using the 2 distributive laws)
 na 2  0 ( n is the characteristic of R)
The above implies that either pa=0 or qa=0. however both the
assibilities are impossible since both p,q are <n and pa=0 with apply px=0 for
all x  R . So also qa=0 will imply qx=0 for all x  R . Thus n must be a prime
and the problem has been established.
The following problem, for which the hint given may be used, loves that
the commutativity of addition in a ring R with identity is consequence of the
remaining axioms which define R.
Problem 4:
Let R  R, ,.  be an algebraic system satisfying 1 the axioms for a
ring with identity with the possible exception of the commutativity of addition
+; (i.e) <R,+> is a group, not necessarily abelian, <R,.> is a semigroup with
identity (known as a monoid) and multiplication ‘.’ Is both left distributive and
right distributive over addition ‘+’. Then show that R is actually a ring.
(Hint: Expand, (a+b) (e+e) in two ways, using the distributive laws,
where a, b  R and e is the identity in the monoid <R,.>).

47
Homomorphisms:
Just as for groups, we introduce now the important notion of
homomorphism between rings
Definition 9:
A mapping  of a ring R into R ' is said to be a homomorphism of R into
R ' if
1.  (a  b)   (a)   (b) and
2.  (a.b)   (a). (b) , for all a, b  R .
In the above definition it must be noted (as in the case of groups) that +
and . that occur in the left hand sides of the above conditions (1) and (2) are
those operations of + and . in the ring R, while the + and . that occur on the
right hand sides are those of the ring R ' .
If there exist a homomorphism of a ring R onto a ring R ' , we write
R R ' and say that R ' is a homomorphic image of R. more briefly, a
homomorphism of a ring R into (onto) a ring R ' is a mapping of R into
(onto) R ' is also known as an epimorphism (more precisely, a ring
epimorphism). If the homomorphism  of a ring R into a ring R ' is one-to-one
(i.e)  (a)   (b)  a  b , where a, b  R , then the homomorphism  is said
to be a monomorphism (more precisely, a ring monomorphism). A
homomorphism  : R  R' which is both a monomorphism and an
epimorphism is called an isomorphism and in such a case we write R R ' and
say that the ring R is isomorphic with ring R ' . It is easy to see that any two
rings which are isomorphic have the same algebraic properties pertaining to
addition and multiplication. For instance if the rings R and R ' are isomorphic,
then (1) R is commutative iff R ' is, (2) R has identity iff R ' has, (3) R is an
integral domain iff R ' is, (4) R is a field iff R ' , is, (5) R is finite iff R ' is and so
on. Such a statement as above cannot be made when there exists only a
homomorphism and not an isomorphism of the ring R into (or onto) the ring
R ' . We will say more about this a little later.
In what follows let  : R  R' be a homomorphism of a ring R into a
ring R ' . Since  preserves the additive structures of the rings R and R ' ,
making use of the known properties of a group homemorphism, we have the
following facts: (1)    0 , where 0 is the left hand side is the zero element in
R and 0 in the righ-hand side is the zero element in R ' . (2)   a     a  for
all a  R .

48
Definition 10: For the homomorphism  : R  R' the set

K  a  R   a   0, the zero in R '  is (as in the case of groups) called the
kernel of the homomorphism  .
One can immediately notice (from group theory) that  is an
isomorphism iff K is trivial, i.e. K contains only the zero element of R. Further
K, under addition, is a (normal) subgroup of the addition group <R,+> of the
ring R. Also, for any a  K and r  R both ra and ar in K, since
  ra     r   a     r  .0  0 and similarly   ra   0 . These properties of
K are used to define the notion of an ideal in the ring R, this notion being
analogous to the notion of normal subgroups in Group Theory.
Definition 11:
Let R be any ring, A non-empty subset A of R is said to be a two sided
ideal or simply an ideal of R if
1. x, y  A  x  y  A (module property)
2. x  A, r  R  rx  A (left-multiple property)
3. x  A, r  R  xr  A (right-multiple property)
Note: A is said to be a left ideal of R if conditions (1) and (2) stated above are
satisfied and is said to a right ideal of R if condition (1) and (3) are satisfied.
Thus an ideal of R is both a left ideal and a right ideal of R. also one may note
that when the ring R is commutative both the notions of left ideal and right
ideal coincide and all ideals in R are only two sided ideals.
Definition 12:
A non-empty subset A of a ring R is said to be a sub-ring of R, if A is
also a ring under the same operations of + and as for R.
It is easy to prove that a non-empty subset A of a ring R is a subring of R iff
(i) x, y  A  x  y  A
(ii) x, y  A  x, y  A
For the ring Z of integers, all multiples, of a fixed integer n (say) form
the subset nZ which is clearly a sub-ring of Z. so also all the integers from a
sub-ring of the Z[i]of Gaussian integers. The real numbers of the form
a  b c where a, b  Z and c, a fixed positive integer which is not a perfect
square form a sub-rings of the field R of real numbers. Many example of sub-
rings can be give. One can also show that every ideal a in a ring R, (even a one-
sided ideal) is a sub-ring of R, though not conversely. We give below examples
of (i) a sub-ring which is not an ideal (even one-sided) (ii) a left ideal which is
not a right ideal, (iii) a right ideal which is not a left ideal, besides examples of
two-sided ideals.
Example (a):
Let F2 denote the (non-commutative) ring of all real 2  2 matrices
(vide: Example 8). Let A be the set of all real, upper triangular matrices of the

49
a b
form   then one can easily check that A is only a subring of F2 and is
 0 c 
neither a left nor a right ideal of F2 . Similarly the set B of all real lower
a 0
triangular matrices of the form   , (which are transposes of the matrices
b c 
in A) is also, only a sub-ring of F2 and is neither a left nor a right ideal of F2 .
a b
On the other hand if C is the set of all real matrices of the form   , one
 0 0
can easily show that C is a right ideal and not a left ideal of F2 . Similarly the
transposes of the above mentioned matrices. Viz. the set D of all real matrices
a 0
of the form   is a left ideal and not a right ideal of F2 . Analogous
b 0
examples ccan be given from the (more general) ring Fn of all n  n real (or
rational or complex) matrices.
Example (b):
In the ring z of integers the subset nZ, where n is a fixed integer, is not
only a sub-ring of Z, but even a two sided ideal in Z, it can be shown that every
 
ideal in Z is of this form, viz. nZ  nm m  Z for some fixed integer n. for
this reason Z is a principal ideal ring about which we will mention in greater
details in the next lesson.
Example (c):
Let R be any ring not necessarily commutative. If a be any fixed
 
element in R, the set aR  ax x  R can be shown to be a right ideal of R. in

gen eral aR need not be a left ideal of R. similarly the set Ra  xa x  R is a
left ideal of R and is, in general, not a right ideal of R. moreover, in general, a
need not belong to either the right ideal aR or the left ideal Ra. For example if
R is the (commutative) ring of all even integers, the subset 2R (of all multiples
of 4) is an ideal of R not containing 2. However, if the element a in the ring R
 
belong to the centre, viz. the set Z ( R)  z  R xz  zx, x  R , then aR=Ra
will be a two-sided of R.
The following standard problem gives an important characterization of a
division ring in terms of right or left ideals, (One may also note in passing that,
any ring R, not necessarily commutative but however containing atleast one
non-zero element, always has two deals viz. the null ideal (o) containing the
zero element o only and the unit ideal which is the whole ring R.)

50
Problem 5:
Let R be a ring with identity element; not necessarily commutative.
Then R is a division ring if and only if the only right (left) ideals of R are the
null ideal (o) and the unit ideal R.
Proof for the problem:
Suppose the only right ideals in R are the null ideal (o) and the unit
ideal R. then we have to show that R is a division ring. For this, suffice to show
that the non-zero elements of R form a group under multiplication and for this
purpose, in our present case since R has the (multiplicative) identity element e
(say) it is enough if we show that every non-zero element in R has a
(multiplicative) right (or left) inverse and that the product of any two non-zero
elements in R is again a non zero element. First we will show that if
a, b  R and a, b are  0 . Now aR is a right ideal of R and since
a  ae  aR and a  0 , aR=R as R contains only the 2 ideals, (0) and R.
similarly bR=R. suppose, if possible ab=0. Then (ab) R=(0) but
(ab)R=a(bR)=aR=R  0 . hence we obtain a contradiction and so ab  0 . Thus
the non-zero elements of R form under multiplication, a semigroup with
identity e. also since aR=R, there exists an element a ' (a right inverse of a) in R
such that aa '  e . Hence, by definition number 1 for a group (vide: Lesson 2),
the non-zero elements in R form a group under multiplication, as required.
Conversely suppose R is a division ring. Then we prove that R has only two
ideals namely the null ideal (0) and the unit ideal R. suppose A is a non-null
ideal of R and a  A, a  0 . Then for any b  R there exists an element
x  R such that ax=b. (This fact follows from the definition of a division ring).
Now since A is an ideal and a  A , we get that b  ax  A . But b is any
arbitrary element of R, whence it follows that R  A  R . Thus A=R, the unit
ideal, as desired. Hence the problem.
Bearing in mind the definition of an ideal (left or right or two-sided), it
is not difficult to show that if A and B are both left (or right two-sided) ideals
in a ring R, then A B is also a left (or right or two-sided) ideal in R. more
generally the intersection of any non-empty family of left (or right or two-
sided) ideals in R is also a left (or right or two-sided) ideal generated by any
non-empty subset M of R as intersection of all left (or right or two-sided) ideals
of R which contain M. The above definition is meaningful in as much as that
there is atleast one left (right, two-sided) ideal in R. viz. the unit ideal R which
contains M. when M is the singletion set (a), the left (right, two-sided) ideal
generated by M is called the principal left (right, two-sided) ideal generated by
the element a and is usually denoted by ( a )l , or (a) r (a), as the case may be.
Problem 6:
If R is a ring with identity, not necessarily commutative then show that,
for elements a  R , aR is the left ideal generated by a, Ra is the right ideal

51
generated by a and the set of all finite sums of the form 
r , sR
ras is the two

sided ideal generated by a.


However, if R has no identity element then show that
(i)  
the set na  ra n  z, r  R is the left ideal ( a )l

(ii) the set na  ar n  z, r  R is the right ideal (a) , and


r

 
(iii) the set na   ras n  z,  being a finite sum
 r , sR 
is the two sided ideal (a).
It may be noted that the left (right, two-sided) ideal generated by the non-
empty subset M of R is the set-theoretically smallest left (right or two-sided)
ideal in R which contains M.
Problem 7:
If A, B be two left (right, two-sided) ideals in a ring R, not necessarily
 
commutative, then show that A  B  a  b a  A, b  B is also a left (right,
two-sided) ideal in R, which contains A B .
Problem 8:
If A, B be two ideals in a ring r, left AB denote the set of all elements
that can be written as finite sums of elements of the ab with a  A , b  B .
Then prove that AB is also an ideal in R and that AB  A B .
Note: The above problems 6,7,8 which concern the algebra of ideals are almost
straightforward and follow from definition of ideals. So you may try to do them
all.
We saw earlier that the definition of an ideal was motivated by the
module and multiple properties of the kernel of any ring homomorphism. Just
as for groups, where we mentioned the close relationship that exists between
group homomorphism, normal subgroup and factor group, we proceed to
explain in the case of a ring, a similar relationship that exists between ring
homomorphism, ideal and factor-ring or residue-class ring. We now introduce
the concept factor ring (or quotient ring or residue-class ring).
Factor ring or quotient ring; Let R be a ring and A be an ideal (i.e, a two-
sided ideal) of R, A being a (normal) subgroup of the additive group <R, +>,
we can talk of cosets A + x, x  R . These cosets, which are now called residue
classes of A, we know, from a group under the well-defined addition given by
(A+x)+(A+y)=A+(x+y) for x, y  R . Now we define multiplication of residue
classes by setting (A+x) (A+y)=A+xy. That, the above definition, of
multiplication of residue classes is well-defined, can be seen as follows:
Suppose A  x  A  x ' , A  y  A  y ' , where x, y, x' , y '  R ,

52
Then we show  A  x  A  y    A  x'  A  y '  , i.e.
A  xy  A  x' y '
Now A  x  A  x'  x  x'  A, ….(1)
A  y  A  y'  y  y'  A …(2)
Since A is an ideal, x  x'  A  x  x' y  A  
 xy  x' y  A ….(3)
Again, y  y '  A   y  y '  x'  A
 x' y  x' y '  A ……(4)
The above implications (1), (2), (3), (4) show that
 xy  x y    x y  x y   A
' ' ' '

i.e xy  x' y '  A


whence A  xy  A  x' y ' , as desired.
R
Now we make the formal definition of factor ring .
A
Definition 13:
Let R be any ring, not necessarily commutative and let A be a two sided

  A  x x  R be the set of all residue classes of A (or


R
ideal of R. Let
A
R
modulo A). Define addition+and multiplication. On by setting. For
A
R
x, y  R ,(A+x)+(A+y)=A+(x+y),(A+x)(A+y)=A+xy. Then become a ring
A
under the above operation of addition and multiplication of residue classes of a
R
and this ring is called the quotient ring or factor riag or residue class of R
A
modulo A.
We already know from group theory that, if we restrict ourselves to the
R
additive structures only, then the mapping  :R defined by
A
 ( x)  A  x for x  R is the canonical (group) homomorphism. The same
mapping turns out to be a ring epimorphism also if we take the ring structures
R
of R and into consideration, thus we have the following important example
A
of a ring epimorphism.

53
R
For any two sided ideal A in R the mapping  : R  defined by
A
 ( x)  A  x is a ring epimorphism known as the canonical (ring)
homomorphism. In fact, for
x, y  R, ( x  y)  A  ( x  y)  ( A  x)  ( A  y)   ( x)   ( y) and
 ( xy) A  xy  ( A  x)( A  y)   ( x). ( y)
so that  preserves, both the operations of addition and multiplication.
R
Moreover  is clearly surjective, as every element in is of the form A+x and
A
 ( x)  A  x .
Applying the same procedure as for groups one can establish the
following (analogous) fundamental ring homonorphism theorem:
Theorem 1:
Let  : R  R' be a homomorphism of a ring R onto a ring R ' and let K
be the kernel of  . Then K is a two-sided ideal of R such that the residue class
R'
sing is isomorphic with R ' (in other words, any homomorphic image of a
A
R
ring R is isomorphic with residue class ring where K is the kernel of the
K
homomorphism). Conversely if K is any two sided ideal of R then there exists a
R
homomorphism (viz. the canonical homomorphism  : R  of the ring R
A
R
onto the ring such that the kernel of the homomorphism  is precisely K.
K
(write down the proof of this basic theorem yourself l).
The following theorem is analogous to the group-theoretical results,
namely Lemma 9 and Theorem 5 of Lesson 3 and its proof is an exact verbatim
translation of the proofs for the above mentioned group-theoretical results into
the language of rings. So this theorem is also stated without proof and you are
invited to write down the proof.
Theorem 2:
Let  R  R ' be a homomorphism of a ring R onto a ring R ' and let k

be the kernel of  . If A ' be any sub-ring of R ' let A  x  R  ( x)  A' .
Then A is a sub-ring of R containing K and A is an ideal in R iff A ' is an ideal
in R ' . Moreover there is an one-to-one correspondence between the ideals
' '
A of R and the ideals A (as defined earlier) of R which contain K. further if

54
the ideals A ' of R ' and A of R correspond under the above mentioned one-to-
R R'
one correspondence, then is isomorphic with ' .
A A
We conclude this lesson with some standard examples of ring
homomorphisms and certain observations on them:
Example (i):
Let R and R ' be two arbitrary rings and let  : R  R ' be a mapping
defined by (x)  0 x  R . It is quite easy to see that  is (trivially) a
homomorphism. This homomorphism is called the zero-homomorphism and is
never an epimorphism. Unless R ' is the trivial ring consisting of the zero
element only. Similarly  is never a monomorphism unless R is the trivial ring
consisting of the zero element only.
Example (ii):
Let R be a ring and I: R  R ' be the identity mapping given by
I(x)  x x  R . Then clearly I is an isomorphism and is really the identity
automorphism of the ring R.
Note: An automorphism of a ring R is an isomorphism of R onto itself.
Example (iii):
Let R  Z[i] be the ring (in fact, an integral domin) of all Gaussian
integers a+bi where a, b  Z . Consider the mapping  : R  R defined
(a  bi)  a  bi . Then one can easily check  is an automorphism of K,
which is not the identity automorphism. So also, if R is the field C of complex
numbers, the mapping which maps every complex number z into its conjugate
z is an automorphism, which is not the identity automorphism. Similarly, it R
is the ring (in fact, an integral domain) of all real numbers of the form
a  b c (where a, b  Z and c is a fixed positive integer which is not a perfect
square) –vide: Example 5-the mapping  : R  R defined by
 
 a  b c  a  b c can be easily verified to be an automorphism of R,
which is not the identity automorphism.
Example (iv):
If, as usual, Z denotes the ring (in fact, an integral domain) of integers
and Zn (where n is a fixed positive integer) denotes the ring of residue classes
modulo n-vide: Example 7, consider the mapping  : Z  Zn defined by
(m)  m , where m denotes the residue class modulo n determined by the
integer which are congruent to m modulo n. since n  m  m for   Z , this
mapping  is certainly not one-to-one, though however is onto. Since one can
easily check that p  q  p  q and pq  pq where p, q  Z , the mapping  is

55
a ring epimorphism, whose kernel is the ideal nZ of Z. Thus Zn is a
homomorphic image of Z, with kernel nZ. Whence, using the basic
Z
homomorphism Theorem 1, we get that Zn . Earlier we had mentioned
nZ
that Zn is a field if and only if the integer n is a prime. Thus be choosing a
nonprime n we find that the homomorphic image of an integral domain, viz., Z
need not be an integral domain, as Zn contains non-trivial zero divisiors when n
is not a prime. So also, one can show that the homomorphic image of a ring
with non-trivial zero divisors can be a ring without non-trivial zero divisors.
Example (v):
Let R be the set of all continuous real-valued functions defined on the
closed unit interval I=[0, 1]. Define addition ‘+’ and multiplication ‘.’ On R, as
usual, by setting, for f , g  R , (f+g)(x)=f(x)+g(x) and (f.g)(x)=f(x)g(x)
x  L . Then since sum and product of two real valued continuous function on
I are also real valued continuous functions, one can easily check that R
becomes a commutative infinite ring with identity, under the above defined
operations of _ and. Now let F be the field of real numbers. Define a mapping
 R  F by setting  ( f )  f ( ) where f  R and  a fixed real number in
I. (hence f ( ) is the value of the function f  f ( x) when x   ). Then it is
not difficult to verify that  is a ring epimorphism, whose kernel consists of all
functions in R which vanish at x=n. that, the mapping  is surjective follows
from the fact that the ring R contains all constant functions.
Solved problems:
1. Prove that any field is an integral domain though not vice-versa.
Solution:
Suffice to show that a field F has no zero divisors.
Let a, b  F such that a  0 & ab  0
ab  0  a 1 (ab)  a 1 0 ( a  0  a 1 exists)
  a 1a  b  0
 1.b  0
b0
Similarly let ab  0 & b  0
Since b  0 b 1 exists & we have
 b 1 (ab)  b 1 0
  b 1b  a  0
 1.a  0
a0

56
Thus F has no zero divisors.
Hence every field is an integral domain.
But the converse is not true.
For example the ring of integers which is an integral domain is not a
field since the only invertible elements of the ring of integers are 1 & -1 .
2. A ring R is said to be a Boolean ring if a 2 =a for all a  R . Prove that any
Boolean ring is of characteristic 2 and is commutative.
Solution:
Let R be a Boolean ring,
Now a  R  a  a  R
 (a  a)2  (a  a) (given)
 (a  a)(a  a)  a  a
 (a  a)a  (a  a)a  a  a (Left distributive Law)
 a2  a2  a 2  a 2  a  a (Right distributive law)
 ( a  a)  ( a  a)  a  a  a2  a 
 (a  a)  (a  a)  (a  a)  0
 (a  a)  0 (Left cancellation Law)
 Any Boolean ring is of characteristic 2.
Now (a  b)2  (a  b)
 (a  b)(a  b)  a  b
 (a  b)a  (a  b)a  a  b (Left dist. Law)
 a 2  ba  ab  b2  a  b (Right dist. Law)
 (a  ba)  (ab  b)  a  b  a 2  a; b2  b 
 (a  b)  (ba  ab)  (a  b)  0 (Commutativity & associativity of
addition)
 ba  ab  0 (Left Cancellation law)
 ba  ab  ba  ba (R is of Characteristic 2)
 ab  ba (Left Cancellation law)
3. If R is a ring and a  R . Let r ( A)   x  R / ax  0  a  R . Show that
r(A) is a right ideal of R.
Solution:
0R is such that a0=0
 r ( A)  0
Let x1 , x2  r ( A)  ax1  0; ax2  0 a  A.

57
Now a( x1  x2 )  ax1  ax2  0  0 a  A.
 x1  x2  r ( A)
r ( A) is a group under addition
In order to show that r ( A) is a right ideal of R, suffice to show that
x  r ( A), y  R  xy  r ( A) . But
x  r ( A)  ax  0  a  A.
 a( xy)  (ax) y
=0y
=0
Hence xy  r ( A)
Thus x  r ( A) , y  r ( A)  xy  r ( A)
r ( A) is a right ideal of R.
4. If A be any ideal in a ring R; let [R:A] denote the det
x  R / rx  A, for all r  R  . Prove that [R:A] is an ideal of R and that
A  [ R : A] .
Solution:
Since 0 R is such that r 0  0 A for all r  R , [R:A] is non-empty.
Let x1 , x2 be two elements of [R:A].
Then rx1  A for all r  R and rx2  A for all r  R .
Since A is an ideal,
rx1  A, rx2  A  rx1  rx2  A
 r  x1  x2   A
  x1  x2   [ R : A]
Let x be any elements of [R:A] and s be any element of R. then
rx  A for all r  A
 (rx) s  A for all r  R
 r ( sx)  A for all r  R
 xs  [ R : A]
Also rx  A for all r  R
 (rs) x  A for all r  R
 r ( sx)  A for all r  R
 sx  [ R : A]
Thus x  [ R : A] , s  R  xs  [ R : A] . Hence [ R : A] an ideal of R.
Also

58
y  A  yr  A for all r  R
 y  [ R : A]
 A  [ R : A]
5. If R is ring and Ais left ideal of R. let
 ( A)   x  R / xa  0, for all a  A . Prove that  ( A) is a two sided ideal
of R.
Solution:
Since 0R is such that 0a  0  a  A,  ( A)  
Let x1 , x2   ( A) . Then x1a  0  a  A,
And x2 a  0  a  A,
We have  x1  x2  a  x1a  x2 a  0  0  for all a  A
 x1  x2   ( A)
Let x   ( A); r  R
Then xa  0  a  A
 r ( xa)  r 0 a  A
 (rx)a  0 a  A
 rx   ( A)
Also, since A is a left ideal,
ra  A r  R; a  A
 x(ra )  0 ( x   ( A))
i.e. ( xr )a  0 a  A
 xr   ( A)
Hence  ( A) is an ideal of R.
6. It R is a ring with unit element 1 and  is a homomorphism of R onto R ' .
Prove that  (1) is the unit element of R ' .
Solution: Since  is a homomorphism of R onto R ' . R ' is a homomorphic
image of R.
If 1 is the unit element of R, then  (1)  R'
Let a'  R' . Then a'   (a) for some a  R(  is onto)
 (1)a '   (1) (a)
  (1.a)
  (a)
 a'

59
Also
a ' (1)   ( a ) (1)
  ( a.1)
  (a)
 a'
Hence  (1) is the unit element of R '
7. If R is a ring with unit element 1 and  is a homomorphism of R into an
integral domain R ' such that of   R , prove that  (1) is the unit element of
R' .
Solution:
 is a homomorphism of a ring R into an integral domain R ' then

Ker   x : x  R and ( x)  0  R' 
Since ker   R,  a  R such that  (a)  0  R '
We have  (1) (a)   (1a)   (a)
Let b ' be any element of R ' .
Now  (a)b'   (a)b'
  (1) (a)b'   (a)b'
  (a)[ (1)b' ]   (a)b' ( R ' is commutative)
  (a)[ (1)b' ]   (a)b'  0
  (a)[ (1)b'  b' ]  0
 (1)b'  b'  0 (  (a)  0 & R' is without zero divisors)
 (1)b'  b'  b'  (1) ( R' is commutative)
 (1) is the unit element of R.
Exercise
1. An element a in a ring R is said to be nilpotent if a a  0 for some positive
integer n does the Z108 of all residue classes modulo 108 have non-zero
nilpotents? If so, what are they?
2. If R be a non-commutative ring with a unique left identity, i.e. an element e
such that ea=a for all a  R , then show that the left identity is also the
identity for R. ( Hint: consider (ac-a+c)b for a, b  R .
3. If R be any ring, prove that the centre z ( R)   z  R xz  zx  x  R is
a sub-ring of R. Determine z(R) when R is the ring of all n  n real
matrices and show that it is the set of all scalar matrices.

60
4. What can you say about the intersection of a left ideal and a right ideal of a
ring R? Substantiate your contention by means of suitable examples.
5. Let R be a ring with identity 1. Define new operations  and on R by
setting a a  b  a  b  1, a b  a  b  ab and let R  R, , .
Prove that R is also a ring with identity and R is isomorphic with R .
6. Let R be any ring and R  R  ( x, y ) / x, y  R be the Cartesian product
of R with itself. Define componentwise addition and multiplication on
R  R , by setting (a,b) + (c,d)=(a+c, b+d) and (a,b (c,d) = (ac, bd). Prove
that, under these operations R  R is a ring with non-trival divisors of zero,
assuming that R has at least one non-zero element. Show further that R is a
homomor of a ring which is not an integral domain and which, however has
a homomorphic image that is an integral domain.
7. Give the complete proofs of Theorems 1 and 2, stated in this lesson.
8. Complete the proof for the statements, left unproved, in Example V,
concerning the ring of all real-valued continuous function defined on I. Do
the same for Example (iii).
In the last Section, we had introduced the basic and important concepts
like ideals, residue class rings and homomorphism and also mentioned about
their inter-relations. In this Section we propose to study certain special types of
ideals and rings and discuss some of their important properties.
The Theorem 1 on ring homomorphism (given in the last lesson) gives
us a complete information about all the possible homomorphic images of a
given ring. Applying this theorem and recalling the fact that the only ideals in
any division ring are the trivial ones, viz., the null ideal (O) and the unit ideal r,
we can state that any division ring (and in particular, any field) has only trivial
homomorphic images. Namely itself and the zero ring, consisting of only the
zero element. Thus a field is the most desirable kind of ring, in that it cannot be
simplified further by applying a homomorphism to it. Now we define two types
of ideals in a commutative ring R and indicate their inter relations, apart from
suitable examples to illustrate these ideals it happens that the residue class ring
modulo, one of these types of ideals is a always a field.
Definition 1:
An ideal P in a commutative ring R is said to be a prime ideal if P is
not unit ideal and, whenever ab  P , where a, b  R then at least one of the
factors a or b is in P.
Definition 2:
An ideal M in a ring R is said to be a maximal ideal of R, if M is not the
unit ideal and whenever N is an ideal in R such that M  N  R , then either
N=R or N=R.
The meaning of maximal ideal (given above) is that an ideal of R is
maximal if it is impossible to squeeze an ideal between it and the whole ring R.
it can be proved, by using Zorn’s Lemma or Axiom of choice, that any

61
commutative ring with identity has at least one maximal ideal. In fact, any ideal
is contained in a unique maximal and this can be proved. Moreover, a ring may
possess many maximal ideals and this fact will be illustrated with the help ring
Z of integers.
Theorem 1:
Let R be a commutative ring. Then an ideal p of R is a prime ideal iff
R
the resident class ring is an integral domain.
P
Proof:
R
Suppose is an integral domain. We prove that P is a prime ideal. For
P
this suffice to show that x y p implies that at least one of the factors x or y is
in P (by definition of a prime ideal). Now x y p implies xy+P=P, which, in
R
turn implies (x+p)(y+p)=p (here, note that the elements of are residue classes
P
x+p where x p and that (x+p)(y+p)=(xy+p). but P is the zero element in ring
R
, which is assumed to be an integral domain. Hence either x+p=p or y+p=p
P
which means that either x p or y  p as desired. Therefore P is a prime
R
ideal. Conversely, assume that is a prime ideal. We show that is an integeral
P
domain. For this it is enough if we show that (x+p)(y+p)=p (the zero element
R
) implies that either x+p=p or y+p=p. now (x+p)(y+p)=p implies xy+p=p.
P
whence xy  p . Since P is a prime ideal, this means that either x or y is in P,
R
whence either x+P or y+P is same as P, as required. Thus becomes an
P
R
integer domain. (note the R being commutative, is also a commutative ring
P
R
and so the ring is a commutative integral domain, as required in the
P
definition of an integral domain).
Before obtaining the important result concerning the “iff” condition for
an ideal to be a maximal ideal of a commutative ring with identity (vide:
Theorem 2), we list a few exmples of prime ideals and non prime ideals in the
ring Z, incidentally these examples will justify the term “prime ideal”.
Take the ring Z of integers. If A be any non-null ideal of Z, we show
that A is a principal ideal, generated by some suitable integer a  A , since A is
a non-null ideal A contains non-zero integers. If the non-zero integer m  A ,

62
by definition of ideal, it follows that m  A . Of the two integers m and –m,
one of them is certainly positive. Thus A contains positive integers. Let a be the
smallest of the positive integers in A. (Such an integer a exists, as the set of all
positive integers is known to be well ordered). We now claim that this positive
integer a generates the ideal A. For , if m  A , then, by division algorithm
available in Z, we can find two integers q and r such that m = qa + r, where
0  r  a. Since m, m  A and A is an ideal, we find that m m  qa A , i.e.,
r  A . Hence r=0. for otherwise there exists a positive integer r<a in a,
contradicting the fact that a is the least positive integer in A. Thus r=0 and we
get m=qa. Which means that A=aZ is the principal ideal in Z generated by a.
Because of this property of Z viz, that any ideal in Z is a principal ideal. Z is a
principal ideal ring-this concept of a principal ideal. Z is a principal ideal ring-
this concept of a principal ideal ring will be defined subsequently. Now we
show that an ideal P in Z is a prime ideal iff it is generated by a prime number.
Suppose P=pZ is the ideal generated by the prime P and suppose ab  P where
a, b  P . Then p ab , which implies (since p is a prime) that either p a or
p b . This means that either a or b is in P=pz Hence, by definition P becomes a
prime ideal. On the other hand any ideal A of Z generated by a non-prime is not
at prime ideal. In fact, suppose A=aZ is an ideal generated by a which is not a
prime then we can choose two integers x and y such that a xy , a does not
divide either x or y. For example, suppose a=12 (a non prime) we can choose
x=4 y=9 so that a xy , though a divides neither x nor y. hence the ideal A
cannot be a prime ideal, as xy  P but both x and y are not in P. thus we see
that any ideal I Z is a prime ideal iff it is generated by a prime number. This
property is one of the important properties of any Euclidean ring, a concept to
be defined sub sequently and we will see later that Z is a Euclidean ring.
Similarly in the case of the ring Z[x] of all polynomials in x over z the ring Z[x]
will be considered in the next lesson any ideal is a prime ideal iff it is an ideal
generated by an irreducible polynomial.
Now we will obtain the “iff condition for an ideal M to be a maximal
ideal in R. in this connection we have the following important theorem.
Theorem 2:
Let R be a commutative ring with identity. These an ideal M in R is a
R
maximal ideal iff the residue class ring is a field.
M

63
Proof:
R
Suppose M is an ideal of R such that the residue class ring is a field,
M
Since a field has only two ideals, viz. the null ideal and the unit ideal, the
R
residue class ring has only the two ideals, viz. the zero (M) and the unit
M
R
ideal . Now by Theorem 2 in Lesson 5, there is one-to-one correspondence
M
R
between the set of ideals of and the set ideals of R which contain M. The
M
R
ideal M of R corresponds to the zero ideal (M) of and the unit ideal R of R
M
R R
corresponds to the unit ideal of . Hence there is no ideal between M and
M M
R other than these two, whence it follows that M is a maximal ideal.
R
Conversely, suppose M is a maximal ideal. We show that is a field. By the
M
one-to-one correspondence mentioned above, it follows that (since there is no
R
ideal between M and R other than these two) the residue class ring has only
M
R
two ideals. Namely the null ideal (M) and the unit . Hence (by problem 5,
M
R R
Lesson 5) is a division ring and since R is a commutative ring turns out
M M
to be even a field. This concludes the proof of the theorem.
Corollary: In any commutative ring with identity every maximal ideal is a
prime ideal.
Proof:
Let R be a commutative ring with identity and M, a maximal ideal of R.
R
then by the above theorem is a field and an integral domain. Therefore, by
M
applying Theorem 1, M turns out to be a prime ideal and the corollary is
proved.
Note:
Even though any maximal ideal (in a commutative ring with identity)is
always a prime ideal, the converse of the statement is false. In fact, we can give
an example of a commutative ring with identity, in which not all prime ideals
are maximal. However, in the case of the ring Z of integers and also in the case
of any finite commutative ring (with identity), all prime ideals are also
maximal. These facts are proved in the following problems.

64
Problem 1:
Give an example of a commutative ring with identity in which not all
prime ideals are maximal.
Solution:
Let Z[x] (as usual) denote the set of all polynomials in the indeterminate
(or variable) x over Z (i.e. with coefficients form the ring Z of integers). One
can easily see that Z[x] becomes a commutative ring with identity, under the
usual (high – Scholl) operations of addition and multiplication of polynomials.
Let P denote the subset of Z[x], consisting of all polynomials in Z[x] which are
divisible by x; i.e. polynomials which lack the constant term. One can easily
verify the P is an ideal in Z[x]. In fact it is the principal ideal generated by the
polynomial x. this ideal P is moreover a prime ideal, because, whenever
u v  P where u v  R[ x] it is easy to see that at least one of the polynomials u
or v must lack the constant ter; i.e. at least one of the polynomials u or v is in P.
Thus P becomes a prime ideal in Z[x]. However P is not a maximal ideal, as
we show presently. Letr N be set of all polynomials in Z[x] whose constant
terms are either zero or an even integer. It is not difficult to check that N is an
ideal (even a prime ideal) in Z[x] with the property that P  N  Z [ x ] , so
that N turns out to be a proper ideal in Z[x] which contains P properly, Thus P
cannot be a maximal ideal. Thus Z[x] is an example of a commutative ring with
identity. In which not all prime ideals are maximal.
Remark:
The ideal P, referred to above is properly contained in many other
proper ideals of Z[x]. In fact, if P is any prime and N p denotes the set of all
polynomials in Z[x] whose constant terms are either O or P, one can easily
check that N p is a proper ideal (even a prime ideal) in Z[x] which contains P
properly. This ideal N p is the ideal generated by the set [xp] in Z[x].
Problem 2:
Prove that in the ring of integers all prime ideals are maximal.
Solution:
Let P be any prime ideal in the ring Z of integers. Then it is known that
there exists a prime number P which generates the ideal P: i.e P=pZ, Now by
Z
example (iv), given in Lesson 5, we know that  Z p , where Z p is the ring
pZ
of residue classes modulo P. since p is a prime, Z p is a field (vide: Lesson 5).
Z
Hence is a field and so, by Theorem 2, the ideal pZ, i.e. P becomes a
Zp
maximal ideal.
Note:
The above problem can also be proved from elementary consideration,
without using, Theorem 2 of the basic homomorphism theorem of ring theory.

65
Problem 3:
If R is any finite commutative ring (with identity) then any prime ideal
of R is also maximal.
Solution:
R
Let P be any prime ideal of the ring R. Then by Theorem 1, is an
P
R
integral domain. Since R is a finite ring, is also a finite integral domain and
P
hence becomes a field (vide Lemma 3 in Lesson 5) Hence, by applying
Theorem 2, we get that P is even a maximal ideal, R being a commutative ring
with identity.
Note:
One can show, by means of a suitable example that Theorem 2 is false
when the ring R is either not commutative or does not possess identity element.
However, one can show that, if R is a ring not necessarily commutative, nor
R
possessing an identity element and if M is a maximal ideal of R then is a
M
division ring.
Problem 4:
Let R be the ring of all the real-valued continuous functions defined on
the closed unit interval (See Example V. Lesson 5). Prove that
 
M r  f ( x)  R f (r )  0 is a maximal ideal of R where 0  r  1 .
[Note: It can be shown that every maximal ideal of R is of this form
M r , so that the maximal ideals of R are in one-to-one correspondence with the
points on the unit interval].
Solution:
That M r is an ideal of R, is easy and straight forward to see, (By
Example V Lesson 5) M r is the kernel of the epimorphism r : R  F
defined by setting  ( f )  f (r ) , where f  f ( x)  R and F is the field of real
numbers and so M r is ideal of R. To show that M r is a maximal ideal of R, it
is enough if we show that whenever N is an ideal of R such the N contains M r
and N  M r then N=R, Now, since N  M r there exists a function g ( x)  N .
g ( x)  M r . Since g ( x)  M r , g (r )  0 , say g (r )    0 . Take
h( x)  g ( x)   . Then h( x)  R such that h(r )  h(r )    0 . Whence
h( x)  M r  N . Thus both g ( x), h( x)  N so that   g ( x)  h( x)  N , N
being an ideal of R. since   0, 1 exists in R and 1   1  N Hence for
any element t ( x)  R, t ( x)  1 , t ( x)  N which means that N  R  N ; i.e
N=R as required. Hence M r is a maximal ideal.

66
The above problems have furnished us with a large number of maximal
ideals in the ring Z of integers and the ring R of all real-valued continuous
functions defined on I, the closed unit interval.
The field of quotients of an integral domain:
We are familiar with the method by which the set Q of rational numbers
is constructed out of the set Z of integers. Remembering that Q is a field and Z
is an integral domain we have therefore a construction of a field out of an
integral domain such that the constructed field contains, as a sub domain the
given integral domain. A similar procedure is adopted so as to “imbed a given
integral domain in a field”, as per the following definition.
Definition 3:
A ring R is said to be imbedded in a ring R ' , if there exists an
isomorphism of R onto a subring of R ' , (In other words. R ' contains an
isomorphic image of R as a subring). This R and R ' possess identities e and e' .
We now have the following important and fundamental theorem.
Theorem 3: Every integral domain can be imbedded in a field.
Proof:
Let R be any integral domain. Let M be the set of all order pairs (a,b),
where a, b  R and b  0 . We now define a relation on M by setting
(a, b), (c, d ) if and only if ad=bc, we show that this relation is an
equivalence relation. For this we need check only, whether this relation is
reflexive, symmetric and transitive, i.e, whether for (a,b)(c,d),(e,f)  M. (i)
(a,b) (a,b) (ii) (a,b) (c,d)  (c, d ) (a, b) and (iii) and (a, b), (c, d )
(c,d) (e,f)  (a,b) (e,f) (i) and (ii) follow easily from the definition of this
relation and then that the integral domain is a commutative ring. To see the
truth of (iii) we proceed as follows: (a, b), (c, d ) and (c, d ) (e, f ) imply
ad=bc and cf=de. Form ad=bc, we get adf=bcf and from cf=de, we get bcf=bde.
So we get adf=bde, whence using the commutativity of multiplication and
cancellation law valid in this integral domain, we obtain af=be which implies
(a, b), (e, f ) as required. Thus the relation turns out to be an equivalence
relation and so defines a partition of M into equivalence classes. We denote, as
is customary, any equivalence class determined by the pari (a, b)  M . We
define suitable operations of addition and multiplication on F and make F into
the desired field in which the given integral domain is embedded, in fact, we
define addition ‘+’ and multiplication ‘.’ On F as below: - for [a,b],
[c, d ] in F , let[a, b]  [c, d ]  [ab  bc, bd ], 
  (1)
[a, b].[c, d ]  [ac, bd ]. 
Since both b and d are non-zero elements of R an R is an integral domain,
bd  0 so that the symbols [ab+bc, bd] and [ac, bd] are permissible and they
belong to F. since the above definitions of addition and multiplication on F
depend on the representatives that are choose in the equivalence classes, we

67
need check that the above definitions are well defined. For this we need
examine whether [a, b]  [a' , b' ] and [c, d ]  [c'd ' ] imply
[a, b]  [c, d ]  [a' , b' ]  [c'd ' ] and
[a, b].[c, d ]  [a' , b' ][c'd ' ]
.Now, [a, b]  [a' , b' ].[c, d ]  [c'd ' ] imply
(a, b) (a' , b' ),(c, d ) (c' , d ' ) ;
i.e, ab'  a'b and cd '  c 'd  (2)
To prove that [a, b]  [c, d ]  [a' , b' ]  [c'd ' ] we need show that
[ab  bc, bd ]  [a ' d '  b'c ' , b'd ' ] , i.e., (ad+bc) b'd ' a '
d '  b'c'  bd, ie.,
b'd '  bc b'd '  a 'd 'bd  b'c' b'd '  (3)
From (2) we get, ab'dd '  a'dd and bb'cd '  bb'c'd , whence, by
adding and using the commutativity of multiplication we obtain the required
result (3), similarly, to prove [a, b].[c, d ]  [a' , b' ].[c'd ' ] we need show that
[ab  bd ]  [a 'c'  bc b'd ' ] . Ie. acb'd '  a'c'bd . This follows from (2) by
multiplying the equations in (2) and using commutativity of multiplication.
Thus we have verified that both the operations of addition and multiplication
defined of F by equations (1) are well-defined. Now we claim that F is a field
under these operations of addition and multiplication and that R can be
imbedded in F. in the first place, we can easily check that F is an abelian group
under addition. In fact addition on F is easily seen to be commutative. It is also
associative, for take [a,b], [c,d], [e,f] in F. then, an easy computation will show
that
[a, b]  [c, d ]  [e, f ]  (ad  bc) f  (bd ) f  and
[a, b]  [c, d ]  [e, f ]   a(df )  b(cf  de), b(df ) 
Since the right hand sides of the above equations are clearly equal, the
associativity of addition on F follows. Then the equivalence class
[o, b]  F acts as additive identity, as can be easily verified and for any
[a, b]  F , (a, b) acts as additive inverse,
since [a, b]  [a, b]  [ab  b(a), b2 ]  [0, b2 ]  [0, b]
and also [a, b]  [a, b]  (0, b), the additive identity if F. thus  F ,   is an
abelian group. Secondly, we examine the multiplicative structure of F and show
that the non-zero elements the of F, viz. [a,b] where a, b  R and a,b are both
non-zero form a group under multiplication. In fact, the commutativity and
associativity of multiplication and the absence and non-trivial zero divisors, in
the integral domain R force us to conclude that the non-zero elements of F form
a commutative semigroup. Further the element [a,a] in F acts as identity for
multiplication, since it can be defined that [a,a].[x,y]=[x,y],[a,a]=[x,y] where

68
[x,y]  F . Here we may note that [a,a]=[c,d] iff c=d (using the cancellation
laws valid in the integral domain R). Again for any non-zero elements [x,y] in
F (which means that both x,y are non-zero elements of R), then exists an
inverse, viz [y,s] in F such that [s,y].[x,y] = the identity [xy,xy] thus the non-
zero elements of F form a multiplication abelian group. Thirdly, we can check
that multiplication is F is distributive over addition in F. since multiplication in
F is commutative, one need check one of the distributive laws and we proced to
check the left distributive law:
[a,b].([c,d]+e,f])=[a,b].[c,d]+[a,b].[e,f]  (4)
The left hand side of (4), on computation, reduces to [a (cf+de), b (df)]
The right hand side of (4), on the other hand, reduces to
[(ac)(bf)+(bd)(ae), (bd)(bf)]=[b(acf+ade).b(bd f)  (5)
Using the commutative and associative properties of multiplication R we
find that (5) and (6) equal. Thus the multiplication in Fis distributive over
addition in F and we have therefore verified all the field axioms in the case of
F, Hence <F,+,.> becomes a field, as earlier claimed,. To conclude the proof of
the theorem, it remains for us to show that the given integral domain R can be
imbedded in the field F. for this purpose we need identify certain elements in F
 
as elements in R. in fact let R'  [ab, b]  F a  R . We claim that R ' is a
sub-ring of F which is isomorphic with R. once we establish this claim it
follows that R is embeddable in F. we note that [ab,b]=[x,y] iff aby=bx iff ay=x
(using cancellation law valid in R), so that all the elements of the form [ab,b]
where b is arbitrary and a is fixed. Are all equal and that only these are equal .
we ideatify the element [ab,b] of F with the element a of R, the defined by
 (a)  [ab, b] . That the above mapping  is well defined, one to one and onto
is clear. Further  (a  a' )  [(a  a' )b, b]  [ab, b]  [a'b, b]  [ab, b].[a'b, b] ,
as can be verified, whence  (a). (a' )  (a) . Thus  preserves both the
operations in R, and hence  is an isomorphism of R onto R ' . Hence R is
imbedded in F, as required.
Note: The above proof of the Theorem 3 appears to be very lengthy and it is so
because we have included, in the proof the verification of almost all the ring
axioms in the case of F, though many of the verifications are routine and could
have been omitted. The field F constructed in the course of the proof is called
the field of quotients of R. when R=Z, the ring of integers, the field F is the
familiar field Q of rational numbers.
Problem 5:
Let R be an integral domain and a, b  R . Suppose a n  bn and
a m  bm where m and n are two relatively prime integers. Then prove that a=b.
Solution:
In the first place we note that, under the conditions stated in the problem
. a=0 if and only if b=0. this follows from the fact that x m  0 for x  R implies

69
x=0, R being an integral domain. Thus when a=0, b=0 also so that a=b=0. so
let us assume that a  0 , in which case b  0 . We will prove that a=b, in this
case also. Now, since the integers m and n are relatively prime, here exist
integers p and q such that pm+qn=1. as m and n are both positive integers, of
the 2 integers p and q one of them is positive and the other is negative. We may
assume without loss of generality that p is positive and q is negative, Let
q  q' where q ' is positive. Now pm+qn=1 gives pm=1+ q ' n , so that

a   a. a  b   b. b 
' '
1 qn m p n q m p n q
a pm
a i.e. which gives, since, by

a m  bm and a n  bn hence b pm  a.b; q 'n; b1 q n  a b.q n i.e.


' '
data
bbq n  abq n which implies, by cancelling bq n  0 (R being an integral
' ' '

domain) b=a. thus the problem follows.


Note: The need for the introduction of the positive integer of arises in the
above proof, because a qn has no meaning (in general) when q is negative.
Euclidean ring:
The type of rings we propose to study now is motivated by the
properties of several rings like the ring Z of integers the ring Z[i] of Gaussian
integers and the polynomial rings (about which we will study in the next and
final lesson on rings) The Euclidean ring possesses some of the characteristic
and important properties of the above mentioned rings.
Definition 4:
A Euclidean ring is an integral domain R in which for every element
a  0 in R a non-negative d(a). [called a semi-valuation or a d-value of a] is
associated satisfying the following condition algorithm exists): if a, b  R and
b  0 then there exist element q, r  R such that b=qa+r where either r=0 or
d(r)<d(a).
Note:
For the zero element of R no d-value is assigned, in out definition. An
example of a Euclidean ring is the ring Z of integers. In fact, it is known that Z
is an integral domain and for any m  Z , m i.e. the numerical value of m
serves m serves as the d-value of m, as both the condition (i) and (ii) stipulated
above in the definition are clearly satisfied, Later we will show that the ring
Z[i] of Gaussian integers is also a Euclidean domain. We will now prove that
any Euclidean ring is a principal ideal ring. First we define the notion of a
principal ideal ring.
Definition 5:
A principal ideal ring is an integral domain R with identity in which
every ideal is principal i.e. if is any ideal in R then there exists an element
a  R which generates A; i.e. every element in A is of the form ra with r  R .
Theorem 4: Any Euclidean ring is a principal ideal ring

70
Proof:
Let R be a Euclidean ring. Since R is already is integral domain, in
order to prove that R is a principal ideal ring suffice to show that R has identity
element and any ideal of R is principal. Let A any ideal of R. if A is the null
ideal there is nothing to prove. We assume therefore that A is a non-null ideal
For each a  A, a  0 , there is a d-value d(a) which is a non-negative integer
choose the element, say a, whose d-value d(a) is the least among the d-values of
the non-zero elements in a. we claim the then by division algorithm which
exists in the Euclidean ring R, the exist elements r  R such that b=qa+r
where either r=0 or d(1)<d(a). we assert that r=0 for, if not d(r)<d(a) and r=b-
qa  R ( A is an ideal and b, a  R r  R ). This leads to the contradictis that
d(a) is the least among the d-values of the non-zero elements an A. Hence r=0
and we getb=qa. Thus every element in A is an the form qa for some q  R .
Applying this fact to the unit ideal R, we find that there exists an element say u
in r such that any element of R is a multiple, say qu of u for some q  R . In
particular u-eu for some e  R . We claim that this element e is the identity
element of R. in fact for any element x in R, we can write
x=qu=q(eu)=(qe)u=(eq)u=e(qu)=ex hence, it follow that e is the identity
element of R. now that we have proved that in has identity and for any ideal A
of R there exists a suitable element a  A such that every element of A is a
multiple qa of a, we can conclude that any ideal A of R is a principal ideal,
generated by some element a  A . Hence R becomes a principal ideal,
generated by some element a  A . Hence R becomes a principal ideal ring.
Note: Even though any Euclidean ring is a principal ideal ring, as we have
shown just now, the converse is false. An example of a principal ideal ring
which is not a Euclidean ring can be given. We can show that the ring Z[x] of
all polynomials in x over the ring Z of integers cannot be a Euclidean ring. In
fact Z[x] contains an ideal, viz. the ideal generated by (2,x). which is not
principal. This fact can be proved. Thus Z[x] is not a Euclidean ring,
eventhough it is an integral domain.
Some authors call a Euclidean ring as a Euclidean domain and a
principal ideal ring as a principal ideal domain, since both these rings are
integral domains. In order to obtain some further properties of Euclidean ring
we introduce the familiar notions of divisibility, prime element, greatest
common divisor, relatively prime elements etc. we will assume hereafter for
this purpose that any element considered in the Euclidean ring R is non zero,
unless otherwise stated.
If a, b  R then we say that a is a divisor of b or b is a multiple if a
whenever there exists an element c in R such that b=ca. we exhibit this fact by
wirting a b . One can easily check the following facts: (i) for a, b, c  R , if
a b and b c then a c , (ii) if a b and b c then a  b  c  , and (iii) if a b then
a rb for r  R .

71
Definition 6:
If a, b, c  R , (where R is a ring) then an element of R is said to be a
greatest common divisor (abbreviated as g.c.d if a and if: (1) d a and d b (2)
whenever for c  R c a and c b then c d . We denote any g.c.d of a and b by
(a.b.).
Lemma 1:
Let R be a Euclidean ring. Then any two elements a,b of R have a.g.c.d
say d and there exists elements  ,   R such that d   a  b .
Proof:
 
Let A  ra  sb r. s  R . We claim that A is an ideal of R. for let
x, y  A so that x  r1a  s1b, y  r2 a  s2b or some r1 , r2 , s1 , s2  R . Then
( x  y )   r1  r2  a   s1  s2  b and for any r  R, rx  (rr1 )a  (rs1 )b so that
both x-y and x are in A. thus A turns out to be an ideal and hence, R being
Euclidean ring, there exists an element d  A which generates the ideal A, i.e.
A=<d>. but both a and b are in A, as we can writec=1a+0b, and b=0a+1b.
hence d a and d b . Further since d  A , there exists  ,   R such that
d   a  b which shows that if c a , c b for c  R then c d Hence d is a
greatest common divisor of a and b and the Lemma is proved.
Note: The above proof of the Lemma is valid obviously even in the case of a
principal ideal ring. However in the case of Euclidean ring there exists an
algorithm or a method for determining the g.c.d. (a,b) of any given pair of
elements a,b of the ring, while in the case of a principal ideal ring such an
algorithm may not exist. The follows with problem explains the algorithm for
determining the g.c.d (a,b) of any given pair of elements a,b of Euclidean ring.

Problem 8:
Explain how the g.c.d (ab) of a given pair of elements a, b of a
Euclidean ring R can be found.
Solution:
Invoking the “division algorithm” which exists in R we can determine
elements q0 , r1 , q1 , r2 .......qn  1, rn etc. in R such that
b  q0 a  r1 where d (r1 )  d (a )  (1)
a  q1r1  r2 where d (r2 )  d (r1 )  (2)
r1  q2 r2  r2 where d (r2 )  d (r2 )  (3)
... ... ... ... ...
... .... ... ... ...
rn 2  qn1rn1  rn where d (rn )  d (rn 1 )  (n)

72
Thus we have a strictly decreasing sequence of non-negative integer viz,
d (a), d (r1 ), d (r2 ) ......, d (rn1 )d (rn ) . This sequence must terminates at a fint
stage, say at the  n  1 stage, so that we get the  n  1 equation rn.1  qn rn .
th th

Here we have taken that rn 1  0 now we assert that rn  (a, b) . In fact the
 n  1
th
equations shows that rn rn1 ; using this fact nth equations shows that
rn rn  2 . Whence the  n  1 equation, viz. rna  qn  2rn2  rn1 shows that
th

 n  2
th
rn rn3 . Proceeding in the same manner the equations shows that
rn rn  4 etc. finally we get from the 2 nd and the 1st equations that rn is a common
divisor of both a and b. on the other hand, if d in R is a common divisor of both
a and b then from equation (1) we find that d is a divisor of a1 so that d is a
common divisor of both a and r1 whence from equation (2) we have that d is a
divisor of r2 . Arguing similarly we can show from equation (3) that d is divisor
of r4 , r5 etc and finally from equation (n) we find that d is a divisor of rn . Thus
we have established that rn is a common divisor of both a and b and that any
common divisor d of both a and b is a divisor of rn . Hence, by definition, rn
turns out to be a g.c.d of a and b and the problem is answered.
Definition 7:
Let R be a commutative ring with identity 1. An element u ( 0) in R is
said to be a unit (or an invertible element) in R iff there exists an element v in R
such that uv=1.
Lemma 2:
Let R be an integral domain with identity. Suppose for a, b  R we have that
a b and b a . Then a=ub where u is a unit in R.
Proof:
Since a b we can write b=xa for some x  R . Similarly since b a we
have
a=yb for some y  R . Thus b=xa=x(yb)=(xy)b which shows that xy=1, on
cancelling b which is permissible, R being an integral domain.
Hence y turns out to be a unit in R and a=yb proving the lemma.
Such pairs of elements like a,b as above are called associates as her the
following definition.
Definition:
Let R be a commutative ring with identity. Two elements a and b of R
are said to be associates if b=ua for some unit in R. clearly, if a and b are
associates in a ring, then each is a divisor of the other, and they generate the
same ideal, provided the ring is a commutative ring with identity. Also, it can

73
be shown thus in a Euclidean ring any two given greatest common divisors of
two elements are associates.
Lemma 3:
If R is a Euclidean ring and a, b  R then d(a)<d(ab) provided b is not a
unit in R.
Proof:
We had seen earlier in the proof of Theorem 4 (viz, every ideal in
Euclidean ring is a principal ideal) that for any non-null ideal A in R any
element in A whose d-value is the least among the d-values of the non-zero
elements of A is a generator of A. suppose now A=<a> is the ideal generated
by a, then ba  A and by one d (a)  d (ba) suppose not, then d(ba)=d(a)
implies that ba is also a generator of A. This means a=x(ba) for some x  R ,
whenever by cancelling a (which is permissible as R is an integral domain) we
get 1=xb, this shows that b is a unit which is a contradiction to the data that b is
not a unit in R. thus d(a)<d(ba) as required.
Definition 9:
In a Euclidean ring R an element p is said to be a prime if (1) p is not a
unit and (2) whenever p=ab for a, b  R then either a or b is a unit.
The above definition shows that an element p  R is a prime iff its only
divisors are units and associates of p. in other words any time element of R
admits only trivial factorizations. We now prove a important property
concerning the factorization of any element is Euclidean ring and ultimately
establish the unique factorization property.
Lemma 4:
Let R BE A Euclidean ring. Then every element of R is either a unit or
can be written as a product of a finite number of prime elements of R.
Proof:
We prove the result by induction on d(a), where a  R . Suppose
d(a)=d(1). Then a must be a unit. In fact, for any non-zero b  R , one of the
properties of the d-value shows that (since b=1b) d (1)  d (b) . Now by
division algorithm in R there exist elements q, r  R such that 1=qa+r where
either r=0 or d®<d(a). since d(1) is the least among the d-values we conclude
r=0 so that 1=qa, which shows that a is a unit and the lemma follows. We
assume therefore that the lemma is true for all x  R for which d(x)<d(a) and
prove that the lemma is true for a. this would complete the induction and the
lemma would have been proved. If a is a prime element in R there is nothing to
prove. So we may assume that a is not a prime and that we can write a=be
where b, c  R and both b and c are not units,. Now by the last lemma, we get
d(b)<d(bc)=d(a) and d(c)<d(bc)=d(a). thus, by our induction hypothesis,
elements, say b  p1 , p2 ...... pm and c  p1' p2' ..... pn' where the p ' s and p'' s are
all prime elements in R. hence we get a=bc= p1 , p2 ...... pm p1' p2' ..... pn' . Thus a

74
has been written as a product of a finite number of prime elements the lemma
follows by iduction.
Definition 10: Two elements a, b in a Euclidean ring are said to be relatively
prime (or prime to each other or coprimes), whenever agcd of a and b is a unit.
Since any associate of agcd. Is easily seen to be also agcd and since 1 ad
any unit are associates we find that we can take (a, b)=1 whenever a and b are
relatively prime. Further we will find that the prime elements in a Euclidean
ring play the same role that prime numbers play in number theory. In fact we
obtain familiar number theoretical results in the language of Euclidean rings
and we obtain many of them belowz:
Lemma 5:
Suppose R is a Euclidean ring and a, b, c  R . If a bc and (a, b)=1,
then a c .
Proof:
Since (a,b)=1 we find, by Lemma 1 that there exist elements
 ,   R such that  a  b  1 . So we can write
c  1c    a  b  c   ac  bc . But by dat a bc whenever can assume that
bc=ad for some d  R . Hence, from c  ac  bc we obtain
c  a  c   d  which shows that a c as required.
Just as in the case of integers, we can show that if p is any prime
element in a Euclidean ring R and a  R then either p a or (p,a)=1 (or a unit).
In fact by the definition of the g,c,d, (p,a) is a divisor of p and p being a prime
this means that (p, a) is either p or 1. so if ( p, a)  1 . Then (p,a)=pwhich
means that p a Hence the assertion. We use this fact in proving the following
important property of any prime element in R.
Lemma 6:
If p is any prime element in a Euclidean ring R and p ab where
a, b  R the p must be a divisor of at least one of a or b.
Proof:
Suppose p is not a divisor of a we will show that p b since p is prime
element and p does not divide a, (p,a)=1. hence by Lemma 5 we have to
conclude that p b , for by data p a b .
Corollary:
If a prime element p of a Euclidean ring is a divisor of a product
a1 , a2 ......an then p must be a divisor of at least one the elements a1 , a2 ......an .
We now come to the important fact that any Euclidean ring (and
similarly any principal ideal domain) satisfies the following Using
Factorization Theorem.

75
Theorem 5:
Let R be a Euclidean ring and a  0 be a non-unit in R. suppose
a= p1 , p2 ...... pn  p1' p2' ..... pm' where the pi and pi' are prime elements of R.
Then n=m and each pi for 1  i  n is an associate of some pi' it an associative
of some pi . (In short, if any element a of a Euclidean ring R has two possible
factorizations into prime factors then both and same but for order and unit
factors)
Proof:
The given relation a  p1 , p2 ...... pn  pn  p1' p2' ..... pm' shows that p
must divide p1' p2' ..... pm' and hence by the above corollary. p1 must divide some
pi' . Since both p1 and pi' are prime elements in R and p1 pi' , p1 and pi' must be
associates so that we can take pi'  ui p1 where ui is unit in R. Thus
p1 , p2 ,.... pn  p , p
'
1
'
2 p  ui p1 p2 ..... p , p ....... p ;after cancelling off p1
'
m
'
i 1
'
i 2
'
m

we will be left with p2 ,.... pn  u1 p2 ..... pi' 1 , pi' 2 ....... pm' . Repeat the same
argument with p2 . and p1 and after successively cancelling p2 , p3 etc., at the
nth stage we would obtain 1 equal to the product of certain units and a certain
number viz m-n of p ' s provided n  m . But since the p ' are not units, the
above equality will force cancellation of all the p ' s and this means that m=n.
further, we have incidentally proved that every pi and p 'j as an associate and
vice versa. Hence the theorem follows.
Applying the earlier proved lemma 4, the above theorem shows that
every non-zero element in a Euclidean ring R is either a unit or can be uniquely
factorised as a product of prime elements of R, factorization being unique but
for order and uit factors. Because of this property any Euclidean ring turns out
to be a unique factorization domain (U.F.D). the notion of a U.F.D. will be
considered in the next lesson.
We will close this lesson with two important results, one concerning the
maximal ideals of a Euclidean ring the other providing an important example.
Viz, Z(i) of a Euclidean ring.
Theorem 6:
An ideal M of an Euclidean ring R is a maximal ideal iff the ideal M is
the principal ideal generated by a prime element of R.
Proof:
Since any ideal in a Euclidean ring is a principal ideal we can take ideal
M=<m> , the ideal generated by an element m  R . In the first place we will
prove that when m is not a prime element then the ideal M is not maximal.
Since m is not a prime element we can write m=ab where a, b  R and both a
and b are non-units. We claim that, if A=<a>, the ideal generated by a, then M

76
is properly contained in the proper ideal A of R whence it will follow that M is
not a maximal ideal. In fact, if x  M then x=rm for some r  R so that
x=r(ab)=(rb) a  A . Thus M  A . However M  A for otherwise a  M and
we should have a  m for some   R . This means a   (ab) whence
ab =1 which is not possible, as b is not a unit in R. Thus M is properly
contained in the ideal A. moreover this ideal A is not the unit ideal R, for
otherwise l  A and this implies l   a for some   R . But this again is
impossible since a is not a unit. Thus M  A  R and the ideal A is neither M
nor R. Hence M cannot be a maximal ideal.
Conversely suppose m it a prime element in R. we will prove that the
ideal M is maximal. In fact suppose N=<n> the ideal generated by n  R , is an
ideal such that M  N  R we will prove that either N=M or N=R whence it
will follow that M is maximal. Since M  N , m  N =<n> so that we can
write m=x n for some x  R . But m is a prime element and so either x is a unit
in which case n is an associate of m or n is a unit. When x is a unit m and n are
associates and it can be easily shown that <m>=<n> ie, M=N. on the other hand
when n is a unit we can show that since N=<n> and l=np we get l  N , N
being the ideal generated by n. hence for r  R, r  1 r  N . Thus
R  N  R ; i.e. N=R M  N  R the either N=M or M=R. this means that
M is the maximal ideal of R and the proof of the theorem is concluded.
Corollary:
In a Euclidean ring R any prime ideal  R is also a maximal ideal.
Proof:
P  R be a prime ideal and let p be generated by the element p  R .
We claim that p must be a prime element. For suppose p is not a prime then we
can write p=ab where a, b  R and both a and b are not units. (we may note
here that p cannot be a unit for then P=R, which is not the case). Now a does
not belong to P for otherwise, if a  P then a=pc for some c  R and we obtain
p=ab=pcb. This means that cb=1 which is not possible, since by our assumption
b is not a unit in R. thus a  P . Similarly we can show that b  P . However
ab= p  P . Hence P cannot be a prime ideal contradicting our assumption that
P is a prime ideal. Thus the prime ideal P can be generated by only a prime
element. Hence, by the last Theorem, P is also maximal and the corollary
follows.
Note: Since in any commutative ring with identity, any maximal ideal is also a
prime ideal we conclude from the above corollary that in any Euclidean ring R
there is no difference between maximal and prime ideals.
Theorem 7:
The ring Z[i] of Gaussian integers is a Euclidean ring.

77
Proof:
 
We recall Z [i ]     i  Z , i  1 is a commutative ring with
identity, under the usual operations addition and multiplication of complex
numbers, since Z[i] is clearly a subring of the field of complex numbers, Z[i] is
an integral domain. Therefore, in order to show that Z[i] is a Euclidean ring we
need define a suitable d-value d(a) for each non-zero element a in Z[i]
possessing the required two properties. Viz. (i) d (ab)  d (a) & d (b) and (ii)
there exist elements q, r  Z [i ] , (given a, b  Z [i ] with a  0 ) such that

b=qa+r where either=0 or d(r)<d(a). we will now show that a   aa  for


a  Z [i ] serves as a suitable d-value of a here a denotes, as usual, the complex
conjugate of a and a is the modulus of a. in fact, if we take d(a)= a where
a  Z [i ] . Clearly d(a) is a non-negative integer such that for a, b  Z [i] .
d (ab)  ab  a b : i.e. d (ab)  d (a)d (b) . Thus the required
property (i) of the d-value in a Euclidean ring is satisfied. It remains for us to
verify that, property (ii) also holds good. Now Z[i] can be imbedded in the field
Q[i] of all Gaussian numbers of the form x+yi where x, y  Q . Given
a, b  Z [i] with a  0 , we need determine q, r  Z [i ] such that b=qa+r, where
either r=0 or d(b)<d(a). since Q[i] is the field of quotients for the integral
b b
domain Z [i],  Q[i] and we can take -x+yi where x,y Q .
a a
Let q1 , q2 be the integers nearest to x and y respectively such that we can
1 1
write x  q1  1 , y  q2  2 where 1 ,2 Q and 1  , 2  . Let
2 2
q  q1  iq2  Z [i ] . Then b  qa  r  Z [i] , where r  1 i 2  a, as can be
easily checked. Now
1 1
(d (r ))2  r  1 i 2 a  12   22  a     (d (r ))2 .
2 2 2 2
Since
4 4
1
both 1 and 2 are  . Hence d(r)<d(a). thus we have determined elements
2
q, p  Z [i ] such that b=qa+r where either r=0 (which will be the case when
1  2  0 or d®<d(a). this completes the proof of the theorem.
We will now illustrate the above procedure of finding the required
q, r  Z [i ] , in a particular case, say when a=3+ri and b=4+7i, then
b 4  7i  4  7i  3  4i  40  5i 8 1
     , so that in the notation of the
a 3  4i 25 25 5 5

78
8 1
earlier work, we have x ,y  . we have to take therefore
5 5
2 1
q1  2 1   , q2  0 2  , q  q1  iq2  2. then b-qa=(r+7i)-2(3+4i)=-2-
5 5
1=r and (d (r ))  (2)  (1)2  5 which is certainly < (d (a))2 , since
2 2

(d (a))2  32  42  25 .
Thus we can write r+7i=2(3+4i)+(-2-i) which is in the form b=qa+r with
d(r)<d(a).
In the next lesson we will use this fact that the ring Z[i] of Gaussian
integers is a Euclidean ring and establish a very important number theoretical
result, namely that if p is a prime number of the form 4n+1 then there exist
integers a and b such that p  a 2  b2 .
Solved problems:
1. Prove that if an ideal A of a ring R contains a unit then prove that the ideal A
must be the unit ideal R.
Solution:
Now A  R
R
M
rM
 1r  m  M
M  1  M   r  ( M   )( M  r )
 M     M  r   M  r is invertible
1

mM
1  m r
( A is an ideal of R)
1 A, x  R  1x  A ( A is an ideal)
 x A
R  A.
Hence A = R.
2. Let R be a Euclidean ring and a  R . Prove that a is a unit in R iff d(a)=d(1)
where 1 is the identity in R.
Solution:
Let a be a unit in R.
To prove: d(a)=d(1)
By the definition of Euclidean ring.

79
d (la)  d (1)
 d (a)  d (1)
Since a is a unit in R, a -1 exists & aa -1  1
Hence d (1)  d (aa1 )
 d (a)
 d (a)  d (1)
Given:
d (a)= d (1)
To prove: a is a unit in R.
By division algorithm valid in a, we can fine q,1  A such that 1=qa+r
where either r=0 or d® < d(a).
Now, since d(a)=d(1) and d®=d(r.1)  d(1), we have d(r)  d(a).
 r=0 so that 1=qa, whence a is a unit in r.
3. If R is a non-commutative ring with identity and if M is a maximal ideal in
R
R, prove that the residue class ring is a division ring.
M
Solution:
R
Since R is a commutative ring with identity is also commutative
M
ring with identity.
R
The zero element of is M.
M
R
To prove: is division ring
M
R
Let M + r be any non-zero element of .
M
Then M + r  M ie. r  M
Claim: M + r is invertible.
If (r)is the principal ideal of R generated by r, then M + (r)is also an ideal
of R. since r  M ,M is properly contained in M + (r). But M is a maximal
ideal of R. Hence M + (r)=R, and since 1  R , 1  m   r for m  M
1r  m  M
M  1  M   r  ( M   )( M  r )
M     M  r   M  r is invertible
1

R
Hence is a division ring.
M

80
4. Prove that in an integral domain R with identity the elements a, b of R
generate the some ideal iff a and b, are associates.
Solution:
Given: (a)=(b) where (a)= the principal ideal generated by a
 
ax x  R & (b)  the principal ideal of R generated by b.
To prove: a and b are associates.
Now (a)  (b)
 a  (b)
 a=rb for some r  R
b a
Similarly (a)=(b)
 (b)  (a)
 b  (a)
 b  sa
sR
 a a   d R
 b  ac  (bd )c  b(dc)
 b.1  b(dc)
 b(1  dc)  0
 1  dc  0 ( b  0 & R
 b  sa for some s  R
ab
Now  a b   c  R such that b=ac
 a a   d  R such that a=bd
 b  ac  (bd )c  b(dc)
 b.1  b(dc)
 b(1  dc) 
 1  dc  0 ( b  0 & R is without zero divisors)
 1  dc .
 Both c & d are units in R,
Thus a=bd where d is a unit.
Hence a and b are associates.
Given: a and b are associates
To prove: (a)=(b)

81
Since a and b are associates, a=bu where u is a unit in R.
Now a=bu  b a
Again a  bu  b  au 1
ab
Now a b  b  pa for some p  R
Let y  (b)  y  ab for some q  R
= q(pa)
= (qp)a
 (a) ( q p  R)
 (b)  (a)
Similarly b a  (a)  (b)
Hence (a) = (b).
In the last Section we had introduced the concept of a Euclidean ring
and established some of its properties, many of which resemble the familiar
properties of integers. At the end of the lesson we had shown that the ring Z[i]
of all Gaussian integers a+bi, where a, b  Z is a Euclidean domain, by
prescribing a suitable d-value. Now we apply the properties of a Euclidean ring
to this ring Z[i] and derive Fermat’s theorem in number theory, viz, that any
prime number of the form 4n+1 can be exhibited as a sum of two squares.
Before proving this theorem we require two lemmas.
Lemma 1:
Let p be a prime number. Suppose c is an integer relatively prime to p
and suppose there exist integers x and y such that cp  x 2  y 2 , then p can be
written as the sum of two square that is p  a 2  b 2 for some pair of integers a
and b.
Proof:
Since any (ordinary) integer can be considered as as Gaussian integer
with imaginary part absent, we find that the ring Z, of integers is a subring of
ring Z[i] of Gaussian integers it may happen that an integer p which is prime
element in Z (i.e. prime number) need not be prime in Z[i], for we can write
2=(1+i)(1-i) and both (1+i)(1-i) are neither units, nor associates of 2 in Z[i],
since the only units in Z[i] are 1, -1,I,-I as can be easily shown. Now, we claim
that the prime number p considered in the Lemma is not a prime in Z[i]. for
suppose not. Then cp  x 2  y 2  ( x  yi) ( x  yi) this factorization being
possible in Z[i], shows by Lemma 6 of the last lessons that since p is a prime
divisor of the l.H.S and hence of the R.H.S., p must divide at least one of x+yi
or x-yi. If p x  yi , then ( x  yi )  p (u  vi ) which clearly implies (by taking
complex conjugates of both sides or other wise) that ( x  yi )  p (u  vi ) .

82
hence ( x  yi) ( x  yi)  p 2 (u  vi) (u  vi) . Thus p ( x 2  y 2 )  cp . Whence it
follows p c . However this is not possible as, by assumption, p and c are
integers which are relatively prime. Hence p cannot be a divisor of x+yi.
Similarly p cannot be a divisor of x-yi. Thus p cannot be a prime element in
Z[i] as claimed. We can therefore take p=(a+bi) (m+ni) where a+bi, m+ni  Z[i]
and both a+bi, m+ni are not units. Since, one can easily show that the Gaussian
integer    i is a unit in Z[i] iff  2   2  1 we find that both a 2  b 2 and
m2  n2 are  1 , as both a+bi and m+ni are not units in Z[i]. Now from
p=(a+bi)(m+ni), by taking complex conjugates of both sides or otherwise, we
get p =(a-bi) (m-ni). Hence
p 2  (a  bi)(m  ni)(a  bi)(m  ni)   a 2  b2  m2  n2  . Therefore a 2
 b2 
is a divisor of p 2 . But the only possible (positive)integer divisors of p 2 are 1,p
and p 2 ,p being a prime. Hence a 2  b 2  1 or p or p 2 . But a 2  b 2  1 . So
a 2  b 2  1 . Further a 2  b 2 cannot be p 2 for then m 2  n 2  1 which conot be,
since m 2  n 2  1 . Thus the only possibility is that a 2  b2  p  m2  n 2 .
Whence the lemma is proved.
Now, if an odd prime number is divided by 4 the remainder can be
either 1 or 3 only, as it cannot be clearly 0 to 2. thus the set of odd numbers can
be divided into two disjoint classes. Vix (1) those odd primes of the form 4n+1,
which on division by 4 leave remainder 1 and (2) those odd primes of the form
4n+3, which on any odd prime number of the form 4n+1 can be exhibited as a
sum of two squares and we leave it as an exercise to show that no odd prime
number of the of the form 4n+3 has this property. For proving this main
theorem we require the following lemma which concerns number theory and in
the proof of this lemma we make use of the well-known Wilson’s theorem on
number theory, viz. for any integer p>1, p is a prime number iff ( p  1)! 1  0
(mod p). we now state the required lemma and prove the same.
Lemma 2:
If p is any prime number of the form 4n+1then there exists an integer x
such that x 2  1  0 (mod p)
Proof:
p 1 p 1
Take x=1,2,3,….. . Hence is an weven number since p=4n+1,
2 2
 p 1 
for some integern. Therefore, we can also write x=(-1)(-2)(-3)….   .
 2 
Now p-k p  k   k (mod p), so that we get

83
 p 1   p 1 
x 2 =(-1)(-2)(-3).    . 1.2.3. 
 2  2 
 p 1   p 1 
 (p-1)(p-2)(p-3).  p    1.2.3.........  (mod p)
 2   2 
 p 1   p 1 
 (p-1)(p-2)(p-3).    1.2.3.........  (mod p)
 2   2 
p 1 p 1 p  3
 1.2.3..... . ..........(p  2) (p  1) (mod p)
2 2 2
(by rearranging the factor)
 p  1 (mod)
 1 (mod p) . Since by Wilson’s theorem
p  1  1 (mod) ,P being a prime.
Thus the integer x, we have taken, has the property x 2  1  0 (mod p)
as required and the lemma has been established. As an illustration ofr the above
lemma, take p=4.4+1=17, a prime of the form 4n+1. then, one can check by
p 1
actual verification that x=1.2.3.4.5.6.7.8 there  8), viz, x=40320  13
2
(mod 17) satisfies the congruence relation x x 2  132  1 (mod 17) .
Theorem 1:
If p is any prime number of the form 4n+1, then d can expressed as a
sum of two squares; i.e., p  a 2  b 2 for some integers a,b.
Proof:
By the above lemma we can find an integer x such that x 2  1 (mod p) .
p
We claim that we can choose this integer x such that x  also
2
x 2  1 (mod p) . In fact, in the integer x we had initially taken, is >p, we can
divide x by p and obtain the remainder then r<p and x  r (mod p) so that
p
x 2  r 2 , if r  , then we have determined the required integer x, viz. r, as
2
p p p
x 2  r 2  1 (mod p) and r  . Suppose r  then p  r  and since
2 2 2
p  r  r (mod p ) we get that ( p  1)  (r )  r (mod p) . This integer x
2 2 2

p
such that x  and x 2  1 is divisible by p I,.e, x 2  1 =cp for some integer c.
2
2
 p
but x  1     1 , by our choice of x, whence we find that cp  x 2  1  p 2 ,
2

2
. thus c<p and p being a prime c and p are relatively prime. Therefore we can

84
now apply Lemma 1 and conclude that p  a 2  b 2 for some integers b. the
theorem is therefore proved.
Polynomial Rings:
We had already seen in the last lesson how the ring Z of integers and the
ring Z[i] of Gaussian integers can be recognized as Euclidean rings. We are
now going to obtain yet another important example of a Euclidean ring. By
Z[x] we denote the set of all polynomials in x (an inderminate over Z. ie., with
coefficients from Z and it is not difficult t show that Z[x] is a commutative ring
with identity (even an integral domain) under the usual (high school)
operations of addition and multiplication of polynomials. However this ring
Z[x] is not a Euclidean ring and this fact will be proved later. If, instead of Z,
we take either Q, the field of rationales or R, the field of reals or C, the field of
complex numbers and consider the set Q[x] or R[x] or c[x] of all polynomials
in x over Q or R C, as the case may be, we obtain rings, under the usual
operations of addition and multiplication of polynomials and these rings
Q[x],R[x],C[x] are all Euclidean rings, a suitable d-value being by the degree of
the polynomial. More generally we can construct the ring F[x] of all (so called)
polynomials in x over a field F and prove that F[x] is a Euclidean ring. So also
we can construct the ring R[x] of all polynomials over any ring R and study its
properties, we propose to study about these things is this lesson, these ideas
will be necessary for the purpose of Galois theory which will be treated in the
last lesson in Algebra.
Definition 1:
Let R be any ring and let x denote any inderminate (or variable). Let
R[x] denote the set of all symbols or formal sums of the form
a0  a1 x  a2 x 2  ........  an x n where ai  R for i=1,2,…n, n arbitrary non-
negative integer. The operation of addition ‘+’ and multiplication “.” Are
defined on R[x] as follows. If f ( x), g ( x)  R[ x] where
f ( x)  a0  a1 x  .....  an x n , g ( x)  b0  b1x  .....  bm x m ,
then f ( x)  g ( x)  h( x)  c0  c1 x  c2 x  ...c p x with ci  ai  bi for all I
2 p

and p is the max8imum of the integers n and m (taking at  0 if t>n, bt  0 if


t>m) and
f ( x)  g ( x)  k ( x)  d0  d1x  d 2 x 2  .......d mn x mn ,
where di  a0bi  a1bi 1  .......  at bi t  .....  aib0 for all I (with the same
convention as earlier, viz. at  0 if t>n, bt  0 it t>m). then it is possible to
show that R[x] is ring under the above defined operations of addition ‘+’ and
multiplication ‘.’. this ring R[x] is called the polynomial ring or more precisely,
the ring of polynomial in x over R. the elements of R[x] are called polynomials
in x over R.

85
Remark:
In the above definition, it is understood that for the polynomials f(x)
and g(x), the co efficient a n and bm are bit zero us R, so that the definition of
f ( x)  g ( x) can be precisely given. We may also point out here that the
expressions for elements in R[x], like a0  a1 x  .....  an x n are only formal
sums (or symbols) since no meanings as such can be attached to the terms
ai xi as also to the sum of such terms. To avoid this difficulty, we may denote
any element in R[x] as an infinite-tuple like  a0  a1  .....  an ,...... where all
ai  R and all but a finitely many ai are zero. With this form for any element in
R[x]. addition and multiplication on R[x] are then defined as follows:
 a0  a1  .....  an ,......  b0  b1  .....  bn ,......   c0  c1  .....  c p  where
ci  ai  bi for all I;
 a0  a1  .....  an ,......   b0  b1  .....  bn ,......   d 0  d1  .....  d r 
i
where di  a b
j 0
i i j for all i.

With these definitions of addition and multiplication, one can show that
R[x] is a ring. Further, if we denote by x the infinite-tuple (0,1,0,0….) on the
assumption that the ring R has identity 1, we can easily check that for any
positive integer I, xi =(0,0,…0,1,0…) with 1 at the ith place and that the infinite
tuples of the form (a0 ,0,0...) form a subring of R[x] isomorphic with R. so if
we identify any such element (a0 ,0,0...) with the element a0 , we can write any
arbitrary element (a0 , a1 ,....an ,0,0...) for R[x] in the form of polynomial
a0  a1 x  a2 x 2  ........  an x n . We have however avoided this method of
introducing the notion of R[x] in order to present the ideas in a less abstract
manner.
We will now derive certain familiar and important properties of the ring
R[x]. we will omit the proofs of some of the properties, as these properties can
be easily established from the definition of R[x] from the immediate
consequences of the definitions.
Properties of R[x]
1. The ring R can be embedded in R[x] for, all the so called constant
polynomials, say ao x  ao together with the zero polynomial 0 from a
0

sub-ring of R[x] which is obviously isomorphic with R.


2. R[x] is a commutative ring iff R is and this fact is easily proved.
3. If R has identity 1, R[x] has also identity, viz.1 x0  1 .
4. If R is an integral domain so is R[x] and conversely.

86
Proof of (4):
For the purpose of the proof of this property we define the notion degree
of any polynomial R[x]. if f(x)is a non-zero polynomial and
f ( x)  a0  a1 x  a2 x 2  ........  an x n we say that f(x) is degree n, provided
an  0 . For the zero polynomial we do not assign any degree. We now make
the claim that the degree of the product of polynomials f(x) and g(x) which are
of degrees n and m respectively cannot exceed n+m and that it is precisely n+m
where R is an integral domain. In fact, let f[x] be the same as the polynomial
earlier mentioned and g(x) be the polynomial b0  b1 x  .....  bm x m , where
bm  0 the degree of g(x) being m. by the definition of multiplication in R[x]
t
we find that the coefficient of x. in the product f(x).g(x) is a b
i 0
i t j . Now

ai  0 whenever i>n and bi  0 whenever j>m. suppose now t>m+m+n if


0 o  i  n then t-i>m so that bt i  0 hence ai bt i  0 . On the other hand, if
t  i  n then a i  0 , so that again ai bt i  0 . Thus when t>m+n all the terms
t
in a b
j 0
i t i are zero and hence the coefficient of xt in the polynomial f(x).g(x)

is zero, i.e., this term in absent. Hence the degree of f(x).g(x)is at most m+n
and the higher degree coefficient in this product is obviously anbm . Even this
coefficient may be absent, if the ring R has non trival divisors of zero and
an , bm are chosen in R such that an  0, bm  0 , but anbm  0 . In such cases the
degree of the product of two polynomials will actual be less than the sum of the
degrees of the two polynomials. However when R is an integral domain the
degree of f(x).g(x) is precisely m+n, as anbm  0 since both a n and bm are  0 .
Incidentally this shows that the product of any two non-zero polynomials in
R[x] can never be the zero polynomial. Hence R[x] is an integral domain.
Further it is evident that when R[x] is an integral domain, so is R. thus we have
shown the ring R[x]is an integral domain iff R is.
Now we have the following theorem, which incidentally furnishes us
with many examples of a Euclidean ring.
Theorem 2:
If F is any field, the ring F[x] of all polynomials in X over F is a
Euclidean ring.
Proof:
Since F is a field, it is also an integral domain and hence, as we had seen
earlier, the ring F[x] is also an integral domain. So, in order to show that F[x] is
a Euclidean ring it is enough if we are able to assign a d-value to each non-zero
polynomial f ( x)  F ( x) such that the d-value (which must be a non-negative
integer) has the required two properties mentioned in the definition of a
Euclidean ring, namely (i) If f ( x), g ( x) be any two non-zero polynomials in

87
F[x], then d  f ( x).g ( x)   both d  f ( x)  and d  g ( x)  , and (ii) given any
pair of polynomials a(x), b(x) in F[x] such that we can write b(x)=q(x)
a(x)+r(x) where either r(x) is the zero polynomial or d  r ( x)   d  a( x)  . For
this purpose we take the degree of any non-zero polynomial f(x) as d (f(x)).
Then it is clear that d (f(x))is a non-negative integer and since
d(f(x).g(x))= d  f ( x).g ( x)   both d(f(x)) and d(g(x)) the property (i) required
of the d-value is satisfied. It remains for us to show that property (ii) viz
division algorithm is also by this d-value. We establish this property (iii) by
induction on d (b(x)). For the sake of fixing our ideas let us assume that
a( x)a0  a1 x  .....  an x n with an  0 and b( x)  b0  b1 x  .....  bm x m , with
bm  0 . Then d(a(x))=n and d(v(x))=m. when m<n, the property is trivially
satisfied, since we can obviously write b(x)=q(x)a(x)+r(x) where q(x) is the
zero polynomial and r(x)=b(x) so that d(r(x))=d(b©)<d(a(x0), as required
therefore we have now a bsis for the induction proof, so we may assume as
induction hypothesis that the property is true whenever d(b(x))<m and then
prove the truth of the property when d(b(x))=m. now let m>n, where n is the
1
degree of a(x). since an  0 in F, there exists the inverse an in F such that
an1an  1 , the identity in the field F. A straight forward simplification will
1 m  n
show that the polynomial b( x)  bm .an x , a(x) is of degree <m-1.
Problem 1:
Let F and K be two field and F  K . (i.e. K is in extension of the field
F) suppose f ( x), g ( x)  F ( x) are relatively prime in F[x]. prove that they are
relatively prime in K[x] also.
Solution:
since F  K , it is clear that F  K . Moreover both F[x] and K[x] are
Euclidean rings. Since f(x) and g(x) in F(x) are relatively prime in F[x]. their
g.c.d in F[x] is 1. The identity in F[x] being1, the identity in F. so, there exist
polynomials  ( x),  ( x)  F ( x) such that  ( x), f ( x)   ( x).g ( x)  1 . This
relation holds good in K(x) also since all the polynomials
 ( x),  ( x), f ( x), g ( x) are in K[x] also. Hence if d(x) is any common divisor
of f(x) and g(x) in K[x]. the above relation shows that d(x) is a divisor of the
identity element 1, which means that d(x) is a (n0on-zero) constant polynomial
in K[x]. thus the gcd of f(x) and g(x) in K[x] is also 1 or any unit. Therefore
f(x) and g(x) continue to be relatively prime even in K[x].
Problem 2:
F ( x)
Let F be the field of real numbers. Prove is a field isomorphic
 x2  1 
to the field of complex numbers.

88
Solution 2:
Let A denote the ideal  x 2  1  , generated by x 2  1 , in the
Euclidean ring F[x], then A   x  1 f ( x) f ( x)  F ( x) ,
2
the residue
F ( x)
classes modulo A in F[x], i.e. the elements in can be taker as A+(a+bx)
A
where a, b  F . In fact if p( x)  F ( x) then the residue class A+p(x) is same
as A+)a+bx) where a+bx is the (unique) remainder obtained on dividing p(x) by
 
x 2  1 +r(x) where r(x) is the zero polynomial or the d(r(x))=degree of r(x) is
less that 2, which is the degree of x 2
 1 . Thus r(x) can be taken as a+ bx
where a, b  F and both a and b are zero, if r(x) is the zero polynomial. Now
p(x)-r(x)  x2  1  and so the residue classes
p( x)  x  1  and r ( x)  x  1  where r ( x)  a  bx, a, b  F . We
2 2

many denote these residue classes modulo A as a  bx , so that


F [ x]
A
 
 a  bx a, b  F . Now we claim that the mapping  :
F [ x]
A
C
(where C denotes the field of complex numbers) defined by

 
 a  bx  a  bi is an isomorphism of the field
F [ x]
A
onto the field C.

clearly the mapping  is ‘onto’ Also  is one-to-one and well-defined and this
fact is seen by the following “iff” conditions.
a  bx  c  dx   a  bx   A   c  dx   A
  a  bx    c  dx   A  x 2  1 
  x 2  1 is a divisor of (a  c)  (b  d ) x
 a  c  0 and b  d  0
 a  c and b  d
 a  bi  c  di

  a  bx   c  dx   
Further  is a (ring) homomorphism.

In fact,  a  bx  c  dx  
 
  (a  c)  (b  d ) x  (a  c)  (b  d )i   a  bi    c  di 

  
  a  bx   c  dx . 
which shows that  preservers addition, And.

89
   
 a  bx . c  dx    a  bx  c  dx  

  [ac  (ad  bc) x  bd x 2


  [ac  bd  (ad  bc) x  bd ( x 2  1)
  [(ac  bd )  (ad  bc) x] as bd ( x 2  1)  A
 (ac  bd )  (ad  bc)i
 (a  bi )(c  di )
  
  a  bx . c  dx 
which shows that  preserves multiplication also. Hence our claim that  is an
F [ x]
isomorphism of the field onto the field C, and the problem is answered
A
Note: Since x 2  1 ideal is obviously irreducible over the field F of real
numbers, the ideal  x 2  1  is a maximal, prime ideal and the residue class
F ( x)
ring is a field.
 x2  1 
Polynomials over the field of rational numbers:
In this section, unless otherwise state let F denote the field Q of
rationals. We now specialize to the (Euclidean) ring F[x] of polynomials with
rational coefficients. In most of our results we obtain in the section, the
polynomials are those in Z[x], i.e., those with integer coefficients. The main
result established here is Gauss” Lemma and this will find an application in
proving an important theorem in the next and final section of this lesson, viz.
“if the ring R is a U.F.D so is the ring R[x]”. we begin with two definitions.
Definition 2:
A polynomial f ( x)  a0  a1 x  a2 x 2  ........  an x n , there the
coefficients ai are integers is said to be primitive if the g.c.d of the coefficients
a0 , a1 ,........an is 1.
For instance 3  4 x  6 x 2  5 x3 is a primitive polynomials while
3  6 x  9 x2  12 x3  3x 4 is not a primitive polynomial.
Definition 3:
If f(x)us any polynomial, as in the above definition, then the content of
f(x) is the g.c.d. of the coefficient a0 , a1 ,........an .
The content of any primitive polynomial can be taken as 1 it is easy to
see that if d is the content of a polynomials f(x) in Z(x) then we write
f ( x)  dg ( x) where g(x) is a primitive polynomial.
Lemma 3:
The product of any two primitive polynomials Z(x) is also primitive.

90
Proof: Let the two primitive polynomials be
f ( x)  a0  a1x  a2 x  ...  an x , and g ( x)  b0  b1x  b2 x  ...  bm x .
2 n 2 m

Suppose the lemma is not true. Then all the coefficients in the polynomial
f(x).g(x) must be divisible by some integer > 1 and hence by same prime
divisor, say p of that integer. We will now show that this impossible. The
coefficients of x t in the product polynomial f ( x).g ( x) .
t

a b
i 0
i i 1 for 0  t  m  n .

Since f(x) is primitive, not all coefficients ai in f(x) are divisible this
prime p. Let a j be the first coefficient in f(x) which is an divisble by p.
Similarly, since g(x) is also primitive we can find the first coefficient, say bk in
g(x) which is not divisble by p. (Here 0  j  n and 0  k  m). Now the
coefficient of x jk in f(x).g(x) is
c j k   a0b j k  a1b j k 1  ...  a j 1bk 1   a j bk   a j 1bk 1  a j 2bk 2  ...  a j k b0  .
By our choice, p is a divisor of a0 . Similarly p is a divisor of bk 1 , bk 2 ,....., b0
and hence p  a j 1bk 1  a j 2bk 2  ...  a j k b0  But by our assumption p c j k .
Hence the expression for c j k shows that p a j pk . But since p is a prime p
must be a divisor of at least one of a j and bk which is a contract to our choice
of a j and bk . Thus f(x).g(x) must be primitive fand the lemma is proved.
Now we are in a position to establish Gauss’ lemma which is stated as
theorem below.
Theorem 3: (Gauss’ lemma)
If a primitive polynomial f ( x)  Z [ x] can be factored over F, i.e, can be
factored as a product of two polynomials with rational coefficients then f(x) can
be factored as the product of two polynomials with integer coefficients.
Proof:
Assume f ( x)  a ( x).b( x) where a ( x), b( x)  F ( x) . Suppose d1 is a
common multiple of the denominators in the coefficient of a(x). we can write
a ( x)
a ( x)  1 where a1(x) has integer coefficients. If c1 is the content of
d1
a1 ( x), a2 ( x)  c1 ( x) where  ( x) is a primitive polynomial and so
c c
a( x)  1  ( x). Similarly treating b(x) we can write b( x)  2  ( x). where
d1 d2
c2 , d 2 are integers and  ( x) is a primitive polynomial.
Hence
c c c c c
a( x).b( x)   ( x). ( x) where  1 . 2 , i.e., f ( x)   ( x). ( x) , so that
d d d1 d2 d

91
df ( x)  c ( x)  ( x). Now f ( x),  ( x),  ( x) are all in Z[x] and c, d  Z . Since
both  ( x) and  ( x) are primitive, by the last lemma,  ( x). ( x) is also
primitive. Also f(x) is primitive. Thus, equating the contents of both sides of
c
the equation df ( x)  c ( x)  ( x). we find d = c. Hence  1 and
d
f ( x)  a ( x)b( x)   ( x)  ( x) . Thus f(x) has been factored as a product of two
polynomials , viz  ( x) and  ( x) with integer coefficients.
Hence the theorem.
As an immediate consequence of the above theorem we have the
following corollary.
Corollary:
Any polynomial irreducible over Z, continues to be irreducible over F,
the field of rational.
The following theorem gives a sufficient condition for the irreducibility
of a polynomial in Z[x].
Theorem 4: (The Eisentein’s Criterion).
Let f ( x)  a0  a1 x  a2 x2  .....  an xn be a polynomial in Z[x], of
degree n. If there exists a prime number p such that
i) p is a divisor of aj for i=0,1,2,….,n-1
ii) p is not a divisor of an,
iii) p 2 is not a divisor a0/
Then f(x) is irreducible over Z and hence irreducible over F.
Proof:
Without loss of generality we may suppose that f(x) is primitive, for
otherwise, if d ( 1) is the content of f(x) then we may divide each co-efficient
in f(x) by d and obtain a primitive polynomial f(x), say. Since p is not a
divisor of an, p will not be a prime divisor d and so the condition (i),(ii),(iii)
imposed on the coefficients in f(x) still hold good in the case of f1 ( x) also. We
may then treat f1 ( x) instead of f(x) and establish the theorem. Thus we may
take f(x) to be a primitive polynomial. Suppose now f(x) can be factorised as a
product of two polynomials over F, then by Gauss lemma f(x) can be factorised
as a product of two polynomials over Z. thus, if we assume that the theorem is
false, that is f(x) is not irreducible then we can write
f ( x)   b0  b1 x  ... ...  br x r  .  c0  c1 x  ....  cs x s  where the b’s and c’s are
integers and r>0, s>0 such that r+s=n. Equating the constant terms on both
sides of the above dentity we get a0  b0 c0 . Now, by data the prime p is a
divisor of d and hence is a divisor of at least one of b0 and c0 . But since p2 is
not a divisor of a0,p cannot be divisor of both b0 and c0. so let as assume that p
is a divisor of b0 and not a divisor of c0 . Now not all the coefficients,
b0 , b1 ,...., br can be divisble by p, for otherwise all the coefficients in f(x) would

92
then be divisible by p and this is false as p [does not divide an. So, let bk be the
first co-efficients that is not divisible by p. By our assumptions this means that
p b0 , b1 ,....bk 1 and 0  k  r  n. Now, from the factorization of f(x),
equating the coefficients of x we obtain ak  b0ck  b1ck 1  ...  bk c0 . Since k>0
k

, in the above equation we find that p is a divisor of the integers


ak , b0ck , b1ck 1 ,....bk 1c1 , (by our assumption p is a divisor of ai for i  0,1, 2, n  1
and also of b0 , b1 ,....bk 1 ) . This fact implies that p bk c0 . But this is impossible
since, by our assumption, p is nether a divisor of bk, nor a divisor of c0. This
contradiction proves the theorem.
As an application of the above theorem we prove in the following
problem the standard result viz. “Any cyclotomic polynomial
1  x  x 2  ......  x p 1 where p is a prime is ‘irreducible over Fn,
Problem 3:
Prove that the polynomial 1  x  x 2  ......  x p 1 where p is a prime
number, is irreducible over the field F of rational numbers.
Solution:
Let f(x) denote the polynomial 1  x  x 2  ......  x p 1 . Since all the
coefficients in f(x) are 1 the Eisenstein’s criterion for irreducibility cannot as
such be applied for f(x). So, remembering the obvious fact that f(x) is
irreducible, we apply the criterion for f(x+1) and establish that f(x+1) is
irreducible. Now f ( x  1)  1  ( x  1)  ( x  1) 2  ...  ( x  1) p 1 which can also
( x  1) p  1
be written as equal to by summing up the G.P.
( x  1)  1
( x  1) p  1
Thus f ( x  1) 
x
  p  p  p  p   p 
  x p    x p 1    x p 2  ...    x p r  ...    x     1  x
  1   2   r   p  1   p 
 p p( p  1)( p  2)...( p  r  1)  p
where    pcr  , and    1
r  1.2.3.....r  p
 p  p  p p  p 
f ( x  1)  x p 1    x p 2    x p 3  .....    x p  r 1  ...   x 
1  2  r   p  2   p  1
Consider now the integer
 p  p( p  1)( p  2)...( p  r  1)
  where 1  r  p  1
r  1.2.3.....r
Since p is a prime, none of the factors 2,3,…r in the denominator can
 p
divide p which is in the numerator. However since   is an integer, these
r 
factors must cancel out in the remain factors in the numerator, viz.

93
( p  1)( p  2)...( p  r  1)
( p  1)( p  2)....( p  r  1) is an integer and hence the
1.2.3.....r
 p
integer   is divisble by p for 1  r  p  1. Thus the integer p, which is a
r 
prime divisor of all the coefficients in f(x+1), except the (leading) coefficients
p  p 
of xp-1 ; further p2 is not a divisor of   , the constant term is   =p.
 p  1  p  1
Thus the prime p satisfies all the conditions required for Eisentein’s criterion.
Hence we conclude that the polynomial f(x+1) is irreducible over F and so f(x)
is also irreducible over F.
Polynomial Rings over Commutative Rings:
This is the last section for the lesson and the main result proved in this
section is false. If a ring R is a U.F.D(the definition for the U.F.D is given in
sectioned that whenever R is an integral domain so in R[x]. Suppose that R
denotes an integral domain with identity, we can talk about divisibility, unitys,
associates, prime or irreducible elements, g.c.d, etc. in whenver a = bc where
a, b, c  R we say that b and c are divides of a and write b a, c a. An element
u  R is said to ;be unit in R. if there exists an element v in R(viz the inverse
of u) such that uv = ab elements a and b of R are said to be associates. One can
easily that a and b are associates in R iff each of a,b divides the other. An
element p  R is said to be a prime element (or an irreducible element) if p is
not a unit and if the only divisors of p are units of associates of p. Now we
come to the important definition of a U.F.D.
Definition 4:
An integral domain R with identity is said to be unique factorization
domain (abbreviated as U.F.D) if (i) any non-zero element in R is either a unit
or can be factorised as a product of finitely many prime (or irreducible)
elements of R; (ii) the factorization mentioned above is unique but for order
and associates of the prime elements.
In the last Section, we had shown (in a theorem) that the above two
defining properties of a U.F.D are enjoyed by a Euclidean ring. Thus any
Euclidean ring and in particular Z, (the ring of integers), Z[i] (the ring of all
Gaussian integers) and F[x] (the ring of all polynomials in x over any field F)
are all examples of a U.F.D. However the converse is false. That is, a
Euclidean ring need not be a U.F.D. In fact, one can show that Z[x], the ring of
all polynomials in x over Z is only a U.F.D. and not a Euclidean ring. So also,
the ring F[x1,x2] of all polynomials in the two indeterminates over a field F is
only a U.F.D and not a Euclidean ring. The symbol F[x 1,x2] denotes F[x1][x2]
which results from F by adjoining first x1 to obtain the ring F1  F [ x1 ] and then
adjoint x2 to F1 , so as to obtain F1[ x2 ]  F [ x1 ][ x2 ] : The fact Z[x] is a U.F.D
follows from classical algebra in which we have studied standard properties of
polynomials in x with integer coefficients. So also, one can recognize that the

94
ring F [ x1 , x2 ] is a U.F.D. However both Z[x] and F [ x1 , x2 ] are not Euclidean
rings. For, in the case of Z[x], we have the example of a prime ideal, (viz. the
ring not maximal, (Refer Lesson 6) Thus Z[x] cannot be a Euclidean ring, for
we know that in a Euclidean ring all prime ideals are maximal. Again, in the
case of the ring F [ x1 , x2 ] the ideal generated F{x1 , x2 } cannot be a principal
ideal, as can be shown. Hence F [ x1 , x2 ] cannot be a Euclidean ring, for it is
known that in a Euclidean ring all ideals are principal ideals.
We list below some of the divisibility properties of a U.F.D and these
can be easily established just as in the case of the ring Z:- Let C denote a
U.F.D. in all that follows here. Then (i) for, a, b  R , there exists a g.c.d.
denoted by (a,b), for the pair of elements a, b and it is unique but for associates.
a   1  2 ...... m
In fact, by the definition of a U.F.D we can write ,
p1 p2 ........ pm
b   1  2 ...... n
where the p’s and the q’s are distinct primes and the
q1 q2 ........qn
 ' s and  ' s are positive integers. It may or may not happen that some or all
of the primes pi are same as the primes qi and vice-versa. Let us suppose that,
after relabeling if necessary the p’s and the q’s, pi and qi are the only primes
which are associates for 1  i  k where k  min (m,n) with these assumptions
 1 ,  2 ,...... k 
made one can easily check that ( a , b)    where
 p1 , p2......... pk 
 i  min( i , i ) for i=1, 2, ..k
If on the other hand, no prime p1 is an associate of any qi and viceversa
the g.c.d (a,b)=1. Thus the g.c.d. (a.b)can be determine.
i) If a and b relatively prime in R, i.e. (a,b)=1 then for
c  R, a b, c  a c . (This fact is easily proved).
ii) If a is a prime element in R then for b, c  R, a bc  a b or a c .
(This is a corollary from the result (ii)).
It is possible to establish an analogue of Gauss; lemma for the case of
R[x] when R is a U.F.D., just as when R=Z. for the purpose we make the
following analogues definitions.
Definition 5:
If f ( x)  a0  a1 x  a2 x  ....  am x is a polynomial in R[x] then the
2 n

content of f(x), denoted by c(f) is the g.c.d. of the coefficicients


a0 , a1 , a2 ...  am .
One can note c(f) is unique but for unit factors.

95
Definition 6:
A polynomial f[x] in R[x] is said to be a primitive polynomial if c(f)=1
or a unit.
It is quite easy to see that for any polynomial f[x] in R[x] one can write
f(x)=c(f). f1 ( x) where f1 ( x) is a primitive polynomial. Moreover the above
decomposition of f(x) as a product of the elements c(f) of R and the primitive
polynomial f1 ( x) in R[x] is unique but for unit factors.
The following lemma can be proved along the same lines as, the
corresponding Lemma 3 in the case of Z[x] was proved and so the proof is
omitted. (You are invited to supply the proof, as an exercise).
Lemma 4:
If R is a U.F.D, then the product of any two primitive polynomials in
R[x] is also primitive.
Since we had remarked earlier that any polynomial f(x) is R[x] can be
written as f(x)=c(f), f1 ( x) where f1 ( x) is a primitive polynomial in R[x], the
above leema leads us to the following corollary.
Corollary:
If d ( x), g ( x)  R[ x], , then c( fg )  c( f )c( g ) (upto units) and
generally, if f1 ( x) f 2 ( x)....... f n ( x)  R[ x] , then
c( f1 f 2 ....... f n )  c( f1 )(cf 2 )......c( f n ) (upto units)
Now we state and prove analogue of Gauss’ lemma for the case of a
U.E.D.
Lemma 5:
Let R be a U.F.D. and f(x) in R(x) be primitive. Suppose F is the field
of quotients of R, in which R can be imbedded. Then f(x) is irreducible over F
(i.e. as an element in F[x]). Iff it is irreducible over R (i.e. as an element in
R[x]).
Proof:
a
Since any element of F can be taken as where a, b  R and b  0 the
b
ring R[x] can be considered as a sub-ring of F[x]. Hence if the (primitive)
polynomial f(x) in R[x] is irreducible over F, then certainly f(x) is irreducible
over R as otherwise any possible non-trivial factorization of f(x) over R is
clearly valid over F also, conversely suppose f(x) is irreducible over F also. We
prove the lemma by reduction absurdum. So assume that f(x) has non trivial
factorization over F, i.e, let f(x)=g(x) h(x) where g(x) and h(x) are non-constant
polynomials over F. (Here we make use of the easily provable fact that the only
units in R[x], where R is any integral domain with identity, are the units in R)
a
since the coefficients in the g(x) are all quotients of the form with
b
a, b  R and b  0 , we can replace all the coefficients in g(x) by quotients with

96
g1 ( x)
the same denominator, say di  R and so we can write h( x)  where
d1
h ( x)
g1 ( x)  R[ x] . Similarly we can write h( x )  1 where
e1
e1  R, h1 ( x)  R[ x]. . Let c( g1 )  a1 , c(h1 )  b1 so that we can take
g1 ( x)  a1 g 2 ( x) . h1 ( x)  b2 h2 ( x) where g 2 ( x), h2 ( x)  R[ x] and both are
primitive. Now
g1 ( x)h1 ( x) a1 g 2 ( x) h1h2 ( x) a1b1
f ( x )  g ( x ) h( x )    g 2 ( x)h2 ( x) . Since
d1e1 d1 e1 d1e
both g 2 ( x) and h2 ( x) are primitive, we get, in view of the last Lemma 4, that
p
the product g 2 ( x).h2 ( x) is also primitive. Thus we can write f ( x)  f 2 ( x) ,
q
where p  a1b1 . f1 ( x)  g 2 ( x).h2 ( x) is a primitive polynomial in R[x]. hence
q f ( x)  p f1 ( x) and equating the contents of both sides we get q=p, since
both f(x) and f1 ( x) are primitive. This means that f ( x)  f1 ( x)  g 2 ( x)h2 ( x)
and so f(x) admits a non-trival factorization over R, as both g 2 ( x) and
h2 ( x) arre clearly non constant polynomial in R[x]. but thi9s is a contradiction
to our intial assumption that f(x) is irreducible over R and therefore the lemma
is proved.
Before we can establish the main and final theorem of this section we
need one more lemma which proves the unique factorization of any primitive
polynomial in R[x].
Lemma 6:
If R is a U.F.D. and f(x) in R[x] is a primitive polynomial then f(x) can
be factored in a unique manner as a product of irreducible polynomials in R[x],
(the uniqueness being but for order and unit factors).
Proof:
Suppose F is the field of quotients of the integral domain R. then the
ring R[x] can be considered as a sub ring of the ring F[x], so that f(x) in R[x]
can be considered as a polynomial in F[x] But, F being a field we know that
F[x]is a Euclidean ring and hence is a U.F.D. Therefore as an element of F[x],
f(x) admits a unique factorization into irreducible polynomials over F, say
f ( x)  f1 ( x) f 2 ( x)...... f k ( x) where each f i ( x) is a polynomial irreducible
over F. As we had mentioned in the proof of the last lemma, each f i ( x) can be
ci
written as pi ( x) where ci , di  R and pi ( x) is a primitive polynomial in
di
R[x]. but since f i ( x) is irreducible over F, so is pi ( x) and hence, by the last
lemma, pi ( x) is irreducible over R. thus we are able to write

97
c1c2 ......ck
f ( x)  f1 ( x) f 2 ( x)...... f k ( x)  p1 ( x) p2 ( x)..... pk ( x) where each
d1d 2 ......d k
pi ( x) is a primitive, irreducible polynomial in R[x] and ci , di  R for all
i=1,2,….k. the above equation gives.
d1d 2 d k f ( x)  c1c2 ......ck p1 ( x) p2 ( x)..... pk ( x) . Now, by data f(x)is primitive
p1 ( x) p2 ( x)..... pk ( x) is also primitive, as each pi ( x) is primitive and the
product of primitive polynomials is also primitive over R (by Lemma 4).
Therefore equating the contents of both sides of the last equation, we get
d1d 2 d k  c1c2 ......ck and so we obtain f ( x)  p1 ( x) p2 ( x)..... pk ( x) . This is,
we have factored f(x) as a product of irreducible polynomials pi ( x) in R[x].
further this factorization is unique to within order and unit factors, since F[x] is
a U.F.D (as earlier mentioned) and each pi ( x) in R[x] being primitive and
irreducible is irreducible over F. Hence the lemma follows.
Now we have all the necessary materials for the proof of our main
theorem, which we now state and prove.
[Note: For the proof of this main theorem, one must first establish the
Lemmas 5 and 6 and then prove that the theorem; Lemma 4 may also be
quoted].
Theorem 5:
If R is a U F.D. so is R[x].
Proof:
Let f(x) be any arbitrary polynomial over R. then we can write f(x) in a
unique way as f ( x)  c( f ) f1 ( x) where c(f) is the content of f(x) and f i ( x) is a
primitive polynomial in R[x]. now c(f) is in R and is unique but for unit factors.
Also since f1 ( x) is primitive, by the last lemma f1 ( x) can be factored in a
unique manner as a product of irreducible polynomials in R[x], unique but for
the order of the factors. Again since c(f) is in R and R is a U.F.D. c(f) admits a
unique factorization as a product of prime elements in R and we claim that this
is the only factorization for c(f) when it is factored over R. (i.e., as an element
of R[x] and that the primes in R are irreducible even as elements of R[x]. in
fact, suppose we factorize any element of R, in particular c(f)over R and obtain
c( f )  a1 ( x).a2 ( x).....am ( x) . Then equating the degrees of both sides,
considering e(f) as an element of R[x]. we get
0  deg  a1 ( x)   deg  a2 ( x)   ........  deg   am ( x)   . Since the degree of
any non-zero polynomial is a non-negative integer the above equation implies
that each ai ( x) is a polynomial of degree 0. i.e.. each ai ( x) is an element of R.
Hence the only possible factorization of c(f) are those it can have as an element
of R. hence we have obtained the unique factorization of f(x) by putting
together the unique factorization of c(f) as an element of R and the unique

98
factorization of the primitive polynomial f1 ( x) over R. This concludes the
proof of the theorem.
Just as we had defined the ring R[x] of all polynomials in a single
indeterminate x over R, we can define the ring R  x1 , x2 ,........xn  of all
polynomials in n indeterminate (or variables) x1 , x2 ,........xn over R. this can be
defined successively as follows. Let R1  R[ x1 ] be the ring obtained from R by
the (ring adjunction of one x1 Again by adjoining another indeterminate x2 to
R1 we obtain another ring R2  R1[ x1 ]  R[ x1 ] [ x2 ] we denote this ring R2 by
R[ x1 , x2 ] and call it the ring of all polynomials in the two variables x1 , x2 with
coefficients in R. similarly we may adjoin yet another variable x3 to R2 and
obtain the ring R1  R2 [ x3 ] . Proceeding in this manner we obtain the ring
R  x1 , x2 ,........xn  of all polynomials in the n variables (or indeterminate)
x1 , x2 ,........xn over R. when R is an integral domain,
R( x1 ), R[ x1 , x2 ], R[ x1 , x2 , x3 ],....... and in general R  x1 , x2 ,........xn  are all
integer domains. Similarly by successive application of the last theorem
following corollary follows:
Corollary:
If R a U.F.D., so is R  x1 , x2 ,........xn  . Again , when F is a field it is
known that F [ x1 ] is a Euclidean domain and so is a U.F.D. Hence by
successive application of the Theorem we obtain the following corollary also.
Corollary:
If is any field then the ring F  x1 , x2 ,........xn  is a U.F.D.
Solved problems:
1.Find all the units of the ring Z[i] of Gaussian integers.
Solution:
Z [i]  a  ib a, b  Z  is the ring of Gaussian integers.
1  0i is the unit element.
Let x  iy be a unit & x'  iy' be its inverse.

 
Then  x  iy  x'  iy '  1  0i

  xx  yy   i  xy  yx   1  0i
' ' ' '

Equating real and imaginary parts


xx'  yy '  1
and xy'  yx'  0
Squaring and adding we get

99
x 2 x' 2  y 2 y ' 2  x 2 y ' 2  y 2 x' 2  1
  x2  y 2  x' 2  y ' 2   1

  x2  y 2   1 ( the product of two positive integers can be equal


to 1 iff each of them =1)
 x2  0, y 2  1 or x2  1, y 2  0
Then x  0, y  1 or x  1, y  0
The only units of Z[i] are 0  i, 1  0i.
i.e., 1,-1,i,-i
2. If R is an integral domain and if F is its field of quotients, prove that any
g ( x)
element f(x) in F(x) can be written as where g ( x)  R[ x] and a  R .
a
Solution:
p 
F   ; p  R,0  q  R 
q 
Let f ( x)  F ( x) .
a0 a1 a
Then f ( x)   x  ........  n x n where
b0 b1 bn
a0 , a1 ,........an  R and b0 , b1 ,........bn are nonzero elements of R.
Now b0 , b1 ,........bn are non-zero elements of F.
 b0 , b1 ,........bn is also a non-zero element of F and so is invertible.
b0 , b1 ,........bn  a0 a1 an n 
 f ( x)    x  ........  x 
b0 , b1 ,........bn  b0 b1 bn 


 a0b0b1........bn    b0 a1........bn  x  ....   b0b1........bn a0  x n
b0 , b1........bn
g ( x)

a
where
g ( x)   a0b0b1........bn    b0 a1........bn  x  ....  R[ x] & a  b0 , b1........bn  R
3. Prove that x 4  2 x  2 is irreducible over the field of rational numbers.
Solution:
Let f ( x)  2  2 x  0 x2  0 x3  1x4
Now f(x) is a polynomial with integer coefficients. Also 2 is a prime
number such that 2 divides each of the coefficients of f(x) except the coefficient

100
1 of the last term x 4 . Also 2 2 is not divisor of the constant term2. Hence by
Eisenstein’s criterion of irreducibility f(x) is irreducible over the field of
rational numbers.
Exercises
1. Show that units in a commutative ring with identity form a group under
multiplication.
2. Prove that the set N of all polynomials in Z(x) whose constant terms are
either zero or a fixed prime P is a prime ideal ring Z[x]. is this ideal N, a
maximal ideal?
3. Prove that, in an integral domain R with identity, two elements a, b of R
generate the same ideal iff a and b are associates.
4. Prove that in a Euclidean ring any two greatest common divisor of a pair of
elements of the ring are associates.
5. Find the greatest common divisor in Z[i] of:
a. 3+4i and 4-3i
b. 11+7i and 18-i
 
6. If R is commutative ring, let N  x  R x  0 for some integer n.
n

7. Prove:
8. N is an ideal of R
R m
9. In R  , if x  0 for some m, then x  0
N
1. (hence x  N  x  R with x  R )
10. Let R be a commutative ring and suppose that A is an ideal of R. Let
 
N ( A)  x  R x n  A for some n.
11. Prove:
12. N(A) is an ideal of R which contains A.
13. N(N(A))=(A).
i. Note: N(A) is often called the radical of A.
14. Prove that no prime of the form 4n+3 can be written as a 2  b 2 where a and
b are integers.
15. Prove that x 2  x  1is irreducible over F, the field of integers mod 2.
16. Let D be a Euclidean ring and F, its field of quotients, prove the Gauss’
Lemma for polynomials with coefficients in D factored as product of
polynomials with coefficients in F.
17. If R is an integral domain with identity, prove that any unit in R[x] must
already be a unit in R.
18. Prove that when F is a field, F [ x1 , x2 ] is not a principal ideal ring.
19. Prove that a principal ideal ring is a unique factorization domain.
20. If Z is the ring of integers, show that Z [ x1 , x2 ......, xn ] is unique
factorization domain.

101
NOTES
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………

102
Unit - III
VECTOR SPACES AND MODULES
Definition:
A non-empty set V is said to be a vector space over field F, if V is an
abelian group under the operation denoted by ‘+’ and if for every
  F , v  V , there is defined an element, written as  v in V satisfying the
following conditions
(1)   v  w   v   w
(2)     v   v   v
(3)    v     v
(4) 1.v = v
Nowhere  ,   F and v, w  V and 1 represents the unit element of F
under multiplication.
Note (1) The ‘+’ in conditions (1) above in that of the group V ,  , while the
‘+’ in the left hand side of condition (2) is that of the field F and the ‘+’ in the
right hand side is that of the group V ,  .
Note (2) If V is a vector space over F, the there is a mapping
F V  V satisfying the above four conditions. This mapping is usually called
scalar multiplication. We often refer to the elements of F as scalars and the
elements of V as vectors.
Example 1:
Let F be a field and let K be a field which contains F as a subfield. We
consider K as a vector space over F, using addition in the field K as the ‘+’ of
the vector space and by defining, for   F , v  K ,  v to be the product of 
and v as elements in the field K. the axioms (1),(2),(3) for a vector space are
the consequence of the right distributive law, left distributive law and
associative law respectively which hold for K as a ring.
Example 2:
Let F be a field and let V be the totality of all ordered n-tuples
1 ,  2 , 3 ,...., n  where i  F . Two elements
1 , 2 ,3 ,....,  n  and  1 ,  2 , 3 ,....,  n  of V are declared to be equal if and
only if  i  i for each i  1, 2,3...., n . We now introduce the operation ‘+’ in
V as 1 ,  2 , 3 ,....,  n    1 ,  2 , 3 ,....,  n   1  1 ,  2   2 ,...... n   n  .
(Compontent wise addition). Thus becomes an abelian group under this
addition ‘+’. Now let us define the map F V  V (ie. Scalar multiplication in
V) by v 1 ,  2 , 3 ,....,  n    v1 , v 2 , v 3 ,...., v n  where v  F . One can
easily verify that V is a vector space over F and this vector space is usually
denoted by F ( a ) .

103
Definition : Subspace
If V is a vector space over F and if W  V , then W is a subgroup of V if
under the same operation as for V, W itself forms a vector space over F.
Equivalently W is a subspace of V iff w1 , w2 W and   F always imply
that  w1   w2 W . (Supply the proof, yourself )
Lemma 1:
If V is a vector space over F, then
(1)  .0  0
(2) 0.v  0
(3) ( )v    v 
(4) if v  0, then  .v  0 implies that   0 where   F and v  V , 0
represents the zero for addition V and 0 represents the zero for addition in F.
Proof:
1) since  0   (0  0)   0   0 we get  0  0
2) since v 0  (0  0)v  0v  0v we get 0.v  0
3) since 0    ( 0 v   v  ( )v, ( )v    v 
4) if  v  0 and   0, then 0   1 0   1 ( v)   1  v  1.v  v
If W is a subspace of the vector space T (over a field F) then since (V,+)
is an abelian group, the cosets of W in V viz, v+W where v V , form under
V V
addition of cosets, a group . Now can be realized as a vector space over
W W
F, as the following lemma shows.
Lemma 2:
V
If V is a vector space over F and if W is a subspace of V, then is a
W
vector space over F where for
V
v1  W , v2  W  , and   F , we define
W
 v1  W    v2  W    v1  v2   W
  v1  W    v1  W
V
Proof: The commutatively of addition in V assures that is an abelian
W
V V
group. Now let us show that map F   given by   v1  W    v1  W
W W
is well defined. If v  W  v '  W then we must show that
 v  W   v '  W . sin ce v  W  v '  W , v  v ' is in W. Since W is a subspace
  v  v'  must also be in W which implies by lemma 1 that  v   v '  W , so

104
that  v  W   v '  W . Now let us verify the four axioms for the vector
spaces.
1.  v1  W    v2  W      v1  v2   W     v1  v2   W
  v1   v2  W   v1  W    v2  W 
2.     v1  W       v1  W   v1   v1   W
  v1  W     v1  W 
   v1  W     v1  W 
3.    (v1  W )      v1  W    v1  W    v1  W 
4. 1 v1  W   1.v1  W   v1  W for v1 , v2  V ,  ,   F .
V V
Hence is a vector space. This vector space is called the quotient
W W
space V by W.
Just at for groups and rings, we now define the notion (Vectro spaces)
homomorphism between vector spaces.
Definition:
If U and V are vector spaces over F, then a mapping T of U into V is said
to be a homomorphism, if
1. T (u1  u2 )  T (u1 )  T (u2 )
2. T  u1    T (u1 )  for all u1 , u2 U and all   F
If T, in addition, is one-to-one, we call it an isomorphism. The Kernel of
T is defined as u U T (u )  0 where 0 is the identity element for addition in
V. one can verify that kernel of T is a subspace of U and that T is an
isomorphism if and only if its Kernel is [0]. Two vector spaces are said to be
isomorphic if there is an isomorphism of one onto the other.
Theorem 1:
If T is a homomorphism of U onto V with Kernel W then V is
U
isomorphic to . Conversely, if U is a vector space and W a subspace of U,
W
U
then there is a homomorphism of U onto where Kernel is precisely W. The
W
proof is similar to the proof given in group theory.
Definition:
Let V be a vector space over F and let U1 ,U 2 ,...U n be subspaces of V.V is
said to be the internal direct sum of U1 ,U 2 ,...U n if every element v V can be
written in one and only way as v  u1  u2  ...  un where ui U i .
Given any finite number of vector spaces over F say V1 ,V2 .....Vn .
Consider the set V of all ordered n-tuples (v1 , v2 .....vn ) where vi  Vi

105
We declare that two elements (v1........vn ) and  v1' ,...vn'  of V to be equal
if and only if v1  v1' for each i. we add two elements componentwise ie
(v1........vn )   w1 ,...wn    v1  w1...vn  wn  . Also if   F and (v1........vn ) V ,
we define  (v1........vn )   v1 ,..,  vn  . One can see that V is a vector space
over F under these operations. We call V the external direct sum of V1 ,V2 ...Vn
and denote it be writing V  V1  V2  ....  Vn .
Theorem 2:
If V is the internal direct sum of V1 ,V2 ...Vn then V is isomorphic to the
external direct sum of V1 ,V2 ...Vn .
Proof:
Given v V can be written by assumption in one and only way, as
v  u1  u2  ..  un where ui  Vi . Define the mapping
T : V  V1  V2  ....  Vn by T (v)   u1 , u2 ,...un  . Since v has a unique
representation T is well – defined. One can easily verify that T is a one-to-one,
onto homomorphism. Hence the proof.
Linear Independence and Bases
Definition: If V is a vector space over F and if v1 , v2 ,...vn  F , then any
element of the form 1v1  ....   n vn where  i  F is called linear combination
over F of v1 , v2 ,..., vn .
Definition:
If S is a nonempty subset of the vector space V, then L(S), the linear
span of S, I;s the set of all linear combinations of finite sets of elements of S.
Lemma 3: L(S) is a subspace of V.
Proof: If v and w are in L(S) then
v  1s1  2 s2 ...  n sn and w  1t1  2t2 ...  mtm where  ' s and  ' s are in
F and the s1 and t1 ( for i  1, 2...n and j  1, 2...m) are all in S. Thus for
 ,  in F ,
 v   w    1s1  2 s2 ...  n sn     1t1  2t2 ...  mtm 
 1  s1  ...  n  sn   1  t  .....1   m  tm
and so is again in L(S). Hence L(S) is a subspace of V.
Note: It is not difficult to see that this subspace L(S) is the (see theoretically)
smallest subspace of V, which contains the set S.
Lemma 4:
If S and T are subsets of V, then

106
(1) S  T implies L( S )  L(T )
(2) L( S T )  l ( S )  L(T )
(3) L( L( S ))  L( S )
Proof:
We take S  T . Let us show L( S )  L (T ) . Let us take any element
w  L( S ) . Then w  1s1  2 s2 ...  n sn where i  F and si  S for I =
1,2…n. Now S  T  si  T which implies w  L(T ). Hence L( S )  L(T ).
(2) Now let us show that L( S T )  l ( S )  L(T ) . Let v be any element
in L( S T ) . Then
v  1s1  2 s2 ...  n sn  1t1  2t2 ...  mtm where i and  j  f , si  S and ti T for i  Tj n
and ti T for i  Tj n and j  1, 2..m which shows that
v  L( S )  L(T ) Hence L ( S T )  L ( S )  L (T ).
The other side also follows immediately.
Note: If W1 and W2 be two subspaces of the vector space V, it is easy to see
that W1  W2  w1  w2 w1 W1 , w2 W2  is a subspace of V, called the sum or
join of the two sub-spaces W1 and W2 . One cal also check that W1  W2 is
same as L W1 W2  and that W1 W is not in general a subspace of V.
(3) Now let us show L  L( S )   L( S ) . Let w be any ;element in L(S) then
w  L  L( S )  since w = 1.w where 1 F and w  L( S ) . Hence
L( S )  L  L( S )  . Let us take any element
v  L  L(S )  then v  1w1  2 w2  ...  n wn where i  F and wi  L(S ) .
Since wi  L( S ), wi  1i s1i   2i s2i  .......   kii ski i
where  ij  F and s ij  S for j  1, 2...k1

  
v  1 11s11  ......   k11 s1k1  ....  n 1n s1n  ......   knn sknn 

  111  s11  ...  1 k11 s
1
k1  ...      s
n 1
n n
1  
 ...  n  knn sknn
Hence L( S ), i.e., L  L( S )   L( S )
Hence L( S )  L  L( S ) 
Definition: The vector space V is said to be finite dimensional (over F) if there
is a finite subset S in V such that V=L(S)
Definition:
If V is a vector space and if v1......vn are in V we say that they are
linearly dependent over F if there exist elements 1 , 2 ,.,.n in F, not all of
them 0, such that

107
1v1  2v2  ....  n vn  0
If the vectors v1......vn are not linearly dependent over F, they are said to
be linearly independent over F. we shall often contract the phrase “linearly
dependent over F” to “linearly dependent”.
It is clear from the definition of linear dependence that the vectors
v1......vn of the vector space V are linearly independent over F iff any linear
relation of the form 1v1  2v2  ....  n vn  0 always implies that
1  2  n  0
In F(3) it is easy to verify that (1,0,0),(0,1,0) and (0,0,1) are linearly
independent.
But (1,1,0),(3,1,3),(5,3,3) are linearly dependent because
2(1,1, 0)  1(3,1,3)  1(5,3,3)  (0, 0, 0) .
Lemma 5: If v1 , v2 ,.., vn V are linearly independent then every element in
their linear span has a unique representation in the form 1v1  2v2  ....  n vn
where i  F .
Proof:
By definition, every element in the linear span is of the form
1v1  2v2  ....  n vn . To show uniqueness we must show that if
1v1  2v2  ....  n vn = 1v1  2v2  ....  n vn then
1  1 , 2  2 ,......, n  n .
if 1v1  2 v2  ....  n vn  1v1  2 v2  ....  n vn then  1  1  v1  ...   n  n  vn  0
Since v1 , v2 ,.., vn are linearly independent,
1  1  0,....n  n  0. So i  i for i  1, 2,...n.
Theorem 3:
If v1 , v2 ,.., vn are in V, then either they are linearly independent or some
vk is a linear combination of the preceding ones, viz. v1 , v2 ,.., vk 1 .
Proof: If v1 , v2 ,.., vn are linearly independent, there is nothing to prove.
Therefore suppose that 1v1   2 v2  ....   n vn  0 where not all the  ' s are 0.
Let k be the largest integer for which  k  0 . Since  i  0 for i  k
1v1   2v2  ...   k vk  0. Since  k  0.
vk   1  1v1   2v2  ........   k 1vk 1  . Thus vk is
   k11  v1    k1 2  v2  ....    k1 k 1  vk 1
a linear combination of its predecessors.

108
Corollary 1:
If v1 , v2 ,.., vn in V have W as linear span and if v1 , v2 ,.., vk are linearly
dependent, then we can find a subset of v1 , v2 ,.., vn of the form
v1 , v2 ,.., vk , vi1 .....vir consisting of linearly independent elements whose linear
span is also W.
Proof:
If v1 , v2 ,.., vn are linearly independent we are done. If not weed out from
this set the first vj which is a linear combination of its predecessors since
v1 , v2 ,.., vk are linearly independent j>k. The subset so constructed,
v1 , v2 ,.....vk ....v j 1, v j 1...vn has n-1, elements, Clearly its linear span is contained
in W. But we claim that it is actually equal to W. For given w W , w can be
written as a linear combination of v1 , v2 ,.., vn . But in this linear combination,
we can replace v j by a linear combination of v1.......v j 1. Hence w is a linear
combination of v1 , v2 ,.....vk ....v j 1, v j 1...vn . Continuing this weeding out process,
we reach a subset v1 , v2 .....vk , v j1 ....vik whose linear span is still W but in which
no element is a linear combination of the preceeding ones. By the previous
theorem, they are linearly independent.
Corollary 2:
If V is a finite dimensional vector space then it contains finte set
v1 , v2 ,.., vn of linearly independent elements whose linear space his V.
Proof:
Since V is finite dimensional it is the linear span of a finite number of
elements u1 , u2 ,..um . By the previous corollary we can find a subset of these
denoted by v1 , v2 ,.., vn consisting of linearly independent elements whose linear
span must be V.
Definition:
A subset S of a vector space V is called a ;basis of V if S consist of
linearly independent elements and V = L(S).
If V is a finite dimensional vector space and if u1 , u2 ,..um span V, then some
subset of u1 , u2 ,..um forms a basis of V.
Lemma 6:
If v1 , v2 ,.., vn be a basis of V over F and if w1 , w2 ,..wm in V are linearly
independent over F then m  n.
Proof:
Since v1 , v2 ,.., vn is a basis for V, the vector wm is a linear combination
of v1 , v2 ,.., vn . Hence the vector wm, v1 , v2 ,.., vn are linearly dependent and they
span V. Thus some proper subset of these viz.

109
w , v , v ,....v  where k  n 1 forms a basis.
m i1 i2 ik In doing so, we have includes
wm and removed at least one vi in the new basis. Repeating this process, we
will get a basis with the vectors wm1 , wm , vi1 , vi2 ....vin where s  n  2 . Keeping
up this procedure we get a basis of the form w2 .......wm1 , wm , v , v .... since
w1 is not a linear combination of w2 , ....wm1, wm , the basic must actually
include some v. To this basic we have introduced (m-1) w’s, each such
introduction having cost us atleast one v and yet there is a v left, Thus m-1 < n-
1 and hence m<n.
Corollary 3:
If V is finite dimensional over F then any two bases of V have the
number of elements.
Proof:

Let v1, v2, ......vn   
be one basis of V over F and let W1, W2, ......Wn be
another basic. In particular w1...... wm are linearly independent over F. Hence by
the previous theorem m<n. Now interchanging the roles of v’s and w’s we get n
< m. Hence n = m.
Definition:
The number of vectors in a basis (which is unique) of a vector space V is
called the dimension of the vector space.
Lemma 7:
If a vector space V is finite dimensional over F and if u1.... um of V
linearly independent then we can find vectors um1 ,.... um r in V such that
u u1, u um1 ,.... umr  is a basis of V.
2.... m,

Proof:
Since V is finite dimensional it has a basis. Let { v1, v2,…….. vn} be a
basis of V. Since these span V, the vectors u1………..um, v1……..vn also span v.
Hence by corollary 1 there is a subset of these of the form
 
u1 ,........um , vi1 .......vir which consists of linearly independent elements which
spans V. To prove the lemma put
um1  vi1, ......umr  vir .
Lemma 8:
If V is finite dimensional, and dim W ≤ dim V and
v
dim = dim V - dim W.
w
Proof:
Since V is finite dimensional. Let n = dim V. Then (n+1) elements in
V are linearly dependent. In particular n+1 elements in W are linearly

110
dependent. Thus Wo can find a largest set of linearly independent elements in
W, viz, w1, w2, …..wm and m  n.
If w W , then { w1, w2, …..wm, w}, is a linearly dependent set. Hence
 w  1w1..... m wm  0 and not all the  i are o. If   0 , by the linear
independence of w1…wm we would get that each  i  0 , a contradiction. Thus
  0 and so w   1 (1w1  ....  m wm ). Since w is any vector in W,
w1, w2....... wm span W. So W has basis of m elements where m ≤ n. Hence dim W
≤ dim V. Now let w1 , w2 ..., wm  be a basis of W. We can fill this out to a basis
w1 , w2 ,.....wm , v1....vr  of V where m + r=dim V
v
And m = dim W. Denote any element in , as v which is a coset v + W
w
determined by v V . Since any vector is v V is of the form
v   i w1   2 w2  ...   m wm   r vr where  ' s and  ' s  F, we get
v  1 v1  ...   r vr because for I =1,2, …m. wi  wi  W  0
v
( wi being in W) Thus v1....vr span . We claim that they are linearly
w
independent. For if  1 v1  ...   r vr  0 then  1v1  ...   r vr W and so
 1v  ...   r vr  1w1  ...  m wm which by the linear independence of the set
1

 w1......wm , v1...vr  forces  1  ...   r  1  ...  m  0. Hence we have shown that


v v
has a basis of elements namely v1, v2 .....vr and so dim = r = dim V-m =
w w
dim V – dim W.

Corollary 4:
If A and B are finite dimensional subspaces of a vector space V then
A+B if finite dimensional and
dim (A+B) = dim (A) + dim (B) – dim ( A  B ).
Proof:
A B A
Since we can show, as in group theory, that  by equating
B A B
the dimensions of both sides we get that
dim (A+B) - dim (B) = dim (A) – dim  A  B  .
Hence dim (A+B) = dim (A) + dim (B) – dim  A  B  .
Dual Spaces
Notation : Hom (V, W)
Let V and W be two vector spaces over field F. Hom(V,W) denotes the
set of all vector space homomorphism’s of V into W.

111
Lemma 9 :
Hom (V, W) is a vector space over F under suitable operations.
Proof:
Let us define the operation ‘+’ in hom (V, W) as follows. For S and T
in hom (V, W), S+T is defined by (S+T) (v) = S(v) + T (v) for any vector v 
V. Hence S+T is a mapping from V into W. In fact S + T is homomorphism.
For, if v1,v2 V then
(S+T) ( v1  v2 ) = S (v1  v2 )  T (v1  v2 ) [By definition]
= S (v1 )  S (v2 )  T (v1 )  T (v2 )
[ S, T are homomorphisms]
= S (v1 )  T (v1 )  S (v2 )  T (v2 )
= (S+T) (v1 )  ( S  T )(v2 )
Hence S+T preserves addition in V. Moreover for   F , v  V
( S  T )( v)  S ( v)  T ( v)
  S (v)   (T (v))
[ S and T are vector space homomorphisms].
  [ S (v)  T (v)]
  [ S  T (v)]
i.e., S+T preserves scalar multiplication
Hence (S+T)  Hom (V,W).
Under the operation of ‘+’ in Hom (V,W). Hom (V,W) becomes an
abelian group, having the ‘O’ map defined by O(v) = 0 for all v  V as additive
identity and for any S  Hom (V,W), its inverse being (-S) given by (-S) (v) =
- [S(v)} for all v in V.
To make this Hom (V.W) into a vector space over F, let us define scalar
multiplication be setting for any   F and for any S  Hom (V,W), the map
 S from V into W as ( S )(v)   ( S (v)). We claim  S is a homomorphism.
In fact for v1 , v2  V ,
 S (v1  v2 )   ( S (v1  v2 ))   [ S (v1 )  S (v2 )]
  ( S (vi ))   ( S (v2 ))
=  S (v1 )   S (v2 ).
Further for   F ,  S (  v)   ( S (  v))
=  (  S (v))
 ( ) S (v)
= (  ) S (v) [ F is a field]
=  ( S )(v).

112
Hence  S preserves both addition and scalar multiplication and so
 S  Hom(V ,W )
Now we will show the Hom (V,W) is a vector space.
Let  ,   F ; T  How(V ,W ) and v  V.
Then ( ( S  T ))(v)   [( S  T )(v)   [ S (v)  T (v)]
=  S (v)   T (v)  ( S   T )(v)
Hence α ( S  T )   S   T
((   ) S )(v)  (   )( S (v))   ( S (v))   ( S (v))
` = ( S )(v)  (  S )(v)
 ( S   S )(v)
Hence (   ) S   S   S .
(( ))(v)  ( )( Sv)   (  ( S (v))  ( (  S ))(v)
Hence ( ) S   (  S )
(1S)(v) = 1. (S(v))=S(v)i.e., 1.S=S.
Hence Hom (V,W) is a vector space over F.
The following important theorem gives a definite information about the
dimension of Hom (V,W) when both V and W have finite dimensions over F.
Theorem 4:
If V and W have dimensions m and n respectively over F, then Hom
(V,W) is of dimension mn over F.
Proof:
We are going to prove the theorem by actually finding a basis for Hom
(V,W) consisting of mn elements. Let v1 , v2 ,..., vm  be a basis of V over F and
w1 , w2 ,..., wm  be a basis of W over F. If v  V then v  1v1  ...  m vm , where
1,2,...m are uniquely defined elements of F. Now we define the map Tij :
V  W by Tij(v)= i w j where i = 1,…m and j = 1… n. Hence we have in all mn
maps from V to W, We claim that they are all homomorphisms. For if
x1 , x2  V where x2  1v1  ...m vm : x2  1v1  ... mvm then
Tji ( x1  x2 )  (i  i )wj  i wi  i wi
= Tij ( x1 )  Tij ( x2 )
Further, for
k  F , Tij (kx1 )   k i  w j  k  i w j 
 kTij ( x1 )
Hence Tij  Hom(V , W )
Our claim is that these mn, Tij’s form a basis for Hom(V,W) . for that
first let us show that given any S  Hom(V , W ) we can write S as a linear

113
combination of Tij’s . we see that S (v1 ) W and so S (v1 ) can be written as
linear combination of w1 , w2 ,.....wn . Let us write
s(v1 )  11w1  12 w2  ...  1n wn for some suitable 11 , 12 ,...., 1n  F .
Similarly we can write s(v1 )  11w1  12 w2  ...  1n wn for i= 1….m where
ij  F . Consider
S0  11T11  12T12  ...  1nTn   21T21  .....   i1Ti1  ...   m1Tm1  ....   mnTmn .
Let us compute S 0 (vk ) for the basis vector vk
S0 (vk )  11T11  vk   12T12  vk   ......   mnTmn  vk 
By the definition of Tij , we have
Tij (vk )  0 if i  k and Tij (vk )  wj if i  k
Hence S0 (vk )   k1w1   k 2 w2  ...   kn wn which is nothing but S (vk ) .
Hence the homomorphism S 0 and S agree on the basis of V. Hence S 0  S .
Since S 0 is the linear combination of Tij , S is also the same linear combination
of Tij .
Hence Tij ’s spean Hom(V,W) over F.
In order to show that these Tij ’s form a basis, we have to show that they
are linearly independent. Suppose that
11T11  12T12  ...  1nT1n  ...  i1Ti1  ...  inTin   m1Tm1  ....   mnTmn  0
(zero homomorphis) with ij all in F. applying this to vk  mn we get
0   k1w1   k 2 w2  ...   kn wn , sin ce Tij (vk )  0 if i  k
 wi for i  k
Since wj ...wn are linearly independent,  k1   k 2  ...   kn  0 .
Similarly taking the vectors v1 , v2 ,...vn we get  ki  0 for all k and j. Thus Tij
are linearly independent over F. Hence Tij ; i  1...m; j  1....n forms a basis
of Hom(V,W) over F. Hence the dimension of Hom(V,W) = mn.
Corollary 5:
Dimension of Hom(V,W) = m2 if dimension of V is m.
The results follows by taking W = V in the theorem.
Corollary 6:
If dimension of V over F is m then dimension of Hom(V,F)+m . As a
vector space, F is of dimension 1 over F. applying the theorem and putting W
= F, we get dimension of Hom(V,F) = m.
Definition:
If K is a vector space then its dual space is Hom(V,F) and this space is
denoted by V .

114
Note: Any element in V is called a function on V.
Lemma 10:
If V is finite dimensional and 0  v V then there is an element
f V such that f (v)  0 .
Proof:
Since V is finite dimensional, let v1 , v2 ,...vn  be a basis let vi be the
element V defined by
v(v j )  0 if i  j
 1 if i  j
so that vi (1v1   2 v2  ....   n vn )   i
In fact the vi ' s are nothing but Tij introduced in the previous theorem.
 
Hence v1 , v 2 ,...v n forms a basis of V .
If 0  v V , we can find a basis of the form v1  v, v2 ........vn so that there
is an element in V namely v1 such that v1 (v1 )  v1 (v1 )  1  0.
Hence the result.
Definition:
Since Hom(V , F )  V is a vector space over F we can form. How
Hom(V , F ) which we denoted by V and call this, the second dual of V.
Lemma 11:
If V is finite dimensional, then there exists an isomorphism V onto V .

Proof: Given an element v V , associate with it an element T, the V by the


setting for f V , Tv ( f )  f (v) . One can easily see Tv  Hom(V , F ) . Now we
define the map
 :V  V as follows.
For any v V ,  (v)  Tv . We claim  is a homomorphism. For
(v1  v2 )  Tv1 v2 . But Tv1 v2 ( f )  f (v1  v2 )  f (v1 )  f (v2 )  Tv1 ( f )  Tv2 ( f ).
Hence Tv1 v2 = Tv1 + Tv2 . Therefore  (v1  v2 )   (v1 )   (v2 ). similarly
we can show that   F ,  (v )  T v  Tv   (v). Hence  is a vector
space homomorphism of V into V . Since V is finite dimensional it is enough
if we show that the kernel of  is trivial.

115
If  (v)  0 then Tv=0 ie. Tv ( f )  f (v)  0 all f V which implies
that v = 0. Hence  is an isomorphism. Since V is finite dimensional, V is
finite dimensional and V is also finite dimensional. Hence  is onto.
Definition:
If W is a subspace of V then the annihilator of W,
A(W )   f V f ( w)  0 for all w W .
Note: We can show they A(W) is a subspace of V . In face, for any f and
g V and  ,   F we find that  f   g  A(W ), since for
w W  f   g  (w)   f  w    g  w   f (w)   g (w)
  0   0  f , g  A(W ) 
 00  0
Theorem 5:
If V is finite dimensional and W is a subspace of V, then W is
V
isomorphic to and dimA (w) = dimV - dimW.
A(W )
Proof:
If f V , let f be the restriction of f to W. Thus f is defined on W
by f (w)  f (w), for w W .
Since f V , f W . let us define the map T : V  W defined by
T ( f )  f for every f V . Clearly T is a homomorphism. For
T ( f  g )  t  g  T ( f )  T ( g ) and T   f   T ( F ) . Now let us find
the Kernel and if f is in the kernel of T. then the restriction of f to W must be 0.
Hence the Kernel of T=A(W).
We not claim that the mapping T is onto. For that we must now that
given any element h W , then h is the restriction of some f V so that h = f.
If w1 , w2 ,...wm is a basis of W, then it can be expanded to a basis of V
of the form w1 , w2 ,...wm , v1 , v2 ,...vr where r  m  dim V . Let W1 be the
subspace of V spanned by v1 , v2 ,...vr . Thus V  W  W1 . If
h W , define f V as follows;- Given any v V which can be written
uniequely as v  w  w1 where w W and w1  W1 define f(v) =h(w). Then it
is clear that f is in V and that f  h.
V
Hence T is onto- so W
A(W )

 
Hence dim( W )= dim V  dim  A(W )  .

116
 
But we have already proved that dim V  dim V 

Hence dim(W )  dim(V ) dim  A(W )


So dim( A(W ))  dim(V )  dim(W )
We close this lesson after providing solutions to some standard problems:
Problem 1:
If T is an isomorphism of a vector space V into a vector space W (both
over the same field F), prove that T maps any basis of K onto a basis of W.
Solution:
For our purpose we can assume that V is of infinite dimensional say n.
then, because of isomorphism W is also of finite dimension n. Let
v1 , v2 ,...vn  be a basis of V. we show that T (v1 ), T (v2 ),...T (vn ) is a basis for
W. For this suffice if we show that the vectors T (v1 ) . Suppose there exists
scalars 1 , 2 ,..., n in F such that 1T (v1 )  2T (v2 )  ....  nT (vn )  0 (of W),
then using the fact that T is a vector space homomorphism, we get
T  1v1  2v2  ....  n vn   0 implies that 1v1  2v2  ....  n vn =0 , as T is an
isomorphism . But v1 , v2 ,...vn are linearly independent vectors in V. whence
i  0 for i  1, 2...n. This means that T (v1 ), T (v2 ),.......T (vn ) are linearly
independent in W (over F). It remains for us to prove that these vectors span
W. For this we need show that, given any w W ,  scalar
1 , 2 ,....n in F such that w  1T (v1 )  2T (v2 )  ...  n T (vn ). [Note that
T (v1 ) , T (v2 ) …. T (vn ) span a subspace of W]. Since the mapping
T : V  W is onto, v  V such that T (v )  w. Now we can write
v   1v1  2v2  ....  n vn . (using the basis of V) for suitable scalars
i (i  1, 2,..., n). Hence w  T (v)  T  1v1  2v2  ....  nvn 
1T (v1 )  2T (v2 )  ...  n T (vn ) as desired.
The problem is thus proved.
Note:
If T is only a homomorphism and not an isomorphism of V onto W, T
need not map a basis of V into a basis of W. In the above problem T is not
only an isomorphism of V onto W, but also an isomorphism of V into W.
Therefore T maps a basis of V not only into; but also onto, a basis of W. To
see this fact remember that T has an inverse T 1 : W  V which is alos an
isomorphism of W onto V.
Problem 2:
If V is a finite dimensional vector space and T is an isomorphism of V
onto V, prove that T must map V onto V.

117
Solution:
Let V be of finite dimension n and let v1 , v2 ,...vn  be a basis of V. As
in the previous problem, T (v1 ), T (v2 ),...T (vn ) are all distinct and linearly
independent. Since their number in n, dimension of V, T (v1 ), T (v2 ),...T (vn )
will from a basis of V. Hence, for any arbitrary element v in V,  (unique)
scalars 1 , 2 ,....n in the field such that
v  1T (v1 )  2T (v2 )  ...  nT (vn )  T  1v1  2v2  ...  nvn  , T being a
vector space homomorphism. Thus  an element
v'  1v1  2v2  ...  nvn in V such that v  T (v' ) .
This shows that T is onto and the problem follows.
Problem 3:
Let Vn   p( x)  F ( x) deg p( x)  n. define T by
T  0  1 x  ....   n 1 x n 1    0   2 ( x  1) 2  .... n 1  x  1
n 1
. Prove that T is
an isomorphism of Vn onto itself.
Inner Product spaces
When we studied in the last lesson about vector space V over a field ,
the field F has played no important role. But in this lesson we take F to be
either the field of real numbers or the field of complex numbers.
Definition 1:
The vector space V over F is said to be and Inner product space if there
is defined for any two vectors u , v in V an element  u , v  in F , called the inner
product of u and v , such that
1)  u, v    v, u  where  v, u  denotes the complex conjugate of  v, u  .
2)  u, u   0 and (u u)  0 if and only if u  0 .
3)  u   v, w   (u, w)   (v, w) for any u, v, w in V and  ,  in F
Note 1:
If we take the field F to be the field of real numbers the condition (1)
simply means that  u, v   (v, u ) If we take the field F to be the field of complex
numbers, then condition (1) assures that  u, u    u, u  which implies that
(u, u) is real and hence condition (2) makes sense even if we take F as the
field of complex numbers,
Note 2: Let us find the value of (u ,  v   w) where u, v, w are in V and
 ,  are in F .
 u,  v   w    v  w, u  (By condition (1) )

118
  (u, u)   (w, u) (By condition (3))
  (v, u )    u, w
  (u, v)   (u, w)
Example:
In F (2) , let us define u  (1 ,  2 ) and v  ( 1 ,  2 )
(u, v)  11  12
Now  v, u    11  2 2   11   2 2  11   2 2
  u, v  .
Hence condition (1) is satisfied.
 u, u   11   2 2  1  2  0
2 2
Also and (u , u )  0 if both
1 and  2 are zero i.e. if u  (0,0) , the zero in F  2  .
Hence the property (2) is also satisfied.
Again if  ,   F and w    1 ,  2  in F (2) , consider
 u   v, w   1  1. 2  2  ,  1, 2  
 1  1   1   2   2   2
  (1 1   2 2 )   ( 1 1   2 2 )
   u, w    v, w
This property (3) is also verified So  u , v  as taken in this example serves
as an inner product for F   .
2

Here-after let V denote an inner product space over F .


Definition 2:
If v is in V , then the length of v (or norm of v) written as v is
defined by
v   v, v 

 
2
In the example if v  1 ,  2  then v  1   2
2
.

Lemma 1:
For u1 , u2 , v1 , v2  V and 1 ,  2 , 1 ,  2  F we have
1u1  1v1 , 2u2  2v2   11 u1 , u2   12 u1 , v2    1 2  v1 , u2   12  v1 , v2  .
In particular  u   v,  u   v     u, u     u, x     v, u     v, v  .

119
Proof:
1u1  1v1 , 2u2  2v2   1  u1 , 2u2  2v2   1  v1 , 2u2  2v2  (by
condition (3) )
 1  2  u1 , u2   2  u1 , v2    1  2  v1 , u2   2  v1 , v2   (By note 2)
1 2  u1 , u2   2  u1 , v2   1 2  v1 , u2   12  v1 , v2 
Thus the first result is proved. The particular case follows by making
1   2   , 1   2   , u1  u2  u and v1  v2  v .
Note :
The first result in this Lemma can be proved to be equivalent to
condition (3) in Definition 1. So it can replace the condition (3) and we have an
equivalent definition for an inner product space.
Corollary:
u   . u .
Proof:
 u   u,  u 
2

  ,   u, u  [By lemma l, taking  = 0 in the particular case]


2

2
u
Hence  u   . u .

Lemma 2:
If a, b, c are real numbers such that a  0 and a 2  2b  c  0 for all
real number  then b 2  ac .
Proof:
a 2  2b  c can be written as follows:
1  b2 
a 2  2b  c  (a  b) 2   c  
a  a
 b
Since the value is > 0 for all values of  . (in particular when     
 a
2
b
we have   0 , we get b 2  ac .
a
Theorem 1: If u, v are in V then  u, v   u . v
Proof: Let  denote the zero vector in V. By the property (3) of an inner
product, we find that for any v in V,  , v      , v    , v    , v  whence
see that  , v   0 . Similarly  v,    0 . Therefore if u   , or v   and

120
u . v  0 . Hence the result is true when u   or v   . Now let assume that
u  0 and v  0 for the moment let us assume  u , v  to be real.
We have 0   u  v, u  v    2  u, u   2  u, v    v, v  [By lemma 1]
Let a   u, u  , b   u, v  and c   v, v  , Hence by lemma 2, b 2  ac .
Hence
 u, v    u, u  v, v 
2

2 2
 u : v
Hence  u, v   u . v
Suppose now  u, v    (say) is not real. By assumption  u, v   0 
u
that is meaningful Hence by the property (3) of the inner produce, we get

u  1  u, v   1
 , v    u, v  
    u, v 
Since we have proved the inequality for reals.
u  u
1  ,v  .v
  
1
 .u .v

Hence   u . v i.e.  u, v   u . v
Definition 3:
If u.v are in V, then u is said to be orthogonal to v if  u, v   0 we note
that if u is orthogonal to v, then v is orthogonal to u.
For  v, u    u, v   0  0 .
Definition 4:
If we is a subspace of V , the orthogonal complement of W denoted by
W , is defined by W   ( x  v

 0 for all w W ).
 x , w
Lemma 3:
W  is a subspace of V.
Proof:
To show W  is a subspace, we must show that if a, b W  and
,  F then  a   b   W  Now, for w in W
 a   b, w    a, w   b, w  0)  0

121
Since a, b W  ,  a, w  0   b, w
We note that W W  then  w, W   0 . Hence w = 0.
Definition 5:
The set of vectors  vi  in V is said be an orthonormal set if

 
1. each vi is length 1 le , vi , vi  1  
2. For i  j  vi , vi   0
Lemma 4:
If  vi  is an orthonormal set then the vectors  vi  are linearly
independent. Further if w  1vi  ... .. n vn then 1   w, vi  for i  1, 2, … n.
Proof:
In order to show that the  vi  are linearly independent suffice to show
that if 1v1   2v2  ... ...   n vn  0 then all  's are zero. So let
1v1   2v2  ... ...   n vn  0 , then for each i,
0  1v1   2v2  ... ...   n n , vi   0
 1  v1 , vi    2  v2 , vi   ... ...   n  vn , vi 
  i  vi , vi  (Since vi  is an orthogonal set,  vi , v j   0 for j  i )
 i
Thus 1  0 for each i, as required.
If w  1v1  ... ...   n vn then we find  w, vi   1 , as above,

Lemma 5:
v1 , v2 vn  is an orthogonal set in V and if w is n V then
If
u  w   w, vi  vi   w, v2  v2 ....   w, vn  vn is orthogonal for each of v1 , v2 ...vn .
Proof:
Consider
 u, vi    w, vi    w, v1  v1, vi    w, v2  v2 , vi   ...   w, vi  vi , vi   ... ...   w, vn vn , vi 

  w, vi   0  0  ... ...  w, vi  ... ...  0


(Since v1 , v2 ,..., vn  is an orthonormal set)
  w, vi    w, vi   0
Hence u is orthogonal to vi . Similarly u is orthogonal to all vi ' s .

122
Theorem 2:
Let V be a finite-dimensional inner product space Then V has an
orthonormal set as a basis.
[Gram-Schmidt orthogonalization process].
Proof:
Since V is of finite dimension, let us assume that the dimension of V
over F is n let v1 , v2 ..., vn  be a basis of V . Using this basis let us construct
an orthonormal set w1 , w2 ,..., wn  such that each wi is of length 1 and
 w , w   0 if i  j.
i j

v1  v v 
Let w1  . Then  w1 , w1    1 , 1 
v1  v1 v1 
 
2
1 v1
 2  v1 , v1   2
1
v1 v1
Now with the help of w1 and v2 we construct w2 such that w2  1 and
 w2 , w1   0 . So we try to find the value of  such that  w1  v2 orthogonal
to w1 .
 i.e  w1  v2 , w1   0
 i.e  w1  v2 , w1     w1 , w1    v2 , w1      v2 , w1   0 . Hence
    v2 , w1  . So let us take u2    v2 , w1  w1  v2 . Then u2 is orthogonal to
w1 . Since v1.v2 are linearly independent, w1 and v2 are linearly independent as
w1 is a multiple of v1 . Hence u2  0 .
u2
Now let w2  , Then w1 , w2  is orthonormal set.
u2
Let us take now u3    v3 , w1  w1   v3 , w2  w2  v3 .
Then  u3 , w1     v3 , w1  w1 , w1    v3 , w2  w2 , w1    v3 , w1 
   v2 , w1 1  0   v3 , w1 
= 0.
Similarly  u3 , w2   0 . Hence u3 is orthogonal to both w1 and w2 . Since
v1 , v2 , v3 are linearly independent, and since w1 , w2 are in the linear span of
v1 , v2 , we have w1 , w2 and v3 are linearly independent,

123
u3
Hence u3  0. Now let w3 . Then  w1 , w2 , w3  is an orthonormal set,
u3
Similarly proceeding we can construct w1 , w2 ...wi in the linear span of v1...vi
such that w1 , w2 ...wi  is one orthonormal set, To construct wi 1 , let us proceed
as before,
Let ui 1    vi 1 , wi  wi   vi 1 , w2  w2   vi 1 , wi  wi  vi 1 . Now 1  j  i.

ui 1 , w j     v j 1 , w1   w1 , wi    vi 1 , w2   w2 , w j   ...   vi 1 , wi   wi , w j    vi 1 , w j 
   vi 1 , wi   w j , wi    vi 1 , wi 

(Since w 1 , w2 ....wi  is an orthonormal set  wi , w j   0 if i  j )

   vi 1 , w j    vi 1 , w j    w , w   1 .
j i

= 0.
Hence ui 1 is orthogonal to each w1 , w2 ...wi . Since v1 , v2 ...vi 1 are linearly
independent and since w1, w2 ,.....wi are in the linear span of v1 , v2 ...vi , we have
u
w1 , w2 .....wi , vi 1 are linearly independent. Hence ui 1  0. Let wi 1  i 1
ui 1
Thus  w1 , w2 ...wi 1  is an orthonormal set. Hence from the given set
v1 , v2 ....vn 
we have constructed an orthonormal set w1 , w2 ...wn  . Since we
have already shown, that an orthonormal set is linearly independent, we have
thus formed the basis w1 , w2 .....wn  which is orthonormal.
Theorem 3:
If T is a finite dimensional inner product space and if W is a subspace of
V, then V  W  W   i.e. V  W  W  and W W   0  . WE say that V is the
direct sum of W and W  ,
Proof:
Since V is finite dimensional, W is also finite dimensional and hence by
the previous theorem, we can construct an orthonormal set w1 ,... ... ...wx  in W
which is a basis of W. Given any v in V we show that we can write v as a sum
of two elements one from W and the other form W  . In fact, let
v0  v   v1w1  w1   v, w2  w2  ... ...   v, wr  wr .
Then v0 is orthogonal to each wi . Hence v0 is orthogonal to every
element in W. Hence v0 W  .
Now we write
v   v, w1  w1   v, w2  w2  ... ...   v, wr  wr   v0

124
 v1  v0 where v1   v, w1  w1  ... ...   v, wr  wr W .
and v0 W  . We have already shown that W W   0 .
Hence V  W  W  .
Modules
When we defined a vector space, we took the abelian group V and a
field F and we defined the map F V  V . In this section, we are going to take
a ring R instead of F and define a map R V  V .
Definition 6:
Let R be a ring. An abelian group M is said to be an R-module if for
every r in R, and m in M, we associate an element denoted by rm in M such that
for all a, b in M and r, s in R,
1. r  a  b   ra  rb
2. r ( sa )  (rs )a
3.  r  s  a  ra  sa .
The “+” in left hand side of (3) is addition in R and “+” in right hand side
is addition in M.
Definition 7:
If R had a unit element J and if lim m for every element m in M, M is
called a unital R module. Since we have multiplied the elements of R from the
left, we call M as a left module, Similarly we can multiply the elements of R
from the right and define the analogous notion of a right R-module. Throughout
this section, we are going to take only left R-modules.
Example 1:
Let G be an addition abelian group and let us take the ring R as the
ring of integers. Then for any integer n and for any a in G , where, as usual
na denotes a  a  a  .....  a, n times when n is a positive na  G integer
 a    a   .....   a  ,  n  times when n is a negative integer and the zero
element 0 of the group G when n is the integer zero. The module axioms are
trivially satisfied. Hence G becomes a unital module over the ring of integers.
Example 2:
Every associative right R is a module over itself, the additive abelian
group being the group  R,   of right R and multiplication ra for r  R ,
a   R,   being the same as ring multiplication.
Example 3:
If R is an associative ring, then every left ideal M can be considered as
an R-module. For any r  M and the condition for modules are trivially
satisfied.

125
Definition 8:
Let M be an R module. A subgroup A of M is called a sub-module of M
if whenever r  R and ra  A then a  A .
Definition 9:
If M is an R-module and if M 1 , M 2 ....M s are submodules of M, then M
is said to be direct sum of M 1 , M 2 ....M s if every element m  M can be
expressed uniquely as m  m1  m2  .....  ms were mi for i  1, 2,....s , In such a
case, we write.
M  M1  M 2  ....  M s .
In order to drive the main theorem to this section, we need three more
definitions.
Definition 10:
An R-module M is said to be cyclic if there is an element m0  M , such
that every element m in M can be written as m  rm0 where r  R .
Definition 11:
An R-module M is said to be finitely generated if there exist finitely
many elements a1......an in M such that every m  M can be expressed as
m  r1a1  r2 a2  ....  rn an where rj  R for i  1, 2,.....n .
Definition 12:
Let M be a finitely generated R-module. The generating sets having as
few elements are called minimal generating sets and the number of elements in
such a generating sets is called rank of M.
Now we state and prove the main theorem in this section.
Theorem 4:
Let R be an Euclidean ring. Then any finitely generated R-module, M, is
the direct sum of a finite number of cyclic sub-modules.
Proof:
First instead of taking the module M of a Euclidean ring, we will take
the module M over the ring of integers (which is Euclidean) and then we will
give the modification at the end for any Euclidean ring.
We give the proof, by the method of induction, induction on the rank of M.
If M has got rank 1, then M is generated by a single elements. Hence M
is cyclic. So the theorem is proved. Now let us assume that rank of M is q and
the theorem is true for all R-modules of rank of M is q and the theorem is true
for all R-modules of rank q – 1. Let a1 , a2 ...aq  be a minimal generating set for
M. If the relation n1a1  n2a2  .....  nq aq  0 implies that
n1a1  n2a2  .  nq aq  0, then the theorem is true. For M becomes direct sum
of M i  i  1, 2,...q  , Where M i is the cyclic sub-module generated by ai.

126
Hence we can assume that, given any minimal generating set
b1 , b2 ,....bn  of M there exist integers r1 , r2 ,....rn such that
1 1  r2b2  ....  rqbq  0 and in which not all of rb
rb 1 1 , r2b2 ,.....rqbq are 0. Among

all possible such relations for all minimal generating sets there is a smallest
positive integer occurring as a coefficient. Let this integer be s, and let the
generating set for which it occurs be, a1 , a2 ...., aq . Hence we have
s1a1  s1a1  .....  sq aq  0 ... (1) We claim that if
r1a1  r2 a2  ....  rq aq  0 ...  2  then s1 divides r 1 . Since s1 , r1 are integers,
we can write r1  ms1  t where 0  t  s . Hence multiplying the equation (1)
by m and substituting from (2) we get
ta1   r2  ms2  a2  ...   rq  msq  aq  0 .
By our choice of s1 divides si for i  2,....q . If not let us assume that s1 does
not divide s2 = m2s1 + t where 0  t  s1. Now a '1  a1  m2 s2 , a2a 3 ....aq also
generated M. Hence we have from (1),
s1a1  s2a2  ....  sq aq  0
 i.e  s1  a1 ' m2a2   s2a2  ...  sq aq  0
 i.e. s1a1 ' s1m2a2  s2a2  ...  sq a q  0
 i.e. s1a1 ' ta2  s2a2  a2 s2  ....  sq aq  0
 i.e. s1a1 ' ta2  ...  sq aq  0 .
By the choice of s1 , either t  0 or t  s1 , contradicting our assumption
on t. Hence t  0 . So s1 divides s2 similarly s1 divides each si . Let us write
si  mi s1 . Now consider she q elements, ai *, a , a....aq where
a1*  a1  m2 a2  ....  m3a3  ....  mq aq . They generate M. Hence if M 1 is the
cyclic module generated by ai* and M 2 is the sub-module of M generated by
a2 ....aq , then we have M  M 1  M 2 . In order to show that M is the direct sum
of M1 and M2. We must show that M 1 M 2   . It will be enough if we show
that x  y  0 where x  M 1 and y  M 2 , implies x  0.
Hence let r1a1*  r2 a2  r3a3  ....  rq aq  0 . If we substitute for ai* , we get,
r1a1   r2  r1m2  a2   r3  r1m3  a3  ....   rq  r1mq  aq  0. We have
already shown that in such a relation as above s1 | r1. By the choice of s1.r1  0 .
So r1.a1*  0. Hence M 1 M 2  0.

127
Since M 2 has got rank q  1 , by the induction hypothesis M 2 i ' s the
direct sum of cyclic modules. Already M 1 is cyclic module generated by a1* .
Hence M is the direct sum of cyclic modules.
Now, the proof can be modified for any Euclidean ring as below instead
of taking s1 , let us take the element of the ring R occurring on the relation
whose d-value is minimal and whenever we talk of t where r1  m1s1  t , we
take either t = 0 or d  t   d  s1  . The proof can then be completed.
Corollary:
Any finite abelian group is the direct product of cyclic groups
Proof:
Since, any finite abelian group is finitely generated, the result follows,
from the above theorem.
Theorem 5:
The number of non-isomorphic abelian groups of order p n is p  n 
where p  n  denotes the partition of n.
Proof:
Let G be an abelian group of order p n . By the previous corollary G is
the direct product of cyclic groups G1 , G2 ....Gk of order p n1 , p n2 ,... p nk
respectively.
Now p n  0  G   0  G1.G2 ....Gk 
 0  G1  0  G2  ......0  Gk   direct sum
 p n1 . p n2 .... p nk
 p n1  n2 .....nk .
Thus n  n1  n2  ....  nk . Hence given an abelian group of order p n ,
we get a partition of n. On the other hand, given a partition of n say
n  n1  n2  ....nk , let G1 be a cyclic group of order p n1 , G2 of order p n2 , etc
Gk of order p nk . Let G be the external direct product of G1....Gk . Then G is an
abelian group of order p n .
Hence for each partition, there is an abelian group and for each abelian
group, there is a partition. Hence there is one-to-one correspondence between
the partition of n and the non-isomorphic abelian groups of order p n . Hence
the number of non-isemorphic abelian groups of order p n is p  n  .

128
Solved Problems:
1) Let V be the set of all continuous complex valued functions on the closed
b
interval  a, b , show that  f , g    f (t ) g (t )dt defines an inner product in V .
a

Solution :

(i)  f , g    f  t  g  t  dt
b

  f  t  g  t  dt
b

  f  t g  t  dt
b

  g  t  f (t )dt
b

  g, f 
b
(ii)  f   g , h     f  t    g  t   h  t dt
a

 
b
   f (t )h(t )   g (t )h(t ) dt
a
b b
   f (t )h(t )dt    g (t )h(t )dt
a a

   f , h    g, h
(iii) f (t ) f (t )   a0  ib0    a1  ib1  t  ... a0  ib0    a1  ib1  t  ...
  a02  b02   2a  a0 a1  b0b1  t  ....
 F  t  say .
where F(t) is real valued continuous function in t.
b b
  f , f    f  t  f  t dt   F  t dt  0.
a a

Also  f , f   0  F (t )  0  ai  0, bi  0i
 f (t )  0
 f is zero function.
Hence it is an inner product.
2) Let an inner product in V over the real field be defined as
1

 p, q    p(t )q(t )dt. Starting from the basis 1, t, t  in V ,


2
obtain an
0
orthonormal basis.

129
Solution:
Let 1 ,  2 ,  3   1, t , t 2 
Choose 1  1  1
1
1  1  1 , 1   1,1   1.1dt  1
2 2

Let  2   2 
 2 ,  2  
2 1
1
1
Here  2  t ,  2  1   2 ,  2    t ,1   t.1dt 
1

0 2
1 1
  2  t  .1  t 
2 2
 1 1  1  1  1 
  2 , 2    t  , t      t   t   dt
2
2
 2 2  0  2  2 
3 1 1
  13  t  12   
  0 12
 ,    ,  
3   3  3 21 1  3 21  2
1 1
3  t 2 , 1  1, 2  t  12
1
  3 , 1    t ,1    t 2 .1 dt  13
2

0
1

 3 ,  2    t 2
,t  1
2    t 2
(t  12 )  dt
0
1
t4 t3  1 1 1
  12    
4 3  0 4 6 12

 t  12   t 2  t  12  13  t 2  t  16
1
 3  t 2  13 .1  12
1
12

 1  1
1
1
  3 , 3     t 2  t   t 2  t   dt 
2
2
0
6  6 180
Hence
1  2 3 
, ,
1  2 3 
1

 i.e. 1, 2, 3  t  2  , 6 5  t  t  6  p
2 1

130
Exercise:
1. Prove that the following defines an inner product in the respective vector
space:
a. In F (n) define, for u  1..... n  and v   1 ,..... n 
 u, v   11   2 2  .... n n .
b. In F (2) , given u  1 ,  2  and v  1 , 2  define
 u, v   211  12   2 1   2 2 .
c. If F is the real field, find all 4-tuples of real numbers (a, b, c, d) such
that u  1 ,  2  , v  1 ,  2  in F(2) define
 u, v   a11  b 2 2  c12  d 2 1.
2. If dim V  n and if w1 , w2 ...wm  is an orthonormal set in V prove that
there exist vectors wm 1 ,......wn . Such that w1 , w2 ,.....wm , wm1....wn  is an
orthonormal set.
d2y
3. Let V be real function y  f ( x) satisfying  9y  0
dx 2
a. Prove that V is two-dimensional real vector space.

b. In V define  y, z    yz dx Find an orthonormal basis in V.
0

131
NOTES
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………

132
UNIT - IV
EXTENSION FIELDS AND GALOIS THEORY
Field extensions Definition:
Suppose F is a field. Then a field K is said to be an extension of F if F is
a subfield of K.
In the chapter on vector spaces we have shown that if F is a subfield of
a field k. Then K can be regarded as a vector space over F under the ordinary
field operation in K. the dimension of the vector space K(F) will play an
important role in this chapter. Throughout this chapter K will denote an
extension of F.
Degree of field extension definition.
Let K be an extension of the field F. The dimension of K as a vector
space over F i.e., The dimension of the vector space K(F) is called the degree of
K over F. we shall always denote the degree of k over F by [K:F].
Finite field extension. Definition:
Let K be an extension of the field F. then K is said to be a finite
extension of F if the degree of K over F is finite. Thus K is a finite extension of
F if the vector space K(F) is finite dimensional.
In this chapter we shall denote particular attention to finite field
extension because of there importance in the study of the theory of equations.
Before proceeding further we want to give some illustrations of field
extensions.
Illustrations:
1. If F is any field the F can be regarded as a subfield of F. therefore F
can be thought of as an extension of F. the dimension of the vector space F(F)
is one. In fact the unit element 1 of F is a basics of this vector space. Thus the
degree of F over F is one i.e [F:F]=1.
2. the field C of complex numbers is a finite extension of the field R of
real numbers. Also we have [C:R]=2. the set {1,i} where
 k a n  k1a n1  ....................k n
U   0m m 1
then k1  F
 l0 a  l1a  .....................lm
U  F( a )
  U ,0    U   /   U
a,b  K
i  ( 1 ) is a basis of the vector space C(R). if a,b  R . Then
ai  bi  0  a  0, b  0
Therefore the set {1,i} is linearly independent over R. also if a  ib  C ,
then a  ib is a linear combination of 1,I over R. thus the set (1,i) generates
C(R). Therefore the set {1,i} is a basis for the vector space C(R).

133
3. Let Q be the field of rational numbers. The field
Q  2    a  b 2 :a,b  Q  is a finite extension of Q. we have
Q  2  : Q   2 . In fact the set 1, 2  is a basis of Q  2  regarded as a
 
vector space over the field Q.
4. The field
Q   
2 , 3  a  b 2  c 3  d 2 3 : a,b,c,d  Q 
is a finite extension of a Q. we have Q 2 , 3 : Q   4 as can be easily seen


the set 1, 2 , 3 , 2 3 is a basis of Q Q   
2 , 3 thought of as a vector
space over the field Q.
Transitivity of finite extension:
Theorem 1:
If L is a finite extension of K and if K is a finite extension of F, then L
is a finite extension of F, moreover,[L:F]=[L:K] [K:F]
Proof:
Let K be a subfield of L and F be a subfield of K i.e L  K  F . Let
[L:K]=m and [K:F]=n.
Suppose 1 , 2 ............... m is a basis of L over K and
1 ,2 ................n is a basis of K over F. Then 1 ............. n  L and
1 ...............n  K since K  L , therefore 1 ...............n  L . Consequently
the mn elements i j where i=1………m, j=1………….n are all in L. we
shall prove that the set of these mn elements forms a basis of lover F. then we
shall have [L:F]=mn i.e. L will also be a finite extension of F and also we shall
have [L:F]=[L:k] [K:F].
Thus we the theorem shall be proved.
 
First we shall show that the set i , j generates L over F. let r be any
element in L. since  1 ............. m  is a basis of L(K), therefore r can be
expressed as a linear combination of  1 ............. m  over the elements in k.
m
So we have r   k d ,k  K
i 1
i i i

Now, ki  K and  1 ,2 ................n  is a basis of K(F). therefore we have,


n
ki   fijij fij  F
i 1

From (1) and (2) we have

134
 n 
r     fijij  i   fij  i j  fij ,  F
m m n

i 1  j 1  i 1 j 1

This r is a linear combination of the elements i j over F. therefore the


set of mn elements i j generates the vector space L(F). Now we shall show

 
that the set i j is linearly independent over F. we have
m
 n 
ij  i j 
m n


i 1 j 1
f    0 f ij  F     fijij i  0
i 1  i 1 
n
  fij B j  0 for i=1……………m, since  1 ............. m 
j 1

is a basis of L9K) and each fij j  K


 fij  0 for i=1…………….m, j=1……….n since
1 ...............n  is a basis of K(F) and each fij  F

 
 the set i j is linearly independent over F.

 
Hence the set i j is a basis of L over F. this proves the theorem.
Theorem 2:
If L is a finite extension of F and it K is a subfield of L which contains
F, then [K:F] [L:F] i.e. [K:F] is a divisor of [L:E].
Proof:
Let L,K,F be three fields in the relation L  K  F . Suppose further
that [L:F] is finite and is equal to n. let  1 ............. m  be a basis of L over F
then  1 ............. m  generates L over F.
Since K  L therefore any linear combination of  1 ............. m  over
K. therefore the set  1 ............. m  also generates L over K though it may not
be linearly independent over K. since L(K) is generated by a finite set, therefore
it is a finite dimensional vector space and so [L:K] is finite. Further K(F) is a
subspace of L(F). since [L:F] is finite therefore [K:F] is finite. Recall that each
subspace of a finite dimensional vector space is also finite dimensional. Now
by theorem (1) we have [L:F]=[L:K] [K:F]  [K:F]is a divisor of [L:F]
Note:
From theorem (2) we conclude that if [L:F] is a prime number, then there can
be no field k properly contained between L and F. In other words if [L:F] is
prime and k is any subfield of L containing F, then either we have K=L or we
have K=F.

135
Field adjunctions:
Suppose K is an extension of a field F. let a  K suppose C is the
collection of all subfields of K containing both F and a. C is not empty because
at least K itself is in C. now the intersection of an arbitrary collection of
subfields of K is also a subfield of K. let F(a) denote the intersections of all
those subfields of k which are members of c. then subfields of K which are
members of C. then F(a) is a subfield of k. obviously F(a) contain both F and a
because each member of c contains both F and a. thus F(a) is a member of C.
further if E is any subfield of K containing both F and a then F(a) will be
contained in E. the reason is that F(a) is the intersection of the members of C
and E is a member of C. this the f(a) is a subfield of K containing both F and a
and itself is contained in any subfield of k containing both F and a F(a) is the
smallest subfield of K containing both F and a. we call F(a) the subfield
obtained by adjoining a to F. here a has been adjoined to F and the process is
called field adjunction.
Constructive description of F(a). suppose K is an extension of field F. let
ak ,
 kn a n  k1a n1  ....................kn
Let U   m m1
then ki and li are element in
 l0 a  l1a  .....................lm
m 1
F, l0 a  l1a  .....................lm is not equal to the zero elements of K and
m

n,m are any none-negative integers} obviously U contains both F and a.


Therefore U is a subfield of K containing.
Both F and a. this implies that U  F( a )
Obviously U is a subfield of K. it can be easily seen that.
(i)  ,  U      U
(ii)  U ,0  U   / U
This U is a subfield of K. we claim that U=F(a).
Further any subfield of K which contains both F and a, by virtue of
closure under addition and multiplication, must contain all the elements
k0 a n  k1a n1  ....................kn where each k1  F . Since F(a) is a subfield of
K containing both F and a, therefore F(a) must contain all such elements. Being
a subfield of K, F(a) must also contain all such elements. Therefore F(a) must
contain U i.e. F( a )  U .
Now U  F( a ) and F( a )  U  U  F( a )
Simple field extension. Definition:
The extension K of a field F is called a simple extension of F if K=F(a)
for some a in K.
Let k be an extension of a field F. Let a,b  K . Let T=F(a). since F(a)
is a subfield of k. therefore k is also an extension of F(a) let W be the subfield

136
of k obtained by adjoining b to F(a). then W=(F(a))(b). we shall write (F(a))(b)
as F(a,b) similarly we can describe F(b,a) we have F(a,b)= (F(a))(b)
= the smallest subfield of k containing both F(a) and b.
= the smallest subfield of k containing F, a and b because any subfield of k
which contains both F and a must contain F(a).
Similarly F(b,a)= the smallest subfield of k containing F,a and b
Since the subfield of K generated by F,a and b is unique, therefore
F(a,b)=F(b,a). this F(a,b) is the subfield of k obtained by adjoining both a and b
to F.
Similarly if
x n  1 x n1  .................   n
x n  1 x n1  ....................  n
a n  1a n1  ......................   n  0  a n  1a n1  ......................  n
 a n  1a n1  ......................   n  a n  1a n 1  ......................   n
  1  1  a n1    2  2  a n2  ....................    n   n   0
q( x )      
x n1  .............    n  n 
a1 ,a2 ,.................an  k , then F  a1 ,a2 ,.................an  will be
described as the subfield of k generated by F1a1 ,......................an in other
words F( a1 ,a2 .....................an ) will be the smallest subfield of k containing F
as well as a1 ,a2 ,.................an
Algebraic field extensions.
Let q( x )  F[ x ] the ring of polynomials in x over F. let
q( x )  0 x m  1 x m1  .............   m suppose b  k where K is any
extension of F. then by q( b ) we shall mean the element
q( x )  0bm  1bm1  .............   m in K sometimes q( b ) is also called the
value of q( x ) obtained by substituting b for x. the element b is said to satisfy
q( x ) if q( b ) =0 also then we say that b is a root of q( x ) .
Algebraic element. Definition:
Let K be an extension of a field F. An element a  k is said to be
algebraic over F if there is a non-zero polynomial p( x )  F[ x ] for which
p(a)=0.
In other words a  k is said to be algebraic over F it there exit
elements 0 ,1 ,...............n in F, not all 0. such that
0 a n  1a n1  ................n  0

137
Transcendental element. Definition.
Let k be an extension of a field F. an element a  k is said to be
transcendental over F if it is not algebraic over F.
Definition:
A complex number is said to be an algebraic number if it is algebraic
over the field of rational numbers.
A complex number which is not algebraic is called transcendental. The
number e is transcendental.
Minimal polynomial of an algebraic element definition:
Let K be an extension of a field F. let a  k be algebraic over F.
suppose p(x)is a polynomial over F of lowest positive degree satisfied by a.
then p(x) is called a minimal polynomial for a over F.
Let us impose the restriction on minimal polynomial for an over F that
it should be monic i.e that in it the co-efficient of highest power of x should be
1. then we can speak of as the minimal polynomial for a over F because it will
be unique.
Theorem 3:
Let a  k be algebraic over F. then any two minimal monic polynomials
for a over F are equal.
Proof:
x n  1 x n1  .................   n and x n  1 x n1  ....................  n be
two minimal monic polynomials for a over F. Then
a n  1a n1  ......................  n  0  a n  1a n1  ......................  n
 a n  1a n1  ......................   n  a n  1a n1  ......................  n
1,a,a 2 ,......a n 1  F( a )
u T
F( a )  0  1a  ...........   n 1a n 1 : 0 ,1 ,..........   n 1  F 
0  1a  ...........   n 1a n 1
r01  r1a  r2 a 2 ................  rn 1a n 1  0
ri  F
q( x )  r0  r1 x  .............  rn 1a n 1
0  1a  ...........  n 1a n 1  r0  r1a  ................  rn 1a n 1
  0  r0    1  r1  a  .............   n 1  rn 1  a n 1  0
h( x )   0  r0    1  r1  x  .............   n 1  rn 1  x n 1
 0  r0  0 1  r1  0 ,...... n 1  rn 1  0
 0  r0 , 1  r1 .............,n 1  rn 1
ak

138
0
 r0  0, r1  0, r2  0,...........,rn1  0
  1  1  a n1    2  2  a n2  ....................    n   n   0
 a satisfies the polynomial q( x )      
x n1  .............    n   n  belonging to F[x].
 q[ x ] must be the zero polynomial because minimal polynomial for a
over F is at degree n while q(x), if it is not the zero polynomial, is of degree
less than n.
 1  1  0, ............... n  n  0  1  1 ............ n  n
 x n  1 x n1  ...........   n  x n  1x n1  .......  n
This completes the proof of the theorem.
Irreducibility of minimal polynomial:
Theorem 4:
Let a  k be algebraic over F and let p(x) be a minimal polynomial for
a over F. then p(x) is transcendental irreducible over F.
Proof:
Suppose p(x) is a polynomial in F[x] of smaller positive degree such
that p(a)=0 suppose p(x) is not irreducible over F. then p(x) can be resolved
into non-trivial factors. Let p(x)=f(x)g(x) where f(x) and g(x) are polynomials
of positive degree in f[x] and each of them is of degree less that that of p(x).
We have,
P(a)=f(a) g(a)  0=f(a) g(a)
 f(a)=0 or g(a)=0
 a satisfies f(x) or g(x)
 p(x) is not a minimal polynomial for a over F because deg F(x)<deg
p(x) and deg g(x)<deg p(x).
Since p(x) is a minimal polynomial for a over F, therefore our assumption that
p(x) is not irreducible over F is wrong. Hence p(x) must be irreducible over F.
Degree of an algebraic element. Definition:
Let K be an extension of the field F. the element a  k is said to be
algebraic of degree n over F if it satisfies a non-zero polynomial over F of
degree n but no non-zero polynomial of lower degree.
Thus a  k is algebraic of degree n over F if the minimal polynomial
for a over F is of degree n.
Algebraic extension definition:
The extension k of F is called an algebraic extension of F if every
element in k is algebraic over F.

139
If there exists a  k such that a is not algebraic over F, then k is called a
transcendental extension of F.
The field C of complex numbers is an algebraic extension of R, the field
of real numbers. The field R of real number is not an algebraic extension of the
field Q of rational numbers. In fact  is an element of R which is not algebraic
over Q. it was proved by Hermite in 1871 that there exists non-non-zero
polynomial with rational coefficients satisfied by  .
If F is any field, then F is an algebraic extension of F. if a  k , then a
satisfied non-zero polynomial x-a in F[x]
Theorem 5:
Let k be an extension of a field F and let a  k be algebraic of degree n
over F. then
F( a )  0  1a1  2a 2  ...............  n1a n1 : 0 ,1 ,.............n1  F  .
Also the expression for each element of F(a) in the form,
0  1a  ...............  n1a n1 is unique.
Proof:
If n=1 then a  k . Therefore in this case F(a)=F and the result of the
theorem is obviously true.
So let us take n>1. since a  k is algebraic of degree n over F, therefore
the minimal polynomial for a over F is of degree n. let
p( x )  x n  1 x n1  ...............   n be the minimal polynomial for a over F.
then
a n  1a n1  .........   n  0
 a n    1a n1  ..........   n 
 a n1    1a n   2 a n1  .............   n a   ( 1 )
[on multiplying both sides by a]
 a n1    1  1a n1  ............... n    2 a n1  .............   n a 
[putting the value of an from (1)]
n 1
a is a linear combination of the elements 1,1,…. a n1 over F.
n k
Continuing the above process we can show that a , for k  0 is a
linear combination over F of 1,a,…….., a n1 , now let
T  0  1a  ...................  n1a n1 : 0 ,1 ........n1  F 
Now let T=F(a) for this we shall first prove that T is a subfield of k.
Let
U  0  1a  ...................  n1a n1 , V  r0  r1a  ................  rn1a n1

140
Be any two element in T.
U  V   0  r0    1  r1  a  ........   n1  rn1  a n 1  T
n 1
Now let 0  U  0  1a  ...................  n1a be in T.
n 1
Let q( x )  0  1 x  ...................  n1 x  F( x )
Then q[a]= U  0
We claim that q(x) is not a divisor of p(x), then we must have
p( x )   a0 x  a1  q(x) where a0 x  a1 is a non-zero polynomial in F[x].
Putting x=a in this relation we get
p( a )   a0 a  a1  q( a )
 0   a0 a  a1  q( a )
 a0 a  a1  0
 a satisfies a polynomial a0 x  a1 of degree 1 over F because
deg p(x)=n which is >1.
Therefore q(x) is not divisor of p(x) since p(x) is irreducible over F.
therefore q(x) and p(x) must be relatively prime. Consequently we can find
polynomials s(x) and t(x) in f[x] such that p(x) s(x) + q(x) t(x)=1.
Putting x=a in this relation, we get p(a) s(a) +q(a)t(a)=1
 q(a) t(a)=1
U(a)=1 [p(a)=0]
 t(a) is the inverse of U [q(a)=u]
Now in t(a) all powers of a higher then n-1 can be replaced by linear
combinations of 1,a……, a n1 over F.
Therefore t( a )  T thus t(a)= u 1  T now in the product u 1v all
powers of a higher the n-1 can be replaced by linear combinations of
1,a,….. a n1 over F.
Therefore u 1v  T . Thus T is a subfield of K.
Now from the definition of T it is obviously that both F and a are in T.
also by virtue of closure under addition and multiplication any subfield of k
which contains both F and a must contain T. thus T is the smallest subfield of k
containing both F and a. hence T=F(a).
Now let u T . Further let u  0  1a  ...........  n1a n1 and
n 1
also u  r0  r1a  ................  rn1a then
0  1a  ...........  n1a n1  r0  r1a  ................  rn1a n1

  0  r0   1  r1  a  .............   n1  rn1  a n1  0

141
 a satisfies the polynomial
h( x )   0  r0    1  r1  x  .............   n1  rn1  x n1 belonging to
F[x].
 h(x) is must be the zero polynomial because otherwise a will not be of
degree n over F [Note that if h(x)  0 , then deg h(x)<n]
 0  r0  0 1  r1  0,......n1  rn1  0
 0  r0 , 1  r1 .............,n1  rn1
 the expression for u in the form
0  1a  ...........  n1a n1 is unique.
Theorem 6:
Let K be an extension of a field F and let a  k be algebraic over F.
suppose a satisfies an irreducible polynomial p(x) in F[x]. then p{x} must be a
minimal polynomial for a over F.
Proof:
Let M   f ( x )  F [ x ] : f ( a )  0
We claim that M is an ideal of F[x]
The proof is as follows.
Let f(x), g(x)  M , then f(a)=0, g(a)=0
Let s(x)=f(x)-g(x). then s(a)=f(a)-g(a)=0-0=0
S(x)  M
Also let f(x)  M and h(x) F[x].
Then f(a)=0
Let t(x)=f(x).h(x). then t(a)=f(a).h(a)=0, h(a)=0
T(x) M
Hence M is an ideal of F[x]. obviously M  F[x]. because if f(x)=1  F[x].
then f(a)=1  0.
Thus F(x)=1 is not in M. therefore M=F[x].
Now p(x) is an irreducible polynomial in F[x]. therefore the ideal
N=(p(x)) of F[x] generated by p(x) is a maximum ideal because F(x) is a
Euclidean ring, we have p(x)  M because it is give that p(a)=0. now if
n(x)=m(x)p(x) is any element of (p(x)), then n(a)=m(a) p(a)=0 this implies that
n(x) is in M.
Thus N  M therefore M is an ideal of F(x) contained between N and
F[x] i.e. N  M  F [ x ] .
Since N is a maximal ideal and M  F [ x ] , therefore we must have
N=M  M=(p(x)).
Now suppose that p(x) is not a minimal polynomial for a over F. let q(x)
be a polynomial in F(x) of degree lower then that of p(x) and satisfied by a.
Since q(a)=0, therefore q(x)  M .

142
Q(x)=p(x) r(x) for some r(x) F(x). but this result is absurd because
deg q(x) < deg p(x). Hence p(x) must be a minimal polynomial for a over F.
Theorem 7:
Let K be an extension of a field F. then the element a  k is algebraic
over F if and only if F(a) is a finite extension of F.
Proof:
Suppose F(a) is a finite extension of F. then to prove that a is algebraic
over F. let [F(a):F]=m. since F(a) is a field and a  F( a ) , therefore the m+1
2 m 1 m
elements 1,a,a ,...........a ,a are all in F(a). since the dimension of the
vector space F(a) over F is m, therefore these m+1 elements of F(a) are linearly
dependednt over F. so there exist elements  0 ,1 , 2 ...... m  F not all 0 such
that 01  1a   2 a  ......... m a  0
2 m

a satisfies a non-zero polynomial


F( x )  0  1 x   2 x  ......... m x  F [ x ]
2 m

 a is algebraic over F.
This proves the ‘H’ part of the theorem now we shall prove the ‘only if’
part of the theorem. It is given that a  k is algebraic over F and we are to
prove that F(a) is a finite extension of F. let p(x) be a polynomial over F of
lowest positive degree satisfied by a. let degree p(x)=n. then a is algebraic of
degree n over F. therefore
F( a )  0  1a  ...........  n1a n1 : 0 ,1 ,..........  n1  F 
From this we see that F(a) is a vector space over F spanned by the
2 n 1
elements 1,a,a ,......a . These elements of F(a) are also linearly independent
n 1
over F. Because r01  r1a  r2 a ................  rn1a  0 with ri  F
2

n 1
a satisfies a polynomial q( x )  r0  r1 x  .............  rn1a
belonging to F[x].
 q[x] must be the zero polynomial otherwise p(x) will not be a
polynomial of lowest positive degree satisfied by a [Note that deg p(x)=n while
deg q(x) < n if q(x)  0 ]
 r0  0, r1  0, r2  0,...........,rn1  0
 1,a,a 2 ,......a n1  F( a ) are linearly independent over F.
2 n 1
Thus the n elements 1,a,a ,......a continue a basis for F(a) over F.
Hence [F(a):F]=n. therefore F(a) is a finite extension of F.
Remark:
If [F(a):F]+m, a is algebraic over F. let the degree of a over F be n. then
we have [F(a):F]=n therefore m=n. hence if [F(a):F=m, then a is algebraic of
degree m over F.

143
Theorem 8:
Let k be an extension of a field F and let a  k be algebraic of degree n
over F. Then [F(a):F]=n
Proof:
This theorem is nothing but the ‘only if’ part of the previous theorem.
Theorem 9:
Every finite extension K of a field F is algebraic.
Proof:
Suppose k is a finite extension of F. then k will be an algebraic
extension of F if every element a in k is algebraic over F. let [K:F]=m.
Since K is a field and
aK
p( x )   01  1 x   2 x 2  .........   m x m  F( x )
 m
a  K therefore the m+1 since the dimensional of the vector space k(F) is m,
therefore these m+1 elements of K are linearly dependent over F. so there exist
elements  0 ,1 ......... m  F not all zero, such that
01  1a   2 a 2  .........   m a m  0
 a satisfies a non-zero polynomial
p( x )  01  1 x   2 x 2  .........   m x m  F( x )
 a is a algebraic over F.
Hence K is an algebraic extension of F.
Remark:
If [K:F]=m then each element a in K is algebraic over F and the degree
of a over F will be  m .
Theorem 10.
Let K be an extension of a field F and let
a1 ,a2 .......an
F  a1 ,a2 .......ak 1  ak 1a1 ,a2 .......an
F  F( a1 )  F( a1 ,a2 )  .....  F( a1 ,a2 .....an )F  k
 F( a1 ) : F 
  F  a1 ,a2 .......an1  : F  a1 ,a2 .......an2   

144
 F [ x ]  k[ x ]
 p( x )  k[ x ] [ p( x )  F( x )]
 p( c )  0 q( c )  r  p( c )  ra  k
k[x],(x-a)/[(x).
k
ak
a / b( if b  0 )
a  b, ab,a / b, ( if b  0 )
a,b  k  
a,b  F( a,b )
aL
x n  1 x n1   2 x n2  .............   n
M  F  1 , 2 ,......... n  1 , 2 ,......... n
x n  a1 x n1  ........  an
a1 ,a2 .......an be n elements in K algebraic over F.
The F  a1 ,a2 .......an  is a finite extension of F and consequently an
algebraic extension of F.
Proof:
We have F  F( a1 )  F( a1 ,a2 )  .....  F( a1 ,a2 .....an )  k
Since a k is algebraic over F, therefore it is also algebraic over
F  a1 ,a2 .......ak 1 
which is a super field of F. note that any non-zero polynomial over F is also a
non-zero polynomial over F  a1 ,a2 .......an 
Now a k is algebraic over F  a1 ,a2 .......ak 1 
F  a1 ,a2 .......ak 1  ak is a finite extension of F  a1 ,a2 .......ak 1  .
 F  a1 ,a2 .......ak 1 ,ak  is a finite extension of F  a1 ,a2 .......ak 1 
  F  a1 ,a2 .......ak  : F  a1 ,a2 .......ak 1   is finite, say  k . Now
 F  a1 ,a2 .......an  : F 
  F  a1 ,a2 .......an  : F  a1 ,a2 .......an1  
 F  a1 ,a2 .......an1  : F  a1 ,a2 .......an2  

145
……..  F( a1 ) : F 
Hence F( a1 ,a2 .....an ) is a finite extension of F. consequently
F( a1 ,a2 .....an ) is an algebraic extension of F.
Theorem 11:
Let K be an extension of a field F. then the elements in K which are
algebraic over F from a subfield of k. in other words if a, b in k are algebraic
over F, then a  b ab and a/b (if b  0) are algebraic over F.
Proof:
Suppose a,b  k are algebraic over F. since b is algebraic over F.
therefore it is also algebraic over F(a) which is a super field of F. Note that any
non-zero polynomial over F is also a non-zero polynomial over F(a).
Now b is algebraic over F(a)  F(a)(b) i.e F(a,b):F(a)]=finite.
Also [F(a):F] is finite because a is algebraic over F.
Now [F(a,b):F]=[F(a,b):f(a)] [F(a):F]=finite.
F(a,b) is a finite extension of F. consequently F(a,b) is an algebraic
extension of F. Now F(a,b) is a field and a,b  F( a,b ) therefore a  b , ab and
a / b( if b  0 ) are all in F(a,b).
Hence the elements algebraic over F from a subfield of k.
Theorem 12:
If a and b in k are algebraic over F of degree m and n respectively, then
a  b , ab and a / b( if b  0 ) are algebraic over F of degrees at most mn.
Proof:
Since a is algebraic of degree m over F, therefore F(a) is of degree m
over F i.e., [F(a):F]=m Again b is algebraic of degree n over F. therefore b is
algebraic of degree at most n over p(a) which is a superfield of F. this implies
that the sub field (f(a)(b)
i.e. F(a)(b) of k is of degree at most n over F(a) i.e [F(a,b):F(a)]  n.
Now [F(a,b):F]=[F(a,b):F(a)][F(a):F]  mn
since [F(a,b):F] is finite therefore F(a,b) is a finite extension of F and so
it is an algebraic extension of F. since F(a,b) is a field, therefore
a,b F(a,b)  a  b ,ab, a / b( if b  0 ) are all in
F(a,b). a  b, ab,a / b, ( if b  0 ) are algebraic of degree at most mn over F.
Theorem 13: Transitivity of algebraic extension.
If L is an algebraic extension of F. then L is an algebraic extension of F.
Proof:
Let a be any arbitrary element in L. if we prove that a is algebraic over
F, then L will be an algebraic extension of F. since a  L and L is an algebraic
extension of k, therefore a satisfies some polynomial
n 1 n2
x  1 x   2 x  .............   n where 1 , 2 ,......... n are in k. now k
n

146
is an algebraic extension of F. therefore 1 , 2 ,......... n are algebraic over F.
so M  F  1 , 2 ,......... n  is a finite extension of F. now a satisfies the
n 1
polynomial x  a1 x  ........  an whose coefficients 1 , 2 ,......... n are
n

in M  F  1 , 2 ,......... n 
Therefore a is algebraic over M. consequently M(a) is a finite extension
of M. now M(a) is a finite extension of M and M is finite extension of F. This
theorem proof:
Roots of polynomials:
Let F be any field and let p(x) be any polynomial in F[x] our aim is now
to find a field k which is an extension of F and in which p(x) has a root.
Roots of a polynomial definition:
Let F be any field and let p(x) F(x).
Then an element a lying in some extension field of F is called a root of
p(x) if p(a)=0.
Theorem 14: Remainder Theorem
If p(x) F(x) and if K is an extension of F, then for any element c  k,
p(x)=(x-c)q(x)+p(c) where q(x) k[x] and where deg q(x)=deg p(x)-1.
Proof:
We have F  k
 F [ x ]  k[ x ]
 p( x )  k[ x ] [ p( x )  F( x )]
Now the polynomials p(x) and x-c are both in k[x]. therefore by division
algorithm there exist polynomials q(x) and r(x) in k[x] such that
P(x)=(x-c) q(x)+r(x)
where either r(x)=0 or deg r(x) is less than the degree of x-c. but the degree of
x-c is 1. therefore either r(x)=0 or deg r(x)=0 hence r(x) is a constant
polynomial in k[x] i.e. r(x) is simply an element say r, in k , thus,
P(x)=(x-c) q(x)+r
 p( c )  ( c  c ) q( c )  r [putting x=c on both sides]
 p( c )  0 q( c )  r  p( c )  r
Therefore p(x)=(x-c) q(x)+p(c)
Now suppose deg p(x)=n and deg q(x)=m.
The degree of the polynomial on the right hand side of the (1) is then
m+1 by the definition of equality of two polynomial we must have
n=m+1  m=n-1  deg q(x)=deg p(x)-1.
Corollary:
Factor theorem. a  k is a root of p(x) F[x], where F  k , then in
k[x],(x-a)/[(x).

147
Proof:
Let p(x)  F[x] and let a k where k is an extension field of F. then by
remainder theorem in k[x], we have
P( x )   x  a  q( x )  P( a ) n  m
  x  a q  x  0 [ a is a root of P(x)  P( a )  0 ]
  x  a  q( x )
Therefore in K[x] we have x-a is a divisor of P(x). Thus  x  a  / P( x )
in k[ x ] .
Multiple root. Definition
Let F be any field and let P( x )  F[ x ] . If K is any extension of F, then
a  k is said to be a root of P(x) of multiplicity m if in k[x], we have  x  a 
m

is a divisor of P(x) whereas  x  a 


m 1
is not divisor of P(x)
A root of multiplicity 1 is called a simple root and a root of multiplicity
>1 is called a multiple root.
Now we are going to face an important problem. Suppose
P( x )  F[ x ] and k is any extension field of F. The problem is that how
many roots can be p(x) have in K. if a is root of P(x) in k of multiplicity m,
then for this counting purpose we shall count it as m roots and not as one root.
Theorem 15:
A polynomial of degree n over a field can have at most n roots in any
extension field.
Proof:
We shall prove the theorem by induction on n, the degree of the
polynomial p(x).
To start the induction let P(x) be a polynomial of degree one over any
field F. let P( x )  a0 x  a1 where a0 ,a1  F and a0  0. Let a be a root of
P(x) in some extension field of F. Then P( a )  a0 a  a1  0 This gives
a1
a which is a unique element of F. Thus in this case P(x) has the unique
a0
a
root  1 i.e. P(x) has one and exactly one root in any extension field of F. In
a0
this way the theorem is true when P(x) is of degree 1.
Now assume as our induction hypothesis that the theorem is true in any
field for all polynomials of degree less than n, let P(x) be a polynomial of
degree n over a field F. Let K be any extension usly true because then the
number of roots of P9x) in K is zero which is definitely at most n. So let us
suppose that P(x)has at least one root, say, a  k . Let a be a root of multiplicity

148
 x  a
m
m. Then in k[x], we have is a divisor of
P( x )  deg  x  a   deg P( x )  m  n. Since in K[x] we have  x  a 
m m

is a divisor of P(x) therefore let P( x )   x  a  q( x ) where q( x )  K [ x ] .


m

We have deg q( x )  deg P( x )  deg  x  a   n  m which is definitely


m

less than n because 1  m  n .


Now a is a root of P(x) of multiplicity m. Therefore  x  a 
m 1
is not a
divisor of P( x )   x  a  q( x ) . This implies that (x-a) is not a divisor of
m

q(x). Therefore a is not a root of q(x) [see corollary to theorem 14]. Now let
b  a be a root of P(x) in K. Then on putting x=b in P( x )   x  a  q( x ) ,
m

we get 0  P( b )   b  a  q( b ) .
m

Since k is a field and 0   b  a   k and q( b )  k, therefore we must


m

have q( b )  0  b is a root of q(x) in k. Thus any root of P(x), in k, other


than a must also be a root of q(x) in K.
Now q(x) is of degree n-m which is less q(x) has at most n-m roots in k
and none of these roots is equal to a because a is not a root of q(x). Thus q(x)
has at most n-m roots other than a in k.
 P(x) has at most n-m roots other than a in k.
 P(x) has at most (n-m)+m=n roots in k, the root a of P(x) of
multiplicity m being counted m times
The proof is now complete by induction.
Theorem 16.
If P(x) is a polynomial in F[x] of degree n  1 and is irreducible over F,
then there is an extension E of F, such that [E:F]=n, in which P(x) has a root.
Proof:
Let F[x] be the ring of polynomial over F and let v=(P(x)) be the ideal
of F[x] generated by P(x). Since P(x) is irreducible over F, therefore V is a
maximal ideal of F[x]. Consequently E  F[ x ] / v is a field. We shall show
that the field E satisfies the conclusions of the theorem. First we shall show that
E can be regarded as an extension of F even though E does not contain the
element of F in their original form. For this we shall show that the field F can
be imbedded in the field E.
Let  be the mapping from F into E defined by ( a )  V  a  F
 is one-one. Let  ,  F . We have
(  )  (  )  V    V      V
     f ( x )P( x ) for some f ( x )  F( x ) [Note that V is the ideal
generated by P(x)].

149
 f ( x )  0 because if f ( x )  0, then the polynomial f(x)P(x) is of
positive degree and so it cannot be equal to the constant polynomial
      0    
 is one-one
Also
      V        V     V     ( a )  ( B ), and      V    V   
V           
Thus  is an isomorphism from F into E. Let F* be the image of F into
E under the mapping  i.e Let F* is a subfield of E isomorphic of F. If we
identify F and F* i.e. if we identify   F with V    F * then E can be
regarded as an extension of F. Now we claim that E is a finite extension of F.
n 1
For this we shall prove that the n elements V  1,V  x,V  x ,...V  x from
2

a basis of E over F.
[ E : F ]  n
Finally we shall snow
that P(x) has a root in E. Let
P( x )  al 0  a1 x  a2 x  ...an x where a0 ,a1 ,...an  F
2 n

First let us make P(x) as a polynomial over E with the help of the
identification we have made between F and F*. So let us replace a0 by V  a0
a by v  a1 ...an by v  an . Then
P( x )   v  a0   V  a1  x  ...  V  an  x n . We shall show that
V  x  E satisfies P(x). We have
P(V  x )  V  a0   V  a1 V  x   V  a2 V  x   ...  V  an  V  x 
2 n

V  a0   V  a2   V  x   V  a2  V  x2   V  an  V  xn 
[ Note that  v  x   V  x V  x   V  x , and so on]
2 2

  v  a0    v  a1x    v  a2 x 2   .... v  an x n  [ by def. of


multiplication of cosets we have  v  f ( x ) v  g( x )  v  f ( x )g( x )
 v  a0  a1 x  a2 x 2  ....  an x n [ by def. of addition of consets we have
v  f ( x )  v  g( x )  v  f ( x )  g( x )
 v  p( x0
 v since P( x ) V = the zero element of the field E. [ Note that the
zero element orf the field F  x  / v is nothing but the coset v itself.
Thus v  x satisfies P(x). Therefore V  x is a root of P(x). The proof of
the theorem is now complete.

150
Corollary:
If f ( x )  F  x  , then there is a finite extension E of F in which f(x)
has a root, moreover, f ( x )  F  x 

 E : F   deg f  x 
Proof:
Let P(x) be an irreducible factor of f(x). Let f ( x )  p( x )q( x ) we have
deg P( x )  deg f ( x )
Let a be the root of P(x) in some extension field k of F. Then P(a)=0, We
have f ( a )  P( a )q( a )  0 q( a )  0
Thus any root of P(x) in some extension field of F is also a root of f(x) in
that extension field.
Since P(x) is irreducible over F, therefore by the above theorem there is
an extension
E of F with  E : F   deg P( x )  deg f ( x ) in which p(x) has a root.
Hence there is a finite extension E of F with  E : F   deg f ( x ) in
which f(x) has a root.
Theorem 17:
Let f ( x )  F[ x ] be of degree n  1. Then there is a finite extension E
of F of degree at most n! in which f(x) has n roots (and so, a full complement
of roots)
Proof:
We shall prove the theorem by induction on n, the Degree of f(x). To start
the induction let f ( x )  F[ x ] be of degree 1. let f ( x )  a0 x  a1 where
a0 ,a1 a0 ,a1  F and a0  0 . New F itself an extension of F and [F:F]=1. Also
a a
 1  1  F is a root of a0 x  a1 . Thus if deg f9x)=1, then there is a finite
a0 a0
extension F of F of degree at most 1!=1 in which f(x) has one root. Now
assume as our induction hypothesis that the theorem is true in any field for all
polynomials of degree less than n. Let f(x) be a polynomial of degree n over a
field F. By corollary to theorem 16, there is an extension E0 of F with
 E0 : F   n in which f(x) has a root, say,  By factor theorem in E0[x], f(x)
factors as f ( x )   x    q( x ) where deg q(x)= deg f(x)-1=n-1. Now q(x) is
a polynomial over E0 of degree n-1. Since deg q(x) is less than n, therefore by
our induction hypothesis there is an extension E of E0 of degree at most (n-1)!
In which q(x) has n-1 roots. Since any root of f(x) is either  or a root of q(x),
therefore we obtain in E all n roots of f(x).

151
Now E is an extension of E0 and E0 is an extension of F implies that E is
an extension of F. we have  E : F    E : E0  E0 : F    n  1 ! n  n!
Thus E is a finite extension of F degree at most n! in which f(x) has n
roots.
The proof of the theorem is now complete by induction.
Splitting field or Decomposition field.
Definition:
If f ( x )  F  x  a finite extension E of F is said to be a splitting field
over F for f(x) if over E(that is, in E[x]), but not over any proper subfield of E,
f(x) can be factored as a product of linear factors. The field F is called the base
field or the initial field.
Theorem 18:
There are exists a splitting field for every f ( x )  F  x  .
Proof:
Let f ( x )  F  x  be of degree n, first we shall Prove that there exists a
finite extension E of F of degree at most n! in which f(x) has n roots.
n 1
Let f ( x )  a0 x  a1 x  ....  an ,a0  0
n

Let 1 ,..... n be the n roots in E of f(x). Then by factor theorem f(x) can
be factored over E as, f ( x )  a0  x  1  x   2  .... x   n 
In this way f(x) splits up completely over E as a product of first degree
factors. Thus we see that there exists a finite extensions of E of F which
decomposes f(x) as a product of linear factors. Consequently a finite extension
of F of minimal degree exists which also possesses this property. This minimal
extension will be splitting field for f(x) because no proper subfield of this
minimal extension can split f(x) as a product of linear factors.
Another way of defining the splitting field.
An extension E of a field F is said to be a splitting field of F is said to be
a splitting field of f ( x )  F[ x ] , if f ( x )  F[ x ] is expressible as
f ( x )  a0  x  1  x   2  ... x   n  where a0  F ,1 ,.... n  E and
E  F  1 , 2 ,... n  .
Uniqueness of the splitting field.
Now, we shall show that the splitting field of a polynomial is unique a
part from isomorphism. Let E1 and E2 be two splitting fields of f ( x )  F[ x ] .
Let
f ( x )  a0  x  1  x   2  ... x   n 
and f ( x )  a  x  1  x  2  ... x  n  over E2.

152
We shall show that the fields F  1 ,... n  and F  1 ,... n  are
isomorphic by an isomorphism leaving every element of F fixed. Before
proving this main theorem we shall first prove some Pre-requisite results.
Continuation of an isomorphic mapping. Definition
Let F and F1 be two isomorphic fields and let E and E1 be extension fields
of F and F1 respectively. An isomorphism  : E  E1 is called a continuation
of the isomorphism  : F  F 1 if (  )  (  ) for all  in F. Let  be an
isomorphism of a field F into a field F1. For the sake of convenience we shall
denote the image of any   F under  by  . Thus    Here  we
mean     . Now  is another way of writing the image of  under 
Theorem 19:
Let  be an isomorphism of a field F onto a field F1 such that   
for every   F Show that there is an isomorphism * of F [ x ]
onto F 1 [ x ] with a property that *    1 for every   F
Proof:
 is an isomorphism of a field F onto a field F1. For any   F we write
  ' . Let us define a mapping * from F[x] into F’[t] as follows:
For an arbitrary polynomials f ( x )  0  1 x  ....   n x  F [ x ] we
n

define * by
f ( x )*    0  1 x  ....   n x n  *
  0  1t  .....   nt n
 '0  1' t  ......  'nt n  f ' ( t )
We shall show that the mapping * satisfies the conclusions of the
theorem.
* is one-one.
Let
f ( x )  0  1 x  .... n x n , g( x )  0  1x  ....n x n be any two
element of F[x]. We have f ( x )*  g( x )*
   0  1 x  .... n x n  *  0  1 x  .... n x n  *
 '0  '1  .....  'nt n  '0  '1t  .....'mt m
 n  m and 'i   i for each i
 n  m and  i  i for each i[ is one - one ]
 f ( x )  g( x )  * is one-one

153
* is onto
Let '0  '1t  .....t'nt n be any element of F’[x] where
'0  '1  .....'n  F
Since  is onto F’ there fore 0 ,1 ,2 ,...n  F Such that
0   ,1   .......n  
'
0
'
1
'
n

Now 0  1 x  .....  n x n  Fx[ x ] and we have


 0  1 x  .....  n x       t therefore the mapping * is onto
n * '
0
' n
1

* preserves additions
Let f ( x )  0  1 x  ....  n x , g( x )  0  1x  ....  m x be any
n m

two elements of F[x]. without less of generality let n  m . First let us take the
case when
n  m, we have
 f ( x )  g( x ) *   0  0    1  1  x  .....  m  m  x m   m1x m1  ...   n x n  *
   0  0     1  1  t  ....    m  m  t m   m1t m1  ...   nt n
   0  0    1  1  t  ..    m   m  t m1  ...   nt n
  '0  1' t  ...  'mt m  'm1t m1  ...  'nt n   '0  '1t  ...  'mt m 
 f ( x )*  g( x )*
Similarly when n = m, we can show that
 f ( x )  g( x )  *
 f ( x )  g( x )
* *

* preserves multiplications, we have


 f ( x )  g( x ) *   0  1x  ...   n x n 0  1x  ...  m x m  *
  00    01  10  x  ...   n m x n m  *
   00      01  10  t  ...    n m  t n m
  '0'0    '01'  1' '0  t  ...   'n'm  t n  m
[  preserves additions and multiplications
i.e            '  ' and            ' ' ]
   0  1' t  ......  'nt n '0  '1t  ...  'mt m 
  f ( x )*   g( x )* 
Hence * is an isomorphism of f[x] onto.

154
F ' [ t ]. Further if f ( x )  F[ x ] be simply taken as  where   F ,
then by definition of * , we have *    ' .
Note:
From the above theorem we conclude that factorizations of f(x) in F[x]
result in like factorizations of f(x) * = f ' ( t ) in F ' ( t ) and voice versa. In
particular, f(x) is irreducible in F[x] if and only if f ' ( t ) is irreducible in
F' ( t ).
Theorem : 20
let  be an isomorphism of a field F onto a field F defined as   '
for every  F. for an arbitary polynomial
f ( x )  0  1x  ...   n x  F [ x ]
n
us define
f ' ( t )  '0  '1t  .....  'nt n  F ' [ x ] show that there is an isomorphism 
of F[ x ] /  f ( x ) onto F [ t ] / f ( t )
'
 '
 with the property that for every
  F ,     ' .
Proof:
Let * be a function from f[x] into F ' [ t ] defined by the formula
f ( x )*  f ' ( t ) for every f ( x )  F[ x ] then by theorem 19 the mapping
* is a isomorphism of f[x] onto F ' [ t ] let f(x) be irreducible in F[x]. then
f ' ( t ) will be irreducible in F ' [ t ] let V   f ( x ) be the ideal of F[x]
 
generated by f(x) and V  f ( t ) be the ideals because f(x) and f ' ( t ) are
' '

irreducible. Therefore F[ x ] / V and F ' [ t ] / V ' are both fields.


Let us define a mapping  from F[ x ] / V into F ' [ t ] / V ' by the
formula V  g( x )   V  g( x )  V  g ( t ) for every g( x )  F[ x ].
' * ' '

the mapping  is well-defined.


For this we are to show that if
V  g( x )  V  h( x ), then V  g( x )   V  h( x )  where g( x ),h( x )  F [ x ].
We have
V  g( x )  V  h( x )
 g( x )  h( x ) V
g( x )  h( x )  k( x ) f ( x ) for some k( x )  F [ x ]
  g( x )  h( x ) *   k( x ) f ( x ) *
 g( x )*  h( x )*  k( x )*   f ( x )*  * is an isomorphism 
 g' ( t )  h' ( t )  k' ( t ) f ' ( t )

155
 g' ( t )  h' ( t ) V '  V '  g ' ( t )  V '  h' ( t )
 V  g( x )   V  h( x ) 
Therefore the mapping  is well defined.
The mapping  is one-one.
Let g( x ),h( x )  F [ x ].
Then V  g( x )   V  h( x ) 
 V '  g ' ( t )  V '  h' ( t )
 g ' ( t )  h' ( t )  V '
 g ' ( t )  h' ( t )  k ' ( t ) f ' ( t ) for some k ' ( t )  F ' [ t ]
 g( x )*  h( x )*   k( x )*   f ( x )* 
  g( x )  h( x ) *   k( x ) f ( x ) *
 g( x )  h( x )  k( x ) f ( x )
 g( x )  h( x )  V  V  g( x )  V  h( x )
  is one one.
The mapping  is onto.
Since the mapping * is onto. Therefore corresponding to any
polynomial g' ( t ) in F ' [ t ], we have a polynomial g(x) in F[x]. therefore,
V '  g ' ( t )  F ' [ t ] / V '  V  g( x )  F [ x ] / V such that V  g( x )   V '  g ' ( t )
 preserves additions and multiplications.
Let g( x ),h( x )  F [ x ]. We have
V  g( x )  V  h( x ) 
 V  g( x )  h( x ) 
 V '   g( x )  h( x ) *
 V '  g( x )*  h( x )*
 V '  g ' ( t )  h' ( t )
 V '  g ' ( t )  V '  h' ( t )
 V  g( x )  V  h( x ) 
Thus  is an isomorphism of F[ x ] / V onto F ' [ t ] / V ' .
In theorem 16 we have shown that the field F can be imbedded in the
field F[ x ] / V

156
By identifying the element   F with the residue class (coset) V+  in
F[ x ] / V .
Similarly we can consider F’ to be contained in F ' [ t ] / V ' . With its
identification, for any  F , we have
  (V   ) [  has been idetified with V   ]
 V '  *  V '  '
 ' [' has identified with V '  ' ]
Theorem 21:
Let  be an isomorphism of a field F onto a field F’ such that
  ' for every  F. let f ( x )  0  1 x  ...   n x be an irreducible
n

polynomial in F[x] which is mapped onto the polynomial in F[x] which is


mapped onto the polynomial f ( t )  0  1t  ...   nt in F [ t ]. if V is a
' ' ' ' n '

root of f(x) in some extension field of F and w is a root of f ' ( t ) in some


extension field of F’, then the field F(v) is isomorphism to the field f ' ( w ) by
an isomorphism  such that
(1) V   w
(2)     ' for every   F
Note:
The condition (2) may also be stated as that the isomorphism  is a
continuation of the isomorphism  .
Proof:
 is an isomorphism of a field F onto a field F’ such that   ' for
every F . The irreducible polynomial
f ( x )  0  1 x  ...   n x in F [ x ] is mapped onto the irreducible
n

polynomial f ( t )  0  1t  ...   nt in F [ t ]


' ' ' ' n '

We have deg f ( x )  n  deg f ( t ) .


'

Let V be a root of f(x) in some extension field of F. then V is algebraic


over F. since f(x) is irreducible in F[x], therefore f(x) is a minimal polynomial
for V over F, consequently by theorem 5.

We have F( v )  0  1v  ...  n1v
n1
: 0 ,1 ,........,n1  F 
n 1
Also the expression 0  1v  ...  n1v
For an element of F(v) is unique,
'

Similarly, F ( w )  0  1w  ...  n1w
' ' ' n1
: '0 ,1' ,..'n1  F ' 
n 1
Also the expression 0  1w  ...  n1w
' ' '

157
For an element of F ' ( w ) is unique.
Let c be an arbitrary element of F(v).
n 1
Then c can be written as c  0  i v  ...  n1v , where 0 ,1 ,....n 1
are unique elements of F. Let as defining a mapping  from F(v) into F ' ( w )
by the formula
c   0  1v  ...   n1v n1  
 0  1w  ...   n1wn1
n 1
Since the express for c in the form 0  1v  ...  n1v is unique,
therefore, the mapping  is well defined  is one – one. Let c=
0  1v  ...  n1v n1 , d  r0  r1v  ...  rn1v n1 be any two elements of F(v)
. then c  d 
 0  1v  ...  n 1v n 1     r0  r1v  ...  rn 1v n 1  
 '0  '1w  ...  'n1wn1   r0'  r1' w  ...  rn' 1wn 1 
 '0  r0' ,1'  ri' ,......'n1  rn' 1
[since the expression for the element of F ' ( w ) in the form
0  1w  ...  n1wn1 is unique]
 0  r0 ,1  r1 ......,n1  rn1
 0  r0 ,i  ri ,.....n1  rn1
 c  d   is one  one.
 onto let '0  '1w  ...  'n1wn1be
Any element of F1 (w) where 0 , 1 ,........ n 1  F since  is onto
1 1 1

F1 , therefore  so, 10 , 11 ,........ n 1  F such that


0   , 1   ........n 1  
1
0
1 1
n 1 .

Now 0  1U  .......  n 1 Vn 1  F(V) and we have


 0  1U  .......  n 1 Vn 1    10  11w  .........  1n 1w n 1
 is onto F1 (W) .
 preserves additions. We have
 c  s    0  1V  ........  n1Vn1    r0  r1  ..........  rn 1V n1  
 0  r0   1  r1  V  ..............  n 1  rn 1  V n 1  

158
  0  r0    1  r1  W  .........   n 1  rn 1  W n 1
  0  r0   .........  n 1  rn 1  W n 1
  10  r01   ...........   1n 1  rn11  W n 1

  0  1W  ........  n 1W n 1    r0  r1  ..........  rn 1W n 1  

  0  1V  ........  n 1V n 1     r0  r1  ..........  rn 1V n 1   


 c  d a preserves multiplications we have
(cd)  0  1V  ........   n 1V n 1    r0  r1  ..........  rn 1V n 1   

 0 r0   0 r1  1r0  V  ..........  n 1rn 1  V 2n 2  


  0 r0    0 r1  1r0  W  ..................  n 1rn 1  w 2n 2
 10 r01  0 r11  11r01  W  ..............  1n 1rn 1  W 2n 2
[ preserves additions and multiplications]

 
  10  11W  ...........  1n 1W n 1   r0  r1  ..........  rn 1W n 1  
 
  0  1V  ........  n 1V     r0  r1  ..........  rn 1V   
n 1 n 1

  c  d 
Hence  is an isomorphism of F(V) onto F1 (W) now V  F(V) can be
uniquely written as
V  0  1V  0V 2  ..........  0V n 1
 V0  0  1W  0W 2  ..........  0W n 1  w
Also if a  F then   F(V) can be uniquely written as
    0V  0V 2  ..........  0V n 1 .
  1  0w  0w 2  ......  0w n 1  2 .
This completes the proof of the theorem.
Corollary:
If f (x)  F(x) is irreducible and if a,b are two roots of f(x), then F(a) is
isomorphic to F(b) and which leaves every element of F fixed .
Proof:
In the above theorem replace the field F by F i.e take F  F . Take the
1 1

isomorphism  as the identity map of F i.e of fixed and  is an isomorphism


of F onto F. take V=a and w=b.
Then F(a) is isomorphism to F(b) by an isomorphism  such that 1)
a  b 2)      for every   F i.e  leaves every element of F fixed.

159
Now re-write the proof of the above theorem and this complete the
proof of this corollary because the corollary in itself is very important.
Theorem 22:
Let  be an isomorphism of a field F onto a field F defined as   1
for every   F corresponding to a polynomial
f (x)  0  1x  ............   n x in
n
F[x], let
f 1 (t)  10  11t  .........   n t n be a polynomial in F(t), show that the splitting
fields E and E1 of f (x)  F(x) and f 1 (t)  F1[t] , respectively are isomorphism
by an isomorphism  with the properly that     1 for every   F .
Proof:
We shall prove the theorem by induction on the degree of some splitting
field over the initial field. To start the induction let [E:F]=1 then E=F and
therefore f(x) resolves into a product of linear factors over F itself. By theorem
19, f 1 (t) must also resolve into a product of linear factors over F1 itself.
Therefore F1 is a splitting field for f 1 (t) i.e f  E . Hence    provides is
1 1

with an isomorphism of E onto E coinciding with  on F.


1

Now assume as our induction hypothesis that the theorem is true for any
field F0 and any polynomial g(x)  F0 [x] provided the degree of some
splitting field E 0 of g(x) over initial field F0 is less than n i.e  E 0 : F0   n .

Let  E 0 : F0   n  1 , where E is a s splitting field of f(x) over F. If f(x)


resolves into a product of linear factors over F, then f will be a splitting field for
f(x) and so we cannot have  E 0 : F0   1 . Because E  F   E : F  1.
Therefore f (x)  F[x] must have an irreducible factor p(x)  F[x] of degree
r>1. Let p1 (t) be the corrponding irreducible factor of f 1 (t) .
Now E is a splitting field for f (x)  F[x]
 a full complement of roots of f(x) are in E.
 a full complement of roots of p(x) are in E. [p(x) is a factor of f(x)]
 [F(v):F]=r=deg p(x) [by theorem 7]
Similarly there is a w  E1 such that p1 (w)  0 . By theorem 21 there is
an isomorphism  of (F(v)) onto F1 (w) such that     1 for every
F.
Now [E:F]=[E:F(v)] [F(v):F]
[E : F] n
 [E:F(v)]=   n [ r  1]
[F(v) : F] r

160
Now let F0  F(v) and F01  F1 (w) . Since F0  F, therefore
f (x)  F[x] can also be regarded as f (x)  F0 [x] . We claim that E is a
splitting field for f (x)  F0 [x] . obviously f (x)  F0 [x] resolves into a product
of linear factory over E. no proper subfield of E. containing F0 and hence F can
split f(x) into linear factors because we have assumed that E is a splitting field
of f(x) over F. hence E is a splitting field for f (t)  F0 [t] .
1 1

Now  is an isomorphism of F0 onto F0 E is splitting field for


1

f (x)  F0 [x] and E1 is splitting field for f 1 (t)  F01[t]


Since  E : F0    E : F(v)  n therefore by our induction hypothesis
there is an isomorphism  of E onto E such that a  a for all a  F0 .
1

Now   F ,  a  F0 . Therefore for every   F , we have


      1         F
1

The proof is now complete by induction.


Corollary:
Uniqueness of the splitting field any two splitting fields of the same
polynomial over a given field F are isomorphisc by isomorphism leaving every
element of F fixed.
Proof:
In the proof of theorem 22 take F  F and take  as the identify
1

mapping of F i.e       F . Then  is an isomorphism of F onto F and


F and it leaves every element fields for f (x)  F[x] . In the above theorem take
E1  E and E 2  E1 then E1 and E 2 are isomorphic by an isomorphism leaving
every element of F fixed.
More about roots
Derivative of a polynomial over a field.
Definition:
Let f (x)  a 0  a1x  a 2 x 2  ...............  a n 1x n 1  a n x n be a
polynomial over a field F. then the derivative of f(x) denoted by f 1 (x) is
defined as the polynomial.
f 1 (x)  a1  2a 2 x  ................  (n  1)a n 1x n 2  na n x n 1 in F[x]
Note:
Here by 2 we mean 1+1; by n we mean 1+1+………… upto n times.
The field F is arbitrary. It may be a field of finite characteristic or it may be a
field of characteristic zero or infinite. If F is a field of finite characteristic p,
then the derivation of the polynomial x is px p1  0 x p1  0 . Recall that
p

161
p=1+1+1+…..upto p times=0 because the characteristic of the field is p. thus
we see that even the derivative of a non-constant polynomial can be zero if
field F is a field of finite characteristic.
Theorem 23:
Let F be a field and let f (x)  F[x] be such that F1  0 , prove that,
1) If characteristic F=0, then f(x)=   F i.e f(x) is a constant polynomial
2) If characteristic F=P  0, then f(x)= g(x)p for some polynomial
g(x)  F(x) i.e f(x) is a polynomial in x p
Proof:
Let
f (x)  a 0  a1x  a 2 x 2  ...............  a i x i  a n x n a field F.
When we have
i 1 n 1
f (x)  a1  2a 2 x  ................  ia i x  .............  na n x
1

It is given that f 1 (x)  0 =zero polynomial.


Therefore a1  2a 2 x  ....  ia i xi  .....  na n x n 1  0  0x  ....  0x n 1
By definition of equality of polynomials, we get
a1  0, 2a 2  0,ia i  0,...na n  0
Case (i). Characteristic F=0
We have ia i  0
 a1  a i  .... upto I times =0
 Order of ai regarded as an element of the additive group of F is  i .
 Order of aI regarded as an element of the additive group of F is  i
 a1=0, because in a field of Characteristic 0 each non-zero element is of
infinite order and zero is the only element of finite order 1.
Thus a1  0, a 2  0,...a i  0,...a n  0
f (x)  a  F  f (x) is constant.
Case (ii). Characteristic F  P  0.
In this case each non-zero element of F is of order P and the zero element
of F is of order 1.
From the theory of groups we know that if a is an element of order P,
then a m  e  p is a divisor of m. Here e=0, then identity of the additive group
of F. Now I as =0
 either a i  0 or if a i  0 then P should be a divisor of i.
 either a i  0 or if a i  0 then t  p where  is some positive
integer.

162
Therefore in this case the term x i will occur in f(x) only if it is of the
form  xp  . Thus f(x) will be a polynomial in x p
Theorem 24: For any f (x), g(x)  F[x] and any   F .
(i)  f (x)  g(x)   f ' (x)  g ' (x)
'

(ii)  f (x)   f ' (x)


'

(iii)  f (x) g(x)   f ' (x)g(x)  f (x)g ' (x)


'

Proof:
(i) Let f (x)  a 0  a1x  a 2 x 2  ....  a n x n . and
g(x)  b0  b1x  ..b2 x  ...  bm x m . Without loss of generality, we can take
2

n  m . We have
f (x)  g(x)   a 0  b0    a1  b1  x   a 2  b2  x 2  ...   a m  b m  x m   a m1  0  x m1  ..  a n  0  x n

where a m 1....a n are all zero if n=m.


Now
f (x)  g(x)   a1  b1   2 a 2  b2  x  ....  m a m  bm  x m1   m 1 a m1x m  ....  na n x n1.
'

Now if a, b  F and K is any integer, then


k  a  b   ka  kb f (x)  g(x) 
'

 a1  2a 2 x  ....  ma m x m1   m  1 a m 1x m  ...  na n x n 1  b1  2b 2 x  ...  mb m x m1 


 f ' (x)  g ' (x)
(ii) Let f (x)  a 0  a1x  a 2 x 2  ....  a n x n .
We have f (x)  a 0  a1x  a 2 x 2  ....  a n x n .
 f (x)  a1  2  a 2  x  ....  n  a n  x n 1
'

  a1  2a 2 x  ....  na n x n 1   f ' (x)


(iii) Since the Product of f(x) with g(x) is a linear combination over
F of powers of x, therefore by virtue of results (i) and (ii) it is sufficient to
prove the result (iii) for products of two powers of x. let f (x)  x r , g(x)  x 3
where r and s are positive integer.
We have f (x)g(x)  x r x s  x r s , Therefore
f (x)g(x)   r  s  x r s1  rx r  s 1  sx r  s 1
'

  rx r 1  x s  x r  sx s1   f ' (x)g(x)  f (x)g ' (x)

163
Conditions for multiple roots.
Theorem 25:
The polynomial f (x)  F[x] has a multiple root if f(x) and f’’(x) have a
nontrivial common factor.
Proof:
Before proving the main theorem let us first prove an important result
namely: if f(x) and g(x) in F[x] have a non-trivial common factor in K[x] where
K is some extension of F, then they must have a non-trivial common factor in
F[x].
We shall prove this result by contradiction. Suppose f(x) and g(x) have a
non-trivial common factor in K[x] but they have no non-trivial common factor
in F[x]. Then f(x) and g(x) are relatively prime as elements in F[x]. Therefore
there exist two polynomial a(x) and b(x) in F[x] such that
a  x  f (x)  b  x  g  x   1.
Since k  F , therefore a(x), b(x), f (x), g(x) can all be taken as
polynomials in k[x]. Therefore a  x  f (x)  b  x  g  x   1. implies that f(x) and
g(x) are relatively prime as elements in k[x]. Note that if f(x) and g(x) have a
common factor of positive degree then this factor must divide 1 which is
impossible. Thus we get a contradiction. Hence f(x) and g(x) must have a
nontrivial common factor in F[x].
Now we come to the proof of the main theorem. By virtue of the result
we just proved we can assume without loss of generality that the roots of f(x)
all lie in F otherwise extend F to K the splitting field of f(x)
Now suppose that  is a multiple root of f(x) of multiplicity m>1. Then
 x   is a divisor of f(x). Therefore f (x)   x    f (x)   x    q(x) .
m m m

'
We have f ' (x)   x     q(x)   x    q ' (x) [Note that if m>1, then
m m
 
the derivative of  x    is m  x    ]   x    r(x) , Since m>1.
m m 1

Thus  x    is a common factor of f(x) and f’(x) have a non-trivial


common factor. Then to Prove that f(x) has a multiple root suppose that f(x) has
no multiple root.
Converse:
Suppose that f(x) and f’(x) have a non-trivial common factor. Thus to
prove that f(x) has a multiple root suppose that f(x) has no multiple root
Let the roots of f(x) be 1 ,  2 ,.. n where the  i is are all distinct.
Without less of generality we can assume f(x) to be monic. Then
f (x)   x  1  x   2  ....  x   n  .

164
n _________
Therefore f ' (x)    x  1  ....  x  1  .....  x   n  where the ---bar
i 1

denotes the term is omitted. Our Claim is that no root of f(x) is a root of f '  x  ,
for f '  i     i   j   0
j t

Since 1.... n are all distinct. Now if f(x) and f’’(x) have a non-trivial
common factor, then they have a Common factor. Therefore f(x) and f’(x) have
no common root implies that they have no non trivial common factor. Thus we
have Proved that if f(x) has no multiple root , then f(x0 and f’(x) have no non-
trivial common factor, Hence if f(x) and f’(x) have a nontrivial common factor
then f(x) must have a multiple root. This proves the other side of the theorem.
Theorem 26:
If f (x)  F[x] is irreducible, then:
i) If the characteristics of F is 0, f(x) has no multiple roots.
ii) If the characteristic of F is P  0 , f(x) has a multiple root. Only if it is
of the form f(x)=g(xp) for some g(x)  F[x]
Proof:
Suppose f(x) has a multiple root. Then by theorem 25, f(x) and f’(x) have
a non-trivial common factor in F[x], Sinces f(x) is irreducible in F[x], therefore
its only non-trivial factor in F[x] is f(x). Thus f(x) is a factor of f’(x), now if
f’(x)  0, the deg f’(x) is less then that of f(x) and so f(x) cannot be a factor of
f’(x) therefore f (x) / f ' (x) is possible only if f ' (x) =0. In characteristic 0 this
implies that f(x) is a constant, which has no roots; in characteristic P  0 , this
 
forces f (x)  g x p for some g(x)  F[x]
Theorem 27:
If F is a field of characteristic P  0 then the polynomial. xp n  x  F[x]
for n  1, has distinct roots.
Proof:
Let f (x)  xp n  x. Then f ' (x)  p n xp n 1  1.
Now by p  F , we mean 1+1+1+….upto P times since F is of
characteristics P, therefore the order of 1 as an element of the additive group of
F is P. therefore f ' (x)  1 .
Now we see that f(x) and f ' (x) have no nontrivial common factor.
Therefore by theorem 25, f(x) has no Multiple roots.

165
NOTES
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………

166
UNIT - V
CANONOICAL FORMS
A Decomposition of Jordan Form
Let V be a finite-dimensional vector space over F and let T be an
arbitrary element in A F  V  . Suppose that V1 is a subspace of V invariant
under T. Therefore T induces a linear transformation T1 on V1 defined by
uT1  uT for every u  V1 . Given any polynomial q  x  F  x  , we claim that the
linear transformation induced by q  T  on V1 is precisely q  T1   0 . Thus T1
satisfies any polynomial satisfied by T over F. What can be said in the opposite
direction?
Lemma1:
Suppose that V  V1  V2 , where V1 and V2 are subspaces of V
invariant under T. Let T1 and T2 be the linear transformations induced by T on
V1 and V2, respectively. If the minimal polynomial of T1 over F is p1  x  while
that of T2 is p 2  x  , then the minimal polynomial for T over F is the least
common multiple of p1  x  and p 2  x  .
Proof:
If p(x) is the minimal polynomial for T over F, as we have seen above,
both p  T1  and p  T2  are zero, whence p1  x  | p  x  and p2  x  | p  x  . But
then the least common multiple of p1  x  and p 2  x  must also divide p(x).

On the other hand, if q(x) is the least common multiple of p1  x  and


p 2  x  , consider q(T). For 1  V1 , since p1  x  | q  x  , 1q  T   1q  T1   0 ;
similarly, for 2  V2 , 2 q  T   0 . Given any   V,  can be written as
  1  2 , where 1  V2 and 2  V2 , in consequence of which
q  T    1  2  q  T   1q  T   2q  T   0 . Thus q  T   0 and T satisfies
q  x  . Combined with the result of the first paragraph, this yields the lemma.
Corollary 1:
If V  V1  ........  Vk where each V1 is invariant under T and if p1  x 
is the minimal polynomial over F of Ti , the linear transformation induced by T
on Vi , then the minimal polynomial of T over F is the least common multiple
of p1  x  , p2  x  ,......pk  x  .
We leave the proof of the corollary to the reader.

167
Let T  A F  V  and suppose that p  x  in F[x] is the minimal polynomial
of T over F. By known lemma we can factor p(x) in F[x] in a unique way as
p  x   q1  x  1 q 2  x  2 ........q k  x  k , where the qi  x  are distinct irreducible
l l l

polynomials in F[x] and where l1 , l2 ,........, lk are positive integers. Our


objective is to decompose V as a direct sum of subspaces invariant under T
such that on each of these the linear transformation induced by T has, as
minimal polynomial, a power of an irreducible polynomial. If k  1, V itself
already does this for us. So, suppose that k > 1.

 
Let V1   V | q1  T  1  0 , V2   V | q 2  T  2  0 ,…..
l
 l

 
Vk   V | q k  T  k  0 . It is a triviality that each Vi is a subspace of
l

V. In addition, Vi is invariant under T, for if u  Vi , since T and q i  T 


commute,  uT  qi  T 
li
 
 uq i  T  i T  0T  0 . By the definition of Vi , this
l

places uT in Vi . Let Ti be the linear transformation induced by T on Vi .


Theorem 1:
If k = 1 then V  V1 and there is nothing that needs proving. Suppose
then that k > 1.
We first want to prove that each Vi   0  . Towards this end, we
introduce the k polynomials:
h1  x   q 2  x  2 q3  x  3 ........q k  x  k
l l l

h 2  x   q1  x  1 q3  x  3 ........q k  x  k ,
l l l

h i  x    q j  x  j ,.......
l

j i

h k  x   q1  x  1 q 2  x  1 ........q k 1  x  k1 .
l l l

Since k > 1, h i  x   p  x  , whence h i  T   0 . Thus given i,there is a


 V such that   h i  T   0 . But

 
q i  T  i   h i  T  q i  T  i  p  T   0 . In consequence,   0 is in Vi and
l l

so Vi   0  is in Vi . Another remark about the h i  x  is in order now; if


j  Vj for j  i , since q j  x  j | h i  x  ,  jh i  T   0 .
l

168
The polynomials h1  x  , h 2  x  ..........., h k  x  are relatively prime.
(Prove!). Hence by Lemma 3.9.4 we can find polynomials a1  x  ,.......a k  x  in
F[x] such that a1  x  h1  x   .....  a k  x  h k  x   1 . From this we get
a1  T  h1  T   .......  a k  T  h k  T   1 , whence given  V

  1    a1  T  h1  T   .....  a k  T  h k  T    a1  T  h1  T   ......  a k  T  h k  T  .


Now, each a i  T  h i  T  is in Vh i  T  , and since we have shown above that
Vh i  T   Vi , we have now exhibited  as   1  .......  k , where each
i  ai  T  h i  T  is in Vi . Thus V  V1  V2  .....  Vk .
We must now verify that this sum is a direct sum. To show this, it is
enough to prove that if u1  u 2  .......u k  0 with each u i  Vi , then each
u i  0 . So, suppose that u1  u 2  ....  u k  0 and the some u i , say u i , is not 0.
Multiply this relation by h1  T  ; we obtain
u1h1  T   ......  u k h1  T   0h1  T   0 . However u jh1  T   0 for j  1 since
u j  Vj ; the equation thus reduces u1h1  T   0 . But u1q1  T  1  0 and since
l

h1  x  and q1  x  are relatively prime, we are led to u1  0 (Prove !) which is,


of course, inconsistent with the assumption that u1  0 . So far we have
succeeded in proving that V  V1  V2  .......  Vk .
To complete the proof of the theorem, we must still prove that the
minimal polynomial of Ti on v i is q  x  i . By the definition of Vi , since
l

i i  T   0 , q i  Ti   0 , whence the minimal equation of Ti must be divisor


i l i l
Vq
of qi  x  i , thus of the form qi  x  i with f i  li . By the corollary to known
l f

Lemma the minimal polynomial of T over F is the least common multiple of


q1  x  1 ,....q k  x  k and so must be q1  x  1 ,....q k  x  k . Since this minimal
f f f f

polynomial is in fact q1  x  1 ,....q k  x  k we must have that f1  l1 , f 2  l2 , …….


l l

f k  lk . Combined with the opposite inequality above, this yields the desired
result li  fi for i  1, 2,......k and so completes the proof of the theorem.
If all the characteristic roots of T should happen to lie in F, then the
minimal polynomial of T takes on the especially nice form
q  x    x  1  1 .....  x   k  k where 1 ,........ k are the distinct characteristic
l l

roots of T. The irreducible factor qi  x  above are merely qi  x   x  i , Note


that on Vi , Ti only has  i as a characteristic root.

169
Corollary 2:
If all the distinct characteristic roots i ,......,  k of T lie in F , then V

 
can be written as V  V1  ........  Vk where Vi   V |   T   i  i  0 and
l

where Ti has only one characteristic root,  i , on Vi .


Let us go back to the theorem for a moment; we use the same notation
Ti , Vi as in the theorem. Since V  V1  .......  Vk , if dim Vi  n i , by lemma
6.5.1 we can find a basis of V such that in this the matrix of T is of the form.
 A1 
 
 A2 ,
 
 
 Ak 
where each A i is in n i  n i matrix and is in fact the matrix of Ti .
What exactly are we looking for we want an element in the similarity
class of T which we can distinguish in some way. In light of known Theorem
this can be rephrased as follows: we seek a basis of V in which the matrix of T
has an especially simple (and recognizable) form.
By the discussion above, this search can be limited to the linear
transformation Ti ; thus the general problem can be reduced from the discussion
of general linear transformation to that of the special linear transformations
whose minimal polynomial are powers of irreducible polynomials. For the
special situation in which all the characteristic roots of T lie in F we do it
below. The general case in which we put no restrictions on the characteristic
roots of T will be done in the next section.
We are now in the happy position where all the pieces have been
constructed and all we have to do is to put them together. This results in the
highly important and useful theorem in which is exhibited. What is usually
called the Jordan canonical form. But first a definition.
Definition The matrix
 1 0 0
 
0  
 1
 
0 
With  's on the diagonal 1’s on the superdiagonal, and 0’s elsewhere, is a
basic Jordan block belonging to  .

170
Theorem 2:
Let T  A F  V  have all its distinct characteristic roots, 1 ,.......,  k in
F . Then a basis of V can be found in which the matrix T is of the form
 J1 
 
 J2 
 
 
 Jk 
where each
 Bi1 
 
Bi2
Ji   
 
 
 Biri 

and where Bi1 ,.........., Biri are basic Jordon blocks belonging to  i .
Proof:
Before starting, note that an m  m basic Jordan block belonging to  is
merely   M m , where M m is as defined at the end of known Lemma. By the
combination of known Lemma and the corollary to known theorem, we can
reduce to the case when T has only one characteristic root  , that is, T   is
nilpotent. Thus T     T    and since T   is nilpotent, by known theorem
there is a basis in which its matrix is of the form
 M n1 
 
 
 M n r 

But then the matrix of T is of the form
  M
   n1   Bn1 
     
    ,
   M n r   Bn r 
  
Using the first made in this proof about the relation of a basic Jordan
block and the M m 's . This complete the theorem.
Using known theorem we could arrange things so that in each J i the size
of Bi1  size of Bi2  …. When this has been done, the matrix.

171
 J1 
 
 
 J k 

is called the Jordan form of T . Note that known Theorem, for nilpotent
matrices, reduces to known Theorem.
We leave as an exercise the following: Two linear transformations in
A F  V  which have all their characteristic roots in F are similar if and only if
they can be brought to the same Jordan form.
Thus the Jordan form acts as a “determiner” for similarity classes of this
type of linear transformation.
In matrix terms known Theorem can be stated as follows: Let A  Fn and
suppose that K is the splitting field of the minimal polynomial of A over F ;
then an invertible matrix C  K n can be found so that CAC 1 is in Jordan form.
We leave few small points to make the transition from known theorem.
to its matrix form, just given, to the reader.
One final remark: If A  Fn and if in K n where K is the splitting field of the
minimal polynomial of A over F,
 J1 
 
J2
CAC1   
 
 
 Jk 
where each J i corresponds to a different characteristic root,  i , of A, then the
multiplicity of  i as a characteristic root of A is defined to be n i , where J i is
an n i  n i matrix. Note that the sum of the multiplicities is exactly n.
Clearly we can similarly define the multiplicity of a characteristic root of
a linear transformation.
Canonical Forms: Rational Canonical Form
The Jordan form is the one most generally used to prove theorems about
linear transformations and matrices. Unfortunately, it has one distinct, serious
drawback in that it put requirements on the location of the characteristic roots.
True if T  A F  V  (or A  Fn ) does not have its characteristic roots in F we
need but go to a finite extension, K, of F in which all the characteristic roots of
T lie and then to bring T to Jordan form over K . In fact, this is a standard
operating procedure; however, it proves the result in K n and not in Fn . Very
often the result in Fn can be inferred from that in K n , but there are many
occasions when, after a result has been established for A  Fn , considered as an

172
element in K n , we cannot go back from K n to get the desired information in
Fn .

Thus we need some canonical form for elements in A F  V  (or in Fn )


which presumes nothing about the location of the characteristic roots of its
elements, a canonical form and a set of invariants created in A F  V  itself
using only its elements and operations. Such a canonical form is provided us by
the rational canonical form which is described below in known Theorem.
Let T  A F  V  ; by means of T we propose to make V into a module
over F[x], the ring of polynomials in x over F. We do so by defining, for any
polynomial f(x) in F[x], and any  V , f  x    f  T  . We leave the
verification to the reader that, under this definition of multiplication of
elements of V by elements of F[x], V becomes an F[x]-module.
Since V is finite-dimensional over F, it is generated over F, hence all the
more so over F[x] which contains F. Moreover, F[x] is a Euclidean ring; thus as
a finitely generated module over, F[x], by known theorem, V is the direct sum
of a finite number of cyclic sub modules. From the very way in which we have
introduced the module structure on V, each of these cyclic modules is invariant
under T; moreover there is an element m 0 , in such a sub module M, such that
every element m, in M, of the form m  m0f  T  for some f  x   F[x] .
To determine the nature of T on V it will be, therefore, enough for us to
know what T looks like on a cyclic sub module. This is precisely what we
intend, shortly to determine.
But first to carry out a preliminary decomposition of V, as we did in
known Theorem, according to the decomposition of the minimal polynomial of
T as a product of irreducible polynomials.
Let the minimal polynomial of T over F be p  x   q1  x  1 .........q k  x  k
e e

where the qi  x  are distinct irreducible polynomials in F[x] and where each
ei  0 ; then, as we saw earlier in known Theorem, V  V1  V2  .....  Vk
where each Vi is invariant under T and where the minimal polynomial of T on
Vi is qi  x  i . To solve the nature of a cyclic sub module for an arbitrary T we
e

see, from this discussion, that it suffices to settle it for T whose minimal
polynomial is a power of an irreducible one.
We prove the
Lemma 2:
Suppose that T, in A F  V  has an minimal polynomial over F the
polynomial p  x    0  1x  ....   r 1x r 1  x r . Suppose, further, that V , as a

173
module (as described above), is a cyclic module (that is, is cyclic relative to T)
Then there is basis of V over F such that, in this basis, the matrix of T is
 0 1 0 0 
 
 0 0 1 0 
 
 
 0 0 0 1 
  1 .  r 1 
 0
Proof:
Since V is cyclic relative to T , there exists a vector  in V such that
every element  , in V , is of the form   f  T  for some f  x  in F  x  .

Now if for some polynomial s  x  in F  x  , s  T   0 , then for any 


in V, s  T    f  T   s  T   s  T  f  T   0 ; thus s  T  annihilates all of V
and so s  T   0 . But then p  x  | s  x  . Since p  x  is the minimal polynomial
of T . This remark implies that , T, T 2 ,......, T r 1 are linearly independent
over F , for if not, then 0   1T  ...r 1Tr 1  with  0 ,..... r 1 in F. But
then   0  1T  ....   r 1T r 1   0 , hence by the above discussion
p  x  |  0  1x  ......  r 1x r 1  , which is impossible since p(x) is of degree r
unless.
` 0  1  ....   r 1  0

Since Tr   0  1T  ....   r 1Tr 1 , we immediately have that T r  k , for


k  0 , is a linear combination of 1, T,...., T r 1 , and so f(T), for any
f  x   F  x  , is a linear combination of 1, T,...., T r 1 over F. Since any  in V
is of the form   f  T  we get that  is a linear combination of
, T,......, T r 1 .
We have proved, in the above two paragraphs, the elements
, T,......, T r 1 from a basis of V over F. In this basis, as is immediately
verified, the matrix of T is exactly as claimed.
Definition:
If f  x    0  1x  ......   r 1x r 1  x r is in F[x], then the r × r matrix.

174
 0 1 0 0 
 
 0 0 1 0 
 
 
 0 0 0 1 
  1 .  r 1 
 0
is called the companion matrix of f(x). We write it as G(f(x))
Note that known Lemma says that if V is cyclic relative to T and if the
minimal polynomial of T in F[x] is p(x) then for some basis of V the matrix of
T is C(p(x)).
Note further that the matrix C(f(x)), for any monic f(x) in F[x], satisfies
f(x) and has f(x) as its minimal polynomial.
We now prove the very important
Theorem 3:
If T in A F  V  has a minimal polynomial p(x) = q(x)e, where q(x) is a
monic, irreducible polynomial in F[x], then a basis of V over F can be found in
which the matrix of T is of the form

 
 C q  x e1  


 
C qx 2
e
 

 
 


 
C qx r
e
 


where e  e1  e2  .....  e r
Proof:
Since V, as a module over F[x], is finitely generated, and since F[x] is
Euclidean, we can decompose V as V  V1  .....  Vr where the Vi are cyclic
modules. The Vi are thus invariant under T ; if Ti is the linear transformation
induced by T on Vi , it minimal polynomial must be a divisor of p(x) = q(x)e so
is of the form q  x  i . We can renumber the space so that e  e1  e2  .....  e r .
e

Now q  T  1 annihilates each Vi , hence annihilates V, whence


e

q  T  1  0 . Thus e1  e ; since e1 is clearly at most e we get that e1  e


e

By known lemma , since each Vi is cyclic relative to T, we can find a


basis such that the matrix of the linear transformation of Ti on Vi is

175
 
C q  x  i . Thus by known theorem a basis of V can be found so that the
e

matrix of T in this basis is

 
 C q  x e1  


 
C qx 2
e
 

 
 


 
C qx r
e
 


Corollary 3:
If T in A F  V  has minimal polynomial p  x   q1  x  1 ............q k  x  k
l l

over F, where q1  x  ,........, q k  x  are irreducible distinct polynomials in F[x],


then a basis of V can be found in which the matrix of T is of the form
 R1 
 
 R2 
 
 
 Rk 
where each

 
 C q  x ei1
i  

Ri   
 

 
C q i  x  iri
e
 

where ei  ei1  ei2  .......  eiri .
Proof:
By known theorem, V can be decomposed into the direct sum
V  V1  ......  Vk , where each Vi is invariant under T and where the minimal
polynomial of Ti , the linear transformation induced by T on Vi , has minimal
polynomial qi  x  i . The theorem just proved, we obtain the corollary. If the
e

degree of qi  x  is d i , note that the sum of all the di eij is n , the dimension of
V over F.
Definition:
The matrix of T in the statement of the above corollary is called the
rational canonical from of T.

176
Definition:
The polynomials q1  x  11 ,q1  x  12 ,....,q1  x  1r1 ,.....q k  x  k1 ,....q k  x  krk
e e e e e

in F[x] are called divisors of T.


Definition:
If dimF  V   n , then the characteristic polynomial of T, pT  x  , is the
product of its elementary divisors.
We shall be able to identity the characteristic polynomial just defined
with another polynomial which we shall explicitly construct. The characteristic
polynomial of T is a polynomial of degree n lying in F[x]. It has many
important properties, one of which is contained in the
Remark:
Every linear transformation T  A F  V  satisfies its characteristic
polynomial. Every characteristic root of T is a root of PT  x  .
Note 1:
The first sentence of this remark is the statement of a very famous
theorem, the Cayley-Hamilton theorem. However, to call it that in the form we
have given is a little unfair. The meat of the Cayley-Hamilton theorem is the
fact that T satisfies PT  x  when PT  x  is given in a very specific, concrete
form, easily constructible from T. However, even as it stands the remark does
have some meat in it, for since the characteristic polynomial is a polynomial of
degree n, we have shown that every element in A F  V  does satisfy a
polynomial of degree n lying in F[x]. Until now, we had only proved this (in
theorem 6.4.2) for linear transformation having all their characteristic roots in
F.
Note 2:
As stated the second sentence really says nothing, for whenever T
satisfies a polynomial then every characteristic root of T satisfies this same
polynomial; thus PT  x  would be noting special if what were stated in the
theorem were all that held true for it. However, the actual story is the
following: Every characteristic root of T is a root of PT  x  , and conversely,
every root of PT  x  is a characteristic root of T; moreover, the multiplicity of
any root of PT  x  , as a root of the polynomial, equals its multiplicity as a
characteristic root of T. We could prove this now, but defer the proof until later
when we shall be able to do it in a more natural fashion.

177
Proof of the Remark:
We only have to show that T satisfies PT  x  , but this becomes almost trivial.
Since PT  x  is the product of q1  x  11 ,q1  x  12 ,....,q1  x  1r1 ,.....q k  x  k1 ,.......,
e e e e

and since e11  e1 , e21  e2 ,........., e k1  e k , PT  x  is divisible by


p  x   q1  x  1 ..........q k  x  k , the minimal polynomial of T. Since p  T   0 it
e e

follows that pT  T   0 .
We have called the set of polynomials arising in the rational canonical
form of T the elementary divisors of T. It would be highly desirable if these
determined similarly in A F  V  , for then the similarity classes in A F  V 
would be in one-to-one correspondence with set of polynomials in F[x]. We
propose to do this, but first we establish a result which implies that two linear
transformations have the same elementary divisors.
Theorem 4:
Let V and W be two vector spaces over F and suppose that  is a
vector space isomorphism of V onto W. Suppose that S  A F  V  and
T  A F  W  are such that for any  V,  S      T . Then S and T have
the same elementary divisors.
Proof:
We begin with a simple computation. If  V , then
 S     S S    S   T     T  T     T2 . Clearly, if we
2

continue in this pattern we get  S       T


m m
for any integer m  0
whence for any polynomial f  x   F x  and for any
 V,  f S       f  T  .

If f  S  0 then    f  T   0 for any  V , and since  maps V


onto W, we would have that Wf  T    0  , in consequence of which f  T   0 .
Conversely, if g  x   F x  is such that g T  0 , then for any
 V,  g S    0 , and since  is an isomorphism, this results in
g S  0 . This, of course, implies that g  S  0 . Thus S and T satisfy the
same set of polynomials in F  x  , hence must have minimal polynomial.

p  x   q1  x  1 q 2  x  2 ..........q k  x  k
e e e

where q1  x  .........q k  x  are distinct irreducible polynomials in F[x]

178
If U is a subspace of V invariant under S, then U is subspace of W
invariant under T, for  U  T   US   U . Since U and U are
isomorphic, the minimal polynomial of S1 , the linear transformation induced by
S on U is the same, by the remarks above, as the minimal polynomial of T1 ,
the linear transformation induced on U by T.
Now since the minimal polynomial for S on V is
p  x   q1  x  q 2  x  ..........q k  x  k , as we have seen in known Theorem and
e1 e2 e

its corollary, we can take as the first elementary divisor of S the polynomial
q1  x  1 and we can find a subspace of V1 of V which is invariant under S such
e

that
1. V  V1  M where M is invariant under S.
2. The only elementary divisor of S1 , the linear transformation induced on V1
by S, is q1  x  1 .
e

3. The other elementary divisors of S are those of the linear transformation S 2


induced by S on M.
4. We now combine the remarks made above and assert
5. 1. W  W1  N where W1  V1 and N  M are invariant under T.
6. 2. The only elementary divisor of T1 , the linear transformation induced by
T on W1 , is q1  x  1 (Which is an elementary divisor of T since the minimal
e

polynomial of T is p  x   q1  x  1 ..........q k  x  k ).
e e

7. 3. The other elementary divisors of T are those the linear transformation T2


induced by T on N .
Since N  M , M and N are isomorphic vector space over F under the
isomorphism  2 induced by  . Moreover, if  M then
 S2  2   S      T   2  T2 , hence S 2 and T2 are in the same
relation vis-à-vis  2 as S and T were vis-à-vis  . By induction on dimension
(or repeating the argument) S 2 and T2 have the same elementary divisors. But
since the elementary divisors of S are merely q1  x  1 and those of S 2 while
e

those of T are merely q1  x  1 and those of T2 , S and T must have the same
e

elementary divisors, thereby proving the theorem.


Known Theorem and its corollary gave us rational canonical form and
gave rise to the elementary divisors. We should like to push this further and to
be able to assert some uniqueness property. This we do in

179
Theorem 5:
The elements S and T in A F  V  are similar in A F  V  if and only if
they have the same elementary divisors.
Proof:
In one direction this is easy, for suppose that S and T have the same
elementary divisors. Then there are two bases of V over F such that the matrix
of S in the first basis equals the matrix of T in the second (and each equals the
matrix of the rational canonical form). But as we have seen several time earlier,
this implies that S and T are similar.
We now wish to go in the order direction. Here, too, the argument
resembles closely that used in the proof of known theorem. Having been careful
with details there, we can afford to be a little sketchier here.
We first remark that in view of known Theorem we may reduce from
the general case to that of a linear transformation whose minimal polynomial is
a power of an irreducible one. Thus without loss of generality we may suppose
that the minimal polynomial of T is q  x  where q(x) is irreducible in F[x] of
e

degree d.
The rational canonical form tells us that we can decompose V as
V  V1  ....  Vr , where the subspaces Vi are invariant under T and where the
linear transformation induced by T on Vi has a matrix C q  x  i , the  e

companion matrix of q  x  i . We assume that what we are really trying to prove
e

is the following: If V  U1  U 2  .........  Us where the U j are invariant under


T and where the linear transformation induced by T on U j has as matrix

 
C q  x  j , f1  f 2  ......  fs , then r  s and e1  f1 , e2  f 2 ,......, er  f r (Prove
f

that the proof of this is equivalent to proving the theorem!)


Suppose then that we do have the two decomposition described above,
V  V1  ........  Vr and V  U1  ........  U s , and that some ei  fi . Then there
is a first integer m such that em  f m , while e1  f1 ,......, em 1  f m 1 . We may
suppose that em  f m .

Now g  T  m annihilates U m , U m 1 ,........, U s , whence


f

Vq  T  m  U1q  T  m  .........  U m1q  T  m .


f f f

However, it can be shown that the dimension of Ui q  T  m for i  m is


f

d  fi  f m  (Prove!) whence

 
dim Vq  T  m  d  f1  f m   .......  d  f m 1  f m 
f

180
On the other hand, Vq  T  m  V1q  T  m  ....  ...  Vmq  T  m and since
f f f

V1q  T  m has dimension d  ei  f m  , for i  m , we obtain that


f

 
dim Vq  T  m  d  ei  f m   .......  d  e m  f m 
f

Since e1  f1 ,........, e m1  f m1 and em  f m , this contradicts the equality


proved above. We have thus proved the theorem.
Corollary 4:
Suppose the two matrices A, B in Fn are similar in K n where K is an
extension of F. Then A and B are already similar in Fn .
Proof:
Suppose that A, B  Fn are such that B  C1AC with C  K n . We
consider K n as acting on K   , the vector space of n-tuples over K. Thus F  is
n n

contained in K   and although it is a vector space over F it is not a vector


n

space over K. The image of F  , in K   , under C need not fall back in F  but
n n n

at any rate F n C is a subset of K   which is a vector space over F. (Prove!).


n

Let V be the vector space F  over F, W the vector space F n C over F, and for
n

 V let   C . Now A  A F  V  and B  A F  W  and for any  V ,


 A    AC  CB    B whence the conditions of theorem 6.7.2 are
satisfied. Thus A and B have the same elementary divisors; by Theorem 6.7.3,
A and B must be similar in Fn .
A word of caution: The corollary does not state that if A, B  Fn are such
that B  C1AC with C  K n then C must of necessity be in Fn ; this is false. It
merely states that if A, B  Fn are such that B  C1AC with C  K n then there
exists a (possibly different) D  Fn such that B  D1AD
Hermitian, Unitary and Normal Transformations
In our previous considerations about linear transformations, the specific
nature of the field F has played a relatively insignificant role. When it did
make itself felt it was usually in regard to the presence or absence of
characteristic roots. Now, for the first time, we shall restrict the field F –
generally it will be the field of complex numbers but at time it may be the field
of real numbers – and we shall make heavy use of the properties of real and
complex numbers. Unless explicitly stated otherwise, in all of this section F
will denoted the field of complex numbers.
We shall also be making extensive and constant use of the notions and
results about inner product spaces. The reader would be well advised to review
and to digest thoroughly the material before proceeding.

181
One further remark about the complex numbers: Until now we have
managed to avoid using results that were not proved in the book. Now,
however, we are forced to deviate from this policy and to call on a basic fact
about the field of complex numbers, often known as “the fundamental theorem
of algebra”, without establishing it ourselves. It displeases us to pull such a
basic result out of the air, to state it as a fact, and then to make use of it.
Unfortunately, it is essential for what follows and to digress to prove it here
would take us too far a field. We hope that the majority of reader will have seen
it proved in a course on complex variable theory.
Fact 1.
A polynomial with coefficients which are complex numbers has all its
roots in the complex field.
Equivalently, Fact 1 can be stated in the form that the only nonconstant
irreducible polynomials over the filed of complex numbers are those of degree
1.
Fact 2.
The only irreducible, nonconstant, polynomials over the field of real
numbers are either of degree 1 or of degree 2.
The formula for the roots of a quadratic equation allows us to prove
easily the equivalence of Facts 1 and 2.
The immediate implication, for us, of Fact 1 will be the every linear
transformation which we shall consider will have all its characteristic roots in
the field of complex numbers.
In what follows, V will be a finite-dimensional inner-product space over
F, the field of complex numbers; the inner product of two elements of V will be
written, as it was before, as  , 
Lemma 3:
If T  A  V  is such that  T,   0 for all  V , then T = 0.
Proof:
Since  T,   0 for  V , given ,  V,   u   T, u  w   0 .
Expanding this out and making use of  uT, u    wT, w   0 , we obtain

 uT, w    wT, u   0 for all u, w  V ……. (1)


Since equation (1) holds for arbitrary w in V, is still must hold if we
replace in it w by iw where i2  1 ; but  uT,iw   i  uT, w  whereas
 iw  T, u   i  wT, u  . Substituting these values in (1) and canceling out i
leads us to
  uT, w    wT, u   0 …….. (2)

182
Adding (1) and (2) we get  wT, u   0 for all u, w  V , whence, in
particular,  wT, wT   0 . By the defining properties of an inner-product space,
this forces wT  0 for all w  V , hence T  0 . (Note : If V is an inner
product space over the real field, the lemma may be false. For example, let
V   ,   | ,  real , where the inner-product is the dot product. Let T be the
linear transformation sending  ,   into  ,   . A simple check shows that
 T,   0 for all  V , yet T  0 )
Definition:
The linear transformation T  A  V is said to be unitary if
 uT, T    u,  for all u,   V .
A unitary transformation is one which preserves all the structure of V, its
addition, its multiplication by scalars and its inner product. Note that a unitary
transformation preserves length for
   ,    T, T   T
Is the converse true? The answer is provided us in
Lemma 4:
If  T, T    ,  for all  V Then T is unitary.
Proof:
The proof is in the sprit of that of known Lemma. Let u,   V ; by
assumption   u   T,  u   T    u  , u   . Expanding this out and
simplifying, we obtain
 uT, T    T, uT    u,    , u  …… (1)
for u,   V . In (1) replace  by i ; computing the necessary parts, this yields
  uT, T    T, uT     u,     , u  ……. (2)

Adding (1) and (2) results in  uT, T    u,  for all u,   V , hence T


is unitary.
We characterize the property of being unitary in terms of action on a
basis of V.
Theorem 6:
The inner transformation T on V is unitary if and only if is takes an
orthonormal basis of V into an orthonormal basis of V.

183
Proof:
Suppose that 1 ,........, n  is an orthonormal basis of V; thus
 i , j   0 for i  j while  i , i   1. We wish to show that if T is unitary,
then 1T,......., n T is also an orthonormal basis of V. But
 iT, jT    i , j   0 for i  j and  iT, iT    i , i   1 , thus indeed
1T,......., n T is an orthonormal basis of V.
On the other hand, if T  A  V  is such that both 1 ,......., n  and
1T,........., n T are orthonormal bases of V, if u, w  V then
n n
u   i i w   i i
i 1 i 1

Whence by orthonormality of the i ’s,


n
 u, w    ii
i 1

However,
n n
uT   i i T and wT   i i T
i 1 i 1

Whence by the orthonormality of the i T ’s


n
 uT, wT    ii   u, w  ,
i 1

Proving that T is unitary.


Known Theorem states that a change of basis from one orthonormal basis
to another is accomplished by a unitary linear transformation.
Lemma 5:
If T  A  V  then given any  V there exists an element w  V ,
depending on  and T, such that  uT,    u, w  for all u  V . This element
w is uniquely determined by  and T.
Proof:
To prove the lemma, it is sufficient to exhibit a w  V which works for
all the elements of a basis of V. Let u1 ,......., u n  be an orthonormal basis of V;
we define
n
w    u i T, u i
i 1

184
An easy computation shows that  u i , w    u i T,  hence the element w
has the desired property. That w is unique can be seen as follows; Suppose that
 uT,    u, w1    u, w 2  ; then  u, w1  w 2   0 for all u  V which forces,
on putting u  w1  w 2 , w1  w 2 .
Definition:
If T  A  V  then the Hermitian adjoint of T, written as T*, is defined
by  uT,    u, T * for all u,   V .
Given  V we have obtained above an explicit expression for T* (as w)
and we could use this expression to prove the various desired properties of T*.
However, we prefer to do it in a “basis-free” way.
Lemma 6:
If T  A  V  then T*  A  V  . Moreover,

1.  T * *  T ;

2.  S  T   S* T * ;

3.  S *  S* ;

4.  ST  *  T *S* ;

For all S,T  A  V  and all  F .


Proof:
We must first prove that T * is a linear transformation on V. If u,  ,w
are in V, then , in consequence of which    w  T*  T *  wT * . Similarly for
 F ,
 u,   T *   uT,     uT,     u, T *   u,   T *  ,
whence   T*    T * .We have thus proved that T* is a linear
transformation on V.
To see that  T * *  T notice that
 u,   T* *   uT*,    , uT*   T, u    u, T  for all u,   V whence
  T * *  T which implies that  T * *  T . We leave the proofs of
S  T  *  S* T * and of  T  *  T * to the reader. Finally,
 u,  ST  *   uST,    uS, T *   u, T *S* for all u,   V ; this forces
  ST  *  T *S* for every  V which results in  ST  *  T *S* .

185
As a consequence of the lemma the Hermitian adjoint defines an adjoint,
on A  V  .
The Hermitian adjoint allows us to give an alternative description for
unitary transformations in terms of the relation of T and T*.
Lemma 7:
t  A  V  is unitary if and only if TT* = 1.
Proof:
If T is unitary, then for all u,   V ,  u, TT *   uT, T    u, 
hence TT* = 1. On the other hand if TT* = 1 we must have that T*T = 1. We
shall soon give an explicit matrix criterion that a linear transformation be
unitary.
Theorem 7:
1 ,........, n  is an orthonormal basis of
If V and if the matrix of
T  A  V  in this basis is  ij  then the matrix of T* in this basis is  ij  ,
where ij   ji .
Proof:
Since the matricers of T and T* in this basis are, respectively  ij  and
  , then
ij

n n
i T   ij j and i T*   ij j
i 1 i 1

Now
 
ij   i T*,  j    i ,  jT    i ,   jk k    ji
n

 i1 
By the orthonormality of the i ’s. This proves the thorem.
The theorem is very interesting to us in light of what we did earlier in
section 6.8. For the abstract Hermitian adjoint on the inner-product space V,
when translated into matrices in an orthonormal basis of V, becomes nothing
more than the explicit, concrete Hermitian adjoint w defined there for matricer.
Using the matrix representation in an orthonormal basis, we claim that
T  A  V  is unitary if and only if, whenever  ij  is the matrix of T in this
orthonormal basis, then
n

 
i 1
ij ik  0 for j  k

While

186
n


2
ij 1
i 1

In terms of dot products on complex vector spaces, it says that the rows
of the matrix of T form an orthonormal set of vectors in F  under the dot
n

product.
Definition:
T  A  V  is called self-adjoint on Hermitian if T* = T

If T*  T we call skew-Hermitian. Given any S  A  V  .

S  S*  S  S* 
S  i 
2  2i 

and since
S  S* and
S  S* are Hermitian, S  A  iB where both A
2 2i
and B are Hermitian.
Using matrix calculations, we proved that any complex characteristic root
of a Hermitian matrix is real; in light of Fact 1, this can be changed to read:
Every characteristic root of a Hermitian matrix is real. We now re-prove this
from the more uniform point of view of an inner-product space.
Theorem 8:
If T  A  V  is Hermitian, then all its characteristic roots are real.
Proof:
Let  be a characteristic root of T; thus there is a   0 in V such that
T   . We compute:
  ,    ,    T,    , T *   , T    ,      ,   ;

since  ,    0 we are left with    hence  is real.


We want to describe canonical forms for unitary, Hermitian, and even
more general types of linear transformations which will be even simpler than
the Jordan form. This accounts for the next few lemmas which, although of
independent interest, are for the most part somewhat technical in nature.
Lemma 8:
If S  A  V  and if SS*  0 , then S = 0.
Proof:
Consider  SS*,  ; since SS*  0 ,
0   SS*,    S,  S* *   S, S by lemma 6.10.4. In an inner product
space, this implies that S  0 .

187
Corollary 5:
If T is Hermitian and T k  0 for k  1 then T  0 .
Proof:
We show that if T 2m  0 then T  0 ; for if S  T 2m 1 , then S*  S and
SS*  T 2m , whence  SS*,   0 implies that 0  S  T 2m 1 . Continuing
down in this way, we obtain T  0 .If T k  0 , then T 2m  0 for 2m  k
hence T  0 .
We introduce a class of linear transformations which contains, as
special cases, the unitary, Hermitian and skew-Hermitian transformations.
Definition:
T  A  V  is said to be normed if TT*  T *T .
Instead of proving the theorems to follows for unitary and Hermitian
transformations separately, we shall, instead, prove them for normal linear
transformations and derive, as corollaries, the desired results for the unitary and
Hermitian ones.
Lemma 9:
If N is a normal linear transformation and if N  0 for  V , then
N*  0 .
Proof:
Consider  N*, N* ; by definition,
 N*, N *   N * N,    NN*,  , since NN*  N*N . However N  0 ,
whence, certainly, NN*  0 . In this way we obtain  N*, N *  0 , forcing
N*  0 .
Corollary 6:
If  is a characteristic root of the normal transformation N and if
N   then N*   .
Proof:
Since N is normal. NN*  N*N , therefore,
 N    N   *   N     N*    NN* N    N*N  N* N     N*    N   
  N    *  N    , that is to say, N   is normal. Since   N    *  0 ,
hence N*   .
The corollary states the interesting fact that if  is a characteristic root of
the normal transformation N not only is  a characteristic root of N* but say
characteristic vector of N belonging to  is a characteristic vector of N*
belonging to  and vice versa.

188
Corollary 7:
If T is unitary and if  is a characteristic root of T, then   1 .
Proof:
Since T is unitary it is normal. Let  be a characteristic root of T and
suppose that T   with   0 in V. By Corollary 1, T*   , thus
  TT*  T*   since TT*  1 . Thus we get   1 , which, course,
says that   1 .
We pause to see where we are going. Our immediate goal is to prove that
a normal transformation N can be brought to diagonal from by a unitary one. If
1 ,.......,  k are the distinct characteristic roots of V, using Theorem 6.6.1 we
can decompose V as V  V1  .......  Vk , where for i  Vi , i  N  i  i  0 .
n

Accordingly, we want to study two things, namely, the relation of vectors lying
in different Vi 's and the very nature of each Vi . When these have been
determined, we will be able to assemble them to prove the desired theorem.
Lemma 10:
If N is normal and if N*  0 , then N  0
Proof:
Let S  NN*; S is Hermitian, and by the normality of N,
Sk    NN*  Nk  N*  0 . By the corollary to known Lemma, we
k k

deduce that S  0 , that is to say, NN*  0 . Invoking known Lemma itself


yields N  0 .
Corollary 8:
If N is normed and if for  F ,   N     0 , then N  
k

Proof:
From the normality of N it follows that N   is normal, whence by
applying the lemma just proved to N   we obtain the corollary.
In line with the discussion just preceding the last lemma, this corollary
shows that every vector in Vi is a characteristic vector of N belonging to the
characteristic root i . We have determined the nature of Vi ; now we proceed
to investigate the interrelation between two distinct Vi 's .
Lemma 11:
Let N be a normal transformation and suppose that  and  are two
distinct characteristic roots of N. If , w are in V and are such that
N  , N   then  ,   0 .

189
Proof:
We compute  N,  in two different ways. As a consequence of
N  ,  N,    ,     ,  . From M   , using known Lemma
we obtain that N*   , whence  N,    , N *   ,     ,  .
Comparing the two computations gives us   ,     ,  and since    ,
this results in  ,   0
All the background work has been done to enable us to prove the basic
and lovely.
Theorem 9:
If N is a normal linear transformation on V, then there exists an
orthonormal basis, consisting of characteristic vector of N, in which the matrix
of N is diagonal. Equivalently, if N is a normal matrix there exist a unitary
matrix U such that UNU 1   UNU * is diagonal.
Proof:
We fill in the informal sketch we have made of the proof prior to proving
Lemma 6.10.8.
Let N be a normal and 1 ,......... k be the distinct characteristic roots of
N. By the corollary to known theorem we can decompose V  V1  .........  Vk
where every 1Vi is annihilated by  N  i 
ni
. By the corollary to known
Lemma , Vi consists only of characteristic vectors of N belonging to the
characteristic root  i . The inner product of V induces an inner product on Vi ;
by theorem 4.4.2 we can find a basis of Vi orthonormal relative to this inner
product.
By known Lemma elements lying in distinct Vi 's are orthogonal. Thus
putting together the orthonormal bases of the Vi 's provides us with an
orthonormal basis of V. The basiss consists of characteristic vectors in N hence
in this basis the matrix of N is diagonal.
We do not prove the matrix equivalent, leaving it as a problem; we only
point out that two facts are needed:
1. A change of basis from one orthonormal basis to another is
accomplished by a unitary transformation.
2. In a change of basis the matrix of a linear transformation is changed by
conjugating by the matrix of the change of basis.
Both corollaries to follow are very special cases of Theorem 6.10.4, but since
but each is so important in its own right we list them as corollaries in order to
emphasize them.

190
Corollary 9:
If T is a unitary transformation, then there is an orthonormal basis in
which the matrix of T is diagonal; equivalently, if T is a unitary matrix, then
there is a unitary matrix U such that UTU 1   UTU * is diagonal.
Corollary 10:
If T is a Hermitian linear transformation, then there exists an orthonormal
basis in which the matrix of T is diagonal, equivalently, if T is a Hermitian
matrix, then there exists a unitary matrix U that UTU1    UTU* is
diagonal.
The theorem proved is the basic result for normal transformations, for it
sharply characterizes them as precisely those transformations which can be
brought to diagonal from by unitary ones. It also shows that the distinction
between normal, Hermitian and unitary transformations is merely a distinction
caused by the nature of their characteristic roots. This is made precise in
Lemma 12:
The normal transformation N is
1. Hermitian if and only if its characteristic roots are real
2. Unitary if and only if its characteristic roots are all of absolute value 1.
Proof:
We argue using matrices. If N is Hermitian, then it is normal and all its
characteristic roots are real. If N is normal and has only real characteristic
roots, then for some unitary matrix U, UNU 1  UNU*  D , where D is a
diagonal matrix with real entries on the diagonal. Thus D*  D ; since
D*   UNU * *  UN * U * , the relation D* = D implies UN*U*  UNU* ,
and since U is invertible we obtain N*  N . Thus N is Hermitian.
We leave the proof of the part about unitary transformation to the reader.
If A is any linear transformation on V, then tr  AA* can be computed
by using the matrix representation of A in any basis of V. We pick an
orthonormal basis of V; in this basis, if the matrix of A is  ij  then that of A *
is  ij where ij   ji . A simple computation then shows that

tr  AA*   i, j ij
2
and this is 0 if only if each ij  0 , that is if and only if A
= 0. In a word, tr  AA *  0 if and only if A = 0. This is a useful criterion for
showing that a given linear transformation is 0. This is illustrated in
Lemma 13:
If N is normal and AN = NA, then AN* = N*A.

191
Proof:
WE want to show that X = AN* – N*A is 0; what we shall do is prove
that trXX*  0 , and deduce from this that X = 0.
Since N commutes with A with N*, it must commute with AN* – N*A, thus
XX*   AN*  N*A  NA* A* N    AN*  N*A  NA*   AN* N*A  A* N
 N  AN *  N *A  A *   AN *  N *A  A * N . Being of the form NB – BN,
the trace of XX* is 0. Thus X = 0, and AN* = N*A.
We have just seen that N* commutes with all the linear transformations
that commute with N, when N is normal; this is enough to force N* to be a
polynomial expression in N. However, this can be shown directly as a
consequence of known Theorem.
The linear transformation T is Hermitian if and only if  T,   is real for
every  V . Of special interest are those Hermitian linear transformations for
which  T,   0 for all  V . We call these nonnegative linear
transformations and denote the fact that a linear transformation is nonnegative
by writing T  0 . If T  0 and in addition  T,   0 for   0 then we call T
positive (or positive definite) and write T > 0. We wish to distinguish these
linear transformations by their characteristic roots.
Lemma 14:
The Hermitian linear transformation T is nonnegative (positive( if and
only if all of its characteristic roots are nonnegative (positive).
Proof:
Suppose that T  0 ; if  is a characteristic root of T, then T   for
some   0 . Thus 0   T,    ,     ,   ; since  ,    0 we deduce
that   0 .
Conversely, if T is Hermitian with nonnegative characteristic roots, then
we can find an orthonormal characteristic vectors of T. For each i , i T  i i
where  i  0 . Given  V ,    i i hence T   i i T   i i i . But
 T,     ii i ,  i i    iii by orthonormality of the i 's . Since
 i  0 and i i  0 , we get that  T,   0 hence T  0 .
The corresponding “positive” results are left as an exercise.
Lemma 15:
T  0 if and only if T  AA * for some A.
Proof:
We first show that AA*  0 . Given  V ,  AA*,    A, A   0
hence AA*  0 .

192
On the other hand, if T  0 we can find a unitary matrix U such that
 1 
 
UTU*   
  n 

where each  i is a characteristic root of T, hence each  i  0 . Let

 1 
 
S 
 
 n 

Since each  i  0 , each  i us real, whence S is Hermitian. Therefore


U*SU is Hermitian; but
 1 
 
 U *SU   U *S2 U  U *  U  T
2

  n 

We have represented T in the form AA*, where A = U*SU.
Notice that we have actually proved a little more; namely, if in
constructing S above, we had chosen the nonnegative  i for each  i then S,
and U*SU, would have been nonnegative. Thus T  0 is the square of a non-
negative linear transformation; that is, every T  0 has a nonnegative square
root. This nonnegative square root can be shown to be unique.
We close this section with a discussion of unitary and Hermitian
matrices over the real field. In this case, the unitary matrices are called
orthogonal, and satisfy QQ '  1 . The Hermitian ones are just symmetric, in this
case.
We claim that a real symmetric matrix can be brought to diagonal form
by an orthogonal matrix. Let A be a real symmetric matrix. We can consider A
as acting on a real inner-product space V. Considered as a complex matrix, A is
Hermitian and thus all its characteristic roots are real. If these are  i ,.......,  k
then V can be decomposed as V  V1  ........  Vk where i  A  i  i  0 for
n

i  Vi , j  Vj with i  j,  i ,  j   0 . Thus we can find an orthonormal basis


of V consisting of characteristic vectors of A. The change of basis, from the
orthonormal basis 1, 0,...., 0  0,1, 0,..., 0  ,.......,  0,......, 0,1 to thisnew basis
is accomplished by a real, unitary matrix, that is, by an orthogonal matrix,
proving our contention.
To determine canonical forms for the real orthogonal matrices over the
real field is a little more complicated, both in its answer and its execution. We

193
proceed to this now; but first we make a general remark about all unitary
transformation.
If W is a subspace of V invariant under the unitary transformation T, is
it true that W ' , the orthogonal complement of W, is also invariant under T?.
Let EW and x  W' ; thus  T, xT    , x   0 ; since W is invariant under
T and T is regular, WT = W, whence xT, for x  W' is orthogonal to all of W.
Thus indeed  W ' T  W ' . Recall that V  W  W' .

Let Q be a real orthogonal matrix; thus T  Q  Q1  Q  Q ' is


symmetric, hence been real characteristic roots. If these are 1 ,......,  k , then V
can be decomposed as V  V1  ........  Vk where i  V implies i T  i i .
The Vi 's are mutually orthogonal. We claim each Vi is invariant under Q.
(Prove!). Thus to discuss the action of Q on V, it is enough to describe it on
each Vi .

On Vi , since i i  i T  i  Q  Q1  , multiplying by Q yields


i  Q2  i Q  1  0 . Two special cases present themselves, namely  i  2 and
 i  2 (which may, of course, not occur), for then i  Q  1  0 leading to
2

i  Q  1  0 . On these space Q acts as 1 or as -1.


If  i  2, 2 then Q has no characteristic vectors on Vi , hence for
    Vi , , Q are linearly independent. The subspace they generate, W, is
invariant under Q, since Q2  i Q   . Now Vi  W  W ' with W '
invariant under Q. Thus we can get Vi as a direct sum of two-dimensional
mutually orthogonal subspaces invariant under Q. To find canonical forms of Q
on Vi (hence on V), we must merely settle the question for 2 × 2 real
orthogonal matrices.
Let Q be a real 2 × 2 real orthogonal matrix satisfying Q2  Q  1  0 ;
 
suppose that Q    . The orthogonality of Q implies
  
 2  2  1 …. (1)
   2  1 ….. (2)
    0 ……(3)
Since Q2  Q  1  0 , the determination of Q is 1, hence
    1 …. (4)

194
We claim that equation (1) – (4) imply that    ,    . Since
 2  2  1 ,   1 , whence we can write   cos  for some real angle  ; in
these terms   sin  . Therefore, the matrix Q looks like
 cos  sin  
 
  sin  cos  
All the spaces used in all our decompositions were mutually orthogonal,
thus by picking orthogonal bases of each of these we obtain an orthonormal
basis of V. In this basis the matrix of Q is as shown in Figure 1
1 
 
 
 1 
 
 1 
 
 
 1 
 
 cos 1 sin 1 
 
  sin 1 cos 1 
 
 
 
 cos r sin r 
  sin r cos r 

Figure 1
Since we have gone from one orthogonal basis to another, and since this
is accomplished by an orthongonal matrix, given a real orthogonal matrix Q we
can find an orthogonal matrix T such that TQT 1    TQT * is of the form
just described.
Real quadratic form
We close the chapter with a brief discussion of quadratic forms over the
field of real numbers.
Let V be a real, inner-product space and suppose that A is a (real)
symmetric linear transformation on V. The real-valued function    defined on
V by Q     A,  is alled the quadratic from associated with A.
If we consider, as we may without loss of generality, that A is a real, n
×n symmetric matrix  ij  acting on F  and that the inner product for
n

 1 ,......., n  and  1 ,........,  n  in F n  is the real number

195
11  2  2  ..........  n  n for an arbitrary vector    x1 ,..........., x n  in F  a
n

simple calculation shows that


Q      A,    1x12  ........   nn x n2  2  ijx i x j
i j

On the other hand, given any quadratic function in n-variables.


1x12  .....   nn x 2n  2  ij x i x j ,
i j

With real coefficients  ij , we clearly realize it as the quadratic form


associated with the real symmetric matrix C    ij 
In real n-dimensional Euclidean space such quadratic functions serve to
define the quadratic surfaces. For instance, in the real plane, the form
x 2  xy  y 2 gives rise to a conic section (possibly with its major axis
tilted). It is not too unnatural to expect that the geometric properties of this
conic section should be intimately related with the symmetric matrix.
   
 2 ,
 
  
 2 
with which its quadratic form is associated.
Let us recall that in elementary analytic geometry one proves that by a
suitable rotationi of axes the equation x 2  xy  y 2 can, in the new
1  x '  1  y ' .
2 2
coordinate system, assume the form Recall that

1  1     and   
2
 11 . Thus 11 are the characteristic roots of
4
the matrix
   
 2
 
  
 2 
The rotation of axes is just a change of basis by an orthogonal
transformation, and what we did in the geometry was merely to bring the
symmetric matrix to its diagonal form by an orthogonal matrix. The nature of
x 2  xy  y 2 as a conic was basically determined by the size and sign of its
characteristic roots 1 , 1 .
A similar discussion can be carried out to classify quadric surfaces in 3-
space, and indeed quadric surfaces in n-space. What essentially determines the
geometric nature of the quadric surfaces associated with

196
x12  ......   nn x n2  2 ij x i x j
i j

In the size of the characteristic roots of the matrix  ij  , If we were not
interested in the relative flatness of the quadric surface (e.g., if we consider an
ellipse as a flattened circle), then we could ignore the size of the nonzero
characteristic roots and the determining factor for the shape of the quadric
surface would be the number of 0 characteristic roots and the number of
positive (and negative) ones.
These things motivate, and at the same time will be clarified in, the
discussion that follows, which culminates in Sylvester’s law of inertia.
Let A be a real symmetric matrix and let us consider its associated quadratic
from Q     A,  . If T is any nonsingular real linear transformation, given
 F n  ,   T for some  F n  , whence
 A,    TA, T    TAT ',  . Thus A and TAT’ effectively define the
same quadratic form. This prompts the
Definition:
Two real symmetric matrices A and B are congruent if there is a
nonsingular real matrix T such that B=TAT’.
Lemma 16:
Congruence is an equivalence relation.
Proof:
Let us write, when A is congruent to B, A  B .
1. A  A for A  A '
2. If A  B then B  TAT ' where T is nonsingular, hence A=SBS’
where S  T 1 . Thus B  A .
3. If A  B and B  C then B  TAT ' while C  RBR ' , hence
C  RTAT 'R '   RT  A  RT  ' and so A  C
Since the relation satisfies the defining conditions for an equivalence
relation, the lemma is proved.
The principal theorem concerning congruence is its characterization,
contained in Sylvester’s law.
Theorem 10:
Given the real symmetric matrix A there is an invertible matrix T such that
 Ir 
 
TAT'=  Is 
 0t 

197
where I r and Is are respectively the r ×r and s × s unit matrices and where 0 t is
the t  t Zero-matrix. The integers r + s, which is the rank of A, and r – s,
which is the signature of A, characterize the congruence class of A. That is,
two real symmetric matrices are congruent if and only if they have the same
rank and signature.
Proof:
Since A is real symmetric its characteristic roots are all real; let
1 ,.......,  r be its positive characteristic roots,  r 1 ,........,  r s its negative
ones. By the discussion at the end of Section 6.10 we can find a real
orthogonal matrix C such that
 1 
 
 
 r 
1  
CAC  CAC '    r 1 
 
 
  r s 
 0t 

where t  n  r  s . Let D be the real diagonal matrix shown in Figure 2.
 1 
  
 1 
 
 
 1 
 r 
 
D 1 
 
  r+1 
 
 
 1 
  r+s 
 
 I t 

Figure 2
A simple computation shows that
 Ir 
 
DCAC'D '   Is 
 0t 

198
Thus there is a matrix of the required form in the congruence class of A.
Our task is now to show that this is the only matrix in the congruence class of
A of this form, or, equivalently, that
 Ir   Ir ' 
   
L  Is  and M   Is' 
 0t   0t ' 
 
Suppose that M  TLT ' where T is invertible. By lemma 6.1.3 he rank of
M equals that of L; since the ranke of M is n  t ' while that of L is n – t we get
t  t '.
Suppose that r  r ' ; since n  r  s  t  r ' s' t ' and since t  t ' , we
must have s  s' . Let U be the subspace of F  of all vectors having the first r
n

and last t coordinates 0; U is s-dimensional and for u  0 in U,  uL, u   0 .

Let W be the subspace of F  for which the r ' I,......., r ' s '
n

components are all 0; on W,  M,   0 for any  W . Since T is invertible,


and since W is  n  s ' dimensional. WT is  n  s ' -dimensional. For  W ,
 M,   0 ; hence  TLT ',   0 ; that is,  TL, T   0 . Therefore, on
WT,  TL, T   0 for all elements. Now
dim  WT   dim U   n  s '  r  n  s  s '  n ; thus by the corollary to Lemma
4.2.6, WT  U  0 . This, however, is nonsense, for if x  0  WT  U , on one
hand, being in U,  xL, x   0 , while on the other, being in WT,  xL, x   0 .
Thus r  r ' and so s  s' .
The rank, r + s, and signature, r – s, of course, determine r, s and so
t  (n  r  s) , whence they determine the congruence class.

199
NOTES
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………
……………………………………………………………………………………

200

You might also like