Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

A proof using matrices of endomorphisms[edit]

As was mentioned above, the matrix p(A) in statement of the theorem is obtained by first evaluating
the determinant and then substituting the matrix A for t; doing that substitution into the matrix 

 before evaluating the determinant is not meaningful. Nevertheless, it is possible to give an


interpretation where p(A) is obtained directly as the value of a certain determinant, but this requires a
more complicated setting, one of matrices over a ring in which one can interpret both the entries 

 of A, and all of A itself. One could take for this the ring M(n, R) of n × n matrices over R,

where the entry   is realised as  , and A as itself. But considering matrices with
matrices as entries might cause confusion with block matrices, which is not intended, as that gives
the wrong notion of determinant (recall that the determinant of a matrix is defined as a sum of
products of its entries, and in the case of a block matrix this is generally not the same as the
corresponding sum of products of its blocks!). It is clearer to distinguish A from
the endomorphism φ of an n-dimensional vector space V (or free R-module if R is not a field) defined

by it in a basis  , and to take matrices over the ring End(V) of all such endomorphisms.
Then φ ∈ End(V) is a possible matrix entry, while A designates the element of M(n, End(V)) whose i, 

j entry is endomorphism of scalar multiplication by  ; similarly   will be interpreted as


element of M(n, End(V)). However, since End(V) is not a commutative ring, no determinant is
defined on M(n, End(V)); this can only be done for matrices over a commutative subring of End(V).

Now the entries of the matrix   all lie in the subring R[φ] generated by the identity and φ,

which is commutative. Then a determinant map M(n, R[φ]) → R[φ] is defined, and   


evaluates to the value p(φ) of the characteristic polynomial of A at φ (this holds independently of the
relation between A and φ); the Cayley–Hamilton theorem states that p(φ) is the null endomorphism.
In this form, the following proof can be obtained from that of (Atiyah & MacDonald 1969Prop. 2.4)
(which in fact is the more general statement related to the , Nakayama lemma; one takes for
the ideal in that proposition the whole ring R). The fact that A is the matrix of φ in the
basis e1, ..., en means that

One can interpret these as n components of one equation in V n, whose members can be written
using the matrix-vector product M(n, End(V)) × V n → V n that is defined as usual, but with

individual entries ψ ∈ End(V) and v in V being "multiplied" by forming  ; this gives:


where   is the element whose component i is ei (in other words it is the
basis e1, ..., en of V written as a column of vectors). Writing this equation as

one recognizes the transpose of the matrix   considered above, and its


determinant (as element of M(n, R[φ])) is also p(φ). To derive from this equation

that p(φ) = 0 ∈ End(V), one left-multiplies by the adjugate matrix of  , which is


defined in the matrix ring M(n, R[φ]), giving

the associativity of matrix-matrix and matrix-vector multiplication used in the first


step is a purely formal property of those operations, independent of the nature of the
entries. Now component i of this equation says that p(φ)(ei) = 0 ∈ V; thus p(φ)
vanishes on all ei, and since these elements generate V it follows that p(φ) = 0 ∈
End(V), completing the proof.
One additional fact that follows from this proof is that the matrix A whose
characteristic polynomial is taken need not be identical to the value φ substituted
into that polynomial; it suffices that φ be an endomorphism of V satisfying the initial
equations

for some sequence of elements e1, ..., en that generate V (which space might


have smaller dimension than n, or in case the ring R is not a field it might not be
a free module at all).

A bogus "proof": p(A) = det(AI  − A) = det(A − A) =


n

0[edit]
One persistent elementary but incorrect argument[18] for the theorem is to
"simply" take the definition

and substitute A for λ, obtaining

There are many ways to see why this argument is wrong. First, in the
Cayley–Hamilton theorem, p(A) is an n × n matrix. However, the right
hand side of the above equation is the value of a determinant, which is
a scalar. So they cannot be equated unless n = 1 (i.e. A is just a
scalar). Second, in the expression  , the variable λ actually

occurs at the diagonal entries of the matrix  . To illustrate,


consider the characteristic polynomial in the previous example again:

If one substitutes the entire matrix A for λ in those positions, one


obtains

in which the "matrix" expression is simply not a valid one. Note,


however, that if scalar multiples of identity matrices instead of
scalars are subtracted in the above, i.e. if the substitution is
performed as

then the determinant is indeed zero, but the expanded

matrix in question does not evaluate to  ; nor can


its determinant (a scalar) be compared to p(A) (a matrix).

So the argument that   still does not apply.


Actually, if such an argument holds, it should also hold
when other multilinear forms instead of determinant is
used. For instance, if we consider the permanent function

and define  , then by the same argument, we


should be able to "prove" that q(A) = 0. But this statement
is demonstrably wrong: in the 2-dimensional case, for
instance, the permanent of a matrix is given by

So, for the matrix A in the previous example,

Yet one can verify that

One of the proofs for Cayley–Hamilton


theorem above bears some similarity to the
argument that  . By introducing a
matrix with non-numeric coefficients, one can
actually let A live inside a matrix entry, but

then   is not equal to A, and the


conclusion is reached differently.

Proofs using methods of


abstract algebra[edit]
Basic properties of Hasse–Schmidt

derivations on the exterior algebra   of


some B-module M (supposed to be free and of
finite rank) have been used by Gatto &
Salehyan (2016, §4) to prove the Cayley–
Hamilton theorem. See also Gatto & Scherbak
(2015).

You might also like