Cayley-Hamilton Theorem: A Simple, Natural Argument

Click For Summary
SUMMARY

The discussion centers on the Cayley-Hamilton Theorem, which states that any square matrix A satisfies its own characteristic polynomial f(X) = det[XI-A]. Participants debate the validity of a proof using Lagrange's expansion formula for determinants and the classical adjoint. Key points include the distinction between evaluating polynomials with matrix coefficients and matrices with polynomial coefficients, as well as the implications of non-commutativity in algebra. The proof's simplicity is contrasted with more complex approaches found in standard texts by authors such as Artin and Bourbaki.

PREREQUISITES
  • Understanding of the Cayley-Hamilton Theorem
  • Familiarity with determinants and the classical adjoint
  • Knowledge of polynomial rings and matrix algebra
  • Concepts of non-commutative algebra
NEXT STEPS
  • Study the proof of the Cayley-Hamilton Theorem in various algebra texts
  • Explore the implications of non-commutativity in polynomial evaluations
  • Learn about the relationship between matrices and polynomials in algebra
  • Investigate Cramer’s Rule and its applications in linear algebra
USEFUL FOR

Mathematicians, algebra students, and educators interested in linear algebra, particularly those seeking a deeper understanding of the Cayley-Hamilton Theorem and its proofs.

mathwonk
Science Advisor
Homework Helper
Messages
11,968
Reaction score
2,240
If A is an n by n matrix of constants from the commutative ring k, and I is the identity n by n matrix, then Lagrange's expansion formula for determinants implies that

adj[XI-A].[XI-A] = f(X).I where f(X) is the characteristic polynomial of A, and adj denotes the classical adjoint whose entries are + or - the (n-1) by (n-1) minors of A.

Since setting X=A makes the left side equal to zero, we also have f(A) = 0. QED for cayley hamilton. (i.e. any square matrix A satisfies its own characteristic polynomial f(X) = det[XI-A].)


Q: is this a correct argument?

[hint: how could anything so simple and natural not be correct?]:biggrin:
 
Last edited:
Physics news on Phys.org
I do not really understand what you're doing. How can you multiply the (n-1)x(n-1) adj[XI-A] matrix with the nxn (XI-A) matrix?
 
Nevermind, adj[XI-a] is an nxn matrix. Now I understand your proof and I think it's correct, yes.
 
if this simple proof, using only, cramers rule and the remainder theorem is correct, why do standard books including Artin, Bourbaki, Lang, Hungerford, Jacobson, Van der Waerden, Rotman, Sah, Birkhoff - Maclane, and my own grad algebra notes, not give it?

these sources instead appeal to deeper results like decompositon of modules over pids, or jordan form, or rational canonical form, or diagonalization, or existence of partial decomposition into cylic subspaces, or tedious unenlightening computations.:confused:
 
Actually I do object against your arguments. For square matrices we have adj(A).A = det(A).I. So indeed adj(xI-A).(xI-A)=det(xI-A).I, but how do you conclude det(xI-A)=f(x) for x a matrix, instead of an element of k? I am pretty sure that's not true in general.

My shorter proof: det(xI-A)=f(x), putting x=A gives f(A)=det(0)=0. Can't be correct.
 
puzzling, isn't it? hint: one is working in two different rings as one pleases here: namely polynomials with matrix coefficients, and matrices with polynomial coefficients. the calculation adj[XI-A].[XI-A] = f(X).I, is usually justified by cramers rule, in the ring of matrices with polynomial coefficients. But the substitution f(A) is done in the ring of polynomials with matrix coefficients. Are these rings isomorphic?


[actually the short proof above is essentially the one in hefferon's free web notes, but he gives it as an unenlightening computation, without explaining what is going on. word for word the same proof appears in an old book by ivar nering, so maybe he just copied it? at any rate he works, as does nering, at times in one of the rings above, at times in another, never saying why the arguments do not depend on the ring, i.e. never making clear the isomorphism mentioned above.]

(your argument which as you say is wrong, operates in the ring of matrices with matrix coefficients!):rolleyes:
 
remark, you have to deal also with non commutativity issues. i.e. suppose f(x) is a polynomial and A is an element that does not commute with the coefficients of f. what does f(A) mean? is this a problem above?
 
here is the relevant non commutative algebra: if f(X) is a polynomial with coefficients in a non commutative ring, but X commutes with every element, then there are two meanings to f(A) for A an element of the ring of coefficients. one can evaluate f from the right oe from the left, at A.

I.e. one can take a0 + a1 A + a2 A^2 +...+ an A^n, the right evaluation, or one can take

a0 + A a1 + A^2 a2 +...+A^n an, the left evaluation.

then the basic high school root factor theorem has two version: namely if (X-A) divides g(X) from the left, then the left evaluation of f at A is zero, and simialrly for right division.

thus since one has the cramers rule ad(XI-A)(XI-A) = f(X), where f is the characteristic polynomial of A, since XI-A divides f(X) from the right, then we have f(A) = 0 wheren f(A) is the evaluation of f at A from the right. now also one knows that (XI-A) ad(XI-A)= f(X), so also f(A) = 0 where f(A) is evaluation from the left, and of course since f has scalar coefficients, these are the same.check this out. this is to me the simplest, most elementary proof of cayley hamilton, and is very hard to find in books.
 
Last edited:

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 15 ·
Replies
15
Views
5K
  • · Replies 3 ·
Replies
3
Views
5K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 3 ·
Replies
3
Views
10K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 17 ·
Replies
17
Views
7K
  • · Replies 5 ·
Replies
5
Views
2K