Cayley-Hamilton Theorem: A Simple, Natural Argument

Click For Summary

Discussion Overview

The discussion revolves around the Cayley-Hamilton theorem, specifically exploring a proof that utilizes Lagrange's expansion formula for determinants and the properties of adjugate matrices. Participants examine the validity of the proof and discuss various mathematical concepts related to matrix theory, including polynomial evaluations and non-commutativity.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Mathematical reasoning

Main Points Raised

  • One participant presents a proof of the Cayley-Hamilton theorem using Lagrange's expansion formula, claiming that any square matrix satisfies its own characteristic polynomial.
  • Another participant questions the multiplication of matrices involved in the proof, initially expressing confusion but later acknowledges understanding and agrees with the proof's correctness.
  • A different participant raises concerns about the absence of this proof in standard algebra texts, suggesting that these sources prefer more complex approaches involving module decomposition or canonical forms.
  • One participant objects to the argument presented, specifically challenging the assumption that the determinant of a matrix can be equated to the characteristic polynomial when the variable is a matrix rather than an element of the underlying field.
  • Another participant introduces the idea of working within different rings (polynomials with matrix coefficients vs. matrices with polynomial coefficients) and questions the isomorphism between these rings.
  • Further discussion includes the implications of non-commutativity in polynomial evaluations, raising questions about the meaning of evaluating polynomials at matrices that do not commute with their coefficients.
  • A participant elaborates on the non-commutative algebra relevant to the discussion, explaining how polynomial evaluations can differ based on the order of multiplication and how this relates to the proof of the Cayley-Hamilton theorem.

Areas of Agreement / Disagreement

Participants express a mix of agreement and disagreement regarding the proof's validity. While some find the proof convincing, others raise significant concerns about its assumptions and the mathematical framework used, indicating that multiple competing views remain unresolved.

Contextual Notes

Participants highlight limitations related to the assumptions made about the rings involved in the proof and the implications of non-commutativity, which remain unresolved within the discussion.

mathwonk
Science Advisor
Homework Helper
Messages
11,979
Reaction score
2,257
If A is an n by n matrix of constants from the commutative ring k, and I is the identity n by n matrix, then Lagrange's expansion formula for determinants implies that

adj[XI-A].[XI-A] = f(X).I where f(X) is the characteristic polynomial of A, and adj denotes the classical adjoint whose entries are + or - the (n-1) by (n-1) minors of A.

Since setting X=A makes the left side equal to zero, we also have f(A) = 0. QED for cayley hamilton. (i.e. any square matrix A satisfies its own characteristic polynomial f(X) = det[XI-A].)


Q: is this a correct argument?

[hint: how could anything so simple and natural not be correct?]:biggrin:
 
Last edited:
Physics news on Phys.org
I do not really understand what you're doing. How can you multiply the (n-1)x(n-1) adj[XI-A] matrix with the nxn (XI-A) matrix?
 
Nevermind, adj[XI-a] is an nxn matrix. Now I understand your proof and I think it's correct, yes.
 
if this simple proof, using only, cramers rule and the remainder theorem is correct, why do standard books including Artin, Bourbaki, Lang, Hungerford, Jacobson, Van der Waerden, Rotman, Sah, Birkhoff - Maclane, and my own grad algebra notes, not give it?

these sources instead appeal to deeper results like decompositon of modules over pids, or jordan form, or rational canonical form, or diagonalization, or existence of partial decomposition into cylic subspaces, or tedious unenlightening computations.:confused:
 
Actually I do object against your arguments. For square matrices we have adj(A).A = det(A).I. So indeed adj(xI-A).(xI-A)=det(xI-A).I, but how do you conclude det(xI-A)=f(x) for x a matrix, instead of an element of k? I am pretty sure that's not true in general.

My shorter proof: det(xI-A)=f(x), putting x=A gives f(A)=det(0)=0. Can't be correct.
 
puzzling, isn't it? hint: one is working in two different rings as one pleases here: namely polynomials with matrix coefficients, and matrices with polynomial coefficients. the calculation adj[XI-A].[XI-A] = f(X).I, is usually justified by cramers rule, in the ring of matrices with polynomial coefficients. But the substitution f(A) is done in the ring of polynomials with matrix coefficients. Are these rings isomorphic?


[actually the short proof above is essentially the one in hefferon's free web notes, but he gives it as an unenlightening computation, without explaining what is going on. word for word the same proof appears in an old book by ivar nering, so maybe he just copied it? at any rate he works, as does nering, at times in one of the rings above, at times in another, never saying why the arguments do not depend on the ring, i.e. never making clear the isomorphism mentioned above.]

(your argument which as you say is wrong, operates in the ring of matrices with matrix coefficients!):rolleyes:
 
remark, you have to deal also with non commutativity issues. i.e. suppose f(x) is a polynomial and A is an element that does not commute with the coefficients of f. what does f(A) mean? is this a problem above?
 
here is the relevant non commutative algebra: if f(X) is a polynomial with coefficients in a non commutative ring, but X commutes with every element, then there are two meanings to f(A) for A an element of the ring of coefficients. one can evaluate f from the right oe from the left, at A.

I.e. one can take a0 + a1 A + a2 A^2 +...+ an A^n, the right evaluation, or one can take

a0 + A a1 + A^2 a2 +...+A^n an, the left evaluation.

then the basic high school root factor theorem has two version: namely if (X-A) divides g(X) from the left, then the left evaluation of f at A is zero, and simialrly for right division.

thus since one has the cramers rule ad(XI-A)(XI-A) = f(X), where f is the characteristic polynomial of A, since XI-A divides f(X) from the right, then we have f(A) = 0 wheren f(A) is the evaluation of f at A from the right. now also one knows that (XI-A) ad(XI-A)= f(X), so also f(A) = 0 where f(A) is evaluation from the left, and of course since f has scalar coefficients, these are the same.check this out. this is to me the simplest, most elementary proof of cayley hamilton, and is very hard to find in books.
 
Last edited:

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 15 ·
Replies
15
Views
5K
  • · Replies 3 ·
Replies
3
Views
5K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 3 ·
Replies
3
Views
10K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 17 ·
Replies
17
Views
7K
  • · Replies 5 ·
Replies
5
Views
2K