Why Does the Cayley-Hamilton Theorem Seem Intuitively Obvious?

  • Thread starter Thread starter Benn
  • Start date Start date
  • Tags Tags
    Hamilton Theorem
Benn
Messages
34
Reaction score
0
In my linear algebra course, we just finished proving the cayley hamilton theorem (if p(x) = det (A - xI), then p(A) = 0).

The theorem seems obvious: if you plug in A into p, you get det (A-AI) = det (0) = 0. But, of course, you can't do that (this is especially clear when you consider what A-xI looks like... you can't subtract matrices from real numbers)

Is there any way to salvage the idea of just plugging in A? or is it just a coincidence that it seems so obvious?
 
Physics news on Phys.org
Benn said:
In my linear algebra course, we just finished proving the cayley hamilton theorem (if p(x) = det (A - xI), then p(A) = 0).

The theorem seems obvious: if you plug in A into p, you get det (A-AI) = det (0) = 0. But, of course, you can't do that (this is especially clear when you consider what A-xI looks like... you can't subtract matrices from real numbers)

Is there any way to salvage the idea of just plugging in A? or is it just a coincidence that it seems so obvious?


We know p(x) is the characteristic polynomial of A. The meaning of p(A) is to take the input A into the characteristic polynomial, not into the formula d(A - xI). Also, note that the 0 in the expression p(A) = 0 isn't referring to a scalar 0, its referring to the zero matrix.
 
bins4wins said:
We know p(x) is the characteristic polynomial of A. The meaning of p(A) is to take the input A into the characteristic polynomial, not into the formula d(A - xI). Also, note that the 0 in the expression p(A) = 0 isn't referring to a scalar 0, its referring to the zero matrix.

Yes, thank you.

I understand that. But I wasn't sure if it was just a coincidence that the theorem seemed so obvious when we considered the characteristic polynomial to be det (A-xI), or if there was some way to make rigorous the idea of plugging in A.
 
Personally, I find the idea of the notation p(A) fairly loose in terms of rigor, since the definition of p as a function has a domain of the reals. The most amount of rigor you can put in plugging in A is just by defining what it exactly means to plug in A and that is to plug it into the characteristic polynomial expression.
 
bins4wins said:
Personally, I find the idea of the notation p(A) fairly loose in terms of rigor, since the definition of p as a function has a domain of the reals. The most amount of rigor you can put in plugging in A is just by defining what it exactly means to plug in A and that is to plug it into the characteristic polynomial expression.

If ##p(x) = c_{n}x^{n} + ... + c_{1}x + c_{0}## where ##p## is defined from ##\mathbb{R}## to ##\mathbb{R}## and ##c_{i} \in \mathbb{R}##, then ##p(A): \{ \text{m x m matrices} \} \rightarrow \{ \text{m x m matrices} \}## is defined by ##p(A) = c_{n}A^{n} + ... + c_{1}A + c_{0}I##. ... I'm completely happy with that.

I must not have been clear in my question. I'm asking for a proof using the idea of 'plugging A into det (A - xI) or an explanation of why there isn't one, not an clarification of the statement of the theorem.
 
Benn said:
If ##p(x) = c_{n}x^{n} + ... + c_{1}x + c_{0}## where ##p## is defined from ##\mathbb{R}## to ##\mathbb{R}## and ##c_{i} \in \mathbb{R}##, then ##p(A): \{ \text{m x m matrices} \} \rightarrow \{ \text{m x m matrices} \}## is defined by ##p(A) = c_{n}A^{n} + ... + c_{1}A + c_{0}I##. ... I'm completely happy with that.

I must not have been clear in my question. I'm asking for a proof using the idea of 'plugging A into det (A - xI) or an explanation of why there isn't one, not an clarification of the statement of the theorem.

Well, for one thing, p(A) is a matrix (that happens to have all entries = 0), while det(I.A - A) is a scalar (that happens to equal zero).

More generally, suppose we have p(x) = det(xB + C) = det(Bx + C) for nxn matrices B and C, and suppose we happen to have p(A) = 0. Does it follow that det(AB + C) = 0 or that det(BA + C) = 0? Conversely, if either det(AB + C) = 0 or det(BA + C) = 0, does it follow that p(A) = 0? I am not 100% sure of the answers, but I have my doubts that any of the answers is "yes". (If this turns out to be right, then it would be a sheer accident that you happen to get the correct result that p(A) = 0 in the special case that B = I and C = A.)

RGV
 
Last edited:
There are two things I don't understand about this problem. First, when finding the nth root of a number, there should in theory be n solutions. However, the formula produces n+1 roots. Here is how. The first root is simply ##\left(r\right)^{\left(\frac{1}{n}\right)}##. Then you multiply this first root by n additional expressions given by the formula, as you go through k=0,1,...n-1. So you end up with n+1 roots, which cannot be correct. Let me illustrate what I mean. For this...
Back
Top