Why Does the Cayley-Hamilton Theorem Seem Intuitively Obvious?

Benn · Nov 10, 2012

In my linear algebra course, we just finished proving the cayley hamilton theorem (if p(x) = det (A - xI), then p(A) = 0).

The theorem seems obvious: if you plug in A into p, you get det (A-AI) = det (0) = 0. But, of course, you can't do that (this is especially clear when you consider what A-xI looks like... you can't subtract matrices from real numbers)

Is there any way to salvage the idea of just plugging in A? or is it just a coincidence that it seems so obvious?

bins4wins · Nov 10, 2012

Benn said:

In my linear algebra course, we just finished proving the cayley hamilton theorem (if p(x) = det (A - xI), then p(A) = 0).

The theorem seems obvious: if you plug in A into p, you get det (A-AI) = det (0) = 0. But, of course, you can't do that (this is especially clear when you consider what A-xI looks like... you can't subtract matrices from real numbers)

Is there any way to salvage the idea of just plugging in A? or is it just a coincidence that it seems so obvious?

We know p(x) is the characteristic polynomial of A. The meaning of p(A) is to take the input A into the characteristic polynomial, not into the formula d(A - xI). Also, note that the 0 in the expression p(A) = 0 isn't referring to a scalar 0, its referring to the zero matrix.

Benn · Nov 10, 2012

bins4wins said:

We know p(x) is the characteristic polynomial of A. The meaning of p(A) is to take the input A into the characteristic polynomial, not into the formula d(A - xI). Also, note that the 0 in the expression p(A) = 0 isn't referring to a scalar 0, its referring to the zero matrix.

Yes, thank you.

I understand that. But I wasn't sure if it was just a coincidence that the theorem seemed so obvious when we considered the characteristic polynomial to be det (A-xI), or if there was some way to make rigorous the idea of plugging in A.

bins4wins · Nov 10, 2012

Personally, I find the idea of the notation p(A) fairly loose in terms of rigor, since the definition of p as a function has a domain of the reals. The most amount of rigor you can put in plugging in A is just by defining what it exactly means to plug in A and that is to plug it into the characteristic polynomial expression.

Benn · Nov 10, 2012

bins4wins said:

Personally, I find the idea of the notation p(A) fairly loose in terms of rigor, since the definition of p as a function has a domain of the reals. The most amount of rigor you can put in plugging in A is just by defining what it exactly means to plug in A and that is to plug it into the characteristic polynomial expression.

If ##p(x) = c_{n}x^{n} + ... + c_{1}x + c_{0}## where ##p## is defined from ##\mathbb{R}## to ##\mathbb{R}## and ##c_{i} \in \mathbb{R}##, then ##p(A): \{ \text{m x m matrices} \} \rightarrow \{ \text{m x m matrices} \}## is defined by ##p(A) = c_{n}A^{n} + ... + c_{1}A + c_{0}I##. ... I'm completely happy with that.

I must not have been clear in my question. I'm asking for a proof using the idea of 'plugging A into det (A - xI) or an explanation of why there isn't one, not an clarification of the statement of the theorem.

Ray Vickson · Nov 10, 2012

Benn said:

If ##p(x) = c_{n}x^{n} + ... + c_{1}x + c_{0}## where ##p## is defined from ##\mathbb{R}## to ##\mathbb{R}## and ##c_{i} \in \mathbb{R}##, then ##p(A): \{ \text{m x m matrices} \} \rightarrow \{ \text{m x m matrices} \}## is defined by ##p(A) = c_{n}A^{n} + ... + c_{1}A + c_{0}I##. ... I'm completely happy with that.

I must not have been clear in my question. I'm asking for a proof using the idea of 'plugging A into det (A - xI) or an explanation of why there isn't one, not an clarification of the statement of the theorem.

Well, for one thing, p(A) is a matrix (that happens to have all entries = 0), while det(I.A - A) is a scalar (that happens to equal zero).

More generally, suppose we have p(x) = det(xB + C) = det(Bx + C) for nxn matrices B and C, and suppose we happen to have p(A) = 0. Does it follow that det(AB + C) = 0 or that det(BA + C) = 0? Conversely, if either det(AB + C) = 0 or det(BA + C) = 0, does it follow that p(A) = 0? I am not 100% sure of the answers, but I have my doubts that any of the answers is "yes". (If this turns out to be right, then it would be a sheer accident that you happen to get the correct result that p(A) = 0 in the special case that B = I and C = A.)

RGV

Why Does the Cayley-Hamilton Theorem Seem Intuitively Obvious?

Thread 'Finding the nth roots of a complex number'

Thread 'Solve this problem that involves induction'

Similar threads

Hot Threads

Prove that the integral is equal to ##\pi^2/8##

Solving the wave equation with piecewise initial conditions

Area of loop in x-y plane

Calculating radius of gyration of plane figure about x-axis

Solve this problem that involves induction

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective