I forgot to mention there is a geometrical way of looking at this - suppose the diagonal is filled with scalars, then each aii stretches any object that the matrix acts on in the i direction. For example, suppose a11 is 5, then any object acted on by the matrix will be streched by a factor of 5 in the x direction, if you are using an x,y,z coordinate system. In a diagonal matrix the eigenvalues are precisely the entries on the diagonal.
I completely left out the geometric intuition I sought to express - stretching in the x direction, then the y, is the same as stretching in the y direction, then the x - the two operations commute, and so do any matrices associated with the operations.
More generally: two matrices, A and B, commute if and only if they are "simultaneously diagonalizable": that is if there exist some invertible matrix M such that MAM-1= D1 and MBM-1= D2 where D1 and D2 are diagonal matrices (more generally, Jordan Normal Form if A and B are not diagonalizable).
It follows then that A= M-1D1M and B= M-1D2M.
Then AB= M-1D1MM-1D2M= M-1D1D2M
= M-1D2D1M (since all diagonal matrices commute)
It wasn't a proof, just geometric intuition, given by the fact that it doesn't matter which order something is stretched in - the final result will be the same, and the fact that diagonal matrices stretch any object they act on, by the degree given by the ith eigenvalue, in the ith direction. There is an excellent geometric discussion of how eigenvalues and eigenvectors act on an object available here: http://hverrill.net/courses/linalg/linalg8.html
I don't have anything better to add to it.
Ok... I can prove that if A has not defective then A and B commute iff they are simultaneously diagonalizable. Anyone know how to do it when both are defective?
Here's a sketch of what I have so far:
Assume A is not defective and finite dimensional. Then choose a basis that diagonalizes A.
The i,k-th entry in AB is AiiBik
The i,k-th entry in BA is BikAkk
So AB = BA iff, for all (i, k), Bik = 0 or Aii=Akk.
If Aii = Ajj, then A is the identity over the subspace spanned by the i-th and j-th basis vectors, so we can replace these two vectors with basis vectors that diagonalize B over that subspace.
By repeating this process, we can produce a basis that simultaneously diagonalizes both A and B.
I don't know what to do if A is defective or infinite dimensional. (though I admit not having taken a crack at modifying the above proof to use transfinite induction to tackle the infinite dimensional case)
That proof should work for infinite dimensional cases as is because it already uses induction in the repetition of converting the basis vectors to basis vectors that span subspaces of B, two by two, as long as A is not defective. Proof if both A and B are defective is not easy. Good luck.