Geometric insight about rotations needed

Fredrik · Mar 9, 2013

I'm trying to find a way to simplify a complicated proof. The worst step of the proof involves a product of five 4×4 matrices. I'm hoping, perhaps naively, that if I could understand why the result of this operation is so simple, I may be able to explain the proof to others without actually doing the matrix multiplication.

The theorem is a statement about subgroups of GL(ℝ⁴). I will denote the standard basis for ##\mathbb R^4## by ##\{e_0,e_1,e_2,e_3\}##, and label rows and columns of matrices from 0 to 3. (This convention is often used in relativity). The goal is to find all groups ##G\subset\mathrm{GL}(\mathbb R^4)## such that the subgroup of G that consists of all matrices of the form
\begin{pmatrix}* & * & * & *\\ 0 & * & * & *\\ 0 & * & * & *\\ 0 & * & * & *\end{pmatrix} is equal to the group of all matrices of the form
\begin{pmatrix}1 & 0 & 0 & 0\\ 0 & & & \\ 0 & & R & \\ 0 & & & \end{pmatrix} where R is a member of SO(3). I will use the notation U(R) for such a matrix. The theorem says that this assumption implies that G is either the group of Galilean boosts or the group of Lorentz boosts.

I will describe some of the key steps of the proof. The first step is the observation that for all ##\Lambda\in G##, there exist ##R,R'\in\mathrm{SO}(3)## such that the 20,30,21,31 components of ##\Lambda## are all zero. (This is the lower left 2×2 corner). This isn't particularly hard. If we write
$$\Lambda=\begin{pmatrix}a & b^T\\ c & D\end{pmatrix}$$ where a is a number, b,c are 3×1 matrices, and D is a 3×3 matrix, we have
$$U(R)\Lambda U(R') =\begin{pmatrix}a & b^TR'\\ Rc & RDR'\end{pmatrix}.$$ So all we need to do is to choose R,R' such that the last two rows of R are orthogonal to c, and the first column of R' is orthogonal to the last two rows of RD.

The next step is to show that if the lower left is all zeroes, then so is the upper right. This is simple once you have seen the trick. (I will post it on request). The proof then focuses on the subgroup that consists of matrices such that the lower left and upper right are both all zeroes. It contains a matrix of the form
$$M=\begin{pmatrix} a & b & 0 & 0\\ -vd & d & 0 & 0\\ 0 & 0 & k & 0\\ 0 & 0 & 0 & \pm k\end{pmatrix}.$$ This is where things get interesting (because it gets so complicated that you can fill many pages with nothing but matrix multiplication). Define
$$F(t)=\begin{pmatrix}1 & 0 & 0 & 0\\ 0 & \cos t & -\sin t & 0\\ 0 & \sin t & \cos t & 0\\ 0 & 0 & 0 & 1\end{pmatrix}$$ Now the idea is to consider the matrix ##M^{-1}F(t)M##, which doesn't have the pretty form that M does, and then bring it to the pretty form by a transformation ##X\mapsto U(R)XU(R')##. This is where the magic happens. We choose R,R' in the simplest possible way to ensure that the lower left of ##U(R)M^{-1}F(t)MU(R')## will be all zeroes, and the result turns out to be even simpler than M! This is the magic I would like to be able to explain.

It turns out that ##U(R)M^{-1}F(t)MU(R')## is of the form
$$\begin{pmatrix}f & g & 0 & 0\\ -wf & f & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\end{pmatrix}.$$ Now the question is, how did this happen? How did we get rid of the lower left diagonal elements and why are the upper left diagonal elements equal?

I've been doing the matrix multiplication in Mathematica. (I will post the code on request). Its result for ##M^{-1}F(t)M## is
$$\left(
\begin{array}{cccc}
\frac{a+b v \cos (t)}{a+b v} & \frac{b-b \cos (t)}{a+b v} & \frac{b k \sin (t)}{a d+b v d} & 0 \\
\frac{2 a v \sin ^2\left(\frac{t}{2}\right)}{a+b v} & \frac{b v+a \cos (t)}{a+b v} & -\frac{a k \sin (t)}{a d+b v d} & 0 \\
-\frac{d v \sin (t)}{k} & \frac{d \sin (t)}{k} & \cos (t) & 0 \\
0 & 0 & 0 & 1
\end{array}
\right)$$ If we write this as
$$\begin{pmatrix}r & q^T\\ p & S\end{pmatrix},$$ we have
$$U(R)M^{-1}F(t)MU(R') =\begin{pmatrix}r & q^TR'\\ Rp & RSR'\end{pmatrix}.$$ Since p is in the 1-2 plane, we choose R to be a rotation in the 1-2 plane. And then we can choose R' to be a rotation in the 1-2 plane as well, since all we're trying to do is to "zero out" the lower left. When R and R' are chosen this way, we end up with the very simple result above.

The theorem (stated in a very awkward way) and its proof (sans matrix multiplication details) can as far as I know only be found in this book. Unfortunately, it's not possible to view all the pages. I had to go to a library to check it out. I think I understand the proof well enough to answer questions about it.

One final comment, the result we get doesn't have all zeroes in the upper right. Instead, we use the fact that they must be zero (by the lemma that says that if the lower left is all zeroes then so is the upper right) to determine a relationship between the variables, and use that to further simplify the non-zero components.

chiro · Mar 10, 2013

Hey Fredrik.

One suggestion I have is to look at the transpose of the matrix and show that for all sub-group to exist, then you need the leading 0's.

The R matrix has to be invertible since the determinant of the full matrix is the determinant of R (since you have those zeroes). So R is in GL(3).

Now as for the proof you need to prove that you can't write the first row vector (or first column if you are looking at a transpose) as a linear combination of the other three row (or column) vectors.

An intuitive idea is that R spans the space of that part of the matrix and if you let at least one entry be non-zero then there exists a linear combination of vectors such that you can get a (1,0,0,0) vector which makes the whole matrix linearly dependent and not invertible.

Basically this is done by showing that there exists scalars such that c*<a,x0,y0,z0> + d*<0,x1,y1,z1> + e*<0,x2,y2,z2> = <1,0,0,0> for a != 0 and variable c,d,e.

Once you do this then you have shown that all entries in the first row/column besides top left have to be zero and you're done with the zero part.

Fredrik · Mar 10, 2013

Hi chiro.

Thank you for the reply. Did you mean that I should consider the transpose of ##U(R)M^{-1}F(t)MU(R')##? This is an interesting idea even if it's not what you meant.

I will have a look at it.

Both R and R' are chosen from SO(3). In fact, they are both chosen to be rotations in the 1-2 plane. So we don't have to prove that they are invertible or anything like that.

Most of what you're saying focuses on the zeroes. However, because of what I described as "the first step" of the proof, the lower left of ##U(R)M^{-1}F(t)MU(R')## is automatically all zeroes (because we're choosing R and R' to ensure that this is the case). And because of what I described as "the next step", the upper right is also all zeroes. The only thing I really need to prove is that the 00 and 11 components are the same, and that the 22 and 33 components are both 1. Hm, I think I can explain why the 33 component is 1, by doing a small part of the matrix multiplication.

Geometric insight about rotations needed

1. What is the definition of a rotation in geometry?

2. How do rotations affect the orientation of a figure?

3. What is the difference between a positive and negative rotation?

4. How many degrees does a full rotation represent?

5. What is the relationship between rotations and symmetry?

Similar threads

Hot Threads

Recent Insights