# Possible mistake in an article (rotations and boosts).

1. Jan 7, 2013

### Fredrik

Staff Emeritus
This is a linear algebra question, but it's about an article about Minkowski spacetime, so I think it's appropriate to post it here. The article is The rich structure of Minkowski space by Domenico Giulini. The detail I'm asking about is at the top of page 16.

The article is describing the 3+1-dimensional version of the "nothing but relativity" argument that's been discussed here in a few threads recently. (The idea is to prove that the group of functions that make a coordinate change from one global inertial coordinate system to another, is either the group of Galilean boosts or the Lorentz group. So at the start, we do not assume that spacetime is Minkowski spacetime. We just assume that spacetime is some structure with underlying set ℝ4).

The article assumes that the group has two subgroups, one corresponding to rotations, and one corresponding to boosts. The rotations are 4×4 matrices
$$R(D)=\begin{pmatrix}1 & 0\\ 0 & D\end{pmatrix},$$ where the zeroes are a 3×1 matrix and a 1×3 matrix, and D is a member of SO(3). The boosts can be expressed as a function of velocity, and the relationship between boosts and rotations is assumed to be
$$B(Dv)=R(D)B(v)R(D^{-1}).$$ Now the author claims that by chosing v to be a multiple of e1, and D to be an arbitrary rotation around the 1 axis, we can see that
$$B(v)=\begin{pmatrix}A & 0\\ 0 & \alpha I\end{pmatrix},$$ where A is a 2×2 matrix, I is the 2×2 identity matrix, and $\alpha$ is a real number. This result looks wrong to me. I want to know if I'm missing something. So here's my argument:

First write
$$B(v)=\begin{pmatrix}K & L\\ M & N\end{pmatrix}.$$ We have
$$R(D)=\begin{pmatrix}I & 0\\ 0 & D'\end{pmatrix},$$ where I is the 2×2 identity matrix and D is a member of SO(2). So $$R(D^{-1})=\begin{pmatrix}I & 0\\ 0 & D'^{-1}\end{pmatrix}$$ and
\begin{align}\begin{pmatrix}K & L\\ M & N\end{pmatrix} &=B(v)=B(Dv)=R(D)B(v)R(D^{-1})\\
&=\begin{pmatrix}I & 0\\ 0 & D'\end{pmatrix}\begin{pmatrix}K & L\\ M & N\end{pmatrix} \begin{pmatrix}I & 0\\ 0 & D'^{-1}\end{pmatrix} =\begin{pmatrix}K & LD'^{-1}\\ D'M & D'ND'^{-1}\end{pmatrix}
\end{align} Now it's easy to see that L=M=0. For example, we have M=D'M for all D' in SO(2), and if we e.g. choose D' to be a rotation by $\pi/2$, we can easily see that M=0.

However, the same choice of D in the equation for N yields that N is of the form
$$N=\begin{pmatrix}a & b\\ -b & a\end{pmatrix},$$ and this is a number times a member of SO(2), but that member doesn't have to be the identity. And I don't think that there's a way to get b=0 by choosing another D', because $D'ND'^{-1}$ will just be (a number times) a product of 3 rotations by angles θ,λ,-θ that add up to λ, and this turns the equation $N=D'ND'^{-1}$ into N=N, which tells us nothing.

2. Jan 8, 2013

### strangerep

I think you're right -- the paper is missing something. This type of derivation usually proceeds by assuming that the spatial axes of both frames (anchored at the common spatiotemporal origin) are adjusted to coincide.

So spatial isotropy is used twice -- in different ways: firstly to get rid of L and M completely,
by insisting that those 2D rotations must be an automorphism of the group.

But then, one says that we can multiply the B(v) matrix by a matrix consisting of 1 and $N^{-1}$, and silently redefine the product matrix as B(v) .... :-)
This is what one means by "adjusting the transverse axes to coincide in both frames".

Last edited: Jan 8, 2013
3. Jan 8, 2013

### Fredrik

Staff Emeritus
Thank you. By the way, if we're allowed to choose D' as a reflection instead of a (proper) rotation, we can easily get the result that N is a number times the identity matrix. But the author doesn't seem to assume any kind of reflection invariance.

4. Jan 8, 2013

### strangerep

Yes -- I wasn't overly impressed by that paper. Lots of fancy-schmancy language, but a bit superficial, imho. His mention of FL transformations and Manida's work reveals to me that he hasn't delved very deeply into that, nor tried to develop Manida's work further. But it seems lots of papers are like that: mere re-heatings and re-arrangements of various other stuff -- just enough so that it can't easily be called plagiarism.
A lot of the work involved in finding the relativity group from first principles (without cheating by knowing the answer) involves thoroughly analyzing and teasing apart the independent parts, such as separating boosts from rotations, from translations, etc, and separating the identity-connected parts from discrete parts. When considering boosts, one usually assumes that $v=0$ corresponds to the identity transformation, but in fact one should be more careful not to exclude possible discrete transformations thereby.

5. Jan 9, 2013

### tom.stoer

what's the idea behind it? take all Lie groups with 4-dim. rep. and "derive" in some way that spacetime symmetry is Lorentz symmetry? this is strange for several reasons:
- it misses Poincare invariance
- it misses lessons from GR where some/all these symmetries become local gauge symmetries
- it misses diffeomorphism invariance
- there is no mathematical reason to single out SO(3,1), so everything is contained in additional assumptions; are there any guiding principles not already using the result SO(3,1)?

6. Jan 9, 2013

### strangerep

What "it" are you referring to? (BTW, Tom, it would be helpful if you gave a little more quoted context when replying to threads. It's not always clear what you're actually replying to.)

I guess you didn't follow the previous thread(s) where Fredrik and I discussed this subject.
Anyway, the focus is on deriving the maximal group (of coordinate transformations) that preserves inertial motion (i.e., unaccelerated observers) from the relativity principle alone,
i.e., without the light principle. It doesn't miss Poincare invariance. GR is not relevant here since we're only talking about inertial motion.

Last edited: Jan 9, 2013
7. Jan 9, 2013

### Fredrik

Staff Emeritus
The idea is to see what the principle of relativity says about theories of physics in which space and time are represented by $\mathbb R^4$ equipped with global inertial coordinate systems. (Yes, these are theories that are less sophisticated than GR, but the point isn't to find exciting new theories. It's to explain what the less sophisticated ones have in common, and to show the power of the principle of relativity). No assumptions are made about those coordinate systems other than that the functions (permutations of $\mathbb R^4$) that change coordinates from one global inertial coordinate system to another have properties that are suggested by the principle of relativity, and the idea that an inertial coordinate system takes the world line of a non-accelerating particle to a straight line. The most important assumption is that these functions form a group. The claim is that a small set of additional "inspired by the principle of relativity" assumptions imply that this group is either the Galilean group or (isomorphic to) the Poincaré group.

One of the assumptions is that these functions take straight lines to straight lines. (This one isn't inspired by the principle of relativity. Instead it should be thought of as part of what we mean by "inertial coordinate system"). This implies that if T is such a function, there's a linear $\Lambda$ and a vector y such that $T(x)=\Lambda x+y$. (To prove this is by far the hardest part of the argument).

The most important of the other assumptions is that the set of all of these functions form a group. Because of the above, the set of linear functions in that group is a subgroup. What we are talking about is how to find that subgroup.

Last edited: Jan 9, 2013
8. Jan 9, 2013

### Fredrik

Staff Emeritus
The task is easier when we take spacetime to be ℝ2 instead of ℝ4. I will show one way to find the subgroup that consists of linear proper orthochronous coordinate transformations. I'm confident about the calculation, but I'd like to discuss the assumptions. I'll explain what I want to discuss later, in another post.

First note that if
$$\Lambda=\begin{pmatrix}a & b\\ c & d\end{pmatrix},$$ then
$$\Lambda^{-1}=\frac{1}{ad-bc}\begin{pmatrix}d & -b\\ -c & a\end{pmatrix}.$$ The velocity associated with $\Lambda$ can be determined by examining what $\Lambda^{-1}$ does to a point on the time axis. I'm using the convention that the time coordinate is the upper one. Since
$$\Lambda^{-1}\begin{pmatrix}1\\ 0\end{pmatrix}=\begin{pmatrix}(\Lambda^{-1})_{00}\\ (\Lambda^{-1})_{10}\end{pmatrix},$$ the velocity of $\Lambda$ is
$$\frac{(\Lambda^{-1})_{10}}{(\Lambda^{-1})_{00}}=\frac{-c}{d}=\frac{-\Lambda_{10}}{\Lambda_{11}}.$$
Similarly, the velocity of $\Lambda^{-1}$ is c/a.

Let G be a non-trivial subgroup of GL(ℝ2) such that the following statements are true.
1. For all $\Lambda\in G$, $\Lambda$ is proper and orthochronous.
2. There's a function $v:G\to\mathbb R$ such that for all $\Lambda\in G$, $v(\Lambda)=-\Lambda_{10}/\Lambda_{11}$.
3. For all $\Lambda\in G$, $v(\Lambda^{-1})=-v(\Lambda)$.
4. For all $\Lambda\in G$, $v(\Lambda)=0\Rightarrow\Lambda=I$.
5. For all $\Lambda\in G$, $(\Lambda^{-1})_{00}=\Lambda_{00}$.

Let $\Lambda\in G$ be arbitrary. Assumption 3 says that
$$\frac{c}{a}=v(\Lambda^{-1})=-v(\Lambda)=\frac{c}{d}.$$ So if c≠0, then d=a. However, if c=0, then $v(\Lambda)=0$, and now assumption 4 says that $\Lambda=I$, which implies that d=a=0. So we always have d=a.

Define $\gamma=a$, $v=-c/a$ (with apologies for giving the symbol v a second meaning) and $\alpha=b/a$. Note that assumption 3 implies that $v=v(\Lambda)$ (where the v on the left is the new one, and the v on the right is the old one). We have
$$\Lambda=a\begin{pmatrix}1 & b/a\\ c/a & d/a\end{pmatrix}=\gamma\begin{pmatrix}1 & \alpha\\ -v & 1\end{pmatrix}.$$ For all $\Lambda',\Lambda''\in G$,
$$G\ni \Lambda'\Lambda'' =\gamma'\gamma''\begin{pmatrix}1 & \alpha'\\ -v' & 1\end{pmatrix}\begin{pmatrix}1 & \alpha''\\ -v'' & 1\end{pmatrix} =\gamma'\gamma''\begin{pmatrix}1-\alpha'v'' & \alpha''+\alpha'\\ -v'-v'' & -v'\alpha''+1\end{pmatrix}.$$ This implies that $1-\alpha'v''=-v'\alpha''+1$. Note that since this holds for all $\Lambda''$ including one with $v''\neq 0$, it implies that if $v'=0$, then $\alpha'=0$. We will use this in a minute.

Let $\Lambda',\Lambda''$ be arbitrary members of $G$ such that $v'\neq 0$ and $v''\neq 0$. Now the result $1-\alpha'v''=-v'\alpha''+1$ implies that
$$\frac{\alpha''}{v''}=\frac{\alpha'}{v'}.$$ This means that $\alpha'/v'$ has the same value for all $\Lambda'\in G$. Denote this value by -K. We have $\alpha'=-Kv'$ for all $\Lambda'\in G$. So
$$\Lambda=\gamma\begin{pmatrix}1 & -Kv\\ -v & 1\end{pmatrix},\quad\Lambda^{-1}=\frac{1}{\gamma(1-Kv^2)}\begin{pmatrix}1 & Kv\\ v & 1\end{pmatrix}.$$ Assumption 5 tells us that
$$\gamma=\frac{1}{\gamma(1-Kv^2)}.$$ Note that if K>0, this implies that $1-Kv^2>0$. If we define $c=1/\sqrt{K}$, this is equivalent to $v\in(-c,c)$.

Assumption 1 implies that $\gamma>0$, so we have
$$\gamma=\frac{1}{\sqrt{1-Kv^2}}.$$ When K=0, $\Lambda$ is a proper and orthochronous Galilean boost. When $K=1$, $\Lambda$ is a proper and orthochronous Lorentz boost.

A few things that I didn't show here: We must have K≥0, because K<0 contradicts that G is a group or that all its members are orthochronous. For all K≥0, let $G_K$ be a set whose members are all the "Lambdas" of the form found above, with all the velocities that are consistent with the value of K. Then $G_K$ is a group that satisfies the assumptions. For all K>0, $G_K$ is isomorphic to $G_1$, i.e. the restricted Lorentz group.

Last edited by a moderator: Jan 10, 2013
9. Jan 9, 2013

### Fredrik

Staff Emeritus
Here's what I want to discuss:
• What's the best way to justify assumption 3?
• Can we change the assumptions to go directly for the Galilei and Lorentz groups instead of their restricted (proper and orthochronous) subgroups, without complicating the calculation too much?
• Is this whole thing worth doing?
These are some of my thoughts.

A. It's not clear to me what assumptions about space we're using here. Just reflection invariance? Are we also using translation invariance? Consider two identical point particles with different constant velocities in otherwise empty (1-dimensional) space. Is it obvious that if the velocity of A in a coordinate system comoving with B is v, then the velocity of B in a coordinate system comoving with A is -v if the systems have the same orientation and v if they have opposite orientations?

B. If we consider the group that includes reflections of space or time instead of its restricted subgroup, we will get results like d=±a instead of d=a, and these extra minus signs will make the calculation awkward. So maybe the easiest way to find the group is to start by defining the time-reversal operator T and the parity operator P, and prove that any member is equal to a product of T's and P's and a member of the restricted subgroup. Then we find the restricted subgroup as above.

C. I'm undecided. I would say that the argument doesn't help us understand Galilean spacetime, and that it doesn't help us understand Minkowski spacetime. It only helps us understand how the two can be viewed as different versions of the same thing. I think that's interesting, but the argument would be much cooler if a person who doesn't know SR could have come up with it, and I doubt it. [strike]Assumption 4[/strike] Assumption 5 looks like something that only a person who understands time dilation and relativity of simultaneity would make. It ensures that statements about the ticking rate of the other guy's clocks will be symmetrical. E.g. both observers may say "The time displayed by your clock increases slower than the time coordinates assigned to its world line using my clock".

[strike]Assumption 4[/strike] Assumption 5 is a natural way to turn an aspect of the principle of relativity into a mathematically precise statement. Absolute simultaneity on the other hand, isn't in any way an aspect of the principle of relativity. This suggests that [strike]assumption 4[/strike] assumption 5 should be on the list and that absolute simultaneity shouldn't. But the reason we decided to consider ℝ2 (or ℝ4) as the underlying set of spacetime in the first place is that it's natural to think of ℝ as time and ℝ3 as space. So why are we suddenly making a weaker assumption than that? Because we know what answer we want!

Edit: When I referenced "assumption 4" in three different places under C above, I really meant "assumption 5". I'm sorry about that.

Last edited: Jan 9, 2013
10. Jan 9, 2013

### strangerep

If I understand your notation correctly, your assumption 3 is equivalent to this: for every transformation of this form with parameter v, the inverse transformation is given by -v.
That's indeed used as an assumption in most published treatments I've seen, but it's unnecessary. If one merely assumes that there is some parameter $v^-$ that corresponds to the inverse transformation, then it possible to deduce (in this case) that $v^- = - v$. One need simply multiply the forward and inverse transformations, insist that the result is the identity, and then equate coefficients of like powers of the coordinates in the resulting equation to get several constraints, one of which turns out to be $v^- = - v$.
It turns out that this assumption is not necessary. Having established that $v$ means the velocity of (say) B's origin relative to A's origin, and that the transverse axes in both frames have been adjusted to coincide, then it's possible to deduce that the relative velocity of A relative to B is indeed $-v$ -- merely from working through the group-theoretic calculations, and remembering the spatial isotropy assumption.

In any group decomposition, one must typically find the identity-connected parts and the discrete parts, and investigate them separately. I don't think there's any shortcut around this.

Your derivation can be streamlined/simplified to some extent, and with fewer starting assumptions. I see it as a worthwhile exercise to find the minimal set of (physical) assumptions that yield the result.

But (potentially) more useful is to see how the group theoretic approach can reveal unexpected invariant constant(s) in more general cases.

This is just an arbitrary choice of $v=0$ as the parameter value corresponding to the identity transformation. You could potentially choose some other value, but the physics wouldn't make sense because at some point you must establish that $v$ corresponds to what we intuitively mean by "relative velocity" between the observers. (This is more transparent if one uses explicit coordinate formulas rather than just the matrices as you've done in your posts.)

11. Jan 9, 2013

### Fredrik

Staff Emeritus
I apologize. I meant assumption 5, not 4. The one that says that every $\Lambda$ has the same 00 (top left) component as its inverse. Assumption 5 is about time dilation and that sort of thing. (The 00 component is the "gamma" of the transformation).

Yes, these are the assumptions in words:

1. This one is just saying that I have decided to find the restricted (=proper and orthochronous) subgroup first.
2. Every $\Lambda$ has a finite velocity.
3. If the velocity of $\Lambda$ is v, then the velocity of $\Lambda^{-1}$ is -v.
4. The only $\Lambda$ with velocity 0 is the identity.
5. Every $\Lambda$ has the same 00 (top left) component as its inverse.

I don't immediately see how to fill in the details. Even if I did, I think I like my approach better. Note that I'm not making any assumptions about how many parameters there are.

I don't see what you have in mind here. In my approach, $v(\Lambda^{-1})=-v(\Lambda)$ is the spatial isotropy assumption. (In 1+1 dimensions, spatial isotropy and reflection invariance are the same thing).

If you can prove that one of my assumptions is unnecessary, I'm very interested in seeing the proof. Just keep in mind that if you use an assumption like "isotropy", you must turn that assumption into a precise mathematical statement and include it on the list.

Last edited: Jan 9, 2013
12. Jan 9, 2013

### VantagePoint72

Frederik, I realize this thread has gone off in a different direction, but I believe you've made an error in your reasoning at the end here and that the article is correct. For v along and a rotation around the $x_1$ axis, the equation $B(Dv)=R(D)B(v)R(D^{-1})$ (which holds for all rotations) reduces to $B(v)=R(D)B(v)R(D^{-1})$, as you noted. That means the equation:
$\begin{pmatrix}K & L\\ M & N\end{pmatrix} = \begin{pmatrix}K & LD'^{-1}\\ D'M & D'ND'^{-1}\end{pmatrix}$
must hold for arbitrary $D'$ in SO(2). You've just chosen particular values, which does not tell us anything. The way to proceed from here is to note that $L=LD'^{-1}$ is satisfied for arbitrary $D'$ in SO(2) only for $L=0$, since in particular the equivalence after rotation by π requires $L=-L$. Likewise for $M$.

On the other hand, $N = D'ND'^{-1}$ does require that $N$ is a multiple of the unit matrix, as the article says. This follows from Schur's lemma, since $N$ commutes with every group representative of SO(2) in an irreducible representation. So the author's statement is correct.

When you said that a rotation by π/2 requires:
$N=\begin{pmatrix}a & b\\ -b & a\end{pmatrix}$
which is not a multiple of the unit matrix, that is strictly true—but you are not taking into account all the other restrictions imposed by the other possible choices of rotation angle.

13. Jan 9, 2013

### Fredrik

Staff Emeritus
Thanks for taking the time to look at this. I'm not convinced yet, but it's possible that I drew the wrong conclusion from the result of the final step of my argument below.

I did realize that these equalities hold for arbitrary D' in SO(2). That implies that they hold for the specific choices of D' that make the calculations easy. So let's write
$$L=\begin{pmatrix}a & b\\ c & d\end{pmatrix}$$ and choose D' to be a a rotation by $\pi/2$. We have
$$\begin{pmatrix}a & b\\ c & d\end{pmatrix}=\begin{pmatrix}a & b\\ c & d\end{pmatrix}\begin{pmatrix}0 & -1\\ 1 & 0\end{pmatrix}=\begin{pmatrix}b & -a\\ d & -c\end{pmatrix},$$
and now we immediately see that a=b, b=-a, c=d, and d=-c. These results imply that a=b=c=d=0, and therefore that L=0.

I don't know Schur's lemma. It's one of these theorems I've come across several times but never really studied. I will take another look at it.

Right. That specific choice of D' gives us enough information about the relationship between the components of N to allow us to eliminate two of the variables from the notation. I didn't assume that a single choice of D' would give us enough information to conclude that N is a multiple of the identity. What I thought was that another choice of D' would give me additional information, and allow me to conclude that b=0. At the end of the post, I argued that this is impossible. The problem is that for all non-zero real numbers s,
$$N=\begin{pmatrix}a & b\\ -b & a\end{pmatrix}=s\begin{pmatrix}\frac a s & \frac b s\\ \frac{-b}{s} & \frac a s\end{pmatrix}.$$ In particular, this holds when $s=\sqrt{a^2+b^2}$. With this choice of s, the right-hand side above is s times a matrix whose rows are orthonormal. This matrix is in SO(2), so it's a rotation by some angle λ. So we can write $N=sR(\lambda)$, where $R(\lambda)$ denotes a rotation by λ. D' is a rotation by an arbitrary angle $\theta$. We are free to choose that angle as we see fit, but no matter what we choose, we will always have
$$N=D'ND'^{-1}=R(\theta)sR(\lambda)R(-\theta)=sR(\theta+\lambda-\theta)=sR(\lambda)=N.$$ My interpretation of this was that no choice of θ can give me additional information about the components of N.

Maybe that's just the wrong conclusion. I looked at the Wikipedia article on Schur's lemma (link) and if I understand it correctly, it does imply that this N is a multiple of the identity.

Edit: I've been playing around with Wolfram Alpha a bit. It's pretty cool that I can check my (partial) result for N simply by typing
Code (Text):
{{cos v, -sin v},{sin v, cos v}}*{{a,b},{c,d}}*{{cos v, sin v},{-sin v, cos v}} where v=pi/2
and examining the result. Then I tried
Code (Text):
{{cos v, -sin v},{sin v, cos v}}*{{a,b},{-b,a}}*{{cos v, sin v},{-sin v, cos v}} where v=pi/2
with several different choices of the angle v, and I just got the result
$$\begin{pmatrix}a & b\\ -b & a\end{pmatrix}$$ every time, and this tells me nothing.

I also see that Wikipedia requires the vector space to be complex. These things have brought me back to thinking that Giulini got that detail wrong in his article. Maybe Schur's lemma simply doesn't apply here?

Last edited: Jan 9, 2013
14. Jan 9, 2013

### VantagePoint72

Yes, it looks like you're right. I'm not exactly sure how Schur's lemma requires the complex field as it seems pretty generally stated. But clearly something is wrong. Either way, I think it's a fair wager that this is exactly the mistake the author made.

15. Jan 9, 2013

### Fredrik

Staff Emeritus
OK, thank you LastOneStanding.

Back to the discussion about the assumptions that go into the "nothing but relativity" argument... I have realized that one thing that I've been taking for granted is questionable. Does the set of orthochronous transformations really form a subgroup? How do we even define "orthochronous" here? How about this? $T:\mathbb R^n\to\mathbb R^n$ is said to be orthochronous if $(T(x))_0>(T(y))_0$ for all $x,y\in\mathbb R^n$ such that $x_0>y_0$. I don't see how this implies that the set of orthochronous maps is closed under composition.

This is a bit tricky even when we know that the group is the Lorentz group. I did that proof here. Unfortunately it's an old thread with several LaTeX errors (which were unnoticeable before the upgrade to MathJax).

Maybe there's no easier way to prove that there's a proper subgroup and an orthochronous subgroup, than to first determine the full group and then use that result to prove that the group has these subgroups. If that's the case, the list of assumptions will have to be changed.

Edit: The calculation that attempts to find the full group right away is a lot uglier. But I just realized that we can appeal to the principle of relativity and make the existence of an orthochronous subgroup one of our assumptions. If Alice agrees with Bob about the temporal order of any two events, and Bob agrees with Charlie about the temporal order of any two events, then Alice should agree with Charlie about the temporal order of any two events.

Last edited: Jan 9, 2013
16. Jan 9, 2013

### micromass

I fixed LaTeX in that post. If there are good posts whose LaTeX is messed up, then please report them.

17. Jan 9, 2013

### Fredrik

Staff Emeritus
OK, here's another attempt. (Just a sketch. I'm leaving out some details). We're looking for a non-trivial group G that's a subgroup of GL(ℝ2).

Assumption: There's a $v:G\to\mathbb R$ such that $v(\Lambda)=-\frac{\Lambda_{10}}{\Lambda_{11}}$ for all $\Lambda\in G$.

This one implies that both $\Lambda_{11}$ and $\Lambda_{00}$ are non-zero. (The latter because $\Lambda_{00}=(\det\Lambda)(\Lambda^{-1})_{11}$).

Assumptions: $(\Lambda^{-1})_{00}=\Lambda_{00}$ and $(\Lambda^{-1})_{11}=\Lambda_{11}$.

Let $\Lambda\in G$ be arbitrary, and denote its components by a,b,c,d, as in post #8. The assumptions imply that $\det\Lambda=\pm 1$ and that $d=(\det\Lambda)a$. It's now trivial to prove that $G_p$, defined by $G_p=\{\Lambda\in G|\det\Lambda>0\}$, is a subgroup of G.

Assumption: The set $G_o$ defined by $G_o=\{\Lambda\in G|\Lambda\text{ is orthochronous}\}$ is a subgroup of $G$.

The intersection of two subgroups is a subgroup, so the set H defined by $H=G_o\cap G_p$ is a subgroup. Let $\Lambda$ be an arbitrary member of H, and denote its components by a,b,c,d as above. We have d=a, and it's easy to show that a>0. (Use that $\Lambda$ is orthochronous).
$$\Lambda=a\begin{pmatrix}1 & b/a\\ c/a & 1\end{pmatrix}.$$ We define $\gamma=a$ and $\alpha=b/a$. Since
$$\frac{c}{a}=\frac{c}{d}=-v(\Lambda),$$ it's convenient to also define v=-c/a. Now we have
$$\Lambda=\gamma\begin{pmatrix}1 & \alpha\\ -v & 1\end{pmatrix}.$$ From here, everything is the same as in post #8. We use that H is closed under matrix multiplication to prove that $\alpha/v$ has the same value for all $\Lambda\in H$ with v≠0. If we denote that value by -K, we have $\alpha=-Kv$. Then we use the assumption about the 00 component to determine $\gamma$.

This time, I can prove $v(\Lambda^{-1})=-v(\Lambda)$ as a theorem, but only because I added another assumption, the one about the 11 component. Apparently this has also enabled me to drop the assumption that the only member of H with zero velocity is the identity.

Last edited: Jan 10, 2013
18. Jan 9, 2013

### Fredrik

Staff Emeritus
Thank you. I usually do, but for some reason I chose not to do it with this one. I was probably just embarrassed about the number of LaTeX errors in there.

19. Jan 10, 2013

### strangerep

That was also my initial reaction to Fredrik's opening post, but then I realized that's wrong. Take an arbitrary (identity-connected) matrix in SO(2), e.g.,
$$D ~:=~ \begin{pmatrix} \cos\theta & \sin\theta \cr - \sin\theta & \cos\theta \end{pmatrix}$$ and then find all 2x2 matrices N which commute with D. The answer is any N of the form:
$$N ~=~ \begin{pmatrix} a & b \cr - b & a \cr \end{pmatrix}$$

20. Jan 10, 2013

### strangerep

Yes -- I'm thinking of the 3+1 case.
It means that the equations which define all possible transformations are invariant under arbitrary SO(3) rotations. Another way of saying this is that the defining equations do not involve a distinguished 3-vector parameter.

In the case of boosts, we assume that SO(3) by itself has already been found by a decomposition process (such as: "first find all the transformations that leave the time coordinate invariant"). Then fix an arbitrary direction in space $\widehat u$ (corresponding to a velocity direction), and derive the boost transformations for that case. Rotations around $\widehat u$ must leave the defining equations invariant.

Having found these transformations for a specific $\widehat u$, apply general SO(3) rotations to get the general case. (I recall this is called "completing the orbit" in the language of Wigner's little group method.)

21. Jan 10, 2013

### strangerep

Yes, and you're denying yourself the use of one powerful technique: if you decompose the transformations such that, for each type, you're trying to determine a 1-parameter Lie subgroup, then you can exploit the fact that this subgroup is Abelian.

So, in the case of boosts in 1+3D, one fixes a direction $\widehat u$ as I said before, and the parameter then becomes a real number $u$ multiplying the (fixed) 3-vector $\widehat u$.

Then, imposing the requirement that transformations with different parameters $u, u'$ must commute, one determines quickly which of the unknown functions are in fact constant.

Actually, one can do even better if you've already established that the unknown functions are even in $u$. [The Levy-Leblond paper we discussed in another thread shows how to do this -- see the discussion leading up to his equations (15a)-(15c)]. Then the appearance of that common factor $\gamma$ emerges as a consequence.

22. Jan 10, 2013

### TrickyDicky

But why are you using 1+1 dimensions? Even if it is usually true that the one dimensional case generalizes to the higher dimensional case this is not always so.

23. Jan 10, 2013

### Fredrik

Staff Emeritus
1. I expect it to be much easier than the 3+1-dimensional case.
2. I expect to be able to use some of what I've found studying the 1+1-dimensional case when I do the 3+1-dimensional case.
3. Most of the interesting stuff about relativity is present in 1+1 dimensions. (Time dilation, length contraction, relativity of simultaneity, the twin paradox,...)

24. Jan 10, 2013

### Fredrik

Staff Emeritus
I agree, but I think these statements are too vague. They are certainly more precise than e.g. "space is the same in all directions", but I want all of my assumptions to be exact mathematical statements.

I expect that I will have to do something like this in the 3+1-dimensional case.

Note that in the 1+1-dimensional case, I don't have to assume anything about parameters. I've been hoping that I won't have to in the 3+1-dimensional case either, but I have realized that at the very least, I have to assume that there's a subgroup corresponding to rotations.

I may continue to deny myself the use of some powerful techniques, because if it's possible, I'd like to find a derivation that can be understood by people who haven't studied some of the more advanced stuff. If I explain what a group is, my arguments for the 1+1-dimensional case can be understood by anyone who has taken a course in linear algebra.

If we do make assumptions about parameters, we have to make them mathematically precise, and explain how they can be thought of as precise statements of some aspect of the principle of relativity or isotropy (or maybe invariance under spatial reflections or invariance under time-reversal). It's not clear to me how to do this, but I haven't spent a lot of time on the 3+1-dimensional case yet.

I don't have an easy way to access all of the references. There's a university library I can use, but I have to physically go there. I might do that next week.

25. Jan 10, 2013

### Fredrik

Staff Emeritus
I'm quite pleased with the derivation and the assumptions in #17. Note that none of my assumptions is about invariance under spatial reflections or time reversal!

The only problem with #17 is that it doesn't show how to determine G from the restricted group that we find at the end. I could use some help with that. These are some of my thoughts:

I defined $G_p=\{\Lambda\in G|\det\Lambda>0\}$. Now define
$$P=\begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix}.$$ For all $\Lambda\in G$, we have $\det(P\Lambda)=-\det(\Lambda)$. If $P\in G$, then this implies that $G-G_p=\{P\Lambda|\Lambda\in G_p\}$, so we have $G=G_p\cup\{P\Lambda|\Lambda\in G_p\}$.

But what if $P\notin G$? Then $\Lambda\in G_p$ doesn't imply that $P\Lambda\in G$ (at least not in an obvious way), so $G-G_p$ may be a proper subset of $\{P\Lambda|\Lambda\in G_p\}$. The statement that $P\in G$ can be thought of as a mathematically precise version of "space is the same in both directions", so I thought that $P\notin G$ would imply that $G=G_p$. I'm thinking that if space is not invariant under spatial reflections, then our final result for G should be the proper Galilean group or proper Lorentz group.

So is it possible to prove that $P\notin G$ implies that $G-G_p=\emptyset$?

Edit: I may have figured this out. I noticed something 5 seconds after posting that may or may not solve the problem. Thinking about it now...

It turned out to be quite easy. Suppose that $P\notin G$. I will prove that $G-G_p$ is empty by deriving a contradiction from the assumption that it's not. So suppose that it's not. Let $\Lambda\in G-G_p$ be arbitrary. Then $\det(\Lambda)=-1$, and $\det(P\Lambda)=1$. So $P\Lambda\in G_p$. Define $\Lambda'=P\Lambda$. We have $P=\Lambda'\Lambda^{-1}\in G$, and this contradicts the assumption that $P\notin G$.

So it is easy to determine $G$ from $G_p$. If $P\notin G$ (if the two directions of space are not equivalent), we have $G=G_p$. If $P\in G$ (if the two directions of space are equivalent), we have $G=G_p\cup\{P\Lambda|\Lambda\in G_p\}$.

Last edited: Jan 10, 2013