Possible mistake in an article (rotations and boosts).

Fredrik · Jan 7, 2013

This is a linear algebra question, but it's about an article about Minkowski spacetime, so I think it's appropriate to post it here. The article is The rich structure of Minkowski space by Domenico Giulini. The detail I'm asking about is at the top of page 16.

The article is describing the 3+1-dimensional version of the "nothing but relativity" argument that's been discussed here in a few threads recently. (The idea is to prove that the group of functions that make a coordinate change from one global inertial coordinate system to another, is either the group of Galilean boosts or the Lorentz group. So at the start, we do not assume that spacetime is Minkowski spacetime. We just assume that spacetime is some structure with underlying set ℝ⁴).

The article assumes that the group has two subgroups, one corresponding to rotations, and one corresponding to boosts. The rotations are 4×4 matrices
$$R(D)=\begin{pmatrix}1 & 0\\ 0 & D\end{pmatrix},$$ where the zeroes are a 3×1 matrix and a 1×3 matrix, and D is a member of SO(3). The boosts can be expressed as a function of velocity, and the relationship between boosts and rotations is assumed to be
$$B(Dv)=R(D)B(v)R(D^{-1}).$$ Now the author claims that by chosing v to be a multiple of e₁, and D to be an arbitrary rotation around the 1 axis, we can see that
$$B(v)=\begin{pmatrix}A & 0\\ 0 & \alpha I\end{pmatrix},$$ where A is a 2×2 matrix, I is the 2×2 identity matrix, and ##\alpha## is a real number. This result looks wrong to me. I want to know if I'm missing something. So here's my argument:

First write
$$B(v)=\begin{pmatrix}K & L\\ M & N\end{pmatrix}.$$ We have
$$R(D)=\begin{pmatrix}I & 0\\ 0 & D'\end{pmatrix},$$ where I is the 2×2 identity matrix and D is a member of SO(2). So $$R(D^{-1})=\begin{pmatrix}I & 0\\ 0 & D'^{-1}\end{pmatrix}$$ and
\begin{align}\begin{pmatrix}K & L\\ M & N\end{pmatrix} &=B(v)=B(Dv)=R(D)B(v)R(D^{-1})\\
&=\begin{pmatrix}I & 0\\ 0 & D'\end{pmatrix}\begin{pmatrix}K & L\\ M & N\end{pmatrix} \begin{pmatrix}I & 0\\ 0 & D'^{-1}\end{pmatrix} =\begin{pmatrix}K & LD'^{-1}\\ D'M & D'ND'^{-1}\end{pmatrix}
\end{align} Now it's easy to see that L=M=0. For example, we have M=D'M for all D' in SO(2), and if we e.g. choose D' to be a rotation by ##\pi/2##, we can easily see that M=0.

However, the same choice of D in the equation for N yields that N is of the form
$$N=\begin{pmatrix}a & b\\ -b & a\end{pmatrix},$$ and this is a number times a member of SO(2), but that member doesn't have to be the identity. And I don't think that there's a way to get b=0 by choosing another D', because ##D'ND'^{-1}## will just be (a number times) a product of 3 rotations by angles θ,λ,-θ that add up to λ, and this turns the equation ##N=D'ND'^{-1}## into N=N, which tells us nothing.

strangerep · Jan 8, 2013

I think you're right -- the paper is missing something. This type of derivation usually proceeds by assuming that the spatial axes of both frames (anchored at the common spatiotemporal origin) are adjusted to coincide.

So spatial isotropy is used twice -- in different ways: firstly to get rid of L and M completely,
by insisting that those 2D rotations must be an automorphism of the group.

But then, one says that we can multiply the B(v) matrix by a matrix consisting of 1 and ##N^{-1}##, and silently redefine the product matrix as B(v) ... :-)
This is what one means by "adjusting the transverse axes to coincide in both frames".

Fredrik · Jan 8, 2013

Thank you. By the way, if we're allowed to choose D' as a reflection instead of a (proper) rotation, we can easily get the result that N is a number times the identity matrix. But the author doesn't seem to assume any kind of reflection invariance.

strangerep · Jan 8, 2013

Fredrik said:

[...] the author doesn't seem to assume any kind of reflection invariance.

Yes -- I wasn't overly impressed by that paper. Lots of fancy-schmancy language, but a bit superficial, imho. His mention of FL transformations and Manida's work reveals to me that he hasn't delved very deeply into that, nor tried to develop Manida's work further. But it seems lots of papers are like that: mere re-heatings and re-arrangements of various other stuff -- just enough so that it can't easily be called plagiarism.

if we're allowed to choose D' as a reflection instead of a (proper) rotation, we can easily get the result that N is a number times the identity matrix.

A lot of the work involved in finding the relativity group from first principles (without cheating by knowing the answer) involves thoroughly analyzing and teasing apart the independent parts, such as separating boosts from rotations, from translations, etc, and separating the identity-connected parts from discrete parts. When considering boosts, one usually assumes that ##v=0## corresponds to the identity transformation, but in fact one should be more careful not to exclude possible discrete transformations thereby.

tom.stoer · Jan 9, 2013

what's the idea behind it? take all Lie groups with 4-dim. rep. and "derive" in some way that spacetime symmetry is Lorentz symmetry? this is strange for several reasons:
- it misses Poincare invariance
- it misses lessons from GR where some/all these symmetries become local gauge symmetries
- it misses diffeomorphism invariance
- there is no mathematical reason to single out SO(3,1), so everything is contained in additional assumptions; are there any guiding principles not already using the result SO(3,1)?

strangerep · Jan 9, 2013

tom.stoer said:

what's the idea behind it?

What "it" are you referring to? (BTW, Tom, it would be helpful if you gave a little more quoted context when replying to threads. It's not always clear what you're actually replying to.)

I guess you didn't follow the previous thread(s) where Fredrik and I discussed this subject.
Anyway, the focus is on deriving the maximal group (of coordinate transformations) that preserves inertial motion (i.e., unaccelerated observers) from the relativity principle alone,
i.e., without the light principle. It doesn't miss Poincare invariance. GR is not relevant here since we're only talking about inertial motion.

[Edit: the older thread is: https://www.physicsforums.com/showthread.php?t=651640 ]

Fredrik · Jan 9, 2013

The idea is to see what the principle of relativity says about theories of physics in which space and time are represented by ##\mathbb R^4## equipped with global inertial coordinate systems. (Yes, these are theories that are less sophisticated than GR, but the point isn't to find exciting new theories. It's to explain what the less sophisticated ones have in common, and to show the power of the principle of relativity). No assumptions are made about those coordinate systems other than that the functions (permutations of ##\mathbb R^4##) that change coordinates from one global inertial coordinate system to another have properties that are suggested by the principle of relativity, and the idea that an inertial coordinate system takes the world line of a non-accelerating particle to a straight line. The most important assumption is that these functions form a group. The claim is that a small set of additional "inspired by the principle of relativity" assumptions imply that this group is either the Galilean group or (isomorphic to) the Poincaré group.

One of the assumptions is that these functions take straight lines to straight lines. (This one isn't inspired by the principle of relativity. Instead it should be thought of as part of what we mean by "inertial coordinate system"). This implies that if T is such a function, there's a linear ##\Lambda## and a vector y such that ##T(x)=\Lambda x+y##. (To prove this is by far the hardest part of the argument).

The most important of the other assumptions is that the set of all of these functions form a group. Because of the above, the set of linear functions in that group is a subgroup. What we are talking about is how to find that subgroup.

Fredrik · Jan 9, 2013

The task is easier when we take spacetime to be ℝ² instead of ℝ⁴. I will show one way to find the subgroup that consists of linear proper orthochronous coordinate transformations. I'm confident about the calculation, but I'd like to discuss the assumptions. I'll explain what I want to discuss later, in another post.

First note that if
$$\Lambda=\begin{pmatrix}a & b\\ c & d\end{pmatrix},$$ then
$$\Lambda^{-1}=\frac{1}{ad-bc}\begin{pmatrix}d & -b\\ -c & a\end{pmatrix}.$$ The velocity associated with ##\Lambda## can be determined by examining what ##\Lambda^{-1}## does to a point on the time axis. I'm using the convention that the time coordinate is the upper one. Since
$$\Lambda^{-1}\begin{pmatrix}1\\ 0\end{pmatrix}=\begin{pmatrix}(\Lambda^{-1})_{00}\\ (\Lambda^{-1})_{10}\end{pmatrix},$$ the velocity of ##\Lambda## is
$$\frac{(\Lambda^{-1})_{10}}{(\Lambda^{-1})_{00}}=\frac{-c}{d}=\frac{-\Lambda_{10}}{\Lambda_{11}}.$$
Similarly, the velocity of ##\Lambda^{-1}## is c/a.

Let G be a non-trivial subgroup of GL(ℝ²) such that the following statements are true.

For all ##\Lambda\in G##, ##\Lambda## is proper and orthochronous.
There's a function ##v:G\to\mathbb R## such that for all ##\Lambda\in G##, ##v(\Lambda)=-\Lambda_{10}/\Lambda_{11}##.
For all ##\Lambda\in G##, ##v(\Lambda^{-1})=-v(\Lambda)##.
For all ##\Lambda\in G##, ##v(\Lambda)=0\Rightarrow\Lambda=I##.
For all ##\Lambda\in G##, ##(\Lambda^{-1})_{00}=\Lambda_{00}##.

Let ##\Lambda\in G## be arbitrary. Assumption 3 says that
$$\frac{c}{a}=v(\Lambda^{-1})=-v(\Lambda)=\frac{c}{d}.$$ So if c≠0, then d=a. However, if c=0, then ##v(\Lambda)=0##, and now assumption 4 says that ##\Lambda=I##, which implies that d=a=0. So we always have d=a.

Define ##\gamma=a##, ##v=-c/a## (with apologies for giving the symbol v a second meaning) and ##\alpha=b/a##. Note that assumption 3 implies that ##v=v(\Lambda)## (where the v on the left is the new one, and the v on the right is the old one). We have
$$\Lambda=a\begin{pmatrix}1 & b/a\\ c/a & d/a\end{pmatrix}=\gamma\begin{pmatrix}1 & \alpha\\ -v & 1\end{pmatrix}.$$ For all ##\Lambda',\Lambda''\in G##,
$$G\ni \Lambda'\Lambda'' =\gamma'\gamma''\begin{pmatrix}1 & \alpha'\\ -v' & 1\end{pmatrix}\begin{pmatrix}1 & \alpha''\\ -v'' & 1\end{pmatrix} =\gamma'\gamma''\begin{pmatrix}1-\alpha'v'' & \alpha''+\alpha'\\ -v'-v'' & -v'\alpha''+1\end{pmatrix}.$$ This implies that ##1-\alpha'v''=-v'\alpha''+1##. Note that since this holds for all ##\Lambda''## including one with ##v''\neq 0##, it implies that if ##v'=0##, then ##\alpha'=0##. We will use this in a minute.

Let ##\Lambda',\Lambda''## be arbitrary members of ##G## such that ##v'\neq 0## and ##v''\neq 0##. Now the result ##1-\alpha'v''=-v'\alpha''+1## implies that
$$\frac{\alpha''}{v''}=\frac{\alpha'}{v'}.$$ This means that ##\alpha'/v'## has the same value for all ##\Lambda'\in G##. Denote this value by -K. We have ##\alpha'=-Kv'## for all ##\Lambda'\in G##. So
$$\Lambda=\gamma\begin{pmatrix}1 & -Kv\\ -v & 1\end{pmatrix},\quad\Lambda^{-1}=\frac{1}{\gamma(1-Kv^2)}\begin{pmatrix}1 & Kv\\ v & 1\end{pmatrix}.$$ Assumption 5 tells us that
$$\gamma=\frac{1}{\gamma(1-Kv^2)}.$$ Note that if K>0, this implies that ##1-Kv^2>0##. If we define ##c=1/\sqrt{K}##, this is equivalent to ##v\in(-c,c)##.

Assumption 1 implies that ##\gamma>0##, so we have
$$\gamma=\frac{1}{\sqrt{1-Kv^2}}.$$ When K=0, ##\Lambda## is a proper and orthochronous Galilean boost. When ##K=1##, ##\Lambda## is a proper and orthochronous Lorentz boost.

A few things that I didn't show here: We must have K≥0, because K<0 contradicts that G is a group or that all its members are orthochronous. For all K≥0, let ##G_K## be a set whose members are all the "Lambdas" of the form found above, with all the velocities that are consistent with the value of K. Then ##G_K## is a group that satisfies the assumptions. For all K>0, ##G_K## is isomorphic to ##G_1##, i.e. the restricted Lorentz group.

Fredrik · Jan 9, 2013

Here's what I want to discuss:

What's the best way to justify assumption 3?
Can we change the assumptions to go directly for the Galilei and Lorentz groups instead of their restricted (proper and orthochronous) subgroups, without complicating the calculation too much?
Is this whole thing worth doing?

These are some of my thoughts.

A. It's not clear to me what assumptions about space we're using here. Just reflection invariance? Are we also using translation invariance? Consider two identical point particles with different constant velocities in otherwise empty (1-dimensional) space. Is it obvious that if the velocity of A in a coordinate system comoving with B is v, then the velocity of B in a coordinate system comoving with A is -v if the systems have the same orientation and v if they have opposite orientations?

B. If we consider the group that includes reflections of space or time instead of its restricted subgroup, we will get results like d=±a instead of d=a, and these extra minus signs will make the calculation awkward. So maybe the easiest way to find the group is to start by defining the time-reversal operator T and the parity operator P, and prove that any member is equal to a product of T's and P's and a member of the restricted subgroup. Then we find the restricted subgroup as above.

C. I'm undecided. I would say that the argument doesn't help us understand Galilean spacetime, and that it doesn't help us understand Minkowski spacetime. It only helps us understand how the two can be viewed as different versions of the same thing. I think that's interesting, but the argument would be much cooler if a person who doesn't know SR could have come up with it, and I doubt it. [strike]Assumption 4[/strike] Assumption 5 looks like something that only a person who understands time dilation and relativity of simultaneity would make. It ensures that statements about the ticking rate of the other guy's clocks will be symmetrical. E.g. both observers may say "The time displayed by your clock increases slower than the time coordinates assigned to its world line using my clock".

[strike]Assumption 4[/strike] Assumption 5 is a natural way to turn an aspect of the principle of relativity into a mathematically precise statement. Absolute simultaneity on the other hand, isn't in any way an aspect of the principle of relativity. This suggests that [strike]assumption 4[/strike] assumption 5 should be on the list and that absolute simultaneity shouldn't. But the reason we decided to consider ℝ² (or ℝ⁴) as the underlying set of spacetime in the first place is that it's natural to think of ℝ as time and ℝ³ as space. So why are we suddenly making a weaker assumption than that? Because we know what answer we want!

Edit: When I referenced "assumption 4" in three different places under C above, I really meant "assumption 5". I'm sorry about that.

strangerep · Jan 9, 2013

Fredrik said:

A. What's the best way to justify assumption 3?
[...]
A. It's not clear to me what assumptions about space we're using here.

If I understand your notation correctly, your assumption 3 is equivalent to this: for every transformation of this form with parameter v, the inverse transformation is given by -v.
That's indeed used as an assumption in most published treatments I've seen, but it's unnecessary. If one merely assumes that there is some parameter ##v^-## that corresponds to the inverse transformation, then it possible to deduce (in this case) that ##v^- = - v##. One need simply multiply the forward and inverse transformations, insist that the result is the identity, and then equate coefficients of like powers of the coordinates in the resulting equation to get several constraints, one of which turns out to be ##v^- = - v##.

Is it obvious that if the velocity of A in a coordinate system comoving with B is v, then the velocity of B in a coordinate system comoving with A is -v if the systems have the same orientation and v if they have opposite orientations?

It turns out that this assumption is not necessary. Having established that ##v## means the velocity of (say) B's origin relative to A's origin, and that the transverse axes in both frames have been adjusted to coincide, then it's possible to deduce that the relative velocity of A relative to B is indeed ##-v## -- merely from working through the group-theoretic calculations, and remembering the spatial isotropy assumption.

B. If we consider the group that includes reflections of space or time instead of its restricted subgroup, we will get results like d=±a instead of d=a, and these extra minus signs will make the calculation awkward. So maybe the easiest way to find the group is to start by defining the time-reversal operator T and the parity operator P, and prove that any member is equal to a product of T's and P's and a member of the restricted subgroup. Then we find the restricted subgroup as above.

In any group decomposition, one must typically find the identity-connected parts and the discrete parts, and investigate them separately. I don't think there's any shortcut around this.

C. Is this whole thing worth doing?
I'm undecided. I would say that the argument doesn't help us understand Galilean spacetime, and that it doesn't help us understand Minkowski spacetime. It only helps us understand how the two can be viewed as different versions of the same thing. I think that's interesting, but the argument would be much cooler if a person who doesn't know SR could have come up with it, and I doubt it.

Your derivation can be streamlined/simplified to some extent, and with fewer starting assumptions. I see it as a worthwhile exercise to find the minimal set of (physical) assumptions that yield the result.

But (potentially) more useful is to see how the group theoretic approach can reveal unexpected invariant constant(s) in more general cases.

Assumption 4 looks like something that only a person who understands time dilation and relativity of simultaneity would make.

This is just an arbitrary choice of ##v=0## as the parameter value corresponding to the identity transformation. You could potentially choose some other value, but the physics wouldn't make sense because at some point you must establish that ##v## corresponds to what we intuitively mean by "relative velocity" between the observers. (This is more transparent if one uses explicit coordinate formulas rather than just the matrices as you've done in your posts.)

Fredrik · Jan 9, 2013

strangerep said:

This is just an arbitrary choice of ##v=0## as the parameter value corresponding to the identity transformation.

I apologize. I meant assumption 5, not 4. The one that says that every ##\Lambda## has the same 00 (top left) component as its inverse. Assumption 5 is about time dilation and that sort of thing. (The 00 component is the "gamma" of the transformation).

strangerep said:

If I understand your notation correctly, your assumption 3 is equivalent to this: for every transformation of this form with parameter v, the inverse transformation is given by -v.

Yes, these are the assumptions in words:

1. This one is just saying that I have decided to find the restricted (=proper and orthochronous) subgroup first.
2. Every ##\Lambda## has a finite velocity.
3. If the velocity of ##\Lambda## is v, then the velocity of ##\Lambda^{-1}## is -v.
4. The only ##\Lambda## with velocity 0 is the identity.
5. Every ##\Lambda## has the same 00 (top left) component as its inverse.

strangerep said:

That's indeed used as an assumption in most published treatments I've seen, but it's unnecessary. If one merely assumes that there is some parameter ##v^-## that corresponds to the inverse transformation, then it possible to deduce (in this case) that ##v^- = - v##. One need simply multiply the forward and inverse transformations, insist that the result is the identity, and then equate coefficients of like powers of the coordinates in the resulting equation to get several constraints, one of which turns out to be ##v^- = - v##.

I don't immediately see how to fill in the details. Even if I did, I think I like my approach better. Note that I'm not making any assumptions about how many parameters there are.

strangerep said:

It turns out that this assumption is not necessary. Having established that ##v## means the velocity of (say) B's origin relative to A's origin, and that the transverse axes in both frames have been adjusted to coincide, then it's possible to deduce that the relative velocity of A relative to B is indeed ##-v## -- merely from working through the group-theoretic calculations, and remembering the spatial isotropy assumption.

I don't see what you have in mind here. In my approach, ##v(\Lambda^{-1})=-v(\Lambda)## is the spatial isotropy assumption. (In 1+1 dimensions, spatial isotropy and reflection invariance are the same thing).

strangerep said:

Your derivation can be streamlined/simplified to some extent, and with fewer starting assumptions. I see it as a worthwhile exercise to find the minimal set of (physical) assumptions that yield the result.

If you can prove that one of my assumptions is unnecessary, I'm very interested in seeing the proof. Just keep in mind that if you use an assumption like "isotropy", you must turn that assumption into a precise mathematical statement and include it on the list.

VantagePoint72 · Jan 9, 2013

Fredrik said:

However, the same choice of D in the equation for N yields that N is of the form
$$N=\begin{pmatrix}a & b\\ -b & a\end{pmatrix},$$ and this is a number times a member of SO(2), but that member doesn't have to be the identity. And I don't think that there's a way to get b=0 by choosing another D', because ##D'ND'^{-1}## will just be (a number times) a product of 3 rotations by angles θ,λ,-θ that add up to λ, and this turns the equation ##N=D'ND'^{-1}## into N=N, which tells us nothing.

Frederik, I realize this thread has gone off in a different direction, but I believe you've made an error in your reasoning at the end here and that the article is correct. For v along and a rotation around the ##x_1## axis, the equation ##B(Dv)=R(D)B(v)R(D^{-1})## (which holds for all rotations) reduces to ##B(v)=R(D)B(v)R(D^{-1})##, as you noted. That means the equation:
##\begin{pmatrix}K & L\\ M & N\end{pmatrix} = \begin{pmatrix}K & LD'^{-1}\\ D'M & D'ND'^{-1}\end{pmatrix}##
must hold for arbitrary ##D'## in SO(2). You've just chosen particular values, which does not tell us anything. The way to proceed from here is to note that ##L=LD'^{-1}## is satisfied for arbitrary ##D'## in SO(2) only for ##L=0##, since in particular the equivalence after rotation by π requires ##L=-L##. Likewise for ##M##.

On the other hand, ##N = D'ND'^{-1}## does require that ##N## is a multiple of the unit matrix, as the article says. This follows from Schur's lemma, since ##N## commutes with every group representative of SO(2) in an irreducible representation. So the author's statement is correct.

When you said that a rotation by π/2 requires:
##N=\begin{pmatrix}a & b\\ -b & a\end{pmatrix}##
which is not a multiple of the unit matrix, that is strictly true—but you are not taking into account all the other restrictions imposed by the other possible choices of rotation angle.

Fredrik · Jan 9, 2013

LastOneStanding said:

Frederik, I realize this thread has gone off in a different direction, but I believe you've made an error in your reasoning at the end here and that the article is correct. For v along and a rotation around the ##x_1## axis, the equation ##B(Dv)=R(D)B(v)R(D^{-1})## (which holds for all rotations) reduces to ##B(v)=R(D)B(v)R(D^{-1})##, as you noted. That means the equation:
##\begin{pmatrix}K & L\\ M & N\end{pmatrix} = \begin{pmatrix}K & LD'^{-1}\\ D'M & D'ND'^{-1}\end{pmatrix}##
must hold for arbitrary ##D'## in SO(2). You've just chosen particular values, which does not tell us anything. The way to proceed from here is to note that ##L=LD'^{-1}## is satisfied for arbitrary ##D'## in SO(2) only for ##L=0##, since in particular the equivalence after rotation by π requires ##L=-L##. Likewise for ##M##.

Thanks for taking the time to look at this. I'm not convinced yet, but it's possible that I drew the wrong conclusion from the result of the final step of my argument below.

I did realize that these equalities hold for arbitrary D' in SO(2). That implies that they hold for the specific choices of D' that make the calculations easy. So let's write
$$L=\begin{pmatrix}a & b\\ c & d\end{pmatrix}$$ and choose D' to be a a rotation by ##\pi/2##. We have
$$\begin{pmatrix}a & b\\ c & d\end{pmatrix}=\begin{pmatrix}a & b\\ c & d\end{pmatrix}\begin{pmatrix}0 & -1\\ 1 & 0\end{pmatrix}=\begin{pmatrix}b & -a\\ d & -c\end{pmatrix},$$
and now we immediately see that a=b, b=-a, c=d, and d=-c. These results imply that a=b=c=d=0, and therefore that L=0.

LastOneStanding said:

On the other hand, ##N = D'ND'^{-1}## does require that ##N## is a multiple of the unit matrix, as the article says. This follows from Schur's lemma, since ##N## commutes with every group representative of SO(2) in an irreducible representation. So the author's statement is correct.

I don't know Schur's lemma. It's one of these theorems I've come across several times but never really studied. I will take another look at it.

LastOneStanding said:

When you said that a rotation by π/2 requires:
##N=\begin{pmatrix}a & b\\ -b & a\end{pmatrix}##
which is not a multiple of the unit matrix, that is strictly true—but you are not taking into account all the other restrictions imposed by the other possible choices of rotation angle.

Right. That specific choice of D' gives us enough information about the relationship between the components of N to allow us to eliminate two of the variables from the notation. I didn't assume that a single choice of D' would give us enough information to conclude that N is a multiple of the identity. What I thought was that another choice of D' would give me additional information, and allow me to conclude that b=0. At the end of the post, I argued that this is impossible. The problem is that for all non-zero real numbers s,
$$N=\begin{pmatrix}a & b\\ -b & a\end{pmatrix}=s\begin{pmatrix}\frac a s & \frac b s\\ \frac{-b}{s} & \frac a s\end{pmatrix}.$$ In particular, this holds when ##s=\sqrt{a^2+b^2}##. With this choice of s, the right-hand side above is s times a matrix whose rows are orthonormal. This matrix is in SO(2), so it's a rotation by some angle λ. So we can write ##N=sR(\lambda)##, where ##R(\lambda)## denotes a rotation by λ. D' is a rotation by an arbitrary angle ##\theta##. We are free to choose that angle as we see fit, but no matter what we choose, we will always have
$$N=D'ND'^{-1}=R(\theta)sR(\lambda)R(-\theta)=sR(\theta+\lambda-\theta)=sR(\lambda)=N.$$ My interpretation of this was that no choice of θ can give me additional information about the components of N.

Maybe that's just the wrong conclusion. I looked at the Wikipedia article on Schur's lemma (link) and if I understand it correctly, it does imply that this N is a multiple of the identity.

Edit: I've been playing around with Wolfram Alpha a bit. It's pretty cool that I can check my (partial) result for N simply by typing

Code:

 {{cos v, -sin v},{sin v, cos v}}*{{a,b},{c,d}}*{{cos v, sin v},{-sin v, cos v}} where v=pi/2

and examining the result. Then I tried

Code:

{{cos v, -sin v},{sin v, cos v}}*{{a,b},{-b,a}}*{{cos v, sin v},{-sin v, cos v}} where v=pi/2

with several different choices of the angle v, and I just got the result
$$\begin{pmatrix}a & b\\ -b & a\end{pmatrix}$$ every time, and this tells me nothing.

I also see that Wikipedia requires the vector space to be complex. These things have brought me back to thinking that Giulini got that detail wrong in his article. Maybe Schur's lemma simply doesn't apply here?

VantagePoint72 · Jan 9, 2013

Yes, it looks like you're right. I'm not exactly sure how Schur's lemma requires the complex field as it seems pretty generally stated. But clearly something is wrong. Either way, I think it's a fair wager that this is exactly the mistake the author made.

Fredrik · Jan 9, 2013

OK, thank you LastOneStanding.

Back to the discussion about the assumptions that go into the "nothing but relativity" argument... I have realized that one thing that I've been taking for granted is questionable. Does the set of orthochronous transformations really form a subgroup? How do we even define "orthochronous" here? How about this? ##T:\mathbb R^n\to\mathbb R^n## is said to be orthochronous if ##(T(x))_0>(T(y))_0## for all ##x,y\in\mathbb R^n## such that ##x_0>y_0##. I don't see how this implies that the set of orthochronous maps is closed under composition.

This is a bit tricky even when we know that the group is the Lorentz group. I did that proof here. Unfortunately it's an old thread with several LaTeX errors (which were unnoticeable before the upgrade to MathJax).

Maybe there's no easier way to prove that there's a proper subgroup and an orthochronous subgroup, than to first determine the full group and then use that result to prove that the group has these subgroups. If that's the case, the list of assumptions will have to be changed.

Edit: The calculation that attempts to find the full group right away is a lot uglier. But I just realized that we can appeal to the principle of relativity and make the existence of an orthochronous subgroup one of our assumptions. If Alice agrees with Bob about the temporal order of any two events, and Bob agrees with Charlie about the temporal order of any two events, then Alice should agree with Charlie about the temporal order of any two events.

micromass · Jan 9, 2013

Fredrik said:

This is a bit tricky even when we know that the group is the Lorentz group. I did that proof here. Unfortunately it's an old thread with several LaTeX errors (which were unnoticeable before the upgrade to MathJax).

I fixed LaTeX in that post. If there are good posts whose LaTeX is messed up, then please report them.

Fredrik · Jan 9, 2013

OK, here's another attempt. (Just a sketch. I'm leaving out some details). We're looking for a non-trivial group G that's a subgroup of GL(ℝ²).

Assumption: There's a ##v:G\to\mathbb R## such that ##v(\Lambda)=-\frac{\Lambda_{10}}{\Lambda_{11}}## for all ##\Lambda\in G##.

This one implies that both ##\Lambda_{11}## and ##\Lambda_{00}## are non-zero. (The latter because ##\Lambda_{00}=(\det\Lambda)(\Lambda^{-1})_{11}##).

Assumptions: ##(\Lambda^{-1})_{00}=\Lambda_{00}## and ##(\Lambda^{-1})_{11}=\Lambda_{11}##.

Let ##\Lambda\in G## be arbitrary, and denote its components by a,b,c,d, as in post #8. The assumptions imply that ##\det\Lambda=\pm 1## and that ##d=(\det\Lambda)a##. It's now trivial to prove that ##G_p##, defined by ##G_p=\{\Lambda\in G|\det\Lambda>0\}##, is a subgroup of G.

Assumption: The set ##G_o## defined by ##G_o=\{\Lambda\in G|\Lambda\text{ is orthochronous}\}## is a subgroup of ##G##.

The intersection of two subgroups is a subgroup, so the set H defined by ##H=G_o\cap G_p## is a subgroup. Let ##\Lambda## be an arbitrary member of H, and denote its components by a,b,c,d as above. We have d=a, and it's easy to show that a>0. (Use that ##\Lambda## is orthochronous).
$$\Lambda=a\begin{pmatrix}1 & b/a\\ c/a & 1\end{pmatrix}.$$ We define ##\gamma=a## and ##\alpha=b/a##. Since
$$\frac{c}{a}=\frac{c}{d}=-v(\Lambda),$$ it's convenient to also define v=-c/a. Now we have
$$\Lambda=\gamma\begin{pmatrix}1 & \alpha\\ -v & 1\end{pmatrix}.$$ From here, everything is the same as in post #8. We use that H is closed under matrix multiplication to prove that ##\alpha/v## has the same value for all ##\Lambda\in H## with v≠0. If we denote that value by -K, we have ##\alpha=-Kv##. Then we use the assumption about the 00 component to determine ##\gamma##.

This time, I can prove ##v(\Lambda^{-1})=-v(\Lambda)## as a theorem, but only because I added another assumption, the one about the 11 component. Apparently this has also enabled me to drop the assumption that the only member of H with zero velocity is the identity.

Fredrik · Jan 9, 2013

micromass said:

I fixed LaTeX in that post. If there are good posts whose LaTeX is messed up, then please report them.

Thank you. I usually do, but for some reason I chose not to do it with this one. I was probably just embarrassed about the number of LaTeX errors in there.

strangerep · Jan 10, 2013

LastOneStanding said:

[...] ##N = D'ND'^{-1}## does require that ##N## is a multiple of the unit matrix, as the article says. This follows from Schur's lemma, since ##N## commutes with every group representative of SO(2) in an irreducible representation.

That was also my initial reaction to Fredrik's opening post, but then I realized that's wrong. Take an arbitrary (identity-connected) matrix in SO(2), e.g.,
$$
D ~:=~ \begin{pmatrix}
\cos\theta & \sin\theta \cr
- \sin\theta & \cos\theta
\end{pmatrix}
$$ and then find all 2x2 matrices N which commute with D. The answer is any N of the form:
$$
N ~=~ \begin{pmatrix}
a & b \cr
- b & a \cr
\end{pmatrix}
$$

strangerep · Jan 10, 2013

Fredrik said:

In 1+1 dimensions, spatial isotropy and reflection invariance are the same thing

Yes -- I'm thinking of the 3+1 case.

Just keep in mind that if you use an assumption like "isotropy", you must turn that assumption into a precise mathematical statement and include it on the list.

It means that the equations which define all possible transformations are invariant under arbitrary SO(3) rotations. Another way of saying this is that the defining equations do not involve a distinguished 3-vector parameter.

In the case of boosts, we assume that SO(3) by itself has already been found by a decomposition process (such as: "first find all the transformations that leave the time coordinate invariant"). Then fix an arbitrary direction in space ##\widehat u## (corresponding to a velocity direction), and derive the boost transformations for that case. Rotations around ##\widehat u## must leave the defining equations invariant.

Having found these transformations for a specific ##\widehat u##, apply general SO(3) rotations to get the general case. (I recall this is called "completing the orbit" in the language of Wigner's little group method.)

strangerep · Jan 10, 2013

Fredrik said:

The calculation that attempts to find the full group right away is a lot uglier.

Yes, and you're denying yourself the use of one powerful technique: if you decompose the transformations such that, for each type, you're trying to determine a 1-parameter Lie subgroup, then you can exploit the fact that this subgroup is Abelian.

So, in the case of boosts in 1+3D, one fixes a direction ##\widehat u## as I said before, and the parameter then becomes a real number ##u## multiplying the (fixed) 3-vector ##\widehat u##.

Then, imposing the requirement that transformations with different parameters ##u, u'## must commute, one determines quickly which of the unknown functions are in fact constant.

Actually, one can do even better if you've already established that the unknown functions are even in ##u##. [The Levy-Leblond paper we discussed in another thread shows how to do this -- see the discussion leading up to his equations (15a)-(15c)]. Then the appearance of that common factor ##\gamma## emerges as a consequence.

TrickyDicky · Jan 10, 2013

Fredrik said:

In my approach, ##v(\Lambda^{-1})=-v(\Lambda)## is the spatial isotropy assumption. (In 1+1 dimensions, spatial isotropy and reflection invariance are the same thing).

But why are you using 1+1 dimensions? Even if it is usually true that the one dimensional case generalizes to the higher dimensional case this is not always so.

Fredrik · Jan 10, 2013

TrickyDicky said:

But why are you using 1+1 dimensions? Even if it is usually true that the one dimensional case generalizes to the higher dimensional case this is not always so.

1. I expect it to be much easier than the 3+1-dimensional case.
2. I expect to be able to use some of what I've found studying the 1+1-dimensional case when I do the 3+1-dimensional case.
3. Most of the interesting stuff about relativity is present in 1+1 dimensions. (Time dilation, length contraction, relativity of simultaneity, the twin paradox,...)

Fredrik · Jan 10, 2013

strangerep said:

It means that the equations which define all possible transformations are invariant under arbitrary SO(3) rotations. Another way of saying this is that the defining equations do not involve a distinguished 3-vector parameter.

I agree, but I think these statements are too vague. They are certainly more precise than e.g. "space is the same in all directions", but I want all of my assumptions to be exact mathematical statements.

strangerep said:

In the case of boosts, we assume that SO(3) by itself has already been found by a decomposition process (such as: "first find all the transformations that leave the time coordinate invariant"). Then fix an arbitrary direction in space ##\widehat u## (corresponding to a velocity direction), and derive the boost transformations for that case. Rotations around ##\widehat u## must leave the defining equations invariant.

Having found these transformations for a specific ##\widehat u##, apply general SO(3) rotations to get the general case. (I recall this is called "completing the orbit" in the language of Wigner's little group method.)

I expect that I will have to do something like this in the 3+1-dimensional case.

strangerep said:

Yes, and you're denying yourself the use of one powerful technique: if you decompose the transformations such that, for each type, you're trying to determine a 1-parameter Lie subgroup, then you can exploit the fact that this subgroup is Abelian.

So, in the case of boosts in 1+3D, one fixes a direction ##\widehat u## as I said before, and the parameter then becomes a real number ##u## multiplying the (fixed) 3-vector ##\widehat u##.

Note that in the 1+1-dimensional case, I don't have to assume anything about parameters. I've been hoping that I won't have to in the 3+1-dimensional case either, but I have realized that at the very least, I have to assume that there's a subgroup corresponding to rotations.

I may continue to deny myself the use of some powerful techniques, because if it's possible, I'd like to find a derivation that can be understood by people who haven't studied some of the more advanced stuff. If I explain what a group is, my arguments for the 1+1-dimensional case can be understood by anyone who has taken a course in linear algebra.

If we do make assumptions about parameters, we have to make them mathematically precise, and explain how they can be thought of as precise statements of some aspect of the principle of relativity or isotropy (or maybe invariance under spatial reflections or invariance under time-reversal). It's not clear to me how to do this, but I haven't spent a lot of time on the 3+1-dimensional case yet.

strangerep said:

The Levy-Leblond paper we discussed in another thread shows how to do this -- see the discussion leading up to his equations (15a)-(15c).

I don't have an easy way to access all of the references. There's a university library I can use, but I have to physically go there. I might do that next week.

Fredrik · Jan 10, 2013

I'm quite pleased with the derivation and the assumptions in #17. Note that none of my assumptions is about invariance under spatial reflections or time reversal!

The only problem with #17 is that it doesn't show how to determine G from the restricted group that we find at the end. I could use some help with that. These are some of my thoughts:

I defined ##G_p=\{\Lambda\in G|\det\Lambda>0\}##. Now define
$$P=\begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix}.$$ For all ##\Lambda\in G##, we have ##\det(P\Lambda)=-\det(\Lambda)##. If ##P\in G##, then this implies that ##G-G_p=\{P\Lambda|\Lambda\in G_p\}##, so we have ##G=G_p\cup\{P\Lambda|\Lambda\in G_p\}##.

But what if ##P\notin G##? Then ##\Lambda\in G_p## doesn't imply that ##P\Lambda\in G## (at least not in an obvious way), so ##G-G_p## may be a proper subset of ##\{P\Lambda|\Lambda\in G_p\}##. The statement that ##P\in G## can be thought of as a mathematically precise version of "space is the same in both directions", so I thought that ##P\notin G## would imply that ##G=G_p##. I'm thinking that if space is not invariant under spatial reflections, then our final result for G should be the proper Galilean group or proper Lorentz group.

So is it possible to prove that ##P\notin G## implies that ##G-G_p=\emptyset##?

Edit: I may have figured this out. I noticed something 5 seconds after posting that may or may not solve the problem. Thinking about it now...

It turned out to be quite easy. Suppose that ##P\notin G##. I will prove that ##G-G_p## is empty by deriving a contradiction from the assumption that it's not. So suppose that it's not. Let ##\Lambda\in G-G_p## be arbitrary. Then ##\det(\Lambda)=-1##, and ##\det(P\Lambda)=1##. So ##P\Lambda\in G_p##. Define ##\Lambda'=P\Lambda##. We have ##P=\Lambda'\Lambda^{-1}\in G##, and this contradicts the assumption that ##P\notin G##.

So it is easy to determine ##G## from ##G_p##. If ##P\notin G## (if the two directions of space are not equivalent), we have ##G=G_p##. If ##P\in G## (if the two directions of space are equivalent), we have ##G=G_p\cup\{P\Lambda|\Lambda\in G_p\}##.

Fredrik · Jan 10, 2013

Now there's only one thing remaining. We need to do something similar with the orthochronous subgroup ##G_o##. I would like to define
$$T=\begin{pmatrix}-1 & 0\\ 0 & 1\end{pmatrix},$$ and proceed as we did with ##G_p## and ##P##. But to do this, we need to prove that ##T\Lambda## is orthochronous if and only if ##\Lambda## is not. I'm not convinced that this is possible. I have proved one of the implications, so I'll post that proof here. But I still need to think about the other one.

Suppose that ##\Lambda## is orthochronous. Then
$$\Lambda_{00}=\left(\Lambda\begin{pmatrix}1\\ 0\end{pmatrix}\right)_0 >\left(\Lambda\begin{pmatrix}0\\ 0\end{pmatrix}\right)_0=0.$$ We need to prove that there exist ##x,y\in\mathbb R^2## such that ##x_0>y_0## and ##(T\Lambda x)_0\leq (T\Lambda y)_0##. The choice
$$x=\begin{pmatrix}1\\ 0\end{pmatrix},\quad y=\begin{pmatrix}0\\ 0\end{pmatrix}$$ will get the job done.
$$\left(T\Lambda\begin{pmatrix}1\\ 0\end{pmatrix}\right)_0=-\Lambda_{00}<0=\left(T\Lambda\begin{pmatrix}0\\ 0\end{pmatrix}\right)_0.$$

strangerep · Jan 10, 2013

Fredrik said:

I expect that I will have to do something like this in the 3+1-dimensional case.

I just realized it's even more straightforward than I thought. The whole point of this exercise is to find a group of coordinate transformations that preserve inertial motion, i.e., for which the original and new velocities are constant. (For this we don't need to add in the extra red herring about straight lines in 3-space.)

So... after identifying the general group as affine, one can proceed by first finding all transformations which leave the velocity unchanged. Possibly a first step in this is to find the transformations which preserve zero velocity. Then progress to preserving nonzero velocity. Somewhere in there, space and time translations will emerge, so they can be factored out in subsequent steps by restricting to transformations which preserve the spatiotemporal origin.

Then progress to transformations which change the velocity -- this can be split into 2 cases: preserving velocity magnitude, and preserving velocity direction. This makes the rotation subgroup come out in the wash. Probably at this point, the requirement of preserving spatial angles at the origin must be invoked, else you get squeezing transformations which scissor the axes in some way.

Then progress to transformations which change the velocity. Change of velocity direction has already been done, hence only change of velocity magnitude need be done in this step. This is also why the 1+1D case is useful: by performing an orthogonal decomposition with respect to an arbitrarily chosen velocity direction, one can re-use much of the boost calculation from the 1+1D case. :-)

I don't have an easy way to access all of the references. There's a university library I can use, but I have to physically go there. I might do that next week.

The Levy-Leblond paper is accessible online: type in "levy-leblond 1976 lorentz" to Google Scholar. ;-)

EDIT: BTW, your 1st assumption in post #17 could be turned into something more convincing by using instead an assumption that ##v## does indeed correspond to our intuitive notion of what we mean by "relative velocity" between the old and new frames. See the Levy-Leblond paper near his eq(9).

Fredrik · Jan 10, 2013

strangerep said:

I just realized it's even more straightforward than I thought. The whole point of this exercise is to find a group of coordinate transformations that preserve inertial motion, i.e., for which the original and new velocities are constant. (For this we don't need to add in the extra red herring about straight lines in 3-space.)

I find that last comment very odd. The requirement that each transformation must take straight lines to straight lines is far more obvious than the (equivalent) requirement that each transformation is an affine map. An inertial coordinate system is supposed to be a coordinate system that takes the world lines of non-accelerating particles to straight lines. This makes it extremely natural to assume that for all global inertial coordinate systems ##x,y:M\to\mathbb R^4## (where M is spacetime), ##x\circ y^{-1}## is a permutation of ##\mathbb R^4## that takes straight lines to straight lines. This is the natural starting point for all of this.

strangerep said:

The Levy-Leblond paper is accessible online: type in "levy-leblond 1976 lorentz" to Google Scholar. ;-)

Thanks for this tip. I was sure I had tried that a few weeks ago and found nothing, but now I see several different pdf files. Maybe I was looking for another Levy-Leblond paper that time.

strangerep said:

EDIT: BTW, your 1st assumption in post #17 could be turned into something more convincing by using instead an assumption that ##v## does indeed correspond to our intuitive notion of what we mean by "relative velocity" between the old and new frames. See the Levy-Leblond paper near his eq(9).

I think the assumption is perfect as it is. If S and S' are global inertial coordinate systems, then the number that our intuition tells us to call the velocity of S' in S, is 1 divided by the slope of the t' axis in a spacetime diagram for S, with the t axis drawn in the "up" direction. If ##\Lambda## is the transformation from S to S', then it's natural to also call that number the velocity of ##\Lambda##. To find it, we apply ##\Lambda^{-1}## to the S' coordinate pair of an arbitrary point on the t' axis, e.g. ##\begin{pmatrix}1\\ 0\end{pmatrix}##. This gives us the S coordinates of that point, ##\begin{pmatrix}(\Lambda^{-1})_{00}\\ (\Lambda^{-1})_{10}\end{pmatrix}##. Since the t' axis is a straight line through 0, this means that the number that we want to call "the velocity of ##\Lambda##" is
$$\frac{(\Lambda^{-1})_{10}}{(\Lambda^{-1})_{00}}=-\frac{\Lambda_{10}}{\Lambda_{11}},$$ as I already concluded in post #8.

This only explains why we like to use the word "velocity" for this number. This is something that should be explained before the assumptions. (This is what I tried to do at the beginning of #8, but perhaps I didn't include enough details). The actual assumption says that there's a ##v:G\to\mathbb R## such that ##v(\Lambda)=-\Lambda_{10}/\Lambda_{11}## for all ##\Lambda\in G##. This is the mathematical statement that makes the idea "every one of these transformations has a velocity" precise.

strangerep · Jan 10, 2013

Fredrik said:

I find that last comment very odd. The requirement that each transformation must take straight lines to straight lines is far more obvious than the (equivalent) requirement that each transformation is an affine map. An inertial coordinate system is supposed to be a coordinate system that takes the world lines of non-accelerating particles to straight lines. This makes it extremely natural to assume that for all global inertial coordinate systems ##x,y:M\to\mathbb R^4## (where M is spacetime), ##x\circ y^{-1}## is a permutation of ##\mathbb R^4## that takes straight lines to straight lines. This is the natural starting point for all of this.

The physical starting point is that $$
\frac{d^2x'^{\,i}}{dt'^{\,2}} ~=~ 0 ~=~ \frac{d^2x^i}{dt} ~,
$$ i.e., ##v'^{\,i}## and ##v^i## are both (triplets of) constants.
There is no mention of (e.g.,) dx/dy = const in this.

That we have a Euclidean 3-space near the origin can be handled by angle-preservation at the origin according to a Euclidean 3-metric. Then, once we've found that ordinary translations are part of the group that preserves zero acceleration, preservation of straight lines in spatial 3-space follows as a result, afaict.

Anyway, from the rest of your reply I get the feeling we're approaching this with different emphases. (Mine is the physical emphasis of what a local observer can do.)

Fredrik · Jan 10, 2013

I am now convinced that my definition of "orthochronous" is inappropriate. I was thinking that it should mean that the transformation preserves the temporal order of events in the sense that if ##x,y\in\mathbb R^2## are such that ##x_0>y_0##, then ##(\Lambda x)_0>(\Lambda y)_0##. But that inequality is equivalent to
$$0<(\Lambda x)_0-(\Lambda y)_0=(\Lambda(x-y))_0=\Lambda_{00}(x_0-y_0)+\Lambda_{01}(x_1-y_1),$$ and unless ##\Lambda_{01}=0##, this inequality can be violated by some choices of ##x_1,y_1##. Of course, now that I've realized that, I see that I should have expected this, because an orthochronous Lorentz transformation preserves the temporal order of timelike separated events, not the temporal order of all events.

This is a pretty annoying complication. I guess I will have to define "timelike separated" half-way through the calculation instead of after it.

Edit: Waaait a minute...I think I see a way to deal with this. I said earlier that the calculations in #17 and #8 get ugly unless we focus on the restricted subgroup, but most of the ugliness is from not focusing on the subgroup of proper transformations. So I think the way to go here is to not make any assumptions about the existence of an orthochronous subgroup. We just write
$$\Lambda=\sigma\gamma\begin{pmatrix}1 & \alpha\\ -v & 1\end{pmatrix},$$ where ##\gamma=|a|##, ##\sigma=\operatorname{sgn}(a)##, and then we proceed as in post #8. The sigma doesn't contribute to the calculation that determines that ##\alpha/v## has the same value K for all ##\Lambda\in G## with v≠0. So we're going to find the whole group G without any complications.

I think I have now worked out everything in the the 1+1-dimensional case, except how to go from "preserves constant velocity lines" to "preserves straight lines".

Fredrik · Jan 10, 2013

strangerep said:

The physical starting point is that $$
\frac{d^2x'^{\,i}}{dt'^{\,2}} ~=~ 0 ~=~ \frac{d^2x^i}{dt} ~,
$$ i.e., ##v'^{\,i}## and ##v^i## are both (triplets of) constants.
There is no mention of (e.g.,) dx/dy = const in this.

OK, agreed. This is even more natural than the requirement about straight lines. But if this requirement doesn't imply the one about straight lines, I think we're just going to have to introduce other mathematical assumptions inspired by the principle of relativity until we have enough to derive the statement about straight lines.

strangerep said:

Anyway, from the rest of your reply I get the feeling we're approaching this with different emphases. (Mine is the physical emphasis of what a local observer can do.)

Yes, I don't care about local stuff at this point. I'm only concerned with theories of physics in which spacetime is a 4-dimensional vector space (or affine space), and the set of functions that change coordinates from one global inertial coordinate system to another is a group that satisfies mathematical statements inspired by the principle of relativity and symmetry ideas (translation, isotropy and maybe parity and/or time reversal).

strangerep · Jan 11, 2013

Fredrik said:

I think I have now worked out everything in the the 1+1-dimensional case, except how to go from "preserves constant velocity lines" to "preserves straight lines".

In the special case of 1+1D they're the same, aren't they?

Fredrik · Jan 11, 2013

strangerep said:

In the special case of 1+1D they're the same, aren't they?

Yes. I started thinking about that an hour after making that comment, and it turned out to be very easy to prove that "preserves lines with zero coordinate acceleration" implies "preserves straight lines that aren't parallel to the x axis". And since we're assuming bijectivity, this means that those lines, i.e. the ones with infinite speed, are preserved as well. For example, the x-axis is sent to the infinite-speed line through ##\Lambda(0)##.

So unless I've made a blunder somewhere, I now have a complete proof for the 1+1-dimensional case.

For the 3+1-dimensional case, the only part of the proof that I have completed is the theorem that says that if X is a finite-dimensional vector space such that 2≤dim X<∞, and T:X→X is a bijection that takes straight lines to straight lines, then T is an affine map. I have also done some calculations, like finding a boost for an arbitrary velocity from a boost with a velocity in the 1 direction. But I still haven't decided what exactly I want my assumptions to be.

Fredrik · Jan 12, 2013

Fredrik said:

I defined ##G_p=\{\Lambda\in G|\det\Lambda>0\}##. Now define
$$P=\begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix}.$$ For all ##\Lambda\in G##, we have ##\det(P\Lambda)=-\det(\Lambda)##. If ##P\in G##, then this implies that ##G-G_p=\{P\Lambda|\Lambda\in G_p\}##, so we have ##G=G_p\cup\{P\Lambda|\Lambda\in G_p\}##.

But what if ##P\notin G##?
...
...is it possible to prove that ##P\notin G## implies that ##G-G_p=\emptyset##?

Edit: ...
Suppose that ##P\notin G##. I will prove that ##G-G_p## is empty by deriving a contradiction from the assumption that it's not. So suppose that it's not. Let ##\Lambda\in G-G_p## be arbitrary. Then ##\det(\Lambda)=-1##, and ##\det(P\Lambda)=1##. So ##P\Lambda\in G_p##. Define ##\Lambda'=P\Lambda##. We have ##P=\Lambda'\Lambda^{-1}\in G##, and this contradicts the assumption that ##P\notin G##.

I have found two mistakes in my proof for the 1+1-dimensional case. What I said in the quote above is wrong. If ##P\notin G## and ##\Lambda\in G-G_p##, then there's no reason to think that ##P\Lambda\in G##.

Maybe I'm missing something obvious, but I'm starting to think that ##P\notin G## doesn't imply that ##G=G_p##. Is it possible that ##P\notin G## (the mathematical statement corresponding to the idea that the laws of physics are not invariant under spatial reflections) only implies that ##G_p## doesn't determine ##G##? Then the final conclusion won't be that "G+translations" is either the Galilean group or the Poincaré group. It will be that it's a group that has the restricted version of one of those two as a subgroup.

Edit: Nope, the correct final conclusion is that "G+translations" is a subgroup of the Galilean group or the Poincaré group. This is clear when I work with G the whole way instead of restricting my attention to a subgroup. The calculation turned out to be a lot less ugly than I thought it would be, so I will do it this way instead. There's no reason to consider the subgroups as early in the calculation as I did.
The other mistake is in the proof that's supposed to rule out the possibility that K<0. I will have to take another look at that.

Fredrik · Jan 12, 2013

New version of the (main part of) the argument for the 1+1-dimensional case.

We're looking for a non-trivial subgroup of GL(ℝ²), whose members are functions that change coordinates from one global inertial coordinate system to another. We are going to make a small number of assumptions about this group. They will be mathematical statements that can be thought of as making the following ideas precise.

Each inertial observer has a finite velocity in the coordinates used by another.
For each event x, there are at least two velocities at which an experiment performed at x can test the accuracy of predictions.
For each event x, statements that two arbitrary inertial observers at x make about how the other guy's measurements relates to his own coordinate assignments must be symmetrical. (For example, if one of them can say "Your clock is slow by a factor of ##\gamma##.", the other one must be able to say the same thing).
Positive velocities do not add up to negative velocities unless there's a reflection involved.

It's easy to make the first idea precise, if we first define the velocity of an arbitrary ##\Lambda## in the group. Suppose that S and S' are global inertial coordinate systems, and that ##\Lambda## is the transformation from S to S'. We are going to define the velocity of ##\Lambda## as the "##\Delta x/\Delta t##" of the line that ##\Lambda^{-1}## sends the 0 axis to, because that line is the image in S of the world line of the S' observer. It's also the unique straight line through 0 and ##\Lambda^{-1}e_0##. $$\Lambda^{-1}e_0 =\Lambda^{-1}\begin{pmatrix}1\\ 0\end{pmatrix} =\begin{pmatrix}(\Lambda^{-1})_{00}\\ (\Lambda^{-1})_{10}\end{pmatrix}.$$ So we define the velocity of ##\Lambda## as ##(\Lambda^{-1})_{10}/(\Lambda^{-1})_{00}##.

The second idea can be made precise by requiring that there's a member of the group that has a non-zero velocity. (This assumption is necessary in the step that determines the relationship between the off-diagonal components).

The third idea can be made precise by requiring that ##(\Lambda^{-1}e_\mu)_\mu=(\Lambda e_\mu)_\mu## for all ##\mu\in\{0,1\}##. This is equivalent to saying that ##\Lambda^{-1}## and ##\Lambda## have the same diagonal elements.Theorem: Suppose that ##G## is a subgroup of ##\operatorname{GL}(\mathbb R^2)## such that the following statements are true.

There's a ##V:G\to\mathbb R## such that ##V(\Lambda)=-\frac{(\Lambda^{-1})_{10}}{(\Lambda^{-1})_{00}}## for all ##\Lambda\in G##.
##V(G)\neq\{0\}##.
For all ##\mu\in\{0,1\}##, we have ##(\Lambda^{-1})_{\mu\mu}=\Lambda_{\mu\mu}##.
For all ##\Lambda',\Lambda''\in G## with positive velocities and determinants, we have ##V(\Lambda'\Lambda'')>0##.

Then there's a ##K\geq 0## such that
$$G\subset\left\{\left.\frac{\sigma}{\sqrt{1-Kv^2}}\begin{pmatrix}1 & -Kv\\ -\rho v & \rho\end{pmatrix}\right|\ \sigma,\rho\in\{-1,1\}, 1-Kv^2>0\right\}.$$

Proof: Let ##\Lambda\in G## be arbitrary. Denote its components by a,b,c,d. We have
$$\Lambda=\begin{pmatrix}a & b\\ c & d\end{pmatrix},\quad \Lambda^{-1}=\frac{1}{ad-bc}\begin{pmatrix}d & -b\\ -c & a\end{pmatrix}.$$ Note that
$$V(\Lambda)=-\frac{(\Lambda^{-1})_{10}}{(\Lambda^{-1})_{00}}=-\frac{c}{d},\qquad V(\Lambda^{-1})=-\frac{\Lambda_{10}}{\Lambda_{00}}=\frac{c}{a}.$$ This means that assumption 1 implies that a≠0, d≠0. Assumption 3 implies that
$$a=\frac{d}{\det\Lambda},\quad d=\frac{a}{\det\Lambda}=\frac{d}{(\det\Lambda)^2}.$$ Since d≠0, this implies that ##\det\Lambda=\pm 1##.

Define ##\rho=\det\Lambda##, ##\gamma=|a|##, ##\sigma=\operatorname{sgn}(a)##, ##\alpha=b/a## and ##v=-\rho c/a##. Note that since ##a=\rho d##, this v is the velocity of ##\Lambda##.
$$\Lambda=\begin{pmatrix}a & b\\ c & \rho a\end{pmatrix} =a\begin{pmatrix}1 & b/a\\ c/a & \rho\end{pmatrix} =\sigma\gamma\begin{pmatrix}1 & \alpha\\ -\rho v & \rho \end{pmatrix}.$$ Let ##\Lambda',\Lambda''\in G_p## be arbitrary.
\begin{align}
&G_p\ni \Lambda'\Lambda'' =\sigma'\sigma''\gamma'\gamma''\begin{pmatrix}1 & \alpha'\\ -\rho' v' & \rho'\end{pmatrix}\begin{pmatrix}1 & \alpha''\\ -\rho'' v'' & \rho''\end{pmatrix} =\sigma'\sigma''\gamma'\gamma''\begin{pmatrix}1-\alpha'\rho''v'' & \alpha''+\alpha'\rho''\\ -\rho' v'-\rho''v'' & -\rho' v'\alpha''+\rho'\rho''\end{pmatrix}\\
&\rho'\rho'' =(\det\Lambda')(\det\Lambda'') =\det(\Lambda'\Lambda'')
=\frac{(\Lambda'\Lambda'')_{11}}{(\Lambda'\Lambda'')_{00}} =\frac{-\rho' v'\alpha''+\rho'\rho''}{1-\alpha'\rho''v''}.\end{align} Suppose that ##\rho'\rho''=1##. Then we have ##\rho'=\rho''## and
$$1=\frac{-\rho'v'\alpha''+1}{1-\alpha'\rho''v''}.$$ This is equivalent to ##\alpha'\rho''v'' =\rho'v'\alpha''##. Since ##\rho'=\rho''##, this is equivalent to ##\alpha' v''=v'\alpha''##.

Now suppose that ##\rho'\rho''=-1##. Then we have ##\rho'=-\rho''## and $$-1=\frac{-\rho'v'\alpha''-1}{1-\alpha'\rho''v''}.$$ This is equivalent to ##\alpha'\rho''v''=-\rho'v'\alpha''##. Since ##\rho'=-\rho''##, this is equivalent to ##\alpha' v''=v'\alpha''##.

So ##\alpha'v''=v'\alpha''## for all ##\Lambda',\Lambda''\in G##. If every member of ##G## has velocity 0, then this result tells us nothing. This is why we included assumption 2. It ensures that the result we just obtained implies the following.

For all ##\Lambda'\in G##, if ##v'=0##, then ##\alpha'=0##.
For all ##\Lambda',\Lambda''\in G## such that ##v'\neq 0## and ##v''\neq 0##,
$$\frac{\alpha''}{v''}=\frac{\alpha'}{v'}.$$

The second result implies that ##\alpha'/v'## has the same value for all ##\Lambda'\in G## such that ##v'\neq 0##. Denote this value by -K. The second result implies that if ##v\neq 0##, then ##\alpha=-Kv##. The first result implies that ##\alpha=-Kv## also when ##v=0##.

These results imply that
$$\Lambda=\sigma\gamma\begin{pmatrix}1 & -Kv\\ -\rho v & \rho\end{pmatrix},\quad\Lambda^{-1}=\frac{1}{\sigma\gamma(\rho-\rho Kv^2)}\begin{pmatrix}\rho & Kv\\ \rho v & 1\end{pmatrix} =\frac{\sigma}{\gamma(1-Kv^2)}\begin{pmatrix}1 & \rho Kv\\ v & \rho\end{pmatrix}.$$ Now assumption 3 tells us that
$$\sigma\gamma=\frac{\sigma}{\gamma(1-Kv^2)}.$$ Note that if K>0, this implies that ##1-Kv^2>0## (because ##\gamma^2>0##). If we define ##c=1/\sqrt{K}##, this is equivalent to ##v\in(-c,c)##.

Since ##\gamma=|\Lambda_{00}|>0##, the result above implies that
$$\gamma=\frac{1}{\sqrt{1-Kv^2}},$$ and therefore that
$$\Lambda=\frac{\sigma}{\sqrt{1-Kv^2}}
\begin{pmatrix}1 & -Kv\\ -\rho v & \rho\end{pmatrix}.$$ We will prove that K≥0 by deriving a contradiction from the assumption that K<0. So suppose that K<0.

We will prove that there's a ##\bar\Lambda\in G## such that ##\det\bar\Lambda>0## and ##v>0##. Let ##\Lambda## be an arbitrary member of G such that v≠0. (Assumption 2 ensures that such a ##\Lambda## exists). If ##\det V>0## and ##v>0##, we just define ##\bar\Lambda=\Lambda##. If ##\det V>0## and ##v<0##, then ##\det\Lambda^{-1}>0##, and ##V(\Lambda^{-1})=-v>0##. So we can just define ##\bar\Lambda=\Lambda^{-1}##. Now suppose that ##\det V<0##. For all ##\Lambda',\Lambda''\in G##,
$$V(\Lambda'\Lambda'') =-\frac{(\Lambda'\Lambda'')_{10}}{ (\Lambda'\Lambda'')_{00}} =\frac{\rho'v'+\rho''v''}{1-|K|v'v''}.$$ This implies that
$$V(\Lambda^2)=\rho\frac{2v}{1-|K|v^2}\neq 0.$$ We also have ##\det(\Lambda^2)=(\det\Lambda)^2=1##, so if ##V(\Lambda^2)>0##, we can define ##\bar\Lambda=\Lambda^2##. If ##V(\Lambda^2)<0##, then we can define ##\bar\Lambda=(\Lambda^2)^{-1}##.

Now let ##\Lambda## be an arbitrary member of ##G## such that ##\det\Lambda>0## and ##v>0##. Define ##\beta=\sqrt{|K|}v## and ##c=1/\sqrt{|K|}##. Let ##\theta## be the unique member of ##(-\pi/2,\pi/2)## such that ##\tan\theta=\beta##.\begin{align}
&\cos\theta>0\\
&1=\cos^2\theta+\sin^2\theta=\cos^2\theta(1+\beta^2)\\
&\cos\theta=\frac{1}{\sqrt{1+\beta^2}}\\
&\sin\theta =\frac{\sin\theta}{\cos\theta}\cos\theta =\frac{\beta}{\sqrt{1+\beta^2}}\\
&\Lambda =\frac{\sigma}{\sqrt{1-Kv^2}}
\begin{pmatrix}1 & -Kv\\ -v & 1\end{pmatrix} =\frac{\sigma}{\sqrt{1-Kv^2}}
\begin{pmatrix}1 & -\beta/c\\ -c\beta & 1\end{pmatrix}
=\sigma\begin{pmatrix}\cos\theta & \frac{1}{c}\sin\theta\\ -c\sin\theta & \cos\theta\end{pmatrix}
\end{align} Denote the right-hand side by ##T(\theta)##. The above implies that
$$\Lambda^2 =T(\theta)^2 =T(2\theta),$$ and (by induction) that for all ##n\in\mathbb Z^+##,
$$\Lambda^n=T(n\theta).$$ Let ##n\in\mathbb Z^+## be such that ##n\theta\in(\pi/2,3\pi/2)##.
$$V(\Lambda^n)=V(T(n\theta))=\frac{c\sin(n\theta)}{\cos(n\theta)}<0.$$ This contradicts assumption 4.

Possible mistake in an article (rotations and boosts).

Similar threads

Hot Threads

Recent Insights