# Simplest derivation of Lorentz Transformation

1. Nov 10, 2012

### absurdist89

I'm just getting started on relativity. I watched this a couple of day ago -

But I didn't like the way Lorentz Transformation was derived (the assumption about the nature of the final transformations, to be more specific). I tried reading Einstein's original paper for a better derivation but it was kind of hard for me (I'm not really very comfortable with differential equations). Then I found this - http://arxiv.org/pdf/physics/0606103v4.pdf

Is the derivation in the paper correct?

I'm only interested in movement along x-direction, not any arbitrary velocity. Basically, the derivation goes like this -

1) Consider a light clock with two mirrors facing each other a distance L0 apart. Each unit of time corresponds to a cycle of light pulse getting fired from the first mirror, getting reflected and returning. Time period of this clock in rest frame T0 = 2L0/c

2) Let's say the clock is moving with speed v in a direction parallel to the plane of the mirrors. Time period of the moving clock from rest frame is T = T0/p where p = sqrt(1 - v**2/c**2).

3) Consider another clock moving with same speed v but in a direction perpendicular to the plane of mirrors. Here, the paper makes an interesting claim - "because the observer moving with the clocks sees that the clocks tick at the same rate, so should the observer at rest". So the time period of the second clock too is T in the rest frame. Is this necessarily true?

4) The only way 3 can happen is if L (length of moving clock in rest frame) = L0 * p.

5) From here the paper derives the transformation using x = vt + x'/p and x/p = vt' + x'

The claim in 3 seems reasonable to me because I can't find a reason to think that two phenomena which happen to take the same time in a moving frame should take different amounts of time in the rest frame. But I want someone more knowledgeable to confirm that my thinking is correct.

Last edited by a moderator: Sep 25, 2014
2. Nov 10, 2012

### harrylin

Hi welcome to physicsforums

I didn't look at that paper (sorry), no time for it now and not peer reviewed; but did you also look at Einstein's "simple derivation"?
- http://www.bartleby.com/173/a1.html

Last edited by a moderator: Sep 25, 2014
3. Nov 10, 2012

### absurdist89

Thanks, the link seems good enough. Although, the second link (the thread) seemed like a troll :)

4. Nov 10, 2012

### Fredrik

Staff Emeritus
I don't have time to examine someone else's derivation right now, but here's one of mine. (Start reading at "This is how I do these things"). What I like about this one is that it's derived from mathematical assumptions that can be thought of as the statements that make Einstein's postulates mathematically precise.

By the way, when you link to arXiv, it's better to link to the page that presents the article than to the actual article (like this), because that page usually contains information about where the article has been published, if it has been published.

5. Nov 10, 2012

### bcrowell

Staff Emeritus
6. Nov 11, 2012

### absurdist89

Thanks all for your replies :)

@Fredrik: Will keep that in mind next time.

7. Nov 15, 2012

### Fredrik

Staff Emeritus
I'm working on my own version of the "nothing but relativity" argument. (I just like to do things my own way). I will post it in a new thread when I'm done. In the mean time, I would just like to say that I'm not quite buying Pal's argument for why K<0 must be ruled out. (I disagree with the argument, not with the conclusion).

We're looking for a group of coordinate transformations that satisfy a few technical requirements that can be thought of as aspects of the principle of relativity. What we find is that there's a real number K such that an arbitrary member of the group can be written as
$$\Lambda(v)=\gamma(v)\begin{pmatrix}1 & -Kv\\ -v & 1\end{pmatrix},\qquad \gamma(v)=\frac{1}{\sqrt{1-Kv^2}}.$$ When K>0, the inequality $\gamma^2>0$ implies that $v^2<1/K$. When K<0, $\gamma^2>0$ doesn't tell us anything about v.

It turns out that the "gamma" of any member of the group is ≥1. The "gamma" of the composite transformation $\Lambda(u)\Lambda(v)$ is
$$\gamma(u)\gamma(v)(1+Kuv).$$ For this to be ≥1, we must have 1+Kuv>0. And we can violate this inequality by plugging in values of u,v with magnitudes greater than $1/\sqrt{|K|}$. Pal makes this observation, and then immediately rejects groups with K<0 (because $\gamma^2>0$ didn't imply that there's speed limit in the case K<0).

But what if the group simply doesn't contain any $\Lambda(v)$ with problematic values of v? The fact that we didn't get a speed limit from $\gamma^2>0$ doesn't imply that there isn't one. (It's possible that our assumptions imply that there is one when K<0). So have we really ruled out the possibility that there's a set $S\subset \mathbb R$ such that $\{\Lambda(v)|v\in S\}$ is a group? I don't think so.

It seems to me that we need a better argument, and maybe even an additional technical assumption, about continuity of the map $v\mapsto\Lambda(v)$, or connectedness of its domain.

Last edited: Nov 15, 2012
8. Nov 15, 2012

### strangerep

I didn't find Pal's treatment very well-written. Did you check the references he mentions for Levy-Leblond (1976) and Mermin? (The former seems to be available as a downloadable pdf from several sources.)

Afaik, the sign of $K$ must also be regarded as an empirically-determined feature.

Last edited: Nov 15, 2012
9. Nov 15, 2012

### Fredrik

Staff Emeritus
I didn't even read Pal's article. I just tried to understand the idea, and then prove it myself. But I had to consult the article for a couple of details I got stuck on, in particular (in my notation) $\alpha/v=\text{constant}$ and K≥0. His argument for the former was convincing, but his argument for the latter was not.

I actually looked for Mermin's article the other day, but only found a web page that wanted me to pay for it. I may look again if I can't develop the ideas I'm describing below to a proof.

I think K<0 can be ruled out mathematically. Suppose that there is a set $S\subset\mathbb R$ such that $G_K=\{\Lambda(v)|v\in S\}$ is a group. Now Pal's argument shows that $v=\pm=1/\sqrt{|K|}$ can't be in S, because 1+Kv^2=1-|K|v^2=0, and this isn't >0 as it needs to be for $\Lambda(s)^2$ to be in the group. So now we've found one real number that can't be in S.

Suppose that S is non-empty and let $u\in S$ be arbitrary. Now Pal's argument shows that for $v=1/(|K|u)$, we have $\Lambda(u)\Lambda(v)\notin G_K$. This implies that $\Lambda(v)\notin G_K$ and $v\notin S$.

I'm hoping to be able to use a similar argument to show that S is empty, or that $\mathbb R-S$ is dense in $\mathbb R$ or something like that.

Edit: I thought about it some more, and I'm starting to think that this approach too fails to rule out K<0. All I see is that for every "small" velocity in S, there's a "large" velocity that's not in S, and vice versa. (By "small" and "large", I mean that the speed is below or above $1/\sqrt{|K|}$). So it's plausible that there are many choices of S that make $G_K$ a group. In particular, I don't see a way to rule out the possibility $S=(-1/\sqrt{|K|},1/\sqrt{|K|})$. But one thing I do see is that if all velocities in that interval are in S, then no velocity that's not in that interval is in S.

I'm still not sure what the correct conclusion is. I need to fiddle around with velocity addition and similar stuff a bit longer.

Last edited: Nov 15, 2012
10. Nov 15, 2012

### robphy

Consider the eigenvectors that arise from your sign-choices for K.

11. Nov 15, 2012

### Fredrik

Staff Emeritus
Eigenvectors of $\Lambda(v)$? $\Lambda(v)$ is invertible if and only if $1-Kv^2\neq 0$. When K<0, we have $1-Kv^2=1+|K|v^2>0$, so $\Lambda(v)$ is invertible for all v. This implies that it doesn't have any non-zero eigenvectors.

Edit: Oops. That last sentence is wrong. Thanks strangerep. It's when $\Lambda(v)-\lambda$ (where λ is the eigenvalue) is invertible that we don't have any non-zero eigenvectors.

Last edited: Nov 15, 2012
12. Nov 15, 2012

### strangerep

If that's what he meant, the eigenvalues in the case $K<0$ are a complex-conjugate pair, unless I'm mistaken. Similarly, the eigenvectors have complex components. But I must be missing the tacit implication of robphy's point -- presumably it has something to do with complex eigenvectors being "bad" in this context?

Last edited: Nov 15, 2012
13. Nov 15, 2012

### Fredrik

Staff Emeritus
When I wrote this I had temporarily forgotten that I had already ruled out the possibility $S=(-1/\sqrt{|K|},1/\sqrt{|K|})$. When K<0, $1/\sqrt{|K|}$ is a forbidden velocity, because if v has that value, $\Lambda(v)^2=0\notin G_K$. This implies that a lot of other velocities are forbidden. The velocity addition rule is $$u\oplus v=\frac{u+v}{1+Kuv}=\frac{u+v}{1-|K|uv}.$$
Any velocity v such that $v\oplus v$ is a forbidden velocity, is a forbidden velocity (i.e. can't be a member of S). The first example of this is $v=(-1+\sqrt{2})/\sqrt{|K|}$. (That's the part of this that I had figured out and forgotten about before I wrote the stuff in the quote above). We have $v\oplus v=1/\sqrt{|K|}$, so $\Lambda(v)$ can't be in the group (because its square isn't). We can of course keep doing this. There must be a velocity u such that $u\oplus u=(-1+\sqrt{2})/\sqrt{|K|}$. That velocity is forbidden too. And there must be a velocity w such that $w\oplus w=u$, and so on.

I haven't proved it, but it looks there's going to be a sequence of positive forbidden velocities that goes to 0.

And it doesn't end here. Now suppose that there's a non-zero v in S. Then every real number w such that $v\oplus w$ is a member of that sequence, will be a forbidden velocity. For every non-zero velocity in S, there are infinitely many that are not in S.

I think this means that we can rule out K<0 mathematically, but it looks like we need an additional assumption. It's probably sufficient to require that there's an ε>0 such that the interval (-ε,ε) is a subset of S, i.e. that the domain of the map $v\mapsto\Lambda(v)$ contains an open interval that contains 0.

Last edited: Nov 16, 2012
14. Nov 15, 2012

### robphy

K=-1 gives Euclidean space, which has no eigenvectors... no preferred directions.
K=+1 gives Minkowski spacetime, which has two eigenvectors...preserving the lightlike directions.
K=0 gives Galilean spacetime, with one eigenvector... preserving the spacelike directions.

15. Nov 15, 2012

### strangerep

Did you mean no real eigenvectors?
Ok. So the choice among $K=-1,0,+1$ is indeed determined only empirically. :-)

16. Nov 16, 2012

### Fredrik

Staff Emeritus
When K=-1, we get
$$\Lambda(v)=\frac{1}{\sqrt{1+v^2}}\begin{pmatrix}1 & v\\ -v & 1\end{pmatrix},$$
with infinitely many forbidden values of v. When $0<v<1/\sqrt{|K|}$, this transformation rotates the axes counterclockwise by the angle $\arctan(1/v)$, but it also rescales them. For every forbidden velocity u, the direction of the vector (1, -u) is special in the sense that it can't be the world line of an inertial observer, and the direction of the vector (1,u) is special in the sense that it can't be the simultaneity line of an inertial observer. So there are infinitely many "special" directions.

I don't see a reason to say that we're dealing with Euclidean space, other than the fact that the axes get rotated by the same angle as they are being stretched out.

That may still be the case, but before we can say that, we must at least find a set $S\subset\mathbb R$ (other than S={0}) such that $\{\Lambda(v)|v\in S\}$ is a group. Even if there is such an S, since every forbidden velocity implies that there are other forbidden velocities, it seems plausible to me that when we analyze this properly, we will find that the set of forbidden velocities is dense in the smallest interval of $\mathbb R$ that contains S, or something like that.

17. Nov 16, 2012

### robphy

Isn't $\det(\Lambda(v))=1$?
If $v=\tan\theta$, then $\frac{1}{\sqrt{1+v^2}}=\cos\theta$ and $\frac{v}{\sqrt{1+v^2}}=\sin\theta$

18. Nov 16, 2012

### Fredrik

Staff Emeritus
Wow, you're right. I totally missed that. OK, so $\Lambda(v)$ is a proper rotation by an angle θ, defined by $v=\tan\theta$. But there's still the problem of infinitely many forbidden velocities (and therefore forbidden angles).

1 is a forbidden velocity, because the "gamma" (top left component) of $\Lambda(1)^2$ is 0. The velocity addition rule is
$$u\oplus v=\frac{u+v}{1-uv}.$$ This ensures that the v such that $v\oplus v=1$ is a forbidden velocity. ($-1+\sqrt{2}$ is such a velocity). And then the u such that $u\oplus u=v$ is forbidden too. And so on. Also, whenever $u\in S, v\notin S$, the w such that $u+w=v$ is forbidden (i.e. not in S). And whenever $u\in S$, we have $1/u\notin S$ (because the "gamma" of $\Lambda(u)\Lambda(1/u)$ is =0).

So it looks like the set S will either be ={0} or full of holes.

19. Nov 16, 2012

### robphy

I must be missing something.
Why is v=1 or v>1 not allowed for the K=-1 case?
When v=1, isn't 1/sqrt(1+v^2)=1\sqrt(2), as expected since cos(45 deg)=1/sqrt(2)?

just the tangent-of-a-sum identity from Euclidean trigonometry.

The way I see it... (which may be different from your formulation)
I have a metric of the form ds^2=dt^2-K*dy^2, which is invariant under your Lambda,

for K=1 (minkowski) there is an infinity of "inaccessible velocities" ("non-timelike"...those whose square-norm is non-positive... i.e. null and spacelike)... and there are two disjoint groups of accessible velocities (future-timelike and past-timelike)

for K=0 (galilean) there is one "inaccessible velocity" (those whose square-norm is non-positive... i.e. spacelike-and-null (infinite-velocity) ).. and, again, there are two disjoint groups of accessible velocities (future-timelike and past-timelike)

for K=-1 (euclidean) there are no "inaccessible velocities [directions]"... all directions are allowed.... so a proper-rotation can take a unit-vector from the origin to any other unit-vector from the origin.

The "null vectors" are the eigenvectors, which delineate an inaccessible boundary for the "accessible" directions.

Last edited: Nov 16, 2012
20. Nov 16, 2012

### Fredrik

Staff Emeritus
The problem with $\Lambda(1)$ is that
$$\Lambda(1)^2=\begin{pmatrix}0 & 1\\ -1 & 0\end{pmatrix}.$$ If $\Lambda(1)$ is in G, then so is $\Lambda(1)^2$, but this transformation is a rotation by $\pi/2$, so it takes the old time axis to the new space axis. This contradicts the assumption that we're dealing with a group of orthochronous coordinate transformations of 1+1-dimensional spacetime.

A rotation by less than π/2 preserves the temporal order of points on the time axis. A rotation by more than π/2 reverses it. So velocities >1 are ruled out too. $\Lambda(1)^2$ isn't orthochronous for any $v\geq 1$.

21. Nov 16, 2012

### Fredrik

Staff Emeritus
How about this? Suppose that K=-1 and that there's a set $G=S\subset\mathbb R$ such that $\{\Lambda(v)|v\in S\}$ is a group. Since $$\Lambda(u)\Lambda(v) =\gamma(u)\gamma(v) \begin{pmatrix}1-uv & *\\ -u -v & *\end{pmatrix},$$ we know that when $u,v\in S$, we have $1-uv\neq 0$ and
$$\Lambda(u)\Lambda(v)=\gamma(u)\gamma(v)(1-uv)\begin{pmatrix}1 & *\\ \frac{-u-v}{1-uv} & *\end{pmatrix}.$$ This means that the "gamma" of $\Lambda(u)\Lambda(v)$ is $\gamma(u)\gamma(v)(1-uv)$ and that the velocity addition rule is
$$u\oplus v=\frac{u+v}{1-uv}.$$ We know that the gamma of an arbitrary $\Lambda(v)$ is >0 (because by assumption, we're dealing with a group of orthochronous transformations), so this result for the gamma of $\Lambda(u)\Lambda(v)$ implies that 1-uv>0. This implies that no v such that $v \geq 1$ can be a member of S. (Because if v≥1, then $v^2\geq 1$, and this ensures that the gamma of $\Lambda(v)^2$, which is $=1-v^2$, is ≤0). So now we have ruled out all velocities ≥1.

Define $c_1=-1+\sqrt{2}$. This is the positive velocity such that $c_1\oplus c_1=1$. This implies that $c_1\notin S$, because $\Lambda(c_1)^2=\Lambda(1)\notin G$. Now note that for all v>0, we have $v\oplus v\geq v$. This implies that for all $v\geq c_1$, we have $v\oplus v\geq c_1$, and since $c_1\notin S$, this ensures that $\Lambda(v)^2=\Lambda(u)$ for some $u\geq 1$. So $\Lambda(u)\notin G$. This implies that $\Lambda(v)\notin G$, and that $v\notin S$. So now we have ruled out all velocities that are $\geq c_1=-1+\sqrt{2}$.

Define $c_2$ as the positive velocity such that $c_2\oplus c_2=c_1$. The same argument now rules out all velocities $\geq c_2$. We can define a sequence of forbidden velocities in the following way: Define $c_0=1$. For all positive integers n, define $c_n$ as the positive velocity such that $c_n\oplus c_n=c_{n-1}$. Once we have ruled out all velocities $\geq c_k$ for some k, we can use the same argument to rule out all velocities above $\geq c_{k+1}$.

If we can show that the sequence $c_1,c_2,\dots$ goes to 0, we will have proved that S doesn't contain any positive numbers. Then we can rule out all the negative velocities as well, because if $v\in S$, then $-v\in S$. (This is because $\Lambda(-v)=\Lambda(v)^{-1}$). So the only possible S is S={0}.

22. Nov 16, 2012

### robphy

Clearly there is no Lambda-invariant sense of causal-structure in the K=-1 (Euclidean) case... since there is no invariant division between past and future, as one has in Galilean and Minkowski.

Stated another way, in Galilean and Minkowski,
for each vector with positive square-norm (in my metric signature) [regarded as a "timelike vector"],
a vector orthogonal to that vector [using the metric] doesn't have positive-square-norm.

However, in Euclidean geometry,
every nonzero vector (especially those orthogonal to a positive-square-norm vector) has positive-square-norm....
what you might have wanted to call "spacelike" is actually "timelike".
That is, there is no differentiation of vector types in Euclidean Geometry.

Last edited: Nov 16, 2012
23. Nov 16, 2012

### Fredrik

Staff Emeritus
I don't think we should start talking about metrics until after we've obtained the final result.

The point of this argument is that a few mathematical assumptions that can be thought of as mathematically precise statements of aspects of Galileo's principle of relativity lead to the conclusion that the set of functions that change coordinates (properly and orthochronously) from one inertial coordinate system to another, is a group that's either trivial, the group of Galilean boosts, or isomorphic to the Lorentz group.

Once we have obtained that result, we can start thinking about the mathematical structure (metrics and stuff) of spacetime. We obviously want to add translations to these groups, so now the choice is between the Galilean group and the Poincaré group.

Assuming that all this works in 3+1 dimensions as well, in the K=0 case, there is no metric on $\mathbb R^4$ that gets the job done. (This shows that it would have been premature to assume that there's a metric on spacetime at an earlier stage). Because of this, I think we either have to define spacetime as just the pair $(\mathbb R^4,G)$ where G is the Galilean group, or as a fiber bundle in the category of manifolds (or vector spaces), where each fiber is the manifold (or vector space) $\mathbb R^3$ with the Euclidean metric. I think G can be thought of as a "property" of this bundle in some way, but I haven't really thought that through.

In the K>0 case, there's a simple way to avoid having to define spacetime as the pair $(\mathbb R^4,P)$, where P is the Poincaré group. Since the Poincaré group is the the isometry group of the Minkowski metric, we define spacetime to be Minkowski spacetime.

I see that you still consider K<0 to be a mathematical possibility. If we had started without the assumption that we're dealing with a group of orthochronous transformations, then maybe K<0 would have made into the final result. I may have to think about that when I'm done with the orthochronous case.

Last edited: Nov 16, 2012
24. Nov 16, 2012

### Fredrik

Staff Emeritus
Your comments about rotations gave me a simple way to prove that S={0} in the K=-1 case. Let v be an arbitrary real number such 0<v<1, we define θ by $\tan\theta=v$. $\Lambda(v)$ is a counterclockwise rotation by the angle θ. Let k be the smallest positive integer such that kθ>π. $\Lambda(v)^k$ (which must be in the group because a group is closed under multiplication) is a rotation by an angle β such that π<β<2π, and therefore isn't an orthochronous transformation, so it can't be in the group. This implies that $\Lambda(v)$ isn't in the group, and that $v\notin S$.

Edit: I think I got it for an arbitrary K<0 as well.
We define $c=1/\sqrt{|K|}$, $\beta=v/c$ and $\tan\theta=\beta$. Now we have
$$\Lambda(v) =\frac{1}{\sqrt{1+|K|v^2}}\begin{pmatrix}1 & Kv\\ -v & 1\end{pmatrix} = \frac{1}{\sqrt{1+\beta^2}}\begin{pmatrix}1 & \beta/c\\ -c\beta & 1\end{pmatrix} = \begin{pmatrix}\cos\theta & \frac{1}{c}\beta\sin\theta\\ -c\beta\sin\theta & \sin\theta\end{pmatrix}.$$ If we denote that last matrix by $T(\theta)$, we can show that $T(\bar\theta)T(\theta)=T(\bar\theta+\theta)$. This implies (by induction) that $\Lambda(v)^k=T(k\theta)$. When k is large enough to ensure that $k\theta>\pi$, the velocity of $\Lambda(v)^k$ is >1. This implies that $v\notin S$.

Last edited: Nov 16, 2012
25. Nov 16, 2012

### strangerep

Did you mean you want to deduce that it's a group?

Normally, people just recognize that what we mean physically by boosting to various velocities (in a fixed direction) does indeed correspond to the properties of a 1-parameter Lie group. We want to be able to boost in arbitrarily small increments, compose boost transformations, etc.

By starting from the relativity principle, the most general transformations that map an unaccelerated observer to another unaccelerate are the FL transformations. Demanding good behaviour everywhere reduces this to the affine group. Demanding that they form a 1-parameter Lie group with velocity being the parameter is then sufficient to obtain the boost subgroup (again, in a fixed direction) if you also assume spatial isotropy in the 1+3D case, or parity invariance in the 1+1D case.

I sense you still want to figure this out for yourself, so let me know if/when you want the full answer as sketched above... :-)