Derivation of the Lorentz Transformations

NanakiXIII · Dec 26, 2006

I've been going through "Relativity", a translation of a book by Albert Einstein about the Special and General theories of Relativity. It is stressed that the book should be understandable to anyone with a high school education. "A clear explanation that anyone can understand," it says on the cover. Well, I'm feeling rather stupid at the moment, understanding quite little of the book.

At the moment though, I'm not interested in most of the book, only the "Simple Derivation of the Lorentz Transformations" in the appendix. I'm finding it far from simple and I'm hoping someone could help me with it.

First of all, I'll not assume anyone to actually have this book, so here is a link to an online version (of poorer quality) of the derivation: http://www.bartleby.com/173/a1.html

My first problem starts at equation (3) (and (4)). It's not all that obvious to me why we're adding that constant.

The rest of my problems start after equation (7). I (mathematically) understand everything up to (7), but then I get lost. I have no idea, for example, where the equation between (7) and (7a) is coming from. Given that equation, I can understand (7a), but then going on to (7b), I'm lost again. Why are the two snapshots identical?

Then there's (8). I'm willing to believe that by inserting a and b, you get those equations, but how does one acquire the value of b?

The last mystery (I haven't gone on to combining everything yet) is (8a). I'm clueless as to what they did to get this equation. I'm sure they did something with the equations of (8), but what exactly... I don't see it.

I hope I explained things sufficiently clear and I hope someone will be able to help me.

JM · Dec 26, 2006

I have read that derivation also, and I find it useless. It's all arithmetic and no physics. The only derivation I find useful is in Einsteins 1905 paper " On the Electrodynamics of Moving Bodies", available in the Dover book The Principle of Relativity. It's not so easy to follow but it uses math no higher than college level.

Galileo · Dec 27, 2006

NanakiXIII said:

I've been going through "Relativity", a translation of a book by Albert Einstein about the Special and General theories of Relativity. It is stressed that the book should be understandable to anyone with a high school education. "A clear explanation that anyone can understand," it says on the cover. Well, I'm feeling rather stupid at the moment, understanding quite little of the book.

By 'clear explanation' they probably mean something like not (too) technical. It's really only very clear to someone who already understands the subject. Try an introductory physics book on special relativity. Some methods of approach may suit you better than others.

My first problem starts at equation (3) (and (4)). It's not all that obvious to me why we're adding that constant.

We have to find a relation between the coordinates (x,t) of the 'stationary' system K and the coordinates (x',t') of the 'moving' system K'. By the second postulate a ray of light has the same speed in both K and K', sowe must have both x-ct=0 and x'-c't=0.
(Now there is something the book leaves out and that is that the transformation should be linear.)
All you know is that if you enter x=ct into the transformation, you should get x'=ct'. By linearity x-ct and x'-ct' should be proportional.
The equation [itex]x'-ct'=\lambda (x-ct)[/itex] is linear and has exactly that property that whenever x'-ct'=0 we have x-ct=0.
In know that's merely a repetition of the book. Maybe someone else can explain it more clearly.

The rest of my problems start after equation (7). I (mathematically) understand everything up to (7), but then I get lost. I have no idea, for example, where the equation between (7) and (7a) is coming from. Given that equation, I can understand (7a), but then going on to (7b), I'm lost again. Why are the two snapshots identical?

We measure the length of the unit rod in K' from the standpoint of K. So we measure the endpoints at a certain moment in time in K (t=0). The endpoints are related by: x'=ax (I've entered t=0).
The left endpoint is at x=0 ofcourse, the other at 1/a. So the length is 1/a as seen from K.
Now if we reverse the roles, we look at a unit rod in K as seen from K'. We should get the same length 1/a by symmetry, or rather, by the principle of relativity. So we look at the endpoints at a moment in time in K' (t'=0). For this we need to invert the lorentz equations. We want x,t in terms of x' and t'. Since in this case t'=0, we get:
x'=ax-bct
0=act-bx
Now we eliminate t to find the relation between x and x' (multiply the top one by a and the bottom one by b and add them). We get:
ax'=(a^2-b^2)x
or
x'=a(1-b^2/a^2)x
So again one endpoint is at x'=0 ofcourse, the other (x=1) is at a(1-v^2/c^2). (Where bc/a=v is used). Since this should be equal to 1/a by the principle of relativity we find [itex]a^2=1/(1-v^2/c^2)[/itex]

Then there's (8). I'm willing to believe that by inserting a and b, you get those equations, but how does one acquire the value of b?

From a and the expression bc/a=v.

The last mystery (I haven't gone on to combining everything yet) is (8a). I'm clueless as to what they did to get this equation. I'm sure they did something with the equations of (8), but what exactly... I don't see it.

I hope I explained things sufficiently clear and I hope someone will be able to help me.

Simply use the Lorentz-equations and calculate (x')^2-(ct')^2 and it will be the same as x^2-(ct)^2.

(Introduce [itex]\gamma=1/\sqrt{1-v^2/c^2}[/itex] and use [itex]\gamma^2(1-v^2/c^2)=1[/itex] in your calculation so it won't be very messy.)

NanakiXIII · Dec 27, 2006

Thanks for your explanations, Galileo, they've helped me quite a bit, filling in some of the details I was missing. I couldn't gather, for example, that I was to invert the equations for that equation between (7) and (7a). Thanks for that. I still have some questions, however.

About (8a). I'll take your word that calculating those values will get me the same results (I haven't the patience to try), but I'm really looking to know how they got from (8) to (8a) instead of how to check (8a) by means of (8).

I also have a new question. Equation (6) is used quite a lot in the reasoning that comes after that equation, but it is constructed under the condition that x'=0. Does it still hold true if x'/=0? It just seems rather odd that into get (8), (6) is used. In (8) you get an equation for x', while x' is permanently 0 in (6). I hope my messy explanation is clear enough to communicate my problem. At any rate, what am I seeing wrong here?

Any light-shedding on my problems would be greatly appreciated.

Galileo · Dec 27, 2006

NanakiXIII said:

About (8a). I'll take your word that calculating those values will get me the same results (I haven't the patience to try), but I'm really looking to know how they got from (8) to (8a) instead of how to check (8a) by means of (8).

The only way I can this moment think of to derive it IS to check it using the Lorentz-equations. It's an interesting and important result, but it's not obvious. It's just easy to check that it is indeed so. It means that for an event in spacetime, observers in K and K' will find different coordinates (x,t) and (x',t') but the combinations x^2-(ct)^2 and (x')^2-(ct') will be the same. It means we can ascribe some absolute meaning to it (since it's independent of your choice of inertial system).

I also have a new question. Equation (6) is used quite a lot in the reasoning that comes after that equation, but it is constructed under the condition that x'=0. Does it still hold true if x'/=0? It just seems rather odd that into get (8), (6) is used. In (8) you get an equation for x', while x' is permanently 0 in (6). I hope my messy explanation is clear enough to communicate my problem. At any rate, what am I seeing wrong here?

It's gotten from the obvious relation that, if we fix a point in K', it will move away from K at the same speed as K' moves away from K, namely v. So it isn't necessary to pick the point x'=0, any other point will do fine.

I don't know how much math experience you have, but it follows quite directly:
If you fix x', say x'=L (so it's like the coordinate of the endpoint of a fixed rod of length L in K'), then from 5:
L=ax-bct
or
x=(L/a)+(bc/a)t
which is the equation for uniform motion with speed bc/a, so evidently bc/a=v always.
Incidentally, notice that the endpoint is at [itex]L/a=L/\gamma[/itex], which means the length of the rod is shorter by that factor gamma as seen from K, which is ofcourse the phenomenon of Lorentz contraction.

NanakiXIII · Dec 27, 2006

Thanks again, Galileo, for your explanations. I'm not sure what you mean about fixing x', but the first, simple explanation came through.

Going through the derivation again, I've stumbled on yet another problem I'm hoping someone could help me with. To obtain the equations on (5), equations (3) and (4) are added and subtracted. What puzzles me is why one should want to do that. Why add and subtract (3) and (4)?

bernhard.rothenstein · Dec 28, 2006

lorentz transformation

NanakiXIII said:

Thanks again, Galileo, for your explanations. I'm not sure what you mean about fixing x', but the first, simple explanation came through.

Going through the derivation again, I've stumbled on yet another problem I'm hoping someone could help me with. To obtain the equations on (5), equations (3) and (4) are added and subtracted. What puzzles me is why one should want to do that. Why add and subtract (3) and (4)?

have a look at
special relativity and its experimental foundations
yuan zhong zhang
world scientific 1996
in order to see how a guessed shape of the transformation equations and some physical conditions imposed to it lead to expressions for the 16 coefficients.

NanakiXIII · Dec 28, 2006

I haven't yet been able to find that particular piece, but from your description of it, I gather that the derivation isn't entirely based on pure logic and mathematics, am I correct?

In the mean time, I've continued on through the derivation and stumbled yet again upon problems. I hope someone will be able to clear things up for me.

First of all, I was expecting to see the equations of (8) and then similar ones for y' and z'. It would seem the most logical approach, or am I mistaken in this?

Then there's (11a). There's a large piece of text between (11) and (11a), but what I gather is that (11) needs to be generalized. (11a), however, is just (11) with sigma=1. I'd sooner think that specializing the equation, simplifying it to an extended version of (8a), so that it only holds for the x-axis. Perhaps I'm just reading it completely wrong (as you may or may not have noticed, English isn't my native language), or perhaps just misunderstanding.

In general, the entire course of the derivation from (10) onwards seems rather vague. I can follow things through to (11), but the point of it all seems to escape me. I anyone could clarify things a little, I'd much appreciate it.

Jheriko · Dec 28, 2006

I found http://www.mth.uct.ac.za/omei/gr/chap1/node4.html from http://www.mth.uct.ac.za/omei/gr/ useful when I was first learning SR. It uses a physical example of light being emitted and reflected off of something back towards the emitter as a starting point.

I own the book you mention, I found it okay to read but the translation is quite poor in places. I've lent it to someone recently and they said that they couldn't get into it at all. I can't actually remember whether or not there is a derivation of the Lorentz transformation in the book or whether it is just stated that "these are the transformations you must use to get the right answers" (EDIT: just noticed you mentioned the appendix there).

I was also bitterly disappointed with the section on General Relativity, it barely scratches the surface.

Galileo · Dec 28, 2006

NanakiXIII said:

Going through the derivation again, I've stumbled on yet another problem I'm hoping someone could help me with. To obtain the equations on (5), equations (3) and (4) are added and subtracted. What puzzles me is why one should want to do that. Why add and subtract (3) and (4)?

Because you want x' and t' (or ct') in terms of x and t. So you have to solve the linear system:
[tex] x'-ct'=\lambda (x-ct)[/tex]
[tex] x'+ct'=\mu (x+ct)[/tex]
for x' and t'. The fastest way is clearly to add and subtract the 2. Then a and b are introduced for notational convenience. You could keep on working with lambda and mu, but that's a bit clumsy because of the lengthy expressions you might get.

Galileo · Dec 28, 2006

NanakiXIII said:

First of all, I was expecting to see the equations of (8) and then similar ones for y' and z'. It would seem the most logical approach, or am I mistaken in this?

The given Lorentz transformation hold when K and K' have the same orientation in space and K' moves in the positive x-direction (called the standard configuration). Coordinates perpendicular to the direction of motion are not changed.
To obtain the transformation for a system K' going, say, in the positive y-direction just switch x and y.

Then there's (11a). There's a large piece of text between (11) and (11a), but what I gather is that (11) needs to be generalized. (11a), however, is just (11) with sigma=1. I'd sooner think that specializing the equation, simplifying it to an extended version of (8a), so that it only holds for the x-axis. Perhaps I'm just reading it completely wrong (as you may or may not have noticed, English isn't my native language), or perhaps just misunderstanding.

It says the transformation should be generalized to arbitrary configurations (orientations of K and K' and an arbitrary velocity direction). But these can always be obtained by starting from the standard one and applying rotation transformations on K and K'.

It says the relation (11a) always holds for arbitrary Lorentz transformations.

NanakiXIII · Dec 28, 2006

Thanks again, Galileo. Looking through things again, I think I've come to understand that (11a) is simply an equation that holds true, not one that is used for calculations. Am I correct?

I understand that if the system would be moving along one of the other axes, you could switch the variables, but what if it's not moving along an axis, what if it's moving in a random direction? Is that what you meant when you talked about applying rotation transformations? What are rotation transformations?

Galileo · Dec 28, 2006

NanakiXIII said:

Thanks again, Galileo. Looking through things again, I think I've come to understand that (11a) is simply an equation that holds true, not one that is used for calculations. Am I correct?

It is not used for the derivation of the Lorentz transformation, correct. It's a consequence of it and it's probably mentioned because it's an important result.

I understand that if the system would be moving along one of the other axes, you could switch the variables, but what if it's not moving along an axis, what if it's moving in a random direction? Is that what you meant when you talked about applying rotation transformations? What are rotation transformations?

Ok, in short. Suppose K' is stationary wrt K (their origins coincide at all times), but K' is rotated with respect to K. Then K and K' will use different spatial coordinates to describe a point in space. The coordinates are related by a linear transformation (just as with a Lorentz transformation).
Check here for more: http://en.wikipedia.org/wiki/Coordinate_rotation

If you want the Lorentz transformation for arbitrary orientations and velocity, then simply apply rotations to K and K' such that their orientations are the same and the x-axes point along the direction of motion, then apply the 'standard' Lorentz transformation, then rotate them back to the oiginal orientations. The principle is simple, but the math is tedious. I haven't seen anyone use it in practice.

NanakiXIII · Dec 28, 2006

Ah, I get it, thanks. One question arises from your answer, however. If no one uses this method in practice, how is it done usually?

Galileo · Dec 30, 2006

NanakiXIII said:

Ah, I get it, thanks. One question arises from your answer, however. If no one uses this method in practice, how is it done usually?

I`m not sure what you mean. Lorentz transformations are used all the time, but mostly the standard ones. For a particular problem you usually pick your axes such that the equations are simple (i.e. in standard configuration). So the compicated ones where the systems axes are not parallel and the velocity direction is arbitrary is not used in practice AFAIK. (Maybe in working with particle colliders though).

NanakiXIII · Dec 30, 2006

Ah, I understand from your description, thank you.

greensmith · Jun 6, 2011

Thnx, Galileo, I had exactly the same question!

Derivation of the Lorentz Transformations

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad Why is gravity a fictitious force?

Undergrad Relativistic Space Travel: Optimizing Proper Time [Project Hail Mary]

Undergrad KE of rotating disc

Undergrad Why is the Lorentz Force always perpendicular to velocity?

Graduate How valid is the Block Universe theory?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect