# Lorentz transformation derivation. What exactly is wrong?

This is probably a stupid mistake I am making, but I can't figure it out. My apologies in advance...
I am familiar with the text-book derivation of the Lorentz transformation (I don't have any problem with it). It starts out stating:

x2+y2+z2-c2t2 = x'2 + y'2+z'2-c2t'2

meaning that a sphere of light radiating from the point where both coordinates coincide should have the same radius. Also, the assumption is made that x' and t' can be expressed as a linear combination of x and t, (x'=a1x+a2t and t'=b1x + b2t )while y=y' and z=z'. Doing some boring algebraic manipulation, a1, a2 ,b1 and b2 are found.

So I thought: why bother with y and z coordinates since they are the same?
So let's concentrate on events happening along the x and x' axis. I don't need the sphere, I just need to consider the ray of light along the axis and write instead:

x-ct =x'-ct'

But that is obviously different from

x2-c2t2 = x'2 - c2t'2

So of course it does not leave anywhere. My naive question is then: where's the flaw in my reasoning?

Related Special and General Relativity News on Phys.org
Bill_K
You need to consider both directions: x - ct = x' - ct' and x + ct = x' + ct'.

Multiply them together and you get (x - ct)(x + ct) = (x' - ct')(x' + ct') or x2 - c2t2 = x'2 - c2t'2.

Yes, but why does it all go wrong if I restrict my analysis to a light beam moving to the right? I'm confused :(

Bill_K
Yes, but why does it all go wrong if I restrict my analysis to a light beam moving to the right? I'm confused :(
You started out with a three-dimensional sphere enclosing the origin. If you restrict that to two dimensions, you get a two-dimensional "sphere" (a circle) enclosing the origin.

If you further restrict that to one dimension, you get a one-dimensional sphere. The "sphere in one dimension" still encloses the origin, but it is disconnected - it consists of two points: one point lying along the positive and one point lying along the negative axis. You need to keep both of them, i.e. consider waves traveling in both directions.

If you fail to consider both, you really do get transformations more general than the Lorentz transformation. The result will not be reflection symmetric (x → -x) which is also an essential part of the Lorentz group.

Thank you for your response. I have done the math and I get the same transformation but with the idiotic value for γ:

γ = 1 / (1+v/c)

I still don't understand why. I just started with the assumption: "let there be a light ray moving along the x axis in the positive direction". Then I consider that each of the observers see the ray moving at the same speed c, and everything follows from there...

Dale
Mentor
Unless you post your derivation it will just be a guessing game.

Okay, there it goes:

x' = a1x + a2t 
t' = b1x + b2t ​

but x'=0 is moving at speed v for the unprimed system. Substituting in :

0 = a1x + a2t ∴ v = -a2/a1 ​

Substituting into :

x'=a1 (x - vt) ​

and x=0 is moving at speed -v for the primed system. Dividing  and  for x=0:

x'/t' = a2/a2 = -v​

and because of  we get

a1 = b2 ​

A photon moving along the positive x axis will be measured at speed c for both:

x' - ct' = x - ct ​

Substituting  and  into 

a1 (x - vt) + c (b1x + a1)t = x - ct​

Rearranging:

(a1 - c b1) x - (a1v + a1c)t = x - ct​

This is an identity which holds true for every x, t so we can equate the coefficients:

a1 - c b1 = 1
a1v + a1c = c​

So

a1 = 1/(1+v/c) = γ
a2 = -vγ
b1 = γ v/c2
b2 = γ
Which is the same transformation with the absurd value for γ:

γ = 1/(1+v/c)

Bill_K
Your Eq (6) is false. For a Lorentz transformation it is NOT true that x - ct = x' - ct'. It is only true that (for a light ray) x - ct = 0 if and only if x' - ct' = 0. In fact, one is a constant times the other: x - ct = (k)(x' -ct'). At the same time, using the relationship for left-going light rays we can conclude that (x + ct) = (1/k)(x' + ct'). Multiplying the two together, the k's cancel, and (x2 - c2t2) = (x'2 - c2t'2)

Last edited:
My Eq (6) was inspired in the same thought that is used in the text-book derivation, which imagines light being emitted at x=x'=t=t'=0. It is reasoned that both observers can express the fact that the light moves away at speed c in their own coordinates:

x2 + y2 + z2 - c2t2 = 0​

and

x'2 + y'2 + z'2 - c2t'2 = 0​

and therefore the derivation starts with:

x2 + y2 + z2 - c2t2 = x'2 + y'2 + z'2 - c2t'2

I thought I was saying the same thing with my Eq (6), the difference being that it was simpler because my light ray moved along the x axis. We're both saying that each observer sees the light moving away at speed c.

What I don't quite get is why in the textbook derivation the concept expressed in the above equation is valid and my concept in  is not... I don't understand your k, because you say it comes from the Lorentz transformation. But the Lorentz transformation is precisely what I am trying to derive from first principles, so how can it be used as an argument in the derivation itself?

Bill_K
I don't understand your k, because you say it comes from the Lorentz transformation. But the Lorentz transformation is precisely what I am trying to derive from first principles, so how can it be used as an argument in the derivation itself?
You can't use it as an argument in the derivation, but you must not throw it away! Throwing it away (i.e.setting it equal to 1) is unjustified, and guaranteed to get you the wrong answer.

I just gave it a name k to avoid writing out its value, but it's easy to understand what it represents physically -

x - ct = (k)(x' - ct')

it's just the Doppler shift factor. And you can easily calculate its value from the Lorentz transformation:

x = γ(x' - vt')
t = γ(t' - vx'/c2)
implies
x - ct = γ(1 + v/c))(x' - ct') so k = γ(1 + v/c)

As I said, for the left-moving rays you get
x + ct = γ(1 - v/c)(x' + ct') and this time the factor is γ(1 - v/c), which happens to equal 1/k.

A more sophisticated way of understanding this is, if you define two null vectors v1 and v2 with components
v1 = (1, 1, 0, 0)
v2 = (1, -1, 0, 0)
they are the propagation vectors of the left- and right-going light rays, and they are eigenvectors of the Lorentz transformation, with eigenvalues k and 1/k. Under a Lorentz transformation v1 is stretched by a factor k, while v2 is reduced by the same factor.

pervect
Staff Emeritus
As someone here pointed out to me a while ago, another way about thinking about 'k' is that light rays are the eigenvectors of the Lorentz transform, and k, the doppler shift, is the corresponding eigenvalue.

Thus null vectors, or light rays (x,t) must, by the Lorentz transform be mapped to other null vectors (x',t'). They don't have to be the same null vector, though, they can (and generally do) differ by a multiplicative constant. This is k, the dopper factor.

Because the Lorentz transformation is linear, other sorts of mapping other than the linear one (multiplying by k) aren't possible.

This may or may not help the OP, but I thought I'd mention it.

Thank you for your patience. I can see it clearly now. More to myself than to anyone else, I would explain it like this:
When the textbook considers the sphere of light, each observer can write:

x2 + y2 + z2 - c2t2 = 0
x'2 + y'2 + z'2 - c2t'2 = 0​

These equations are both right for the photons moving away. However, when we write:

x2 + y2 + z2 - c2t2 = x'2 + y'2 + z'2 - c2t'2

what we are doing is proposing that this will hold for every event, not just for that ray of light. We are proposing an invariant.

When I write for my right-moving light ray:

x - ct = 0 ; x' - ct'=0​

That's fine, but when I say:

x' - ct' = x - ct​

I am proposing x - ct as an invariant, which of course it's not. It does not even hold for the left-moving photon, as you said, let alone an arbitrary event!

This illustrates what the Lorentz transformation derivation really means. The derivation states a number of reasonable hypothesis, but it's not a derivation in a mathematical sense. There is no a-priori justification for introducing the invariant, only that the result is consistent and works. Just as assuming that x and t are a linear combination of x' and t' and that the y and z coordinates are not affected by the motion along x.

Fredrik
Staff Emeritus