# Deriving Lorentz transformations

#### Erland

Since $\mathbf{T^{-1}}$ share the same property we have $l(\mathbf{v})=l(\mathbf{T^{-1}}(\mathbf{T}\mathbf{(\mathbf{v}})))=kl(\mathbf{T}(\mathbf{v})=k^2l(\mathbf{v})$ whence $k^2=1$
But $k$ depends upon $T$ and there might be another $k$ for $T^{-1}$.

#### facenian

But kk depends upon TT and there might be another kk for T−1T^{-1}.
First, I must say that for $\mathbf{T}^{-1}(\mathbf{v})$ I should have used $\mathbf{T}^{-1}(-\mathbf{v})$ but if space is to be isotropic then T(and its inverse) can only depend on $|\mathbf{v}|$, for a similar reason $\mathbf{T}^{-1}(v)=\mathbf{T}(v)$, ie, isotropy of space demands $\mathbf{T}^{-1}=\mathbf{T}$

#### Samy_A

Homework Helper
It remains to figure out:
1. Which extra conditions do we need to ensure that $k=1$?
I'm not sure if you mean mathematical or physical conditions.

Purely mathematically, you showed that $T$ is represented as a matrix by
$\begin{pmatrix} a & b \\ b& a \end{pmatrix}$ or $\begin{pmatrix} a & b \\ -b& -a \end{pmatrix}$.
$k=\pm \det(A)$
So $k=1$ would mean the form $\begin{pmatrix} a & b \\ b& a \end{pmatrix}$ with $a²-b²=1$

Last edited:

#### strangerep

Although it is false that all linear transformations $T:\Bbb R^2\to \Bbb R^2$, which map (Minkowski) null vectors (w.r.t. the standard basis) to null vectors must preserve Minkowski length, what is true is that the Minkowski length is multiplied with a constant factor by such a transformation. [...]
Yes, you're talking about (uniform) dilations (aka scaling transformations). The largest group which maps null vectors to null vectors is the conformal group, which contains the Poincare group, dilations, and (so-called) "special conformal transformations". The latter are non-linear, so not relevant here.

Also consider the distinction between the O(3) and SO(3) groups...

In physically-motivated derivations of Lorentz transformations, one assumes that all inertial observers set up their local coordinate systems using identical rods and clocks. I.e., they use the same units. This removes the dilational degree of freedom from the transformations.

#### Erland

I'm not sure if you mean mathematical or physical conditions
I would prefer mathematically formulated assumptions which are motivated physically.
Purely mathematically, you showed that $T$ is represented as a matrix by
$\begin{pmatrix} a & b \\ b& a \end{pmatrix}$ or $\begin{pmatrix} a & b \\ -b& -a \end{pmatrix}$.
$k=\pm \det(A)$
So $k=1$ would mean the form $\begin{pmatrix} a & b \\ b& a \end{pmatrix}$ with $a²-b²=1$
It could also be

$\begin{pmatrix} a & b \\ -b& -a \end{pmatrix}$

with $k=a^2-b^2=-\det(A)$ and $\det(A)=-1$.

But I think we can rule out this case by a continuity/connectedness assumption: For each such matrix $A$, we assume that there is a continuous path $h:[0,1]\to \Bbb M_{22}$ (space of $2\times2$-matrices, and we assume that the range of $h$ only contain the "right" kind of invertible matrices) with $h(0)=I$ (identity matrix, corresponding to relative velocity $0$) and $h(1)=A$. The elements in these matrices are then continuous functions on $[0,1]$. This can be motivated physically by the argument that it should be possible to accelerate any object continuously from rest to any velocity (less than light speed).
For such a matrix

$B=\begin{pmatrix} a & b \\ c& d \end{pmatrix}$

we have $ad+bc=1>0$ for $t=0$ and $B=I$, and, in the second case above, $ad+bc=-a^2-b^2<0$ for $t=1$ and $B=A$. But $ad+bc$ is a continuous function on $[0,1]$, so for some $t\in[0,1]$ we must have $ad+bc=0$, but $ad+bc=\pm(a^2+b^2)$, and this can only be $0$ if $a=b=0$, which does not give an invertible matrix and hence is excluded. It follows that we must always have the first case:

$A=\begin{pmatrix} a & b \\ b& a \end{pmatrix}.$

#### Samy_A

Homework Helper
It could also be

$\begin{pmatrix} a & b \\ -b& -a \end{pmatrix}$

with $k=a^2-b^2=-\det(A)$ and $\det(A)=-1$.
Oh yes, of course.
I don't what the fuss is all about in this thread, but my posts in the thread bellow might be helpful.
Yes, you're talking about (uniform) dilations (aka scaling transformations). The largest group which maps null vectors to null vectors is the conformal group, which contains the Poincare group, dilations, and (so-called) "special conformal transformations". The latter are non-linear, so not relevant here.

Also consider the distinction between the O(3) and SO(3) groups...

In physically-motivated derivations of Lorentz transformations, one assumes that all inertial observers set up their local coordinate systems using identical rods and clocks. I.e., they use the same units. This removes the dilational degree of freedom from the transformations.
Many thanks for the link and the explanation.

Concerning "what the fuss is all about" (only talking for myself of course):
For the layman in SR, it is sometimes very enligthning to read a more basic approach (as done here). I learned a lot reading this thread.

Last edited:

#### facenian

I don't what the fuss is all about in this thread, but my posts in the thread bellow might be helpful.
Very interesting post. However I would modify the demonstration because it has one flaw. The problem I see is the assertion that since straight lines must transform into straight lines, the transformation must be linear. I think this is a common mistake(for an example see, for instance, "The Special Theory of Relativity" by Aharoni)
The amendment I propose is summarized as follows:
The principle of Relativity only implies the conformal group and this means $ds'^2=\alpha(x^\mu,\vec{v})ds^2$, here is where homogeneity of space and time comes in demanding $\alpha$ be independant of space time variables $x^\mu$, now isotropy of space limits the dependance of $\alpha(\vec{v})$ to $\alpha(v)$. An argument like the one given in Landau and Lifshitz Volume 2 now proves $\alpha=1$
Finally it can be shown(see, for instance, "Gravitation and Cosmology" by S. Weingberg) that the only transformations that leave $ds^2$ invariant are linear tranformations.
So, I guess, that taking out pieces from these three authors a clear cut demonstration can be built

#### Fredrik

Staff Emeritus
Gold Member
The problem I see is the assertion that since straight lines must transform into straight lines, the transformation must be linear.
I think the correct statement is that if $T:\mathbb R^n\to\mathbb R^n$ is a bijection that takes straight lines to straight lines, then there's a linear bijection $\Lambda:\mathbb R^n\to\mathbb R^n$ and an $a\in\mathbb R^n$ such that $T(x)=\Lambda x+a$ for all $x\in\mathbb R^n$. So if we add the requirement that $T(0)=0$, $T$ must be linear.

#### facenian

think the correct statement is that if T:Rn→RnT:\mathbb R^n\to\mathbb R^n is a bijection that takes straight lines to straight lines, then there's a linear bijection Λ:Rn→Rn\Lambda:\mathbb R^n\to\mathbb R^n and an a∈Rna\in\mathbb R^n such that T(x)=Λx+aT(x)=\Lambda x+a for all x∈Rnx\in\mathbb R^n. So if we add the requirement that T(0)=0T(0)=0, TT must be linear.
First if $\mathbf{a}\neq 0$ the transformation is still linear, but that never mind, I think you just missed that.
On the other hand the problem here is not the existence of a linear transformation, the problem is to conclude that the transformation must be linear.

Last edited:

#### Fredrik

Staff Emeritus
Gold Member
First if $\mathbf{a}\neq 0$ the transformation is still linear, but that never mind, I think you just missed that.
Consider the T defined by $T(x)=x+(1,0,0,0)$. It's not linear, since
\begin{align*}
&T(2(1,1,1,1))=T(2,2,2,2)=(2,2,2,2)+(1,0,0,0)=(3,2,2,2),\\
&2T(1,1,1,1)=2((1,1,1,1)+(1,0,0,0))=2(2,1,1,1) =(4,2,2,2).
\end{align*}
On the other hand the problem here is not the existence of a linear transformation, the problem is to conclude that the transformation must be linear.
Right, and you can do that if you add the assumption that T(0)=0. Without that assumption, the correct conclusion is that T-T(0) is linear.

#### facenian

Consider the T defined by T(x)=x+(1,0,0,0)T(x)=x+(1,0,0,0). It's not linear, since
Yes,I'm sorry you're right, I was thinking of another kind of linearity. The linearity I was thinking about allows for "linear and not homogeneous"
I don't know if your approach is relevant to our discussion, may be is too advanced for me.

#### Fredrik

Staff Emeritus
Gold Member
I don't know if your approach is relevant to our discussion, may be is too advanced for me.
The theorem is relevant to any approach to SR that says "takes straight lines to straight lines" instead of "is linear". (Edit: The last sentence in this post explains why).

Unfortunately the proof is very long. I will only mention a few things from the notes I made a few years ago.

There's a version of this theorem that deals with affine spaces rather than vector spaces. It's called "the fundamental theorem of affine geometry". I studied a proof of that theorem in a book on affine spaces, and sort of "translated" it into a proof about vector spaces.

Let T be a permutation of $\mathbb R^4$ that takes straight lines to straight lines. This assumption is not sufficient to ensure that T is linear, but it is sufficient to ensure that T is affine, i.e. that there's a linear bijection $\Lambda:\mathbb R^4\to\mathbb R^4$ and a vector $a$ such that $T(x)=\Lambda x+a$ for all $x\in\mathbb R^4$. The key steps of the proof are as follows:

1. Define $\Lambda=T-T(0)$ and prove that $\Lambda$ is a bijection that takes straight lines to straight lines.
2. Prove that for all x,y such that {x,y} is linearly independent, we have $\Lambda(x+y)=\Lambda(x)+\Lambda(y)$.
3. Prove that for all x and all real numbers k, we have $\Lambda(kx)=k\Lambda(x)$.
4. Prove that for all x,y such that {x,y} is linearly dependent, we have $\Lambda(x+y)=\Lambda(x)+\Lambda(y)$.

Step 3 breaks up into a trivial case and a difficult case. If x=0, the proof is trivial. If x≠0, the strategy is to prove that there's a function $f:\mathbb R\to\mathbb R$ such that:
(a) $\Lambda(kx)=f(k)\Lambda(x)$.
(b) f is bijective.
(c) f is a field homomorphism.

Statements (b) and (c) say that f is a field automorphism. This result is useful because it's possible to prove that the only field automorphism on ℝ is the identity map.

It's a trivial corollary of this very non-trivial theorem that a permutation that takes straight lines to straight lines and 0 to 0 is linear.

Last edited:

#### strangerep

The principle of Relativity only implies the conformal group [...]
That's incorrect. The principle of relativity implies the group of fractional-linear transformations.

If one also invokes the light principle, and applies it by finding the largest group that preserves the (vacuum) Maxwell eqns, one finds the conformal group.

Taken together, the common subgroup consists of linear transformations.

Ref: Fock & Kemmer, "Space, Time & Gravitation", 2nd ed. 1964.

#### Erland

The 2D Lorentz transformation can be derived from the following mathematical assumptions, which all have physical motivations.

It can be proved that there is a unique one parameter family of transformations $L_v: \Bbb R^2\to \Bbb R^2$, defined for all $v\in (-1,1)$, satisfying:

1. For each fixed $(x,t)\in \Bbb R^2$, the mapping $H:(-1,1)\to \Bbb R^2$ given by $H(v)=L_v(x,t)$ is continuous.
2. Each $L_v$ ($v\in (-1,1)$) is a bijection, and its inverse is $L_w$ for some $w\in (-1,1)$.
3. For each $v\in(-1,1)$: Each line $(t,x)=(t_0,x_0)+s(1,a)$ ($s\in \Bbb R$), with $a\in (-1,1)$ and $t_0,x_0\in\Bbb R$, is mapped by $L_v$ to a line $(t',x')=(t'_0,x'_0)+r(1,b)$ ($r\in \Bbb R$), for some $b\in (-1,1)$ and $t'_0,x'_0\in\Bbb R$.
4. $L_v(0,0)=(0,0)$, for all $v\in (-1,1)$.
5. $L_0$ is the identity transformation on $\Bbb R^2$.
6. For each $v\in (-1,1)$: $L_v(1,v)=(t',0)$, for some $t'\in \Bbb R$.
7. For each $v\in (-1,1)$: $L_v(1,1)=(r,\pm r)$ and $L_v(1,-1)=(s,\pm s)$ for some $r,s\in \Bbb R$.
8. For each $v\in (-1,1)$: If $L_v(t,x)=(t',x')$, for some $t,x,t',x'\in \Bbb R$, then either $L_{-v}(t,-x)=(t',x')$ or $L_{-v}(t,-x)=(t',-x')$.

Each $L_v$ ($v\in (-1,1)$) is then given by $L_v(t,x)=(1/\sqrt{1-v^2})(t-vx, -vt+x)$ for all $(t,x)\in\Bbb R^2$.

One needs not assume that $L_v$ is linear, for this follows, which was proved by micromass, strangerep and Fredrik in an in an old thread
for the general case when all lines are mapped onto lines, and it can be proved that it suffices to look at "timelike" lines.

Physical motivations:

1. It is possible to accelerate an object continuously to any speed less than light speed, through intertial frames.
2. The two frames are interchangeable. A consequence of the special principle of relativity.
3. An an object which is not being acted upon by a force, and hence moves with uniform rectilinear (timelike) motion, w.r.t one frame, is moving as freely w.r.t. the other frame. A consequence of the special principle of relativity.
4. Just an arbitrary practical convention about how we put marks on our rods and synchronize our clocks.
5. If $v=0$, the frames coincide.
6. The relative velocity of Frame 2 w.r.t Frame 1 is $v$.
7. A consequence of the invariance of the light speed.
8. This is about spatial isotropy. The transformation should still be valid if we change the directions of the spatial axes. See an earlier post by strangerep in this thread.

But it becomes more complex in 4D spacetime...

Last edited:

#### Fredrik

Staff Emeritus
Gold Member
That's incorrect. The principle of relativity implies the group of fractional-linear transformations.
What assumptions are made about the domains of these transformations in this approach? (Apologies if you have already told me. In my defense, the thread the Erland linked to above is 3 years old). If we assume (as I do) that these transformations are permutations of $\mathbb R^4$ (and that they take 0 to 0), we get the stronger result that they are linear.

#### strangerep

If we assume (as I do) that these transformations are permutations of $\mathbb R^4$ (and that they take 0 to 0), we get the stronger result that they are linear.
Yes.

Afaict, it's possible to make sense of the FL transformations if one restricts the domain to the interiors of the null bicone of each observer. But that's beyond the scope of this thread.

#### sweet springs

In §3 of Einstein's first paper on special relativity,
,ON THE ELECTRODYNAMICS OF MOVING BODIES
By A. Einstein, June 30, 1905
https://www.fourmilab.ch/etexts/einstein/specrel/www/,
he deals with -v. I should appreciate it if someone could explain how Einstein deduced it.
He describes "Since the relations between x', y', z' and x, y, z do not contain the time t, the systems K and
are at rest with respect to one another, and it is clear that the transformation from K to
must be the identical transformation."

Last edited:

#### facenian

That's incorrect. The principle of relativity implies the group of fractional-linear transformations.
Yes, I was assuming that the PR includes the LP but that is not a good convention
Thanks for the reference.

#### facenian

The theorem is relevant to any approach to SR that says "takes straight lines to straight lines" instead of "is linear". (Edit: The last sentence in this post explains why).
Very interesting and yes it is relevant to this discussion

#### facenian

1. The 2D Lorentz transformation can be derived from the following mathematical assumptions, which all have physical motivations.
I think it's a good warm up to attack the real important case in 4D.
In that respect I think that strangerep and Fredrik gave the answers

#### Erland

I think it's a good warm up to attack the real important case in 4D.
The main diificulty in going to 4D is my point 8, about isotropy. How to formultate this mathematically in a sufficiently simple manner?

#### samalkhaiat

Very interesting post. However I would modify the demonstration because it has one flaw. The problem I see is the assertion that since straight lines must transform into straight lines, the transformation must be linear. I think this is a common mistake(for an example see, for instance, "The Special Theory of Relativity" by Aharoni)
The term “linear” in that thread is not the same as linearity in the algebraic sense: $f(ax + by) = af(x) + bf(y)$. Linearity there means polynomial of degree one in the coordinates, i.e., solutions to the system of 2-order PDE’s
$$\frac{\partial^{2}F^{\sigma}}{\partial x^{\mu}\partial x^{\nu}} = 0 .$$
This should have been clear to you because the inhomogeneous relation $F(x)=Ax + b$ is not linear in the algebraic sense, but it is a linear relation in the sense used in analytic geometry. Having said this, one can still speak of linear Poincare’ transformations: with every element $(\Lambda , a)$ of the Poincare’ group, we can associate a $5 \times 5$ matrix $\Gamma$ defined by
$$\Gamma = \begin{pmatrix} \Lambda^{\mu}{}_{\nu} & a^{\mu} \\ 0_{4\times 4} & 1 \end{pmatrix} . \ \ \ \ \ \ \ \ (1)$$
Then the multiplication law in the Poincare’ group $(\Lambda_{2},a_{2})\cdot (\Lambda_{1},a_{1}) = (\Lambda_{2}\Lambda_{1} , a_{2} + \Lambda_{2}a_{1})$ shows that the correspondence $(\Lambda , a) \to \Gamma$ is an isomorphism of the Poincare’ group on the subgroup of $GL(5,\mathbb{R})$ consisting of matrices of the form (1), where $\Lambda$ satisfies $\Lambda^{T}\eta \Lambda = \eta$ and $a$ is an arbitrary 4-vector. The Poincare group can, therefore, be identified with this matrix Lie group. And Minkowski spacetime $M^{4}$ can be identified with the hyperplane $x^{4}=1$ in $\mathbb{R}^{5}$ with coordinates $(x^{0}, x^{1}, … , x^{4})$. Then the linear operator (1) acts on this hyperplane as the corresponding Poincare transformation $(\Lambda , a)$.

The principle of Relativity only implies the conformal group and this means
This is incorrect. The conformal group does not even act on Minkowski space $M^{4}$. It acts on the (conformally) compactified version of Minkowski space $\bar{M}^{4}\cong S^{3}\times S^{1} / Z_{2}$.

An argument like the one given in Landau and Lifshitz Volume 2 now proves $\alpha=1$
Did you read the whole post? I used argument similar to Landau’s argument to show $\alpha = 1$.

Finally it can be shown(see, for instance, "Gravitation and Cosmology" by S. Weingberg) that the only transformations that leave $ds^2$ invariant are linear tranformations.
The theorem you are talking about is the following
“The coordinate transformation from one Minkowski chart to another is a Poincare’ transformation” , or equivalently stated “The Poincare’ group is the maximal symmetry group of Minkowski spacetime”.
Again, you should read my posts in that thread, because I proved the infinitesimal version of this theorem in there.

#### strangerep

The main diificulty in going to 4D is my point 8, about isotropy. How to formultate this mathematically in a sufficiently simple manner?
Well, here is a sketch to get you started...

Pick an arbitrary (fixed) direction in 3-space, denoted by the unit vector $\widehat {\bf v}$. We consider a transformation to a frame with 3-velocity ${\bf v} \equiv v \widehat {\bf v}$, where the nonbold $v$ is a real number.

Write the linear transformations in the form: $$t' ~=~ a(v)t + {\bf b}({\bf v})\cdot {\bf x} ~,~~~~ {\bf x'} ~=~ {\bf d}({\bf v})t + {\bf E}({\bf v}) {\bf x} ~,~~~~~~~ (1)$$ where ${\bf b}, {\bf d}$ are 3-vector-valued functions and ${\bf E}$ is a $3\times 3$ matrix-valued function.

Since ${\bf b}({\bf v})$ is 3-vector-valued and depends only on ${\bf v}$, it is necessarily of the form $b(v){\bf v}$, where the nonbold $b$ is a new function, now scalar-valued. A similar argument applies to ${\bf d}({\bf v})$. So we can rewrite the transformation equations (1) as: $$t' ~=~ a(v)t + b(v) {\bf v}\cdot {\bf x} ~,~~~~ {\bf x'} ~=~ d(v){\bf v}t + {\bf E}({\bf v}) {\bf x} ~.~~~~~~~ (2)$$
Now decompose ${\bf x}$ wrt $\widehat {\bf v}$ as $${\bf x} ~=~ {\bf x}_\| + {\bf x}_\perp$$ into parts parallel and perpendicular to $\widehat {\bf v}$. I.e., ${\bf x}_\| = x_\| \widehat {\bf v}$ and $\widehat {\bf v} \cdot {\bf x}_\perp = 0$. Also decompose ${\bf x}'$ similarly.

I invite you (or any other readers of this thread) to continue the above via the following exercises.

Exercise 1 (Easy): Using $\widehat {\bf v} \cdot {\bf x} = v x_\|$, what form do the transformation equations now take?

Exercise 2 (Harder): Contracting both sides of the ${\bf x'}$ equations with $\widehat {\bf v}$, and using the definition of spatial isotropy I gave earlier, deduce 2 distinct equations governing the transformations for $x'_\|$ and ${\bf x}'_\perp$ separately.

[Edit: Heh, who am I kidding? No one's going to do those.]

Last edited:

#### facenian

The term “linear” in that thread is not the same as linearity in the algebraic sense: f(ax+by)=af(x)+bf(y) f(ax + by) = af(x) + bf(y). Linearity there means polynomial of degree one in the coordinates, i.e., solutions to the system of 2-order PDE’s
∂ 2 F σx μx ν =0.​
Yes, and this is precisely the kind of linearity I was reffering to. However I reaffirm my only objection to your demonstration, i.e. that straight lines transforming into straight lines requires that the transformation be linear.(this is the only argument you should attack)
This is incorrect. The conformal group does not even act on Minkowski space M 4 M^{4}. It acts on the (conformally) compactified version of Minkowski space M ¯ 4 ≅S 3 ×S 1 /Z 2 \bar{M}^{4}\cong S^{3}\times S^{1} / Z_{2}.
You are right, I should have said that the light principle(LP) only implies that the transformation is conformal
Did you read the whole post? I used argument similar to Landau’s argument to show α=1 \alpha = 1.
As for the rest of your comments ,I was only sketching a demonstration and citing known authors because some other people may read the post. Yes I read the post and I noticed you used a similar agument.

### Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving