# Deriving Lorentz transformations

#### Erland

PeterDonis, this seems quite advanced. I need to study this more to fully understand it. But isn't it much simpler in the SR case, where we have linearity and flatness?

Anyway, is it correct that in defining the Lorentz group, one assumes that the spacetime distance dx2+dy2+dz2-c2dt2 is preserved by the Lorentz transformations?
I am not willing to take this as an axiom (for physics), except when this distance is 0, since this is just another way of stating the invariance of the light speed. Otherwise, it seems in no way obvious and no simple consequence of Einstein's postulates.

#### PeterDonis

Mentor
isn't it much simpler in the SR case, where we have linearity and flatness?
No. The Killing vector fields I described are for the SR case, flat Minkowski spacetime.

is it correct that in defining the Lorentz group, one assumes that the spacetime distance dx2+dy2+dz2-c2dt2 is preserved by the Lorentz transformations?
It is often presented that way, but IIRC you don't actually have to make that assumption. The Lorentz group can be defined by the Killing vector fields that generate it and their commutation relations. Showing that the group of transformations on spacetime that have those commutation relations preserves the spacetime interval can then, IIRC, be derived as a theorem.

As a warmup exercise, consider the simpler case of the rotation group SO(3). This group preserves ordinary Euclidean distances in Euclidean 3-space, i.e., it preserves $ds^2 = dx^2 + dy^2 + dz^2$. But I don't think you have to assume that in defining the group; I think you can define the group by its commutation relations, and then derive as a consequence the fact that the transformations in the group preserve Euclidean distances.

I am not willing to take this as an axiom (for physics), except when this distance is 0, since this is just another way of stating the invariance of the light speed.
You can construct a basis for Minkowski spacetime using only null vectors, i.e., any vector can be expressed as a linear combination of null vectors, so invariance of null intervals is sufficient to establish invariance of all intervals.

#### Dale

Mentor
You can construct a basis for Minkowski spacetime using only null vectors, i.e., any vector can be expressed as a linear combination of null vectors, so invariance of null intervals is sufficient to establish invariance of all intervals.
Wow, excellent point.

How does that translate to curved spacetime where the basis vectors would only span the tangent space and not spacetime which is a manifold and not a vector space?

#### PeterDonis

Mentor
How does that translate to curved spacetime where the basis vectors would only span the tangent space and not spacetime which is a manifold and not a vector space?
The tangent space is still Minkowski spacetime, so as long as we are just looking at infinitesimal line elements, everything works the same way.

Once we go beyond infinitesimal line elements, there is, as you know, no such thing as a global "Lorentz transformation" in curved spacetime. Any coordinate transformation must still preserve arc lengths along curves (and other geometric invariants), but there are no global coordinate charts that have all the properties of inertial charts on Minkowski spacetime. So the viewpoint that Lorentz transformations are defined as the ones that preserve spacetime intervals doesn't really have an analogue, globally, in curved spacetime.

This, btw, is another reason to learn the definition of isotropy (and other symmetries) in terms of Killing vector fields; those definitions carry over just fine to curved spacetime. An isotropic curved spacetime is simply one which has a 3-parameter group of KVFs at every event with the commutation relations of SO(3). That's all there is to it.

#### Erland

No. The Killing vector fields I described are for the SR case, flat Minkowski spacetime.
OK, sorry for my misunderstanding. I know I should study differential geometry more, but I have no good book about this avaliable for the moment. I want to thank you all for trying to enlighten me. I don't mean to be difficult, but I do question things I don't find obvious.
It is often presented that way, but IIRC you don't actually have to make that assumption. The Lorentz group can be defined by the Killing vector fields that generate it and their commutation relations. Showing that the group of transformations on spacetime that have those commutation relations preserves the spacetime interval can then, IIRC, be derived as a theorem.

As a warmup exercise, consider the simpler case of the rotation group SO(3). This group preserves ordinary Euclidean distances in Euclidean 3-space, i.e., it preserves $ds^2 = dx^2 + dy^2 + dz^2$. But I don't think you have to assume that in defining the group; I think you can define the group by its commutation relations, and then derive as a consequence the fact that the transformations in the group preserve Euclidean distances.
Maybe I should know this, but what do you mean by "commutation relations"? Do you simply mean relations of the type AB=BA, or A-1B-1AB=I, where A and B are rotations (in this case)? Or do you mean the set of all commutators: A-1B-1AB, or the subgroup generated by these (the commutator subgroup) given without explicitly displaying the A:s and B:s forming the commutators, or do you mean something else?
You can construct a basis for Minkowski spacetime using only null vectors, i.e., any vector can be expressed as a linear combination of null vectors, so invariance of null intervals is sufficient to establish invariance of all intervals.
I find it in no way obvious that just because the Minkowski lengths of x and y are preserved by L, the same is true for all linear combinations of x and y. It is of course true, but I know that just because I already know what L looks like.

#### strangerep

Very good, but this then leads to the question: Which are the equations in the theory to which this applies?
It must apply to every equation in the theory under consideration. The question then is how to specify a "theory".

In classical dynamics, this is done by choosing a specific Lagrangian, in the context of an Action principle and the calculus of variations. The Lagrangian then determines everything (including the equations of the theory) by extremizing the action. In practice, most people therefore just concentrate on symmetries of the Lagrangian. In this sense, one says that "the Lagrangian is the theory" (with the underlying framework of Newtonian space and time being understood, together with the calculus of variations).

In the foundations of relativity, one is interested in how events perceived by one observer $O$ may be reconciled with how another observer $O'$ perceives those same events. For the specific case where the observers are unaccelerated (i.e., inertial), and in motion relative to each other (with $O'$ having relative velocity $v$ wrt $O$), and with the origins of their spatiotemporal reference frames coinciding, the equations of this "mini theory" are simply the coordinate transformations between the 2 coordinate systems. Assuming the transformations to be linear, and restricting ourselves to the 1+1D case, they are of the form $$t' ~=~ A(v) t + B(v) x ~,~~~~~~ x' = C(v)t + D(v) x ~, ~~~~~~ (1)$$where A,B,C,D are unknown functions to be determined.

One uses various physically-motivated criteria to restrict the form of A,B,C,D. One criteria is that $v=0$ corresponds to the identity transformation $t'=t, x'=x$. Another is that $$\left. \frac{dx'}{dt'}\right|_0 ~=~ -v ~~~~ \mbox{if}~~ \left. \frac{dx}{dt}\right|_0 ~=~ 0 ~.$$ Another criterion (assumption) is that the transformations form a 1-parameter Lie (semi)group, with $v$ being the parameter. This implies (among other things) that 2 successive transformations with parameters $v,v'$ must commute, and the composition of the transformations must be equivalent to a single transformation with some parameter $v'' = v''(v,v')$, to be determined.

The spatial isotropy assumption plays a role as follows. In 1+1D, it means the equations of the theory must be invariant under a reversal of all spatial vectors. Performing this reversal on (1), we get $$t' ~=~ A(-v) t - B(-v) x ~,~~~~~~ -x' = C(-v)t - D(-v) x ~. ~~~~~~ (2)$$ The equations (2) must be equivalent to (1). So, after a little algebra, we find the constraints: $$A(v) = A(-v) ~,~~~~ B(v) = B(-v) ~,~~~~ -C(-v) = C(v) ~,~~~~ D(v) = D(-v) ~.$$

Further, since $v=0$ must correspond to the identity transformation, we can (without loss of generality) substitute $B(v) = v E(v)$ and $C(v) = v F(v)$, where $E(v), F(v)$ are 2 new unknown functions.

The benefit of the above, is that we now have transformation equations where the unknown functions $A,B,E,F$ are all even in $v$. This fact can be used in subsequent steps of the derivation (but forgive me if I don't reproduce the entire thing here).

Also I wonder, how can this be used to prove that, for example, distances in directions perpendicular to the direction of motion between the frames are not changed by the transformation? (In this case, only rotations fixing the direction of motion should be used above.)
Here, you're talking about the more general 1+3D case, and what you say cannot be proven. Instead, one assumes that a rotation of the transformed axes around the boost direction has been performed (if necessary) to ensure that they're aligned with the original axes. (Strictly speaking, one might also have to perform a parity reversal as well.)

And how is it used to prove that $v_{12}=-v_{21}$?
Well, that requires a fair bit more work, applying the assumptions I outlined above to derive further constraints on the unknown functions. I'm happy to work through that detail if you wish -- provided we treat it like a homework exercise: i.e., you must do at least as much of the work as I do, and show it here on PF.

Last edited:

#### PeterDonis

Mentor
what do you mean by "commutation relations"?
The best way to answer that is to give an example; SO(3) will do. This is, as I said, a 3-parameter group of transformations, which means that every rotation in the group can be expressed as a linear combination of three "basis" rotations, which are called "generators". If we call the three generators $J^1$, $J^2$, $J^3$, then these three obey the following commutation relations ($[A, B]$ is the commutator of the objects $A$ and $B$, i.e., $[A, B] = AB - BA$):

$$[J^1, J^2] = i J^3$$

$$[J^2, J^3] = i J^1$$

$$[J^3, J^1] = i J^2$$

These three relations can be expressed more compactly as

$$[J^i, J^j] = i \epsilon^{ijk} J^k$$

where $\epsilon^{ijk}$ is the completely antisymmetric symbol in 3 dimensions, i.e., $e^{123} = 1$, and even permutations of the indexes have the same sign while odd ones have the opposite sign.

I find it in no way obvious that just because the Minkowski lengths of x and y are preserved by L, the same is true for all linear combinations of x and y.
A linear combination of the intervals x and y is another interval, and which interval it is is independent of the choice of coordinates. (This is just the Minkowski spacetime analogue of vector addition in ordinary Euclidean space.) So the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination. But the lengths of x and y, by hypothesis, are not changed by L, and the coefficients of the linear combination define which linear combination it is, so they can't be changed by L. Therefore, the Minkowski length of the linear combination can't be changed by L.

#### Erland

A linear combination of the intervals x and y is another interval, and which interval it is is independent of the choice of coordinates. (This is just the Minkowski spacetime analogue of vector addition in ordinary Euclidean space.) So the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination.
This is wrong, both in Euclidean and Minkowski space. As a counterexample in 2d Minkowski space, with $c=1$, take $x=(3,0)$ and $y=(5,4)$. Their Minkowski lengths are $l(x)=3^2-0^2=9$ and $l(y)=5^2-4^2=9$. If what you wrote is true, we would have $l(x+x)=l(x+y)$, since $x+x$ is a linear combination of $x$ and $x$ with the same coefficients as in the linear combination $x+y$ of $x$ and $y$, and $x$ and $y$ have the same Minkowski lengths. But $l(x+x)=6^2-0^2=36$ and $l(x+y)=8^2-4^2=48$.
To find a counterexample in Euclidean space is even more trivial.
You might object that we must only take linear combinations of mutually orthogonal basis vectors. But you wrote in the earlier post that we should take a basis of lightlike vectors in Minskowski space. However, in 2d Minkowski space, and I am quite sure in 3d and 4d also, we cannot find two lightlike, linearly indepent vectors which are orthogonal to each other. Two lightlike, linearly independent vectors in 2d must be two nonzero vectors $(u,u)$ and $(v,-v)$. But they are not orthogonal to each other in the Minkowski sense, since $uv-u(-v)=2uv\neq 0$.

#### PeterDonis

Mentor
take $x=(3,0)$ and $y=(5,4)$.
These are not null vectors. Your original hypothesis was that invariance of the interval only applied to null intervals, so $x$ and $y$ should be null intervals.

If what you wrote is true, we would have $l(x+x)=l(x+y)$
Of course this is trivially true for any pair of null intervals $x$ and $y$, since their lengths are always zero. So what my statement really amounts to is saying that, for the case under discussion ($x$ and $y$ null intervals), the length of the linear combination depends only on the coefficients of the linear combination.

$x+x$ is a linear combination of $x$ and $x$ with the same coefficients as in the linear combination $x+y$ of $x$ and $y$
Um, what? That doesn't even make sense. The coefficients of a linear combination of $x$ and $y$ are the numbers multiplying $x$ and $y$ in the linear combination, in order; i.e., for the linear combination $ax + by$, the coefficients are $a, b$. So $x + x$ has coefficients $2, 0$, while $x + y$ has coefficients $1, 1$. So these are two different linear combinations.

in 2d Minkowski space, and I am quite sure in 3d and 4d also, we cannot find two lightlike, linearly indepent vectors which are orthogonal to each other.
Basis vectors don't have to be orthogonal, they just have to be linearly independent. Finding a basis of four null vectors just means finding four null vectors that are linearly independent. (You are correct that we can't find two lightlike, linearly independent null vectors that are orthogonal; two null vectors can only be orthogonal if they are collinear.) The basis vectors in the case you're used to, an inertial frame, are all orthogonal, but that does not mean basis vectors always have to be orthogonal; they just happen to be in that particular case.

#### Erland

Of course this is trivially true for any pair of null intervals $x$ and $y$, since their lengths are always zero. So what my statement really amounts to is saying that, for the case under discussion ($x$ and $y$ null intervals), the length of the linear combination depends only on the coefficients of the linear combination.
$l(x)$ here means the Minkowski length of $x$. Let $x=(1,1)$ and $y=(1,-1)$. These are both null vectors. Yet, $l(x+x)=l(2,2)=0$ while $l(x+y)=l(2,0)=4\neq 0$.
Um, what? That doesn't even make sense. The coefficients of a linear combination of $x$ and $y$ are the numbers multiplying $x$ and $y$ in the linear combination, in order; i.e., for the linear combination $ax + by$, the coefficients are $a, b$. So $x + x$ has coefficients $2, 0$, while $x + y$ has coefficients $1, 1$. So these are two different linear combinations.
Linear combinations are defined for all finite sequences of vectors. It doesn't matter if some of them happens to be equal or not. So $x+x$ can be considered as a linear combination of the pair $(x,x)$, with coefficients $(1,1)$ (another possibility is $(0,2)$, and there are infinitely many more, since the pair $(x,x)$ is linearly dependent). Likewise $x+y$ can be considered as a linear combination of $(x,y)$ with coefficients $(1,1)$ (and infinitely many more possibilities, if $(x,y)$ is linearly dependent). So, the coefficients are the same in both cases.
Basis vectors don't have to be orthogonal, they just have to be linearly independent. Finding a basis of four null vectors just means finding four null vectors that are linearly independent. (You are correct that we can't find two lightlike, linearly independent null vectors that are orthogonal; two null vectors can only be orthogonal if they are collinear.) The basis vectors in the case you're used to, an inertial frame, are all orthogonal, but that does not mean basis vectors always have to be orthogonal; they just happen to be in that particular case.
It is not clear to me what you actually claim. Do you claim that every linear transformation $T$ which maps (Minkowski) null vectors to null vectors, preserve Minkowski length? If so, you are wrong.
For example, there is an invertible linear transformation $T:\Bbb R^2 \to \Bbb R^2$ such that $T(1,1)=(2,2)$ and $T(1,-1)=(1,-1)$. We see easily that it maps null vectors to null vectors: $T(x,x)=(2x,2x)$ and $T(x,-x)=(x,-x)$, and all null vectors are of these types.
Now $(2,0)=(1,1)+(1,-1)$ and $T(2,0)=T(1,1)+T(1,-1)=(2,2)+(1,-1)=(3,1)$. But $l(2,0)=4$ and $l(T(2,0))=l(3,1)=8$. Thus, $T$ does not preserve Minkowski lengths.

So, not all linear transformations which map null vectors to null vectors preserve Minkowski lengths. The question is then if you don't talk about all linear transformations, but some particular class of transformations. If so, which class and why? We cannot á priori choose the Lorentz transformations, for that would be circular.

#### PeterDonis

Mentor
Linear combinations are defined for all finite sequences of vectors. It doesn't matter if some of them happens to be equal or not.
You're missing my point. $x + x$ and $x + y$ are both linear combinations, sure. But they are different linear combinations. (If that fact is not obvious to you--and apparently it's not--then I strongly suggest a review of basic vector algebra, because it certainly should be obvious.) So we should expect them to result in vectors with different Minkowski lengths.

#### Erland

Peter, we must have misunderstood each other in some fundamental way, but I don't know in which way...

Let me recapitulate what you wrote in an earlier post:
A linear combination of the intervals x and y is another interval, and which interval it is is independent of the choice of coordinates. (This is just the Minkowski spacetime analogue of vector addition in ordinary Euclidean space.) So the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination. But the lengths of x and y, by hypothesis, are not changed by L, and the coefficients of the linear combination define which linear combination it is, so they can't be changed by L. Therefore, the Minkowski length of the linear combination can't be changed by L.
So you claim that "the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination" (my empasize).

In other words, you mean that if we have a linear combination ax+by, then its Minkowski length, l(ax+by), depends only on l(x), l(y), a, and b. So, if we have two other vectors u and v, with the same lengths as x and y, respectively, that is, l(u)=l(x) and l(v)=l(y), and take a linear combination of u and v with the same coefficients as before: au+bv, then l(au+bv)=l(ax+by).
This is simply wrong, even if we restrict ourselves to the case when l(x)=l(y)=l(u)=l(v)=0. (Counterexample: x=(1,1), y=(1, -1), u=(2,2), v=(1,-1), a=b=1.)
If you don't understand this, it's you, not me, who need to repeat linear algebra.

But if the above is a misinterpretation of what you meant, please let us know what you really meant!

#### PeterDonis

Mentor
In other words, you mean that if we have a linear combination ax+by, then its Minkowski length, l(ax+by), depends only on l(x), l(y), a, and b.
Yes, but I see that the word "depends" hides an ambiguity, so that the statement as I gave it can be taken in a stronger sense than I intended (and the stronger sense is false, as you say). Let me try to restate.

Suppose we pick two null vectors $x$ and $y$ as a basis for 2-D Minkowski spacetime. For concreteness, let's use $x = (1, 1)$ and $y = (1, -1)$ in ordinary inertial coordinates. Then any other vector $v$ can be expressed as a linear combination $v = ax + by$. The length of $v$ is then given by some function $f$ of $l(x)$, $l(y)$, $a$, and $b$; i.e., $l(v) = f(l(x), l(y), a, b)$; in our case, since $l(x) = l(y) = 0$, we can simplify this to $l(v) = f(a, b)$. If we then Lorentz transform, the basis vectors in the new inertial frame will have different numerical components; for example, if the Doppler factor of the transformation is 2, then in the new frame we will have $x = (2, 2)$ and $y = (0.5, -0.5)$. But the vector $v$ will still be given by $ax + by$, and the length of $v$ will still be given by $l(v) = f(a, b)$.

Now suppose we change our minds and decide to use the null vector $u = (2, 2)$ instead of $x$ as the first of our two basis vectors. Then any vector $v$ can be expressed as a linear combination $v = cu + dy$. The length of $v$ is then given by some function $l(v) = g(c, d)$, where $g$ is a different function from $f$ above. But this different formula for the length of $v$ will still be preserved by a Lorentz transformation.

#### Fredrik

Staff Emeritus
Gold Member
The point of a "derivation" of the Lorentz transformation is to show how, in principle, one could have used simpler ideas to guess that the Lorentz transformation would be a useful ingredient in a theory of physics. It shows that a version of SR can be found by someone who doesn't already know what the theory looks like.

If you take Minkowski spacetime as a starting point, then you're doing something very different. You are showing that Minkowski spacetime contains all the mathematics we need to state such a theory in a nice way.

Both of these things are interesting to me. The latter is an important thing to study if you want a thorough understanding of the mathematics of SR. The former is an important thing to study if you want to understand what SR has in common with, and how it differs from, pre-relativistic classical mechanics.

Minkowski spacetime can be defined as a smooth manifold, an affine space, or a vector space. These options give us equivalent theories of physics. The vector space approach is ugly in the sense that it makes one event in spacetime (the 0 vector) mathematically special, even though it's not physically special. (The theory predicts that an experiment carried out there has the same results as if it's carried out somewhere else). But the vector space approach has the advantage that it makes the mathematics much simpler.

So let's define (the vector space version of) Minkowski spacetime as the pair (M,g), where M is $\mathbb R^4$ with the usual vector space structure, and g is the map from M×M into M defined by $g(x,y)=x^T\eta y$ for all x,y in M. Now the claim that "space is isotropic" can be interpreted in the following way: The group of all vector space automorphisms of M that preserve G (i.e. the Poincaré group), has a subgroup that consists of all the linear operators on M that can be written in the form
$$\begin{pmatrix}1 & 0 & 0 & 0\\ 0 & & & \\ 0 & & R &\\ 0 & & &\end{pmatrix},$$ where $R\in\operatorname{SO}(3)$.

Of course, we can only do this if we have already defined Minkowski spacetime, so it can't be part of one of those "derivations" of the Lorentz transformation from simpler ideas. To incorporate isotropy in such a derivation, I would do the following. We're trying to find a theory of physics in which spacetime is a mathematical structure with underlying set $M=\mathbb R^4$. We want this theory to involve global inertial coordinate systems, i.e. maps $x:M\to\mathbb R^4$ that correspond to inertial (i.e. non-accelerating) observers. We want these global inertial coordinate systems to be such that if x and y are global inertial coordinate systems, then the map $x\circ y^{-1}$ is a bijection from M to M that takes straight lines to straight lines (because the motion of an inertial observer should always be a straight line in the coordinate system used by another inertial observer). It can be shown that this implies that these maps are linear. The proof is quite long. We also want this set to be a group. (This is easy to justify by physical principles). Now we can interpret the requirement of isotropy as the choice to only look for groups that have a subgroup that consists of those transformations that can be written in the form above.

It's very difficult to complete a derivation of this type. I have only seen one attempt to really carry this out, and I didn't fully understand it. The result should be that the group is either the Poincaré group or the Galilean group.

#### Erland

Suppose we pick two null vectors $x$ and $y$ as a basis for 2-D Minkowski spacetime. For concreteness, let's use $x = (1, 1)$ and $y = (1, -1)$ in ordinary inertial coordinates. Then any other vector $v$ can be expressed as a linear combination $v = ax + by$. The length of $v$ is then given by some function $f$ of $l(x)$, $l(y)$, $a$, and $b$; i.e., $l(v) = f(l(x), l(y), a, b)$; in our case, since $l(x) = l(y) = 0$, we can simplify this to $l(v) = f(a, b)$.
Ok, but then it is not meaningful to write $f(l(x),l(y),a,b)$, since your $f$ is a function of $x,y,a,b$, not of $l(x),l(y),a,b$, for such a function should not change its value if we change $x$ and $y$ without changing $l(x)$ and $l(y)$ (and $a$ and $b$).

But never mind. I think that much of the confusion arises from the fact that we can view an invertible linear transformation $(t,x) \mapsto (t',x')$ (in 2D) in two ways: either as a mapping of a vector to another vector, or as changing the coordinates of a vector to coordinates of the same vector in another basis.

Consider a 2D vector space $V$ and a basis $B=(\mathbf e,\mathbf f)$ of $V$. Then every vector $\mathbf v\in V$ can be uniquely written in the form $\mathbf v=t\mathbf e+x\mathbf f$.
In first case, we have an invertible linear transformation $T: V\to V$ such that $T(\mathbf v)=T(t \mathbf e + x \mathbf f)=t'\mathbf e + x'\mathbf f$, i.e. $(t',x')$ are the coordinates of the mapped vector $T(\mathbf v)$ in the same basis as before.
In the second case, we have another basis $B'=(\mathbf e',\mathbf f')$ of $V$ such that $\mathbf v=t'\mathbf e'+x'\mathbf f'$, i.e. $(t',x')$ are the coordinates of the same vector $\mathbf v$ as before, in the new basis $B'=(\mathbf e',\mathbf f')$.

Now, consider the scalar function $l: V\to \Bbb R$, defined for vectors expressed in the basis $B$ by $l(t\mathbf e +x \mathbf f)=t^2-x^2$. Our problem can then be formulated in two equivalent ways, one for each of the two viewpoints.

1. If $l(T(\mathbf v))=0$ holds for all $\mathbf v\in V$ such that $l(\mathbf v)=0$, must then $l(T(\mathbf v))=l(\mathbf v)$ hold for all $\mathbf v\in V$?

2. Let $l': V\to \Bbb R$ be be defined for vectors expressed in the basis $B'$ by $l'(t'\mathbf e'+x'\mathbf f')=(t')^2-(x')^2$.
If then $l'(\mathbf v)=0$ holds for all $\mathbf v\in V$ such that $l(\mathbf v)=0$, must then $l'(\mathbf v)=l(\mathbf v)$ hold for all $\mathbf v\in V$? (that is: is the Minkowski length given by the same formula in both bases?).

We know of course that the answers to these questions are "yes" if $T$ is a Lorentz transformation (in case 1) or if the coordinate transformation is given by Lorentz's formulas (in case 2). But the answers are easily seen to be "no" in general. It is easy to find examples of linear transformations / changes of bases for which the answers are "no". I gave some examples in earlier posts. Surely, you must agree about that, Peter?

And since the answers are "no" in general, the question is what extra conditions we must impose on the transformations /changes of bases to make the answers "yes"...

Last edited:

#### Erland

I'm happy to work through that detail if you wish -- provided we treat it like a homework exercise: i.e., you must do at least as much of the work as I do, and show it here on PF.
Well, I'm working on it too, despite lack of time. We'll see how it proceeds. Good post, anyway!

#### Erland

Fredrik, I am thinking along similar lines as you.

#### PeterDonis

Mentor
such a function should not change its value if we change $x$ and $y$ without changing $l(x)$ and $l(y)$ (and $a$ and $b$).
Why not? Changing the basis vectors changes the function--at least, that's how I was viewing it. Bear in mind that I'm not making any particular physical or mathematical claim here; I'm simply trying to clarify what I intended to say when I responded to your original question about whether establishing that some group of transformations maps null vectors to null vectors is sufficient to establish that it preserves the lengths of all vectors.

I also take Fredrik's point, however, that if we take Minkowski spacetime as a starting point, asking for a "derivation" of the Lorentz transformations is moot--the symmetry group of Minkowski spacetime is what it is. And if we are talking about any transformation mapping null vectors to null vectors, we must already know what a null vector is, which means we are already assuming something that probably amounts to assuming Minkowski spacetime.

I think that much of the confusion arises from the fact that we can view an invertible linear transformation $(t,x) \mapsto (t',x')$ (in 2D) in two ways: either as a mapping of a vector to another vector, or as changing the coordinates of a vector to coordinates of the same vector in another basis.
I agree that it's important to be careful about distinguishing these two views. I'll have to ponder some more to see if the intuition I was groping towards can be formulated in a way that addresses all of these concerns.

#### facenian

Very good, but this then leads to the question: Which are the equations in the theory to which this applies? This should be specified in a stringent exposition.
The equations to which it applies are those which represent physical laws. I think this is very important because this restrict the possible laws of physics.
And how is it used to prove that v12=−v21v_{12}=-v_{21}?
regarding this point I would ask : shouldn't it be v12=v21? I think this is an interesting question but more appropriate to be discussed in front of a blackboard

#### Erland

regarding this point I would ask : shouldn't it be v12=v21? I think this is an interesting question but more appropriate to be discussed in front of a blackboard
If two frames have their x-axes aligned and pointing in the same direction, and move in the x-direction, then they have opposite relative velocities: v12=-v21. If I see you go east, you see me go west.

#### facenian

If two frames have their x-axes aligned and pointing in the same direction, and move in the x-direction, then they have opposite relative velocities: v12=-v21. If I see you go east, you see me go west.
That's right, I withdraw my objection regarding the sign, but the original question was why symmetry or isotropy regarding both observers requires v12=-v12 and I think explaining it would be a lot easier to do it in front of a blackboard. Any way I think at this point you already got it

#### PeroK

Homework Helper
Gold Member
2018 Award
If two frames have their x-axes aligned and pointing in the same direction, and move in the x-direction, then they have opposite relative velocities: v12=-v21. If I see you go east, you see me go west.
This highlights my earlier point. Unless you assume symmetry how do you know two observers can align their x axes?

#### Erland

It turns out that PeterDonis wasn't so wrong after all. Although it is false that all linear transformations $T:\Bbb R^2\to \Bbb R^2$, which map (Minkowski) null vectors (w.r.t. the standard basis) to null vectors must preserve Minkowski length, what is true is that the Minkowski length is multiplied with a constant factor by such a transformation.
More precisely: If $l(T(\mathbf v))=0$ for all $\mathbf v\in \Bbb R^2$ such that $l(\mathbf v)=0$, then there is a constant $k\in\Bbb R$ such that $l(T(\mathbf v))=k\,l(\mathbf v)$ for all $\mathbf v \in \Bbb R^2$.

To see this, let such a transformation $T$ be given by $T(\mathbf v)=T(t,x)=(at+bx,ct+dx)$ (so $c$ is not light speed here, the latter is $1$).
Then: $l(T(\mathbf v))=(at+bx)^2-(ct+dx)^2=(a^2-c^2)t^2-(d^2-b^2)x^2+2(ab-cd)tx.$
$(1,1)$ and $(1,-1)$ are null vectors and they are mapped to null vectors. Thus:
$(a^2 -c^2) - (d^2-b^2) +2(ab-cd)=0$ and
$(a^2-c^2)-(d^2-b^2)-2(ab-cd)=0$.
Adding and subtracting these equations, we obtain $a^2-c^2=d^2-b^2$ and $ab-cd=0$. So, putting $k=a^2-c^2$ we obtain $l(T(\mathbf v))=k(t^2-x^2)=k\,l(\mathbf v)$. This holds for all $\mathbf v=(t,x)\in \Bbb R^2$.
(I suspect that there is a general property of quadratic forms lurking in the background, but I cannot figure out what it is.)

If we also assume that $T$ is invertible, then the "lines" given by $(x,t)=u(1,1)$ and $(x,t)=u(1,-1)$ must be mapped onto each other (in some combination). From this, we can show that $c=\pm b$ and $d=\pm a$ (same sign at both places).
Now, if we interprete $T$ as a coordinate transformation between inertial frames, then $v=-c/d=-b/a$ is the velocity of Frame 2 relative to Frame 1 (so $a\neq 0$). Inverting the transformation and interpreting this in the corresponding way, we obtain that the velocity of Frame 1 relative to Frame 2 is $-v$. In other words: $v_{12}=-v_{21}$!

It remains to figure out:
1. Which extra conditions do we need to ensure that $k=1$?
2. Do we need any extra conditions (e.g. spatial isotropy) to generalize this to 4D-space?

#### facenian

This highlights my earlier point. Unless you assume symmetry how do you know two observers can align their x axes?
doesn't this have to do to with the supposed Euclidean geometry?

#### facenian

It remains to figure out:
1. Which extra conditions do we need to ensure that k=1
Since $\mathbf{T^{-1}}$ share the same property we have $l(\mathbf{v})=l(\mathbf{T^{-1}}(\mathbf{T}\mathbf{(\mathbf{v}})))=kl(\mathbf{T}(\mathbf{v})=k^2l(\mathbf{v})$ whence $k^2=1$

### Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving