Deriving Lorentz transformations

PeterDonis · Dec 24, 2015

Erland said:

Linear combinations are defined for all finite sequences of vectors. It doesn't matter if some of them happens to be equal or not.

You're missing my point. ##x + x## and ##x + y## are both linear combinations, sure. But they are different linear combinations. (If that fact is not obvious to you--and apparently it's not--then I strongly suggest a review of basic vector algebra, because it certainly should be obvious.) So we should expect them to result in vectors with different Minkowski lengths.

[Edited to delete my second comment/question, it was already answered in your previous post.]

Erland · Dec 24, 2015

Peter, we must have misunderstood each other in some fundamental way, but I don't know in which way...

Let me recapitulate what you wrote in an earlier post:

PeterDonis said:

A linear combination of the intervals x and y is another interval, and which interval it is is independent of the choice of coordinates. (This is just the Minkowski spacetime analogue of vector addition in ordinary Euclidean space.) So the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination. But the lengths of x and y, by hypothesis, are not changed by L, and the coefficients of the linear combination define which linear combination it is, so they can't be changed by L. Therefore, the Minkowski length of the linear combination can't be changed by L.

So you claim that "the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination" (my empasize).

In other words, you mean that if we have a linear combination ax+by, then its Minkowski length, l(ax+by), depends only on l(x), l(y), a, and b. So, if we have two other vectors u and v, with the same lengths as x and y, respectively, that is, l(u)=l(x) and l(v)=l(y), and take a linear combination of u and v with the same coefficients as before: au+bv, then l(au+bv)=l(ax+by).
This is simply wrong, even if we restrict ourselves to the case when l(x)=l(y)=l(u)=l(v)=0. (Counterexample: x=(1,1), y=(1, -1), u=(2,2), v=(1,-1), a=b=1.)
If you don't understand this, it's you, not me, who need to repeat linear algebra.

But if the above is a misinterpretation of what you meant, please let us know what you really meant!

PeterDonis · Dec 25, 2015

Erland said:

In other words, you mean that if we have a linear combination ax+by, then its Minkowski length, l(ax+by), depends only on l(x), l(y), a, and b.

Yes, but I see that the word "depends" hides an ambiguity, so that the statement as I gave it can be taken in a stronger sense than I intended (and the stronger sense is false, as you say). Let me try to restate.

Suppose we pick two null vectors ##x## and ##y## as a basis for 2-D Minkowski spacetime. For concreteness, let's use ##x = (1, 1)## and ##y = (1, -1)## in ordinary inertial coordinates. Then any other vector ##v## can be expressed as a linear combination ##v = ax + by##. The length of ##v## is then given by some function ##f## of ##l(x)##, ##l(y)##, ##a##, and ##b##; i.e., ##l(v) = f(l(x), l(y), a, b)##; in our case, since ##l(x) = l(y) = 0##, we can simplify this to ##l(v) = f(a, b)##. If we then Lorentz transform, the basis vectors in the new inertial frame will have different numerical components; for example, if the Doppler factor of the transformation is 2, then in the new frame we will have ##x = (2, 2)## and ##y = (0.5, -0.5)##. But the vector ##v## will still be given by ##ax + by##, and the length of ##v## will still be given by ##l(v) = f(a, b)##.

Now suppose we change our minds and decide to use the null vector ##u = (2, 2)## instead of ##x## as the first of our two basis vectors. Then any vector ##v## can be expressed as a linear combination ##v = cu + dy##. The length of ##v## is then given by some function ##l(v) = g(c, d)##, where ##g## is a different function from ##f## above. But this different formula for the length of ##v## will still be preserved by a Lorentz transformation.

Fredrik · Dec 25, 2015

The point of a "derivation" of the Lorentz transformation is to show how, in principle, one could have used simpler ideas to guess that the Lorentz transformation would be a useful ingredient in a theory of physics. It shows that a version of SR can be found by someone who doesn't already know what the theory looks like.

If you take Minkowski spacetime as a starting point, then you're doing something very different. You are showing that Minkowski spacetime contains all the mathematics we need to state such a theory in a nice way.

Both of these things are interesting to me. The latter is an important thing to study if you want a thorough understanding of the mathematics of SR. The former is an important thing to study if you want to understand what SR has in common with, and how it differs from, pre-relativistic classical mechanics.

Minkowski spacetime can be defined as a smooth manifold, an affine space, or a vector space. These options give us equivalent theories of physics. The vector space approach is ugly in the sense that it makes one event in spacetime (the 0 vector) mathematically special, even though it's not physically special. (The theory predicts that an experiment carried out there has the same results as if it's carried out somewhere else). But the vector space approach has the advantage that it makes the mathematics much simpler.

So let's define (the vector space version of) Minkowski spacetime as the pair (M,g), where M is ##\mathbb R^4## with the usual vector space structure, and g is the map from M×M into M defined by ##g(x,y)=x^T\eta y## for all x,y in M. Now the claim that "space is isotropic" can be interpreted in the following way: The group of all vector space automorphisms of M that preserve G (i.e. the Poincaré group), has a subgroup that consists of all the linear operators on M that can be written in the form
$$\begin{pmatrix}1 & 0 & 0 & 0\\ 0 & & & \\ 0 & & R &\\ 0 & & &\end{pmatrix},$$ where ##R\in\operatorname{SO}(3)##.

Of course, we can only do this if we have already defined Minkowski spacetime, so it can't be part of one of those "derivations" of the Lorentz transformation from simpler ideas. To incorporate isotropy in such a derivation, I would do the following. We're trying to find a theory of physics in which spacetime is a mathematical structure with underlying set ##M=\mathbb R^4##. We want this theory to involve global inertial coordinate systems, i.e. maps ##x:M\to\mathbb R^4## that correspond to inertial (i.e. non-accelerating) observers. We want these global inertial coordinate systems to be such that if x and y are global inertial coordinate systems, then the map ##x\circ y^{-1}## is a bijection from M to M that takes straight lines to straight lines (because the motion of an inertial observer should always be a straight line in the coordinate system used by another inertial observer). It can be shown that this implies that these maps are linear. The proof is quite long. We also want this set to be a group. (This is easy to justify by physical principles). Now we can interpret the requirement of isotropy as the choice to only look for groups that have a subgroup that consists of those transformations that can be written in the form above.

It's very difficult to complete a derivation of this type. I have only seen one attempt to really carry this out, and I didn't fully understand it. The result should be that the group is either the Poincaré group or the Galilean group.

Erland · Dec 25, 2015

PeterDonis said:

Suppose we pick two null vectors ##x## and ##y## as a basis for 2-D Minkowski spacetime. For concreteness, let's use ##x = (1, 1)## and ##y = (1, -1)## in ordinary inertial coordinates. Then any other vector ##v## can be expressed as a linear combination ##v = ax + by##. The length of ##v## is then given by some function ##f## of ##l(x)##, ##l(y)##, ##a##, and ##b##; i.e., ##l(v) = f(l(x), l(y), a, b)##; in our case, since ##l(x) = l(y) = 0##, we can simplify this to ##l(v) = f(a, b)##.

Ok, but then it is not meaningful to write ##f(l(x),l(y),a,b)##, since your ##f## is a function of ##x,y,a,b##, not of ##l(x),l(y),a,b##, for such a function should not change its value if we change ##x## and ##y## without changing ##l(x)## and ##l(y)## (and ##a## and ##b##).

But never mind. I think that much of the confusion arises from the fact that we can view an invertible linear transformation ##(t,x) \mapsto (t',x')## (in 2D) in two ways: either as a mapping of a vector to another vector, or as changing the coordinates of a vector to coordinates of the same vector in another basis.

Consider a 2D vector space ##V## and a basis ##B=(\mathbf e,\mathbf f) ## of ##V##. Then every vector ##\mathbf v\in V ## can be uniquely written in the form ##\mathbf v=t\mathbf e+x\mathbf f##.
In first case, we have an invertible linear transformation ##T: V\to V## such that ##T(\mathbf v)=T(t \mathbf e + x \mathbf f)=t'\mathbf e + x'\mathbf f##, i.e. ##(t',x')## are the coordinates of the mapped vector ##T(\mathbf v)## in the same basis as before.
In the second case, we have another basis ##B'=(\mathbf e',\mathbf f')## of ##V## such that ##\mathbf v=t'\mathbf e'+x'\mathbf f'##, i.e. ##(t',x')## are the coordinates of the same vector ##\mathbf v## as before, in the new basis ##B'=(\mathbf e',\mathbf f')##.

Now, consider the scalar function ##l: V\to \Bbb R##, defined for vectors expressed in the basis ##B## by ##l(t\mathbf e +x \mathbf f)=t^2-x^2##. Our problem can then be formulated in two equivalent ways, one for each of the two viewpoints.

1. If ##l(T(\mathbf v))=0## holds for all ##\mathbf v\in V## such that ##l(\mathbf v)=0##, must then ##l(T(\mathbf v))=l(\mathbf v)## hold for all ##\mathbf v\in V##?

2. Let ##l': V\to \Bbb R## be be defined for vectors expressed in the basis ##B'## by ##l'(t'\mathbf e'+x'\mathbf f')=(t')^2-(x')^2##.
If then ##l'(\mathbf v)=0## holds for all ##\mathbf v\in V## such that ##l(\mathbf v)=0##, must then ##l'(\mathbf v)=l(\mathbf v)## hold for all ##\mathbf v\in V##? (that is: is the Minkowski length given by the same formula in both bases?).

We know of course that the answers to these questions are "yes" if ##T## is a Lorentz transformation (in case 1) or if the coordinate transformation is given by Lorentz's formulas (in case 2). But the answers are easily seen to be "no" in general. It is easy to find examples of linear transformations / changes of bases for which the answers are "no". I gave some examples in earlier posts. Surely, you must agree about that, Peter?

And since the answers are "no" in general, the question is what extra conditions we must impose on the transformations /changes of bases to make the answers "yes"...

Erland · Dec 25, 2015

strangerep said:

I'm happy to work through that detail if you wish -- provided we treat it like a homework exercise: i.e., you must do at least as much of the work as I do, and show it here on PF.

Well, I'm working on it too, despite lack of time. We'll see how it proceeds. Good post, anyway!

Erland · Dec 25, 2015

Fredrik, I am thinking along similar lines as you.

PeterDonis · Dec 25, 2015

Erland said:

such a function should not change its value if we change ##x## and ##y## without changing ##l(x)## and ##l(y)## (and ##a## and ##b##).

Why not? Changing the basis vectors changes the function--at least, that's how I was viewing it. Bear in mind that I'm not making any particular physical or mathematical claim here; I'm simply trying to clarify what I intended to say when I responded to your original question about whether establishing that some group of transformations maps null vectors to null vectors is sufficient to establish that it preserves the lengths of all vectors.

I also take Fredrik's point, however, that if we take Minkowski spacetime as a starting point, asking for a "derivation" of the Lorentz transformations is moot--the symmetry group of Minkowski spacetime is what it is. And if we are talking about any transformation mapping null vectors to null vectors, we must already know what a null vector is, which means we are already assuming something that probably amounts to assuming Minkowski spacetime.

Erland said:

I think that much of the confusion arises from the fact that we can view an invertible linear transformation ##(t,x) \mapsto (t',x')## (in 2D) in two ways: either as a mapping of a vector to another vector, or as changing the coordinates of a vector to coordinates of the same vector in another basis.

I agree that it's important to be careful about distinguishing these two views. I'll have to ponder some more to see if the intuition I was groping towards can be formulated in a way that addresses all of these concerns.

facenian · Dec 26, 2015

Erland said:

Very good, but this then leads to the question: Which are the equations in the theory to which this applies? This should be specified in a stringent exposition.

The equations to which it applies are those which represent physical laws. I think this is very important because this restrict the possible laws of physics.

Erland said:

And how is it used to prove that v12=−v21v_{12}=-v_{21}?

regarding this point I would ask : shouldn't it be v12=v21? I think this is an interesting question but more appropriate to be discussed in front of a blackboard

Erland · Dec 26, 2015

facenian said:

regarding this point I would ask : shouldn't it be v12=v21? I think this is an interesting question but more appropriate to be discussed in front of a blackboard

If two frames have their x-axes aligned and pointing in the same direction, and move in the x-direction, then they have opposite relative velocities: v₁₂=-v₂₁. If I see you go east, you see me go west.

facenian · Dec 26, 2015

Erland said:

If two frames have their x-axes aligned and pointing in the same direction, and move in the x-direction, then they have opposite relative velocities: v₁₂=-v₂₁. If I see you go east, you see me go west.

That's right, I withdraw my objection regarding the sign, but the original question was why symmetry or isotropy regarding both observers requires v12=-v12 and I think explaining it would be a lot easier to do it in front of a blackboard. Any way I think at this point you already got it

PeroK · Dec 26, 2015

Erland said:

If two frames have their x-axes aligned and pointing in the same direction, and move in the x-direction, then they have opposite relative velocities: v₁₂=-v₂₁. If I see you go east, you see me go west.

This highlights my earlier point. Unless you assume symmetry how do you know two observers can align their x axes?

Erland · Dec 26, 2015

It turns out that PeterDonis wasn't so wrong after all. Although it is false that all linear transformations ##T:\Bbb R^2\to \Bbb R^2##, which map (Minkowski) null vectors (w.r.t. the standard basis) to null vectors must preserve Minkowski length, what is true is that the Minkowski length is multiplied with a constant factor by such a transformation.
More precisely: If ##l(T(\mathbf v))=0## for all ##\mathbf v\in \Bbb R^2## such that ##l(\mathbf v)=0##, then there is a constant ##k\in\Bbb R## such that ##l(T(\mathbf v))=k\,l(\mathbf v)## for all ##\mathbf v \in \Bbb R^2##.

To see this, let such a transformation ##T## be given by ##T(\mathbf v)=T(t,x)=(at+bx,ct+dx)## (so ##c## is not light speed here, the latter is ##1##).
Then: ##l(T(\mathbf v))=(at+bx)^2-(ct+dx)^2=(a^2-c^2)t^2-(d^2-b^2)x^2+2(ab-cd)tx.##
##(1,1)## and ##(1,-1)## are null vectors and they are mapped to null vectors. Thus:
##(a^2 -c^2) - (d^2-b^2) +2(ab-cd)=0## and
##(a^2-c^2)-(d^2-b^2)-2(ab-cd)=0##.
Adding and subtracting these equations, we obtain ##a^2-c^2=d^2-b^2## and ##ab-cd=0##. So, putting ##k=a^2-c^2## we obtain ##l(T(\mathbf v))=k(t^2-x^2)=k\,l(\mathbf v)##. This holds for all ##\mathbf v=(t,x)\in \Bbb R^2##.
(I suspect that there is a general property of quadratic forms lurking in the background, but I cannot figure out what it is.)

If we also assume that ##T## is invertible, then the "lines" given by ##(x,t)=u(1,1)## and ##(x,t)=u(1,-1)## must be mapped onto each other (in some combination). From this, we can show that ##c=\pm b## and ##d=\pm a## (same sign at both places).
Now, if we interprete ##T## as a coordinate transformation between inertial frames, then ##v=-c/d=-b/a## is the velocity of Frame 2 relative to Frame 1 (so ##a\neq 0##). Inverting the transformation and interpreting this in the corresponding way, we obtain that the velocity of Frame 1 relative to Frame 2 is ##-v##. In other words: ##v_{12}=-v_{21}##!

It remains to figure out:
1. Which extra conditions do we need to ensure that ##k=1##?
2. Do we need any extra conditions (e.g. spatial isotropy) to generalize this to 4D-space?

facenian · Dec 26, 2015

PeroK said:

This highlights my earlier point. Unless you assume symmetry how do you know two observers can align their x axes?

doesn't this have to do to with the supposed Euclidean geometry?

facenian · Dec 26, 2015

Erland said:

It remains to figure out:
1. Which extra conditions do we need to ensure that k=1

Since [itex]\mathbf{T^{-1}}[/itex] share the same property we have [itex]l(\mathbf{v})=l(\mathbf{T^{-1}}(\mathbf{T}\mathbf{(\mathbf{v}})))=kl(\mathbf{T}(\mathbf{v})=k^2l(\mathbf{v})[/itex] whence [itex]k^2=1[/itex]

Erland · Dec 26, 2015

facenian said:

Since [itex]\mathbf{T^{-1}}[/itex] share the same property we have [itex]l(\mathbf{v})=l(\mathbf{T^{-1}}(\mathbf{T}\mathbf{(\mathbf{v}})))=kl(\mathbf{T}(\mathbf{v})=k^2l(\mathbf{v})[/itex] whence [itex]k^2=1[/itex]

But ##k## depends upon ##T## and there might be another ##k## for ##T^{-1}##.

facenian · Dec 26, 2015

Erland said:

But kk depends upon TT and there might be another kk for T−1T^{-1}.

First, I must say that for [itex]\mathbf{T}^{-1}(\mathbf{v})[/itex] I should have used [itex]\mathbf{T}^{-1}(-\mathbf{v})[/itex] but if space is to be isotropic then T(and its inverse) can only depend on [itex]|\mathbf{v}|[/itex], for a similar reason [itex]\mathbf{T}^{-1}(v)=\mathbf{T}(v)[/itex], ie, isotropy of space demands [itex]\mathbf{T}^{-1}=\mathbf{T}[/itex]

Samy_A · Dec 26, 2015

Erland said:

It remains to figure out:
1. Which extra conditions do we need to ensure that ##k=1##?

I'm not sure if you mean mathematical or physical conditions.

Purely mathematically, you showed that ##T## is represented as a matrix by
##\begin{pmatrix}
a & b \\
b& a
\end{pmatrix}## or ##\begin{pmatrix}
a & b \\
-b& -a
\end{pmatrix}##.
##k=\pm \det(A)##
So ##k=1## would mean the form ##\begin{pmatrix}
a & b \\
b& a
\end{pmatrix}## with ##a²-b²=1##

strangerep · Dec 26, 2015

Erland said:

Although it is false that all linear transformations ##T:\Bbb R^2\to \Bbb R^2##, which map (Minkowski) null vectors (w.r.t. the standard basis) to null vectors must preserve Minkowski length, what is true is that the Minkowski length is multiplied with a constant factor by such a transformation. [...]

Yes, you're talking about (uniform) dilations (aka scaling transformations). The largest group which maps null vectors to null vectors is the conformal group, which contains the Poincare group, dilations, and (so-called) "special conformal transformations". The latter are non-linear, so not relevant here.

Also consider the distinction between the O(3) and SO(3) groups... :oldwink:

In physically-motivated derivations of Lorentz transformations, one assumes that all inertial observers set up their local coordinate systems using identical rods and clocks. I.e., they use the same units. This removes the dilational degree of freedom from the transformations.

Erland · Dec 26, 2015

Samy_A said:

I'm not sure if you mean mathematical or physical conditions

I would prefer mathematically formulated assumptions which are motivated physically.

Purely mathematically, you showed that ##T## is represented as a matrix by
##\begin{pmatrix}
a & b \\
b& a
\end{pmatrix}## or ##\begin{pmatrix}
a & b \\
-b& -a
\end{pmatrix}##.
##k=\pm \det(A)##
So ##k=1## would mean the form ##\begin{pmatrix}
a & b \\
b& a
\end{pmatrix}## with ##a²-b²=1##

It could also be

##\begin{pmatrix}
a & b \\
-b& -a
\end{pmatrix}##

with ##k=a^2-b^2=-\det(A)## and ##\det(A)=-1##.

But I think we can rule out this case by a continuity/connectedness assumption: For each such matrix ##A##, we assume that there is a continuous path ##h:[0,1]\to \Bbb M_{22}## (space of ##2\times2##-matrices, and we assume that the range of ##h## only contain the "right" kind of invertible matrices) with ##h(0)=I## (identity matrix, corresponding to relative velocity ##0##) and ##h(1)=A##. The elements in these matrices are then continuous functions on ##[0,1]##. This can be motivated physically by the argument that it should be possible to accelerate any object continuously from rest to any velocity (less than light speed).
For such a matrix

##B=\begin{pmatrix}
a & b \\
c& d
\end{pmatrix}##

we have ##ad+bc=1>0## for ##t=0## and ##B=I##, and, in the second case above, ##ad+bc=-a^2-b^2<0## for ##t=1## and ##B=A##. But ##ad+bc## is a continuous function on ##[0,1]##, so for some ##t\in[0,1]## we must have ##ad+bc=0##, but ##ad+bc=\pm(a^2+b^2)##, and this can only be ##0## if ##a=b=0##, which does not give an invertible matrix and hence is excluded. It follows that we must always have the first case:

##A=\begin{pmatrix}
a & b \\
b& a
\end{pmatrix}.##

samalkhaiat · Dec 26, 2015

I don't what the fuss is all about in this thread, but my posts in the thread bellow might be helpful.
https://www.physicsforums.com/threads/conformal-group-poincare-group.420204/

Samy_A · Dec 27, 2015

Erland said:

It could also be

##\begin{pmatrix}
a & b \\
-b& -a
\end{pmatrix}##

with ##k=a^2-b^2=-\det(A)## and ##\det(A)=-1##.

Oh yes, of course.

samalkhaiat said:

I don't what the fuss is all about in this thread, but my posts in the thread bellow might be helpful.
https://www.physicsforums.com/threads/conformal-group-poincare-group.420204/

strangerep said:

Yes, you're talking about (uniform) dilations (aka scaling transformations). The largest group which maps null vectors to null vectors is the conformal group, which contains the Poincare group, dilations, and (so-called) "special conformal transformations". The latter are non-linear, so not relevant here.

Also consider the distinction between the O(3) and SO(3) groups...

In physically-motivated derivations of Lorentz transformations, one assumes that all inertial observers set up their local coordinate systems using identical rods and clocks. I.e., they use the same units. This removes the dilational degree of freedom from the transformations.

Many thanks for the link and the explanation.

Concerning "what the fuss is all about" (only talking for myself of course):
For the layman in SR, it is sometimes very enligthning to read a more basic approach (as done here). I learned a lot reading this thread.

facenian · Dec 27, 2015

samalkhaiat said:

I don't what the fuss is all about in this thread, but my posts in the thread bellow might be helpful.
https://www.physicsforums.com/threads/conformal-group-poincare-group.420204/

Very interesting post. However I would modify the demonstration because it has one flaw. The problem I see is the assertion that since straight lines must transform into straight lines, the transformation must be linear. I think this is a common mistake(for an example see, for instance, "The Special Theory of Relativity" by Aharoni)
The amendment I propose is summarized as follows:
The principle of Relativity only implies the conformal group and this means [itex]ds'^2=\alpha(x^\mu,\vec{v})ds^2[/itex], here is where homogeneity of space and time comes in demanding [itex]\alpha[/itex] be independant of space time variables [itex]x^\mu[/itex], now isotropy of space limits the dependence of [itex]\alpha(\vec{v})[/itex] to [itex]\alpha(v)[/itex]. An argument like the one given in Landau and Lifshitz Volume 2 now proves [itex]\alpha=1[/itex]
Finally it can be shown(see, for instance, "Gravitation and Cosmology" by S. Weingberg) that the only transformations that leave [itex]ds^2[/itex] invariant are linear tranformations.
So, I guess, that taking out pieces from these three authors a clear cut demonstration can be built

Fredrik · Dec 27, 2015

facenian said:

The problem I see is the assertion that since straight lines must transform into straight lines, the transformation must be linear.

I think the correct statement is that if ##T:\mathbb R^n\to\mathbb R^n## is a bijection that takes straight lines to straight lines, then there's a linear bijection ##\Lambda:\mathbb R^n\to\mathbb R^n## and an ##a\in\mathbb R^n## such that ##T(x)=\Lambda x+a## for all ##x\in\mathbb R^n##. So if we add the requirement that ##T(0)=0##, ##T## must be linear.

facenian · Dec 27, 2015

Fredrik said:

think the correct statement is that if T:Rn→RnT:\mathbb R^n\to\mathbb R^n is a bijection that takes straight lines to straight lines, then there's a linear bijection Λ:Rn→Rn\Lambda:\mathbb R^n\to\mathbb R^n and an a∈Rna\in\mathbb R^n such that T(x)=Λx+aT(x)=\Lambda x+a for all x∈Rnx\in\mathbb R^n. So if we add the requirement that T(0)=0T(0)=0, TT must be linear.

First if [itex]\mathbf{a}\neq 0[/itex] the transformation is still linear, but that never mind, I think you just missed that.
On the other hand the problem here is not the existence of a linear transformation, the problem is to conclude that the transformation must be linear.

Fredrik · Dec 27, 2015

facenian said:

First if [itex]\mathbf{a}\neq 0[/itex] the transformation is still linear, but that never mind, I think you just missed that.

Consider the T defined by ##T(x)=x+(1,0,0,0)##. It's not linear, since
\begin{align*}
&T(2(1,1,1,1))=T(2,2,2,2)=(2,2,2,2)+(1,0,0,0)=(3,2,2,2),\\
&2T(1,1,1,1)=2((1,1,1,1)+(1,0,0,0))=2(2,1,1,1) =(4,2,2,2).
\end{align*}

facenian said:

On the other hand the problem here is not the existence of a linear transformation, the problem is to conclude that the transformation must be linear.

Right, and you can do that if you add the assumption that T(0)=0. Without that assumption, the correct conclusion is that T-T(0) is linear.

facenian · Dec 27, 2015

Fredrik said:

Consider the T defined by T(x)=x+(1,0,0,0)T(x)=x+(1,0,0,0). It's not linear, since

Yes,I'm sorry you're right, I was thinking of another kind of linearity. The linearity I was thinking about allows for "linear and not homogeneous"
I don't know if your approach is relevant to our discussion, may be is too advanced for me.

Fredrik · Dec 27, 2015

facenian said:

I don't know if your approach is relevant to our discussion, may be is too advanced for me.

The theorem is relevant to any approach to SR that says "takes straight lines to straight lines" instead of "is linear". (Edit: The last sentence in this post explains why).

Unfortunately the proof is very long. I will only mention a few things from the notes I made a few years ago.

There's a version of this theorem that deals with affine spaces rather than vector spaces. It's called "the fundamental theorem of affine geometry". I studied a proof of that theorem in a book on affine spaces, and sort of "translated" it into a proof about vector spaces.

Let T be a permutation of ##\mathbb R^4## that takes straight lines to straight lines. This assumption is not sufficient to ensure that T is linear, but it is sufficient to ensure that T is affine, i.e. that there's a linear bijection ##\Lambda:\mathbb R^4\to\mathbb R^4## and a vector ##a## such that ##T(x)=\Lambda x+a## for all ##x\in\mathbb R^4##. The key steps of the proof are as follows:

1. Define ##\Lambda=T-T(0)## and prove that ##\Lambda## is a bijection that takes straight lines to straight lines.
2. Prove that for all x,y such that {x,y} is linearly independent, we have ##\Lambda(x+y)=\Lambda(x)+\Lambda(y)##.
3. Prove that for all x and all real numbers k, we have ##\Lambda(kx)=k\Lambda(x)##.
4. Prove that for all x,y such that {x,y} is linearly dependent, we have ##\Lambda(x+y)=\Lambda(x)+\Lambda(y)##.

Step 3 breaks up into a trivial case and a difficult case. If x=0, the proof is trivial. If x≠0, the strategy is to prove that there's a function ##f:\mathbb R\to\mathbb R## such that:
(a) ##\Lambda(kx)=f(k)\Lambda(x)##.
(b) f is bijective.
(c) f is a field homomorphism.

Statements (b) and (c) say that f is a field automorphism. This result is useful because it's possible to prove that the only field automorphism on ℝ is the identity map.

It's a trivial corollary of this very non-trivial theorem that a permutation that takes straight lines to straight lines and 0 to 0 is linear.

strangerep · Dec 27, 2015

facenian said:

The principle of Relativity only implies the conformal group [...]

That's incorrect. The principle of relativity implies the group of fractional-linear transformations.

If one also invokes the light principle, and applies it by finding the largest group that preserves the (vacuum) Maxwell eqns, one finds the conformal group.

Taken together, the common subgroup consists of linear transformations.

Ref: Fock & Kemmer, "Space, Time & Gravitation", 2nd ed. 1964.

Erland · Dec 27, 2015

The 2D Lorentz transformation can be derived from the following mathematical assumptions, which all have physical motivations.

It can be proved that there is a unique one parameter family of transformations ##L_v: \Bbb R^2\to \Bbb R^2##, defined for all ##v\in (-1,1)##, satisfying:

1. For each fixed ##(x,t)\in \Bbb R^2##, the mapping ##H:(-1,1)\to \Bbb R^2## given by ##H(v)=L_v(x,t)## is continuous.
2. Each ##L_v## (##v\in (-1,1)##) is a bijection, and its inverse is ##L_w## for some ##w\in (-1,1)##.
3. For each ##v\in(-1,1)##: Each line ##(t,x)=(t_0,x_0)+s(1,a)## (##s\in \Bbb R##), with ##a\in (-1,1)## and ##t_0,x_0\in\Bbb R##, is mapped by ##L_v## to a line ##(t',x')=(t'_0,x'_0)+r(1,b)## (##r\in \Bbb R##), for some ##b\in (-1,1)## and ##t'_0,x'_0\in\Bbb R##.
4. ##L_v(0,0)=(0,0)##, for all ##v\in (-1,1)##.
5. ##L_0## is the identity transformation on ##\Bbb R^2##.
6. For each ##v\in (-1,1)##: ##L_v(1,v)=(t',0)##, for some ##t'\in \Bbb R##.
7. For each ##v\in (-1,1)##: ##L_v(1,1)=(r,\pm r)## and ##L_v(1,-1)=(s,\pm s)## for some ##r,s\in \Bbb R##.
8. For each ##v\in (-1,1)##: If ##L_v(t,x)=(t',x')##, for some ##t,x,t',x'\in \Bbb R##, then either ##L_{-v}(t,-x)=(t',x') ## or ##L_{-v}(t,-x)=(t',-x')##.

Each ##L_v## (##v\in (-1,1) ##) is then given by ##L_v(t,x)=(1/\sqrt{1-v^2})(t-vx, -vt+x)## for all ##(t,x)\in\Bbb R^2##.

One needs not assume that ##L_v## is linear, for this follows, which was proved by micromass, strangerep and Fredrik in an in an old thread
https://www.physicsforums.com/threa...formations-are-the-only-ones-possible.651640/
for the general case when all lines are mapped onto lines, and it can be proved that it suffices to look at "timelike" lines.

Physical motivations:

1. It is possible to accelerate an object continuously to any speed less than light speed, through intertial frames.
2. The two frames are interchangeable. A consequence of the special principle of relativity.
3. An an object which is not being acted upon by a force, and hence moves with uniform rectilinear (timelike) motion, w.r.t one frame, is moving as freely w.r.t. the other frame. A consequence of the special principle of relativity.
4. Just an arbitrary practical convention about how we put marks on our rods and synchronize our clocks.
5. If ##v=0##, the frames coincide.
6. The relative velocity of Frame 2 w.r.t Frame 1 is ##v##.
7. A consequence of the invariance of the light speed.
8. This is about spatial isotropy. The transformation should still be valid if we change the directions of the spatial axes. See an earlier post by strangerep in this thread.

But it becomes more complex in 4D spacetime...

Fredrik · Dec 27, 2015

strangerep said:

That's incorrect. The principle of relativity implies the group of fractional-linear transformations.

What assumptions are made about the domains of these transformations in this approach? (Apologies if you have already told me. In my defense, the thread the Erland linked to above is 3 years old). If we assume (as I do) that these transformations are permutations of ##\mathbb R^4## (and that they take 0 to 0), we get the stronger result that they are linear.

strangerep · Dec 27, 2015

Fredrik said:

If we assume (as I do) that these transformations are permutations of ##\mathbb R^4## (and that they take 0 to 0), we get the stronger result that they are linear.

Yes.

Afaict, it's possible to make sense of the FL transformations if one restricts the domain to the interiors of the null bicone of each observer. But that's beyond the scope of this thread.

sweet springs · Dec 28, 2015

In §3 of Einstein's first paper on special relativity,
,ON THE ELECTRODYNAMICS OF MOVING BODIES
By A. Einstein, June 30, 1905
https://www.fourmilab.ch/etexts/einstein/specrel/www/,
he deals with -v. I should appreciate it if someone could explain how Einstein deduced it.
He describes "Since the relations between x', y', z' and x, y, z do not contain the time t, the systems K and

are at rest with respect to one another, and it is clear that the transformation from K to

must be the identical transformation."

facenian · Dec 28, 2015

strangerep said:

That's incorrect. The principle of relativity implies the group of fractional-linear transformations.

Yes, I was assuming that the PR includes the LP but that is not a good convention
Thanks for the reference.

facenian · Dec 28, 2015

Fredrik said:

The theorem is relevant to any approach to SR that says "takes straight lines to straight lines" instead of "is linear". (Edit: The last sentence in this post explains why).

Very interesting and yes it is relevant to this discussion

Deriving Lorentz transformations

Similar threads

Hot Threads

Recent Insights