Recognitions:

## Showing that Lorentz transformations are the only ones possible

 Quote by Fredrik I realized something interesting when I looked at the statement of the theorem they're proving. They're saying that if ##\Lambda## takes straight lines to straight lines, there's a 4×4 matrix A, two 4×1 matrices y,z, and a number c, such that $$\Lambda(x)=\frac{Ax+y}{z^Tx+c}.$$ If we just impose the requirement that ##\Lambda(0)=0##, we get y=0. And if z≠0, there's always an x such that the denominator is 0. So if we also require that ##\Lambda## must be defined on all of ##\mathbb R^4##, then the theorem says that ##\Lambda## must be linear.
Yes, that's what I tried to explain in earlier posts.
 Both of these requirements are very natural if what we're trying to do is to explain e.g. what the principle of relativity suggests about theories of physics that use ##\mathbb R^4## as a model of space and time.
But it gets trickier if you take a more physics-first approach to the foundations: by itself the relativity principle doesn't give you (flat) ##\mathbb R^4## as a model of space and time -- you've got to make some other assumptions about omnipresent rigid rods and standard clocks which might not be so reasonable in the large.
 for now, I'll just try to figure out the best way to use the two additional assumptions I suggested above to simplify the problem.
If you mean "just assume linearity", the best physicist-oriented proof I've seen is in Rindler's SR textbook.

Mentor
 Quote by strangerep If you mean "just assume linearity", the best physicist-oriented proof I've seen is in Rindler's SR textbook.
I meant that I would like to prove that if ##\Lambda## is is a permutation of ##\mathbb R^4## (or ##\mathbb R^2##) that takes straight lines to straight lines, and 0 to 0, then ##\Lambda## is linear. I think I know how to do the rest after that, at least in 1+1 dimensions.
 It is shown by geometrical inspection that the Lorentz transformation is the only solution to accounting for the invariant speed of light. We begin with a graphical representation of three examples of observers moving at arbitrarily selected different speeds with respect to the black inertial frame of reference. The speed of light in the black inertial reference system is already known to have the value of c and is represented by the world line of a single photon (green line slanted at an angle of 45 degrees in the black frame). Next, we inquire as to what orientation of the X1 axis for each observer we must have for the speed of light to be invariant among the inertial frames. By trial and error inspection we can only have those orientations of the X1 axis for which the photon world line bisects the angle between the X1 axis and the X4 axis as shown below. So, based on this result we wish to derive the coordinate transformations between any two arbitrarily selected frames. Again by geometric inspection we identify a right triangle for which we can apply the Pythagorean Theorem. Notice that we have selected two of the moving observer frames, entirely arbitrarily, and then found a new black inertial frame for which two other inertial frames are moving in opposite directions with the same speed. This is a perfectly general situation, since for any pair of observers moving relative to each other, you can always find such a reference frame. Having derived the time dilation, the result for length contraction can easily be shown by similar triangle inspection.

 Quote by strangerep Yeah, it took me several months (elapsed time) before I understood what's going on here. You should see the original version in Fock's textbook -- it's even more obscure. The crucial idea here is that the straight line is being parameterized in terms of an arbitrary real ##\lambda##. Also think of ##x_0^i## as an arbitrary point on the line so that ##\lambda## and ##v^i## generate the whole line. Then they adopt a confusing notation that ##x## is an abbreviation for the 3-vector with components ##x^i##. Using a bold font would have been more helpful. But persevering with their notation, ##x = x(\lambda) = x_0 + \lambda v##. Since we want the transformed ##x'^{\,i}## to be a straight line also, in general parameterized by a different ##\lambda'## and ##v'##, we can write $$x'^{\,i}(x) ~=~ x'^{\,i}(x_0) ~+~ \lambda'(\lambda) \, v'^{\,i}$$ where the first term on the RHS is to be understood as what ##x_0## is mapped into. I.e., think of ##x'^{\,i}## as a mapping. It might have been more transparent if they'd written ##x'^{\,i}_0## and then explained why this can be expressed as ##x'^{\,i}(x_0)##. Confusing? Yes, I know that only too well. I guess it becomes second nature when one is working in this way all the time. Fock also does a lot of this sort of thing.
Ok, but a little bit further down in the proof, the author seems to use this, which is based upon a particular representation of a particular line, to draw conclusions about other lines at other positions, it is where he introduces a function f(x,v), and I don't understand this at all.

And still, the conclusion of the theorem seems wrong to me. It is nowhere stated that we must have n>1, and for n=1, the function f(x)=x^3+x seems to contradict the theorem, since it is a differentialble bijection from R (a line) onto itself, with a differentiable inverse, but f does not have the required form.

Recognitions:
 Quote by Erland Ok, but a little bit further down in the proof, the author [Guo et al] seems to use this, which is based upon a particular representation of a particular line, to draw conclusions about other lines at other positions, it is where he introduces a function f(x,v), and I don't understand this at all.
From their equation
$$v^j v^k \, \frac{\partial^2 x'^{\,i}}{\partial x^j \partial x^k} (x_0 + \lambda v) ~=~ v^j\,\frac{\partial x'^{\,i}}{\partial x^j} \, \frac{\,\frac{d^2 \lambda'}{d\lambda^2}\,}{d\lambda'/d\lambda} ~,$$
we see that ##\frac{d^2 \lambda'}{d\lambda^2}/\frac{d\lambda'}{d\lambda}## at ##(x^i)## depends not only on ##x^i## but also on ##v^i##. Therefore, there must exist a function ##f(x,v)## such that
$$v^j v^k \, \frac{\partial^2 x'^{\,i}}{\partial x^j \partial x^k} ~=~ v^j \, \frac{\partial x'^{\,i}}{\partial x^j} \,f(x,v) ~.$$
Strictly, ##f(x,v)## also depends on ##\lambda##, but this dependence is suppressed in the notation here, since we only need the fact that ##f## depends at least on ##x## and ##v##.
 And still, the conclusion of the theorem seems wrong to me. It is nowhere stated that we must have n>1, and for n=1, the function f(x)=x^3+x seems to contradict the theorem, [...]
No, that 's ##n=2##, not ##n=1##.
Think of the (x,y) plane. A straight line on this plane can be expressed as
$$y ~=~ y(x) ~=~ y_0 + s x ~.$$ for some constants ##y_0## and ##s##.
Alternatively, the same straight line can be expressed in terms of a parameter ##\lambda## and constants ##v_x, v_y## as
$$y = y(\lambda) ~=~ y_0 + \lambda v_y ~,~~~~~ x = x(\lambda) ~=~ \lambda v_x ~,$$ and eliminating ##\lambda## gives the previous form, with ##s = v_y/v_x##.
That's what going on here: straight lines are expressed in the parametric form. Your cubic cannot be expressed in this form, hence is in no sense a straight line.

 Quote by strangerep From their equation $$v^j v^k \, \frac{\partial^2 x'^{\,i}}{\partial x^j \partial x^k} (x_0 + \lambda v) ~=~ v^j\,\frac{\partial x'^{\,i}}{\partial x^j} \, \frac{\,\frac{d^2 \lambda'}{d\lambda^2}\,}{d\lambda'/d\lambda} ~,$$ we see that ##\frac{d^2 \lambda'}{d\lambda^2}/\frac{d\lambda'}{d\lambda}## at ##(x^i)## depends not only on ##x^i## but also on ##v^i##. Therefore, there must exist a function ##f(x,v)## such that $$v^j v^k \, \frac{\partial^2 x'^{\,i}}{\partial x^j \partial x^k} ~=~ v^j \, \frac{\partial x'^{\,i}}{\partial x^j} \,f(x,v) ~.$$ Strictly, ##f(x,v)## also depends on ##\lambda##, but this dependence is suppressed in the notation here, since we only need the fact that ##f## depends at least on ##x## and ##v##.
It is precisely this I don't understand. If we are talkning about a single line and its image, then ##v## is a constant vector, a direction vector of the line, and then it doesn't seem meaningful to take a function depending upon it.
If, on the other hand, we are talking about several, perhaps all, lines and their images, then the problem is that the parametric equations of the lines are not unique, we can freely choose between points on the line and parallell direction vectors, and it is hard to see how we can associate one such choice for the image line with one for the original line in a consistent way. How can then ##f(x,v)## be well defined?
 Quote by strangerep No, that 's ##n=2##, not ##n=1##. [---] Your cubic cannot be expressed in this form, hence is in no sense a straight line.
No, I am not talking about the curve ##y=f(x)=x^3+x## in ##R^2##. I talk about ##f## as a transformation from ##R^1## to itself. In ##R^1##, there is only one line, ##R^1## itself, and it is mapped onto itself by ##f##.

 Quote by Erland No, I am not talking about the curve ##y=f(x)=x^3+x## in ##R^2##. I talk about ##f## as a transformation from ##R^1## to itself. In ##R^1##, there is only one line, ##R^1## itself, and it is mapped onto itself by ##f##.
Remember that for the one dimensional case it doesn't make sense to single out mappings of straight lines to straight lines since they all are "straight lines", curvature for one-dimensional objects is only extrinsic unlike what happens in higher dimension spaces.
So even if you want to restrict the function to the real line, you need the 2-dimensional representation as strangerep pointed out if you want to make any distinction between linearity and non-linearity of lines(curves).

Mentor
 Quote by TrickyDicky Remember that for the one dimensional case it doesn't make sense to single out mappings of straight lines to straight lines since they all are "straight lines",
That's precisely why it's disturbing that the theorem doesn't assume that the dimension of the vector space is at least 2. Since every ##f:\mathbb R\to\mathbb R## takes straight lines to straight lines, the theorem says that there are numbers a,b such that
$$f(x)=ax+b$$
for all x in the domain. Actually it says that there are numbers a,b,c,d such that
$$f(x)=\frac{ax+b}{cx+d}$$
for all x in the domain, but since we're considering an f with domain ℝ, we must have c=0, and this allows us to define a'=a/d, b'=b/d. Since there are lots of other functions from ℝ to ℝ, the theorem is wrong.

It's possible that the only problem with the theorem is that it left out a statement that says that the dimension of the vector space must be at least 2, but then the proof should contain a step that doesn't work in 1 dimension. (I still haven't studied the proof, so I have no opinion).
 One dimensional vector spaces? That would be scalars, in linear algebra the vector spaces are assumed to be of dimension 2 or higher, aren't they?

Mentor
 Quote by TrickyDicky One dimensional vector spaces? That would be scalars, in linear algebra the vector spaces are assumed to be of dimension 2 or higher, aren't they?
No, they can even be 0-dimensional. That would be a set with only one member. (Denote that member by 0. Define addition and scalar multiplication by 0+0=0, and a0=0 for all scalars a. The triple ({0},addition,scalar multiplication) satisfies the definition of a vector space). 0-dimensional vector spaces are considered "trivial". ℝ is a 1-dimensional real vector space.

 Quote by Fredrik No, they can even be 0-dimensional. That would be a set with only one member. (Denote that member by 0. Define addition and scalar multiplication by 0+0=0, and a0=0 for all scalars a. The triple ({0},addition,scalar multiplication) satisfies the definition of a vector space). 0-dimensional vector spaces are considered "trivial". ℝ is a 1-dimensional real vector space.
Sure, I'm not saying they can't be defined in those dimensions, by assumed I referred to the usually found in linear transformations involving velocities.
 Mentor I think most theorems in linear algebra hold for any finite-dimensional vector space. But I'm sure there are some that only hold when the dimension is ≥2, and some that only hold when it's ≥3.

Recognitions:
 Quote by Erland [...] If, on the other hand, we are talking about several, perhaps all, lines and their images, then the problem is that the parametric equations of the lines are not unique, we can freely choose between points on the line and parallel direction vectors, and it is hard to see how we can associate one such choice for the image line with one for the original line in a consistent way. How can then ##f(x,v)## be well defined?
We're talking about all lines and their images. The idea is that, for any given line, pick a parameterization, and find mappings such that the image is still a (straight) line, in some parameterization of the same type. The ##f(x,v)## is defined in terms of whatever parameterization we chose initially.

 No, I am not talking about the curve ##y=f(x)=x^3+x## in ##R^2##. I talk about ##f## as a transformation from ##R^1## to itself. In ##R^1##, there is only one line, ##R^1## itself, and it is mapped onto itself by ##f##.
But that case is irrelevant to the physics applications here since there's only one component ##x^i## (which I'll just write as ##x##), hence the notion of velocity cannot be defined since one needs at least ##n=2## for that so we can write ##dx/dt##.

In your ##n=1## objection, ##x'## is parallel (or antiparallel) to ##x##. Afaict, this means that the 2nd derivatives in the proof such as
$$\frac{\partial^2 x'{^i}}{\partial x^j \, \partial x^k}$$
always vanish. Probably this is a degenerate case, though I haven't tracked it through to find precisely where this affects things. The authors are interested in ##dx/dt## which is an ##n\ge 2## case, hence probably didn't bother with that subtlety. Maybe the proof should have a caveat about ##n\ge 2##, but for the intended physics applications, this doesn't change anything.

BTW, note that Stepanov's proof does not use the parameterization technique used by Guo et al, but rather works directly with 1+1D spacetime, requiring that the condition of zero acceleration is preserved. This is more physically intuitive, and less prone to subtle oversights.
 I may as well go ahead and complete the derivation for the Lorentz transformations (boost). So, continuing from the previous time dilation derivation (post #37) we identify congruent triangles from which an easy derivation of the length contraction follows.

 Quote by strangerep The most common reason is so-called homogeneity of space and time. By this, the authors mean that position-dependent (and time-dependent) dilations (scale changes) are ruled out arbitrarily. Personally, I prefer a different definition of spacetime homogeneity: i.e., that it should look the same wherever and whenever you are. IOW, it must be a space of constant curvature. This includes such things as deSitter spacetime, and admits a larger class of possibilities. But another way that various authors reach the linearity assumption is to start with the most general transformations preserving inertial motion, which are fractional-linear transformations. (These are the most general transformations which map straight lines to straight lines -- see note #1.) They then demand that the transformations must be well-defined everywhere, which forces the denominator in the FL transformations to be restricted to a constant, leaving us with affine transformations. In the light of modern cosmology, these arbitrary restrictions are becoming questionable. -------- Note #1: a simpler version of Fock's proof can be found in Appendix B of this paper: http://arxiv.org/abs/gr-qc/0703078c/0703078 by Guo et al. An even simpler proof for the case of 1+1D can also be found in Appendix 1 of this paper: http://arxiv.org/abs/physics/9909009 by Stepanov. (Take the main body of this paper with a large grain of salt, but his Appendix 1 seems to be ok, though it still needs the reader to fill in some of the steps -- speaking from personal experience. :-)
I think this post is exposing the central problematic. Lorentz transformations are stronghly related to a pragmatic necessity: inertial observers must have the sensation that the essential properties of the space are presserved (one peculiar example is the length element).

Conversely, does it mean that non-inertial observers must use different transformations than the Lorentz's ones? If yes, which ones?
 Mentor Anyone see a simple proof of the following less general statement? If ##\Lambda:\mathbb R^n\to\mathbb R^n## is a bijection that takes straight lines to straight lines, and takes 0 to 0, then ##\Lambda## is linear. Feel free to add assumptions about differentiability of ##\Lambda## if you think that's necessary. I've got almost nothing so far. I can see that given an arbitrary vector x and an arbitrary real number t, there's a real number s such that ##\Lambda(tx)=s\Lambda(x)##. This means that there's a function ##s:\mathbb R^n\times\mathbb R\to\mathbb R## such that ##\Lambda(tx)=s(x,t)\Lambda(x)## for all x,t. For all x, we have ##0=\Lambda(0)=\Lambda(0x)=s(x,0)\Lambda(x)##. This implies that ##s(x,0)=0## for all ##x\neq 0##. We should be able to choose our s such that s(0,0)=0 as well. I don't see how to proceed from here, and I don't really see how to begin with the evaluation of ##\Lambda(x+y)## where x,y are arbitrary. One idea I had was to let r be a number such that x+y is on the line through rx and ry. (If x,y are non-zero, there's always such a number. And if one of x,y is zero, there's nothing to prove). Then there's a number t such that $$\Lambda(x+y)=(1-t)\Lambda(rx)+t\Lambda(ry)=(1-t)s(x,r)\Lambda(x)+ts(y,r)\Lambda(y).$$ But I don't see how to use this. If we want to turn the above into a "For all x,y" statement, we must write t(x,y) instead of t. By the way, one of the reasons why I think there should be a simple proof is that this was an exercise in the book I linked to in post #27. Unfortunately the author didn't even mention that the map needs to take 0 to 0, so there's definitely something wrong with the exercise, but perhaps that omission is the only thing wrong with it. The author also assumed that the map is a surjection (onto a vector space W), rather than a bijection.

 Quote by Fredrik Anyone see a simple proof of the following less general statement? If ##\Lambda:\mathbb R^n\to\mathbb R^n## is a bijection that takes straight lines to straight lines, and takes 0 to 0, then ##\Lambda## is linear. Feel free to add assumptions about differentiability of ##\Lambda## if you think that's necessary.
A priori, per definition, a bijection is a surjection and an injection. I don't see why this should imply the linearity of that bijection.

 By the way, one of the reasons why I think there should be a simple proof is that this was an exercise in the book I linked to in post #27. Unfortunately the author didn't even mention that the map needs to take 0 to 0, so there's definitely something wrong with the exercise, but perhaps that omission is the only thing wrong with it. The author also assumed that the map is a surjection (onto a vector space W), rather than a bijection.
The exercice (1.3.1) page 9 (1) is not so complicated: If T is a linear transformation and if x, y and z are co-linear vectors then you have an α, β and λ (for example in ℝ) such that α. x = β. y = λ. z. Consequently: T(α. x) = T(β. y) = T(λ. z) and the linearity implies: α. T(x) = β. T(y) = λ. T(z). So that T(x), T(y) and T(z) are also colinear.

Now I think we are very far from the initial question which was to prove the unicity of the Lorentz's transformations. There are several levels in the different interventions proposed until here: 1°) at one level interventions are trying to re-demontrate the Lorentz's transformations (LTs) but it is not answering the initial question; 2°) at the other level indications are given concerning the logic going from the preservation of the length element (post 1) to the LTs. An answer to the initial question would thus consist in testing the unicity of the followed logic.

 Similar discussions for: Showing that Lorentz transformations are the only ones possible Thread Forum Replies General Math 35 Special & General Relativity 5 Special & General Relativity 1 Special & General Relativity 4 Special & General Relativity 8