Deriving Lorentz transformations

Click For Summary
SUMMARY

The discussion centers on the symmetry of relative speeds in the context of Lorentz transformations in special relativity. Participants argue that if one observer measures an object moving at velocity v, the object must measure the observer's velocity as equal due to the symmetry principle. The conversation references specific assumptions necessary for deriving Lorentz transformations, including spatial isotropy and the behavior of velocity boost transformations. The conclusion emphasizes that the inverse transformation's parameter must be the negative of the original, reinforcing the symmetry of relative motion.

PREREQUISITES
  • Understanding of Lorentz transformations in special relativity
  • Familiarity with the symmetry principle in physics
  • Knowledge of spatial isotropy and its implications
  • Basic concepts of velocity boost transformations
NEXT STEPS
  • Study the derivation of Lorentz transformations in detail
  • Explore the implications of spatial isotropy in physical theories
  • Learn about the symmetry principle and its applications in relativity
  • Investigate the mathematical framework of Lie groups and semigroups in physics
USEFUL FOR

Physicists, students of relativity, and anyone interested in the mathematical foundations of motion and symmetry in the universe.

  • #31
Erland said:
Very good, but this then leads to the question: Which are the equations in the theory to which this applies?
It must apply to every equation in the theory under consideration. The question then is how to specify a "theory".

In classical dynamics, this is done by choosing a specific Lagrangian, in the context of an Action principle and the calculus of variations. The Lagrangian then determines everything (including the equations of the theory) by extremizing the action. In practice, most people therefore just concentrate on symmetries of the Lagrangian. In this sense, one says that "the Lagrangian is the theory" (with the underlying framework of Newtonian space and time being understood, together with the calculus of variations).

In the foundations of relativity, one is interested in how events perceived by one observer ##O## may be reconciled with how another observer ##O'## perceives those same events. For the specific case where the observers are unaccelerated (i.e., inertial), and in motion relative to each other (with ##O'## having relative velocity ##v## wrt ##O##), and with the origins of their spatiotemporal reference frames coinciding, the equations of this "mini theory" are simply the coordinate transformations between the 2 coordinate systems. Assuming the transformations to be linear, and restricting ourselves to the 1+1D case, they are of the form $$t' ~=~ A(v) t + B(v) x ~,~~~~~~ x' = C(v)t + D(v) x ~, ~~~~~~ (1)$$where A,B,C,D are unknown functions to be determined.

One uses various physically-motivated criteria to restrict the form of A,B,C,D. One criteria is that ##v=0## corresponds to the identity transformation ##t'=t, x'=x##. Another is that $$\left. \frac{dx'}{dt'}\right|_0 ~=~ -v ~~~~ \mbox{if}~~ \left. \frac{dx}{dt}\right|_0 ~=~ 0 ~.$$ Another criterion (assumption) is that the transformations form a 1-parameter Lie (semi)group, with ##v## being the parameter. This implies (among other things) that 2 successive transformations with parameters ##v,v'## must commute, and the composition of the transformations must be equivalent to a single transformation with some parameter ##v'' = v''(v,v')##, to be determined.

The spatial isotropy assumption plays a role as follows. In 1+1D, it means the equations of the theory must be invariant under a reversal of all spatial vectors. Performing this reversal on (1), we get $$t' ~=~ A(-v) t - B(-v) x ~,~~~~~~ -x' = C(-v)t - D(-v) x ~. ~~~~~~ (2)$$ The equations (2) must be equivalent to (1). So, after a little algebra, we find the constraints: $$A(v) = A(-v) ~,~~~~ B(v) = B(-v) ~,~~~~ -C(-v) = C(v) ~,~~~~ D(v) = D(-v) ~.$$

Further, since ##v=0## must correspond to the identity transformation, we can (without loss of generality) substitute ##B(v) = v E(v)## and ##C(v) = v F(v)##, where ##E(v), F(v)## are 2 new unknown functions.

The benefit of the above, is that we now have transformation equations where the unknown functions ##A,B,E,F## are all even in ##v##. This fact can be used in subsequent steps of the derivation (but forgive me if I don't reproduce the entire thing here).
Also I wonder, how can this be used to prove that, for example, distances in directions perpendicular to the direction of motion between the frames are not changed by the transformation? (In this case, only rotations fixing the direction of motion should be used above.)
Here, you're talking about the more general 1+3D case, and what you say cannot be proven. Instead, one assumes that a rotation of the transformed axes around the boost direction has been performed (if necessary) to ensure that they're aligned with the original axes. (Strictly speaking, one might also have to perform a parity reversal as well.)

And how is it used to prove that ##v_{12}=-v_{21}##?
Well, that requires a fair bit more work, applying the assumptions I outlined above to derive further constraints on the unknown functions. I'm happy to work through that detail if you wish -- provided we treat it like a homework exercise: i.e., you must do at least as much of the work as I do, and show it here on PF. :wink:
 
Last edited:
  • Like
Likes   Reactions: Erland
Physics news on Phys.org
  • #32
Erland said:
what do you mean by "commutation relations"?

The best way to answer that is to give an example; SO(3) will do. This is, as I said, a 3-parameter group of transformations, which means that every rotation in the group can be expressed as a linear combination of three "basis" rotations, which are called "generators". If we call the three generators ##J^1##, ##J^2##, ##J^3##, then these three obey the following commutation relations (##[A, B]## is the commutator of the objects ##A## and ##B##, i.e., ##[A, B] = AB - BA##):

$$
[J^1, J^2] = i J^3
$$

$$
[J^2, J^3] = i J^1
$$

$$
[J^3, J^1] = i J^2
$$

These three relations can be expressed more compactly as

$$
[J^i, J^j] = i \epsilon^{ijk} J^k
$$

where ##\epsilon^{ijk}## is the completely antisymmetric symbol in 3 dimensions, i.e., ##e^{123} = 1##, and even permutations of the indexes have the same sign while odd ones have the opposite sign.

Erland said:
I find it in no way obvious that just because the Minkowski lengths of x and y are preserved by L, the same is true for all linear combinations of x and y.

A linear combination of the intervals x and y is another interval, and which interval it is is independent of the choice of coordinates. (This is just the Minkowski spacetime analogue of vector addition in ordinary Euclidean space.) So the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination. But the lengths of x and y, by hypothesis, are not changed by L, and the coefficients of the linear combination define which linear combination it is, so they can't be changed by L. Therefore, the Minkowski length of the linear combination can't be changed by L.
 
  • #33
PeterDonis said:
A linear combination of the intervals x and y is another interval, and which interval it is is independent of the choice of coordinates. (This is just the Minkowski spacetime analogue of vector addition in ordinary Euclidean space.) So the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination.
This is wrong, both in Euclidean and Minkowski space. As a counterexample in 2d Minkowski space, with ##c=1##, take ##x=(3,0)## and ##y=(5,4)##. Their Minkowski lengths are ##l(x)=3^2-0^2=9## and ##l(y)=5^2-4^2=9##. If what you wrote is true, we would have ##l(x+x)=l(x+y)##, since ##x+x## is a linear combination of ##x## and ##x## with the same coefficients as in the linear combination ##x+y## of ##x## and ##y##, and ##x## and ##y## have the same Minkowski lengths. But ##l(x+x)=6^2-0^2=36## and ##l(x+y)=8^2-4^2=48##.
To find a counterexample in Euclidean space is even more trivial.
You might object that we must only take linear combinations of mutually orthogonal basis vectors. But you wrote in the earlier post that we should take a basis of lightlike vectors in Minskowski space. However, in 2d Minkowski space, and I am quite sure in 3d and 4d also, we cannot find two lightlike, linearly indepent vectors which are orthogonal to each other. Two lightlike, linearly independent vectors in 2d must be two nonzero vectors ##(u,u)## and ##(v,-v)##. But they are not orthogonal to each other in the Minkowski sense, since ##uv-u(-v)=2uv\neq 0##.
 
  • #34
Erland said:
take ##x=(3,0)## and ##y=(5,4)##.

These are not null vectors. Your original hypothesis was that invariance of the interval only applied to null intervals, so ##x## and ##y## should be null intervals.

Erland said:
If what you wrote is true, we would have ##l(x+x)=l(x+y)##

Of course this is trivially true for any pair of null intervals ##x## and ##y##, since their lengths are always zero. So what my statement really amounts to is saying that, for the case under discussion (##x## and ##y## null intervals), the length of the linear combination depends only on the coefficients of the linear combination.

Erland said:
##x+x## is a linear combination of ##x## and ##x## with the same coefficients as in the linear combination ##x+y## of ##x## and ##y##

Um, what? That doesn't even make sense. The coefficients of a linear combination of ##x## and ##y## are the numbers multiplying ##x## and ##y## in the linear combination, in order; i.e., for the linear combination ##ax + by##, the coefficients are ##a, b##. So ##x + x## has coefficients ##2, 0##, while ##x + y## has coefficients ##1, 1##. So these are two different linear combinations.

Erland said:
in 2d Minkowski space, and I am quite sure in 3d and 4d also, we cannot find two lightlike, linearly indepent vectors which are orthogonal to each other.

Basis vectors don't have to be orthogonal, they just have to be linearly independent. Finding a basis of four null vectors just means finding four null vectors that are linearly independent. (You are correct that we can't find two lightlike, linearly independent null vectors that are orthogonal; two null vectors can only be orthogonal if they are collinear.) The basis vectors in the case you're used to, an inertial frame, are all orthogonal, but that does not mean basis vectors always have to be orthogonal; they just happen to be in that particular case.
 
  • #35
PeterDonis said:
Of course this is trivially true for any pair of null intervals ##x## and ##y##, since their lengths are always zero. So what my statement really amounts to is saying that, for the case under discussion (##x## and ##y## null intervals), the length of the linear combination depends only on the coefficients of the linear combination.
##l(x)## here means the Minkowski length of ##x##. Let ##x=(1,1)## and ##y=(1,-1)##. These are both null vectors. Yet, ##l(x+x)=l(2,2)=0## while ##l(x+y)=l(2,0)=4\neq 0##.
Um, what? That doesn't even make sense. The coefficients of a linear combination of ##x## and ##y## are the numbers multiplying ##x## and ##y## in the linear combination, in order; i.e., for the linear combination ##ax + by##, the coefficients are ##a, b##. So ##x + x## has coefficients ##2, 0##, while ##x + y## has coefficients ##1, 1##. So these are two different linear combinations.
Linear combinations are defined for all finite sequences of vectors. It doesn't matter if some of them happens to be equal or not. So ##x+x## can be considered as a linear combination of the pair ##(x,x)##, with coefficients ##(1,1)## (another possibility is ##(0,2)##, and there are infinitely many more, since the pair ##(x,x)## is linearly dependent). Likewise ##x+y## can be considered as a linear combination of ##(x,y)## with coefficients ##(1,1)## (and infinitely many more possibilities, if ##(x,y)## is linearly dependent). So, the coefficients are the same in both cases.
Basis vectors don't have to be orthogonal, they just have to be linearly independent. Finding a basis of four null vectors just means finding four null vectors that are linearly independent. (You are correct that we can't find two lightlike, linearly independent null vectors that are orthogonal; two null vectors can only be orthogonal if they are collinear.) The basis vectors in the case you're used to, an inertial frame, are all orthogonal, but that does not mean basis vectors always have to be orthogonal; they just happen to be in that particular case.
It is not clear to me what you actually claim. Do you claim that every linear transformation ##T## which maps (Minkowski) null vectors to null vectors, preserve Minkowski length? If so, you are wrong.
For example, there is an invertible linear transformation ##T:\Bbb R^2 \to \Bbb R^2## such that ##T(1,1)=(2,2)## and ##T(1,-1)=(1,-1)##. We see easily that it maps null vectors to null vectors: ##T(x,x)=(2x,2x)## and ##T(x,-x)=(x,-x)##, and all null vectors are of these types.
Now ##(2,0)=(1,1)+(1,-1)## and ##T(2,0)=T(1,1)+T(1,-1)=(2,2)+(1,-1)=(3,1)##. But ##l(2,0)=4## and ##l(T(2,0))=l(3,1)=8##. Thus, ##T## does not preserve Minkowski lengths.

So, not all linear transformations which map null vectors to null vectors preserve Minkowski lengths. The question is then if you don't talk about all linear transformations, but some particular class of transformations. If so, which class and why? We cannot á priori choose the Lorentz transformations, for that would be circular.
 
  • #36
Erland said:
Linear combinations are defined for all finite sequences of vectors. It doesn't matter if some of them happens to be equal or not.

You're missing my point. ##x + x## and ##x + y## are both linear combinations, sure. But they are different linear combinations. (If that fact is not obvious to you--and apparently it's not--then I strongly suggest a review of basic vector algebra, because it certainly should be obvious.) So we should expect them to result in vectors with different Minkowski lengths.

[Edited to delete my second comment/question, it was already answered in your previous post.]
 
  • #37
Peter, we must have misunderstood each other in some fundamental way, but I don't know in which way...

Let me recapitulate what you wrote in an earlier post:
PeterDonis said:
A linear combination of the intervals x and y is another interval, and which interval it is is independent of the choice of coordinates. (This is just the Minkowski spacetime analogue of vector addition in ordinary Euclidean space.) So the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination. But the lengths of x and y, by hypothesis, are not changed by L, and the coefficients of the linear combination define which linear combination it is, so they can't be changed by L. Therefore, the Minkowski length of the linear combination can't be changed by L.

So you claim that "the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination" (my empasize).

In other words, you mean that if we have a linear combination ax+by, then its Minkowski length, l(ax+by), depends only on l(x), l(y), a, and b. So, if we have two other vectors u and v, with the same lengths as x and y, respectively, that is, l(u)=l(x) and l(v)=l(y), and take a linear combination of u and v with the same coefficients as before: au+bv, then l(au+bv)=l(ax+by).
This is simply wrong, even if we restrict ourselves to the case when l(x)=l(y)=l(u)=l(v)=0. (Counterexample: x=(1,1), y=(1, -1), u=(2,2), v=(1,-1), a=b=1.)
If you don't understand this, it's you, not me, who need to repeat linear algebra.

But if the above is a misinterpretation of what you meant, please let us know what you really meant!
 
  • #38
Erland said:
In other words, you mean that if we have a linear combination ax+by, then its Minkowski length, l(ax+by), depends only on l(x), l(y), a, and b.

Yes, but I see that the word "depends" hides an ambiguity, so that the statement as I gave it can be taken in a stronger sense than I intended (and the stronger sense is false, as you say). Let me try to restate.

Suppose we pick two null vectors ##x## and ##y## as a basis for 2-D Minkowski spacetime. For concreteness, let's use ##x = (1, 1)## and ##y = (1, -1)## in ordinary inertial coordinates. Then any other vector ##v## can be expressed as a linear combination ##v = ax + by##. The length of ##v## is then given by some function ##f## of ##l(x)##, ##l(y)##, ##a##, and ##b##; i.e., ##l(v) = f(l(x), l(y), a, b)##; in our case, since ##l(x) = l(y) = 0##, we can simplify this to ##l(v) = f(a, b)##. If we then Lorentz transform, the basis vectors in the new inertial frame will have different numerical components; for example, if the Doppler factor of the transformation is 2, then in the new frame we will have ##x = (2, 2)## and ##y = (0.5, -0.5)##. But the vector ##v## will still be given by ##ax + by##, and the length of ##v## will still be given by ##l(v) = f(a, b)##.

Now suppose we change our minds and decide to use the null vector ##u = (2, 2)## instead of ##x## as the first of our two basis vectors. Then any vector ##v## can be expressed as a linear combination ##v = cu + dy##. The length of ##v## is then given by some function ##l(v) = g(c, d)##, where ##g## is a different function from ##f## above. But this different formula for the length of ##v## will still be preserved by a Lorentz transformation.
 
  • #39
The point of a "derivation" of the Lorentz transformation is to show how, in principle, one could have used simpler ideas to guess that the Lorentz transformation would be a useful ingredient in a theory of physics. It shows that a version of SR can be found by someone who doesn't already know what the theory looks like.

If you take Minkowski spacetime as a starting point, then you're doing something very different. You are showing that Minkowski spacetime contains all the mathematics we need to state such a theory in a nice way.

Both of these things are interesting to me. The latter is an important thing to study if you want a thorough understanding of the mathematics of SR. The former is an important thing to study if you want to understand what SR has in common with, and how it differs from, pre-relativistic classical mechanics.

Minkowski spacetime can be defined as a smooth manifold, an affine space, or a vector space. These options give us equivalent theories of physics. The vector space approach is ugly in the sense that it makes one event in spacetime (the 0 vector) mathematically special, even though it's not physically special. (The theory predicts that an experiment carried out there has the same results as if it's carried out somewhere else). But the vector space approach has the advantage that it makes the mathematics much simpler.

So let's define (the vector space version of) Minkowski spacetime as the pair (M,g), where M is ##\mathbb R^4## with the usual vector space structure, and g is the map from M×M into M defined by ##g(x,y)=x^T\eta y## for all x,y in M. Now the claim that "space is isotropic" can be interpreted in the following way: The group of all vector space automorphisms of M that preserve G (i.e. the Poincaré group), has a subgroup that consists of all the linear operators on M that can be written in the form
$$\begin{pmatrix}1 & 0 & 0 & 0\\ 0 & & & \\ 0 & & R &\\ 0 & & &\end{pmatrix},$$ where ##R\in\operatorname{SO}(3)##.

Of course, we can only do this if we have already defined Minkowski spacetime, so it can't be part of one of those "derivations" of the Lorentz transformation from simpler ideas. To incorporate isotropy in such a derivation, I would do the following. We're trying to find a theory of physics in which spacetime is a mathematical structure with underlying set ##M=\mathbb R^4##. We want this theory to involve global inertial coordinate systems, i.e. maps ##x:M\to\mathbb R^4## that correspond to inertial (i.e. non-accelerating) observers. We want these global inertial coordinate systems to be such that if x and y are global inertial coordinate systems, then the map ##x\circ y^{-1}## is a bijection from M to M that takes straight lines to straight lines (because the motion of an inertial observer should always be a straight line in the coordinate system used by another inertial observer). It can be shown that this implies that these maps are linear. The proof is quite long. We also want this set to be a group. (This is easy to justify by physical principles). Now we can interpret the requirement of isotropy as the choice to only look for groups that have a subgroup that consists of those transformations that can be written in the form above.

It's very difficult to complete a derivation of this type. I have only seen one attempt to really carry this out, and I didn't fully understand it. The result should be that the group is either the Poincaré group or the Galilean group.
 
  • #40
PeterDonis said:
Suppose we pick two null vectors ##x## and ##y## as a basis for 2-D Minkowski spacetime. For concreteness, let's use ##x = (1, 1)## and ##y = (1, -1)## in ordinary inertial coordinates. Then any other vector ##v## can be expressed as a linear combination ##v = ax + by##. The length of ##v## is then given by some function ##f## of ##l(x)##, ##l(y)##, ##a##, and ##b##; i.e., ##l(v) = f(l(x), l(y), a, b)##; in our case, since ##l(x) = l(y) = 0##, we can simplify this to ##l(v) = f(a, b)##.
Ok, but then it is not meaningful to write ##f(l(x),l(y),a,b)##, since your ##f## is a function of ##x,y,a,b##, not of ##l(x),l(y),a,b##, for such a function should not change its value if we change ##x## and ##y## without changing ##l(x)## and ##l(y)## (and ##a## and ##b##).

But never mind. I think that much of the confusion arises from the fact that we can view an invertible linear transformation ##(t,x) \mapsto (t',x')## (in 2D) in two ways: either as a mapping of a vector to another vector, or as changing the coordinates of a vector to coordinates of the same vector in another basis.

Consider a 2D vector space ##V## and a basis ##B=(\mathbf e,\mathbf f) ## of ##V##. Then every vector ##\mathbf v\in V ## can be uniquely written in the form ##\mathbf v=t\mathbf e+x\mathbf f##.
In first case, we have an invertible linear transformation ##T: V\to V## such that ##T(\mathbf v)=T(t \mathbf e + x \mathbf f)=t'\mathbf e + x'\mathbf f##, i.e. ##(t',x')## are the coordinates of the mapped vector ##T(\mathbf v)## in the same basis as before.
In the second case, we have another basis ##B'=(\mathbf e',\mathbf f')## of ##V## such that ##\mathbf v=t'\mathbf e'+x'\mathbf f'##, i.e. ##(t',x')## are the coordinates of the same vector ##\mathbf v## as before, in the new basis ##B'=(\mathbf e',\mathbf f')##.

Now, consider the scalar function ##l: V\to \Bbb R##, defined for vectors expressed in the basis ##B## by ##l(t\mathbf e +x \mathbf f)=t^2-x^2##. Our problem can then be formulated in two equivalent ways, one for each of the two viewpoints.

1. If ##l(T(\mathbf v))=0## holds for all ##\mathbf v\in V## such that ##l(\mathbf v)=0##, must then ##l(T(\mathbf v))=l(\mathbf v)## hold for all ##\mathbf v\in V##?

2. Let ##l': V\to \Bbb R## be be defined for vectors expressed in the basis ##B'## by ##l'(t'\mathbf e'+x'\mathbf f')=(t')^2-(x')^2##.
If then ##l'(\mathbf v)=0## holds for all ##\mathbf v\in V## such that ##l(\mathbf v)=0##, must then ##l'(\mathbf v)=l(\mathbf v)## hold for all ##\mathbf v\in V##? (that is: is the Minkowski length given by the same formula in both bases?).

We know of course that the answers to these questions are "yes" if ##T## is a Lorentz transformation (in case 1) or if the coordinate transformation is given by Lorentz's formulas (in case 2). But the answers are easily seen to be "no" in general. It is easy to find examples of linear transformations / changes of bases for which the answers are "no". I gave some examples in earlier posts. Surely, you must agree about that, Peter?

And since the answers are "no" in general, the question is what extra conditions we must impose on the transformations /changes of bases to make the answers "yes"...
 
Last edited:
  • #41
strangerep said:
I'm happy to work through that detail if you wish -- provided we treat it like a homework exercise: i.e., you must do at least as much of the work as I do, and show it here on PF. :wink:
Well, I'm working on it too, despite lack of time. We'll see how it proceeds. Good post, anyway!
 
  • #42
Fredrik, I am thinking along similar lines as you.
 
  • #43
Erland said:
such a function should not change its value if we change ##x## and ##y## without changing ##l(x)## and ##l(y)## (and ##a## and ##b##).

Why not? Changing the basis vectors changes the function--at least, that's how I was viewing it. Bear in mind that I'm not making any particular physical or mathematical claim here; I'm simply trying to clarify what I intended to say when I responded to your original question about whether establishing that some group of transformations maps null vectors to null vectors is sufficient to establish that it preserves the lengths of all vectors.

I also take Fredrik's point, however, that if we take Minkowski spacetime as a starting point, asking for a "derivation" of the Lorentz transformations is moot--the symmetry group of Minkowski spacetime is what it is. And if we are talking about any transformation mapping null vectors to null vectors, we must already know what a null vector is, which means we are already assuming something that probably amounts to assuming Minkowski spacetime.

Erland said:
I think that much of the confusion arises from the fact that we can view an invertible linear transformation ##(t,x) \mapsto (t',x')## (in 2D) in two ways: either as a mapping of a vector to another vector, or as changing the coordinates of a vector to coordinates of the same vector in another basis.

I agree that it's important to be careful about distinguishing these two views. I'll have to ponder some more to see if the intuition I was groping towards can be formulated in a way that addresses all of these concerns.
 
  • #44
Erland said:
Very good, but this then leads to the question: Which are the equations in the theory to which this applies? This should be specified in a stringent exposition.
The equations to which it applies are those which represent physical laws. I think this is very important because this restrict the possible laws of physics.
Erland said:
And how is it used to prove that v12=−v21v_{12}=-v_{21}?
regarding this point I would ask : shouldn't it be v12=v21? I think this is an interesting question but more appropriate to be discussed in front of a blackboard
 
  • #45
facenian said:
regarding this point I would ask : shouldn't it be v12=v21? I think this is an interesting question but more appropriate to be discussed in front of a blackboard
If two frames have their x-axes aligned and pointing in the same direction, and move in the x-direction, then they have opposite relative velocities: v12=-v21. If I see you go east, you see me go west.
 
  • #46
Erland said:
If two frames have their x-axes aligned and pointing in the same direction, and move in the x-direction, then they have opposite relative velocities: v12=-v21. If I see you go east, you see me go west.
That's right, I withdraw my objection regarding the sign, but the original question was why symmetry or isotropy regarding both observers requires v12=-v12 and I think explaining it would be a lot easier to do it in front of a blackboard. Any way I think at this point you already got it
 
  • #47
Erland said:
If two frames have their x-axes aligned and pointing in the same direction, and move in the x-direction, then they have opposite relative velocities: v12=-v21. If I see you go east, you see me go west.
This highlights my earlier point. Unless you assume symmetry how do you know two observers can align their x axes?
 
  • #48
It turns out that PeterDonis wasn't so wrong after all. Although it is false that all linear transformations ##T:\Bbb R^2\to \Bbb R^2##, which map (Minkowski) null vectors (w.r.t. the standard basis) to null vectors must preserve Minkowski length, what is true is that the Minkowski length is multiplied with a constant factor by such a transformation.
More precisely: If ##l(T(\mathbf v))=0## for all ##\mathbf v\in \Bbb R^2## such that ##l(\mathbf v)=0##, then there is a constant ##k\in\Bbb R## such that ##l(T(\mathbf v))=k\,l(\mathbf v)## for all ##\mathbf v \in \Bbb R^2##.

To see this, let such a transformation ##T## be given by ##T(\mathbf v)=T(t,x)=(at+bx,ct+dx)## (so ##c## is not light speed here, the latter is ##1##).
Then: ##l(T(\mathbf v))=(at+bx)^2-(ct+dx)^2=(a^2-c^2)t^2-(d^2-b^2)x^2+2(ab-cd)tx.##
##(1,1)## and ##(1,-1)## are null vectors and they are mapped to null vectors. Thus:
##(a^2 -c^2) - (d^2-b^2) +2(ab-cd)=0## and
##(a^2-c^2)-(d^2-b^2)-2(ab-cd)=0##.
Adding and subtracting these equations, we obtain ##a^2-c^2=d^2-b^2## and ##ab-cd=0##. So, putting ##k=a^2-c^2## we obtain ##l(T(\mathbf v))=k(t^2-x^2)=k\,l(\mathbf v)##. This holds for all ##\mathbf v=(t,x)\in \Bbb R^2##.
(I suspect that there is a general property of quadratic forms lurking in the background, but I cannot figure out what it is.)

If we also assume that ##T## is invertible, then the "lines" given by ##(x,t)=u(1,1)## and ##(x,t)=u(1,-1)## must be mapped onto each other (in some combination). From this, we can show that ##c=\pm b## and ##d=\pm a## (same sign at both places).
Now, if we interprete ##T## as a coordinate transformation between inertial frames, then ##v=-c/d=-b/a## is the velocity of Frame 2 relative to Frame 1 (so ##a\neq 0##). Inverting the transformation and interpreting this in the corresponding way, we obtain that the velocity of Frame 1 relative to Frame 2 is ##-v##. In other words: ##v_{12}=-v_{21}##!

It remains to figure out:
1. Which extra conditions do we need to ensure that ##k=1##?
2. Do we need any extra conditions (e.g. spatial isotropy) to generalize this to 4D-space?
 
  • #49
PeroK said:
This highlights my earlier point. Unless you assume symmetry how do you know two observers can align their x axes?
doesn't this have to do to with the supposed Euclidean geometry?
 
  • #50
Erland said:
It remains to figure out:
1. Which extra conditions do we need to ensure that k=1
Since \mathbf{T^{-1}} share the same property we have l(\mathbf{v})=l(\mathbf{T^{-1}}(\mathbf{T}\mathbf{(\mathbf{v}})))=kl(\mathbf{T}(\mathbf{v})=k^2l(\mathbf{v}) whence k^2=1
 
  • #51
facenian said:
Since \mathbf{T^{-1}} share the same property we have l(\mathbf{v})=l(\mathbf{T^{-1}}(\mathbf{T}\mathbf{(\mathbf{v}})))=kl(\mathbf{T}(\mathbf{v})=k^2l(\mathbf{v}) whence k^2=1
But ##k## depends upon ##T## and there might be another ##k## for ##T^{-1}##.
 
  • #52
Erland said:
But kk depends upon TT and there might be another kk for T−1T^{-1}.
First, I must say that for \mathbf{T}^{-1}(\mathbf{v}) I should have used \mathbf{T}^{-1}(-\mathbf{v}) but if space is to be isotropic then T(and its inverse) can only depend on |\mathbf{v}|, for a similar reason \mathbf{T}^{-1}(v)=\mathbf{T}(v), ie, isotropy of space demands \mathbf{T}^{-1}=\mathbf{T}
 
  • #53
Erland said:
It remains to figure out:
1. Which extra conditions do we need to ensure that ##k=1##?
I'm not sure if you mean mathematical or physical conditions.

Purely mathematically, you showed that ##T## is represented as a matrix by
##\begin{pmatrix}
a & b \\
b& a
\end{pmatrix}## or ##\begin{pmatrix}
a & b \\
-b& -a
\end{pmatrix}##.
##k=\pm \det(A)##
So ##k=1## would mean the form ##\begin{pmatrix}
a & b \\
b& a
\end{pmatrix}## with ##a²-b²=1##
 
Last edited:
  • #54
Erland said:
Although it is false that all linear transformations ##T:\Bbb R^2\to \Bbb R^2##, which map (Minkowski) null vectors (w.r.t. the standard basis) to null vectors must preserve Minkowski length, what is true is that the Minkowski length is multiplied with a constant factor by such a transformation. [...]
Yes, you're talking about (uniform) dilations (aka scaling transformations). The largest group which maps null vectors to null vectors is the conformal group, which contains the Poincare group, dilations, and (so-called) "special conformal transformations". The latter are non-linear, so not relevant here.

Also consider the distinction between the O(3) and SO(3) groups... :oldwink:

In physically-motivated derivations of Lorentz transformations, one assumes that all inertial observers set up their local coordinate systems using identical rods and clocks. I.e., they use the same units. This removes the dilational degree of freedom from the transformations.
 
  • #55
Samy_A said:
I'm not sure if you mean mathematical or physical conditions
I would prefer mathematically formulated assumptions which are motivated physically.
Purely mathematically, you showed that ##T## is represented as a matrix by
##\begin{pmatrix}
a & b \\
b& a
\end{pmatrix}## or ##\begin{pmatrix}
a & b \\
-b& -a
\end{pmatrix}##.
##k=\pm \det(A)##
So ##k=1## would mean the form ##\begin{pmatrix}
a & b \\
b& a
\end{pmatrix}## with ##a²-b²=1##
It could also be

##\begin{pmatrix}
a & b \\
-b& -a
\end{pmatrix}##

with ##k=a^2-b^2=-\det(A)## and ##\det(A)=-1##.

But I think we can rule out this case by a continuity/connectedness assumption: For each such matrix ##A##, we assume that there is a continuous path ##h:[0,1]\to \Bbb M_{22}## (space of ##2\times2##-matrices, and we assume that the range of ##h## only contain the "right" kind of invertible matrices) with ##h(0)=I## (identity matrix, corresponding to relative velocity ##0##) and ##h(1)=A##. The elements in these matrices are then continuous functions on ##[0,1]##. This can be motivated physically by the argument that it should be possible to accelerate any object continuously from rest to any velocity (less than light speed).
For such a matrix

##B=\begin{pmatrix}
a & b \\
c& d
\end{pmatrix}##

we have ##ad+bc=1>0## for ##t=0## and ##B=I##, and, in the second case above, ##ad+bc=-a^2-b^2<0## for ##t=1## and ##B=A##. But ##ad+bc## is a continuous function on ##[0,1]##, so for some ##t\in[0,1]## we must have ##ad+bc=0##, but ##ad+bc=\pm(a^2+b^2)##, and this can only be ##0## if ##a=b=0##, which does not give an invertible matrix and hence is excluded. It follows that we must always have the first case:

##A=\begin{pmatrix}
a & b \\
b& a
\end{pmatrix}.##
 
  • #57
Erland said:
It could also be

##\begin{pmatrix}
a & b \\
-b& -a
\end{pmatrix}##

with ##k=a^2-b^2=-\det(A)## and ##\det(A)=-1##.
Oh yes, of course.
samalkhaiat said:
I don't what the fuss is all about in this thread, but my posts in the thread bellow might be helpful.
https://www.physicsforums.com/threads/conformal-group-poincare-group.420204/
strangerep said:
Yes, you're talking about (uniform) dilations (aka scaling transformations). The largest group which maps null vectors to null vectors is the conformal group, which contains the Poincare group, dilations, and (so-called) "special conformal transformations". The latter are non-linear, so not relevant here.

Also consider the distinction between the O(3) and SO(3) groups... :oldwink:

In physically-motivated derivations of Lorentz transformations, one assumes that all inertial observers set up their local coordinate systems using identical rods and clocks. I.e., they use the same units. This removes the dilational degree of freedom from the transformations.
Many thanks for the link and the explanation.

Concerning "what the fuss is all about" (only talking for myself of course):
For the layman in SR, it is sometimes very enligthning to read a more basic approach (as done here). I learned a lot reading this thread.
 
Last edited:
  • #58
samalkhaiat said:
I don't what the fuss is all about in this thread, but my posts in the thread bellow might be helpful.
https://www.physicsforums.com/threads/conformal-group-poincare-group.420204/
Very interesting post. However I would modify the demonstration because it has one flaw. The problem I see is the assertion that since straight lines must transform into straight lines, the transformation must be linear. I think this is a common mistake(for an example see, for instance, "The Special Theory of Relativity" by Aharoni)
The amendment I propose is summarized as follows:
The principle of Relativity only implies the conformal group and this means ds&#039;^2=\alpha(x^\mu,\vec{v})ds^2, here is where homogeneity of space and time comes in demanding \alpha be independent of space time variables x^\mu, now isotropy of space limits the dependence of \alpha(\vec{v}) to \alpha(v). An argument like the one given in Landau and Lifshitz Volume 2 now proves \alpha=1
Finally it can be shown(see, for instance, "Gravitation and Cosmology" by S. Weingberg) that the only transformations that leave ds^2 invariant are linear tranformations.
So, I guess, that taking out pieces from these three authors a clear cut demonstration can be built
 
  • #59
facenian said:
The problem I see is the assertion that since straight lines must transform into straight lines, the transformation must be linear.
I think the correct statement is that if ##T:\mathbb R^n\to\mathbb R^n## is a bijection that takes straight lines to straight lines, then there's a linear bijection ##\Lambda:\mathbb R^n\to\mathbb R^n## and an ##a\in\mathbb R^n## such that ##T(x)=\Lambda x+a## for all ##x\in\mathbb R^n##. So if we add the requirement that ##T(0)=0##, ##T## must be linear.
 
  • #60
Fredrik said:
think the correct statement is that if T:Rn→RnT:\mathbb R^n\to\mathbb R^n is a bijection that takes straight lines to straight lines, then there's a linear bijection Λ:Rn→Rn\Lambda:\mathbb R^n\to\mathbb R^n and an a∈Rna\in\mathbb R^n such that T(x)=Λx+aT(x)=\Lambda x+a for all x∈Rnx\in\mathbb R^n. So if we add the requirement that T(0)=0T(0)=0, TT must be linear.
First if \mathbf{a}\neq 0 the transformation is still linear, but that never mind, I think you just missed that.
On the other hand the problem here is not the existence of a linear transformation, the problem is to conclude that the transformation must be linear.
 
Last edited:

Similar threads

  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 101 ·
4
Replies
101
Views
7K
Replies
8
Views
820
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 10 ·
Replies
10
Views
1K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 10 ·
Replies
10
Views
1K
  • · Replies 32 ·
2
Replies
32
Views
4K
  • · Replies 33 ·
2
Replies
33
Views
3K