Deriving Lorentz transformations

abhinavjeet · Dec 17, 2015

Why is relative speed taken to be symmetrical i.e. speed of one frame of reference from a second frame is equal to that of the second frame of frame refrence from the first frame

PeroK · Dec 17, 2015

abhinavjeet said:

Why is relative speed taken to be symmetrical i.e. speed of one frame of reference from a second frame is equal to that of the second frame of frame refrence from the first frame

Which one do you think should be the greater?

abhinavjeet · Dec 17, 2015

Well I think that if I observe an object moving with velocity v then how can I know how is the object observing me.
Also if the speeds are equal then can I say that in my refrence frame
X=vt
And in his frame X'=vt'
Please forgive me if I m mistaken
I m not so good at physics...

PeroK · Dec 17, 2015

abhinavjeet said:

Well I think that if I observe an object moving with velocity v then how can I know how is the object observing me.
Also if the speeds are equal then can I say that in my refrence frame
X=vt
And in his frame X'=vt'
Please forgive me if I m mistaken
I m not so good at physics...

That wasn't the question. The question is:

If you observe an object moving with velocity ##v## does that object measure your velocity as ##> v## or ##< v##? (Assuming it's not ##= v##)

abhinavjeet · Dec 17, 2015

I don't have any reason to prefer >v or <v

PeroK · Dec 17, 2015

abhinavjeet said:

I don't have any reason to prefer >v or <v

Then it must be ##=v##.

Let's say you see an object moving towards you at speed ##v## and it sees you moving towards it at speed ##v_1 > v##. Then, unless there is a lack of symmetry in the universe, you must see it moving towards you at ##v_2 > v_1##. That can't be. You have the same argument if ##v_1 < v##

The symmetry of the universe implies equality of relative velocities.

Dale · Dec 17, 2015

abhinavjeet said:

I don't have any reason to prefer >v or <v

Yes, exactly. This is called a symmetry principle.

bcrowell · Dec 17, 2015

I have a discussion of this in my SR book http://lightandmatter.com/sr/ , section 1.4, example 11, "Observers agree on their relative speeds."

DrSirius · Dec 17, 2015

bcrowell said:

I have a discussion of this in my SR book http://lightandmatter.com/sr/ , section 1.4, example 11, "Observers agree on their relative speeds."

The book seems excellent. I am glad to have found about that example, it is the first time that I see, explicitly stated, that the question is not completely obvious or trivial. A teacher of mine solved it by printing a paper and fliping it the other side to show the symmetry. IMHO, that is not completely trivial and the matter is a little more subtle than that.

sweet springs · Dec 17, 2015

Let velocity of IFR 2 measured in IFR 1 be $$V_{12}$$.
Let velocity of IFR 1 measured in IFR 2 be $$V_{21}$$.
Let X be axis along to the velocity. Changing the direction of X, V_12 also changes its sign.
Chanbing the direction of X, IFR 1 and IFR 2 exchange in the sense that one is along X and the other is reverse.
So we may have a reason to say $$V_{12}=-V_{21}$$.

strangerep · Dec 17, 2015

Afaik, one only needs the following assumptions:

1) The usual relativity principle,

2) An assumption of 3D spatial isotropy,

3) An assumption that the axes transverse to the boost direction remain unchanged (as usual when deriving Lorentz transformations), [Edit: meaning that the rotational freedom allowed by spatial isotropy is exploited to align the transverse axes of the original and transformed frames.]

4) An assumption that velocity boost transformations form a 1-parameter Lie semigroup (possibly a group).

From (1)-(4) one can derive that, for a boost transformation with parameter ##v##, the inverse transformation must have parameter ##-v##. Additional hand waving intuition is not necessary.

[Edit: I modified this post slightly in an attempt to soften my previous tone.]

bcrowell · Dec 17, 2015

strangerep said:

3) An assumption that the axes transverse to the boost direction remain unchanged (as usual when deriving Lorentz transformations),

This isn't a necessary assumption. This fact can be derived.

strangerep said:

4) An assumption that velocity boost transformations form a 1-parameter Lie semigroup (possibly a group).

From (1)-(4) one can derive that, for a boost transformation with parameter ##v##, the inverse transformation must have parameter ##-v##. Additional hand waving intuition is not necessary.

By assuming a Lie group you are assuming what you claim to prove.To justify this assumption, you will find yourself having to reason about physics. Physical reasoning does not equate to "hand waving intuition."

[Mentors note: This post has been edited as part of some overall thread moderation]

strangerep · Dec 17, 2015

bcrowell said:

strangerep said:

3) An assumption that the axes transverse to the boost direction remain unchanged (as usual when deriving Lorentz transformations),

This isn't a necessary assumption. This fact can be derived.

I suspect we are talking at crossed purposes on this point. Please see my edit in post #11. If that doesn't put the discussion back on track, then please explain how this is "derived". (I did look in your book, but maybe I wasn't looking in the right place.)

By assuming a Lie group you are assuming what you claim to prove.

I do not see this. What do you think I "claim to prove".

To justify this assumption [Lie (semi)group], you will find yourself having to reason about physics.

Sure.

bcrowell · Dec 18, 2015

@strangerep: Sorry for my grumpy tone in my previous post.

strangerep said:

I suspect we are talking at crossed purposes on this point. Please see my edit in post #11. If that doesn't put the discussion back on track, then please explain how this is "derived". (I did look in your book, but maybe I wasn't looking in the right place.)

I see. From your original version, I thought you were talking about an assumption that there is no transverse Lorentz contraction. From your edited version, I can see that you just mean there is no rotation between the two frames.

strangerep said:

I do not see this. What do you think I "claim to prove".

Maybe I'm misunderstanding this as well. Is the complete argument written up somewhere?

In #4, you only assume a semigroup, but then at the end you refer to "the inverse transformation." Are you claiming that based on assumptions 1-4 you can prove that it's a group, and not just a semigroup?

My general comments on the outline you've presented are:

(1) I don't see the point of doing this in 3+1 dimensions rather than 1+1. Any successful argument is going to require an assumption of spatial isotropy, but that assumption can be expressed in one spatial dimension, where it's equivalent to parity symmetry.

(2) Proving some group-theoretical facts doesn't establish anything about the physics unless you provide some "glue" between the math and the physics, which you haven't done. The mathematical assumptions require physical justification, and the mathematical results require physical interpretation. Using fancy math may give the superficial impression of rigor, but fancy math without the glue is actually less rigorous than simple math with the glue. In any case, all you seem to be claiming to prove is that "for a boost transformation with parameter v, the inverse transformation must have parameter −v." This is easy to prove in even the simplest treatment of SR using high school algebra. The gee-whiz stuff about Lie semigroups comes off to me as obscurantism.

strangerep · Dec 18, 2015

bcrowell said:

@strangerep: Sorry for my grumpy tone in my previous post.

Thank you. Actually, I see now that (the original version of) my post could indeed be perceived as supercilious. I take your remarks on board for the future.

Maybe I'm misunderstanding this as well. Is the complete argument written up somewhere?

My writeup of the complete argument is still not in final form (and no one else has yet proofread the draft). Probably, I should keep my mouth shut until this occurs, but I don't always have the necessary self control.

In #4, you only assume a semigroup, but then at the end you refer to "the inverse transformation." Are you claiming that based on assumptions 1-4 you can prove that it's a group, and not just a semigroup?

I believe so, yes. (Strictly speaking, there's a couple of other inputs: i.e., that the original and transformed origins coincide, and some physical input to establish that the effect of the transformations does indeed correspond to a natural interpretation of what "velocity boost" means in terms of coordinates.)

(1) I don't see the point of doing this in 3+1 dimensions rather than 1+1. Any successful argument is going to require an assumption of spatial isotropy, but that assumption can be expressed in one spatial dimension, where it's equivalent to parity symmetry.

I wanted to investigate the most general case, to be sure I wasn't accidentally overlooking something.

(2) Proving some group-theoretical facts doesn't establish anything about the physics unless you provide some "glue" between the math and the physics, which you haven't done. The mathematical assumptions require physical justification, and the mathematical results require physical interpretation. Using fancy math may give the superficial impression of rigor, but fancy math without the glue is actually less rigorous than simple math with the glue.

I agree with all of the above. One of the reasons I haven't finished my writeup is that it's challenging to do this well, and without subtly introducing extra assumptions.

In any case, all you seem to be claiming to prove is that "for a boost transformation with parameter v, the inverse transformation must have parameter −v." This is easy to prove in even the simplest treatment of SR using high school algebra. The gee-whiz stuff about Lie semigroups comes off to me as obscurantism.

Well, I'm trying to write a treatise covering the most general (fractional-linear) case, with as few assumptions as possible. I decided to be more careful about the semigroup stuff when I investigated time displacement transformations in the same framework. The advanced investigations along related lines that I'm aware of always seem to assume time reversal symmetry. (E.g., Bacry & Levy-LeBlond, "Possible Kinematics", JMP vol 9, 1968, p1605.) Relaxing that assumption, one must investigate the physical domain (in velocity phase space) on which all the transformations are well-defined, and ask on what domain the inverse transformations are also well-defined. I found some interesting consequences (beyond the scope of this thread), and this motivated me to relax the group assumption for other transformations such as boosts and spatial displacements, and investigate how much could be derived starting from only a semigroup assumption. Whether this is perceived as "obscurantism" depends on one's interests, I suppose.

Erland · Dec 22, 2015

Symmetry, spatial isotropy, how are these principles formulated in a mathematically stringent way?

Early in this thread, it was claimed that the relative speeds ##v_1## of Frame 1 w.r.t. Frame 2 and ##v_2## of Frame 2 w.r.t. Frame 1 must be equal because of "symmetry". But then, suppose an object is at rest w.r.t Frame 1. Then, it has speed ##v_1>0## w.r.t Frame 2. Why doesn't this violate symmetry?

Also, I am reading bcrowells interesting book and looked up the example he mentioned, but I don't understand how "spatial isotropy" is applied in that case.

strangerep · Dec 22, 2015

Erland said:

Symmetry, spatial isotropy, how are these principles formulated in a mathematically stringent way?

Let's take spatial isotropy as an example. Heuristically, this means there is no distinguished (or "preferred") direction in 3-space. In mathematically more precise terms, I'd express it as follows:

A theory is spatially isotropic iff every equation of the theory remains invariant when all quantities in the equation are substituted by their (respective) rotated counterparts. To understand this in more detail, let's consider an equation in a theory, written as $$F(t,{\bf x}, {\bf v}, ...) ~=~ 0 ~~~~~~ (1) ~,$$ where the bold symbols denote 3-vectors. Now consider an arbitrary 3D rotation matrix ##\bf M##, and substitute all 3-vectors in (1) by their rotated counterparts, i.e., ##{\bf Mx}, {\bf Mv}##, etc. So we have $$F(t,{\bf Mx}, {\bf Mv}, ...) ~=~ 0 ~~~~~~ (2) ~.$$ If (2) is equivalent to (1), i.e., if by purely algebraic manipulations we can make (2) look exactly like (1), then (1) is called a spatially isotropic equation. If every equation in the theory has this property, then the theory has the property of spatially isotropy.

A similar idea applies to other symmetries: if we can take any equation in the theory, substitute the various quantities therein by their transformed counterparts, and by purely algebraic manipulations make the new equation look exactly like the original, then the theory has that symmetry.

Sometimes it's obvious that an equation is spatially isotropic, e.g., if it contains only scalar expressions explicitly, like (say) ##{\bf x \cdot \bf p}##.

BTW, although I've only explained the case with 3-vectors explicitly, the same idea applies if there are higher rank tensors (or spinors) in the theory -- one simply has to use the correct rotation transformation formula for each type of quantity appearing.

strangerep · Dec 22, 2015

Erland said:

Early in this thread, it was claimed that the relative speeds ##v_1## of Frame 1 w.r.t. Frame 2 and ##v_2## of Frame 2 w.r.t. Frame 1 must be equal because of "symmetry". But then, suppose an object is at rest w.r.t Frame 1. Then, it has speed ##v_1>0## w.r.t Frame 2. Why doesn't this violate symmetry?

By introducing "an object" (which I'll denote by ##O##), you must now consider 4 relative velocities: ##v_{12}##, ##v_{21}##, ##v_{O1}##, ##v_{O2}##. (Here, my notation ##v_{AB}## denotes velocity of "A" relative to "B". More precisely, ##v_{12}## denotes the velocity of the origin in Frame 1 relative to the origin of Frame 2, where we assume the origins coincide momentarily.)

We have ##v_{21} = -v_{12}## by spatial isotropy (or rather, parity reversal in the 1+1D case), but that (by itself) doesn't say anything about the speed of O relative to either frame. So even if ##v_{O1} = 0##, it doesn't necessarily mean ##v_{O2} = 0##, since ##v_{12} \ne 0##.

Erland · Dec 23, 2015

strangerep said:

Let's take spatial isotropy as an example. Heuristically, this means there is no distinguished (or "preferred") direction in 3-space. In mathematically more precise terms, I'd express it as follows:

A theory is spatially isotropic iff every equation of the theory remains invariant when all quantities in the equation are substituted by their (respective) rotated counterparts. To understand this in more detail, let's consider an equation in a theory, written as $$F(t,{\bf x}, {\bf v}, ...) ~=~ 0 ~~~~~~ (1) ~,$$ where the bold symbols denote 3-vectors. Now consider an arbitrary 3D rotation matrix ##\bf M##, and substitute all 3-vectors in (1) by their rotated counterparts, i.e., ##{\bf Mx}, {\bf Mv}##, etc. So we have $$F(t,{\bf Mx}, {\bf Mv}, ...) ~=~ 0 ~~~~~~ (2) ~.$$ If (2) is equivalent to (1), i.e., if by purely algebraic manipulations we can make (2) look exactly like (1), then (1) is called a spatially isotropic equation. If every equation in the theory has this property, then the theory has the property of spatially isotropy.

Very good, but this then leads to the question: Which are the equations in the theory to which this applies? This should be specified in a stringent exposition.

Also I wonder, how can this be used to prove that, for example, distances in directions perpendicular to the direction of motion between the frames are not changed by the transformation? (In this case, only rotations fixing the direction of motion should be used above.)

And how is it used to prove that ##v_{12}=-v_{21}##?

Dale · Dec 23, 2015

Erland said:

Early in this thread, it was claimed that the relative speeds ##v_1## of Frame 1 w.r.t. Frame 2 and ##v_2## of Frame 2 w.r.t. Frame 1 must be equal because of "symmetry". But then, suppose an object is at rest w.r.t Frame 1. Then, it has speed ##v_1>0## w.r.t Frame 2. Why doesn't this violate symmetry?

The presence of the object breaks the symmetry. This is not because of asymmetry in the laws, but asymmetry in the boundary conditions.

Erland · Dec 23, 2015

DaleSpam said:

The presence of the object breaks the symmetry. This is not because of asymmetry in the laws, but asymmetry in the boundary conditions.

Can you, in a stringent way, state the law(s) / symmetry principle(s) used here, from which it follows that the relative speed between the frames is the same w.r.t both frames...?

PeroK · Dec 23, 2015

Erland said:

Very good, but this then leads to the question: Which are the equations in the theory to which this applies? This should be specified in a stringent exposition.

Also I wonder, how can this be used to prove that, for example, distances in directions perpendicular to the direction of motion between the frames are not changed by the transformation? (In this case, only rotations fixing the direction of motion should be used above.)

And how is it used to prove that ##v_{12}=-v_{21}##?

It seems to me that trying to answer his question requires a mixture of mathematical axioms and a physical explanation of why those axioms are adopted. For SR, for example, there is no problem is simply taking as an axiom the differential geometry of spacetime. Spatial symmetry and isotropy come along with the axiom.

But, if you are trying to justify why space is isotropic, then what do you take as your axioms?

Erland · Dec 23, 2015

PeroK said:

It seems to me that trying to answer his question requires a mixture of mathematical axioms and a physical explanation of why those axioms are adopted.

Yes, I think so too.

For SR, for example, there is no problem is simply taking as an axiom the differential geometry of spacetime. Spatial symmetry and isotropy come along with the axiom.

I don't understand what you mean. How does "the differential geometry of spacetime" imply the isotropy w.r.t the Lorentz transformation?

But, if you are trying to justify why space is isotropic, then what do you take as your axioms?

Isotropy here means, I suppose, that the Lorentz transformation is somehow invariant w.r.t. rotations. I don't think it can be really justified by more "basic" physical principles, it just seems obvious. The problem is to formulate it stringently and economically. I have been considering something like this:

We assume that the two frames, called the "stationary" and the "moving" frame, move w.r.t each other along their x- and x'-axes, respectively, and that L(0,0,0,0)=(0,0,0,0). Consider the events 1, 2, and 3, with coordinates (0,1,0,0), (0,0,1,0), and (0,-1,0,0), respectively, in the stationary frame. The spatial vectors (x_i,y_i,z_i) of these events (i=1,2,3) in the stationary frame are perpendicular to the direction of relative motion of the frames, they have the same lengths (1) and the angles between these vectors of events 1 and 2, and 2 and 3, respectively, are the same (90 degrees). Also, the events are simultaneous in the stationary frame. Let (x_i',y_i',z_i',t_i'), i=1,2,3, be the coordinates of the events 1,2, and 3, respectively, in the moving frame.
Given this, we now assume that the (squares of) the lengths (x_i')²+(y_i')²+(z_i')² are equal, that the x_i':s are equal, and that the t_i':s are equal, (i=1,2,3), and that the angles between the spatial vectors (x_i',y_i',z_i'), for the pairs 1,2 and 2,3 respectively, are equal.
All this is justified by "isotropy". Using this and linearity of the Lorentz transform L, it is not hard prove that, after a suitable rotation of the spatial coordinates in the moving frame, there is a constant a such that if L(x,y,z,t)=(x',y',z',t'), then y'=ay and z'=az.

But I am not entirely satisfied with this. It feels a little bit too cumbersome. Can someone come up with something better, simpler, and more beautiful?

Dale · Dec 23, 2015

Erland said:

Can you, in a stringent way, state the law(s) / symmetry principle(s) used here, from which it follows that the relative speed between the frames is the same w.r.t both frames...?

In a stringent way, no. In a rough handwaving way:

Given inertial reference frames A and B with B moving at speed v wrt A then, by the principle of relativity there is nothing which distinguishes A from B, therefore by the principle of relativity we can exchange A and B and state that we have A moving at speed v wrt B.

PeterDonis · Dec 23, 2015

Erland said:

How does "the differential geometry of spacetime" imply the isotropy w.r.t the Lorentz transformation?

The "stringent" way of talking about isotropy is by means of Killing vector fields. Saying that a spacetime is "isotropic" means that, at every event in the spacetime, there is a 3-parameter group of spacelike Killing vector fields that have the commutation relations of the rotation group SO(3).

Erland said:

Isotropy here means, I suppose, that the Lorentz transformation is somehow invariant w.r.t. rotations.

The connection with Lorentz transformations, in the "stringent" way of talking that I describe above, is that the rotation group SO(3) is a subgroup of the Lorentz group SO(3, 1), which is a six-parameter group of Killing vector fields having three spacelike generators (generating the SO(3) subgroup) and three timelike generators (generating "boosts", which do not form a subgroup because the commutator of two boosts is a rotation--physically, this shows up as Thomas precession). So Lorentz transformations preserve the same invariants that are preserved by the spatial rotation group SO(3).

More info here:

https://en.wikipedia.org/wiki/Lorentz_group

Erland · Dec 23, 2015

PeterDonis, this seems quite advanced. I need to study this more to fully understand it. But isn't it much simpler in the SR case, where we have linearity and flatness?

Anyway, is it correct that in defining the Lorentz group, one assumes that the spacetime distance dx²+dy²+dz²-c²dt² is preserved by the Lorentz transformations?
I am not willing to take this as an axiom (for physics), except when this distance is 0, since this is just another way of stating the invariance of the light speed. Otherwise, it seems in no way obvious and no simple consequence of Einstein's postulates.

PeterDonis · Dec 23, 2015

Erland said:

isn't it much simpler in the SR case, where we have linearity and flatness?

No. The Killing vector fields I described are for the SR case, flat Minkowski spacetime.

Erland said:

is it correct that in defining the Lorentz group, one assumes that the spacetime distance dx²+dy²+dz²-c²dt² is preserved by the Lorentz transformations?

It is often presented that way, but IIRC you don't actually have to make that assumption. The Lorentz group can be defined by the Killing vector fields that generate it and their commutation relations. Showing that the group of transformations on spacetime that have those commutation relations preserves the spacetime interval can then, IIRC, be derived as a theorem.

As a warmup exercise, consider the simpler case of the rotation group SO(3). This group preserves ordinary Euclidean distances in Euclidean 3-space, i.e., it preserves ##ds^2 = dx^2 + dy^2 + dz^2##. But I don't think you have to assume that in defining the group; I think you can define the group by its commutation relations, and then derive as a consequence the fact that the transformations in the group preserve Euclidean distances.

Erland said:

I am not willing to take this as an axiom (for physics), except when this distance is 0, since this is just another way of stating the invariance of the light speed.

You can construct a basis for Minkowski spacetime using only null vectors, i.e., any vector can be expressed as a linear combination of null vectors, so invariance of null intervals is sufficient to establish invariance of all intervals.

Dale · Dec 23, 2015

PeterDonis said:

You can construct a basis for Minkowski spacetime using only null vectors, i.e., any vector can be expressed as a linear combination of null vectors, so invariance of null intervals is sufficient to establish invariance of all intervals.

Wow, excellent point.

How does that translate to curved spacetime where the basis vectors would only span the tangent space and not spacetime which is a manifold and not a vector space?

PeterDonis · Dec 23, 2015

DaleSpam said:

How does that translate to curved spacetime where the basis vectors would only span the tangent space and not spacetime which is a manifold and not a vector space?

The tangent space is still Minkowski spacetime, so as long as we are just looking at infinitesimal line elements, everything works the same way.

Once we go beyond infinitesimal line elements, there is, as you know, no such thing as a global "Lorentz transformation" in curved spacetime. Any coordinate transformation must still preserve arc lengths along curves (and other geometric invariants), but there are no global coordinate charts that have all the properties of inertial charts on Minkowski spacetime. So the viewpoint that Lorentz transformations are defined as the ones that preserve spacetime intervals doesn't really have an analogue, globally, in curved spacetime.

This, btw, is another reason to learn the definition of isotropy (and other symmetries) in terms of Killing vector fields; those definitions carry over just fine to curved spacetime. An isotropic curved spacetime is simply one which has a 3-parameter group of KVFs at every event with the commutation relations of SO(3). That's all there is to it.

Erland · Dec 23, 2015

PeterDonis said:

No. The Killing vector fields I described are for the SR case, flat Minkowski spacetime.

OK, sorry for my misunderstanding. I know I should study differential geometry more, but I have no good book about this avaliable for the moment. I want to thank you all for trying to enlighten me. I don't mean to be difficult, but I do question things I don't find obvious.

It is often presented that way, but IIRC you don't actually have to make that assumption. The Lorentz group can be defined by the Killing vector fields that generate it and their commutation relations. Showing that the group of transformations on spacetime that have those commutation relations preserves the spacetime interval can then, IIRC, be derived as a theorem.

As a warmup exercise, consider the simpler case of the rotation group SO(3). This group preserves ordinary Euclidean distances in Euclidean 3-space, i.e., it preserves ##ds^2 = dx^2 + dy^2 + dz^2##. But I don't think you have to assume that in defining the group; I think you can define the group by its commutation relations, and then derive as a consequence the fact that the transformations in the group preserve Euclidean distances.

Maybe I should know this, but what do you mean by "commutation relations"? Do you simply mean relations of the type AB=BA, or A^-1B^-1AB=I, where A and B are rotations (in this case)? Or do you mean the set of all commutators: A^-1B^-1AB, or the subgroup generated by these (the commutator subgroup) given without explicitly displaying the A:s and B:s forming the commutators, or do you mean something else?

You can construct a basis for Minkowski spacetime using only null vectors, i.e., any vector can be expressed as a linear combination of null vectors, so invariance of null intervals is sufficient to establish invariance of all intervals.

I find it in no way obvious that just because the Minkowski lengths of x and y are preserved by L, the same is true for all linear combinations of x and y. It is of course true, but I know that just because I already know what L looks like.

strangerep · Dec 23, 2015

Erland said:

Very good, but this then leads to the question: Which are the equations in the theory to which this applies?

It must apply to every equation in the theory under consideration. The question then is how to specify a "theory".

In classical dynamics, this is done by choosing a specific Lagrangian, in the context of an Action principle and the calculus of variations. The Lagrangian then determines everything (including the equations of the theory) by extremizing the action. In practice, most people therefore just concentrate on symmetries of the Lagrangian. In this sense, one says that "the Lagrangian is the theory" (with the underlying framework of Newtonian space and time being understood, together with the calculus of variations).

In the foundations of relativity, one is interested in how events perceived by one observer ##O## may be reconciled with how another observer ##O'## perceives those same events. For the specific case where the observers are unaccelerated (i.e., inertial), and in motion relative to each other (with ##O'## having relative velocity ##v## wrt ##O##), and with the origins of their spatiotemporal reference frames coinciding, the equations of this "mini theory" are simply the coordinate transformations between the 2 coordinate systems. Assuming the transformations to be linear, and restricting ourselves to the 1+1D case, they are of the form $$t' ~=~ A(v) t + B(v) x ~,~~~~~~ x' = C(v)t + D(v) x ~, ~~~~~~ (1)$$where A,B,C,D are unknown functions to be determined.

One uses various physically-motivated criteria to restrict the form of A,B,C,D. One criteria is that ##v=0## corresponds to the identity transformation ##t'=t, x'=x##. Another is that $$\left. \frac{dx'}{dt'}\right|_0 ~=~ -v ~~~~ \mbox{if}~~ \left. \frac{dx}{dt}\right|_0 ~=~ 0 ~.$$ Another criterion (assumption) is that the transformations form a 1-parameter Lie (semi)group, with ##v## being the parameter. This implies (among other things) that 2 successive transformations with parameters ##v,v'## must commute, and the composition of the transformations must be equivalent to a single transformation with some parameter ##v'' = v''(v,v')##, to be determined.

The spatial isotropy assumption plays a role as follows. In 1+1D, it means the equations of the theory must be invariant under a reversal of all spatial vectors. Performing this reversal on (1), we get $$t' ~=~ A(-v) t - B(-v) x ~,~~~~~~ -x' = C(-v)t - D(-v) x ~. ~~~~~~ (2)$$ The equations (2) must be equivalent to (1). So, after a little algebra, we find the constraints: $$A(v) = A(-v) ~,~~~~ B(v) = B(-v) ~,~~~~ -C(-v) = C(v) ~,~~~~ D(v) = D(-v) ~.$$

Further, since ##v=0## must correspond to the identity transformation, we can (without loss of generality) substitute ##B(v) = v E(v)## and ##C(v) = v F(v)##, where ##E(v), F(v)## are 2 new unknown functions.

The benefit of the above, is that we now have transformation equations where the unknown functions ##A,B,E,F## are all even in ##v##. This fact can be used in subsequent steps of the derivation (but forgive me if I don't reproduce the entire thing here).

Also I wonder, how can this be used to prove that, for example, distances in directions perpendicular to the direction of motion between the frames are not changed by the transformation? (In this case, only rotations fixing the direction of motion should be used above.)

Here, you're talking about the more general 1+3D case, and what you say cannot be proven. Instead, one assumes that a rotation of the transformed axes around the boost direction has been performed (if necessary) to ensure that they're aligned with the original axes. (Strictly speaking, one might also have to perform a parity reversal as well.)

And how is it used to prove that ##v_{12}=-v_{21}##?

Well, that requires a fair bit more work, applying the assumptions I outlined above to derive further constraints on the unknown functions. I'm happy to work through that detail if you wish -- provided we treat it like a homework exercise: i.e., you must do at least as much of the work as I do, and show it here on PF.

PeterDonis · Dec 23, 2015

Erland said:

what do you mean by "commutation relations"?

The best way to answer that is to give an example; SO(3) will do. This is, as I said, a 3-parameter group of transformations, which means that every rotation in the group can be expressed as a linear combination of three "basis" rotations, which are called "generators". If we call the three generators ##J^1##, ##J^2##, ##J^3##, then these three obey the following commutation relations (##[A, B]## is the commutator of the objects ##A## and ##B##, i.e., ##[A, B] = AB - BA##):

$$
[J^1, J^2] = i J^3
$$

$$
[J^2, J^3] = i J^1
$$

$$
[J^3, J^1] = i J^2
$$

These three relations can be expressed more compactly as

$$
[J^i, J^j] = i \epsilon^{ijk} J^k
$$

where ##\epsilon^{ijk}## is the completely antisymmetric symbol in 3 dimensions, i.e., ##e^{123} = 1##, and even permutations of the indexes have the same sign while odd ones have the opposite sign.

Erland said:

I find it in no way obvious that just because the Minkowski lengths of x and y are preserved by L, the same is true for all linear combinations of x and y.

A linear combination of the intervals x and y is another interval, and which interval it is is independent of the choice of coordinates. (This is just the Minkowski spacetime analogue of vector addition in ordinary Euclidean space.) So the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination. But the lengths of x and y, by hypothesis, are not changed by L, and the coefficients of the linear combination define which linear combination it is, so they can't be changed by L. Therefore, the Minkowski length of the linear combination can't be changed by L.

Erland · Dec 24, 2015

PeterDonis said:

A linear combination of the intervals x and y is another interval, and which interval it is is independent of the choice of coordinates. (This is just the Minkowski spacetime analogue of vector addition in ordinary Euclidean space.) So the Minkowski length of the linear combination can only depend on the lengths of x and y, and the coefficients of the linear combination.

This is wrong, both in Euclidean and Minkowski space. As a counterexample in 2d Minkowski space, with ##c=1##, take ##x=(3,0)## and ##y=(5,4)##. Their Minkowski lengths are ##l(x)=3^2-0^2=9## and ##l(y)=5^2-4^2=9##. If what you wrote is true, we would have ##l(x+x)=l(x+y)##, since ##x+x## is a linear combination of ##x## and ##x## with the same coefficients as in the linear combination ##x+y## of ##x## and ##y##, and ##x## and ##y## have the same Minkowski lengths. But ##l(x+x)=6^2-0^2=36## and ##l(x+y)=8^2-4^2=48##.
To find a counterexample in Euclidean space is even more trivial.
You might object that we must only take linear combinations of mutually orthogonal basis vectors. But you wrote in the earlier post that we should take a basis of lightlike vectors in Minskowski space. However, in 2d Minkowski space, and I am quite sure in 3d and 4d also, we cannot find two lightlike, linearly indepent vectors which are orthogonal to each other. Two lightlike, linearly independent vectors in 2d must be two nonzero vectors ##(u,u)## and ##(v,-v)##. But they are not orthogonal to each other in the Minkowski sense, since ##uv-u(-v)=2uv\neq 0##.

PeterDonis · Dec 24, 2015

Erland said:

take ##x=(3,0)## and ##y=(5,4)##.

These are not null vectors. Your original hypothesis was that invariance of the interval only applied to null intervals, so ##x## and ##y## should be null intervals.

Erland said:

If what you wrote is true, we would have ##l(x+x)=l(x+y)##

Of course this is trivially true for any pair of null intervals ##x## and ##y##, since their lengths are always zero. So what my statement really amounts to is saying that, for the case under discussion (##x## and ##y## null intervals), the length of the linear combination depends only on the coefficients of the linear combination.

Erland said:

##x+x## is a linear combination of ##x## and ##x## with the same coefficients as in the linear combination ##x+y## of ##x## and ##y##

Um, what? That doesn't even make sense. The coefficients of a linear combination of ##x## and ##y## are the numbers multiplying ##x## and ##y## in the linear combination, in order; i.e., for the linear combination ##ax + by##, the coefficients are ##a, b##. So ##x + x## has coefficients ##2, 0##, while ##x + y## has coefficients ##1, 1##. So these are two different linear combinations.

Erland said:

in 2d Minkowski space, and I am quite sure in 3d and 4d also, we cannot find two lightlike, linearly indepent vectors which are orthogonal to each other.

Basis vectors don't have to be orthogonal, they just have to be linearly independent. Finding a basis of four null vectors just means finding four null vectors that are linearly independent. (You are correct that we can't find two lightlike, linearly independent null vectors that are orthogonal; two null vectors can only be orthogonal if they are collinear.) The basis vectors in the case you're used to, an inertial frame, are all orthogonal, but that does not mean basis vectors always have to be orthogonal; they just happen to be in that particular case.

Erland · Dec 24, 2015

PeterDonis said:

Of course this is trivially true for any pair of null intervals ##x## and ##y##, since their lengths are always zero. So what my statement really amounts to is saying that, for the case under discussion (##x## and ##y## null intervals), the length of the linear combination depends only on the coefficients of the linear combination.

##l(x)## here means the Minkowski length of ##x##. Let ##x=(1,1)## and ##y=(1,-1)##. These are both null vectors. Yet, ##l(x+x)=l(2,2)=0## while ##l(x+y)=l(2,0)=4\neq 0##.

Um, what? That doesn't even make sense. The coefficients of a linear combination of ##x## and ##y## are the numbers multiplying ##x## and ##y## in the linear combination, in order; i.e., for the linear combination ##ax + by##, the coefficients are ##a, b##. So ##x + x## has coefficients ##2, 0##, while ##x + y## has coefficients ##1, 1##. So these are two different linear combinations.

Linear combinations are defined for all finite sequences of vectors. It doesn't matter if some of them happens to be equal or not. So ##x+x## can be considered as a linear combination of the pair ##(x,x)##, with coefficients ##(1,1)## (another possibility is ##(0,2)##, and there are infinitely many more, since the pair ##(x,x)## is linearly dependent). Likewise ##x+y## can be considered as a linear combination of ##(x,y)## with coefficients ##(1,1)## (and infinitely many more possibilities, if ##(x,y)## is linearly dependent). So, the coefficients are the same in both cases.

Basis vectors don't have to be orthogonal, they just have to be linearly independent. Finding a basis of four null vectors just means finding four null vectors that are linearly independent. (You are correct that we can't find two lightlike, linearly independent null vectors that are orthogonal; two null vectors can only be orthogonal if they are collinear.) The basis vectors in the case you're used to, an inertial frame, are all orthogonal, but that does not mean basis vectors always have to be orthogonal; they just happen to be in that particular case.

It is not clear to me what you actually claim. Do you claim that every linear transformation ##T## which maps (Minkowski) null vectors to null vectors, preserve Minkowski length? If so, you are wrong.
For example, there is an invertible linear transformation ##T:\Bbb R^2 \to \Bbb R^2## such that ##T(1,1)=(2,2)## and ##T(1,-1)=(1,-1)##. We see easily that it maps null vectors to null vectors: ##T(x,x)=(2x,2x)## and ##T(x,-x)=(x,-x)##, and all null vectors are of these types.
Now ##(2,0)=(1,1)+(1,-1)## and ##T(2,0)=T(1,1)+T(1,-1)=(2,2)+(1,-1)=(3,1)##. But ##l(2,0)=4## and ##l(T(2,0))=l(3,1)=8##. Thus, ##T## does not preserve Minkowski lengths.

So, not all linear transformations which map null vectors to null vectors preserve Minkowski lengths. The question is then if you don't talk about all linear transformations, but some particular class of transformations. If so, which class and why? We cannot á priori choose the Lorentz transformations, for that would be circular.

Deriving Lorentz transformations

Similar threads

Hot Threads

Recent Insights