# I How to bridge the gap between these approaches to SR?

Tags:
1. Aug 28, 2016

### leo.

For quite a long time now I'm having some trouble to bridge the gap between two different approaches to Special Relativity.

The first approach is the traditional one. It is the approach that Einstein presented in his paper and that is taught in most of the basic textbooks. In this approach, Special Relativity is introduced by two means of two postulates. In that setting the theory is based on the two postulates (the relativity principle and the constancy of the speed of light) and the Lorentz transformations arise as the transformations between inertial reference frames which obeys the two postulates.

In this approach, for example, it is the postulates that forces us to consider that time may be reference frame dependent, because the relativity of simultaneity is a consequence of the postulates alone.

On the other hand there is the modern approach. In the modern approach Special Relativity is introduced directly as a theory of the structure of spacetime. The spacetime metric is introduced right away and the Lorentz transformations are defined as the transformations of spacetime which keep the spacetime metric invariant.

On this approach, nothing leads to a new spacetime structure, nor to the metric. It all appears out of thin air, inasmuch as the postulates in the original approach.

Now I want to make something clear here. I do know that the only important thing is that both approaches work. One can look into the second approach, and just use it.

The point is that IMHO, there's a huge gap between the two approaches.

The original approach by Einstein is well motivated in his paper. Talking about Electrodynamics, Einstein is able to motivate quite well the need for his two postulates. It all flows quite smoothly.

The second approach is not that nice to grasp. It doesn't seem to flow smoothly from anything else. It is usually not motivated, it is just presented as: "the mathematical framework is this because it works" and nothing more is said.

Furthremore, it is not at all obvious that the Einstein's postulates are totally equivalent to invariance of the spacetime metric.

In that sense, how can we bridge the gap between the two approaches? How can we present the second approach in a quite smooth and natural way as the first approach was presented by Einstein?

2. Aug 28, 2016

### QuantumQuest

The key here, is the kind of geometry that is used. It is hyperbolic, so the metric is according to this. On the other hand, the postulates of Einstein lead to this kind of geometry too.

3. Aug 28, 2016

### Staff: Mentor

Einstein's postulates imply the Lorentz transforms, and the Lorentz transforms imply the invariance of the spacetime interval. Is that not sufficient?

Einstein's approach only seems smooth and natural because, with the advantage of hindsight, we're able to avoid a number of wrong turns and unnecessary detours.
- Relativistic mass accepted then discarded.
- Applicability of SR to accelerated coordinate systems missed the first time around.
- Value of Minkowski model only appreciated after the fact.

The historical approach got to us the destination, but now that we know what the destination is, we can find easier paths to it. Once you know that the Minkowski spacetime geometry ensures the invariance of the speed of light, it's not unnatural to take the assumption that the universe is Minkowskian as a starting point equivalent to Einstein's postulates. Of course it would never occur to anyone to try that unprompted if they hadn't already seen the historically derived answer - but that's what hindsight is for.

4. Aug 28, 2016

### robphy

5. Aug 29, 2016

### strangerep

Heh, well, I'm from the camp that thinks the 2nd approach is "cheating".

Mathematicians tend to prefer the 2nd approach because it's more axiomatic, whereas the 1st approach is more physically motivated.

Once you have the symmetry group (i.e., Poincare transformations), the spacetime can be constructed as a homogeneous space for the group. (This is in fact an aspect of the Klein's Erlangen program, which featured briefly in a recent Insights article.)

So I wouldn't even bother starting from Minkowski spacetime, but that's just my personal taste.

BTW, there's actually a "zero'th" approach which starts from the Relativity Principle, but does not assume the Light Principle. I.e., it doesn't assume invariance of the speed of light. A sketch of this approach appears in this Wiki page (section "From group postulates"). A full exposition of this derivation is rather long, however.

6. Aug 29, 2016

### vanhees71

It depends on whom you address to teach SR. As a first encounter, I think the Einstein way is the best. You start with the tension between the classical (Newtonian) space-time structure and electromagnetism by the observation that Maxwell's equation are not Galileo-invariant. Einstein's ingenious insight (after many years of thinking about the problem!) was to keep precisely the essence of what's right about both theories: From Newtonian mechanics he took the principle of inertia and the corresponding existence of a class of preferred reference frames, namely the inertial frames (even in GR this holds with the qualification that it becomes a local concept, and locality should be emphasized in relativistic physics anyway!). Then from Maxwell's equations and the assumption that they should be forminvariant under transformations from one inertial reference frame to another he concludes that the only thing which goes wrong in using the Galileo transformation is that the speed of light wouldn't be the same. So as the essence of getting both the principle of special relativity and Maxwell's equations consistence is to postulate the constancy of the speed of light. No matter how the light source moves (with constant velocity wrt. to an inertial frame of reference!) the speed of light (i.e., the phase velocity of electromagnetic waves in a vacuum) should be the same. From this he derives the Lorentz transformations. From this it is only a very small step to analyze the correct mathematical frame work, i.e., Minkowski geometry (space time=pseudo-Euclidean affine manifold with a pseudo-metrix of signature (1,3)).

The "axiomatic approach" is a shortcut for advanced students, starting right away from the Minkwoski space is a shortcut emphasizing the math rather than the physics.

There's a 3rd approach, which however is a bit lengthy compared to the other two, but for me provide the most mathematical fun: You just postulate the principle of special relativity, the homogeneity of space and time (translation invariance) and isotropy of space for any inertial observer and assume that the space-time symmetry transformations build a continuously connected group. This leads to precisely two spacetime structures. You can guess easily, which two that might be ;-)).

7. Aug 29, 2016

### micromass

Staff Emeritus
Do you have any good reference on this?

8. Aug 29, 2016

### vanhees71

9. Aug 29, 2016

### soviet1100

I've only recently studied general relativity, so I might not be as qualified as the others to make a properly informed statement, but I'll add my own two cents.

I was first taught SR using lights and clocks and trains and paradoxes and Maxwell equations. It didn't work well and it made ugly one of the most, if not the most, subject in all of physics. Same goes for electrodynamics taught starting from Coulomb's law and biot-savart's law etc. and subsequently fixing one stupid mistake after the other (this is the approach followed in for e.g. Griffiths)

The approach that I would vehemently (militantly?) advocate is that taken in Landau & Lifshitz's CFT (Vol. 2). First postulate the existence of a maximal velocity and the equivalence of inertial frames. These postulates aren't at all like the postulates of QM in that these are easily digestible and there exists a wealth of experimental support that one can easily find in papers and understand. Once you do that, you can systematically reason/derive the invariant interval between events. L&L do this on page 4; I'm not reproducing their argument here. This essentially fixes the 'metric', if you want to call it that. But you can argue the case for the invariant interval from purely physical perspectives. Once that's done, the only other "assumption" is the holy grail of physics, the principle of extremal action. The equivalence of inertial frames postulate imposes strong constraints (lorentz invariance) on possible terms in lagrangians. From these basic ingredients, one can derive all of SR (including Lorentz transformations, concepts of simultaneity etc.), E&M, and then beautifully generalise to GR. When this approach is used in GR, the beauty is breathtakingly stunning. Without mention of the mathematical formalism of manifolds or charts or covariant derivatives or geodesics, all these concepts just plumply fall into your lap from just the action principle. It's actually a great platform to intuitively understand some basic concepts of diff geo. It's a shame that the action principle is not emphasised enough, if at all, in undergraduate university courses.

{begin rant} This is how physics is supposed to be taught, from scratch. Starting with simple comprehensible experimental data, translating them into mathematical requirements of a theory, then choosing a mathematical structure, building it, and then finally making testable predictions with it. Violations against this is most prominent in teaching QM, where everyone usually just postulates out of the blue that a system is described by a vector in a Hilbert space with dynamics governed by the Schrodinger equation etc. And the reason why such is the case is usually summed in a single statement: "because it ultimately agrees with experiment". I doubt someone looked at an electron diffraction apparatus and then straight away concluded that the electron lives in a hilbert space and wrote down the Schrodinger equation. One book that does justice is of course Dirac's book (read in conjunction with Ballentine's chapter 3) {end rant}.

Last edited: Aug 29, 2016
10. Aug 30, 2016

### vanhees71

I couldn't agree more with what you said about relativity and electrodynamics. I don't understand why they copy still 19th-century textbooks concerning E&M. I think the final word on this kind of textbooks has been written by Jackson and that's it. For the 21st century, LL is the right way, although one needs to also provide a step between the inductive way the subject is treated in the experimental-physics course and theory, including a thorough treatment of vector calculus, which is the main obstacle for undergraduate students to learn the subject.

Concerning QT, I'd say you need a bit of historical introduction to motivate the Hilbert-space "axiomatics". It's too abstract to just being thrown in in the first lecture. That doesn't help to gain understanding, why this abstraction is necessary. I think the best approach is the heuristics via de Broglie-Einstein ("wave particle dualism") with the clear hint that this is not satisfactory and then coming as quickly as possible to Dirac's representation free formalism with abstract operators and vectors on Hilbert space (and no, the pure states are not represented by the vectors but by rays!).

11. Aug 30, 2016

### leo.

Thanks for all the answers. I agree that the first approach is more physically motivated and easier to understand from a physical standpoint. On the other hand I'm starting to think that the best way to motivate the second approach would then to see historically how people got there.

And this is indeed the sort of "gap" that I have the impression that exists between the two approaches. I mean, in most of the presentations I've seem up to now things goes more or less along these lines:
1. Einstein's original approach is presented (or just a recap of it), deriving the relativity of simultaneity, length contraction, time dilation, and hence, the Lorentz transformations (in truth just the boosts).
2. Without any motivation the spacetime metric (and hence the metric spacetime interval) is defined and it is shown to be able to classify the separation of events (timelike, spacelike or lightlike).
3. The spacetime interval is shown to be invariant under boosts.
4. It is immediately said that the Lorentz transformations can be just characterized directly as the transformations preserving the spacetime interval.
I mean, as one goes from Classical Mechanics and Electrodynamics to Einstein's approach is, as I said, smooth and natural. Following Einstein's approach is also going in a quite natural direction.

I feel that the gap appears in two distinct situations: when the metric tensor is presented (which is at first a quite strange thing to define and usually has no motivation presented) and later when it is said that the spacetime metric is enough to rebuild the entire theory.

I mean, that the spacetime metric is enough to rebuild the entire theory is not at all obvious. If we look at it carefully, it seems to be just another object defined in the theory that happens to be invariant under boosts. I can't see how it is obvious that the other way around (that the Lorentz transformations can be characterized as the ones keeping the metric fixed) works and rebuilds the theory.

Is there one way to fill those gaps? I mean, is it possible to:
1. Motivate the introduction of the spacetime metric properly, so that it doesn't appear to just come out of thin air.
2. Motivate and make it quite reasonable that defining the Lorentz transformations as the ones preserving the spacetime metric we get the same transformations arising from Einstein's postulates and we are able to rebuild the entire theory.
Of course, as I said, this motivation might be historical, might be just to say how historically people did this for the first time. Perhaps the only possible way to do it is to explain historically what was done, but I never found this data.

12. Aug 30, 2016

### robphy

Did you see my reference to the Bondi k-calculus in #4?
There the metric and the Lorentz transformation are not in the foreground of the discussion.
It is the principles of relativity, with focus on the radar method and the Doppler effect.
(Secretly, the approach is using the eigenbasis of the Lorentz Transformation.)

https://archive.org/details/RelativityCommonSense
https://en.wikipedia.org/wiki/Bondi_k-calculus

13. Aug 30, 2016

### vanhees71

How do you like my PF-FAQ article?

http://th.physik.uni-frankfurt.de/~hees/pf-faq/srt.pdf

There I use a mixture of Einstein's postulates and the argument via the pseudo-Euclidean affine structure of spacetime (it's in fact not a metric but a pseudometric in the sense of a pseudo-Riemannian manifold, of which Minkowski space, i.e., the pseudo-Euclidean affine space with signature (1,3)).

14. Aug 30, 2016

### samalkhaiat

15. Aug 31, 2016

### Battlemage!

I saw one book where it focused on general linear transformations, then it noted (if I recall correctly) that the Galilean transformation (with c=infinity) and Lorentz transformation are the only possible linear transformations that could work with the assumption of the relativity principle.

I think a fun approach would be to start with some general transformation, for example you might have x coordinates transform by x = f(ax +bt), then show that the Galilean transformation is a special case of it, and then show that the Lorentz transformation is as well; and finally, show that the Galilean transformation is itself a special case of the Lorentz transformation.

At least with that approach you'd touch a little bit on the math structure, while not sacrificing the central component: the principle of relativity.

16. Aug 31, 2016

### robphy

...but that's not a true statement.
One could argue that a Galilean boost is a certain limit (but not a special case) of a Lorentz boost.

17. Aug 31, 2016

### Battlemage!

How so? Aren't all limits a special case of a particular function? Anyway, maybe I should update my book collection. This entire idea came from Rindler's Essential Relativity, which says on page 65 "Thus we are left with LT's having arbitrary positive "lightsquare" V2, and GT's. The latter are really particular cases of the former (corresponding to V2 = ∞) [...]", so just fyi I didn't just make that up.

Anyway, a limit that is what I meant. You could write the Galilean transformation identically with gamma as 1/sqrt(1-(v^2/c^2)) and then write "with c->infinity" afterwards and have the same thing. Kind of like saying some constant c is a polynomial but with x=0. I mean, P(x) = c for all x in R, c = constant.

So, my point is, you can have a generic coordinate transformations: x = ƒ(v)*(ax + bt), t = g(v)*(et + fx), and then simply replace the coefficients and/or functions with what is needed to get both the GT and LT.

My retelling of it, due to my lack of experience, was technically incorrect, however as I said I got the idea from a pretty well known book: an old version of Rindler's Essential Relativity, pages 64-67 and . It considered what SR would look like without the c speed limit postulate, and then went super abstract with generic transformations, pointing out that the principle of relativity is only preserved with a GT or LT transformation (and pointing out the differences in between universes with each one).

So ignoring the technical failures of yours truly, I still like the idea of approaching it from a "top down" view instead of inexplicably (in the eyes of the student) jumping from GT's to LT's. The first time I saw the GT equations I thought, "well that makes sense." First time I saw the LT equations I was like "wat..."

18. Aug 31, 2016

### robphy

My answer to "Aren't all limits a special case of a particular function?" is NO.
In 1+1 Minkowski spacetime, a lorentz boost (in rectangular coordinates with natural units) is symmetric... a Galilean boost is not.

Yes, when you write the Lorentz Transformations in a certain way and then take the limit of this "maximum signal speed parameter" as it goes to infinity,
you can get to the Galilean limit. (If you write it a different way, you get the so-called [Lewis] Carroll limit. So, what you write down [to describe what limit is being taken] before taking the limit is important.)
These are discussed in http://scitation.aip.org/content/aip/journal/jmp/9/10/10.1063/1.1664490
"Possible Kinematics" - Henri Bacry and Jean-Marc Lévy-Leblond
J. Math. Phys. 9, 1605 (1968)

https://arxiv.org/abs/1402.0657 "Carroll versus Newton and Galilei: two dual non-Einsteinian concepts of time"

With this regard, I also advocate a top-down approach... this approach is based upon the Cayley-Klein Geometries (which likely includes [subsumes] these "relativity without the speed of light postulate"... including Rindler, Mermin (1984) http://scitation.aip.org/content/aapt/journal/ajp/52/2/10.1119/1.13917 and Ignatowski (1910) https://en.wikisource.org/wiki/Translation:Some_General_Remarks_on_the_Relativity_Principle .)

Three of them are well known to mathematicians (Euclidean, Elliptic, and Hyperbolic), the other six are three Lorentz-signature spacetimes (Minkowski, deSitter, and anti deSitter) and their Galilean limits.

In this scheme, one can write a "unified", "generalized", (or "meta"-)expression for a "boost"... of which Minkowski, Galilean, Euclidean, deSitter, etc... are special but distinct (non-overlapping) cases. The appropriate "experiment" can then be used to pick out the case under study.

Last edited: Sep 2, 2016
19. Aug 31, 2016

### Battlemage!

I just wonder if there is a way to teach it this way without needing the student to understand more than the basics of linear algebra, because by the time a student usually starts getting into these types of non-euclidean geometries, they've already been acquainted with the basics of special relativity.

As an undergrad I have barely seen non-euclidean geometry, but I saw the basics of SR my first year. Of course it was inexplicable to me from a general math perspective. It made sense to talk about c as a speed limit and then derive the transformations from that, but at the time the only explanation was "well, we observe c=constant therefore SR." And then I've been told later on in GR that isn't strictly speaking true, since the notion of a speed is not easily defined in the first place. I just don't know if GR is where a student should be introduced to non-euclidean geometry. It's like saying "well, we kind of lied to you before but here is how things really are..."

Then again maybe that is just too much that early.

20. Aug 31, 2016

### robphy

One of my ideas is to start by slowly developing aspects of the non-euclidean geometry already inherent in a PHY 101 position vs time graph.
Although the simulation already starts at the "E=1" case [the punchline],

(Based on an earlier post of mine https://www.physicsforums.com/threads/minkowski-diagram.876359/#post-5503686 )

https://www.desmos.com/calculator/ti58l2sair [my t axis is horizontal ]

The Lorentz Transformations mentioned above preserves the unit hyperbola (that is, under that Lorentz Transformation, the events on the future unit hyperbola are mapped to the same hyperbola), whose asymptotes are lightlike---that is to say, all frames of reference agree on the same value of the speed of light (maximum signal speed).

An inertial worldline (the t'-axis) drawn from the "center" of the hyperbola into the future light cone intersects the hyperbola at an event.
Think of this worldline as a radius vector from the center.
The tangent line to the hyperbola here is "spacetime-perpendicular" (perpendicular in Minkowski spacetime geometry) to the worldline.
Think "tangent to a circle is perpendicular to radius".
Physically, this tangent line determines the x'-axis.... That observer's "space" direction is perpendicular to that observer's "time" direction.
Additionally, this tangent line has the "same value of t' " for that observer---this is that observer's line of simultaneity.

Try changing the E-slider.... to the Euclidean case (when E= -1) and the Galilean case (when E=0).