# Can SR be derived from causality alone?

friend
I'm wondering if causality is enough to derive the Minkowski metric and the Lorentz transformations. It seems to me that in order to ensure that some set of events maintain a causal relationship to each other under a transformation to a moving frame of reference, that there must be some restrictions on the metric in such a space and on the types of transformations of reference frames allowed in that space. Otherwise you could transform to some ref frame in which the causality is reversed. So is it a well established part of relativity that the Minkowski metric and/or the Lorentz transformations are derived solely from causality? If not, is there anyone working on this effort? Thanks.

soothsayer
I don't see how causality alone could ensure Minkowski space and Lorentz transformation. In Galilean transformation, time is an absolute across all reference frames; If event A happened before event B in one frame (whether or not event A caused event B), then event A should precede event B in ALL Galilean frames, and by exactly the same amount of time as in the previous frame. Essentially, Galilean relativity also guarantees causality, as time is treated separately from space and undergoes no transformation, so there's no need for Lorentz transformation based on causality alone. Least, none that I can see.

Staff Emeritus
Gold Member
You can do it in a system of postulates based on (1) causality, (2) relativity of simultaneity, and (3) symmetry of spacetime. You need 2 to rule out Galilean relativity, and 1 to rule out a theory in which a boost along the x-axis is simply a rotation of the x-t plane. Here is a treatment I wrote along these lines: http://lightandmatter.com/html_books/0sn/ch07/ch07.html [Broken] This is not original with me. Here are some other references that take a similar approach:

W.v.Ignatowsky, Phys. Zeits. 11 (1911) 972

Rindler, Essential Relativity: Special, General, and Cosmological, 1979, p. 51

Palash B. Pal, "Nothing but Relativity," Eur.J.Phys.24:315-319,2003, http://arxiv.org/abs/physics/0302045v1

Last edited by a moderator:
friend
I don't see how causality alone could ensure Minkowski space and Lorentz transformation. In Galilean transformation, time is an absolute across all reference frames; If event A happened before event B in one frame (whether or not event A caused event B), then event A should precede event B in ALL Galilean frames, and by exactly the same amount of time as in the previous frame. Essentially, Galilean relativity also guarantees causality, as time is treated separately from space and undergoes no transformation, so there's no need for Lorentz transformation based on causality alone. Least, none that I can see.

Perhaps there is something about how the passage of time and length is compared. How do I in my frame of reference know how fast time passes for you in your frame? How do I know that a meter to me is a meter to you? In order to get that information, I'd have to have some channel of communication between us which implies a chain of cause and effect from you to me in order to get that information.

TrickyDicky
E. C. Zeeman
Causality Implies the Lorentz Group.
J. Math. Phys. April 1964 Volume 5, Issue 4, pp. 490-493

soothsayer
Perhaps there is something about how the passage of time and length is compared. How do I in my frame of reference know how fast time passes for you in your frame? How do I know that a meter to me is a meter to you? In order to get that information, I'd have to have some channel of communication between us which implies a chain of cause and effect from you to me in order to get that information.

There would need to be a chain of cause and effect for information transfer between any two reference frames regardless of whether Galilean or Lorentz transformation is used. The fact that we need to exchange information causally in order to determine empirically what a meter is in each of our frames, and what a second is in each of our frames, is not exclusive to special relativity.

friend
There would need to be a chain of cause and effect for information transfer between any two reference frames regardless of whether Galilean or Lorentz transformation is used. The fact that we need to exchange information causally in order to determine empirically what a meter is in each of our frames, and what a second is in each of our frames, is not exclusive to special relativity.

It seems the absolute time and space dimensions of the Galilean Xformation is imposed and not derived. What information can we obtain from observation that this absolute coordinate system is real? This is not derived from observation which relies on causation. And so it is not derived from causation.

Aimless
I remember hearing in a talk once upon a time that knowing the casual structure of spacetime was sufficient to give you the metric up to a (constant?) scale factor; however, I've been unable to track down a reference for that statement.

soothsayer
It seems the absolute time and space dimensions of the Galilean Xformation is imposed and not derived. What information can we obtain from observation that this absolute coordinate system is real? This is not derived from observation which relies on causation. And so it is not derived from causation.

It's derived empirically, based on simple observations of the classical world perceived by Galileo and Newton and from simple logic. The Lorentz transformation is imposed in the same way Galilean transformation is imposed: They were both mathematical theories and needed empirical data and experimentation to support.

Imagine we live in a world where the speed of light is not absolute, and Galilean transformations were physical reality at all relative velocities: Where would causality be violated? What I'm saying is, for example, from sin2x + cos2x = 1, we can derive sin2x = 1 - cos2x, and vice-versa. It's a one way street. From causality alone, we can derive any number of transformations; none of which are preferred without further restrictions, as bcrowell stated earlier.

friend
I found the following paper on the arXiv:

http://arxiv.org/abs/1005.4172

Abstract:
"We present a novel derivation of special relativity based on the information physics of events comprising a causal set. We postulate that events are fundamental, and that some events have the potential to receive information about other events, but not vice versa. (This is causality) This leads to the concept of a partially-ordered set of events, which is called a causal set. Quantification proceeds by selecting two chains of coordinated events, each of which represents an observer, and assigning a valuation to each chain. Events can be projected onto each chain by identifying the earliest event on the chain that can be informed about the event. In this way, each event can be quantified by a pair of numbers, referred to a pair, that derives from the valuations on the chains. Pairs can be decomposed into a sum of symmetric and antisymmetric pairs, which correspond to time-like and space-like coordinates. From this pair, we derive a scalar measure and show that this is the Minkowski metric. The Lorentz transformations follow, as well as the fact that speed is a relevant quantity relating two inertial frames, and that there exists a maximal speed, which is invariant in all inertial frames. All results follow directly from the Event Postulate and the adopted quantification scheme."

When events are fundamental, and one event can have influence on another along a chain of events; this is a description of causality. The paper deriveds the Minkowski metric, the Lorentz transfromations, and the speed of light, all from causality.

However, I'm not so sure about his method. When he says, "Events can be projected onto each chain by identifying the earliest event on the chain that can be informed about the event", this seems to already assume a Minkowski-like metric, right? Any help with these concepts would be appreciated.

Last edited:
I found the following paper on the arXiv:

http://arxiv.org/abs/1005.4172

Abstract:
"We present a novel derivation of special relativity based on the information physics of events comprising a causal set. We postulate that events are fundamental, and that some events have the potential to receive information about other events, but not vice versa. (This is causality) This leads to the concept of a partially-ordered set of events, which is called a causal set. Quantification proceeds by selecting two chains of coordinated events, each of which represents an observer, and assigning a valuation to each chain. Events can be projected onto each chain by identifying the earliest event on the chain that can be informed about the event. In this way, each event can be quantified by a pair of numbers, referred to a pair, that derives from the valuations on the chains. Pairs can be decomposed into a sum of symmetric and antisymmetric pairs, which correspond to time-like and space-like coordinates. From this pair, we derive a scalar measure and show that this is the Minkowski metric. The Lorentz transformations follow, as well as the fact that speed is a relevant quantity relating two inertial frames, and that there exists a maximal speed, which is invariant in all inertial frames. All results follow directly from the Event Postulate and the adopted quantification scheme."

When events are fundamental, and one event can have influence on another along a chain of events; this is a description of causality. The paper deriveds the Minkowski metric, the Lorentz transfromations, and the speed of light, all from causality.

However, I'm not so sure about his method. When he says, "Events can be projected onto each chain by identifying the earliest event on the chain that can be informed about the event", this seems to already assume a Minkowski-like metric, right? Any help with these concepts would be appreciated.

This paper is not inconsistent with the claim that causality alone is not enough to derived SR. The key is that the derivation in this paper requires that events can be partially ordered but not totally ordered. This is just a disguised (and elegant!) way of incorporating Bcrowell's axiom (2): relativity of simultaneity.

nitsuj
You can do it in a system of postulates based on (1) causality, (2) relativity of simultaneity, and (3) symmetry of spacetime.

For me that sums it up (not sure why symmetry of spacetime as opposed to isotropic spacetime / spacetime interval ...). Appears to precede time/length measures ("mere*" observations), but introduces length time measures to explain the dichotomy of spacetime separated events & causal events.

*the comparatives between length/time measures seems moot to the event itself (causal or not). Interval is what matters here...i.e. isotropic spacetime. I don't think the "universe" cares if Jack & Diane measure time/length of different proportions, only "concern" for the causal connection between them (and not important if there is no causal connection) of course observation is a causal connection i.e. Relativity of simultaneity, a human (conscious?)concern, but otherwise meaningless from a physical perspective. I'm thinking #2 could be dropped from the list, no?

Opps, I wasn't sure what was meant by Mikowski metric? looks like it's what describes spacetime from a length/time (measurements) perspective so with that know understood & answering my own question of course #2 is required, it implies measures i.e. quantified comparisons

Last edited:
*the comparatives between length/time measures seems moot to the event itself (causal or not). Interval is what matters here..i.e. isotropic spacetime. I don't think the "universe" cares if Jack & Diane measure time/length of different proportions, only "concern" for the causal connection between them (and not important if there is no causal connection) of course observation is a causal connection i.e. Relativity of simultaneity, a human (conscious?)concern, but otherwise meaningless from a physical perspective. I'm thinking #2 could be dropped from the list, no?

Some form of it is needed to rule out Galilean relativity. Note that Galilean relativity = SR if c=∞. You need something extra to rule out c=∞.

friend
What intrigues me is when he says, "Quantification proceeds by selecting two chains of coordinated events, each of which represents an observer, and assigning a valuation to each chain. Events can be projected onto each chain by identifying the earliest event on the chain that can be informed about the event."

These "chain of events" make me think of the paths of Feynman's path integral, or perhaps the path of least action in the Action integral. Perhaps the two chains onto which he projects an unaffiliated event are slight deviations in a particular path in one of these integral formulations. And perhaps his projection procedure only insures that causality is maintained for events that are between a path and it's slight deviation. Then perhaps if alternative paths of causality are required along the points on some topology or manifold, then a Minkowski metric is required.

Last edited:
nitsuj
Some form of it is needed to rule out Galilean relativity. Note that Galilean relativity = SR if c=∞. You need something extra to rule out c=∞.

Ah okay, I was thinking that isotropic spacetime implies an interval, and that #2 introduces the concept the two components of an interval. Now I see an interval doesn't inherently mean invariance, which requires a finite length over time (and isotropic spacetime).

Last edited:
TrickyDicky
I think one would need first a good definition of causality, even if its meaning looks obvious.

Also I consider Galilean transformations to be agnostic about causality, it is well known that classical mechanics is time-reversible, only by introducing an omniscient observer that prescribes absolute time there is causality. The only difference with the Lorentz case is that in the Galilean relativity time is not a cordinate/dimension, we are dealing with Euclidean space and time as a parameter, while in Minkowski space time is a dimension, so causality is intrinsic to the spacetime structure, and for spacetimes one needs not rule out c=∞ because it is implicit in the presence of a time dimension that c must be finite.
Then introducing observers in a Lorentzian spacetime automatically leads to relativity of simultaneity.

So I'd say that causality implies the Lorentz group and to answer the OP I think causality is enough to derive the Minkowski metric and the Lorentz transformations because it is the only way to introduce the time dimension and doesn't need an external omniscient observer to prescribe it thru a parameter.

soothsayer
So I'd say that causality implies the Lorentz group and to answer the OP I think causality is enough to derive the Minkowski metric and the Lorentz transformations because it is the only way to introduce the time dimension and doesn't need an external omniscient observer to prescribe it thru a parameter.

But that's exactly it, isn't it? Galilean relativity demands there is a preferred frame, an "external omniscient observer", if you will, and time is, accordingly, a parameter, not a dimension. This is on the same level as special relativity which demands there is no preferred frame, and a time dimension follows accordingly from that. What makes either choice preferred from the standpoint of causality? Of course, today we know that special relativity is correct and that the Galilean idea of a preferred reference frame is silly, but that doesn't mean the latter can't preserve causality any better than the former. As an earlier poster put it, yes, if we take c to be finite, we need Minkowski space and Lorentz transformation to preserve causality (actually, I'm not certain there aren't still other options at that level) but Galilean transformation is just a Lorentz transformation for c = ∞

nitsuj
I think one would need first a good definition of causality, even if its meaning looks obvious.

Also I consider Galilean transformations to be agnostic about causality, it is well known that classical mechanics is time-reversible, only by introducing an omniscient observer that prescribes absolute time there is causality. The only difference with the Lorentz case is that in the Galilean relativity time is not a cordinate/dimension, we are dealing with Euclidean space and time as a parameter, while in Minkowski space time is a dimension, so causality is intrinsic to the spacetime structure, and for spacetimes one needs not rule out c=∞ because it is implicit in the presence of a time dimension that c must be finite.
Then introducing observers in a Lorentzian spacetime automatically leads to relativity of simultaneity.

So I'd say that causality implies the Lorentz group and to answer the OP I think causality is enough to derive the Minkowski metric and the Lorentz transformations because it is the only way to introduce the time dimension and doesn't need an external omniscient observer to prescribe it thru a parameter.

the time dimension is derived from c being finite right? Said different a finite c defines the spacial/temporal dimensions.

Thinking of this more, I don't believe causality alone can "produce" metrics, it seems to be a purely physical concept (defining causality here), and "ignores" an observer perspective. I find it gets particularly confusing when introducing the "what's observed/measured is physical reality" school of thought.

Einsteins two SR postulates do "produce" Minkowski metric (if I am understanding Minkowski metric correctly).

Last edited:
soothsayer
Yeah, I think TrickyDicky has it a bit backward. The only difference between Galilean and Lorentz transformations is that c is infinite in the former, but finite in the latter. You don't derive a limit on c based on the presence of a time dimension, you infer the presence of a time dimension based on the fact that c is finite, which in turn yields Minkowski space.

TrickyDicky
Thinking of this more, I don't believe causality alone can "produce" metrics, it seems to be a purely physical concept (defining causality here), and "ignores" an observer perspective. I find it gets particularly confusing when introducing the "what's observed/measured is physical reality" school of thought.
That is whay I said a working definition of causality is needed here. Too many concepts are implicit in the word and they could be different for different people.
Einsteins two SR postulates do "produce" Minkowski metric (if I am understanding Minkowski metric correctly).

Although historically the two postulates came before, they are actually logically derived from Minkowski spacetime.

TrickyDicky
Yeah, I think TrickyDicky has it a bit backward. The only difference between Galilean and Lorentz transformations is that c is infinite in the former, but finite in the latter. You don't derive a limit on c based on the presence of a time dimension, you infer the presence of a time dimension based on the fact that c is finite, which in turn yields Minkowski space.
I think this sequence is not the right one, but it is logical that a discussion on causality includes disagreements on what's backwards and what's forward .
But it all depends on what you want to use as initial postulate, I think people understands better SR if you start with the Minkowski space:
Causality→Minkowski space→time dimension and finite c.

soothsayer
I think this sequence is not the right one, but it is logical that a discussion on causality includes disagreements on what's backwards and what's forward

Ha!

But it all depends on what you want to use as initial postulate, I think people understands better SR if you start with the Minkowski space:
Causality→Minkowski space→time dimension and finite c.

I guess I just don't follow that progression very easily. I feel like the idea of a finite c that is constant in all reference frames is really where most educators start. From there you pick up relativity of simultaneity, Lorentz tranformations, Minkowski space, and then discussions on why causality is definitely preserved in this space. It feels more like a check at the end of a theory rather than something you can use to arrive at ONE correct theory. At least, this is how I best understand it.

soothsayer
Although historically the two postulates came before, they are actually logically derived from Minkowski spacetime.

And I feel as though the "causality postulate" is also logically derived from Minkowski space, not vice versa.

TrickyDicky
I guess I just don't follow that progression very easily. I feel like the idea of a finite c that is constant in all reference frames is really where most educators start. From there you pick up relativity of simultaneity, Lorentz tranformations, Minkowski space, and then discussions on why causality is definitely preserved in this space. It feels more like a check at the end of a theory rather than something you can use to arrive at ONE correct theory. At least, this is how I best understand it.
You are right that most educators go about it like that, see this for instance:
http://www.pantaneto.co.uk/issue33/henry.htm

And I feel as though the "causality postulate" is also logically derived from Minkowski space, not vice versa.
Intuition might be misleading, I also feel more natural to think about causality from Minkowski space rather than the opposite, that is because we associate causality to Minkowski space (or Lorentzian manifold in general) but then you can also associate causality to other spaces like euclidean space and galilean relativity plus preferred frame that gives absolute time or others, but all of those others seem to require additional assumptions. To have causality without other assumptions the simplest way is a space with a time dimension, that is to say, with a Lorentzian signature.

nitsuj
Although historically the two postulates came before, they are actually logically derived from Minkowski spacetime.

I'd reiterate they are postulates; "assumptions" of our observations. The metric continues this into a mathematically useable/structured format. I don't know details or much really about the metric beyond it "describing" dimensions. +++-? thingy :uhh: I think

But yea they can be derived from, or "produce" the said metric.

I guess the quoted statement isn't really stating anything, well besides the postulates came first, and are fundamental to the metric.

Finding this a neat topic to think about, causality and what it means. I'm caught up thinking of it being purely about observation. In the case of causality there is only one "true" order, despite observed non-congruent order of causal events in a hypothetical scenario.

Last edited:
soothsayer
You are right that most educators go about it like that, see this for instance:
http://www.pantaneto.co.uk/issue33/henry.htm

Totally agree, Minkowski space is the best way to go about teaching SR, but doesn't that require a higher level of mathematical skill than most students are capable of when they first need to learn SR?

And also, again, in this context, Minkowski space is not derived from anything, but rather postulated and then tested by seeing whether c is finite and constant, whether causality is preserved, etc. Its kind of a working backwards, in a very intuitive and instructive way such as is often done in physics education, but working backwards nonetheless, and to me, and doesn't constitute a formal derivation.

Intuition might be misleading, I also feel more natural to think about causality from Minkowski space rather than the opposite, that is because we associate causality to Minkowski space (or Lorentzian manifold in general) but then you can also associate causality to other spaces like euclidean space and galilean relativity plus preferred frame that gives absolute time or others, but all of those others seem to require additional assumptions. To have causality without other assumptions the simplest way is a space with a time dimension, that is to say, with a Lorentzian signature.

I see what you mean here: that causality follows much more simply from Minkowski space than from any other spaces. I might be inclined to agree to that.

soothsayer
I'd reiterate they are postulates; "assumptions" of our observations. The metric continues this into a mathematically useable/structured format. I don't know details or much really about the metric beyond it "describing" dimensions. +++-? thingy :uhh: I think

You got it. The time metric is backward. A read through of the link TrickyDicky posted might provide some interesting insight for it (the idea of time being measured in imaginary numbers, speaking colloquially) basically, the negative metric distinguishes it from space because in our space dimensions, you can travel backward and forward, but in the time dimension, you can not. Nevertheless, it IS a dimension that is utilized and wed with space in SR and GR, so it is included in the metric, not merely an imaginary parameter, as was often previously thought.

nitsuj
You got it. The time metric is backward. A read through of the link TrickyDicky posted might provide some interesting insight for it (the idea of time being measured in imaginary numbers, speaking colloquially) basically, the negative metric distinguishes it from space because in our space dimensions, you can travel backward and forward, but in the time dimension, you can not. Nevertheless, it IS a dimension that is utilized and wed with space in SR and GR, so it is included in the metric, not merely an imaginary parameter, as was often previously thought.

Oh I've been down that road, I remember when after a fair bit of reading about SR I heard of this "imaginary numbers". I forget the details, but remember I "came to terms" with the term lol.

Same issue when I heard gravity was a "fictitious" force, but accept it now too. But wow talk about misleading :) (but guess it's just the way I read the term, the effect isn't "fictitious", just compared to other forces it is, perhaps "imposter force" :)

Last edited:
soothsayer
Same issue when I heard gravity was a "fictitious" force, but accept it now too. But wow talk about misleading :)

Oh yeah, that was a weird one for me too, but it totally made sense when I heard it. Though, I've talked to some people here who tell me gravity isn't exactly the same as a pseudo force, mathematically, just that it could be thought of as one. Don't remember the distinction, but maybe I could dig through my old threads and find it. Wasn't too long ago...

soothsayer
Found it! Can't seem to quote it because all the LaTeX he used won't come out right, for whatever reason, but it had to do with the fact that fictitious forces could be transformed away, whereas gravity cannot.

friend
I previously mentioned the paper that tries to derive SR from causality alone:

http://arxiv.org/abs/1005.4172

But I had my doubts that it is valid. For it seems his projection procedure assumes a speed of light to begin with. For his "projection" procedure depends on the angle at which they are projected to a line of causality. Two unaffilitated events, depending on the angle they are projected onto a line could be seen as time-like, space-like, or light-like. This makes his break down into symmetric and anti-symmetric parts from which he derives the SR metric dependent on the angle of projection which is just another way of specifying the speed of light. So his derivation really depends on the speed of light, just like other derivations.

Staff Emeritus
Gold Member
And I feel as though the "causality postulate" is also logically derived from Minkowski space, not vice versa.

You can do it either way around: https://www.physicsforums.com/showthread.php?t=534862 [Broken]

Last edited by a moderator:
Staff Emeritus
Gold Member
I previously mentioned the paper that tries to derive SR from causality alone:

http://arxiv.org/abs/1005.4172

But I had my doubts that it is valid. For it seems his projection procedure assumes a speed of light to begin with. For his "projection" procedure depends on the angle at which they are projected to a line of causality. Two unaffilitated events, depending on the angle they are projected onto a line could be seen as time-like, space-like, or light-like. This makes his break down into symmetric and anti-symmetric parts from which he derives the SR metric dependent on the angle of projection which is just another way of specifying the speed of light. So his derivation really depends on the speed of light, just like other derivations.

I don't see where they assume a speed of light. I don't think their use of the word "projection" implies any preexisting concept of angle. AFAICT the only notion of measure that they assume is that every observer has a clock that ticks once per event.

The whole treatment smells to me like a different presentation of what Geroch does in his cute and idiosyncratic popularization General Relativity from A to B. The five cases in figure 3 look to me like they probably map directly onto the cases that Geroch does.

If the whole thing isn't a complete swindle, then there has to be some point at which they rule out two possibilities: (1) that spacetime is Galilean, and (2) that spacetime isn't flat.

Given any pair of events, it is not necessarily true that one can be informed about the other. In this case, we say that the events are incomparable and write A||C.
In Galilean relativity, A||C never occurs. I suppose they must assume somewhere that A||C sometimes occurs, and clearly this should be a postulate, but they not only fail to state it explicitly as a postulate, they seem to make it very difficult to tell at what point they've made use of it.

Flatness likewise seems to be something that they either don't understand is a nontivial assumption or are aware of but want to sweep under the rug.

Well, these are the issues I'd have complained about if I'd been asked to referee the paper...and it doesn't seem to have been published in a peer-reviewed journal, even though they posted into arxiv in 2010...

friend
I don't see where they assume a speed of light. I don't think their use of the word "projection" implies any preexisting concept of angle. AFAICT the only notion of measure that they assume is that every observer has a clock that ticks once per event.

If you go to Fig 3, page 6, of http://arxiv.org/abs/1005.4172, the lower middle drawing has two events, yellow dots, 1 and 2, between two vertical lines which represent chains of events. They project a line from dot 1 through dot 2 to the right side chain and call these events light-like separated. However, it seems arbitrary to draw the line from dot 1 through dot 2. That depends on the angle of the line from dot 1. If the lines from the dots were drawn more horizontally, then event (dot) 1 would arrive at the right chain sooner than event 2. If the lines from the dots were drawn more vertically, event 2 would arrive at the right chain sooner than event 1. When the lines reach the chains depends on the vertical and horizontal distance of the dots to the chains and on the angle used to project them. It is not explained why these lines of projection have the same angle going left as going right, or why every dot has lines of projection of the same angle. It seems obvious that what is drawn are 2 dimensional light-cones from the dots, which already assumes the Minkowski-like metric and a speed of light. There is no new information in this paper that I can see.

Last edited:
Staff Emeritus