# The Foundations of Relativity

1. Dec 20, 2005

### Oxymoron

Im trying to understand more about the foundations upon which general relativity lies.
The first thing I need to clarify is the space in which GR works. I've read that GR merges time and space into spacetime, which to my understanding means that, when considered separately, time and space are invariant.
This spacetime must then be much like any other mathematical space, such as Euclidean space or Hilbert space, in that notions of distance must be defined.
In Euclidean space, $$\mathbb{R}^3$$, you have the Euclidean metric,
$$\Delta s^2 = x^2 - y^2$$
which obeys all the axioms of a metric and defines the distance between two points in the Euclidean space.
However, I have read, that the metric in spacetime, the spacetime metric, is defined as
$$\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2 - c^2 \Delta t^2$$
This spacetime metric raises some issues with me. If this is the metric of spacetime then points in spacetime must be represented as a 4-vector. Are these points actually called events by physicists?
The Euclidean plane has structure, right, in the form of the Euclidean metric:
1. $$d(x,y) \geq 0$$
2. $$d(x,y) = 0 \Leftrightarrow x = y$$
3. $$d(x,y) = d(y,x)$$
4. $$d(x,z) \leq d(x,y) + d(y,z)$$
In a similar fashion I am interested to know what the structure on spacetime is? Are there similar axioms for the spacetime metric?

Last edited: Dec 20, 2005
2. Dec 20, 2005

### Oxymoron

Ok, maybe I jumped in too quickly for myself.
In Euclidean space, $\mathbb{R}^3$, you have the distance metric between any two pairs of points to be
$$\Delta s^2 = (\underline{x}_2 - \underline{x}_1)^2$$
Then you can construct Euclidean transformations, where the distance is invariant, resulting in what are called affine transformations. I learnt that tranformations can be expressed by matrices. So, such a Euclidean transformation can be expressed like
$$\Delta \underline{x}' = A\Delta \underline{x}$$
So the 'transformed' vector equals the product of some matrix pertaining to the transformation and the original vector. Then by letting one of the pairs of points be the origin you can get
$$\underline{x}' = A\underline{x} + a$$
where $a$ is some constant.
Now the problem I have comes when I take this one step further. Consider Galilean space. I have found that it has some structure; three axioms:
1. Time Intervals $\Delta t = t_2 - t_1$
2. Spatial Distance $\Delta s = |\underline{x}_2 - \underline{x}_1|$
3. Motions of inertial particles (rectilinear motion) $\underline{x}(t) = \underline{u}t + \underline{x}_0$

And by a similar method done in Euclidean space you can see that

$$t' = t + a$$

$$\underline{x}' = A\underline{x} - \underline{v}t + \underline{b}$$

BUT, why don't Galilean transformations preserve the light cone at the origin? Why must we formulate the Minkowski metric to take care of this?

3. Dec 20, 2005

Staff Emeritus

The metric you cite is actually that of "flat" Minkowski space of special relativity, not the more general metric of Einstein's pseudo-Riemannian manifold. From the Minkowski metric you can define the causal structure ("light cones") of his spacetime and then define the Lorentz transformations as the set of linear transformations the leave that metric invariant.

To do the pseudo-Riemannian geometry you need a more general metric, a symmetric tensor $$g_{\mu\nu}, \mu, \nu = 0,...,3$$. From the derivatives of this wrt the coordinates you define the Levi-Civita Connection $$\Gamma^{\rho}_{\sigma\tau}$$, then the covariant derivative and finally the curvature tensor. This is not too difficult to grasp, but you really should read up on manifolds first.

4. Dec 20, 2005

### robphy

Why? They just don't. The eigenvectors of the Galilean transformation are purely spatial vectors. For Galilean, this means that t=constant...which means that there is a notion of absolute time and absolute simultaneity. On the other hand, the eigenvectors of the Lorentz Transformation (which preserve the Minkowski metric) are lightlike vectors that are tangent to the light cone... which means that the speed of light is absolute.

Last edited: Dec 20, 2005
5. Dec 20, 2005

### Oxymoron

Ok, so obviously Galilean spacetime works fine for classical mechanics but not for special relativity. Am I correct to assume that the spacetime interval

$$\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2 - c^2 \Delta t^2$$

is the metric in Minkowski space? The way I see it is that this way of defining the distance between two events (points) in Minkowski space also incorporates the time interval between the two as well.

When events satisfy $\Delta s^2 = 0$ then we say that they are connected by a light signal?

It seems to me that by introducing the Minkowski metric we have combined space and time into spacetime and yet we have split spacetime, via the light cone from an event, into two pieces: one which is cut off from the event (by the absolute speed of light) and one which recieves information about the event (from the transmission of the light signal).

A problem I am having is visualising the light cone. If I create an event, say I create a photon. Then the photon spreads out from where it is created in all three spatial directions at the speed of light. Im not sure if this is right, but I tend to imagine the point of origin (the tip of the cone) as where I created the photon. In space the light spreads out as a sphere until it consumes the entire universe. Points in space outside the sphere do not know of what is happening inside the sphere until it reaches that point and that time is restricted by the speed of light. But this notion is spherical, not conical.

EDIT:
Im babbling here. Surely light cones are 4-dimensional and I am trying to picture them as 3-dimensional objects on 2-dimensional paper so clearly Im getting the wrong impression. If anyone has a good description of them or knows of any I would be appreciative.

The eigenvectors of the Galilean transformation. The Galilean transformations are

$$\underline{x}' = A\underline{x} + \underline{a}(t)$$

right? I played around with Newton's first law of motion I came to the conclusion that A is a constant matrix, that is, its time-derivatives are all zero. What could this mean? Well, I read what you wrote and it makes sense - I hope! - the eigenvectors of the Galilean transformation are indeed spatial. Using the Galilean transformations on Newton's First Law of motion there is absolute time and simultaneity. If this is the case then how can an event behave as it does in special relativity? Is this why the Galilean transformation does not preserve the light cone?

If so, changing to the Lorentz transformation

$$\underline{x}' = L\underline{x} + \underline{a}(t)$$

My question at this stage is, does the Lorentz transformation preserve the light cone at any event?

Last edited: Dec 20, 2005
6. Dec 21, 2005

### Oxymoron

I finally discovered the true definition of Minkowski spacetime - in terms of the metric defined on it. Please correct me if any of my understanding is flawed. The metric is a nondegenerate, symmetric, indefinite, bilinear form:

$$\bold{g} : \mathscr{M} \times \mathscr{M} \rightarrow \mathbb{R}$$

Within Minkowski spacetime we may define the Lorentz inner product as being

$$g(v,w) := v\cdot w = v_1w_1 + v_2w_2 + v_3w_3 - v_4w_4$$

Once we include the inner product does Minkowski spacetime become an inner product space, you know, like Hilbert space?

Vectors in $\mathscr{M}$ are either spacelike, timelike, or null if the inner product is positive, negative, or zero respectively.

Now lets collect all the null vectors into one set and call it the null cone, or light cone:

$$C_N(x_0) = \{x\in\mathscr{M} \,:\, v\cdot w = 0\}$$

So the light cone is a surface in Minkowski space consisting of all those vectors whose inner product with $x_0$ is zero. The Lorentz inner product tells us the spacetime interval between two events right? If the inner product, $g(v,x_0)$ is zero then the spacetime interval between the event $v$ and $x_0$ is simultaneous. Therefore the light cone is the collection of events in spacetime which occur simultaneously to an observer at $x_0$.

If I spontaneously create a photon and call the event $x_0$ then the photon radiates outward in all spatial directions. I should be able to construct a light cone whose point is at $x_0$.

All the events inside the future light cone of the event happen AFTER the event - there is a strict time ordering of events inside the future cone. We can even say that the event $x_0$ can cause things to happen, and if it does the new event must be within the future cone. So events which are caused by $x_0$ must be reachable by information at speeds less than $c$. We then say that these new events are caused by $x_0$ and are timelike vectors.

If the new events inside the future light cone are all timelike then their Lorentz inner product must be negative. But what does this mean? If the Lorentz inner product is meant to tell us the spacetime interval between two events, and this interval is negative, does it mean that the new event (caused by $x_0$ and thus inside the future light cone) is further away from $x_0$ in time than in space?

7. Dec 21, 2005

### robphy

...where the rightmost expression uses rectangular components and the (+,+,+,-) signature convention.
...inner product with itself (that is, its square norm) is...
...the null vectors at a point (event) [say, $x_0$] of M.
It's not "all those vectors whose inner product with $x_0$"... but "all those vectors at event $x_0$ whose inner product with itself "...

Last edited: Dec 21, 2005
8. Dec 21, 2005

### robphy

There may be some confusion here. The arguments of the metric are vectors. So, when you write $g(v,x_0)$, then v and x_0 are vectors. However, x_0 is a point (event)... unless you somehow want to use something like a so-called position vector in spacetime... however, you now have to specify an origin of spacetime position... but then your inner product $g(v,x_0)$ depends on that choice of origin... probably not what you want.

"That two events are simultaneous" is an observer dependent concept. Using your something similar to your notation $g(v,x_0)$, I can clarify this.
At event x_0, let v be a unit-timelike vector (representing an observer's 4-velocity). Suppose there is an event x_1 such that the displacement vector $\Delta x=x_1-x_0$ has inner-product zero with v: $g(v,x_1-x_0)$... then x_1 and x_0 are simultaneous according to the observer with 4-velocity v. Looking at this in more detail, the 4-vector $\Delta x=x_1-x_0$ must be a spacelike vector... to observer-v, it is in fact a purely spatial vector... in other words, it is orthogonal to v.

So, "Therefore the light cone is the collection of events in spacetime which occur simultaneously to an observer at $x_0$" is NOT correct.
In Minkowski space, the future light cone traces out the set of events that can be reached by a light signal at the vertex event (what it broadcasts)... the past light cone traces out the events that reach the vertex event by a light signal (what it literally "sees").
I would probably say "a flash of light"..."sending out many photons outward
in all spatial directions".
I would say "can be influenced by $x_0$"... since an event P can be influence by many events (not just $x_0$) in the past light cone of P.
Again... you need to distinguish points (events) from vectors.

9. Dec 21, 2005

### Oxymoron

So we must treat events differently to these timelike and spacelike vectors? Or is it simply that the two do not compute right in the metric? Im confused at why the event, $x_0$ is not a vector in Minkowski space? Wait, maybe not. Elements of Minkowski space are events; vectors in Minkowski space are not events, right?

Is v inside the future light cone? Surely it must be outside the cone? In which case, no matter what the 4-velocity of the observer, so long as it is spacelike, will see $x_0$ and $x_1$ simultaneously.

At this point I was about to ask "What if the observer was inside the cone, whose 4-velocity was purely time-like. Would the two events appear simultaneous now?". Surely, $\bold{g}(v,\Delta x)$ is still zero, so the events are on the surface of the light cone, but now the observer must see one event happen AFTER the other. If this is right, could you explain.

One extra question. What is $\eta_{ab}$, the thing which can be 1, -1, or 0 depending on a and b.

10. Dec 21, 2005

### dicerandom

Correct. Events are points in the Minkowski space, vectors (can) connect events.

v must be in the light cone, otherwise the velocity would be greater than that of light. In order for two events to be seen simultaneously they must have a purely spatial seperation in the observer's refrence frame, as robphy said the vector connecting the events must be normal to the world line (the velocity vector) of the observer. If we restrict ourselves to 1D motion there will only be one velocity which will see two spacelike seperated events as simultaneous. Events which are not spacelike seperated cannot be seen as simultaneous by any observers.

I don't see how $\bold{g}(v,\Delta x)$ could be zero if the two events are on the lightcone. $\bold{g}(\Delta x,\Delta x)$ would be zero since they're lightlike seperated, but v is a different vector.

$\eta_{\mu \nu}$ is the metric of flat Minkowsky space, it's diagonal and all 1's in the diagonal slots, except for the -1 in the time position (either the upper left or bottom right, depening on which book you're looking at). Note that by convention greek indicies imply a range over all four dimensions whereas roman indicies imply a range over only the spatial dimensions.

Last edited: Dec 21, 2005
11. Dec 21, 2005

### Oxymoron

Hmmm, Im still confused.

If we have an event $x_0$ from which we construct a light cone then future events which occur on the surface of the light cone have Lorentz inner product zero. Future events which are inside the cone are connected by time-like vectors, and those outside are connected by space-like vectors.

Right so far? (probably not)

My question here is: how do I define an observer? Can he be anywhere - as in, inside the cone, outside, at the event? I mean, could an observer be at the very location of the event at the time it occurs? Or could he be in the event's future - perhaps on the surface of the light cone or inside it or outside it.

12. Dec 21, 2005

### robphy

An observer's 4-velocity v is a unit-timelike vector tangent to the observer's worldline. It points into the interior region enclosed by the future light cone (the "chronological future" of the vertex event). One can roughly interpret v as one unit of time along that observer's worldline. So, the 4-velocity v is never spacelike [and never null].

13. Dec 21, 2005

### Oxymoron

So that is why the null vectors are sometimes called lightlike vectors!! Because for v was on the surface then v = c and all of a sudden every event in the future is simultaneous to that observer. v cannot be outside the cone because then v > c.

14. Dec 21, 2005

### Oxymoron

If I cause an event $x_0$ and I am the observer at that point. Then can I safely say that there is a future event $x_1$ which lies on the light cone of the original event? To me, if there is such an event and $x_0$ causes that event to occur, then my velocity must be c. Since my 4-vector is timelike and if I observe $x_0$ causing $x_1$ (which is on the surface) then my velocity is c.

15. Dec 21, 2005

### dicerandom

Looks good so far

The observer can be anywhere, yes. However what we generally do is define a worldline for an observer, i.e. a path that the observer will follow through spacetime. In simple cases the observer has constant velocity and it's just a straight line, however in more complicated situations the observer can undergo accelerations and the worldine can be curved. At any point along the worldline we define what is called the MCRF (Momentarily Comoving Reference Frame), which is a reference frame that is moving with uniform velocity equal to the instanteneous velocity of the observer. The observer's velocity vector is then the unit vector which points along the time axis of this reference frame, i.e. it is a vector of unit length which is tangental to the observer's worldline.

Right I'd be careful about saying what happens when v=c though, technically the theory doesn't extend to that point but if you look at the limiting behavior as v->c that is how it seems things would be.

There are in fact an infinite number of such events. Suppose that you're out in space with a flashlight, you turn your flashlight on and some time later a friend who is some distance away sees that you turned your light on. You turning your light on would be your event $x_0$ and your friend seeing the light would be an event $x_1$ which lies on the lightcone from $x_0$, yet you didn't have to move anywhere.

Last edited: Dec 21, 2005
16. Dec 21, 2005

### Oxymoron

...if you were moving at v = c, would you know instantly that your friend saw your light flash - instead of some time later? To me it seems that the time it takes for information to get around inside the light cone depends on the observer's velocity too?

Also, just say that you were that friend, waiting for me to flash the light. At time=0 you must clearly be outside the cone. Is this the reason why the light cone is hard to picture, because in 3 dimensions you dont realize that the cone itself is 'expanding' - that is, this expanding in time dimension is compressed. Because an observer who at $t = t_1$ observes the initial event (and therefore at this time inside the cone) is outside the cone initially.

17. Dec 21, 2005

### JesseM

You can't travel at v=c, and things that do move at c (like photons) do not have their own reference frame in relativity, so you can't say what things will look like from their point of view.
It's easier to visualize if you drop the number of dimensions by 1, so you have a 2D space and one time dimension. If you represent time as the vertical dimension, then any horizontal slice through this 3D spacetime will give you all of 2D space at a single instant in time. When an event happens, the light moves outward in an expanding 2D circle, so with time as the third dimension this looks like a cone, with a horizontal slice through the cone being the circle that light from the event has reached at a given time (see the illustration in the wikipedia article here). Of course, in 3D space it would actually be an expanding sphere instead, and the "cone" would be a 4D one that we humans can't actually visualize.
yes, if you picture each observer's worldline as a vertical line in this 3D spacetime (or a slanted line, if an observer is moving through space), then only when the second observer's worldline enters the cone does he see the event, at earlier time slices the expanding circle of light from the event has not reached him.

18. Dec 21, 2005

### Oxymoron

I was just wondering something, please forgive me as it's a little off-topic. Photon's are light-like. Now, this may be crazy, but those theoretical particles, tachyons, whose speeds are greater than c. do they behave as spacelike vectors? Also, if a tachyon did exist, would it's speed be infinite, not just "greater than c"?

I'll get back on topic next post.

19. Dec 23, 2005

### Oxymoron

Can somebody explain the need for contravariant tensors and covariant tensors. I mean, at first glance, the only difference I see between the two is that the indices are swapped around. Does it actually make a physical difference whether the incides are subscripted or superscripted?

20. Dec 23, 2005

### robphy

Many physical quantities are naturally described ("born", if you will) by (say) contravariant tensors... many others by covariant... and the rest, mixed. At an abstract level, it is usually the geometry of the mathematical model of the physical quantity that dictates the type.

For example, the unit-4-velocity is a vector [a contravariant tensor] tangent to the worldline. The electromagnetic field is a 2-form [a totally antisymmetric covariant tensor]...which can be written as the curl of a potential. When there is a nondegenerate metric around, one can do index gymnastics and raise and lower indices.... however, one should really be aware of the natural description of the quantity... or else its physical meaning could be obscured in all of the shuffling.

In Euclidean space, the simplicity of the metric and volume-form can sometimes blur the distinction among various "directional quantities"... so that we get away with thinking of a lot of these quantities as simple "vectors". For example, the cross-product of two vectors is not really a vector...without the metric and the volume form. A physical example: the electric field and the magnetic field are not fundamentally contravariant vectors.