Understanding the Structure and Transformation of Tensors in Spacetime

Oxymoron · Dec 20, 2005

Im trying to understand more about the foundations upon which general relativity lies.
The first thing I need to clarify is the space in which GR works. I've read that GR merges time and space into spacetime, which to my understanding means that, when considered separately, time and space are invariant.
This spacetime must then be much like any other mathematical space, such as Euclidean space or Hilbert space, in that notions of distance must be defined.
In Euclidean space, [tex]\mathbb{R}^3[/tex], you have the Euclidean metric,
[tex]\Delta s^2 = x^2 - y^2[/tex]
which obeys all the axioms of a metric and defines the distance between two points in the Euclidean space.
However, I have read, that the metric in spacetime, the spacetime metric, is defined as
[tex]\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2 - c^2 \Delta t^2[/tex]
This spacetime metric raises some issues with me. If this is the metric of spacetime then points in spacetime must be represented as a 4-vector. Are these points actually called events by physicists?
The Euclidean plane has structure, right, in the form of the Euclidean metric:
1. [tex]d(x,y) \geq 0[/tex]
2. [tex]d(x,y) = 0 \Leftrightarrow x = y[/tex]
3. [tex]d(x,y) = d(y,x)[/tex]
4. [tex]d(x,z) \leq d(x,y) + d(y,z)[/tex]
In a similar fashion I am interested to know what the structure on spacetime is? Are there similar axioms for the spacetime metric?

Oxymoron · Dec 20, 2005

Ok, maybe I jumped in too quickly for myself.
In Euclidean space, [itex]\mathbb{R}^3[/itex], you have the distance metric between any two pairs of points to be
[tex]\Delta s^2 = (\underline{x}_2 - \underline{x}_1)^2[/tex]
Then you can construct Euclidean transformations, where the distance is invariant, resulting in what are called affine transformations. I learned that tranformations can be expressed by matrices. So, such a Euclidean transformation can be expressed like
[tex]\Delta \underline{x}' = A\Delta \underline{x}[/tex]
So the 'transformed' vector equals the product of some matrix pertaining to the transformation and the original vector. Then by letting one of the pairs of points be the origin you can get
[tex]\underline{x}' = A\underline{x} + a[/tex]
where [itex]a[/itex] is some constant.
Now the problem I have comes when I take this one step further. Consider Galilean space. I have found that it has some structure; three axioms:
1. Time Intervals [itex]\Delta t = t_2 - t_1[/itex]
2. Spatial Distance [itex]\Delta s = |\underline{x}_2 - \underline{x}_1|[/itex]
3. Motions of inertial particles (rectilinear motion) [itex]\underline{x}(t) = \underline{u}t + \underline{x}_0[/itex]

And by a similar method done in Euclidean space you can see that

[tex]t' = t + a[/tex]

[tex]\underline{x}' = A\underline{x} - \underline{v}t + \underline{b}[/tex]

BUT, why don't Galilean transformations preserve the light cone at the origin? Why must we formulate the Minkowski metric to take care of this?

selfAdjoint · Dec 20, 2005

Oxymoron said:

Im trying to understand more about the foundations upon which general relativity lies.
The first thing I need to clarify is the space in which GR works. I've read that GR merges time and space into spacetime, which to my understanding means that, when considered separately, time and space are invariant.
This spacetime must then be much like any other mathematical space, such as Euclidean space or Hilbert space, in that notions of distance must be defined.
In Euclidean space, [tex]\mathbb{R}^3[/tex], you have the Euclidean metric,
[tex]\Delta s^2 = x^2 - y^2[/tex]
which obeys all the axioms of a metric and defines the distance between two points in the Euclidean space.
However, I have read, that the metric in spacetime, the spacetime metric, is defined as
[tex]\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2 - c^2 \Delta t^2[/tex]
This spacetime metric raises some issues with me. If this is the metric of spacetime then points in spacetime must be represented as a 4-vector. Are these points actually called events by physicists?
The Euclidean plane has structure, right, in the form of the Euclidean metric:
1. [tex]d(x,y) \geq 0[/tex]
2. [tex]d(x,y) = 0 \Leftrightarrow x = y[/tex]
3. [tex]d(x,y) = d(y,x)[/tex]
4. [tex]d(x,z) \leq d(x,y) + d(y,z)[/tex]
In a similar fashion I am interested to know what the structure on spacetime is? Are there similar axioms for the spacetime metric?

The metric you cite is actually that of "flat" Minkowski space of special relativity, not the more general metric of Einstein's pseudo-Riemannian manifold. From the Minkowski metric you can define the causal structure ("light cones") of his spacetime and then define the Lorentz transformations as the set of linear transformations the leave that metric invariant.

To do the pseudo-Riemannian geometry you need a more general metric, a symmetric tensor [tex]g_{\mu\nu}, \mu, \nu = 0,...,3[/tex]. From the derivatives of this wrt the coordinates you define the Levi-Civita Connection [tex]\Gamma^{\rho}_{\sigma\tau}[/tex], then the covariant derivative and finally the curvature tensor. This is not too difficult to grasp, but you really should read up on manifolds first.

robphy · Dec 20, 2005

Oxymoron said:

Consider Galilean space. I have found that it has some structure; three axioms:
1. Time Intervals [itex]\Delta t = t_2 - t_1[/itex]
2. Spatial Distance [itex]\Delta s = |\underline{x}_2 - \underline{x}_1|[/itex]
3. Motions of inertial particles (rectilinear motion) [itex]\underline{x}(t) = \underline{u}t + \underline{x}_0[/itex]
And by a similar method done in Euclidean space you can see that
[tex]t' = t + a[/tex]
[tex]\underline{x}' = A\underline{x} - \underline{v}t + \underline{b}[/tex]
BUT, why don't Galilean transformations preserve the light cone at the origin? Why must we formulate the Minkowski metric to take care of this?

Why? They just don't. The eigenvectors of the Galilean transformation are purely spatial vectors. For Galilean, this means that t=constant...which means that there is a notion of absolute time and absolute simultaneity. On the other hand, the eigenvectors of the Lorentz Transformation (which preserve the Minkowski metric) are lightlike vectors that are tangent to the light cone... which means that the speed of light is absolute.

Oxymoron · Dec 20, 2005

The metric you cite is actually that of "flat" Minkowski space of special relativity, not the more general metric of Einstein's pseudo-Riemannian manifold. From the Minkowski metric you can define the causal structure ("light cones") of his spacetime and then define the Lorentz transformations as the set of linear transformations the leave that metric invariant.

Ok, so obviously Galilean spacetime works fine for classical mechanics but not for special relativity. Am I correct to assume that the spacetime interval

[tex]\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2 - c^2 \Delta t^2[/tex]

is the metric in Minkowski space? The way I see it is that this way of defining the distance between two events (points) in Minkowski space also incorporates the time interval between the two as well.

When events satisfy [itex]\Delta s^2 = 0[/itex] then we say that they are connected by a light signal?

It seems to me that by introducing the Minkowski metric we have combined space and time into spacetime and yet we have split spacetime, via the light cone from an event, into two pieces: one which is cut off from the event (by the absolute speed of light) and one which receives information about the event (from the transmission of the light signal).

A problem I am having is visualising the light cone. If I create an event, say I create a photon. Then the photon spreads out from where it is created in all three spatial directions at the speed of light. I am not sure if this is right, but I tend to imagine the point of origin (the tip of the cone) as where I created the photon. In space the light spreads out as a sphere until it consumes the entire universe. Points in space outside the sphere do not know of what is happening inside the sphere until it reaches that point and that time is restricted by the speed of light. But this notion is spherical, not conical.

EDIT:
Im babbling here. Surely light cones are 4-dimensional and I am trying to picture them as 3-dimensional objects on 2-dimensional paper so clearly I am getting the wrong impression. If anyone has a good description of them or knows of any I would be appreciative.

Why? They just don't. The eigenvectors of the Galilean transformation are purely spatial vectors. For Galilean, this means that t=constant...which means that there is a notion of absolute time and absolute simultaneity. On the other hand, the eigenvectors of the Lorentz Transformation (which preserve the Minkowski metric) are lightlike vectors that are tangent to the light cone... which means that the speed of light is absolute.

The eigenvectors of the Galilean transformation. The Galilean transformations are

[tex]\underline{x}' = A\underline{x} + \underline{a}(t)[/tex]

right? I played around with Newton's first law of motion I came to the conclusion that A is a constant matrix, that is, its time-derivatives are all zero. What could this mean? Well, I read what you wrote and it makes sense - I hope! - the eigenvectors of the Galilean transformation are indeed spatial. Using the Galilean transformations on Newton's First Law of motion there is absolute time and simultaneity. If this is the case then how can an event behave as it does in special relativity? Is this why the Galilean transformation does not preserve the light cone?

If so, changing to the Lorentz transformation

[tex]\underline{x}' = L\underline{x} + \underline{a}(t)[/tex]

tells me that the space is affine. (not sure about this).

My question at this stage is, does the Lorentz transformation preserve the light cone at any event?

Oxymoron · Dec 21, 2005

I finally discovered the true definition of Minkowski spacetime - in terms of the metric defined on it. Please correct me if any of my understanding is flawed. The metric is a nondegenerate, symmetric, indefinite, bilinear form:

[tex]\bold{g} : \mathscr{M} \times \mathscr{M} \rightarrow \mathbb{R}[/tex]

Within Minkowski spacetime we may define the Lorentz inner product as being

[tex]g(v,w) := v\cdot w = v_1w_1 + v_2w_2 + v_3w_3 - v_4w_4[/tex]

Once we include the inner product does Minkowski spacetime become an inner product space, you know, like Hilbert space?

Vectors in [itex]\mathscr{M}[/itex] are either spacelike, timelike, or null if the inner product is positive, negative, or zero respectively.

Now let's collect all the null vectors into one set and call it the null cone, or light cone:

[tex]C_N(x_0) = \{x\in\mathscr{M} \,:\, v\cdot w = 0\}[/tex]

So the light cone is a surface in Minkowski space consisting of all those vectors whose inner product with [itex]x_0[/itex] is zero. The Lorentz inner product tells us the spacetime interval between two events right? If the inner product, [itex]g(v,x_0)[/itex] is zero then the spacetime interval between the event [itex]v[/itex] and [itex]x_0[/itex] is simultaneous. Therefore the light cone is the collection of events in spacetime which occur simultaneously to an observer at [itex]x_0[/itex].

If I spontaneously create a photon and call the event [itex]x_0[/itex] then the photon radiates outward in all spatial directions. I should be able to construct a light cone whose point is at [itex]x_0[/itex].

All the events inside the future light cone of the event happen AFTER the event - there is a strict time ordering of events inside the future cone. We can even say that the event [itex]x_0[/itex] can cause things to happen, and if it does the new event must be within the future cone. So events which are caused by [itex]x_0[/itex] must be reachable by information at speeds less than [itex]c[/itex]. We then say that these new events are caused by [itex]x_0[/itex] and are timelike vectors.

If the new events inside the future light cone are all timelike then their Lorentz inner product must be negative. But what does this mean? If the Lorentz inner product is meant to tell us the spacetime interval between two events, and this interval is negative, does it mean that the new event (caused by [itex]x_0[/itex] and thus inside the future light cone) is further away from [itex]x_0[/itex] in time than in space?

robphy · Dec 21, 2005

Oxymoron said:

I finally discovered the true definition of Minkowski spacetime - in terms of the metric defined on it. Please correct me if any of my understanding is flawed. The metric is a nondegenerate, symmetric, indefinite, bilinear form:
[tex]\bold{g} : \mathscr{M} \times \mathscr{M} \rightarrow \mathbb{R}[/tex]
Within Minkowski spacetime we may define the Lorentz inner product as being
[tex]g(v,w) := v\cdot w = v_1w_1 + v_2w_2 + v_3w_3 - v_4w_4[/tex]

...where the rightmost expression uses rectangular components and the (+,+,+,-) signature convention.

Oxymoron said:

Once we include the inner product does Minkowski spacetime become an inner product space, you know, like Hilbert space?
Vectors in [itex]\mathscr{M}[/itex] are either spacelike, timelike, or null if the inner product is positive, negative, or zero respectively.

...inner product with itself (that is, its square norm) is...

Oxymoron said:

Now let's collect all the null vectors into one set and call it the null cone, or light cone:
[tex]C_N(x_0) = \{x\in\mathscr{M} \,:\, v\cdot w = 0\}[/tex]
So the light cone is a surface in Minkowski space consisting of all those vectors whose inner product with [itex]x_0[/itex] is zero.

...the null vectors at a point (event) [say, [itex]x_0[/itex]] of M.
It's not "all those vectors whose inner product with [itex]x_0[/itex]"... but "all those vectors at event [itex]x_0[/itex] whose inner product with itself "...

robphy · Dec 21, 2005

Oxymoron said:

The Lorentz inner product tells us the spacetime interval between two events right? If the inner product, [itex]g(v,x_0)[/itex] is zero then the spacetime interval between the event [itex]v[/itex] and [itex]x_0[/itex] is simultaneous. Therefore the light cone is the collection of events in spacetime which occur simultaneously to an observer at [itex]x_0[/itex].

There may be some confusion here. The arguments of the metric are vectors. So, when you write [itex]g(v,x_0)[/itex], then v and x_0 are vectors. However, x_0 is a point (event)... unless you somehow want to use something like a so-called position vector in spacetime... however, you now have to specify an origin of spacetime position... but then your inner product [itex]g(v,x_0)[/itex] depends on that choice of origin... probably not what you want.

"That two events are simultaneous" is an observer dependent concept. Using your something similar to your notation [itex]g(v,x_0)[/itex], I can clarify this.
At event x_0, let v be a unit-timelike vector (representing an observer's 4-velocity). Suppose there is an event x_1 such that the displacement vector [itex]\Delta x=x_1-x_0[/itex] has inner-product zero with v: [itex]g(v,x_1-x_0)[/itex]... then x_1 and x_0 are simultaneous according to the observer with 4-velocity v. Looking at this in more detail, the 4-vector [itex]\Delta x=x_1-x_0[/itex] must be a spacelike vector... to observer-v, it is in fact a purely spatial vector... in other words, it is orthogonal to v.

So, "Therefore the light cone is the collection of events in spacetime which occur simultaneously to an observer at [itex]x_0[/itex]" is NOT correct.
In Minkowski space, the future light cone traces out the set of events that can be reached by a light signal at the vertex event (what it broadcasts)... the past light cone traces out the events that reach the vertex event by a light signal (what it literally "sees").

Oxymoron said:

If I spontaneously create a photon and call the event [itex]x_0[/itex] then the photon radiates outward in all spatial directions. I should be able to construct a light cone whose point is at [itex]x_0[/itex].

I would probably say "a flash of light"..."sending out many photons outward
in all spatial directions".

Oxymoron said:

All the events inside the future light cone of the event happen AFTER the event - there is a strict time ordering of events inside the future cone. We can even say that the event [itex]x_0[/itex] can cause things to happen, and if it does the new event must be within the future cone. So events which are caused by [itex]x_0[/itex] must be reachable by information at speeds less than [itex]c[/itex]. We then say that these new events are caused by [itex]x_0[/itex] and are timelike vectors.

I would say "can be influenced by [itex]x_0[/itex]"... since an event P can be influence by many events (not just [itex]x_0[/itex]) in the past light cone of P.

Oxymoron said:

If the new events inside the future light cone are all timelike then their Lorentz inner product must be negative. But what does this mean? If the Lorentz inner product is meant to tell us the spacetime interval between two events, and this interval is negative, does it mean that the new event (caused by [itex]x_0[/itex] and thus inside the future light cone) is further away from [itex]x_0[/itex] in time than in space?

Again... you need to distinguish points (events) from vectors.

Oxymoron · Dec 21, 2005

here may be some confusion here. The arguments of the metric are vectors. So, when you write , then v and x_0 are vectors. However, x_0 is a point (event)... unless you somehow want to use something like a so-called position vector in spacetime... however, you now have to specify an origin of spacetime position... but then your inner product depends on that choice of origin... probably not what you want.

So we must treat events differently to these timelike and spacelike vectors? Or is it simply that the two do not compute right in the metric? I am confused at why the event, [itex]x_0[/itex] is not a vector in Minkowski space? Wait, maybe not. Elements of Minkowski space are events; vectors in Minkowski space are not events, right?

That two events are simultaneous" is an observer dependent concept. Using your something similar to your notation , I can clarify this.
At event x_0, let v be a unit-timelike vector (representing an observer's 4-velocity). Suppose there is an event x_1 such that the displacement vector has inner-product zero with v: ... then x_1 and x_0 are simultaneous according to the observer with 4-velocity v. Looking at this in more detail, the 4-vector must be a spacelike vector... to observer-v, it is in fact a purely spatial vector... in other words, it is orthogonal to v.

So, "Therefore the light cone is the collection of events in spacetime which occur simultaneously to an observer at " is NOT correct.
In Minkowski space, the future light cone traces out the set of events that can be reached by a light signal at the vertex event (what it broadcasts)... the past light cone traces out the events that reach the vertex event by a light signal (what it literally "sees").

Is v inside the future light cone? Surely it must be outside the cone? In which case, no matter what the 4-velocity of the observer, so long as it is spacelike, will see [itex]x_0[/itex] and [itex]x_1[/itex] simultaneously.

At this point I was about to ask "What if the observer was inside the cone, whose 4-velocity was purely time-like. Would the two events appear simultaneous now?". Surely, [itex]\bold{g}(v,\Delta x)[/itex] is still zero, so the events are on the surface of the light cone, but now the observer must see one event happen AFTER the other. If this is right, could you explain.

One extra question. What is [itex]\eta_{ab}[/itex], the thing which can be 1, -1, or 0 depending on a and b.

dicerandom · Dec 21, 2005

Oxymoron said:

Elements of Minkowski space are events; vectors in Minkowski space are not events, right?

Correct. Events are points in the Minkowski space, vectors (can) connect events.

Oxymoron said:

Is v inside the future light cone? Surely it must be outside the cone? In which case, no matter what the 4-velocity of the observer, so long as it is spacelike, will see [itex]x_0[/itex] and [itex]x_1[/itex] simultaneously.

v must be in the light cone, otherwise the velocity would be greater than that of light. In order for two events to be seen simultaneously they must have a purely spatial separation in the observer's refrence frame, as robphy said the vector connecting the events must be normal to the world line (the velocity vector) of the observer. If we restrict ourselves to 1D motion there will only be one velocity which will see two spacelike separated events as simultaneous. Events which are not spacelike separated cannot be seen as simultaneous by any observers.

Oxymoron said:

At this point I was about to ask "What if the observer was inside the cone, whose 4-velocity was purely time-like. Would the two events appear simultaneous now?". Surely, [itex]\bold{g}(v,\Delta x)[/itex] is still zero, so the events are on the surface of the light cone, but now the observer must see one event happen AFTER the other. If this is right, could you explain.

I don't see how [itex]\bold{g}(v,\Delta x)[/itex] could be zero if the two events are on the lightcone. [itex]\bold{g}(\Delta x,\Delta x)[/itex] would be zero since they're lightlike separated, but v is a different vector.

Oxymoron said:

One extra question. What is [itex]\eta_{ab}[/itex], the thing which can be 1, -1, or 0 depending on a and b.

[itex]\eta_{\mu \nu}[/itex] is the metric of flat Minkowsky space, it's diagonal and all 1's in the diagonal slots, except for the -1 in the time position (either the upper left or bottom right, depening on which book you're looking at). Note that by convention greek indicies imply a range over all four dimensions whereas roman indicies imply a range over only the spatial dimensions.

Oxymoron · Dec 21, 2005

Hmmm, I am still confused. :redface:

If we have an event [itex]x_0[/itex] from which we construct a light cone then future events which occur on the surface of the light cone have Lorentz inner product zero. Future events which are inside the cone are connected by time-like vectors, and those outside are connected by space-like vectors.

Right so far? (probably not)

My question here is: how do I define an observer? Can he be anywhere - as in, inside the cone, outside, at the event? I mean, could an observer be at the very location of the event at the time it occurs? Or could he be in the event's future - perhaps on the surface of the light cone or inside it or outside it.

robphy · Dec 21, 2005

Oxymoron said:

Is v inside the future light cone? Surely it must be outside the cone? In which case, no matter what the 4-velocity of the observer, so long as it is spacelike, will see [itex]x_0[/itex] and [itex]x_1[/itex] simultaneously.

An observer's 4-velocity v is a unit-timelike vector tangent to the observer's worldline. It points into the interior region enclosed by the future light cone (the "chronological future" of the vertex event). One can roughly interpret v as one unit of time along that observer's worldline. So, the 4-velocity v is never spacelike [and never null].

Oxymoron · Dec 21, 2005

Posted by Dicerandom

v must be in the light cone, otherwise the velocity would be greater than that of light.

So that is why the null vectors are sometimes called lightlike vectors! Because for v was on the surface then v = c and all of a sudden every event in the future is simultaneous to that observer. v cannot be outside the cone because then v > c.

Oxymoron · Dec 21, 2005

If I cause an event [itex]x_0[/itex] and I am the observer at that point. Then can I safely say that there is a future event [itex]x_1[/itex] which lies on the light cone of the original event? To me, if there is such an event and [itex]x_0[/itex] causes that event to occur, then my velocity must be c. Since my 4-vector is timelike and if I observe [itex]x_0[/itex] causing [itex]x_1[/itex] (which is on the surface) then my velocity is c.

dicerandom · Dec 21, 2005

Oxymoron said:

Hmmm, I am still confused.
If we have an event [itex]x_0[/itex] from which we construct a light cone then future events which occur on the surface of the light cone have Lorentz inner product zero. Future events which are inside the cone are connected by time-like vectors, and those outside are connected by space-like vectors.
Right so far? (probably not)

Looks good so far

Oxymoron said:

My question here is: how do I define an observer? Can he be anywhere - as in, inside the cone, outside, at the event? I mean, could an observer be at the very location of the event at the time it occurs? Or could he be in the event's future - perhaps on the surface of the light cone or inside it or outside it.

The observer can be anywhere, yes. However what we generally do is define a worldline for an observer, i.e. a path that the observer will follow through spacetime. In simple cases the observer has constant velocity and it's just a straight line, however in more complicated situations the observer can undergo accelerations and the worldine can be curved. At any point along the worldline we define what is called the MCRF (Momentarily Comoving Reference Frame), which is a reference frame that is moving with uniform velocity equal to the instanteneous velocity of the observer. The observer's velocity vector is then the unit vector which points along the time axis of this reference frame, i.e. it is a vector of unit length which is tangental to the observer's worldline.

Oxymoron said:

So that is why the null vectors are sometimes called lightlike vectors! Because for v was on the surface then v = c and all of a sudden every event in the future is simultaneous to that observer. v cannot be outside the cone because then v > c.

Right

I'd be careful about saying what happens when v=c though, technically the theory doesn't extend to that point but if you look at the limiting behavior as v->c that is how it seems things would be.

Oxymoron said:

If I cause an event [itex]x_0[/itex] and I am the observer at that point. Then can I safely say that there is a future event [itex]x_1[/itex] which lies on the light cone of the original event? To me, if there is such an event and [itex]x_0[/itex] causes that event to occur, then my velocity must be c. Since my 4-vector is timelike and if I observe [itex]x_0[/itex] causing [itex]x_1[/itex] (which is on the surface) then my velocity is c.

There are in fact an infinite number of such events. Suppose that you're out in space with a flashlight, you turn your flashlight on and some time later a friend who is some distance away sees that you turned your light on. You turning your light on would be your event [itex]x_0[/itex] and your friend seeing the light would be an event [itex]x_1[/itex] which lies on the lightcone from [itex]x_0[/itex], yet you didn't have to move anywhere.

Oxymoron · Dec 21, 2005

Posted by Dicerandom

There are in fact an infinite number of such events. Suppose that you're out in space with a flashlight, you turn your flashlight on and some time later a friend who is some distance away sees that you turned your light on. You turning your light on would be your event and your friend seeing the light would be an event which lies on the lightcone from , yet you didn't have to move anywhere.

...if you were moving at v = c, would you know instantly that your friend saw your light flash - instead of some time later? To me it seems that the time it takes for information to get around inside the light cone depends on the observer's velocity too?

Also, just say that you were that friend, waiting for me to flash the light. At time=0 you must clearly be outside the cone. Is this the reason why the light cone is hard to picture, because in 3 dimensions you don't realize that the cone itself is 'expanding' - that is, this expanding in time dimension is compressed. Because an observer who at [itex]t = t_1[/itex] observes the initial event (and therefore at this time inside the cone) is outside the cone initially.

JesseM · Dec 21, 2005

Oxymoron said:

...if you were moving at v = c, would you know instantly that your friend saw your light flash - instead of some time later? To me it seems that the time it takes for information to get around inside the light cone depends on the observer's velocity too?

You can't travel at v=c, and things that do move at c (like photons) do not have their own reference frame in relativity, so you can't say what things will look like from their point of view.

Oxymoron said:

Also, just say that you were that friend, waiting for me to flash the light. At time=0 you must clearly be outside the cone. Is this the reason why the light cone is hard to picture, because in 3 dimensions you don't realize that the cone itself is 'expanding' - that is, this expanding in time dimension is compressed.

It's easier to visualize if you drop the number of dimensions by 1, so you have a 2D space and one time dimension. If you represent time as the vertical dimension, then any horizontal slice through this 3D spacetime will give you all of 2D space at a single instant in time. When an event happens, the light moves outward in an expanding 2D circle, so with time as the third dimension this looks like a cone, with a horizontal slice through the cone being the circle that light from the event has reached at a given time (see the illustration in the wikipedia article here). Of course, in 3D space it would actually be an expanding sphere instead, and the "cone" would be a 4D one that we humans can't actually visualize.

Oxymoron said:

Because an observer who at [itex]t = t_1[/itex] observes the initial event (and therefore at this time inside the cone) is outside the cone initially.

yes, if you picture each observer's worldline as a vertical line in this 3D spacetime (or a slanted line, if an observer is moving through space), then only when the second observer's worldline enters the cone does he see the event, at earlier time slices the expanding circle of light from the event has not reached him.

Oxymoron · Dec 21, 2005

I was just wondering something, please forgive me as it's a little off-topic. Photon's are light-like. Now, this may be crazy, but those theoretical particles, tachyons, whose speeds are greater than c. do they behave as spacelike vectors? Also, if a tachyon did exist, would it's speed be infinite, not just "greater than c"?

I'll get back on topic next post.

Oxymoron · Dec 23, 2005

Can somebody explain the need for contravariant tensors and covariant tensors. I mean, at first glance, the only difference I see between the two is that the indices are swapped around. Does it actually make a physical difference whether the incides are subscripted or superscripted?

robphy · Dec 23, 2005

Oxymoron said:

Can somebody explain the need for contravariant tensors and covariant tensors. I mean, at first glance, the only difference I see between the two is that the indices are swapped around. Does it actually make a physical difference whether the incides are subscripted or superscripted?

Many physical quantities are naturally described ("born", if you will) by (say) contravariant tensors... many others by covariant... and the rest, mixed. At an abstract level, it is usually the geometry of the mathematical model of the physical quantity that dictates the type.

For example, the unit-4-velocity is a vector [a contravariant tensor] tangent to the worldline. The electromagnetic field is a 2-form [a totally antisymmetric covariant tensor]...which can be written as the curl of a potential. When there is a nondegenerate metric around, one can do index gymnastics and raise and lower indices... however, one should really be aware of the natural description of the quantity... or else its physical meaning could be obscured in all of the shuffling.

In Euclidean space, the simplicity of the metric and volume-form can sometimes blur the distinction among various "directional quantities"... so that we get away with thinking of a lot of these quantities as simple "vectors". For example, the cross-product of two vectors is not really a vector...without the metric and the volume form. A physical example: the electric field and the magnetic field are not fundamentally contravariant vectors.

pervect · Dec 23, 2005

Oxymoron said:

Can somebody explain the need for contravariant tensors and covariant tensors. I mean, at first glance, the only difference I see between the two is that the indices are swapped around. Does it actually make a physical difference whether the incides are subscripted or superscripted?

Most of the time people prefer to work in orthonormal Cartesian coordinate systems. In these cases, there is no difference between covariant and contravariant tensors, because the metric is an identity matrix.

In some situations, however, one cannot use an identity matrix for the metric. Relativity is an example -- as ds^2 = dx^2 + dy^2 + dz^2 - dt^2, because of the minus sign before the dt^2, the metric is not an identity metric.

In these situations, one has to worry about covariant vs contravariant tensors. You can think of them as the machinery necessary to include the minus sign in front of the dt^2, or the machinery necessary to work in any coordinate system, including ones that are not orthonormal.

If you abtract out "coordinate choice" issues from "coordinate independent" issues, all the issues related to covariance and contravariance are related to coordinate choices. Thus [itex]G_{uv} = 8 \pi T_{uv}[/itex] relates to the same physics as [itex]G^{uv} = 8 \pi T^{uv}[/itex], the covariance and contravariance issues are ultimately all related to the choice of coordinates.

In order to maintain comprehensibility, though, certain very strong conventions are used - for instance, the space-time coordinates of an event are always written superscripted. The subscripted space-time coordinates of an event are then determined from the superscripted coordinates by the machinery of the tensor transformations via the metric at that location. The value of the metric depends on some more coordinate choice issues.

robphy · Dec 23, 2005

pervect said:

If you abtract out "coordinate choice" issues from "coordinate independent" issues, all the issues related to covariance and contravariance are related to coordinate choices. Thus [itex]G_{uv} = 8 \pi T_{uv}[/itex] relates to the same physics as [itex]G^{uv} = 8 \pi T^{uv}[/itex], the covariance and contravariance issues are ultimately all related to the choice of coordinates.

Technically speaking, if your metric tensor happens to be degenerate (so that it has no inverse) [for example in the Newton-Cartan case] then you can't raise indices so easily. Of course, one could take for granted an invertible metric tensor as one often does... but it's probably a good idea not to take such things for granted. In my opinion, it's always a good idea to see the scaffolding to appreciate just what went into the construction of tensorial expressions.

From a measurement point of view... suppose you didn't know the metric (maybe you are not able to determine it right now)... then certain expressions that use the metric couldn't be determined... but those that didn't need it can be determined. So, one should know which expressions need the metric and which don't.

For example, one can formulate electrodynamics without the use of a metric
http://arxiv.org/abs/physics/9907046

Here's a reference for the Newton-Cartan formalism
p.44 of http://arxiv.org/abs/gr-qc/0506065 (note the comments on p. 47)

Here's some motivation for this general viewpoint:
http://www.ucolick.org/~burke/forms/draw.ps
http://arxiv.org/abs/gr-qc/9807044
http://journals.tubitak.gov.tr/physics/issues/fiz-99-23-5/fiz-23-5-7-9903-44.pdf

The first paragraph of http://www.bgu.ac.il/~rsegev/Papers/JMP2002AIP.pdf has a nice motivation in its opening paragraph: "Since one cannot assume that the metric tensor is known in advance, it would be preferable, at least from the theoretical point of view, to have a formulation of the theory that does not rely on the metric structure."

Indeed... in many approaches to quantum gravity, the metric isn't available. So, there is [at a deep level] some physics that distinguishes a natural-born tensorial expression from its index-raised-or-lowered analogue.

pmb_phy · Dec 23, 2005

Oxymoron said:

Im trying to understand more about the foundations upon which general relativity lies.
The first thing I need to clarify is the space in which GR works. I've read that GR merges time and space into spacetime, ...

"Spacetime," the union of space and time, was Minkowski's idea which he revealed in 1908.

..which to my understanding means that, when considered separately, time and space are invariant.

I don't follow you here. What does that mean?

This spacetime must then be much like any other mathematical space, such as Euclidean space or Hilbert space, in that notions of distance must be defined.

No. Spacetime is manifold and Hilbert space is a vector space. These are two different uses of the term "space."

In Euclidean space, [tex]\mathbb{R}^3[/tex], you have the Euclidean metric,
[tex]\Delta s^2 = x^2 - y^2[/tex]
which obeys all the axioms of a metric and defines the distance between two points in the Euclidean space.

That is incorrect. The spatial distance between two events is defined in flat spacetime as
[tex]\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2[/tex]

However, I have read, that the metric in spacetime, the spacetime metric, is defined as
[tex]\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2 - c^2 \Delta t^2[/tex]

Yes. That is correct for Minkowski coordinates, i.e. an inertial frame of reference. It is not valid in general.

This spacetime metric raises some issues with me. If this is the metric of spacetime then points in spacetime must be represented as a 4-vector.

You're speaking of the 4-position = X = (ct, x, y, z). This is the spacetime interval between two events, one of which is called the "origin" and assigned the 4-position X = (0, 0, 0, 0)

Are these points actually called events by physicists?

Yes. A point in spacetime is called an "event."

The Euclidean plane has structure, right, in the form of the Euclidean metric:
1. [tex]d(x,y) \geq 0[/tex]
2. [tex]d(x,y) = 0 \Leftrightarrow x = y[/tex]
3. [tex]d(x,y) = d(y,x)[/tex]
4. [tex]d(x,z) \leq d(x,y) + d(y,z)[/tex]
In a similar fashion I am interested to know what the structure on spacetime is? Are there similar axioms for the spacetime metric?

Similar but not identical. The first two and the last are invalid in relativity.

Pete

Oxymoron · Dec 24, 2005

Thanks Pete for your input. That certainly cleared up some of my earlier issues.

Ok, unfortunately I am still struggling with the notion of indices here.

Take the Kronecker delta for example.

[tex]\delta_{ij} = \delta^{ij} = \delta^i_j[/tex]

In Euclidean space with rectangular coordinates, the Kronecker delta may be written with its indices superscripted or subscipted - it doesn't make any difference. By the way, correct me if I am wrong with any of my assumptions. So, for example, [itex]\delta_{ij}x_ix_j = (x_1)^2 + 0 + 0 + 0 + (x_2)^2 + 0 + 0 + 0 + (x_3)^2 = x_ix_i[/itex], so the Kronecker delta just removes the non-diagonal terms. But what about, say, [itex]\delta^i_jx_ix_j[/itex]. To me it seems as though it has the same effect regardless of whether the [itex]i[/itex] or the [itex]j[/itex] sits superscripted or subscripted. Is the Kronecker delta a tensor? Does changing the position of the indices ever change what it does? Is this too simple an example to illustrate the need for the different indices?

Now consider the Rectangular coordinate system. Coordinates of a point are always denoted by

[tex](x^1,x^2,\dots,x^n)[/tex]

is there any reason why the subscripts are now replaced by superscripts when regarding tensors? For example, now the distance between two points in rectangular coordinates is

[tex]\sqrt{(x^1 - y^1)^2 + (x^2-y^2)^2 + \dots + (x^n - y^n)^n}}= \sqrt{\delta_{ij}\Delta x^i \Delta x^j}[/tex]

The way I see it is that normally [itex]\Delta x^i \Delta x^j[/itex] is summed over all combinations of [itex]i[/itex] and [itex]j[/itex] up to [itex]n[/itex]. But sticking in the Kronecker delta, we only sum when [itex]i=j[/itex], all other terms go to zero - which is handy when we are finding the distance.

Why have the indices superscripted now? Do they get in the way later on? Any reason?

My question here is, if we were to 'transform' the coordinate system into a slightly different rectangular coordinate system, say be changing the basis or something, would the distance be unchanged? Are there any requirements of the coordinate system which makes this possible? I mean, if distance is preserved under a coordinate transformation - this is just like saying that an operator preserves the metric, hence isometric. Is this the same thing?

The way that I am teaching myself tensors is to work with coordinate transformations. (is this a good way to start?) Now I am at the point where I should be able to learn the difference between a contra and a co-variant tensor. Say that I have a vector field [itex]\bold{V}[/itex] defined on some subset of [itex]\mathbb{R}^n[/itex]. So my elements of the vector field are vectors, which we know can be written as [itex]V^i[/itex] with respect to some 'admissible' coordinate system. Each element [itex]V^i[/itex] is a real-valued function. Now let me assume I have at hand two admissible coordinate systems, that is, I should be able to tranform between the two without changing my metric, my method of prescribing distances. Now let the vector field be written in terms of its [itex]n[/itex] components:

[tex]\bold{V} = V^1,V^2,\dots,V^n[/tex].

Each component, [itex]V^i[/itex] can of course be written as a real-valued function. Let's call each one [itex]T^i[/itex]. Now since we have two coordinate systems let's express all this in the following way

For the [itex](x^i)[/itex] system we have

[tex]T^1,T^2,\dots, T^n[/tex]

and for the [itex](\bar{x}^i)[/itex] system we have

[tex]\bar{T}^1,\bar{T}^2,\dots,\bar{T}^n[/tex]

Now at this point I am faced with the following idea: that [itex]\bold{V}[/itex] is itself a contravariant tensor of order one provided that its components [itex]T^i[/itex] and [itex]\bar{T}^i[/itex] relative to the respective coordinate systems obeys a given law of transformation.

This means to me that the vector field is the tensor all along!

Oxymoron · Dec 24, 2005

Lets say that this transformation was

[tex]\bar{T}^i = T^r \frac{\partial\bar{x}^i}{\partial x^r}[/tex]

In fact, it is this transformation which makes the vector field a contravariant tensor of rank one.

If the law of transformation was say

[tex]\bar{T}_i = T_r \frac{\partial x^r}{\partial\bar{x}^i}[/tex]

Now the vector field is a covariant tensor of rank one.

My question is, what is the difference between the two transformation laws? I mean, the only visible difference is that some of the indices have been lowered and we are now differentiating with respect to a different coordinate system.

There is a method of transforming from one system to another. Then, if the transformation is bijective, we could transform back to the original via the inverse. Now, is this like the transformation laws here? I mean, the vector field is said to be a contravariant tensor if the first law holds - meaning we can transform to a different coordinate system. THEN, the vector field is a covariant tensor IF we can transform back?

I may be able to see why the indices are changing now. So you can tell which way the coordinate transforming is operating.

selfAdjoint · Dec 24, 2005

Oxymoron said:

Lets say that this transformation was
[tex]\bar{T}^i = T^r \frac{\partial\bar{x}^i}{\partial x^r}[/tex]
In fact, it is this transformation which makes the vector field a contravariant tensor of rank one.
If the law of transformation was say
[tex]\bar{T}_i = T_r \frac{\partial x^r}{\partial\bar{x}^i}[/tex]
Now the vector field is a covariant tensor of rank one.
My question is, what is the difference between the two transformation laws? I mean, the only visible difference is that some of the indices have been lowered and we are now differentiating with respect to a different coordinate system.
There is a method of transforming from one system to another. Then, if the transformation is bijective, we could transform back to the original via the inverse. Now, is this like the transformation laws here? I mean, the vector field is said to be a contravariant tensor if the first law holds - meaning we can transform to a different coordinate system. THEN, the vector field is a covariant tensor IF we can transform back?
I may be able to see why the indices are changing now. So you can tell which way the coordinate transforming is operating.

In the covariant change formula you multiply by partials of the old coordinates with respect to the new ones. In the contravariant formula you multiply by partials of the new variables with respect to the old ones. [tex](\frac{\partial x^{\mu}}{\partial x'^{\nu}})[/tex] and [tex](\frac{\partial x'^{\mu}}{\partial x^{\nu}})[/tex] are inverse operations.

pervect · Dec 24, 2005

At a fundamental level, there are two types of quantities, both of which tranform differently (oppositely).

These quantities are called vectors (aka contravariant vectors), and one-forms (aka covariant vectors).

There is a duality relationship between these quantities. A one-form is produced by mapping a vector to a scalar.

If you review vector spaces, you should see some mention of "dual spaces". The dual of a vector space (defined by a linear mapping of the vector space to a scalar as I mentioned above) is always a vector space of the same dimension as the original vector space. The interesting fact is that the dual of a dual recovers the original vector space, which is why the operation is named the way it is. (I'm not going to attempt to prove this interesting statement, but it's reasonably well known and you should be able to find one if you look for it).

Tensors can be regarded as a multi-linear map from vectors and dual vectors to a scalar. (This is an alternative definition to defining them by their transformation properties).

Take a look at baez's GR outline

http://math.ucr.edu/home/baez/gr/outline2.html

for more details as to how to approach tensors from a vector / one-form aproach. (Baez calls the one-forms cotangent vectors).

I'm used to being able to freely interconvert vectors and one-forms via means of the metric. I'll have to ponder robphy's remakrs about the cases where this is not always possible. Meanwhile, in most situations, a metric exists, and via the metric it is possible to convert vectors to one-forms, and vica-versa.

The origin of the metric is the existence of the dot product of two vectors, a product that should give the "length" of a vector when the dot product of a vector is "dotted" with itself. This dot product also commutes in most physical situations.

The dot product, A (dot) B, asociates with every vector A a linear map from the vector B to a scalar by definition (since it maps two vectors to a scalar).

It also associates with every vector B a linear map from the vector A to a scalar by the same logic.

When the dot product commutes, these two maps are equivalent, and one simply says that the dot product associates a vector with a one-form (or a vector with a dual vector, a tangent vector with a cotangent vector, etc. etc.)

Oxymoron · Dec 24, 2005

If you review vector spaces, you should see some mention of "dual spaces". The dual of a vector space (defined by a linear mapping of the vector space to a scalar as I mentioned above) is always a vector space of the same dimension as the original vector space. The interesting fact is that the dual of a dual recovers the original vector space, which is why the operation is named the way it is. (I'm not going to attempt to prove this interesting statement, but it's reasonably well known and you should be able to find one if you look for it).

Tensors can be regarded as a multi-linear map from vectors and dual vectors to a scalar. (This is an alternative definition to defining them by their transformation properties).

Ok, I should be able to understand this. Suppose we take [itex]N[/itex] vector spaces over the reals: [itex]V_1, V_2, \dots, V_N[/itex]. Now let's define a map which takes all [itex]N[/itex] vector spaces to a single real number:

[tex]T:V_1 \times V_2 \times \dots \times V_N \rightarrow \mathbb{R}[/tex]

This is a 'multi' linear functional. Is the collection of such linear functionals forms its own vector space:

[tex]V_1^* \otimes \dots \otimes V_N^*[/tex]

At this stage what makes us able to identify every vector space [itex]V_i[/itex] with its double dual, [itex]V_i^{**}[/itex]? Because the tensor product of the vector spaces [itex]V_1,\dots,V_N[/itex], denoted by

[tex]V_1 \otimes V_2 \otimes \dots \otimes V_N[/tex]

is a set of linear functionals which maps from [itex]V_1^* \otimes \dots \otimes V_N^*[/itex] to [itex]\mathbb{R}[/itex].

Now when we speak of a tensor of type (r,s), what do the r and s mean? Well, I thought that a map

[tex]T:V_1^* \times V_2^* \times \dots \times V_r^* \times V_1 \times V_2 \times \dots \times V_s \rightarrow \mathbb{R}[/tex]

which is a tensor (is this right, we call such multilinear functionals tensors?)

In this case the r in "a tensor of type (r,s)" is the number of vector spaces we map from. So a tensor of type (0,1) is

[tex]T:V \rightarrow \mathbb{R}[/tex]

and hence it is just a linear map, simple as that.

A tensor of type (2,2) would be

[tex]T:V_1^* \times V_2^* \times V_1 \times V_2 \rightarrow \mathbb{R}[/tex]

which is some strange map which I can't think of anything it applies to. But a tensor of rank (0,2) would be a bilinear map.

Then if r=0 and s=n then the tensor is a covariant tensor of rank n and if r=n and s=0 the tensor is a contravariant tensor of rank n. So in this way a tensor is simply a collection of linear functionals and being covariant means we map from the product of vector spaces and the contravariant means we map from the product of the dual vector spaces.

pervect · Dec 25, 2005

Suppose we take N vector spaces over the reals...

Not quite - you need only one vector space [itex]\mathbb{V}[/itex] which has several different vectors [itex]v_i \subset \mathbb{V}[/itex]

Oxymoron · Dec 25, 2005

Not quite - you need only one vector space which has several different vectors

Are you sure? In the book I am reading, the formulation of a tensor as a multilinear map requires several vector spaces [itex]V_1,\dots,V_N[/itex]. Maybe I am wrong and I don't understand their version of 'vector space'.

From "A Course in Modern Mathematical Physics"

Let [itex]V_1,V_2,\dots,V_N[/itex] be vector spaces over [itex]\mathbb{R}[/itex]. A map

[tex]T: V_1 \times, V_2 \times \dots \times V_N \rightarrow \mathbb{R}[/tex]

is a multilinear map. Multilinear maps can be added and multiplied by scalars in the usual fashion and forms a vector space, denoted

[tex]V_1^* \otimes V_2^* \otimes \dots \otimes V_N^*[/tex]

called the tensor product of the dual spaces [itex]V_1^*,V_2^*,\dots,V_N^*[/itex].

Understanding the Structure and Transformation of Tensors in Spacetime

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

Undergrad Why is gravity a fictitious force?

Undergrad Relativistic Space Travel: Optimizing Proper Time [Project Hail Mary]

Undergrad KE of rotating disc

Undergrad Why is the Lorentz Force always perpendicular to velocity?

Graduate How valid is the Block Universe theory?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect