Understanding the Structure and Transformation of Tensors in Spacetime

Oxymoron
Messages
868
Reaction score
0
Im trying to understand more about the foundations upon which general relativity lies.
The first thing I need to clarify is the space in which GR works. I've read that GR merges time and space into spacetime, which to my understanding means that, when considered separately, time and space are invariant.
This spacetime must then be much like any other mathematical space, such as Euclidean space or Hilbert space, in that notions of distance must be defined.
In Euclidean space, \mathbb{R}^3, you have the Euclidean metric,
\Delta s^2 = x^2 - y^2
which obeys all the axioms of a metric and defines the distance between two points in the Euclidean space.
However, I have read, that the metric in spacetime, the spacetime metric, is defined as
\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2 - c^2 \Delta t^2
This spacetime metric raises some issues with me. If this is the metric of spacetime then points in spacetime must be represented as a 4-vector. Are these points actually called events by physicists?
The Euclidean plane has structure, right, in the form of the Euclidean metric:
1. d(x,y) \geq 0
2. d(x,y) = 0 \Leftrightarrow x = y
3. d(x,y) = d(y,x)
4. d(x,z) \leq d(x,y) + d(y,z)
In a similar fashion I am interested to know what the structure on spacetime is? Are there similar axioms for the spacetime metric?
 
Last edited:
Physics news on Phys.org
Ok, maybe I jumped in too quickly for myself.
In Euclidean space, \mathbb{R}^3, you have the distance metric between any two pairs of points to be
\Delta s^2 = (\underline{x}_2 - \underline{x}_1)^2
Then you can construct Euclidean transformations, where the distance is invariant, resulting in what are called affine transformations. I learned that tranformations can be expressed by matrices. So, such a Euclidean transformation can be expressed like
\Delta \underline{x}' = A\Delta \underline{x}
So the 'transformed' vector equals the product of some matrix pertaining to the transformation and the original vector. Then by letting one of the pairs of points be the origin you can get
\underline{x}' = A\underline{x} + a
where a is some constant.
Now the problem I have comes when I take this one step further. Consider Galilean space. I have found that it has some structure; three axioms:
1. Time Intervals \Delta t = t_2 - t_1
2. Spatial Distance \Delta s = |\underline{x}_2 - \underline{x}_1|
3. Motions of inertial particles (rectilinear motion) \underline{x}(t) = \underline{u}t + \underline{x}_0

And by a similar method done in Euclidean space you can see that

t' = t + a

\underline{x}' = A\underline{x} - \underline{v}t + \underline{b}

BUT, why don't Galilean transformations preserve the light cone at the origin? Why must we formulate the Minkowski metric to take care of this?
 
Oxymoron said:
Im trying to understand more about the foundations upon which general relativity lies.
The first thing I need to clarify is the space in which GR works. I've read that GR merges time and space into spacetime, which to my understanding means that, when considered separately, time and space are invariant.
This spacetime must then be much like any other mathematical space, such as Euclidean space or Hilbert space, in that notions of distance must be defined.
In Euclidean space, \mathbb{R}^3, you have the Euclidean metric,
\Delta s^2 = x^2 - y^2
which obeys all the axioms of a metric and defines the distance between two points in the Euclidean space.
However, I have read, that the metric in spacetime, the spacetime metric, is defined as
\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2 - c^2 \Delta t^2
This spacetime metric raises some issues with me. If this is the metric of spacetime then points in spacetime must be represented as a 4-vector. Are these points actually called events by physicists?
The Euclidean plane has structure, right, in the form of the Euclidean metric:
1. d(x,y) \geq 0
2. d(x,y) = 0 \Leftrightarrow x = y
3. d(x,y) = d(y,x)
4. d(x,z) \leq d(x,y) + d(y,z)
In a similar fashion I am interested to know what the structure on spacetime is? Are there similar axioms for the spacetime metric?


The metric you cite is actually that of "flat" Minkowski space of special relativity, not the more general metric of Einstein's pseudo-Riemannian manifold. From the Minkowski metric you can define the causal structure ("light cones") of his spacetime and then define the Lorentz transformations as the set of linear transformations the leave that metric invariant.

To do the pseudo-Riemannian geometry you need a more general metric, a symmetric tensor g_{\mu\nu}, \mu, \nu = 0,...,3. From the derivatives of this wrt the coordinates you define the Levi-Civita Connection \Gamma^{\rho}_{\sigma\tau}, then the covariant derivative and finally the curvature tensor. This is not too difficult to grasp, but you really should read up on manifolds first.
 
Oxymoron said:
Consider Galilean space. I have found that it has some structure; three axioms:
1. Time Intervals \Delta t = t_2 - t_1
2. Spatial Distance \Delta s = |\underline{x}_2 - \underline{x}_1|
3. Motions of inertial particles (rectilinear motion) \underline{x}(t) = \underline{u}t + \underline{x}_0
And by a similar method done in Euclidean space you can see that
t' = t + a
\underline{x}' = A\underline{x} - \underline{v}t + \underline{b}
BUT, why don't Galilean transformations preserve the light cone at the origin? Why must we formulate the Minkowski metric to take care of this?

Why? They just don't. The eigenvectors of the Galilean transformation are purely spatial vectors. For Galilean, this means that t=constant...which means that there is a notion of absolute time and absolute simultaneity. On the other hand, the eigenvectors of the Lorentz Transformation (which preserve the Minkowski metric) are lightlike vectors that are tangent to the light cone... which means that the speed of light is absolute.
 
Last edited:
The metric you cite is actually that of "flat" Minkowski space of special relativity, not the more general metric of Einstein's pseudo-Riemannian manifold. From the Minkowski metric you can define the causal structure ("light cones") of his spacetime and then define the Lorentz transformations as the set of linear transformations the leave that metric invariant.

Ok, so obviously Galilean spacetime works fine for classical mechanics but not for special relativity. Am I correct to assume that the spacetime interval

\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2 - c^2 \Delta t^2

is the metric in Minkowski space? The way I see it is that this way of defining the distance between two events (points) in Minkowski space also incorporates the time interval between the two as well.

When events satisfy \Delta s^2 = 0 then we say that they are connected by a light signal?

It seems to me that by introducing the Minkowski metric we have combined space and time into spacetime and yet we have split spacetime, via the light cone from an event, into two pieces: one which is cut off from the event (by the absolute speed of light) and one which receives information about the event (from the transmission of the light signal).


A problem I am having is visualising the light cone. If I create an event, say I create a photon. Then the photon spreads out from where it is created in all three spatial directions at the speed of light. I am not sure if this is right, but I tend to imagine the point of origin (the tip of the cone) as where I created the photon. In space the light spreads out as a sphere until it consumes the entire universe. Points in space outside the sphere do not know of what is happening inside the sphere until it reaches that point and that time is restricted by the speed of light. But this notion is spherical, not conical.

EDIT:
Im babbling here. Surely light cones are 4-dimensional and I am trying to picture them as 3-dimensional objects on 2-dimensional paper so clearly I am getting the wrong impression. If anyone has a good description of them or knows of any I would be appreciative.

Why? They just don't. The eigenvectors of the Galilean transformation are purely spatial vectors. For Galilean, this means that t=constant...which means that there is a notion of absolute time and absolute simultaneity. On the other hand, the eigenvectors of the Lorentz Transformation (which preserve the Minkowski metric) are lightlike vectors that are tangent to the light cone... which means that the speed of light is absolute.

The eigenvectors of the Galilean transformation. The Galilean transformations are

\underline{x}' = A\underline{x} + \underline{a}(t)

right? I played around with Newton's first law of motion I came to the conclusion that A is a constant matrix, that is, its time-derivatives are all zero. What could this mean? Well, I read what you wrote and it makes sense - I hope! - the eigenvectors of the Galilean transformation are indeed spatial. Using the Galilean transformations on Newton's First Law of motion there is absolute time and simultaneity. If this is the case then how can an event behave as it does in special relativity? Is this why the Galilean transformation does not preserve the light cone?

If so, changing to the Lorentz transformation

\underline{x}' = L\underline{x} + \underline{a}(t)

tells me that the space is affine. (not sure about this).

My question at this stage is, does the Lorentz transformation preserve the light cone at any event?
 
Last edited:
I finally discovered the true definition of Minkowski spacetime - in terms of the metric defined on it. Please correct me if any of my understanding is flawed. The metric is a nondegenerate, symmetric, indefinite, bilinear form:

\bold{g} : \mathscr{M} \times \mathscr{M} \rightarrow \mathbb{R}

Within Minkowski spacetime we may define the Lorentz inner product as being

g(v,w) := v\cdot w = v_1w_1 + v_2w_2 + v_3w_3 - v_4w_4

Once we include the inner product does Minkowski spacetime become an inner product space, you know, like Hilbert space?


Vectors in \mathscr{M} are either spacelike, timelike, or null if the inner product is positive, negative, or zero respectively.

Now let's collect all the null vectors into one set and call it the null cone, or light cone:

C_N(x_0) = \{x\in\mathscr{M} \,:\, v\cdot w = 0\}

So the light cone is a surface in Minkowski space consisting of all those vectors whose inner product with x_0 is zero. The Lorentz inner product tells us the spacetime interval between two events right? If the inner product, g(v,x_0) is zero then the spacetime interval between the event v and x_0 is simultaneous. Therefore the light cone is the collection of events in spacetime which occur simultaneously to an observer at x_0.

If I spontaneously create a photon and call the event x_0 then the photon radiates outward in all spatial directions. I should be able to construct a light cone whose point is at x_0.

All the events inside the future light cone of the event happen AFTER the event - there is a strict time ordering of events inside the future cone. We can even say that the event x_0 can cause things to happen, and if it does the new event must be within the future cone. So events which are caused by x_0 must be reachable by information at speeds less than c. We then say that these new events are caused by x_0 and are timelike vectors.

If the new events inside the future light cone are all timelike then their Lorentz inner product must be negative. But what does this mean? If the Lorentz inner product is meant to tell us the spacetime interval between two events, and this interval is negative, does it mean that the new event (caused by x_0 and thus inside the future light cone) is further away from x_0 in time than in space?
 
Oxymoron said:
I finally discovered the true definition of Minkowski spacetime - in terms of the metric defined on it. Please correct me if any of my understanding is flawed. The metric is a nondegenerate, symmetric, indefinite, bilinear form:
\bold{g} : \mathscr{M} \times \mathscr{M} \rightarrow \mathbb{R}
Within Minkowski spacetime we may define the Lorentz inner product as being
g(v,w) := v\cdot w = v_1w_1 + v_2w_2 + v_3w_3 - v_4w_4
...where the rightmost expression uses rectangular components and the (+,+,+,-) signature convention.
Oxymoron said:
Once we include the inner product does Minkowski spacetime become an inner product space, you know, like Hilbert space?
Vectors in \mathscr{M} are either spacelike, timelike, or null if the inner product is positive, negative, or zero respectively.
...inner product with itself (that is, its square norm) is...
Oxymoron said:
Now let's collect all the null vectors into one set and call it the null cone, or light cone:
C_N(x_0) = \{x\in\mathscr{M} \,:\, v\cdot w = 0\}
So the light cone is a surface in Minkowski space consisting of all those vectors whose inner product with x_0 is zero.
...the null vectors at a point (event) [say, x_0] of M.
It's not "all those vectors whose inner product with x_0"... but "all those vectors at event x_0 whose inner product with itself "...
 
Last edited:
Oxymoron said:
The Lorentz inner product tells us the spacetime interval between two events right? If the inner product, g(v,x_0) is zero then the spacetime interval between the event v and x_0 is simultaneous. Therefore the light cone is the collection of events in spacetime which occur simultaneously to an observer at x_0.
There may be some confusion here. The arguments of the metric are vectors. So, when you write g(v,x_0), then v and x_0 are vectors. However, x_0 is a point (event)... unless you somehow want to use something like a so-called position vector in spacetime... however, you now have to specify an origin of spacetime position... but then your inner product g(v,x_0) depends on that choice of origin... probably not what you want.

"That two events are simultaneous" is an observer dependent concept. Using your something similar to your notation g(v,x_0), I can clarify this.
At event x_0, let v be a unit-timelike vector (representing an observer's 4-velocity). Suppose there is an event x_1 such that the displacement vector \Delta x=x_1-x_0 has inner-product zero with v: g(v,x_1-x_0)... then x_1 and x_0 are simultaneous according to the observer with 4-velocity v. Looking at this in more detail, the 4-vector \Delta x=x_1-x_0 must be a spacelike vector... to observer-v, it is in fact a purely spatial vector... in other words, it is orthogonal to v.

So, "Therefore the light cone is the collection of events in spacetime which occur simultaneously to an observer at x_0" is NOT correct.
In Minkowski space, the future light cone traces out the set of events that can be reached by a light signal at the vertex event (what it broadcasts)... the past light cone traces out the events that reach the vertex event by a light signal (what it literally "sees").
Oxymoron said:
If I spontaneously create a photon and call the event x_0 then the photon radiates outward in all spatial directions. I should be able to construct a light cone whose point is at x_0.
I would probably say "a flash of light"..."sending out many photons outward
in all spatial directions".
Oxymoron said:
All the events inside the future light cone of the event happen AFTER the event - there is a strict time ordering of events inside the future cone. We can even say that the event x_0 can cause things to happen, and if it does the new event must be within the future cone. So events which are caused by x_0 must be reachable by information at speeds less than c. We then say that these new events are caused by x_0 and are timelike vectors.
I would say "can be influenced by x_0"... since an event P can be influence by many events (not just x_0) in the past light cone of P.
Oxymoron said:
If the new events inside the future light cone are all timelike then their Lorentz inner product must be negative. But what does this mean? If the Lorentz inner product is meant to tell us the spacetime interval between two events, and this interval is negative, does it mean that the new event (caused by x_0 and thus inside the future light cone) is further away from x_0 in time than in space?
Again... you need to distinguish points (events) from vectors.
 
here may be some confusion here. The arguments of the metric are vectors. So, when you write , then v and x_0 are vectors. However, x_0 is a point (event)... unless you somehow want to use something like a so-called position vector in spacetime... however, you now have to specify an origin of spacetime position... but then your inner product depends on that choice of origin... probably not what you want.

So we must treat events differently to these timelike and spacelike vectors? Or is it simply that the two do not compute right in the metric? I am confused at why the event, x_0 is not a vector in Minkowski space? Wait, maybe not. Elements of Minkowski space are events; vectors in Minkowski space are not events, right?

That two events are simultaneous" is an observer dependent concept. Using your something similar to your notation , I can clarify this.
At event x_0, let v be a unit-timelike vector (representing an observer's 4-velocity). Suppose there is an event x_1 such that the displacement vector has inner-product zero with v: ... then x_1 and x_0 are simultaneous according to the observer with 4-velocity v. Looking at this in more detail, the 4-vector must be a spacelike vector... to observer-v, it is in fact a purely spatial vector... in other words, it is orthogonal to v.

So, "Therefore the light cone is the collection of events in spacetime which occur simultaneously to an observer at " is NOT correct.
In Minkowski space, the future light cone traces out the set of events that can be reached by a light signal at the vertex event (what it broadcasts)... the past light cone traces out the events that reach the vertex event by a light signal (what it literally "sees").

Is v inside the future light cone? Surely it must be outside the cone? In which case, no matter what the 4-velocity of the observer, so long as it is spacelike, will see x_0 and x_1 simultaneously.

At this point I was about to ask "What if the observer was inside the cone, whose 4-velocity was purely time-like. Would the two events appear simultaneous now?". Surely, \bold{g}(v,\Delta x) is still zero, so the events are on the surface of the light cone, but now the observer must see one event happen AFTER the other. If this is right, could you explain.


One extra question. What is \eta_{ab}, the thing which can be 1, -1, or 0 depending on a and b.
 
  • #10
Oxymoron said:
Elements of Minkowski space are events; vectors in Minkowski space are not events, right?

Correct. Events are points in the Minkowski space, vectors (can) connect events.
Oxymoron said:
Is v inside the future light cone? Surely it must be outside the cone? In which case, no matter what the 4-velocity of the observer, so long as it is spacelike, will see x_0 and x_1 simultaneously.

v must be in the light cone, otherwise the velocity would be greater than that of light. In order for two events to be seen simultaneously they must have a purely spatial separation in the observer's refrence frame, as robphy said the vector connecting the events must be normal to the world line (the velocity vector) of the observer. If we restrict ourselves to 1D motion there will only be one velocity which will see two spacelike separated events as simultaneous. Events which are not spacelike separated cannot be seen as simultaneous by any observers.

Oxymoron said:
At this point I was about to ask "What if the observer was inside the cone, whose 4-velocity was purely time-like. Would the two events appear simultaneous now?". Surely, \bold{g}(v,\Delta x) is still zero, so the events are on the surface of the light cone, but now the observer must see one event happen AFTER the other. If this is right, could you explain.

I don't see how \bold{g}(v,\Delta x) could be zero if the two events are on the lightcone. \bold{g}(\Delta x,\Delta x) would be zero since they're lightlike seperated, but v is a different vector.

Oxymoron said:
One extra question. What is \eta_{ab}, the thing which can be 1, -1, or 0 depending on a and b.

\eta_{\mu \nu} is the metric of flat Minkowsky space, it's diagonal and all 1's in the diagonal slots, except for the -1 in the time position (either the upper left or bottom right, depening on which book you're looking at). Note that by convention greek indicies imply a range over all four dimensions whereas roman indicies imply a range over only the spatial dimensions.
 
Last edited:
  • #11
Hmmm, I am still confused.:redface:

If we have an event x_0 from which we construct a light cone then future events which occur on the surface of the light cone have Lorentz inner product zero. Future events which are inside the cone are connected by time-like vectors, and those outside are connected by space-like vectors.

Right so far? (probably not)

My question here is: how do I define an observer? Can he be anywhere - as in, inside the cone, outside, at the event? I mean, could an observer be at the very location of the event at the time it occurs? Or could he be in the event's future - perhaps on the surface of the light cone or inside it or outside it.
 
  • #12
Oxymoron said:
Is v inside the future light cone? Surely it must be outside the cone? In which case, no matter what the 4-velocity of the observer, so long as it is spacelike, will see x_0 and x_1 simultaneously.
An observer's 4-velocity v is a unit-timelike vector tangent to the observer's worldline. It points into the interior region enclosed by the future light cone (the "chronological future" of the vertex event). One can roughly interpret v as one unit of time along that observer's worldline. So, the 4-velocity v is never spacelike [and never null].
 
  • #13
Posted by Dicerandom

v must be in the light cone, otherwise the velocity would be greater than that of light.

So that is why the null vectors are sometimes called lightlike vectors! Because for v was on the surface then v = c and all of a sudden every event in the future is simultaneous to that observer. v cannot be outside the cone because then v > c.
 
  • #14
If I cause an event x_0 and I am the observer at that point. Then can I safely say that there is a future event x_1 which lies on the light cone of the original event? To me, if there is such an event and x_0 causes that event to occur, then my velocity must be c. Since my 4-vector is timelike and if I observe x_0 causing x_1 (which is on the surface) then my velocity is c.
 
  • #15
Oxymoron said:
Hmmm, I am still confused.:redface:
If we have an event x_0 from which we construct a light cone then future events which occur on the surface of the light cone have Lorentz inner product zero. Future events which are inside the cone are connected by time-like vectors, and those outside are connected by space-like vectors.
Right so far? (probably not)

Looks good so far :smile:

Oxymoron said:
My question here is: how do I define an observer? Can he be anywhere - as in, inside the cone, outside, at the event? I mean, could an observer be at the very location of the event at the time it occurs? Or could he be in the event's future - perhaps on the surface of the light cone or inside it or outside it.

The observer can be anywhere, yes. However what we generally do is define a worldline for an observer, i.e. a path that the observer will follow through spacetime. In simple cases the observer has constant velocity and it's just a straight line, however in more complicated situations the observer can undergo accelerations and the worldine can be curved. At any point along the worldline we define what is called the MCRF (Momentarily Comoving Reference Frame), which is a reference frame that is moving with uniform velocity equal to the instanteneous velocity of the observer. The observer's velocity vector is then the unit vector which points along the time axis of this reference frame, i.e. it is a vector of unit length which is tangental to the observer's worldline.

Oxymoron said:
So that is why the null vectors are sometimes called lightlike vectors! Because for v was on the surface then v = c and all of a sudden every event in the future is simultaneous to that observer. v cannot be outside the cone because then v > c.

Right :smile: I'd be careful about saying what happens when v=c though, technically the theory doesn't extend to that point but if you look at the limiting behavior as v->c that is how it seems things would be.

Oxymoron said:
If I cause an event x_0 and I am the observer at that point. Then can I safely say that there is a future event x_1 which lies on the light cone of the original event? To me, if there is such an event and x_0 causes that event to occur, then my velocity must be c. Since my 4-vector is timelike and if I observe x_0 causing x_1 (which is on the surface) then my velocity is c.

There are in fact an infinite number of such events. Suppose that you're out in space with a flashlight, you turn your flashlight on and some time later a friend who is some distance away sees that you turned your light on. You turning your light on would be your event x_0 and your friend seeing the light would be an event x_1 which lies on the lightcone from x_0, yet you didn't have to move anywhere.
 
Last edited:
  • #16
Posted by Dicerandom

There are in fact an infinite number of such events. Suppose that you're out in space with a flashlight, you turn your flashlight on and some time later a friend who is some distance away sees that you turned your light on. You turning your light on would be your event and your friend seeing the light would be an event which lies on the lightcone from , yet you didn't have to move anywhere.

...if you were moving at v = c, would you know instantly that your friend saw your light flash - instead of some time later? To me it seems that the time it takes for information to get around inside the light cone depends on the observer's velocity too?

Also, just say that you were that friend, waiting for me to flash the light. At time=0 you must clearly be outside the cone. Is this the reason why the light cone is hard to picture, because in 3 dimensions you don't realize that the cone itself is 'expanding' - that is, this expanding in time dimension is compressed. Because an observer who at t = t_1 observes the initial event (and therefore at this time inside the cone) is outside the cone initially.
 
  • #17
Oxymoron said:
...if you were moving at v = c, would you know instantly that your friend saw your light flash - instead of some time later? To me it seems that the time it takes for information to get around inside the light cone depends on the observer's velocity too?
You can't travel at v=c, and things that do move at c (like photons) do not have their own reference frame in relativity, so you can't say what things will look like from their point of view.
Oxymoron said:
Also, just say that you were that friend, waiting for me to flash the light. At time=0 you must clearly be outside the cone. Is this the reason why the light cone is hard to picture, because in 3 dimensions you don't realize that the cone itself is 'expanding' - that is, this expanding in time dimension is compressed.
It's easier to visualize if you drop the number of dimensions by 1, so you have a 2D space and one time dimension. If you represent time as the vertical dimension, then any horizontal slice through this 3D spacetime will give you all of 2D space at a single instant in time. When an event happens, the light moves outward in an expanding 2D circle, so with time as the third dimension this looks like a cone, with a horizontal slice through the cone being the circle that light from the event has reached at a given time (see the illustration in the wikipedia article here). Of course, in 3D space it would actually be an expanding sphere instead, and the "cone" would be a 4D one that we humans can't actually visualize.
Oxymoron said:
Because an observer who at t = t_1 observes the initial event (and therefore at this time inside the cone) is outside the cone initially.
yes, if you picture each observer's worldline as a vertical line in this 3D spacetime (or a slanted line, if an observer is moving through space), then only when the second observer's worldline enters the cone does he see the event, at earlier time slices the expanding circle of light from the event has not reached him.
 
  • #18
I was just wondering something, please forgive me as it's a little off-topic. Photon's are light-like. Now, this may be crazy, but those theoretical particles, tachyons, whose speeds are greater than c. do they behave as spacelike vectors? Also, if a tachyon did exist, would it's speed be infinite, not just "greater than c"?

I'll get back on topic next post.
 
  • #19
Can somebody explain the need for contravariant tensors and covariant tensors. I mean, at first glance, the only difference I see between the two is that the indices are swapped around. Does it actually make a physical difference whether the incides are subscripted or superscripted?
 
  • #20
Oxymoron said:
Can somebody explain the need for contravariant tensors and covariant tensors. I mean, at first glance, the only difference I see between the two is that the indices are swapped around. Does it actually make a physical difference whether the incides are subscripted or superscripted?
Many physical quantities are naturally described ("born", if you will) by (say) contravariant tensors... many others by covariant... and the rest, mixed. At an abstract level, it is usually the geometry of the mathematical model of the physical quantity that dictates the type.

For example, the unit-4-velocity is a vector [a contravariant tensor] tangent to the worldline. The electromagnetic field is a 2-form [a totally antisymmetric covariant tensor]...which can be written as the curl of a potential. When there is a nondegenerate metric around, one can do index gymnastics and raise and lower indices... however, one should really be aware of the natural description of the quantity... or else its physical meaning could be obscured in all of the shuffling.

In Euclidean space, the simplicity of the metric and volume-form can sometimes blur the distinction among various "directional quantities"... so that we get away with thinking of a lot of these quantities as simple "vectors". For example, the cross-product of two vectors is not really a vector...without the metric and the volume form. A physical example: the electric field and the magnetic field are not fundamentally contravariant vectors.
 
  • #21
Oxymoron said:
Can somebody explain the need for contravariant tensors and covariant tensors. I mean, at first glance, the only difference I see between the two is that the indices are swapped around. Does it actually make a physical difference whether the incides are subscripted or superscripted?

Most of the time people prefer to work in orthonormal Cartesian coordinate systems. In these cases, there is no difference between covariant and contravariant tensors, because the metric is an identity matrix.

In some situations, however, one cannot use an identity matrix for the metric. Relativity is an example -- as ds^2 = dx^2 + dy^2 + dz^2 - dt^2, because of the minus sign before the dt^2, the metric is not an identity metric.

In these situations, one has to worry about covariant vs contravariant tensors. You can think of them as the machinery necessary to include the minus sign in front of the dt^2, or the machinery necessary to work in any coordinate system, including ones that are not orthonormal.

If you abtract out "coordinate choice" issues from "coordinate independent" issues, all the issues related to covariance and contravariance are related to coordinate choices. Thus G_{uv} = 8 \pi T_{uv} relates to the same physics as G^{uv} = 8 \pi T^{uv}, the covariance and contravariance issues are ultimately all related to the choice of coordinates.

In order to maintain comprehensibility, though, certain very strong conventions are used - for instance, the space-time coordinates of an event are always written superscripted. The subscripted space-time coordinates of an event are then determined from the superscripted coordinates by the machinery of the tensor transformations via the metric at that location. The value of the metric depends on some more coordinate choice issues.
 
Last edited:
  • #22
pervect said:
If you abtract out "coordinate choice" issues from "coordinate independent" issues, all the issues related to covariance and contravariance are related to coordinate choices. Thus G_{uv} = 8 \pi T_{uv} relates to the same physics as G^{uv} = 8 \pi T^{uv}, the covariance and contravariance issues are ultimately all related to the choice of coordinates.
Technically speaking, if your metric tensor happens to be degenerate (so that it has no inverse) [for example in the Newton-Cartan case] then you can't raise indices so easily. Of course, one could take for granted an invertible metric tensor as one often does... but it's probably a good idea not to take such things for granted. In my opinion, it's always a good idea to see the scaffolding to appreciate just what went into the construction of tensorial expressions.

From a measurement point of view... suppose you didn't know the metric (maybe you are not able to determine it right now)... then certain expressions that use the metric couldn't be determined... but those that didn't need it can be determined. So, one should know which expressions need the metric and which don't.

For example, one can formulate electrodynamics without the use of a metric
http://arxiv.org/abs/physics/9907046

Here's a reference for the Newton-Cartan formalism
p.44 of http://arxiv.org/abs/gr-qc/0506065 (note the comments on p. 47)

Here's some motivation for this general viewpoint:
http://www.ucolick.org/~burke/forms/draw.ps
http://arxiv.org/abs/gr-qc/9807044
http://journals.tubitak.gov.tr/physics/issues/fiz-99-23-5/fiz-23-5-7-9903-44.pdf

The first paragraph of http://www.bgu.ac.il/~rsegev/Papers/JMP2002AIP.pdf has a nice motivation in its opening paragraph: "Since one cannot assume that the metric tensor is known in advance, it would be preferable, at least from the theoretical point of view, to have a formulation of the theory that does not rely on the metric structure."

Indeed... in many approaches to quantum gravity, the metric isn't available. So, there is [at a deep level] some physics that distinguishes a natural-born tensorial expression from its index-raised-or-lowered analogue.
 
  • #23
Oxymoron said:
Im trying to understand more about the foundations upon which general relativity lies.
The first thing I need to clarify is the space in which GR works. I've read that GR merges time and space into spacetime, ...
"Spacetime," the union of space and time, was Minkowski's idea which he revealed in 1908.
..which to my understanding means that, when considered separately, time and space are invariant.
I don't follow you here. What does that mean?
This spacetime must then be much like any other mathematical space, such as Euclidean space or Hilbert space, in that notions of distance must be defined.
No. Spacetime is manifold and Hilbert space is a vector space. These are two different uses of the term "space."
In Euclidean space, \mathbb{R}^3, you have the Euclidean metric,
\Delta s^2 = x^2 - y^2
which obeys all the axioms of a metric and defines the distance between two points in the Euclidean space.
That is incorrect. The spatial distance between two events is defined in flat spacetime as
\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2
However, I have read, that the metric in spacetime, the spacetime metric, is defined as
\Delta s^2 = \Delta x^2 + \Delta y^2 + \Delta z^2 - c^2 \Delta t^2
Yes. That is correct for Minkowski coordinates, i.e. an inertial frame of reference. It is not valid in general.
This spacetime metric raises some issues with me. If this is the metric of spacetime then points in spacetime must be represented as a 4-vector.
You're speaking of the 4-position = X = (ct, x, y, z). This is the spacetime interval between two events, one of which is called the "origin" and assigned the 4-position X = (0, 0, 0, 0)
Are these points actually called events by physicists?
Yes. A point in spacetime is called an "event."
The Euclidean plane has structure, right, in the form of the Euclidean metric:
1. d(x,y) \geq 0
2. d(x,y) = 0 \Leftrightarrow x = y
3. d(x,y) = d(y,x)
4. d(x,z) \leq d(x,y) + d(y,z)
In a similar fashion I am interested to know what the structure on spacetime is? Are there similar axioms for the spacetime metric?
Similar but not identical. The first two and the last are invalid in relativity.

Pete
 
  • #24
Thanks Pete for your input. That certainly cleared up some of my earlier issues.

Ok, unfortunately I am still struggling with the notion of indices here.

Take the Kronecker delta for example.

\delta_{ij} = \delta^{ij} = \delta^i_j

In Euclidean space with rectangular coordinates, the Kronecker delta may be written with its indices superscripted or subscipted - it doesn't make any difference. By the way, correct me if I am wrong with any of my assumptions. So, for example, \delta_{ij}x_ix_j = (x_1)^2 + 0 + 0 + 0 + (x_2)^2 + 0 + 0 + 0 + (x_3)^2 = x_ix_i, so the Kronecker delta just removes the non-diagonal terms. But what about, say, \delta^i_jx_ix_j. To me it seems as though it has the same effect regardless of whether the i or the j sits superscripted or subscripted. Is the Kronecker delta a tensor? Does changing the position of the indices ever change what it does? Is this too simple an example to illustrate the need for the different indices?

Now consider the Rectangular coordinate system. Coordinates of a point are always denoted by

(x^1,x^2,\dots,x^n)

is there any reason why the subscripts are now replaced by superscripts when regarding tensors? For example, now the distance between two points in rectangular coordinates is

\sqrt{(x^1 - y^1)^2 + (x^2-y^2)^2 + \dots + (x^n - y^n)^n}}= \sqrt{\delta_{ij}\Delta x^i \Delta x^j}

The way I see it is that normally \Delta x^i \Delta x^j is summed over all combinations of i and j up to n. But sticking in the Kronecker delta, we only sum when i=j, all other terms go to zero - which is handy when we are finding the distance.

Why have the indices superscripted now? Do they get in the way later on? Any reason?

My question here is, if we were to 'transform' the coordinate system into a slightly different rectangular coordinate system, say be changing the basis or something, would the distance be unchanged? Are there any requirements of the coordinate system which makes this possible? I mean, if distance is preserved under a coordinate transformation - this is just like saying that an operator preserves the metric, hence isometric. Is this the same thing?


The way that I am teaching myself tensors is to work with coordinate transformations. (is this a good way to start?) Now I am at the point where I should be able to learn the difference between a contra and a co-variant tensor. Say that I have a vector field \bold{V} defined on some subset of \mathbb{R}^n. So my elements of the vector field are vectors, which we know can be written as V^i with respect to some 'admissible' coordinate system. Each element V^i is a real-valued function. Now let me assume I have at hand two admissible coordinate systems, that is, I should be able to tranform between the two without changing my metric, my method of prescribing distances. Now let the vector field be written in terms of its n components:

\bold{V} = V^1,V^2,\dots,V^n.

Each component, V^i can of course be written as a real-valued function. Let's call each one T^i. Now since we have two coordinate systems let's express all this in the following way

For the (x^i) system we have

T^1,T^2,\dots, T^n

and for the (\bar{x}^i) system we have

\bar{T}^1,\bar{T}^2,\dots,\bar{T}^n

Now at this point I am faced with the following idea: that \bold{V} is itself a contravariant tensor of order one provided that its components T^i and \bar{T}^i relative to the respective coordinate systems obeys a given law of transformation.

This means to me that the vector field is the tensor all along!
 
Last edited:
  • #25
Lets say that this transformation was

\bar{T}^i = T^r \frac{\partial\bar{x}^i}{\partial x^r}

In fact, it is this transformation which makes the vector field a contravariant tensor of rank one.

If the law of transformation was say

\bar{T}_i = T_r \frac{\partial x^r}{\partial\bar{x}^i}

Now the vector field is a covariant tensor of rank one.


My question is, what is the difference between the two transformation laws? I mean, the only visible difference is that some of the indices have been lowered and we are now differentiating with respect to a different coordinate system.

There is a method of transforming from one system to another. Then, if the transformation is bijective, we could transform back to the original via the inverse. Now, is this like the transformation laws here? I mean, the vector field is said to be a contravariant tensor if the first law holds - meaning we can transform to a different coordinate system. THEN, the vector field is a covariant tensor IF we can transform back?

I may be able to see why the indices are changing now. So you can tell which way the coordinate transforming is operating.
 
Last edited:
  • #26
Oxymoron said:
Lets say that this transformation was
\bar{T}^i = T^r \frac{\partial\bar{x}^i}{\partial x^r}
In fact, it is this transformation which makes the vector field a contravariant tensor of rank one.
If the law of transformation was say
\bar{T}_i = T_r \frac{\partial x^r}{\partial\bar{x}^i}
Now the vector field is a covariant tensor of rank one.
My question is, what is the difference between the two transformation laws? I mean, the only visible difference is that some of the indices have been lowered and we are now differentiating with respect to a different coordinate system.
There is a method of transforming from one system to another. Then, if the transformation is bijective, we could transform back to the original via the inverse. Now, is this like the transformation laws here? I mean, the vector field is said to be a contravariant tensor if the first law holds - meaning we can transform to a different coordinate system. THEN, the vector field is a covariant tensor IF we can transform back?
I may be able to see why the indices are changing now. So you can tell which way the coordinate transforming is operating.


In the covariant change formula you multiply by partials of the old coordinates with respect to the new ones. In the contravariant formula you multiply by partials of the new variables with respect to the old ones. (\frac{\partial x^{\mu}}{\partial x'^{\nu}}) and (\frac{\partial x'^{\mu}}{\partial x^{\nu}}) are inverse operations.
 
  • #27
At a fundamental level, there are two types of quantities, both of which tranform differently (oppositely).

These quantities are called vectors (aka contravariant vectors), and one-forms (aka covariant vectors).

There is a duality relationship between these quantities. A one-form is produced by mapping a vector to a scalar.

If you review vector spaces, you should see some mention of "dual spaces". The dual of a vector space (defined by a linear mapping of the vector space to a scalar as I mentioned above) is always a vector space of the same dimension as the original vector space. The interesting fact is that the dual of a dual recovers the original vector space, which is why the operation is named the way it is. (I'm not going to attempt to prove this interesting statement, but it's reasonably well known and you should be able to find one if you look for it).

Tensors can be regarded as a multi-linear map from vectors and dual vectors to a scalar. (This is an alternative defintion to defining them by their transformation properties).

Take a look at baez's GR outline

http://math.ucr.edu/home/baez/gr/outline2.html

for more details as to how to approach tensors from a vector / one-form aproach. (Baez calls the one-forms cotangent vectors).

I'm used to being able to freely interconvert vectors and one-forms via means of the metric. I'll have to ponder robphy's remakrs about the cases where this is not always possible. Meanwhile, in most situations, a metric exists, and via the metric it is possible to convert vectors to one-forms, and vica-versa.

The origin of the metric is the existence of the dot product of two vectors, a product that should give the "length" of a vector when the dot product of a vector is "dotted" with itself. This dot product also commutes in most physical situations.

The dot product, A (dot) B, asociates with every vector A a linear map from the vector B to a scalar by defintion (since it maps two vectors to a scalar).

It also associates with every vector B a linear map from the vector A to a scalar by the same logic.

When the dot product commutes, these two maps are equivalent, and one simply says that the dot product associates a vector with a one-form (or a vector with a dual vector, a tangent vector with a cotangent vector, etc. etc.)
 
  • #28
If you review vector spaces, you should see some mention of "dual spaces". The dual of a vector space (defined by a linear mapping of the vector space to a scalar as I mentioned above) is always a vector space of the same dimension as the original vector space. The interesting fact is that the dual of a dual recovers the original vector space, which is why the operation is named the way it is. (I'm not going to attempt to prove this interesting statement, but it's reasonably well known and you should be able to find one if you look for it).

Tensors can be regarded as a multi-linear map from vectors and dual vectors to a scalar. (This is an alternative defintion to defining them by their transformation properties).

Ok, I should be able to understand this. Suppose we take N vector spaces over the reals: V_1, V_2, \dots, V_N. Now let's define a map which takes all N vector spaces to a single real number:

T:V_1 \times V_2 \times \dots \times V_N \rightarrow \mathbb{R}

This is a 'multi' linear functional. Is the collection of such linear functionals forms its own vector space:

V_1^* \otimes \dots \otimes V_N^*

At this stage what makes us able to identify every vector space V_i with its double dual, V_i^{**}? Because the tensor product of the vector spaces V_1,\dots,V_N, denoted by

V_1 \otimes V_2 \otimes \dots \otimes V_N

is a set of linear functionals which maps from V_1^* \otimes \dots \otimes V_N^* to \mathbb{R}.


Now when we speak of a tensor of type (r,s), what do the r and s mean? Well, I thought that a map

T:V_1^* \times V_2^* \times \dots \times V_r^* \times V_1 \times V_2 \times \dots \times V_s \rightarrow \mathbb{R}

which is a tensor (is this right, we call such multilinear functionals tensors?)

In this case the r in "a tensor of type (r,s)" is the number of vector spaces we map from. So a tensor of type (0,1) is

T:V \rightarrow \mathbb{R}

and hence it is just a linear map, simple as that.

A tensor of type (2,2) would be

T:V_1^* \times V_2^* \times V_1 \times V_2 \rightarrow \mathbb{R}

which is some strange map which I can't think of anything it applies to. But a tensor of rank (0,2) would be a bilinear map.

Then if r=0 and s=n then the tensor is a covariant tensor of rank n and if r=n and s=0 the tensor is a contravariant tensor of rank n. So in this way a tensor is simply a collection of linear functionals and being covariant means we map from the product of vector spaces and the contravariant means we map from the product of the dual vector spaces.
 
  • #29
Suppose we take N vector spaces over the reals...

Not quite - you need only one vector space \mathbb{V} which has several different vectors v_i \subset \mathbb{V}
 
  • #30
Not quite - you need only one vector space which has several different vectors

Are you sure? In the book I am reading, the formulation of a tensor as a multilinear map requires several vector spaces V_1,\dots,V_N. Maybe I am wrong and I don't understand their version of 'vector space'.

From "A Course in Modern Mathematical Physics"

Let V_1,V_2,\dots,V_N be vector spaces over \mathbb{R}. A map

T: V_1 \times, V_2 \times \dots \times V_N \rightarrow \mathbb{R}

is a multilinear map. Multilinear maps can be added and multiplied by scalars in the usual fashion and forms a vector space, denoted

V_1^* \otimes V_2^* \otimes \dots \otimes V_N^*

called the tensor product of the dual spaces V_1^*,V_2^*,\dots,V_N^*.
 
  • #31
Oxymoron said:
Are you sure? In the book I am reading, the formulation of a tensor as a multilinear map requires several vector spaces V_1,\dots,V_N. Maybe I am wrong and I don't understand their version of 'vector space'.

These N vector spaces are isomorphic copies of a single vector space... for example, V_1 and V_2 must have the same dimensionality [which doesn't seem to be required according to what has been written].
 
  • #32
Oxymoron said:
Are you sure? In the book I am reading, the formulation of a tensor as a multilinear map requires several vector spaces V_1,\dots,V_N. Maybe I am wrong and I don't understand their version of 'vector space'.

Yes. The vectors v_i all live in the same vector space. In GR this is usually the tangent space of some manifold. So we start with a manifold (which we haven't defined in this thread, that's a whole topic in itself - but for an illustrative example, picture a general manifold to be a curved n-dimensional surface, and for a specific example imagine that we have the 2-d surface of some 3-d sphere).

Given the manifold, there is also some tangent space on this manifold (for the example above, just imagine a plane that's tangent to the sphere). This tangent space is the vector space V that all the vectors "live in". There is also a "dual space" V* with the same number of dimensions that the duals of the vectors live in.

A tensor is just the functional you described which maps a certain number of dual vectors and a certain other number of vectors to a scalar - but all the vectors v_i live in the same vector space V, and all the dual vectors v_j live in the same dual space V*.

This is really a very minor distinction, otherwise you seem to be on the right track. But since you seem to be a mathemetican (or at least approaching the topic in the same way that a mathemtician does), I thought I'd try to be extra precise.

I have to hope that I have not violated Born's dictum here, which is to never write more precisely than you can think - I usually take a more "physical" approach than the rather abstract approach I am taking here.
 
  • #33
robphy said:
These N vector spaces are isomorphic copies of a single vector space... for example, V_1 and V_2 must have the same dimensionality [which doesn't seem to be required according to what has been written].

Right - there is nothing in the general definition that says that the N vector spaces can't be N copies of the same vector space, but there also is nothing in the general definition that says that the N vector spaces have to be N copies of the same vector space.

Also, this defintion only works for finite-dimensional vector spaces, like in relativity. For tensor products of infinte-dimensional vectors spaces, which occur in quantum theory, a different definition is needed. (The 2 definitions agree for finite-dimensiona vector spaces.)

How is the book "A Course in Modern Mathematical Physics"? I'm fairly sure that I will soon order it.

Regards,
George
 
  • #34
George Jones said:
Right - there is nothing in the general definition that says that the N vector spaces can't be N copies of the same vector space, but there also is nothing in the general definition that says that the N vector spaces have to be N copies of the same vector space.

Ah, yes... so when permitting distinct vector spaces, one would probably use different sets of indices... as is done with "soldering forms".
 
  • #35
Posted by pervect:

Yes. The vectors v_i all live in the same vector space. In GR this is usually the tangent space of some manifold. So we start with a manifold (which we haven't defined in this thread, that's a whole topic in itself - but for an illustrative example, picture a general manifold to be a curved n-dimensional surface, and for a specific example imagine that we have the 2-d surface of some 3-d sphere).

Given the manifold, there is also some tangent space on this manifold (for the example above, just imagine a plane that's tangent to the sphere). This tangent space is the vector space V that all the vectors "live in". There is also a "dual space" V* with the same number of dimensions that the duals of the vectors live in.

A tensor is just the functional you described which maps a certain number of dual vectors and a certain other number of vectors to a scalar - but all the vectors v_i live in the same vector space V, and all the dual vectors v_j live in the same dual space V*.

This is really a very minor distinction, otherwise you seem to be on the right track. But since you seem to be a mathemetican (or at least approaching the topic in the same way that a mathemtician does), I thought I'd try to be extra precise.

Perfect, exactly what I wanted to hear! Well written. BTW, you are correct, I am a mathematician - well at least I have just graduated from a Bachelor of Maths. Anyway, this description really helps.

Posted by George Jones:

How is the book "A Course in Modern Mathematical Physics"? I'm fairly sure that I will soon order it.

I ordered it over the internet about 3 weeks ago. It bridges the gap between undergraduate and graduate mathematical physics really well. I found it very well structured and written. Its about 600 pages and starts off with group theory and vector spaces. Then it moves into inner product spaces and algebras. Then it moves on to exterior algebra which I found very interesting and then has a chapter on tensors - introducing them in two different ways (which we have been discussing here) and finishes with applications to special relativity. The second part of the book (the final 300 pages) starts with topology and measure theory and some work on distibution functions which sets the stage for applications to quantum theory. There is also a chapter on Hilbert spaces. Finally chapters on differential geometry, forms, and manifolds (which I haven't read yet) which finishes with Riemannian curvature, connections, homology and a final chapter on Lie groups (all of which I haven't read). To sum up, its my favourite book at the moment. Very well written.
 
  • #36
Ok, so I am pretty sure I understand what covariant and contravariant tensors are. A covariant tensor or type (0,2) is a map

T:V\times V \rightarrow \mathbb{R}

Now if you take, say a two dimensional vector space V which has two basis components \{e_1,e_2\} then the dual vector space V^*, which (as robphy and pervect pointed out) has the same dimension as V, then has two basis components. Now am I correct in assuming that the basis components of the dual space is written in Greek? As in, \{\epsilon_1,\epsilon_2\}?

So a vector \omega \in V^* may be written as a sum of it's basis components: \omega = \alpha\epsilon_1 + \beta\epsilon_2?

Extending this idea to n-dimensional vector spaces we have that e_1,e_2,\dots,e_n is a basis for V and \epsilon_1,\epsilon_2,\dots,\epsilon_n is a basis for V^*.

As we have already discussed I assume that when writing, say, the product of the two basis vectors (\epsilon)(e) and by including the indices we would write

\epsilon^i(e_j)

So I would write the i index superscripted because the \epsilon basis vector came from the dual vector space, and the j index is subscripted because e_j came from the vector space. Is this the reason for superscripting and subscripting indices - to make a distinction about which space we are in? Because after all, they are by no means identical bases, even if the vector space and its dual are equal?

My last question for now is, why is the product

\langle \epsilon^i,e_j \rangle = \delta_j^i

equal to the Kronecker delta? The Kronecker delta equal 1 if the indices are the same, and zero if the indices are different. Let's say that the vector space, V has n dimensions and the dual space, V^* has m dimensions. Then,

\delta_j^i = 1 + 1 + \dots + 1 + 1^* + 1^* + \dots + 1^*

where there are n 1's in the first sum and m 1*'s in the second sum. Therefore

\delta_j^i = n + m = \dim(V) + \dim(V^*)

which should equal the product of the basis vectors, \epsilon^i and e_j. Could this be the reason?
 
  • #37
Oxymoron said:
Ok, so I am pretty sure I understand what covariant and contravariant tensors are. A covariant tensor or type (0,2) is a map
T:V\times V \rightarrow \mathbb{R}

Now if you take, say a two dimensional vector space V which has two basis components \{e_1,e_2\} then the dual vector space V^*, which (as robphy and pervect pointed out) has the same dimension as V, then has two basis components. Now am I correct in assuming that the basis components of the dual space is written in Greek? As in, \{\epsilon_1,\epsilon_2\}?

Usually basis one-forms are written as \{\omega^1, \omega^2 \}, a different greek letter choice, and more importantly superscripted rather than subscripted.

So a vector \omega \in V^* may be written as a sum of it's basis components: \omega = \alpha\epsilon_1 + \beta\epsilon_2?

If you write out a vector as a linear sum of multiples of the basis vectors as you do above, it's traditional to write simply

x^i \, e_i. Repeating the index i implies a summation, i.e.

\sum_{i=1}^{n} x^i e_i

Extending this idea to n-dimensional vector spaces we have that e_1,e_2,\dots,e_n is a basis for V and \epsilon_1,\epsilon_2,\dots,\epsilon_n is a basis for V^*.

If you write out a one-form in terms of the basis one-forms, it's
x_i \omega^i

Is this the reason for superscripting and subscripting indices - to make a distinction about which space we are in? Because after all, they are by no means identical bases, even if the vector space and its dual are equal?

Yes. It also leads to fairly intuitive tensor manipulation rules when you get used to the notation.

My last question for now is, why is the product
\langle \epsilon^i,e_j \rangle = \delta_j^i
equal to the Kronecker delta? The Kronecker delta equal 1 if the indices are the same, and zero if the indices are different. Let's say that the vector space, V has n dimensions and the dual space, V^* has m dimensions. Then,
\delta_j^i = 1 + 1 + \dots + 1 + 1^* + 1^* + \dots + 1^*
where there are n 1's in the first sum and m 1*'s in the second sum. Therefore
\delta_j^i = n + m = \dim(V) + \dim(V^*)
which should equal the product of the basis vectors, \epsilon^i and e_j. Could this be the reason?

In an orthonormal basis, e_i \cdot e_j = \delta_j^i. This is not true in a general basis, only in an orthonormal basis.

\omega^1 e_1 is just different notation for e_1 \cdot e_1, so it will be unity only if the basis is normalized. Similarly only when the basis vectors are orthogonal will \omega^i e_j be zero when i is not equal to j.
 
Last edited:
  • #38
Posted by Pervect.

Usually basis one-forms are written as...

What is a one-form?
 
  • #39
Oxymoron said:
What is a one-form?
A 1-form is a mapping (i.e. function) which maps vectors to scalars. If "a" is a 1-form and "B" a vector and "s" the scalar then the typical notation is

s = <a, B>

Pete
 
  • #40
Oxymoron said:
Because after all, they are by no means identical bases, even if the vector space and its dual are equal?

A vector space and its dual are not equal, but they have equal dimension. Any 2 vector spaces that have equal dimension are isomorphic, but without extra structure (like a metric), there is no natural basis independent isomorphism.

The Kronecker delta equal 1 if the indices are the same, and zero if the indices are different.

Yes.

Lets say that the vector space, V has n dimensions and the dual space, V^* has m dimensions. Then,
\delta_j^i = 1 + 1 + \dots + 1 + 1^* + 1^* + \dots + 1^*
where there are n 1's in the first sum and m 1*'s in the second sum. Therefore
\delta_j^i = n + m = \dim(V) + \dim(V^*)

Careful - this isn't true.

My last question for now is, why is the product
\langle \epsilon^i,e_j \rangle = \delta_j^i
equal to the Kronecker delta?

Given an n-dimensional vector space V, the dual space V* is defined as

V* = \left\{f: V \rightarrow \mathbb{R} | f \mathrm{is linear} \right\}.

The action of any given linear between vector spaces is pinned down by finding/defining its action on a basis for the vector space that is the domain of the mapping. A dual vector is a linear mapping between the vector spaces V and V*, so this is true in this case.

Let \left\{ e_{1}, \dots , e_{n} \right\} be a basis for V, and define \omega^{i} by: 1) \omega^{i} : V \rightarrow \mathbb{R} is linear; 2) \omega^{i} \left( e_{j} \right) = \delta^{i}_{j}. Now let v = v^{j} e_{j} (summation convention) be an arbitrary element of V. Then

\omega^{i} \left( v \right) = \omega^{i} \left( v^{j} e_{j} \right) = v^{j} \omega^{i} \left( e_{j} \right) = v^{j} \delta^{i}_{j} = v^{i}.

Consequently, \omega^{i} is clearly an element of V*, and \omega^{i} \left( e_{j} \right) = \delta^{i}_{j} by definition!

Exercise: prove that \left\{\omega_{1}, \dots , \omega_{n} \right\} is a basis for the vector space V*.

What is a one-form?

I like to make a distinction between a tensor and a tensor field. A tensor field (of a given type) on a diifferentiable manifold M is the smooth assignment of a tensor at each p \in M.

A one-form is a dual vecor field. Note, however, that some references call a dual vector a one-form. See the thread "https://www.physicsforums.com/showthread.php?t=96073"" in the Linear & Abstract Algebra forum. I tried to sum up the situation in posts #11 and #23.

Regards,
George
 
Last edited by a moderator:
  • #41
Posted by George Jones:

Careful - this isn't true.

I read what I wrote again and it sounds wrong indeed. For one, how can the dimension of V be different from V^* as I have implied by m and n. However, I would like some clarification on the incorrectness it.

Exercise: prove that \{\omega_1,\dots,\omega_n\} is a basis for the vector space .

Well, each \omega^i is linearly independant of the dual basis vectors, \epsilon^i from what you wrote, that

\omega^i\epsilon^j = \delta^_j

Im not sure how to show that they span though, I mean, I could probably do it, but I am not sure which vectors and bases and stuff to use.

Posted by George Jones:

A one-form is a dual vecor field. Note, however, that some references call a dual vector a one-form. See the thread "One-forms" in the Linear & Abstract Algebra forum. I tried to sum up the situation in posts #11 and #23.

I read that, and it makes sense, especially as I was almost up to that point.If you have a tensor of type (0,2) then it can be written as

T:V\times V \rightarrow \mathbb{R}

which is covariant. If \omega and \rho are elements of V^* (which means that they are simply linear functionals over V right?) then we can define their 'tensor' product as

\omega \otimes \rho (u,v) = \omega(u) \rho(v)

At first this was hard to get my head around. My first thought was, \omega \otimes \rho was multiplied by (u,v) and so what was this (u,v) thing? But then I though this is just like

\phi(u,v) = \phi(u)\phi(v)

in group theory - its just a mapping! Where \omega \otimes \rho was the 'symbol' (or \phi in this case) representing the tensor product acting on two variables from V \times V.

My next question at this stage is "how does one define a basis on V\times V (can I assume that the notation V^{(0,2)} means V \times V?

Well, if V has dimension n, then V^{(0,2)} has dimension n^2. So let \epsilon^i be a basis of V, then

\epsilon^i \otimes \epsilon^j

forms a basis for V^{(0,2)}, yes?

Now, is \epsilon^i \otimes \epsilon^j a tensor?My main issue with dealing with these basis vectors is I want to define the metric tensor next, and I am thinking that a sound understanding of how to define bases for these vector spaces and tensors is a logical stepping stone.
 
Last edited by a moderator:
  • #42
Oxymoron said:
.
My next question at this stage is "how does one define a basis on V\times V (can I assume that the notation V^{(0,2)} means V \times V?

I've never seen that notation used, at least in physics.

Well, if V has dimension n, then V^{(0,2)} has dimension n^2. So let \epsilon^i be a basis of V, then
\epsilon^i \otimes \epsilon^j
forms a basis for V^{(0,2)}, yes?

In tensor noation we use subscripts for vectors, so we'd usually write that e_i is a basis of V (we would write \omega^i as a basis of V*)
Now, is \epsilon^i \otimes \epsilon^j a tensor?

e_i \otimes e_j is an element of V \otimes V, not a map from V \otimes V to a scalar.
 
Last edited:
  • #43
Oxymoron said:
However, I would like some clarification on the incorrectness it.

As you said,
\delta^{i}_{j} = \left\{\begin{array}{cc}0,&amp;\mbox{ if } i \neq j \\1, &amp; \mbox{ if } i = j \end{array}\right,
but \delta^{i}_{j} is not expressed as a sum. \delta^{i}_{j} can be used in sums, e.g.,
\sum_{i = 1}^{n} \delta^{i}_{j} = 1,
and
\sum_{i = 1}^{n} \delta^{i}_{i} = n.
Im not sure how to show that they span though, I mean, I could probably do it, but I'm not sure which vectors and bases and stuff to use.

First, let me fill in the linear independence argument. \left\{\omega_{1}, \dots , \omega_{n} \right\} is linearly independent if 0 = c_{i} \omega^{i} implies that the c_i = 0 for each i. The zero on the left is the zero function, i.e., 0(v) = 0 for v \in V. Now let the equation take e_{j} as an argument:
0 = c_{i} \omega^i \left( e_{j} \right) = c_{i} \delta^{i}_{j} = c_{j}.
Since this is true for each j, \left\{\omega_{1}, \dots , \omega_{n} \right\} is a linearly independent set of covectors.

Now show spanning. Let f : V \rightarrow \mathbb{R} be linear. Define scalars f_{i} by f_{i} = f \left( e_{i} \right). Then
f \left( v \right) = f \left( v^{i} e_{i} \right) = v^{i} f \left( e_{i} \right) = v^{i} f_{i} .
Now show that f_{i} \omega^{i} = f:
f_{i} \omega^{i} \left( v \right) = f_{i} \omega^{i} \left( v^{j} e_{j} \right) = f_{i} v^{j} \omega^{i} \left( e_{j} \right) = f_{i} v^{j} \delta^{i}_{j} = f_{i} v^{i} = f \left( v \right).
Since this is true for every v, f_{i} \omega^{i} = f.

its just a mapping! Where \omega \otimes \rho was the 'symbol' (or \phi in this case) representing the tensor product acting on two variables from V \times V.

Exactly!

Becoming used to the abstractness of this approach takes a bit of time and effort.

My next question at this stage is "how does one define a basis on V\times V (can I assume that the notation V^{(0,2)} means V \times V?

I think you mean V* \otimes V*. V \times V is the set of ordered pairs where each element of the ordered pairs comes from V. As a vector space, this is the external direct sum of V with itself. What you want is the space of all (0,2) tensors, i.e.,
V* \otimes V* = \left\{ T: V \times V \rightarrow \mathbb{R} | T \mathrm{is linear} \right\}.
Well, if V has dimension n, then V^{(0,2)} has dimension n^2. So let \epsilon^i be a basis of V, then
\epsilon^i \otimes \epsilon^j
forms a basis for V^{(0,2)}, yes?

Yes.

Now, is \epsilon^i \otimes \epsilon^j a tensor?

Yes, \epsilon^i \otimes \epsilon^j is, for i and j, an element of V* \otimes V*. \epsilon^i and \epsilon^j are specific examples of your \omega and \rho.

Regards,
George
 
Last edited:
  • #44
Thanks George and Pervect. Your answers helped me a lot. So much in fact I have no further queries on that. Which is good.

But now I want to move along and talk about the metric tensor. Is a metric tensor similar to metric functions in say, topology or analysis? Do they practically do the same thing? That is, define a notion of distance between two objects? Or are they something completely abstract?

I had a quick look over the metric tensor and there seems to be several ways of writing it. The first method was to introduce an inner product space. Then to define a functional as

g\,:\,V\times V \rightarrow \mathbb{R}

defined by

g(u,v) = u\cdot v

As we have already discussed this is a bilinear covariant tensor of degree two. Now this is not the general metric tensor I have read about, instead this is the metric tensor of the inner product. Are the two different? Or is the metric tensor intertwined with some sort of inner product always?

I understand that to introduce the idea of a metric we need some sort of mathematical tool which represents distance. In this case the inner product usually represents 'length' of an element. Is this the reason for introducing the metric tensor like this? Could you go further and instead of an inner product, simply define the metric tensor as an arc length or something general like this?
 
  • #45
What is an inner product? I ask this because I want to compare and contrast whatever answer you give with a "metric" tensor.

Regards,
George
 
  • #46
In my understanding to have an inner product you need a vector space V over the field \mathbb{R}. Then the inner product over the vector space is a bilinear mapping:

g\,:\,V\times V \rightarrow \mathbb{R}

which is symmetric, distributive, and definite.
 
  • #47
So, isn't an inner product on a vector space V a (0,2) tensor, i.e., an element of V* \otimes V*?

Too exhausted to say more - took my niece and nephew skating (their first time; 3 and 4 years old), and pulling them around the ice completely drained me.

Regards,
George
 
Last edited:
  • #48
Well, I think of distances as being quadratic forms. Quadratic forms are in a one-one correspondence with symmetric bilinear forms.

http://mathworld.wolfram.com/SymmetricBilinearForm.html

the defintion of which leads you directly to your defintion M : V \otimes V -&gt; \mathbb{R}, except for the requirement of symmetry.

There's probably something deep to say about symmetry, but I'm not quite sure what it is. In GR we can think of the metric tensor as always being symmetric, so if you accept the symmetry as being a requirement, you can go directly from quadratic forms to symmetric bilinear forms.

Of course you have to start with the assumption that distances are quadratic forms, I'm not sure how to justify something this fundamental offhand.

[add]
I just read there may be a very small difficulty with the above argument, see for instance

http://www.answers.com/topic/quadratic-form
 
Last edited:
  • #49
The metric tensor helps us "lower" or "raise" indices, thus allowing us to make scalars out of tensors. For example, say we want a scalar out of two rank 1 tensors A^\mu, B^\nu. We can go for

g_{\mu\nu}A^\mu B^\nu.

This is usually the inner product between A and B.

EDIT: The metric has other important functions too.
 
Last edited:
  • #50
Hi,

It seems nobody answered yet a particular part of your original question which is wether the geometry of spacetime can also be given a distance formulation. It can (in the achronal case), and here are the axioms:
(a) d(x,y) >= 0 and d(x,x) = 0
(b) d(x,y) > 0 implies that d(y,x) = 0.
(c) d(x,z) >= d(x,y) + d(y,z) if d(x,y)d(y,z) > 0

Notice that d can also take the value infinity. d gives a partial order defined by x < y if and only if d(x,y) > 0 as you can easily verify. There is an old approach based upon (a suitably differentiable and causally stable) d to general relativity which is the world function formulation by Synge.

Cheers,

Careful
 
Back
Top