I Confusion about index notation and operations of GR

Put1demerde
Messages
3
Reaction score
0
Hello,
I am an undergrad currently trying to understand General Relativity. I am reading Sean Carroll's Spacetime and Geometry and I understand the physics (to a certain degree) but I am having trouble understanding the notation used as well as the ideas for tensors, dual vectors and the mathematics behind them. It is mainly that I cannot visualize the mathematical operations used on tensors as such. Are there any good references that I could look into that simply explain the notation? I think once I get this, the pace should increase but for now I'm stuck.
Thanks!
 
Physics news on Phys.org
Put1demerde said:
Hello,
I am an undergrad currently trying to understand General Relativity. I am reading Sean Carroll's Spacetime and Geometry and I understand the physics (to a certain degree) but I am having trouble understanding the notation used as well as the ideas for tensors, dual vectors and the mathematics behind them. It is mainly that I cannot visualize the mathematical operations used on tensors as such. Are there any good references that I could look into that simply explain the notation? I think once I get this, the pace should increase but for now I'm stuck.
Thanks!
If you don’t have a very firm grasp on linear algebra and vector calculus, I’d recommend starting with “Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach” by the Hubbards. After that, “Tensor Calculus” (Schaum’s Outline) by David Kay is great if you’re like me and you need a lot of examples to work through to really get a good grasp on the concepts. After that, you should be ready to grasp most of the concepts in many GR books.
 
  • Like
Likes Put1demerde
Put1demerde said:
Hello,
I am an undergrad currently trying to understand General Relativity. I am reading Sean Carroll's Spacetime and Geometry and I understand the physics (to a certain degree) but I am having trouble understanding the notation used as well as the ideas for tensors, dual vectors and the mathematics behind them. It is mainly that I cannot visualize the mathematical operations used on tensors as such. Are there any good references that I could look into that simply explain the notation? I think once I get this, the pace should increase but for now I'm stuck.
Thanks!

I'm not sure if this helps, but you can visualize vectors as little arrows, and their duals as "stacks of plates". The former idea, vectors as little arrows, is probably pretty familar. The latter is a bit unusual, but the point is that the number of plates in the stack that the little arrow pierces is a scalar, this mimics the way that dual vectors are maps of (vectors to scalars). The stack-of-plates creates a map from the little arrows (vectors) to a number (the number of plates the arrow pierces).

This particular visual aid is described in MTW's big black book, "Gravitation". <<amazon link>>.

Some of the other notation in MTW might be confusing though, the semicolon notation in particular. There may be other books or sources that use the "stacks of plates" idea.

A toy example might help. Suppose you have a 3-d hill, and a 2d contour map of the hill. The "stacks of plates" idea is like the spacing of the lines of the contour map. Vectors, as little arrows, are displacements. The idea of vectors as displacmenets are dangerous in general, be warned, so this may not be the ideal example. Still, it might give you the push you need to appreciate the graphical symbolism of vectors as arrows and their duals as stacks-of-plates. The mathematical point here is that the gradient of the hill, ##\nabla_a##, is an example of a dual vector, because it's subscripted, making it different from the vectors, that are superscripted.

WIth an understanding of vectors and dual vectors as geometric objects, the arrows and plate stacks, one next needs to know that the metric tensor can convert vectors into their duals, and the inverse metric can convert dual vectors into vectors. Put another way, it can raise and lower indices.

I'm going to stop here for now, and see how you react. Anyway, MTW has it's own notational issues, but it can offer some nice insights, too.
 
  • Like
Likes Pencilvester and Put1demerde
pervect said:
I'm not sure if this helps, but you can visualize vectors as little arrows, and their duals as "stacks of plates". The former idea, vectors as little arrows, is probably pretty familar. The latter is a bit unusual, but the point is that the number of plates in the stack that the little arrow pierces is a scalar, this mimics the way that dual vectors are maps of (vectors to scalars). The stack-of-plates creates a map from the little arrows (vectors) to a number (the number of plates the arrow pierces).

This particular visual aid is described in MTW's big black book, "Gravitation". <<amazon link>>.

Some of the other notation in MTW might be confusing though, the semicolon notation in particular. There may be other books or sources that use the "stacks of plates" idea.

A toy example might help. Suppose you have a 3-d hill, and a 2d contour map of the hill. The "stacks of plates" idea is like the spacing of the lines of the contour map. Vectors, as little arrows, are displacements. The idea of vectors as displacmenets are dangerous in general, be warned, so this may not be the ideal example. Still, it might give you the push you need to appreciate the graphical symbolism of vectors as arrows and their duals as stacks-of-plates. The mathematical point here is that the gradient of the hill, ##\nabla_a##, is an example of a dual vector, because it's subscripted, making it different from the vectors, that are superscripted.

WIth an understanding of vectors and dual vectors as geometric objects, the arrows and plate stacks, one next needs to know that the metric tensor can convert vectors into their duals, and the inverse metric can convert dual vectors into vectors. Put another way, it can raise and lower indices.

I'm going to stop here for now, and see how you react. Anyway, MTW has it's own notational issues, but it can offer some nice insights, too.

Thank you so much! This makes a lot more sense. I picked up a copy of "Gravitation" a couple days ago to use but I haven't looked through it enough.

So far, everything you've said makes sense except the gradient ##\nabla_a## being subscripted. I don't understand the whole convention that something is subscripted/superscripted and therefore makes it a certain mathematical object (vector or dual vector)– wouldn't it be that because it is a dual vector it is subscripted (I don't mean to be nitpicky. This is definitely something I am having trouble understanding–the whole superscript/subscript thing– and want to solidify my intuition ). So how do the subscripts/superscripts apply to tensors?

Also, raising and lowering indices: If I want to raise the indices to a tensor (for example ##X_{\mu\nu}##) I would use the minkowski metric tensor ##\eta^{\rho\sigma}## (although it seems this is only for flat spacetime and the metric tensor ##g_{\mu\nu}## will replace it for curved spacetime), for instance, to raise one of the indices, we would perform:

##\eta^{\rho\nu}####X_{\mu\nu}## = ##X_{\mu}{}^{\rho}## ##\rightarrow## I hope this is right... (my intuition says that the same components will cancel and because they are on the right of the other superscript, and thus the new index will be on the right)

But then what if I want to find the components of ##X_{\mu}{}^{\nu}##? I can't just switch the indices I don't want, or is this the purpose of dummy indices? I doubt I can have a metric tensor with the same indices.

Moreover, what's the difference in performing ##\eta^{\rho\nu}####X_{\mu\nu}## = ##X_{\mu}{}^{\rho}## and ##\eta_{\rho\nu}####X_{\mu\nu}## = ##X_{\mu}{}^{\rho}## if ##\eta^{\rho\nu}## = ##\eta_{\rho\nu}##. I know this may seem like absolute blasphemy and I know it's not correct, but why?

Also, how would the math work out for this. If I know the components of ##X_{\mu\nu}##, how could I work out the new components of ##X_{\mu}{}^{\nu}##?

I'm sorry for so many questions
 
Last edited:
Pencilvester said:
If you don’t have a very firm grasp on linear algebra and vector calculus, I’d recommend starting with “Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach” by the Hubbards. After that, “Tensor Calculus” (Schaum’s Outline) by David Kay is great if you’re like me and you need a lot of examples to work through to really get a good grasp on the concepts. After that, you should be ready to grasp most of the concepts in many GR books.

Thank you for the suggestion! I learn best from examples and working out problems so I will definitely look into this.
 
Put1demerde said:
Thank you so much! This makes a lot more sense. I picked up a copy of "Gravitation" a couple days ago to use but I haven't looked through it enough.

So far, everything you've said makes sense except the gradient ##\nabla_a## being subscripted. I don't understand the whole convention that something is subscripted/superscripted and therefore makes it a certain mathematical object (vector or dual vector)– wouldn't it be that because it is a dual vector it is subscripted (I don't mean to be nitpicky. This is definitely something I am having trouble understanding–the whole superscript/subscript thing– and want to solidify my intuition ). So how do the subscripts/superscripts apply to tensors?

Also, raising and lowering indices: If I want to raise the indices to a tensor (for example ##X_{\mu\nu}##) I would use the minkowski metric tensor ##\eta^{\rho\sigma}## (although it seems this is only for flat spacetime and the metric tensor ##g_{\mu\nu}## will replace it for curved spacetime), for instance, to raise one of the indices, we would perform:

##\eta^{\rho\nu}####X_{\mu\nu}## = ##X_{\mu}{}^{\rho}## ##\rightarrow## I hope this is right... (my intuition says that the same components will cancel and because they are on the right of the other superscript, and thus the new index will be on the right)

But then what if I want to find the components of ##X_{\mu}{}^{\nu}##? I can't just switch the indices I don't want, or is this the purpose of dummy indices? I doubt I can have a metric tensor with the same indices.

Moreover, what's the difference in performing ##\eta^{\rho\nu}####X_{\mu\nu}## = ##X_{\mu}{}^{\rho}## and ##\eta_{\rho\nu}####X_{\mu\nu}## = ##X_{\mu}{}^{\rho}## if ##\eta^{\rho\nu}## = ##\eta_{\rho\nu}##. I know this may seem like absolute blasphemy and I know it's not correct, but why?

Also, how would the math work out for this. If I know the components of ##X_{\mu\nu}##, how could I work out the new components of ##X_{\mu}{}^{\nu}##?

I'm sorry for so many questions

Just a couple of points:

1) Don't expect this to be easy. I think getting to grips with this is quite challenging. Be prepared to get knocked back a few times.

2) The gradient doesn't magically become a dual vector just because of the position of the script! There's a bit of mathematics needed to prove that the gradient behaves like a dual vector.

3) The concept of vectors as "directional derivatives" is key and, again, not that easy to grasp. Be prepared to work on this for a while.

4) Fundamentally, the superscripting and subscripting is designed to keep track of what basis or bases you are using. In a lot of standard linear algebra the basis and dual basis coincide, hence there is no need to use upper and lower indices. In GR you do not have that simplicity, hence the need for the upper and lower indexing.
 
  • Like
Likes Put1demerde
Put1demerde said:
Thank you so much! This makes a lot more sense. I picked up a copy of "Gravitation" a couple days ago to use but I haven't looked through it enough.

So far, everything you've said makes sense except the gradient ##\nabla_a## being subscripted. I don't understand the whole convention that something is subscripted/superscripted and therefore makes it a certain mathematical object (vector or dual vector)– wouldn't it be that because it is a dual vector it is subscripted (I don't mean to be nitpicky. This is definitely something I am having trouble understanding–the whole superscript/subscript thing– and want to solidify my intuition ). So how do the subscripts/superscripts apply to tensors?

It's a bit tricky to explain properly. I'm going to focus on the components, because I find that easiest. Let u be a vector, and v be it's dual. Then ##u^a## gives the components of a vector, where a is an index which usually takes the range {0,1,2,3}. These components actually multiply the basis vectors themselves. It gets a bit tricky when one is talking about the vector u, the components (which are superscripted) multiply the actual vectors in some chosen basis.

If we think of vectors as little arrows, we have some set of little arrows, that form a basis when we work with vectors. So the first step is to create this basis. Then the weighted sum of these basis vectors, weighted by the components ##u^a##, form an expression for an arbitrary vector in that basis.

For the dual vector, it's components are given by ##v_a##, where again a is in the range {0,1,2,3}.
Also, raising and lowering indices: If I want to raise the indices to a tensor (for example ##X_{\mu\nu}##) I would use the minkowski metric tensor ##\eta^{\rho\sigma}## (although it seems this is only for flat spacetime and the metric tensor ##g_{\mu\nu}## will replace it for curved spacetime), for instance, to raise one of the indices, we would perform:

##\eta^{\rho\nu}####X_{\mu\nu}## = ##X_{\mu}{}^{\rho}## ##\rightarrow## I hope this is right... (my intuition says that the same components will cancel and because they are on the right of the other superscript, and thus the new index will be on the right)

So far this is right, though ##g_{ab}## would lower indices, and the inverse metric ##g^{ab}## would raise indices. In flat space time ##\eta_{ab}## lowers indices and ##\eta^{ab}## raises them.

But then what if I want to find the components of ##X_{\mu}{}^{\nu}##? I can't just switch the indices I don't want, or is this the purpose of dummy indices? I doubt I can have a metric tensor with the same indices.

I don't quite understand this. Does my remark about raising indices with the inverse metric and lowering them with the metric help any?

Moreover, what's the difference in performing ##\eta^{\rho\nu}####X_{\mu\nu}## = ##X_{\mu}{}^{\rho}## and ##\eta_{\rho\nu}####X_{\mu\nu}## = ##X_{\mu}{}^{\rho}## if ##\eta^{\rho\nu}## = ##\eta_{\rho\nu}##. I know this may seem like absolute blasphemy and I know it's not correct, but why?

I can tell you it's wrong, but I can't tell you why you think it might be right. I can tell you that if we take a more complex but still flat space-time case metric like

$$g_{ab} = \begin{bmatrix}
2 & 0 \\ 0 & 1
\end{bmatrix}$$

it's inverse is

$$g^{ab} =
\begin{bmatrix}
\frac{1}{2} & 0 \\ 0 & 1
\end{bmatrix}$$

So you have to pay attention to the rules to get results that aren't nonsense if you have a flat space metric other than ##\eta_{ab}##.

Another useful case to consider is polar coordinates, r, ##\phi## where the metric is

$$g_{ab} = \begin{bmatrix} 1 & 0 \\ 0 & r^2 \end{bmatrix}$$
and the inverse matrix is
$$g^{ab} = \begin{bmatrix} 1 & 0 \\ 0 &\frac{1}{ r^2} \end{bmatrix}$$
 

Similar threads

Back
Top