***** WARNING: LONG POST *****
NOTE: The questions I ask in this post are rhetorical. It was just an easy way to describe what I was thinking at the time. Please don't waste your time trying to answer them. Today I have answers to most of them anyway.
FreeThinking said:
... it would definitely be helpful if the authors would at least get the terminology correct and/or give more steps in the derivation. Their cavalier use of the terms "covariant" and "directional" derivatives have caused me no end of grief.
PeterDonis said:
What in particular is confusing you about those terms? Can you give an example of a usage of them that you find confusing from MTW (or another textbook if that's easier)?
I want to be clear that, despite how it may sometimes sound, I am in no way, shape, or form blaming MTW or any other author or book for my difficulties in understanding this subject. Every reader/student has a different level of background, understanding, and skill. Every author must decide for themselves who their target audience is. I just happen to be on the outer fringes of the target audience of MTW, so their book is very challenging for me. It is simply a fact, not a complaint, that MTW, and others as well, seem to have used the terms "covariant derivative" and "directional derivative", as well as other terms, in a way that has confused me. I am not talking about the occasional typographical mistakes, but only what appears to be deliberate uses.
For examples of what I mean, look at the following:
MTW page 208 and 209, which is section 8.5, Parallel transport, covariant derivative, connection coefficients, geodesics. On page 208, 2nd paragraph, they say that the gradient of a tensor field is ## \boldsymbol \nabla \boldsymbol T ##. So that defines the gradient operator as ## \boldsymbol \nabla ##. In elementary calculus, including 3-D, Cartesian vector calculus, I was taught that the gradient operator is defined as ## \boldsymbol \nabla \equiv \left ( \frac {\partial} {\partial x^i} \right ) {\boldsymbol {\hat x}}_i ##, a differential operator that is a vector. We write it that way to emphasize that the partial derivative does not operate on the basis vector. A very important point is that the gradient operator has to include the basis vectors of the coordinate system being used because they will be used to form the dot product with the basis vectors in the tangent vector of the curve to produce the total derivative, with respect to the parameter of the curve, of the given scalar field along a given curve. Later I learned that this doesn't work well in an arbitrary, curvilinear coordinate system, so it is convenient to change the definition of the gradient operator to be a one-form ## \boldsymbol \nabla \equiv \left ( \frac {\partial} {\partial \xi^\gamma} \right ) {\boldsymbol {\widetilde \xi}}^\gamma##.
So, if that's true, then letting ##\boldsymbol T = T^\rho_\lambda {\boldsymbol e}_\rho \otimes {\boldsymbol \omega}^\lambda ##, we can write ## \boldsymbol \nabla {\boldsymbol T} = \left (\frac {\partial{\boldsymbol T}} {\partial \xi^\gamma} \right ) \otimes {\boldsymbol {\widetilde \xi}}^\gamma = \left ( {\nabla}_\gamma T^\rho_\lambda \right ) {\boldsymbol e}_\rho \otimes {\boldsymbol \omega}^\lambda \otimes {\boldsymbol {\widetilde \xi}}^\gamma = \left ( T^\rho_{\lambda;\gamma} \right ) {\boldsymbol e}_\rho \otimes {\boldsymbol \omega}^\lambda \otimes {\boldsymbol {\widetilde \xi}}^\gamma ##.
Note that the nabla symbol without the subscript is bolded while the one with the subscript is not bolded. The expressions ## \nabla_\gamma T^\beta_\alpha ## and ## T^\beta_{\alpha ; \gamma} ## are what I understood to be the covariant derivative of the components of the given tensor. Nothing else is the covariant derivative, just those two ways of expressing it. Especially, ## \boldsymbol \nabla ## is definitely NOT the covariant derivative. You cannot take the covariant derivative of a tensor; only the components of a tensor. That was my understanding when I arrived at MTW's front door. Yet, in the next paragraph they say, "First define the "covariant derivative" ## \boldsymbol \nabla_{\boldsymbol u} \boldsymbol T ## of ## \boldsymbol T ## along a curve ## P(\lambda) ##, whose tangent vector is ## \boldsymbol u = \frac {dP}{d\lambda} ##." The expression ## \boldsymbol \nabla_{\boldsymbol u} \boldsymbol T ## is not a covariant derivative. It contains a covariant derivative, but it is not itself a covariant derivative, at least according to my understanding of the definition of a covariant derivative. In fact, it is not even a gradient, per the definition above given by MTW themselves. It is a directional derivative. It contains a gradient, which contains a covariant derivative, but neither of them are covariant derivatives.
Now one could say, "Well, they're defining the covariant derivative to be the directional derivative along the curve, etc." Or, one could also say, "Well, let's not get too hung up on the exact wording, it should be clear what they mean from the equations." What I would say is, "Fine. If we're going to define terms differently from other books and/or ignore the wording & just look at the math, I can do that. But it would be helpful to me if that plan were explicitly stated ahead of time." Over the past six months or so, my understanding of the terms & notation has improved greatly. So now as I review MTW trying to pick up where I left off before all the confusion set in, I'm beginning to see how to interpret their wording correctly and so I'm now less confused about what they're saying. But that's due to my very hard-won, expanded insight that required a lot of study & self-help outside of MTW. Again, I'm not criticizing them, I'm just pointing out that this was a point of confusion for me when I first encountered it because I was still not sure of MTW's definition of things.
Also, starting with the last two paragraphs at the bottom of page 208, we establish that ## \boldsymbol {e}_\beta ## and ## \boldsymbol {\omega}^\alpha ## are general bases dual to each other. Continuing onto page 209, equation 8.19a says that ## {\boldsymbol \nabla}_\gamma \equiv {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} ## . Then further down the page, equation 8.20 defines ## T^\beta_{\alpha,\gamma} \equiv {\boldsymbol \nabla}_\gamma T^\beta_\alpha \equiv \partial_{{\boldsymbol e}_\gamma} T^\beta_\alpha \equiv \partial_\gamma T^\beta_\alpha ##. With a general basis, not a local Lorentz frame, why are we defining the directional derivative ## {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} \equiv {\boldsymbol \nabla}_\gamma ## to be a partial derivative ## T^\beta_{\alpha,\gamma} \equiv \partial_\gamma T^\beta_\alpha ##? If we were using a coordinate basis, say ## \left \lbrace {\boldsymbol {\xi}_\gamma} \right \rbrace ##, it would make sense since ## {\boldsymbol {\xi}_\gamma} \equiv {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} ##, the directional derivative operator along the coordinate curve ## {\boldsymbol {\xi}_\gamma} ##. Perhaps if we stare at this section long enough, it might dawn on us what they actually mean. I get the gist of the section. I understand the gamma correction terms, but I'm just not sure what justifies the way they write some of the equations. While writing this post, I tried to work through equation 8.19a & 8.20, but I'm still not getting the same result they seem to get. So, this confused me when I first encountered it and it still seems to be confusing me now.
Continuing on to page 210, 1st paragraph, equation 8.22: And we're back now with something that exactly matches what I would expect for the directional derivative of the tensor field ## \boldsymbol T ##, but again MTW calls it the covariant derivative. Ok, is this just how they call it, or have I misunderstood something? Considering some of the ways thay have defined the directional derivative as described in my previous paragraphs above, as a newbie, I was just not sure what I was not understanding. So not only did the words not match, even the equations did not seem totally consistent to me.
On page 253, it says: Any "rule" ## \boldsymbol \nabla ##, for producing new verctor fields from old, ... is called by differential geometers a "symmetric covariant derivative." So maybe their constantly calling the bold nabla the covariant derivative is standard. But then, what do we call the part with the semicolon? You know, the part with the gamma correction terms? Then on page 255, last paragraph in the box, labeled "B.", it says: "The machine ## \boldsymbol \nabla ## differs from a tensor in two ways. ...(2) ## \boldsymbol \nabla ## is not a linear machine (whereas a tensor must be linear!)". What? I thought the whole point of the covariant derivative was to have a derivative that produced results that could be used as the components of a tensor. If the covariant derivative is a tensor or has the components of a tensor, it's linear. Yes? No? Which is it? So, that really sent me scrambling all the way back to Schaum's Outline on Vector Analysis by Murray R. Spiegel where I first encountered the covariant derivative as that thing with the gamma correction terms. Yes, well, OK. The thing with the gamma correction terms is supposed to transform like the components of a tensor. So what does MTW mean when they say ## \boldsymbol \nabla ## are not the components of a tensor?
By the time I reached page 271, equation 11.8, I was too confused about what the bold nabla symbol meant. Trying to understand (11.8), I tried several different definitions of ## \boldsymbol \nabla ##, but nothing worked. Somewhere around here and slightly beyond, I was not understanding the math at all, so I Googled. One of the things that popped up was Carroll's Lecture Notes. On pages 75 through 77 I found enough information to eventually figure out a consistent set of terms & notations for all these derivatives that actually seems to give me the same answers as Carroll and makes some sense of MTW. I don't know if my set of definitions is the same as the mainstream or if they even make sense to anyone else, but they make sense to me. The task(s) I'm working on now is to go back through MTW, identify all the places that confused me before, and reread & rewrite MTW's text & math in my notation to see if it makes MTW any more sensible to me. Preliminary experience indicates that MTW can be rewritten to make more sense to me, but there are still a lot of places that don't. So, I may have finally found the right definitions and can start making some progress.
This post is not meant to be a rigorous proof that MTW is inconsistent or sloppy in their notation. Probably anyone who has any business reading MTW would not be confused because they are either smart enough to figure it out without a lot of hand-holding (that ain't me) or they've already mastered the math (also not me) & MTW is just intended to show them how to apply it to general relativity. So most readers here will probably wonder how I could have gotten so confused. But for this particular, self-guided hobbyist, it was enough to hold me up for quite awhile. And now that I'm in review & recover mode, I see better what they mean, but I still feel that some of their wording & even the math is inconsistent in some places. I'm confident that that fact means I still don't know what I'm doing.
Finally, I'd like to echo one sentiment of an earlier poster:
vanhees71 said:
The overall concept of MTW and the presentation of the material, however, is outstanding.
If I did not agree, I would never have spent so much time trying to understand it. I have found no other book that covers as much material, even if the coverage is a challenge for the likes of me. Well, it keeps me off the streets at night.