A Why does MTW keep calling the "product rule" the "chain rule"?

FreeThinking · Jul 30, 2019

MTW p 257, exercises 10.2 through 10.5: These exercises are all dealing with this familiar property of derivatives ∇ (AB) = ∇A B + A ∇ B . I learned this was called the "product rule". I learned that d/dx f(y(x)) = df/dy dy/dx is called the "chain rule". MTW keeps calling what I learned as the "product rule" by the term "chain rule". I've Googled and such and all the hits use the terms in the way I expect. Why is MTW calling them differently? What am I not understanding?

Also, they do this in other places in the book also.

Thanks.

fresh_42 · Jul 30, 2019

My guess is, that it is either a spleen or a leftover from differentials as boundary operators of (co-)chain complexes, a rule how to deal with chains or cochains so to say.

The product rule is also called Leibniz rule or Jacobi identity or boundary operator. Chain rule is new to me.

strangerep · Jul 30, 2019

FreeThinking said:

MTW p 257, exercises 10.2 through 10.5: These exercises are all dealing with this familiar property of derivatives ∇ (AB) = ∇A B + A ∇ B . I learned this was called the "product rule". [...]

Perhaps because different authors wrote different parts?

E.g., near the top of p182, and near the top of p216, they do call it the "product rule". But on p76, near eq(3.21), the use the name "product rule" for writing tensor products in component form.

FreeThinking · Jul 30, 2019

@fresh_42: Thank you for response. Unfortunately, I have no idea what you said. I looked up the word "spleen" but could find no definition that fit the current context. And I have no idea what "(co-)chain complexes" are. Whatever they are, I hope that's not what MTW meant or I might as well give up any attempt to understand this subject. :-(

Calling the "product rule" the "Leibniz rule" is also in keeping with my understanding, although I can't remember it being called the "Jacobi identity" or "boundary operator". I will have to research those. Thanks for that info.If the chain rule is new to you, then I didn't state it well in my post, so let me try again. The chain rule I'm talking about is from elementary calculus and refers to how you take the derivative of a function composition, namely:

Let y=g(x) be a continuous function of x, and let z = f(y) be a continuous function of y. Then the derivative of z with respect to x is just ## {\frac {dz} {dx}} = \frac {dz} {dy} \frac {dy} {dx} ##. At least that's my understanding of the Google hits I get on it. The above equation appears nowhere in the exercises or other places where they say "chain rule".

My problem is that what MTW calls the "chain rule" looks more like the product rule to me: d(fg)=df g + f dg . All my hits on Google seem to confirm my view. That is the equation that appears in all those places where they say "chain rule".

Since MTW are 3 of the best physics/mathematics minds of our age, I figured the failure in understanding is most certainly mine. I just didn't know where to go or who to ask, since I can find nothing explaining it on the internet.

Thanks.

PeterDonis · Jul 30, 2019

FreeThinking said:

Since MTW are 3 of the best physics/mathematics minds of our age, I figured the failure in understanding is most certainly mine.

What you're failing to understand isn't anything of substance: you clearly understand how to take derivatives of products of two functions and of function compositions. Which of those you use the terms "product rule" or "chain rule" to refer to is a matter of terminology, not math or physics. I don't think anyone except MTW themselves can answer the question of why MTW chose particular terminology in a particular place in the book. I don't understand why they picked that particular terminology either. But I wouldn't spend much time worrying about that.

fresh_42 · Jul 30, 2019

FreeThinking said:

Thank you for response. Unfortunately, I have no idea what you said. I looked up the word "spleen" but could find no definition that fit the current context. And I have no idea what "(co-)chain complexes" are. Whatever they are, I hope that's not what MTW meant or I might as well give up any attempt to understand this subject. :-(

Maybe I have chosen a wrong word. We use it in my language and as it sounded and is written English I thought it would exist. I meant quirk or personal peculiarity.

Chain complexes are described in Wikipedia. The boundary operator of the Chevalley Eilenberg complex for Lie algebras follow the same rules as differential operators aka covariant derivatives on n-forms do. They are closely related. This would have been a possibility for the wording, but as I said, a guess.

FreeThinking said:

If the chain rule is new to you, then I didn't state it well in my post, so let me try again.

It is new to call the Leibniz rule chain rule. The Jacobi identity is the product rule for vector fields.

FreeThinking · Jul 30, 2019

@strangerep: Thanks for the additional references. I'll check them out.

I can believe different authors wrote different parts. In fact, I would suspect that more than a few graduate students got thrown into the mix as well.

My problem is that MTW is my first exposure to a lot of the math with which I'm struggling so I don't know enough to spot the problems if they're there. I have compared many of their important equations to other sources and have found no discrepancies that I can detect. That has lead me to trust them.

So, do you think it's safe to assume they were that sloppy with their nomenclature and I can just quit worrying about it? If so, I find that prospect very distressing.

Thanks.

PeterDonis · Jul 30, 2019

FreeThinking said:

do you think it's safe to assume they were that sloppy with their nomenclature and I can just quit worrying about it?

I think you can trust the equations in MTW, and those are what contain the substantive content. If the ordinary language discussion uses a term that doesn't seem appropriate to you for a particular equation, you can just ignore the term and look at the equation. Nothing of substance will be affected.

FreeThinking said:

If so, I find that prospect very distressing.

Why? Why does the ordinary language matter if you have the equations?

PeterDonis · Jul 30, 2019

FreeThinking said:

MTW is my first exposure to a lot of the math

MTW is probably not the best first textbook on GR to learn from. It is certainly comprehensive, but for that very reason it contains a lot of material that really isn't necessary if you are just trying to learn the basics of GR.

If you haven't looked at Sean Carroll's lecture notes on GR, you might give them a try:

https://arxiv.org/abs/gr-qc/9712019
They cover the basics, including the basics of the math--manifolds, tensors, differential geometry--in a much more focused way that might be easier as an introduction.

FreeThinking · Jul 30, 2019

@fresh_42: Ok, now I understand you. I've never known the word "spleen" to mean that, but apparently it is used, at least in the U.S, I don't know about anywhere else, to mean "complain" or "rant". So, thanks for your guess. It was certainly better than anything I was coming up with.

FreeThinking · Jul 30, 2019

@PeterDonis: Thanks. That makes me feel better about my confusion.

It's well after midnight here & I'm losing coherence, so I'll be brief. I think for someone like me who is studying on their own as a hobby with only the internet to ask questions of, the text becomes more important than it might be for someone who already understands it well. As to the equations, some are there but the intermediate steps are not. These textbooks are written for students much smarter than me, I can deal with that, but it would definitely be helpful if the authors would at least get the terminology correct and/or give more steps in the derivation. Their cavalier use of the terms "covariant" and "directional" derivatives have caused me no end of grief.

If you want to hear the whole, sad story, let me know. Otherwise I'll just leave it that, for me, the words really do matter when the intermediate steps of the derivation are completely left out.

Thanks again, Peter. I'll take your advice and not worry about it any further.

FreeThinking · Jul 30, 2019

@PeterDonis: P.S. Sorry, I forgot to mention that Carroll's lecture notes are practically burned on my screen, I've been studying them so long. And a lot of other printed & online resources. They've helped, but it has still be very difficult for a dummy like me.

Good night, and thanks again.

Orodruin · Jul 31, 2019

I have known many people who mix up the product rule and the chain rule nomenclature. MTW would not be the first nor the last. I have probably done so myself at some point in time.

jbriggs444 · Jul 31, 2019

FreeThinking said:

@fresh_42: Ok, now I understand you. I've never known the word "spleen" to mean that, but apparently it is used, at least in the U.S, I don't know about anywhere else, to mean "complain" or "rant". So, thanks for your guess. It was certainly better than anything I was coming up with.

I'd only ever heard the word used in that sense as part of the idiom "vent [one's] spleen". Apparently originating from a historical idea that the organ called the spleen is the repository of one's anger.

Michael Price · Jul 31, 2019

Orodruin said:

I have known many people who mix up the product rule and the chain rule nomenclature. MTW would not be the first nor the last. I have probably done so myself at some point in time.

I am surprised to hear that - they are quite different things and almost self-descriptive. I wouldn't even expect an undergrad to mix them up, let alone MTW.

Orodruin · Jul 31, 2019

Michael Price said:

I am surprised to hear that - they are quite different things and almost self-descriptive. I wouldn't even expect an undergrad to mix them up, let alone MTW.

I see it among undergraduates at a quite frequent basis (not in the first or second year, they still have it fresh). However, as has been mentioned here, the main thing is getting the maths right, not the English nomenclature. The product rule holds whether youcalk it the wrong thing or not.

kent davidge · Jul 31, 2019

Orodruin said:

youcalk it the wrong thing or not

was this on porpuse?

PeterDonis · Jul 31, 2019

FreeThinking said:

it would definitely be helpful if the authors would at least get the terminology correct and/or give more steps in the derivation. Their cavalier use of the terms "covariant" and "directional" derivatives have caused me no end of grief.

What in particular is confusing you about those terms? Can you give an example of a usage of them that you find confusing from MTW (or another textbook if that's easier)?

PeterDonis · Jul 31, 2019

FreeThinking said:

As to the equations, some are there but the intermediate steps are not.

Yes, that's true, textbooks will often leave out intermediate steps in the derivation, or assign them as homework problems instead of giving them in the main text. MTW does this fairly often. The homework help forums here can be useful if you're stuck on a particular problem.

FreeThinking · Jul 31, 2019

Orodruin said:

I see it among undergraduates at a quite frequent basis (not in the first or second year, they still have it fresh). However, as has been mentioned here, the main thing is getting the maths right, not the English nomenclature. The product rule holds whether youcalk it the wrong thing or not.

I agree that the main thing is to understand the maths, not the verbiage. This discussion is helping me on both counts.

MTW and every other book & article on the subject are what they are and we're stuck with them. My problem was, being such a novice especially with the non-coordinate symbols (I've been through a lot of this before using the old component-based contravariant/covariant tensors with focus on their transformation equations before I read Schutz's First course & discovered the "new" Cartanian way) I figured who am I to say that all these experts in the field are saying it wrong. But now that so many of you seem to think that's not such an unreasonable or arrogantly ignorant view to take, it tells me that perhaps I'm not as confused as I thought I was. I always try to avoid blaming my confusion on the expert authors since 99.99%+ of the time, I'm the one getting it wrong, not them.

Thanks.

FreeThinking · Jul 31, 2019

PeterDonis said:

What in particular is confusing you about those terms? Can you give an example of a usage of them that you find confusing from MTW (or another textbook if that's easier)?

Thank you for that invitation, I'll take you up on it, but it will take me awhile to compose that post.

Now that I have finally figured out at least one way to derive Carroll's equation 3.71 on page 77 of his lecture notes, I'm going back and reviewing both Carroll & MTW to see if I understand things better now. As I do, I'll collect examples of where and how I got confused.

SiennaTheGr8 · Jul 31, 2019

Product rule follows from the chain rule anyway.

dsaun777 · Aug 1, 2019

PeterDonis said:

MTW is probably not the best first textbook on GR to learn from. It is certainly comprehensive, but for that very reason it contains a lot of material that really isn't necessary if you are just trying to learn the basics of GR.

If you haven't looked at Sean Carroll's lecture notes on GR, you might give them a try:

https://arxiv.org/abs/gr-qc/9712019
They cover the basics, including the basics of the math--manifolds, tensors, differential geometry--in a much more focused way that might be easier as an introduction.

Those notes are brilliant for anyone wanting to start studying GR, Thank you. Anything similar for quantum field theory

Michael Price · Aug 1, 2019

SiennaTheGr8 said:

Product rule follows from the chain rule anyway.

I don't find that proof terribly convincing. And it still doesn't excuse MTW mixing them up, which I am still amazed by.

PeroK · Aug 1, 2019

SiennaTheGr8 said:

Product rule follows from the chain rule anyway.

Perhaps if you ignore mathematical niceties like:

The product rule requires the functions share a domain.

The chain rule requires the range of one function to be a subset of the domain of the other.

Function composition is fundamentally different from the simple product of functions.

If MTW had confused "proper" and "coordinate" time, I don't think he would have got away so lightly.

SiennaTheGr8 · Aug 1, 2019

PeroK said:

Perhaps if you ignore mathematical niceties

I do!

PAllen · Aug 2, 2019

PeroK said:

Perhaps if you ignore mathematical niceties like:

The product rule requires the functions share a domain.

The chain rule requires the range of one function to be a subset of the domain of the other.

Function composition is fundamentally different from the simple product of functions.

It's perfectly easy to make the derivation of product rule from the multivariate chain rule completely rigorous.

vanhees71 · Aug 2, 2019

It's fascinating, how one can get hooked up in a debate about a somewhat sloppy naming of a standard mathematical theorem known already in high school, as if there's nothing else of real interest in this great textbook on GR...

fresh_42 · Aug 2, 2019

vanhees71 said:

It's fascinating, how one can get hooked up in a debate about a somewhat sloppy naming of a standard mathematical theorem known already in high school, as if there's nothing else of real interest in this great textbook on GR...

The reason is: Some like the thought "Look, even the big ones make mistakes!" - not that they ever had claimed otherwise - and others jump into save their reputation - not that it would be necessary. And I have been fascinated by the mathematical questions whether the chain rule can be associated with chain complexes, or simpler, why the bilinear, associative ##f\circ g## as a multiplication instruction doesn't automatically show that the chain rule is an instance of the product rule. But I admit, wrong forum.

vanhees71 · Aug 3, 2019

Yes, well, I wrote enough manuscripts to be very mild against trivial typos or just a bad formulation. The overall concept of MTW and the presentation of the material, however, is outstanding.

Orodruin · Aug 3, 2019

vanhees71 said:

Yes, well, I wrote enough manuscripts to be very mild against trivial typos or just a bad formulation. The overall concept of MTW and the presentation of the material, however, is outstanding.

That moment when you are going to explain to your students that a time integral is missing in a course book and show the ”correct” version in your own book just to realize that - although you remembered the dt unlike the other book - the integral sign with the limits is missing from your book as well ...

fresh_42 · Aug 3, 2019

vanhees71 said:

Yes, well, I wrote enough manuscripts to be very mild against trivial typos or just a bad formulation. The overall concept of MTW and the presentation of the material, however, is outstanding.

I have forgotten to mention the "internet effect". I remember a thread which I thought would have been instantly closed or at least after a few answers. It felt to have lasted internally. The question was, whether zero is a real number or some nonsense like this.

The internet effect: The more ridiculous the subject, the longer the thread! (*)

_________
(*) I do not claim that the reverse is true. So go ahead with QM interpretation discussions!

vanhees71 · Aug 5, 2019

Though, the QM interpretation discussions tend to justify your observed rule about the length of threads ;-))). SCNR.

FreeThinking · Aug 6, 2019

SiennaTheGr8 said:

Product rule follows from the chain rule anyway.

Ok. That's why I asked the question, rather than assuming they were wrong. That's the kind of thing I meant when I asked, "What am I missing?" So the two rules are connected in this way. I just never spotted that in all my Googling.

I will say that the connection seems a bit obscure to me, personally, because I've been reading a lot of physics books for a very long time and I don't remember any other books saying "chain rule" and then writing equations that were simply the "product rule" without any sign of a "chain" in them. I've only seen this in MTW. But, at least, this reference shows that there is some justification for the connection, so now I consider myself better educated on the matter.

Thank you very much, SiennaTheGr8 .

FreeThinking · Aug 8, 2019

***** WARNING: LONG POST *****

NOTE: The questions I ask in this post are rhetorical. It was just an easy way to describe what I was thinking at the time. Please don't waste your time trying to answer them. Today I have answers to most of them anyway.

FreeThinking said:

... it would definitely be helpful if the authors would at least get the terminology correct and/or give more steps in the derivation. Their cavalier use of the terms "covariant" and "directional" derivatives have caused me no end of grief.

PeterDonis said:

What in particular is confusing you about those terms? Can you give an example of a usage of them that you find confusing from MTW (or another textbook if that's easier)?

I want to be clear that, despite how it may sometimes sound, I am in no way, shape, or form blaming MTW or any other author or book for my difficulties in understanding this subject. Every reader/student has a different level of background, understanding, and skill. Every author must decide for themselves who their target audience is. I just happen to be on the outer fringes of the target audience of MTW, so their book is very challenging for me. It is simply a fact, not a complaint, that MTW, and others as well, seem to have used the terms "covariant derivative" and "directional derivative", as well as other terms, in a way that has confused me. I am not talking about the occasional typographical mistakes, but only what appears to be deliberate uses.

For examples of what I mean, look at the following:

MTW page 208 and 209, which is section 8.5, Parallel transport, covariant derivative, connection coefficients, geodesics. On page 208, 2nd paragraph, they say that the gradient of a tensor field is ## \boldsymbol \nabla \boldsymbol T ##. So that defines the gradient operator as ## \boldsymbol \nabla ##. In elementary calculus, including 3-D, Cartesian vector calculus, I was taught that the gradient operator is defined as ## \boldsymbol \nabla \equiv \left ( \frac {\partial} {\partial x^i} \right ) {\boldsymbol {\hat x}}_i ##, a differential operator that is a vector. We write it that way to emphasize that the partial derivative does not operate on the basis vector. A very important point is that the gradient operator has to include the basis vectors of the coordinate system being used because they will be used to form the dot product with the basis vectors in the tangent vector of the curve to produce the total derivative, with respect to the parameter of the curve, of the given scalar field along a given curve. Later I learned that this doesn't work well in an arbitrary, curvilinear coordinate system, so it is convenient to change the definition of the gradient operator to be a one-form ## \boldsymbol \nabla \equiv \left ( \frac {\partial} {\partial \xi^\gamma} \right ) {\boldsymbol {\widetilde \xi}}^\gamma##.

So, if that's true, then letting ##\boldsymbol T = T^\rho_\lambda {\boldsymbol e}_\rho \otimes {\boldsymbol \omega}^\lambda ##, we can write ## \boldsymbol \nabla {\boldsymbol T} = \left (\frac {\partial{\boldsymbol T}} {\partial \xi^\gamma} \right ) \otimes {\boldsymbol {\widetilde \xi}}^\gamma = \left ( {\nabla}_\gamma T^\rho_\lambda \right ) {\boldsymbol e}_\rho \otimes {\boldsymbol \omega}^\lambda \otimes {\boldsymbol {\widetilde \xi}}^\gamma = \left ( T^\rho_{\lambda;\gamma} \right ) {\boldsymbol e}_\rho \otimes {\boldsymbol \omega}^\lambda \otimes {\boldsymbol {\widetilde \xi}}^\gamma ##.
Note that the nabla symbol without the subscript is bolded while the one with the subscript is not bolded. The expressions ## \nabla_\gamma T^\beta_\alpha ## and ## T^\beta_{\alpha ; \gamma} ## are what I understood to be the covariant derivative of the components of the given tensor. Nothing else is the covariant derivative, just those two ways of expressing it. Especially, ## \boldsymbol \nabla ## is definitely NOT the covariant derivative. You cannot take the covariant derivative of a tensor; only the components of a tensor. That was my understanding when I arrived at MTW's front door. Yet, in the next paragraph they say, "First define the "covariant derivative" ## \boldsymbol \nabla_{\boldsymbol u} \boldsymbol T ## of ## \boldsymbol T ## along a curve ## P(\lambda) ##, whose tangent vector is ## \boldsymbol u = \frac {dP}{d\lambda} ##." The expression ## \boldsymbol \nabla_{\boldsymbol u} \boldsymbol T ## is not a covariant derivative. It contains a covariant derivative, but it is not itself a covariant derivative, at least according to my understanding of the definition of a covariant derivative. In fact, it is not even a gradient, per the definition above given by MTW themselves. It is a directional derivative. It contains a gradient, which contains a covariant derivative, but neither of them are covariant derivatives.

Now one could say, "Well, they're defining the covariant derivative to be the directional derivative along the curve, etc." Or, one could also say, "Well, let's not get too hung up on the exact wording, it should be clear what they mean from the equations." What I would say is, "Fine. If we're going to define terms differently from other books and/or ignore the wording & just look at the math, I can do that. But it would be helpful to me if that plan were explicitly stated ahead of time." Over the past six months or so, my understanding of the terms & notation has improved greatly. So now as I review MTW trying to pick up where I left off before all the confusion set in, I'm beginning to see how to interpret their wording correctly and so I'm now less confused about what they're saying. But that's due to my very hard-won, expanded insight that required a lot of study & self-help outside of MTW. Again, I'm not criticizing them, I'm just pointing out that this was a point of confusion for me when I first encountered it because I was still not sure of MTW's definition of things.

Also, starting with the last two paragraphs at the bottom of page 208, we establish that ## \boldsymbol {e}_\beta ## and ## \boldsymbol {\omega}^\alpha ## are general bases dual to each other. Continuing onto page 209, equation 8.19a says that ## {\boldsymbol \nabla}_\gamma \equiv {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} ## . Then further down the page, equation 8.20 defines ## T^\beta_{\alpha,\gamma} \equiv {\boldsymbol \nabla}_\gamma T^\beta_\alpha \equiv \partial_{{\boldsymbol e}_\gamma} T^\beta_\alpha \equiv \partial_\gamma T^\beta_\alpha ##. With a general basis, not a local Lorentz frame, why are we defining the directional derivative ## {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} \equiv {\boldsymbol \nabla}_\gamma ## to be a partial derivative ## T^\beta_{\alpha,\gamma} \equiv \partial_\gamma T^\beta_\alpha ##? If we were using a coordinate basis, say ## \left \lbrace {\boldsymbol {\xi}_\gamma} \right \rbrace ##, it would make sense since ## {\boldsymbol {\xi}_\gamma} \equiv {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} ##, the directional derivative operator along the coordinate curve ## {\boldsymbol {\xi}_\gamma} ##. Perhaps if we stare at this section long enough, it might dawn on us what they actually mean. I get the gist of the section. I understand the gamma correction terms, but I'm just not sure what justifies the way they write some of the equations. While writing this post, I tried to work through equation 8.19a & 8.20, but I'm still not getting the same result they seem to get. So, this confused me when I first encountered it and it still seems to be confusing me now.

Continuing on to page 210, 1st paragraph, equation 8.22: And we're back now with something that exactly matches what I would expect for the directional derivative of the tensor field ## \boldsymbol T ##, but again MTW calls it the covariant derivative. Ok, is this just how they call it, or have I misunderstood something? Considering some of the ways thay have defined the directional derivative as described in my previous paragraphs above, as a newbie, I was just not sure what I was not understanding. So not only did the words not match, even the equations did not seem totally consistent to me.

On page 253, it says: Any "rule" ## \boldsymbol \nabla ##, for producing new verctor fields from old, ... is called by differential geometers a "symmetric covariant derivative." So maybe their constantly calling the bold nabla the covariant derivative is standard. But then, what do we call the part with the semicolon? You know, the part with the gamma correction terms? Then on page 255, last paragraph in the box, labeled "B.", it says: "The machine ## \boldsymbol \nabla ## differs from a tensor in two ways. ...(2) ## \boldsymbol \nabla ## is not a linear machine (whereas a tensor must be linear!)". What? I thought the whole point of the covariant derivative was to have a derivative that produced results that could be used as the components of a tensor. If the covariant derivative is a tensor or has the components of a tensor, it's linear. Yes? No? Which is it? So, that really sent me scrambling all the way back to Schaum's Outline on Vector Analysis by Murray R. Spiegel where I first encountered the covariant derivative as that thing with the gamma correction terms. Yes, well, OK. The thing with the gamma correction terms is supposed to transform like the components of a tensor. So what does MTW mean when they say ## \boldsymbol \nabla ## are not the components of a tensor?

By the time I reached page 271, equation 11.8, I was too confused about what the bold nabla symbol meant. Trying to understand (11.8), I tried several different definitions of ## \boldsymbol \nabla ##, but nothing worked. Somewhere around here and slightly beyond, I was not understanding the math at all, so I Googled. One of the things that popped up was Carroll's Lecture Notes. On pages 75 through 77 I found enough information to eventually figure out a consistent set of terms & notations for all these derivatives that actually seems to give me the same answers as Carroll and makes some sense of MTW. I don't know if my set of definitions is the same as the mainstream or if they even make sense to anyone else, but they make sense to me. The task(s) I'm working on now is to go back through MTW, identify all the places that confused me before, and reread & rewrite MTW's text & math in my notation to see if it makes MTW any more sensible to me. Preliminary experience indicates that MTW can be rewritten to make more sense to me, but there are still a lot of places that don't. So, I may have finally found the right definitions and can start making some progress.

This post is not meant to be a rigorous proof that MTW is inconsistent or sloppy in their notation. Probably anyone who has any business reading MTW would not be confused because they are either smart enough to figure it out without a lot of hand-holding (that ain't me) or they've already mastered the math (also not me) & MTW is just intended to show them how to apply it to general relativity. So most readers here will probably wonder how I could have gotten so confused. But for this particular, self-guided hobbyist, it was enough to hold me up for quite awhile. And now that I'm in review & recover mode, I see better what they mean, but I still feel that some of their wording & even the math is inconsistent in some places. I'm confident that that fact means I still don't know what I'm doing.

Finally, I'd like to echo one sentiment of an earlier poster:

vanhees71 said:

The overall concept of MTW and the presentation of the material, however, is outstanding.

If I did not agree, I would never have spent so much time trying to understand it. I have found no other book that covers as much material, even if the coverage is a challenge for the likes of me. Well, it keeps me off the streets at night.

FreeThinking · Aug 9, 2019

FreeThinking said:

Also, starting with the last two paragraphs at the bottom of page 208, we establish that ## \boldsymbol {e}_\beta ## and ## \boldsymbol {\omega}^\alpha ## are general bases dual to each other. Continuing onto page 209, equation 8.19a says that ## {\boldsymbol \nabla}_\gamma \equiv {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} ## . Then further down the page, equation 8.20 defines ## T^\beta_{\alpha,\gamma} \equiv {\boldsymbol \nabla}_\gamma T^\beta_\alpha \equiv \partial_{{\boldsymbol e}_\gamma} T^\beta_\alpha \equiv \partial_\gamma T^\beta_\alpha ##. With a general basis, not a local Lorentz frame, why are we defining the directional derivative ## {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} \equiv {\boldsymbol \nabla}_\gamma ## to be a partial derivative ## T^\beta_{\alpha,\gamma} \equiv \partial_\gamma T^\beta_\alpha ##? If we were using a coordinate basis, say ## \left \lbrace {\boldsymbol {\xi}_\gamma} \right \rbrace ##, it would make sense since ## {\boldsymbol {\xi}_\gamma} \equiv {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} ##, the directional derivative operator along the coordinate curve ## {\boldsymbol {\xi}_\gamma} ##. Perhaps if we stare at this section long enough, it might dawn on us what they actually mean. ... I tried to work through equation 8.19a & 8.20, but I'm still not getting the same result they seem to get.

Ok, I've stared it a while longer, and here's what I'm seeing:

Based on how MTW defines things, as described above, I get ## \boldsymbol \nabla_\gamma T^\beta_\alpha = \Lambda^\mu_\gamma T^\beta_{\alpha,\mu} ##, using ## \boldsymbol e_\gamma = \Lambda^\sigma_\gamma \boldsymbol \xi_\sigma ## where ## \boldsymbol \xi_\sigma ## are the coordinate basis vectors. But if I replace the ## \boldsymbol e_\gamma ## with ## \boldsymbol \xi_\gamma ##, I get ## \boldsymbol \nabla_\gamma T^\beta_\alpha = T^\beta_{\alpha,\gamma} ## which seems to be what MTW says it should be.

But, I see several problems with this:

MTW has used ## \boldsymbol \nabla ## is such a way that it generates gamma correction terms when applied to a general tensor. But applying it to just the components of a tensor does not generate those components unless we interpret it as the semicolon operator, which they do not seem to do in (8.20).
MTW just defined ## \boldsymbol e_\beta ## to be a general basis, not necessarily a coordinate basis. Yet in (8.20) the ## \Lambda^\sigma_\gamma ## needed to define the general basis is nowhere to be found. It is as if MTW has suddenly changed ## \boldsymbol e_\beta ## to be a coordinate basis.

This is a case where the math itself confuses me even if we ignore the text. Which is why, when I arrive at other places in MTW that use the nabla symbol, I'm never sure what they mean at that particular point. I have to work the problem multiple ways until I stumble on the same result.

So, this is a question I would like to have answered: How is one to think about this? Is it a typo? Have they just switched back to using e as a coordinate basis? Is ## \boldsymbol \nabla_\gamma ## intended to be just the simple, elementary, partial derivative at this particular point in the text? Or, which I always consider to be the most likely case, what am I not understanding?

PeterDonis · Aug 9, 2019

FreeThinking said:

Based on how MTW defines things, as described above, I get##\boldsymbol \nabla_\gamma T^\beta_\alpha = \Lambda^\mu_\gamma T^\beta_{\alpha,\mu}## , using ##\boldsymbol e_\gamma = \Lambda^\sigma_\gamma \boldsymbol \xi_\sigma## where ##\boldsymbol \xi_\sigma## are the coordinate basis vectors.

I don't understand what you're doing here. The Lorentz transformation ##\Lambda## doesn't appear anywhere in the section of MTW you're referring to, and anyway you don't use a Lorentz transformation to transform from local inertial coordinates to general curvilinear coordinates.

FreeThinking said:

but if I replace the ##\boldsymbol e_\gamma## with ##\boldsymbol \xi_\gamma## , I get ##\boldsymbol \nabla_\gamma T^\beta_\alpha = T^\beta_{\alpha,\gamma}## which seems to be what MTW says it should be.

I don't understand what you're doing here either. It doesn't help that you're throwing in your own notation ##\boldsymbol \xi_\gamma##, which doesn't appear anywhere in MTW. MTW always uses ##\boldsymbol e## for the basis vectors, not ##\boldsymbol \xi##.

FreeThinking said:

MTW has used ##\boldsymbol \nabla## is such a way that it generates gamma correction terms when applied to a general tensor. But applying it to just the components of a tensor does not generate those components unless we interpret it as the semicolon operator, which they do not seem to do in (8.20).

You are confused. You don't apply the ##\boldsymbol \nabla## operator to the components of a tensor.

##\boldsymbol \nabla##, by itself, with no subscripts, is a differential operator that takes an ##(m, n)## tensor (a tensor with ##m## upper indexes and ##n## lower indexes, or, in MTW's coordinate-free terminology, a tensor with ##m## slots that accept 1-forms and ##n## slots that accept vectors) to an ##(m, n+1)## tensor. (This is all explained in section 3.5 of MTW.) In other words, if I have a tensor ##\boldsymbol T##, then ##\boldsymbol \nabla \boldsymbol T## is another tensor with one more lower index. Applying ##\boldsymbol \nabla## by itself to the components of a tensor makes no sense.

If I want to express ##\boldsymbol \nabla \boldsymbol T## in component notation, then if ##\boldsymbol T## is a ##(1, 1)##, tensor, i.e., in components it is ##T^\alpha{}_\beta##, then ##\boldsymbol \nabla \boldsymbol T## will be ##T^\alpha{}_{\beta ; \gamma}##.

MTW also use the notation ##\boldsymbol \nabla_{\boldsymbol u}##, i.e., ##\boldsymbol \nabla## with a subscript, to denote a different operator, the directional derivative along the 4-vector ##\boldsymbol u##. In component notation, ##\boldsymbol \nabla_{\boldsymbol u} T## is ##u^\gamma T^\alpha{}_{\beta ; \gamma}##.

In neither case described above do we apply the operator ##\boldsymbol \nabla## (with or without a subscript) to the components of a tensor.

FreeThinking said:

MTW just defined ##\boldsymbol e_\beta## to be a general basis, not necessarily a coordinate basis. Yet in (8.20) the ##\Lambda^\sigma_\gamma## needed to define the general basis is nowhere to be found

I don't know where you're getting this from. You don't use a Lorentz transformation to go to general curvilinear coordinates. See above.

FreeThinking said:

This is a case where the math itself confuses me even if we ignore the text.

Have you encountered covariant derivatives in other textbooks? Have they confused you there?

For example, Carroll discusses covariant derivatives in his lecture notes. Were you able to follow his presentation?

FreeThinking · Aug 11, 2019

Peter: My apologies. I was trying to be brief and may have pulled an MTW on you. Just ignore it for now and don't spend any more time on it. I'm working on a longer version that will hopefully explain things more clearly. I've got things going on so it may be a few days before I can post it. I want to make sure I get it right this time.

FreeThinking · Aug 16, 2019

I have a question.

MTW says that the covariant derivative is a machine with slots that accepts inputs and produces an output. Looking specifically on page 255, Box 10.3, part A, sub-parts 3 through 5, here's how I interpret what they're saying there:

## \boldsymbol \nabla ## is a machine, called the covariant derivative, with 3 slots. If we plug certain types of objects into the appropriate slots, we get out new machines depending on which slots we fill. These new machines also have slots that accept the proper kind of object. Depending on which slots we fill, we get different outputs: One selection gives us a directional derivative, another selection gives us a gradient, and filling all the slots gives us a number.

The key point of all of this is that the machine called the "directional derivative" and the machine called the "gradient" are both instances of the more general machine called "covariant derivative". A "directional derivative" is a "covariant derivative", but a "covariant derivative" is not necessarily a "directional derivative".

It's analogous to how a car, a truck, and a bus are each an instance of a motor vehicle, but not every motor vehicle is a car; not each is a truck; etc.

So, when MTW writes the term "covariant derivative" but then writes a mathematical expression that looks all the world like a directional derivative, this practice is consistent with their definition of the "covariant derivative" being the generator of other kinds of derivatives.

Is this the correct view to take of how and why MTW keeps calling expressions that are the directional derivative by the name "covariant derivative"?

fresh_42 · Aug 16, 2019

All derivatives are directional derivatives per construction. However, we can consider the direction as a variable, a slot to be filled. This makes it a covariant derivative, since the direction hasn't been specified yet. The tricky point here and what MTW tried to describe via slot machine, is the fact that a derivative can be considered from many different perspectives, resulting in a different object.

It is the path from the narrow high school perspective as ##f'(x)## being a "slope" to ##D_p f(v)## being a covariant derivative. As you can see, we can consider the differential process ##D##, the evaluation at a certain point ##p##, or in a certain direction ##v## or all of them to get a number ##D_pf(v)##. Even the function ##f## can be considered as a variable for the process ##D##.
It is always more or less the same thing, only differing in the point of view. But the objects are different as well. We have e.g. ##D_p(f+g)(v) =D_p(f)(v)+D_p(g)(v)## and ##D_p(f)(v+w)=D_p(f)(v)+D_p(f)(w)## but the same cannot be done on the location level ##p##. We also have ##D_p(f\cdot g)(v)=f(g(p)) \cdot D_p(g)(v)+D_{g(p)}(f)(v)\cdot g(p)## but this is not true on the direction level ##v##. So depending on what you consider variable, you get different results from the slot machine ##D##. And in the end, you can even consider all these on the component level with different coordinate systems.

Here's a list I once gathered:
https://www.physicsforums.com/insights/journey-manifold-su2mathbbc-part/and "slope" wasn't even mentioned. If you want to read more, have a look at
https://www.physicsforums.com/insights/the-pantheon-of-derivatives-i/

FreeThinking · Aug 25, 2019

@fresh_42: Thanks for your reply.

The first part of your first paragraph sounds to me like you are saying the exact opposite of what I said:

Me: Dir deriv is a case of the more general covar deriv.
You: Covar deriv is a case of the more general dir deriv.

The last part of your first paragraph sounds to me like the "deriviatve" (with no adjectives) is the main thing & what kind of derivative (with adjectives) we get depends on how we look at it. Seems like what we call things just depends on our mood at the time. Ok, if that's the case, I'll adapt.

The rest of your post and the two references you gave will certainly keep me quiet for awhile, which is always a worthy goal. I had already skimmed through those Insights some months ago trying to answer my own questions, but they seemed well over my head. I'll give them a closer look in light of what I've learned since last I read them.

Finally, since my last post, I have picked up a copy of Hobson. On first, preliminary skimming, it appears to me that he uses terms & symbols much closer to what I am used to from previous books I've read. So I'm going to concentrate on Hobson for awhile before I return to re-reading MTW.

So, I've got a lot of reading to do now. Thanks to everyone for your help.

fresh_42 · Aug 25, 2019

FreeThinking said:

Me: Dir deriv is a case of the more general covar deriv.
You: Covar deriv is a case of the more general dir deriv.

This is because "more general" is without meaning! A covariant derivative is certainly a rather abstract construction. E.g. Wikipedia says: "A covariant derivative is a certain connection on a tensor bundle". So in this sense a covariant derivative is more general, as almost nothing is specified. If we say directional derivative, then we automatically ask: which direction? which function?, and this is less abstract.

What I wanted to say is, that whatever you take, at its kernel a derivative is a linear approximation of something curved. I was referring to this underlying principle, which is "more general".

If you choose specific examples, then the covariant derivative is probably "more general", but this is semantics as long as you do not define a measure for generality.

We always have derivative = (a topological object ##\mathcal{T}##, a location ##P##, a linear approximation ##D##, a flow ##\phi(t)## with direction ##v##). The topological object is usually the only part which isn't considered variable. This means depending on what is variable, the thing has different names:

##(\mathcal{T}, - , - , - , -) = \text{ manifold }##
##(\mathcal{T}, - , - , \phi(t) , -) = \text{ flow }##
##(\mathcal{T}, - , -, \phi(t), v ) = \text{ vector field }##
##(\mathcal{T}, - , D, - , - ) = \text{ differentiation }##
##(\mathcal{T}, - , D , - , - ) = \text{ derivation }##
##(\mathcal{T}, - , D , - , v ) = \text{ tangent bundle }##
##(\mathcal{T}, - , D, \phi(t) , - ) = \text{ cotangent bundle }##
##(\mathcal{T}, - , D, \phi(t), v) = \text{ connection }##
##(\mathcal{T}, P , -, - , v) = \text{ tangent }##
##(\mathcal{T}, P , D, - , -) = \text{ section }##
##(\mathcal{T}, P , D, - , v) = \text{ directional derivative }##
##(\mathcal{T}, P , D, - , x_i) = \text{ partial derivative }##
##(\mathcal{T}, P , D, - , (x_1,\ldots,x_n) ) = \text{ total differential }##
##(\mathcal{T}, P , D, \phi(t) , v) = \text{ slope }##

However, please, please, do not take this literally, even less than your "more general". This list is very loosely speaking and only meant to stress from how many different sides you can look at what in the end is only a slope. I pressed some terms into the scheme, that's why it does not serve as a definition - only as an impression! The most general term in the sense of common language is probably 'section of a fiber bundle'.

FreeThinking · Aug 27, 2019

Ok, thanks. This is way, way, way over my head, but I get your drift. There's more than one way to look at this, and one can get very abstract about it. For now, I'm going to have to stick to a more concrete view until I get more experience with it all.

I've been reading Hobson and I find that he definitely uses the same terms & notations that I picked up from previously read books. Coupled with his providing more steps in the derivations, I find him much easier to follow than MTW, even easier than "A first course ..." by Schutz. He defines the covariant derivative as I first learned it: ## \nabla_\gamma T^\rho_\lambda = T^\rho_{\lambda ; \gamma} = T^\rho_{\lambda , \gamma} + \Gamma^\rho_{\sigma \gamma} T^\sigma_\lambda - \Gamma^\sigma_{\lambda \gamma} T^\rho_\sigma ## .

I'll keep a copy of your post & keep it handy as I read your Insights. Thanks for taking so much time & effort for me. I sincerely appreciate it.

A Why does MTW keep calling the "product rule" the "chain rule"?

Similar threads

Hot Threads

A Minimal property of Spacelike geodesics in GR/curved spacetime?

A Dirac's "GTR" Eq (27.4): how momentum ##p^\mu## varies

A Question on Dirac's derivatives of the 4-velocity w.r.t. coordinates

B No object actually approaches the speed of light

B When I jump up and down what is the Einsteinian way to describe it?

Recent Insights

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers