The Chain Rule, death to anyone that breaks the rule

In summary, the book Stewart uses is apparently copied word for word from another book, with different authors continuing to publish over time. The equation for the derivative of Q with respect to x and z is not found in the linearization of the tangent plane, and is instead found through some unspecified procedure.
  • #1
Cyrus
3,238
16
Ok so I am reviewing multivariable now that i have some time; (why is it taking me so long to grasp some of these concepts!? :mad: ) anyways, and I am reading the proof of stokes theorem. The book I use is Stewart, but it seems to be ripped off word for word from swokowski, which in turn rippes off S.L salas and Einar Hille, (maybe each new author contiuned the publishing over time?).

Here is what's throwing me a curve, at one point, they show that the [tex]curl \vec{F} \cdot d\vec{S} = \int_c \vect{F} \cdot d\vec{r}[/tex]

They assume that [tex] \vec{F}=P\hat{i}+Q\hat{j}+R\hat{k}[/tex]

I will just put the proof up to avoid confusion when I refer to it:

(1) [tex] \int_c \vec{F} \cdot d\vec{r} = \int^b_a (P\frac{dx}{dt} +Q\frac{dy}{dt}+ R\frac{dz}{dt}) dt [/tex]

(2) [tex] ' ' = \int^b_a [P \frac{dx}{dt} +Q \frac{dy}{dt}+ R( \frac{\partial z}{\partial x} \frac{dx}{dt} + \frac{\partial z}{\partial y} \frac{dy}{dt})] dt [/tex]

(3) [tex]' ' = \int^b_a [(P+R\frac{\partial z}{\partial x})\frac{dx}{dt} + (Q+R\frac{\partial z}{\partial y})\frac{dy}{dt}] dt [/tex]

(4) [tex]' '=\int_{c1} (P+R\frac{\partial z}{\partial x})dx + (Q+R\frac{\partial z}{\partial y})dy [/tex]

(5) [tex]' '=\int\int_D[\frac{\partial}{\partial x}(Q+R\frac{\partial z}{\partial y})-\frac{\partial}{\partial y}(P+R\frac{\partial z}{\partial x}]dA [/tex]

By Green's Theorem. Then using the chain rule again and remembering that P,Q, and R are functions of x,y and z, and that z is a function itself of x and y, we get:

(6) [tex] \int_c \vect{F} \cdot d\vec{r} = \int\int_D [( \frac{\partial Q}{\partial x} + \frac{\partial Q}{\partial z} \frac{\partial z}{\partial x}+\frac{\partial R }{\partial x}\frac{\partial z }{\partial y }+ \frac{\partial R }{\partial z }\frac{\partial z }{\partial x }\frac{\partial z }{\partial y }+ R \frac{\partial^2 z }{\partial x \partial y }) -( \frac{\partial P }{\partial y }+ \frac{\partial P }{\partial z } \frac{\partial z }{\partial y }+ \frac{\partial R }{\partial y } \frac{\partial z }{\partial x } + \frac{\partial R }{\partial z } \frac{\partial z }{\partial y } \frac{\partial z }{\partial x }+R \frac{\partial^2 z }{\partial y \partial x })]dA [/tex]


TOO MUCH TYPING! :yuck:

For some reason I can't see what I typed?? Aha, "" marks are reserved for something that's what it is. YES! the last line worked finally, aye! :tongue2:
Ok, time to get back on track.

I will walk through each step, 1-6 followed by the last line and describe what they are doing. If I am wrong along the way, let me know. I will also explain what part is throwing me off.

(1) This is just standard notation derived previously when doing the line integral of a vector field. The reason for the dt outside the parthensis is because we have x as a function of t; x=f(t), and when doing a line integral we want to integrate P W.R.T dx, not dx/dt(dx/dt is the speed at which x changes, but we are interested in the change of x only, not the change in its speed), which is why the extra dt outside the parenthesis to cancel out the dt, thus integrating W.R.T dx. I.E dx/dt*dt = dx Similar arguments hold for the dy/dt and dz/dt.

(2) This part is fine too, becuase z is a function of x and y. Furthermore, x and y are functions of t. So when we do the total derivative of z, we get the junk inside the parenthesis. This makes sense, since it was derived earlier in the book. We did the linearization of the TANGENT PLANE and arrived at an equation for dz. We divided this entire equation by dt, and took the limit, and we get the result inside (" ").

(3) Easy, Easy, Easy, just move things around and factor out differentials

(4) Seems like the trick played here is that they UNPARAMETERIZED the function with respect to t to just x and y again.

(5) Now they just apply Green's theorem for a vector function, which they can do because they use the planar curve c1 which lies on the projection plane of the surface on the xy plane. So it only varies in x and y.

Enter Confusion:

Performing the partial derivative is messing me up. Since it is the same for all the partials, let's just deal with the first part in the brackets []( the minus " " terms are the same procedure of differentiation.)

Now we have the partial derivative of Q W.R.T x. Now Q is a function of x AND a function of z. Of course Z is itself a function of x.

But how did they get the part:

[tex] \frac{\partial Q}{\partial x} + \frac{\partial Q}{\partial z} \frac{\partial z}{\partial x} [/tex].

I understand you have to take the derivative of the x part and the z part. But how did they arrive at this equation, because this equation does not reseble the linearization i.e dz=partialf/partialx(dx)+partialf/partialy(dy)? What proof can I turn to, so that I can say ah yes, this is why you do the derivative this way.
 
Last edited:
Physics news on Phys.org
  • #2
Your missing a couple lines there, #4 and 5.
 
  • #3
It's just an application of the chain rule. See 'The Chain Rule (General version)' on page 793. You might as well consider x and y as functions x(x,y), y(x,y) and use that y is constant w.r.t. x (it doesn't really depend on x). It looks weird, but in this way you can readily use the form presented in the book if it's not immediately obvious:

[tex]\frac{\partial}{\partial x} Q(x(x,y),y(x,y),z(x,y))=\frac{\partial Q}{\partial x}\frac{\partial x}{\partial x}+\frac{\partial Q}{\partial y}\frac{\partial y}{\partial x}+\frac{\partial Q}{\partial z}\frac{\partial z}{\partial x}=[/tex]


[tex]\frac{\partial Q}{\partial x}\cdot 1+\frac{\partial Q}{\partial y}\cdot 0+\frac{\partial Q}{\partial z}\frac{\partial z}{\partial x}=\frac{\partial Q}{\partial x}+\frac{\partial Q}{\partial z}\frac{\partial z}{\partial x}[/tex]
 
  • #4
Galileo said:
It's just an application of the chain rule. See 'The Chain Rule (General version)' on page 793. You might as well consider x and y as functions x(x,y), y(x,y) and use that y is constant w.r.t. x (it doesn't really depend on x). It looks weird, but in this way you can readily use the form presented in the book if it's not immediately obvious:

[tex]\frac{\partial}{\partial x} Q(x(x,y),y(x,y),z(x,y))=\frac{\partial Q}{\partial x}\frac{\partial x}{\partial x}+\frac{\partial Q}{\partial y}\frac{\partial y}{\partial x}+\frac{\partial Q}{\partial z}\frac{\partial z}{\partial x}=[/tex]


[tex]\frac{\partial Q}{\partial x}\cdot 1+\frac{\partial Q}{\partial y}\cdot 0+\frac{\partial Q}{\partial z}\frac{\partial z}{\partial x}=\frac{\partial Q}{\partial x}+\frac{\partial Q}{\partial z}\frac{\partial z}{\partial x}[/tex]


Funny you should say that, I was thinking about it in terms of what you said last night right before i went to bed but i wasnet sure. The thing that I was not sure about was the second part of the fraction where you have dx/dx, dy/dx and dz/dx. Is the reason you have partial x/ partial x and not dx/dx, is that you made x a function of x and y?
 
  • #5
Galileo said:
It's just an application of the chain rule. See 'The Chain Rule (General version)' on page 793. You might as well consider x and y as functions x(x,y), y(x,y) and use that y is constant w.r.t. x (it doesn't really depend on x). It looks weird, but in this way you can readily use the form presented in the book if it's not immediately obvious:

[tex]\frac{\partial}{\partial x} Q(x(x,y),y(x,y),z(x,y))=\frac{\partial Q}{\partial x}\frac{\partial x}{\partial x}+\frac{\partial Q}{\partial y}\frac{\partial y}{\partial x}+\frac{\partial Q}{\partial z}\frac{\partial z}{\partial x}=[/tex]


[tex]\frac{\partial Q}{\partial x}\cdot 1+\frac{\partial Q}{\partial y}\cdot 0+\frac{\partial Q}{\partial z}\frac{\partial z}{\partial x}=\frac{\partial Q}{\partial x}+\frac{\partial Q}{\partial z}\frac{\partial z}{\partial x}[/tex]
The reason why this looks "weird" is the use of sloppy notation, as Galileo is of course, fully aware of.
One might use a pedantic notation here:
1. Let us have a function [tex]Q'(x',y',z')[/tex]
2. Let [tex](x',y',z')\in\mathbb{R}^{3}[/tex] be related to [tex](x,y)\in\mathbb{R}^{2}[/tex] as follows:
x'=X(x,y)=x,y'=Y(x,y)=y,z'=Z(x,y)
3. We may now define a function Q(x,y) as follows:
[tex]Q(x,y)=Q'(X(x,y),Y(x,y),Z(x,y))[/tex]
4. We also define: [tex]\vec{x}'=(x',y',z'), \vec{X}(x,y)=(X(x,y),Y(x,y),Z(x,y))[/tex]
5.Thus, we have:
[tex]\frac{\partial{Q}}{\partial{x}}=(\frac{\partial{Q'}}{\partial{x'}}\frac{\partial{X}}{\partial{x}}+\frac{\partial{Q'}}{\partial{y'}}\frac{\partial{Y}}{\partial{x}}+\frac{\partial{Q'}}{\partial{z'}}\frac{\partial{Z}}{\partial{x}})\mid_{\vec{x}'=\vec{X}(x,y)}[/tex]

6. Now, one might ask oneself if pedantic notation is really worthwhile..:wink:
 
Last edited:
  • #6
arildno said:
The reason why this looks "weird" is the use of sloppy notation, as Galileo is of course, fully aware of.
One might use a pedantic notation here:
1. Let us have a function [tex]Q'(x',y',z')[/tex]
2. Let [tex](x',y',z')\in\mathbb{R}^{3}[/tex] be related to [tex](x,y)\in\mathbb{R}^{2}[/tex] as follows:
x'=X(x,y)=x,y'=Y(x,y)=y,z'=Z(x,y)
3. We may now define a function Q(x,y) as follows:
[tex]Q(x,y)=Q'(X(x,y),Y(x,y),Z(x,y))[/tex]
4. We also define: [tex]\vec{x}'=(x',y',z'), \vec{X}(x,y)=(X(x,y),Y(x,y),Z(x,y))[/tex]
5.Thus, we have:
[tex]\frac{\partial{Q}}{\partial{x}}=(\frac{\partial{Q'}}{\partial{x'}}\frac{\partial{X}}{\partial{x}}+\frac{\partial{Q'}}{\partial{y'}}\frac{\partial{Y}}{\partial{x}}+\frac{\partial{Q'}}{\partial{z'}}\frac{\partial{Z}}{\partial{x}})\mid_{\vec{x}'=\vec{X}(x,y)}[/tex]

6. Now, one might ask oneself if pedantic notation is really worthwhile..:wink:

You are loosing me at line 4, can you clarify that please. Why is x' a vector function, its scalar based on your notation.
 
  • #7
No, I just use [tex]\vec{x}'[/tex] to designate a point in [tex]\mathbb{R}^{3}[/tex]
Any such point can be represented with the aid of a vector having three components; the components of the vector [tex]\vec{x}'[/tex] are the scalars x',y',z'.
 
  • #8
i recommend you write out a proof yourself for integration around a rectangle. it is very easy, following from precisely two facts: FTC and Fubini. I.e. Fubini reduces the two dimensional integral to one dimensional integrals, and then FTC does those.

Then all other cases are obtained from that by the chain rule, which is a separate issue, having nothing to do with Stokes theorem, only with how to generalize it technically.
 
  • #9
before you had x'=X(x,y), so x' was a scalar function. Now you have x' as a vector. How come you changed it into a vector function. I thought x' is only the scalar component in the x direction.
 
  • #10
Is [tex]\vec{x}'[/tex] the same symbol as x'?
Doesn't look so for me, at least.
I could equally well have called my vector [tex]\vec{v}'=(x',y',z')[/tex]
if you are happier with that.
 
  • #11
arildno said:
Is [tex]\vec{x}'[/tex] the same symbol as x'?
Doesn't look so for me, at least.
I could equally well have called my vector [tex]\vec{v}'=(x',y',z')[/tex]
if you are happier with that.

Only if I squint real hard. :tongue2:

I see what your saying now. It looks as if [tex]\vec{x'}[/tex] and [tex] \vec{X'} [/tex] are the same thing. Why was it necessary to use both of them in the proof? Oh, and i think [tex] \vec{s'} [/tex] would have been a little bit clearer. Calling the position vector by x confused me with the x component of a vector.
 
Last edited:
  • #12
With [tex]\vec{x}'[/tex], I designate a POINT (element) in [tex]\matbb{R}^{3}[/tex]

With (x,y) (or, if you like, [tex]\vec{x}[/tex] (no apostroph/hyphen here!)), I designate a POINT (element) in [tex]\mathbb{R}^{2}[/tex]

The statement [tex]\vec{x}'=\vec{X}'(x,y)[/tex] says that we have a mapping [tex]\vec{X}':\mathbb{R}^{2}\to\mathbb{R}^{3}[/tex] i.e, a point (x,y) in the plane (the input value) is mapped onto a point (x',y',z') (the output value) in [tex]\mathbb{R}^{3}[/tex] in accordance with some rule.
 
Last edited:
  • #13
I see. :-)
 
  • #14
x'=X(x,y)=x,y'=Y(x,y)=y,z'=Z(x,y)=z Here you are reusing the values of x, and y when you call x'=X(x,y)=x. I guess the thing here is to remember that the x=X(x,y) is not the same as the variable x inside the parthensis. It has a double meaning that i should be VERY carefull about.
 
  • #15
Our "machine", or, function, X has input (x,y), and spits out the x-value of (x,y) as the output (i.e, X can be regarded as the operation of projecting (x,y) onto the x-axis.)
That is, we have specified a rule so that given input, we may calculate output.

This output is then what x' is set equal to.
 
  • #16
Right, just to be clear, the x's I was reffering to are: X(x,y), the x inside the (x,y) and the x in X(x,y)=x on the right side of the equal sign. Here we are reusing the variable x for two different purposes. I dident mean the capital X. (sorry if I was not clear on that.)
 
  • #17
It was I who wasn't clear about it from the start (sorry about that), but I think you've got it.
 
  • #18
this stuff is trivial, or ought to be if done right. Stewart et al are just clogging it up with notation.

Look, if W is a one form, (i.e. something like Pdx + Q dy), then the "curl" is just dW = dP^dx + dQ^dy which is expanded out by the rule that dP = dP/dx dx + dP/dy dy, and dx^dx = 0 = dy^dy, and dy^dx = -dx^dy, as usual. so dW = (dQ/dx - dP/dy)dx^dy.

Then greens theorem, which is repeated integration plus FTC, says the integral of W over the boundary of the rectangle equals the integral of dW over the rectangle.


then the so called "Stokes theorem" just says this remains true for surfaces parametrized by a rectangle. that's all. and this is just the chain rule.

I.e. let f(s,t) = (x(s,t), y(s,t), z(s,t)), be a map from the rectangle R in the s,t plane into x,y,z, space.

Then by definition, the boundary of f(R), is f of the boundary of R. and if W is a one form in x,y,z space, then curl of the pullback f*W, of W to s,t space, is the pullback of the curl of W. (this is the chain rule). i.e. f*(dW) = d(f*W)

Hence the integral of W over the boundary of f(R), equals the integral of f*W over the boundary of R, which by Green equals the integral of d(f*W) over R, which equals the inetgral of f*(dW) over R, which by definition equals the integral of d(W) over f(R).

If we write dot product for integral and b for boundary, we get just this:

<W,b(f(R))> = <W,f(bR)> = <f*W, b(R)> = <d(f*W),R> = <f*(dW),R> = <dW,f(R)>. done.

I.e. the first equation is true because f(b(R)) = b(f(R)). the second by definition of how to integrate over a parametrized curve, the third is true by greens theorem, the fourth is true by the chain rule which implies that f*(dW) = d(f*W), and the last is true by definition.

thats it.
 
Last edited:
  • #19
Ok, so far so good now I hope. Can you explain this part to me though.

We have:

[tex] \partial Q = \frac{\partial Q'}{\partial x'} \partial x + \frac{\partial Q'}{\partial y'} \partial y [/tex].

Lets just assume right now that I am working in two dimensions so this resembles the linearization of the tanget plane approximation better. Then the equation I wrote is very simliary; however, we don't have dx anymore, we have [tex] \partial x [/tex]. But when we did the linearization dx with the tangent plane approximation, we used dx. Why is it not written in terms of dx?

I.e. not written as:

[tex] \partial Q = \frac{\partial Q'}{\partial x'} dx + \frac{\partial Q'}{\partial y'} dy [/tex].
 
Last edited:
  • #20
if you mean me, i do not know how to write curly d's so i use square d's for everything. but as you suggest it should be dQ = (curlydQ)/(curlydx) dx + (curlydQ)/(curlydy) dy.
 
  • #21
mathwonk said:
if you mean me, i do not know how to write curly d's so i use square d's for everything. but as you suggest it should be dQ = (curlydQ)/(curlydx) dx + (curlydQ)/(curlydy) dy.

No sorry I was not referring to your post.
 
  • #22
You should study carefully what mathwonk has provided you with, I'll look in upon your question afterwards.
 
  • #23
ok, Ill look at it now
 
  • #24
Mathwonk, can you change your post so that I can tell the difference between the partials and the d's. I am reading through but I don't know when and where you mean dx or [\partial x]. Thanks
 
  • #25
doesn't matter, only one possibility is possible in each case. i.e. if f(x,y) is a function of two variables, then df/dx obviously means partial wrt x.
 
  • #26
Ok, let's go back a few steps perhaps. Could you please show me a simple proof of the GENERAL version of the chain rule. I am having trouble seeing how that is derived. If you can help me with that, then it will be a great help.
 
  • #27
you do not need this to follow the proof of the stokes theorem. all you need is this:


definition: if (x,y,z) = f(s,t) = (x(s,t), y(s,t),z(s,t)).

and if W = Adx + Bdy +Cdz, then
f*W = A dx/ds ds + A dx/dt dt + B dy/ds ds + B dy/dt dt +C dz/ds ds + C dz/dt dt/

= [A dx/ds + Bdy/ds +Cdz/ds] ds + [Adx/dt +Bdy/dt +Cdz/dt] dt

= Pds +Qdt.


Then d(f*W) = [dQ/dx - dP/dy] dx^dy.

and dW = [dB/dx-dA/dy] dx^dy + [dC/dy-dB/dz]dy^dz + [dA/dz -dC/dx]dz^dx.

Then check that f*(dW) = d(f*W).

this is the only missing step in the proof of the stokes theorem.

the ca=hain rule itslef is much ahrder to prove. this is just a tedious lengthy but mechanical calculation. the chain rule requires an idea to prove it.

i know how, but it will not shed any light on the current question.
 
  • #28
actually that's too complicated. just assume W = Adx. then the same proof will work for the general W.
 
  • #29
OK we've at least identified the place in the argument where the messy chain rule calculua tion occurs: in showing that f*(dW) = d(f*W).

I) Let's do this for a trivially simple case: W = dx.

Then if f(s,t) = (x(s,t),y(s,t),z(s,t)), then f*dx just emans composing x with f:

i.e. then f*dx = ?x/?s ds + ?x/?t dt, (where the curly d's are coming out as question marks for some reason.)

then d(f*dx) = d(?x/?s)^ds + d(?x/?t)^dt

= [(?^2x/?s^2) ds + (?^2x/?s?t) dt]^ds + [(?^2x/?t?s) ds + (?^2x/?t^2) dt]^dt

= (since ds^ds = 0 = dt^dt), (?^2x/?t?s - ?^2x/?s?t) ds^dt = 0, by equality of mixed partials. so there is something else going on here besides the chain rule.

now that we see d(f*dx) = 0 we claim also that f*(d(dx)) = 0, but that is immediate because d(dx) is already zero, i.e. this equals d(1)^dx = (0)^dx = 0, since 1 is the coefficient function of dx.



II) Now let's try pulling back a function A(x,y,z) by f, and differentiating. it is longer but still mindless calculation.

i.e. f*A = A(x(s,t),y(s,t),z(s,t)), so df*A = ?A/?s ds + ?A/?t dt.

where by the chain rule this time:
?A/?s = (?A/?x) (?x/?s) + (?A/?y) (?y/?s) + (?A/?z) (?z/?s), etc..for t.


now compute in the other order: i.e. first take
dA = (?A/?x) dx+ (?A/?y) dy + (?A/?z) dz, and then take f* of that.

i.e. substitute in dx = ?x/?s ds + ?x/?t dt, etc...for y,z,


this gives ultimately,
f*(dA) = (?A/?x) [ ?x/?s ds + ?x/?t dt] + (?A/?y) [ ?y/?s ds + ?y/?t dt]+..etc for z

= [(?A/?x) (?x/?s) + (?A/?y) (?y/?s) + (?A/?z) (?z/?s)] ds +...etc for t,

= ?A/?s ds + ?A/?t dt = d(f*A), as claimed.


III) Now we can do it for Adx, i.e. then d(Adx) = dA^dx, so

f*(d(Adx)) = f*(dA)^f*(dx) = (by previous argument) d(f*A)^f*dx

= d(f*A)^f*dx + f*A ^ d(f*dx)) (since we know the second term is zero from above,

= (by leibniz rule for d) d(f*(A)^f*dx) = d(f*(Adx)). done.


IV) Then we are done for any W = Adx + Bdy + Cdz, by linearity of f* and d.




Well checking all this is tedious, but mindless, and can at least be separated out from the rest of the argument.

i.e. the whole thing boils down to just f*(dW) = d(f*W), which is the chain rule plus the equailty of mixed partials.

The advantage of this point of view is that you only need to prove this relation once, and then can use it in calculations of many types in many settings, instead of embedding it in a proof of stokes theorem.

I.e. you should realize what it is you are proving so you can use it again.
 
  • #30
could we change the title of this thread to make it a little less apocalyptic?
 
  • #31
Hey arildno, I think I am getting some where can you please help me out.

Lets start with the linearization of a function of three variables,

w=b(x,y,z), and x=f(s,t) y=g(s,t) z=h(s,t)

Then the Linearization approximation to the tanget surface is defined as:

[tex] \Delta w = \frac{\partial w}{\partial x} \Delta x + \frac{\partial w}{\partial y} \Delta y + \frac{\partial w}{\partial z} \Delta z [/tex]

if we divide this by delta s and take the limit we get:

[tex] \lim_{\Delta s \rightarrow 0} \frac{\Delta w}{\Delta s} = \frac{\partial w}{\partial x} \lim_{\Delta s \rightarrow 0} \frac{\Delta x}{\Delta s} + \frac{\partial w}{\partial y} \lim_{\Delta s \rightarrow 0} \frac{\Delta y}{\Delta s} + \frac{\partial w}{\partial z} \lim_{\Delta s \rightarrow 0} \frac{\Delta z }{\Delta s}[/tex]

which in turn becomes:

[tex] \frac{\partial w}{\partial s}= \frac{\partial w}{\partial x} \frac{\partial x}{\partial s} + \frac{\partial w}{\partial y} \frac{\partial y}{\partial s} + \frac{\partial w}{\partial z} \frac{\partial z}{\partial s} [/tex]

Now for what you stated, we change W into Q',
x=f(s,t) y=g(s,t) z=h(s,t)

changes to

x=X(s,t) y=Y(s,t) z=Z(s,t)

and (s,t) changes to

(x,y)

and pluging these changes back into the final equation yeilds:

[tex] \frac{\partial Q'}{\partial x}= \frac{\partial Q'}{\partial X} \frac{\partial X}{\partial x} + \frac{\partial Q'}{\partial Y} \frac{\partial Y}{\partial x} + \frac{\partial Q'}{\partial Z} \frac{\partial Z}{\partial x} [/tex]


Ok, so at least I am one step closer to your final equation, but how do I go from this equation to your eqauation (5) in your origional post?


P.S. I am sorry about the title MathWonk, It won't let me edit it anymore! Boo. Also, I hope you don't get the impression that I am not listening to your help Mathwonk, I am going to look at your alternative method. I just want to tackel my weakness with the chain rule first. I don't want to jump into your method, because it just means I am putting off knowing the chain rule very well, which won't help me out in the long run.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*********************************************************
EDIT! WHOOPS! [tex] \frac{\partial Q'} {\partial X} [/tex] has no meaning! X is a function of either x, or y! D'OUGH!

Let me repaste the correct text below and ignore the one above!

and pluging these changes back into the final equation yeilds:

[tex] \frac{\partial Q'}{\partial x}= \frac{\partial Q'}{\partial x} \frac{\partial X}{\partial x} + \frac{\partial Q'}{\partial y} \frac{\partial Y}{\partial x} + \frac{\partial Q'}{\partial z} \frac{\partial Z}{\partial x} [/tex]


Ok, so at least I am one step closer to your final equation, but how do I go from this equation to your eqauation (5) in your origional post?


P.S. I am sorry about the title MathWonk, It won't let me edit it anymore! Boo. Also, I hope you don't get the impression that I am not listening to your help Mathwonk, I am going to look at your alternative method. I just want to tackel my weakness with the chain rule first. I don't want to jump into your method, because it just means I am putting off knowing the chain rule very well, which won't help me out in the long run.

Now were in VERY close agreement, except you have partial of Q and i have partial of Q' in the left hand side. ?? HMMMMM...
**********************************************************~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

EDIT NUMBER TWO! S*%%#. LOL. AHHHHHHHHHHHHHH Now I see why your use of the variable x'. I needs to be there to distinguish it from the x. Thus the problem with my dQ'/dx which has dx on the bottom, which is NOT the same dx of dQ'/dx on the first fraction on the right hand side of the equals sign. This is because we are overuisng the variable x. It should be more like this,

x=X(x,y)=x' y=Y(x,y)=y' and z'=Z(x,y)=z

so that we may write the equation as follows:



[tex] \frac{\partial Q'}{\partial x}= \frac{\partial Q'}{\partial x'} \frac{\partial X}{\partial x} + \frac{\partial Q'}{\partial y'} \frac{\partial Y}{\partial x} + \frac{\partial Q'}{\partial z'} \frac{\partial Z}{\partial x} [/tex]
 
Last edited:
  • #32
you want to know the proof of the chain rule/ look: definition of the derivative of f at a, is it is a linear function L such that L(v) is tangent to f(a+v)-f(a) at v= 0.

i.e. the difference quotient [f(a+v) - f(a) - L(v)]/|v| approaches zero as v does.

call a function o(v) such that o(v)/|v| goes to zero as v does "little oh", and write it o(v).

A function such that the quotient O(v)/|v| is bounded as v approaches zero, "big oh" and write it as O(v).

Then basic ruelks are these: linear combinations of O's are also O, and also for o's, and compositions of o's and O's are always o if even one "factor" is o. and a product of two O's is o.


Then the chain rule is as follows;

assume L is the derivative of f and M is the derivative of g, then

f(a+v)) - f(a) - L(v) = o(v), so f(a+v) - f(a) = L(v) + o(v).

Hence M(f(a+v) - f(a)) =M(L(v)) + M(o(v)).

since f(a+v) = f(a) + [f(a+v)-f(a)], hence we have

g(f(a+v)) - g(f(a)) -M([f(a+v)-f(a)]) = o(f(a+v)- f(a)) = o(O(v)) = o(v).

bu also M(f(a+v) - f(a)) =M(L(v)) + M(o(v)), from above,

so g(f(a+v)) - g(f(a)) -M([f(a+v)-f(a)])

= g(f(a+v)) - g(f(a)) - M(L(v)) + M(o(v)) = o(v).

hence g(f(a+v)) - g(f(a)) - M(L(v)) = -M(o(v)) + o(v) = o(v) + o(v) = o(v).

hence by definition, the derivative of g(f) at a is M(L).

i.e. the derivative of a composition is the composition, as linear maps, of the derivatives. hence as matrices it is dot product as you are computing above:

ie. dw/ds = (dw/dx,dw/dy/dw/dz).(dx/dt, dy/dt, dz/dt), and so on...
 
  • #33
Apart from that I would say:

[tex]Q(x,y)=Q'(X(x,y),Y(x,y),Z(x,y))[/tex]
with derivative relation I've posted earlier.
The function on your right-hand side, Q' has three arguments, whereas the function on your left-hand side, Q, has two.

The fact that in our particular case we've related (x',y',z') to (x,y) doesn't change the functional form of Q'.
 
Last edited:
  • #34
So is what I did ok, do I simply have to replace [tex] \frac{\partial Q'}{\partial x}[/tex] with [tex] \frac{\partial Q}{\partial x}[/tex] , which both mean the same thing, they are equivalent?

I don't understand what you mean by 'functional form of Q'. Sorry.

Or is there problems in my proof?
 
Last edited:
  • #35
Remember that the function Q' has basically the arguments x', y', z'; it is only when we let x',y',z' be functions of x,y themselves that we can say that Q' is a function of x and y.

This is expressed best by defining a NEW function Q(x,y) which equals Q' whenever x', y', z' are functions of x,y.
 
<h2>1. What is the Chain Rule?</h2><p>The Chain Rule is a mathematical concept that allows us to find the derivative of a composite function. It states that the derivative of a composite function is equal to the derivative of the outer function multiplied by the derivative of the inner function.</p><h2>2. Why is the Chain Rule important?</h2><p>The Chain Rule is important because it allows us to find the rate of change of complex functions, which are composed of multiple simpler functions. It is a fundamental tool in calculus and is used in many real-world applications, such as physics, engineering, and economics.</p><h2>3. Can the Chain Rule be applied to any composite function?</h2><p>Yes, the Chain Rule can be applied to any composite function, as long as the individual functions are differentiable. This means that they have a well-defined derivative at every point within their domain.</p><h2>4. What happens if the Chain Rule is not followed?</h2><p>If the Chain Rule is not followed, the derivative of the composite function will be incorrect. This can lead to incorrect results and can have serious consequences in fields where precise calculations are crucial, such as in engineering and science.</p><h2>5. How can I remember the Chain Rule?</h2><p>One way to remember the Chain Rule is to use the acronym "LO-D-HI," which stands for "Last, Outer, Derivative of Inner, and High." This reminds us to take the derivative of the outer function first, then multiply it by the derivative of the inner function, and finally replace the inner function with its derivative.</p>

1. What is the Chain Rule?

The Chain Rule is a mathematical concept that allows us to find the derivative of a composite function. It states that the derivative of a composite function is equal to the derivative of the outer function multiplied by the derivative of the inner function.

2. Why is the Chain Rule important?

The Chain Rule is important because it allows us to find the rate of change of complex functions, which are composed of multiple simpler functions. It is a fundamental tool in calculus and is used in many real-world applications, such as physics, engineering, and economics.

3. Can the Chain Rule be applied to any composite function?

Yes, the Chain Rule can be applied to any composite function, as long as the individual functions are differentiable. This means that they have a well-defined derivative at every point within their domain.

4. What happens if the Chain Rule is not followed?

If the Chain Rule is not followed, the derivative of the composite function will be incorrect. This can lead to incorrect results and can have serious consequences in fields where precise calculations are crucial, such as in engineering and science.

5. How can I remember the Chain Rule?

One way to remember the Chain Rule is to use the acronym "LO-D-HI," which stands for "Last, Outer, Derivative of Inner, and High." This reminds us to take the derivative of the outer function first, then multiply it by the derivative of the inner function, and finally replace the inner function with its derivative.

Similar threads

Replies
6
Views
1K
Replies
1
Views
813
Replies
1
Views
2K
Replies
3
Views
1K
Replies
5
Views
271
  • Calculus
Replies
2
Views
2K
Replies
6
Views
849
Replies
3
Views
1K
Replies
6
Views
2K
  • Calculus
Replies
5
Views
1K
Back
Top