Help explaining the chain rule please

In summary, the conversation discusses the use of the chain rule to calculate the second partial derivative of a function with two variables. The conversation also highlights the importance of being rigorous with notation and fixing certain variables in order to correctly perform the chain rule.
  • #1
Boltzman Oscillation
233
26
Homework Statement
Given these two statements:
y =v(x,t)
v(x(s,t),t) = y(s,t)
find:


second partial derivative of y in respects to t
Relevant Equations
chain rule
I had already calculated the first partial derivative to equal the following:
$$\frac{\partial y}{\partial t} = \frac{\partial v}{\partial x} \frac{\partial x}{\partial t} + \frac{\partial v}{\partial t}$$
Now the second partial derivative I can use the chain rule to do and get to:
$$\frac{\partial^2 y}{\partial t^2} = \frac{\partial}{\partial t} \frac{\partial v}{\partial x} \frac{\partial x}{\partial t} + \frac{\partial v}{\partial x} \frac{\partial}{\partial t} \frac{\partial x}{\partial t} + \frac{\partial^2 v}{\partial t^2}$$
How would I continue? I can easily perform the last too terms:
$$\frac{\partial v}{\partial x} \frac{\partial}{\partial t} \frac{\partial x}{\partial t} = \frac{\partial v}{\partial x} \frac{\partial^2 x}{\partial t^2} $$
and
$$\frac{\partial}{\partial t } \frac{\partial v}{\partial t} = \frac{\partial^2 v}{\partial t^2} $$
How would I perform the first term? Again:

$$y =v(x,t)$$
$$v(x(s,t),t) = y(s,t)$$
 
Physics news on Phys.org
  • #2
You could begin by adopting a more rigorous notation. In physics, one often sloppyly uses the same name for functions that give the same quantity, but are parameterized by different variables. Here you need to be more rigorous

Lets start with ##y=v\left(x,t\right)##. Are you interested in ##\frac{\partial^2 y}{\partial t^2}##? Be more rigorous and specify not only what is varied, but also what is fixed. So you want:

##\left(\frac{\partial^2 y}{\partial t^2}\right)_{x}##

Next, you introduce an additional relationship: ##x=x\left(s, t\right)##. This relationship would not change

##\left(\frac{\partial^2 y}{\partial t^2}\right)_{x}##

because ##x=const## there by definition. But then you also introduce ##\tilde{y}\left(s,t\right)=v\left(x\left(s, t\right), t\right)##. Note that I introduced a different name for ##\tilde{y}## since it is a *different* function, though physically it may give you the same quantity. You could now define:

##\left(\frac{\partial^2 \tilde{y}}{\partial t^2}\right)_{s}##

So which partial derivative do you want ##\left(\frac{\partial^2 \tilde{y}}{\partial t^2}\right)_{s}## or ##\left(\frac{\partial^2 y}{\partial t^2}\right)_{x}##?
 
  • Like
Likes PeroK
  • #3
I am not too sure as the book does not specify but I will guess that s must stay constant.
 
  • #4
So then it becomes easy:

##\left(\frac{\partial^2 \tilde{y}}{\partial t^2}\right)_s=\left(\frac{\partial^2 v\left(x\left(s, t\right), t\right)}{\partial t^2}\right)_s##

Now simply apply the chain rule twice. I am happy to check your work, but you will have to do it. If this is a question you have to hand in for marking, I would provide answers for both options (keeping ##s## or ##x## fixed). It is not your fault the question is ambiguous.
 
  • #5
Boltzman Oscillation said:
I am not too sure as the book does not specify but I will guess that s must stay constant.

What @Cryo is saying is included in my insight on the chain rule, which you may be interested in:

https://www.physicsforums.com/insights/demystifying-chain-rule-calculus/
In terms of this problem I would note that (and this is also in the above link) partial derivatives are functions too. In this case you have that ##v(x, t)## is a function of two variable. The partial derivative ##\frac{\partial v}{\partial x}## is also a function of two variables. So, you could let:

##f(x, t) = \frac{\partial v}{\partial x}(x, t)##

Which emphasises that you have just another function here. And, in fact, using @Cryo's notation you have here:

##\tilde{f}(s, t) = \frac{\partial v}{\partial x}(x(s, t), t)##

Now you can perform the same chain rule on ##\tilde{f}## as you did on the the original function.
 
  • Like
Likes Cryo
  • #6
Cryo said:
So then it becomes easy:

##\left(\frac{\partial^2 \tilde{y}}{\partial t^2}\right)_s=\left(\frac{\partial^2 v\left(x\left(s, t\right), t\right)}{\partial t^2}\right)_s##

Now simply apply the chain rule twice. I am happy to check your work, but you will have to do it. If this is a question you have to hand in for marking, I would provide answers for both options (keeping ##s## or ##x## fixed). It is not your fault the question is ambiguous.

It isn't homework, I am trying to learn partial derivatives via a book by a Weinberger. I already know the answer though, it is in the book. I will attempt what you told me to do.
 
  • #7
Boltzman Oscillation said:
I am trying to learn partial derivatives via a book by a Weinberger.

Funnily enough, it is this type of excersize, i.e. two sets of coordinates ##\{s,\,t\}## and ##\{x,\,t\}## with one shared variable, that made me sort out partial differentiation in my head.

The mantra I remembered is that partial differentiation is more about what you keep fixed, than what you change (to differentiate), so if you encounter such problems, always be clear what is being kept fixed.

Another trick I found useful is to define shared variable to be different, i.e. you have two coordinate systems ##\{s,\,t\}## and ##\{x,\,t\}##, but nothing is stopping you from defining ##\{s,\,t\}\to \{s,\,\tilde{t}\}##, and then enforcing the constraint ##t=\tilde{t}## later on, so then:

##\left(\frac{\partial f}{\partial t}\right)_x=\left(\frac{\partial \tilde{t}}{\partial t}\right)_x\left(\frac{\partial f}{\partial \tilde{t}}\right)_s + \left(\frac{\partial s}{\partial t}\right)_x\left(\frac{\partial f}{\partial s}\right)_\tilde{t}##

but now ##\tilde{t}=t## and ##\left(\frac{\partial \tilde{t}}{\partial t}\right)_x=1##, so:##\left(\frac{\partial f}{\partial t}\right)_x=\left(\frac{\partial f}{\partial t}\right)_s + \left(\frac{\partial s}{\partial t}\right)_x\left(\frac{\partial f}{\partial s}\right)_t##

Which is basically what you had in the original post
 
Last edited:
  • #8
Cryo said:
So then it becomes easy:

##\left(\frac{\partial^2 \tilde{y}}{\partial t^2}\right)_s=\left(\frac{\partial^2 v\left(x\left(s, t\right), t\right)}{\partial t^2}\right)_s##

Now simply apply the chain rule twice. I am happy to check your work, but you will have to do it. If this is a question you have to hand in for marking, I would provide answers for both options (keeping ##s## or ##x## fixed). It is not your fault the question is ambiguous.
I am having trouble performing this. Hopefully you can help me further.

I have the following:
$$\frac{\partial v}{\partial x} \frac{\partial x}{\partial t} + \frac{\partial v}{\partial t} $$
I will now attempt to take the partial derivative of this term in respect to t. Let's start with the second part since it is easier, namely:
$$\frac{\partial v}{\partial t} _s $$
The chain rule, by the way I understand it, is the following:
1. The derivative of the outside, times the inside.
Plus
2. The outside times the derivative of the inside.
So I will perform these two steps, the first one:
$$\frac{\partial}{\partial t}\frac{\partial}{\partial t}v(x(s,t),t) = (\frac{\partial^2 v}{\partial t^2})_s$$
Now I need to sum this by step 2 which is equal to:
$$ (\frac{\partial}{\partial t})(\frac{\partial v}{\partial t}) = (\frac{\partial}{\partial t})(\frac{\partial v}{\partial x})(\frac{\partial x}{\partial t}) = \frac{\partial^2 v}{\partial x\partial t}\frac{\partial x}{\partial t} $$
Summing what i got for step one and two i get:
$$\frac{\partial^2 v}{\partial t^2} +\frac{\partial^2 v}{\partial x\partial t}\frac{\partial x}{\partial t}$$
What did i do wrong?
 
  • #9
You started well and then you dropped all the rigour which created confusion. Partial derivatives are specific to a coordinate system and you need to keep track of what is fixed. So let us make it explicit. We have two coordinate systems ##\{x,\tilde{t}\}## and ##\{s,t\}## and we also have ##\tilde{t}=t##, so ##\left(\frac{\partial \tilde{t}}{\partial t}\right)_s=1## and ##\left(\frac{\partial \tilde{t}}{\partial s}\right)_t=0##. [I did it for ##f## not ##v##, too late to correct now]

Now let's take a derivative of a generic function ##f=f\left(x,\tilde{t}\right)##:

##\left(\frac{\partial f}{\partial t}\right)_s=\left(\frac{\partial x}{\partial t}\right)_s \left(\frac{\partial f}{\partial x}\right)_\tilde{t} + \left(\frac{\partial \tilde{t}}{\partial t}\right)_s \left(\frac{\partial f}{\partial \tilde{t}}\right)_x=\left(\frac{\partial x}{\partial t}\right)_s \left(\frac{\partial f}{\partial x}\right)_\tilde{t} + \left(\frac{\partial f}{\partial \tilde{t}}\right)_x##

So far similar to yours, but I kept the subscripts of what is fixed, which allows me to keep track of the coordinate system in use.

The second derivative is applied the same way (use the product rule):

##\left(\frac{\partial \left(\frac{\partial f}{\partial t}\right)_s}{\partial t}\right)_s=\left(\frac{\partial \left(\frac{\partial x}{\partial t}\right)_s}{\partial t}\right)_s \left(\frac{\partial f}{\partial x}\right)_\tilde{t} + \left(\frac{\partial x}{\partial t}\right)_s \left(\frac{\partial \left(\frac{\partial f}{\partial x}\right)_\tilde{t}}{\partial t}\right)_s + \left(\frac{\partial \left(\frac{\partial f}{\partial \tilde{t}}\right)_x}{\partial t}\right)_s##

So here it is already different from:

Boltzman Oscillation said:
Now I need to sum this by step 2 which is equal to

Since I have three terms

Let's consider just the last term. I will now use the following notation ##\left(\frac{\partial f}{\partial x}\right)_\tilde{t}=\frac{\partial f}{\partial x_{\tilde{t}}}##

## \left(\frac{\partial \left(\frac{\partial f}{\partial \tilde{t}}\right)_x}{\partial t}\right)_s=\frac{\partial x}{\partial t_s}\frac{\partial^2 f}{\partial x_\tilde{t} \partial \tilde{t}_x} + \frac{\partial \tilde{t}}{\partial t_s}\frac{\partial^2 f}{\partial \tilde{t}_x \partial \tilde{t}_x}##

Now we can drop the extra notation (because we are at the end):

## \left(\frac{\partial \left(\frac{\partial f}{\partial \tilde{t}}\right)_x}{\partial t}\right)_s=\frac{\partial x\left(s,t\right)}{\partial t}\frac{\partial^2 f\left(x,t\right)}{\partial t \partial x } + \frac{\partial^2 f\left(x,t\right)}{\partial t^2}##

Which is similar to yours, but there are two more terms to consider
 
  • #10
Cryo said:
You started well and then you dropped all the rigour which created confusion. Partial derivatives are specific to a coordinate system and you need to keep track of what is fixed. So let us make it explicit. We have two coordinate systems ##\{x,\tilde{t}\}## and ##\{s,t\}## and we also have ##\tilde{t}=t##, so ##\left(\frac{\partial \tilde{t}}{\partial t}\right)_s=1## and ##\left(\frac{\partial \tilde{t}}{\partial s}\right)_t=0##. [I did it for ##f## not ##v##, too late to correct now]

Now let's take a derivative of a generic function ##f=f\left(x,\tilde{t}\right)##:

##\left(\frac{\partial f}{\partial t}\right)_s=\left(\frac{\partial x}{\partial t}\right)_s \left(\frac{\partial f}{\partial x}\right)_\tilde{t} + \left(\frac{\partial \tilde{t}}{\partial t}\right)_s \left(\frac{\partial f}{\partial \tilde{t}}\right)_x=\left(\frac{\partial x}{\partial t}\right)_s \left(\frac{\partial f}{\partial x}\right)_\tilde{t} + \left(\frac{\partial f}{\partial \tilde{t}}\right)_x##

So far similar to yours, but I kept the subscripts of what is fixed, which allows me to keep track of the coordinate system in use.

The second derivative is applied the same way (use the product rule):

##\left(\frac{\partial \left(\frac{\partial f}{\partial t}\right)_s}{\partial t}\right)_s=\left(\frac{\partial \left(\frac{\partial x}{\partial t}\right)_s}{\partial t}\right)_s \left(\frac{\partial f}{\partial x}\right)_\tilde{t} + \left(\frac{\partial x}{\partial t}\right)_s \left(\frac{\partial \left(\frac{\partial f}{\partial x}\right)_\tilde{t}}{\partial t}\right)_s + \left(\frac{\partial \left(\frac{\partial f}{\partial \tilde{t}}\right)_x}{\partial t}\right)_s##

So here it is already different from:
Since I have three terms

Let's consider just the last term. I will now use the following notation ##\left(\frac{\partial f}{\partial x}\right)_\tilde{t}=\frac{\partial f}{\partial x_{\tilde{t}}}##

## \left(\frac{\partial \left(\frac{\partial f}{\partial \tilde{t}}\right)_x}{\partial t}\right)_s=\frac{\partial x}{\partial t_s}\frac{\partial^2 f}{\partial x_\tilde{t} \partial \tilde{t}_x} + \frac{\partial \tilde{t}}{\partial t_s}\frac{\partial^2 f}{\partial \tilde{t}_x \partial \tilde{t}_x}##

Now we can drop the extra notation (because we are at the end):

## \left(\frac{\partial \left(\frac{\partial f}{\partial \tilde{t}}\right)_x}{\partial t}\right)_s=\frac{\partial x\left(s,t\right)}{\partial t}\frac{\partial^2 f\left(x,t\right)}{\partial t \partial x } + \frac{\partial^2 f\left(x,t\right)}{\partial t^2}##

Which is similar to yours, but there are two more terms to consider

I arrive to your conclusion but I do not think I arrive correctly, can you help me once more?

I am given:
## \left(\frac{\partial \left(\frac{\partial f}{\partial \tilde{t}}\right)_x}{\partial t}\right)_s##
I will begin by using the chain rule on the "inside" portion, namely:
$$\frac{\partial f}{\partial \tilde{t}}_x = (\frac{\partial f}{\partial x})_x(\frac{\partial x}{\partial \tilde{t}})_x + (\frac{\partial f}{\partial \tilde{t}})_x$$
Now all of the terms on the right should have x as non-changing because the term on the left has x as stationary right?
If this is true then I can move on to the next portion which is plugging in what we just obtained into the first equation mentioned here, thus:

## \left(\frac{\partial \left(\frac{\partial f}{\partial \tilde{t}}\right)_x}{\partial t}\right)_s = \Bigg[\frac{\partial}{\partial t} \Big[(\frac{\partial f}{\partial x})_x(\frac{\partial x}{\partial \tilde{t}})_x+(\frac{\partial f}{\partial \tilde{t}})_x \Big]\Bigg]_s##

which then equals to
## = \frac{\partial x}{\partial t} \frac{\partial^2 f}{\partial t \partial x} + \frac{\partial^2 f}{\partial t^2}##

did i follow on correctly?
 
  • #11
Boltzman Oscillation said:
I am given:
(∂(∂f∂~t)x∂t)s \left(\frac{\partial \left(\frac{\partial f}{\partial \tilde{t}}\right)_x}{\partial t}\right)_s
I will begin by using the chain rule on the "inside" portion, namely:

∂f∂~tx=(∂f∂x)x(∂x∂~t)x+(∂f∂~t)x​

I think problems start here. What is the point of this line? ##f=f\left(x,\tilde{t}\right)## so ##\frac{\partial f}{\partial \tilde{t}_x}## is already the simplest derivative you can have. Also, ##\frac{\partial x}{\partial \tilde{t}_x}=0## identically, by definition. So all you do here is state ##\frac{\partial f}{\partial \tilde{t}_x}=\frac{\partial f}{\partial \tilde{t}_x}##

Then we move onto:

##\frac{\partial }{\partial t_s} \, \frac{\partial f}{\partial \tilde{t}_x} ##

To evaluate it simply re-write ##\frac{\partial }{\partial t_s} ##. Currently it is for ##\{s,t\}## coordinates. Re-write it for ##\{x,\tilde{t}\}## coordinates:

##\frac{\partial }{\partial t_s} = \frac{\partial x}{\partial t_s} \frac{\partial }{\partial x_\tilde{t}} + \frac{\partial \tilde {t}}{\partial t_s} \frac{\partial }{\partial \tilde{t}_x} = \frac{\partial x}{\partial t_s} \frac{\partial }{\partial x_\tilde{t}} + \frac{\partial }{\partial \tilde{t}_x} ##

So

##\frac{\partial }{\partial t_s} \, \frac{\partial f}{\partial \tilde{t}_x} = \frac{\partial x}{\partial t_s} \frac{\partial^2 f }{\partial x_\tilde{t}\partial \tilde{t}_x} + \frac{\partial^2 f }{\partial \tilde{t}_x^2} ##

It would seem our final results match, but I think this is only because you have conviently dropped the subscripts and tildes too early.
 
  • #12
Cryo said:
I think problems start here. What is the point of this line? ##f=f\left(x,\tilde{t}\right)## so ##\frac{\partial f}{\partial \tilde{t}_x}## is already the simplest derivative you can have. Also, ##\frac{\partial x}{\partial \tilde{t}_x}=0## identically, by definition. So all you do here is state ##\frac{\partial f}{\partial \tilde{t}_x}=\frac{\partial f}{\partial \tilde{t}_x}##

Then we move onto:

##\frac{\partial }{\partial t_s} \, \frac{\partial f}{\partial \tilde{t}_x} ##

To evaluate it simply re-write ##\frac{\partial }{\partial t_s} ##. Currently it is for ##\{s,t\}## coordinates. Re-write it for ##\{x,\tilde{t}\}## coordinates:

##\frac{\partial }{\partial t_s} = \frac{\partial x}{\partial t_s} \frac{\partial }{\partial x_\tilde{t}} + \frac{\partial \tilde {t}}{\partial t_s} \frac{\partial }{\partial \tilde{t}_x} = \frac{\partial x}{\partial t_s} \frac{\partial }{\partial x_\tilde{t}} + \frac{\partial }{\partial \tilde{t}_x} ##

So

##\frac{\partial }{\partial t_s} \, \frac{\partial f}{\partial \tilde{t}_x} = \frac{\partial x}{\partial t_s} \frac{\partial^2 f }{\partial x_\tilde{t}\partial \tilde{t}_x} + \frac{\partial^2 f }{\partial \tilde{t}_x^2} ##

It would seem our final results match, but I think this is only because you have conviently dropped the subscripts and tildes too early.
I understand more clearly now. I still need a lot more practicing to do. Thank you for everything Cryo!
 
  • #13
Boltzman Oscillation said:
Thank you for everything Cryo!
Good luck
 

1. What is the chain rule in calculus?

The chain rule is a formula used in calculus to find the derivative of a composite function. It allows us to calculate the rate of change of a function that is made up of multiple smaller functions.

2. How do you apply the chain rule?

To apply the chain rule, you must first identify the inner and outer functions of the composite function. Then, you take the derivative of the outer function and multiply it by the derivative of the inner function. This will give you the derivative of the entire composite function.

3. Why is the chain rule important?

The chain rule is important because it allows us to find the derivative of more complex functions by breaking them down into smaller, simpler functions. It is a fundamental concept in calculus and is used in many real-world applications, such as physics, engineering, and economics.

4. Can you give an example of the chain rule?

Sure, let's say we have the function f(x) = (2x + 3)^2. To find the derivative of this function, we can use the chain rule by first identifying the inner function as 2x + 3 and the outer function as x^2. The derivative of the outer function is 2x, and the derivative of the inner function is 2. So, the derivative of the entire function is 2(2x + 3)(2) = 4(2x + 3) = 8x + 12.

5. Are there any common mistakes when using the chain rule?

Yes, there are a few common mistakes when using the chain rule. One is forgetting to take the derivative of the outer function. Another is not correctly identifying the inner and outer functions. It's important to carefully analyze the function and make sure you are applying the chain rule correctly.

Similar threads

Replies
4
Views
645
  • Calculus and Beyond Homework Help
Replies
1
Views
153
  • Calculus and Beyond Homework Help
Replies
3
Views
769
  • Calculus and Beyond Homework Help
Replies
5
Views
763
  • Calculus and Beyond Homework Help
Replies
5
Views
619
Replies
4
Views
497
  • Calculus and Beyond Homework Help
Replies
10
Views
910
  • Calculus and Beyond Homework Help
Replies
2
Views
461
  • Calculus and Beyond Homework Help
Replies
4
Views
784
  • Calculus and Beyond Homework Help
2
Replies
42
Views
3K
Back
Top