Rigorously understanding chain rule for sum of functions

In summary, the conversation discusses the use of the chain rule in understanding and computing the Euler-Lagrange equation. The issue at hand involves finding the derivative of the composed function G, where G = F o g, at ε = 0. The chain rule is used to obtain the result, and the conversation also explores the use of different definitions of the derivative for a functional.
  • #1
Avatrin
245
6
In my quest to understand the Euler-Lagrange equation, I've realized I have to understand the chain rule first. So, here's the issue:

We have [itex]g(\epsilon) = f(t) + \epsilon h(t)[/itex]. We have to compute [itex]\frac{\partial F(g(\epsilon))}{\partial \epsilon}[/itex]. This is supposed to be equal to [itex]\frac{\partial F(f)}{\partial f}h(t)[/itex] when [itex]\epsilon = 0[/itex]. However, this does not make any sense to me. Doing the computations and using the chain rule, I get:
$$\frac{\partial F(g(0))}{\partial \epsilon} = \lim_{\epsilon \to 0}\frac{F(g(\epsilon)) - F(g(0))}{g(\epsilon) -g(0) } \frac{g(\epsilon) -g(0)}{\epsilon} = \lim_{\epsilon \to 0}\frac{F(f(t)+\epsilon h(t)) - F(f(t))}{\epsilon h(t) } h(t) $$
On an intuitive level I can understand it. I can think of [itex]f(t)+\epsilon h(t)[/itex] as [itex]f+\Delta f[/itex] since [itex]h[/itex] can be any arbitrary function, and that allows me to use the other definition of the derivative. However, it does not seem like a very rigorous way of doing it.

How can I show that [itex]\frac{\partial F(g(0))}{\partial \epsilon} = \frac{\partial F(f)}{\partial f}h(t)[/itex] using the definition of the derivative? Or, rather, a definition of the derivative..?
 
Physics news on Phys.org
  • #2
The notation is not quite right, and that can easily cause confusion. The expression ##\frac{\partial F(g(0))}{\partial\epsilon}## has two problems. First, the thing after the ##\partial## in the numerator needs to be a function, but what is there is not a function but a value, ie a number: ##F(g(0))##. Secondly, all the functions you have presented are single-variable functions, so there is no need for the partial derivative symbol ##\partial##.

The second issue is easily fixed by replacing ##\partial## by ##d##. To fix the first issue, let's first define a function that is the 'function of a function' we are interested in. We define single-variable function ##G## such that ##G(\epsilon)=F(g(\epsilon))##. Then we want to calculate ##G'(0)##, where the prime symbol ##'## indicates differentiation.

These things are easier to write and work with if we use the function composition symbol ##\circ##, which plays the role that, if ##\phi,\psi## are single-variable functions then ##\phi\circ\psi## is the single-variable function that, given an input of ##x##, returns value ##\phi(\psi(x))##. With this notation, we have ##G=F\circ g##.

The chain rule tells us that ##\left(F\circ g\right)'=(F'\circ g)g'##, which means that, evaluated at ##x##, this gives ##F'(g(x))\times g'(x)##.

In the OP, it appears we want to evaluate the derivative of ##G## at ##\epsilon=0##. The chain rule tells us that this is:
$$ G'(0) = \left(F\circ g\right)'(0) = F'(g(0))g'(0) $$
Then, differentiating the expression you gave for ##g##, we see that ##g'=h##. Substituting that in, we get:
$$ G'(0) = F'(g(0))h(0) $$
which appears to be the result sought.
 
  • Like
Likes scottdave and fresh_42
  • #3
andrewkirk said:
The chain rule tells us that ##\left(F\circ g\right)'=(F'\circ g)g'##, which means that, evaluated at ##x##, this gives ##F'(g(x))\times g'(x)##.

In the OP, it appears we want to evaluate the derivative of ##G## at ##\epsilon=0##. The chain rule tells us that this is:
$$ G'(0) = \left(F\circ g\right)'(0) = F'(g(0))g'(0) $$
Then, differentiating the expression you gave for ##g##, we see that ##g'=h##. Substituting that in, we get:
$$ G'(0) = F'(g(0))h(0) $$
which appears to be the result sought.
No, that is what every resource on the Euler-Lagrange equation tells me. However, since we are differentiating with respect to [itex]\epsilon[/itex] we should have that [itex]\Delta g = \Delta \epsilon h[/itex].

The issue is that none of those resources tell me why what you are writing is true. I can prove the chain rule. However, none of the proofs I know seem to apply for this particular case, and the notation used by both of us hides that. That is why I specified that I want a proof that uses the definition of the derivative.
 
  • #4
Avatrin said:
However, none of the proofs I know seem to apply for this particular case
Why do you think the chain rule doesn't apply? We have a composed function ##G=F\circ g## and we want to differentiate it. That's exactly what the Chain Rule is about.

Perhaps it would help if you tried to re-express what it is that you have been asked to prove. As per my previous post, the statement in the OP doesn't make sense, as the thing it purports to differentiate is not a function.
 
  • #5
Avatrin said:
However, this does not make any sense to me. Doing the computations and using the chain rule, I get:
$$\frac{\partial F(g(0))}{\partial \epsilon} = \lim_{\epsilon \to 0}\frac{F(g(\epsilon)) - F(g(0))}{g(\epsilon) -g(0) } \frac{g(\epsilon) -g(0)}{\epsilon} = \lim_{\epsilon \to 0}\frac{F(f(t)+\epsilon h(t)) - F(f(t))}{\epsilon h(t) } h(t) $$
On an intuitive level I can understand it. I can think of [itex]f(t)+\epsilon h(t)[/itex] as [itex]f+\Delta f[/itex] since [itex]h[/itex] can be any arbitrary function, and that allows me to use the other definition of the derivative. However, it does not seem like a very rigorous way of doing it.

How can I show that [itex]\frac{\partial F(g(0))}{\partial \epsilon} = \frac{\partial F(f)}{\partial f}h(t)[/itex] using the definition of the derivative? Or, rather, a definition of the derivative..?
Do you have some other definition of the derivative that you prefer? The limit of the ratio of deltas that you used above seems like that original definition and rigorous.
 
Last edited:
  • #6
andrewkirk said:
Perhaps it would help if you tried to re-express what it is that you have been asked to prove. As per my previous post, the statement in the OP doesn't make sense, as the thing it purports to differentiate is not a function.

FactChecker said:
Do you have some other definition of the derivative that you prefer? The limit of the ratio of deltas that you used above seems like that original definition and rigorous.
So, you two together make me wonder if my textbook is not misleading me since the derivative of a functional may be defined differently than the definition I have been using above. I am trying to differentiate a functional since this technically falls under the calculus of variations. However, my textbook claims you only need multivariable calculus to understand the equation. Using those definitions, it does not make sense that we do this: [itex]\Delta \epsilon g = \Delta f[/itex] and call it a day.

Like I said in my original post, it makes sense on an intuitive level, but it just doesn't seem very rigorous unless it is defined that way. Skimming through a few articles on functional derivatives online, it does indeed seem to be the case that the functional derivative is defined in a way that computationally makes it equal to [itex]\Delta \epsilon g = \Delta f[/itex] (at least in this case).
 
  • #7
Part of the trouble is you have not defined any of your terms. If you still want help, I suggest you post definitions of the terms ##f,g,h,F,\epsilon## and explain what it is that you are trying to prove. It may also help to post an image of the page(s) that you are having trouble understanding.
 
  • Like
Likes FactChecker and fresh_42
  • #8
andrewkirk said:
Part of the trouble is you have not defined any of your terms. If you still want help, I suggest you post definitions of the terms ##f,g,h,F,\epsilon## and explain what it is that you are trying to prove. It may also help to post an image of the page(s) that you are having trouble understanding.
The very first sentence in my OP says I am trying to understand the Euler-Lagrange equation. So:

F is a functional
f and h are continuous functions
g I did define in my original post
##\epsilon## is a variable

I am just trying to understand a step in the standard derivation of the Euler-Lagrange equation (it is the derivation I see on every website I have been through, including Wikipedia).
 
  • #9
It may help to realize that as long as limits and division are defined, many proofs from Calc 1 are likely to be just as valid in a new application. If you have doubts, you should go step-by-step through the Calc 1 proofs and see which specific steps might be problematic in the new context. Then you can address those steps specifically.

In fact, derivatives and calculus can be used an many relatively abstract settings with the same theorems holding.
 
  • Like
Likes Avatrin

1. What is the chain rule for the sum of functions?

The chain rule for the sum of functions is a rule used in calculus to find the derivative of a function that is composed of two or more functions added together. It states that the derivative of the sum of two functions is equal to the sum of the derivatives of each individual function.

2. Why is it important to understand the chain rule for the sum of functions?

Understanding the chain rule for the sum of functions is important because it allows us to find the rate of change of a function that is composed of multiple functions. This is essential in many areas of science and engineering, such as physics, economics, and engineering.

3. How do you apply the chain rule for the sum of functions?

To apply the chain rule for the sum of functions, you first need to find the derivatives of each individual function. Then, you add these derivatives together to find the derivative of the sum of the functions. It is important to remember to keep track of the variables and use the chain rule when necessary.

4. Can the chain rule for the sum of functions be used for more than two functions?

Yes, the chain rule for the sum of functions can be used for any number of functions that are added together. The rule states that the derivative of the sum of functions is equal to the sum of the derivatives of each individual function. Therefore, it can be applied to any number of functions in a sum.

5. Are there any common mistakes when using the chain rule for the sum of functions?

One common mistake when using the chain rule for the sum of functions is forgetting to use the chain rule when taking the derivative of a composite function within the sum. It is important to remember to use the chain rule for each individual function. Another mistake is not keeping track of the variables, which can lead to errors in the final result.

Similar threads

  • Calculus
Replies
5
Views
1K
  • Calculus
Replies
2
Views
2K
Replies
1
Views
947
Replies
9
Views
929
  • Calculus
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
6
Views
863
Replies
3
Views
1K
Replies
2
Views
1K
  • Calculus
Replies
1
Views
944
Back
Top