# Chain Rule Troubles

1. Jun 3, 2015

### Kinta

1. The problem statement, all variables and given/known data
I'm unable to see fully how the following equality is determined: $$\frac {\partial f(y + \alpha \eta, \, y' + \alpha \eta', \, x)}{\partial \alpha} = \eta \frac {\partial f}{\partial y} + \eta' \frac {\partial f}{\partial y'}$$ where $y = y(x)$, $\eta = \eta (x)$, and primes indicate derivatives in $x$.

2. Relevant equations
I know that, generally, a result of the chain rule is that, with $h = h(s,t)$ and $k = k(s,t)$, $$\frac {\partial g(h, k)}{\partial t} = \frac {\partial g}{\partial h} \frac {\partial h}{\partial t} + \frac {\partial g}{\partial k} \frac {\partial k}{\partial t}.$$

3. The attempt at a solution
When I try to apply the general method to this problem I get $$\frac {\partial f(y + \alpha \eta, \, y' + \alpha \eta', \, x)}{\partial \alpha} = \eta \frac {\partial f}{\partial (y + \alpha \eta)} + \eta' \frac {\partial f}{\partial (y' + \alpha \eta')} + (0) \frac {\partial f}{\partial x}.$$

So, the core of my trouble lies in not understanding how the partials in the denominators come to be without any $\alpha$ or $\eta$ terms.

2. Jun 3, 2015

### ShayanJ

The confusion arises because you're not differentiating w.r.t. same y and y' in the function's arguments. Indeed we have:
$\frac{\partial}{\partial \alpha} (f(w,z,x)|_{w=y+\alpha \eta,z=y'+\alpha \eta'})=\eta \frac{\partial f}{\partial w}|_{w=y}+\eta' \frac{\partial f}{\partial z}|_{z=y'}$

3. Jun 3, 2015

### Kinta

It is not clear to me why I should be differentiating w.r.t. only $y$ and $y'$. I'll try to share how I'm interpreting the general result that I gave compared to what you've given me.

How I interpret the general form: "We have a function $g$ whose arguments $h$ and $k$ are also functions of two variables $s$ and $t$. If we wish to take the partial derivative of $g$ w.r.t. $t$, we take the product of the partial derivative of $g$ w.r.t. the function $h$ and the partial derivative of $h$ w.r.t. the variable $t$ and add the product of the partial derivative of $g$ w.r.t. the function $k$ and the partial derivative of $k$ w.r.t. the variable $t$.

How I interpret your answer: "We have a function $f$ whose arguments $w$, $z$, and $x$ are functions of $x$ and $\alpha$ (with the third argument, of course, remaining unchanged in changing $\alpha$). If we wish to take the partial derivative of $f$ w.r.t. $\alpha$, we take the product of the partial derivative of $f$ w.r.t. only part of the function $w$ and the partial derivative of $w$ w.r.t. the variable $\alpha$, add the product of the partial derivative of $f$ w.r.t. only part of the function $z$ and the partial derivative of $z$ w.r.t. the variable $\alpha$, and add the product of the partial derivative of $f$ w.r.t. the function/variable $x$ and the partial derivative of $x$ w.r.t. the variable $\alpha$. The third part of this sum is zero."

These two interpretations of mine conflict. Where am I misinterpreting the information and how so?

4. Jun 3, 2015

### ShayanJ

I don't understand what you mean by "only part of".

5. Jun 3, 2015

### PeroK

Perhaps to expand on what Shyan says above. First, take a step back and see what is meant by those partial derivatives. We need to tighten up the notation to see what's going on.

$f$ is a function of three variables. Let's call these $X, Y$ and $Z$. Now, we have three functions that can be obtained by taking the partial derivatives of $f$ wrt to each of these variables:

$\frac {\partial f}{\partial X}, \ \frac {\partial f}{\partial Y}$ and $,\frac {\partial f}{\partial Z}$

Each of these is, incidently, also a function of the same three variables. Now we can imagine that $X, Y, Z$ are functions of other variables - in this case $X$ is a function of $y$ and $\alpha$, $Y$ is a function of $y'$ and $\alpha$ and $Z$ is simply $x$.

Now, we can apply the chain rule to get:

$\frac {\partial f}{\partial \alpha} = \frac {\partial f}{\partial X} \frac {\partial X}{\partial \alpha} + \frac {\partial f}{\partial Y} \frac {\partial Y}{\partial \alpha} + \frac {\partial f}{\partial Z} \frac {\partial Z}{\partial \alpha} = \eta \frac {\partial f}{\partial X} + \eta' \frac {\partial f}{\partial Y}$

So, you can see that in your post:

$\frac {\partial f}{\partial y}$ actually meant "the partial derivative of $f$ wrt to the first of its variables" and similarly for $\frac {\partial f}{\partial y'}$.

No doubt $f$ was earlier defined as $f(y, y', x)$, hence the (very slightly) loose notation where $y$ is technically serving two purposes.

Does this help?

6. Jun 3, 2015

### Kinta

I mean exactly that. To me, it appears that you're arbitrarily chopping off the $\alpha \eta$ portion of the function's first argument when taking the partial of the function with respect to it and chopping off the $\alpha \eta'$ portion of the function's second argument when taking the partial of the function with respect to it.

If I understand what you're conveying, then my confusion is only solidified. What I think you're noting is that the $y$ and $y'$ terms on the left-hand side of the very first equation I posted are unequal to those on the right-hand side of it. This would clear things right up for me had the author of the text, from which I'm getting this equation, not previously defined this function as $f(Y,Y',x)$. In the text, this is written about three lines prior to the first equation that I gave. I would like to believe that, because he wrote it this way so recently, if he meant that $\frac {\partial f}{\partial \alpha} = \eta \frac {\partial f}{\partial Y} + \eta' \frac {\partial f}{\partial Y'}$, he would've written it this way. However, if the community here sees no sense in the literal form of the equation in which he gives it, I may have to assume it was an error.

I appreciate the help I'm getting with this.

7. Jun 3, 2015

### Ray Vickson

Part of your problem arises from notation; it would be clearer if you used function names that did not involve the derivative. That is, let $f_1(u,v) =\partial f(u,v) /\partial u$ and $f_2(u,v) =\partial f(u, v) / \partial v$. Note that $f_1,\, f_2$ are just two functions of $u$ and $v$---nevermind that they were obtained from $f$ by differentiation!. In particular, the function $f_1(y + a x, y' + a z')$ is obtained by substituting $u = y + ax$ and $v = y' + a z'$ into the function $f_1(u,v)$, etc. Another way of saying this is that $$\frac{\partial f(y + az, y' + a z')}{\partial a} = z\, \left. \frac{\partial f(u,v)}{\partial u} \right|_{u=y+az, v = y' + az'} + z' \,\left. \frac{\partial f(u,v)}{\partial v} \right|_{u=y+az, v = y' + az'}$$

8. Jun 3, 2015

### PeroK

I think you need to just take a fresh look at what a function is. Let's take an example:

$f(X, Y, Z) = X^2 + 2Y^2 + 3Z$

Now, that is the definition of a function. $X, Y, Z$ are essentially dummy variables. That means that, for example:

$f(y, y', x) = y^2 + 2y'^2 + 3x$

And

$f(cos(x), sin(y), tan(z)) = cos^2(x) + 2sin^2(y) + 3tan(z)$

Now:

$\frac{\partial f}{\partial X} = 2X$, $\frac{\partial f}{\partial Y} = 4Y$ and $\frac{\partial f}{\partial Z} = 3$

And, if we have $f(y, y', x)$, which simply means we are using $f$ as a function to operate on these variables, then:

$\frac{\partial f}{\partial y} = 2y$, $\frac{\partial f}{\partial y'} = 4y'$ and $\frac{\partial f}{\partial x} = 3$

In fact, we can deconstruct the original equation a little more, by defining:

$g(\alpha, y, y', x) = f(y + \alpha \eta, \, y' + \alpha \eta', \, x)$

And, in fact, then we have:

$\frac {\partial g}{\partial \alpha} = \eta \frac {\partial f}{\partial X} + \eta' \frac {\partial f}{\partial Y}$

If you want, you could use the $f$ I gave above and use it to define $g$, and use the partial derivatives to verify this equation in this case.

9. Jun 3, 2015

### ShayanJ

Yeah...sorry, That was a mistake. I should have written the same $w=y+\alpha \eta, z=y'+\alpha \eta'$ under the derivatives!
In Perok's post, Y isn't the same as y!

10. Jun 3, 2015

### Kinta

I'm pretty sure that I understand all of this and that I understood it before your post (I don't mean for this to come across as arrogant). What was your motivation for posting this? I mean to ask what, from my posts, seems to indicate that it would be instructive to review the definition of a function? It seems to me that you may see a problem of mine of which I'm not even aware.

Is there is anything truly inherently wrong with my attempted solution as given in my original post or is it just notationally clunky?

11. Jun 3, 2015

### Fredrik

Staff Emeritus
This answer is not wrong. The difference between what you wrote and the "correct" answer is a notational convention.

I think the least confusing notation for the three partial derivatives of a function $f:\mathbb R^3\to\mathbb R$ is $D_1f$, $D_2f$, $D_3f$. The most popular notation $\frac{\partial f}{\partial x}$, $\frac{\partial f}{\partial y}$, $\frac{\partial f}{\partial z}$ is actually pretty strange, since e.g. the first one is defined as the function $g:\mathbb R^3\to\mathbb R$ such that for all $r,s,t\in\mathbb R$,
$$g(r,s,t)=\lim_{h\to 0}\frac{f(r+h,s,t)-f(r,s,t)}{h}.$$ As you can see, the variable "x" isn't involved in the definition, so why should it be part of the notation for this function? The only reasonable answer is that we usually use the symbol x to represent the number that we plug into the first variable slot.

The motivation for the "correct" result in your problem is something like this: Let's say that we're dealing with some expression that involves the variables $y$, $y'$ and $x$, something like $3y+y'x^2$. This expression is said to be a "function of" $y$, $y'$ and $x$ because its value (the number the expression represents) is determined by the values of these three variables. Now it's convenient to introduce an actual function $f:\mathbb R^3\to\mathbb R$ defined by
$$f(r,s,t)=3r+st^2$$ for all $r,s,t\in\mathbb R$. Now the original expression is equal to $f(y,y',x)$. Because of this, it makes sense to denote $D_1f$, $D_2f$, $D_3f$ by $\frac{\partial f}{\partial y}$, $\frac{\partial f}{\partial y'}$, $\frac{\partial f}{\partial x}$. When you evalutate
$$\frac{d}{d\alpha} f(y+\alpha\eta,y'+\alpha\eta',x)$$ the answer will contain partial derivatives of f, and we should denote them by $\frac{\partial f}{\partial y}$, $\frac{\partial f}{\partial y'}$, $\frac{\partial f}{\partial x}$, because that's the notation we decided to use before we decided to evaluate this particular total derivative.

Last edited: Jun 3, 2015
12. Jun 3, 2015

### PeroK

Your first post is essentially correct, so all this was perhaps an attempt to explain the slightly loose notation that the author was using. And perhaps that sums it up: his is a bit loose and yours is clunky and mine is pedantic!

I wasn't sure why you started to use things like $Y'$ in your last post, which suggested you didn't quite understand what I was getting at. Hopefully it's all clear now.

13. Jun 3, 2015

### Kinta

I think everything is crystal clear now. Thanks, everybody!