Partial derivative of composition

Click For Summary

Discussion Overview

The discussion revolves around the interpretation of the partial derivative of a function \( C(x, y) \) where \( y \) is a function of \( x \). Participants explore the implications of differentiating with respect to \( x \) while considering \( y \) as dependent on \( x \), discussing the correct application of the gradient operator in this context.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants propose that the term \( \nabla_x C(x_0, y(x_0)) \) should be interpreted as taking the partial derivative of \( C \) with respect to \( x \) and substituting \( y(x_0) \) for \( y \) and \( x_0 \) for \( x \).
  • Others argue that one must be cautious about the function being differentiated, suggesting that it defines a new function \( g(x) = C(x, y(x)) \) and that the derivative of \( g \) includes both the partial derivatives of \( C \) and the derivative of \( y(x) \).
  • A later reply questions the meaning of the subscript in \( \nabla_x \), suggesting that if \( C(x, y(x)) \) is treated as a function \( g(x) \), the subscript may become irrelevant.
  • Participants discuss whether \( C \) should be considered as a function of two vector variables or reduced to a single vector variable, depending on the context of the problem.
  • There is a suggestion that clarity in notation is essential, and ambiguity should be avoided to ensure proper interpretation of the derivatives involved.

Areas of Agreement / Disagreement

Participants express differing interpretations regarding the treatment of the function \( C \) and the application of the gradient operator. There is no consensus on the correct approach, and multiple competing views remain present in the discussion.

Contextual Notes

Limitations include the potential ambiguity in notation and the dependence on the context of the problem being discussed. The discussion does not resolve the mathematical steps involved in differentiating the function.

SchroedingersLion
Messages
211
Reaction score
56
Hi guys,

suppose we have a function ##C(x, y)## into the real numbers. Suppose also that ##y=y(x)##, i.e. ##y## is a function of ##x##.

Now in my script, I have a term ##\nabla_x C(x_0, y(x_0)) ##. From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
I would not consider the x-derivative of ##y(x)##, since then the subscript at the nabla operator wouldn't make any sense as this would simply be ##\nabla C(x_0, y(x_0))##.

Is this line of thinking correct?

Best

SL.
 
Physics news on Phys.org
SchroedingersLion said:
From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
I agree.
I would not consider the x-derivative of ##y(x)##, since then the subscript at the nabla operator wouldn't make any sense as this would simply be ##\nabla C(x_0, y(x_0))##.
I would interpret the last expression as vector with the two partial derivatives as components.
 
  • Like
Likes   Reactions: SchroedingersLion
SchroedingersLion said:
Hi guys,

suppose we have a function ##C(x, y)## into the real numbers. Suppose also that ##y=y(x)##, i.e. ##y## is a function of ##x##.

In this case you must be careful about what function you are talking about and how you are differentiating. Technically, you have now defined a new function of a single variable:
$$g(x) = C(x, y(x))$$
And, the derivative of ##g## is given by:
$$g'(x) = \frac{\partial C}{\partial x}+ \frac{\partial C}{\partial y}y'(x)$$
It would make no sense, of course, to take the gradient of ##g##. And:
$$g'(x_0) = \frac{\partial C}{\partial x}(x_0, y(x_0)) + \frac{\partial C}{\partial y}(x_0, y(x_0))y'(x_0)$$

However, you have another function which is the gradient of ##C##:
$$\nabla C = \frac{\partial C}{\partial x} \hat x + \frac{\partial C}{\partial y} \hat y$$
Note that this function is also a function of two variables. And, you can now define another function by:
$$h(x) = \nabla C(x, y(x)) = \frac{\partial C}{\partial x}(x, y(x)) \hat x + \frac{\partial C}{\partial y}(x, y(x)) \hat y$$
And, of course:
$$h(x_0) = \nabla C(x_0, y(x_0)) = \frac{\partial C}{\partial x}(x_0, y(x_0)) \hat x + \frac{\partial C}{\partial y}(x_0, y(x_0)) \hat y$$
In the context of your question, therefore, you need to decide which function you are dealing with: ##g## or ##h##.
 
  • Like
Likes   Reactions: SchroedingersLion
Thanks for the responses!

The two of you seem to disagree about my own interpretation what to do.

@PeroK
I should have noted that ##x## and ##y## are each in ##R^d## so the partial derivatives should be gradients.
From your explanations, I still don't see what the subscript then means in ##\nabla_x##. If ##C(x, y(x))## implies the redefinition of ##C(x,y)## as a function ##g(x)##, then the subscript of the gradient would be useless, right?
Therefore, my original idea:
SchroedingersLion said:
From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
 
SchroedingersLion said:
Thanks for the responses!

The two of you seem to disagree about my own interpretation what to do.

@PeroK
I should have noted that ##x## and ##y## are each in ##R^d## so the partial derivatives should be gradients.
From your explanations, I still don't see what the subscript then means in ##\nabla_x##. If ##C(x, y(x))## implies the redefinition of ##C(x,y)## as a function ##g(x)##, then the subscript of the gradient would be useless, right?
Therefore, my original idea:

Okay, so you actually have a function of two vector variables: ##C(\vec x, \vec y) = C(x_1, x_2, x_3, y_1, y_2, y_3)##.

There's the same issue. Are you first reducing ##C## to a function of a single vector ##\vec x##, using ##\vec y = \vec y(\vec x)##? Or, are you taking the gradient of ##C## with respect to ##\vec x## and then plugging in ##\vec x_0, \vec y_0##?

Which one of these it is depends on the context of what you are doing.
 
PeroK said:
Okay, so you actually have a function of two vector variables: ##C(\vec x, \vec y) = C(x_1, x_2, x_3, y_1, y_2, y_3)##.

There's the same issue. Are you first reducing ##C## to a function of a single vector ##\vec x##, using ##\vec y = \vec y(\vec x)##? Or, are you taking the gradient of ##C## with respect to ##\vec x## and then plugging in ##\vec x_0, \vec y_0##?

Which one of these it is depends on the context of what you are doing.

That's essentially what I don't know. I had hoped there was some clear convention =(
 
SchroedingersLion said:
That's essentially what I don't know. I had hoped there was some clear convention =(

I don't think it's a matter of convention. It's a matter of the author being unambiguous. You should be able to work out which one it is from what follows.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K