Partial derivative of composition

  • #1
Hi guys,

suppose we have a function ##C(x, y)## into the real numbers. Suppose also that ##y=y(x)##, i.e. ##y## is a function of ##x##.

Now in my script, I have a term ##\nabla_x C(x_0, y(x_0)) ##. From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
I would not consider the x-derivative of ##y(x)##, since then the subscript at the nabla operator wouldn't make any sense as this would simply be ##\nabla C(x_0, y(x_0))##.

Is this line of thinking correct?

Best

SL.
 

Answers and Replies

  • #2
35,803
12,529
From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
I agree.
I would not consider the x-derivative of ##y(x)##, since then the subscript at the nabla operator wouldn't make any sense as this would simply be ##\nabla C(x_0, y(x_0))##.
I would interpret the last expression as vector with the two partial derivatives as components.
 
  • Like
Likes SchroedingersLion
  • #3
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2021 Award
20,010
11,387
Hi guys,

suppose we have a function ##C(x, y)## into the real numbers. Suppose also that ##y=y(x)##, i.e. ##y## is a function of ##x##.

In this case you must be careful about what function you are talking about and how you are differentiating. Technically, you have now defined a new function of a single variable:
$$g(x) = C(x, y(x))$$
And, the derivative of ##g## is given by:
$$g'(x) = \frac{\partial C}{\partial x}+ \frac{\partial C}{\partial y}y'(x)$$
It would make no sense, of course, to take the gradient of ##g##. And:
$$g'(x_0) = \frac{\partial C}{\partial x}(x_0, y(x_0)) + \frac{\partial C}{\partial y}(x_0, y(x_0))y'(x_0)$$

However, you have another function which is the gradient of ##C##:
$$\nabla C = \frac{\partial C}{\partial x} \hat x + \frac{\partial C}{\partial y} \hat y$$
Note that this function is also a function of two variables. And, you can now define another function by:
$$h(x) = \nabla C(x, y(x)) = \frac{\partial C}{\partial x}(x, y(x)) \hat x + \frac{\partial C}{\partial y}(x, y(x)) \hat y$$
And, of course:
$$h(x_0) = \nabla C(x_0, y(x_0)) = \frac{\partial C}{\partial x}(x_0, y(x_0)) \hat x + \frac{\partial C}{\partial y}(x_0, y(x_0)) \hat y$$
In the context of your question, therefore, you need to decide which function you are dealing with: ##g## or ##h##.
 
  • Like
Likes SchroedingersLion
  • #4
Thanks for the responses!

The two of you seem to disagree about my own interpretation what to do.

@PeroK
I should have noted that ##x## and ##y## are each in ##R^d## so the partial derivatives should be gradients.
From your explanations, I still don't see what the subscript then means in ##\nabla_x##. If ##C(x, y(x))## implies the redefinition of ##C(x,y)## as a function ##g(x)##, then the subscript of the gradient would be useless, right?
Therefore, my original idea:
From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
 
  • #5
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2021 Award
20,010
11,387
Thanks for the responses!

The two of you seem to disagree about my own interpretation what to do.

@PeroK
I should have noted that ##x## and ##y## are each in ##R^d## so the partial derivatives should be gradients.
From your explanations, I still don't see what the subscript then means in ##\nabla_x##. If ##C(x, y(x))## implies the redefinition of ##C(x,y)## as a function ##g(x)##, then the subscript of the gradient would be useless, right?
Therefore, my original idea:

Okay, so you actually have a function of two vector variables: ##C(\vec x, \vec y) = C(x_1, x_2, x_3, y_1, y_2, y_3)##.

There's the same issue. Are you first reducing ##C## to a function of a single vector ##\vec x##, using ##\vec y = \vec y(\vec x)##? Or, are you taking the gradient of ##C## with respect to ##\vec x## and then plugging in ##\vec x_0, \vec y_0##?

Which one of these it is depends on the context of what you are doing.
 
  • #6
Okay, so you actually have a function of two vector variables: ##C(\vec x, \vec y) = C(x_1, x_2, x_3, y_1, y_2, y_3)##.

There's the same issue. Are you first reducing ##C## to a function of a single vector ##\vec x##, using ##\vec y = \vec y(\vec x)##? Or, are you taking the gradient of ##C## with respect to ##\vec x## and then plugging in ##\vec x_0, \vec y_0##?

Which one of these it is depends on the context of what you are doing.

That's essentially what I don't know. I had hoped there was some clear convention =(
 
  • #7
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2021 Award
20,010
11,387
That's essentially what I don't know. I had hoped there was some clear convention =(

I don't think it's a matter of convention. It's a matter of the author being unambiguous. You should be able to work out which one it is from what follows.
 

Related Threads on Partial derivative of composition

  • Last Post
Replies
6
Views
997
  • Last Post
Replies
8
Views
2K
  • Last Post
Replies
14
Views
2K
Replies
3
Views
1K
  • Last Post
Replies
2
Views
1K
  • Last Post
Replies
6
Views
1K
Replies
1
Views
909
Replies
6
Views
2K
  • Last Post
Replies
5
Views
974
Replies
4
Views
8K
Top