Partial derivative of composition

  • #1
SchroedingersLion
214
57
Hi guys,

suppose we have a function ##C(x, y)## into the real numbers. Suppose also that ##y=y(x)##, i.e. ##y## is a function of ##x##.

Now in my script, I have a term ##\nabla_x C(x_0, y(x_0)) ##. From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
I would not consider the x-derivative of ##y(x)##, since then the subscript at the nabla operator wouldn't make any sense as this would simply be ##\nabla C(x_0, y(x_0))##.

Is this line of thinking correct?

Best

SL.
 

Answers and Replies

  • #2
36,247
13,301
From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
I agree.
I would not consider the x-derivative of ##y(x)##, since then the subscript at the nabla operator wouldn't make any sense as this would simply be ##\nabla C(x_0, y(x_0))##.
I would interpret the last expression as vector with the two partial derivatives as components.
 
  • Like
Likes SchroedingersLion
  • #3
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2022 Award
23,784
15,397
Hi guys,

suppose we have a function ##C(x, y)## into the real numbers. Suppose also that ##y=y(x)##, i.e. ##y## is a function of ##x##.

In this case you must be careful about what function you are talking about and how you are differentiating. Technically, you have now defined a new function of a single variable:
$$g(x) = C(x, y(x))$$
And, the derivative of ##g## is given by:
$$g'(x) = \frac{\partial C}{\partial x}+ \frac{\partial C}{\partial y}y'(x)$$
It would make no sense, of course, to take the gradient of ##g##. And:
$$g'(x_0) = \frac{\partial C}{\partial x}(x_0, y(x_0)) + \frac{\partial C}{\partial y}(x_0, y(x_0))y'(x_0)$$

However, you have another function which is the gradient of ##C##:
$$\nabla C = \frac{\partial C}{\partial x} \hat x + \frac{\partial C}{\partial y} \hat y$$
Note that this function is also a function of two variables. And, you can now define another function by:
$$h(x) = \nabla C(x, y(x)) = \frac{\partial C}{\partial x}(x, y(x)) \hat x + \frac{\partial C}{\partial y}(x, y(x)) \hat y$$
And, of course:
$$h(x_0) = \nabla C(x_0, y(x_0)) = \frac{\partial C}{\partial x}(x_0, y(x_0)) \hat x + \frac{\partial C}{\partial y}(x_0, y(x_0)) \hat y$$
In the context of your question, therefore, you need to decide which function you are dealing with: ##g## or ##h##.
 
  • Like
Likes SchroedingersLion
  • #4
SchroedingersLion
214
57
Thanks for the responses!

The two of you seem to disagree about my own interpretation what to do.

@PeroK
I should have noted that ##x## and ##y## are each in ##R^d## so the partial derivatives should be gradients.
From your explanations, I still don't see what the subscript then means in ##\nabla_x##. If ##C(x, y(x))## implies the redefinition of ##C(x,y)## as a function ##g(x)##, then the subscript of the gradient would be useless, right?
Therefore, my original idea:
From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
 
  • #5
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2022 Award
23,784
15,397
Thanks for the responses!

The two of you seem to disagree about my own interpretation what to do.

@PeroK
I should have noted that ##x## and ##y## are each in ##R^d## so the partial derivatives should be gradients.
From your explanations, I still don't see what the subscript then means in ##\nabla_x##. If ##C(x, y(x))## implies the redefinition of ##C(x,y)## as a function ##g(x)##, then the subscript of the gradient would be useless, right?
Therefore, my original idea:

Okay, so you actually have a function of two vector variables: ##C(\vec x, \vec y) = C(x_1, x_2, x_3, y_1, y_2, y_3)##.

There's the same issue. Are you first reducing ##C## to a function of a single vector ##\vec x##, using ##\vec y = \vec y(\vec x)##? Or, are you taking the gradient of ##C## with respect to ##\vec x## and then plugging in ##\vec x_0, \vec y_0##?

Which one of these it is depends on the context of what you are doing.
 
  • #6
SchroedingersLion
214
57
Okay, so you actually have a function of two vector variables: ##C(\vec x, \vec y) = C(x_1, x_2, x_3, y_1, y_2, y_3)##.

There's the same issue. Are you first reducing ##C## to a function of a single vector ##\vec x##, using ##\vec y = \vec y(\vec x)##? Or, are you taking the gradient of ##C## with respect to ##\vec x## and then plugging in ##\vec x_0, \vec y_0##?

Which one of these it is depends on the context of what you are doing.

That's essentially what I don't know. I had hoped there was some clear convention =(
 
  • #7
PeroK
Science Advisor
Homework Helper
Insights Author
Gold Member
2022 Award
23,784
15,397
That's essentially what I don't know. I had hoped there was some clear convention =(

I don't think it's a matter of convention. It's a matter of the author being unambiguous. You should be able to work out which one it is from what follows.
 

Suggested for: Partial derivative of composition

  • Last Post
Replies
9
Views
961
  • Last Post
Replies
6
Views
1K
  • Last Post
Replies
3
Views
991
  • Last Post
Replies
0
Views
285
  • Last Post
Replies
6
Views
625
  • Last Post
Replies
6
Views
540
  • Last Post
Replies
4
Views
673
  • Last Post
Replies
2
Views
537
  • Last Post
Replies
11
Views
716
  • Last Post
Replies
4
Views
1K
Top