# Partial derivative of composition

SchroedingersLion
Hi guys,

suppose we have a function ##C(x, y)## into the real numbers. Suppose also that ##y=y(x)##, i.e. ##y## is a function of ##x##.

Now in my script, I have a term ##\nabla_x C(x_0, y(x_0)) ##. From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
I would not consider the x-derivative of ##y(x)##, since then the subscript at the nabla operator wouldn't make any sense as this would simply be ##\nabla C(x_0, y(x_0))##.

Is this line of thinking correct?

Best

SL.

Mentor
From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
I agree.
I would not consider the x-derivative of ##y(x)##, since then the subscript at the nabla operator wouldn't make any sense as this would simply be ##\nabla C(x_0, y(x_0))##.
I would interpret the last expression as vector with the two partial derivatives as components.

• SchroedingersLion
Homework Helper
Gold Member
2022 Award
Hi guys,

suppose we have a function ##C(x, y)## into the real numbers. Suppose also that ##y=y(x)##, i.e. ##y## is a function of ##x##.

In this case you must be careful about what function you are talking about and how you are differentiating. Technically, you have now defined a new function of a single variable:
$$g(x) = C(x, y(x))$$
And, the derivative of ##g## is given by:
$$g'(x) = \frac{\partial C}{\partial x}+ \frac{\partial C}{\partial y}y'(x)$$
It would make no sense, of course, to take the gradient of ##g##. And:
$$g'(x_0) = \frac{\partial C}{\partial x}(x_0, y(x_0)) + \frac{\partial C}{\partial y}(x_0, y(x_0))y'(x_0)$$

However, you have another function which is the gradient of ##C##:
$$\nabla C = \frac{\partial C}{\partial x} \hat x + \frac{\partial C}{\partial y} \hat y$$
Note that this function is also a function of two variables. And, you can now define another function by:
$$h(x) = \nabla C(x, y(x)) = \frac{\partial C}{\partial x}(x, y(x)) \hat x + \frac{\partial C}{\partial y}(x, y(x)) \hat y$$
And, of course:
$$h(x_0) = \nabla C(x_0, y(x_0)) = \frac{\partial C}{\partial x}(x_0, y(x_0)) \hat x + \frac{\partial C}{\partial y}(x_0, y(x_0)) \hat y$$
In the context of your question, therefore, you need to decide which function you are dealing with: ##g## or ##h##.

• SchroedingersLion
SchroedingersLion
Thanks for the responses!

The two of you seem to disagree about my own interpretation what to do.

@PeroK
I should have noted that ##x## and ##y## are each in ##R^d## so the partial derivatives should be gradients.
From your explanations, I still don't see what the subscript then means in ##\nabla_x##. If ##C(x, y(x))## implies the redefinition of ##C(x,y)## as a function ##g(x)##, then the subscript of the gradient would be useless, right?
Therefore, my original idea:
From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.

Homework Helper
Gold Member
2022 Award
Thanks for the responses!

The two of you seem to disagree about my own interpretation what to do.

@PeroK
I should have noted that ##x## and ##y## are each in ##R^d## so the partial derivatives should be gradients.
From your explanations, I still don't see what the subscript then means in ##\nabla_x##. If ##C(x, y(x))## implies the redefinition of ##C(x,y)## as a function ##g(x)##, then the subscript of the gradient would be useless, right?
Therefore, my original idea:

Okay, so you actually have a function of two vector variables: ##C(\vec x, \vec y) = C(x_1, x_2, x_3, y_1, y_2, y_3)##.

There's the same issue. Are you first reducing ##C## to a function of a single vector ##\vec x##, using ##\vec y = \vec y(\vec x)##? Or, are you taking the gradient of ##C## with respect to ##\vec x## and then plugging in ##\vec x_0, \vec y_0##?

Which one of these it is depends on the context of what you are doing.

SchroedingersLion
Okay, so you actually have a function of two vector variables: ##C(\vec x, \vec y) = C(x_1, x_2, x_3, y_1, y_2, y_3)##.

There's the same issue. Are you first reducing ##C## to a function of a single vector ##\vec x##, using ##\vec y = \vec y(\vec x)##? Or, are you taking the gradient of ##C## with respect to ##\vec x## and then plugging in ##\vec x_0, \vec y_0##?

Which one of these it is depends on the context of what you are doing.

That's essentially what I don't know. I had hoped there was some clear convention =(