A Partial derivative of composition

SchroedingersLion · Mar 10, 2020

Hi guys,

suppose we have a function ##C(x, y)## into the real numbers. Suppose also that ##y=y(x)##, i.e. ##y## is a function of ##x##.

Now in my script, I have a term ##\nabla_x C(x_0, y(x_0)) ##. From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.
I would not consider the x-derivative of ##y(x)##, since then the subscript at the nabla operator wouldn't make any sense as this would simply be ##\nabla C(x_0, y(x_0))##.

Is this line of thinking correct?

Best

SL.

mfb · Mar 11, 2020

SchroedingersLion said:

From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.

I agree.

I would not consider the x-derivative of ##y(x)##, since then the subscript at the nabla operator wouldn't make any sense as this would simply be ##\nabla C(x_0, y(x_0))##.

I would interpret the last expression as vector with the two partial derivatives as components.

PeroK · Mar 11, 2020

SchroedingersLion said:

Hi guys,

suppose we have a function ##C(x, y)## into the real numbers. Suppose also that ##y=y(x)##, i.e. ##y## is a function of ##x##.

In this case you must be careful about what function you are talking about and how you are differentiating. Technically, you have now defined a new function of a single variable:
$$g(x) = C(x, y(x))$$
And, the derivative of ##g## is given by:
$$g'(x) = \frac{\partial C}{\partial x}+ \frac{\partial C}{\partial y}y'(x)$$
It would make no sense, of course, to take the gradient of ##g##. And:
$$g'(x_0) = \frac{\partial C}{\partial x}(x_0, y(x_0)) + \frac{\partial C}{\partial y}(x_0, y(x_0))y'(x_0)$$

However, you have another function which is the gradient of ##C##:
$$\nabla C = \frac{\partial C}{\partial x} \hat x + \frac{\partial C}{\partial y} \hat y$$
Note that this function is also a function of two variables. And, you can now define another function by:
$$h(x) = \nabla C(x, y(x)) = \frac{\partial C}{\partial x}(x, y(x)) \hat x + \frac{\partial C}{\partial y}(x, y(x)) \hat y$$
And, of course:
$$h(x_0) = \nabla C(x_0, y(x_0)) = \frac{\partial C}{\partial x}(x_0, y(x_0)) \hat x + \frac{\partial C}{\partial y}(x_0, y(x_0)) \hat y$$
In the context of your question, therefore, you need to decide which function you are dealing with: ##g## or ##h##.

SchroedingersLion · Mar 11, 2020

Thanks for the responses!

The two of you seem to disagree about my own interpretation what to do.

@PeroK
I should have noted that ##x## and ##y## are each in ##R^d## so the partial derivatives should be gradients.
From your explanations, I still don't see what the subscript then means in ##\nabla_x##. If ##C(x, y(x))## implies the redefinition of ##C(x,y)## as a function ##g(x)##, then the subscript of the gradient would be useless, right?
Therefore, my original idea:

SchroedingersLion said:

From my point of view, this means that you take the partial derivative of ##C(x,y)## with respect to x and then insert ##y(x_0)## for ##y##, and ##x_0## for ##x##.

PeroK · Mar 11, 2020

SchroedingersLion said:

Thanks for the responses!

The two of you seem to disagree about my own interpretation what to do.

@PeroK
I should have noted that ##x## and ##y## are each in ##R^d## so the partial derivatives should be gradients.
From your explanations, I still don't see what the subscript then means in ##\nabla_x##. If ##C(x, y(x))## implies the redefinition of ##C(x,y)## as a function ##g(x)##, then the subscript of the gradient would be useless, right?
Therefore, my original idea:

Okay, so you actually have a function of two vector variables: ##C(\vec x, \vec y) = C(x_1, x_2, x_3, y_1, y_2, y_3)##.

There's the same issue. Are you first reducing ##C## to a function of a single vector ##\vec x##, using ##\vec y = \vec y(\vec x)##? Or, are you taking the gradient of ##C## with respect to ##\vec x## and then plugging in ##\vec x_0, \vec y_0##?

Which one of these it is depends on the context of what you are doing.

SchroedingersLion · Mar 11, 2020

PeroK said:

Okay, so you actually have a function of two vector variables: ##C(\vec x, \vec y) = C(x_1, x_2, x_3, y_1, y_2, y_3)##.

There's the same issue. Are you first reducing ##C## to a function of a single vector ##\vec x##, using ##\vec y = \vec y(\vec x)##? Or, are you taking the gradient of ##C## with respect to ##\vec x## and then plugging in ##\vec x_0, \vec y_0##?

Which one of these it is depends on the context of what you are doing.

That's essentially what I don't know. I had hoped there was some clear convention =(

PeroK · Mar 11, 2020

SchroedingersLion said:

That's essentially what I don't know. I had hoped there was some clear convention =(

I don't think it's a matter of convention. It's a matter of the author being unambiguous. You should be able to work out which one it is from what follows.

A Partial derivative of composition

Similar threads

Hot Threads

Insights Fermat's Last Theorem

B What could prove this wrong? I'm having a dispute with friends

B About a definition: What is the number of terms of a polynomial P(x)?

B How Many Straight Lines to Connect an N by M Array of Points in a Closed Loop?

B Geometry Puzzle with 20 points in a cross pattern

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem