Proof of Multivariable Chain Rule in higher dimensions

SpY] · Apr 9, 2011

Homework Statement

Let \textbf{F}: \textbf{R}^m \rightarrow \textbf{R}^n and \textbf{G}: \textbf{R}^p \rightarrow \textbf{R}^m

Prove that ({\textbf{F} \circ \textbf{G}})'(x) = {\textbf{F}}'(\textbf{G}(\textbf{x})) {\textbf{G}}'(\textbf{x})

Homework Equations

Assume the single variable chain rule, that is for
f, g: \textbf{R} \rightarrow \textbf{R}

\frac {d(f \circ g)}{dt}(t) = \frac {df}{dt} \big]_{g(t)} \frac {dg}{dt}(t)

The Attempt at a Solution

I figured using the single variable result by extending it to \textbf{R}^2 first, a sort of subproof which uses the mean value theorem:

Let f: \textbf{R}^2 \rightarrow \textbf{R} and \textbf{G}: \textbf{R} \rightarrow \textbf{R}^2

Then

f(\textbf{G}(t+h)) - f(\textbf{G}(t)) = f(G_1(t+h), G_2(t+h)) - f(G_1(t), G_2(t+h)) + f(G_1(t), G_2(t+h)) - f(G_1(t), G_2(t))
The second and third terms change nothing, I will use them later

Then by the first mean value theorem,
\exists k_1, k_2 \in (0,h) such that

G_1 (t+h) - G_1 (t) = h{G_1}'(t+k_1)

G_2 (t+h) - G_2 (t) = h{G_2}'(t+k_2)

Expanding the first two terms previously by substituting G_1(t+h)
f(G_1(t+h), G_2(t+h)) - f(G_1(t), G_2(t+h))

= f(h{G_1}'(t+k_1) + G_1(t), G_2(t+h))- f(G_1(t), G_2(t+h))

= h{G_1}'(t+k_1) \frac {\partial df}{\partial dx_1} \big]_{(p_1 + G_1(t), G_2(t+h))}

Where p_1 \in (0, h{G_1}'(t+k_1))

Similarly for the next two terms substituting G_2(t+h)
f(G_1(t), G_2(t+h)) - f(G_1(t), G_2(t))

f(G_1(t), h{G_2}'(t+k_2) + G_2(t)) - f(G_1(t), G_2(t))

= h{G_2}'(t+k_2) \frac {\partial df}{\partial dx_2} \big]_{(G_1(t), p_2 + G_2(t))}

Where p_2 \in (0, h{G_1}'(t+k_2))

Combining this all together and dividing by h:

\frac {f(\textbf{G}(t+h)) - f(\textbf{G}(t))}{h}

= {G_1}'(t+k_1) \frac {\partial df}{\partial dx_1} \big]_{(p_1 + G_1(t), G_2(t+h))} + {G_2}'(t+k_2) \frac {\partial df}{\partial dx_2} \big]_{(G_1(t), p_2 + G_2(t))}

Now as h \rightarrow 0, k_1, k_2, p_1, p_2 \rightarrow 0 since they are contained in intervals up to h. The LHS is now the chain derivative

{(f \circ \textbf{G})}'(t) =\lim_{h \to 0} \frac {f(\textbf{G}(t+h)) - f(\textbf{G}(t))}{h}

= {G_1}'(t+k_1) \frac {\partial df}{\partial dx_1} \big]_{(p_1 + G_1(t), G_2(t+h))} + {G_2}'(t+k_2) \frac {\partial df}{\partial dx_2} \big]_{(G_1(t), p_2 + G_2(t))}

= {f}'(\textbf{G} (t)) { \textbf{G}}'(t)

I've tried generalizing this for any n, but it gets rather long so I'm not sure how to put in concisely. After that, I don't know how to take it to the general proof (any m,n) as required.

Thanks

I like Serena · Apr 9, 2011

Rather than going into all the limit stuff, I think there is an easier way.

Let i \in \{1, ..., n\}, j \in \{1, ..., m\}, k \in \{1, ..., p\}.

First we need to have that:

\frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) = \sum_{j=1}^m \frac {\partial} {\partial x_j} f_i(g_1(x), ..., g_m(x)) \cdot \frac {\partial} {\partial x_k} g_j(x))

Then we can apply the definitions and say:

(F \circ G)'(x) = \left( \frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) \right) = \left( \sum_{j=1}^m \frac {\partial} {\partial x_j} f_i(g_1(x), ..., g_m(x)) \cdot \frac {\partial} {\partial x_k} g_j(x)) \right) = \left( \frac {\partial} {\partial x_j} f_i(g_1(x), ..., g_m(x)) \right) \left( \frac {\partial} {\partial x_k} g_j(x)) \right) = F'(G(x))G'(x)

SpY] · Apr 10, 2011

Hmmm ok so let me get this straight: the i refers to elements in f, j to elements in g, and k for partial derivatives in \frac {\partial} {\partial x_k}? Where f: \textbf{R}^n \rightarrow \textbf{R} and g: \textbf{R}^m \rightarrow \textbf{R}? (just to be specific on domains here)

Then shouldn't your first line read

 \sum_{i=1}^n \frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) = \sum_{j=1}^m \sum_{i=1}^n \frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) \cdot \frac {\partial} {\partial x_k} g_j(x)) 

Because you need to sum the components for f otherwise f_i is meaningless, then end up with a double sum on the right (over f and g).

Also your first partial derivative on the right should be \frac {\partial} {\partial x_k} not by x_j otherwise it goes to \frac {\partial} {\partial x_m} because of the sum, or should there be a \sum_{k=1}^p somewhere?

I'm having trouble following your last line as well, because you expand the partial derivative using a sum, then just take the sum away keeping the same indices. Throughout you have the variables i, j, k without the sum in front, when you should be summing to n, m, p.

Thanks for the effort though. If a mentor or homework helper could give input it would be appreciated.

Proof of Multivariable Chain Rule in higher dimensions

Homework Statement

Homework Equations

The Attempt at a Solution

Thread 'Distance between a Clock's hands when the distance is increasing most rapidly'

Similar threads

Distance between a Clock's hands when the distance is increasing most rapidly

Volume with spherical coordinates

Use greedy vertex coloring algorithm to prove the upper bound of χ

Does this series converge uniformly?

Independent components of three indexed systems ##T_{ijk}##

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers