Understanding the Chain Rule in Vector Calculus for Gradient of Scalar Functions

Click For Summary

Discussion Overview

The discussion revolves around the application of the chain rule in vector calculus, specifically for taking the gradient of scalar functions that depend on vector inputs. Participants explore different notations, mathematical expressions, and coordinate systems related to this topic.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Conceptual clarification

Main Points Raised

  • One participant seeks clarification on the expression for the gradient of a scalar function f(A) where A is a vector, referencing a source that uses nabla notation.
  • Another participant provides a detailed mathematical expression for the gradient of f(A) in terms of its components and the components of the vector A, suggesting that matrix math can confirm the validity of the expression.
  • A third participant asks how the chain rule would be represented in spherical coordinates, indicating interest in the application of the discussed concepts in different coordinate systems.
  • A later reply introduces an alternative notation for the chain rule, explaining the components involved and how they relate to the gradient of the composition of functions, while questioning the necessity of a particular formulation.

Areas of Agreement / Disagreement

Participants present various interpretations and formulations of the chain rule without reaching a consensus. Different notations and approaches are discussed, but no agreement is established on a single preferred method or representation.

Contextual Notes

The discussion includes various mathematical expressions and notations that may depend on specific conventions or assumptions not fully articulated by participants. The transition between coordinate systems and the implications for the chain rule are also not resolved.

daudaudaudau
Messages
297
Reaction score
0
Hi. I was looking for a chain rule in vector calculus for taking the gradient of a function such as f(A), where A is a vector and f is a scalar function. I found the following expression on wikipedia, but I don't understand it. It's taking the gradient of f, and applying that to A, and then writing nabla A ?? Can anyone tell me what's going on?

fcd0ce7679df0e7387af5f353182e420.png
 
Physics news on Phys.org
Consider ##A = [a(x,y,z) \hat x + b(x,y,z) \hat y + c(x,y,z) \hat z ],## and ##f(A) = f(a,b,c),##
then ##\nabla f(A) = \hat x \frac{\partial f}{\partial x}\left[\frac{\partial a}{\partial x}+\frac{\partial b}{\partial x}+\frac{\partial c}{\partial x} \right] +
\hat y\frac{\partial f}{\partial y}\left[\frac{\partial a}{\partial y}+\frac{\partial b}{\partial y}+\frac{\partial c}{\partial y} \right]+
\hat z \frac{\partial f}{\partial z}\left[\frac{\partial a}{\partial z}+\frac{\partial b}{\partial z}+\frac{\partial c}{\partial z} \right]##
##\nabla f = \hat x \frac{\partial f}{\partial x}+ \hat y \frac{\partial f}{\partial y}+ \hat z \frac{\partial f}{\partial z}##
Also, ## \nabla A = \pmatrix{\frac{\partial a}{\partial x} &\frac{\partial b}{\partial x} & \frac{\partial c}{\partial x}\\
\frac{\partial a}{\partial y} &\frac{\partial b}{\partial y} & \frac{\partial c}{\partial y} \\
\frac{\partial a}{\partial z} &\frac{\partial b}{\partial z} & \frac{\partial c}{\partial z}}##
So if you carry out the matrix math and rearrange the terms,
fcd0ce7679df0e7387af5f353182e420.png

You will see that the equation is true.
 
Thanks. How would that look in spherical coordinates ?
 
I'm just going to answer the first question in a different notation. I like this version of the chain rule: ##(f\circ g)_{,i}(x) =f_{,j}(g(x)) g^j{}_{,i}(x)##. Here ##_{,i}## denotes partial differentiation with respect to the ##i##th variable, and ##g^i## denotes the real-valued function that takes ##x## to the ##i##th component of ##g(x)##. I'm using the convention to not write any summation sigmas, since the sum is always over the index that appears twice. For example, if I write ##X^i_k Y^k_j##, it means ##\sum_{k=1}^n X^i_k Y^k_j##.

Note that the ##i##th component of ##\nabla(f\circ A)(x)## is ##(f\circ A)_{,i}(x)##.
$$(f\circ A)_{,i}(x)= f_{,j}(A(x))A^j{}_{,i}(x) =\nabla f(A(x))\cdot A_{,i}(x) =(\nabla f\circ A)(x)\cdot A_{,i}(x).$$ I suppose we could also write this as
$$\nabla(f\circ A)(x) =(\nabla f\circ A)(x)\cdot\nabla A(x),$$ but I don't see why we'd want to.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K