Undergrad Why Does the Gradient Point Towards the Greatest Increase?

Click For Summary
SUMMARY

The discussion centers on the mathematical proof that the gradient vector points in the direction of the greatest increase of a scalar function at a given point. The participants clarify that the expression del(f)*dl = df is maximized when the gradient del(f) and the direction vector dl align. They emphasize that the total differential does not account for the directionality of the increments, leading to potential confusion. The conversation also touches on the interpretation of differentials in the context of parametric curves and the implications of negative derivatives in multivariable functions.

PREREQUISITES
  • Understanding of multivariable calculus, specifically gradients and directional derivatives.
  • Familiarity with the concept of total differentials in calculus.
  • Knowledge of parametric curves and their representation in mathematical notation.
  • Basic comprehension of differential forms and their applications in calculus.
NEXT STEPS
  • Study the properties of gradients in multivariable calculus.
  • Learn about the geometric interpretation of the gradient vector and its significance in optimization.
  • Explore the theory of differential forms and their role in advanced calculus.
  • Investigate the relationship between directional derivatives and the gradient in various contexts.
USEFUL FOR

Students and professionals in mathematics, physics, and engineering who are looking to deepen their understanding of gradients, optimization techniques, and the mathematical foundations of multivariable functions.

kidsasd987
Messages
142
Reaction score
4
Hi, I am looking for a proof that explains why gradient is a vector that points to the greatest increase of a scalar function at a given point p.

http://math.stackexchange.com/quest...always-be-directed-in-an-increasing-direction

I understand the proof here. But.. the idea here is del(f)*dl = df is maximized when del(f) and dl point to the same direction, and that maximizes df. Then we have to first consider the direction of dl to verify where del(f) points to.
If we assume that there is a multivariable function f(x1, x2, x3, . . . xn)
and let's say that the derivative with respect to xj is a negative value at p0.
(also derivatives with respect to other variable x1, x2, x3 . . . xn are positive)

which indicates that the peak is at the left of the graph (at the negative direction with respect to point p0)
then, del(f)*dl = df will be maximized when df/dxj*(-dxj) because it would give positive incremental df, since the derivative is negative at p0.

also, we can do this because dl is a vector quantity so we can define its direction as we want it to be.
But, total differential doesn't take this into account. It just multiplies a small increment of each variable, dxi. and they all have the same sign.

I think this is a contradiction.
 
Physics news on Phys.org
I think what's confusing you is the use of ## dx_{j}##. In this context, it doesn't mean a small positive infinitesimal quantity, as it would in the context of integrating over an area, as in ##dA = dx dy##. In this context, ##dx_{j}## is meant to represent a displacement (be it positive or negative) in the ##x_{j}## coordinate. The author of the post you linked likely uses this notation because it agrees with the theory of differential forms. This works if you always remember that the dx's aren't the fundamental changes and treat them instead as parametric differentials. In other words, let ## (c_{1}, . . ., c_{n}) ## be the point you want to take the gradient of ## f(x_{1},...,x_{n}) ## (which must be smooth and non-critical at the point ##c##). Consider an arbitrary smooth regular curve through ##\vec{c}## given by ## \gamma (\epsilon) ## defined over a closed interval ##-a \leq \epsilon \leq +a ## where ##\gamma(0) = \vec{c}##. For an infinitesimally small increment ##d\epsilon## in the curve parameter ##\epsilon##, the displacement of the point ##\vec{c}## along the curve is given by the differential ##d\vec{\mathcal{l}} = \sum_{i} \dot{\gamma_{i}}(0) \vec{e}_{i} d\epsilon##. In that sense, each component differential ##dx{i} = \dot{\gamma_{i}}(0) \vec{e}_{i} d\epsilon## can be either positive or negative depending on the velocity of the curve ##\gamma## as it passes through point ##\vec{c}##.
 
Twigg said:
I think what's confusing you is the use of ## dx_{j}##. In this context, it doesn't mean a small positive infinitesimal quantity, as it would in the context of integrating over an area, as in ##dA = dx dy##. In this context, ##dx_{j}## is meant to represent a displacement (be it positive or negative) in the ##x_{j}## coordinate. The author of the post you linked likely uses this notation because it agrees with the theory of differential forms. This works if you always remember that the dx's aren't the fundamental changes and treat them instead as parametric differentials. In other words, let ## (c_{1}, . . ., c_{n}) ## be the point you want to take the gradient of ## f(x_{1},...,x_{n}) ## (which must be smooth and non-critical at the point ##c##). Consider an arbitrary smooth regular curve through ##\vec{c}## given by ## \gamma (\epsilon) ## defined over a closed interval ##-a \leq \epsilon \leq +a ## where ##\gamma(0) = \vec{c}##. For an infinitesimally small increment ##d\epsilon## in the curve parameter ##\epsilon##, the displacement of the point ##\vec{c}## along the curve is given by the differential ##d\vec{\mathcal{l}} = \sum_{i} \dot{\gamma_{i}}(0) \vec{e}_{i} d\epsilon##. In that sense, each component differential ##dx{i} = \dot{\gamma_{i}}(0) \vec{e}_{i} d\epsilon## can be either positive or negative depending on the velocity of the curve ##\gamma## as it passes through point ##\vec{c}##.
Thank you for your reply.
But it seems still unclear to me somehow. well, i maybe asking you stupid questions but everything gets suspicious when I think too much of something.

1.
I thought that each incremental dxi should be all positive or all negative because that would give maximum df.
b5ba8aee773ce27d2aa1f0c2032b3dfb.png

be6b4f3eba4a0c4e053d33749eed7f8e.png
It is obvious that incremental df is dependent on the direction of increment hv at a given point p0.
if we look at the graph of f vs xj, because the derivative is negative, it should increase to the negative direction.
For each coordinate x1 to xn, we are free to choose the direction(either negative or positive) to increase the total differential sum of dl.
But the differential dxi has to have the same sign to maximize dl.
2.

Do you mean the parametrized differential dx by dx = lim(eps->0)|eps|?
 
In a very simplistic approach, ## df=\nabla f \cdot ds ## where ## ds=dx \hat{i} +dy \hat{j} +dz \hat{k} ##. This implies ## df=|\nabla f| |ds| \cos \theta ## where ## \theta ## is the angle between ## \nabla f ## and ## ds ##. The maximum occurs when ## \theta=0 ##.
 

Similar threads

  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 38 ·
2
Replies
38
Views
7K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
17
Views
9K
Replies
6
Views
3K
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K