Why Does the Gradient Point Towards the Greatest Increase?

Click For Summary

Discussion Overview

The discussion revolves around understanding why the gradient of a scalar function points in the direction of the greatest increase at a given point. Participants explore the mathematical and conceptual implications of the gradient in the context of multivariable functions, addressing potential contradictions and clarifications regarding the use of differentials.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • One participant seeks a proof for why the gradient vector indicates the direction of greatest increase, referencing a linked post for context.
  • Another participant clarifies that the notation used for differentials (##dx_{j}##) represents displacements that can be positive or negative, rather than strictly positive infinitesimals.
  • There is a discussion about the implications of choosing directions for the differentials and how this affects the total differential, with one participant expressing confusion about whether all increments should have the same sign to maximize the differential.
  • A later reply introduces a simplified expression for the differential, suggesting that maximum increase occurs when the angle between the gradient and the displacement vector is zero.

Areas of Agreement / Disagreement

Participants express differing views on the interpretation of differentials and their signs in relation to maximizing the total differential. The discussion remains unresolved, with no consensus on the implications of these interpretations.

Contextual Notes

Participants highlight potential confusion regarding the interpretation of differentials in the context of the gradient and its relationship to directional derivatives. There are also mentions of assumptions about the smoothness of functions and the nature of the variables involved.

kidsasd987
Messages
142
Reaction score
4
Hi, I am looking for a proof that explains why gradient is a vector that points to the greatest increase of a scalar function at a given point p.

http://math.stackexchange.com/quest...always-be-directed-in-an-increasing-direction

I understand the proof here. But.. the idea here is del(f)*dl = df is maximized when del(f) and dl point to the same direction, and that maximizes df. Then we have to first consider the direction of dl to verify where del(f) points to.
If we assume that there is a multivariable function f(x1, x2, x3, . . . xn)
and let's say that the derivative with respect to xj is a negative value at p0.
(also derivatives with respect to other variable x1, x2, x3 . . . xn are positive)

which indicates that the peak is at the left of the graph (at the negative direction with respect to point p0)
then, del(f)*dl = df will be maximized when df/dxj*(-dxj) because it would give positive incremental df, since the derivative is negative at p0.

also, we can do this because dl is a vector quantity so we can define its direction as we want it to be.
But, total differential doesn't take this into account. It just multiplies a small increment of each variable, dxi. and they all have the same sign.

I think this is a contradiction.
 
Physics news on Phys.org
I think what's confusing you is the use of ## dx_{j}##. In this context, it doesn't mean a small positive infinitesimal quantity, as it would in the context of integrating over an area, as in ##dA = dx dy##. In this context, ##dx_{j}## is meant to represent a displacement (be it positive or negative) in the ##x_{j}## coordinate. The author of the post you linked likely uses this notation because it agrees with the theory of differential forms. This works if you always remember that the dx's aren't the fundamental changes and treat them instead as parametric differentials. In other words, let ## (c_{1}, . . ., c_{n}) ## be the point you want to take the gradient of ## f(x_{1},...,x_{n}) ## (which must be smooth and non-critical at the point ##c##). Consider an arbitrary smooth regular curve through ##\vec{c}## given by ## \gamma (\epsilon) ## defined over a closed interval ##-a \leq \epsilon \leq +a ## where ##\gamma(0) = \vec{c}##. For an infinitesimally small increment ##d\epsilon## in the curve parameter ##\epsilon##, the displacement of the point ##\vec{c}## along the curve is given by the differential ##d\vec{\mathcal{l}} = \sum_{i} \dot{\gamma_{i}}(0) \vec{e}_{i} d\epsilon##. In that sense, each component differential ##dx{i} = \dot{\gamma_{i}}(0) \vec{e}_{i} d\epsilon## can be either positive or negative depending on the velocity of the curve ##\gamma## as it passes through point ##\vec{c}##.
 
Twigg said:
I think what's confusing you is the use of ## dx_{j}##. In this context, it doesn't mean a small positive infinitesimal quantity, as it would in the context of integrating over an area, as in ##dA = dx dy##. In this context, ##dx_{j}## is meant to represent a displacement (be it positive or negative) in the ##x_{j}## coordinate. The author of the post you linked likely uses this notation because it agrees with the theory of differential forms. This works if you always remember that the dx's aren't the fundamental changes and treat them instead as parametric differentials. In other words, let ## (c_{1}, . . ., c_{n}) ## be the point you want to take the gradient of ## f(x_{1},...,x_{n}) ## (which must be smooth and non-critical at the point ##c##). Consider an arbitrary smooth regular curve through ##\vec{c}## given by ## \gamma (\epsilon) ## defined over a closed interval ##-a \leq \epsilon \leq +a ## where ##\gamma(0) = \vec{c}##. For an infinitesimally small increment ##d\epsilon## in the curve parameter ##\epsilon##, the displacement of the point ##\vec{c}## along the curve is given by the differential ##d\vec{\mathcal{l}} = \sum_{i} \dot{\gamma_{i}}(0) \vec{e}_{i} d\epsilon##. In that sense, each component differential ##dx{i} = \dot{\gamma_{i}}(0) \vec{e}_{i} d\epsilon## can be either positive or negative depending on the velocity of the curve ##\gamma## as it passes through point ##\vec{c}##.
Thank you for your reply.
But it seems still unclear to me somehow. well, i maybe asking you stupid questions but everything gets suspicious when I think too much of something.

1.
I thought that each incremental dxi should be all positive or all negative because that would give maximum df.
b5ba8aee773ce27d2aa1f0c2032b3dfb.png

be6b4f3eba4a0c4e053d33749eed7f8e.png
It is obvious that incremental df is dependent on the direction of increment hv at a given point p0.
if we look at the graph of f vs xj, because the derivative is negative, it should increase to the negative direction.
For each coordinate x1 to xn, we are free to choose the direction(either negative or positive) to increase the total differential sum of dl.
But the differential dxi has to have the same sign to maximize dl.
2.

Do you mean the parametrized differential dx by dx = lim(eps->0)|eps|?
 
In a very simplistic approach, ## df=\nabla f \cdot ds ## where ## ds=dx \hat{i} +dy \hat{j} +dz \hat{k} ##. This implies ## df=|\nabla f| |ds| \cos \theta ## where ## \theta ## is the angle between ## \nabla f ## and ## ds ##. The maximum occurs when ## \theta=0 ##.
 

Similar threads

  • · Replies 18 ·
Replies
18
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 38 ·
2
Replies
38
Views
7K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 17 ·
Replies
17
Views
9K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K