Directional derivative and the gradient - confused.

In summary, the conversation discusses the definition and application of the directional derivative in calculus. The first definition given does not depend on the size of the vector y, while the second definition does. The conversation also explores the proof of a statement related to the directional derivative, with one participant questioning if they have misunderstood the concept. However, upon closer examination of the definitions, it is revealed that the first definition given is incorrect and the correct definition does depend on the size of the vector. The conversation concludes with the participants discussing how to manipulate the signs inside the function in order to properly calculate the directional derivative.
  • #1
Tomer
202
0
Hello, thanks for reading!

I am slightly confused. According to the definition of the directional derivative, calculated at the point x in the direction y,
f'(x;y) = [itex]lim\frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h}[/itex], h-->0
According to this definition, the directional derivative seems to not depend on the size of the vector y, which makes intuitively sense.

It is also true for differentiable functions: f'(x;y) = [itex]grad(f)(x)\cdot\vec{y}[/itex]
However, this definition seems to depend on the size of y.
What am I missing?

Furthermore, I can't seem to be able to prove that if f'(x;y) >0 for a certain x,y then f'(x;-y)<0 (which I think is true, and can also be verified using the gradient relation).

This is probably incredibly dumb but I thought of it and I can't seem to understand it.
 
Physics news on Phys.org
  • #2
Tomer said:
Hello, thanks for reading!

I am slightly confused. According to the definition of the directional derivative, calculated at the point x in the direction y,
f'(x;y) = [itex]lim\frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h}[/itex], h-->0
According to this definition, the directional derivative seems to not depend on the size of the vector y, which makes intuitively sense.

It is also true for differentiable functions: f'(x;y) = [itex]grad(f)(x)\cdot\vec{y}[/itex]
However, this definition seems to depend on the size of y.
What am I missing?
Both definitions you have quoted are only for unit vectors, i.e. iff [itex]|\vec{y}| = 1[/itex]. Indeed, both do depend on the length of the vector, [itex]\vec{y}[/itex]. If you want to use a non-normalized vector, [itex]\vec{v} : |\vec{v}|\neq0[/itex], then you must use the following definition

[tex]\nabla_\vec{v} f(\vec{x}) = \lim_{h\rightarrow0+} \frac{f(\vec{x}+h\vec{v}) - f(\vec{x})}{h|\vec{v}|}[/tex]

or the equivalent gradient definition.

Tomer said:
Furthermore, I can't seem to be able to prove that if f'(x;y) >0 for a certain x,y then f'(x;-y)<0 (which I think is true, and can also be verified using the gradient relation).
Well, what exactly have you tried?
 
  • #3
Thanks for replying.

Well, I'm reading Apostle (volume 2), and here's the definition he gives for the directional derivative at the point [itex]\vec{x}[/itex] and direction [itex]\vec{v}[/itex]:

Given a scalar field f: S -> R , where S[itex]\subseteq R^{n}[/itex], Let [itex]\vec{x}[/itex] be an interior point of S and let [itex]\vec{v}[/itex] be an arbitrary point in . The derivative of f at [itex]\vec{x}[/itex] with respect to [itex]\vec{v}[/itex] is denoted by the symbol
f'([itex]\vec{x}[/itex]; [itex]\vec{v}[/itex]) and is defined by the equation:

f'([itex]\vec{x}[/itex]; [itex]\vec{v}[/itex]) = [itex]lim_{h\rightarrow0}
\frac{f(\vec{x} + h\vec{v}) - f(\vec{x})}{h}[/itex]

He doesn't mention anywhere the fact that [itex]\vec{v}[/itex] is a unit vector. Is it wrong then? I made sure I copied the definition 1-1.

When I follow your definition (or the world's :-) ) I see the dependence of course.

About what I tried:

Well, I tried playing with the one-sided limits, and I actually seem to get the unwanted result: that the two derivatives in the directions [itex]\vec{v}[/itex] and [itex]\vec{-v}[/itex] are equal:

f'([itex]\vec{x}[/itex]; [itex]\vec{v}[/itex]) = [itex]lim_{h\rightarrow0^{+}}
\frac{f(\vec{x} + h\vec{v}) - f(\vec{x})}{h}[/itex] = [itex]lim_{h\rightarrow0^{-}}
\frac{f(\vec{x} + h(\vec{-v})) - f(\vec{x})}{h}[/itex] = f'([itex]\vec{x}[/itex]; [itex]\vec{-v}[/itex])
(if we ignore the ||v|| factor for a second, it shouldn't effect)

The first and last equalities follow from the fact that both one-sided limits exist and are equal to the limit itself.
The middle equality follows from a direct substitution h --> -h.

So, did I shake the foundations of maths or am I missing something? 0_0
 
  • #4
Tomer said:
Well, I'm reading Apostle (volume 2), and here's the definition he gives for the directional derivative at the point [itex]\vec{x}[/itex] and direction [itex]\vec{v}[/itex]:

Given a scalar field f: S -> R , where S[itex]\subseteq R^{n}[/itex], Let [itex]\vec{x}[/itex] be an interior point of S and let [itex]\vec{v}[/itex] be an arbitrary point in . The derivative of f at [itex]\vec{x}[/itex] with respect to [itex]\vec{v}[/itex] is denoted by the symbol
f'([itex]\vec{x}[/itex]; [itex]\vec{v}[/itex]) and is defined by the equation:

f'([itex]\vec{x}[/itex]; [itex]\vec{v}[/itex]) = [itex]lim_{h\rightarrow0}
\frac{f(\vec{x} + h\vec{v}) - f(\vec{x})}{h}[/itex]

He doesn't mention anywhere the fact that [itex]\vec{v}[/itex] is a unit vector. Is it wrong then? I made sure I copied the definition 1-1.

When I follow your definition (or the world's :-) ) I see the dependence of course.
As quoted, the definition is obviously false.
Tomer said:
Well, I tried playing with the one-sided limits, and I actually seem to get the unwanted result: that the two derivatives in the directions [itex]\vec{v}[/itex] and [itex]\vec{-v}[/itex] are equal:

f'([itex]\vec{x}[/itex]; [itex]\vec{v}[/itex]) = [itex]lim_{h\rightarrow0^{+}}
\frac{f(\vec{x} + h\vec{v}) - f(\vec{x})}{h}[/itex] = [itex]lim_{h\rightarrow0^{-}}
\frac{f(\vec{x} + h(\vec{-v})) - f(\vec{x})}{h}[/itex] = f'([itex]\vec{x}[/itex]; [itex]\vec{-v}[/itex])
(if we ignore the ||v|| factor for a second, it shouldn't effect)

The first and last equalities follow from the fact that both one-sided limits exist and are equal to the limit itself.
The middle equality follows from a direct substitution h --> -h.

So, did I shake the foundations of maths or am I missing something? 0_0
In the second equality, you forgot to change the sign of the h in the denominator.
 
Last edited:
  • #5
How disturbing. Apostle uses it again and again and all the extensions are defined similarly.

Aside from that - I realize the equalities are trivial, which only strengthens the fact that this definition is false. h must remain strictly positive then.
I cannot see therefore a way to manipulate the "-" sign inside the function. You think you could give a tip? This is not homework, I'm learning the material alone.
 
  • #6
Oh - heck, I just realized I have done a stupid mistake. I forgot that the dominator h also changes sign to -h, which gives me the minus I wanted!
I'm sorry, that was pretty dumb, as I suspected :-)

I'm still confused about the definition though. So h must remain positive? I realize this makes sense, I just cannot fathom how Apostle would write a false definition. Since the gradient was also defined through partial derivatives, which are defined through the relation I've written, I would expect everything to remain consistent, but then there's this thing with the dependence of the directional derivative on the size of the direction vector.
 
  • #7
Tomer said:
Oh - heck, I just realized I have done a stupid mistake. I forgot that the dominator h also changes sign to -h, which gives me the minus I wanted!
I'm sorry, that was pretty dumb, as I suspected :-)
Not a problem, we all make stupid mistakes. Even I didn't catch it first time through.
Tomer said:
I'm still confused about the definition though. So h must remain positive? I realize this makes sense, I just cannot fathom how Apostle would write a false definition. Since the gradient was also defined through partial derivatives, which are defined through the relation I've written, I would expect everything to remain consistent, but then there's this thing with the dependence of the directional derivative on the size of the direction vector.
Yes, it is convention that h is non-negative. You can easily understand why. If we did not put a restriction on the sign of h, then the directional derivative with respect to [itex]\vec{u}[/itex] could be in the direction [itex]-\vec{u}[/itex] or equally [itex]\vec{u}[/itex]. Do you see?

Something to point out: the directional derivative does not depend on the length of the vector. It is only your incorrect definition that does. The directional derivative is defined such that the direction vector is normalized and hence independent of length.

As for Apostol's "incorrect" definition do not worry about it. He could have said elsewhere that the vector must have unit length and you have just missed it. The other option is that your edition contains a typo that propagated through the text.
 
  • #8
Apostol :-)

Alright, I'll assume v is a unit vector from now on.

Thank you very much!
 
  • #9
Hootenanny said:
Something to point out: the directional derivative does not depend on the length of the vector. It is only your incorrect definition that does. The directional derivative is defined such that the direction vector is normalized and hence independent of length.

How standard is the definition of directional derivative? There are pages on the web (e.g. Wolfram's) that agree with you and there are pages that use Apostol's definition. Since Apostol's notation includes the v, it can be regarded as being more general that the definition that assumes v is a unit vector. I'm curious whether Apostol has any internal inconsistency in his book with respect to directional derivatives.
 
  • #10
Stephen Tashi said:
How standard is the definition of directional derivative? There are pages on the web (e.g. Wolfram's) that agree with you and there are pages that use Apostol's definition. Since Apostol's notation includes the v, it can be regarded as being more general that the definition that assumes v is a unit vector. I'm curious whether Apostol has any internal inconsistency in his book with respect to directional derivatives.

That's exactly what bothers me.
Since the gradient itself is also defined through partial derivatives, which are defined through directional derivatives in the direction of the axes - I would think the gradient is somehow also "altered" to be consistent with the rest (which is pretty disturbing as well). I also followed the proofs roughly and found myself agreeing with them (not that that means much). I made sure of it - he doesn't mention anywhere that v is a unit vector.

However, the really disturbing thing is, like I mention, the contradiction between his definition of the directional derivative (in which by the way "h" isn't even limited to be positive which seems totally wrong!) not being dependent of ||v||, whereas the gradient relation gradf*v has an obvious dependence on it.

Since reading the answer of Hootananny I just assume v is a unit vector.
 
  • #11
In the derivative of a real valued function of a real number, h isn't required to be positive. f(x+h) - f(x) and h both change "direction" when if h is given a different sign, so it doesn't matter. Is there something about the case of a function of a vector that is different?
 
  • #12
Tomer said:
Hello, thanks for reading!

I am slightly confused. According to the definition of the directional derivative, calculated at the point x in the direction y,
f'(x;y) = [itex]lim\frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h}[/itex], h-->0
According to this definition, the directional derivative seems to not depend on the size of the vector y, which makes intuitively sense.

your definition depends on the length of the vector,y. It is equal to df(y) which is linear in y. The direction that y determines is represented by the unit vector that points along y. Your definition is the directional derivative when y is that unit vector.
 
  • #13
Stephen Tashi said:
In the derivative of a real valued function of a real number, h isn't required to be positive. f(x+h) - f(x) and h both change "direction" when if h is given a different sign, so it doesn't matter. Is there something about the case of a function of a vector that is different?

I'd be happy if someone would clear that out. I'm also confused.
 
  • #14
The difference is that in higher dimensional space, 2, 3, or even higher dimensions, we have more "degrees of freedom". In one dimension, we can only approach a point "from the left" or "from the right". In two dimensions, we can approach along an infinite number of lines or upon curves such as parabolas or spirals. You may have seen examples of functions of two variables where approaching along the x or y axes gives the same thing but along a slanted line another. There are even examples where approaching a point along any straight line gives the same limit, but approaching along a parabola gives a different result.

Here is an example from Sala, Hille, and Etgen, edition 4, page 860:
[tex]f(x,y)= \frac{x^2y}{x^4+ y^2}[/tex]
as (x,y) goes to (0,0).
Along any line y= mx, that becomes
[tex]f(x, mx)= \frac{x^2(mx)}{x^4+ (mx)^2}= \frac{mx^3}{x^4+ m^2x^2}= \frac{mx}{x^2+ m^2x}[/tex]
The limit of that, as x goes to 0, is 0 no matter what x is.
(The vertical line cannot be written "y= mx" but it is x= 0 so the function becomes
[tex]f(0, y)= \frac{0}{0+ y^2}= 0[/tex]
for all non-zero y so the limit is still 0.)

But if we approach (0, 0) along the curve [itex]y= x^2[/itex] we have
[tex]f(x, x^2)= \frac{x^2(x^2)}{x^4+ (x^2)^2}= \frac{x^4}{2x^4}[/tex]
which, for x not 0, is just 1/2. The limit as x approaches 0 is 1/2, not 0.

Notice, by the way, that the standard definition of the "derivative of f at [itex]x_0[/itex]",
[tex]\lim_{h\to 0}\frac{f(x_0+h)- f(x_0)}{h}[/tex]
can be restated as
[tex]\lim_{x\to x_0}\frac{f(x)- f(x_0)}{d(x, x_0)}[/tex]
where x= x_0+ h and, in one dimension, d(x, x_0)= |x- x_0|, is just the distance beween x and [itex]x_0[/itex].

That is, we must have a "vector space" structure on the range space (values of f) because we must be able to subtract [itex]f(x)- f(x_0)[/itex] and multiply by the number, [itex]1/d(x, x_0)[/itex]. But we do NOT need a "vector space structure" on the domain space! We only need to be able to calculate the distance between two points- we need a metric space. That's why, normally, we do not think of the arguments of functions as vectors but just talk about "functions of several variables".
 
  • #15
Couldn't have said it better myself, Halls.

Just to clarify a further point regarding the unitary direction vector. It is entirely legitimate to take a derivative along a vector. This vector needn't be normalised. Therefore, the length of the resulting derivative will be relative to the length of the vector along which we are taking the derivative. On the other hand, when taking the directional derivative, or the derivative in a given direction, the reference vector is assumed to be normalised since we are only interested in the direction and do not want to "scale" the derivative with respect to the length of the reference vector.

The distinction between directional derivative and the derivative along a vector is an important one, and I should have made this point earlier. Unfortunately, the two terms are often used synonymously.
 
  • #16
I agree that for a function of several variables, there can be different values of limits at a point depending on how we approach that point.

But my "point" is that when a definition refers to "limit" in the singular, it is defining something in the case when the value of the limit is independent of how we approach. With respect to the scalar h approaching zero, if there is more than one value of the limit depending on the direction of approach then the definition doesn't apply and there is no defined value of whatever-we-were-defining at such a point.

I gather that Apostol's definition of the directional derivative of a scalar valued function f(.) calculated at the point x in the direction y is [itex] f_y(x) = lim_{h \rightarrow 0} \frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h}[/itex].

There are various facts about this definition:
1. The definition does not assert that that the scalar [itex] h > 0 [/itex]
2. The definition implies that [itex]f_{v}(y) = -f_{-v}(y)[/itex] when both limits exist.
3. The value of [itex] f_{v}(y) [/itex] depends on the magnitude of the vector [itex]y[/itex]

As I see it, none of these facts contradict the others.

For example, the left and right handed limits can be equal
[tex]lim_{h \rightarrow 0_+} \frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h} =lim_{h \rightarrow 0_-} \frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h} [/tex]
without contradicting the fact that
[tex] lim_{h \rightarrow 0} \frac{f(\vec{x} + h(-\vec{y}))-f(\vec{x})}{h} = - lim_{h \rightarrow 0} \frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h} [/tex]
 
  • #17
Actually, upon re-reading Apostol's definition as posted above, he does not mention the term "directional derivative". It is Tomer that asserts that this is the directional derivative. Apostol only refers to the derivative with respect to a vector. In this case, Apostol's definition is fine as long as we are mindful of the distinction between directional derivative and the derivative with respect to a vector.
 
  • #18
Stephen Tashi said:
[tex]lim_{h \rightarrow 0_+} \frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h} =lim_{h \rightarrow 0_-} \frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h} [/tex]
And
[tex]\lim_{h \rightarrow 0_+} \frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h} = \lim_{h \rightarrow 0_-} \frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h} = - \lim_{h \rightarrow 0_+} \frac{f(\vec{x} + h(-\vec{y}))-f(\vec{x})}{h}[/tex]
So Stephen, do these expressions represent the directional derivative with respect to [itex]\vec{y}[/itex] or [itex]-\vec{y}[/itex]?
 
  • #19
So Stephen, do these expressions represent the directional derivative with respect to [itex]\vec{y}[/itex] or [itex]-\vec{y}[/itex]?

The derivative with respect to [itex] \vec y [/itex] is:
[tex]f_{\vec y}(\vec x) = lim_{h \rightarrow 0} \frac{f(\vec{x} + h\vec{y})-f(\vec{x})}{h} [/tex]

The derivative with respect to [itex] -\vec y [/itex] is:
[tex]f_{-\vec y}(\vec x) = lim_{h \rightarrow 0} \frac{f(\vec{x} + h(-\vec{y}))-f(\vec{x})}{h} [/tex]
 
  • #20
You're right Hootananny, I wasn't aware of the difference and wasn't careful with my words!
 

What is a directional derivative?

A directional derivative is a measure of how a function changes in the direction of a given vector. It is used in multivariate calculus to find the rate of change of a function at a specific point in a particular direction.

What is the difference between directional derivative and partial derivative?

The directional derivative is a generalization of the partial derivative, which only measures the change of a function in one direction. The directional derivative, on the other hand, measures the change of a function in any given direction.

How is the directional derivative calculated?

The directional derivative is calculated by taking the dot product of the gradient of the function and the unit vector in the direction of interest. This can also be represented as the product of the magnitude of the gradient and the cosine of the angle between the gradient and the direction vector.

What does the gradient represent?

The gradient represents the direction and magnitude of the steepest ascent of a function at a given point. It is a vector that points in the direction of greatest change and its magnitude represents the rate of change in that direction.

How is the gradient used in optimization problems?

The gradient is used in optimization problems to find the direction of greatest increase or decrease of a function. By taking steps in the direction of the gradient, we can reach the maximum or minimum value of a function. This method is known as gradient descent or ascent.

Similar threads

Replies
1
Views
189
  • Calculus
Replies
9
Views
1K
Replies
2
Views
1K
Replies
6
Views
1K
Replies
4
Views
352
Replies
18
Views
2K
Replies
1
Views
923
  • Calculus and Beyond Homework Help
Replies
8
Views
466
Replies
3
Views
1K
Back
Top