Numerical computation of the derivative

In summary: Floating point precision is there to make sure those things don't happen, but it introduces inaccuracy in its own right.
  • #1
Gaussian97
Homework Helper
683
412
TL;DR Summary
Question about an algorithm to compute the derivative of a function.
I'm not sure if this is the correct forum to post this question, or should I post it in a math forum. But I was looking at some code when I found a 'strange' implementation to compute the derivative of a function, and I wanted to know if any of you has an idea of why such an implementation is used.

The formula used is
$$f'(x) = \frac{f((1+\varepsilon)x)-f((1-\varepsilon)x)}{2\varepsilon x}$$
Of course, ##\varepsilon## should be a small number. I know that there are many ways to implement the derivative of a function numerically, and obviously, the formula above does indeed converge to the derivative in the limit ##\varepsilon\to 0## in the case of differentiable functions.

My question is if someone has also used this formula instead of the usual ##f'(x) = \frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}## or if someone knows any advantage for using this alternative formula.

Something that may be important is that the formula is used to compute derivatives of functions that are only defined in the interval ##(0,1)##, so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
For example, this formula has the advantage that even if ##x<\varepsilon## the argument will never be smaller than 0, which probably is one of the reasons for using it.

Does anyone have any information?
 
  • Like
Likes JD_PM and homeyn
Technology news on Phys.org
  • #2
Gaussian97 said:
Summary:: Question about an algorithm to compute the derivative of a function.

I'm not sure if this is the correct forum to post this question, or should I post it in a math forum. But I was looking at some code when I found a 'strange' implementation to compute the derivative of a function, and I wanted to know if any of you has an idea of why such an implementation is used.

The formula used is
$$f'(x) = \frac{f((1+\varepsilon)x)-f((1-\varepsilon)x)}{2\varepsilon x}$$
Of course, ##\varepsilon## should be a small number. I know that there are many ways to implement the derivative of a function numerically, and obviously, the formula above does indeed converge to the derivative in the limit ##\varepsilon\to 0## in the case of differentiable functions.

My question is if someone has also used this formula instead of the usual ##f'(x) = \frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}## or if someone knows any advantage for using this alternative formula.

Something that may be important is that the formula is used to compute derivatives of functions that are only defined in the interval ##(0,1)##, so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
For example, this formula has the advantage that even if ##x<\varepsilon## the argument will never be smaller than 0, which probably is one of the reasons for using it.

Does anyone have any information?
Maybe they are just avoiding a divide by zero?
 
  • #3
valenumr said:
Maybe they are just avoiding a divide by zero?
Uh, actually, that doesn't seem right at all. The denominator should not contain x. If epsilon is really small, it won't make a huge difference, but it is wrong. Just think of a simple function like y=4x which should have exact solutions for epsilon. It doesn't work with x in the denominator.
 
  • #4
Gaussian97 said:
so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
Quite the contrary. This formula has a distinct disadvantage when ##x=0##. It will not give the derivative there. It will be undefined.
EDIT: I missed the fact that ##x## is restricted to (0,1).
 
Last edited:
  • Like
Likes valenumr
  • #5
Gaussian97 said:
Summary:: Question about an algorithm to compute the derivative of a function.

I'm not sure if this is the correct forum to post this question, or should I post it in a math forum. But I was looking at some code when I found a 'strange' implementation to compute the derivative of a function, and I wanted to know if any of you has an idea of why such an implementation is used.

The formula used is
$$f'(x) = \frac{f((1+\varepsilon)x)-f((1-\varepsilon)x)}{2\varepsilon x}$$
Of course, ##\varepsilon## should be a small number. I know that there are many ways to implement the derivative of a function numerically, and obviously, the formula above does indeed converge to the derivative in the limit ##\varepsilon\to 0## in the case of differentiable functions.

My question is if someone has also used this formula instead of the usual ##f'(x) = \frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}## or if someone knows any advantage for using this alternative formula.

Something that may be important is that the formula is used to compute derivatives of functions that are only defined in the interval ##(0,1)##, so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
For example, this formula has the advantage that even if ##x<\varepsilon## the argument will never be smaller than 0, which probably is one of the reasons for using it.

Does anyone have any information?

Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want [itex]1/\epsilon[/itex] to overflow, you don't want [itex]x \pm \epsilon [/itex] to be rounded to [itex]x[/itex] and you don't want [itex]|f(x + \epsilon) - f(x - \epsilon)|[/itex] to be rounded to zero.

This is an attempt to avoid those problems.
 
  • Like
Likes PhDeezNutz and jim mcnamara
  • #6
pasmith said:
Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want [itex]1/\epsilon[/itex] to overflow, you don't want [itex]x \pm \epsilon [/itex] to be rounded to [itex]x[/itex] and you don't want [itex]|f(x + \epsilon) - f(x - \epsilon)|[/itex] to be rounded to zero.

This is an attempt to avoid those problems.
I'm still thinking this is wrong. If x is limited to the interval (0,1), x+e is guaranteed to be less than e.
 
  • #7
valenumr said:
I'm still thinking this is wrong. If x is limited to the interval (0,1), x+e is guaranteed to be less than e.
Oops, x*e
 
  • #8
I'm going to with this has no justification. It is flat out wrong. I already gave a counter example where you can choose any value for epsilon that would work with the correct math. It's a little weird that you can take the limit and it's mostly okay, but I don't see how it's applicable.
 
  • #9
valenumr said:
I'm going to with this has no justification. It is flat out wrong. I already gave a counter example where you can choose any value for epsilon that would work with the correct math. It's a little weird that you can take the limit and it's mostly okay, but I don't see how it's applicable.
Offer only applies to polynomials of degree two or higher.
 
  • #10
pasmith said:
Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want [itex]1/\epsilon[/itex] to overflow, you don't want [itex]x \pm \epsilon [/itex] to be rounded to [itex]x[/itex] and you don't want [itex]|f(x + \epsilon) - f(x - \epsilon)|[/itex] to be rounded to zero.

This is an attempt to avoid those problems.
I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).
 
  • #11
FactChecker said:
I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).
I'm at a loss coming up with a case where this is valid. I mean, mathematically, it's just wrong.
 
  • #12
FactChecker said:
I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).
And that still doesn't justify x in the denominator.
 
  • #13
valenumr said:
And that still doesn't justify x in the denominator.
If ##x## is in (0,1) and let ##h=x\epsilon##, this seems just like the standard definition.
 
  • #14
FactChecker said:
If ##x## is in (0,1) and let ##h=x\epsilon##, this seems just like the standard definition.
I've never seen this definition of a derivative, and it really only "works" for functions of degree two or higher. And it's still not right, but might give okay answers in the extreme limit. Again. Pick a linear function and do the math. It won't matter how big it small epsilon is. The answer will be exact, if you remove x from the denominator. For all higher order functions, the answer will be close if epsilon is small, bit it won't be correct.
 
  • #15
valenumr said:
Maybe they are just avoiding a divide by zero?
Mmm... I don't see how this would work, if ##\varepsilon## is so small that gives problems with division by zero, since ##x<1## dividing by ##x\varepsilon## would be even worse, no?
valenumr said:
Uh, actually, that doesn't seem right at all. The denominator should not contain x. If epsilon is really small, it won't make a huge difference, but it is wrong. Just think of a simple function like y=4x which should have exact solutions for epsilon. It doesn't work with x in the denominator.
Mmm... Mathematically, for a fixed ##x## the expression for ##\varepsilon\to 0## converges to the derivative. Actually, for the case you mention ##f(x)=4x## we would have
$$f'(x)=\frac{4(1+\varepsilon)x - 4(1-\varepsilon)x}{2\varepsilon x} = \frac{4x(1+\varepsilon-1+\varepsilon)}{2\varepsilon x}=4$$
so it gives the correct result independent of the value of ##\varepsilon##.

pasmith said:
Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want [itex]1/\epsilon[/itex] to overflow, you don't want [itex]x \pm \epsilon [/itex] to be rounded to [itex]x[/itex] and you don't want [itex]|f(x + \epsilon) - f(x - \epsilon)|[/itex] to be rounded to zero.

This is an attempt to avoid those problems.
Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\varepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.
 
  • #16
Gaussian97 said:
Mmm... I don't see how this would work, if ##\varepsilon## is so small that gives problems with division by zero, since ##x<1## dividing by ##x\varepsilon## would be even worse, no?

Mmm... Mathematically, for a fixed ##x## the expression for ##\varepsilon\to 0## converges to the derivative. Actually, for the case you mention ##f(x)=4x## we would have
$$f'(x)=\frac{4(1+\varepsilon)x - 4(1-\varepsilon)x}{2\varepsilon x} = \frac{4x(1+\varepsilon-1+\varepsilon)}{2\varepsilon x}=4$$
so it gives the correct result independent of the value of ##\varepsilon##.Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\svarepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.
No, I think your assertion is wrong in the general case. My point was if you actually choose a value non-zero for epsilon, it fails as an approximation. Look at your equation and choose a small but non zero value for epsilon and a large (ish) value for x. It doesn't make sense.
Gaussian97 said:
Mmm... I don't see how this would work, if ##\varepsilon## is so small that gives problems with division by zero, since ##x<1## dividing by ##x\varepsilon## would be even worse, no?

Mmm... Mathematically, for a fixed ##x## the expression for ##\varepsilon\to 0## converges to the derivative. Actually, for the case you mention ##f(x)=4x## we would have
$$f'(x)=\frac{4(1+\varepsilon)x - 4(1-\varepsilon)x}{2\varepsilon x} = \frac{4x(1+\varepsilon-1+\varepsilon)}{2\varepsilon x}=4$$
so it gives the correct result independent of the value of ##\varepsilon##.Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\varepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.
Eh, the divide by zero was just a first thought thinking out loud. It doesn't actually make sense.
 
  • #17
Let's start with the well-known expression ## f'(x) \approx \frac{f(x+h)-f(x-h)}{2h} ##. The accuracy of the approximation clearly depends on the (i) magnitude of ## h ##, and also (ii) on the characteristics of the function in the neighborhood of ## x ## (and in particular the magnitude of the discarded terms of the Taylor expansion). We have no control over (ii) and so in order to attempt to achieve the desired accuracy we choose the value of ## h ## to use at a particular ## x ## using some method ## h(x) ##.

Whoever has implemented the algorithm has clearly decided that setting ## h = \varepsilon x ## where ## \varepsilon ## is presumably constant* gives the level of accuracy they desire over the range in which they wish to approximate ## f'(x) ##. Without knowing much about ## f(x) ## or anything else it is impossible to judge whether this is a good choice or not, and certainly not possible to state that this leads to an 'incorrect' approximation.

*[Edit] or is perhaps a factor which is reduced iteratively in order to estimate accuracy
 
Last edited:
  • Like
Likes sysprog
  • #18
pbuk said:
Let's start with the well-known expression ## f'(x) \approx \frac{f(x+h)-f(x-h)}{2h} ##. The accuracy of the approximation clearly depends on the (i) magnitude of ## h ##, and also (ii) on the characteristics of the function in the neighborhood of ## x ## (and in particular the magnitude of the discarded terms of the Taylor expansion). We have no control over (ii) and so in order to attempt to achieve the desired accuracy we choose the value of ## h ## to use at a particular ## x ## using some method ## h(x) ##.

Whoever has implemented the algorithm has clearly decided that setting ## h = \varepsilon x ## where ## \varepsilon ## is presumably constant gives the level of accuracy they desire over the range in which they wish to approximate ## f'(x) ##. Without knowing much about ## f(x) ## or anything else it is impossible to judge whether this is a good choice or not, and certainly not possible to state that this leads to an 'incorrect' approximation.
But the question remains: why would one do that? It is obviously inaccurate in the general case. I wonder what set of functions this may apply to, and also why it might be computationally efficient.
 
  • #19
Thread closed temporarily for Moderation and cleanup...
 
  • #20
Off topic posts removed and thread is re-opened. Thanks.
 
  • #21
pbuk said:
Let's start with the well-known expression ## f'(x) \approx \frac{f(x+h)-f(x-h)}{2h} ##. The accuracy of the approximation clearly depends on the (i) magnitude of ## h ##, and also (ii) on the characteristics of the function in the neighborhood of ## x ## (and in particular the magnitude of the discarded terms of the Taylor expansion). We have no control over (ii) and so in order to attempt to achieve the desired accuracy we choose the value of ## h ## to use at a particular ## x ## using some method ## h(x) ##.

Whoever has implemented the algorithm has clearly decided that setting ## h = \varepsilon x ## where ## \varepsilon ## is presumably constant gives the level of accuracy they desire over the range in which they wish to approximate ## f'(x) ##. Without knowing much about ## f(x) ## or anything else it is impossible to judge whether this is a good choice or not, and certainly not possible to state that this leads to an 'incorrect' approximation.
Well, for sure the method will work better or worse depending on the particular functions. I was asking just in case any of you knew some good reason to use such a method instead of the usual one. Like the already mentioned fact that using ##f((1-\varepsilon)x)## works for arbitrary small values of ##x## while ##f(x-\varepsilon)## gives problems for ##x\leq \varepsilon## (which maybe is just the only reason why).
##\varepsilon## is supposed to be a fixed small constant.

valenumr said:
No, I think your assertion is wrong in the general case. My point was if you actually choose a value non-zero for epsilon, it fails as an approximation. Look at your equation and choose a small but non zero value for epsilon and a large (ish) value for x. It doesn't make sense.
I'm not sure to understand your point here.
 
  • #22
Gaussian97 said:
Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\varepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.

This is true.

However there is also the problem that using [itex]x \pm \epsilon[/itex] with a fixed [itex]\epsilon[/itex] doesn't give a good local approximation when [itex]|x| < \epsilon[/itex]; using [itex]x + \epsilon x[/itex] avoids this, and in choosing [itex]\epsilon[/itex] one has to strike a balance between having a good local approximation and avoiding floaitng-point errors.
 
  • #23
Gaussian97 said:
Well, for sure the method will work better or worse depending on the particular functions. I was asking just in case any of you knew some good reason to use such a method instead of the usual one. Like the already mentioned fact that using ##f((1-\varepsilon)x)## works for arbitrary small values of ##x## while ##f(x-\varepsilon)## gives problems for ##x\leq \varepsilon## (which maybe is just the only reason why).
##\varepsilon## is supposed to be a fixed small constant.I'm not sure to understand your point here.
Ok, that was hasty and misguided. You are absolutely correct. But it doesn't work for, say a quadratic. Thinking about it more, though, is it just perhaps including the division by x as an algorithmic step? Also, if x is small, it would otherwise amplify the derivative, would it not?
 
  • #24
FactChecker said:
I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).
It is still possible to chose a value of e suitable for large values of x, but I see that if x is small, it can make a large deviation in the true derivative, so I'm curious as well as to why this is the choice.
 
  • #25
valenumr said:
I've never seen this definition of a derivative, and it really only "works" for functions of degree two or higher. And it's still not right, but might give okay answers in the extreme limit. Again. Pick a linear function and do the math. It won't matter how big it small epsilon is. The answer will be exact, if you remove x from the denominator. For all higher order functions, the answer will be close if epsilon is small, bit it won't be correct.
If ##x## is in (0,1) and ##h=\epsilon x##, then
##f'(x) = \lim_{\epsilon \to 0}\frac{f((1+\epsilon)x)-f((1-\epsilon)x)}{2\epsilon x} = \lim_{h \to 0}\frac{f(x+h)-f(x-h)}{2h}##
I think that this is a valid definition of the derivative of ##f## at ##x##.
 
  • Like
Likes pbuk
  • #26
FactChecker said:
If ##x## is in (0,1) and ##h=\epsilon x##, then
##f'(x) = \lim_{\epsilon \to 0}\frac{f((1+\epsilon)x)-f((1-\epsilon)x)}{2\epsilon x} = \lim_{h \to 0}\frac{f(x+h)-f(x-h)}{2h}##
I think that this is a valid definition of the derivative of ##f## at ##x##.
Well, to be fair, everything I said was completely wrong. But let's consider f(x) = x^2. Now I think the limit is constant (==1?), when it should be 2x. I'll defer to the fact that there may be a good reason for this specific implementation, but I don't see it being correct, and I don't see how it could be more efficient or numerically more accurate.
 
  • #27
valenumr said:
Well, to be fair, everything I said was completely wrong. But let's consider f(x) = x^2. Now I think the limit is constant (==1?), when it should be 2x. I'll defer to the fact that there may be a good reason for this specific implementation, but I don't see it being correct, and I don't see how it could be more efficient or numerically more accurate.
Actually, if you compute the value for ##f(x)=x^2## you get ##f'(x)=2x##, which is not only correct, but independent of ##\varepsilon##. Indeed you can prove that the error of the algorithm goes like
$$\frac{x^2\varepsilon^2}{3}f'''(x)+\mathcal{O}(\varepsilon^3)$$
so it gives exact results for all quadratic polynomials.
 
  • Like
Likes FactChecker
  • #28
Gaussian97 said:
Actually, if you compute the value for ##f(x)=x^2## you get ##f'(x)=2x##, which is not only correct, but independent of ##\varepsilon##. Indeed you can prove that the error of the algorithm goes like
$$\frac{x^2\varepsilon^2}{3}f'''(x)+\mathcal{O}(\varepsilon^3)$$
so it gives exact results for all quadratic polynomials.
I'm on my phone, but I get the numerator as (x^2 + 2ex + e^2) - (x^2 - 2ex + e^2), which simplifies to 4ex. I think that's right. So divide by 2ex and you get 2 (not 1, but still a constant). I am a numerical person, I do lots of "computer math", so I'd really like to understand this more if it is correct.
 
  • #29
valenumr said:
I'm on my phone, but I get the numerator as (x^2 + 2ex + e^2) - (x^2 - 2ex + e^2), which simplifies to 4ex. I think that's right. So divide by 2ex and you get 2 (not 1, but still a constant). I am a numerical person, I do lots of "computer math", so I'd really like to understand this more if it is correct.
No, the numerator is $$(1 + 2\varepsilon + \varepsilon^2)x^2 - (1 - 2\varepsilon + \varepsilon^2)x^2 = 4\varepsilon x^2$$
 
  • #30
Gaussian97 said:
No, the numerator is $$(1 + 2\varepsilon + \varepsilon^2)x^2 - (1 - 2\varepsilon + \varepsilon^2)x^2 = 4\varepsilon x^2$$
OOOOH 🤯, I see it now. I was substituting very wrongly.
 
  • #31
valenumr said:
OOOOH 🤯, I see it now. I was substituting very wrongly.
So out of curiosity and going back to the original question... Is this an efficient and / or more accurate approach? And if so, why? We don't know anything about f(x), and you provided the error limits. Is that optimal, or at least pretty good?
 
  • #32
The main difference between this "two-sided" limit versus the common "one-sided" limit is that this will give the "average slope" at a point where the derivative has valid but unequal left and right side values. That is desired in some contexts.
 
  • #33
FactChecker said:
The main difference between this "two-sided" limit versus the common "one-sided" limit is that this will give the "average slope" at a point where the derivative has valid but unequal left and right side values. That is desired in some contexts.
I got that part right away, but it was the x in the denominator that lost me. Now that I'm on track, I'm curious about the implementation, and this is up my alley as a computational guy.
 
  • #34
valenumr said:
I got that part right away, but it was the x in the denominator that lost me. Now that I'm on track, I'm curious about the implementation, and this is up my alley as a computational guy.
I think that, given equal amounts of round-off and truncation errors, this will tend to give a more accurate estimate at the midpoint value than a one-sided calculation would. This is just a gut feeling on my part and my knowledge of the numerical issues is too old to be reliable.
 
  • Like
Likes pbuk
  • #35
FactChecker said:
I think that, given equal amounts of round-off and truncation errors, this will tend to give a more accurate estimate at the midpoint value than a one-sided calculation would. This is just a gut feeling on my part and my knowledge of the numerical issues is too old to be reliable.
Ok, thanks. I think if I dig into @Gaussian97 post #27, it will shake out.
 

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Differential Equations
Replies
1
Views
665
  • Calculus and Beyond Homework Help
Replies
9
Views
915
  • Special and General Relativity
Replies
9
Views
1K
  • Differential Equations
Replies
1
Views
770
  • Introductory Physics Homework Help
Replies
18
Views
1K
  • Math Proof Training and Practice
Replies
8
Views
826
  • Classical Physics
Replies
7
Views
893
  • Calculus and Beyond Homework Help
Replies
13
Views
966
  • Thermodynamics
Replies
7
Views
1K
Back
Top