Numerical computation of the derivative

AI Thread Summary
The discussion centers on a numerical method for computing the derivative of a function using the formula f'(x) = (f((1+ε)x) - f((1-ε)x)) / (2εx), where ε is a small number. Participants debate the validity of this approach compared to the more common formula f'(x) = (f(x+ε) - f(x-ε)) / (2ε). Concerns are raised about the potential mathematical inaccuracies, particularly when x approaches 0, as the formula becomes undefined at that point. Some argue that this method may help avoid issues with floating-point arithmetic, while others assert that it may not yield accurate results for all functions. Ultimately, the conversation highlights the complexities and considerations in numerical differentiation techniques.
Gaussian97
Homework Helper
Messages
683
Reaction score
412
TL;DR Summary
Question about an algorithm to compute the derivative of a function.
I'm not sure if this is the correct forum to post this question, or should I post it in a math forum. But I was looking at some code when I found a 'strange' implementation to compute the derivative of a function, and I wanted to know if any of you has an idea of why such an implementation is used.

The formula used is
$$f'(x) = \frac{f((1+\varepsilon)x)-f((1-\varepsilon)x)}{2\varepsilon x}$$
Of course, ##\varepsilon## should be a small number. I know that there are many ways to implement the derivative of a function numerically, and obviously, the formula above does indeed converge to the derivative in the limit ##\varepsilon\to 0## in the case of differentiable functions.

My question is if someone has also used this formula instead of the usual ##f'(x) = \frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}## or if someone knows any advantage for using this alternative formula.

Something that may be important is that the formula is used to compute derivatives of functions that are only defined in the interval ##(0,1)##, so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
For example, this formula has the advantage that even if ##x<\varepsilon## the argument will never be smaller than 0, which probably is one of the reasons for using it.

Does anyone have any information?
 
  • Like
Likes JD_PM and homeyn
Technology news on Phys.org
Gaussian97 said:
Summary:: Question about an algorithm to compute the derivative of a function.

I'm not sure if this is the correct forum to post this question, or should I post it in a math forum. But I was looking at some code when I found a 'strange' implementation to compute the derivative of a function, and I wanted to know if any of you has an idea of why such an implementation is used.

The formula used is
$$f'(x) = \frac{f((1+\varepsilon)x)-f((1-\varepsilon)x)}{2\varepsilon x}$$
Of course, ##\varepsilon## should be a small number. I know that there are many ways to implement the derivative of a function numerically, and obviously, the formula above does indeed converge to the derivative in the limit ##\varepsilon\to 0## in the case of differentiable functions.

My question is if someone has also used this formula instead of the usual ##f'(x) = \frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}## or if someone knows any advantage for using this alternative formula.

Something that may be important is that the formula is used to compute derivatives of functions that are only defined in the interval ##(0,1)##, so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
For example, this formula has the advantage that even if ##x<\varepsilon## the argument will never be smaller than 0, which probably is one of the reasons for using it.

Does anyone have any information?
Maybe they are just avoiding a divide by zero?
 
valenumr said:
Maybe they are just avoiding a divide by zero?
Uh, actually, that doesn't seem right at all. The denominator should not contain x. If epsilon is really small, it won't make a huge difference, but it is wrong. Just think of a simple function like y=4x which should have exact solutions for epsilon. It doesn't work with x in the denominator.
 
Gaussian97 said:
so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
Quite the contrary. This formula has a distinct disadvantage when ##x=0##. It will not give the derivative there. It will be undefined.
EDIT: I missed the fact that ##x## is restricted to (0,1).
 
Last edited:
  • Like
Likes valenumr
Gaussian97 said:
Summary:: Question about an algorithm to compute the derivative of a function.

I'm not sure if this is the correct forum to post this question, or should I post it in a math forum. But I was looking at some code when I found a 'strange' implementation to compute the derivative of a function, and I wanted to know if any of you has an idea of why such an implementation is used.

The formula used is
$$f'(x) = \frac{f((1+\varepsilon)x)-f((1-\varepsilon)x)}{2\varepsilon x}$$
Of course, ##\varepsilon## should be a small number. I know that there are many ways to implement the derivative of a function numerically, and obviously, the formula above does indeed converge to the derivative in the limit ##\varepsilon\to 0## in the case of differentiable functions.

My question is if someone has also used this formula instead of the usual ##f'(x) = \frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}## or if someone knows any advantage for using this alternative formula.

Something that may be important is that the formula is used to compute derivatives of functions that are only defined in the interval ##(0,1)##, so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
For example, this formula has the advantage that even if ##x<\varepsilon## the argument will never be smaller than 0, which probably is one of the reasons for using it.

Does anyone have any information?

Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want 1/\epsilon to overflow, you don't want x \pm \epsilon to be rounded to x and you don't want |f(x + \epsilon) - f(x - \epsilon)| to be rounded to zero.

This is an attempt to avoid those problems.
 
  • Like
Likes PhDeezNutz and jim mcnamara
pasmith said:
Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want 1/\epsilon to overflow, you don't want x \pm \epsilon to be rounded to x and you don't want |f(x + \epsilon) - f(x - \epsilon)| to be rounded to zero.

This is an attempt to avoid those problems.
I'm still thinking this is wrong. If x is limited to the interval (0,1), x+e is guaranteed to be less than e.
 
valenumr said:
I'm still thinking this is wrong. If x is limited to the interval (0,1), x+e is guaranteed to be less than e.
Oops, x*e
 
I'm going to with this has no justification. It is flat out wrong. I already gave a counter example where you can choose any value for epsilon that would work with the correct math. It's a little weird that you can take the limit and it's mostly okay, but I don't see how it's applicable.
 
valenumr said:
I'm going to with this has no justification. It is flat out wrong. I already gave a counter example where you can choose any value for epsilon that would work with the correct math. It's a little weird that you can take the limit and it's mostly okay, but I don't see how it's applicable.
Offer only applies to polynomials of degree two or higher.
 
  • #10
pasmith said:
Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want 1/\epsilon to overflow, you don't want x \pm \epsilon to be rounded to x and you don't want |f(x + \epsilon) - f(x - \epsilon)| to be rounded to zero.

This is an attempt to avoid those problems.
I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).
 
  • #11
FactChecker said:
I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).
I'm at a loss coming up with a case where this is valid. I mean, mathematically, it's just wrong.
 
  • #12
FactChecker said:
I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).
And that still doesn't justify x in the denominator.
 
  • #13
valenumr said:
And that still doesn't justify x in the denominator.
If ##x## is in (0,1) and let ##h=x\epsilon##, this seems just like the standard definition.
 
  • #14
FactChecker said:
If ##x## is in (0,1) and let ##h=x\epsilon##, this seems just like the standard definition.
I've never seen this definition of a derivative, and it really only "works" for functions of degree two or higher. And it's still not right, but might give okay answers in the extreme limit. Again. Pick a linear function and do the math. It won't matter how big it small epsilon is. The answer will be exact, if you remove x from the denominator. For all higher order functions, the answer will be close if epsilon is small, bit it won't be correct.
 
  • #15
valenumr said:
Maybe they are just avoiding a divide by zero?
Mmm... I don't see how this would work, if ##\varepsilon## is so small that gives problems with division by zero, since ##x<1## dividing by ##x\varepsilon## would be even worse, no?
valenumr said:
Uh, actually, that doesn't seem right at all. The denominator should not contain x. If epsilon is really small, it won't make a huge difference, but it is wrong. Just think of a simple function like y=4x which should have exact solutions for epsilon. It doesn't work with x in the denominator.
Mmm... Mathematically, for a fixed ##x## the expression for ##\varepsilon\to 0## converges to the derivative. Actually, for the case you mention ##f(x)=4x## we would have
$$f'(x)=\frac{4(1+\varepsilon)x - 4(1-\varepsilon)x}{2\varepsilon x} = \frac{4x(1+\varepsilon-1+\varepsilon)}{2\varepsilon x}=4$$
so it gives the correct result independent of the value of ##\varepsilon##.

pasmith said:
Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want 1/\epsilon to overflow, you don't want x \pm \epsilon to be rounded to x and you don't want |f(x + \epsilon) - f(x - \epsilon)| to be rounded to zero.

This is an attempt to avoid those problems.
Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\varepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.
 
  • #16
Gaussian97 said:
Mmm... I don't see how this would work, if ##\varepsilon## is so small that gives problems with division by zero, since ##x<1## dividing by ##x\varepsilon## would be even worse, no?

Mmm... Mathematically, for a fixed ##x## the expression for ##\varepsilon\to 0## converges to the derivative. Actually, for the case you mention ##f(x)=4x## we would have
$$f'(x)=\frac{4(1+\varepsilon)x - 4(1-\varepsilon)x}{2\varepsilon x} = \frac{4x(1+\varepsilon-1+\varepsilon)}{2\varepsilon x}=4$$
so it gives the correct result independent of the value of ##\varepsilon##.Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\svarepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.
No, I think your assertion is wrong in the general case. My point was if you actually choose a value non-zero for epsilon, it fails as an approximation. Look at your equation and choose a small but non zero value for epsilon and a large (ish) value for x. It doesn't make sense.
Gaussian97 said:
Mmm... I don't see how this would work, if ##\varepsilon## is so small that gives problems with division by zero, since ##x<1## dividing by ##x\varepsilon## would be even worse, no?

Mmm... Mathematically, for a fixed ##x## the expression for ##\varepsilon\to 0## converges to the derivative. Actually, for the case you mention ##f(x)=4x## we would have
$$f'(x)=\frac{4(1+\varepsilon)x - 4(1-\varepsilon)x}{2\varepsilon x} = \frac{4x(1+\varepsilon-1+\varepsilon)}{2\varepsilon x}=4$$
so it gives the correct result independent of the value of ##\varepsilon##.Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\varepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.
Eh, the divide by zero was just a first thought thinking out loud. It doesn't actually make sense.
 
  • #17
Let's start with the well-known expression ## f'(x) \approx \frac{f(x+h)-f(x-h)}{2h} ##. The accuracy of the approximation clearly depends on the (i) magnitude of ## h ##, and also (ii) on the characteristics of the function in the neighborhood of ## x ## (and in particular the magnitude of the discarded terms of the Taylor expansion). We have no control over (ii) and so in order to attempt to achieve the desired accuracy we choose the value of ## h ## to use at a particular ## x ## using some method ## h(x) ##.

Whoever has implemented the algorithm has clearly decided that setting ## h = \varepsilon x ## where ## \varepsilon ## is presumably constant* gives the level of accuracy they desire over the range in which they wish to approximate ## f'(x) ##. Without knowing much about ## f(x) ## or anything else it is impossible to judge whether this is a good choice or not, and certainly not possible to state that this leads to an 'incorrect' approximation.

*[Edit] or is perhaps a factor which is reduced iteratively in order to estimate accuracy
 
Last edited:
  • Like
Likes sysprog
  • #18
pbuk said:
Let's start with the well-known expression ## f'(x) \approx \frac{f(x+h)-f(x-h)}{2h} ##. The accuracy of the approximation clearly depends on the (i) magnitude of ## h ##, and also (ii) on the characteristics of the function in the neighborhood of ## x ## (and in particular the magnitude of the discarded terms of the Taylor expansion). We have no control over (ii) and so in order to attempt to achieve the desired accuracy we choose the value of ## h ## to use at a particular ## x ## using some method ## h(x) ##.

Whoever has implemented the algorithm has clearly decided that setting ## h = \varepsilon x ## where ## \varepsilon ## is presumably constant gives the level of accuracy they desire over the range in which they wish to approximate ## f'(x) ##. Without knowing much about ## f(x) ## or anything else it is impossible to judge whether this is a good choice or not, and certainly not possible to state that this leads to an 'incorrect' approximation.
But the question remains: why would one do that? It is obviously inaccurate in the general case. I wonder what set of functions this may apply to, and also why it might be computationally efficient.
 
  • #19
Thread closed temporarily for Moderation and cleanup...
 
  • #20
Off topic posts removed and thread is re-opened. Thanks.
 
  • #21
pbuk said:
Let's start with the well-known expression ## f'(x) \approx \frac{f(x+h)-f(x-h)}{2h} ##. The accuracy of the approximation clearly depends on the (i) magnitude of ## h ##, and also (ii) on the characteristics of the function in the neighborhood of ## x ## (and in particular the magnitude of the discarded terms of the Taylor expansion). We have no control over (ii) and so in order to attempt to achieve the desired accuracy we choose the value of ## h ## to use at a particular ## x ## using some method ## h(x) ##.

Whoever has implemented the algorithm has clearly decided that setting ## h = \varepsilon x ## where ## \varepsilon ## is presumably constant gives the level of accuracy they desire over the range in which they wish to approximate ## f'(x) ##. Without knowing much about ## f(x) ## or anything else it is impossible to judge whether this is a good choice or not, and certainly not possible to state that this leads to an 'incorrect' approximation.
Well, for sure the method will work better or worse depending on the particular functions. I was asking just in case any of you knew some good reason to use such a method instead of the usual one. Like the already mentioned fact that using ##f((1-\varepsilon)x)## works for arbitrary small values of ##x## while ##f(x-\varepsilon)## gives problems for ##x\leq \varepsilon## (which maybe is just the only reason why).
##\varepsilon## is supposed to be a fixed small constant.

valenumr said:
No, I think your assertion is wrong in the general case. My point was if you actually choose a value non-zero for epsilon, it fails as an approximation. Look at your equation and choose a small but non zero value for epsilon and a large (ish) value for x. It doesn't make sense.
I'm not sure to understand your point here.
 
  • #22
Gaussian97 said:
Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\varepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.

This is true.

However there is also the problem that using x \pm \epsilon with a fixed \epsilon doesn't give a good local approximation when |x| &lt; \epsilon; using x + \epsilon x avoids this, and in choosing \epsilon one has to strike a balance between having a good local approximation and avoiding floaitng-point errors.
 
  • #23
Gaussian97 said:
Well, for sure the method will work better or worse depending on the particular functions. I was asking just in case any of you knew some good reason to use such a method instead of the usual one. Like the already mentioned fact that using ##f((1-\varepsilon)x)## works for arbitrary small values of ##x## while ##f(x-\varepsilon)## gives problems for ##x\leq \varepsilon## (which maybe is just the only reason why).
##\varepsilon## is supposed to be a fixed small constant.I'm not sure to understand your point here.
Ok, that was hasty and misguided. You are absolutely correct. But it doesn't work for, say a quadratic. Thinking about it more, though, is it just perhaps including the division by x as an algorithmic step? Also, if x is small, it would otherwise amplify the derivative, would it not?
 
  • #24
FactChecker said:
I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).
It is still possible to chose a value of e suitable for large values of x, but I see that if x is small, it can make a large deviation in the true derivative, so I'm curious as well as to why this is the choice.
 
  • #25
valenumr said:
I've never seen this definition of a derivative, and it really only "works" for functions of degree two or higher. And it's still not right, but might give okay answers in the extreme limit. Again. Pick a linear function and do the math. It won't matter how big it small epsilon is. The answer will be exact, if you remove x from the denominator. For all higher order functions, the answer will be close if epsilon is small, bit it won't be correct.
If ##x## is in (0,1) and ##h=\epsilon x##, then
##f'(x) = \lim_{\epsilon \to 0}\frac{f((1+\epsilon)x)-f((1-\epsilon)x)}{2\epsilon x} = \lim_{h \to 0}\frac{f(x+h)-f(x-h)}{2h}##
I think that this is a valid definition of the derivative of ##f## at ##x##.
 
  • Like
Likes pbuk
  • #26
FactChecker said:
If ##x## is in (0,1) and ##h=\epsilon x##, then
##f'(x) = \lim_{\epsilon \to 0}\frac{f((1+\epsilon)x)-f((1-\epsilon)x)}{2\epsilon x} = \lim_{h \to 0}\frac{f(x+h)-f(x-h)}{2h}##
I think that this is a valid definition of the derivative of ##f## at ##x##.
Well, to be fair, everything I said was completely wrong. But let's consider f(x) = x^2. Now I think the limit is constant (==1?), when it should be 2x. I'll defer to the fact that there may be a good reason for this specific implementation, but I don't see it being correct, and I don't see how it could be more efficient or numerically more accurate.
 
  • #27
valenumr said:
Well, to be fair, everything I said was completely wrong. But let's consider f(x) = x^2. Now I think the limit is constant (==1?), when it should be 2x. I'll defer to the fact that there may be a good reason for this specific implementation, but I don't see it being correct, and I don't see how it could be more efficient or numerically more accurate.
Actually, if you compute the value for ##f(x)=x^2## you get ##f'(x)=2x##, which is not only correct, but independent of ##\varepsilon##. Indeed you can prove that the error of the algorithm goes like
$$\frac{x^2\varepsilon^2}{3}f'''(x)+\mathcal{O}(\varepsilon^3)$$
so it gives exact results for all quadratic polynomials.
 
  • Like
Likes FactChecker
  • #28
Gaussian97 said:
Actually, if you compute the value for ##f(x)=x^2## you get ##f'(x)=2x##, which is not only correct, but independent of ##\varepsilon##. Indeed you can prove that the error of the algorithm goes like
$$\frac{x^2\varepsilon^2}{3}f'''(x)+\mathcal{O}(\varepsilon^3)$$
so it gives exact results for all quadratic polynomials.
I'm on my phone, but I get the numerator as (x^2 + 2ex + e^2) - (x^2 - 2ex + e^2), which simplifies to 4ex. I think that's right. So divide by 2ex and you get 2 (not 1, but still a constant). I am a numerical person, I do lots of "computer math", so I'd really like to understand this more if it is correct.
 
  • #29
valenumr said:
I'm on my phone, but I get the numerator as (x^2 + 2ex + e^2) - (x^2 - 2ex + e^2), which simplifies to 4ex. I think that's right. So divide by 2ex and you get 2 (not 1, but still a constant). I am a numerical person, I do lots of "computer math", so I'd really like to understand this more if it is correct.
No, the numerator is $$(1 + 2\varepsilon + \varepsilon^2)x^2 - (1 - 2\varepsilon + \varepsilon^2)x^2 = 4\varepsilon x^2$$
 
  • #30
Gaussian97 said:
No, the numerator is $$(1 + 2\varepsilon + \varepsilon^2)x^2 - (1 - 2\varepsilon + \varepsilon^2)x^2 = 4\varepsilon x^2$$
OOOOH 🤯, I see it now. I was substituting very wrongly.
 
  • #31
valenumr said:
OOOOH 🤯, I see it now. I was substituting very wrongly.
So out of curiosity and going back to the original question... Is this an efficient and / or more accurate approach? And if so, why? We don't know anything about f(x), and you provided the error limits. Is that optimal, or at least pretty good?
 
  • #32
The main difference between this "two-sided" limit versus the common "one-sided" limit is that this will give the "average slope" at a point where the derivative has valid but unequal left and right side values. That is desired in some contexts.
 
  • #33
FactChecker said:
The main difference between this "two-sided" limit versus the common "one-sided" limit is that this will give the "average slope" at a point where the derivative has valid but unequal left and right side values. That is desired in some contexts.
I got that part right away, but it was the x in the denominator that lost me. Now that I'm on track, I'm curious about the implementation, and this is up my alley as a computational guy.
 
  • #34
valenumr said:
I got that part right away, but it was the x in the denominator that lost me. Now that I'm on track, I'm curious about the implementation, and this is up my alley as a computational guy.
I think that, given equal amounts of round-off and truncation errors, this will tend to give a more accurate estimate at the midpoint value than a one-sided calculation would. This is just a gut feeling on my part and my knowledge of the numerical issues is too old to be reliable.
 
  • Like
Likes pbuk
  • #35
FactChecker said:
I think that, given equal amounts of round-off and truncation errors, this will tend to give a more accurate estimate at the midpoint value than a one-sided calculation would. This is just a gut feeling on my part and my knowledge of the numerical issues is too old to be reliable.
Ok, thanks. I think if I dig into @Gaussian97 post #27, it will shake out.
 
  • #36
@Gaussian97, any chance you could post the code in question?
 
  • #37
So thinking about this a little more, I think 1± e is pretty well defined computationally. I'm not even sure this would run into problems except for very large values of x. And as long as f(x) isn't very "flat" around x, this is actually very good.
 
  • Like
Likes FactChecker
  • #38
sysprog said:
@Gaussian97, any chance you could post the code in question?
Well, the code in question is more than 2000 lines, but the relevant part is:
Python:
# The i and Q variables are not relevant.
def deriv(f, i, Q, x0, epsilon):
    h = epsilon * x0
    result = (f(i, x0 + h, Q**2) - f(i, x0 - h, Q**2))/(2*h)
    return result

# The function we want to differentiate is:
import lhapdf

pdf = lhapdf.mkPDF(PDFname)
f = pdf.xfxQ2
I don't know if this gives any extra information.
 
Last edited by a moderator:
  • Like
Likes sysprog
  • #39
FactChecker said:
I think that, given equal amounts of round-off and truncation errors, this will tend to give a more accurate estimate at the midpoint value than a one-sided calculation would. This is just a gut feeling on my part and my knowledge of the numerical issues is too old to be reliable.
Yes, expanding the Taylor series for each side of the symmetric difference method shows that the ## \frac{h^2}{2!} f''(x) ## term eliminates and you are left with errors ## O(h^3) ## instead of ## O(h^2) ## with a one-sided (Newton) method.

Gut feelings are great, but it's good to remind yourself of the maths every few decades :-p
 
  • Like
Likes sysprog and BvU
  • #40
Gaussian97 said:
Summary:: Question about an algorithm to compute the derivative of a function.

I'm not sure if this is the correct forum to post this question, or should I post it in a math forum. But I was looking at some code when I found a 'strange' implementation to compute the derivative of a function, and I wanted to know if any of you has an idea of why such an implementation is used.

The formula used is
$$f'(x) = \frac{f((1+\varepsilon)x)-f((1-\varepsilon)x)}{2\varepsilon x}$$
Of course, ##\varepsilon## should be a small number. I know that there are many ways to implement the derivative of a function numerically, and obviously, the formula above does indeed converge to the derivative in the limit ##\varepsilon\to 0## in the case of differentiable functions.

My question is if someone has also used this formula instead of the usual ##f'(x) = \frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}## or if someone knows any advantage for using this alternative formula.

Something that may be important is that the formula is used to compute derivatives of functions that are only defined in the interval ##(0,1)##, so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
For example, this formula has the advantage that even if ##x<\varepsilon## the argument will never be smaller than 0, which probably is one of the reasons for using it.

Does anyone have any information?
So I'm still thinking on this a lot... I'm thinking the (1±e) is a very accurate way to do this. It removes scale, I guess it's the best way I can put it. I actually like this a lot and now that I get it, going to rewrite some code
 
  • #41
pasmith said:
Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want 1/\epsilon to overflow, you don't want x \pm \epsilon to be rounded to x and you don't want |f(x + \epsilon) - f(x - \epsilon)| to be rounded to zero.

This is an attempt to avoid those problems.
If x is in (0,1), doesn't multiplying by it aggravate those problems?
 
  • Like
Likes valenumr
  • #42
FactChecker said:
If x is in (0,1), doesn't multiplying by it aggravate those problems?
I'm still trying to make sense of that fact
 
  • #43
Unless there is some reason to include ##x## as a factor in ##h = x\epsilon##, I would prefer that the characteristics of the numerical estimate be unaffected by a horizontal translation, i.e. the nature of the estimate of ##f'(x+c)## at ##x=x_0## would be the same as the nature of the estimate of ##f'(x)## at ##x=x_0+c##.
 
Last edited:
  • #44
It is not that we need to make the step size (which I am going to call ## h ## because ## \epsilon ## is always machine epsilon) small near zero - we need to make the step size as small as we can everywhere: at least small enough that we can be sure that the ## O(h^3) ## term disappears below ## \epsilon ##.

However the smaller we make the step size in relation to ## x ##, the bigger the problem we are going to have with roundoff error: bear in mind that we try to calculate values of ## f(x + h) ## we are actually calculating ## f(x + h + \epsilon(x + h)) ##.

As a result we can see that we can in general minimise the errors introduced by ## \epsilon ## by making ## h ## proportional to ## x ##, and this is what the quoted code does. Exactly what proportion to choose depends on how badly behaved ## f(x) ## is, but given that we are trying to make the ## O(h^3) ## term disappear, something around ## \epsilon^{1 \over 3} ## (adjusted for the range of ## f'''(x) ##) is probably a good place to start.
 
Last edited:
  • Like
Likes valenumr
  • #45
pbuk said:
It is not that we need to make the step size (which I am going to call ## h ## because ## \varepsilon ## is always machine epsilon) small near zero - we need to make the step size as small as we can everywhere: at least small enough that we can be sure that the ## O(h^3) ## term disappears below ## \varepsilon ##.

However the smaller we make the step size in relation to ## x ##, the bigger the problem we are going to have with roundoff error: bear in mind that we try to calculate values of ## f(x + h) ## we are actually calculating ## f(x + h + \varepsilon(x + h)) ##.

As a result we can see that we can in general minimise the errors introduced by ## \varepsilon ## by making ## h ## proportional to ## x ##, and this is what the quoted code does. Exactly what proportion to choose depends on how badly behaved ## f(x) ## is, but given that we are trying to make the ## O(h^3) ## term disappear, something around ## \varepsilon^{1 \over 3} ## (adjusted for the range of ## f'''(x) ##) is probably a good place to start.
Yeah, for some reason I initially interpreted the function as f(x + e) etc. It makes a lot more sense with f(x * (1 + e))
 
  • #46
Gaussian97 said:
I don't know if this gives any extra information.
I think that if it had been in other than an interpreted language, it potentially could have ##-## thanks for the reply . . .
 
  • #47
pbuk said:
It is not that we need to make the step size (which I am going to call ## h ## because ## \varepsilon ## is always machine epsilon) small near zero - we need to make the step size as small as we can everywhere: at least small enough that we can be sure that the ## O(h^3) ## term disappears below ## \varepsilon ##.

However the smaller we make the step size in relation to ## x ##, the bigger the problem we are going to have with roundoff error: bear in mind that we try to calculate values of ## f(x + h) ## we are actually calculating ## f(x + h + \varepsilon(x + h)) ##.

As a result we can see that we can in general minimise the errors introduced by ## \varepsilon ## by making ## h ## proportional to ## x ##, and this is what the quoted code does. Exactly what proportion to choose depends on how badly behaved ## f(x) ## is, but given that we are trying to make the ## O(h^3) ## term disappear, something around ## \varepsilon^{1 \over 3} ## (adjusted for the range of ## f'''(x) ##) is probably a good place to start.
It seems pretty clear that this is what the implementation is trying to do, but after thinking about it more, I agree with @FactChecker, for a couple of reasons. I think the implementation in some cases introduces more problems that it solves, though perhaps not applicable in this specific case, but for example, there are problems with negative values of x, and the artificial discontinuity introduced at x=0. Also, if you take something like f(x) = sin(x), answers for f'(x) and f'(x + 2n*pi) will not be exactly the same. Obviously minimizing the epsilon based on the magnitude of x might be more important, but there are other ways to accomplish this without introducing the hx term or the division by x, for example, h should be O(x * 2^-24), so a constant value can be chosen based on the floating point exponent of x that will still guarantee that (x+/-h <> x) without all the weird side effects.
 
  • #48
But the implementation is specifically and only for functions over the domain (0, 1) so none of that applies.
 
  • #49
pbuk said:
But the implementation is specifically and only for functions over the domain (0, 1) so none of that applies.
Right, I understood that. But also in that case you can just pick h to be 2^-24 (or maybe -23?) For all x, for 32-bit floats.
 
  • #50
valenumr said:
Right, I understood that. But also in that case you can just pick h to be 2^-24 (or maybe -23?) For all x, for 32-bit floats.
Well, that's actually wrong if you use x very close to zero, but that can be accounted for.
 

Similar threads

Back
Top