Numerical computation of the derivative

Gaussian97 · Nov 11, 2021

I'm not sure if this is the correct forum to post this question, or should I post it in a math forum. But I was looking at some code when I found a 'strange' implementation to compute the derivative of a function, and I wanted to know if any of you has an idea of why such an implementation is used.

The formula used is
$$f'(x) = \frac{f((1+\varepsilon)x)-f((1-\varepsilon)x)}{2\varepsilon x}$$
Of course, ##\varepsilon## should be a small number. I know that there are many ways to implement the derivative of a function numerically, and obviously, the formula above does indeed converge to the derivative in the limit ##\varepsilon\to 0## in the case of differentiable functions.

My question is if someone has also used this formula instead of the usual ##f'(x) = \frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}## or if someone knows any advantage for using this alternative formula.

Something that may be important is that the formula is used to compute derivatives of functions that are only defined in the interval ##(0,1)##, so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
For example, this formula has the advantage that even if ##x<\varepsilon## the argument will never be smaller than 0, which probably is one of the reasons for using it.

Does anyone have any information?

valenumr · Nov 11, 2021

Gaussian97 said:

Summary:: Question about an algorithm to compute the derivative of a function.

I'm not sure if this is the correct forum to post this question, or should I post it in a math forum. But I was looking at some code when I found a 'strange' implementation to compute the derivative of a function, and I wanted to know if any of you has an idea of why such an implementation is used.

The formula used is
$$f'(x) = \frac{f((1+\varepsilon)x)-f((1-\varepsilon)x)}{2\varepsilon x}$$
Of course, ##\varepsilon## should be a small number. I know that there are many ways to implement the derivative of a function numerically, and obviously, the formula above does indeed converge to the derivative in the limit ##\varepsilon\to 0## in the case of differentiable functions.

My question is if someone has also used this formula instead of the usual ##f'(x) = \frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}## or if someone knows any advantage for using this alternative formula.

Something that may be important is that the formula is used to compute derivatives of functions that are only defined in the interval ##(0,1)##, so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
For example, this formula has the advantage that even if ##x<\varepsilon## the argument will never be smaller than 0, which probably is one of the reasons for using it.

Does anyone have any information?

Maybe they are just avoiding a divide by zero?

valenumr · Nov 11, 2021

valenumr said:

Maybe they are just avoiding a divide by zero?

Uh, actually, that doesn't seem right at all. The denominator should not contain x. If epsilon is really small, it won't make a huge difference, but it is wrong. Just think of a simple function like y=4x which should have exact solutions for epsilon. It doesn't work with x in the denominator.

FactChecker · Nov 11, 2021

Gaussian97 said:

so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?

Quite the contrary. This formula has a distinct disadvantage when ##x=0##. It will not give the derivative there. It will be undefined.
EDIT: I missed the fact that ##x## is restricted to (0,1).

pasmith · Nov 11, 2021

Gaussian97 said:

Summary:: Question about an algorithm to compute the derivative of a function.

I'm not sure if this is the correct forum to post this question, or should I post it in a math forum. But I was looking at some code when I found a 'strange' implementation to compute the derivative of a function, and I wanted to know if any of you has an idea of why such an implementation is used.

The formula used is
$$f'(x) = \frac{f((1+\varepsilon)x)-f((1-\varepsilon)x)}{2\varepsilon x}$$
Of course, ##\varepsilon## should be a small number. I know that there are many ways to implement the derivative of a function numerically, and obviously, the formula above does indeed converge to the derivative in the limit ##\varepsilon\to 0## in the case of differentiable functions.

My question is if someone has also used this formula instead of the usual ##f'(x) = \frac{f(x+\varepsilon)-f(x-\varepsilon)}{2\varepsilon}## or if someone knows any advantage for using this alternative formula.

Something that may be important is that the formula is used to compute derivatives of functions that are only defined in the interval ##(0,1)##, so maybe I thought this formula has some advantage when ##x \sim 0## or ##x \sim 1##?
For example, this formula has the advantage that even if ##x<\varepsilon## the argument will never be smaller than 0, which probably is one of the reasons for using it.

Does anyone have any information?

Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want 1/\epsilon to overflow, you don't want x \pm \epsilon to be rounded to x and you don't want |f(x + \epsilon) - f(x - \epsilon)| to be rounded to zero.

This is an attempt to avoid those problems.

valenumr · Nov 11, 2021

pasmith said:

Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want 1/\epsilon to overflow, you don't want x \pm \epsilon to be rounded to x and you don't want |f(x + \epsilon) - f(x - \epsilon)| to be rounded to zero.

This is an attempt to avoid those problems.

I'm still thinking this is wrong. If x is limited to the interval (0,1), x+e is guaranteed to be less than e.

valenumr · Nov 11, 2021

valenumr said:

I'm still thinking this is wrong. If x is limited to the interval (0,1), x+e is guaranteed to be less than e.

Oops, x*e

valenumr · Nov 11, 2021

I'm going to with this has no justification. It is flat out wrong. I already gave a counter example where you can choose any value for epsilon that would work with the correct math. It's a little weird that you can take the limit and it's mostly okay, but I don't see how it's applicable.

valenumr · Nov 11, 2021

valenumr said:

I'm going to with this has no justification. It is flat out wrong. I already gave a counter example where you can choose any value for epsilon that would work with the correct math. It's a little weird that you can take the limit and it's mostly okay, but I don't see how it's applicable.

Offer only applies to polynomials of degree two or higher.

FactChecker · Nov 11, 2021

pasmith said:

Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want 1/\epsilon to overflow, you don't want x \pm \epsilon to be rounded to x and you don't want |f(x + \epsilon) - f(x - \epsilon)| to be rounded to zero.

This is an attempt to avoid those problems.

I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).

valenumr · Nov 11, 2021

FactChecker said:

I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).

I'm at a loss coming up with a case where this is valid. I mean, mathematically, it's just wrong.

valenumr · Nov 11, 2021

FactChecker said:

I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).

And that still doesn't justify x in the denominator.

FactChecker · Nov 11, 2021

valenumr said:

And that still doesn't justify x in the denominator.

If ##x## is in (0,1) and let ##h=x\epsilon##, this seems just like the standard definition.

valenumr · Nov 11, 2021

FactChecker said:

If ##x## is in (0,1) and let ##h=x\epsilon##, this seems just like the standard definition.

I've never seen this definition of a derivative, and it really only "works" for functions of degree two or higher. And it's still not right, but might give okay answers in the extreme limit. Again. Pick a linear function and do the math. It won't matter how big it small epsilon is. The answer will be exact, if you remove x from the denominator. For all higher order functions, the answer will be close if epsilon is small, bit it won't be correct.

Gaussian97 · Nov 11, 2021

valenumr said:

Maybe they are just avoiding a divide by zero?

Mmm... I don't see how this would work, if ##\varepsilon## is so small that gives problems with division by zero, since ##x<1## dividing by ##x\varepsilon## would be even worse, no?

valenumr said:

Uh, actually, that doesn't seem right at all. The denominator should not contain x. If epsilon is really small, it won't make a huge difference, but it is wrong. Just think of a simple function like y=4x which should have exact solutions for epsilon. It doesn't work with x in the denominator.

Mmm... Mathematically, for a fixed ##x## the expression for ##\varepsilon\to 0## converges to the derivative. Actually, for the case you mention ##f(x)=4x## we would have
$$f'(x)=\frac{4(1+\varepsilon)x - 4(1-\varepsilon)x}{2\varepsilon x} = \frac{4x(1+\varepsilon-1+\varepsilon)}{2\varepsilon x}=4$$
so it gives the correct result independent of the value of ##\varepsilon##.

pasmith said:

Floating point arithmetic is inherently inaccurate - particularly if you're dealing with very small or very large numbers. You don't want 1/\epsilon to overflow, you don't want x \pm \epsilon to be rounded to x and you don't want |f(x + \epsilon) - f(x - \epsilon)| to be rounded to zero.

This is an attempt to avoid those problems.

Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\varepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.

valenumr · Nov 11, 2021

Gaussian97 said:

Mmm... I don't see how this would work, if ##\varepsilon## is so small that gives problems with division by zero, since ##x<1## dividing by ##x\varepsilon## would be even worse, no?

Mmm... Mathematically, for a fixed ##x## the expression for ##\varepsilon\to 0## converges to the derivative. Actually, for the case you mention ##f(x)=4x## we would have
$$f'(x)=\frac{4(1+\varepsilon)x - 4(1-\varepsilon)x}{2\varepsilon x} = \frac{4x(1+\varepsilon-1+\varepsilon)}{2\varepsilon x}=4$$
so it gives the correct result independent of the value of ##\varepsilon##.Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\svarepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.

No, I think your assertion is wrong in the general case. My point was if you actually choose a value non-zero for epsilon, it fails as an approximation. Look at your equation and choose a small but non zero value for epsilon and a large (ish) value for x. It doesn't make sense.

Gaussian97 said:

Mmm... I don't see how this would work, if ##\varepsilon## is so small that gives problems with division by zero, since ##x<1## dividing by ##x\varepsilon## would be even worse, no?

Mmm... Mathematically, for a fixed ##x## the expression for ##\varepsilon\to 0## converges to the derivative. Actually, for the case you mention ##f(x)=4x## we would have
$$f'(x)=\frac{4(1+\varepsilon)x - 4(1-\varepsilon)x}{2\varepsilon x} = \frac{4x(1+\varepsilon-1+\varepsilon)}{2\varepsilon x}=4$$
so it gives the correct result independent of the value of ##\varepsilon##.Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\varepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.

Eh, the divide by zero was just a first thought thinking out loud. It doesn't actually make sense.

pbuk · Nov 11, 2021

Let's start with the well-known expression ## f'(x) \approx \frac{f(x+h)-f(x-h)}{2h} ##. The accuracy of the approximation clearly depends on the (i) magnitude of ## h ##, and also (ii) on the characteristics of the function in the neighborhood of ## x ## (and in particular the magnitude of the discarded terms of the Taylor expansion). We have no control over (ii) and so in order to attempt to achieve the desired accuracy we choose the value of ## h ## to use at a particular ## x ## using some method ## h(x) ##.

Whoever has implemented the algorithm has clearly decided that setting ## h = \varepsilon x ## where ## \varepsilon ## is presumably constant* gives the level of accuracy they desire over the range in which they wish to approximate ## f'(x) ##. Without knowing much about ## f(x) ## or anything else it is impossible to judge whether this is a good choice or not, and certainly not possible to state that this leads to an 'incorrect' approximation.

*[Edit] or is perhaps a factor which is reduced iteratively in order to estimate accuracy

valenumr · Nov 11, 2021

pbuk said:

Let's start with the well-known expression ## f'(x) \approx \frac{f(x+h)-f(x-h)}{2h} ##. The accuracy of the approximation clearly depends on the (i) magnitude of ## h ##, and also (ii) on the characteristics of the function in the neighborhood of ## x ## (and in particular the magnitude of the discarded terms of the Taylor expansion). We have no control over (ii) and so in order to attempt to achieve the desired accuracy we choose the value of ## h ## to use at a particular ## x ## using some method ## h(x) ##.

Whoever has implemented the algorithm has clearly decided that setting ## h = \varepsilon x ## where ## \varepsilon ## is presumably constant gives the level of accuracy they desire over the range in which they wish to approximate ## f'(x) ##. Without knowing much about ## f(x) ## or anything else it is impossible to judge whether this is a good choice or not, and certainly not possible to state that this leads to an 'incorrect' approximation.

But the question remains: why would one do that? It is obviously inaccurate in the general case. I wonder what set of functions this may apply to, and also why it might be computationally efficient.

berkeman · Nov 11, 2021

Thread closed temporarily for Moderation and cleanup...

berkeman · Nov 11, 2021

Off topic posts removed and thread is re-opened. Thanks.

Gaussian97 · Nov 11, 2021

pbuk said:

Let's start with the well-known expression ## f'(x) \approx \frac{f(x+h)-f(x-h)}{2h} ##. The accuracy of the approximation clearly depends on the (i) magnitude of ## h ##, and also (ii) on the characteristics of the function in the neighborhood of ## x ## (and in particular the magnitude of the discarded terms of the Taylor expansion). We have no control over (ii) and so in order to attempt to achieve the desired accuracy we choose the value of ## h ## to use at a particular ## x ## using some method ## h(x) ##.

Whoever has implemented the algorithm has clearly decided that setting ## h = \varepsilon x ## where ## \varepsilon ## is presumably constant gives the level of accuracy they desire over the range in which they wish to approximate ## f'(x) ##. Without knowing much about ## f(x) ## or anything else it is impossible to judge whether this is a good choice or not, and certainly not possible to state that this leads to an 'incorrect' approximation.

Well, for sure the method will work better or worse depending on the particular functions. I was asking just in case any of you knew some good reason to use such a method instead of the usual one. Like the already mentioned fact that using ##f((1-\varepsilon)x)## works for arbitrary small values of ##x## while ##f(x-\varepsilon)## gives problems for ##x\leq \varepsilon## (which maybe is just the only reason why).
##\varepsilon## is supposed to be a fixed small constant.

valenumr said:

No, I think your assertion is wrong in the general case. My point was if you actually choose a value non-zero for epsilon, it fails as an approximation. Look at your equation and choose a small but non zero value for epsilon and a large (ish) value for x. It doesn't make sense.

I'm not sure to understand your point here.

pasmith · Nov 11, 2021

Gaussian97 said:

Yes, I also have this kind of thing in mind, but as I said, because ##x\varepsilon < \varepsilon##, if ##1/\varepsilon## overflows then ##1/(x\varepsilon)## should be even worse, no? Also similarly, if ##x\pm \varepsilon## gets round to ##x##, again the substitution ##\varepsilon \to x\varepsilon## should make things worse.

This is true.

However there is also the problem that using x \pm \epsilon with a fixed \epsilon doesn't give a good local approximation when |x| < \epsilon; using x + \epsilon x avoids this, and in choosing \epsilon one has to strike a balance between having a good local approximation and avoiding floaitng-point errors.

valenumr · Nov 11, 2021

Gaussian97 said:

Well, for sure the method will work better or worse depending on the particular functions. I was asking just in case any of you knew some good reason to use such a method instead of the usual one. Like the already mentioned fact that using ##f((1-\varepsilon)x)## works for arbitrary small values of ##x## while ##f(x-\varepsilon)## gives problems for ##x\leq \varepsilon## (which maybe is just the only reason why).
##\varepsilon## is supposed to be a fixed small constant.I'm not sure to understand your point here.

Ok, that was hasty and misguided. You are absolutely correct. But it doesn't work for, say a quadratic. Thinking about it more, though, is it just perhaps including the division by x as an algorithmic step? Also, if x is small, it would otherwise amplify the derivative, would it not?

valenumr · Nov 11, 2021

FactChecker said:

I think that is a good point for large values of ##x##, where ##x+\epsilon## would be truncated to ##x##. I don't see the other point about ##f(x+\epsilon)## or ##1/\epsilon##. The OP states that ##x## is restricted to (0,1).

It is still possible to chose a value of e suitable for large values of x, but I see that if x is small, it can make a large deviation in the true derivative, so I'm curious as well as to why this is the choice.

FactChecker · Nov 11, 2021

valenumr said:

I've never seen this definition of a derivative, and it really only "works" for functions of degree two or higher. And it's still not right, but might give okay answers in the extreme limit. Again. Pick a linear function and do the math. It won't matter how big it small epsilon is. The answer will be exact, if you remove x from the denominator. For all higher order functions, the answer will be close if epsilon is small, bit it won't be correct.

If ##x## is in (0,1) and ##h=\epsilon x##, then
##f'(x) = \lim_{\epsilon \to 0}\frac{f((1+\epsilon)x)-f((1-\epsilon)x)}{2\epsilon x} = \lim_{h \to 0}\frac{f(x+h)-f(x-h)}{2h}##
I think that this is a valid definition of the derivative of ##f## at ##x##.

valenumr · Nov 11, 2021

FactChecker said:

If ##x## is in (0,1) and ##h=\epsilon x##, then
##f'(x) = \lim_{\epsilon \to 0}\frac{f((1+\epsilon)x)-f((1-\epsilon)x)}{2\epsilon x} = \lim_{h \to 0}\frac{f(x+h)-f(x-h)}{2h}##
I think that this is a valid definition of the derivative of ##f## at ##x##.

Well, to be fair, everything I said was completely wrong. But let's consider f(x) = x^2. Now I think the limit is constant (==1?), when it should be 2x. I'll defer to the fact that there may be a good reason for this specific implementation, but I don't see it being correct, and I don't see how it could be more efficient or numerically more accurate.

Gaussian97 · Nov 11, 2021

valenumr said:

Well, to be fair, everything I said was completely wrong. But let's consider f(x) = x^2. Now I think the limit is constant (==1?), when it should be 2x. I'll defer to the fact that there may be a good reason for this specific implementation, but I don't see it being correct, and I don't see how it could be more efficient or numerically more accurate.

Actually, if you compute the value for ##f(x)=x^2## you get ##f'(x)=2x##, which is not only correct, but independent of ##\varepsilon##. Indeed you can prove that the error of the algorithm goes like
$$\frac{x^2\varepsilon^2}{3}f'''(x)+\mathcal{O}(\varepsilon^3)$$
so it gives exact results for all quadratic polynomials.

valenumr · Nov 11, 2021

Gaussian97 said:

Actually, if you compute the value for ##f(x)=x^2## you get ##f'(x)=2x##, which is not only correct, but independent of ##\varepsilon##. Indeed you can prove that the error of the algorithm goes like
$$\frac{x^2\varepsilon^2}{3}f'''(x)+\mathcal{O}(\varepsilon^3)$$
so it gives exact results for all quadratic polynomials.

I'm on my phone, but I get the numerator as (x^2 + 2ex + e^2) - (x^2 - 2ex + e^2), which simplifies to 4ex. I think that's right. So divide by 2ex and you get 2 (not 1, but still a constant). I am a numerical person, I do lots of "computer math", so I'd really like to understand this more if it is correct.

Gaussian97 · Nov 11, 2021

valenumr said:

I'm on my phone, but I get the numerator as (x^2 + 2ex + e^2) - (x^2 - 2ex + e^2), which simplifies to 4ex. I think that's right. So divide by 2ex and you get 2 (not 1, but still a constant). I am a numerical person, I do lots of "computer math", so I'd really like to understand this more if it is correct.

No, the numerator is $$(1 + 2\varepsilon + \varepsilon^2)x^2 - (1 - 2\varepsilon + \varepsilon^2)x^2 = 4\varepsilon x^2$$

valenumr · Nov 11, 2021

Gaussian97 said:

No, the numerator is $$(1 + 2\varepsilon + \varepsilon^2)x^2 - (1 - 2\varepsilon + \varepsilon^2)x^2 = 4\varepsilon x^2$$

OOOOH

, I see it now. I was substituting very wrongly.

Numerical computation of the derivative

Similar threads

How to increase phone signal strength by lying about it

A Crisis for Newly Minted CompSci Majors -- entry level jobs gone

How to calculate Tension for a series of connected points?

Learning Assembly and computer architecture for x86

Sequential Analog Computers?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers