Convergence criterion for Newton-Raphson

  • I
  • Thread starter Kyouran
  • Start date
  • Tags
    Convergence
In summary, the Newton-Raphson algorithm has two methods for convergence: one based on the function value and one based on the change in the root. The first method only allows for a maximum absolute error, while the second method allows for both relative change and step size. Each method has its own advantages and potential issues, and a combination of both methods may be more robust but also more computationally expensive.
  • #1
Kyouran
70
10
TL;DR Summary
Convergence criterion Newton-Raphson
The Newton-Raphson algorithm is well-known:

##x_{n+1} = x_n - \frac{f(x_n)}{f'(x_{n})}##

Looking at a few implementations online, I have encountered two methods for convergence:

1) The first method uses the function value of the last estimate itself, ##f(x_n)## or ##f(x_{n+1})##. Since at the root the function value is zero, this limits us to only specifying a maximum absolute error on ## f(x)##, as a relative error w.r.t. zero doesn't make much sense.

2) The second method uses the change in the root of the last iteration. If this change falls below a certain treshold value, then the algorithm is assumed to have converged. Here, one has a bit more freedom: one can look at the relative change ##(\frac{x_{n+1}}{x_n}-1)## in the root, or one can look at the step size ##x_{n+1}-x_n##.

Obviously, the two criteria (or three, if you count the last two as distinct) are different, so the question here is what are the consequences of choosing one criterion over the other.
 
Mathematics news on Phys.org
  • #2
The question is what do you need. Do you need the function value to be small (take 1) or do you need the deviation in x to be small (take 2)?
 
  • #3
mfb said:
The question is what do you need. Do you need the function value to be small (take 1) or do you need the deviation in x to be small (take 2)?
I suppose either would do when you just want the root, but then again I can imagine that if you take a function that comes very close to zero yet doesn't actually cross it it may see it as a false root, say something like x^2 + 0.000001 with a tolerance of like 0.0001 would still lead to a root in the first case but not in the second case.
 
  • #4
For these functions the method won't converge nicely anyway. Sure, if your cutoff is large then you get a result of the method (in both cases), but if you try to improve it the estimate will get worse.
 
  • #5
What I'm wondering is whether the second method is more robust, as it doesn't seem to have that particular problem that I mentioned. Perhaps there are downsides to the second method as well where the first method may perform better; I just haven't figured it out. My overall goal here is to get an idea of the tradeoff that is being made here.
 
  • #6
The second method is basically doing half of the next step, so typically you are "a bit closer" (you checked how large the next step would be) - but you also did more calculation.
 
  • #7
Kyouran said:
so the question here is what are the consequences of choosing one criterion over the other.
They're both going to converge the same (quadratically or linearly usually) but big difference in speed if ##f(z)## is computationally difficult to compute, say millions of floating-point operations when choosing the criteria ##|f(z_{n+1})-f(z_n)|<p## compared to one subtraction of ##|z_{n+1}-z_n|<p## say when searching for a million roots of a million degree polynomial.
 
  • #9
Well, I suppose for the second method if you have a function and the slope is quite steep locally, then ##x_{n+1}## may be close to ##x_n## even though there is a large change in the function value. This gives the risk of premature convergence. The first method does not have that problem, but has the problem mentioned earlier. Combining both could make it more robust in that sense, but yes it would be more costly computationally.
 
  • #10
Kyouran said:
Well, I suppose for the second method if you have a function and the slope is quite steep locally, then ##x_{n+1}## may be close to ##x_n## even though there is a large change in the function value. This gives the risk of premature convergence. The first method does not have that problem, but has the problem mentioned earlier.
This was addressed in the first reply to your original question:
mfb said:
The question is what do you need. Do you need the function value to be small (take 1) or do you need the deviation in x to be small (take 2)?

and I am not sure any more needs to be said, except perhaps to deal with

Kyouran said:
Combining both could make it more robust in that sense, but yes it would be more costly computationally.

it could be amended to "Do you need the function value to be small (take 1), do you need the deviation in x to be small (take 2) or do you need both (take both 1 and 2)?"

Note that if your convergence test is ## |f(x_i)| < \varepsilon ## you no longer have a guarantee of convergence when ## |x_i - x| \le \varepsilon_0 x ## (## \varepsilon_0 ## is machine epsilon here and ## x ## is the exact solution). Partly because of this, we usually use test 2.
 

1. What is the convergence criterion for Newton-Raphson?

The convergence criterion for Newton-Raphson is a mathematical condition used to determine when the iterative process of the Newton-Raphson method has reached an acceptable solution. It is typically based on the difference between successive iterations of the solution and is used to ensure that the method converges to the correct solution.

2. How is the convergence criterion calculated?

The convergence criterion is typically calculated by taking the absolute value of the difference between the current and previous iterations of the solution. This value is then compared to a predetermined tolerance level, and if it is below the tolerance, the method is considered to have converged.

3. Why is the convergence criterion important in the Newton-Raphson method?

The convergence criterion is important in the Newton-Raphson method because it ensures that the method converges to the correct solution. Without a proper convergence criterion, the method may continue to iterate indefinitely without reaching a satisfactory solution.

4. How does the choice of convergence criterion affect the accuracy of the Newton-Raphson method?

The choice of convergence criterion can significantly affect the accuracy of the Newton-Raphson method. If the tolerance level is set too high, the method may converge quickly but with a less accurate solution. On the other hand, if the tolerance level is set too low, the method may take longer to converge but with a more accurate solution.

5. Can the convergence criterion be adjusted during the Newton-Raphson method?

Yes, the convergence criterion can be adjusted during the Newton-Raphson method. If the method is not converging to the desired solution, the tolerance level can be decreased to allow for more iterations. Conversely, if the method is converging too slowly, the tolerance level can be increased to speed up the process.

Similar threads

Replies
16
Views
1K
  • General Math
Replies
7
Views
1K
Replies
19
Views
2K
Replies
7
Views
1K
  • Calculus
Replies
13
Views
1K
  • General Math
Replies
21
Views
3K
Replies
17
Views
3K
Replies
1
Views
570
Replies
24
Views
2K
Back
Top