Questions on Newton's Method: Investigating Convergence & Initial Guesses

PFuser1232 · Jun 18, 2015

I have three questions regarding Newton's method.
https://en.m.wikipedia.org/wiki/Newton-Raphson#Failure_of_the_method_to_converge_to_the_root
According to this wikipedia article, "if the first derivative is not well behaved in the neighborhood of a particular root, the method may overshoot, and diverge from that root."
However, the example cited, namely ##|x|^a## for ##0 < a < \frac{1}{2}##, is a case where the first derivative is not well behaved at the root, despite being defined for all ##x ≠ 0##. So my question is, will Newton's method always fail if the first derivative doesn't exist at the root of the function?
Also, suppose we're given a function ##f(x)## and we're asked to find the root of the function, which happens to lie in the interval ##[a, b]##. Is there a general rule of thumb for finding a good initial guess that won't result in Newton's method diverging? What if ##f(x)## has a critical number in ##[a, b]##? Will the method always fail in that case? Or is there some chance it won't (depending on the "direction of convergence", for lack of a better term)?

DEvens · Jun 18, 2015

Remember you are talking numerical methods. So using words like "always" is kind of a slippery concept. The behaviour of a particular method, in abstract and considered using arbitrary precision hand-calculations, is sometimes very different from the behaviour of some code slapped together using floating-point variables of limited precision. With that caveat in mind...

If the function is well behaved then you can usually depend on Newton's method. If it is not, then you probably cannot. You might accidentally get some kind of answer, and that might accidentally be something resembling the answer you wanted. But it is not the way to assume things will go. Generally, you will do some simple examination of the function you are passing to Newton's method and look for those ill-behaved places. Any time the function or the derivative behave strangely you probably want to do something else. Maybe you can do some algebra to convert the problem to one in which you are looking for the root of a well behaved function instead. For example, if you find the function ##f(x,a)=|x|^a## troublesome, maybe some algebra will tell you that the roots of ##f(x,1)## will satisfy your requirements. Or some other such manipulation. If the computer program does not like ##f(x)## maybe it likes ##g(x)## that has the same roots but behaves nicely.

As to initial guesses: It depends on the nature of the function. There probably is no efficient algorithm that will work for an arbitrary function. It depends on what information you can supply in advance. For example, if you knew there was exactly one root on the interval [a,b], then you could do a simple binary search to get a quick approximation for a couple of digits. However, there is no promise in advance, for an arbitrary function, that there is exactly one root in an interval. You can get whether it is an odd or an even number of zeroes by comparing the sign of the function ##f(a)## with ##f(b)##. After that, you need to know something about the general character of the function to have an idea how to approach it.

Usually that means the person writing the computer program builds in some of this knowledge into the solver. Then the solver, if it is really clever, may try a couple different methods to explore the character of the function before it looks for roots. It may, for example, sample the function over the interval to explore the variability. Just as one example. Or it may know, for example, that the tangent function must be dealt with carefully near 90 degrees.

It is straight forward to construct examples of functions that will confuse any particular numerical method. And it is up to the programmer to detect such strange cases and take appropriate action to account for it. Generally speaking, this will be part of the determination of the domain of applicability of the program.

PFuser1232 · Jun 18, 2015

DEvens said:

Remember you are talking numerical methods. So using words like "always" is kind of a slippery concept. The behaviour of a particular method, in abstract and considered using arbitrary precision hand-calculations, is sometimes very different from the behaviour of some code slapped together using floating-point variables of limited precision. With that caveat in mind...

If the function is well behaved then you can usually depend on Newton's method. If it is not, then you probably cannot. You might accidentally get some kind of answer, and that might accidentally be something resembling the answer you wanted. But it is not the way to assume things will go. Generally, you will do some simple examination of the function you are passing to Newton's method and look for those ill-behaved places. Any time the function or the derivative behave strangely you probably want to do something else. Maybe you can do some algebra to convert the problem to one in which you are looking for the root of a well behaved function instead. For example, if you find the function ##f(x,a)=|x|^a## troublesome, maybe some algebra will tell you that the roots of ##f(x,1)## will satisfy your requirements. Or some other such manipulation. If the computer program does not like ##f(x)## maybe it likes ##g(x)## that has the same roots but behaves nicely.

As to initial guesses: It depends on the nature of the function. There probably is no efficient algorithm that will work for an arbitrary function. It depends on what information you can supply in advance. For example, if you knew there was exactly one root on the interval [a,b], then you could do a simple binary search to get a quick approximation for a couple of digits. However, there is no promise in advance, for an arbitrary function, that there is exactly one root in an interval. You can get whether it is an odd or an even number of zeroes by comparing the sign of the function ##f(a)## with ##f(b)##. After that, you need to know something about the general character of the function to have an idea how to approach it.

Usually that means the person writing the computer program builds in some of this knowledge into the solver. Then the solver, if it is really clever, may try a couple different methods to explore the character of the function before it looks for roots. It may, for example, sample the function over the interval to explore the variability. Just as one example. Or it may know, for example, that the tangent function must be dealt with carefully near 90 degrees.

It is straight forward to construct examples of functions that will confuse any particular numerical method. And it is up to the programmer to detect such strange cases and take appropriate action to account for it. Generally speaking, this will be part of the determination of the domain of applicability of the program.

That was a thorough, well-written answer.
Thanks!

FactChecker · Jun 18, 2015

MohammedRady97 said:

will Newton's method always fail if the first derivative doesn't exist at the root of the function?

No. f(x) = |x| will always find the root in 1 step, regardless of the initial value of x₀

Also, suppose we're given a function ##f(x)## and we're asked to find the root of the function, which happens to lie in the interval ##[a, b]##. Is there a general rule of thumb for finding a good initial guess that won't result in Newton's method diverging?

No. In general, the functions are complicated and the sequence of x_is is unknown. So that there is no way to know that. If the problem area is far from the root, you may be lucky and the x_is will stay away from the problem area. But my experience is that it often takes a few re-starts at different initial points to get to the root. If you are writing a program to implement Newton's method, I recommend setting a reasonable iteration limit and restarting at another random point if it hasn't converged to a root within the iteration limit.

Questions on Newton's Method: Investigating Convergence & Initial Guesses

Undergrad Why ##a^0=1##?

Undergrad Finding the minimum distance between two curves

High School Arc Length for Hyperbolic Sin

Undergrad Why is ##x=e^y## the inverse of ##y=\int_1^x \frac{1}{t} dt##?

High School Taking a limit and getting the wrong answer...don't know why

Undergrad Correct Upper/Lower limits for continuation of solutions for ODE

Graduate About ellipticity and a proof that a system of PDEs is elliptic

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Questions on Newton's Method: Investigating Convergence & Initial Guesses

Similar threads