# Make Intelligent Initial Guesses for Newton-Raphson Programmatically

• I
• person123

#### person123

TL;DR Summary
I'm looking to numerically solve equations when Newton Raphson's method only converges for certain initial guesses, so I'm wondering if there's ways to perform NR repeatedly with initial guesses that are more likely to converge.
I'm writing code to numerically solve a single variable equation, currently with Newton Raphson's method. Right now, I'm just using an initial guess of 1, and reporting a failure if it doesn't converge. While it usually works, it does of course fails for many functions with asymptotes or other discontinuous behavior. I can improve things slightly by just incrementing the initial guess (e.g. 1, -1, 2, -2 etc.), and performing Newton Raphson repeatedly until if finds a solution or it fails after n times. I'm wondering if there's a more intelligent method though, perhaps by using previous attempts to determine a better initial guess for the next time.

To be clear, I know there's always going to be some functions with roots the method can't find -- I'm just looking for an efficient method to find roots in most cases . I'm also just looking to find one root, not every root.

I've been using Newton-Raphson since that's what I'm familiar with, but if an entirely different algorithm would be better for this, I would be happy to try something else.

Are there any existing methods for something like this? Does anyone have ideas how this could be done?

Thanks!

WWGD
Maybe what they here write about:

https://www.sciencedirect.com/science/article/abs/pii/S0096300306015712

Basically, if you want an approximation for the roots of a polynomial, you find another polynomial where the coefficients of terms are similar enough in value and which has exactly solvable roots. Then a first or second order perturbation approximation. This is also explained in more elementary way in many lecture notes that can be found with Google search.

I should have clarified I want this for any arbitrary function, it looks like the article only applies for poly nomials. Maybe I could use Lagrange polynomials to approximate the function as a polynomial and approximate roots of that, but my sense is that just makes things more complicated and would completely miss asymptotes of the actual function.

"Intelligent attempts" require some information about the function. When you say you want it to work for general functions, even with discontinuities and asymptotes, then I am afraid that you eliminate anything except some standard set of initial points. Of course, you could use the results of some initial points to define the following set of initial guesses. Pattern search methods might help.

person123
"Intelligent attempts" require some information about the function. When you say you want it to work for general functions, even with discontinuities and asymptotes, then I am afraid that you eliminate anything except some standard set of initial points. Of course, you could use the results of some initial points to define the following set of initial guesses. Pattern search methods might help.
Yes, I agree, I would have to start by just guessing standard points. I'm hoping though that I could use the function values at previous guesses and/or previous Newton-Raphson attempts to learn something about the function, guiding future guesses. For example, if the function is only finite at a certain region, maybe I could get it to continue guessing just in that region, and that would be more efficient.

I see a wikipedia article on pattern search for optimization, but nothing for root finding. Would the idea be to turn it into an optimization problem with the objective function being f(x)?

I've also thought about starting by just guessing points until I find a value below 0 and a value above 0. Then I would use the bisection method or something similar first. For functions like tan(x), I think this would prevent values from blowing up compared to Newton Rapshon. Then, once it gets close to the root, I could start using Newton Rapshon to speed things up. If on the other hand, the function values start getting farther and farther from 0, it's probably an asymptote, so I would go back to searching for points. I think this would generally work, but I'm not sure about the details (e.g. when to switch to Newton Rapshon, when to realize it's likely an asymptote, whether I could use past guesses to inform later guesses).

FactChecker
I see a wikipedia article on pattern search for optimization, but nothing for root finding.
Good point. I missed that. You could turn a root-finding problem of ##f(x)## into a minimization problem of ##f(x)^2##.

You could try sampling the sign of the function for a set of point in an interval of interest for a root. For example, if you were looking for roots for ##x\in [0,1]##, you could sample the function at ##x=0,0.1, 0.2,...##. If you see the sign of the function change between any two consecutive points, then you could start your Newton-Raphson somewhere between the two consecutive points. I don't know if you will do much better than cooking up random heuristics without restricting the functions more.

FactChecker
Are you sure Newton Raphson (or Newton's method) is as useful or interesting as you seem to think it is?

This MSE thread is a good explanation of why it might not be: https://math.stackexchange.com/ques...een-some-basic-numerical-root-finding-methods.
I realize that my terminology was actually off -- I meant secant method, not Newton's method (I never symbolically take derivatives of the functions, instead I just approximate the derivative using two points).

That being said, Brent's method, mentioned in the article, looks promising to me, and is vaguely similar to what I was thinking about combining bisection and secant method (what I wrongly called Newton's method).

I'm going to try it, in combination with something like what @Haborix suggested, and I'll post how it works.

Bear in mind that once you have straddled the root, bisection is guaranteed to find a double precision result in less than 64 iterations. If you need to spend many more than that locating the neighbourhood of a root then the savings from attempting to accelerate this with a secant or higher order method are not going to be significant, and if the derivative is not smooth you may well make things worse.

What is the problem you are actually trying to solve?

SammyS
Bear in mind that once you have straddled the root, bisection is guaranteed to find a double precision result in less than 64 iterations. If you need to spend many more than that locating the neighbourhood of a root then the savings from attempting to accelerate this with a secant or higher order method are not going to be significant, and if the derivative is not smooth you may well make things worse.

What is the problem you are actually trying to solve?
Good point. I think that realistic ideas like this can only be used if the OP drops the requirement that the algorithm must work in spite of discontinuities. That was too much to ask for.

SammyS
 Behaviour: smooth Behaviour: smooth at first or second etc. derivative Behaviour: pathalogical Cost to calculate: cheap Doesn't matter Bisection if bracket available, otherwise quadratic, cubic etc. Bisection Cost to calculate: expensive Secant Quadratic, cubic etc. Problem-specific, may be intractable

What is the problem you are actually trying to solve?
I'm making a website that links systems of equations with dynamic visuals. Usually it solves the system of equations symbolically with a CAS I made. However, when it can't solve an equation symbolically and it only contains a single variable, I make it solve it numerically as a backup.

For most of the examples I made, the secant method works fine, but it still fails in some cases. See the page for modeling a 3-bar linkage. (Click "RUN" in the upper left corner to solve and show the visual). For ##\theta## less than 0.86, it works fine, but after that, the method fails despite there being a solution. Here's a desmos plot, note that there's a very narrow window for the method to converge to the solution.

I could of course tailor the method to this particular case, but this is just an example -- I don't know what systems someone would want to solve. I know it will fail on some, but I'd like to be successful for as many as I can, while running quickly.
 Behaviour: smooth Behaviour: smooth at first or second etc. derivative Behaviour: pathalogical Cost to calculate: cheap Doesn't matter Bisection if bracket available, otherwise quadratic, cubic etc. Bisection Cost to calculate: expensive Secant Quadratic, cubic etc. Problem-specific, may be intractable
Got it, I'm leaning more and more heavily toward just doing the bisection method.

I should have clarified I want this for any arbitrary function, it looks like the article only applies for poly nomials. Maybe I could use Lagrange polynomials to approximate the function as a polynomial and approximate roots of that, but my sense is that just makes things more complicated and would completely miss asymptotes of the actual function.
Maybe Taylor series , at least first few terms, considering region of convergence, etc., to estimate roots?

I'm writing code to numerically solve a single variable equation, currently with Newton Raphson's method.
How is the arbitrary function (that defines the equation) to be presented to the subprogram that tries to solve it? For example, in C++ programming, the subprogram could access the subprogram for the function by having a pointer-to-function passed to it. If your program architecture is that general, I don't see how you are passing any information that gives hints about the location of the zeroes of particular functions.

If the subprogram that solves the equation is also given the initial guesses, then are you asking how to provide those initial guesses for an "arbitrary" function? It seems to me that's equally hopeless. I think you need to specialize to questions like: If I have a such-and-such particular type of function, how can I make initial guesses about its zeroes.

FactChecker
The OP mentions discontinuities, singularities, etc. I wonder what the plan is to handle things like that. If there is a change of sign at a discontinuity or singularity, will it be considered a zero? If a change of sign is used, what will happen with even-ordered zeros, like ##f(x)=x^2##?
IMO, the OP should be reduced to a more reasonable problem where we can give better, specific advice.

WWGD
I went for the bisection method, just guessing points first upward and then downward from 0, and trying the method whenever the sign changes. It worked a bit better than Newton-Raphson since it can't diverge, but it still does fail in cases like ##f(x)=x^2## like @FactChecker pointed out. There might not be much more I can do, and I'm okay with these occasional failures. Thanks everyone!