I Error bounds for Newton's Method

mcastillo356
Gold Member
Messages
634
Reaction score
342
TL;DR Summary
Would like to understand this theorem, or at least the statement. The textbook says demonstration lies on Mean-Value Theorem, but doesn´t provide, because "is of little practical use". The question I make is to just understand why, if ##K/(2L)<1##, the theorem shows that ##x_n## converges quickly to ##r## once ##n## becomes large enough that ##|x_n-r|<1##
Hi, PF

Sometimes it is not easy to find roots of functions. Newton gave a nice clue: the Newton's Method formula: ##x_{n+1}=x_n-\dfrac{f(x_n}{f'(x_n)}##. My concern is, now that I have understood and practiced it, comprehend what I've sketched in the summary. This is all taken from "Calculus, 9th edition, by Robert A. Adams and Christopher Essex".Theorem 2 Error bounds for Newton's Method
The following theorem gives sufficient conditions for the Newton approximations to converge to a root ##r## of the equation ##f(x)=0## if the initial guess ##x_0## is sufficiently close to the root ##r##.
Suppose ##f##, ##f'## and ##f''## continuous on an interval ##I## containing ##x_n##, ##x_{n+1}## and a root ##x=r## of ##f(x)=0##. Suppose also that exist ##K>0## and ##L>0## such that for all ##x## in ##I## we have

(i) ##|f''(x)|\leq{K}## and

(ii) ##|f'(x)|\geq{L}##

Then

(a) ##|x_{n+1}-r|\leq{\dfrac{K}{2L}|x_{n+1}-x_n|^2}## and

(b) ##|x_{n+1}-r|\leq{\dfrac{K}{2L}|x_n-r|^2}##

(i) y (ii) assert that near ##r## the slope of ##y=f(x)## is not too small in size and doesn't change too rapidly. If ##K/(2L)<1##, the theorem shows that ##x_n## converges quickly to ##r## once ##n## becomes large enough that ##|x_n-r|<1##.

Thanks in advance!
 
Physics news on Phys.org
Hi,

No quick responders so I'll try to give it a kick-off:

Is the proof of the theorem as presented here clear to you ?

##\ ##
 
  • Love
Likes mcastillo356
Thanks, @BvU. Now is 2.13 AM, but I will remain sleepless. I was stucked, now I'm hopeful.
Love
 
  • Like
Likes BvU
Hi, @BvU, not familiar still with Taylor's Theorem; just found a proof that relies on Mean-Value Theorem. Could work out aspects I find difficult?
file:///C:/Users/usuario/Desktop/Newton-MVT.pdf
But first of all, is this link ok?
 
Link looks OK to me. Can't say I find the top half of page 2 easy to follow, though...

##\ ##
 
@BvU , the same feeling. Second page is difficult for me

Since ##T'(\hat{x})=0##, in the neigborhood of the root we will be able to select a decay constant ##c<1##, so the root is an attractive fixed point of ##T##. In particular let ##D=[\hat{x}-r,\hat{x}+r]## where ##|T'(x)|\leq{c<1}## for all ##x\in{D}##. (If ##f'(\hat{x})\neq{0}## we can always find an ##r## such that the inequality holds for the given constant ##c##.) Then if ##u,v\in{D}##, the Mean Value Theorem gives ##T(u)-T(v)=T'(\tilde{u})(u-v)## for some ##\tilde{u}\in{D}## between ##u## and ##v##. Thus ##|T(u)-T(v)|\leq{c|u-v|}## for all ##u,v\in{D}##. In particular, ##|x_n-\hat{x}|\leq{c\;|x_{n-1}-\hat{x}|}\leq{...}\leq{c^{n}|x_0-\hat{x}|}##

How can I get to this late expression from the previous one?
 
1- ##|T(u)-T(v)|\leq{c\;|u-v|}##, ##u,v\in{D_T}##

2- Domain of T: ##D=[\hat{x}-r,\hat{x}+r]##, where ##|T'(x)|\leq{c<1},\;\forall{x\in{D}}##

3- Recall MVT: ##\exists{\;u<\tilde{u}<v}##, s.t. ##T'(\tilde{u})=\dfrac{T(u)-T(v)}{u-v}##

4- Take absolute values on 3: ##|T'(\tilde{u})|=\dfrac{|T(u)-T(v)|}{|u-v|}##

5- Set ##u=x_{n-1}## and ##v=\hat{x}##. Remember we are looking for ##T(\hat{x})=\hat{x}##

6- ##c^0|(T(x_{n-1})=x_n)-(T(\hat{x})=\hat{x})|\leq{\;c^1|x_{n-1}-\hat{x}|}##

7- Recursively, ##|x_n-\hat{x}|\leq\;c\;|x_{n-1}-\hat{x}|\leq...\leq\;c^n\;|x_0-\hat{x}|##

Right? Take it as a question.
 
mcastillo356 said:
How can I get to this late expression from the previous one?
Yeah, sorry I didn't understand what the question was here. I'm sure you can see the unrolling from ##n## via ##n-1## all te way to ##0##, adding a factor ##c## at each step.

And ##T(u)-T(v)## to ##x_{n} - \hat x## doesn't seem nonsensical either...

mcastillo356 said:
Right? Take it as a question.
I'm not very good at rubber stamping approvals :wink: . All I can say is I can't find fault, but that doesn't really help you.

Thought I'd posted a snippet from Burden & Faires: Numerical analysis 9th ed, but perhaps it was removed by a mentor for copyright reasons ? Anyway, they follow the same thread of reasoning to prove a theorem 2.6 (using a fixed point theorem 2.4).

More important is a concluding remark there: the proof is all fine and good, bit it doesn't tell us how big ##D## is. So its practical value is minimal. That means that in real life one starts up Newton with the best available initial guess and if it doesn't converge, one retreats to a plan B to improve the starting point.

This all sounds corny for a 1D case, but for applications like flowsheeting you can have over ##10^5## equations with as many unknowns and each and every one of those ##10^5## can throw you off a convergent track. Because there are minute numerical discontinuities (e.g. caused by if-statements in physical properties), numerical singularities, numerical differentiation, ##0=0## equations, and so on.

##\ ##
 
  • Informative
Likes mcastillo356
  • #10
BvU said:
Yeah, sorry I didn't understand what the question was here. I'm sure you can see the unrolling from ##n## via ##n-1## all te way to ##0##, adding a factor ##c## at each step.

And ##T(u)-T(v)## to ##x_{n} - \hat x## doesn't seem nonsensical either...I'm not very good at rubber stamping approvals :wink: . All I can say is I can't find fault, but that doesn't really help you.

Thought I'd posted a snippet from Burden & Faires: Numerical analysis 9th ed, but perhaps it was removed by a mentor for copyright reasons ? Anyway, they follow the same thread of reasoning to prove a theorem 2.6 (using a fixed point theorem 2.4).

More important is a concluding remark there: the proof is all fine and good, bit it doesn't tell us how big ##D## is. So its practical value is minimal. That means that in real life one starts up Newton with the best available initial guess and if it doesn't converge, one retreats to a plan B to improve the starting point.

This all sounds corny for a 1D case, but for applications like flowsheeting you can have over ##10^5## equations with as many unknowns and each and every one of those ##10^5## can throw you off a convergent track. Because there are minute numerical discontinuities (e.g. caused by if-statements in physical properties), numerical singularities, numerical differentiation, ##0=0## equations, and so on.
@BvU, thanks. My intention now is to turn into a maths native forum, and keep this thread in standby, until clarifying all, i.e., understand the quadratic convergence proof provided, and the Example 1 where approximates ##\sqrt{7}##. I don't want to bore this site.
##D## will be one of the questions, but for now target achieved:
If ##x_0\in{D}##, then so are all of the ##x_n\in{D}## and ##x_n\rightarrow{\hat{x}}## as ##n\rightarrow{\infty}##
Two personal remarks: hope PF checks this thread at all times; and hope it is all consistent, coherent, and useful to someone else. Ultimately that's what my vanity asks for.
 
  • #11
BvU said:
Is the proof of the theorem as presented here clear to you ?

BvU said:
Thought I'd posted a snippet from Burden & Faires: Numerical analysis 9th ed, but perhaps it was removed by a mentor for copyright reasons ?
I don't see any history of your post being edited. Are you talking about the link in the first quote of yours that I copied?
 
  • #12
In that case I must have lost my draft.
1642797193530.png

treated in the same way as in the link in #5, showing all the hypotheses in
1642797376700.png

are satisfied. Then for the error estimate
1642797458800.png


---

Coming back to #4: the entire Newton method is nevertheless based on the Taylor expansion of ##f## around ##p## (with ##\, f(p)=0\,##). So getting familiar with Taylor (MacLaurin) is more or less required (and very useful for maany other applications as well).

##\ ##
 
  • Informative
Likes mcastillo356
  • #13
BvU said:
the entire Newton method is nevertheless based on the Taylor expansion of ##f## around ##p## (with ##\, f(p)=0\,##). So getting familiar with Taylor (MacLaurin) is more or less required (and very useful for maany other applications as well).
Sure, but still haven't got in touch, though soon have to face.

I've released two different quotes: the one at first post:
The casual expressions "not too small" and "too rapidly" kept me wondering. (i) and (ii) are just bounds.

Regarding the pdf, there is a typo in Example 1. It should be ##\sqrt{7}## instead of ##\sqrt{2}##. And for me it's more comprehensible if it would be ##|f''(x)|\leq B## instead of ##|f''(x)|< B##, and ##|f'(x)|\geq A## instead of ##|f'(x)|> A##, when it comes to introduce ##A## and ##B##.

Thanks, @BvU, PF!
 
  • #14
mcastillo356 said:
Regarding the pdf, there is a typo in Example 1. It should be ##\sqrt{7}## instead of ##\sqrt{2}##.
Yes, I see that the author inadvertently wrote ##\sqrt 2## in one place.
mcastillo356 said:
And for me it's more comprehensible if it would be ##|f''(x)|\leq B## instead of ##|f''(x)|< B##, and ##|f'(x)|\geq A## instead of ##|f'(x)|> A##, when it comes to introduce ##A## and ##B##.
This is of little consequence. All it is saying is that |f''(x)| is bounded and that |f'(x)| is bounded below. It doesn't make any difference between < and ##\le## for the first, and similar for the bound on f'(x).
 
  • Informative
  • Like
Likes mcastillo356 and BvU
Back
Top