1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Homework Help: Convergence of sequence : x + cosx

  1. Aug 10, 2012 #1
    1. The problem statement, all variables and given/known data
    xn+1 = xn + cosxn , n>=1
    where x0 E [π/4 , 3π/4] = D.
    Show it converges, find rate of convergence.

    2. Relevant equations
    contraction theorem

    3. The attempt at a solution
    Setting a function f(x) = x+cosx we have f'(x) = 1 - sinx, f''(x)= -cosx.
    Now f' >= 0, so f is increasing.

    For x E D, the first derivative is less than one.
    Since f is increasing , f(π/4) = 1.4925 > π/4 and f(3π/4) = 1.6490 < 3π/4 , f is a mapping D -> D.
    So f is a contraction in D, thus the sequence xn+1 will converge to a fixed point x* Ε D.

    But doesn't the above hold for x0 only? Don't i have to prove that for all n, xn E D ?
  2. jcsd
  3. Aug 10, 2012 #2
    [itex] x_0 [/itex] is any point in D. The index n doesn't really matter; the recurrence is just a formal way of describing how to move "forward". That is, [itex] x_n=f^n(x_0) [/itex] where f^n denotes the composition with itself n times.

    In fact, look at the web page that you quoted in the relevant equations... "for any x in M..."
    Last edited: Aug 10, 2012
  4. Aug 10, 2012 #3
    Ah okay thanks.
    Strangely enough, i have done the same exact thing in other identical exercises, and that question didn't arise there.
    Thank you.

    So my proof is consistent?

    Edit: may i add another question on sequences.
    if you have an "backwards" sequence, of the form yn-1 = f(yn) , when asked to show convergence, do you proceed as if it was a 'forward' sequence, or do you have to solve for yn and continue from there?
  5. Aug 10, 2012 #4
    Yes. If you haven't seen a proof of the Banach fixed-point theorem, take a look at it here: http://en.wikipedia.org/wiki/Banach_fixed-point_theorem

    It isn't too difficult a proof to follow, and it will explain why every point in the space converges to the fixed point (it comes down to the fact that, in your case, the set D equipped with the standard metric, is complete).

    That depends what you mean. Let me give you an example...
    It is obvious that this sequence fixes the point 0. You can also verify that [itex] f [/itex] is a contraction mapping.

    The "backwards" sequence is [itex]y_{n-1}=2y_n=g(y_n)[/itex].

    To show this is actually the backwards sequence, just compose f with g, and show that the output is the same as the input. ie. [itex]g(f(x_n))=x_n[/itex].

    However, [itex] g [/itex] is NOT a contraction mapping, even though it has a fixed point.

    I challenge you to prove the following:
    Let [itex] f:D\rightarrow D [/itex] be a contraction mapping with a fixed point [itex] x^* [/itex], where [itex] D [/itex] is a closed interval on the real line. If there exists [itex] g:D\rightarrow D[/itex] with [itex]g(f(x))=f(g(x))=x[/itex], then [itex] g(x^*)=x^*[/itex], but [itex] g[/itex] it is not a contraction mapping.
    (HINT: Inverse Function theorem)

    In summary, I claim that if you have a sequence [itex]x_{n+1}=f(x_n)[/itex] for which f is a contraction mapping, then if a reverse sequence exists, then it does not induce a contraction mapping. You can think of this geometrically in the sense that, if a contraction mapping "contracts" a set to a point, then the reverse of this mapping should "blow up" everything near this point, into the original set.

    But for the question you actually asked... yeah, it doesn't matter if the sequence you're looking at is "backwards" or "forwards". In general, you can think of a "backwards" sequence as a "forward" sequence where the labelling is done in reverse... that is, you start at x0 and go to x(-1), x(-2) etc., instead of x0, x1, x2...
    Last edited: Aug 10, 2012
  6. Aug 11, 2012 #5
    Got it.

    In general, since the max|f'(x)| < 1 , there will be a point b in the inverse where f'(b) = 1/f'(a) >1, since f'(a) <= max|f'(x)| < 1

    Makes sense. Thanks again.
  7. Aug 11, 2012 #6
    I hope it's ok to add more questions on sequences.. (i reckon it's better forum-wise than making new threads.If not , i'm sorry + close this thread).
    Maybe a moderator could append " and general sequence questions".

    When examining the (global) convergence of Newton-Raphson[/PLAIN] [Broken] method.
    Having a function f(x), with first and second derivatives positive, and a positive root r,
    the sequence-building function of the method for approximating the root of f is

    xn+1= h(xn) = xn - f(xn) / f'(xn)

    Now in calculating the derivative of h, my textbook arrives to the easily rememberd shortcut of
    h'(xn) = f(xn) f''(xn) / [f'(xn)]2 ,
    so for xn > r , f(xn) >0 and h'(xn)>0 => h increasing (*)
    and for xn < r , f(xn) < 0 and h'(xn) <0 => h decreasing (**)

    Wouldn't this mean that h would diverge from the root of f?
    Despite that h is always decreasing (in this case, where f',f'' >0 and r >0) and approaches r.

    if xn >r , then f(xn) >0 which means that
    h(xn) = xn - f(xn)/f'(xn) = xn - [positive] =>h(xn) < xn (h decreasing)

    if xn <r , then f(xn) < 0 which means that
    h(xn) = xn - f(xn)/f'(xn) = xn - [negative] =>h(xn) > xn (h increasing)

    which are opposite from (*),(**) tell us.
    What am i missing?
    Last edited by a moderator: May 6, 2017
  8. Aug 11, 2012 #7
    I read your post, and I will admit, I was a bit confused too. I needed to refresh myself on properties of contraction mappings on the real line and what not. I think the main reason you're having trouble is that you're forgetting what the function h actually is...

    Let's abandon this [itex] x_n [/itex] business for a while. We have
    [tex] h(x)=x-\frac{f(x)}{f'(x)}. [/tex]
    So what do we know about h? Well, first off, our root r is a fixed point. Indeed, h(r)=r-f(r)/f'(r), but since r is a root of f, f(r)=0, so h(r)=r. What about the derivative of h at r? We have
    so h'(r)=0.

    What does h look like in a small region about x=r? Let's pick an easy example for f (this isn't a big deal, since given your assumptions on f and its derivatives, you can always derive similar estimates to the ones I'm about to). Suppose in a sufficiently small open ball around r, we can approximate f by a linear function: f~x-r. So its derivative f'~1, and f''~0 in this same small ball. Hence, in this small ball, h~x-(x-r)/1, therefore h~r, and h'~0. See the image below; the vertical bars denote this open interval (in reality, this interval would likely be much smaller, but I've drawn it larger so we can actually see it).


    In particular, in a sufficiently small open ball around r, we have [itex] |h'(x)|<1 [/itex]. Denote this ball by D. Knowing this, what can you say about the behaviour of h in D? What happens to the sequence [itex] x_{n+1}=f(x_n) [/itex] if [itex]x_0\in D[/itex]?

    EDIT: I drew f wrong (it's derivative is negative to the left of the interval), and forgot to label it, but that's ok.
    Last edited: Aug 11, 2012
  9. Aug 11, 2012 #8
    As for why there's this confusion over the derivative of h being negative to the left of the root, and positive to the right of the root, I would discuss this by analogy with ordinary differential equations (I'm going to assume you've taken an elementary course in ODEs)

    In ODEs, if you have a 1-dimensional homogeneous ODE x'=f(x), then you would think of the function f as telling you which direction your point x is going to move. For example, if f(r)>0, then a point sitting at x=r will move to the right (ie. positive). If f(r)<0, then such a point will move to the left (ie. negative). If f(r)=0, then a point starting at x=r doesn't move; r is a fixed point, or equilibrium.

    You might recall that to determine the stability of an equilibrium r, you take the derivative of f, and evaluate at r. If f'(r)>0, then it's unstable, and if f'(r)<0, then it's stable. Suppose it's stable, then f'(r)<0. If f' is continuous (which it should be, as this is often a requirement for standard ODE theorems to be applied), then there's an open ball around r, call it B, such that if x is in B, then f'(x)<0. But this doesn't mean that points to the left of r will necessarily move away from the equilibrium... f' at points near the equilibrium say NOTHING about its stability: only f'(r) matters.

    This is similar to discrete processes (eg. Newton's Method), except that the stability criterion is that we have to have |f'(r)|<1. See the proof of the banach fixed point theorem if you want to know why this is.

    So in the end, there are is one big thing to remember to avoid confusion...
    1) f'(x) doesn't tell you where x is going to move. f'(r) at a fixed point r determines it's stability (equivalently, Lipschitz constant in a sufficiently small neighbourhood of r).
    Last edited: Aug 11, 2012
  10. Aug 11, 2012 #9
    Also, what you have written here is
    x>r => h(x)<x
    x<r => x<h(x)

    However, this does not imply that h is increasing or decreasing. We know that h'(x)<0 for x<r, and h'(x)>0 for x>r. See the image below: this is not a contradiction.


    These questions are really helpful for me, since in about two weeks I'm going to be TA-ing a first-year calculus course which has a section on discrete systems, as well as Newton's Method. This is good review :)
  11. Aug 13, 2012 #10
    Thank you so much.
    I guess i was confused with the output of h(x) being also the input (points in the x axis) for f.

    Your illustrations have cleared things out a bit :)

    Edit again =(
    I have forgotten to ask, in the problem in the 1st post, about the rate of convergence r.

    f(x) = x + cosx , the fixed point is x* = f(x*) => x* = x* + cos(x*) => x* = pi/2 or 3pi/2 (latter rejected as x* must E [pi/4 , 3pi/4] )
    since f'(x*) = 1 - sin(pi/2) = 0 , r>1.

    How do i proceed from here?

    alternatively taking the limit as n-> +inf:
    lim (xn+1 - x*) / (xn - x*)p
    = lim (xn + cosxn - pi/2) / (xn - pi/2)p = lim 1/(xn - pi/2)p-1 + cosxn / (xn - pi/2) = ... ?
    Last edited: Aug 13, 2012
  12. Aug 13, 2012 #11
    I thought it would be something like that.

    I'm going to assume that the only definition of 'rate of convergence' you've looked at is this one:
    [tex] r=\lim_{n\rightarrow\infty}\frac{|x_{n+1}-x^*|}{|x_n-x^*|^p}
    where the order of the convergence is p>1. This is referred to as "superlinear convergence of order p". You seem to be assuming right now that your sequence converges superlinearly. Perhaps you should start by showing that it does converge superlinearly (ie. p=1), since this is fairly easy (hint: [itex]\lim_{n\rightarrow\infty} |a_n|=|\lim_{n\rightarrow\infty}a_n|[/itex], and try L'Hopital's rule). A sequence converges superlinearly if r=0 for p=1 (as above).

    Also, you seem to be forgetting your absolute value bars... this is causing you to break some rules (you end up splitting it into two fractions, but you can't, unless you apply the triangle inequality).

    I claim that it converges quadratically (p=2) with rate r=1/2. To show this, use the fact that |ab|=|a||b| along with the hint I suggested above. If you get stuck, let me know, and show me your work. However, I would ask that for the rest of this post, you make an effort to typeset your equations in LaTeX. It is much easier for me to read.
  13. Aug 14, 2012 #12
    Actually i got that from a worked example.
    My textbook instructions on RoC are:
    At least linear convergence , when exist C<1 , N such that:
    \left| x_{n+1} - x^*\right|\le C \left|x_n - x^* \right| , \forall n\in \mathbb N
    Convergence of order p , p >1 when exists C>0 such that:
    \left| x_{n+1} - x^*\right|\le C \left|x_n - x^* \right|^p
    It also adds that applying the mean value theorem and taking the limit:
    [tex]\lim_{n\rightarrow\infty}\frac{x_{n+1} - x^*}{x_{n} - x^*} = \phi^'(x^*)[/tex] , (where φ is the contraction mapping function)
    if the derivative at the fixed point different that 0 , the convergence order is exactly one.
    if different than 0, p> 1. (that's where i got the superlinear convergence thing)

    And if for a convergent sequence:
    [tex] \lim_{n\rightarrow\infty}\frac{(x_{n+1} - x^*)}{(x_{n} - x^*)^p} = a , a \ne 0[/tex] , then the convergence order is exactly p

    continuing in next post
  14. Aug 14, 2012 #13
    I showed superilnear convergence with the 3rd formula above.
    The limit:

    \lim_{n\rightarrow\infty}\frac{\left|x_{n+1} - x^*\right|}{\left|x_{n} - x^*\right|}=
    \left|\lim_{n\rightarrow\infty}\frac{x_{n+1} - x^*}{x_{n}-x^*}\right|=\left|\lim_{n\rightarrow\infty}\phi^'(k_n)\right| = \phi(x^*) = cos(\frac{\pi}{2}) = 0 [/tex] , where kn between xn,x*

    [tex]\lim_{n\rightarrow\infty} \frac{\left|x_{n+1} - x^*\right|}{\left|x_{n} - x^*\right|^2} =
    \frac{x_{n+1} - x^*}{(x_n - x^*)^2}

    \lim_{n\rightarrow\infty}\frac{cosx_n}{(x_n - x^*)^2}\right| +
    \left|\lim_{n\rightarrow\infty}\frac{x_n}{(x_n - x^*)^2}\right| -
    \left|\lim_{n\rightarrow\infty}\frac{x^*}{(x_n - x^*)^2}\right|
    = [/tex][tex]
    \left|\lim_{n\rightarrow\infty}\frac{cosx_n}{(x_n - x^*)^2}\right| +
    \left|\frac{x^*}{(x^* - x^*)^2}\right| - \left|\frac{x^*}{(x^* - x^*)^2}\right|
    =\left|\lim_{n\rightarrow\infty}\frac{cosx_n}{(x_n - x^*)^2}\right|

    Since xn -> x*= pi/2, i get 0 / 0.By de L'Hospital's rule i can get to:
    [tex]\left|\lim_{n\rightarrow\infty}\frac{cosx_n}{(x_n - x^*)^2}\right| =
    \left|\lim_{n\rightarrow\infty}- \frac{sinx_n}{2(x_n - x^*)}\right| = ..?

    where i'm stuck. Dividing by xn doesn't rid of the zero denominator
  15. Aug 14, 2012 #14
    Why are you splitting this up with triangle inequality? Let's assume that you continued along this line of reasoning, and eventually got your limit to converge to something. Say
    [tex]\lim_{n\rightarrow\infty} \frac{\left|x_{n+1} - x^*\right|}{\left|x_{n} - x^*\right|^2} \leq k [/tex] for some k>0. You have an inequality because you used the triangle inequality. Does this mean that you have proven that your sequence converges quadratically? No: just because this limit is less than or equal to k, doesn't mean it isn't zero. And if it's zero, then that means that the convergence is greater than order 2.

    Also, I should tell you that you applied the triangle inequality incorrectly. I assume you did something like this: [itex] |cosx_n+x_n-x^*|\leq|cosx_n|+|x_n-x^*|\leq|cosx_n|+|x_n|-|x^*|[/itex]. However, that last application is incorrect; what if we had (for the sake of argument) [itex]x^*=\pi/2[/itex] and [itex] x_n =\pi/4 [/itex]. Then [itex]|x_n-x^*|=\pi/4 [/itex], but [itex] |x_n|-|x^*|=\pi/4-\pi/2=-\pi/4[/itex]. So this logic would imply [itex] \pi/4\leq-\pi/4 [/itex], which is incorrect.

    Let's start over, but this time, we aren't going to split up the numerator. I'm going to denote [itex] L:=\lim_{n\rightarrow\infty}[/itex], just to make things easier to type.

    [tex]L\frac{|x_{n+1}-x^*|}{|x_n-x^*|^2}=\left|L\frac{x_{n+1}-x^*}{(x_n-x^*)^2}\right|=\left|L\frac{\cos x_n + x_n - x^*}{(x_n-x^*)^2}\right|[/tex]
    The limit of the top is zero, and the limit of the bottom is zero. We apply L'Hopital's rule...

    [tex]...=\left|L\frac{-\sin x_n +1}{2(x_n-x^*)}\right|.[/tex]
    Once again, the limit on the top is zero, and the limit on the bottom is zero.
  16. Aug 14, 2012 #15
    Well, you just unveiled a pretty tragic gap in my knowledge.Sloppiness, sloppiness everywhere.

    okay, the last fraction you apply de l hospital's again and you're there.

    So to summarize :
    To show that a sequence's convergence is of order p, you have to show that
    \lim_{n\rightarrow\infty} \frac{\left|x_{n+1} - x^*\right|}{\left|x_{n} - x^*\right|^p} = k
    , and in that case , if k >0 then the sequence converges with order p , with a rate of convergence k.

    If i'm not mistaken, p shows overall the speed of convergence.
    What does the "ratio" , intuitively show?
    (i'm asking this because in my textbook (greek) ,the order is named "class of convergence" and the ratio is just an unnamed constant ,by which to distinguish sub/superlinearity.)

    You claimed that p was 2.Er, how did you come up to that claim?
    Am i supposed to try iteratively values for p until i reach a non-existent limit?
    Do you have any general strategy advice?

    also,how do you make tex code not force a new line?
    Last edited: Aug 14, 2012
  17. Aug 14, 2012 #16
    Don't beat yourself up over it too much. I think we've all been there at some point or another :smile:

    Yes, p gives you a sense of how fast the convergence is in an asymptotic sense. If p=2, we say the convergence is quadratic, p=3 is cubic, etc.

    What the ratio intuitively means can be inferred by expressing it in a way similar to how you did in an earlier post;
    [tex] \lim_{n\rightarrow\infty}|x_{n+1}-x^*|=k\lim_{n\rightarrow\infty}|x_n-x^*|^p[/tex]
    where k is the rate of convergence as calculated (eg. for your problem, it was k=1/2 for p=2).

    What this means is that, for n sufficiently large (or equivalently, if x_n is close to x*), we have
    [tex] |x_{n+1}-x^*|\approx k|x_n-x^*|^p[/tex]

    Now, if [itex]x_n[/itex] and [itex]x^*[/itex] are very close, then [itex]|x_n-x^*|[/itex] is going to be very small. That is, [itex]|x_n-x^*|<<1[/itex]. But then [itex]|x_n-x^*|^p[/itex] is going to be even smaller. As p increases, this number decreases.

    To summarize, what it says is that, asymptotically, as your sequence tends toward the fixed point, the distance between the fixed point and the "next" step is equal to the p'th power of the distance between the fixed point and the "current" step, up to multiplication by a constant k.

    How I reached that claim... I think I did the calculation with p=1, and then noticed that if I set p=2, I would get another indeterminate form (0/0). Then I did it, and got the answer.

    In general, although I haven't done many problems like this, I would guess that L'Hopital's rule is used fairly often. So what you could do, is go about it like this... Let p be arbitrary, and as large as necessary. We have
    [tex] \lim_{n\rightarrow\infty}\frac{|f(x_n)-x^*|}{|x_n-x^*|^p}=\left|\lim_{n\rightarrow\infty}\frac{f(x_n)-x^*}{(x_n-x^*)^p}\right|.[/tex]

    This is already an indeterminate form of type 0/0, since f(x_n)=x_{n+1} and x_n both tend to x* as n approaches infinity. So you can apply L'Hopital's rule right off the bat, and get

    [tex] \left|\lim_{n\rightarrow\infty}\frac{f(x_n)-x^*}{(x_n-x^*)^p}\right|= \left|\lim_{n\rightarrow\infty}\frac{f'(x_n)}{p(x_n-x^*)^{p-1}}\right|[/tex]

    If the limit of the numerator is zero, then you can apply L'Hopital's rule again, since the denominator will always have zero limit. And you can keep doing over and over so long as the limit of the top is zero. However, by choosing p carefully, if the top limit is ever nonzero, you can make it so that the bottom limit is nonzero as well, once you apply L'hopital's rule enough (ie. p times)

    The above outline should help you prove the following, if you so desire.

    If the m'th derivative of f(x_n) exists and is nonzero, and all previous derivatives are zero (in the limit as n approaches infinity), then the order of convergence is m, and the rate is [itex]\frac{|h|}{m!}[/itex], where [itex]\lim_{n\rightarrow\infty}f^{(m)}(x_n)=h[/itex].

    Use the command itex, instead of tex. Eg.
  18. Aug 15, 2012 #17
    Ah, i solved an identical one yesterday,so i can't have the "i figured this out on my own" satisfaction.
    Via taylor expansion:
    f(x_n) = x_{n+1} = f(x^*) + (x_n-x^*)f'(x^*)+...+(x_n-x^*)^{(m-1)}\frac{f^{(m-1)}(x^*) }{(m-1)!} +(x_n-x^*)^m \frac{f^{(m)}(k) }{m!} =\\

    =x^* + (x_n-x^*)^m \frac{f^{(m)}(k) }{m!}
    \iff \frac{x_{n+1} - x^*}{(x_n-x^*)^m } = \frac{f^{(m)}(k) }{m!} \\
    \iff \left | \lim_{n\rightarrow\infty} \frac{x_{n+1} - x^*}{(x_n-x^*)^m } \right|= \left | \lim_{n\rightarrow\infty} \frac{f^{(m)}(k) }{m!} \right| = \frac{\left| h \right|}{m!}

    where k between [itex]x_n , x^* [/itex].


    To apply the general strategy, here's trying to find the order of convergence for [itex]f(x_n) = x_n - sinx_n[/itex]. [itex]L [/itex] standing for [itex]\lim_{n\to\infty}[/itex] and [itex]x^* = 0[/itex] the fixed point:
    [tex]L \frac{\left|x_n-sinx_n-x^*\right|}{\left|x_n-x^*\right|^p} = \frac{0}{0} =
    L \frac{\left|1-cosx_n\right|}{p\left|x_n-x^*\right|^{p-1}} = \frac{0}{0} =
    L \frac{\left|sinx_n\right|}{p(p-1)\left|x_n-x^*\right|^{p-2}} = \frac{0}{0} =
    L \frac{\left|cosx_n\right|}{p(p-1)(p-2)\left|x_n-x^*\right|^{p-3}}
    for [itex]p = 3 , L \frac{\left|cosx_n\right|}{3(3-1)(3-2)\left|x_n-x^*\right|^{3-3}} =

    So [itex]f(x_n)[/itex] converges with p= 3 and r = 1/6. Could you verify/correct please?]


    To wrap things up ,I presume convergence here is similar if not identical to asymtotic growth of functions. (i'm a computer science undergrad).
    k is like the machine-dependent constant term "neglected", and when we say that a function [itex]f \in O(n^p)[/itex] , all factors [itex]n^k , k<p[/itex] are discarded.
    Similarly , when a sequence converges with an order of p, it is "also true" that it converges with speed [itex] k <p[/itex].

    Anyway thanks for all your help.Since you cleared all of my "approximation of non linear functions" chapter questions, i'll make another thread if any more arise.
    Last edited: Aug 15, 2012
  19. Aug 15, 2012 #18
    I can confirm that I get the same answer for f(x_n)=x_n-sin(x_n); the order is p=3, and r=1/6.

    Yes, the principle is very similar to asymptotic growth of functions. The only difference is that, in asymptotic growth, you traditionally look at the behaviour of the function as the argument gets large. In this case, we are looking at the behaviour not of the function, but its distance from a particular point, as the argument approaches the particular point. If you want, you can look at it like this:

    Let d(x_n)=|f(x_n)-x*|, and let g(x_n)=|x_n-x*|^p. Then [itex]d(x_n)\in O(g(x_n))[/itex] as n grows large (not in the sense of x_n->infinity), since we have d(x_n)=g(x_n) as n grows to infinity. Since we can never have d(x_n)>g(x_n), we are allowed to write [itex]d(x_n)\leq r g(x_n) [/itex] for sufficiently large n. It's not the usual interpretation, but that's the feel of it.

    And of course, you are very welcome. This discussion has been illuminating :smile:
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook