1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Is this problem ill-conditioned or not?

  1. Apr 28, 2012 #1
    Say we want to know the zero of a function f, then of course we can use the iterative Newton-Raphson method, given by [itex]x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}[/itex].

    But say f typically has really big values, and say you can actually, without much trouble, rewrite your problem as solving for the zero of a function g which reaches much smaller values. Is it then "better" to use [itex]x_{n+1} = x_n - \frac{g(x_n)}{g'(x_n)}[/itex] instead of [itex]x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}[/itex]? Or does it make no difference? After all, the "bigness" of f doesn't seem to influence the relative error on x (since if [.] denotes the computed value of something, then [itex][f(x_n)] = (1+\epsilon_r^{(1)})f(x_n)[/itex] and [itex][f'(x_n)] = (1+\epsilon_r^{(2)})f'(x_n)[/itex], such that [itex]\frac{[f](x_n)}{[f'](x_n)} = \frac{f(x_n)}{f'(x_n)}(1+2 \epsilon_r)[/itex], so no problem there)

    On the other hand, although the method with the big f might be well conditioned(?), it might still be better to use g, since you'll avoid an overflow?
     
  2. jcsd
  3. Apr 28, 2012 #2

    Office_Shredder

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Simply scaling your function (for example using g(x)=f(x)/100000) does nothing to help your problem. There's no purpose from an analytic standpoint to switch functions unless g(x) has a shape which is better suited for solving the problem - this in itself has nothing to do with the relative magnitudes of f(x) and g(x). It seems like 99.9% of the time given a function f(x) it would be impossible to construct a wholly different function g(x) that you know has the same zeros as f(x) does

    Picking a function with smaller values might help if f(x) takes on values which are obscenely large, but if g(x) ends up taking values that are obscenely small you'll have the same problem in the other direction.
     
  4. Apr 28, 2012 #3
    Thanks for replying. Maybe I'll make my situation a bit more concrete:

    I need to find the eigenvalues of a certain matrix, i.e. solve [itex]|A - \lambda I| = 0[/itex], but it turns out that the elements of the matrix A are multiplied by a large constant, i.e. [itex]A = \alpha B[/itex] with B having 1's and 2's as elements and [itex]\alpha[/itex] being pretty big. So I was wondering if it would be benificial to solve instead [itex]|B - \tilde \lambda I| = 0[/itex] and then at the end [itex]\lambda = \alpha \lambda'[/itex].

    Intuitively I expected it to make a difference, but thinking about it more closely I can't really seem to justify why it would make a difference. (NB: to evaluate the determinant I'm using a recursive formula, using that A is tridiagonal) Perhaps it does make a difference, but not in the evaluation of the Newton-Raphson method, but in the evaluation of the determinant... Avoiding catastrophic cancellations(?)
     
    Last edited: Apr 28, 2012
  5. Apr 28, 2012 #4

    AlephZero

    User Avatar
    Science Advisor
    Homework Helper

    The amount of numerical ill conditioning when doing operations on a matrix usually depends on the "condition number" which is the ratio of the largest to the smallest eigenvalue. A bigger condition number means the matrix is more nearly "numerically singular". But unless ##\log_{10}## of the condition number is similar to the number of significant digits used in the calculation (16, for most serious computer calculations), this is not usually an issue, There are various ways to estimate the condition number without actually finding the eigenvalues. One way is to start by doing a LU factorization of the the matrix, which is always a well behaved process, even if the L and U matrices it produces are very badly conditioned.

    Scaling all the terms of a matrix doesn't change its condition number. For some numerical methods of finding eigenvalues, there is an advantage in solving ##|(A + sI) - (\lambda+s)I| = 0##, in other words adding or subtracting a constant ##s## to the original eigenvalues, and making the matrix more or less diagonally dominant.

    There was a significant breakthrough in the general subject of error analysis in the 1960s, when Wilkinson realized that instead of asking "how wrong is the numerical answer compared with the exact answer", which is usually mathematically intractible, there is an easier question that is just as useful in practice, which is "if the answers I got are the exact answers to the wrong problem, how wrong is the wrong problem".

    For example if you are solbing the matrix equations ##Ax = b## and you get an approximate answer ##x'##, tryiing to estimate ##x-x'## is hard. But in a practical physics or engineering situation where ##A## and/or ##b## are not known exactly, it can be more useful to investigate if there is a "small" vector ##b'## such that ##Ax'= b+b'##, or if there is a "small" matrix ##A'## such that ##(A+A')x' = b##.
     
  6. Apr 28, 2012 #5
    But then how do you explain that it does have a major effect on my numerical computations? Without the procedure described in post #3 I get ridiculous results, whereas with it, it works smoothly and effectively...
     
  7. Apr 28, 2012 #6

    AlephZero

    User Avatar
    Science Advisor
    Homework Helper

    Are you using an eigensolution routine from a "professional" library, or your own code? Are you using the most appropriate numerical method?

    Seeing the problem is probably a necessary (but not sufficient!) condition for getting an explanation.
     
  8. Apr 28, 2012 #7
    My own code. I don't want to bother you with code, but the essence of what is happening is pretty simple.

    If you like, you can take a look at the page I attached. I'm basically using that iterative formula to calculate the determinant of a tridiagonal matrix. In my case, the elements a,b,c are pretty big (and if it matters: the eventual eigenvalues, at least the smallest ones, are "normal-sized", i.e. order 1). If I simply evaluate that determinant, as it is, I get wrong/NaN results, but if I first replace a,b,c by [itex]\alpha a, \alpha b, \alpha c[/itex] (with alpha small), and then solve for lambda, in the end dividing lambda by alpha, I get good results.

    Maybe the specific formula (i.e. the one on the attached page) induces catastrophic cancellations? However, if that is the case, then I'm not sure why I have no troubles at all after rescaling, since then my lambda is typically really small, so relative to lambda, a and b are still huge, such that operations like in (5.122) should behave funnily(?)

    EDIT: By the way, thanks for following up
     

    Attached Files:

  9. Apr 28, 2012 #8

    AlephZero

    User Avatar
    Science Advisor
    Homework Helper

    OK. If you are trying to find the eigenvalues by determinant search, you don't really need the value of the determinant, only its sign. But the determinant is likely to be of the order of the product of all the diagonal terms of the matrix, so you need to keep track of a scale factor so that stays within the range of your floating point arithemetic. You could either scale the matrix, or represent the D's by an integer power of 2 (or 10 of you prefer) times a floating ponit number fairly close to 1. That would not overflow until the determinant was of the order of about 10^8 decimal digits, which should be big enough for most situations!)

    Alternatively, you could do an LU factorization of matrix 5.121, which is about the same amount of work as finding the determinant but willl not overflow, count the number of sign changes along the diagonal terms of U, and use that as a Sturm sequence to tell you the number of negative eigenvalues for a given value of ##\lambda##..
     
    Last edited: Apr 28, 2012
  10. Apr 29, 2012 #9

    AlephZero

    User Avatar
    Science Advisor
    Homework Helper

    Just the add: this is nothing to do with numerical conditioning, it is just the fact that the determinant is too big to represent as a floating point number. You would get the same problem running the code even if the matrix was diagonal and there was no cancelation at all.

    For a large matrix, the matrix elements don't even have to be "big". For example if the elements of a matrix of order 500 were about 10, the determinant would be usually be about ##10^{500}##, but IEEE 64-bit floating point can only represent numbers up to about ##10^{300}##. But calculating all the eigenpairs of an arbitrary matrix of order 500 is not usually a big deal numerically in 64-bit arithmetic

    FWIW the old LINPACK numerical library included determinant calculation routines, and returned the value as a power of 10 times a number between 1 and 10. The newer LAPACK libraries don't contain any determinant calculation routines at all, though you can get the determinant easily after calling other routines in the lilbrary if you really really want it.
     
  11. Apr 29, 2012 #10
    Thank you, your posts were enlightening.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook