# Is this problem ill-conditioned or not?

1. Apr 28, 2012

### nonequilibrium

Say we want to know the zero of a function f, then of course we can use the iterative Newton-Raphson method, given by $x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$.

But say f typically has really big values, and say you can actually, without much trouble, rewrite your problem as solving for the zero of a function g which reaches much smaller values. Is it then "better" to use $x_{n+1} = x_n - \frac{g(x_n)}{g'(x_n)}$ instead of $x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$? Or does it make no difference? After all, the "bigness" of f doesn't seem to influence the relative error on x (since if [.] denotes the computed value of something, then $[f(x_n)] = (1+\epsilon_r^{(1)})f(x_n)$ and $[f'(x_n)] = (1+\epsilon_r^{(2)})f'(x_n)$, such that $\frac{[f](x_n)}{[f'](x_n)} = \frac{f(x_n)}{f'(x_n)}(1+2 \epsilon_r)$, so no problem there)

On the other hand, although the method with the big f might be well conditioned(?), it might still be better to use g, since you'll avoid an overflow?

2. Apr 28, 2012

### Office_Shredder

Staff Emeritus
Simply scaling your function (for example using g(x)=f(x)/100000) does nothing to help your problem. There's no purpose from an analytic standpoint to switch functions unless g(x) has a shape which is better suited for solving the problem - this in itself has nothing to do with the relative magnitudes of f(x) and g(x). It seems like 99.9% of the time given a function f(x) it would be impossible to construct a wholly different function g(x) that you know has the same zeros as f(x) does

Picking a function with smaller values might help if f(x) takes on values which are obscenely large, but if g(x) ends up taking values that are obscenely small you'll have the same problem in the other direction.

3. Apr 28, 2012

### nonequilibrium

Thanks for replying. Maybe I'll make my situation a bit more concrete:

I need to find the eigenvalues of a certain matrix, i.e. solve $|A - \lambda I| = 0$, but it turns out that the elements of the matrix A are multiplied by a large constant, i.e. $A = \alpha B$ with B having 1's and 2's as elements and $\alpha$ being pretty big. So I was wondering if it would be benificial to solve instead $|B - \tilde \lambda I| = 0$ and then at the end $\lambda = \alpha \lambda'$.

Intuitively I expected it to make a difference, but thinking about it more closely I can't really seem to justify why it would make a difference. (NB: to evaluate the determinant I'm using a recursive formula, using that A is tridiagonal) Perhaps it does make a difference, but not in the evaluation of the Newton-Raphson method, but in the evaluation of the determinant... Avoiding catastrophic cancellations(?)

Last edited: Apr 28, 2012
4. Apr 28, 2012

### AlephZero

The amount of numerical ill conditioning when doing operations on a matrix usually depends on the "condition number" which is the ratio of the largest to the smallest eigenvalue. A bigger condition number means the matrix is more nearly "numerically singular". But unless $\log_{10}$ of the condition number is similar to the number of significant digits used in the calculation (16, for most serious computer calculations), this is not usually an issue, There are various ways to estimate the condition number without actually finding the eigenvalues. One way is to start by doing a LU factorization of the the matrix, which is always a well behaved process, even if the L and U matrices it produces are very badly conditioned.

Scaling all the terms of a matrix doesn't change its condition number. For some numerical methods of finding eigenvalues, there is an advantage in solving $|(A + sI) - (\lambda+s)I| = 0$, in other words adding or subtracting a constant $s$ to the original eigenvalues, and making the matrix more or less diagonally dominant.

There was a significant breakthrough in the general subject of error analysis in the 1960s, when Wilkinson realized that instead of asking "how wrong is the numerical answer compared with the exact answer", which is usually mathematically intractible, there is an easier question that is just as useful in practice, which is "if the answers I got are the exact answers to the wrong problem, how wrong is the wrong problem".

For example if you are solbing the matrix equations $Ax = b$ and you get an approximate answer $x'$, tryiing to estimate $x-x'$ is hard. But in a practical physics or engineering situation where $A$ and/or $b$ are not known exactly, it can be more useful to investigate if there is a "small" vector $b'$ such that $Ax'= b+b'$, or if there is a "small" matrix $A'$ such that $(A+A')x' = b$.

5. Apr 28, 2012

### nonequilibrium

But then how do you explain that it does have a major effect on my numerical computations? Without the procedure described in post #3 I get ridiculous results, whereas with it, it works smoothly and effectively...

6. Apr 28, 2012

### AlephZero

Are you using an eigensolution routine from a "professional" library, or your own code? Are you using the most appropriate numerical method?

Seeing the problem is probably a necessary (but not sufficient!) condition for getting an explanation.

7. Apr 28, 2012

### nonequilibrium

My own code. I don't want to bother you with code, but the essence of what is happening is pretty simple.

If you like, you can take a look at the page I attached. I'm basically using that iterative formula to calculate the determinant of a tridiagonal matrix. In my case, the elements a,b,c are pretty big (and if it matters: the eventual eigenvalues, at least the smallest ones, are "normal-sized", i.e. order 1). If I simply evaluate that determinant, as it is, I get wrong/NaN results, but if I first replace a,b,c by $\alpha a, \alpha b, \alpha c$ (with alpha small), and then solve for lambda, in the end dividing lambda by alpha, I get good results.

Maybe the specific formula (i.e. the one on the attached page) induces catastrophic cancellations? However, if that is the case, then I'm not sure why I have no troubles at all after rescaling, since then my lambda is typically really small, so relative to lambda, a and b are still huge, such that operations like in (5.122) should behave funnily(?)

EDIT: By the way, thanks for following up

#### Attached Files:

• ###### devries.gif
File size:
28.6 KB
Views:
104
8. Apr 28, 2012

### AlephZero

OK. If you are trying to find the eigenvalues by determinant search, you don't really need the value of the determinant, only its sign. But the determinant is likely to be of the order of the product of all the diagonal terms of the matrix, so you need to keep track of a scale factor so that stays within the range of your floating point arithemetic. You could either scale the matrix, or represent the D's by an integer power of 2 (or 10 of you prefer) times a floating ponit number fairly close to 1. That would not overflow until the determinant was of the order of about 10^8 decimal digits, which should be big enough for most situations!)

Alternatively, you could do an LU factorization of matrix 5.121, which is about the same amount of work as finding the determinant but willl not overflow, count the number of sign changes along the diagonal terms of U, and use that as a Sturm sequence to tell you the number of negative eigenvalues for a given value of $\lambda$..

Last edited: Apr 28, 2012
9. Apr 29, 2012

### AlephZero

Just the add: this is nothing to do with numerical conditioning, it is just the fact that the determinant is too big to represent as a floating point number. You would get the same problem running the code even if the matrix was diagonal and there was no cancelation at all.

For a large matrix, the matrix elements don't even have to be "big". For example if the elements of a matrix of order 500 were about 10, the determinant would be usually be about $10^{500}$, but IEEE 64-bit floating point can only represent numbers up to about $10^{300}$. But calculating all the eigenpairs of an arbitrary matrix of order 500 is not usually a big deal numerically in 64-bit arithmetic

FWIW the old LINPACK numerical library included determinant calculation routines, and returned the value as a power of 10 times a number between 1 and 10. The newer LAPACK libraries don't contain any determinant calculation routines at all, though you can get the determinant easily after calling other routines in the lilbrary if you really really want it.

10. Apr 29, 2012

### nonequilibrium

Thank you, your posts were enlightening.

Share this great discussion with others via Reddit, Google+, Twitter, or Facebook