What Iterative Method is This?

Iseous · Feb 22, 2013

I haven’t really done much related to systems of linear equations for a couple years, so my memory is a bit faded. But, I was wondering if anyone could tell me what iterative method this is. I made it up a while ago to see if it would work, and have recently been coding it to see that it does, and seemingly quite well. I do not recall ever learning about it in any classes, though I’m sure it already exists. The procedure is as follows:

A*x = b

A = n by n Matrix
x = solution to our system of linear equations
b = right-hand-side of the system of linear equations

1) Normalize A with respect to each row, making sure each row of b is normalized with respect to the corresponding row in A.
2) Guess a solution to the system. Any row of the matrix or all zeros work fine.
3) Choose the first row of A, Row1.
4) We wish to determine what multiple of this row to add to our guess in order to achieve a better solution. Since Row1 multiplied by our guess will give us the first value in b, we want to determine what multiple(alpha) of Row1 to add to our guess to achieve that value. Thus,
Row1*(Guess + alpha*Row1) = b1.
5) Solving for alpha:
alpha = (b1 - Row1*Guess)/(Row1*Row1)
6) Since the dot product of a vector with itself is its magnitude squared, which was normalized, it reduces to 1. This will be the same for all rows.
7) Thus, alpha = b1 - Row1*Guess
8) We can then add this multiple of the first row to our guess. NewGuess = Guess + alpha*Row1
9) Repeat steps 3-8 for all n rows of matrix, A.
10) Calculate the residual, R = norm(A*Guess - b)
11) Repeat steps 9-10 to cycle through the matrix as many times as needed to reach a desired residual.

This method can be used for most matrices, although the more linearly independent the rows are, the faster it converges. For an orthogonal matrix, it will converge in 1 iteration. The method can also be modified in many ways. For instance, rather than adding only 1 row to our guess at a time, we can add multiple rows, and then use other methods to solve for the coefficients. For instance, we could solve for 2 or 3 multiples simultaneously, since these can be calculated relatively easily algebraically. That is, if we took the first three rows, we would result in the following system of equations:

1) Row1*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b1
2) Row2*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b2
3) Row3*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b3

We could then solve for alpha1, alpha2, and alpha3 quite simply algebraically. The method can even be used on itself for much larger matrices. For instance, if dealing with a very large matrix such as n = 1,000,000, we could split it. We could take 10,000 rows at a time to solve for 10,000 different multiples, and then use the method itself again to solve each of those 10,000 splits by 3 at a time, as shown above. For sparse matrices, it is easy to only take the parts of the rows that are nonzero to reduce the number of calculations necessary.

Simon Bridge · Feb 22, 2013

Welcome to PF;
It reads to me like a massive complication on row reduction... what have you gained by this approach?

Iseous · Feb 22, 2013

It’s as complicated as you want to make it. If you only take one row at a time, it’s like 2 lines of code. It seems to be more effective taking 3 at a time, but I have not tried using the method on itself as I described. Although, you’re right, it does look similar to row reduction, but is that really what it is?

As far as what I have gained, I do not remember how well other methods worked, and I do not feel like coding another method at the moment. Relative to using inverses to solve a system, it was several hundred times faster with n = 10,000, with residuals ranging from 10^-2 to 10^-10. I cannot compare any values greater than that for inverses since it takes too much memory. Although, I modified the code for a tri-diagonal system to see the number of iterations it would take. Even when using n = 10 million, it is less than 100. I do not remember the number of iterations or time required when I used other methods for classes a few years ago, so I do not know if that is good or bad. Thank you for your time.

Simon Bridge · Feb 23, 2013

The usual approach is Gauss-Jordan with partial pivoting. It is usually possible to code it recursively.

I'll try and attract a mathematician more used to looking at this sort of thing.

Iseous · Feb 23, 2013

I vaguely remember using that. I also remember using multi-grid methods, though that was kind of a pain to do. Thank you for your time.

Simon Bridge · Feb 23, 2013

Lets try:$$\begin{bmatrix}
0.001 & 1\\1 & 0
\end{bmatrix}
\begin{bmatrix}
x\\y
\end{bmatrix}=
\begin{bmatrix}
1\\1
\end{bmatrix}$$... by your method: step 1 is to normalize the matrix rows

0.001+1=1.001 - so the normalization constant is 1/1.001 = 0.999000999000... an irrational number. Doesn't that introduce an error in each iteration?
Maybe it converges anyway - leaving it to the computer, the computer will just round off. If it rounded to
4dp, that would make the first row 0.0010 and 0.9990 ...
5dp, would make it 0.00100 and 0.99900
6dp would make it 0.000999 and 0.999001 - that normalizes!
Presumably computers are more accurate than that but I have to be able to type the numbers in.
So I'll be simulating an innaccurate computer.

$$\begin{bmatrix}
0.000999 & 0.999001\\1 & 0
\end{bmatrix}
\begin{bmatrix}
x\\y
\end{bmatrix}=
\begin{bmatrix}
0.999001\\1
\end{bmatrix}$$

2) Guess a solution to the system. Any row of the matrix or all zeros work fine.

first guess ##(0,0)^t##

3) Choose the first row of A, Row1.
4) We wish to determine what multiple of this row to add to our guess in order to achieve a better solution. Since Row1 multiplied by our guess will give us the first value in b, we want to determine what multiple(alpha) of Row1 to add to our guess to achieve that value. Thus,
Row1*(Guess + alpha*Row1) = b1.

i.e.
$$\begin{bmatrix}0.000999 & 0.999001\end{bmatrix}(
\begin{bmatrix}0 \\ 0\end{bmatrix}+\alpha\begin{bmatrix}0.000999 & 0.999001\end{bmatrix} ) = 1$$... ah, there's another problem - but perhaps you mean:
$$\begin{bmatrix}0.000999 & 0.999001\end{bmatrix}(
\begin{bmatrix}0 \\ 0\end{bmatrix}+\alpha\begin{bmatrix}0.000999 \\ 0.999001\end{bmatrix} ) = 1\\
\Rightarrow \alpha \begin{bmatrix}0.000999 & 0.999001\end{bmatrix}\begin{bmatrix}0.000999 \\ 0.999001\end{bmatrix}=1\\
\Rightarrow \alpha (0.000999^2 + 0.999001^2)=1\\
\Rightarrow \alpha = \frac{1}{0.998004}=1.002000$$

Iseous · Feb 23, 2013

True, I suppose you don't really have to normalize it. Although the point of normalizing it was so that the equation to find alpha would be reduced so that you would not have to calculate the magnitude over and over. But that would still introduce the error. Although, wouldn't that be something that could affect pretty much any method? I'm sure most methods would introduce errors from irrational numbers.

Simon Bridge · Feb 23, 2013

I wondered if the approach may be robust enough to converge correctly anyway ... see last post.

You get this problem, not just with irrational numbers but anytime one or more elements are very small compared with the others.

"Partial pivoting" helps with the problem when you use Gauss-Jordan.

For your method - perhaps you want to either do away with the normalization step, or just go right to step 7 and assume the rounding is perfect (ignoring the difference)? Possibly that's what you'd program... but I wasn't sure so I had to stop the calculation.

Iseous · Feb 23, 2013

Yes, you're right, you do need to translate the row if you use vector multiplication. Although as a dot product, the element by element multiplication would be the same as long as they have the same number of elements, which they do.

As far as eliminating the normalization, would that be more effective? Would the errors make you worse off than having to recalculate the dot products over and over of the rows?

Simon Bridge · Feb 23, 2013

In Gauss-Jordan we'd be reducing:
$$\begin{bmatrix}0.001 & 1 & 1\\ 1 & 0 & 1
\end{bmatrix}$$
step 1 - partial pivot: ##r1 \leftrightarrow r2##
$$\begin{bmatrix}1 & 0 & 1 \\ 0.001 & 1 & 1
\end{bmatrix}$$
step 2 - row reduce: ##r2 \leftarrow r2-(0.001)r1##
$$\begin{bmatrix}1 & 0 & 1 \\ 0 & 1 & 0.999
\end{bmatrix}$$... end in 2 steps.

so x=1 and y=0.999 which we can check by solving the original series algebraically.

If I don't pivot, then the first step is to multiply row 1 by 1000 - then row-reduce to echelon form. It's also two steps - three if you insist on reduced echelon.

Your method, I think, suffers more keenly from accumulated rounding errors, and involved many more calculations just to get part-way through a single iteration.

Iseous · Feb 23, 2013

Isn't G-J a direct solver? So you're not really iterating on a guess or anything. For a 2X2 matrix it is easy to solve algebraically; I thought iterative methods were more efficient than a direct solver for larger matrices. For larger matrices, would the number of steps increase linearly?

Simon Bridge · Feb 23, 2013

Well... I wanted a small example so I could use your method by hand so see how it worked in detail. I picked the simplest awkward example I could find. It's a point of comparison ... in particular, G-J does not have the rounding problem quite so bad so it is not something that is so inevitable right? I can come up with arbitrarily large obnoxious systems if you want but I figured you can do that too.

To test your algorithm empirically, you need something you can solve by another means.

The speed of guessing methods can depend on how good the initial guess is ... for an arbitrary first guess you are sort-of hoping the phase-space has only one simple attractor in it - and that attractor is the correct solution. You have indicated you can deal with degenerate systems with this approach.

How would you work out how your algorithm scales?

Simon Bridge · Feb 23, 2013

We haven't got closer to answering the question though ... I have tried waving my arms and shouting. I don't recognize the method in question but a full-time mathematician may.

Maybe if you posted a semi-trolling thread along the lines of "innovative method found, shall I patent it?" or something.
It's early days yet - we'll wait and see.
If you do - I will disavow any knowledge ... this message will self destruct in 3... 2... 1... <click>

Iseous · Feb 23, 2013

Well as far as using small matrices, I wouldn't use an iterative method anyway. Even in the method itself when solving for 3 variables at a time, I just solve it algebraically since you can easily derive expressions to solve for them directly.

I am not very good at looking at this sort of thing empirically and dealing with error propagation, so I don't really want to make a lame attempt at it.

As for the initial guess, I do not see how using a zero guess or a row vector could possibly be wrong. In Ax = b, x is merely telling you the linear combination of vectors in your A matrix to give you b. And seeing as how this method is directly using linear combinations of the row vectors to alter your guess, how would it screw up unless the matrix were singular? It seems the only way it would be wrong is if you started with an initial guess that wasn't a linear combination of vectors in A, and so you wouldn't be able to eliminate it from your guess through iterations.

I suppose I could make a thread like that, or maybe just edit the top of my post to say that... I'm not going to get banned for that or anything I hope?

micromass · Feb 23, 2013

Iseous said:

I suppose I could make a thread like that, or maybe just edit the top of my post to say that... I'm not going to get banned for that or anything I hope?

Please don't.

Anyway, I think your method is very much like http://en.wikipedia.org/wiki/Iterative_refinement
In fact, you say

This method can be used for most matrices, although the more linearly independent the rows are, the faster it converges. For an orthogonal matrix, it will converge in 1 iteration. The method can also be modified in many ways. For instance, rather than adding only 1 row to our guess at a time, we can add multiple rows, and then use other methods to solve for the coefficients. For instance, we could solve for 2 or 3 multiples simultaneously, since these can be calculated relatively easily algebraically. That is, if we took the first three rows, we would result in the following system of equations:

1) Row1*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b1
2) Row2*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b2
3) Row3*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b3

We could then solve for alpha1, alpha2, and alpha3 quite simply algebraically.

But calculating [itex]\alpha_1,\alpha_2,\alpha_3[/itex] is equivalent to calculating the vector [itex]\alpha_1\text{Row}_1+\alpha_2\text{Row}_2 + \alpha_3\text{Row}_3[/itex]. but this is exactly the vector [itex]d_m[/itex] in the wikipedia.
The only difference seems to be that you calculate the vector [itex]d_m[/itex] in a specific basis generated by the rows. I don't know if this changes much or if it is really necessary.

Iseous · Feb 24, 2013

It does look similar. Thank you for finding that.

What Iterative Method is This?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad The vector to which a dual vector corresponds

Graduate Confusion about the Moyal-Weyl twist

Undergrad 2 interpretations of bra-ket expression: equal, & isomorphic, but...

Undergrad Spinor calculus

Undergrad Matrix representation of rank-2 spinors

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect