What Iterative Method is This?

In summary, this method can be used for most matrices, is faster for more linearly independent rows, and can be modified to be faster or more effective.
  • #1
Iseous
38
0
I haven’t really done much related to systems of linear equations for a couple years, so my memory is a bit faded. But, I was wondering if anyone could tell me what iterative method this is. I made it up a while ago to see if it would work, and have recently been coding it to see that it does, and seemingly quite well. I do not recall ever learning about it in any classes, though I’m sure it already exists. The procedure is as follows:

A*x = b

A = n by n Matrix
x = solution to our system of linear equations
b = right-hand-side of the system of linear equations

1) Normalize A with respect to each row, making sure each row of b is normalized with respect to the corresponding row in A.
2) Guess a solution to the system. Any row of the matrix or all zeros work fine.
3) Choose the first row of A, Row1.
4) We wish to determine what multiple of this row to add to our guess in order to achieve a better solution. Since Row1 multiplied by our guess will give us the first value in b, we want to determine what multiple(alpha) of Row1 to add to our guess to achieve that value. Thus,
Row1*(Guess + alpha*Row1) = b1.
5) Solving for alpha:
alpha = (b1 - Row1*Guess)/(Row1*Row1)
6) Since the dot product of a vector with itself is its magnitude squared, which was normalized, it reduces to 1. This will be the same for all rows.
7) Thus, alpha = b1 - Row1*Guess
8) We can then add this multiple of the first row to our guess. NewGuess = Guess + alpha*Row1
9) Repeat steps 3-8 for all n rows of matrix, A.
10) Calculate the residual, R = norm(A*Guess - b)
11) Repeat steps 9-10 to cycle through the matrix as many times as needed to reach a desired residual.

This method can be used for most matrices, although the more linearly independent the rows are, the faster it converges. For an orthogonal matrix, it will converge in 1 iteration. The method can also be modified in many ways. For instance, rather than adding only 1 row to our guess at a time, we can add multiple rows, and then use other methods to solve for the coefficients. For instance, we could solve for 2 or 3 multiples simultaneously, since these can be calculated relatively easily algebraically. That is, if we took the first three rows, we would result in the following system of equations:

1) Row1*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b1
2) Row2*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b2
3) Row3*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b3

We could then solve for alpha1, alpha2, and alpha3 quite simply algebraically. The method can even be used on itself for much larger matrices. For instance, if dealing with a very large matrix such as n = 1,000,000, we could split it. We could take 10,000 rows at a time to solve for 10,000 different multiples, and then use the method itself again to solve each of those 10,000 splits by 3 at a time, as shown above. For sparse matrices, it is easy to only take the parts of the rows that are nonzero to reduce the number of calculations necessary.
 
Physics news on Phys.org
  • #2
Welcome to PF;
It reads to me like a massive complication on row reduction... what have you gained by this approach?
 
  • #3
It’s as complicated as you want to make it. If you only take one row at a time, it’s like 2 lines of code. It seems to be more effective taking 3 at a time, but I have not tried using the method on itself as I described. Although, you’re right, it does look similar to row reduction, but is that really what it is?

As far as what I have gained, I do not remember how well other methods worked, and I do not feel like coding another method at the moment. Relative to using inverses to solve a system, it was several hundred times faster with n = 10,000, with residuals ranging from 10^-2 to 10^-10. I cannot compare any values greater than that for inverses since it takes too much memory. Although, I modified the code for a tri-diagonal system to see the number of iterations it would take. Even when using n = 10 million, it is less than 100. I do not remember the number of iterations or time required when I used other methods for classes a few years ago, so I do not know if that is good or bad. Thank you for your time.
 
Last edited:
  • #4
The usual approach is Gauss-Jordan with partial pivoting. It is usually possible to code it recursively.

I'll try and attract a mathematician more used to looking at this sort of thing.
 
  • #5
I vaguely remember using that. I also remember using multi-grid methods, though that was kind of a pain to do. Thank you for your time.
 
  • #6
Lets try:$$\begin{bmatrix}
0.001 & 1\\1 & 0
\end{bmatrix}
\begin{bmatrix}
x\\y
\end{bmatrix}=
\begin{bmatrix}
1\\1
\end{bmatrix}$$... by your method: step 1 is to normalize the matrix rows

0.001+1=1.001 - so the normalization constant is 1/1.001 = 0.999000999000... an irrational number. Doesn't that introduce an error in each iteration?
Maybe it converges anyway - leaving it to the computer, the computer will just round off. If it rounded to
4dp, that would make the first row 0.0010 and 0.9990 ...
5dp, would make it 0.00100 and 0.99900
6dp would make it 0.000999 and 0.999001 - that normalizes!
Presumably computers are more accurate than that but I have to be able to type the numbers in.
So I'll be simulating an innaccurate computer.

$$\begin{bmatrix}
0.000999 & 0.999001\\1 & 0
\end{bmatrix}
\begin{bmatrix}
x\\y
\end{bmatrix}=
\begin{bmatrix}
0.999001\\1
\end{bmatrix}$$

2) Guess a solution to the system. Any row of the matrix or all zeros work fine.

first guess ##(0,0)^t##

3) Choose the first row of A, Row1.
4) We wish to determine what multiple of this row to add to our guess in order to achieve a better solution. Since Row1 multiplied by our guess will give us the first value in b, we want to determine what multiple(alpha) of Row1 to add to our guess to achieve that value. Thus,
Row1*(Guess + alpha*Row1) = b1.


i.e.
$$\begin{bmatrix}0.000999 & 0.999001\end{bmatrix}(
\begin{bmatrix}0 \\ 0\end{bmatrix}+\alpha\begin{bmatrix}0.000999 & 0.999001\end{bmatrix} ) = 1$$... ah, there's another problem - but perhaps you mean:
$$\begin{bmatrix}0.000999 & 0.999001\end{bmatrix}(
\begin{bmatrix}0 \\ 0\end{bmatrix}+\alpha\begin{bmatrix}0.000999 \\ 0.999001\end{bmatrix} ) = 1\\
\Rightarrow \alpha \begin{bmatrix}0.000999 & 0.999001\end{bmatrix}\begin{bmatrix}0.000999 \\ 0.999001\end{bmatrix}=1\\
\Rightarrow \alpha (0.000999^2 + 0.999001^2)=1\\
\Rightarrow \alpha = \frac{1}{0.998004}=1.002000$$
 
Last edited:
  • #7
True, I suppose you don't really have to normalize it. Although the point of normalizing it was so that the equation to find alpha would be reduced so that you would not have to calculate the magnitude over and over. But that would still introduce the error. Although, wouldn't that be something that could affect pretty much any method? I'm sure most methods would introduce errors from irrational numbers.
 
  • #8
I wondered if the approach may be robust enough to converge correctly anyway ... see last post.

You get this problem, not just with irrational numbers but anytime one or more elements are very small compared with the others.

"Partial pivoting" helps with the problem when you use Gauss-Jordan.

For your method - perhaps you want to either do away with the normalization step, or just go right to step 7 and assume the rounding is perfect (ignoring the difference)? Possibly that's what you'd program... but I wasn't sure so I had to stop the calculation.
 
  • #9
Yes, you're right, you do need to translate the row if you use vector multiplication. Although as a dot product, the element by element multiplication would be the same as long as they have the same number of elements, which they do.

As far as eliminating the normalization, would that be more effective? Would the errors make you worse off than having to recalculate the dot products over and over of the rows?
 
  • #10
In Gauss-Jordan we'd be reducing:
$$\begin{bmatrix}0.001 & 1 & 1\\ 1 & 0 & 1
\end{bmatrix}$$
step 1 - partial pivot: ##r1 \leftrightarrow r2##
$$\begin{bmatrix}1 & 0 & 1 \\ 0.001 & 1 & 1
\end{bmatrix}$$
step 2 - row reduce: ##r2 \leftarrow r2-(0.001)r1##
$$\begin{bmatrix}1 & 0 & 1 \\ 0 & 1 & 0.999
\end{bmatrix}$$... end in 2 steps.

so x=1 and y=0.999 which we can check by solving the original series algebraically.

If I don't pivot, then the first step is to multiply row 1 by 1000 - then row-reduce to echelon form. It's also two steps - three if you insist on reduced echelon.

Your method, I think, suffers more keenly from accumulated rounding errors, and involved many more calculations just to get part-way through a single iteration.
 
Last edited:
  • #11
Isn't G-J a direct solver? So you're not really iterating on a guess or anything. For a 2X2 matrix it is easy to solve algebraically; I thought iterative methods were more efficient than a direct solver for larger matrices. For larger matrices, would the number of steps increase linearly?
 
  • #12
Well... I wanted a small example so I could use your method by hand so see how it worked in detail. I picked the simplest awkward example I could find. It's a point of comparison ... in particular, G-J does not have the rounding problem quite so bad so it is not something that is so inevitable right? I can come up with arbitrarily large obnoxious systems if you want but I figured you can do that too.

To test your algorithm empirically, you need something you can solve by another means.

The speed of guessing methods can depend on how good the initial guess is ... for an arbitrary first guess you are sort-of hoping the phase-space has only one simple attractor in it - and that attractor is the correct solution. You have indicated you can deal with degenerate systems with this approach.

How would you work out how your algorithm scales?
 
  • #13
We haven't got closer to answering the question though ... I have tried waving my arms and shouting. I don't recognize the method in question but a full-time mathematician may.

Maybe if you posted a semi-trolling thread along the lines of "innovative method found, shall I patent it?" or something.
It's early days yet - we'll wait and see.
If you do - I will disavow any knowledge ... this message will self destruct in 3... 2... 1... <click>
 
  • #14
Well as far as using small matrices, I wouldn't use an iterative method anyway. Even in the method itself when solving for 3 variables at a time, I just solve it algebraically since you can easily derive expressions to solve for them directly.

I am not very good at looking at this sort of thing empirically and dealing with error propagation, so I don't really want to make a lame attempt at it.

As for the initial guess, I do not see how using a zero guess or a row vector could possibly be wrong. In Ax = b, x is merely telling you the linear combination of vectors in your A matrix to give you b. And seeing as how this method is directly using linear combinations of the row vectors to alter your guess, how would it screw up unless the matrix were singular? It seems the only way it would be wrong is if you started with an initial guess that wasn't a linear combination of vectors in A, and so you wouldn't be able to eliminate it from your guess through iterations.

I suppose I could make a thread like that, or maybe just edit the top of my post to say that... I'm not going to get banned for that or anything I hope?
 
  • #15
Iseous said:
I suppose I could make a thread like that, or maybe just edit the top of my post to say that... I'm not going to get banned for that or anything I hope?

Please don't.

Anyway, I think your method is very much like http://en.wikipedia.org/wiki/Iterative_refinement
In fact, you say

This method can be used for most matrices, although the more linearly independent the rows are, the faster it converges. For an orthogonal matrix, it will converge in 1 iteration. The method can also be modified in many ways. For instance, rather than adding only 1 row to our guess at a time, we can add multiple rows, and then use other methods to solve for the coefficients. For instance, we could solve for 2 or 3 multiples simultaneously, since these can be calculated relatively easily algebraically. That is, if we took the first three rows, we would result in the following system of equations:

1) Row1*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b1
2) Row2*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b2
3) Row3*(Guess + alpha1*Row1 + alpha2*Row2 + alpha3*Row3) = b3

We could then solve for alpha1, alpha2, and alpha3 quite simply algebraically.

But calculating [itex]\alpha_1,\alpha_2,\alpha_3[/itex] is equivalent to calculating the vector [itex]\alpha_1\text{Row}_1+\alpha_2\text{Row}_2 + \alpha_3\text{Row}_3[/itex]. but this is exactly the vector [itex]d_m[/itex] in the wikipedia.
The only difference seems to be that you calculate the vector [itex]d_m[/itex] in a specific basis generated by the rows. I don't know if this changes much or if it is really necessary.
 
  • #16
It does look similar. Thank you for finding that.
 

1. What is an iterative method?

An iterative method is a mathematical algorithm that repeatedly performs a calculation or set of calculations using an initial estimate or guess, in order to approach a more accurate solution. It is often used to solve complex problems in fields such as engineering, physics, and computer science.

2. How does an iterative method work?

An iterative method typically involves performing a set of calculations and using the result to update the initial estimate. This process is repeated until a desired level of accuracy is achieved. The method may also involve using a convergence criterion to determine when the solution is close enough to the actual solution.

3. What types of problems can be solved using iterative methods?

Iterative methods can be used to solve a wide range of problems, including linear and nonlinear equations, optimization problems, differential equations, and eigenvalue problems. They are particularly useful for problems that are difficult or impossible to solve using analytical methods.

4. What are some examples of iterative methods?

Some commonly used iterative methods include the Newton-Raphson method, Gauss-Seidel method, Jacobi method, conjugate gradient method, and gradient descent method. These methods differ in their approach and efficiency, and are often chosen based on the specific problem being solved.

5. What are the advantages of using an iterative method?

Iterative methods offer several advantages over other methods of problem solving. They can be used for a wide range of problems, are often more efficient than other methods, and can often be implemented using simple computational techniques. Additionally, they can provide insights into the behavior of a system and can be easily modified to accommodate changes in the problem or solution approach.

Similar threads

  • Linear and Abstract Algebra
Replies
6
Views
844
  • Linear and Abstract Algebra
Replies
1
Views
3K
  • Linear and Abstract Algebra
Replies
15
Views
4K
  • Linear and Abstract Algebra
Replies
1
Views
1K
  • Linear and Abstract Algebra
Replies
2
Views
1K
  • Linear and Abstract Algebra
Replies
3
Views
1K
  • Linear and Abstract Algebra
Replies
1
Views
1K
  • Linear and Abstract Algebra
Replies
14
Views
3K
  • Linear and Abstract Algebra
Replies
2
Views
2K
  • Linear and Abstract Algebra
Replies
15
Views
1K
Back
Top