# Gaussian elimination: Proof of correctness

## Main Question or Discussion Point

I have been looking for a proof of correctness of Gaussian elimination, but alas, without much success. Most online resources explain how to apply the algorithm rather than proving correctness. That said, I have been looking for a proof to the following theorem, which is stated in Friedberg's linear algebra:

Gaussian elimination transforms a matrix into its row echelon form.

I would appreciate any help/links to a proof of this theorem. I am almost certain it involves mathematical induction on the number of rows in the matrix, but am having trouble proving it myself.

Thanks!

BiP

## Answers and Replies

Related Linear and Abstract Algebra News on Phys.org
SteamKing
Staff Emeritus
Science Advisor
Homework Helper
Gaussian elimination is an algorithm, a collection of mathematical operations which when performed in a certain order gives a desired result. I'm not sure what you are looking for here, unless you think that the algebra itself is somehow suspect. I'm not sure if the Friedberg quote can be called a 'theorem' as much as it is a description of the purpose of this algorithm.

The component matrix operations of elimination are described here:
http://en.wikipedia.org/wiki/Elementary_matrix

If you are looking for a formal 'proof of correctness', then you would start by showing that the described operations themselves are proven, IMO.

HallsofIvy
Science Advisor
Homework Helper
Gaussian elimination transforms a matrix to row echelon form through row operations. The crucial point is that every row operation corresponds to multiplication by a specific "elementary matrix".

An "elementary matrix", corresponding to a given row operation, is the matrix we get by applying that row operation to the identity matrix. For example, the row operation "add twice the first row to the third row", applied to the 3 by 3 identity matrix, gives
$$\begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\2 & 0 & 1\end{bmatrix}$$
and it is easy to see that any 3 by 3 matrix, multiplied by this matrix will, in fact, have twice its first row added to its third:
$$\begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\2 & 0 & 1\end{bmatrix}\begin{bmatrix}a & b & c \\ d & e & f\\ g & h & i\end{bmatrix}= \begin{bmatrix}a & b & c \\ d & e & f \\ 2a+ g & 2b+ h & 2c+ i\end{bmatrix}$$

Now, suppose we have the matrix equation Ax= B and apply a sequence of row operations to A (applying those same operations to B) reducing A to the identity matrix. That is, we have $r_3r_2r_1A= I$. If we write the corresponding elementary matrices as "$R_1$", "$R_2$", "$R_3$", then we can say that $R_3R_2R_1A= I$. That, of course, is the same as saying $(R_3R_2R_1)A= I$ which is the same as saying that the single matrix $R= R_3R_2R_1$, the product of the three elementary matrix is the inverse matrix to A, $A^{-1}$. Applying those to B is the same as multiplying the elementary matrices to B which is the same as multiplying B by $A^{-1}$.

That is, applying the row operations to A and B is the same as going from Ax= B to $A^{-1}Ax= A^{-1}B$ or $x= A^{-1}B$.

Stephen Tashi
Science Advisor
II am almost certain it involves mathematical induction on the number of rows in the matrix, but am having trouble proving it myself.
The explanation of Hallsofivy can be used to prove that the algorithm accomplishes a given goal if it terminates successfully. To formally prove that the algorithm can always proceed to the next step would need an inductive argument.

HallsofIvy
Science Advisor
Homework Helper
Of course, if A does NOT HAVE an inverse, then you are going to have a problem!

Of course, if A does NOT HAVE an inverse, then you are going to have a problem!
Why does the matrix being singular pose a problem? Gaussian elimination works on rectangular matrices even.

BiP

HallsofIvy
Science Advisor
Homework Helper
To solve what problem? I was thinking of using Gaussian elimination to solve Ax= B.

Of course, if you just want to "row-reduce" a general matrix then that is not a problem. I am puzzled by Stephen Tashi's "if it terminates successfully. If A is an m by n matrix, it will require only a finite number of row operations to row-reduce it so the question of "termination" does not arize.

Stephen Tashi
Science Advisor
To "prove" Gaussian elimination, you have to state some complete sentence that makes a claim about Gaussian elimination!

You also have to define Gaussian elimination rigorously as a procedure. Algorithms are usually defined in terms of how to start and how to proceed to the next step. You have to define what conditions cause the algorithm to terminate. (A simple way to do that would be to say it terminates when it is impossible to proceed to the next step.) You have to prove that the algorithm does terminate. Then you should prove that, upon the termination of the algorithm, the claim you made about the algorithm is true.

That's a big nuisance and I'm not volunteering to do any of it. I only offer to make vague comments indicating that termination and success need to be defined and proven by anyone undertaking the job.

There is a book called A First Course in Abstract Algebra by Joseph J. Rotman. In the third edition of the book, on pages 350 and 351 there is a proof of correctness of the Gaussian elimination algorithm. You can try to seek a copy from your local library or from another source.

mathwonk
Science Advisor
Homework Helper
i agree with the OP that induction on the number of rows is one way to go. however the description of the algorithm is also the proof.
case n=1: a one row matrix is already in echelon form.
case n> 1: begin with the first column starting from the left, in which a non zero element a appears, and use row interchange to put this element in the first row. (if all entries in the matrix are zero, the matrix is in echelon form.)

now, if any other row below the first has a non zero entry c in the same column with a, subtract multiples of the first row from the later row to make that entry zero. the proof this is possible is the fact that if a ≠ 0, and c ≠ 0, then c - (c/a)a = 0.
now we have a matrix in which the first row has a non zero entry a, and all entries directly below and also to the left of this entry are zero.

then we may by induction reduce the matrix composed of the remaining n-1 rows to echelon form, and we are done.

mathwonk
Science Advisor
Homework Helper
the more interesting theorem is not the existence, but the uniqueness of the reduced echelon form, wherein one also eliminates all elements above the first non zero element in each row, and also changes the first non zero element to 1.

Svein
Science Advisor
The algorithm that is most often used is Gaussian elimination with pivot - sort the rows remaining to make the one with the largest column element the next row to be processed.

mathwonk
Science Advisor
Homework Helper
interesting - i would have thought one would prefer the pivot to be small, to be more likely to divide other elements of that column. e.g. if working over the integers, one would try to get it to be the gcd of the lower entries.

e.g. see pages 11-13 of the notes for my class 845-1 (where in that setting column operations are also allowed):

http://www.math.uga.edu/%7Eroy/845-1.pdf [Broken]

Last edited by a moderator:
Svein
Science Advisor
interesting - i would have thought one would prefer the pivot to be small, to be more likely to divide other elements of that column. e.g. if working over the integers, one would try to get it to be the gcd of the lower entries.
Two reasons:

First reason: Expressed in matrix terms, Gaussian elimination is an algorithm for transforming the coefficient matrix into an upper triangular one. The usual algorithm for row n is: "Divide row n with element an,n to get a 1 on the diagonal, then subtract a multiple of row n from the remaining rows to get zeros below the diagonal". Now this cannot be done if element an,n = 0. Instead of giving up, just look for a row with am,n≠0 and swap the rows.

Second reason: Even if an,n ≠ 0, it may be the result of a subtraction. Normally this is no problem, but then there is the problem of of subtracting very large numbers. If you have limited precision (which you always have), subtracting 1 000 000 000.056 from 1 000 000 000.057 can just as easily end up as 0 as of 0.001. And how do we get such large numbers in the first place? By dividing something with a very small number...