Gaussian elimination: Proof of correctness

Bipolarity · Jul 11, 2013

I have been looking for a proof of correctness of Gaussian elimination, but alas, without much success. Most online resources explain how to apply the algorithm rather than proving correctness. That said, I have been looking for a proof to the following theorem, which is stated in Friedberg's linear algebra:

Gaussian elimination transforms a matrix into its row echelon form.

I would appreciate any help/links to a proof of this theorem. I am almost certain it involves mathematical induction on the number of rows in the matrix, but am having trouble proving it myself.

Thanks!

BiP

SteamKing · Jul 11, 2013

Gaussian elimination is an algorithm, a collection of mathematical operations which when performed in a certain order gives a desired result. I'm not sure what you are looking for here, unless you think that the algebra itself is somehow suspect. I'm not sure if the Friedberg quote can be called a 'theorem' as much as it is a description of the purpose of this algorithm.

The component matrix operations of elimination are described here:
http://en.wikipedia.org/wiki/Elementary_matrix

If you are looking for a formal 'proof of correctness', then you would start by showing that the described operations themselves are proven, IMO.

HallsofIvy · Jul 11, 2013

Gaussian elimination transforms a matrix to row echelon form through row operations. The crucial point is that every row operation corresponds to multiplication by a specific "elementary matrix".

An "elementary matrix", corresponding to a given row operation, is the matrix we get by applying that row operation to the identity matrix. For example, the row operation "add twice the first row to the third row", applied to the 3 by 3 identity matrix, gives
[tex]\begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\2 & 0 & 1\end{bmatrix}[/tex]
and it is easy to see that any 3 by 3 matrix, multiplied by this matrix will, in fact, have twice its first row added to its third:
[tex]\begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\2 & 0 & 1\end{bmatrix}\begin{bmatrix}a & b & c \\ d & e & f\\ g & h & i\end{bmatrix}= \begin{bmatrix}a & b & c \\ d & e & f \\ 2a+ g & 2b+ h & 2c+ i\end{bmatrix}[/tex]

Now, suppose we have the matrix equation Ax= B and apply a sequence of row operations to A (applying those same operations to B) reducing A to the identity matrix. That is, we have [itex]r_3r_2r_1A= I[/itex]. If we write the corresponding elementary matrices as "[itex]R_1[/itex]", "[itex]R_2[/itex]", "[itex]R_3[/itex]", then we can say that [itex]R_3R_2R_1A= I[/itex]. That, of course, is the same as saying [itex](R_3R_2R_1)A= I[/itex] which is the same as saying that the single matrix [itex]R= R_3R_2R_1[/itex], the product of the three elementary matrix is the inverse matrix to A, [itex]A^{-1}[/itex]. Applying those to B is the same as multiplying the elementary matrices to B which is the same as multiplying B by [itex]A^{-1}[/itex].

That is, applying the row operations to A and B is the same as going from Ax= B to [itex]A^{-1}Ax= A^{-1}B[/itex] or [itex]x= A^{-1}B[/itex].

Stephen Tashi · Jul 11, 2013

Bipolarity said:

II am almost certain it involves mathematical induction on the number of rows in the matrix, but am having trouble proving it myself.

The explanation of Hallsofivy can be used to prove that the algorithm accomplishes a given goal if it terminates successfully. To formally prove that the algorithm can always proceed to the next step would need an inductive argument.

HallsofIvy · Jul 11, 2013

Of course, if A does NOT HAVE an inverse, then you are going to have a problem!

Bipolarity · Jul 12, 2013

HallsofIvy said:

Of course, if A does NOT HAVE an inverse, then you are going to have a problem!

Why does the matrix being singular pose a problem? Gaussian elimination works on rectangular matrices even.

BiP

HallsofIvy · Jul 14, 2013

To solve what problem? I was thinking of using Gaussian elimination to solve Ax= B.

Of course, if you just want to "row-reduce" a general matrix then that is not a problem. I am puzzled by Stephen Tashi's "if it terminates successfully. If A is an m by n matrix, it will require only a finite number of row operations to row-reduce it so the question of "termination" does not arize.

Stephen Tashi · Jul 14, 2013

To "prove" Gaussian elimination, you have to state some complete sentence that makes a claim about Gaussian elimination!

You also have to define Gaussian elimination rigorously as a procedure. Algorithms are usually defined in terms of how to start and how to proceed to the next step. You have to define what conditions cause the algorithm to terminate. (A simple way to do that would be to say it terminates when it is impossible to proceed to the next step.) You have to prove that the algorithm does terminate. Then you should prove that, upon the termination of the algorithm, the claim you made about the algorithm is true.

That's a big nuisance and I'm not volunteering to do any of it. I only offer to make vague comments indicating that termination and success need to be defined and proven by anyone undertaking the job.

Best Pokemon · Jan 22, 2015

There is a book called A First Course in Abstract Algebra by Joseph J. Rotman. In the third edition of the book, on pages 350 and 351 there is a proof of correctness of the Gaussian elimination algorithm. You can try to seek a copy from your local library or from another source.

mathwonk · Jan 23, 2015

i agree with the OP that induction on the number of rows is one way to go. however the description of the algorithm is also the proof.
case n=1: a one row matrix is already in echelon form.
case n> 1: begin with the first column starting from the left, in which a non zero element a appears, and use row interchange to put this element in the first row. (if all entries in the matrix are zero, the matrix is in echelon form.)

now, if any other row below the first has a non zero entry c in the same column with a, subtract multiples of the first row from the later row to make that entry zero. the proof this is possible is the fact that if a ≠ 0, and c ≠ 0, then c - (c/a)a = 0.
now we have a matrix in which the first row has a non zero entry a, and all entries directly below and also to the left of this entry are zero.

then we may by induction reduce the matrix composed of the remaining n-1 rows to echelon form, and we are done.

mathwonk · Jan 23, 2015

the more interesting theorem is not the existence, but the uniqueness of the reduced echelon form, wherein one also eliminates all elements above the first non zero element in each row, and also changes the first non zero element to 1.

Svein · Jan 24, 2015

The algorithm that is most often used is Gaussian elimination with pivot - sort the rows remaining to make the one with the largest column element the next row to be processed.

mathwonk · Jan 24, 2015

interesting - i would have thought one would prefer the pivot to be small, to be more likely to divide other elements of that column. e.g. if working over the integers, one would try to get it to be the gcd of the lower entries.

e.g. see pages 11-13 of the notes for my class 845-1 (where in that setting column operations are also allowed):

http://www.math.uga.edu/%7Eroy/845-1.pdf

Svein · Jan 25, 2015

mathwonk said:

interesting - i would have thought one would prefer the pivot to be small, to be more likely to divide other elements of that column. e.g. if working over the integers, one would try to get it to be the gcd of the lower entries.

Two reasons:

First reason: Expressed in matrix terms, Gaussian elimination is an algorithm for transforming the coefficient matrix into an upper triangular one. The usual algorithm for row n is: "Divide row n with element a_n,n to get a 1 on the diagonal, then subtract a multiple of row n from the remaining rows to get zeros below the diagonal". Now this cannot be done if element a_n,n = 0. Instead of giving up, just look for a row with a_m,n≠0 and swap the rows.

Second reason: Even if a_n,n ≠ 0, it may be the result of a subtraction. Normally this is no problem, but then there is the problem of of subtracting very large numbers. If you have limited precision (which you always have), subtracting 1 000 000 000.056 from 1 000 000 000.057 can just as easily end up as 0 as of 0.001. And how do we get such large numbers in the first place? By dividing something with a very small number...

Gaussian elimination: Proof of correctness

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad The vector to which a dual vector corresponds

Undergrad 2 interpretations of bra-ket expression: equal, & isomorphic, but...

Undergrad Spinor calculus

Undergrad Matrix representation of rank-2 spinors

Undergrad Looking for a paper about spinors

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect