I Uniqueness of reduced row echelon form (proof verification)

psie
Messages
315
Reaction score
40
TL;DR Summary
I'm working an exercise where I'm supposed to prove that the reduced row echelon form (rref) of a matrix is unique as a corollary to a certain theorem. I will say that I have modified some of the arguments below from other sources. However, I'm a bit unsure if I'm ultimately doing it right, since I have not seen aspects of the proof elsewhere.
The theorem reads as follows:

Theorem. Let ##A## be an ##m\times n## matrix of rank ##r##, where ##r>0##, and let ##B## be the rref of ##A##. Then
(a) The number of nonzero rows of ##B## is ##r##.
(b) For each ##i=1,2,\ldots,r##, there is a column ##b_{j_i}## of ##B## such that ##b_{j_i}=e_i##.
(c) The columns of ##A## numbered ##j_1,j_2,\ldots,j_r## are linearly independent.
(d) For each ##k=1,2,\ldots,n##, if column ##k## of ##B## is ##d_1e_1+d_2e_2+\cdots+d_re_r##, then column ##k## of ##A## is ##d_1a_{j_1}+d_2a_{j_2}+\cdots+d_ra_{j_r}##, where ##a_{j_i}## is the column of ##A## corresponding to ##b_{j_i}## of ##B##.

Corollary. The rref of a matrix is unique.

Proof. The proof is by induction on the number of columns of an ##m\times n## matrix ##A##. The base case is kind of clear (either the rref is the ##0## vector or ##e_1##). Suppose now that the rref of a matrix with ##k## columns is unique and consider ##A## with ##k+1## columns, say ##u_1,u_2,\ldots,u_{k+1}##. Let ##A'## be the matrix by deleting the final column of ##A##, that is ##A=(A'\mid u_{k+1})##. If ##(B'\mid b)## is a rref of ##A##, we know that ##B'## is a rref of ##A'##. And ##B'## is unique by the induction hypothesis. Now I'm a bit unsure of what to show next, but I proceed by showing ##b'## is unique. The set of pivotal columns ##P'## of ##B'## is also unique and corresponds to a maximal linearly independent subset of ##\{u_1,u_2,\ldots,u_k\}## by Theorem 3.16(c).

If ##P'\cup\{b\}## is linearly independent, this means $$\operatorname{rank}(A)=\operatorname{rank}(A')+1.$$Say this number equals ##r##. According to Theorem 3.16(b), ##b## is the only column which can be ##e_r## (for otherwise ##r-1=\operatorname{rank}(A')\geq r##). In this case we conclude that ##(B'\mid b)## is unique since ##b## can not be anything other than ##e_r##.

If ##P'\cup\{b\}## is linearly dependent, then ##b## is not a pivotal column of ##(B'\mid b)##; otherwise ##P'\cup\{b\}## is the set of pivotal columns of ##(B'\mid b)##, which is linearly independent. This means ##b## has a unique representation with respect to the set ##P'## (since they form a basis of the column space by Theorem 3.16(b)), say ##b=d_1e_1+d_2e_2+\cdots d_re_r##, where ##r=\operatorname{rank}(B'\mid b)##. So there can not be any other representation of ##b##, and we conclude ##(B'\mid b)## is unique. This concludes the proof. ##\blacksquare##

It's important to me that I'm staying true to the exercise, i.e. proving it as a corollary to the theorem. But I'm doubting what I'm doing in the induction step. I'm not really sure what I'm showing with ##b##. I have seen other proofs that assume there are two rrefs of ##A## and then show that these rrefs equal. I feel like I accomplish something similar, though I'm not using any contradiction, which I've seen in the other proofs I've looked at.

I appreciate any feedback. Can you follow the proof? Is it correct?
 
Physics news on Phys.org
It sounds right to me but I think there are simpler ways of describing it. Don't use the word pivotal column since the theorem doesn't call them that:

If rank(B)=rank(B') then B' contains the all the columns mentioned in point (b) of the theorem for B, and then (d) uniquely describes how b must be constructed. If not, b must be the last ##e_r##. Either way you know what it is exactly once you know B'. I think this is the same proof in 5% as much space.
 
I suspect it is impossible to deduce uniqueness from this (to me) somewhat confusing theorem, without assuming more about RREF, such as that j1 < j2 <...< jr. I.e. if A has columns a1 = e2, and a2 = e1, then either B with b1 = e1 and b2 = e2, or also B with b1 = e2 and b2 = e1, seem to satisfy the theorem. (Thus uniqueness is apparently not a corollary of properties (a),...,(d), of this theorem.)

[Reconsideration:
Well, perhaps I am being too hard on the theorem's statement. The only problem it has is giving properties that do not quite force the matrix B to be in reduced echelon form. So although there do exist matrices B satisfying the properties (a),...(d), in the theorem without being in RREF, if we assume that B not only satisfies (a),...(d), but also that B is in RREF, then it does follow that B is unique, which is surely what they meant. Strictly speaking however, I think the fact that the ji in part (c) are in their natural order, should be stated there explicitly.]

Here is my favorite proof (conceptual) of uniqueness: since A has rank r, the row space of A has dimension r. Then there is a unique subspace S of R^n spanned by standard basis vectors ej1,...,ejr, with the sequence of indices j1< ... < jr, as small as possible in lexicographical order, such that projection of the row space onto the subspace S is an isomorphism. Then the rows of B are exactly the pullback of the standard basis ej1,....,ejr, by this isomorphism. QED.

I.e.
To prove existence and uniqueness of reduced echelon form of an mxn matrix, we are trying to produce a nice basis for the row space in R^n. assume the row space, say of rank r, projects isomorphically onto the subspace of R^n spanned by the first r vectors e1,..,er. Then just take the r vectors in the row space that map isomorphically to the r standard basis vectors of R^r. These are the rows of the reduced echelon form.

If the row space does not surject onto the first copy of R^r in R^n, project it onto each of the first r subspaces and as soon as it does not surject, omit that index and go to the next one. E.g. if the row space surjects onto R^1 but not R^2, skip the second index and see whether the row space surjects onto the space spanned by e1 and e3. If so include e3 and continue. eventually we find a sequence of r standard basis vectors ej1,...,ejr, with j1<...<jr, such that the row space surjects (isomorphically) onto the subspace they span. then use the preimages of those standard basis vectors as the rows of the RREF.

The key point in a proof of uniqueness is that the pivot columns are uniquely determined by A. A column aj is a pivot column (of A) if and only if it is not a linear combination of the previous columns of A. Then projection of the row space is an isomorphism into the subspace of R^n spanned by the standard basis vectors ej1,...,ejr, where aj1,...,ajr, are the pivot columns of A, (and the ji are in their natural order). Pulling back these basis vectors by that isomorphism gives the row vectors of B.

Another nice, more explicit proof, is to note that row equivalent matrices have the same null space. But it is pretty easy to show that different row reduced matrices do not have the same null spaces. E.g. if the first n-1 columns are the pivot columns, then the null space has dimension one, and the unique element of the null space with last entry 1, is (minus) the only non pivot entry of the RREF. In general the interesting entries of the usual basis for the null space are (minus) the interesting (non pivot) entries of the RREF.

This is what the theorem above is trying to say: namely, elements of the null space tell you how to express the non - pivot columns of A in terms of its pivot columns, and this is the same (Theorem part (d)) as expressing the non pivot columns of B in terms of its pivot columns; and since the pivot columns of B are standard basis vectors, this actually gives you the entries of the non pivot columns of B.

Briefly, the theorem and corollary are roughly equivalent to saying that B is the only RREF matrix having the same "solutions" as A, which is correct.
 
Last edited:
Here is another nice way to see uniqueness. The key point as usual is to first determine the r pivot columns. For simplicity again assume they are the first r columns. In any case, knowing them determines a decomposition of R^n as R^r x R^(n-r), i.e. allows us to view R^n as a product of the pivot subspace and the non pivot subspace of R^n.

Then the row space of A, is an r dimensional subspace of R^r x R^(n-r) that projects isomorphically onto the first factor R^r. This is therefore exactly the graph of a unique linear transformation T:R^r-->R^(n-r). I.e. the elements of the row space all have form (v,T(v)), for v in R^r. Then the RREF of A is comprised of exactly those r vectors in the row space of form (e1,T(e1)),....,(er,T(er)).

Thus to obtain the RREF of A, first determine the pivot space, then look at the row space R(A) of A as the graph of a unique linear transformation T defined on that pivot space, then select the standard basis ej1,...,ejr of that pivot space, and take the vectors corresponding to (ej1,T(ej1),...,(ejr,T(jer)) (under the isomorphism of R^n with the product of the pivot and non pivot spaces) as the rows of the RREF of A.

I.e. subspaces R(A) of R^n that project isomorphically onto R^r correspond uniquely to linear transformations from R^r to R^(n-r), and these correspond uniquely to sequences of r vectors in R^(n-r), namely to the non - pivot entries of the rows of the RREF.
 
Last edited:
Forgive me, I keep trying to make this uniqueness look natural and elementary. [see edit below for link with post #1]
So here is another statement of the characterization of RREF(A) = B, of an mxn matrix A of rank r, essentially just from the definition of the RREF.
First, as usual, determine the r pivot columns of A, namely as every column of A which is not a linear combination of earlier columns. (It follows immediately that these pivot columns are a basis for the column space of A.) Then B = the RREF of A, has only its first r rows non zero, and has the standard basis vectors e1,…,er, as its pivot columns (in that order), and has the same row space as A, i.e. B is obtained from A by row operations.

proof of Uniqueness of B: First note that row operations may be obtained by left multiplication by invertible matrices. Thus there is an invertible matrix M such that MA = B, and since M acts on the columns of A, it restricts to an isomorphism S from the column space of A to the column space of B. Since M takes linear combinations to linear combinations, it takes the pivot columns of A to the pivot columns of B, and also takes the non pivot columns of A to the non pivot columns of B. (In particular the pivot columns of A and B are in the same positions.)
Now since the pivot columns of A are a basis for the column space of A, the isomorphism S is determined by what it does to those columns. Since by definition of B its pivot columns are e1,…,er, this means the behavior of S on the pivot columns of A is determined, hence its behavior on the non pivot columns of A is also determined.

Thus B is entirely determined by knowing its pivot columns are e1,...er, (in order), and that the columns of B are obtained by applying M to the columns of A. (Since e1,...,er are a basis for the column space of B, it follows that B has only r non zero rows.) I.e. B is entirely determined by knowing it is in RREF, and is row equivalent to A.

Briefly, there is a unique isomorphism S from the column space of A to the subspace of R^m spanned by the standard basis vectors e1,…,er, and which takes the pivot columns of A to the ej. The columns of B are obtained by applying this unique S to the columns of A. Hence B is uniquely determined by A.

EDIT: At last! I understand conceptually what that theorem was trying to say! Theorem, part (d), is equivalent to saying that the function sending the kth column of B to the kth column of A, extends to a linear isomorphism from the column space of B to that of A, (which in particular preserves the positions of pivot columns). It follows that the inverse isomorphism takes the kth column of A to that of B, and since the pivot columns of B are determined as the standard basis vectors (and are in the same positions as those of A), hence the columns of B are all determined by the columns of A.
(Of course phrased this way, induction is unnecessary.)

I also just noticed that the argument above implies that for an invertible matrix M, MA=B is in RREF if and only if the first r columns of M^(-1) are the pivot columns of A. In particular, M is unique if and only if all columns of A are pivot columns, if and only if the columns of A are independent.
This makes sense, because the operations used to place a matrix in RREF depend only on the entries of the pivot columns, and not on what position those columns are in.
 
Last edited:
I asked online questions about Proposition 2.1.1: The answer I got is the following: I have some questions about the answer I got. When the person answering says: ##1.## Is the map ##\mathfrak{q}\mapsto \mathfrak{q} A _\mathfrak{p}## from ##A\setminus \mathfrak{p}\to A_\mathfrak{p}##? But I don't understand what the author meant for the rest of the sentence in mathematical notation: ##2.## In the next statement where the author says: How is ##A\to...
The following are taken from the two sources, 1) from this online page and the book An Introduction to Module Theory by: Ibrahim Assem, Flavio U. Coelho. In the Abelian Categories chapter in the module theory text on page 157, right after presenting IV.2.21 Definition, the authors states "Image and coimage may or may not exist, but if they do, then they are unique up to isomorphism (because so are kernels and cokernels). Also in the reference url page above, the authors present two...
When decomposing a representation ##\rho## of a finite group ##G## into irreducible representations, we can find the number of times the representation contains a particular irrep ##\rho_0## through the character inner product $$ \langle \chi, \chi_0\rangle = \frac{1}{|G|} \sum_{g\in G} \chi(g) \chi_0(g)^*$$ where ##\chi## and ##\chi_0## are the characters of ##\rho## and ##\rho_0##, respectively. Since all group elements in the same conjugacy class have the same characters, this may be...
Back
Top