Intuition behind elementary operations on matrices

Mr Real · May 20, 2017

For finding the inverse of a matrix A, we convert the expression A = I A (where I is identity matrix), such that we get I = B A ( here B is inverse of matrix A) by employing elementary row or column operations. But why do these operations work? Why does changing elements of a complete row by another corresponding row work but e.g. we can't add/subtract the same number in the equality?
Is there some proof of these operations? What is the principle/intuition behind these operations? Can they be explained by any first principles?

Stephen Tashi · May 20, 2017

Mr real said:

Why does changing elements of a complete row by another corresponding row work but e.g. we can't add/subtract the same number in the equality?

You have to be clear about what you mean by "work".

If the only requirement for a procedure to "work" is that it transforms an equation into another equation then we can add and subtract to both sides of a matrix equation. Each "side" of a matrix equation , when completely simplified, is a matrix. For example, suppose we have the equation ##C = D## where ##C## and ##D## are 2x3 matrices. If we add 7 to the entry in the 2nd row 3rd column on both sides of the equation, we get another equation. So adding 7 to the corresponding entry of both matrices does "work" in that sense.

Suppose we have the equation ##C = (F)(G)## where ##C,F,G## are 2x3 matrices. It is not true that we always transform this to an equivalent equation if we add 7 to the 2nd row 3rd column of ##C## and do the same to only to the matrix ##F##. That isn't surprising. By analogy, if you have the equation in real variables ##0 = (x-2)(x+1)## we don't preserve the solution set if if we add 7 to the left hand side and add 7 only to factor ##(x-2)## on the right hand side.

With an equation in real variables, you can multiply both sides by the same number. For example, we can transform the equation ##0 = (x/7 - 1/7)(x-2)## to an equivalent equation by multiplying both sides by ##7##. When we multiply the right side by 7, we are permitted to multiply only the factor ##(x/7 - 1/7)## by 7. By analogy, if we have the matrix equation ##C = (F)(G)## we can multiply both sides of the equation by a matrix ##E##, obtaining an equivalent equation ##(E)(C) = (E)(F)(G)##. In evaluating the right hand side, we can compute it as ##(EF)(G)##

The reason an elementary row operation "works" is that it is the same as multiplying both sides of a matrix equation of the form ##C = (F)(G)## by a matrix ##E##. Have you studied how to find the "elementary matrix" that performs a given elementary row operation?

Mr Real · May 20, 2017

Okay, i understood what all you said above. But I still have some doubts. Consider C = (F)(G), now why does doing operations like replacing a complete row by another row of the same matrix or changing R2 to R1 - R3 (R refers to row) give an equivalent equation? What's the logic behind this?
Also while doing a row operation and changing C and G by the same amount (e.g. by replacing a row), it wouldn't give an equivalent equation but changing C and F by the same amount would. Why is it so?

jedishrfu · May 20, 2017

You might get some insight from a series of youtube videos by 3blue1brown on his channel.

https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw

Heres the start

Stephen Tashi · May 20, 2017

Mr Real said:

Consider C = (F)(G), now why does doing operations like replacing a complete row by another row of the same matrix or changing R2 to R1 - R3 (R refers to row) give an equivalent equation?

The operations you described may not give an equilvalent equation - depending on what things in the equation are the variables. Not all manipulations with rows are "elementary row operations". An example of an elementary row operation is to replace R2 by R1 + R2.

That elementary row operation changes the matrix ## A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}## to ##\begin{pmatrix} a & b\\ a+c && b+d \end{pmatrix}##.

The same effect can be accomplished by the multiplication ##\begin{pmatrix} 1 &0 \\ 1 & 1 \end{pmatrix} \begin{pmatrix} a & b \\ c& d \end{pmatrix}##. That's why I asked you if you have studied the "elementary matrices" that correspond to "elementary row operations".

When you perform an elementary row operation on a matrix equation ##C = (F)(G)## you are implementing a short-cut for multiplying both sides of the equation by an invertible matrix ##E##.

Mr Real · May 21, 2017

Stephen Tashi said:

The operations you described may not give an equilvalent equation

Can you provide an example of an equation of the form: C = (F)(G) where if we change R2 to R1 - R3, it doesn't give an equivalent equation.
And no, I didn't know about elementary matrices. Just read a bit about it and have understood these operations a little better. Will continue doing so.
Thanks for your response.
Mr R

Mark44 · May 21, 2017

Mr Real said:

Can you provide an example of an equation of the form: C = (F)(G) where if we change R2 to R1 - R3, it doesn't give an equivalent equation.

Without giving it very much thought, I don't believe that replacing R2 by R1 - R3 is a valid row operation.
There are three row operations. Each of these row operations can be represented by an elementary matrix.
1. Replace a row by a nonzero multiple of itself. ##R_i <-- kR_i##
2. Switch two rows. ##R_i <--> R_j##
3.Replace a row by itself plus a constant multiple of another row. ##R_i <-- R_i + kR_j##

Each row operation results in a changed matrix, but one that is equivalent to the one you started with. In other words, the three row operations don't change the solution set of the matrix you started with.

Mr Real said:

And no, I didn't know about elementary matrices. Just read a bit about it and have understood these operations a little better. Will continue doing so.
Thanks for your response.
Mr R

Mr Real · May 21, 2017

Mark44 said:

Without giving it very much thought, I don't believe that replacing R2 by R1 - R3 is a valid row operation.

Okay, so one of the terms needs to be the row that is to be changed.

Mark44 said:

3.Replace a row by itself plus a constant multiple of another row. Ri<−−Ri+kRjRi<−−Ri+kRjR_i

So won't replacing R2 by R1+ 7R2 be fine ?
Just a set of some other queries regarding this: so we can't replace, let's say, R1 by R1 - R2 + R3 (as it has 3 terms), by R1 - R1/R2 ?
Thanks for your response.
Mr R

Mark44 · May 21, 2017

Mr Real said:

So won't replacing R2 by R1+ 7R2 be fine ?

This isn't an elementary row operation, but it could be done by two of these operations.
1. ##R_2 <-- 7R_2##
2. ##R_2 <-- R_2 + R_1##

Mr Real said:

Just a set of some other queries regarding this: so we can't replace, let's say, R1 by R1 - R2 + R3 (as it has 3 terms), by R1 - R1/R2 ?

For the first, this could be done in two steps, similar to what I did above.
For your second, dividing one row by another isn't a valid operation. The elementary row operations involve only the addition of two rows, multiplication of a row by a nonzero constant, and swapping two rows. They don't involve multiplying one row by another, or dividing one row by another, taking the square root of a row, or other such operation.

Stephen Tashi · May 21, 2017

Mr Real said:

Can you provide an example of an equation of the form: C = (F)(G) where if we change R2 to R1 - R3, it doesn't give an equivalent equation.

An "equation" is statement involving variables. By contrast, an "equality" is simply a statement that two constant expressions are equal. For two equations to be "equivalent" they must have the same set of solutions.

The matrix equation ##\begin{pmatrix}x&0&0 \\0&y&0 \\0&0&z \end{pmatrix} = \begin{pmatrix} 1&0&0 \\0&1&0 \\0&0&1 \end{pmatrix} \begin{pmatrix} 3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} ## has solution ##x = 3, y =4, z = 5##.

The matrix equation ##\begin{pmatrix} x&0&0 \\x&0&-z \\0&0&z \end{pmatrix} = \begin{pmatrix} 1&0&0 \\1&0&-1 \\0&0&1 \end{pmatrix} \begin{pmatrix} 3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} ## has additional solutions such as ##x = 3, y = 13, z = 5## because the equation allows ##y## to be arbitrary.

It is correct that the row operation you mention does change an "equality" into another equality.

##\begin{pmatrix}3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} = \begin{pmatrix} 1&0&0 \\0&1&0 \\0&0&1 \end{pmatrix} \begin{pmatrix} 3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} ##

##\begin{pmatrix}3&0&0 \\3&0&-5 \\0&0&5 \end{pmatrix} = \begin{pmatrix} 1&0&0 \\1&0&-1 \\0&0&1 \end{pmatrix}\begin{pmatrix} 3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} ##

However changing from one equality to another is not sufficient to prove that such a series of steps reaches a desired goal. For example one could begin with a matrix equality and apply the operation "Change all rows to rows of zeroes". This would produce another equality, but serve no useful purpose.

The goal we are pursuing in manipulating the equality ##A = (I)(A)## is to reach another equality of the form ##I = (B)(A)##. For this goal to be reached, we must avoid producing an equality like
##\begin{pmatrix}3&0&0 \\3&0&-5 \\0&0&5 \end{pmatrix} = \begin{pmatrix} 1&0&0 \\1&0&-1 \\0&0&1 \end{pmatrix}\begin{pmatrix} 3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} ## because this leaves us no way to proceed that will change the left hand side to the identity matrix. The fact that the second column of entries in the left hand size is all zeroes prevents us from getting the second row ##0,1,0## of the identity matrix.

mathwonk · May 21, 2017

matrices represent linear transformations between two vector spaces. there are several natural equivalence relations in these terms. one is left equivalence, where two transformations, between the same two spaces, are "left - equivalent" iff they become equal after composing one of them with an isomorphism of the target space. by convention of writing compositions, this means composing on the left with an isomorphism. this turns out to be the same as saying the two transformations have the same row space, also iff they have the same kernel.

in terms of the matrices of the linear transformations, they are left equivalent iff multiplying one of them from the left by an invertible matrix changes it into the other. since every isomorphism can be written as a composition of a sequence of "elementary matrices", this is the same as saying one of the matrices can be changed into the other by a sequence of elementary row operations. I.e. left multiplying by an elementary matrix just performs an elementary row operation.

since any matrix can be changed into any other matrix with the same kernel by a sequence of row operations, then any square matrix with zero kernel, can be changed into the identity matrix by such a sequence. Necessarily such a sequence will have as product the inverse of the matrix. This is why we can compute an inverse by these operations.

another equivalence relation is right equivalence, where two matrices are R-equivalent iff one can be changed into the other by right multiplication by an invertible matrix, or equivalently by a sequence of column operations. this happens iff the two matrices have the same image, i.e. same column space.

the 3rd equivalence relation for arbitrary matrices makes two matrices equivalent iff one can be changed into the other by using both row and column operations. for square matrices, the key equivalence relation is similarity, where we change one into the other by conjugating ti by an invertible matrix.

there are normal foms for matrices under all these equivalences. the normal form for row equivalence is the row educed echelon form, for column operations it is the reduced version of the columns, i.e. iff their transposes have the same row reduced echelon form. the row reduced echelon form of every invertible matrix for example is the identity matrix.

for 2 sided equivalence, the normal form is diagonal, with only 1's and zeroes on the diagonal, so two matrices are 2-sided equivalent iff they have the same rank. finally two square matrices are similar iff they have the same rational canonical form, iff their characteristic matrices are 2 sided equivalent over the polynomial ring, i.e. iff their characteristic matrices have the same diagonal form, (when the diagonal entries are made to successively divide one another.) these forms can be computed by the euclidean algorithm.

there is another normal form for similarity, the jordan form, but it cannot be computed by hand unless one is lucky enough to be able to factor the characteristic polynomial, which in general is unfeasible, but in cooked examples found in books, it can be done by the rational root theorem, since in those examples the characteristic polynomial is rigged to have all its irreducible factors linear polynomials.

WWGD · May 21, 2017

I don't know if this is what you are looking for, but , formally, the set of all elementary matrices generates the space of all invertible matrices through the operation of multiplication. This means if you start with the identity matrix ( for convenience; you can start with any other invertible matrix) and apply a finite number of elementary transformations, you will end up with any invertible matrix.

Mr Real · May 21, 2017

Stephen Tashi said:

An "equation" is statement involving variables. By contrast, an "equality" is simply a statement that two constant expressions are equal.

But I have studied that an "expression" is a statement involving variables and an equation is an expression that has an equality sign.

Stephen Tashi said:

It is correct that the row operation you mention does change an "equality" into another equality.

How is the second equation an equality? The product of the two matrices in RHS does not give the matrix in LHS ( the second row of the matrix in LHS should be [0 0 0] instead of [3 0 -5].

Stephen Tashi said:

this leaves us no way to proceed that will change the left hand side to the identity matrix. The fact that the second column of entries in the left hand size is all zeroes prevents us from getting the second row 0,1,0 of the identity matrix.

How did you come to know just by seeing the equality that the second column elements are all zero that the matrix on the left cannot be converted into an identity matrix? How does second column elements all being zero stop us from converting it to an identity matrix? Is it some property related to matrices?
Thanks
Mr R

Stephen Tashi · May 21, 2017

Mr Real said:

But I have studied that an "expression" is a statement involving variables and an equation is an expression that has an equality sign.

An equation has an equality sign, but an equation implies there are variables, for which we wish to find solutions. For example, "3 = 2 + 1" does not say anything about a variable being involved. You could pose a problem by saying "Find x such that 3 = 2 + 1" and force the reader to consider a variable. In that case the solution set for the "equation" 3 = 2 + 1 would be all numbers.

How is the second equation an equality? The product of the two matrices in RHS does not give the matrix in LHS ( the second row of the matrix in LHS should be [0 0 0] instead of [3 0 -5].

Second "equation" or second "equality" ? In either case, check your multiplication.
For example the computation for the first element in the second row is ##(1)(3) + (0)(0)+ (-1)(0) = 3##.

How did you come to know just by seeing the equality that the second column elements are all zero that the matrix on the left cannot be converted into an identity matrix? How does second column elements all being zero stop us from converting it to an identity matrix? Is it some property related to matrices?

In performing elementary row operations, once you get a column of zeroes, no amount of adding rows together or multiplying rows by numbers will get rid of the zeroes. It's just something you'll learn from experience.

mathwonk · May 21, 2017

i apologize if my answer was too long, but i was trying to be complete. WWGD's nice answer for example is my second sentence, second paragraph. post 11.

maybe this point indeed is what the OP wants to grasp. i.e. we are trying to transform a matrix into a simple form by multiplying on the left by a suitable invertible matrix. Now elementary matrices are chosen because they are basic building blocks for all invertible matrices. Thus we can achieve the same result of left multiplying by any invertible matrix, in stages, by performing a sequence of elementary row operations. So elementary row operations are a tool for accomplishing the left multiplication of a matrix by any invertible matrix. In particular, if starting from an invertible matrix, we can reach the identity only by left multiplying by its left inverse. hence if we can find any sequence of operations that results in the identity, then those must have as their product, the inverse matrix.

as to why some of your operations are not acceptible, all operations must be invertible, i.e. it must be possible to reverse their effect by some other operation. but some of your operations lose information which cannot be recovered by another operation, as steven tashi is explaining to you.

Mr Real · May 22, 2017

Stephen Tashi said:

Second "equation" or second "equality" ? In either case, check your multiplication.
For example the computation for the first element in the second row is (1)(3)+(0)(0)+(−1)(0)=3(1)(3)+(0)(0)+(−1)(0)=3(1)(3) + (0)(0)+ (-1)(0) = 3

Yeah, you are right, it was my mistake.

Mr Real · May 22, 2017

mathwonk said:

i apologize if my answer was too long, but i was trying to be complete.

Actually, I've just started to learn about matrices, so many of the terms used in your previous answer were alien to me but I understood this answer.

mathwonk said:

as to why some of your operations are not acceptible, all operations must be invertible, i.e. it must be possible to reverse their effect by some other operation. but some of your operations lose information which cannot be recovered by another operation

But how is an operation like converting R2 to R1 - R3 not invertible but converting it to R2 - R3 is invertible? Can you explain this with an example?
Thanks
Mr R

jedishrfu · May 22, 2017

How would you get back R2 in your matrix ? You can't add or subtract any rows now, you've lost R2 permanently.

However the R2-R3 replacement for R2 means you can revert back to the old R2 just by adding R3 to it.

Mr Real · May 22, 2017

jedishrfu said:

How would you get back R2 in your matrix ? You can't add or subtract any rows now, you've lost R2 permanently.

Yeah, I get it now. So, would this work for all elementary row/column operations and for any matrix?
Thanks
Mr R

PeroK · May 22, 2017

Mr Real said:

Yeah, I get it now. So, would this work for all elementary row/column operations and for any matrix?
Thanks
Mr R

In this context matrices are just a convenient way of representing a set of equations. As row operations involve at most two rows, you can explain it all using just two equations. These can have any number of variables, but let's have three in this example:

##a_1x + b_1y + c_1z = d_1##
##a_2x + b_2y + c_2z = d_2##

Now, let's suppose we have found ##(x, y, z)## that solve these equations. It's clear that if:

##a_1x + b_1y + c_1z = d_1##

Then:

##k(a_1x + b_1y + c_1z) = k(d_1)##

Where ##k## is some non-zero constant. And, likewise, if this second equation holds (with ##k## in it) then so does the first without the ##k##.

That's why you can multiply a row by a constant.

It's even clearer that if:

##a_1x + b_1y + c_1z = d_1##
##a_2x + b_2y + c_2z = d_2##

Then:

##a_2x + b_2y + c_2z = d_2##
##a_1x + b_1y + c_1z = d_1##

That's just the same equations written in a different order, so must have the same solutions.

Finally, if

##a_1x + b_1y + c_1z = d_1##
##a_2x + b_2y + c_2z = d_2##

Then adding these together gives:

##(a_1+a_2)x + (b_1+b_2)y + (c_1+c_2)z = d_1+d_2##

And, if we put this together with either of the original equations we have:

##a_1x + b_1y + c_1z = d_1##
##(a_1+a_2)x + (b_1+b_2)y + (c_1+c_2)z = d_1+d_2##

This has exactly the same solutions as the original equations. That's why you can add one row to another. Adding a mutiple of one row to another isn't really a new operation, just a combination of these ones.

Mr Real · May 22, 2017

PeroK said:

This has exactly the same solutions as the original equations. That's why you can add one row to another. Adding a mutiple of one row to another isn't really a new operation, just a combination of these ones.

So, with regard to what jedishrfu said that we can always get R2 back if we apply any elementary operation, does it hold for every matrix?

PeroK · May 22, 2017

Mr Real said:

So, with regard to what jedishrfu said that we can always get R2 back if we apply any elementary operation, does it hold for every matrix?

You can see for yourself. If you start with:

R1
R2

If you then you add R1 to R2, you have an equaivalent set of equations:

R1
R2 + R1

Which has exactly the same solutions.

If you want R2 back, you simply subtract R1 from the second row and you are back to:

R1
R2

The key point is that these operations change the set of equations and change the matrix, but they do not change the solutions. The new set of equations always has the same set of solutions.

Mr Real · May 22, 2017

Thank you everyone for giving such brilliant answers to my doubts, especially Stephen Tashi. Thanks to all of you, matrices have become clearer to me.
p.s. Thanks for suggesting the awesome playlist @jedishrfu. Just watched most of them!

Mr R

Mark44 · May 22, 2017

PeroK said:

In this context matrices are just a convenient way of representing a set of equations.

This is an important point, and one that many students overlook.

Mr Real · May 24, 2017

Hey, I have come to know a little more about matrices so had doubts about this particular statement.

PeroK said:

In this context matrices are just a convenient way of representing a set of equations.

Well, I know that matrices can be used to represent a set of equations(and then we can find solutions for x, y, z, etc.), but isn't that just one of the many uses of matrices, not what matrices themselves are because matrices can also be used for other things, like to decribe linear transformations in 2D, 3D, etc. , can't they?

PeroK · May 24, 2017

Mr Real said:

Hey, I have come to know a little more about matrices so had doubts about this particular statement.
Well, I know that matrices can be used to represent a set of equations(and then we can find solutions for x, y, z, etc.), but isn't that just one of the many uses of matrices, not what matrices themselves are because matrices can also be used for other things, like to decribe linear transformations in 2D, 3D, etc. , can't they?

Yes, but if they stand for Linear Transformations you can't do elementary row operations on them and still get the same transformation. Which was more or less my point. The matrices change, but in this context (the context of sets of linear equations) the change is not relevant.

In other contexts you cannot apply row operations without fundamentally changing what the matrix represents.

Mr Real · May 24, 2017

PeroK said:

Yes, but if they stand for Linear Transformations you can't do elementary row operations on them and still get the same transformation. Which was more or less my point.

But if we have an equality like the one we were discussing; C = (F)(G),
even if we considered these matrices as linear transformations, and apply elementary operations on both sides wouldn't we get the same transformation on both sides?

PeroK · May 24, 2017

Mr Real said:

But if we have an equality like the one we were discussing; C = (F)(G),
even if we considered these matrices as linear transformations, and apply elementary operations on both sides wouldn't we get the same transformation on both sides?

No. For example:

##\begin{pmatrix} 1 &1 \\ 1 & 1 \end{pmatrix}##

Is not the same linear transformation as:

##\begin{pmatrix} 1 &1 \\ 0 & 0 \end{pmatrix}##

Mr Real · May 24, 2017

PeroK said:

No. For example:

##\begin{pmatrix} 1 &1 \\ 1 & 1 \end{pmatrix}##

Is not the same linear transformation as:

##\begin{pmatrix} 1 &1 \\ 0 & 0 \end{pmatrix}##

I know they are not the same but that's not what I had asked. What I had asked is: if we've an equality C = (F)(G) and we apply elementary row operation on both sides then the new RHS will be different from C and the LHS will be different from (F)(G) but RHS will be still be equal to LHS, that is we would get the same transformation on both sides even after applying an elementary operation.

PeroK · May 24, 2017

Mr Real said:

I know they are not the same but that's not what I had asked. What I had asked is: if we've an equality C = (F)(G) and we apply elementary row operation on both sides then the new RHS will be different from C and the LHS will be different from (F)(G) but RHS will be still be equal to LHS, that is we would get the same transformation on both sides even after applying an elementary operation.

I'll let you find your own counterexample to that. Hint: think about the identity matrix.

But, if you think for a moment about matrix multiplication you will see that there is no way that row operations would support that sort of result.

Intuition behind elementary operations on matrices

Undergrad The vector to which a dual vector corresponds

Undergrad Spinor calculus

Undergrad Matrix representation of rank-2 spinors

Undergrad Looking for a paper about spinors

On the Moore–Penrose inverse from a banal linear algebra viewpoint

On the Moore–Penrose inverse from a banal linear algebra viewpoint

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Intuition behind elementary operations on matrices

Similar threads