Elaboration on matrix operations.

Shawn Garsed · Jul 21, 2010

Hi everybody,

I'm currently studying matrices on an algebra 2 level. The problem is that right now the book is just teaching what to do and how to do it, but it's not telling why it has to be done that way (for instance: matrix multiplication, determinants and matrix inversion). Basically, it's not teaching the theory behind it.
My question is:
Is it possible to already learn the theory behind matrices at an algebra 2 level.
If so, is there a book or a website you can refer me too.

Thanks in advance,

Shawn Garsed

arildno · Jul 21, 2010

but it's not telling why it has to be done that way (for instance: matrix multiplication, determinants and matrix inversion). "

Here, you must not confuse two levels:

1. THE LEVEL OF DEFINITIONS
What operations you can "do" with a matrix or two, are infinite:
You may add them termwise, you might switch a few terms between them, or any other creative measure.

That we choose to define some operations, is, largely, arbitrary.

2. ON BASIS OF DEFINITIONS, WHAT CAN THE OPERATION "DO"?
Now, for example, with matrix multiplication, as defined in 1, we may PROVE a variety of properties.
For example, if A, B, C are matrices we generally have (as long as we can multiply two of them):
A*(B*C)=(A*B)*C
That is, matrix "multiplication" is an associative operation.

Furthermore, given a matrix sum operation as elementwise addition (only possible if the matrices are of equal size!), we also have distributivity of matrix "multiplication:
A*(B+C)=A*B+A*C

On basis that we can PROVE these properties to hold, given our definition of matrix "multiplication" it is, indeed, defensible to attach the label "multiplication" to that operation, since those properties are, generally, present for "normal" multiplication of single numbers.

Note, however, that in general, matrix multiplication is NOT commutative, i.e we generally do NOT have the identity A*B=B*A
In fact, one of the sides might not even be a permitted operation!

On understanding the difference between definitions, that are largely arbitrary, and hence, there being no deeper "why"-answer to them, you can focus on those "why"-answers that ARE deep, namely the proofs of the various properties the operations in question generate.

Shawn Garsed · Jul 21, 2010

Let me give you an example regarding matrix inversion from the book.

It says, in order to get the inverse from a matrix, you have to do this:

The inverse of matrix
[a b]=
[c d]

[ d -b]*1/(ad-bc)
[-c a]

It just gives me that formula and that's it, it doesn't say anything about how it's derived.

Hope that clears up my question.

Mark44 · Jul 21, 2010

Here's a start to the "why" for this matrix inverse formula.
Given a matrix A that is invertible (has an inverse A^-1) it must be true that AA^-1 = I, where I is the identity matrix of the same order as A. (It must also be true that A^-1A = I.)

Using your 2 x 2 matrix, and assuming the constants a, b, c, and d are known, form the matrix equation below.
[tex]\left[ \begin{array} {cc} a & b \\ c & d\end{array}\right]\left[ \begin{array} {cc} e & f \\ g & h\end{array}\right] = \left[ \begin{array} {cc} 1 & 0 \\ 0 & 1\end{array}\right][/tex]
Expanding the matrix product on the left yields four equations in the four unknowns e, f, g, and h. Solving for those unknowns gives the coefficients of the inverse matrix.

arildno · Jul 21, 2010

Shawn Garsed said:

Let me give you an example regarding matrix inversion from the book.

It says, in order to get the inverse from a matrix, you have to do this:

The inverse of matrix
[a b]=
[c d]

[ d -b]*1/(ad-bc)
[-c a]

It just gives me that formula and that's it, it doesn't say anything about how it's derived.

Hope that clears up my question.

Okay!

Well, think of it this way:

You are to find a 2*2 matrix which is the general inverse of an arbitrary matrix, with elements (a,b,c,d).

Thus, you have 4 unknown quantities to find, namely those of the inverse, call them (x,y,z,w).

Now:
Can you construct a system of 4 equations that MUST hold for the 4 unknowns (x,y,z,w), knowing that as elements in a matrix, it is to be the multiplicative inverse of a matrix with elements (a,b,c,d)?

Shawn Garsed · Jul 21, 2010

I understand how to find the inverse matrix the way you describe it, but how do you go from that to the formule I described in my previous post.

It's hard to explain what I mean by saying I want to know the theory behind matrices. And what the book gives me is just not sufficient. For instance, I know how to calculate the determinant of a matrix, but what does it actually refer to. It's not like someone just started crossmultiplying and subtracting the elements of a 2x2 matrix for the fun of it and thought, hey, this could be useful. Same thing goes for multiplying matrices. There must be a reason why multiplying matrices isn't done in the same way as adding or subtracting matrices, multiplying each element in one matrix with the corresponding element in the other.

I hope you get the gist of what I'm saying cause it's difficult to explain.

Mark44 · Jul 21, 2010

Shawn Garsed said:

I understand how to find the inverse matrix the way you describe it, but how do you go from that to the formule I described in my previous post.

Both arildno and I laid out how you would arrive at the formula you showed.

Shawn Garsed said:

It's hard to explain what I mean by saying I want to know the theory behind matrices. And what the book gives me is just not sufficient. For instance, I know how to calculate the determinant of a matrix, but what does it actually refer to.

That's a somewhat vague question, but one thing that it "determines" is whether a square matrix is invertible (det != 0) or noninvertible (det = 0). It might help to look at this Wiki article, http://en.wikipedia.org/wiki/Determinant, especially the History section near the bottom of the page.

Shawn Garsed said:

It's not like someone just started crossmultiplying and subtracting the elements of a 2x2 matrix for the fun of it and thought, hey, this could be useful. Same thing goes for multiplying matrices. There must be a reason why multiplying matrices isn't done in the same way as adding or subtracting matrices, multiplying each element in one matrix with the corresponding element in the other.

I don't know the history of matrix multiplication, but to understand how matrix multiplication works, you need to know something about vector multiplication, particularly the dot, or inner product. Each entry in a product matrix is the result of the dot product of a row vector in the first matrix with a column vector in the second matrix. For example, the (2, 3) element in a product matrix is the result of dotting the 2nd row of the first matrix with the 3rd column of the second matrix.

Shawn Garsed said:

I hope you get the gist of what I'm saying cause it's difficult to explain.

arildno · Jul 21, 2010

Shawn Garsed said:

. For instance, I know how to calculate the determinant of a matrix, but what does it actually refer to.

It refers to itself.
Historically, it was seen that the "determinant" of a system of linear equations was a number that regularly cropped up in the solution to that system.
This is elegantly expressed by Cramer's rule, for example.

Same thing goes for multiplying matrices. There must be a reason why multiplying matrices isn't done in the same way as adding or subtracting matrices, multiplying each element in one matrix with the corresponding element in the other.

I have already told you that by defining matrix multiplication in that particular way, some salient properties of standard multiplication will also be present in matrix multiplication (associativity and distributivity), and that this presence is provable.

trambolin · Jul 21, 2010

Agreeing with the rest of the posts, let me give you another view

You have to set a threshold level for yourself on agreeing things. Otherwise you have to get into more complicated mathematical structures e.g. groups,rings etc. before you master the details of simplified concepts such as matrix inverse, multiplication operation defined on vector spaces etc.

Hence you have to take it for granted some, not all, operations defined on the mathematical objects. You might want to look at the Ax = b equation. A is a matrix and x,b being vectors. There are already many questions to be asked but set a level for yourself such that the symbols that you have written are believable. Then you should start building on it. For example, matrices are somewhat natural objects as I pointed out http://acharya.iitm.ac.in/mirrors/vv/vidya/emathist.html" many times on this forum. Hence we might agree on writing simultaneous equations as a matrix form. You should also notice that there are some historical conventions that are accepted worldwide to write a matrix equation.

Now next thing might be about wondering what we should do when we have the solutions at hand. Say x=c. If we insist on the matrix form then we have to find an A matrix to make it compatible for our understanding. Then we might end up with the identity matrix i.e. Ix = c. As the name implies this is the identity element for our operation, namely matrix multiplication.

Again persisting on the idea and banging our head on the wall, we might come up with the idea of having a left inverse etc. Some elements does not exhibit an inverse. Why that should be? So on...

Of course historically, things were not as hygienic as I put here, but can you see a pattern? We need some operations and we need other objects that we will frequently use in these operations. Hence you might wish to set a level for your understanding and force yourself to stay on that level unless you really feel the presence of a more general concept in this case I would say abstract algebra or group theory waiting in line.

Also have a look at
https://www.physicsforums.com/showthread.php?t=416051

Hope it helps.

Tobias Funke · Jul 21, 2010

Matrix multiplication is defined the way it is so that it corresponds to the composition of linear transformations. For simplicity, take a vector v=(x y) and 2x2 matrices A and B. Calculate Bv and then A(Bv). Now calculate (AB)v. That's why the seemingly arbitrary definition is used and its not hard to see that it works for the general case.

Some of the responses seem to suggest that you shouldn't think about why a certain definition is used, just use it. I don't agree and I think you should continue to ask this kind of question.

arildno · Jul 22, 2010

Tobias Funke said:

Matrix multiplication is defined the way it is so that it corresponds to the composition of linear transformations. For simplicity, take a vector v=(x y) and 2x2 matrices A and B. Calculate Bv and then A(Bv). Now calculate (AB)v. That's why the seemingly arbitrary definition is used and its not hard to see that it works for the general case.

THis is based on the property of associativity, which is a lot more fundamental mathematical property than "composition of linear transformations".

Tobias Funke · Jul 22, 2010

Sure it's more fundamental, but the "obvious" definition to use is entrywise multiplication (apparently called the Hadamard product according to Wikipedia) and that's associative, distributive, commutative, and has an identity element. There must be another reason besides associativity to adopt the standard definition.

I guess it could be justified by saying that it leads to deep results later on, but at least there's some initial motivation behind it.

Office_Shredder · Jul 22, 2010

arildno said:

THis is based on the property of associativity, which is a lot more fundamental mathematical property than "composition of linear transformations".

The fact that composing linear transformations is associative is why multiplying matrices is associative.

All matrix multiplication/addition rules are based on the fact that matrices are supposed to be linear transformations. Furthermore, a matrix that has columns [tex]v_1,..,v_m[/tex] sends the canonical basis elements [tex]e_1,...,e_m[/tex] to [tex]v_1,...,v_m[/tex] respectively ([tex]e_i[/tex] has a 1 in the [tex]i[/tex]th position and 0s everywhere else)

Then you can use linearity to identify the value of any other vector when the matrix is applied. For example, if you have the vector (3,2) that can be written as 3*(1,0)+2*(0,1)=3*e₁+2*e₂. So if we applied a matrix with columns v₁=(1,4,2) and v₂=(5,2,-1) to (3,2) we know that the result should be 3*(1,4,2)+2*(5,3,-1). Matrix multiplication is just a way of finding what rules there are with respect to the entries of matrices and vectors that makes sure this happens

Tobias Funke · Jul 22, 2010

Another justification for the standard product is pretty simple. If you've defined matrix-vector multiplication already (maybe when studying systems of equations), then the entrywise product usually won't work in this setting since both matrices have to be the same size.

Then defining AB=(Ab_1, Ab_2, ... , Ab_n), where the b_i's are the columns of B seems pretty natural. Of course it eventually all ties in with composition, but a priori it has nothing to do with it.

trambolin · Jul 22, 2010

Tobias Funke said:

Sure it's more fundamental, but the "obvious" definition to use is entrywise multiplication (apparently called the Hadamard product according to Wikipedia) and that's associative, distributive, commutative, and has an identity element. There must be another reason besides associativity to adopt the standard definition.

I guess it could be justified by saying that it leads to deep results later on, but at least there's some initial motivation behind it.

That's why I said you have to set a level to handle the problem. I did not mean that you should not bother about these details. Without understanding the seemingly trivial, you cannot deduce the more general structures. Because you simply don't know the connection between special cases.

Moreover, many definitions narrow down a specific collection hence it is only meaningful to uinderstand the definition if it filters out some quantities, in this particular case, you can work out why other definitions of multiplication does not do the job.

Elaboration on matrix operations.

Undergrad The vector to which a dual vector corresponds

Undergrad Spinor calculus

Undergrad Matrix representation of rank-2 spinors

Undergrad Looking for a paper about spinors

On the Moore–Penrose inverse from a banal linear algebra viewpoint

On the Moore–Penrose inverse from a banal linear algebra viewpoint

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Elaboration on matrix operations.

Similar threads