Elaboration on matrix operations.

  • Thread starter Thread starter Shawn Garsed
  • Start date Start date
  • Tags Tags
    Matrix Operations
AI Thread Summary
Matrix operations, including multiplication, determinants, and inverses, are often taught without sufficient theoretical background, leading to confusion about their necessity and application. Understanding the definitions of these operations is crucial, as they are somewhat arbitrary but allow for the derivation of important properties like associativity and distributivity. For instance, matrix multiplication is defined to align with the composition of linear transformations, which is fundamental in linear algebra. The determinant serves as a key indicator of whether a matrix is invertible, with a non-zero determinant indicating invertibility. Exploring resources that delve into the theory behind these concepts can enhance comprehension and appreciation of their mathematical significance.
Shawn Garsed
Messages
50
Reaction score
1
Hi everybody,

I'm currently studying matrices on an algebra 2 level. The problem is that right now the book is just teaching what to do and how to do it, but it's not telling why it has to be done that way (for instance: matrix multiplication, determinants and matrix inversion). Basically, it's not teaching the theory behind it.
My question is:
Is it possible to already learn the theory behind matrices at an algebra 2 level.
If so, is there a book or a website you can refer me too.

Thanks in advance,

Shawn Garsed
 
Mathematics news on Phys.org
but it's not telling why it has to be done that way (for instance: matrix multiplication, determinants and matrix inversion). "

Here, you must not confuse two levels:

1. THE LEVEL OF DEFINITIONS
What operations you can "do" with a matrix or two, are infinite:
You may add them termwise, you might switch a few terms between them, or any other creative measure.

That we choose to <i>define</i> some operations, is, largely, arbitrary.

2. ON BASIS OF DEFINITIONS, WHAT CAN THE OPERATION "DO"?
Now, for example, with matrix multiplication, as defined in 1, we may PROVE a variety of properties.
For example, if A, B, C are matrices we generally have (as long as we can multiply two of them):
A*(B*C)=(A*B)*C
That is, matrix "multiplication" is an <i>associative</i> operation.

Furthermore, given a matrix sum operation as elementwise addition (only possible if the matrices are of equal size!), we also have <i>distributivity</i> of matrix "multiplication:
A*(B+C)=A*B+A*C

On basis that we can PROVE these properties to hold, given our definition of matrix "multiplication" it is, indeed, defensible to attach the label "multiplication" to that operation, since those properties are, generally, present for "normal" multiplication of single numbers.

Note, however, that in general, matrix multiplication is NOT commutative, i.e we generally do NOT have the identity A*B=B*A
In fact, one of the sides might not even be a permitted operation!


On understanding the difference between definitions, that are largely arbitrary, and <i>hence</i>, there being no deeper "why"-answer to them, you can focus on those "why"-answers that ARE deep, namely the proofs of the various properties the operations in question generate.
 
Let me give you an example regarding matrix inversion from the book.

It says, in order to get the inverse from a matrix, you have to do this:

The inverse of matrix
[a b]=
[c d]

[ d -b]*1/(ad-bc)
[-c a]

It just gives me that formula and that's it, it doesn't say anything about how it's derived.

Hope that clears up my question.
 
Last edited:
Here's a start to the "why" for this matrix inverse formula.
Given a matrix A that is invertible (has an inverse A-1) it must be true that AA-1 = I, where I is the identity matrix of the same order as A. (It must also be true that A-1A = I.)

Using your 2 x 2 matrix, and assuming the constants a, b, c, and d are known, form the matrix equation below.
\left[ \begin{array} {cc} a &amp; b \\ c &amp; d\end{array}\right]\left[ \begin{array} {cc} e &amp; f \\ g &amp; h\end{array}\right] = \left[ \begin{array} {cc} 1 &amp; 0 \\ 0 &amp; 1\end{array}\right]
Expanding the matrix product on the left yields four equations in the four unknowns e, f, g, and h. Solving for those unknowns gives the coefficients of the inverse matrix.
 
Shawn Garsed said:
Let me give you an example regarding matrix inversion from the book.

It says, in order to get the inverse from a matrix, you have to do this:

The inverse of matrix
[a b]=
[c d]

[ d -b]*1/(ad-bc)
[-c a]

It just gives me that formula and that's it, it doesn't say anything about how it's derived.

Hope that clears up my question.
Okay!

Well, think of it this way:

You are to find a 2*2 matrix which is the general inverse of an arbitrary matrix, with elements (a,b,c,d).

Thus, you have 4 unknown quantities to find, namely those of the inverse, call them (x,y,z,w).

Now:
Can you construct a system of 4 equations that MUST hold for the 4 unknowns (x,y,z,w), knowing that as elements in a matrix, it is to be the multiplicative inverse of a matrix with elements (a,b,c,d)?
 
I understand how to find the inverse matrix the way you describe it, but how do you go from that to the formule I described in my previous post.

It's hard to explain what I mean by saying I want to know the theory behind matrices. And what the book gives me is just not sufficient. For instance, I know how to calculate the determinant of a matrix, but what does it actually refer to. It's not like someone just started crossmultiplying and subtracting the elements of a 2x2 matrix for the fun of it and thought, hey, this could be useful. Same thing goes for multiplying matrices. There must be a reason why multiplying matrices isn't done in the same way as adding or subtracting matrices, multiplying each element in one matrix with the corresponding element in the other.

I hope you get the gist of what I'm saying cause it's difficult to explain.
 
Shawn Garsed said:
I understand how to find the inverse matrix the way you describe it, but how do you go from that to the formule I described in my previous post.
Both arildno and I laid out how you would arrive at the formula you showed.
Shawn Garsed said:
It's hard to explain what I mean by saying I want to know the theory behind matrices. And what the book gives me is just not sufficient. For instance, I know how to calculate the determinant of a matrix, but what does it actually refer to.
That's a somewhat vague question, but one thing that it "determines" is whether a square matrix is invertible (det != 0) or noninvertible (det = 0). It might help to look at this Wiki article, http://en.wikipedia.org/wiki/Determinant, especially the History section near the bottom of the page.
Shawn Garsed said:
It's not like someone just started crossmultiplying and subtracting the elements of a 2x2 matrix for the fun of it and thought, hey, this could be useful. Same thing goes for multiplying matrices. There must be a reason why multiplying matrices isn't done in the same way as adding or subtracting matrices, multiplying each element in one matrix with the corresponding element in the other.
I don't know the history of matrix multiplication, but to understand how matrix multiplication works, you need to know something about vector multiplication, particularly the dot, or inner product. Each entry in a product matrix is the result of the dot product of a row vector in the first matrix with a column vector in the second matrix. For example, the (2, 3) element in a product matrix is the result of dotting the 2nd row of the first matrix with the 3rd column of the second matrix.
Shawn Garsed said:
I hope you get the gist of what I'm saying cause it's difficult to explain.
 
Shawn Garsed said:
. For instance, I know how to calculate the determinant of a matrix, but what does it actually refer to.
It refers to itself.
Historically, it was seen that the "determinant" of a system of linear equations was a number that regularly cropped up in the solution to that system.
This is elegantly expressed by Cramer's rule, for example.

Same thing goes for multiplying matrices. There must be a reason why multiplying matrices isn't done in the same way as adding or subtracting matrices, multiplying each element in one matrix with the corresponding element in the other.
I have already told you that by defining matrix multiplication in that particular way, some salient properties of standard multiplication will also be present in matrix multiplication (associativity and distributivity), and that this presence is provable.
 
Agreeing with the rest of the posts, let me give you another view

You have to set a threshold level for yourself on agreeing things. Otherwise you have to get into more complicated mathematical structures e.g. groups,rings etc. before you master the details of simplified concepts such as matrix inverse, multiplication operation defined on vector spaces etc.

Hence you have to take it for granted some, not all, operations defined on the mathematical objects. You might want to look at the Ax = b equation. A is a matrix and x,b being vectors. There are already many questions to be asked but set a level for yourself such that the symbols that you have written are believable. Then you should start building on it. For example, matrices are somewhat natural objects as I pointed out http://acharya.iitm.ac.in/mirrors/vv/vidya/emathist.html" many times on this forum. Hence we might agree on writing simultaneous equations as a matrix form. You should also notice that there are some historical conventions that are accepted worldwide to write a matrix equation.

Now next thing might be about wondering what we should do when we have the solutions at hand. Say x=c. If we insist on the matrix form then we have to find an A matrix to make it compatible for our understanding. Then we might end up with the identity matrix i.e. Ix = c. As the name implies this is the identity element for our operation, namely matrix multiplication.


Again persisting on the idea and banging our head on the wall, we might come up with the idea of having a left inverse etc. Some elements does not exhibit an inverse. Why that should be? So on...

Of course historically, things were not as hygienic as I put here, but can you see a pattern? We need some operations and we need other objects that we will frequently use in these operations. Hence you might wish to set a level for your understanding and force yourself to stay on that level unless you really feel the presence of a more general concept in this case I would say abstract algebra or group theory waiting in line.

Also have a look at
https://www.physicsforums.com/showthread.php?t=416051

Hope it helps.
 
Last edited by a moderator:
  • #10
Matrix multiplication is defined the way it is so that it corresponds to the composition of linear transformations. For simplicity, take a vector v=(x y) and 2x2 matrices A and B. Calculate Bv and then A(Bv). Now calculate (AB)v. That's why the seemingly arbitrary definition is used and its not hard to see that it works for the general case.

Some of the responses seem to suggest that you shouldn't think about why a certain definition is used, just use it. I don't agree and I think you should continue to ask this kind of question.
 
  • #11
Tobias Funke said:
Matrix multiplication is defined the way it is so that it corresponds to the composition of linear transformations. For simplicity, take a vector v=(x y) and 2x2 matrices A and B. Calculate Bv and then A(Bv). Now calculate (AB)v. That's why the seemingly arbitrary definition is used and its not hard to see that it works for the general case.
THis is based on the property of associativity, which is a lot more fundamental mathematical property than "composition of linear transformations".
 
  • #12
Sure it's more fundamental, but the "obvious" definition to use is entrywise multiplication (apparently called the Hadamard product according to Wikipedia) and that's associative, distributive, commutative, and has an identity element. There must be another reason besides associativity to adopt the standard definition.

I guess it could be justified by saying that it leads to deep results later on, but at least there's some initial motivation behind it.
 
  • #13
arildno said:
THis is based on the property of associativity, which is a lot more fundamental mathematical property than "composition of linear transformations".

The fact that composing linear transformations is associative is why multiplying matrices is associative.

All matrix multiplication/addition rules are based on the fact that matrices are supposed to be linear transformations. Furthermore, a matrix that has columns v_1,..,v_m sends the canonical basis elements e_1,...,e_m to v_1,...,v_m respectively (e_i has a 1 in the ith position and 0s everywhere else)

Then you can use linearity to identify the value of any other vector when the matrix is applied. For example, if you have the vector (3,2) that can be written as 3*(1,0)+2*(0,1)=3*e1+2*e2. So if we applied a matrix with columns v1=(1,4,2) and v2=(5,2,-1) to (3,2) we know that the result should be 3*(1,4,2)+2*(5,3,-1). Matrix multiplication is just a way of finding what rules there are with respect to the entries of matrices and vectors that makes sure this happens
 
  • #14
Another justification for the standard product is pretty simple. If you've defined matrix-vector multiplication already (maybe when studying systems of equations), then the entrywise product usually won't work in this setting since both matrices have to be the same size.

Then defining AB=(Ab_1, Ab_2, ... , Ab_n), where the b_i's are the columns of B seems pretty natural. Of course it eventually all ties in with composition, but a priori it has nothing to do with it.
 
  • #15
Tobias Funke said:
Sure it's more fundamental, but the "obvious" definition to use is entrywise multiplication (apparently called the Hadamard product according to Wikipedia) and that's associative, distributive, commutative, and has an identity element. There must be another reason besides associativity to adopt the standard definition.

I guess it could be justified by saying that it leads to deep results later on, but at least there's some initial motivation behind it.

That's why I said you have to set a level to handle the problem. I did not mean that you should not bother about these details. Without understanding the seemingly trivial, you cannot deduce the more general structures. Because you simply don't know the connection between special cases.

Moreover, many definitions narrow down a specific collection hence it is only meaningful to uinderstand the definition if it filters out some quantities, in this particular case, you can work out why other definitions of multiplication does not do the job.
 

Similar threads

Replies
6
Views
2K
Replies
3
Views
5K
Replies
9
Views
2K
Replies
69
Views
8K
Replies
4
Views
1K
Replies
21
Views
3K
Back
Top