# Concept of Matrix Multiplication

1. Sep 13, 2010

### abiyo

Here is my question(It is not a homework problem.)

What does it mean to multiply two matrices? I know how to do the operation but what is the concept behind it? I understand dot products and vectors. Matrix multiplication can be seen as a dot product operation of row vector of vector A to column vector of vector B. But why did we pick such an arrangement?

Thanks a lot. Could anyone recommend a good algebra/linear algebra book that has a conceptual flavor as opposed to crunching numbers?

Peace
Abiyo

2. Sep 13, 2010

### Matthollyw00d

If you're asking where the operation comes from, I believe it's because it was needed to make $$\mathbb{M}^{m\; \x\; n}$$ the vector space, and to easily represent the linear transformation's operations. There are other ways to define it, though I believe we use this one for convention.

An amazing text for linear algebra, would be Finite-Dimensional Vector Spaces by Paul R. Halmos.

3. Sep 13, 2010

### Tac-Tics

There are many, many ways of interpreting matrices. One way is to start off with linear functions.

A linear function f has the properties that a*f(u) + b*f(v) = f(au + bv) for all real numbers a, b and vectors u, v.

Linear functions play very nicely with the basises. Let e1, e2, ..., ek be a basis for your vector space. Then, for every vector v, you can write it out in terms of that basis: v1 e1 + v2 e2 + ... + vn en, where vi are the coordinates for v in that basis.

Now, taking that linear function, f(v) = f(v1 e1 + v2 e2 + ... + vn en) = v1 f(e1) + v2 f(e2) + ... + vn f(en). So, if we know what f does to each vector in the basis, we know what f does to each vector in the entire vectorspace.

So, to this point, we haven't introduced any matrices, yet, but we're really close. Given a basis, we can perfectly summarize any linear function with just n vectors: f(e1), f(e2), ..., f(en). These will correspond to the columns of a matrix.

From there, I don't really have the time to spell it out, but it's very closely related to the fact that the coordinates of a vector are determined by the inner product. In fact, with a basis, v = (v . e1) e1 + (v . e2) e2 + ... + (v . en) en. Then, matrix multiplication is just "plumbing" to make sure you are taking the inner products of the right vectors and adding them appropriately.

4. Sep 13, 2010

### Landau

I think the only clean explanation is using linear maps. Given are three finite-dimensional vector spaces V,W,X of dimension n,m,p, respectively, and two linear maps T:V->W and S:W->X. The linear maps can of course be composed to give the linear map ST:V->X.

Now take bases for V,W,X. Then we have two matrices A and B representing S and T, respectively, w.r.t. the corresponding bases: A is p-by-m, B is m-by-n.

The matrix product AB is the p-by-n matrix representing the linear map ST w.r.t. the chosen bases. Using the description by Tac-Tics you can work out that this indeed agrees with usual matrix-multiplication.

Now we have the immediate corollary that matrix multiplication is accociative (because function composition is), and similar properties.

5. Sep 13, 2010

### Fredrik

Staff Emeritus
See this post for an explanation of the relationship between linear operators and matrices.

Suppose that A and B are linear operators on a vector space V, and that {ei} is a basis for V. We want to show that the ij component of $A\circ B$ is given by matrix multiplication. In the notation used in the post I linked to above, the proof is

$$(A\circ B)_{ij}=(A\circ B(e_j))_i=(A(Be_j))_i=(A((Be_j)_k e_k))_i=(A(B_{kj}e_k))_i=(Ae_k)_i B_{kj}=A_{ik}B_{kj}$$

My favorite linear algebra book is "Linear algebra done right", by Sheldon Axler.

Last edited: Sep 13, 2010
6. Sep 14, 2010

### Mårten

Hi abiyo!

That's a really interesting question! I've been thinking about that for quite a while as well. Why is the matrix multiplication done in the manner it is done? I mean, if we have matrices A and B and multiply AB, why do we take the columns in B and multiply with the rows in A, why not rows in B and multiply by rows in A, or why just not multiply element by element in A and B and just skip this adding procedure you do when you multiply two matrices?

I cannot give you a comprehensive answer, but it seems that when we define multiplication of matrices the way we do, there are many applications out there that actually could make use of such a definition.

I'll try with an example from economics. Suppose you have the following matrix,

$$A = \begin{bmatrix} 3 & 1 \\ 5 & 2 \end{bmatrix}.$$

This matrix shall be interpreted as follows: The first column describes how many units of copper (first row) and plastics (second row) is needed to produce one unit of memory chip. The second column describes how many units of copper and plastics is needed to produce one unit of cpu chip. So, you need 3 units of copper and 5 units of plastics to produce one unit of memory chip, and so on. (I just made this figures up, don't take them too seriously.)

Now, we have a second matrix,

$$B = \begin{bmatrix} 9 & 8 \\ 7 & 6 \end{bmatrix}.$$

The first column in this matrix describes how many units of memory chips (first row) and cpu chips (second row) you need to produce a computer. The second column describes how many units of memory and cpu chips you need to produce a radio. So, you need 9 units of memory chips and 7 units of cpu chips to produce a computer, and so on.

So, let us now buy some computers and radios, let's say we need 2 computers and 3 radios. We describe this with the vector

$$X = \begin{bmatrix} 2 \\ 3 \end{bmatrix}.$$

So how many memory chips and cpu chips must the chips factory produce to fullfill your needs of 2 computers and 3 radios? Easy, it must be BX, since

$$BX = \begin{bmatrix} 9*2 + 8*3 \\ 7*2 + 6*3 \end{bmatrix} = \begin{bmatrix} 42 \\ 32 \end{bmatrix}.$$

(Remember that the left column in B was the amount of memory and cpu you needed to produce 1 computer, and since you now need 2 computers, you need twice as much memory and cpu.)

In the same manner, to calculate how much raw materials (copper and plastics) needed for that amount of memory and cpu chips (described by BX) your computers and radios need, you take A times BX. That's because BX is a column vector that describes how much memory and cpu you have, and A describes how much copper and plastics is needed per memory chip and per cpu chip. So all in all you get the raw material needed for your computers and radios as

$$ABX = \begin{bmatrix} 3*42 + 1*32 \\ 5*42 + 2*32 \end{bmatrix} = \begin{bmatrix} 158 \\ 274 \end{bmatrix},$$

where the first row describes the amount of copper needed, and the second row the amount of plastics needed.

To conclude, and perhaps finally answer your question, we could as well look at AB directly, and see that it describes the raw material of copper and plastics needed per computer (left column) or per radio (right column). You see that if you do the multiplication:

$$AB = \begin{bmatrix} 3*9 + 1*7 & 3*8 + 1*6 \\ 5*9 + 2*7 & 5*8 + 2*6\end{bmatrix} = \begin{bmatrix} 34 & 30 \\ 59 & 52\end{bmatrix}.$$

So, in a way, you can say that you go from A (raw materials) -> B (intermediate materials) -> X (finished products). And the matrix AB produced by the matrix multiplication, sort of puts together the first two steps in that chain.

I hope I got all figures right, and that this will help you somewhat in you understanding.

This example was inspired by an example from a rather good text book in Linear algebra I had, but it's in Swedish, so I guess you don't have any use of it...

EDIT: One more thing: This matrix multiplication done here, is actually just a generalization of a normal multiplication from one to several dimensions (in this case to two dimensions, with 2x2 matrices). Because, imagine instead that we have a computer that only needs cpu chips and nothing more, and that cpu chips in turn only need copper to be produced. Then, if for every unit cpu, you need 1 unit of copper (A), and for every unit of computer, you need 7 units of cpu (B), and you'd like 2 computers (X), then the total amount of copper needed for these two computers is A*B*X = 1*7*2=14. So in this 1-dimensional case, it is very clear that it is multiplication we have to do. And for dimensions over 1, it seems that matrix multiplication gives us just the information we want, as my 2-dimensional example above shows.

Last edited: Sep 14, 2010
7. Sep 14, 2010

### abiyo

Thanks everyone for being so understanding and clarifying things. I will look through and try to understand it. If I fail, I will ask back.

Remember in matrices AB$$\neq$$BA Most of time...