Associativity of Matrix multiplication

Buffu · Jun 9, 2017

##\begin{align}[A(BC)]_{ij} &= \sum_r A_{ir}(BC)_{rj} \\ &= \sum_r A_{ir} \sum_s B_{rs}C_{sj}\\ &= \sum_r\sum_s A_{ir}B_{rs}C_{sj}\\ &= \sum_{s} (\sum_{r} A_{ir} B_{rs}) C_{sj} \\ &= [(AB) C]_{ij}\end{align}##

How did it went from ##2## to ##3##. In general is there a proof that sums can be switched like this ?

FactChecker · Jun 9, 2017

You can look at the subscripts and see that A_ir is a constant multiplier in A_ir∑_sB_rsC_sj. So it can be distributed to multiply the individual terms in the summation.

Buffu · Jun 9, 2017

FactChecker said:

You can look at the subscripts and see that A_ir is a constant multiplier in A_ir∑_sB_rsC_sj. So it can be distributed to multiply the individual terms in the summation.

And what about switching sums ? Step 3 to 4 ?

FactChecker · Jun 9, 2017

Consider Σ_rΣ_s. You can individually list the terms that are summed so that they are in a two dimensional matrix with a separate row for r=1, 2, 3,... and each row has columns for s=1, 2, 3, ... . Then clearly you can change the order of summation so that you sum column s=1 over all the rows r=1, 2, 3,..., then sum column s=2 over all rows r=1, 2, 3, ... That is just the summation in order of Σ_sΣ_r.

mathwonk · Jun 9, 2017

all your questions seem to me to lend themselves more smoothly to abstract approaches rather than computational ones. indeed these (to me) tedious computations are a good argument in favor of doing things the abstract way. you seem to be impressively skilled with them however, and you may just prefer these hands on concrete calculations.

Buffu · Jun 9, 2017

FactChecker said:

Consider Σ_rΣ_s. You can individually list the terms that are summed so that they are in a two dimensional matrix with a separate row for r=1, 2, 3,... and each row has columns for s=1, 2, 3, ... . Then clearly you can change the order of summation so that you sum column s=1 over all the rows r=1, 2, 3,..., then sum column s=2 over all rows r=1, 2, 3, ... That is just the summation in order of Σ_sΣ_r.

mathwonk said:

all your questions seem to me to lend themselves more smoothly to abstract approaches rather than computational ones. indeed these (to me) tedious computations are a good argument in favor of doing things the abstract way. you seem to be impressively skilled with them however, and you may just prefer these hands on concrete calculations.

Yes I want to know that is in general it is true that I can switch order of sums without changing the answer ##\sum_r \sum_s f(r, s) = \sum_s\sum_r f(r,s)## ? Is this also true for infinite sums like integration ?

FactChecker · Jun 9, 2017

Buffu said:

Is this also true for infinite sums like integration ?

If the terms are all positive and they have a finite sum, then you can. Furthermore, if the terms are not all positive, but the sum of their absolute values is finite, then you can. That is called "absolute convergence".

On the other hand, if the sum of the absolute values is not finite, then you can not be sure. It may converge when the summation is in a particular order, but not in a different order. That is called "conditional convergence"

StoneTemplePython · Jun 9, 2017

FactChecker said:

If the terms are all positive and they have a finite sum, then you can.

I will go out on a limb and amend this to say: if all terms in your series are real valued and non-negative, you can always interchange the "summations". If the series converges before the interchange, it does so after. And if the series does not converge before the interchange, it does not converge afterward.

The thing you don't want to do in standard analysis is re-arrange and partition some infinite sum and get that half of it is ##\infty##, the other half is ##-\infty##, then add them together and say it is equal to ##0##. You are supposed to say things do not converge, and move on.

More generally, you can interchange any linear operators -- things like ##\int, \sum, E[], trace()## (where E is expectation operator), so long as you have first dealt with convergence issues.

mathwonk · Jun 9, 2017

i would just check that multiplication of matrices corresponds to composition of linear maps and thus the operation is associative since that fact is easy for maps.

Buffu · Jun 9, 2017

mathwonk said:

i would just check that multiplication of matrices corresponds to composition of linear maps and thus the operation is associative since that fact is easy for maps.

Since everyone is speaking of linear maps. Does linear maps just a function from one field to other ?

FactChecker · Jun 9, 2017

Buffu said:

Since everyone is speaking of linear maps. Does linear maps just a function from one field to other ?

You would also need the linear properties of the map: M(aX+bY) = aM(X)+bM(Y).

mathwonk · Jun 9, 2017

a map f is linear if f (av+bw) = a f(v) + b f(w) for scalars a,b and vectors v,w. Since matrix multiplication obeys M(av+bw) = aMv + bMw, it is a linear map. but composition is associative for all maps, linear or not. The point is you only need to show associativity for multiplication by vectors, i.e. for matrices M,N and vectors v, that (M.N).v = M.(N.v). it then follows that (MN)P = M(NP) for all matrices M,N,P.

WWGD · Jun 15, 2017

Buffu said:

Since everyone is speaking of linear maps. Does linear maps just a function from one field to other ?

You ultimately need a space in which adding vectors and scaling them makes sense, which it does not in a "standard space". They are often defined in vector spaces, where vectors are elements of a group, but also on rings, etc.

mathwonk · Jun 15, 2017

multiplication by an n by m matrix of scalars from a field k, sends column vectors of length m to column vectors of length n, hence defines a function k^m-->k^n. Since addition of such column vectors is defined and also multiplication of them by scalars, the requirements spelled out by WWGD in the previous post are satisfied, hence linearity makes sense. and since matrix multiplication obeys the laws mentioned above, this function is indeed a linear function, or linear "map", to distinguish it from usual functions whose inputs and outputs are numbers.

WWGD · Jun 15, 2017

mathwonk said:

multiplication by an n by m matrix of scalars foma field k, sends column vectors of length m to column vectors of length n, hence defines a function k^m-->k^n. Since addition of such column vectors is defined and also multiplication of them by scalars, and since matrix multiplication obeys the laws mentioned above, this functionm is called a linear function, or linear "map", to distinguish it from usual functions whose inputs and outputs are numbers.

Yes, but I meant, those vectors must belong to a space where their sum is defined. You cannot, e.g., defne a linear map in a discrete topological space without an additional definition of sum and scaling, at least not so AFAIK,

mathwonk · Jun 15, 2017

yes, i was trying to address your requirement in my second sentence (addressed to Buffu): "Since addition of such column vectors is defined and also multiplication of them by scalars". I have edited my post to try to make this more clear. thanks!

Associativity of Matrix multiplication

Similar threads

Undergrad About the existence of Hamel basis for vector spaces

Undergrad How to define a vector field?

Graduate Confusion about the Moyal-Weyl twist

Undergrad Can one find a matrix that's 'unique' to a collection of eigenvectors?

Undergrad Erroneously finding discrepancy in transpose rule

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers