Associativity of Matrix multiplication

Click For Summary

Discussion Overview

The discussion centers on the associativity of matrix multiplication, exploring the mathematical properties of sums and the nature of linear maps. Participants examine the steps in a proof of associativity, the conditions under which sums can be interchanged, and the implications for linear transformations.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Some participants present a proof of associativity using summation notation and question the validity of switching the order of sums.
  • Others argue that constants can be factored out of summations, allowing for the interchange of summation order.
  • A participant suggests that abstract approaches may be more effective than computational ones in understanding these concepts.
  • There is a discussion about the conditions under which sums can be interchanged, particularly regarding absolute and conditional convergence in infinite series.
  • Some participants assert that matrix multiplication corresponds to the composition of linear maps, which is inherently associative.
  • Questions arise regarding the definition of linear maps and the necessary properties for a function to be considered linear.
  • Clarifications are made about the requirements for defining linear maps in various mathematical contexts, including vector spaces and fields.

Areas of Agreement / Disagreement

Participants express differing views on the conditions for interchanging sums and the nature of linear maps. While some points are clarified, no consensus is reached on the broader implications of these mathematical properties.

Contextual Notes

Limitations include the dependence on specific definitions of convergence and linearity, as well as the unresolved nature of some mathematical steps discussed in the thread.

Buffu
Messages
851
Reaction score
147
##\begin{align}[A(BC)]_{ij} &= \sum_r A_{ir}(BC)_{rj} \\ &= \sum_r A_{ir} \sum_s B_{rs}C_{sj}\\ &= \sum_r\sum_s A_{ir}B_{rs}C_{sj}\\ &= \sum_{s} (\sum_{r} A_{ir} B_{rs}) C_{sj} \\ &= [(AB) C]_{ij}\end{align}##

How did it went from ##2## to ##3##. In general is there a proof that sums can be switched like this ?
 
Physics news on Phys.org
You can look at the subscripts and see that Air is a constant multiplier in AirsBrsCsj. So it can be distributed to multiply the individual terms in the summation.
 
  • Like
Likes   Reactions: Buffu
FactChecker said:
You can look at the subscripts and see that Air is a constant multiplier in AirsBrsCsj. So it can be distributed to multiply the individual terms in the summation.

And what about switching sums ? Step 3 to 4 ?
 
Consider ΣrΣs. You can individually list the terms that are summed so that they are in a two dimensional matrix with a separate row for r=1, 2, 3,... and each row has columns for s=1, 2, 3, ... . Then clearly you can change the order of summation so that you sum column s=1 over all the rows r=1, 2, 3,..., then sum column s=2 over all rows r=1, 2, 3, ... That is just the summation in order of ΣsΣr.
 
  • Like
Likes   Reactions: Buffu
all your questions seem to me to lend themselves more smoothly to abstract approaches rather than computational ones. indeed these (to me) tedious computations are a good argument in favor of doing things the abstract way. you seem to be impressively skilled with them however, and you may just prefer these hands on concrete calculations.
 
  • Like
Likes   Reactions: Buffu
FactChecker said:
Consider ΣrΣs. You can individually list the terms that are summed so that they are in a two dimensional matrix with a separate row for r=1, 2, 3,... and each row has columns for s=1, 2, 3, ... . Then clearly you can change the order of summation so that you sum column s=1 over all the rows r=1, 2, 3,..., then sum column s=2 over all rows r=1, 2, 3, ... That is just the summation in order of ΣsΣr.
mathwonk said:
all your questions seem to me to lend themselves more smoothly to abstract approaches rather than computational ones. indeed these (to me) tedious computations are a good argument in favor of doing things the abstract way. you seem to be impressively skilled with them however, and you may just prefer these hands on concrete calculations.
Yes I want to know that is in general it is true that I can switch order of sums without changing the answer ##\sum_r \sum_s f(r, s) = \sum_s\sum_r f(r,s)## ? Is this also true for infinite sums like integration ?
 
Buffu said:
Is this also true for infinite sums like integration ?
If the terms are all positive and they have a finite sum, then you can. Furthermore, if the terms are not all positive, but the sum of their absolute values is finite, then you can. That is called "absolute convergence".

On the other hand, if the sum of the absolute values is not finite, then you can not be sure. It may converge when the summation is in a particular order, but not in a different order. That is called "conditional convergence"
 
FactChecker said:
If the terms are all positive and they have a finite sum, then you can.

I will go out on a limb and amend this to say: if all terms in your series are real valued and non-negative, you can always interchange the "summations". If the series converges before the interchange, it does so after. And if the series does not converge before the interchange, it does not converge afterward.

The thing you don't want to do in standard analysis is re-arrange and partition some infinite sum and get that half of it is ##\infty##, the other half is ##-\infty##, then add them together and say it is equal to ##0##. You are supposed to say things do not converge, and move on.

More generally, you can interchange any linear operators -- things like ##\int, \sum, E[], trace()## (where E is expectation operator), so long as you have first dealt with convergence issues.
 
  • Like
Likes   Reactions: Buffu and FactChecker
i would just check that multiplication of matrices corresponds to composition of linear maps and thus the operation is associative since that fact is easy for maps.
 
  • #10
mathwonk said:
i would just check that multiplication of matrices corresponds to composition of linear maps and thus the operation is associative since that fact is easy for maps.

Since everyone is speaking of linear maps. Does linear maps just a function from one field to other ?
 
  • #11
Buffu said:
Since everyone is speaking of linear maps. Does linear maps just a function from one field to other ?
You would also need the linear properties of the map: M(aX+bY) = aM(X)+bM(Y).
 
  • #12
a map f is linear if f (av+bw) = a f(v) + b f(w) for scalars a,b and vectors v,w. Since matrix multiplication obeys M(av+bw) = aMv + bMw, it is a linear map. but composition is associative for all maps, linear or not. The point is you only need to show associativity for multiplication by vectors, i.e. for matrices M,N and vectors v, that (M.N).v = M.(N.v). it then follows that (MN)P = M(NP) for all matrices M,N,P.
 
Last edited:
  • Like
Likes   Reactions: Buffu and FactChecker
  • #13
Buffu said:
Since everyone is speaking of linear maps. Does linear maps just a function from one field to other ?
You ultimately need a space in which adding vectors and scaling them makes sense, which it does not in a "standard space". They are often defined in vector spaces, where vectors are elements of a group, but also on rings, etc.
 
  • #14
multiplication by an n by m matrix of scalars from a field k, sends column vectors of length m to column vectors of length n, hence defines a function k^m-->k^n. Since addition of such column vectors is defined and also multiplication of them by scalars, the requirements spelled out by WWGD in the previous post are satisfied, hence linearity makes sense. and since matrix multiplication obeys the laws mentioned above, this function is indeed a linear function, or linear "map", to distinguish it from usual functions whose inputs and outputs are numbers.
 
Last edited:
  • #15
mathwonk said:
multiplication by an n by m matrix of scalars foma field k, sends column vectors of length m to column vectors of length n, hence defines a function k^m-->k^n. Since addition of such column vectors is defined and also multiplication of them by scalars, and since matrix multiplication obeys the laws mentioned above, this functionm is called a linear function, or linear "map", to distinguish it from usual functions whose inputs and outputs are numbers.
Yes, but I meant, those vectors must belong to a space where their sum is defined. You cannot, e.g., defne a linear map in a discrete topological space without an additional definition of sum and scaling, at least not so AFAIK,
 
  • #16
yes, i was trying to address your requirement in my second sentence (addressed to Buffu): "Since addition of such column vectors is defined and also multiplication of them by scalars". I have edited my post to try to make this more clear. thanks!
 
Last edited:

Similar threads

  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 15 ·
Replies
15
Views
5K