Linear operators & mappings between vector spaces

Click For Summary

Discussion Overview

The discussion revolves around the definition and properties of linear mappings between two vector spaces, specifically focusing on the expression of transformations in terms of their matrix representations. Participants explore the implications of different notations and conventions used in defining linear operators.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • One participant questions why the transformation is expressed as T(v_j) = ∑_{i=1}^{n} T_{ij}w_{i} instead of ∑_{j=1}^{n} T_{ij}w_{j}, suggesting a need for deeper understanding or derivation of this expression.
  • Another participant explains that the left-hand side T(v_j) contains a j, and summing over j on the right side would lead to a loss of this index, complicating interpretation.
  • Some participants note that the traditional formulation is linked to matrix representations, where linear transformations are associated with matrices, and changing the order of indices would alter the interpretation of the transformation.
  • There is a discussion about the implications of using different indices and how it affects the understanding of the transformation and its matrix representation.
  • One participant reflects on their reasoning regarding the relationship between basis vectors and their representation in matrix form, questioning the generality of their approach.
  • Another participant acknowledges the calculation presented by a peer but points out inaccuracies and potential confusion in notation, suggesting a clearer distinction between basis vectors and arbitrary vectors.

Areas of Agreement / Disagreement

Participants express differing views on the notation and conventions used in defining linear transformations, with no consensus reached on the best approach or interpretation. Some agree on the traditional aspects of matrix representation while others challenge the clarity and implications of the definitions.

Contextual Notes

Participants highlight potential ambiguities in notation and the need for careful interpretation of indices in mathematical expressions. There are also mentions of unresolved aspects regarding the derivation of certain expressions and the implications of different conventions.

"Don't panic!"
Messages
600
Reaction score
8
Hi,

I'm having a bit of difficulty with the following definition of a linear mapping between two vector spaces:

Suppose we have two n-dimensional vector spaces V and W and a set of linearly independent vectors \mathcal{S} = \lbrace \mathbf{v}_{i}\rbrace_{i=1, \ldots , n} which forms a basis for V. We define the linear operator T which maps the basis vectors \mathbf{v}_{j} to their representations in W, i.e. T:V \rightarrow W, as T\left(\mathbf{v}_{j}\right)= \sum_{i=1}^{n} T_{ij}\mathbf{w}_{i} where \mathcal{B}= \lbrace\mathbf{w}_{i} \rbrace_{i=1, \ldots , n} is a basis for W.

What I'm struggling with is, why is the transformation expressed as \sum_{i=1}^{n} T_{ij}\mathbf{w}_{i} and not \sum_{j=1}^{n} T_{ij}\mathbf{w}_{j} ? Is it purely definition, or is there some deeper meaning behind it? (and if so, is there any way of deriving this expression?).

Sorry to ask a probably very trivial question, but it's been bugging me, and I can't seem to find a satisfactory answer from trawling the internet.
 
Physics news on Phys.org
"Don't panic!" said:
What I'm struggling with is, why is the transformation expressed as \sum_{i=1}^{n} T_{ij}\mathbf{w}_{i} and not \sum_{j=1}^{n} T_{ij}\mathbf{w}_{j} ? Is it purely definition, or is there some deeper meaning behind it? (and if so, is there any way of deriving this expression?).

The left hand side T(v_j) has a j in it. If you summed over j on the right hand side you wouldn't have any j remaining on the right hand side. So the equation would make a claim that T(v_j) is a sum involving subscripts that are constant except for the index i. How would we interpret that?

Mathematics is traditionally written in a sloppy way from the viewpoint of logic. An equation with a with unknowns like i and j is used with the preceeding quantifiers "for each i and for each j" omitted. So the statement T(v_j) = T_{i\ 1} w_1 + T_{i\ 2} w_2 + ... T_{i\ n} w_n would be interpreted to be a condition that was true for each pair of indices i , j. However this wouldn't make sense as definition unless terms like T_{i\ 1} w_1 had the same value for all indices i. We don't want that restriction in the definition of a linear transformation.

You could also ask why we don't define the transformation by T(v_j) = \sum_{i=1}^n T_{j\ i} w_i.

It's just a matter of tradition. Linear transformations are associated with matrices. The traditional way to think of this is to think of a column vector v being transformed by being left multipled by a matrix T. If you wanted to think of a transformation as a row vector v being right multiplied by a matrix T, you'd change the order of T's indexes i,j in the definition.
 
"Don't panic!" said:
What I'm struggling with is, why is the transformation expressed as \sum_{i=1}^{n} T_{ij}\mathbf{w}_{i} and not \sum_{j=1}^{n} T_{ij}\mathbf{w}_{j} ? Is it purely definition, or is there some deeper meaning behind it? (and if so, is there any way of deriving this expression?).
That first definition ensures that ##(Tv_j)_i## (the ith component of ##Tv_j## in the ##\mathcal B## basis) is ##T_{ij}##.

Recall that when we're given a linear operator T, we define the matrix [T] corresponding to it by ##[T]_{ij}=(Tv_j)_i##. (See the FAQ post I linked to in your other thread). The reason that we use this convention rather than ##[T]_{ij}=(Tv_i)_j## is that it's nice to have the matrix equation that corresponds to y=Tx be [y]=[T][x], rather than ##[y]^T=[x]^T[T]##.

The convention you're asking about ensures that ##[T]_{ij}=T_{ij}##.
 
Stephen Tashi said:
The left hand side T(v_j) has a j in it. If you summed over j on the right hand side you wouldn't have any j remaining on the right hand side. So the equation would make a claim that T(v_j) is a sum involving subscripts that are constant except for the index i. How would we interpret that?You could also ask why we don't define the transformation by T(v_j) = \sum_{i=1}^n T_{j\ i} w_i.

Thanks. Sorry, when I wrote it with the sum over the other index, I had meant in the form that you've put above.

I guess I was just trying to make sense of it really, as usually, if one expresses a vector in terms of a column matrix, with respect to a given ordered basis, for example
\left[ \mathbf{v}\right]_{ \mathcal{B}} = \left( \begin{matrix} a_{1} \\ \vdots \\ a_{n} \end{matrix} \right)
where \mathcal{B} is an n-dimensional basis for V and \mathbf{v} \in V. The components a_{i} of the column matrix are the components of the vector \mathbf{v} with respect to the basis vectors \mathbf{v}_{i}. ( \left[\, \cdot \,\right]_{\mathcal{ B}} defines an isomorphism between V and \mathbb{R}^{n} ).

Then, for some linear operator \mathcal{A}, we have that, with respect to the basis \mathcal{B}: \left[\mathcal{ A} \left( \mathbf{v} \right) \right]_{\mathcal{ B}} = \left( \sum_{j=1}^{n} A_{ij} a_{j} \right)
where the a_{j} are the components of \mathbf{v} with respect to the basis vectors \mathbf{v}_{i} \in \mathcal{B} (the right-hand side of the equation denotes a column matrix whose components are \sum_{j=1}^{n} A_{ij} a_{j} ).
In my mind, I rationalised it by using that one can the basis vectors as the standard basis in the basis that they define, i.e. \left[ \mathbf{v}_{i}\right]_{ \mathcal{B}} = \left( \begin{matrix} 0 \\ \vdots \\ 1 \\ \vdots \\ 0 \end{matrix} \right)
where \mathbf{v}_{i} \in \mathcal{B}. And in that sense, for the j^{th} basis vector \mathbf{v}_{j} \in \mathcal{B}, the vector "picks off" the elements of the j^{th} column of the matrix representation of \mathcal{A} in the basis \mathcal{B}, such that \left[ \mathcal{A} \left( \mathbf{v}_{j} \right) \right]_{\mathcal{B} } = \left( \begin{matrix} A_{1j} \\ \vdots \\ A_{nj} \end{matrix} \right) = A_{1j} \left( \begin{matrix} 1 \\ 0 \\ \vdots \\ 0 \end{matrix} \right) + \cdots + A_{nj} \left( \begin{matrix} 0 \\ \vdots \\ 0 \\ 1 \end{matrix} \right) = A_{1j} \left[ \mathbf{v}_{1}\right]_{ \mathcal{B}} + \cdots + A_{nj} \left[ \mathbf{v}_{n}\right]_{ \mathcal{B}} = \left[ A_{1j} \mathbf{v}_{1} + \cdots + A_{nj} \mathbf{v}_{n} \right]_{ \mathcal{ B}} = \left[ \sum_{i=1}^{n} A_{ij} \mathbf{v}_{i} \right]_{\mathcal{B}}
from which one could imply that \mathcal{A} \left( \mathbf{v}_{j} \right) = \sum_{i=1}^{n} A_{ij} \mathbf{v}_{i} however, I'm not convinced that I've got this write and it didn't seem to be the most general way of doing it?!
 
Cheers Fredrik. My question actually arose after reading your FAQ, I was just a bit unsure. So, is the way that an operator acts on the basis vectors defined that way so that one recovers the matrix equation \left[ \mathbf{y}\right] = \left[\mathcal{T} \right] \left[\mathbf{x} \right]. instead of \left[ \mathbf{y}\right]^{T} = \left[\mathbf{x} \right]^{T} \left[\mathcal{T} \right]?
Also, is it correct to deduce it the way I did in the above post?
 
"Don't panic!" said:
Then, for some linear operator \mathcal{A}, we have that, with respect to the basis \mathcal{B}: \left[\mathcal{ A} \left( \mathbf{v} \right) \right]_{\mathcal{ B}} = \left( \sum_{j=1}^{n} A_{ij} a_{j} \right)
Your calculation looks OK, but there are some inaccuracies. I see nothing in the calculation that indicates a misunderstanding, but the formula above is weird, because it says that a matrix is equal to a number. The right-hand side is the ith component of the left-hand side.

It's also a little confusing to use the same letter of the alphabet (v) for the basis vectors and the arbitrary vector. I will write ##\mathbf x=\sum x_i\mathbf v_i## instead.

If you want to prove that ##A\mathbf v_j=\sum_{i=1}^n A_{ij}\mathbf v_i##, where ##A_{ij}## is the row i, column j component of the matrix representation of A, you really only have to realize the left-hand side is a vector, and can therefore be expressed as a linear combination of the basis vectors: ##A\mathbf v_j = \sum_{i=1}^n (A\mathbf v_j)_i \mathbf v_i##. Then you recall that ##A_{ij}## is defined by ##A_{ij}=(A\mathbf v_j)_i##.

Then you can use this result to evaluate ##A\mathbf x##, where ##\mathbf x## is an arbitrary vector.
$$A\mathbf x =A\bigg(\sum_j x_j \mathbf v_j\bigg) =\sum_j x_j A\mathbf v_j =\sum_j x_j\bigg(\sum_i A_{ij}\mathbf v_i\bigg) =\sum_i\bigg(\sum_j A_{ij}x_j\bigg)\mathbf v_i.$$ This implies that the ith component of ##A\mathbf x## is
$$(A\mathbf x)_i =\sum_j A_{ij} x_j.$$
 
Cheers Fredrik, that's most helpful!

Yes, the formula \mathcal{A} \left[ \left(\mathbf{v} \right) \right]_{\mathcal{B}} = \left(\sum_{j=1}^{n} A_{ij} a_{j}\right) was just laziness on my part - I didn't want to write out all the components of the column vector, so I implied it, all be it a bit ambiguously, through the parentheses around the sum. Apologies for the confusion though, and thank you very much for your help once again.
 
Either way works. They are just picking one way. The notation is simpler.
 
The question features basis vectors, whereas a routine operation is calculation of components; one rarely refers to the basis directly. For components we have, in standard notation, an intuitive order of indices (Tv)k = Tkv, where k is the index of a row and enumerates both columns of the matrix Tk and components v.

Original poster wants to see how images of basis vectors of the first space are expressed from basis vectors of the second space. Yes, there is a reversion, because bases are a concept dual to components, namely (ei)k = δki.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K