lineartransformations

Matrix Representations of Linear Transformations

[Total: 3    Average: 5/5]

Let X and Y be finite-dimensional vector spaces. Let ##T:X\to Y## be a linear transformation. Let ##A=(e_1,\dots,e_n)## and ##B=(f_1,\dots,f_m)## be ordered bases for X and Y respectively. (An ordered basis for an n-dimensional vector space is just an n-tuple whose components are the elements of a basis). For each ##y\in Y##, there’s a unique m-tuple of scalars ##y_1,\dots,y_m## such that ##y=\sum_{i=1}^m y_i f_i##. These scalars are called the components of y with respect to B.

Let ##i\in\{1,\dots,m\}## and ##j\in\{1,\dots,n\}## be arbitrary. ##Te_j## is by definition of ##T## an element of Y. Since we use the notation ##y_i## for the ith component of an arbitrary ##y\in Y## with respect to B, it’s natural to use the notation ##(Te_j)_i## for the ith component of ##Te_j## with respect to B. The mn (m times n) scalars ##(Te_j)_i## with ##i\in\{1,\dots,m\}## and ##j\in\{1,\dots,n\}## are called the components, or matrix elements, of T with respect to (A,B). The m×n matrix
$$\begin{pmatrix}(Te_1)_1 & \cdots & (Te_n)_1\\ \vdots & \ddots & \vdots\\ (Te_1)_m & \dots & (Te_n)_m\end{pmatrix}$$ is called the matrix representation of T with respect to (A,B). It is often denoted by the same symbol as the linear transformation, in this case T. In situations where you would prefer to use different notations for the linear transformation and its matrix representation, a notation like ##[T]## or ##[T]_{B,A}## can be used for the latter.

The standard notation for the scalar on row i, column j of a matrix ##T## is ##T_{ij}##. In this notation, we have ##T_{ij}=(Te_j)_i##. This is the formula you need to remember. An alternative notation for the scalar on row i, column j is ##T^i_j##. In this notation, we have ##T^i_j=(Te_j)^i##. You may find it easier to remember this version of the formula, since the index that’s upstairs on the left is upstairs on the right.

Given an m×n matrix M, there’s a simple way to define a linear transformation ##T:X\to Y## such that the matrix representation of T with respect to (A,B) is M. We define T to be the unique linear ##T:X\to Y## such that ##Te_j=\sum_{i=1}^n T_{ij}f_i## for all ##j\in\{1,\dots,n\}##. In the alternative notation, we would write ##Te_j=\sum_{i=1}^n T^i_j f_i##. You may find it easier to remember this version of the formula, since the summation is over an index that appears once upstairs and once downstairs.

The following observation provides some motivation for the definitions. Let ##x\in X## be arbitrary. Define ##y\in Y## by ##y=Tx##. Let ##x_1,\dots,x_n## be the components of x with respect to A, and let ##y_1,\dots,y_m## be the components of y with respect to B. We have
\begin{align*}
y &=\sum_{i=1}^m y_i f_i\\
Tx &=T\left(\sum_{j=1}^n x_j e_j\right) =\sum_{j=1}^n x_jTe_j =\sum_{j=1}^n x_j \left(\sum_{i=1}^m(Te_j)_i f_i\right) =\sum_{j=1}^n \sum_{i=1}^m x_j (Te_j)_i f_i\\
&=\sum_{i=1}^m \left(\sum_{j=1}^n x_j (Te_j)_i\right) f_i.
\end{align*}
Since ##\{f_1,\dots,f_m\}## is linearly independent, these results and the equality y=Tx imply that
$$\sum_{j=1}^n x_j (Te_j)_i= y_i$$ for all ##i\in\{1,\dots,m\}##. If you recall that the definition of matrix multiplication is ##(AB)_{ij}=\sum_k A_{ik}B_{kj}##, you can recognize the above as the ith row of the matrix equation
$$\begin{pmatrix}(Te_1)_1 & \cdots & (Te_n)_1\\ \vdots & \ddots & \vdots\\ (Te_1)_m & \dots & (Te_n)_m\end{pmatrix} \begin{pmatrix}x_1\\ \vdots\\ x_n\end{pmatrix} =\begin{pmatrix}y_1\\ \vdots\\ y_m\end{pmatrix}.$$

The following is a simple example of how to find a matrix representation of a linear transformation. Define ##S:\mathbb R^3\to\mathbb R^2## by ##S(x,y,z)=(3z-x,2y)##. This S is linear. Let ##C=(g_1,g_2,g_3)## and ##D=(h_1,h_2)## be the standard ordered bases for ##\mathbb R^3## and ##\mathbb R^2## respectively. We will denote the matrix of S with respect to (C,D) by ##[ S]##.
\begin{align*}Sg_1 &=S(1,0,0)=(-1,0) =-1h_1+0h_2\\
Sg_2 &=S(0,1,0) =(0,2)=0h_1+2h_2\\
Sg_3 &=S(0,0,1) =(3,0)=3h_1+0h_2\\
[ S] &=\begin{pmatrix}(Sg_1)_1 & (Sg_2)_1 & (Sg_3)_1\\ (Sg_1)_2 & (Sg_2)_2 & (Sg_3)_2\end{pmatrix} =\begin{pmatrix}-1 & 0 & 3\\ 0 & 2 & 0 \end{pmatrix}.
\end{align*} Note that for all ##x,y,z\in\mathbb R##,
$$[ S]\begin{pmatrix}x\\ y\\ z\end{pmatrix} = \begin{pmatrix}-1 & 0 & 3\\ 0 & 2 & 0 \end{pmatrix} \begin{pmatrix}x\\ y\\ z\end{pmatrix} =\begin{pmatrix}-x+3z\\ 2y\end{pmatrix}.$$

If Y is an inner product space, and B is an orthonormal ordered basis, the easiest way to find the matrix elements is often to use the inner product. We will use the physicist’s convention for inner products. This means that the term “inner product” is defined so that when we’re dealing with a vector space over ℂ (i.e. when the set of scalars is ℂ rather than ℝ), the map ##v\mapsto\langle u,v\rangle## is linear and the map ##u\mapsto\langle u,v\rangle## is antilinear (i.e. conjugate linear). Let ##i\in\{1,\dots,m\}## and ##y\in Y## be arbitrary. Let ##y_1,\dots,y_m## be the components of y with respect to B.
$$\left\langle f_i,y\right\rangle =\left\langle f_i,\sum_{j=1}^m y_j f_j \right\rangle =\sum_{j=1}^m y_j \left\langle f_i,f_j\right\rangle =\sum_{j=1}^m y_j \delta_{ij} =y_i.$$ Since i and y are arbitrary, this implies that for all ##i\in\{1,\dots,m\}## and all ##j\in\{1,\dots n\}##,
$$\left\langle f_i,Te_j \right\rangle=(Te_j)_i .$$ If X=Y, it’s convenient to choose B=A, and to speak of the matrix representation of T with respect to A instead of with respect to (A,A), or (A,B). The formula for ##T_{ij}## can now be written as
$$T_{ij}=(Te_j)_i=\left\langle e_i,Te_j \right\rangle.$$ One final comment for those of you who have studied quantum mechanics. (If you haven’t, just ignore this). In bra-ket notation, we would usually write the ith basis vector as ##\left|i\right\rangle##. This turns the last formula above into
$$T_{ij} =\left\langle i\right|T\left|j\right\rangle.$$

Click For Forum Comments

19 replies
Newer Comments »
  1. Krylov
    Krylov says:

    Thank you for the nice article!

    I hope it will help beginning students to avoid the kind of confusion that I used to experience. (I think that part of this confusion is due to the fact that in physics literature, one usually doesn’t distinguish between an operator and its matrix representation and, often, one also omits the specification of the underlying bases. For trained readers this is usually not a problem, but for students just coming from an LA course and looking to apply the theory in physics problems, I believe this can cause unnecessary difficulties.)

    One typo:

    In the line starting with: “We just define T to be the unique linear ##T:X→Y## such that (…)” you probably meant to write
    $$
    T e_j = sum_{i=1}^m{T_{ij}f_i}
    $$
    since at this point in the text you have not yet assumed that ##X = Y##, etc.

    Two suggestions:

    [LIST=1]
    [*]I think it would make the article even better if you would also discuss a second example, this time of an operator acting on an abstract (but still finite dimensional) vector space, such as a space of polynomials or so. This way, it becomes clear that a vector / matrix and its representation w.r.t. a basis are really two different things, and it also showcases the power of matrix representations when doing computations with abstract operators.

    [*]Perhaps, alluding to your remark on QM at the end, it would be nice if you would write a follow-up on how this generalizes quite easily to bounded linear operators on separable Hilbert spaces. Then, you could also comment on what happens when you replace a bounded operator with an unbounded (differential) operator, which is typically the case physicists encounter when studying QM.
    [/LIST]
    Hopefully you do not consider these comments an interference, but rather an expression of my enthusiasm for the subject and the attention that it has recently received on PF.

  2. lavinia
    lavinia says:

    As pointed out by Krylov

    ”Given an m×n matrix M, there’s a simple way to define a linear transformation T:X→Y such that the matrix representation of T with respect to (A,B) is M. We just define T to be the unique linear ##T:X→Y ## such that ##Tej=∑ni=1Tijei## for all j∈{1,…,n}.”

    only works if ##X=Y##. A matrix determines a linear transformation for each choice of basis for ##X## and ##Y##. Without a choice of bases, the matrix does not determine a linear transformation.

  3. fresh_42
    fresh_42 says:

    A matrix determines a linear transformation for each choice of basis for XX and YY. Without a choice of bases, the matrix does not determine a linear transformation.

    But given a matrix the unity vectors in both spaces always define a natural basis to which the matrix is a linear transformation.

  4. Fredrik
    Fredrik says:

    One typo:

    In the line starting with: “We just define T to be the unique linear ##T:X→Y## such that (…)” you probably meant to write
    $$
    T e_j = sum_{i=1}^m{T_{ij}f_i}
    $$

    Good catch. That line isn’t present in the last draft I that I discussed with other people (in February 2013) before I turned it into a FAQ post (in June 2013…I’m pretty slow apparently), so I must have put it in later and not proofread it well enough.

    Two suggestions:

    [LIST=1]
    [*]I think it would make the article even better if you would also discuss a second example, this time of an operator acting on an abstract (but still finite dimensional) vector space, such as a space of polynomials or so. This way, it becomes clear that a vector / matrix and its representation w.r.t. a basis are really two different things, and it also showcases the power of matrix representations when doing computations with abstract operators.

    [*]Perhaps, alluding to your remark on QM at the end, it would be nice if you would write a follow-up on how this generalizes quite easily to bounded linear operators on separable Hilbert spaces. Then, you could also comment on what happens when you replace a bounded operator with an unbounded (differential) operator, which is typically the case physicists encounter when studying QM.
    [/LIST]
    Hopefully you do not consider these comments an interference, but rather an expression of my enthusiasm for the subject and the attention that it has recently received on PF.

    Your comments are welcome, and I like your suggestions. Unfortunately I don’t have a lot of time to improve this post right now. If you would like to do it, I’m more than OK with that.

    The LaTeX can be improved. When I wrote this in 2013, LaTeX behaved differently here. There was no automatic numbering of equations for example. I would like to make sure that only those equations that should be numbered are numbered. Removing all the numbers is also an option. Also, the equation that begins with Tx= wasn’t split over two lines before. It needs an explicit line break followed by an alignment symbol. (I could edit the post when it was a normal FAQ post. I don’t think I can now that it’s an Insights post).

  5. WWGD
    WWGD says:

    I also think the bases selected in each of ##X,Y ## both have to be ordered bases for there to be an isomorphism between ## L(X,Y) ## , linear maps between ## X,Y ## and ##M_{n times m}(R)## , where ##R## is the Ring; ## M_{n times m}(R) ## is the space of matrices with coefficients in the ring and##X,Y ## are (free, of course) ##R ##-modules (both right- or left-, I think); I think this is the most general scope of the isomorphism

  6. lavinia
    lavinia says:

    But given a matrix the unity vectors in both spaces always define a natural basis to which the matrix is a linear transformation.

    Yes if one already has two bases then the matrix defines a linear map. But there is no natural given basis for a vector space. You need to select one. Not sure what you mean by the unity vectors.

  7. fresh_42
    fresh_42 says:

    Yes if one already has two bases then the matrix defines a linear map. But there is no natural given basis for a vector space. You need to select one. Not sure what you mean by the unity vectors.

    Physicists probably write them ##e_i = (δ_{ij})_j##. I learned unit vectors. Ok, it’s not the i-th basis vector but the coordinate representation of the i-th basis vector. But that is hair-splitting. To awake the impression that a matrix isn’t a linear transformation is negligent. There is always a basis to which the matrix is a linear transformation. And in the finite dimensional case even without the use of the axiom of choice. I just wanted to avoid someone saying: “But I’ve read on the internet that a matrix isn’t a linear transformation.” The discussion distinguishing between the vectors themselves and their coordinate representation is in my opinion something for specialists and logicians.

Newer Comments »

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply