# Linear Transformation using Two Basis

Hi,

I'm having trouble understanding the purpose of using two basis in a linear transformation. My lecturer explained that it was a way to find a linear transformation that satisfied either dimension, but I'm having trouble understanding how that relates to the method in finding this transformation.

An example being,

T(Ao+A1x+A2x2)=(Ao+A2)+Aox

Using standard basis
b = (Ao,A1x,A2x2)
C= (1,x)

[T]cb =
| 1 0 1 |
| 1 0 0 |, what does this indicate?

## Answers and Replies

Fredrik
Staff Emeritus
Science Advisor
Gold Member
This looks like a good opportunity to test if a post I wrote a few months ago is understandable. Let me know if something isn't clear. I've been meaning to turn this into a FAQ post, but I haven't gotten around to adding the additional material I think should be included.

I moved your post to linear & abstract algebra, questions about concepts, definitions and theorems do not belong in the homework forums. They are mainly for questions about textbook-style problems.

Let X and Y be finite-dimensional vector spaces. Let ##T:X\to Y## be a linear transformation. Let ##A=(e_1,\dots,e_n)## and ##B=(f_1,\dots,f_m)## be ordered bases for X and Y respectively. For each ##y\in Y##, there's a unique m-tuple of scalars ##y_1,\dots,y_m## such that ##y=\sum_{i=1}^m y_i f_i##. These scalars are called the components of y with respect to B.

Let ##i\in\{1,\dots,m\}## and ##j\in\{1,\dots,n\}## be arbitrary. ##Te_j## is clearly a member of Y. Since we use the notation ##y_i## for the ith component of an arbitrary ##y\in Y## with respect to B, it's natural to use the notation ##(Te_j)_i## for the ith component of ##Te_j## with respect to B. The mn (m times n) scalars ##(Te_j)_i## with ##i\in\{1,\dots,m\}## and ##j\in\{1,\dots,n\}## are called the components, or matrix elements, of T with respect to (A,B). The m×n matrix
$$\begin{pmatrix}(Te_1)_1 & \cdots & (Te_n)_1\\ \vdots & \ddots & \vdots\\ (Te_1)_m & \dots & (Te_n)_m\end{pmatrix}$$ is called the matrix representation of T, or just the matrix of T, with respect to (A,B). It is often denoted by the same symbol as the linear transformation, in this case T. In situations where you would prefer to use different notations for the linear transformation and its matrix representation, a notation like ##[T]## or ##[T]^{A,B}## can be used for the latter.

The standard notation for the scalar on row i, column j of a matrix ##T## is ##T_{ij}##. In this notation, we have ##T_{ij}=(Te_j)_i##. This is the formula you need to remember. An alternative notation for the scalar on row i, column j is ##T^i_j##. In this notation, we have ##T^i_j=(Te_j)^i##. If you commit this version of the formula to memory, there's no chance that you will forget the correct order of the indices.

The following observation provides some motivation for the definitions. Let ##x\in X## be arbitrary. Define ##y\in Y## by ##y=Tx##. Let ##x_1,\dots,x_n## be the components of x with respect to A, and let ##y_1,\dots,y_m## be the components of y with respect to B.
\begin{align}
y &=\sum_{i=1}^m y_i f_i\\
Tx &=T\left(\sum_{j=1}^n x_j e_j\right) =\sum_{j=1}^n x_jTe_j =\sum_{j=1}^n x_j \left(\sum_{i=1}^m(Te_j)_i f_i\right) =\sum_{j=1}^n \sum_{i=1}^m x_j (Te_j)_i f_i =\sum_{i=1}^m \left(\sum_{j=1}^n x_j (Te_j)_i\right) f_i
\end{align}
Since ##\{f_1,\dots,f_m\}## is linearly independent, these results and the equality y=Tx imply that
$$\sum_{j=1}^n x_j (Te_j)_i= y_i$$ for all ##i\in\{1,\dots,m\}##. If you recall that the definition of matrix multiplication is ##(AB)_{ij}=\sum_k A_{ik}B_{kj}##, you can easily recognize the above as the ith row of the matrix equation
$$\begin{pmatrix}(Te_1)_1 & \cdots & (Te_n)_1\\ \vdots & \ddots & \vdots\\ (Te_1)_m & \dots & (Te_n)_m\end{pmatrix} \begin{pmatrix}x_1\\ \vdots\\ x_n\end{pmatrix} =\begin{pmatrix}y_1\\ \vdots\\ y_m\end{pmatrix}.$$ The following is a simple example of how to find a matrix representation of a linear transformation. Define ##S:\mathbb R^3\to\mathbb R^2## by ##S(x,y,z)=(3z-x,2y)##. This S is clearly linear. Let ##C=(e_1,e_2,e_3)## and ##D=(f_1,f_2)## be the standard ordered bases for ##\mathbb R^3## and ##\mathbb R^2## respectively. We will denote the matrix of S with respect to (C,D) by .
\begin{align}Se_1 &=S(1,0,0)=(-1,0) =-1f_1+0f_2\\
Se_2 &=S(0,1,0) =(0,2)=0f_1+2f_2\\
Se_3 &=S(0,0,1) =(3,0)=3f_1+0f_2\\
&=\begin{pmatrix}(Se_1)_1 & (Se_2)_1 & (Se_3)_1\\ (Se_1)_2 & (Se_2)_2 & (Se_3)_2\end{pmatrix} =\begin{pmatrix}-1 & 0 & 3\\ 0 & 2 & 0 \end{pmatrix}.
\end{align} Note that for all ##x,y,z\in\mathbb R##,
$$\begin{pmatrix}x\\ y\\ z\end{pmatrix} = \begin{pmatrix}-1 & 0 & 3\\ 0 & 2 & 0 \end{pmatrix} \begin{pmatrix}x\\ y\\ z\end{pmatrix} =\begin{pmatrix}-x+3z\\ 2y\end{pmatrix}.$$ If Y is an inner product space and B is orthonormal, the easiest way to find the matrix elements is often to use the inner product. We will use the physicist's convention for inner products. This means that the term "inner product" is defined so that when we're dealing with a vector space over ℂ (i.e. when the set of scalars is ℂ rather than ℝ), the map ##v\mapsto\langle u,v\rangle## is linear and the map ##u\mapsto\langle u,v\rangle## is antilinear (i.e. conjugate linear). Let ##i\in\{1,\dots,m\}## and ##y\in Y## be arbitrary. Let ##y_1,\dots,y_m## be the components of y with respect to B.
$$\left\langle f_i,y\right\rangle =\left\langle f_i,\sum_{j=1}^m y_j f_j \right\rangle =\sum_{j=1}^m y_j \left\langle f_i,f_j\right\rangle =\sum_{j=1}^m y_j \delta_{ij} =y_i.$$ Since i and y are arbitrary, this implies that for all ##i\in\{1,\dots,m\}## and all ##j\in\{1,\dots n\}##,
$$(Te_j)_i =\left\langle f_i,Te_j \right\rangle.$$ If X=Y, it's convenient to choose B=A, and to speak of the matrix representation of T with respect to A instead of with respect to (A,A), or (A,B). The formula for ##T_{ij}## can now be written as
$$T_{ij}=\left\langle e_i,Te_j \right\rangle.$$ In bra-ket notation (which is used in quantum mechanics), we would usually write the ith basis vector as ##\left|i\right\rangle##. This turns the formula into
$$T_{ij} =\left\langle i\right|T\left|j\right\rangle.$$

Fredrik
Staff Emeritus
Science Advisor
Gold Member
It's hard to tell if you're quiet because that was exactly what you needed, or because you didn't understand a word of it. Some feedback would be appreciated.

Sorry about not replying, its quite a bit to chew down.
It makes quite a bit of sense actually, I just found the heavy use of equations daunting because I'm still grappling with the abstract theorems, the bit near the end, about quantum mechanics, seemed to go over my head a bit, something we haven't covered yet.

I'm understanding it, is it alright to assume, using analogy, that, matrix representation T is just a way to say, these are directions to either B using C systems?

Fredrik
Staff Emeritus
Science Advisor
Gold Member
Thanks for the reply. The statement about bra-ket notation (used in quantum mechanics) is there because this is a draft of a FAQ post, and half the people who ask about these things are physics students who are studying the concept of "spin" in quantum mechanics. I should probably add a comment about that. You can safely ignore those last few lines.

Feel free to ask if there's something in that post that you don't understand.

I don't understand the last sentence in your post, so I don't know what you're asking.

One of the main points of this is that given a pair of ordered bases, we can "translate" equations involving vectors, linear operators and composition of functions, into equations involving matrices and matrix multiplication. The original equation holds if and only if the translated equation holds.

HallsofIvy
Science Advisor
Homework Helper
The most important observation is this:
$$\begin{bmatrix}a & b & c \\ d & e & f \\ g & h & i\end{bmatrix}\begin{bmatrix}1 \\ 0 \\ 0 \end{bmatrix}= \begin{bmatrix}a \\ d \\ g\end{bmatrix}$$
$$\begin{bmatrix}a & b & c \\ d & e & f \\ g & h & i\end{bmatrix}\begin{bmatrix}0 \\ 1 \\ 0 \end{bmatrix}= \begin{bmatrix}b \\ e \\ h\end{bmatrix}$$
and
$$\begin{bmatrix}a & b & c \\ d & e & f \\ g & h & i\end{bmatrix}\begin{bmatrix}0 \\ 0 \\ 1 \end{bmatrix}= \begin{bmatrix}c \\ f \\ i\end{bmatrix}$$

That is, when you multiply a matrix by each of the standard basis vectors you get the columns of the matrix.
When you are given two bases, one for the "domain" space and the other for the "range" space, apply the linear transformation to each basis vector in turn (those are "(1, 0, 0)", "(0, 1, 0)", and "(0, 0, 1)"), writing the result as a linear combination of the basis vectors for the range space (those are the vector on the right above). Those coefficients are the columns of the matrix representation.