Linear Transformation using Two Basis

Click For Summary

Discussion Overview

The discussion revolves around the concept of linear transformations using two bases in linear algebra. Participants explore the implications of representing linear transformations with respect to different bases, including the relationship between the transformation and its matrix representation. The conversation includes theoretical aspects, examples, and connections to quantum mechanics.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • One participant expresses confusion about the purpose of using two bases in linear transformations and how it relates to finding the transformation.
  • Another participant provides a detailed explanation of linear transformations, bases, and matrix representations, including the notation used for components and matrix elements.
  • There is mention of the relationship between linear transformations and matrix multiplication, emphasizing that the original equations hold if and only if the translated equations hold.
  • A later reply indicates that the heavy use of equations was daunting for some, and seeks clarification on the analogy of matrix representation as directions between two systems.
  • Participants discuss the relevance of bra-ket notation from quantum mechanics, with one noting it may not be necessary for all readers.
  • Another participant highlights the importance of understanding how to translate equations involving vectors and linear operators into matrix equations.

Areas of Agreement / Disagreement

Participants generally agree on the importance of understanding linear transformations and their matrix representations, but there is no consensus on the clarity of the explanations provided, as some find the material challenging and others seek further clarification.

Contextual Notes

Some participants express difficulty with the abstract nature of the concepts and the mathematical notation used, indicating a potential gap in foundational understanding. The discussion includes references to quantum mechanics, which may not be familiar to all participants.

Who May Find This Useful

This discussion may be useful for students and individuals interested in linear algebra, particularly those grappling with the concepts of linear transformations, matrix representations, and their applications in various fields, including physics.

Offlinedoctor
Messages
12
Reaction score
0
Hi,

I'm having trouble understanding the purpose of using two basis in a linear transformation. My lecturer explained that it was a way to find a linear transformation that satisfied either dimension, but I'm having trouble understanding how that relates to the method in finding this transformation.

An example being,

T(Ao+A1x+A2x2)=(Ao+A2)+Aox

Using standard basis
b = (Ao,A1x,A2x2)
C= (1,x)

[T]cb =
| 1 0 1 |
| 1 0 0 |, what does this indicate?
 
Physics news on Phys.org
This looks like a good opportunity to test if a post I wrote a few months ago is understandable. Let me know if something isn't clear. I've been meaning to turn this into a FAQ post, but I haven't gotten around to adding the additional material I think should be included.

I moved your post to linear & abstract algebra, questions about concepts, definitions and theorems do not belong in the homework forums. They are mainly for questions about textbook-style problems.
Let X and Y be finite-dimensional vector spaces. Let ##T:X\to Y## be a linear transformation. Let ##A=(e_1,\dots,e_n)## and ##B=(f_1,\dots,f_m)## be ordered bases for X and Y respectively. For each ##y\in Y##, there's a unique m-tuple of scalars ##y_1,\dots,y_m## such that ##y=\sum_{i=1}^m y_i f_i##. These scalars are called the components of y with respect to B.

Let ##i\in\{1,\dots,m\}## and ##j\in\{1,\dots,n\}## be arbitrary. ##Te_j## is clearly a member of Y. Since we use the notation ##y_i## for the ith component of an arbitrary ##y\in Y## with respect to B, it's natural to use the notation ##(Te_j)_i## for the ith component of ##Te_j## with respect to B. The mn (m times n) scalars ##(Te_j)_i## with ##i\in\{1,\dots,m\}## and ##j\in\{1,\dots,n\}## are called the components, or matrix elements, of T with respect to (A,B). The m×n matrix
$$\begin{pmatrix}(Te_1)_1 & \cdots & (Te_n)_1\\ \vdots & \ddots & \vdots\\ (Te_1)_m & \dots & (Te_n)_m\end{pmatrix}$$ is called the matrix representation of T, or just the matrix of T, with respect to (A,B). It is often denoted by the same symbol as the linear transformation, in this case T. In situations where you would prefer to use different notations for the linear transformation and its matrix representation, a notation like ##[T]## or ##[T]^{A,B}## can be used for the latter.

The standard notation for the scalar on row i, column j of a matrix ##T## is ##T_{ij}##. In this notation, we have ##T_{ij}=(Te_j)_i##. This is the formula you need to remember. An alternative notation for the scalar on row i, column j is ##T^i_j##. In this notation, we have ##T^i_j=(Te_j)^i##. If you commit this version of the formula to memory, there's no chance that you will forget the correct order of the indices.

The following observation provides some motivation for the definitions. Let ##x\in X## be arbitrary. Define ##y\in Y## by ##y=Tx##. Let ##x_1,\dots,x_n## be the components of x with respect to A, and let ##y_1,\dots,y_m## be the components of y with respect to B.
\begin{align}
y &=\sum_{i=1}^m y_i f_i\\
Tx &=T\left(\sum_{j=1}^n x_j e_j\right) =\sum_{j=1}^n x_jTe_j =\sum_{j=1}^n x_j \left(\sum_{i=1}^m(Te_j)_i f_i\right) =\sum_{j=1}^n \sum_{i=1}^m x_j (Te_j)_i f_i =\sum_{i=1}^m \left(\sum_{j=1}^n x_j (Te_j)_i\right) f_i
\end{align}
Since ##\{f_1,\dots,f_m\}## is linearly independent, these results and the equality y=Tx imply that
$$\sum_{j=1}^n x_j (Te_j)_i= y_i$$ for all ##i\in\{1,\dots,m\}##. If you recall that the definition of matrix multiplication is ##(AB)_{ij}=\sum_k A_{ik}B_{kj}##, you can easily recognize the above as the ith row of the matrix equation
$$\begin{pmatrix}(Te_1)_1 & \cdots & (Te_n)_1\\ \vdots & \ddots & \vdots\\ (Te_1)_m & \dots & (Te_n)_m\end{pmatrix} \begin{pmatrix}x_1\\ \vdots\\ x_n\end{pmatrix} =\begin{pmatrix}y_1\\ \vdots\\ y_m\end{pmatrix}.$$ The following is a simple example of how to find a matrix representation of a linear transformation. Define ##S:\mathbb R^3\to\mathbb R^2## by ##S(x,y,z)=(3z-x,2y)##. This S is clearly linear. Let ##C=(e_1,e_2,e_3)## and ##D=(f_1,f_2)## be the standard ordered bases for ##\mathbb R^3## and ##\mathbb R^2## respectively. We will denote the matrix of S with respect to (C,D) by .
\begin{align}Se_1 &=S(1,0,0)=(-1,0) =-1f_1+0f_2\\
Se_2 &=S(0,1,0) =(0,2)=0f_1+2f_2\\
Se_3 &=S(0,0,1) =(3,0)=3f_1+0f_2\\
&=\begin{pmatrix}(Se_1)_1 & (Se_2)_1 & (Se_3)_1\\ (Se_1)_2 & (Se_2)_2 & (Se_3)_2\end{pmatrix} =\begin{pmatrix}-1 & 0 & 3\\ 0 & 2 & 0 \end{pmatrix}.
\end{align} Note that for all ##x,y,z\in\mathbb R##,
$$\begin{pmatrix}x\\ y\\ z\end{pmatrix} = \begin{pmatrix}-1 & 0 & 3\\ 0 & 2 & 0 \end{pmatrix} \begin{pmatrix}x\\ y\\ z\end{pmatrix} =\begin{pmatrix}-x+3z\\ 2y\end{pmatrix}.$$ If Y is an inner product space and B is orthonormal, the easiest way to find the matrix elements is often to use the inner product. We will use the physicist's convention for inner products. This means that the term "inner product" is defined so that when we're dealing with a vector space over ℂ (i.e. when the set of scalars is ℂ rather than ℝ), the map ##v\mapsto\langle u,v\rangle## is linear and the map ##u\mapsto\langle u,v\rangle## is antilinear (i.e. conjugate linear). Let ##i\in\{1,\dots,m\}## and ##y\in Y## be arbitrary. Let ##y_1,\dots,y_m## be the components of y with respect to B.
$$\left\langle f_i,y\right\rangle =\left\langle f_i,\sum_{j=1}^m y_j f_j \right\rangle =\sum_{j=1}^m y_j \left\langle f_i,f_j\right\rangle =\sum_{j=1}^m y_j \delta_{ij} =y_i.$$ Since i and y are arbitrary, this implies that for all ##i\in\{1,\dots,m\}## and all ##j\in\{1,\dots n\}##,
$$(Te_j)_i =\left\langle f_i,Te_j \right\rangle.$$ If X=Y, it's convenient to choose B=A, and to speak of the matrix representation of T with respect to A instead of with respect to (A,A), or (A,B). The formula for ##T_{ij}## can now be written as
$$T_{ij}=\left\langle e_i,Te_j \right\rangle.$$ In bra-ket notation (which is used in quantum mechanics), we would usually write the ith basis vector as ##\left|i\right\rangle##. This turns the formula into
$$T_{ij} =\left\langle i\right|T\left|j\right\rangle.$$
 
It's hard to tell if you're quiet because that was exactly what you needed, or because you didn't understand a word of it. Some feedback would be appreciated.
 
Sorry about not replying, its quite a bit to chew down.
It makes quite a bit of sense actually, I just found the heavy use of equations daunting because I'm still grappling with the abstract theorems, the bit near the end, about quantum mechanics, seemed to go over my head a bit, something we haven't covered yet.

I'm understanding it, is it alright to assume, using analogy, that, matrix representation T is just a way to say, these are directions to either B using C systems?
 
Thanks for the reply. The statement about bra-ket notation (used in quantum mechanics) is there because this is a draft of a FAQ post, and half the people who ask about these things are physics students who are studying the concept of "spin" in quantum mechanics. I should probably add a comment about that. You can safely ignore those last few lines.

Feel free to ask if there's something in that post that you don't understand.

I don't understand the last sentence in your post, so I don't know what you're asking. :smile:

One of the main points of this is that given a pair of ordered bases, we can "translate" equations involving vectors, linear operators and composition of functions, into equations involving matrices and matrix multiplication. The original equation holds if and only if the translated equation holds.
 
The most important observation is this:
\begin{bmatrix}a & b & c \\ d & e & f \\ g & h & i\end{bmatrix}\begin{bmatrix}1 \\ 0 \\ 0 \end{bmatrix}= \begin{bmatrix}a \\ d \\ g\end{bmatrix}
\begin{bmatrix}a & b & c \\ d & e & f \\ g & h & i\end{bmatrix}\begin{bmatrix}0 \\ 1 \\ 0 \end{bmatrix}= \begin{bmatrix}b \\ e \\ h\end{bmatrix}
and
\begin{bmatrix}a & b & c \\ d & e & f \\ g & h & i\end{bmatrix}\begin{bmatrix}0 \\ 0 \\ 1 \end{bmatrix}= \begin{bmatrix}c \\ f \\ i\end{bmatrix}

That is, when you multiply a matrix by each of the standard basis vectors you get the columns of the matrix.
When you are given two bases, one for the "domain" space and the other for the "range" space, apply the linear transformation to each basis vector in turn (those are "(1, 0, 0)", "(0, 1, 0)", and "(0, 0, 1)"), writing the result as a linear combination of the basis vectors for the range space (those are the vector on the right above). Those coefficients are the columns of the matrix representation.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 20 ·
Replies
20
Views
3K
  • · Replies 12 ·
Replies
12
Views
5K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 43 ·
2
Replies
43
Views
8K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K