# Matrix Elements as images of basis vectors

1. Oct 5, 2015

I'm trying to understand the maths of QM from Shankar's book - Principles of Quantum Mechanics: On page 21 of that book, there is a general derivation that if we have a relation:

|v'> = Ω|v>

Where Ω is a operator on |v> transfroming it into |v'>, then the matrix entries of the operator can be expressed as:

Ωij = <i|Ω|j>, where |i> and |j> are basis vectors. What the book says is essentially that the jth column of matrix Ω can be viewed as the image of the transformed jth basis vector expressed in the same basis. More explicitly, it means:

Ω =
<1|Ω|1> <1|Ω|2> ..... <1|Ω|n>
<2|Ω|1> <2|Ω|2> ..... <2|Ω|n>
<3|Ω|1> <3|Ω|2> ..... <2|Ω|1>
..............................................
..............................................
<n|Ω|1> <n|Ω|2> ..... <n|Ω|n>

Where |1>, |2> ... form the basis set. I tried to verify this but it turned out to be not correct (or so i found). For e.g. consider the 2x2 matrix example:

Ω =
2 3
4 5

If I choose basis vectors as |1> = [1 0] and |2> = [0 1] (please read them as column vectors), the I can verify that Ωij = <i|Ω|j>. However, if i choose a non-starndard but orthonormal basis such as |1> = [1 0] and |2> = [0 -1] (again, please read them as column vectors), then using Ωij = <i|Ω|j>, I get:

Ω =
2 -3
-4 5

Why is this happening ? Why am I not getting the same Ω back ? What am I missing here? because the proof in the book seems totally logical. Thanks in advance.

2. Oct 5, 2015

### Fredrik

Staff Emeritus
Let $T:X\to Y$ be a linear transformation. There isn't just one matrix associated with T. There's one for each pair (A,B) such that A is an ordered basis for X, and B is an ordered basis for Y. The one associated with a specific pair (A,B) is often denoted by $[T]_{B,A}$. Its components are given by the formula $([T]_{B,A})_{ij}= (Ta_j)_i$, where $a_j$ denotes the $j$th element of the ordered basis $A$, and $(Ta_j)_i$ denotes the $i$th component of $Ta_j$ with respect to the ordered basis $B$. If there's an inner product on $Y$, and $B$ is orthonormal, then we have $(Ta_j)_i=\langle b_i,Ta_j\rangle$. If B isn't orthonormal, then this last formula doesn't hold.

Suppose that $T:X\to X$ is linear, and that A and B are two different orthonormal ordered bases for X. What you have found is that $[T]_{B,B}$ may be different from $[T]_{A,A}$. They are however similar; there's always a matrix S such that $[T]_{B,B}=S[T]_{A,A}S^{-1}$.

3. Oct 5, 2015

Firstly, thanks for a great reply. It helps me understand the original text better.

So for my case, the second situation applies where T:X→X and I have two different orthonormal basis sets: A being column vectors [1 0], [0 1] and B being column vectors [1 0] and [0 -1]. Now if I use the general idea that the inner product (Taj)i = <ai,Taj>, then this seems to work out for the basis set A. However, the same does not work out for the basis set B, when expressing (Tbj)i = <bi,Tbj>. If my understanding is correct - what you are saying is that there is an equivalent of T, lets call it T', which is also defined as T':X→X and this T' will reproduce its self when we express all basis vectors in the basis set B, this T' will have the same end effect on the vectors as T, and this T' will be related in some way to the original T, such that T' = STS-1.

That is very exciting for me because in my earlier expression (T≡Ω), Ωij = <i|Ω|j>, for me seemed to hold only for the standard kind (the A kind) of basis sets. Please correct me if I am wrong and also it would be great if you could please tell me what is this topic/theme exactly called and if there is a source/book/pages/pdf where I can learn how to rigorously understand the meaning of S and how to contsruct it.

Thanks again.

4. Oct 5, 2015

For the relation T|V> = |V'>, the matrix equation hold perfectly fine that v'i=ΣTijvj (That's how matrices are multiplied to vectors). Then from just looking at how vectors behave upon transformations, it follows Tij = <i|T|j>. What is then the inherent assumption I am making that |i> and |j> have to be the standard basis set vectors.... still cannot get my head around it.

Here is the derivation:
T|V> = |V'>

This implies:
v'i = <i|V'> = <i|T|V> = <i|T|(∑vj|j> = ∑vj <i|T|j>

And simply by looking at how matrix multiplication is done, we have:
v'i=ΣTijvj

Therefore:
Tij = <i|T|j>.

But I cannot reproduce T using non-standard |i> and |j>, strange! Still don't get where are the assumptions, or what I am doing wrong because this derivation is independent of the choice of basis set.

5. Oct 5, 2015

### Fredrik

Staff Emeritus
We're dealing with only one linear operator $T:X\to X$, but two different matrices associated with it, $[T]_{A,A}$ and $[T]_{B,B}$. One matrix for each ordered basis. I will simplify these notations to $[T]_A$ and $[T]_B$ here. (Yes, another option is to denote them by $T$ and $T'$). I will also use the notations $A=(a_1,\dots,a_n)$ and $B=(b_1,\dots,b_n)$. (Let's not worry about infinite-dimensional vector spaces here). And finally, I will use the following notation for components of vectors: $x=\sum_{i=1}^n x_i a_i=\sum_{i=1}^n x_i' b_i$. Actually, I will drop the summation sigmas from the notation as well, since the sum is always over the index that appears twice. So I will write $x=x_ia_i=x_i'b_i$.

Let $M$ be the unique linear operator such that $b_i=Ma_i$ for all $i$. We have
$$b_i=Ma_i =(Ma_i)_j a_j = ([M]_A)_{ji} a_j.$$ Let's simplify the notation $[M]_A$ to just $M$, so we can write $b_i=M_{ji}a_j$. The relationship between $[T]_B$ and $[T]_A$ is
$$[T]_B=M^{-1}[T]_A M.$$
This isn't too hard to prove. You need to know that in my notation, the definition of matrix multiplication is simply $(PQ)_{ij}=P_{ik}Q_{kj}$. We have
\begin{align}
&([T]_B)_{ij}b_i =([T]_B)_{ij} M_{ki}a_k =(M[T]_B)_{kj} a_k,\\
&([T]_B)_{ij}b_i =(Tb_j)_i' b_i =Tb_j =T(M_{ij}a_i)=M_{ij}Ta_i =M_{ij}(Ta_i)_k a_k =M_{ij}([T]_A)_{ki} a_k =([T]_AM)_{kj} a_k.
\end{align}
Since the left-hand sides are equal, the right-hand sides are too. Since $\{a_1,\dots,a_n\}$ is a linearly independent set, this implies that $(M[T]_B)_{kj}=([T]_AM)_{kj}$ for all k,j. This implies that $M[T]_B=[T]_AM$. Now you just multiply both sides by $M^{-1}$ from the left.

You need to distinguish between $[\Omega]_A$ and $[\Omega]_B$. If we denote the $i$the element of A by $|i\rangle$ and the $i$th element of B by $|i'\rangle$, we have $([\Omega]_A)_{ij}=\langle i|A|j\rangle$ and $([\Omega]_B)_{ij}=\langle i'|A|j'\rangle$. If you denote both $[\Omega]_A$ and $[\Omega]_B$ by $\Omega$, things will certainly get confusing.

We're talking about the relationship between linear transformations and matrices. It's explained in books on linear algebra. In all of them, I think. But some books (Treil, Axler) introduce this topic as early as possible, and some books (Anton) delay it for as long as possible. The former kind of book is much better for someone who's studying quantum mechanics. They also work with complex vector spaces from the start.

It would actually make sense to introduce this stuff before matrix multiplication, because one of the results obtained from these definitions and methods can be viewed as the reason why matrix multiplication is defined the way it is: Suppose that $T:X\to Y$ and $S:Y\to Z$ are linear, and that A,B,C are ordered bases for X,Y,Z respectively. We have
$$([S\circ T]_{C,A})_{ij} =([ S]_{C,B})_{ik} ([T]_{B,A})_{kj}.$$ It would be a good exercise for you to prove this. (Before you try that, you should work through the other proof above, until you can do it without looking at my calculations). The definition of matrix multiplication allows us to rewrite the above as
$$[S\circ T]_{C,A}=[ S]_{C,B}[T]_{B,A}.$$

Last edited: Oct 6, 2015
6. Oct 5, 2015

### Fredrik

Staff Emeritus
This has to be justified by the methods discussed in my previous posts. Since
\begin{align}
&Tv=T(v_j a_j)=v_j Ta_j = v_j (Ta_j)_i a_i =v_j ([T]_A)_{ij} a_i,\\
&Tv=(Tv)_i a_i,
\end{align} we have $(Tv)_i =([T]_A)_{ij} v_j$. I assume that the left-hand side is what you denote by $v_i'$. So we have $v_i' =([T]_A)_{ij} v_j$. This means that your $T_{ij}$ is the $ij$ component of $[T]_A$.

Last edited: Oct 6, 2015
7. Oct 7, 2015

Thanks for you reply. Basically, the M represented above by you here is kind of a change of basis matrix such that Mai = bi. I guess the stem I am failing to understand is the one where you take (Mai)j = ([M]A)ji. I tried to derive it and have a major source of confusion because the M that you have is the inverse of the M that I have used in my proof. Please look at it below.

Lets say there is vector v. This v is expressed simply in the standard basis set E (e1,.....,ek,.....,en) consisting of 1s at the nth and 0 at the other positions in the columns. At first I now take a new basis set A (a1,.....,ak,.....,an) and try to express this v simply in the new basis set as vA. I have a n by n matrix called the change of basis matrix, which I will call simply MAE , and the columns of this matrix are nothing but the vectors a1,.....,ak,.....,an, such that:

MAE = [a1,.....,ak,.....,an ]

I can now see that the relation MAEvA=v holds true. One very important part of the fact (or so I think) that this is being done so simply is that all these new basis vectors, the aks, are being expressed as columns with their entries as coefficients of the original eks. Somehow to me this stands out as a very important fact and makes the standard set special. It means that entries of the columns of MAE are the projections of the basis vectors of set A, expressed in the standard set E and not the new set A. So the standard set still hasn't left us completely.

I wanted to get rid of the standard set and come at a completely general idea from any basis to any basis. So I first note that the same can be said about a new basis set B having its own set of basis vectors (b1,.....,bk,.....,bn), its change of basis matrix MBE the corresponding expression of the same old vector as vB in this new basis. Exactly speaking: MBEvB=v.

Let us say we now want the change of basis matrix from A to B such that MBAvB=vA. What should be the constituent elements of MBA?

=> MBAvB=vA
=> MBAMBE-1v=MAE-1v
(...skipping some steps because naturally both MAE and MBE are invertible as they contain linearly independent columns, we'll get)
=> MBA = MAE-1MBE.

This gives us a way to reach the new change of basis matrix from A to B MBA, in terms of the change of basis matrices from the standard basis MBE and MAE. However, does the old interpretation still hold such that we now say the columns of MBA are the basis vectors of B, expressed in terms of the basis vectors of A. I have been able to verify that by taking examples but I guess what you have done is shown it more rigorously using indices. Anyway, once we agree what MBA is, I find it easier to apply these ideas to transformations T:V→W such that TAAvA=wA, and determine a new TBB from a given TAA. Using similar notation as I have described here:

=> TBBvB=wB
=> TBBMBA-1vA=MBA-1wA
=> TBBMBA-1vA=MBA-1TAAvA
=> TBB=MBA-1TAAMBA

I guess that is enough as a proof, but I have assumed several things like the very important fact that V and W have the same dimensions. If that wouldn't be true then the method could still remain the same but the change of basis matrices in V and W would cease to make any sense across the transformation. once cannot have a change of basis matrix from 3 dimensions to 2 dimensions because it isn't the same vector space anymore.

So the last remaining doubt that I have is why is the M that you describe and the MBA that I describe different, They seem to be inverses of each other in the beginning, although when our final conclusions are the same, that is TBB=MBA-1TAAMBA.

8. Oct 7, 2015

There are a lot of mistakes in my previous post. However, I found a big mistake that I had been making in the proofs using indices! That resolves my problem. As you had mentioned, I needed to distinguish between $\Omega_{A}$ and $\Omega_{B}$. Anyway, ditching my notation and taking up yours from the very first post, let see we have a map called $T: X \rightarrow Y$, such that $T\vec{x}=\vec{y}$, now $A (a_{1},..,..,a_{m})$is an ordered basis set for $X$ and $B (b_{1},..,..,b_{n})$ is an ordered basis set for $Y$. The point that I have ascertained here is that actually $T$ by itself is defined on different basis sets. Lets say $T\vec{x} = T_{[D,C]}\vec{x_{C}} = \vec{y} = \vec{y_{D}}$ where $C$ is an ordered basis set for $X$ and $D$ for $Y$. This gives us the original definition of $T$ (I'm doing all of this to get rid of the notion that the standard basis set is any special, when its actually not).

Now, one can show easily that $(T_{[B,A]})_{ij} = \langle b_{i}|T|a_{j} \rangle$, where $T_{[B,A]}$ and $T=T_{[D,C]}$ are related but different matrices. Here is the proof -
We know, $T_{[D,C]}\vec{x_{C}} = \vec{y_{D}}$ and $T_{[B,A]}\vec{x_{A}} = \vec{y_{B}}$. There are also the change of basis matrices as defined before where $M_{[A,C]}\vec{x_{A}} = \vec{x_{C}}$ and $M_{[B,D]}\vec{y_{B}} = \vec{y_{D}}$.

Starting with:
$T_{[B,A]}\vec{x_{A}} = \vec{y_{B}}$

Using change of basis:
$T_{[B,A]}M^{-1}_{[A,C]}\vec{x_{C}} = M^{-1}_{[B,D]}\vec{y_{D}}$

Using the original $T_{[D,C]}\vec{x_{C}} = \vec{y_{D}}$, we get
$T_{[B,A]}M^{-1}_{[A,C]}\vec{x_{C}} = M^{-1}_{[B,D]}T_{[D,C]}\vec{x_{C}}$

Now the cool thing is that both $M^{-1}_{[A,C]}$ and $M^{-1}_{[B,D]}$ are matrices with orthogonal column vectors (Thats how change of basis matrices are constructed) so they are Unitary matrices, and their inverse is their conjugate transpose. This quickly gives us:
$M^{-1}_{[B,D]}T_{[D,C]}M_{[A,C]} = M^{T}_{[B,D]}T_{[D,C]}M_{[A,C]}$

Therefore:
$T_{[B,A]} = M^{T}_{[B,D]}T_{[D,C]}M_{[A,C]}$

Now we can play around with $A, B, C, D$ all we want and get all the results mentioned above. For example, if both $C$ and $D$ are the standard basis sets called $E$, then $T=T_{[D,C]}$ because naturally $M_{[E,E]} = I$ which results in:
$T_{[B,A]} = M^{T}_{[B,E]}TM_{[A,E]}$

When this is viewed in bracket notation, it directly translates to:
$(T_{[B,A]})_{ij} = \langle b_{i}|T|a_{j} \rangle$

Wow, matrices are cool!! The important thing is to not confuse between transformations $T$ and change of basis matrices $M$. While transformations can help vectors jump over vector spaces of different dimensions, the change of basis matrices only give them a new look, not a different space.

Last edited: Oct 7, 2015
9. Oct 7, 2015

### Fredrik

Staff Emeritus
There's a simpler proof of the equality $([T]_{B,A})_{ij}=\langle b_j,Te_j\rangle$. I'll use the following notation for components of vectors: $x=x_i a_i =x_i'b_i$. For all vectors $x$, we have $x=x_i'b_i$, and therefore
$$\langle b_i,x\rangle =\langle b_i,x_j'b_j\rangle =x_j'\langle b_i,b_j\rangle =x_j'\delta_{ij} =x_i'.$$ In particular, we have
$$([T]_{B,A})_{ij}= (Ta_j)_i' =\langle b_i,Ta_j\rangle.$$ That's it. The first equality on the line above is just the definition of the $[T]_{B,A}$ notation.

In post #5, when I wrote $[M]_A$, that was an abbreviated notation for $[M]_{A,A}$. So the formula $([M]_A)_{ij}=(Ma_j)_i$ follows immediately from the definition of the notation $[M]_{B,A}$.

Even though I had already simplified the notation from $[M]_{A,A}$ to $[M]_A$, I chose to simplify it further, and just write $M$. That may have caused some confusion.

Here's another cool way to rewrite $[M]_{A,A}$: Let $I$ denote the identity map. Since
$$([ I]_{A,B})_{ij} =(Ib_j)_i =(b_j)_i =(Ma_j)_i =([M]_{A,A})_{ij},$$ we have $[M]_{A,A}=[ I]_{A,B}$. So we can write
$$[T]_{B,B} =([ I]_{A,B})^{-1} [T]_{A,A} [ I]_{A,B}.$$ Note however that $[ I]_{B,A}$ isn't the inverse of $[ I]_{A,B}$. We have $[ I]_{B,A}=[M^{-1}]_{B,B}$ and $([ I]_{A,B})^{-1}=[M^{-1}]_{A,A}$.

By the way, as you may have noticed, it's a bit of a pain to LaTeX [I] and [S]. The problem is that they're interpreted as BBcodes by the forum software. I'm inserting an extra space after the left bracket to take care of that.

10. Oct 7, 2015