Matrix Elements as images of basis vectors

askhetan · Oct 5, 2015

I'm trying to understand the maths of QM from Shankar's book - Principles of Quantum Mechanics: On page 21 of that book, there is a general derivation that if we have a relation:

|v'> = Ω|v>

Where Ω is a operator on |v> transfroming it into |v'>, then the matrix entries of the operator can be expressed as:

Ω_ij = <i|Ω|j>, where |i> and |j> are basis vectors. What the book says is essentially that the jth column of matrix Ω can be viewed as the image of the transformed jth basis vector expressed in the same basis. More explicitly, it means:

Ω =
<1|Ω|1> <1|Ω|2> ... <1|Ω|n>
<2|Ω|1> <2|Ω|2> ... <2|Ω|n>
<3|Ω|1> <3|Ω|2> ... <2|Ω|1>
.........
.........
<n|Ω|1> <n|Ω|2> ... <n|Ω|n>

Where |1>, |2> ... form the basis set. I tried to verify this but it turned out to be not correct (or so i found). For e.g. consider the 2x2 matrix example:

Ω =
2 3
4 5

If I choose basis vectors as |1> = [1 0] and |2> = [0 1] (please read them as column vectors), the I can verify that Ω_ij = <i|Ω|j>. However, if i choose a non-starndard but orthonormal basis such as |1> = [1 0] and |2> = [0 -1] (again, please read them as column vectors), then using Ω_ij = <i|Ω|j>, I get:

Ω =
2 -3
-4 5

Why is this happening ? Why am I not getting the same Ω back ? What am I missing here? because the proof in the book seems totally logical. Thanks in advance.

Fredrik · Oct 5, 2015

Let ##T:X\to Y## be a linear transformation. There isn't just one matrix associated with T. There's one for each pair (A,B) such that A is an ordered basis for X, and B is an ordered basis for Y. The one associated with a specific pair (A,B) is often denoted by ##[T]_{B,A}##. Its components are given by the formula ##([T]_{B,A})_{ij}= (Ta_j)_i##, where ##a_j## denotes the ##j##th element of the ordered basis ##A##, and ##(Ta_j)_i## denotes the ##i##th component of ##Ta_j## with respect to the ordered basis ##B##. If there's an inner product on ##Y##, and ##B## is orthonormal, then we have ##(Ta_j)_i=\langle b_i,Ta_j\rangle##. If B isn't orthonormal, then this last formula doesn't hold.

Suppose that ##T:X\to X## is linear, and that A and B are two different orthonormal ordered bases for X. What you have found is that ##[T]_{B,B}## may be different from ##[T]_{A,A}##. They are however similar; there's always a matrix S such that ##[T]_{B,B}=S[T]_{A,A}S^{-1}##.

askhetan · Oct 5, 2015

Firstly, thanks for a great reply. It helps me understand the original text better.

So for my case, the second situation applies where T:X→X and I have two different orthonormal basis sets: A being column vectors [1 0], [0 1] and B being column vectors [1 0] and [0 -1]. Now if I use the general idea that the inner product (Ta_j)_i = <a_i,Ta_j>, then this seems to work out for the basis set A. However, the same does not work out for the basis set B, when expressing (Tb_j)_i = <b_i,Tb_j>. If my understanding is correct - what you are saying is that there is an equivalent of T, let's call it T', which is also defined as T':X→X and this T' will reproduce itself when we express all basis vectors in the basis set B, this T' will have the same end effect on the vectors as T, and this T' will be related in some way to the original T, such that T' = STS^-1.

That is very exciting for me because in my earlier expression (T≡Ω), Ω_ij = <i|Ω|j>, for me seemed to hold only for the standard kind (the A kind) of basis sets. Please correct me if I am wrong and also it would be great if you could please tell me what is this topic/theme exactly called and if there is a source/book/pages/pdf where I can learn how to rigorously understand the meaning of S and how to contsruct it.

Thanks again.

askhetan · Oct 5, 2015

For the relation T|V> = |V'>, the matrix equation hold perfectly fine that v'_i=ΣT_ijv_j (That's how matrices are multiplied to vectors). Then from just looking at how vectors behave upon transformations, it follows T_ij = <i|T|j>. What is then the inherent assumption I am making that |i> and |j> have to be the standard basis set vectors... still cannot get my head around it.

Here is the derivation:
T|V> = |V'>

This implies:
v'_i = <i|V'> = <i|T|V> = <i|T|(∑v_j|j> = ∑v_j <i|T|j>

And simply by looking at how matrix multiplication is done, we have:
v'_i=ΣT_ijv_j

Therefore:
T_ij = <i|T|j>.

But I cannot reproduce T using non-standard |i> and |j>, strange! Still don't get where are the assumptions, or what I am doing wrong because this derivation is independent of the choice of basis set.

Fredrik · Oct 5, 2015

askhetan said:

If my understanding is correct - what you are saying is that there is an equivalent of T, let's call it T', which is also defined as T':X→X and this T' will reproduce itself when we express all basis vectors in the basis set B, this T' will have the same end effect on the vectors as T, and this T' will be related in some way to the original T, such that T' = STS^-1.

We're dealing with only one linear operator ##T:X\to X##, but two different matrices associated with it, ##[T]_{A,A}## and ##[T]_{B,B}##. One matrix for each ordered basis. I will simplify these notations to ##[T]_A## and ##[T]_B## here. (Yes, another option is to denote them by ##T## and ##T'##). I will also use the notations ##A=(a_1,\dots,a_n)## and ##B=(b_1,\dots,b_n)##. (Let's not worry about infinite-dimensional vector spaces here). And finally, I will use the following notation for components of vectors: ##x=\sum_{i=1}^n x_i a_i=\sum_{i=1}^n x_i' b_i##. Actually, I will drop the summation sigmas from the notation as well, since the sum is always over the index that appears twice. So I will write ##x=x_ia_i=x_i'b_i##.

Let ##M## be the unique linear operator such that ##b_i=Ma_i## for all ##i##. We have
$$b_i=Ma_i =(Ma_i)_j a_j = ([M]_A)_{ji} a_j.$$ Let's simplify the notation ##[M]_A## to just ##M##, so we can write ##b_i=M_{ji}a_j##. The relationship between ##[T]_B## and ##[T]_A## is
$$[T]_B=M^{-1}[T]_A M.$$
This isn't too hard to prove. You need to know that in my notation, the definition of matrix multiplication is simply ##(PQ)_{ij}=P_{ik}Q_{kj}##. We have
\begin{align}
&([T]_B)_{ij}b_i =([T]_B)_{ij} M_{ki}a_k =(M[T]_B)_{kj} a_k,\\
&([T]_B)_{ij}b_i =(Tb_j)_i' b_i =Tb_j =T(M_{ij}a_i)=M_{ij}Ta_i =M_{ij}(Ta_i)_k a_k =M_{ij}([T]_A)_{ki} a_k =([T]_AM)_{kj} a_k.
\end{align}
Since the left-hand sides are equal, the right-hand sides are too. Since ##\{a_1,\dots,a_n\}## is a linearly independent set, this implies that ##(M[T]_B)_{kj}=([T]_AM)_{kj}## for all k,j. This implies that ##M[T]_B=[T]_AM##. Now you just multiply both sides by ##M^{-1}## from the left.

askhetan said:

Ω_ij = <i|Ω|j>, for me seemed to hold only for the standard kind (the A kind) of basis sets.

You need to distinguish between ##[\Omega]_A## and ##[\Omega]_B##. If we denote the ##i##the element of A by ##|i\rangle## and the ##i##th element of B by ##|i'\rangle##, we have ##([\Omega]_A)_{ij}=\langle i|A|j\rangle## and ##([\Omega]_B)_{ij}=\langle i'|A|j'\rangle##. If you denote both ##[\Omega]_A## and ##[\Omega]_B## by ##\Omega##, things will certainly get confusing.

askhetan said:

it would be great if you could please tell me what is this topic/theme exactly called and if there is a source/book/pages/pdf where I can learn how to rigorously understand the meaning of S and how to contsruct it.

We're talking about the relationship between linear transformations and matrices. It's explained in books on linear algebra. In all of them, I think. But some books (Treil, Axler) introduce this topic as early as possible, and some books (Anton) delay it for as long as possible. The former kind of book is much better for someone who's studying quantum mechanics. They also work with complex vector spaces from the start.

It would actually make sense to introduce this stuff before matrix multiplication, because one of the results obtained from these definitions and methods can be viewed as the reason why matrix multiplication is defined the way it is: Suppose that ##T:X\to Y## and ##S:Y\to Z## are linear, and that A,B,C are ordered bases for X,Y,Z respectively. We have
$$([S\circ T]_{C,A})_{ij} =([ S]_{C,B})_{ik} ([T]_{B,A})_{kj}.$$ It would be a good exercise for you to prove this. (Before you try that, you should work through the other proof above, until you can do it without looking at my calculations). The definition of matrix multiplication allows us to rewrite the above as
$$[S\circ T]_{C,A}=[ S]_{C,B}[T]_{B,A}.$$

Fredrik · Oct 5, 2015

askhetan said:

For the relation T|V> = |V'>, the matrix equation hold perfectly fine that v'_i=ΣT_ijv_j (That's how matrices are multiplied to vectors).

This has to be justified by the methods discussed in my previous posts. Since
\begin{align}
&Tv=T(v_j a_j)=v_j Ta_j = v_j (Ta_j)_i a_i =v_j ([T]_A)_{ij} a_i,\\
&Tv=(Tv)_i a_i,
\end{align} we have ##(Tv)_i =([T]_A)_{ij} v_j##. I assume that the left-hand side is what you denote by ##v_i'##. So we have ##v_i' =([T]_A)_{ij} v_j##. This means that your ##T_{ij}## is the ##ij## component of ##[T]_A##.

askhetan · Oct 7, 2015

Thanks for you reply. Basically, the M represented above by you here is kind of a change of basis matrix such that Ma_i = b_i. I guess the stem I am failing to understand is the one where you take (Ma_i)_j = ([M]_A)_ji. I tried to derive it and have a major source of confusion because the M that you have is the inverse of the M that I have used in my proof. Please look at it below.

Lets say there is vector v. This v is expressed simply in the standard basis set E (e₁,...,e_k,...,e_n) consisting of 1s at the n^th and 0 at the other positions in the columns. At first I now take a new basis set A (a₁,...,a_k,...,a_n) and try to express this v simply in the new basis set as v_A. I have a n by n matrix called the change of basis matrix, which I will call simply M_AE , and the columns of this matrix are nothing but the vectors a₁,...,a_k,...,a_n, such that:

M_AE = [a₁,...,a_k,...,a_n ]

I can now see that the relation M_AEv_A=v holds true. One very important part of the fact (or so I think) that this is being done so simply is that all these new basis vectors, the a_ks, are being expressed as columns with their entries as coefficients of the original e_ks. Somehow to me this stands out as a very important fact and makes the standard set special. It means that entries of the columns of M_AE are the projections of the basis vectors of set A, expressed in the standard set E and not the new set A. So the standard set still hasn't left us completely.

I wanted to get rid of the standard set and come at a completely general idea from any basis to any basis. So I first note that the same can be said about a new basis set B having its own set of basis vectors (b₁,...,b_k,...,b_n), its change of basis matrix M_BE the corresponding expression of the same old vector as v_B in this new basis. Exactly speaking: M_BEv_B=v.

Let us say we now want the change of basis matrix from A to B such that M_BAv_B=v_A. What should be the constituent elements of M_BA?

=> M_BAv_B=v_A
=> M_BAM_BE^-1v=M_AE^-1v
(...skipping some steps because naturally both M_AE and M_BE are invertible as they contain linearly independent columns, we'll get)
=> M_BA = M_AE^-1M_BE.

This gives us a way to reach the new change of basis matrix from A to B M_BA, in terms of the change of basis matrices from the standard basis M_BE and M_AE. However, does the old interpretation still hold such that we now say the columns of M_BA are the basis vectors of B, expressed in terms of the basis vectors of A. I have been able to verify that by taking examples but I guess what you have done is shown it more rigorously using indices. Anyway, once we agree what M_BA is, I find it easier to apply these ideas to transformations T:V→W such that T_AAv_A=w_A, and determine a new T_BB from a given T_AA. Using similar notation as I have described here:

=> T_BBv_B=w_B
=> T_BBM_BA^-1v_A=M_BA^-1w_A
=> T_BBM_BA^-1v_A=M_BA^-1T_AAv_A
=> T_BB=M_BA^-1T_AAM_BA

I guess that is enough as a proof, but I have assumed several things like the very important fact that V and W have the same dimensions. If that wouldn't be true then the method could still remain the same but the change of basis matrices in V and W would cease to make any sense across the transformation. once cannot have a change of basis matrix from 3 dimensions to 2 dimensions because it isn't the same vector space anymore.

So the last remaining doubt that I have is why is the M that you describe and the M_BA that I describe different, They seem to be inverses of each other in the beginning, although when our final conclusions are the same, that is T_BB=M_BA^-1T_AAM_BA.

askhetan · Oct 7, 2015

There are a lot of mistakes in my previous post. However, I found a big mistake that I had been making in the proofs using indices! That resolves my problem. As you had mentioned, I needed to distinguish between ##\Omega_{A}## and ##\Omega_{B}##. Anyway, ditching my notation and taking up yours from the very first post, let see we have a map called ##T: X \rightarrow Y##, such that ##T\vec{x}=\vec{y}##, now ##A (a_{1},..,..,a_{m})##is an ordered basis set for ##X## and ##B (b_{1},..,..,b_{n})## is an ordered basis set for ##Y##. The point that I have ascertained here is that actually ##T## by itself is defined on different basis sets. Let's say ##T\vec{x} = T_{[D,C]}\vec{x_{C}} = \vec{y} = \vec{y_{D}}## where ##C## is an ordered basis set for ##X## and ##D## for ##Y##. This gives us the original definition of ##T## (I'm doing all of this to get rid of the notion that the standard basis set is any special, when its actually not).

Now, one can show easily that ##(T_{[B,A]})_{ij} = \langle b_{i}|T|a_{j} \rangle##, where ##T_{[B,A]}## and ##T=T_{[D,C]}## are related but different matrices. Here is the proof -
We know, ##T_{[D,C]}\vec{x_{C}} = \vec{y_{D}}## and ##T_{[B,A]}\vec{x_{A}} = \vec{y_{B}}##. There are also the change of basis matrices as defined before where ##M_{[A,C]}\vec{x_{A}} = \vec{x_{C}}## and ##M_{[B,D]}\vec{y_{B}} = \vec{y_{D}}##.

Starting with:
##T_{[B,A]}\vec{x_{A}} = \vec{y_{B}}##

Using change of basis:
##T_{[B,A]}M^{-1}_{[A,C]}\vec{x_{C}} = M^{-1}_{[B,D]}\vec{y_{D}}##

Using the original ##T_{[D,C]}\vec{x_{C}} = \vec{y_{D}}##, we get
##T_{[B,A]}M^{-1}_{[A,C]}\vec{x_{C}} = M^{-1}_{[B,D]}T_{[D,C]}\vec{x_{C}}##

Now the cool thing is that both ##M^{-1}_{[A,C]}## and ##M^{-1}_{[B,D]}## are matrices with orthogonal column vectors (Thats how change of basis matrices are constructed) so they are Unitary matrices, and their inverse is their conjugate transpose. This quickly gives us:
##M^{-1}_{[B,D]}T_{[D,C]}M_{[A,C]} = M^{T}_{[B,D]}T_{[D,C]}M_{[A,C]}##

Therefore:
##T_{[B,A]} = M^{T}_{[B,D]}T_{[D,C]}M_{[A,C]}##

Now we can play around with ##A, B, C, D## all we want and get all the results mentioned above. For example, if both ##C## and ##D## are the standard basis sets called ##E##, then ##T=T_{[D,C]}## because naturally ##M_{[E,E]} = I## which results in:
##T_{[B,A]} = M^{T}_{[B,E]}TM_{[A,E]}##

When this is viewed in bracket notation, it directly translates to:
##(T_{[B,A]})_{ij} = \langle b_{i}|T|a_{j} \rangle##

Wow, matrices are cool! The important thing is to not confuse between transformations ##T## and change of basis matrices ##M##. While transformations can help vectors jump over vector spaces of different dimensions, the change of basis matrices only give them a new look, not a different space.

Fredrik · Oct 7, 2015

There's a simpler proof of the equality ##([T]_{B,A})_{ij}=\langle b_j,Te_j\rangle##. I'll use the following notation for components of vectors: ##x=x_i a_i =x_i'b_i##. For all vectors ##x##, we have ##x=x_i'b_i##, and therefore
$$\langle b_i,x\rangle =\langle b_i,x_j'b_j\rangle =x_j'\langle b_i,b_j\rangle =x_j'\delta_{ij} =x_i'.$$ In particular, we have
$$([T]_{B,A})_{ij}= (Ta_j)_i' =\langle b_i,Ta_j\rangle.$$ That's it. The first equality on the line above is just the definition of the ##[T]_{B,A}## notation.

In post #5, when I wrote ##[M]_A##, that was an abbreviated notation for ##[M]_{A,A}##. So the formula ##([M]_A)_{ij}=(Ma_j)_i## follows immediately from the definition of the notation ##[M]_{B,A}##.

Even though I had already simplified the notation from ##[M]_{A,A}## to ##[M]_A##, I chose to simplify it further, and just write ##M##. That may have caused some confusion.

Here's another cool way to rewrite ##[M]_{A,A}##: Let ##I## denote the identity map. Since
$$([ I]_{A,B})_{ij} =(Ib_j)_i =(b_j)_i =(Ma_j)_i =([M]_{A,A})_{ij},$$ we have ##[M]_{A,A}=[ I]_{A,B}##. So we can write
$$[T]_{B,B} =([ I]_{A,B})^{-1} [T]_{A,A} [ I]_{A,B}.$$ Note however that ##[ I]_{B,A}## isn't the inverse of ##[ I]_{A,B}##. We have ##[ I]_{B,A}=[M^{-1}]_{B,B}## and ##([ I]_{A,B})^{-1}=[M^{-1}]_{A,A}##.

By the way, as you may have noticed, it's a bit of a pain to LaTeX [I] and [S]. The problem is that they're interpreted as BBcodes by the forum software. I'm inserting an extra space after the left bracket to take care of that.

askhetan · Oct 7, 2015

That clarifies it. Thanks for your responses!

Matrix Elements as images of basis vectors

Undergrad The vector to which a dual vector corresponds

Undergrad Spinor calculus

Undergrad Matrix representation of rank-2 spinors

Undergrad Looking for a paper about spinors

On the Moore–Penrose inverse from a banal linear algebra viewpoint

On the Moore–Penrose inverse from a banal linear algebra viewpoint

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Matrix Elements as images of basis vectors

Similar threads