# Change of Basis Matrix vs Transformation matrix in the same basis...

• I
fog37
Summary
Change of basis matrix and transformation matrix
Hello,

Let's consider a vector ##X## in 2D with its two components ##(x_1 , x_2)_A## expressed in the basis ##A##. A basis is a set of two independent (unit or not) vectors. Any vector in the 2D space can be expressed as a linear combination of the two basis vectors in the chosen basis. There are infinite possible bases to choose from. Each basis expresses the components as two different numbers.

That said, a particular vector ##X## can be changed (its components can be changed while its direction and magnitude remain the same) from a basis ##A## to a different basis ##B## using a change-of-basis matrix ##M_{AB}##. Vector ##X=(x_1 , x_2)_B##, in basis ##B##, has the same direction and magnitude as vector ##X=(x_1 , x_2)_A## in basis ##A## ,i.e. it remains the same object.

However, there are matrices that can transform a vector ##X## into a different (magnitude and/or direction) vector ##Y## in the same basis ##A##.

What is the difference between a change-of-basis matrix ##M_{AB}## which leaves the vector unchanged and a matrix ##F## that actually changes a vector to a different vector in the same basis ##A##? I know that the columns of matrix ##M## are the basis vector of basis ##A## transformed to basis ##B##. But to accomplish that transformation we need a matrix first.
Matrix ##F##, on the other hand, must change the vector to a different vector but does not affect the basis vectors themselves...

Thank you for any clarification...

What is the difference between a change-of-basis matrix ##M_{AB}## which leaves the vector unchanged and a matrix ##F## that actually changes a vector to a different vector in the same basis ##A##?

The mathematical definition of a function ##f(x)## does not specify an physical interpretation of the process that maps ##x## to ##f(x)##. This holds true in the case that the function is a "linear transformation" on a vector space. So a vector valued function defined as multiplying a matrix times a vector could have either of the interpretations you mentioned - or perhaps some third interpretation depending how mathematics is being applied.

Instead of asking about the difference between a change-of-basis matrix and a change-of-direction matrix, it is more useful to ask about the relation between those two matrices in the context of the relation between the two different applications of math that you mentioned.

fog37
fog37
The mathematical definition of a function ##f(x)## does not specify an physical interpretation of the process that maps ##x## to ##f(x)##. This holds true in the case that the function is a "linear transformation" on a vector space. So a vector valued function defined as multiplying a matrix times a vector could have either of the interpretations you mentioned - or perhaps some third interpretation depending how mathematics is being applied.

Instead of asking about the difference between a change-of-basis matrix and a change-of-direction matrix, it is more useful to ask about the relation between those two matrices in the context of the relation between the two different applications of math that you mentioned.
Thanks Stephen Tashi.

I see how a linear transformation (let's call it type 1) is a map and is implemented through a multiplication by a matrix ##A##. It changes a vector into a different vector while the basis remains the same.

A change-of- basis transformation (let's call it type 2) is a also a linear transformation that instead changes the components of a vector but dot not transform the vector itself. The basis changes.

As you point out, the two types of linear transformations are related... But in what way? They seem to accomplish a very different task.

It seems that in the case of type 1, the transformation is not touching the basis vectors but only the vector ##X## under consideration...

In regards to type 2, the basis vectors change (point in different direction and have different magnitude than the original basis vectors) while the vector ##X## remains the same in terms of length and direction...

Thanks!

Staff Emeritus
Gold Member
2021 Award
I don't really get the point of this thread. If you take a number and you multiply it by itself, that's useful for calculating the area of a square. It's also a useful calculation for finding the distance that a thing traveled while undergoing uniform acceleration. It's also a thing you can do to compute air resistance as a function of velocity. It also let's you can do to the speed of light to compute the conversion between mass and energy

Deep down, do all of these examples somehow boil down to computing the area of a square? I don't know, maybe. But squaring a number is mostly just a useful way to write down equations that can mean all sorts of different things, so we do. Similarly, it's useful to write down a matrix for different reasons, and it might not be that useful to get *too* worked up about the secret link between all of them.

If you really care though, you can make this whole thing parsimonious by just observing that a tuple of numbers is a representation of a vector space after picking a basis, and every matrix is a representation of a linear map between vector spaces whose bases have been picked in advance (otherwise you could just pick bases so that the matrix has 0s off the diagonal and 1s on some diagonal terms, and save yourself a lot of effort learning how to do matrix computations).

Some matrices represent a map from a vector space to itself, where the prechosen bases may or may not be the same. Some matrices represent a map between two different vector spaces. This is not particularly insightful, and won't really open up much new understanding into the field I think.

fog37
fog37
Some progress:

A linear transformation ##T## has a matrix ##A_{old}## with respect to base ##E##.

In a different basis ##B##, the matrix for ##T## will be different and equal to ##A_{new}##. The new matrix ##A_{new} ## is given by $$A_{new} = P^{-1} A_{old} P$$
where the matrix ##P## is the change of basis matrix between basis ##E## and ##B##.

In the same basis ##E##, why does the matrix ##A_{old}## change vectors into other vectors but keeps the basis vectors in basis ##E## the same? If they changed, the basis would change....I am afraid I am not correct somewhere here...

Thanks

Last edited:
As you point out, the two types of linear transformations are related... But in what way?
The general topic is applying matrix multiplication to define a (linear) transformation of coordinates. The coordinates of a vector depend on what basis the coordinates refer to. (I'm going to use terminology "coordinates" rather than the terminology "components". Some people understand a "component" of a vector to be a scalar coordinate, but others (especially in physics) understand a component of a vector to be a vector. What I'm talking about is n-tuples of numbers - i.e. scalars).

For the matrix function Y=MX (before we even get to any interpretation of what it means) we have to be aware of alternative notation conventions:

1) The coordinates of Y and X are expressed with respect to the same basis.
2) The coordinates of Y and X are expressed with respect to different bases.

Then we have alternatives of interpretation

1) Y = MX represents the vector X being moved to a new location Y

2) Y = MX represents how to find the coordinates Y of a vector V expressed in one bases when we are give the coordinates X of V expressed in a different basis.

Theoretically, we could employ notation convention 2) with interpretation 1), but the most usual situation is that we use notation convention 1) with interpretation 1) and we use notation convention 2) with interpretation 2).

A linear transformation T has a matrix Aold with respect to base E.

In a different basis B, the matrix for T will be different and equal to Anew. The new matrix Anew is given by Anew=P−1AoldP

This depends on how you define P. To see that, consider Y=MX with notation convention 1) and interpretation 1). There is some matrix P that can be used to change the coordinates of vectors with repsect to the basis being used for writing Y=MX to their coordinates with respect to a different basis.

Let PY be the coordinates of Y in the different basis and PX be the coordinates of X with respect to that different basis. (In other words, assume there is a mapping X→PX that uses notation convention 2) and interpretation 2)). To express the linear transformation Y=MX in the different basis we need some matrix A such that (PY)=(A)(PX) when using the different coordinate system and such that this equation defines the same linear transformation as Y=MX in the original coordinate system.

Multiply ##PY=APX## on both sides on left by##P^{−1}## and we obtain ##Y=P^{−1}APX##, which now refers so coordinates with respect to the original basis. We want ##Y=(P^{−1}AP)(X)## to be the same transformation as ##Y=MX##, so we must have ##M=P^{−1}AP##. Solving for ##A## we get ##A=PMP^{-1}##.

If we define ##P## differently by saying the coordinate transformation is given by ##P^{−1}Y## and ##P^{−1}X##, we get a different expression for A. (Further consideration is that some people work with row vectors instead of column vectors and they represent linear transformations as a row vector multiplied on the right side by a matrix.)

Last edited:
fog37
I think things might have gotten unnecessarily complicated. Say you have some linear transformation ##T : V \longrightarrow W##, where ##V## and ##W## are vector spaces. You can choose a basis ##A = (v_1, \dots, v_n)## of ##V## and a basis ##B = (w_1, \dots, w_m)## of ##W##. A bit of notation; for ##v \in V##, define ##[v]_{A}## to be the coordinates of ##v## in the basis ##A##, etc.

Now you can define a matrix ##\mathcal{M}##, the "matrix of the linear transformation ##T## with respect to ##A## and ##B##", like this$$\mathcal{M} = \left( \, [T(v_1)]_{B}, \, \dots, \, [T(v_n)]_{B} \, \right)$$where each ##[T(v_i)]_{B}## is a column formed by the ##B##-coordinates of ##T(v_i)##. This matrix has the property that, if ##y = Tx## for ##x \in V## and ##y \in W##, then ##[y]_{B} = \mathcal{M} [x]_{A}##, i.e. you express the initial vector ##x## in ##A##-coordinates, and it spits out the image vector ##y## in ##B##-coordinates.

For a linear transformation ##T: V \longrightarrow V##, i.e. from ##V## to itself, it's pretty common to want to work only in one basis, in which case you just have ##A = B## and the matrix ##\mathcal{M}## is just the bog-standard thing you're used to dealing with, e.g.\begin{align*} y &= T x \\ [y]_{A} &= \mathcal{M} [x]_{A} \end{align*}For a change-of-basis, the goal is to convert from the ##A##-coordinates of some vector ##v \in V## to the ##B##-coordinates of that same vector ##v##. You can do this by considering the identity map ##I: V \longrightarrow V## which just takes ##v \mapsto v##, and now let ##\mathcal{M}## be the matrix of ##I## with respect to ##A## and ##B##,\begin{align*} v &= I v \\ [v]_{B} &= \mathcal{M} [v]_{A} \end{align*}To summarise, a "transformation matrix in the same basis" and a "change of basis matrix" are just two specific cases of that general concept of a matrix of a linear transformation with respect to two specific bases. Anyway, hope that helps.

fog37
Homework Helper
If they changed, the basis would change...
Why do you say this? The choice of basis has nothing to do with the transformation itself. The choice of basis simply defines what the matrix numbers of the transformation will be. The matrix will be different numbers in a different basis. The choice of basis defines a particular matrix representation of the transformation.
Obviously there are a host of interesting results that obtain

fog37
I think things might have gotten unnecessarily complicated. Say you have some linear transformation T:V⟶W, where V and W are vector spaces. You can choose a basis A=(v1,…,vn) of V and a basis B=(w1,…,wm) of W. A bit of notation; for v∈V, define [v]A to be the coordinates of v in the basis A, etc.

Now you can define a matrix M, the "matrix of the linear transformation T with respect to A and B", like thisM=([T(v1)]B,…,[T(vn)]B)where each [T(vi)]B is a column formed by the B-coordinates of T(vi). This matrix has the property that, if y=Tx for x∈V and y∈W, then [y]B=M[x]A, i.e. you express the initial vector x in A-coordinates, and it spits out the image vector y in B-coordinates.

For a linear transformation T:V⟶V, i.e. from V to itself, it's pretty common to want to work only in one basis, in which case you just have A=B and the matrix M is just the bog-standard thing you're used to dealing with, e.g.\begin{align*} y &= T x \ [y]_{A} &= \mathcal{M} [x]_{A} \end{align*}For a change-of-basis, the goal is to convert from the A-coordinates of some vector v∈V to the B-coordinates of that same vector v. You can do this by considering the identity map I:V⟶V which just takes v↦v, and now let M be the matrix of I with respect to A and B,\begin{align*} v &= I v \ [v]_{B} &= \mathcal{M} [v]_{A} \end{align*}To summarise, a "transformation matrix in the same basis" and a "change of basis matrix" are just two specific cases of that general concept of a matrix of a linear transformation with respect to two specific bases. Anyway, hope that helps.

Thanks etotheipi.

So ##I## is a linear map that takes a vector to itself. Like every linear transformation, it has an associated matrix ##M## whose elements vary depending on the chosen basis.

The matrix ##M## is then called a change-of-basis transformation because its elements are designed to change the components of vector ##x## but keep the magnitude and direction of vector ##x## the same.

What properties does a matrix ##M## that represents an identity map ##I## have compared to matrices that don't do that? What is special about its elements to be able to keep the vector the same but just change its components?

etotheipi
The key is to realise that the "abstract" vector ##x \in V## and its so-called "coordinate-vector" with respect to a basis ##[x]_A## are different.

First, let's just re-cap we mean by a "coordinate-vector w.r.t ##A##", or " the components of ##v## in a basis ##A##". Given any ##v \in V##, if ##A = (a_1, \dots, a_n)## is a basis of ##V## then you can express ##v = c_1 a_1 + \dots + c_n a_n##. Then, ##[v]_A := \begin{pmatrix} c_1 \\ \dots \\ c_n \end{pmatrix}## are called the ##A##-coordinates of ##v##. Of course, given a different basis ##(b_1, \dots, b_n)## the coefficients will be different, i.e. ##v = d_1 b_1 + \dots + d_n b_n## and then the ##B##-coordinates of ##v## are instead ##[v]_B := \begin{pmatrix} d_1 \\ \dots \\ d_n \end{pmatrix}##. The whole point of a change-of-basis is to figure out how to get from the ##A##-coordinates of ##v## to the ##B##-coordinates of ##v##!

Let's take a concrete example of a change-of-basis. Consider the vector space ##V = \mathbf{R}^2##, and let's take two bases, say ##A = \{ \begin{pmatrix} 1 \\ 0 \end{pmatrix} , \begin{pmatrix} 0 \\ 2 \end{pmatrix} \}## and ##B = \{ \begin{pmatrix} 2 \\ 1 \end{pmatrix}, \begin{pmatrix} 2 \\ 0 \end{pmatrix} \}##. Now, let's work out what the matrix ##\mathcal{M} = \left( \, [I\begin{pmatrix} 1 \\ 0 \end{pmatrix}]_{B}, [I\begin{pmatrix} 0 \\ 2 \end{pmatrix}]_{B} \, \right)## of the identity map ##I : \mathbf{R}^2 \longrightarrow \mathbf{R}^2, \, v \mapsto v## is with respect to these two bases [i.e. ##\mathcal{M}## is the change-of-basis matrix]. Firstly,$$[I\begin{pmatrix} 1 \\ 0 \end{pmatrix}]_B = [\begin{pmatrix} 1 \\ 0 \end{pmatrix}]_B = \begin{pmatrix} 0 \\ 1/2 \end{pmatrix}$$because ##\begin{pmatrix} 1 \\ 0 \end{pmatrix} = 0 \times \begin{pmatrix} 2 \\ 1 \end{pmatrix} + 1/2 \times \begin{pmatrix} 2 \\ 0 \end{pmatrix}##, and similarly$$[I\begin{pmatrix} 0 \\ 2 \end{pmatrix}]_B = [\begin{pmatrix} 0 \\ 2 \end{pmatrix}]_B = \begin{pmatrix} 2 \\ -2 \end{pmatrix}$$because ##\begin{pmatrix} 0 \\ 2 \end{pmatrix} = 2 \times \begin{pmatrix} 2 \\ 1 \end{pmatrix} -2 \times \begin{pmatrix} 2 \\ 0 \end{pmatrix}##. Hence the change-of-basis matrix from ##A## to ##B## coordinates is$$\mathcal{M} = \begin{pmatrix} 0 & 2\\ 1/2 & -2 \end{pmatrix}$$Let's check that it works. Consider some random vector ##\begin{pmatrix} 5 \\ 6 \end{pmatrix}## in ##\mathbf{R}^2##. Since ##\begin{pmatrix} 5 \\ 6 \end{pmatrix} = 5 \times \begin{pmatrix} 1 \\ 0 \end{pmatrix} + 3 \times \begin{pmatrix} 0 \\ 2 \end{pmatrix}##, it's ##A##-coordinates are ##[\begin{pmatrix} 5 \\ 6 \end{pmatrix}]_A = \begin{pmatrix} 5 \\ 3 \end{pmatrix}##.

And since ##\begin{pmatrix} 5 \\ 6 \end{pmatrix} = 6 \times \begin{pmatrix} 2 \\ 1 \end{pmatrix} - 7/2 \times \begin{pmatrix} 2 \\ 0 \end{pmatrix}##, it's ##B##-coordinates are ##[\begin{pmatrix} 5 \\ 6 \end{pmatrix}]_B = \begin{pmatrix} 6 \\ -7/2 \end{pmatrix}##.

And indeed, you can check that this is what we get when we apply the change of basis matrix,$$\begin{pmatrix} 6 \\ -7/2 \end{pmatrix} = \begin{pmatrix} 0 & 2\\ 1/2 & -2 \end{pmatrix} \begin{pmatrix} 5 \\ 3 \end{pmatrix}$$The thing that can be difficult to grasp, is that ##[\begin{pmatrix} 5 \\ 6 \end{pmatrix}]_A = \begin{pmatrix} 5 \\ 3 \end{pmatrix}## and ##[\begin{pmatrix} 5 \\ 6 \end{pmatrix}]_B = \begin{pmatrix} 6 \\ -7/2 \end{pmatrix}## are just representations of the vector ##\begin{pmatrix} 5 \\ 6 \end{pmatrix}## with respect to two those two bases. You must not confuse the representations with the original vector itself!

I would suggest, if you want to learn more, that you consult a linear algebra textbook, or alternatively have a read through some articles like this
- https://en.wikipedia.org/wiki/Transformation_matrix
- https://en.wikipedia.org/wiki/Change_of_basis
- https://en.wikipedia.org/wiki/Coordinate_vector
to get a better understanding. Hope it helps!

Last edited by a moderator:
fog37
Staff Emeritus
Gold Member
2021 Award
It might help for the underlying space to be something like linear polynomials, so the representations and the vector itself are more obviously separate

etotheipi
What properties does a matrix ##M## that represents an identity map ##I## have compared to matrices that don't do that?

We can list properties that ##M## must have, but some matrices that do not represent a change-of-basis can have the same properties. For example, ##M## must have an inverse. If the bases used have special properties (orthogonality, normality) then ##M## must have other special properties.

Are you looking for a list of mathematical properties of ##M## that tell you whether to interpret the equation ##y = Mx## as a change of basis versus a motion of vectors? There is no such list. Even if an author has used notation like "##x_A##" to indicate the basis used to represent vectors (and I've rarely seen this done in scientific articles) you must read the text of an article to determine what is meant.

Furthermore, if you are familiar with the idea of "reference frames" in physics, you know that it is often unimportant to make a distinction between motion of some object relative to an external reference frame versus the view that the object is at rest in its own reference frame and the external world is moving.

There is a relation between the matrices used in these differing points of view.

Let's say that ##y_b = M v_b## represents a change of position of a vector ##v_b## when ##b## is the basis of the space. (e.g. in 2-D , think of the vector (2,1) being rotated counterclockwise by 10 degress relative to the origin)

Then ##y_c = M^{-1} x_b ## can represent computing the coordinates of a new set of basis vectors ##c## when we take the view that ##v## remained at rest and the coordinate system was transformed. (e.g. the vector (2,1) remained where it was and the x an y axes were rotated 10 deg clockwise).

Of couse, in pure mathematics, we should not talk about a set of basis vectors moving. We only "transform" basis vectors in the sense that we turn our attention from one fixed set of basis vectors to a different fixed set of basis vectors. However, in physics, one often thinks of a set of basis vectors as an object that can move to new places.

Last edited:
fog37
fog37
The key is to realise that the "abstract" vector ##x \in V## and its so-called "coordinate-vector" with respect to a basis ##[x]_A## are different.

First, let's just re-cap we mean by a "coordinate-vector w.r.t ##A##", or " the components of ##v## in a basis ##A##". Given any ##v \in V##, if ##A = (a_1, \dots, a_n)## is a basis of ##V## then you can express ##v = c_1 a_1 + \dots + c_n a_n##. Then, ##[v]_A := \begin{pmatrix} c_1 \\ \dots \\ c_n \end{pmatrix}## are called the ##A##-coordinates of ##v##. Of course, given a different basis ##(b_1, \dots, b_n)## the coefficients will be different, i.e. ##v = d_1 b_1 + \dots + d_n b_n## and then the ##B##-coordinates of ##v## are instead ##[v]_B := \begin{pmatrix} d_1 \\ \dots \\ d_n \end{pmatrix}##. The whole point of a change-of-basis is to figure out how to get from the ##A##-coordinates of ##v## to the ##B##-coordinates of ##v##!

Let's take a concrete example of a change-of-basis. Consider the vector space ##V = \mathbf{R}^2##, and let's take two bases, say ##A = \{ \begin{pmatrix} 1 \\ 0 \end{pmatrix} , \begin{pmatrix} 0 \\ 2 \end{pmatrix} \}## and ##B = \{ \begin{pmatrix} 2 \\ 1 \end{pmatrix}, \begin{pmatrix} 2 \\ 0 \end{pmatrix} \}##. Now, let's work out what the matrix ##\mathcal{M} = \left( \, [I\begin{pmatrix} 1 \\ 0 \end{pmatrix}]_{B}, [I\begin{pmatrix} 0 \\ 2 \end{pmatrix}]_{B} \, \right)## of the identity map ##I : \mathbf{R}^2 \longrightarrow \mathbf{R}^2, \, v \mapsto v## is with respect to these two bases [i.e. ##\mathcal{M}## is the change-of-basis matrix]. Firstly,$$[I\begin{pmatrix} 1 \\ 0 \end{pmatrix}]_B = [\begin{pmatrix} 1 \\ 0 \end{pmatrix}]_B = \begin{pmatrix} 0 \\ 1/2 \end{pmatrix}$$because ##\begin{pmatrix} 1 \\ 0 \end{pmatrix} = 0 \times \begin{pmatrix} 2 \\ 1 \end{pmatrix} + 1/2 \times \begin{pmatrix} 2 \\ 0 \end{pmatrix}##, and similarly$$[I\begin{pmatrix} 0 \\ 2 \end{pmatrix}]_B = [\begin{pmatrix} 0 \\ 2 \end{pmatrix}]_B = \begin{pmatrix} 2 \\ -2 \end{pmatrix}$$because ##\begin{pmatrix} 0 \\ 2 \end{pmatrix} = 2 \times \begin{pmatrix} 2 \\ 1 \end{pmatrix} -2 \times \begin{pmatrix} 2 \\ 0 \end{pmatrix}##. Hence the change-of-basis matrix from ##A## to ##B## coordinates is$$\mathcal{M} = \begin{pmatrix} 0 & 2\\ 1/2 & -2 \end{pmatrix}$$Let's check that it works. Consider some random vector ##\begin{pmatrix} 5 \\ 6 \end{pmatrix}## in ##\mathbf{R}^2##. Since ##\begin{pmatrix} 5 \\ 6 \end{pmatrix} = 5 \times \begin{pmatrix} 1 \\ 0 \end{pmatrix} + 3 \times \begin{pmatrix} 0 \\ 2 \end{pmatrix}##, it's ##A##-coordinates are ##[\begin{pmatrix} 5 \\ 6 \end{pmatrix}]_A = \begin{pmatrix} 5 \\ 3 \end{pmatrix}##.

And since ##\begin{pmatrix} 5 \\ 6 \end{pmatrix} = 6 \times \begin{pmatrix} 2 \\ 1 \end{pmatrix} - 7/2 \times \begin{pmatrix} 2 \\ 0 \end{pmatrix}##, it's ##B##-coordinates are ##[\begin{pmatrix} 5 \\ 6 \end{pmatrix}]_B = \begin{pmatrix} 6 \\ -7/2 \end{pmatrix}##.

And indeed, you can check that this is what we get when we apply the change of basis matrix,$$\begin{pmatrix} 6 \\ -7/2 \end{pmatrix} = \begin{pmatrix} 0 & 2\\ 1/2 & -2 \end{pmatrix} \begin{pmatrix} 5 \\ 3 \end{pmatrix}$$The thing that can be difficult to grasp, is that ##[\begin{pmatrix} 5 \\ 6 \end{pmatrix}]_A = \begin{pmatrix} 5 \\ 3 \end{pmatrix}## and ##[\begin{pmatrix} 5 \\ 6 \end{pmatrix}]_B = \begin{pmatrix} 6 \\ -7/2 \end{pmatrix}## are just representations of the vector ##\begin{pmatrix} 5 \\ 6 \end{pmatrix}## with respect to two those two bases. You must not confuse the representations with the original vector itself!

I would suggest, if you want to learn more, that you consult a linear algebra textbook, or alternatively have a read through some articles like this
- https://en.wikipedia.org/wiki/Transformation_matrix
- https://en.wikipedia.org/wiki/Change_of_basis
- https://en.wikipedia.org/wiki/Coordinate_vector
to get a better understanding. Hope it helps!
Hello. Thanks. I think the example drives things home: the same vector \begin{pmatrix} 5 \\ 6 \end{pmatrix} has different coordinate representation in the two different bases. Is the vector \begin{pmatrix} 5 \\ 6 \end{pmatrix}, which you call random, a vector expressed in some other unknown base where 5 and 6 are the components? I think so.