Applying change of basis to vector b_1 doesn't give b'_1?

  • #1
giraffe714
19
2
TL;DR Summary
The formula for finding change of basis transformations is b'_j = c_1j b_1 + ... + c_nj v_n, but when the coefficients c_1j are put into a matrix and we multiply that matrix by b_1, it doesn't give back b'_1?
The formula my textbook provides for finding change of basis matrices is:
$$b'_j = a_{1j} b_1 + \cdots + a_{nj} b_n$$
I assume, since that's the convention and also because Wikipedia itself uses this formula like this, that the first index of the c's is the row, and the second is the columns. So, given some basis B and some basis B', we should be able to construct the matrix P which is a change of basis between the two. Let's also assume that B and B' each have 2 vectors for simplicity.
Hence, we can construct the matrix
$$ P = \begin{pmatrix} a_{11} && a_{12} \\ a_{21} && a_{22} \end{pmatrix} $$
All of this is standard. My confusion comes when I try to directly multiply this P by some b_1.

$$ \begin{pmatrix} a_{11} && a_{12} \\ a_{21} && a_{22} \end{pmatrix} \begin{pmatrix} b_{11} \\ b_{21} \end{pmatrix} = \begin{pmatrix} a_{11} b_{11} + a_{12} b_{21} \\ a_{21} b_{11} + a_{22} b_{21} \end{pmatrix} $$
Which isn't the formula ##b'_j = a_{1j} b_1 + \cdots + a_{nj} v_n##. Or at least, it's not the formulas ##b'_{j1} = a_{1j} b_{11} + \cdots + a_{nj} b_{n1}## and ##b'_{j2} = a_{1j} b_{12} + \cdots + a_{nj} b_{n2}##, which is surely just the above formula separated out by components, which should be allowed? Which step here is flawed? I can't find any explanation for this discrepancy online. By all means - the formula works when I use concrete bases. Is this level of abstraction simply not allowed in linear algebra? I understand that I'm making a mistake somewhere in this reasoning but I can't find where.

The weird thing is, this formula works. By all means, if I use two concrete bases, such as, for example:
B = (1, 0), (0, 1) and B' = (1, 0), (1, 1)
This formula works. But I don't know why.
 
Last edited:
Physics news on Phys.org
  • #2
giraffe714 said:
TL;DR Summary: The formula for finding change of basis transformations is b'_j = c_1j b_1 + ... + c_nj v_n, but when the coefficients c_1j are put into a matrix and we multiply that matrix by b_1, it doesn't give back b'_1?

The formula my textbook provides for finding change of basis matrices is:
$$b'_j = a_{1j} b_1 + \cdots + a_{nj} b_n$$
I assume, since that's the convention and also because Wikipedia itself uses this formula like this, that the first index of the c's is the row, and the second is the columns. So, given some basis B and some basis B', we should be able to construct the matrix P which is a change of basis between the two. Let's also assume that B and B' each have 2 vectors for simplicity.
Hence, we can construct the matrix
$$ P = \begin{pmatrix} a_{11} && a_{12} \\ a_{21} && a_{22} \end{pmatrix} $$
All of this is standard. My confusion comes when I try to directly multiply this P by some b_1.

$$ \begin{pmatrix} a_{11} && a_{12} \\ a_{21} && a_{22} \end{pmatrix} \begin{pmatrix} b_{11} \\ b_{21} \end{pmatrix} = \begin{pmatrix} a_{11} b_{11} + a_{12} b_{21} \\ a_{21} b_{11} + a_{22} b_{21} \end{pmatrix} $$
Which isn't the formula ##b'_j = a_{1j} b_1 + \cdots + a_{nj} v_n##. Or at least, it's not the formulas ##b'_{j1} = a_{1j} b_{11} + \cdots + a_{nj} b_{n1}## and ##b'_{j2} = a_{1j} b_{12} + \cdots + a_{nj} b_{n2}##, which is surely just the above formula separated out by components, which should be allowed? Which step here is flawed? I can't find any explanation for this discrepancy online. By all means - the formula works when I use concrete bases. Is this level of abstraction simply not allowed in linear algebra? I understand that I'm making a mistake somewhere in this reasoning but I can't find where.

The weird thing is, this formula works. By all means, if I use two concrete bases, such as, for example:
B = (1, 0), (0, 1) and B' = (1, 0), (1, 1)
This formula works. But I don't know why.
There are two things here. First, there are two sets of basis vectors. Second, each vector can be expressed as a linear combination of either set of basis vectors. We need to be very careful about notation.

In your last example, the vectors in both B and B' are expressed in the B basis. The vectors in B' expressed in the B' basis would be ##(1,0)', (0, 1)'## - where I've used the prime to denote that the components involve the B' basis vectors.

If you have the change of basis matrix, A, then in this case we should have (I'll leave the vectors in row form for ease of typing):
$$A(1, 0) = (1, 0)'; \ A(1,1) = (0, 1)'$$Does that explain things?
 
  • #3
PeroK said:
There are two things here. First, there are two sets of basis vectors. Second, each vector can be expressed as a linear combination of either set of basis vectors. We need to be very careful about notation.

In your last example, the vectors in both B and B' are expressed in the B basis. The vectors in B' expressed in the B' basis would be ##(1,0)', (0, 1)'## - where I've used the prime to denote that the components involve the B' basis vectors.

If you have the change of basis matrix, A, then in this case we should have (I'll leave the vectors in row form for ease of typing):
$$A(1, 0) = (1, 0)'; \ A(1,1) = (0, 1)'$$Does that explain things?
Yes, it does, from a concrete point of view at least. When we multiply ## A(1,0) = (1, 0)' ## what we're doing is taking the coefficients ## (1, 0) ## and ## (1, 0)' ##, correct? But then, for any basis B, the basis vectors in that basis would always "look like" the standard basis, and when we do a change of basis, we're first expressing the basis vectors of the new basis B' in terms of the current basis B, and then... what? If for any basis B the basis vectors in that basis look like the standard basis, wouldn't changing the basis back be redundant? If we already have to know B' in terms of B to even know what to multiply?

But I'm still confused why directly multiplying P by some b_1 doesn't give the thing I thought it should give according to the very formula P is derived using. Could you please elaborate on that part?
 
Last edited:
  • #4
giraffe714 said:
Yes, it does, from a concrete point of view at least. When we multiply ## A(1,0) = (1, 0)' ## what we're doing is taking the coefficients ## (1, 0) ## and ## (1, 0)' ##, correct? But then, for any basis B, the basis vectors in that basis would always "look like" the standard basis, and when we do a change of basis, we're first expressing the basis vectors of the new basis B' in terms of the current basis B, and then... what? If for any basis B the basis vectors in that basis look like the standard basis, wouldn't changing the basis back be redundant? If we already have to know B' in terms of B to even know what to multiply?

But I'm still confused why directly multiplying P by some b_1 doesn't give the thing I thought it should give according to the very formula P is derived using. Could you please elaborate on that part?
Sorry I can't answer this now. I'll get back to you later today if no one else has in the meantime.
 
  • #5
giraffe714 said:
Yes, it does, from a concrete point of view at least. When we multiply ## A(1,0) = (1, 0)' ## what we're doing is taking the coefficients ## (1, 0) ## and ## (1, 0)' ##, correct?

Depends on what you mean by ##(1,0)'.## This is a very confusing notation which you should avoid. What we have is
$$
A\cdot b_1=A\cdot \begin{pmatrix}1\\0 \end{pmatrix}=\begin{pmatrix}a_{11}\\a_{21}\end{pmatrix}=a_{11}\cdot \begin{pmatrix}1\\0\end{pmatrix}+a_{21}\cdot\begin{pmatrix}0\\1\end{pmatrix}=a_{11}b_1+a_{21}b_2=b_1'
$$
which expresses the new basis vector ##b_1'## as a linear combination of the previous basis vectors ##\{b_1,b_2\}.## If you use coordinates like ##(1,0)## you must be aware of whether you mean ##1\cdot b_1+ 0\cdot b_2## or ##1\cdot b'_1+ 0\cdot b'_2\,,## i.e. carefully deal with your choice of bases. Of course, we always have ##b'_1= 1\cdot b'_1+ 0\cdot b'_2\,,## too, but ##b_1\neq_{i.g.}1\cdot b'_1+ 0\cdot b'_2\,.## Writing both in coordinates ##(1,0)## without telling according to which basis produces confusion.

I suggest calculating with the bases ##B=\{(1,0),(0,1)\}## and ##B'=\{(1,-1),(-2,3)\}## where
\begin{align*}
b'_1&=(1,-1)=1\cdot b_1-1\cdot b_2\\
b'_2&=(-2,3)=-2\cdot b_1+3\cdot b_2\\
\end{align*}

##b'_1=1\cdot b_1'+0\cdot b'_2=(1,0)_{B'}## but ##b'_1=(1,-1)_{B}.## Coordintes (or components of vectors) make only sense if it is clear which basis in meant.
 
  • #6
Here's a provisonal answer. Suppose we rotate the standard basis vectors in ##\mathbb R^2## by ##\theta##. The x-axis is mapped to an axis at ##+\theta## above the x-axis and, likewise, the y-axis is rotated by ##\theta##. We can describe this using the rotation matrix ##R(\theta)##, where the coordinates of the new x-axis are given by:
$$\begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix}\begin{pmatrix} 1 \\ 0 \end{pmatrix}= \begin{pmatrix} \cos \theta \\ \sin \theta \end{pmatrix}$$And the coordinates of the new y-axis are given by:
$$\begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix}\begin{pmatrix} 0 \\ 1 \end{pmatrix}= \begin{pmatrix} -\sin \theta \\ \cos \theta \end{pmatrix}$$Note that these are the coordinates (components or coefficients) of the new axes , expressed in the original basis.

When we transform a vector's components into the new basis, we must use the inverse rotation matrix:
$$R(-\theta) = \begin{pmatrix} \cos \theta & \sin \theta \\ -\sin \theta & \cos \theta \end{pmatrix}$$And the components of a vector expressed in the new basis are given by:
$$\begin{pmatrix} x \\ y \end{pmatrix}' = \begin{pmatrix} \cos \theta & \sin \theta \\ -\sin \theta & \cos \theta \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix}$$Now, we can find the components of the new x-axis in the new basis:
$$\begin{pmatrix} x \\ y \end{pmatrix}' = \begin{pmatrix} \cos \theta & \sin \theta \\ -\sin \theta & \cos \theta \end{pmatrix}\begin{pmatrix} \cos \theta \\ \sin \theta \end{pmatrix} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}'$$As expected. In other words, if we rotate the basis vectors clockwise, then effectively we rotate the vectors anti-clockwise, in terms of their position relative to the new basis.

This is a common theme, where you have to be careful about the difference between

a) Mapping one set of basis vectors to a new set of basis vectors.

b) Describing the components of a vector in the new basis, given their components in the old basis.

These are generally inverse transformations of each other.
 
  • Like
Likes DaveE
  • #7
PeroK said:
Here's a provisonal answer. Suppose we rotate the standard basis vectors in ##\mathbb R^2## by ##\theta##. The x-axis is mapped to an axis at ##+\theta## above the x-axis and, likewise, the y-axis is rotated by ##\theta##. We can describe this using the rotation matrix ##R(\theta)##, where the coordinates of the new x-axis are given by:
$$\begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix}\begin{pmatrix} 1 \\ 0 \end{pmatrix}= \begin{pmatrix} \cos \theta \\ \sin \theta \end{pmatrix}$$And the coordinates of the new y-axis are given by:
$$\begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix}\begin{pmatrix} 0 \\ 1 \end{pmatrix}= \begin{pmatrix} -\sin \theta \\ \cos \theta \end{pmatrix}$$Note that these are the coordinates (components or coefficients) of the new axes , expressed in the original basis.

When we transform a vector's components into the new basis, we must use the inverse rotation matrix:
$$R(-\theta) = \begin{pmatrix} \cos \theta & \sin \theta \\ -\sin \theta & \cos \theta \end{pmatrix}$$And the components of a vector expressed in the new basis are given by:
$$\begin{pmatrix} x \\ y \end{pmatrix}' = \begin{pmatrix} \cos \theta & \sin \theta \\ -\sin \theta & \cos \theta \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix}$$Now, we can find the components of the new x-axis in the new basis:
$$\begin{pmatrix} x \\ y \end{pmatrix}' = \begin{pmatrix} \cos \theta & \sin \theta \\ -\sin \theta & \cos \theta \end{pmatrix}\begin{pmatrix} \cos \theta \\ \sin \theta \end{pmatrix} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}'$$As expected. In other words, if we rotate the basis vectors clockwise, then effectively we rotate the vectors anti-clockwise, in terms of their position relative to the new basis.

This is a common theme, where you have to be careful about the difference between

a) Mapping one set of basis vectors to a new set of basis vectors.

b) Describing the components of a vector in the new basis, given their components in the old basis.

These are generally inverse transformations of each other.
Ohh, alright. That makes sense, thank you. But once again, what happens when we multiply the abstract P by some abstract b_1? Why don't we get b'_1? Because the inverse of some 2x2 P (I would assume it has to be invertible) is
$$ \frac{1}{ab-cd} \begin{pmatrix} d && -c \\ -b && a \end{pmatrix} $$
And if we multiply this by b_1 we still don't get b'_1 by the transformation ## b'_j = c_1j b_1 + ... + c_nj v_n ##. I fully understand your response, the only thing I don't understand is this specific discrepancy in a general case. Because while the rotation example is interesting and does show many things, I can't figure out how to get the answer to this question from it (perhaps it's in there somewhere, I just cannot understand where.)
 
  • #8
giraffe714 said:
Ohh, alright. That makes sense, thank you. But once again, what happens when we multiply the abstract P by some abstract b_1? Why don't we get b'_1? Because the inverse of some 2x2 P (I would assume it has to be invertible) is
$$ \frac{1}{ab-cd} \begin{pmatrix} d && -c \\ -b && a \end{pmatrix} $$
And if we multiply this by b_1 we still don't get b'_1 by the transformation ## b'_j = c_1j b_1 + ... + c_nj v_n ##. I fully understand your response, the only thing I don't understand is this specific discrepancy in a general case. Because while the rotation example is interesting and does show many things, I can't figure out how to get the answer to this question from it (perhaps it's in there somewhere, I just cannot understand where.)
Can you say precisely what is ##b'_j##?
 
  • #9
Let me write this out. We have bases: ##B = \{\vec b_1, \dots \vec b_n \}## and ##B' = \{\vec b'_1, \dots \vec b'_n \}##.

First, we have a set of vector equations that express each of the vectors in one basis as a linear combination of vectors in the other basis:
$$\vec b'_i = \sum_{j = 1}^{n} c_{ij}\vec b_j = c_{i1}\vec b_1 + c_{i2}\vec b_2 + \dots + c_{in}\vec b_n$$Now, we take a typical vector ##\vec v## that can be expressed in both bases:
$$\vec v = \sum_{i = 1}^{n}a_i \vec b_i = \sum_{i = 1}^{n}a'_i \vec b'_i$$We then want to find the relationship between the coefficients ##a'_i## and ##a_i##. We see that:
$$\sum_{i = 1}^{n}a'_i \vec b'_i = \sum_{i = 1}^{n}a'_i\sum_{j = 1}^{n} c_{ij}\vec b_j = \sum_{j = 1}^{n}\bigg (\sum_{i = 1}^{n}c_{ij}a'_i\bigg )\vec b_j$$This gives us the required equation:
$$\vec v = \sum_{j = 1}^{n}a_j \vec b_j = \sum_{j = 1}^{n}\bigg (\sum_{i = 1}^{n}c_{ij}a'_i\bigg )\vec b_j$$And, as ##B## is a basis, we have a unique expansion for ##\vec v##, hence:
$$\forall j: a_j = \sum_{i = 1}^{n}c_{ij}a'_i$$Now, because of how we have defined things, the coefficients ##c_{ij}## are the wrong way round for matrix multiplication of the components. So, if we define the matrix ##C## by ##C_{ij} = c_{ij}##, then it's the transpose of ##C## that we want in order to transform the components of the vector ##\vec v## from the ##B'## basis to the ##B## basis.

Now we have the notational issue that has already been mentioned in this thread. The tuples ##(a_1, a_2, \dots a_n)## and ##(a'_1, a'_2 \dots a'_n)## are not strictly speaking vectors in this context, but lists of vector components in the corresponding basis, which are related by a matrix equation:
$$\begin{pmatrix} a_1 \\ a_2 \\ \dots \\ a_n \end{pmatrix} = \begin{pmatrix} c_{11} & c_{21} & \dots & c_{n1} \\ c_{12} & c_{22} & \dots & c_{n2} \\ \dots \\ c_{1n} & c_{2n} & \dots & c_{nn} \end{pmatrix} \begin{pmatrix} a'_1 \\ a'_2 \\ \dots \\ a'_n \end{pmatrix}$$I'm not sure what you are trying to do. Perhaps you are assuming that the matrix ##C## or ##C^T## can be used in a different way? In any case, the matrix generated from the relationship between the basis vectors can be used to transform the components of vectors from one basis to the other.
 

Similar threads

Replies
1
Views
897
Replies
12
Views
2K
Replies
34
Views
2K
Replies
3
Views
2K
Replies
1
Views
921
Back
Top