Matrices of Linear Transformations .... Example 2.6.4 - McInerney

Math Amateur · Feb 10, 2016

I am reading Andrew McInerney's book: First Steps in Differential Geometry: Riemannian, Contact, Symplectic ...

I am currently focussed on Chapter 2: Linear Algebra Essentials ... and in particular I am studying Section 2.6 Constructing Linear Transformations ...

I need help with a basic aspect of Example 2.6.4 ...

Example 2.6.4 reads as follows:https://www.physicsforums.com/attachments/5259
View attachment 5260I do not follow the method outlined for finding the matrix representation of the linear transformation T involved ... ... that is, I do not follow the setting up of the matrix for ... and the solving of ... the system for the values of $\displaystyle T(e_1), T(e_2)$ and $\displaystyle T(e_3)$ ... I do not follow any of this process including setting up of the matrix for Gaussian elimination ... I am confused regarding the whole process ... I understand the basics of employing row operations to reduce a matrix to echelon form and then read of a solution to n equations in n unknowns ( or at least I thought I did ... ...

... ... ) but cannot seem to follow what McInerney is doing ...

Can someone carefully explain the process here ... what exactly is going on ...

Hope someone can help ...

Peter===================================================So that MHB readers can understand the context and notation of the above post, I am providing the first few pages of McInerney's Section 2.6 Constructing Linear Transformations ... ... as follows:https://www.physicsforums.com/attachments/5261
https://www.physicsforums.com/attachments/5262
https://www.physicsforums.com/attachments/5263
https://www.physicsforums.com/attachments/5264

Deveno · Feb 10, 2016

What we want is the matrix that takes $[v]_B$ and via matrix multiplication, spits out $[Tv]_{B'}$.

Let me clarify with a one-dimensional example:

Suppose I say the point $x$ is "5 from the start". Your immediate question should be...5 what? Miles? Kilometers? Feet? City Blocks?

Assigning a *number* to a distance, depends on the unit of measurement. This is a choice of basis. Now, normally, we measure a real number "by one's", that is, relative to the unit vector, $1$. By any non-zero real number generates the same "real number line", and we have a *conversion factor*.

For example, if $c \neq 0 \in \Bbb R$, then:

$\Bbb R = \{ac: a \in \Bbb R\}$, since given any real number $r = r1$, we can find $r$ in terms of $c$ like so:

$r = r1 = r\left(\dfrac{c}{c}\right) = \dfrac{r}{c}c$ (and $a = \dfrac{r}{c}$ is certainly a real number, since $c \neq 0$ and $\Bbb R$ is a field).

In higher dimensions, similar things can happen-our "units" could be stretched or shrunken, or our rectangular grid (in two dimensions, for example) could be replaced by a diamond-shaped grid at an angle.

Now, one fundamental fact you should take to heart is this:

(Left)-multiplication of a column vector (array) by an appropriately size matrix is, in fact a linear transformation, that is:

$A(x + y) = Ax + Ay$, whenever either of the two sides is defined, and:

$A(cx) = c(Ax)$ in a similar fashion.

Often, this linear transformation is ALSO called "$A$" (although this can be confusing). Some properties of such a linear transformation defined by a matrix:

If $A$ is an $m \times n$ matrix, then:

$A$ (the linear transformation) is surjective if and only if $\text{rank}(A) = m$ (here $A$ refers to the matrix, but it turns out we can define the rank of a linear transformation by the rank of any matrix for it relative to any two bases, more on that later).

$A$ is injective if and only if its null space is the 0-vector.

If $A$ is surjective, then $A$ maps a basis to a spanning set.

If $A$ is injective, then $A$ maps a linearly independent set to a linearly independent set.

Interestingly enough, if $A$ is a SQUARE matrix ($m = n$), then surjectivity is equivalent to injectivity.

That is, a square matrix which considered as a linear transformation is either injective, or surjective, is thus bijective, and represents a linear automorphism.

Since invertible matrices are bijective transformations (when we left-multiply by them), they must map a basis to a basis.

The REVERSE is also true-any linear mapping of a basis to a basis can be represented by an invertible matrix.

***********************

So let's talk about the linear isomorphism $P:\Bbb R^2 \to \Bbb R^2$ that takes the basis $B' = \{(-1,1),(2,1)\}$ to the basis $C = \{(1,0),(0,1)\}$. If we write $w = (w_1,w_2) = [w]_C$ then by $[w]_{B'}$ we mean whatever $w$ is expressed in the basis $B'$.

So if $w = c_1e'_1 + c_2e'_2 = c_1(-1,1) + c_2(2,1)$, then $[w]_{B'} = (c_1,c_2)$.

So we want to first discover how we can write $P$ as a matrix (it will be a 2x2 matrix). We want:

$P([w]_{B'}) = [w]_C$.

Since a linear transformation is COMPLETELY DETERMINED by its action on a basis, it makes sense to examine:

$P([e'_1]_{B'})$ and $P([e'_2]_{B'})$.

Clearly, $e'_1 = 1e'_1 + 0e'_2$, so $[e'_1]_{B'} = (1,0)$. Similarly, $[e'_2]_{B'} = (0,1)$.

So if $P = \begin{bmatrix}a&b\\c&d\end{bmatrix}$, since we know $[e'_1]_C = (-1,1)$ and $[e'_2]_C = (2,1)$, we have:

$\begin{bmatrix}a&b\\c&d\end{bmatrix}\begin{bmatrix}1\\0\end{bmatrix} = \begin{bmatrix}-1\\1\end{bmatrix}$.

Thus $a = -1$ and $c = 1$.

In the same fashion we have:

$\begin{bmatrix}a&b\\c&d\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix} = \begin{bmatrix}2\\1\end{bmatrix}$,

so $b = 2$ and $d = 1$.

This is how we get the "left part" of the augmented matrix we're going to row-reduce.

Now row-reduction is multiplication on the left by a series of invertible matrices (each elementary row-operation corresponds to left-multiplication by an invertible matrix). For example, if $E_{R1+3R2}$ is the row-operation which adds three times row 2 to row one, this is the same as multiplying on the left by the matrix:

$B = \begin{bmatrix}1&3\\0&1\end{bmatrix}$

If we row-reduce a 2x2 matrix $P$ to the 2x2 identity matrix, the cumulative effect of the row-operations is thus to left-multiply by $P^{-1}$.

Now the "usual" matrix for $T$ is the matrix that takes $[v]_B$ to $[Tv]_C$. This is what is on the right side of the augmented matrix. Since $P$ takes $[Tv]_{B'}$ to $[Tv]_{C}$, it is clear that $P^{-1}$ does the reverse. But we know how to apply $P^{-1}$ to the matrix on the right side-this is what our row-reduction operations do.

So the net effect is to do this:

$[v]_B \stackrel{T}{\to} [Tv]_C \stackrel{P^{-1}}{\to} [Tv]_{B'}$

Math Amateur · Feb 10, 2016

Deveno said:

What we want is the matrix that takes $[v]_B$ and via matrix multiplication, spits out $[Tv]_{B'}$.

Let me clarify with a one-dimensional example:

Suppose I say the point $x$ is "5 from the start". Your immediate question should be...5 what? Miles? Kilometers? Feet? City Blocks?

Assigning a *number* to a distance, depends on the unit of measurement. This is a choice of basis. Now, normally, we measure a real number "by one's", that is, relative to the unit vector, $1$. By any non-zero real number generates the same "real number line", and we have a *conversion factor*.

For example, if $c \neq 0 \in \Bbb R$, then:

$\Bbb R = \{ac: a \in \Bbb R\}$, since given any real number $r = r1$, we can find $r$ in terms of $c$ like so:

$r = r1 = r\left(\dfrac{c}{c}\right) = \dfrac{r}{c}c$ (and $a = \dfrac{r}{c}$ is certainly a real number, since $c \neq 0$ and $\Bbb R$ is a field).

In higher dimensions, similar things can happen-our "units" could be stretched or shrunken, or our rectangular grid (in two dimensions, for example) could be replaced by a diamond-shaped grid at an angle.

Now, one fundamental fact you should take to heart is this:

(Left)-multiplication of a column vector (array) by an appropriately size matrix is, in fact a linear transformation, that is:

$A(x + y) = Ax + Ay$, whenever either of the two sides is defined, and:

$A(cx) = c(Ax)$ in a similar fashion.

Often, this linear transformation is ALSO called "$A$" (although this can be confusing). Some properties of such a linear transformation defined by a matrix:

If $A$ is an $m \times n$ matrix, then:

$A$ (the linear transformation) is surjective if and only if $\text{rank}(A) = m$ (here $A$ refers to the matrix, but it turns out we can define the rank of a linear transformation by the rank of any matrix for it relative to any two bases, more on that later).

$A$ is injective if and only if its null space is the 0-vector.

If $A$ is surjective, then $A$ maps a basis to a spanning set.

If $A$ is injective, then $A$ maps a linearly independent set to a linearly independent set.

Interestingly enough, if $A$ is a SQUARE matrix ($m = n$), then surjectivity is equivalent to injectivity.

That is, a square matrix which considered as a linear transformation is either injective, or surjective, is thus bijective, and represents a linear automorphism.

Since invertible matrices are bijective transformations (when we left-multiply by them), they must map a basis to a basis.

The REVERSE is also true-any linear mapping of a basis to a basis can be represented by an invertible matrix.

***********************

So let's talk about the linear isomorphism $P:\Bbb R^2 \to \Bbb R^2$ that takes the basis $B' = \{(-1,1),(2,1)\}$ to the basis $C = \{(1,0),(0,1)\}$. If we write $w = (w_1,w_2) = [w]_C$ then by $[w]_{B'}$ we mean whatever $w$ is expressed in the basis $B'$.

So if $w = c_1e'_1 + c_2e'_2 = c_1(-1,1) + c_2(2,1)$, then $[w]_{B'} = (c_1,c_2)$.

So we want to first discover how we can write $P$ as a matrix (it will be a 2x2 matrix). We want:

$P([w]_{B'}) = [w]_C$.

Since a linear transformation is COMPLETELY DETERMINED by its action on a basis, it makes sense to examine:

$P([e'_1]_{B'})$ and $P([e'_2]_{B'})$.

Clearly, $e'_1 = 1e'_1 + 0e'_2$, so $[e'_1]_{B'} = (1,0)$. Similarly, $[e'_2]_{B'} = (0,1)$.

So if $P = \begin{bmatrix}a&b\\c&d\end{bmatrix}$, since we know $[e'_1]_C = (-1,1)$ and $[e'_2]_C = (2,1)$, we have:

$\begin{bmatrix}a&b\\c&d\end{bmatrix}\begin{bmatrix}1\\0\end{bmatrix} = \begin{bmatrix}-1\\1\end{bmatrix}$.

Thus $a = -1$ and $c = 1$.

In the same fashion we have:

$\begin{bmatrix}a&b\\c&d\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix} = \begin{bmatrix}2\\1\end{bmatrix}$,

so $b = 2$ and $d = 1$.

This is how we get the "left part" of the augmented matrix we're going to row-reduce.

Now row-reduction is multiplication on the left by a series of invertible matrices (each elementary row-operation corresponds to left-multiplication by an invertible matrix). For example, if $E_{R1+3R2}$ is the row-operation which adds three times row 2 to row one, this is the same as multiplying on the left by the matrix:

$B = \begin{bmatrix}1&3\\0&1\end{bmatrix}$

If we row-reduce a 2x2 matrix $P$ to the 2x2 identity matrix, the cumulative effect of the row-operations is thus to left-multiply by $P^{-1}$.

Now the "usual" matrix for $T$ is the matrix that takes $[v]_B$ to $[Tv]_C$. This is what is on the right side of the augmented matrix. Since $P$ takes $[Tv]_{B'}$ to $[Tv]_{C}$, it is clear that $P^{-1}$ does the reverse. But we know how to apply $P^{-1}$ to the matrix on the right side-this is what our row-reduction operations do.

So the net effect is to do this:

$[v]_B \stackrel{T}{\to} [Tv]_C \stackrel{P^{-1}}{\to} [Tv]_{B'}$

Thanks for the substantial help, Deveno

I appreciate it ...

Just working through your post carefully ... and reflecting on what you have said ...

Thanks again,

Peter

Math Amateur · Feb 10, 2016

Deveno said:

What we want is the matrix that takes $[v]_B$ and via matrix multiplication, spits out $[Tv]_{B'}$.

Let me clarify with a one-dimensional example:

Suppose I say the point $x$ is "5 from the start". Your immediate question should be...5 what? Miles? Kilometers? Feet? City Blocks?

Assigning a *number* to a distance, depends on the unit of measurement. This is a choice of basis. Now, normally, we measure a real number "by one's", that is, relative to the unit vector, $1$. By any non-zero real number generates the same "real number line", and we have a *conversion factor*.

For example, if $c \neq 0 \in \Bbb R$, then:

$\Bbb R = \{ac: a \in \Bbb R\}$, since given any real number $r = r1$, we can find $r$ in terms of $c$ like so:

$r = r1 = r\left(\dfrac{c}{c}\right) = \dfrac{r}{c}c$ (and $a = \dfrac{r}{c}$ is certainly a real number, since $c \neq 0$ and $\Bbb R$ is a field).

In higher dimensions, similar things can happen-our "units" could be stretched or shrunken, or our rectangular grid (in two dimensions, for example) could be replaced by a diamond-shaped grid at an angle.

Now, one fundamental fact you should take to heart is this:

(Left)-multiplication of a column vector (array) by an appropriately size matrix is, in fact a linear transformation, that is:

$A(x + y) = Ax + Ay$, whenever either of the two sides is defined, and:

$A(cx) = c(Ax)$ in a similar fashion.

Often, this linear transformation is ALSO called "$A$" (although this can be confusing). Some properties of such a linear transformation defined by a matrix:

If $A$ is an $m \times n$ matrix, then:

$A$ (the linear transformation) is surjective if and only if $\text{rank}(A) = m$ (here $A$ refers to the matrix, but it turns out we can define the rank of a linear transformation by the rank of any matrix for it relative to any two bases, more on that later).

$A$ is injective if and only if its null space is the 0-vector.

If $A$ is surjective, then $A$ maps a basis to a spanning set.

If $A$ is injective, then $A$ maps a linearly independent set to a linearly independent set.

Interestingly enough, if $A$ is a SQUARE matrix ($m = n$), then surjectivity is equivalent to injectivity.

That is, a square matrix which considered as a linear transformation is either injective, or surjective, is thus bijective, and represents a linear automorphism.

Since invertible matrices are bijective transformations (when we left-multiply by them), they must map a basis to a basis.

The REVERSE is also true-any linear mapping of a basis to a basis can be represented by an invertible matrix.

***********************

So let's talk about the linear isomorphism $P:\Bbb R^2 \to \Bbb R^2$ that takes the basis $B' = \{(-1,1),(2,1)\}$ to the basis $C = \{(1,0),(0,1)\}$. If we write $w = (w_1,w_2) = [w]_C$ then by $[w]_{B'}$ we mean whatever $w$ is expressed in the basis $B'$.

So if $w = c_1e'_1 + c_2e'_2 = c_1(-1,1) + c_2(2,1)$, then $[w]_{B'} = (c_1,c_2)$.

So we want to first discover how we can write $P$ as a matrix (it will be a 2x2 matrix). We want:

$P([w]_{B'}) = [w]_C$.

Since a linear transformation is COMPLETELY DETERMINED by its action on a basis, it makes sense to examine:

$P([e'_1]_{B'})$ and $P([e'_2]_{B'})$.

Clearly, $e'_1 = 1e'_1 + 0e'_2$, so $[e'_1]_{B'} = (1,0)$. Similarly, $[e'_2]_{B'} = (0,1)$.

So if $P = \begin{bmatrix}a&b\\c&d\end{bmatrix}$, since we know $[e'_1]_C = (-1,1)$ and $[e'_2]_C = (2,1)$, we have:

$\begin{bmatrix}a&b\\c&d\end{bmatrix}\begin{bmatrix}1\\0\end{bmatrix} = \begin{bmatrix}-1\\1\end{bmatrix}$.

Thus $a = -1$ and $c = 1$.

In the same fashion we have:

$\begin{bmatrix}a&b\\c&d\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix} = \begin{bmatrix}2\\1\end{bmatrix}$,

so $b = 2$ and $d = 1$.

This is how we get the "left part" of the augmented matrix we're going to row-reduce.

Now row-reduction is multiplication on the left by a series of invertible matrices (each elementary row-operation corresponds to left-multiplication by an invertible matrix). For example, if $E_{R1+3R2}$ is the row-operation which adds three times row 2 to row one, this is the same as multiplying on the left by the matrix:

$B = \begin{bmatrix}1&3\\0&1\end{bmatrix}$

If we row-reduce a 2x2 matrix $P$ to the 2x2 identity matrix, the cumulative effect of the row-operations is thus to left-multiply by $P^{-1}$.

Now the "usual" matrix for $T$ is the matrix that takes $[v]_B$ to $[Tv]_C$. This is what is on the right side of the augmented matrix. Since $P$ takes $[Tv]_{B'}$ to $[Tv]_{C}$, it is clear that $P^{-1}$ does the reverse. But we know how to apply $P^{-1}$ to the matrix on the right side-this is what our row-reduction operations do.

So the net effect is to do this:

$[v]_B \stackrel{T}{\to} [Tv]_C \stackrel{P^{-1}}{\to} [Tv]_{B'}$

Hi Deveno

... thanks again for your help ... BUT ... although I followed most of your post I really need some further help ...

I followed your post down to, and including, determining $\displaystyle P$ as:

$\displaystyle P = \begin{bmatrix} a & b \\ c & d \end{bmatrix} = \begin{bmatrix} -1 & 2 \\ 1 & 1 \end{bmatrix} $

and I recognise (just by checking) that it is the same as the matrix on the left of McInerney's augmented matrix in Example 2.6.4 ... but why exactly the matrix P should be there in the augmented matrix ... I am not at all sure ... I also recognise the matrix on the right of the augmented matrix as the usual matrix for T in Example 2.6.4 ... but exactly how/why after reducing the matrix can we read off the values of $\displaystyle T(e_1), T(e_2)$ and $\displaystyle T(e_3)$ as the columns of the reduced matrix, I am not sure ... what is happening exactly ... ?

So ... generally ... I am having trouble connecting the theory and working of your post to McInerney's Example 2.6.4 ... for example, how exactly can converting one basis in $\displaystyle \mathbb{R}^2$ to another in $\displaystyle \mathbb{R}^2$ be relevant to an example involving a linear transformation from $\displaystyle \mathbb{R}^3$ to $\displaystyle \mathbb{R}^2$ ... ? ... well ... I can vaguely see some possible connections/links ... but generally I am very unsure and somewhat confused ...

Sorry to be slow ... BUT ... can you explain how the theory and computations in your post relate to the solution of McInerney's Example 2.6.4 ...

Note that I am also unsure regarding the nature of the links that McInerney seems to be talking about between Example 2.6.4 and Example 2.6.3 (see below for the text of Example 2.6.3)

Are you able to help ... ... ?

Peter

=====================================================The above post refers to Example 2.6.3 ... the text of Example 2.6.3 reads as follows:https://www.physicsforums.com/attachments/5266
https://www.physicsforums.com/attachments/5267
For convenience of readers of this post I am providing (again) Section 2.6 up to and including Example 2.6.4 ... ... as follows:https://www.physicsforums.com/attachments/5268
View attachment 5269
View attachment 5270
View attachment 5271

Deveno · Feb 11, 2016

Let's do a couple things, here. Let's verify that the matrix $P$ really does change $B'$-coordinates, to their "standard form" (in what I am calling the basis $C$).

First, we'll figure this out "the old-fashioned way".

If $w = (w_1,w_2) = c_1(-1,1) + c_2(2,1)$, then expanding the RHS we get:

$w = (2c_2-c_1,c_1+c_2)$, that is:

$w_1 = 2c_2-c_1$
$w_2 = c_1+c_2$.

For example, if we have $w = 2(-1,1) + 3(2,1)$, then:

$w_1 = 6-2 = 4$
$w_2 = 2+3 = 5$

So the standard form of $[(2,3)]_{B'}$ is: $(4,5)$.

Now let's check via matrix multiplication:

$\begin{bmatrix}-1&2\\1&1\end{bmatrix}\begin{bmatrix}c_1\\c_2\end{bmatrix} = \begin{bmatrix}-c_1+2c_2\\c_1+c_2\end{bmatrix}$

as expected.

If $P$ is invertible, it should be clear that (left) multiplying by $P^{-1}$ reverses this process. Of course, we don't actually know that $P$ is invertible a priori, but we can verify this by row-reduction. I will show this in laborious detail, so you can see what happens.

The first row-operation I will do is to swap rows one and two. This is the same as left-multiplying by the matrix:

$E_1 = \begin{bmatrix}0&1\\1&0\end{bmatrix}$ (verify this!).

Now we have the matrix $E_1P = \begin{bmatrix}1&1\\-1&2\end{bmatrix}$.

Next, I will add row one to row two, leaving row one unchanged. This is the same as left-multiplying by the matrix:

$E_2 = \begin{bmatrix}1&0\\1&1\end{bmatrix}$, so now we have:

$E_2E_1P = \begin{bmatrix}1&1\\0&3\end{bmatrix}$.

Continuing, I will multiply row 2 by $\frac{1}{3}$, which is the same as left-multiplying by the matrix:

$E_3 = \begin{bmatrix}1&0\\0&\frac{1}{3}\end{bmatrix}$, leaving us with:

$E_3E_2E_1P = \begin{bmatrix}1&1\\0&1\end{bmatrix}$.

Finally, I will subtract row two from row one, leaving row two unchanged. This is the same as left-multiplying by the matrix:

$E_4 = \begin{bmatrix}1&-1\\0&1\end{bmatrix}$, and we have reached:

$E_4E_3E_2E_1P = \begin{bmatrix}1&0\\0&1\end{bmatrix} = I$.

It stands to reason, then, that $E_4E_3E_2E_1 = P^{-1}$.

We can verify this, by computing the four-fold product long-hand:

$E_4E_3 = \begin{bmatrix}1&-1\\0&1\end{bmatrix}\begin{bmatrix}1&0\\0&\frac{1}{3}\end{bmatrix} = \begin{bmatrix}1&-\frac{1}{3}\\0&\frac{1}{3}\end{bmatrix}$

$E_4E_3E_2 = \begin{bmatrix}1&-\frac{1}{3}\\0&\frac{1}{3}\end{bmatrix}\begin{bmatrix}1&0\\1&1\end{bmatrix} = \begin{bmatrix}\frac{2}{3}&-\frac{1}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix}$

$E_4E_3E_2E_1 = \begin{bmatrix}\frac{2}{3}&-\frac{1}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix} \begin{bmatrix}0&1\\1&0\end{bmatrix} = \begin{bmatrix}-\frac{1}{3}&\frac{2}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix}$.

We already saw that $E_4E_3E_2E_1$ is a left-inverse for $P$, so let's show its a right-inverse also:

$PE_4E_3E_2E_1 = \begin{bmatrix}-1&2\\1&1\end{bmatrix}\begin{bmatrix}-\frac{1}{3}&\frac{2}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix} = \begin{bmatrix}1&0\\0&1\end{bmatrix}$.

So, $P$ is, in fact, invertible. Thus $P^{-1}$ "undoes what $P$ does". For example, we saw above that $P$ takes $(2,3)$ to $(4,5)$. Thus (if we've done our arithmetic correctly), $P^{-1}$ should take $(4,5)$ to $(2,3)$.

$\begin{bmatrix}-\frac{1}{3}&\frac{2}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix}\begin{bmatrix}4\\5\end{bmatrix} = \begin{bmatrix}-\frac{4}{3}+\frac{10}{3}\\ \frac{4}{3}+\frac{5}{3}\end{bmatrix} = \begin{bmatrix}2\\3\end{bmatrix}$

Hopefully, this convinces you that applying $P$ changes $B'$-coordinates to standard coordinates, and applying $P^{-1}$ does the opposite.

Now what the "standard matrix" for $T$ does, is take $B$-coordinates (which ARE the standard coordinates for $\Bbb R^3$) of a vector $v$, and spits out (after left-multiplication) $Tv$ in standard coordinates (what I named the basis $C$).

Since we want our final answer expressed in $B'$-coordinates, we need to apply $P^{-1}$ to $Tv$ in standard coordinates; we want:

$P^{-1}(Tv)$.

Row-reducing both sides of the augmented matrix simultaneously does exactly that-the only reason we start with $P$ on the left, is so that when we row-reduce IT to the identity, we know we're done (row-reduction is a very computationally efficient way to find an inverse-well, not so much for a 2x2 matrix, but certainly for anything 4x4 and larger-with 3x3 matrices, it sort of depends how many 0 entries it has-sometimes using the adjugate matrix is faster).

As to your question of how a 2x2 matrix is involved with a 2x3 matrix, our original matrix does this:

(2x3)(3x1) = 2x1

Multiplying on the left by a 2x2 matrix is still kosher:

[(2x2)(2x3)](3x1) = 2x1

In fact, this example sort of "over-simplifies" the situation, by starting with $B$ as the standard basis:

If $T:\Bbb R^3 \to \Bbb R^2$ is the linear transformation we started with, and we want to take vectors in $\Bbb R^3$ expressed in a non-standard basis (say $C'$) to our non-standard basis $B'$ in $\Bbb R^2$, there is ANOTHER matrix (a 3x3 invertible matrix), say $Q$, which changes $C'$-coordinates to $B$-coordinates, and we have:

$[T]_{C',B'} = P^{-1}A_TQ$, where $A_T$ is the "standard matrix" for $T$.

This trick (of changing bases to get our matrix in a simple, desirable form) lies at the heart of what is often the pinnacle of many first-year linear algebra courses, the Singular Value Decomposition Theorem.

Math Amateur · Feb 11, 2016

Deveno said:

Let's do a couple things, here. Let's verify that the matrix $P$ really does change $B'$-coordinates, to their "standard form" (in what I am calling the basis $C$).

First, we'll figure this out "the old-fashioned way".

If $w = (w_1,w_2) = c_1(-1,1) + c_2(2,1)$, then expanding the RHS we get:

$w = (2c_2-c_1,c_1+c_2)$, that is:

$w_1 = 2c_2-c_1$
$w_2 = c_1+c_2$.

For example, if we have $w = 2(-1,1) + 3(2,1)$, then:

$w_1 = 6-2 = 4$
$w_2 = 2+3 = 5$

So the standard form of $[(2,3)]_{B'}$ is: $(4,5)$.

Now let's check via matrix multiplication:

$\begin{bmatrix}-1&2\\1&1\end{bmatrix}\begin{bmatrix}c_1\\c_2\end{bmatrix} = \begin{bmatrix}-c_1+2c_2\\c_1+c_2\end{bmatrix}$

as expected.

If $P$ is invertible, it should be clear that (left) multiplying by $P^{-1}$ reverses this process. Of course, we don't actually know that $P$ is invertible a priori, but we can verify this by row-reduction. I will show this in laborious detail, so you can see what happens.

The first row-operation I will do is to swap rows one and two. This is the same as left-multiplying by the matrix:

$E_1 = \begin{bmatrix}0&1\\1&0\end{bmatrix}$ (verify this!).

Now we have the matrix $E_1P = \begin{bmatrix}1&1\\-1&2\end{bmatrix}$.

Next, I will add row one to row two, leaving row one unchanged. This is the same as left-multiplying by the matrix:

$E_2 = \begin{bmatrix}1&0\\1&1\end{bmatrix}$, so now we have:

$E_2E_1P = \begin{bmatrix}1&1\\0&3\end{bmatrix}$.

Continuing, I will multiply row 2 by $\frac{1}{3}$, which is the same as left-multiplying by the matrix:

$E_3 = \begin{bmatrix}1&0\\0&\frac{1}{3}\end{bmatrix}$, leaving us with:

$E_3E_2E_1P = \begin{bmatrix}1&1\\0&1\end{bmatrix}$.

Finally, I will subtract row two from row one, leaving row two unchanged. This is the same as left-multiplying by the matrix:

$E_4 = \begin{bmatrix}1&-1\\0&1\end{bmatrix}$, and we have reached:

$E_4E_3E_2E_1P = \begin{bmatrix}1&0\\0&1\end{bmatrix} = I$.

It stands to reason, then, that $E_4E_3E_2E_1 = P^{-1}$.

We can verify this, by computing the four-fold product long-hand:

$E_4E_3 = \begin{bmatrix}1&-1\\0&1\end{bmatrix}\begin{bmatrix}1&0\\0&\frac{1}{3}\end{bmatrix} = \begin{bmatrix}1&-\frac{1}{3}\\0&\frac{1}{3}\end{bmatrix}$

$E_4E_3E_2 = \begin{bmatrix}1&-\frac{1}{3}\\0&\frac{1}{3}\end{bmatrix}\begin{bmatrix}1&0\\1&1\end{bmatrix} = \begin{bmatrix}\frac{2}{3}&-\frac{1}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix}$

$E_4E_3E_2E_1 = \begin{bmatrix}\frac{2}{3}&-\frac{1}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix} \begin{bmatrix}0&1\\1&0\end{bmatrix} = \begin{bmatrix}-\frac{1}{3}&\frac{2}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix}$.

We already saw that $E_4E_3E_2E_1$ is a left-inverse for $P$, so let's show its a right-inverse also:

$PE_4E_3E_2E_1 = \begin{bmatrix}-1&2\\1&1\end{bmatrix}\begin{bmatrix}-\frac{1}{3}&\frac{2}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix} = \begin{bmatrix}1&0\\0&1\end{bmatrix}$.

So, $P$ is, in fact, invertible. Thus $P^{-1}$ "undoes what $P$ does". For example, we saw above that $P$ takes $(2,3)$ to $(4,5)$. Thus (if we've done our arithmetic correctly), $P^{-1}$ should take $(4,5)$ to $(2,3)$.

$\begin{bmatrix}-\frac{1}{3}&\frac{2}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix}\begin{bmatrix}4\\5\end{bmatrix} = \begin{bmatrix}-\frac{4}{3}+\frac{10}{3}\\ \frac{4}{3}+\frac{5}{3}\end{bmatrix} = \begin{bmatrix}2\\3\end{bmatrix}$

Hopefully, this convinces you that applying $P$ changes $B'$-coordinates to standard coordinates, and applying $P^{-1}$ does the opposite.

Now what the "standard matrix" for $T$ does, is take $B$-coordinates (which ARE the standard coordinates for $\Bbb R^3$) of a vector $v$, and spits out (after left-multiplication) $Tv$ in standard coordinates (what I named the basis $C$).

Since we want our final answer expressed in $B'$-coordinates, we need to apply $P^{-1}$ to $Tv$ in standard coordinates; we want:

$P^{-1}(Tv)$.

Row-reducing both sides of the augmented matrix simultaneously does exactly that-the only reason we start with $P$ on the left, is so that when we row-reduce IT to the identity, we know we're done (row-reduction is a very computationally efficient way to find an inverse-well, not so much for a 2x2 matrix, but certainly for anything 4x4 and larger-with 3x3 matrices, it sort of depends how many 0 entries it has-sometimes using the adjugate matrix is faster).

As to your question of how a 2x2 matrix is involved with a 2x3 matrix, our original matrix does this:

(2x3)(3x1) = 2x1

Multiplying on the left by a 2x2 matrix is still kosher:

[(2x2)(2x3)](3x1) = 2x1

In fact, this example sort of "over-simplifies" the situation, by starting with $B$ as the standard basis:

If $T:\Bbb R^3 \to \Bbb R^2$ is the linear transformation we started with, and we want to take vectors in $\Bbb R^3$ expressed in a non-standard basis (say $C'$) to our non-standard basis $B'$ in $\Bbb R^2$, there is ANOTHER matrix (a 3x3 invertible matrix), say $Q$, which changes $C'$-coordinates to $B$-coordinates, and we have:

$[T]_{C',B'} = P^{-1}A_TQ$, where $A_T$ is the "standard matrix" for $T$.

This trick (of changing bases to get our matrix in a simple, desirable form) lies at the heart of what is often the pinnacle of many first-year linear algebra courses, the Singular Value Decomposition Theorem.

Thanks for for the substantial help, Deveno ... Appreciate it ...

Will be be working through your post in detail shortly ...

Thanks again,

Peter

Math Amateur · Feb 12, 2016

Deveno said:

Let's do a couple things, here. Let's verify that the matrix $P$ really does change $B'$-coordinates, to their "standard form" (in what I am calling the basis $C$).

First, we'll figure this out "the old-fashioned way".

If $w = (w_1,w_2) = c_1(-1,1) + c_2(2,1)$, then expanding the RHS we get:

$w = (2c_2-c_1,c_1+c_2)$, that is:

$w_1 = 2c_2-c_1$
$w_2 = c_1+c_2$.

For example, if we have $w = 2(-1,1) + 3(2,1)$, then:

$w_1 = 6-2 = 4$
$w_2 = 2+3 = 5$

So the standard form of $[(2,3)]_{B'}$ is: $(4,5)$.

Now let's check via matrix multiplication:

$\begin{bmatrix}-1&2\\1&1\end{bmatrix}\begin{bmatrix}c_1\\c_2\end{bmatrix} = \begin{bmatrix}-c_1+2c_2\\c_1+c_2\end{bmatrix}$

as expected.

If $P$ is invertible, it should be clear that (left) multiplying by $P^{-1}$ reverses this process. Of course, we don't actually know that $P$ is invertible a priori, but we can verify this by row-reduction. I will show this in laborious detail, so you can see what happens.

The first row-operation I will do is to swap rows one and two. This is the same as left-multiplying by the matrix:

$E_1 = \begin{bmatrix}0&1\\1&0\end{bmatrix}$ (verify this!).

Now we have the matrix $E_1P = \begin{bmatrix}1&1\\-1&2\end{bmatrix}$.

Next, I will add row one to row two, leaving row one unchanged. This is the same as left-multiplying by the matrix:

$E_2 = \begin{bmatrix}1&0\\1&1\end{bmatrix}$, so now we have:

$E_2E_1P = \begin{bmatrix}1&1\\0&3\end{bmatrix}$.

Continuing, I will multiply row 2 by $\frac{1}{3}$, which is the same as left-multiplying by the matrix:

$E_3 = \begin{bmatrix}1&0\\0&\frac{1}{3}\end{bmatrix}$, leaving us with:

$E_3E_2E_1P = \begin{bmatrix}1&1\\0&1\end{bmatrix}$.

Finally, I will subtract row two from row one, leaving row two unchanged. This is the same as left-multiplying by the matrix:

$E_4 = \begin{bmatrix}1&-1\\0&1\end{bmatrix}$, and we have reached:

$E_4E_3E_2E_1P = \begin{bmatrix}1&0\\0&1\end{bmatrix} = I$.

It stands to reason, then, that $E_4E_3E_2E_1 = P^{-1}$.

We can verify this, by computing the four-fold product long-hand:

$E_4E_3 = \begin{bmatrix}1&-1\\0&1\end{bmatrix}\begin{bmatrix}1&0\\0&\frac{1}{3}\end{bmatrix} = \begin{bmatrix}1&-\frac{1}{3}\\0&\frac{1}{3}\end{bmatrix}$

$E_4E_3E_2 = \begin{bmatrix}1&-\frac{1}{3}\\0&\frac{1}{3}\end{bmatrix}\begin{bmatrix}1&0\\1&1\end{bmatrix} = \begin{bmatrix}\frac{2}{3}&-\frac{1}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix}$

$E_4E_3E_2E_1 = \begin{bmatrix}\frac{2}{3}&-\frac{1}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix} \begin{bmatrix}0&1\\1&0\end{bmatrix} = \begin{bmatrix}-\frac{1}{3}&\frac{2}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix}$.

We already saw that $E_4E_3E_2E_1$ is a left-inverse for $P$, so let's show its a right-inverse also:

$PE_4E_3E_2E_1 = \begin{bmatrix}-1&2\\1&1\end{bmatrix}\begin{bmatrix}-\frac{1}{3}&\frac{2}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix} = \begin{bmatrix}1&0\\0&1\end{bmatrix}$.

So, $P$ is, in fact, invertible. Thus $P^{-1}$ "undoes what $P$ does". For example, we saw above that $P$ takes $(2,3)$ to $(4,5)$. Thus (if we've done our arithmetic correctly), $P^{-1}$ should take $(4,5)$ to $(2,3)$.

$\begin{bmatrix}-\frac{1}{3}&\frac{2}{3}\\ \frac{1}{3}&\frac{1}{3}\end{bmatrix}\begin{bmatrix}4\\5\end{bmatrix} = \begin{bmatrix}-\frac{4}{3}+\frac{10}{3}\\ \frac{4}{3}+\frac{5}{3}\end{bmatrix} = \begin{bmatrix}2\\3\end{bmatrix}$

Hopefully, this convinces you that applying $P$ changes $B'$-coordinates to standard coordinates, and applying $P^{-1}$ does the opposite.

Now what the "standard matrix" for $T$ does, is take $B$-coordinates (which ARE the standard coordinates for $\Bbb R^3$) of a vector $v$, and spits out (after left-multiplication) $Tv$ in standard coordinates (what I named the basis $C$).

Since we want our final answer expressed in $B'$-coordinates, we need to apply $P^{-1}$ to $Tv$ in standard coordinates; we want:

$P^{-1}(Tv)$.

Row-reducing both sides of the augmented matrix simultaneously does exactly that-the only reason we start with $P$ on the left, is so that when we row-reduce IT to the identity, we know we're done (row-reduction is a very computationally efficient way to find an inverse-well, not so much for a 2x2 matrix, but certainly for anything 4x4 and larger-with 3x3 matrices, it sort of depends how many 0 entries it has-sometimes using the adjugate matrix is faster).

As to your question of how a 2x2 matrix is involved with a 2x3 matrix, our original matrix does this:

(2x3)(3x1) = 2x1

Multiplying on the left by a 2x2 matrix is still kosher:

[(2x2)(2x3)](3x1) = 2x1

In fact, this example sort of "over-simplifies" the situation, by starting with $B$ as the standard basis:

If $T:\Bbb R^3 \to \Bbb R^2$ is the linear transformation we started with, and we want to take vectors in $\Bbb R^3$ expressed in a non-standard basis (say $C'$) to our non-standard basis $B'$ in $\Bbb R^2$, there is ANOTHER matrix (a 3x3 invertible matrix), say $Q$, which changes $C'$-coordinates to $B$-coordinates, and we have:

$[T]_{C',B'} = P^{-1}A_TQ$, where $A_T$ is the "standard matrix" for $T$.

This trick (of changing bases to get our matrix in a simple, desirable form) lies at the heart of what is often the pinnacle of many first-year linear algebra courses, the Singular Value Decomposition Theorem.

Thanks again for the help, Deveno ...

You write:

" ... ... Hopefully, this convinces you that applying $P$ changes $B'$-coordinates to standard coordinates, and applying $P^{-1}$ does the opposite. ... ... "

I followed your post easily (because it was so clear ... ) down to this point but begain having some troubles ...

Just a point of clarification ... ...

You write:

" ... ... Now what the "standard matrix" for $T$ does, is take $B$-coordinates (which ARE the standard coordinates for $\Bbb R^3$) of a vector $v$, and spits out (after left-multiplication) $Tv$ in standard coordinates (what I named the basis $C$). ... ... "

Are you referring to T and B in McInerney's Example 2.6.4? I suspect you are ... BUT ... McInerney's B does not seem to me to the standard coordinates (are they "coordinates" or basis members, by the way ...) $\displaystyle (1,0,0), (0,1,0), (0,0,1)$ ... ... in fact they are $\displaystyle e_1 = (1,0,0), e_2 = (1,1,0), \text{ and } e_3 = (1,1,1)$ ... ...

Am I misunderstanding something? Can you clarify ...

Still reflecting on the final part of your post ... particularly on the nature of the augmented matrix and its transformation ...

Thanks again for all your help ...

Peter

Deveno · Feb 14, 2016

Peter said:

Thanks again for the help, Deveno ...

You write:

" ... ... Hopefully, this convinces you that applying $P$ changes $B'$-coordinates to standard coordinates, and applying $P^{-1}$ does the opposite. ... ... "

I followed your post easily (because it was so clear ... ) down to this point but begain having some troubles ...

Just a point of clarification ... ...

You write:

" ... ... Now what the "standard matrix" for $T$ does, is take $B$-coordinates (which ARE the standard coordinates for $\Bbb R^3$) of a vector $v$, and spits out (after left-multiplication) $Tv$ in standard coordinates (what I named the basis $C$). ... ... "

Are you referring to T and B in McInerney's Example 2.6.4? I suspect you are ... BUT ... McInerney's B does not seem to me to the standard coordinates (are they "coordinates" or basis members, by the way ...) $\displaystyle (1,0,0), (0,1,0), (0,0,1)$ ... ... in fact they are $\displaystyle e_1 = (1,0,0), e_2 = (1,1,0), \text{ and } e_3 = (1,1,1)$ ... ...

Am I misunderstanding something? Can you clarify ...

Still reflecting on the final part of your post ... particularly on the nature of the augmented matrix and its transformation ...

Thanks again for all your help ...

Peter

My apologies. I mis-read the original problem.

Let's go back to the original $T$:

$T(x,y,z) = (2x+y-z,x+3z)$

This has the matrix:

$A_T = \begin{bmatrix}2&1&-1\\1&0&3\end{bmatrix}$

As you have graciously pointed out, this is NOT the 3x2 matrix on the RHS of the augmented matrix. As I indicated previosly, the RHS ought to be:

$A_TQ$, where $Q$ is a change-of-basis matrix that takes $B$-coordinates to standard coordinates. That is, we should have:

$Qe_1 = Q(1e_1 + 0e_2 + 0e_3) = Q(e_1) = (1,0,0)$,
$Qe_2 = Q(0e_1 + 1e_2 + 0e_3) = Q(e_2) = (1,1,0)$,
$Qe_3 = Q(0e_1 + 0e_2 + 1e_3) = Q(e_3) = (1,1,1)$, or, in matrix form:

$Q\begin{bmatrix}1\\0\\0\end{bmatrix} = \begin{bmatrix}1\\0\\0\end{bmatrix}$

$Q\begin{bmatrix}0\\1\\0\end{bmatrix} = \begin{bmatrix}1\\1\\0\end{bmatrix}$

$Q\begin{bmatrix}0\\0\\1\end{bmatrix} = \begin{bmatrix}1\\1\\1\end{bmatrix}$

It's not hard to see that $Q$ must be the matrix:

$Q = \begin{bmatrix}1&1&1\\0&1&1\\0&0&1\end{bmatrix}$, and thus we have:

$A_TQ = \begin{bmatrix}2&1&-1\\1&0&3\end{bmatrix}\begin{bmatrix}1&1&1\\0&1&1\\0&0&1\end{bmatrix} = \begin{bmatrix}2&3&2\\1&1&4\end{bmatrix}$

and that IS the RHS of the augmented matrix.

McInerney takes a more simplistic approach, instead of converting to standard coordinates via $Q$ like this:

$[v]_B \to [v]_{\text{std}} \to [Tv]_{\text{std}}$

he just calculates $Tv$ in standard coordinates of $v$ already in standard coordinates (we just focus on basis elements because we can extend by linearity):

$T(e_1) = T(1,0,0) = (2,1)$
$T(e_2) = T(1,1,0) = (3,1)$
$T(e_3) = T(1,1,1) = (2,4)$

thus combining $A_TQ$ in one step.

Math Amateur · Feb 14, 2016

Deveno said:

My apologies. I mis-read the original problem.

Let's go back to the original $T$:

$T(x,y,z) = (2x+y-z,x+3z)$

This has the matrix:

$A_T = \begin{bmatrix}2&1&-1\\1&0&3\end{bmatrix}$

As you have graciously pointed out, this is NOT the 3x2 matrix on the RHS of the augmented matrix. As I indicated previosly, the RHS ought to be:

$A_TQ$, where $Q$ is a change-of-basis matrix that takes $B$-coordinates to standard coordinates. That is, we should have:

$Qe_1 = Q(1e_1 + 0e_2 + 0e_3) = Q(e_1) = (1,0,0)$,
$Qe_2 = Q(0e_1 + 1e_2 + 0e_3) = Q(e_2) = (1,1,0)$,
$Qe_3 = Q(0e_1 + 0e_2 + 1e_3) = Q(e_3) = (1,1,1)$, or, in matrix form:

$Q\begin{bmatrix}1\\0\\0\end{bmatrix} = \begin{bmatrix}1\\0\\0\end{bmatrix}$

$Q\begin{bmatrix}0\\1\\0\end{bmatrix} = \begin{bmatrix}1\\1\\0\end{bmatrix}$

$Q\begin{bmatrix}0\\0\\1\end{bmatrix} = \begin{bmatrix}1\\1\\1\end{bmatrix}$

It's not hard to see that $Q$ must be the matrix:

$Q = \begin{bmatrix}1&1&1\\0&1&1\\0&0&1\end{bmatrix}$, and thus we have:

$A_TQ = \begin{bmatrix}2&1&-1\\1&0&3\end{bmatrix}\begin{bmatrix}1&1&1\\0&1&1\\0&0&1\end{bmatrix} = \begin{bmatrix}2&3&2\\1&1&4\end{bmatrix}$

and that IS the RHS of the augmented matrix.

McInerney takes a more simplistic approach, instead of converting to standard coordinates via $Q$ like this:

$[v]_B \to [v]_{\text{std}} \to [Tv]_{\text{std}}$

he just calculates $Tv$ in standard coordinates of $v$ already in standard coordinates (we just focus on basis elements because we can extend by linearity):

$T(e_1) = T(1,0,0) = (2,1)$
$T(e_2) = T(1,1,0) = (3,1)$
$T(e_3) = T(1,1,1) = (2,4)$

thus combining $A_TQ$ in one step.

Hi Deveno,

Thanks again for all your help on this problem ... I have learned a lot ... BUT ... I do have some further questions ... ...

You write:

" ... ... Let's go back to the original $T$:

$T(x,y,z) = (2x+y-z,x+3z)$

This has the matrix:

$A_T = \begin{bmatrix}2&1&-1\\1&0&3\end{bmatrix}$

... ... ... ... "Further on in your post (I think) you imply that the matrix $\displaystyle A_T$ that can be, as it were, "read off" the statement of the transformation $\displaystyle T$ is in terms of the basis $\displaystyle B$ ... that is its columns are $\displaystyle B$-coordinates (oh my God ... maybe they are $\displaystyle B'$-coordinates!) ... so that the matrix $\displaystyle A_T$ is NOT in terms of the standard basis ... ?Can you please help me with the nature of $\displaystyle A_T$ in general ... that is, the matrix we "read off" the transformation, so to speak ... what basis are the numbers relative to ? ...

...
[ ... ... ... I may be just deluded ... but ... I thought you said in a previous post that when dealing with linear transformations from $\displaystyle \mathbb{R}^n$ to $\displaystyle \mathbb{R}^m$ that we (by convention) state the transformation in numbers that are relative to the standard basis of $\displaystyle \mathbb{R}^m$ ... have I understood you correctly ... ? ... maybe I did not follow you correctly ... ... ... ]
Can you please clarify the issues above ... I really need some help here ...

Peter

Deveno · Feb 14, 2016

Peter said:

Hi Deveno,

Thanks again for all your help on this problem ... I have learned a lot ... BUT ... I do have some further questions ... ...

You write:

" ... ... Let's go back to the original $T$:

$T(x,y,z) = (2x+y-z,x+3z)$

This has the matrix:

$A_T = \begin{bmatrix}2&1&-1\\1&0&3\end{bmatrix}$

... ... ... ... "Further on in your post (I think) you imply that the matrix $\displaystyle A_T$ that can be, as it were, "read off" the statement of the transformation $\displaystyle T$ is in terms of the basis $\displaystyle B$ ... that is its columns are $\displaystyle B$-coordinates (oh my God ... maybe they are $\displaystyle B'$-coordinates!) ... so that the matrix $\displaystyle A_T$ is NOT in terms of the standard basis ... ?Can you please help me with the nature of $\displaystyle A_T$ in general ... that is, the matrix we "read off" the transformation, so to speak ... what basis are the numbers relative to ? ... ...
[ ... ... ... I may be just deluded ... but ... I thought you said in a previous post that when dealing with linear transformations from $\displaystyle \mathbb{R}^n$ to $\displaystyle \mathbb{R}^m$ that we (by convention) state the transformation in numbers that are relative to the standard basis of $\displaystyle \mathbb{R}^m$ ... have I understood you correctly ... ? ... maybe I did not follow you correctly ... ... ... ]
Can you please clarify the issues above ... I really need some help here ...

Peter

$(x,y,z) = x(1,0,0) + y(0,1,0) + z(0,0,1)$

$T(x,y,z) = (2x-y+z,x+3z) = (2x-y+z)(1,0) + (x+3z)(0,1)$.

By linearity:

$T(x,y,z) = xT(1,0,0) + yT(0,1,0) + zT(0,0,1)$.

By definition (of $T$):

$T(1,0,0) = (2-0+0,3+0) = (2,3) = 2(1,0) + 3(0,1)$
$T(0,1,0) = (0-1+0,0+0) = (-1,0) = -1(1,0) + 0(0,1)$
$T(0,0,1) = (0-0+1,0+3) = (1,3) = 1(1,0) + 3(0,1)$

Now, an interesting fact about matrices is:

$Ae_j$ where $e_j$ is the $j$-th standard basis vector in the standard basis essentially picks out the $j$-th column of $A$.

So to "read off" the matrix of $T$ in the *standard* basis, we simply form the matrix whose columns are the images of the standard basis vectors (in the standard basis).

So the first column of $A_T$ is $\begin{bmatrix}2\\3\end{bmatrix}$, the second column is $\begin{bmatrix}-1\\0\end{bmatrix}$, and the third column is $\begin{bmatrix}1\\3\end{bmatrix}$.

In the notation of my earlier posts, this is $[T]_{C',C}$.

As we have seen, $[T]_{B,B'} = P^{-1}[T]_{C',C}Q$

The augmented matrix we started with was:

$P|[T]_{C',C}Q$.

Often, in these types of problems, the linear transformation is from $\Bbb R^n \to \Bbb R^n$, and the "alternate basis" is the same for domain and co-domain, so that $P = Q$, and we have something like:

$[T]_C = P^{-1}[T]_BP$.

Math Amateur · Feb 14, 2016

Deveno said:

$(x,y,z) = x(1,0,0) + y(0,1,0) + z(0,0,1)$

$T(x,y,z) = (2x-y+z,x+3z) = (2x-y+z)(1,0) + (x+3z)(0,1)$.

By linearity:

$T(x,y,z) = xT(1,0,0) + yT(0,1,0) + zT(0,0,1)$.

By definition (of $T$):

$T(1,0,0) = (2-0+0,3+0) = (2,3) = 2(1,0) + 3(0,1)$
$T(0,1,0) = (0-1+0,0+0) = (-1,0) = -1(1,0) + 0(0,1)$
$T(0,0,1) = (0-0+1,0+3) = (1,3) = 1(1,0) + 3(0,1)$

Now, an interesting fact about matrices is:

$Ae_j$ where $e_j$ is the $j$-th standard basis vector in the standard basis essentially picks out the $j$-th column of $A$.

So to "read off" the matrix of $T$ in the *standard* basis, we simply form the matrix whose columns are the images of the standard basis vectors (in the standard basis).

So the first column of $A_T$ is $\begin{bmatrix}2\\3\end{bmatrix}$, the second column is $\begin{bmatrix}-1\\0\end{bmatrix}$, and the third column is $\begin{bmatrix}1\\3\end{bmatrix}$.

In the notation of my earlier posts, this is $[T]_{C',C}$.

As we have seen, $[T]_{B,B'} = P^{-1}[T]_{C',C}Q$

The augmented matrix we started with was:

$P|[T]_{C',C}Q$.

Often, in these types of problems, the linear transformation is from $\Bbb R^n \to \Bbb R^n$, and the "alternate basis" is the same for domain and co-domain, so that $P = Q$, and we have something like:

$[T]_C = P^{-1}[T]_BP$.

Thanks once again, Deveno ... just to restate what I think you have said ... to be certain about what is going on here ... ...... ... So you are stating the convention that in the statement of $\displaystyle T: \mathbb{R}^3 \longrightarrow \mathbb{R}^2$ as

$\displaystyle T(x,y,z) = (2x + y + z, x + 3z) $

we have that the $\displaystyle (x,y,z)$ is relative to the standard basis $\displaystyle S_1 = \{ (1, 0, 0), (0, 1, 0), (0, 0, 1) \}$ and the $\displaystyle (2x + y + z, x + 3z)$ stated relative to the standard basis $\displaystyle S_2 = \{ (1, 0), (0, 1) \}$ ... ... and not relative to the bases B and B' ...

Is that correct? That is ... it is the convention of representation in standard bases ...?It seems to me that it would be more logical to have $\displaystyle (x, y, z)$ to be expressed relative to B ... ... that is that $\displaystyle (x, y, z)$ represent coordinates of B ... so ... we would have ...

$\displaystyle (x, y, z) = x(1, 0, 0) + y(1,1,0) + z(1, 1, 1)$

and also to have $\displaystyle (2x + y + z, x + 3z)$ representing coordinates in the declared basis of the co-domain of T, namely B' ... so ... we would have ...

$\displaystyle T(x,y,z) = (2x + y + z, x + 3z) = (2x + y + z)(-1, 1) + (x + 3z)(2,1)$

... ...

It seems more sensible for the transformation to be declared in terms of coordinates of the declared bases, namely $\displaystyle B$ and $\displaystyle B'$ ... do you know why we do not have this convention ...

Can you comment on the above ...

Peter*** EDIT ***

Just a "supplementary" question ... does the convention regarding the statement/declaration of a transformation apply to transformations that are not from $\displaystyle \mathbb{R}^n$ to $\displaystyle \mathbb{R}^m$ ... ... say from the space of nth degree polynomials to the space of mth degree polynomials ... ?

Peter

Math Amateur · Feb 14, 2016

Deveno said:

My apologies. I mis-read the original problem.

Let's go back to the original $T$:

$T(x,y,z) = (2x+y-z,x+3z)$

This has the matrix:

$A_T = \begin{bmatrix}2&1&-1\\1&0&3\end{bmatrix}$

As you have graciously pointed out, this is NOT the 3x2 matrix on the RHS of the augmented matrix. As I indicated previosly, the RHS ought to be:

$A_TQ$, where $Q$ is a change-of-basis matrix that takes $B$-coordinates to standard coordinates. That is, we should have:

$Qe_1 = Q(1e_1 + 0e_2 + 0e_3) = Q(e_1) = (1,0,0)$,
$Qe_2 = Q(0e_1 + 1e_2 + 0e_3) = Q(e_2) = (1,1,0)$,
$Qe_3 = Q(0e_1 + 0e_2 + 1e_3) = Q(e_3) = (1,1,1)$, or, in matrix form:

$Q\begin{bmatrix}1\\0\\0\end{bmatrix} = \begin{bmatrix}1\\0\\0\end{bmatrix}$

$Q\begin{bmatrix}0\\1\\0\end{bmatrix} = \begin{bmatrix}1\\1\\0\end{bmatrix}$

$Q\begin{bmatrix}0\\0\\1\end{bmatrix} = \begin{bmatrix}1\\1\\1\end{bmatrix}$

It's not hard to see that $Q$ must be the matrix:

$Q = \begin{bmatrix}1&1&1\\0&1&1\\0&0&1\end{bmatrix}$, and thus we have:

$A_TQ = \begin{bmatrix}2&1&-1\\1&0&3\end{bmatrix}\begin{bmatrix}1&1&1\\0&1&1\\0&0&1\end{bmatrix} = \begin{bmatrix}2&3&2\\1&1&4\end{bmatrix}$

and that IS the RHS of the augmented matrix.

McInerney takes a more simplistic approach, instead of converting to standard coordinates via $Q$ like this:

$[v]_B \to [v]_{\text{std}} \to [Tv]_{\text{std}}$

he just calculates $Tv$ in standard coordinates of $v$ already in standard coordinates (we just focus on basis elements because we can extend by linearity):

$T(e_1) = T(1,0,0) = (2,1)$
$T(e_2) = T(1,1,0) = (3,1)$
$T(e_3) = T(1,1,1) = (2,4)$

thus combining $A_TQ$ in one step.

Hi Deveno,

Thanks so much for the help ... ...

Just some points of clarification regarding your post above ...

You write:

" ... ... $A_TQ$, where $Q$ is a change-of-basis matrix that takes $B$-coordinates to standard coordinates. ... ... "

Question 1

Are you able to explain why$\displaystyle A_T Q$ is what is needed on the right side of the augmented matrix ... ?Question 2

You mention that $\displaystyle A_T Q$ is a matrix that takes $\displaystyle B$-coordinates to standard coordinates ... but then (slightly further on) when you write:

" ... ...

$Q\begin{bmatrix}1\\0\\0\end{bmatrix} = \begin{bmatrix}1\\0\\0\end{bmatrix}$

$Q\begin{bmatrix}0\\1\\0\end{bmatrix} = \begin{bmatrix}1\\1\\0\end{bmatrix}$

$Q\begin{bmatrix}0\\0\\1\end{bmatrix} = \begin{bmatrix}1\\1\\1\end{bmatrix}$

... ... "... ... it looks as if Q is taking standard coordinates and transforming them to standard coordinates ... that is taking standard coordinates to B-coordinates ...

Can you comment? Am I misunderstanding something?Question 3

You write:

" ... ... It's not hard to see that $Q$ must be the matrix:

$Q = \begin{bmatrix}1&1&1\\0&1&1\\0&0&1\end{bmatrix}$ ... ... "Sorry Deveno ... I cannot see exactly why/how $\displaystyle Q$ must be of that form ... ... can you help?Hope you can help with the above issues/questions ...

Peter

Deveno · Feb 15, 2016

Peter said:

Hi Deveno,

Thanks so much for the help ... ...

Just some points of clarification regarding your post above ...

You write:

" ... ... $A_TQ$, where $Q$ is a change-of-basis matrix that takes $B$-coordinates to standard coordinates. ... ... "

Question 1

Are you able to explain why$\displaystyle A_T Q$ is what is needed on the right side of the augmented matrix ... ?

The goal is to get a matrix that inputs $[v]_B$ and outputs (after left-multiplication by our matrix) $[Tv]_{B'}$. If we *have* a matrix that takes $[v]_{\text{std}}$ and outputs $[Tv]_{\text{std}}$, then to USE *that* matrix, we need to convert $B$-coordinates to standard coordinates (that is what $Q$ does), and then hit it with $A_T$, which leaves us with $Tv$ in standard coordinates. The row-reduction process in the exercise then effectively converts what we have so far into $B'$-coordinates.

Question 2

You mention that $\displaystyle A_T Q$ is a matrix that takes $\displaystyle B$-coordinates to standard coordinates ... but then (slightly further on) when you write:

" ... ...

$Q\begin{bmatrix}1\\0\\0\end{bmatrix} = \begin{bmatrix}1\\0\\0\end{bmatrix}$

$Q\begin{bmatrix}0\\1\\0\end{bmatrix} = \begin{bmatrix}1\\1\\0\end{bmatrix}$

$Q\begin{bmatrix}0\\0\\1\end{bmatrix} = \begin{bmatrix}1\\1\\1\end{bmatrix}$

... ... "... ... it looks as if Q is taking standard coordinates and transforming them to standard coordinates ... that is taking standard coordinates to B-coordinates ...

Can you comment? Am I misunderstanding something?

Yes, it does appear that way, doesn't it? That is why these sorts of things confuse students of linear algebra. Matrices don't actually "change bases". They just (via matrix multiplication) change $m$-tuples into $n$-tuples. However, if these tuples REPRESENT the coordinates (the scalar coefficients) of a vector in a certain basis, then an invertible matrix can change a coordinate representation in one basis, to one in another.

In other words, the $(1,0,0)$ column vector $Q$ "hits" is NOT representing $1(1,0,0) + 0(0,1,0) + 0(0,0,1)$, but rather:
$1(1,0,0) + 0(1,1,0) + 0(1,1,1)$.

An $n$-tuple can be used to represent (in an obvious way) a linear combination. However, you must always keep asking yourself-"linear combination of WHAT?". In the basis $\{v_1,v_2,v_3\}$, the triple $(1,0,0)$ means $v_1$. In the basis $\{v_1+v_2,v_1-v_2,v_3\}$ the triple $(1,0,0)$ means $v_1+v_2$ which is not the same vector.

Equal coordinates does not mean equal vectors unless the bases are the same.

Question 3

You write:

" ... ... It's not hard to see that $Q$ must be the matrix:

$Q = \begin{bmatrix}1&1&1\\0&1&1\\0&0&1\end{bmatrix}$ ... ... "Sorry Deveno ... I cannot see exactly why/how $\displaystyle Q$ must be of that form ... ... can you help?Hope you can help with the above issues/questions ...

Peter

It's "read off" from the values of $Q$ on the triples $(1,0,0), (0,1,0), (0,0,1)$ much the same as we "read off" the matrix $A_T$ in the standard basis:

The matrix product of an $m \times n$ matrix and an $n \times 1$ column matrix that is all 0's except for a 1 in the $j$-th row (=place, since all rows are just a single entry long), picks out the $j$-th column of the matrix. Try it and see!

Matrices of Linear Transformations .... Example 2.6.4 - McInerney

1. What are matrices of linear transformations?

2. What is the purpose of using matrices of linear transformations?

3. How are matrices of linear transformations related to vector spaces?

4. Can matrices of linear transformations only be used for two-dimensional transformations?

5. How are matrices of linear transformations used in real-world applications?

Similar threads

Hot Threads

Recent Insights