MHB All basis' go to standard basis?

  • Thread starter Thread starter ognik
  • Start date Start date
  • Tags Tags
    Basis Standard
ognik
Messages
626
Reaction score
2
I am sure this is very 'basic' :-), I've not read it anywhere - but it's just struck me that every basis in $R^n$ can be row reduced to the std basis for $R^n - (e_1, e_2,...e_n) $ .

Row reducing is the way I know to test if a given Matrix of column vectors is a basis. Row reducing also doesn't change the space... I hope this is all true?

That has left me uncertain about the relationship between some arbitrary basis in $R^n$ and the std. basis for $R^n$ - is there any? I think not for a subspace with a lower rank than n, but what about a sub-space with the same rank?
 
Physics news on Phys.org
Hi ognik,

I'm not really sure I have really understood your question, but I think this could at least put some light on it.

There is a well known theorem in linear algebra that can be stated as follows:

Theorem
Given two vector spaces $V, W$, a basis $\{\alpha_{1},\ldots ,\alpha_{n}\}$ of $V$ and a set of $n$ vectors $\{\beta_{1}, \ldots , \beta_{n}\}\subset W$ then there exists a unique linear transformation $T:V\longrightarrow W$ such that $T(\alpha_{i})=\beta_{i}$.

In your case, both $V$ and $W$ are $\Bbb{R}^{n}$ and the set of vectors in $W$ are also a basis so yes, any two basis of $\Bbb{R}^{n}$ are related to each other. More specifically, any basis is related with the standard basis by a linear transformation.

As you probably know, linear transformations can be represented as a matrix-vector product , and in this special case those matrices are called change of basis matrix.

So, for example, given a basis $\mathcal{V}=\{v_{1},\ldots ,v_{n}\}$ of $\Bbb{R}^{n}$, can you figure out what is the change of basis matrix starting with the standard basis and ending with $\mathcal{V}$?
 
Some other things that are true:

A square matrix of full rank has linearly independent columns.

A square matrix that can be row-reduced to the identity matrix has full rank.

The matrix for taking the standard basis to the standard basis is the identity matrix.

Row-reduction operations are reversible, and can be represented by invertible matrices. For example, in the $3 \times 3$ case, the row-operation that adds twice the third row to the second is left-multiplication by the matrix:

$\begin{bmatrix}1&0&0\\0&1&2\\0&0&1\end{bmatrix}$

which has the inverse which subtracts twice the third row form the second:

$\begin{bmatrix}1&0&0\\0&1&-2\\0&0&1\end{bmatrix}$

The net effect of taking to row-reduced echelon form is to left-multiply a matrix $A$ by an invertible matrix $P$.

Now if $PA = I$, multiplying by $P^{-1}$ on the left gives us:

$P^{-1}(PA) = P^{-1}I$
$(P^{-1}P)A = P^{-1}$
$IA = P^{-1}$
$A = P^{-1}$

Hence $A^{-1} = (P^{-1})^{-1} = P$

So, if $A$ can be row-reduced to the identity, then $A$ is invertible, which means in particular that its range (as the function which takes a vector $v$ to $Av$) is all of $\Bbb R^n$ (that is, $A$ is surjective, or onto, as a function). This means that its columns (which span its range) must be linearly independent, since we have a spanning set of $n$ vectors and the dimension of $\Bbb R^n$ is $n$.

On the other hand, suppose the columns of an $n \times n$ matrix $A$ are linearly independent. Then these form a basis for $\Bbb R^n$, since $A$ has $n$ columns.

If we label these columns as $u_1,\dots,u_n$, linear independence means that:

$c_1u_1 +\cdots + c_nu_n = 0 \implies c_1 = \dots = c_n = 0$.

Since any vector $Av$ is in the column space spanned by the $u_i$, this means the ONLY solution to $Av = 0$ is $v = 0$.

Recall that the "general solution" of $Av = b$, is:

$v = x_0 + x$, where $x_0$ is one PARTICULAR solution, and $x$ is ANY solution of $Av = 0$.

For our particular $A$, we know there is only one solution of $Av = 0$, namely $v = 0$. This means there is only one particular solution $x_0$ (for if there were two, say $x_0,x_1$, with $x_0 \neq x_1$, we would have:

$Ax_0 = b$ and $Ax_1 = b$ so that $Ax_0 - Ax_1 = b - b = 0$, and thus $A(x_0 - x_1) = 0$, meaning $x_0 - x_1 = 0$, by the uniqueness of this solution. But that means $x_0 = x_1$, contradicting that we have two solutions).

Now since the columns of $A$ span $\Bbb R^n$ (being a LI set of cardinality $n$), we always have at least ONE solution to $Av = b$, and now we see we have EXACTLY one.

Now let's look at what this means for row-reduction: in particular for the leading 1's of our rows. These always move "to the right" in successive rows.

Claim 1: each leading 1 is exactly "one right" of the row above. For if not, as we go down, we would run out of room "to go right" before we reached the last row, leading to at least one zero row at the bottom. Such a zero row would lead to:

$PAx = \begin{bmatrix}1&\ast&\dots&\ast\\0&\ast&\dots&\ast\\ \vdots&\vdots&\ddots&\vdots\\0&0&\dots&0\end{bmatrix}\begin{bmatrix}x_1\\x_2\\ \vdots\\x_n\end{bmatrix} = \begin{bmatrix}y_1\\y_2\\ \vdots\\0\end{bmatrix}$

In particular, there is NO $x \in \Bbb R^n$ with $PAx = e_n$.

Why is this a problem? Well, since $P$ is invertible, there is no $x \in \Bbb R^n$, with $Ax = P^{-1}e_n$ (for if there were, then $PAx = P(P^{-1}e_n) = Ie_n = e_n$, contradiction).

But EVERY vector in $\Bbb R^n$ is $Av$ for some $v$ (because the column space spans $\Bbb R^n$ since the columns are a basis), including whatever $P^{-1}e_n$ is. So such a zero row cannot happen.

Claim 1 means our row-reduced echelon form is:

$PA = \begin{bmatrix}1&\ast&\dots&\ast\\0&1&\dots&\ast\\ \vdots&\vdots&\ddots&\vdots\\0&0&\dots&1\end{bmatrix}$

Since we are in row-reduced echelon form, all the $\ast$ above each one are zeros:

$PA = I$.

In other words, $A$ is an $n \times n$ matrix of basis vectors if and only if the row-reduced echelon form of $A$ is the identity matrix, as you indeed conjectured.

As an aside, you remark that "row-reduction doesn't change the space". It doesn't change the SOLUTION space, it can change the COLUMN space. The first is the elements that $A$ acts on, the second is the RESULTS. What NEVER changes is the DIMENSION of these spaces, and if $A$ is ONTO (if it indeed has columns that form a basis), then the column space doesn't change since "all of $\Bbb R^n$" is still all of $\Bbb R^n$ no matter how you are slicing and dicing it.

If $A$ isn't onto (if it row-reduces to something other than the identity), then to get the column space of $A$ you have to reference the "pivot columns" of the row-reduced matrix in the original $A$, not the columns of the row-reduced matrix.

Aside number 2: linear transformations are ALGEBRA, matrices are ARITHMETIC. We can "mirror" the algebraic structure of linear transformations in the arithmetic of matrices (which use actual numbers we can add, and multiply, etc.), but to do so, we have to "pick" a basis, and this choice can be done in "more than one way". Mathematicians prefer, in general, proofs that don't require such a choice of basis, because these show the underlying structure inherent in just the "vectorness of vectors", whereas numerical or algorithmic arguments depend on our basis coordinates, which are somewhat arbitrarily chosen (Imagine modelling vector forces in physical space. The choice of $x,y,z$ axes and the origin's location aren't part of the physical problem, they are something we IMPOSE upon the situation to make life easier for us. Sometimes, halfway through this process, we realize a different choice makes for easier calculations-the vectors (the underlying algebra) remain the same, but the coordinates (the numbers we calculate with) can be totally different).
 
Thanks Fallen Angel & Deveno.

I knew I hadn't phrased the question very well, reading your posts has helped, but in the meantime another thought struck me - the vectors of a standard basis are all mutually orthonormal, but all other basis in the same space will not ALL be mutually orthonormal - unless they are scalar multiples of the standard basis?

Mention of particular solutions might have helped me clarify my original question - could a solution set for a sub-space V in $R^n$ be comprised of a combination of the standard basis, plus a combination of the basis for V? Sort of like in ODE's, particular + general solution?
----------
So it seems the change of basis matrix (following Fallen Angel's suggestion) is the matrix P comprising the vectors in the relevant V. This is a different relationship to what I was trying to express, but I'm very glad you raised it.

Here I'm a little fuzzy (this stuff is new), so I will try this with an example in $R^2$:

We have E=$ \left[ \begin{array}{cc}1 & 0 \\0 & 1 \end{array} \right]$, let's say $ V= \left\{ \left[ \begin{array}{c}3 \\1 \end{array}\right], \left[ \begin{array}{c}-2 \\1 \end{array}\right] \right\}$, so $ P= \left[ \begin{array}{c}3 \\1 \end{array} \begin{array}{c}-2 \\1 \end{array}\right] $ ?

Now we look at co-ordinates, let's have a vector $ \vec{a} $ whose co-ords are $ [a]_V = \left[ \begin{array}{c}2 \\1 \end{array}\right] $. Then $ [a]_E = P [a]_V $ ?

And going the other way, we would invert P?
 
Last edited:
Yes,

$\begin{bmatrix}3&-2\\1&1\end{bmatrix}\begin{bmatrix}2\\1\end{bmatrix} = \begin{bmatrix}4\\3\end{bmatrix}$

and $2(3,1) + 1(-2,1) = (4,3) = 4(1,0) + 3(0,1)$.

It's far easier to write down explicitly the transformation which takes a non-standard basis to the standard basis-it's columns are simply the coordinates of the non-standard basis in the standard basis.

The reverse transformation is indeed $P^{-1}$, but one can also derive this by calculating $e_1$ and $e_2$ in the new basis:

$(1,0) = a(3,1) + b(-2,1)$ leads to:

$3a - 2b = 1$
$a + b = 0$

so $b = -a$, and thus $3a + 2a = 5a = 1$, so

$(1,0) = \frac{1}{5}(3,1) - \frac{1}{5}(-2,1)$

Similarly, we find that:

$(0,1) = \frac{2}{5}(3,1) + \frac{3}{5}(-2,1)$, so that:

$P^{-1} = \begin{bmatrix}\frac{1}{5}&\frac{2}{5}\\ -\frac{1}{5}&\frac{3}{5}\end{bmatrix}$

the columns of which are the standard basis vectors' coordinates in the non-standard basis.

EDIT: It is *not* true that any mutually orthogonal basis is a scalar multiple of the standard basis-take for example the basis:

$B = \{(\frac{\sqrt{2}}{2},\frac{\sqrt{2}}{2}),(\frac{-\sqrt{2}}{2},\frac{\sqrt{2}}{2})\}$ of $\Bbb R^2$.

This is an orthonormal basis under the usual "dot product" (it is orthogonal, and every basis vector has length 1), but is clearly not composed of scalar multiples of the standard basis.

EDIT #2: It is EXACTLY like the situation in ODE's-for this reason:

The transformation $f \mapsto f'$ is a LINEAR transformation on the space of differentiable functions, using the vector operations:

$f+g$ defined by $(f+g)(x) = f(x) + g(x)$ (point-wise addition)
$cf$ defined by $(cf)(x) = c(f(x))$, for a scalar $c$.

because $(f+g)' = f' + g'$ and $(cf)' = c(f')$.

One often write this transformation as $Df$ (that is, $D(f) = f'$), and polynomial expressions in $D$ (where "multiplication" is "repeated composition" that is, $D^2(f) = D(D(f)) = (f')' = f''$) are likewise ALSO linear transformations.
 
Last edited:
Back
Top