Show that the Gram-Schmidt process gives the shortest length

  • #1
274
1

Homework Statement


Let ##v_{1},...,v_{m}## be an orthonormal set of vectors in ##V##. Let ##v\not\in S(v_{1},...,v_{m})##. Show that the vector ##v'=v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}## given by the Gram-Schmidt process has the shortest length among all vectors of the form ##v-x## for ##x\in Sv_{1},...,v_{n})##.


The Attempt at a Solution


Since the problem wants me to deal with the length of vectors I thought that I would probably have to use something along the lines of
$$
||v'||\leq ||v-x||\Leftrightarrow \left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|
$$
Since ##x\in S(v_{1},...,v_{n})## ##x## must be a linear combination of ##v_{1},...,v_{n}## so ##x=\sum\limits_{i=1}^{n}\lambda_{i}v_{i}##.
I then tried to show that
$$
0\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|.
$$
To get rid of the square roots I squared both sides to get
$$
0\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^{2}-2\left(\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|\right) \left(\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\right)+\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^{2}
$$
and this is equivalent to
$$
0\leq (v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i},v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i})-2 \left(\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|\right)
\left(\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\right)+(v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i},v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}).
$$
I am not sure how to proceed from here.
 
  • #2
I then tried to show that
$$
0\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|.
$$
To get rid of the square roots I squared both sides to get
$$
0\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^{2}-2\left(\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|\right) \left(\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\right)+\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^{2}
$$

It might seem like nitpicking, but you should never write a step of a proof that includes the equality that you're trying to show. It's not only bad form, but it can lead to confusion for both the reader and author. Be clear about what is to be shown and then the rest of the proof should contain statements that you can show to be true.

Instead, try something like:

"If we can show that

$$
0\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|,
$$

then the lemma follows. Therefore, we expand

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right| = \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^{2}-2\left(\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|\right) \left(\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\right)+\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^{2}.
$$

etc."

and this is equivalent to
$$
0\leq (v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i},v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i})-2 \left(\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|\right)
\left(\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\right)+(v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i},v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}).
$$
I am not sure how to proceed from here.

You've either got lots of typos or you made mistakes in expanding the expressions. In any case, the cross terms here make the expression very complicated. I would suggest computing

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^2.
$$

If this is positive semidefinite, then the lemma follows. If you expand carefully, you should be able to show that this expression can be written as a sum of squares.
 
  • #3
I can see how the expression

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^2.
$$

becomes the difference between two inner products:

$$
(v-x,v-x)-(v',v')
$$

and writing that out I do get a sum of squares but I don't see how I can show that this sum is greater than 0. In particular, if we let ##(v-x)_{i}## be the ##i##th component of ##v-x##, I don't see how I can show

$$
(v-x)_{1}^{2}+\cdots +(v-x)_{n}^{2}-v_{1}'^{2}-\cdots -v_{n}'^{2}\geq 0.
$$
 
  • #4
I don't see how I can show

$$
(v-x)_{1}^{2}+\cdots +(v-x)_{n}^{2}-v_{1}'^{2}-\cdots -v_{n}'^{2}\geq 0.
$$

You have to express this in terms of ##v##, ##v_i## and ##\lambda_i##. After some cancellations, you should be able to write the remaining terms as a sum of squares with positive coefficients.
 
  • #5
You have to express this in terms of ##v##, ##v_i## and ##\lambda_i##. After some cancellations, you should be able to write the remaining terms as a sum of squares with positive coefficients.

I tried to expand this further into
$$
(v_{1}-\lambda_{1}v_{1})^{2}+\cdots+(v_{n}-\lambda_{n}v_{n})-(v_{1}-(v,v_{1})v_{1})^{2}-\cdots-(v_{n}-(v,v_{n})v_{n})^{2}
$$
which is equal to
$$
v_{1}^{2}+\cdots+v_{n}^{2}+\lambda_{1}^{2}v_{1}^{2}+\cdots+\lambda_{n}^{2}v_{n}^{2}-2\lambda_{1}v_{1}^{2}-\cdots - 2\lambda_{n}v_{n}^{2} - v_{1}^{2}-\cdots -v_{n}^{2} -(v,v_{1})^{2}v_{1}^{2}-\cdots - (v,v_{n})^{2}v_{n}^{2} + 2(v,v_{1})v_{1}^{2} +\cdots +2(v,v_{n})v_{n}^{2}
$$
which after canceling out becomes
$$
\lambda_{1}^{2}v_{1}^{2}+\cdots+\lambda_{n}^{2}v_{n}^{2}-2\lambda_{1}v_{1}^{2}-\cdots - 2\lambda_{n}v_{n}^{2}-(v,v_{1})^{2}v_{1}^{2}-\cdots - (v,v_{n})^{2}v_{n}^{2} + 2(v,v_{1})v_{1}^{2} +\cdots +2(v,v_{n})v_{n}^{2}.
$$

The only way I found to expand ##(v,v_{i})## further was ##(v,v_{i})=(||v||)(||v_{i}||)\cos\theta##. For ##i=1,...,m## we have ##||v_{i}||=1## but that's as far as I can go.

So factoring out all of the ##v_{i}'s## I am left with
$$
v_{1}^{2}(\lambda_{1}^{2}+2||v||\cos\theta-2\lambda_{1}-||v||^{2}\cos^{2}\theta)+\cdots+v_{n}^{2}(\lambda_{n}^{2}+2||v||\cos \theta-2\lambda_{n}-(||v||)^{2}(||v_{n}||)^{2}\cos^{2}\theta).
$$

I think it might be that I am getting confused by how to denote each component of our vector.
 
Last edited:
  • #6
I tried to expand this further into
$$
(v_{1}-\lambda_{1}v_{1})^{2}+\cdots+(v_{n}-\lambda_{n}v_{n})-(v_{1}-(v,v_{1})v_{1})^{2}-\cdots-(v_{n}-(v,v_{n})v_{n})^{2}
$$
which is equal to
$$
v_{1}^{2}+\cdots+v_{n}^{2}+\lambda_{1}^{2}v_{1}^{2}+\cdots+\lambda_{n}^{2}v_{n}^{2}-2\lambda_{1}v_{1}^{2}-\cdots - 2\lambda_{n}v_{n}^{2} - v_{1}^{2}-\cdots -v_{n}^{2} -(v,v_{1})^{2}v_{1}^{2}-\cdots - (v,v_{n})^{2}v_{n}^{2} + 2(v,v_{1})v_{1}^{2} +\cdots +2(v,v_{n})v_{n}^{2}
$$
which after canceling out becomes
$$
\lambda_{1}^{2}v_{1}^{2}+\cdots+\lambda_{n}^{2}v_{n}^{2}-2\lambda_{1}v_{1}^{2}-\cdots - 2\lambda_{n}v_{n}^{2}-(v,v_{1})^{2}v_{1}^{2}-\cdots - (v,v_{n})^{2}v_{n}^{2} + 2(v,v_{1})v_{1}^{2} +\cdots +2(v,v_{n})v_{n}^{2}.
$$

The only way I found to expand ##(v,v_{i})## further was ##(v,v_{i})=(||v||)(||v_{i}||)\cos\theta##. For ##i=1,...,m## we have ##||v_{i}||=1## but that's as far as I can go.

So factoring out all of the ##v_{i}'s## I am left with
$$
v_{1}^{2}(\lambda_{1}^{2}+2||v||\cos\theta-2\lambda_{1}-||v||^{2}\cos^{2}\theta)+\cdots+v_{n}^{2}(\lambda_{n}^{2}+2||v||\cos \theta-2\lambda_{n}-(||v||)^{2}(||v_{n}||)^{2}\cos^{2}\theta).
$$

I think it might be that I am getting confused by how to denote each component of our vector.

You are making this MUCH too complicated with all the formulas. Any vector of the form v-x (where x is in the span of the v's) can also be expressed as v'-x' (where x' is a possibly different vector in the span of the v's). v' is orthogonal to any vector in the span of the v's. Since v' and x' are orthogonal, the Pythagorean theorem can tell you about the length of v'-x'.
 
  • #7
I tried to expand this further into
$$
(v_{1}-\lambda_{1}v_{1})^{2}+\cdots+(v_{n}-\lambda_{n}v_{n})-(v_{1}-(v,v_{1})v_{1})^{2}-\cdots-(v_{n}-(v,v_{n})v_{n})^{2}
$$

This expression is not equal to

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^2.
$$

For instance, the first term in the sum is

$$ \left((v)_1 - \lambda_{1}(v_{1})_1\right)^2 = \left((v)_1 - \lambda_{1}\right)^2. $$

The original notation is somewhat unfortunate. Since we used ##v_1## to stand for an element of an orthonormal basis, if we want to denote the 1st component of the vector ##v##, we should write it in some other way like ##(v)_1## in order not to confuse it with ##v_1##.

Anyway, there's no reason to expand into components. Use inner products to compute the length of the vectors:

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2 = (v,v) - 2\sum\limits_{i=1}^{n}\lambda_{i}(v,v_{i}) + \sum\limits_{i,j=1}^{n}\lambda_{i}\lambda_{j}(v_{i},v_{j}),
$$

with a similar expression for the other term.
 
Last edited:
  • #8
This expression is not equal to

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^2.
$$

For instance, the first term in the sum is

$$ \left((v)_1 - \lambda_{1}(v_{1})_1\right)^2 = \left((v)_1 - \lambda_{1}\right)^2. $$

The original notation is somewhat unfortunate. Since we used ##v_1## to stand for an element of an orthonormal basis, if we want to denote the 1st component of the vector ##v##, we should write it in some other way like ##(v)_1## in order not to confuse it with ##v_1##.

Anyway, there's no reason to expand into components. Use inner products to compute the length of the vectors:

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2 = (v,v) - \sum\limits_{i=1}^{n}\lambda_{i}(v,v_{i}) + \sum\limits_{i,j=1}^{n}\lambda_{i}\lambda_{j}(v_{i},v_{j}),
$$

with a similar expression for the other term.

Ahh! Yes. That does it. I got too bogged down with my notation before. Thanks.
 

Suggested for: Show that the Gram-Schmidt process gives the shortest length

Replies
16
Views
552
Replies
1
Views
518
Replies
10
Views
597
Replies
9
Views
287
Replies
8
Views
442
Replies
37
Views
2K
Back
Top