Show that the Gram-Schmidt process gives the shortest length

DeadOriginal · Sep 15, 2013

Homework Statement

Let ##v_{1},...,v_{m}## be an orthonormal set of vectors in ##V##. Let ##v\not\in S(v_{1},...,v_{m})##. Show that the vector ##v'=v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}## given by the Gram-Schmidt process has the shortest length among all vectors of the form ##v-x## for ##x\in Sv_{1},...,v_{n})##.

The Attempt at a Solution

Since the problem wants me to deal with the length of vectors I thought that I would probably have to use something along the lines of
$$
||v'||\leq ||v-x||\Leftrightarrow \left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|
$$
Since ##x\in S(v_{1},...,v_{n})## ##x## must be a linear combination of ##v_{1},...,v_{n}## so ##x=\sum\limits_{i=1}^{n}\lambda_{i}v_{i}##.
I then tried to show that
$$
0\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|.
$$
To get rid of the square roots I squared both sides to get
$$
0\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^{2}-2\left(\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|\right) \left(\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\right)+\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^{2}
$$
and this is equivalent to
$$
0\leq (v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i},v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i})-2 \left(\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|\right)
\left(\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\right)+(v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i},v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}).
$$
I am not sure how to proceed from here.

fzero · Sep 15, 2013

DeadOriginal said:

I then tried to show that
$$
0\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|.
$$
To get rid of the square roots I squared both sides to get
$$
0\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^{2}-2\left(\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|\right) \left(\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\right)+\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^{2}
$$

It might seem like nitpicking, but you should never write a step of a proof that includes the equality that you're trying to show. It's not only bad form, but it can lead to confusion for both the reader and author. Be clear about what is to be shown and then the rest of the proof should contain statements that you can show to be true.

Instead, try something like:

"If we can show that

$$
0\leq \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|,
$$

then the lemma follows. Therefore, we expand

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right| = \left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^{2}-2\left(\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|\right) \left(\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\right)+\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^{2}.
$$

etc."

and this is equivalent to
$$
0\leq (v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i},v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i})-2 \left(\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|\right)
\left(\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|\right)+(v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i},v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}).
$$
I am not sure how to proceed from here.

You've either got lots of typos or you made mistakes in expanding the expressions. In any case, the cross terms here make the expression very complicated. I would suggest computing

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^2.
$$

If this is positive semidefinite, then the lemma follows. If you expand carefully, you should be able to show that this expression can be written as a sum of squares.

DeadOriginal · Sep 15, 2013

I can see how the expression

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^2.
$$

becomes the difference between two inner products:

$$
(v-x,v-x)-(v',v')
$$

and writing that out I do get a sum of squares but I don't see how I can show that this sum is greater than 0. In particular, if we let ##(v-x)_{i}## be the ##i##th component of ##v-x##, I don't see how I can show

$$
(v-x)_{1}^{2}+\cdots +(v-x)_{n}^{2}-v_{1}'^{2}-\cdots -v_{n}'^{2}\geq 0.
$$

fzero · Sep 15, 2013

DeadOriginal said:

I don't see how I can show

$$
(v-x)_{1}^{2}+\cdots +(v-x)_{n}^{2}-v_{1}'^{2}-\cdots -v_{n}'^{2}\geq 0.
$$

You have to express this in terms of ##v##, ##v_i## and ##\lambda_i##. After some cancellations, you should be able to write the remaining terms as a sum of squares with positive coefficients.

DeadOriginal · Sep 16, 2013

fzero said:

You have to express this in terms of ##v##, ##v_i## and ##\lambda_i##. After some cancellations, you should be able to write the remaining terms as a sum of squares with positive coefficients.

I tried to expand this further into
$$
(v_{1}-\lambda_{1}v_{1})^{2}+\cdots+(v_{n}-\lambda_{n}v_{n})-(v_{1}-(v,v_{1})v_{1})^{2}-\cdots-(v_{n}-(v,v_{n})v_{n})^{2}
$$
which is equal to
$$
v_{1}^{2}+\cdots+v_{n}^{2}+\lambda_{1}^{2}v_{1}^{2}+\cdots+\lambda_{n}^{2}v_{n}^{2}-2\lambda_{1}v_{1}^{2}-\cdots - 2\lambda_{n}v_{n}^{2} - v_{1}^{2}-\cdots -v_{n}^{2} -(v,v_{1})^{2}v_{1}^{2}-\cdots - (v,v_{n})^{2}v_{n}^{2} + 2(v,v_{1})v_{1}^{2} +\cdots +2(v,v_{n})v_{n}^{2}
$$
which after canceling out becomes
$$
\lambda_{1}^{2}v_{1}^{2}+\cdots+\lambda_{n}^{2}v_{n}^{2}-2\lambda_{1}v_{1}^{2}-\cdots - 2\lambda_{n}v_{n}^{2}-(v,v_{1})^{2}v_{1}^{2}-\cdots - (v,v_{n})^{2}v_{n}^{2} + 2(v,v_{1})v_{1}^{2} +\cdots +2(v,v_{n})v_{n}^{2}.
$$

The only way I found to expand ##(v,v_{i})## further was ##(v,v_{i})=(||v||)(||v_{i}||)\cos\theta##. For ##i=1,...,m## we have ##||v_{i}||=1## but that's as far as I can go.

So factoring out all of the ##v_{i}'s## I am left with
$$
v_{1}^{2}(\lambda_{1}^{2}+2||v||\cos\theta-2\lambda_{1}-||v||^{2}\cos^{2}\theta)+\cdots+v_{n}^{2}(\lambda_{n}^{2}+2||v||\cos \theta-2\lambda_{n}-(||v||)^{2}(||v_{n}||)^{2}\cos^{2}\theta).
$$

I think it might be that I am getting confused by how to denote each component of our vector.

Dick · Sep 16, 2013

DeadOriginal said:

I tried to expand this further into
$$
(v_{1}-\lambda_{1}v_{1})^{2}+\cdots+(v_{n}-\lambda_{n}v_{n})-(v_{1}-(v,v_{1})v_{1})^{2}-\cdots-(v_{n}-(v,v_{n})v_{n})^{2}
$$
which is equal to
$$
v_{1}^{2}+\cdots+v_{n}^{2}+\lambda_{1}^{2}v_{1}^{2}+\cdots+\lambda_{n}^{2}v_{n}^{2}-2\lambda_{1}v_{1}^{2}-\cdots - 2\lambda_{n}v_{n}^{2} - v_{1}^{2}-\cdots -v_{n}^{2} -(v,v_{1})^{2}v_{1}^{2}-\cdots - (v,v_{n})^{2}v_{n}^{2} + 2(v,v_{1})v_{1}^{2} +\cdots +2(v,v_{n})v_{n}^{2}
$$
which after canceling out becomes
$$
\lambda_{1}^{2}v_{1}^{2}+\cdots+\lambda_{n}^{2}v_{n}^{2}-2\lambda_{1}v_{1}^{2}-\cdots - 2\lambda_{n}v_{n}^{2}-(v,v_{1})^{2}v_{1}^{2}-\cdots - (v,v_{n})^{2}v_{n}^{2} + 2(v,v_{1})v_{1}^{2} +\cdots +2(v,v_{n})v_{n}^{2}.
$$

The only way I found to expand ##(v,v_{i})## further was ##(v,v_{i})=(||v||)(||v_{i}||)\cos\theta##. For ##i=1,...,m## we have ##||v_{i}||=1## but that's as far as I can go.

So factoring out all of the ##v_{i}'s## I am left with
$$
v_{1}^{2}(\lambda_{1}^{2}+2||v||\cos\theta-2\lambda_{1}-||v||^{2}\cos^{2}\theta)+\cdots+v_{n}^{2}(\lambda_{n}^{2}+2||v||\cos \theta-2\lambda_{n}-(||v||)^{2}(||v_{n}||)^{2}\cos^{2}\theta).
$$

I think it might be that I am getting confused by how to denote each component of our vector.

You are making this MUCH too complicated with all the formulas. Any vector of the form v-x (where x is in the span of the v's) can also be expressed as v'-x' (where x' is a possibly different vector in the span of the v's). v' is orthogonal to any vector in the span of the v's. Since v' and x' are orthogonal, the Pythagorean theorem can tell you about the length of v'-x'.

fzero · Sep 16, 2013

DeadOriginal said:

I tried to expand this further into
$$
(v_{1}-\lambda_{1}v_{1})^{2}+\cdots+(v_{n}-\lambda_{n}v_{n})-(v_{1}-(v,v_{1})v_{1})^{2}-\cdots-(v_{n}-(v,v_{n})v_{n})^{2}
$$

This expression is not equal to

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^2.
$$

For instance, the first term in the sum is

$$ \left((v)_1 - \lambda_{1}(v_{1})_1\right)^2 = \left((v)_1 - \lambda_{1}\right)^2. $$

The original notation is somewhat unfortunate. Since we used ##v_1## to stand for an element of an orthonormal basis, if we want to denote the 1st component of the vector ##v##, we should write it in some other way like ##(v)_1## in order not to confuse it with ##v_1##.

Anyway, there's no reason to expand into components. Use inner products to compute the length of the vectors:

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2 = (v,v) - 2\sum\limits_{i=1}^{n}\lambda_{i}(v,v_{i}) + \sum\limits_{i,j=1}^{n}\lambda_{i}\lambda_{j}(v_{i},v_{j}),
$$

with a similar expression for the other term.

DeadOriginal · Sep 16, 2013

fzero said:

This expression is not equal to

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2-\left|\left|v-\sum\limits_{i=1}^{n}(v,v_{i})v_{i}\right|\right|^2.
$$

For instance, the first term in the sum is

$$ \left((v)_1 - \lambda_{1}(v_{1})_1\right)^2 = \left((v)_1 - \lambda_{1}\right)^2. $$

The original notation is somewhat unfortunate. Since we used ##v_1## to stand for an element of an orthonormal basis, if we want to denote the 1st component of the vector ##v##, we should write it in some other way like ##(v)_1## in order not to confuse it with ##v_1##.

Anyway, there's no reason to expand into components. Use inner products to compute the length of the vectors:

$$
\left|\left|v-\sum\limits_{i=1}^{n}\lambda_{i}v_{i}\right|\right|^2 = (v,v) - \sum\limits_{i=1}^{n}\lambda_{i}(v,v_{i}) + \sum\limits_{i,j=1}^{n}\lambda_{i}\lambda_{j}(v_{i},v_{j}),
$$

with a similar expression for the other term.

Ahh! Yes. That does it. I got too bogged down with my notation before. Thanks.

Show that the Gram-Schmidt process gives the shortest length

Homework Statement

The Attempt at a Solution

What is the Gram-Schmidt process?

What is the significance of orthonormalization in the Gram-Schmidt process?

How does the Gram-Schmidt process ensure the shortest length?

Can the Gram-Schmidt process be used for any set of vectors?

Are there any limitations to the Gram-Schmidt process?

Similar threads

Hot Threads

Recent Insights