Matrix and Projection: Gram-Schmidt Process and Projection Matrix

Denver Dang · May 21, 2009

Homework Statement

Let A be a m x n matrix of rank n and let [tex]\textbf{b} \in R^{m}[/tex]. If Q and R are the matrices derived from applying the Gram-Schmidt process to the column vectors of A and

p = c₁q₁ + c₂q₂ + ... + c_nq_n

is the projection of b onto R(A), then show that:

a) c = Q^Tb
b) p = QQ^Tb
c) QQ^T = A(A^TA)^-1A^T

Homework Equations

I'm not quite sure about the (a).
But in (b) I have, from my book:
Let S be a nonzero subspace of R^m and let [tex]\textbf{b} \in R^{m}[/tex]. If {u₁, u₂,..., u_k} is an orthonormal basis for S and U = (u₁, u₂,..., u_k), then the projection p of b onto S is given by

p = UU^Tb

In (c) I have (Some place in my book) that:

[tex]\textbf{p} = A\hat{x} = A\left(A^{T}A\right)^{-1}A^{T}\textbf{}[/tex] which is called the projection matrix.

The Attempt at a Solution

As I said, I don't know about (a), but in (b) my thought was that the projection p of b is given by:

p=c₁u₁ + ... + c_ku_k = Uc, which is pretty much the same as given in the problem, and

c = (c₁, c₂,...,c_k) = (u₁^Tb, u₂^Tb,..., u_k^Tb) = U^Tb

and then I get: p = UU^Tb, which is what I needed to find (except for some other letters used in this problem).

In (c) we have the p from (b) which were p = UU^Tb. So by applying p on p's place in the equation given in "Rel. eq" I get

UU^Tb = [tex]A\hat{x} = A\left(A^{T}A\right)^{-1}A^{T}\textbf{}[/tex] which is pretty much the thing I needed to show, except for the two b's on each side.
But I don't know if I just can remove them, just if they were numbers? Because that would give me the correct equation.

Hope I haven't made it to confusing. At least I hope for some hints for (a), and maybe a hint about (b) and (c)'s correctness ? :)

Regards

Denver Dang · May 22, 2009

No one ? :S

jbunniii · May 22, 2009

For part (a), you have

c_j = <b, q_j> = b^Tq_j = q_j^Tb for each j. Stack these into a column vector and you're done.

You can use (a) in a similar way to show (b). For (c), try writing A in terms of Q and R and substituting into the right-hand side.

Denver Dang · May 22, 2009

jbunniii said:

For part (a), you have

c_j = <b, q_j> = b^Tq_j = q_j^Tb for each j. Stack these into a column vector and you're done.

You can use (a) in a similar way to show (b). For (c), try writing A in terms of Q and R and substituting into the right-hand side.

Thank you :)

I will try that.

Denver Dang · May 24, 2009

Hi again.

I'm just thinking. In c, why bother writing A it in terms of Q and R ?
I've almost proved it, except the b's. So don't I just need an argument for removing the b's on each side ?

jbunniii · May 24, 2009

Denver Dang said:

Hi again.

I'm just thinking. In c, why bother writing A it in terms of Q and R ?
I've almost proved it, except the b's. So don't I just need an argument for removing the b's on each side ?

Well, it's true that if S and T are matrices, and Sb = Tb for every vector b in R^m, then S = T. (i.e., if S and T map every b the same way, then by definition they are the same map).

But what's your justification for writing

[tex]p = A(A^TA)^{-1}A^T[/tex]

in the first place? Correct me if I'm wrong, but seems to me that you are trying to prove the formula is true by saying "I found it in the book, so it must be true." The point of writing A in terms of Q and R is that it will allow you to prove part (c) directly.

Denver Dang · May 25, 2009

You have a point :)

So you want me to write:

A = QR

QQ^T = A(A^TA)^-1A^T
QQ^T = (QR)((QR)^T(QR))^-1(QR)^T
QQ^T = (QR)(R^TQ^TQR)^-1R^TQ^T
QQ^T = (QR)(1)R^TQ^T
QQ^T = (QR)R^TQ^T
QQ^T = Q(RR^T)Q^T
QQ^T = Q(1)Q^T
QQ^T = QQ^T

And then I have it ?

jbunniii · May 25, 2009

Denver Dang said:

You have a point :)

So you want me to write:

A = QR

QQ^T = A(A^TA)^-1A^T
QQ^T = (QR)((QR)^T(QR))^-1(QR)^T
QQ^T = (QR)(R^TQ^TQR)^-1R^TQ^T
QQ^T = (QR)(1)R^TQ^T
QQ^T = (QR)R^TQ^T
QQ^T = Q(RR^T)Q^T
QQ^T = Q(1)Q^T
QQ^T = QQ^T

And then I have it ?

I don't think this is quite right. You're assuming that

[tex]R^T R = R R^T = I[/tex]

which means that R is an orthogonal matrix. But is that true? The Gram-Schmidt procedure gives you a factorization

[tex]A = QR[/tex]

where Q is orthogonal, but what kind of matrix is R? It's not orthogonal in general.

Denver Dang · May 25, 2009

jbunniii said:

I don't think this is quite right. You're assuming that

[tex]R^T R = R R^T = I[/tex]

which means that R is an orthogonal matrix. But is that true? The Gram-Schmidt procedure gives you a factorization

[tex]A = QR[/tex]

where Q is orthogonal, but what kind of matrix is R? It's not orthogonal in general.

Hmmm, then R is an upper triangular matrix as far as I can read in my book.
But what is the last step, if RR^T=R^TR isn't not equal 1 ?
An upper triangular matrix * a transposed upper triangular matrix don't give anything I can use, does it ? :S

Denver Dang · May 25, 2009

I think I maybe figured it out.
Since the middle brackets have ^-1, then when the first QQ^T becomes 1, since they are orthogonal, then you have R^TR left, which will be R^-1(R^T)^-1. And then we have RR^-1 and (R^T)^-1R^-1 which should be one, since the inverse matrix has nothing to do with ortogonality.

Right ?

jbunniii · May 25, 2009

Denver Dang said:

I think I maybe figured it out.
Since the middle brackets have ^-1, then when the first QQ^T becomes 1, since they are orthogonal, then you have R^TR left, which will be R^-1(R^T)^-1. And then we have RR^-1 and (R^T)^-1R^-1 which should be one, since the inverse matrix has nothing to do with ortogonality.

Right ?

[EDIT]: Sorry, my original response assumed that Q is m x m and R is m x n (the standard QR factorization). But the more natural factorization associated with Gram-Schmidt is

A = QR

where Q is m x n and R is n x n. This assumes n <= m, which is implied by the fact that rank(A) = n. (Why?)

Looking more carefully at your problem statement, in particular 1(a), it's clear that Q in your case is m x n, and therefore R must be n x n. Since A has rank n, so also must R have rank n. (Why?) Therefore your assumption that R is invertible is valid, and what you have written is correct, assuming you meant:

[tex]\begin{align*}
A(A^T A)^{-1} A^T &= QR((QR)^T QR)^{-1}(QR)^T \\
&= QR(R^T Q^T Q R)^{-1} R^T Q^T \\
&= QR(R^T R)^{-1} R^T Q^T \\
&= QR R^{-1} (R^T)^{-1} R^T Q^T \\
&= Q(R R^{-1})((R^T)^{-1}R^T)Q^T \\
&= Q I I Q^T \\
&= QQ^T
\end{align*}[/tex]

If instead the "other" factorization was used, A = QR where Q is m x m and R is m x n, then R is no longer necessarily even square, let alone invertible. In that case, 1(c) is still true, and you still obtain it by substituting A = QR, but you have to argue more delicately in that case because R itself isn't invertible. That was the content of my original response, but now that I see it isn't the case for your problem, I deleted that response. Sorry if you read it before I removed it, and for any confusion it might have caused.

Matrix and Projection: Gram-Schmidt Process and Projection Matrix

Homework Statement

Homework Equations

The Attempt at a Solution

1. What is a matrix in the context of projection?

2. How is a projection matrix different from a regular matrix?

3. What is the purpose of using a projection matrix?

4. How is a projection matrix calculated?

5. Can a projection matrix be used for non-linear projections?

Similar threads

Hot Threads

Recent Insights