# Singular Value Decomposition

## Homework Statement

Quick question, bros.

## Homework Equations

Tell me if I'm right.

## The Attempt at a Solution

A = U∑VT

where ∑ contains on its diagonal the singular values of ATA
U contains the corresponding eigenvectors for those singular values
V contains the eigenvectors corresponding to the singular values of AAT

jbunniii
Homework Helper
Gold Member
Close but not exactly right.

∑ contains on its diagonal the singular values of A. These in turn are the square roots of the eigenvalues of ATA. You can take square roots because ATA is positive semidefinite, hence has nonnegative eigenvalues.

The columns of U are the eigenvectors of AAT.

The columns of V are the eigenvectors of ATA.

If A is complex, you have to use Hermitian transposes above.

In addition to what you wrote, U and V are orthogonal matrices. (Or unitary matrices, if A is complex.)

The singular value decomposition has a nice geometric interpretation if A is real and square. It says that A can be decomposed into the following steps: rotate by some angle, then stretch or shrink along the standard axes, then rotate by some other angle. (The "rotation" can also include reflection if U or V has determinant -1 instead of 1.)

Close but not exactly right.

∑ contains on its diagonal the singular values of A. These in turn are the square roots of the eigenvalues of ATA. You can take square roots because ATA is positive semidefinite, hence has nonnegative eigenvalues.

The columns of U are the eigenvectors of AAT.

The columns of V are the eigenvectors of ATA.

If A is complex, you have to use Hermitian transposes above.

In addition to what you wrote, U and V are orthogonal matrices. (Or unitary matrices, if A is complex.)

The singular value decomposition has a nice geometric interpretation if A is real and square. It says that A can be decomposed into the following steps: rotate by some angle, then stretch or shrink along the standard axes, then rotate by some other angle. (The "rotation" can also include reflection if U or V has determinant -1 instead of 1.)

Thanks!

The columns of U are the eigenvectors of AAT.

How do I know which order to arrange them in?

jbunniii
Homework Helper
Gold Member
How do I know which order to arrange them in?

The standard (canonical) arrangement is to sort the singular values from high to low, so the upper left corner of ∑ has the largest singular value, and the lower right corner (assuming a square matrix) is the smallest singular value.

Then arrange the rows of VT and the columns of U in the same order. i.e. the left column of U goes with the largest singular value, the right column goes with the smallest. The top row of VT goes with the largest singular value, the bottom row with the smallest.

The standard (canonical) arrangement is to sort the singular values from high to low, so the upper left corner of ∑ has the largest singular value, and the lower right corner (assuming a square matrix) is the smallest singular value.

Then arrange the rows of VT and the columns of U in the same order. i.e. the left column of U goes with the largest singular value, the right column goes with the smallest. The top row of VT goes with the largest singular value, the bottom row with the smallest.

Will AAT and ATA have the same eigenvalues?

jbunniii
Homework Helper
Gold Member
Will AAT and ATA have the same eigenvalues?

Yes. You can actually prove this using the SVD.

$$AA^T = (U \Sigma V^T)(U \Sigma V^T)^T = U \Sigma V^T V \Sigma^T U^T = U \Sigma \Sigma^T U^T = UDU^T$$

where D is a diagonal matrix whose diagonal elements are the squares of the diagonal elements of ∑. Similarly, you can show that

$$A^TA = V D V^T$$

These two equations tell you that the diagonal elements of D are the eigenvalues of both $AA^T$ and $A^T A$, and that the eigenvectors of these matrices are the columns of U and V, respectively.

jbunniii
Homework Helper
Gold Member
P.S. The above assumes that A is square. If A is not square, then AAT and ATA have the same nonzero eigenvalues, but they can have different multiplicities of 0 as an eigenvalue.

Thanks for the info, brah. I have a final tomorrow. Maybe you answer me a few more questions, since you seem to like this stuff.

(1) My professor says "We study three kinds of approximation: Interpolation, Data fitting, and Global approximation." The definitions of data fitting and global approximation he then gives are eerily similar; in fact I cannot tell the difference. Perhaps you could explain that.

(2) Markov chains! I'm trying to use my basic knowledge of statistics to understand these. We have a problem that says the probability of rain or sun tomorrow depends on the weather today: P(R-->R) = .2, P(R-->S) = .8, P(S-->R) = .5, P(S-->S) = .5. The Markov chain looks like [.2 .5; .8 .5] * [x1; x2] = [.2x1 + .5x2; .8x1 + .5x2].

I'm trying to get a grasp of this. It says the probability of rain tomorrow is .2x1 + .5x2. I think in terms of P(rain tomorrow) = P(rain tomorrow | rain today) + P(rain tomorrow | sun today) .... Am I right?

Oh, wait .... is it P(rain tomorrow) = P(rain tomorrow & rain today) + P(rain tomorrow & sun today) = P(rain tomorrow | sun today) * P(sun today) + P(rain tomorrow | rain today) * P(rain today) = .5x2 + .2x1

jbunniii
Homework Helper
Gold Member
Thanks for the info, brah. I have a final tomorrow. Maybe you answer me a few more questions, since you seem to like this stuff.

(1) My professor says "We study three kinds of approximation: Interpolation, Data fitting, and Global approximation." The definitions of data fitting and global approximation he then gives are eerily similar; in fact I cannot tell the difference. Perhaps you could explain that.

(2) Markov chains! I'm trying to use my basic knowledge of statistics to understand these. We have a problem that says the probability of rain or sun tomorrow depends on the weather today: P(R-->R) = .2, P(R-->S) = .8, P(S-->R) = .5, P(S-->S) = .5. The Markov chain looks like [.2 .5; .8 .5] * [x1; x2] = [.2x1 + .5x2; .8x1 + .5x2].

I'm trying to get a grasp of this. It says the probability of rain tomorrow is .2x1 + .5x2. I think in terms of P(rain tomorrow) = P(rain tomorrow | rain today) + P(rain tomorrow | sun today) .... Am I right?

Oh, wait .... is it P(rain tomorrow) = P(rain tomorrow & rain today) + P(rain tomorrow & sun today) = P(rain tomorrow | sun today) * P(sun today) + P(rain tomorrow | rain today) * P(rain today) = .5x2 + .2x1

Offhand I'm not sure about these - we have now jumped from linear algebra/matrix theory to some sort of statistics. I studied Markov chains but it was many moons ago and I don't remember most of the details anymore. You should start new threads for each question, with appropriate titles, so others with expertise in this area will have a better chance to see the questions.