Singular value decomposition

1. Aug 13, 2014

joshmccraney

Hello again PF!

I had a question on singular value decomposition. My question relates to a pepper I have been reading, and I'm wondering if any of you would be able to glance over the paper and let me know your thoughts?

The link is here: http://www.ling.ohio-state.edu/~kbaker/pubs/Singular_Value_Decomposition_Tutorial.pdf [Broken]

My question is on page 22. Specifically, how does the author interpret matrix U with respect to the words car, doctor, hospital, wheel, and nurse?

i'm trying to get a good understanding of SVD practically. any help is greatly appreciated!

josh

Last edited by a moderator: May 6, 2017
2. Aug 13, 2014

D H

Staff Emeritus
The very top of the page is marked "ROUGH DRAFT - BEWARE suggestions <email address elided>". That certainly is true with regard to pages 21 and 22. The row and column labels are missing on matrix A on page 21, and what he said about the eigenvalues on page 22 is inconsistent.

Let me try to reconstruct that. This is a guess. A "ROUGH DRAFT", if you will. The matrix A is an occurrence matrix. Each element indicates how many times a given word occurs in a given document. With labels attached, here's my guess as to that matrix:
$$\begin{array}{lrrrrr} & \text{#1} & \text{#2} & \text{#3} & \text{#4} & \text{#5} \\ \text{doctor} & 2 & 0 & 8 & 6 & 0 \\ \text{car} & 1 & 6 & 0 & 1 & 7 \\ \text{hospital} & 5 & 0 & 7 & 4 & 0 \\ \text{nurse} & 7 & 0 & 8 & 5 & 0 \\ \text{wheel} & 0 & 10 & 0 & 0 & 7 \end{array}$$
The rows represent different words whose occurrence we want to study; the columns represent documents in which the words appear. I've made a guess as to which word corresponds to which row. I've just labeled the columns #1, #2, etc. The names of those documents isn't relevant here. (Obviously it is in an analysis of those documents.)

Some obvious things that jump out, just looking at the matrix:
• Doctors, nurses, and hospitals are deeply connected with one another. You can see this in documents #1, #3, and #4.
• Cars and wheels are deeply connected with one another. Two of the documents, #2 and #5, only mention cars and wheels.
• A bit weaker, nurses and cars are connected when doctors are absent. Documents #1 and #4 mention cars, document #3 doesn't. Document #1 barely mentions doctors but does mention cars. Document #3 mentions doctors a lot but doesn't mention cars at all.

The following link computes an SVD of that array, courtesy WolframAlpha.

Note that except for some sign changes, the U matrix in that document and the U matrix from WolframAlpha are the same. Negate an eigenvector and you still have an eigenvector.

The author of the paper completely messed up the interpretation of the matrix. The first column vector corresponds to the largest singular value, so it's the most significant eingenvector. That column says that doctors, nurses, and hospitals go hand in hand. A document that contains one of those words is most likely to contain all three. Cars and wheels are a bit irrelevant in this first eigenvector. The second eigenvector says that some documents contain the words car and wheel and none of the other words. The third eigenvector represents documents that don't contain the words doctor or wheel but do contain the words nurse, and to a lesser extent contain the words car and hospital. The fourth and fifth eigenvectors? Now we're in the noise. Those eigenvectors are small, particularly for a 5x5 matrix.

3. Aug 13, 2014

SteamKing

Staff Emeritus
You should not try to read peppers or other vegetables.

4. Aug 13, 2014

joshmccraney

what's wrong if i "red peppers" bahahhaha? do they hurt your eyes?

5. Aug 13, 2014

joshmccraney

Thanks a ton DH! a few following question for you if it's okay?

first, the matrix U. is it still categorized as $words \times documents number$? it seems like the document number is no longer preserved. is this correct?

also, what does $V^T$ give us? would it be $document number \times words$? or are words omitted?

thanks a ton!

6. Aug 13, 2014

AlephZero

If $A = U\Sigma V^T$, then $A^T = V\Sigma U^T$. So you can interpret the $V$ matrix that way.

Instead of showing you clusters of words that occur in individual documents, it is showing you the "clusters of documents" that contain individual words. That probably isn't so interesting, because in a real life sized example you would have many more words than documents. If you want to know which documents contain a particular word, you know that already from the A matrix.

7. Aug 14, 2014

D H

Staff Emeritus
Suppose A is an NxM matrix, where in this case N is the number of words and M is the number of documents. Then AAT will be an NxN matrix, which in this case is number of words by number of words. AAT shows how words are related by their mutual appearance (or lack thereof) across all of the documents. The U matrix contains the eigenvectors of AAT.

The flip side of this is the matrix ATA. This is an MxM matrix, i.e., number of documents by number of documents. ATA shows how documents are related by using common words (or using disjoint words) across all of the words. The V matrix contains the eigenvectors of ATA.

8. Aug 16, 2014

joshmccraney

perfect, i get it now (i think)! thanks so much. what would happen if we examined the row vectors of U, rather than the columns? would it be anything significant?

how about $V^T$? do we still look at columns?

thanks!

9. Aug 18, 2014

D H

Staff Emeritus
You look at the columns of V, not VT. That's one of the reasons the matrix is returned as V rather that VT.

10. Aug 18, 2014

joshmccraney

thanks!

Suppose the SVD for an $n \times n$ matrix $A=U \Sigma V^T$ where each matrix of the decomposition is an $n \times n$ matrix. Let's say we are going to try to truncate for data reduction so we take out the last $n-2$ diagonals of $\Sigma$ and leave only the first two. Let's call this truncated matrix $\Sigma '$. How will this help data reduction? Generally won't matrix $U \Sigma ' V^T$ still be an $n \times n$ matrix and not necessarily have any columns that are zeros?

11. Aug 18, 2014

AlephZero

If you set an entry in $\Sigma$ to zero, The corresponding row of $U$ and column of $V^T$ are multiplied by 0 in the product $U\Sigma V^T$, so you can delete them from $U$ and $V^T$.

If you only keep 2 non-zero entries in $\Sigma$, you only need the corresponding 2 columns of $U$, and the corresponding 2 rows of $V^T$ (or the 2 columns of $V$).

You are right that $U\Sigma V^T$ is still a fully populated matrix, but you don't need to store it explicitly. When you need to use it in a calculation, you can use the truncated versions of $U$, $\Sigma$ and $V$ instead.

This is analogous to not calculating the inverse of a matrix explicitly. If you want to solve some linear equations $Ax = b$, the matrix $A^{-1}$ is usually fully populated even when $A$ is almost all zeros (e.g. $A$ is a tridiagonal matrix). You don't form the matrix $A^{-1}$ and then multiply it by $b$. You do some form of Gaussian elimination that preserves the sparseness of $A$.

Last edited: Aug 19, 2014