Singular Value Decomposition: How Can We Efficiently Use SVD Practically?

member 428835 · Aug 13, 2014

Hello again PF!

I had a question on singular value decomposition. My question relates to a pepper I have been reading, and I'm wondering if any of you would be able to glance over the paper and let me know your thoughts?

The link is here: http://www.ling.ohio-state.edu/~kbaker/pubs/Singular_Value_Decomposition_Tutorial.pdf

My question is on page 22. Specifically, how does the author interpret matrix U with respect to the words car, doctor, hospital, wheel, and nurse?

i'm trying to get a good understanding of SVD practically. any help is greatly appreciated!

josh

D H · Aug 13, 2014

The very top of the page is marked "ROUGH DRAFT - BEWARE suggestions <email address elided>". That certainly is true with regard to pages 21 and 22. The row and column labels are missing on matrix A on page 21, and what he said about the eigenvalues on page 22 is inconsistent.

Let me try to reconstruct that. This is a guess. A "ROUGH DRAFT", if you will. The matrix A is an occurrence matrix. Each element indicates how many times a given word occurs in a given document. With labels attached, here's my guess as to that matrix:
[tex]
\begin{array}{lrrrrr}
& \text{#1} & \text{#2} & \text{#3} & \text{#4} & \text{#5} \\
\text{doctor} & 2 & 0 & 8 & 6 & 0 \\
\text{car} & 1 & 6 & 0 & 1 & 7 \\
\text{hospital} & 5 & 0 & 7 & 4 & 0 \\
\text{nurse} & 7 & 0 & 8 & 5 & 0 \\
\text{wheel} & 0 & 10 & 0 & 0 & 7
\end{array}[/tex]
The rows represent different words whose occurrence we want to study; the columns represent documents in which the words appear. I've made a guess as to which word corresponds to which row. I've just labeled the columns #1, #2, etc. The names of those documents isn't relevant here. (Obviously it is in an analysis of those documents.)

Some obvious things that jump out, just looking at the matrix:

Doctors, nurses, and hospitals are deeply connected with one another. You can see this in documents #1, #3, and #4.
Cars and wheels are deeply connected with one another. Two of the documents, #2 and #5, only mention cars and wheels.
A bit weaker, nurses and cars are connected when doctors are absent. Documents #1 and #4 mention cars, document #3 doesn't. Document #1 barely mentions doctors but does mention cars. Document #3 mentions doctors a lot but doesn't mention cars at all.

The following link computes an SVD of that array, courtesy WolframAlpha.

Note that except for some sign changes, the U matrix in that document and the U matrix from WolframAlpha are the same. Negate an eigenvector and you still have an eigenvector.

The author of the paper completely messed up the interpretation of the matrix. The first column vector corresponds to the largest singular value, so it's the most significant eingenvector. That column says that doctors, nurses, and hospitals go hand in hand. A document that contains one of those words is most likely to contain all three. Cars and wheels are a bit irrelevant in this first eigenvector. The second eigenvector says that some documents contain the words car and wheel and none of the other words. The third eigenvector represents documents that don't contain the words doctor or wheel but do contain the words nurse, and to a lesser extent contain the words car and hospital. The fourth and fifth eigenvectors? Now we're in the noise. Those eigenvectors are small, particularly for a 5x5 matrix.

SteamKing · Aug 13, 2014

joshmccraney said:

Hello again PF!

I had a question on singular value decomposition. My question relates to a pepper I have been reading, and I'm wondering if any of you would be able to glance over the paper and let me know your thoughts?

josh

You should not try to read peppers or other vegetables.

member 428835 · Aug 13, 2014

SteamKing said:

You should not try to read peppers or other vegetables.

what's wrong if i "red peppers" bahahhaha? do they hurt your eyes?

member 428835 · Aug 13, 2014

Thanks a ton DH! a few following question for you if it's okay?

first, the matrix U. is it still categorized as ##words \times documents number##? it seems like the document number is no longer preserved. is this correct?

also, what does ##V^T## give us? would it be ##document number \times words##? or are words omitted?

thanks a ton!

AlephZero · Aug 13, 2014

If ##A = U\Sigma V^T##, then ##A^T = V\Sigma U^T##. So you can interpret the ##V## matrix that way.

Instead of showing you clusters of words that occur in individual documents, it is showing you the "clusters of documents" that contain individual words. That probably isn't so interesting, because in a real life sized example you would have many more words than documents. If you want to know which documents contain a particular word, you know that already from the A matrix.

D H · Aug 14, 2014

joshmccraney said:

Thanks a ton DH! a few following question for you if it's okay?

first, the matrix U. is it still categorized as ##words \times documents number##? it seems like the document number is no longer preserved. is this correct?

also, what does ##V^T## give us? would it be ##document number \times words##? or are words omitted?

Suppose A is an NxM matrix, where in this case N is the number of words and M is the number of documents. Then AA^T will be an NxN matrix, which in this case is number of words by number of words. AA^T shows how words are related by their mutual appearance (or lack thereof) across all of the documents. The U matrix contains the eigenvectors of AA^T.

The flip side of this is the matrix A^TA. This is an MxM matrix, i.e., number of documents by number of documents. A^TA shows how documents are related by using common words (or using disjoint words) across all of the words. The V matrix contains the eigenvectors of A^TA.

member 428835 · Aug 16, 2014

D H said:

Suppose A is an NxM matrix, where in this case N is the number of words and M is the number of documents. Then AA^T will be an NxN matrix, which in this case is number of words by number of words. AA^T shows how words are related by their mutual appearance (or lack thereof) across all of the documents. The U matrix contains the eigenvectors of AA^T.

The flip side of this is the matrix A^TA. This is an MxM matrix, i.e., number of documents by number of documents. A^TA shows how documents are related by using common words (or using disjoint words) across all of the words. The V matrix contains the eigenvectors of A^TA.

perfect, i get it now (i think)! thanks so much. what would happen if we examined the row vectors of U, rather than the columns? would it be anything significant?

how about ##V^T##? do we still look at columns?

thanks!

D H · Aug 18, 2014

You look at the columns of V, not V^T. That's one of the reasons the matrix is returned as V rather that V^T.

member 428835 · Aug 18, 2014

thanks!

Suppose the SVD for an ##n \times n## matrix ##A=U \Sigma V^T## where each matrix of the decomposition is an ##n \times n## matrix. Let's say we are going to try to truncate for data reduction so we take out the last ##n-2## diagonals of ##\Sigma## and leave only the first two. Let's call this truncated matrix ##\Sigma '##. How will this help data reduction? Generally won't matrix ##U \Sigma ' V^T## still be an ##n \times n## matrix and not necessarily have any columns that are zeros?

AlephZero · Aug 18, 2014

If you set an entry in ##\Sigma## to zero, The corresponding row of ##U## and column of ##V^T## are multiplied by 0 in the product ##U\Sigma V^T##, so you can delete them from ##U## and ##V^T##.

If you only keep 2 non-zero entries in ##\Sigma##, you only need the corresponding 2 columns of ##U##, and the corresponding 2 rows of ##V^T## (or the 2 columns of ##V##).

You are right that ##U\Sigma V^T## is still a fully populated matrix, but you don't need to store it explicitly. When you need to use it in a calculation, you can use the truncated versions of ##U##, ##\Sigma## and ##V## instead.

This is analogous to not calculating the inverse of a matrix explicitly. If you want to solve some linear equations ##Ax = b##, the matrix ##A^{-1}## is usually fully populated even when ##A## is almost all zeros (e.g. ##A## is a tridiagonal matrix). You don't form the matrix ##A^{-1}## and then multiply it by ##b##. You do some form of Gaussian elimination that preserves the sparseness of ##A##.

Singular Value Decomposition: How Can We Efficiently Use SVD Practically?

What is Singular Value Decomposition?

Why is Singular Value Decomposition important?

What are the applications of Singular Value Decomposition?

How does Singular Value Decomposition work?

What are the advantages of using Singular Value Decomposition?

Similar threads

Hot Threads

Recent Insights