Singular Value Decomposition: How Can We Efficiently Use SVD Practically?

  • Context: Graduate 
  • Thread starter Thread starter member 428835
  • Start date Start date
  • Tags Tags
    Decomposition Value
Click For Summary

Discussion Overview

The discussion revolves around the practical applications and interpretations of Singular Value Decomposition (SVD), particularly in relation to a specific paper. Participants explore how SVD can be used to analyze word occurrence matrices and the implications of the matrices U and V in this context.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant questions the interpretation of matrix U in the context of specific words and their relationships as presented in the paper.
  • Another participant critiques the paper for missing labels and inconsistencies, attempting to reconstruct the occurrence matrix based on their understanding.
  • Some participants discuss the significance of the first few eigenvectors of matrix U, suggesting that they represent meaningful relationships among the words.
  • Questions arise about the structure of matrices U and V, including whether they maintain the original document numbers and how they relate to the original occurrence matrix.
  • There is a discussion on the implications of truncating the diagonal matrix Σ for data reduction, with participants exploring how this affects the resulting matrices.
  • Clarifications are made regarding the interpretation of the matrices AAT and ATA, and how they relate to word and document relationships.
  • One participant expresses confusion about examining row vectors of U and the significance of columns in V, seeking further clarification.

Areas of Agreement / Disagreement

Participants express differing views on the interpretation of the matrices and the implications of SVD, indicating that multiple competing perspectives remain. Some points are clarified, but no consensus is reached on the overall interpretation of the paper or the practical applications of SVD.

Contextual Notes

Limitations include potential misunderstandings of the paper's content, missing assumptions regarding the structure of the matrices, and unresolved questions about the implications of truncating the Σ matrix for data reduction.

member 428835
Hello again PF!

I had a question on singular value decomposition. My question relates to a pepper I have been reading, and I'm wondering if any of you would be able to glance over the paper and let me know your thoughts?

The link is here: http://www.ling.ohio-state.edu/~kbaker/pubs/Singular_Value_Decomposition_Tutorial.pdf

My question is on page 22. Specifically, how does the author interpret matrix U with respect to the words car, doctor, hospital, wheel, and nurse?

i'm trying to get a good understanding of SVD practically. any help is greatly appreciated!

josh
 
Last edited by a moderator:
Physics news on Phys.org
The very top of the page is marked "ROUGH DRAFT - BEWARE suggestions <email address elided>". That certainly is true with regard to pages 21 and 22. The row and column labels are missing on matrix A on page 21, and what he said about the eigenvalues on page 22 is inconsistent.

Let me try to reconstruct that. This is a guess. A "ROUGH DRAFT", if you will. The matrix A is an occurrence matrix. Each element indicates how many times a given word occurs in a given document. With labels attached, here's my guess as to that matrix:
<br /> \begin{array}{lrrrrr}<br /> &amp; \text{#1} &amp; \text{#2} &amp; \text{#3} &amp; \text{#4} &amp; \text{#5} \\<br /> \text{doctor} &amp; 2 &amp; 0 &amp; 8 &amp; 6 &amp; 0 \\<br /> \text{car} &amp; 1 &amp; 6 &amp; 0 &amp; 1 &amp; 7 \\<br /> \text{hospital} &amp; 5 &amp; 0 &amp; 7 &amp; 4 &amp; 0 \\<br /> \text{nurse} &amp; 7 &amp; 0 &amp; 8 &amp; 5 &amp; 0 \\<br /> \text{wheel} &amp; 0 &amp; 10 &amp; 0 &amp; 0 &amp; 7<br /> \end{array}
The rows represent different words whose occurrence we want to study; the columns represent documents in which the words appear. I've made a guess as to which word corresponds to which row. I've just labeled the columns #1, #2, etc. The names of those documents isn't relevant here. (Obviously it is in an analysis of those documents.)

Some obvious things that jump out, just looking at the matrix:
  • Doctors, nurses, and hospitals are deeply connected with one another. You can see this in documents #1, #3, and #4.
  • Cars and wheels are deeply connected with one another. Two of the documents, #2 and #5, only mention cars and wheels.
  • A bit weaker, nurses and cars are connected when doctors are absent. Documents #1 and #4 mention cars, document #3 doesn't. Document #1 barely mentions doctors but does mention cars. Document #3 mentions doctors a lot but doesn't mention cars at all.

The following link computes an SVD of that array, courtesy WolframAlpha.

Note that except for some sign changes, the U matrix in that document and the U matrix from WolframAlpha are the same. Negate an eigenvector and you still have an eigenvector.

The author of the paper completely messed up the interpretation of the matrix. The first column vector corresponds to the largest singular value, so it's the most significant eingenvector. That column says that doctors, nurses, and hospitals go hand in hand. A document that contains one of those words is most likely to contain all three. Cars and wheels are a bit irrelevant in this first eigenvector. The second eigenvector says that some documents contain the words car and wheel and none of the other words. The third eigenvector represents documents that don't contain the words doctor or wheel but do contain the words nurse, and to a lesser extent contain the words car and hospital. The fourth and fifth eigenvectors? Now we're in the noise. Those eigenvectors are small, particularly for a 5x5 matrix.
 
  • Like
Likes   Reactions: 1 person
joshmccraney said:
Hello again PF!

I had a question on singular value decomposition. My question relates to a pepper I have been reading, and I'm wondering if any of you would be able to glance over the paper and let me know your thoughts?

josh

You should not try to read peppers or other vegetables. :biggrin:
 
SteamKing said:
You should not try to read peppers or other vegetables. :biggrin:

what's wrong if i "red peppers" bahahhaha? do they hurt your eyes?
 
Thanks a ton DH! a few following question for you if it's okay?

first, the matrix U. is it still categorized as ##words \times documents number##? it seems like the document number is no longer preserved. is this correct?

also, what does ##V^T## give us? would it be ##document number \times words##? or are words omitted?

thanks a ton!
 
If ##A = U\Sigma V^T##, then ##A^T = V\Sigma U^T##. So you can interpret the ##V## matrix that way.

Instead of showing you clusters of words that occur in individual documents, it is showing you the "clusters of documents" that contain individual words. That probably isn't so interesting, because in a real life sized example you would have many more words than documents. If you want to know which documents contain a particular word, you know that already from the A matrix.
 
  • Like
Likes   Reactions: 1 person
joshmccraney said:
Thanks a ton DH! a few following question for you if it's okay?

first, the matrix U. is it still categorized as ##words \times documents number##? it seems like the document number is no longer preserved. is this correct?

also, what does ##V^T## give us? would it be ##document number \times words##? or are words omitted?
Suppose A is an NxM matrix, where in this case N is the number of words and M is the number of documents. Then AAT will be an NxN matrix, which in this case is number of words by number of words. AAT shows how words are related by their mutual appearance (or lack thereof) across all of the documents. The U matrix contains the eigenvectors of AAT.

The flip side of this is the matrix ATA. This is an MxM matrix, i.e., number of documents by number of documents. ATA shows how documents are related by using common words (or using disjoint words) across all of the words. The V matrix contains the eigenvectors of ATA.
 
D H said:
Suppose A is an NxM matrix, where in this case N is the number of words and M is the number of documents. Then AAT will be an NxN matrix, which in this case is number of words by number of words. AAT shows how words are related by their mutual appearance (or lack thereof) across all of the documents. The U matrix contains the eigenvectors of AAT.

The flip side of this is the matrix ATA. This is an MxM matrix, i.e., number of documents by number of documents. ATA shows how documents are related by using common words (or using disjoint words) across all of the words. The V matrix contains the eigenvectors of ATA.

perfect, i get it now (i think)! thanks so much. what would happen if we examined the row vectors of U, rather than the columns? would it be anything significant?

how about ##V^T##? do we still look at columns?

thanks!
 
You look at the columns of V, not VT. That's one of the reasons the matrix is returned as V rather that VT.
 
  • #10
thanks!

Suppose the SVD for an ##n \times n## matrix ##A=U \Sigma V^T## where each matrix of the decomposition is an ##n \times n## matrix. Let's say we are going to try to truncate for data reduction so we take out the last ##n-2## diagonals of ##\Sigma## and leave only the first two. Let's call this truncated matrix ##\Sigma '##. How will this help data reduction? Generally won't matrix ##U \Sigma ' V^T## still be an ##n \times n## matrix and not necessarily have any columns that are zeros?
 
  • #11
If you set an entry in ##\Sigma## to zero, The corresponding row of ##U## and column of ##V^T## are multiplied by 0 in the product ##U\Sigma V^T##, so you can delete them from ##U## and ##V^T##.

If you only keep 2 non-zero entries in ##\Sigma##, you only need the corresponding 2 columns of ##U##, and the corresponding 2 rows of ##V^T## (or the 2 columns of ##V##).

You are right that ##U\Sigma V^T## is still a fully populated matrix, but you don't need to store it explicitly. When you need to use it in a calculation, you can use the truncated versions of ##U##, ##\Sigma## and ##V## instead.

This is analogous to not calculating the inverse of a matrix explicitly. If you want to solve some linear equations ##Ax = b##, the matrix ##A^{-1}## is usually fully populated even when ##A## is almost all zeros (e.g. ##A## is a tridiagonal matrix). You don't form the matrix ##A^{-1}## and then multiply it by ##b##. You do some form of Gaussian elimination that preserves the sparseness of ##A##.
 
Last edited:
  • Like
Likes   Reactions: 1 person

Similar threads

Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
10K