Understanding Low Rank Approximation with SVD: A Comprehensive Guide

TheOldHag
Messages
44
Reaction score
3
I'm studying low rank approximation by way of SVD and I'm having trouble understanding how the result matrix has lower rank. For instance, in the link the calculation performed on page 11 resulting in the so-called low rank approximation on page 12. This matrix on page 12 doesn't appear to me to be of rank 2 as one would think. I would expect only 2 linearly independent row vector. When I put it back into a calculator it is still rank 9.

http://lsa.colorado.edu/papers/dp1.LSAintro.pdf
 
Physics news on Phys.org
This is hard for me to word.

Basically, this explanation of low-rank approximation says the new m x n matrix should have rank r is not something I'm seeing in the Latent Semantic Indexing examples out there. They are coming up with new matrices that to me look like they have higher rank than r.

http://en.wikipedia.org/wiki/Low-rank_approximation
 
The matrix on page 12 is only printed to 2 significant figures, so if you copied and pasted those numbers into a calculator, most likely the matrix wound be "rank 9" because of the rounding errors introduced.

The matrix on page is actually the product W (12 x 2) . S (2 x 2) . P' (2 x 9) where the ( ) are the dimensions of each matrix and the relevant parts of the full W S and P matrices are shaded on page 11.

Of course the full W and P matrices on page 11 are only shown to two significant figures as well. If you check whether W and P are orthonormal using those rounded values, you will probably find the "zero" terms are of order 0.0001 or worse.

If you want to do a numerical check for orthogonality, you really need to calculate the SVD yourself to 16 significant figures in double precision arithmetic, and work with that.

FWIW it should be obvious that if you took only one SVD component, the resulting matrix
W (12 x 1) . S (1 x 1) . P' (1 x 9)
would be of rank 1.
 
I think your last point helps understand this better and the rounding error makes perfect sense.

As far as feature extraction in general or latent semantic indexing in general I'm not sure how this reduced dimensionality results in reduced computational cost down the road since you are still left with a term-document matrix that is no longer sparse but having fewer dimensions. Are there further steps to this feature extraction process beyond this point. For instance, determine the basis and then maps things in term of the basis. Seems to be a step missing in these papers on reading since they just end with this new matrix that doesn't appear much simpler on the surface.
 
I don't know anything about semantic feature extraction specifically, but the math in your link looks quite similar to methods that are used in other areas to reduce "large" models to manageable size. Numerical methods based on the SVD decomposition don't seem to feature much in first courses in numerical methods, but they are very useful in practice.

I assume those examples show the matrices in full just to demonstrate what is going on. In practice, it is unlikely you would compute the full matrices. For example if you only want to compute a subset of the singular values (say 1000 SVs of a matrix of dimension maybe 10,000,000 x 20,000, for a large text data base with a realistic sized vocabulary), you don't need to calculate all the SVs , plus the full orthonormal matrices, and then throw most of that data away.
 
I appreciate your help. I'm pass that bit of confusion now.
 
I asked online questions about Proposition 2.1.1: The answer I got is the following: I have some questions about the answer I got. When the person answering says: ##1.## Is the map ##\mathfrak{q}\mapsto \mathfrak{q} A _\mathfrak{p}## from ##A\setminus \mathfrak{p}\to A_\mathfrak{p}##? But I don't understand what the author meant for the rest of the sentence in mathematical notation: ##2.## In the next statement where the author says: How is ##A\to...
##\textbf{Exercise 10}:## I came across the following solution online: Questions: 1. When the author states in "that ring (not sure if he is referring to ##R## or ##R/\mathfrak{p}##, but I am guessing the later) ##x_n x_{n+1}=0## for all odd $n$ and ##x_{n+1}## is invertible, so that ##x_n=0##" 2. How does ##x_nx_{n+1}=0## implies that ##x_{n+1}## is invertible and ##x_n=0##. I mean if the quotient ring ##R/\mathfrak{p}## is an integral domain, and ##x_{n+1}## is invertible then...
The following are taken from the two sources, 1) from this online page and the book An Introduction to Module Theory by: Ibrahim Assem, Flavio U. Coelho. In the Abelian Categories chapter in the module theory text on page 157, right after presenting IV.2.21 Definition, the authors states "Image and coimage may or may not exist, but if they do, then they are unique up to isomorphism (because so are kernels and cokernels). Also in the reference url page above, the authors present two...
Back
Top