Can anyone give a common-sense explanation of singular values?

mikeph · Dec 10, 2012

I've been reading a few mathematical definitions and I'm still not sure I grasp their significance. If Ax = b, then what do the singular values of A represent in terms of x and b? Why are they important?

I'm not looking for a formal definition, just a conceptual description or even some rambling about what they're used for, so I can get an intuitive idea of their purpose/significance.

Thanks for any help

AlephZero · Dec 10, 2012

The simplest description of "what singular values are" is "the eigenvalues of ##A^TA##'' (plus some zeros, if A is not a square matrix).

That suggests one practical use for them, which is solving over- or under-determined sets of equations in a similar way to finding a least squares approximation. In other words, by solving ##A^T Ax = A^Tb## instead of ##Ax = b##.

The singular values are useful in numerical work because they can decompose the problem into parts that are more or less numerically "important". For example if the ##A^TA## has less than full rank, a numerical calculation is unlikely to show that it is exactly singular, but it will have some very small singular values, and they can be ignored by setting them to zero. This leads to ideas like the "pseudo inverse" of a matrix. If the singular values of the original matrix were ##s_i##, the pseudo inverse has singular values ##1/s_i## when ##s_i > 0## and ##0## when ##s_i = 0##.

Apart from numerical applications, the SVD is very useful theoretically, because the decompositon ##A = U\Sigma V^T## exists for any matrix A (unlike an eigenvalue decomopsition, which doesn't exist for rectangular matrices and doesn't even exist for some square matrices), and the square matrices ##U## and ##V## matrices satisfy ##U^T U = V^T V = I##. For example ##U## and ##V## are related to choosing basis vectors for interpreting ##A## as a transformation between two vector spaces.

algebrat · Dec 10, 2012

Just passing through, not checking definitions, but maybe we could say that finding the singular values helps find the eigenvectors, and if we can decompose b in terms of the eigenbasis, it could be seen as a method for finding x. Not sure if this is practical. But you might be able to rephrase finding singular values as finding eigenvalues. The term singular values might be too general a term for where you're at in your studies. So if we rephrase to what is the point of eigenvalues, um... Anyways, singular values sounds like a bit of an advanced perspective, appearing in for instance functional analysis or quantum physics. Not really sure though (there's my disclaimer).

AlephZero · Dec 11, 2012

"Singular values" and "eigenvalues" are two different concepts (though they are related).

Both terms are well defined and commonly used, but mixing them up is NOT a good idea.

mikeph · Dec 11, 2012

I think I'm understanding it slightly. My problem is mostly concerned with numerical solutions to under-determined problems, and I think I do have the issue that if A is rank-deficient, then rounding errors would appear to make A'*A "singular to working precision" as MATLAB phrases it.

But why do small singular values matter? What's the link between the eigenvalues of A'*A and the invertibility? What makes a small singular value a bad thing? What's to stop me dividing my entire matrix by 10, then surely the singular values are all divided by ten, why should this affect the fundamental maths of inversion?

D H · Dec 11, 2012

MikeyW said:

But why do small singular values matter? What's the link between the eigenvalues of A'*A and the invertibility? What makes a small singular value a bad thing? What's to stop me dividing my entire matrix by 10, then surely the singular values are all divided by ten, why should this affect the fundamental maths of inversion?

I'll answer your last question first. A singular value is deemed "small" if the ratio of that singular value to the largest singular value is less than some ε. Multiplying the entire matrix by some scalar doesn't change that ratio.

What makes a small singular value a "bad thing" is that unless you were using infinite precision arithmetic, that small singular value is probably garbage. Things work much better if you treat those small values as if they were zero -- except when you invert your matrix, that is. Then you use the apparently kludgy operation 1/0->0.

One way to look at it is that the you are designating the first n diagonal elements and the first n rows of your U and W matrices (I look at SVD as A=U*V*W^T) as the only ones that count. These are the "principal components" of your system. In fact, SVD and principal component analysis are very closely allied.

mikeph · Dec 12, 2012

Can you elaborate on why "that small singular value is probably garbage"? Is the singular value decomposition an approximation?

D H · Dec 12, 2012

MikeyW said:

Can you elaborate on why "that small singular value is probably garbage"? Is the singular value decomposition an approximation?

Singular value decomposition is limited by the tools at hand. The problem is that the standard representations of floating point numbers (e.g., float, double, long double in C-based languages, real, real*8, real*16 in Fortran) are but approximations of the real numbers. In the reals, addition and multiplication are associative and distributive. This is not the case for finite precision numbers such as IEEE floats and doubles. Example: -1+(1+1e-16) is zero, (-1+1)+1e-16 is 1e-16.

Suppose you get a singular value that is equal to 10^-16 times the largest singular value. This value is almost certainly pure garbage, a number with zero bits of precision. You can't even trust the sign to be correct. There's a whole lot of repetitive addition and multiplication in an SVD implementation. Those small singular values result because of cancelations of positive and negative values during summations.

jasonRF · Dec 12, 2012

MikeyW said:

I think I'm understanding it slightly. My problem is mostly concerned with numerical solutions to under-determined problems, and I think I do have the issue that if A is rank-deficient, then rounding errors would appear to make A'*A "singular to working precision" as MATLAB phrases it.

If you are solving an underdetermined linear problem [itex]Ax=b[/itex], then by definition your [itex]A[/itex] matrix is rank deficient. If [itex]rank(A)=r[/itex] then you only have [itex]r[/itex] nonzero singular values. Now, as indicated by D H, when you use a numerical method to determine the singular values, you may not find any of them that are exactly zero due to the finite precision arrithmatic that is used in the solution - instead you will find [itex]r[/itex] "large" ones, and the rest will be "small."

I'm assuming you are doing rank-deficient least-squares? If so then the SVD is one useful tool. I think of it this way, if [itex]U^\dagger A V = \Sigma[/itex] is the SVD of a matrix [itex]A[/itex] with [itex]rank(A)=r[/itex], then you can write
[tex] A = \sum_{i=1}^{r} \sigma_i u_i v_i^\dagger,[/tex]
where [itex]\sigma_i[/itex] is the [itex]i^{th}[/itex] singular value, [itex]u_i[/itex] is the [itex]i^{th}[/itex] left singular vector, and[itex]v_i[/itex] is the [itex]i^{th}[/itex] right singular vector. Since your rank deficient problem, you can add any vector in the nullspace of [itex]A[/itex] to any least-squares solution [itex]x_{LS}[/itex] and maintain the same residual - that is, the rank-deficient problem has an infinite number of solutions. Sometimes you want the solution that has the smallest 2-norm. In these cases the SVD can used to give it to you and the resulting solution is:
[tex] x_{LS} = \sum_{i=1}^{r} \sigma_i^{-1} \left( u_i^\dagger b\right) v_i.[/tex]
This is related to the so-called pseudo-inverse (http://en.wikipedia.org/wiki/Moore–Penrose_pseudoinverse). Note that other "complete orthogonal factorizations" can also give you the minimum 2-norm solution. I wrote my own QR based code that does this (see Golub and Van-Loan).

Since you have access to matlab, here is some simple code that solves an underdetermined system via least-squares:
A = randn(15,20);
b = ones(15,1);
x_mldivide = A\b;
x_pinv = pinv(A)*b;

you will find that x_mldivide has 5 zeros, as it provides the solution with the fewest number of non-zero elements (often this is called a "basic" solution), and that x_pinv is actually the minimum norm solution that the svd would give you. A more explicit computation is,
x_minnorm = 0;
[U,S,V] = svd(A);
for ii=1:rank(A)
x_minnorm = x_minnorm + (U(:,ii)'*b)*V(:,ii)/S(ii,ii);
end

where I have used the formula above. You should find that x_minnorm=x_pinv. Note that the command rank(A) is likely using an SVD to estimate rank - I think it looks to find the number of singular values that are above some numerical error tolerance.

enjoy!

jason

mikeph · Dec 13, 2012

Oh wow, I had no idea MATLAB actually computed pinv. I'll have a go at the computations as soon as my PC is free!

Can I ask one more thing that is probably really basic but still confusing me a little- you say an underdetermined problem is by definition rank deficient, but wikipedia has a seemingly different definition:

"The rank of an m × n matrix cannot be greater than m nor n. A matrix that has a rank as large as possible is said to have full rank; otherwise, the matrix is rank deficient."

What if in our underdetermined system m < n, but rank(A) = m? Surely by the above definition A is full-rank, since the rank of A is as large as it could possibly be?

Can anyone give a common-sense explanation of singular values?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad The vector to which a dual vector corresponds

Graduate Confusion about the Moyal-Weyl twist

Undergrad 2 interpretations of bra-ket expression: equal, & isomorphic, but...

Undergrad Spinor calculus

Undergrad Matrix representation of rank-2 spinors

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect