Nearly Singular Matrix Inversion

  • Thread starter Thread starter EngWiPy
  • Start date Start date
  • Tags Tags
    Inversion Matrix
AI Thread Summary
The discussion revolves around inverting a nearly singular matrix using singular value decomposition (SVD) and the challenges of setting an appropriate threshold for singular values. It emphasizes the importance of choosing a threshold relative to the largest singular value to avoid precision loss, with suggested ranges typically between 10^-12 and 10^-3. The conversation also highlights the need to consider the physical significance of singular values and the representation of the matrix, particularly in the context of communication channels. Participants suggest plotting normalized singular values to identify a threshold for filtering out small values, which can vary based on measurement noise. Ultimately, the accuracy of the matrix inversion is linked to the performance of detection algorithms, raising questions about how to evaluate inversion quality without a known accurate reference.
EngWiPy
Messages
1,361
Reaction score
61
Hi,

I have a matrix that is nearly singular, so I am using singular value decomposition (SVD) approach for inversion. In this case an N-by-N matrix can be written as:

\mathbf{H}=\mathbf{U}\Sigma\mathbf{V}^H

Then the inverse can be written as:

\mathbf{H}^{-1}=\mathbf{V}\Sigma^{-1}\mathbf{U}^H

where

\Sigma=\text{diag}(\sigma_1,\ldots,\sigma_N)

The problem in singular matrices is that some of the singular values are zeros, and hence invert \Sigma will have some undefined values. To overcome this problem, if a singular value is less than a threshold, then its inverse is forced to 0.

The question is: how to control the threshold such that the matrix is inverted with high accuracy?

Thanks
 
Mathematics news on Phys.org
As a starter, you should make your threshold relative to the largest singular value. How low should you go? It depends in part on how you are representing the reals. For example, you are almost guaranteed to get a mess if you use double precision and set the threshold to less than 10-16 times the largest singular value. Even a value of 10-14 is a bit dubious. You are still bound to run into precision loss problems. A more reasonable scale factor is typically somewhere between 10-12 and 10-3.

Another factor is, what does your matrix H represent? Those small singular values says that don't really have N degrees of freedom. How many do you think you have? If you know, or think you know, the answer is n (rather than N), then another approach is just to take the first n largest singular values as having any real physical meaning, and discarding the rest.
 
D H said:
As a starter, you should make your threshold relative to the largest singular value. How low should you go? It depends in part on how you are representing the reals. For example, you are almost guaranteed to get a mess if you use double precision and set the threshold to less than 10-16 times the largest singular value. Even a value of 10-14 is a bit dubious. You are still bound to run into precision loss problems. A more reasonable scale factor is typically somewhere between 10-12 and 10-3.

Another factor is, what does your matrix H represent? Those small singular values says that don't really have N degrees of freedom. How many do you think you have? If you know, or think you know, the answer is n (rather than N), then another approach is just to take the first n largest singular values as having any real physical meaning, and discarding the rest.

Again, what "small" singular values means? like < 10^-6, for example? The matrix represents an equivalent communication channel where the components are correlated due the type of processing at the receiver.

I am using the first approach you mentined, I set the threshold as th*sigma_max, but for some matrices th=10^-4 works fine, for others it doesn't. For some matrices I tried th=0.1, 0.01, 0.001, ... 1e-6, but it doesn't work.

How to solve this issue? Is there another way to invert such matrices and yet more stable and accurate?

Thanks
 
There is no clear cut answer, and you are asking for what often are conflicting goals, stability and accuracy.

By way of analogy, suppose you are given 2001 measurements of some scalar quantity taken at 2001 different points in time. Your job: Develop some polynomial model that describes the quantity varies with time. You could develop a 2000th degree polynomial and hit each of those sampled values exactly. That ridiculous polynomial might be accurate (you can't get any more accurate), but it's absolutely worthless for interpolation and worth even less for extrapolation. Accuracy is not the be-all and end-all.

What you should be looking for instead is the smallest polynomial that still does a reasonable job of modeling your data. If you know how good those measurements are, you can use this as one measure of when to stop adding coefficients. Stop when you the residuals look like measurement noise. If you don't know the uncertainties in the measurements, another approach is to reserve some of those sampled values as evaluation points. You don't use those reserved points in the regression; you instead use them to evaluate the quality of the generated model.

This is a fairly standard approach not just in curve fitting but throughout machine learning. (Curve fitting is just a rather simple form of machine learning.) Machine learning is oftentimes plagued by not knowing the quality of the measured values. It's much better to take those uncertainties into account in your modeling if you do know something about measurement errors. It's also a good idea to keep some data in reserve for evaluation.
 
D H said:
There is no clear cut answer, and you are asking for what often are conflicting goals, stability and accuracy.

By way of analogy, suppose you are given 2001 measurements of some scalar quantity taken at 2001 different points in time. Your job: Develop some polynomial model that describes the quantity varies with time. You could develop a 2000th degree polynomial and hit each of those sampled values exactly. That ridiculous polynomial might be accurate (you can't get any more accurate), but it's absolutely worthless for interpolation and worth even less for extrapolation. Accuracy is not the be-all and end-all.

What you should be looking for instead is the smallest polynomial that still does a reasonable job of modeling your data. If you know how good those measurements are, you can use this as one measure of when to stop adding coefficients. Stop when you the residuals look like measurement noise. If you don't know the uncertainties in the measurements, another approach is to reserve some of those sampled values as evaluation points. You don't use those reserved points in the regression; you instead use them to evaluate the quality of the generated model.

This is a fairly standard approach not just in curve fitting but throughout machine learning. (Curve fitting is just a rather simple form of machine learning.) Machine learning is oftentimes plagued by not knowing the quality of the measured values. It's much better to take those uncertainties into account in your modeling if you do know something about measurement errors. It's also a good idea to keep some data in reserve for evaluation.

I need to invert the matrix to detect transmitted symbols. In particular, I want to find this matrix:

\mathbf{G}=\mathbf{H}^H(\mathbf{H}\mathbf{H}^H+N_0\mathbf{H})^{-1}

where N0 H is the covariance matrix of the noise. Here H is nearly singular. G is crucial for data detection, because if it is not accurate enough, the detection process will be erroneous. I agree the inverse must not be 100% accurate, but accurate enough for correct detection as in perfect inversion.
 
S_David said:
I am using the first approach you mentined, I set the threshold as th*sigma_max, but for some matrices th=10^-4 works fine, for others it doesn't. For some matrices I tried th=0.1, 0.01, 0.001, ... 1e-6, but it doesn't work.

A practical approach is male a plot of the normalized singular values, sorted in descending order, on a log scale.

Often, the plot has two distinct parts. The largest SV's decrease fairly quickly, and the smaller ones all have about the same value. The plot will show a clear "kink" between the two parts.

Set the threshold to filter out all the "small" SVs. If your matrix represents measured data, the numerical value of "small" will depends very much on the accuracy and resolution of the measurements. If you have some source of "noise" in the measurement system which has different amplitudes for different measurements, either you need to choose the filter value for each set of data, or (better) find a way to eliminate or at least stabilize the noise level in the measurement system.
 
AlephZero said:
A practical approach is male a plot of the normalized singular values, sorted in descending order, on a log scale.

Often, the plot has two distinct parts. The largest SV's decrease fairly quickly, and the smaller ones all have about the same value. The plot will show a clear "kink" between the two parts.

Set the threshold to filter out all the "small" SVs. If your matrix represents measured data, the numerical value of "small" will depends very much on the accuracy and resolution of the measurements. If you have some source of "noise" in the measurement system which has different amplitudes for different measurements, either you need to choose the filter value for each set of data, or (better) find a way to eliminate or at least stabilize the noise level in the measurement system.

My channel is changing (random), so how to know what values are "small" to discard them each time?

I just discovered that reducing th=0.1 enhanced the performance, but the results were not the same as a journal paper I am trying to simulate. When I put th=1e-5 it is almost the same as the paper.

A question comes to my mind here: how to know if an inversion approximation is accurate enough since we don't know the accurate inversion? Basically, I am transmitting signals over a wireless channel, and these signals interfered at the receiver. So, I am suing minimum mean squared error (MMSE) equalizer to reduce the effect of interference and noise as following:

MSE=E\left[\|\mathbf{d}-\mathbf{G}\mathbf{y}\|^2\right]

where d is the transmitted signals and y is the received signals. A related question: does enhancing the performance of detection necessarily mean that the inversion process is accurate enough?
 
Last edited:

Similar threads

Back
Top