Nearly Singular Matrix Inversion

  • Thread starter Thread starter EngWiPy
  • Start date Start date
  • Tags Tags
    Inversion Matrix
Click For Summary
SUMMARY

The discussion centers on the challenges of inverting nearly singular matrices using Singular Value Decomposition (SVD). The matrix inversion is defined as H^{-1} = VΣ^{-1}U^H, where Σ contains singular values. A critical issue arises when singular values approach zero, necessitating a threshold to determine which values to consider for inversion. Participants recommend setting the threshold relative to the largest singular value, with effective ranges typically between 10^{-12} and 10^{-3}. The accuracy of the inversion is paramount for applications such as data detection in communication systems, where even slight inaccuracies can lead to erroneous results.

PREREQUISITES
  • Understanding of Singular Value Decomposition (SVD)
  • Knowledge of matrix inversion techniques
  • Familiarity with numerical precision issues in floating-point arithmetic
  • Basic concepts of Minimum Mean Squared Error (MMSE) equalization
NEXT STEPS
  • Research methods for determining optimal thresholds in SVD for matrix inversion
  • Explore numerical stability techniques in matrix computations
  • Learn about the implications of measurement noise on matrix inversion accuracy
  • Investigate advanced SVD applications in machine learning and data modeling
USEFUL FOR

Data scientists, engineers in signal processing, and researchers involved in communication systems who require accurate matrix inversion techniques for effective data detection and modeling.

EngWiPy
Messages
1,361
Reaction score
61
Hi,

I have a matrix that is nearly singular, so I am using singular value decomposition (SVD) approach for inversion. In this case an N-by-N matrix can be written as:

\mathbf{H}=\mathbf{U}\Sigma\mathbf{V}^H

Then the inverse can be written as:

\mathbf{H}^{-1}=\mathbf{V}\Sigma^{-1}\mathbf{U}^H

where

\Sigma=\text{diag}(\sigma_1,\ldots,\sigma_N)

The problem in singular matrices is that some of the singular values are zeros, and hence invert \Sigma will have some undefined values. To overcome this problem, if a singular value is less than a threshold, then its inverse is forced to 0.

The question is: how to control the threshold such that the matrix is inverted with high accuracy?

Thanks
 
Mathematics news on Phys.org
As a starter, you should make your threshold relative to the largest singular value. How low should you go? It depends in part on how you are representing the reals. For example, you are almost guaranteed to get a mess if you use double precision and set the threshold to less than 10-16 times the largest singular value. Even a value of 10-14 is a bit dubious. You are still bound to run into precision loss problems. A more reasonable scale factor is typically somewhere between 10-12 and 10-3.

Another factor is, what does your matrix H represent? Those small singular values says that don't really have N degrees of freedom. How many do you think you have? If you know, or think you know, the answer is n (rather than N), then another approach is just to take the first n largest singular values as having any real physical meaning, and discarding the rest.
 
D H said:
As a starter, you should make your threshold relative to the largest singular value. How low should you go? It depends in part on how you are representing the reals. For example, you are almost guaranteed to get a mess if you use double precision and set the threshold to less than 10-16 times the largest singular value. Even a value of 10-14 is a bit dubious. You are still bound to run into precision loss problems. A more reasonable scale factor is typically somewhere between 10-12 and 10-3.

Another factor is, what does your matrix H represent? Those small singular values says that don't really have N degrees of freedom. How many do you think you have? If you know, or think you know, the answer is n (rather than N), then another approach is just to take the first n largest singular values as having any real physical meaning, and discarding the rest.

Again, what "small" singular values means? like < 10^-6, for example? The matrix represents an equivalent communication channel where the components are correlated due the type of processing at the receiver.

I am using the first approach you mentined, I set the threshold as th*sigma_max, but for some matrices th=10^-4 works fine, for others it doesn't. For some matrices I tried th=0.1, 0.01, 0.001, ... 1e-6, but it doesn't work.

How to solve this issue? Is there another way to invert such matrices and yet more stable and accurate?

Thanks
 
There is no clear cut answer, and you are asking for what often are conflicting goals, stability and accuracy.

By way of analogy, suppose you are given 2001 measurements of some scalar quantity taken at 2001 different points in time. Your job: Develop some polynomial model that describes the quantity varies with time. You could develop a 2000th degree polynomial and hit each of those sampled values exactly. That ridiculous polynomial might be accurate (you can't get any more accurate), but it's absolutely worthless for interpolation and worth even less for extrapolation. Accuracy is not the be-all and end-all.

What you should be looking for instead is the smallest polynomial that still does a reasonable job of modeling your data. If you know how good those measurements are, you can use this as one measure of when to stop adding coefficients. Stop when you the residuals look like measurement noise. If you don't know the uncertainties in the measurements, another approach is to reserve some of those sampled values as evaluation points. You don't use those reserved points in the regression; you instead use them to evaluate the quality of the generated model.

This is a fairly standard approach not just in curve fitting but throughout machine learning. (Curve fitting is just a rather simple form of machine learning.) Machine learning is oftentimes plagued by not knowing the quality of the measured values. It's much better to take those uncertainties into account in your modeling if you do know something about measurement errors. It's also a good idea to keep some data in reserve for evaluation.
 
D H said:
There is no clear cut answer, and you are asking for what often are conflicting goals, stability and accuracy.

By way of analogy, suppose you are given 2001 measurements of some scalar quantity taken at 2001 different points in time. Your job: Develop some polynomial model that describes the quantity varies with time. You could develop a 2000th degree polynomial and hit each of those sampled values exactly. That ridiculous polynomial might be accurate (you can't get any more accurate), but it's absolutely worthless for interpolation and worth even less for extrapolation. Accuracy is not the be-all and end-all.

What you should be looking for instead is the smallest polynomial that still does a reasonable job of modeling your data. If you know how good those measurements are, you can use this as one measure of when to stop adding coefficients. Stop when you the residuals look like measurement noise. If you don't know the uncertainties in the measurements, another approach is to reserve some of those sampled values as evaluation points. You don't use those reserved points in the regression; you instead use them to evaluate the quality of the generated model.

This is a fairly standard approach not just in curve fitting but throughout machine learning. (Curve fitting is just a rather simple form of machine learning.) Machine learning is oftentimes plagued by not knowing the quality of the measured values. It's much better to take those uncertainties into account in your modeling if you do know something about measurement errors. It's also a good idea to keep some data in reserve for evaluation.

I need to invert the matrix to detect transmitted symbols. In particular, I want to find this matrix:

\mathbf{G}=\mathbf{H}^H(\mathbf{H}\mathbf{H}^H+N_0\mathbf{H})^{-1}

where N0 H is the covariance matrix of the noise. Here H is nearly singular. G is crucial for data detection, because if it is not accurate enough, the detection process will be erroneous. I agree the inverse must not be 100% accurate, but accurate enough for correct detection as in perfect inversion.
 
S_David said:
I am using the first approach you mentined, I set the threshold as th*sigma_max, but for some matrices th=10^-4 works fine, for others it doesn't. For some matrices I tried th=0.1, 0.01, 0.001, ... 1e-6, but it doesn't work.

A practical approach is male a plot of the normalized singular values, sorted in descending order, on a log scale.

Often, the plot has two distinct parts. The largest SV's decrease fairly quickly, and the smaller ones all have about the same value. The plot will show a clear "kink" between the two parts.

Set the threshold to filter out all the "small" SVs. If your matrix represents measured data, the numerical value of "small" will depends very much on the accuracy and resolution of the measurements. If you have some source of "noise" in the measurement system which has different amplitudes for different measurements, either you need to choose the filter value for each set of data, or (better) find a way to eliminate or at least stabilize the noise level in the measurement system.
 
AlephZero said:
A practical approach is male a plot of the normalized singular values, sorted in descending order, on a log scale.

Often, the plot has two distinct parts. The largest SV's decrease fairly quickly, and the smaller ones all have about the same value. The plot will show a clear "kink" between the two parts.

Set the threshold to filter out all the "small" SVs. If your matrix represents measured data, the numerical value of "small" will depends very much on the accuracy and resolution of the measurements. If you have some source of "noise" in the measurement system which has different amplitudes for different measurements, either you need to choose the filter value for each set of data, or (better) find a way to eliminate or at least stabilize the noise level in the measurement system.

My channel is changing (random), so how to know what values are "small" to discard them each time?

I just discovered that reducing th=0.1 enhanced the performance, but the results were not the same as a journal paper I am trying to simulate. When I put th=1e-5 it is almost the same as the paper.

A question comes to my mind here: how to know if an inversion approximation is accurate enough since we don't know the accurate inversion? Basically, I am transmitting signals over a wireless channel, and these signals interfered at the receiver. So, I am suing minimum mean squared error (MMSE) equalizer to reduce the effect of interference and noise as following:

MSE=E\left[\|\mathbf{d}-\mathbf{G}\mathbf{y}\|^2\right]

where d is the transmitted signals and y is the received signals. A related question: does enhancing the performance of detection necessarily mean that the inversion process is accurate enough?
 
Last edited:

Similar threads

  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K