Pseudo inverse very inaccurate

  • Thread starter Thread starter divB
  • Start date Start date
  • Tags Tags
    Inverse
Click For Summary

Discussion Overview

The discussion revolves around the numerical challenges encountered when using the pseudoinverse in MATLAB for a matrix with nearly linearly dependent columns. Participants explore the implications of high condition numbers on least squares solutions and seek methods to stabilize the system.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant describes a numerical problem involving a matrix Hf and two vectors yl1 and yl3, noting that the results from solving Hf \ yl1 and Hf \ yl3 are significantly different.
  • Another participant suggests that the columns of Hf may be too close, indicating potential linear dependence that could lead to numerical instability in calculations involving the pseudoinverse.
  • A participant inquires about methods to stabilize the system, considering modifications to the vectors and matrix.
  • Another participant mentions the use of numeric iterative schemes for systems with high condition numbers as a potential approach.
  • One participant questions why MATLAB does not implement more robust algorithms for such cases and seeks clarification on the role of linear dependency in both rows and columns of the matrix.
  • Another participant highlights that a condition number of 183 is quite high and suggests using the pseudoinverse function (pinv) instead of the backslash operator for potentially more consistent results.
  • It is noted that the backslash operator may not account for the non-uniqueness of solutions due to the presence of a nonzero vector that satisfies Hf.v = 0.

Areas of Agreement / Disagreement

Participants express varying views on the implications of linear dependence and the effectiveness of different numerical methods. There is no consensus on a definitive solution or approach to the problem presented.

Contextual Notes

Participants discuss the condition number's impact on the stability of least squares solutions, but there is no clear threshold established for when instability occurs. The discussion also touches on the complexity of linear dependencies in both rows and columns, which remains unresolved.

divB
Messages
85
Reaction score
0
Hi,

I've a very trivial numerical problem where I'm currently stuck. In MATLAB the matrix Hf:

Code:
>> Hf
Hf =

  1.0e+003 *

    1.6443    1.6516    1.6583
    4.8373    4.8349    4.8334
    4.6385    4.6418    4.6445
   -9.6014   -9.6084   -9.6154

And the following vectors which are very close:

Code:
>> [yl1 , yl3 , yl1 - yl3 ]
ans =

  1.0e+006 *

    0.2966    0.2972   -0.0006
    0.8705    0.8703    0.0002
    0.8352    0.8355   -0.0003
   -1.7288   -1.7295    0.0006

yl1 is my result as it should be:

Code:
>> Hf \ yl1
ans =

  100.0000
   75.0000
    5.0000

yl3 is obtained in a different way but is very close to the original. But sill:

Code:
>> Hf \ yl3
ans =

   56.0412
   72.5578
   51.4007

The result is not just a little bit away, it is terrible, unuseable!

I have much redundancy in the data, so I can ad much lines to the matrix Hf. However, it does not matter how much, the result is always the same ... unuseable.

Can anyone explain why the least squares is so terrible in this case? I'm a bit confused because least squares should be pretty robust ...

Thanks,
divB
 
Physics news on Phys.org
Hmm, sorry I think I know: The columns of Hf are too close, right?
 
divB said:
Hmm, sorry I think I know: The columns of Hf are too close, right?

Likely, yes. The columns are very nearly linearly dependent, which can make calculations like the pseudoinverse numerically unstable.
 
Hmm, thanks. Is there a good way to stabilize such a system? E.g. I have y=Hf*h; I want to find h and may modify y and Hf
 
It's been a while since I did this myself, but have you tried looking at the various numeric iterative schemes to solve systems with a high condition number?
 
Hmm, I wonder why MATLAB would not implement these itself?
Anycase, I get similar results when the condition number is not so high, e.g. cond(Hf) = 183. Starting from which condition number could one say that the LS becomes unstable?

I am currently trying to reformulate the problem to avoid these similar columns. In my current setup I sum over a series where each column differs only with one sample of the series. Clearly, this yields very similar numbers in each column per row.

However, I am currently confused about the linear dependency of the rows/columns in such a linear system. Can you clearify that? I thought for the solvability, only the linear dependency of the rows (rather than the cols) plays a role?

E.g. when there are fewer lin. independent rows than unknowns, then the system cannot be solved.

So does this mean that the rows AND columns must be lin. independent?

I found no clear explanation.
 
That is an extremely high condition number: 183 is very big.
 
Instead of Hy \ y1, try pinv(Hf)*y1. That may give you more "consistent" results.

Judging by http://www.mathworks.co.uk/help/matlab/ref/arithmeticoperators.html, \ uses a fairly simple minded (but quick) algorithm to solve badly conditioned problems. The bottom line is that the solution isn't unique, because there is a nonzero vector v such that Hf.v = 0, and you can add any multiple of v to the "solution" to get another solution. And the algorithm that \ uses doesn't care what that multiple is, but the algorithm for pinv() does care - it also tries to make the components of the "answer" as small as possible.