[Numerical] Why is it better to solve Ax=b instead of calculating inv(A)

  • Thread starter Thread starter nonequilibrium
  • Start date Start date
  • Tags Tags
    Numerical
AI Thread Summary
Solving the equation Ax=b directly is preferred over calculating the inverse of A due to several factors, including stability, accuracy, and computational efficiency. When reusing the matrix A in iterative methods, techniques like LU decomposition offer a more efficient solution than both Gaussian elimination and matrix inversion. LU decomposition maintains sparsity, which is crucial when dealing with sparse matrices, as the inverse is often fully populated and computationally expensive. In general, explicitly inverting a matrix is discouraged unless there are compelling reasons to do so. Overall, direct equation solving methods are typically more advantageous in numerical applications.
nonequilibrium
Messages
1,412
Reaction score
2
Hello,

So I'm using a numerical method where I iteratively have to solve Ax_{i+1}=x_i which can of course be done by calculating the inverse matrix, but something in the back of my mind is telling me that it is better to solve the equation using Gaussian elimination and substitution.

However, what is the reason for this? And is the difference relevant, or is merely a difference "in principle".
 
Mathematics news on Phys.org
There are lots of reasons not to use the inverse. Stability, accuracy, time. However, some of these reasons apply to a one-time solution of Ax=b. Here you are apparently reusing the A matrix, iteratively solving Ax_{i+1} = x_i. If you reuse A, timing considerations suggest you should compute the inverse of A instead of using Gaussian elimination.

There's an even better approach to using either of the techniques you mentioned. Compute the LU decomposition. It's cheaper to compute the LU decomposition than it is to use Gaussian elimination or the matrix inverse. LU backsubstitution is fast, faster than the matrix * vector computation, and the same stability and accuracy advantages that pertain to Gaussian elimination apply to LU decomposition.
 
Last edited:
The iteration usually takes only three steps though, so maybe it is not evident that the inverse is optimal relative to solving it by Gaussian elimination?

Interesting what you say about the LU decomposition...
 
Calculating the inverse efficiently (i.e. by doing an LU decomposition and then solving N right hand sides each containng one 1 and the rest of the entries 0) is an overhead of about N solves. So if you are only doing 3 solves anyway, you will lose unless N is very small.

In addition to D.H's points, often A is sparse and the LU decomposition has a similar amount of sparsity, but the inverse matrix is almost always fully populated. In that situation, multiplying by the inverse can be orders of magnitude more expensive than solving using the sparse matrices, quite apart from the numerical conditoning issues and the amount of storage that may be required for the inverse matrix.

My "rule of thumb" here is that explicitly inverting a matrix numerically is NEVER the right thing to do, unless you can make a very good argument in favor of it. Even when it appears to be "no worse" than equation solving, straightforward equation solving (based on LU decomoposition for example) is usually not the only alternative.
 
Thank you both for the good advice.
 
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Suppose ,instead of the usual x,y coordinate system with an I basis vector along the x -axis and a corresponding j basis vector along the y-axis we instead have a different pair of basis vectors ,call them e and f along their respective axes. I have seen that this is an important subject in maths My question is what physical applications does such a model apply to? I am asking here because I have devoted quite a lot of time in the past to understanding convectors and the dual...
Back
Top