Optimizing Matrix Inversions: Tips for Faster Calculations

daudaudaudau · Apr 9, 2011

Hi. For a calculation I'm doing I need to do many inversions of the form
[tex] \left(\mathbf I-\mathbf H-\mathbf S\right)^{-1}[/tex]

Where I is the identity matrix, H is a known constant matrix, and S is a matrix that changes with each calculation. Is there any way to optimize this calculation, by any sort of factorization or anything like that? I.e. make it faster because the inverse of H is known in advance.

I like Serena · Apr 9, 2011

daudaudaudau said:

Hi. For a calculation I'm doing I need to do many inversions of the form
[tex] \left(\mathbf I-\mathbf H-\mathbf S\right)^{-1}[/tex]

Where I is the identity matrix, H is a known constant matrix, and S is a matrix that changes with each calculation. Is there any way to optimize this calculation, by any sort of factorization or anything like that? I.e. make it faster because the inverse of H is known in advance.

Without more knowledge about S, I think there are not much options for optimisation.
Note that a matrix inverse requires O(n³) multiplications.
The subtractions in your matrix will be negligible compared to that.

Now if you know that S contains for instance a number of zeroes, or that two subsequent S'es are related somehow...

AlephZero · Apr 9, 2011

Depending on the form of S, these might be useful:

http://en.wikipedia.org/wiki/Sherman–Morrison_formula
http://en.wikipedia.org/wiki/Woodbury_matrix_identity

Note, you may be able to split S into several parts and use the Sherman Morrison formula several times, which is still faster than re-computing the inverse.

But 99 times out of 100 in numerical methods, you don't actually need to calculate an inverse matrix at all, because what you are really doing is solving equations not inverting matrices.

daudaudaudau · Apr 10, 2011

The calculation is part of an iteration, so S will converge to S*, i.e. at some point S will not change much from iteration to iteration. Is this helpful?

I like Serena · Apr 10, 2011

daudaudaudau said:

The calculation is part of an iteration, so S will converge to S*, i.e. at some point S will not change much from iteration to iteration. Is this helpful?

The only relevant thing I can find, is that if the difference of two S matrices is of the from uv^T (u and v vectors), that the before-mentioned Sherman–Morrison formula can be used to reduce the complexity of the matrix inverse from O(n³) to O(n²). I have an algorithm for that based on the so called QR Decomposition.

Stephen Tashi · Apr 12, 2011

Just a daydream on the topic:

Since [tex]\mathbf{H}[/tex] is constant you can rewrite [tex](\mathbf{I} - \mathbf{H} - \mathbf{S} )^{-1}[/tex] as [tex](\mathbf{I} - \mathbf{A} - \mathbf{S_0} )^{-1}[/tex] where [tex]\mathbf{A}[/tex] is a constant matrix that can be customized to have convenient properties.

For example if I daydream about computing [tex](\mathbf{I} - \mathbf{A} - \mathbf{S_0} )^{-1}[/tex] as a power series expansion, I might want [tex]\mathbf{A}^2 = 0[/tex].

Stephen Tashi · Apr 12, 2011

Several web pages reference the article "On the Inverse of the Sum of Matrices" by Kenneth S. Miller, Mathematics Magazine vol54, No 2, March 1981 p67.

I didn't find a place where I could view the whole article but a poster on mathstackexchange quoted the results from it.

http://math.stackexchange.com/questions/17776/inverse-of-the-sum-of-matrices

The typography on that page is better than my attempt to edit a cut-and-paste of it, which is:

He proves the following:

Lemma. If A and A+B are invertible, and B has rank 1, then let g=trace(BA^−1). Then g≠−1 and
(A+B)^−1=(A^−1)−1/(1+g) (A^−1)B(A^−1).

From this lemma, we can take a general A+B that is invertible and write it as A+B=A+B1+B2+⋯+Br, where Bi each have rank 1 and such that each A+B1+⋯+Bk is invertible (such a decomposition always exists if A+B is invertible and rank(B)=r). Then you get:

Theorem. Let A and A+B be nonsingular matrices, and let B have rank r>0. Let B=B1+⋯+Br, where each Bi has rank 1, and each C[k+1]=A+B1+⋯+Bk is nonsingular. Setting C[1]=A, then
C^−1[k+1]=C^−1[k]−g[k]C^−1[k] Bk C^−1[k]
where g[k]=1/( 1+trace(C^−1[k] Bk). In particular,
(A+B)^−1=C^−1[r]−g[r]C&−1[r] Br C−1[r].

(If the rank of B is 0, then B=0, so (A+B)^−1=A^−1).

I haven't digested whether this is of any help.

In the original post, since [itex]\mathbf{H}[/itex] is constant you can write [itex](\mathbf{I} - \mathbf{H} - \mathbf{S} )[/itex] as
[itex]( \mathbf{A} + \mathbf{S_0} )[/itex] where [itex]\mathbf{A}[/itex] is a constant matrix with any properties that you want it to have. You just won't have any choice about what properties [itex]\mathbf{S_0} = (\mathbf{I} - \mathbf{H} - \mathbf{S} - \mathbf{A}[/itex] has.

Optimizing Matrix Inversions: Tips for Faster Calculations

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad The vector to which a dual vector corresponds

Graduate Confusion about the Moyal-Weyl twist

Undergrad How to define a vector field?

Undergrad 2 interpretations of bra-ket expression: equal, & isomorphic, but...

Undergrad Erroneously finding discrepancy in transpose rule

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight