Preconditioning for linear equations

Tusike · Nov 25, 2016

I understand that for many iterative methods, convergence rates can be shown to depend on the condition number of the coefficient matrix A in the linear equation
$$Ax=y.$$
Therefore, if a preconditioner satisfies
$$P \approx A,$$
then by solving the transformed linear equation
$$(AP^{-1}) (Px)=y.$$
the new coefficient matrix will now have more favorable spectral properties and hence better convergence can be achieved.

One of the main properties a good preconditioner should satisfy besides the above condition is that its inverse should be cheap to apply. Thus, they are often sought out for with a certain structure. Typical examples are the incomplete Cholesky and LU factorizations of the matrix A.

My question is: why do we want to have P approximate A, or, in a more direct approach, why do we formulate finding preconditioners as:
$$
\min_{P} \left\| AP^{-1} - I \right\|_F,
$$
where F represents the Frobenius-norm? The identity matrix isn't the only one with a condition number of 1; would it not be better to formulate the problem as:
$$
\min_{P,Q} \left\| AP^{-1} - Q \right\|_F,
$$
with Q having to be orthogonal? Given a certain structure restriction on P, I imagine this could lead to better preconditioning than in the previous case. Yet, I have not run across any such examples.

jvicens · Nov 25, 2016

Thank you for your insightful question regarding the formulation of finding preconditioners. To answer your question, let me first start by explaining why we want to have P approximate A in the first place.

As you mentioned, the convergence rate of iterative methods is heavily dependent on the condition number of the coefficient matrix A. A high condition number indicates that the matrix is ill-conditioned, meaning small changes in the input can result in large changes in the output. This can lead to slow convergence or even divergence of iterative methods.

By finding a good preconditioner P, we aim to transform the original linear equation into a new one with a better conditioned coefficient matrix. This is achieved by choosing P to be as close as possible to A, which means that the transformed matrix AP^{-1} will have a smaller condition number. This results in faster convergence of iterative methods, as you correctly pointed out.

Now, let's address the question of why we formulate finding preconditioners as minimizing the Frobenius norm of the difference between AP^{-1} and the identity matrix. This formulation is commonly used because it is a natural measure of how close P is to A. The Frobenius norm is a matrix norm that measures the overall magnitude of a matrix, making it a suitable choice for comparing the similarity between two matrices. By minimizing this norm, we are effectively minimizing the distance between P and A, which in turn leads to a smaller condition number for the transformed matrix.

As for your suggestion of formulating the problem as minimizing the Frobenius norm of the difference between AP^{-1} and an orthogonal matrix Q, this approach has been explored in some research papers. However, it has not gained as much popularity as the standard formulation. One reason could be that finding an orthogonal matrix Q that approximates AP^{-1} can be computationally expensive, especially for large matrices. Additionally, the structure of Q may not always be suitable for preconditioning, as it may introduce new dependencies or correlations in the matrix elements.

In conclusion, the formulation of finding preconditioners as minimizing the Frobenius norm of the difference between AP^{-1} and the identity matrix has proven to be effective in practice. However, as with any scientific problem, there are always alternative approaches that can be explored. I hope this answer has provided you with a better understanding of the rationale behind the formulation of finding preconditioners.

Preconditioning for linear equations

1. What is preconditioning for linear equations?

2. How does preconditioning work?

3. What are the benefits of preconditioning?

4. What are some common preconditioners used for linear equations?

5. When should preconditioning be used?

Similar threads

Hot Threads

Recent Insights