A pseudo-orthogonalization process

sunjin09 · Jun 21, 2012

It is well known that a Hermitian symmetric complex matrix A, [itex]A^{\dagger}=A[/itex] can be taking into a tridiagonolized form: [itex]A=V^{\dagger}HV[/itex] where [itex]^{\dagger}[/itex] is Hermitian conjugate and [itex]H[/itex] is the tridiagonal Hessenberg matrix, and [itex]V^{\dagger}V=VV^{\dagger}=I[/itex]. This decomposition is realized using Schmidt orthonormalization, where H is nothing but the coefficients generated by the orthonormaliztion.

Now suppose A is only symmetric, i.e., [itex]A^T=A[/itex] where T denotes transpose without conjugation, and proceed formally through Schmidt orthonormalization, where the corresponding "inner product" [itex]<x,y>=\sqrt{\sum_ix_i*y_i}[/itex] without taking abs value, i.e., not a real inner product, but can nevertheless be defined. The result would be that A is tridiagonalized into the form [itex]A=VHV^T[/itex] (if V has full rank), where [itex]V^TV=VV^T=I[/itex], enabling solution to the problem [itex]Ax=b[/itex] through solution of [itex]Hy=V^Tb[/itex] followed by [itex]x=Vy[/itex], where the H system is much easier to solve. Now my question is: where can this method go wrong? If so what is the condition on A under which this method does work?

chiro · Jun 22, 2012

Hey sunjin09.

The thing I worry about not using a proper inner-product is that the geometry of your operator is not preserved.

Remember that the inner-product geometrically preserves the geometry of your operator which for a normal inner product in R^n, preserves that characteristic.

To place a square root, what you have done in terms of the transformation is basically changed the geometry with respect to the new inner-product. In terms of orthonormalization, your orthonormalization characteristics still are reflected in the inner product (since the term in the square root sign is still zero giving a zero answer), but in terms of the actual projection itself, this will be really screwy.

The reason why I am saying this is because if you want to calculate projections of a given orthonormal basis from say a vector (or an operator) onto the basis itself, that inner product definition will screw things up really badly and if you are doing this in your decomposition and don't transform things back somehow then the result won't make any sense.

Also just out of curiosity, are there any constraints on your matrix besides it being Hermitian (i.e. any further constraints)?

sunjin09 · Jun 22, 2012

Hi chiro, thanks for answering. Le me first clarify a few points:

1. The Gram–Schmidt process I mentioned is actually Anordi Gram–Schmidt process, where you start with an (arbitrary) initial vector v, and try to find an orthogonal basis for the Krylov space K=span{v,Av,A^2v,...}.

2. The system A I consider is actually the EM Green's function G(r,r'), which is known to be transpose symmetric, but NOT Hermitian, i.e., G(r',r)=G(r,r') but G(r',r)≠G*(r,r'). G is used to form the integral equation of an (arbitrary) EM scattering problem, i.e.,
E=E0+G(σE), where E0 is known incident E field, σ is known conductivity and E is the unknown total field, i.e., total field is equal to incident field plus scattered (induced) field, where scattered field is generated by induced current σE. Quite straightforward formulation.

The system is typically very large, and an approximate solution is sought in the Krylov space K=span{E0,G(σE0),G(σG(σE0)),...} (fixed point iteration or Born series). If an orthognonal decomposition of K may be formulated as mentioned in my OP, I would avoid biconjugate gradient which works for nonsymmetric system, and make use of the non-conjugate symmetry and cut the cost by half. Since the problem is a well defined physics problem, the Born series is guaranteed to converge to a unique solution. I guess this is some constraints on the system, i.e., well posed with unique solution, or the system E=AE with the operator A: AE=E0+G(σE), is a contraction.

I tested with a small arbitrary A with the illegal inner product I mentioned and it worked. I wonder if it works with my EM problem in general. The thing I worried about the illegal inner product is indeed what you mentioned: <x,y>=0 does not guarantee x and y are orthogonal, <x,x>=0 does not even guarantee x=0. So that my "unitary" matrix [itex]V, V^TV=VV^T=I[/itex], may be poorly conditioned or even singular.

AlephZero · Jun 22, 2012

Maybe I'm misunderstanding your notation, but don't you have the basic problem that your definition of < x , x > can be zero when x is nonzero? For example if x is the vector
(1 + i), (1 - i) ?

If that is so, it shouldn't be too hard to invent a matrix where the method fails completely, or runs into severe numerical problems losing precision through cancelling nearly equal and opposite terms.

That can't happen in the Hermitian case, because the computation of < x, x > is always well conditioned.

sunjin09 · Jun 22, 2012

AlephZero said:

Maybe I'm misunderstanding your notation, but don't you have the basic problem that your definition of < x , x > can be zero when x is nonzero? For example if x is the vector
(1 + i), (1 - i) ?

If that is so, it shouldn't be too hard to invent a matrix where the method fails completely, or runs into severe numerical problems losing precision through cancelling nearly equal and opposite terms.

That can't happen in the Hermitian case, because the computation of < x, x > is always well conditioned.

You are totally right. That's what I worry about. I was just hoping maybe the solution is still valid given the particular problem I have (with further constraints on the system). Remember the solution is an iterative one. My previous post has a lot more details. Can you help look at it? Thank you so much!

AlephZero · Jun 23, 2012

I think you are "living dangerously" with this. We agree the method fails in general if you create a vector that has pseudo-norm-length 0. So you really have to prove that is impossible because of the special properties of your problem, or find a way to restart if it happens.

If you get a small pseudo-norm instead of an exact zero, and the corresponding "unit length vector" will be probably be scaled to have very large components, leading to numerical problems later on.

I don't think arbitrary complex symmetric matrices have many "nice" numerical properties. You can see that by reformulating the problem as a double-sized real matrix. If ##A = P + iQ## where ##P## and ##Q## are real, then you can replace A by the real matrix
$$A' = \begin{bmatrix} P & Q \\ -Q & P \end{bmatrix}$$
If A is Hermitian, P is symmetric, Q is antisymmetric, and A' is symmetric. If A is symmetric, P and Q are both symmetric and A' is (almost) an arbitrary real matrix.

If you want to explore this, I would try using a "pseudo norm" that really is a norm - e.g. Norm(x + iy) = |x| + |y| or something similar.

sunjin09 · Jun 24, 2012

AlephZero said:

I think you are "living dangerously" with this. We agree the method fails in general if you create a vector that has pseudo-norm-length 0. So you really have to prove that is impossible because of the special properties of your problem, or find a way to restart if it happens.

If you get a small pseudo-norm instead of an exact zero, and the corresponding "unit length vector" will be probably be scaled to have very large components, leading to numerical problems later on.

I don't think arbitrary complex symmetric matrices have many "nice" numerical properties. You can see that by reformulating the problem as a double-sized real matrix. If ##A = P + iQ## where ##P## and ##Q## are real, then you can replace A by the real matrix
$$A' = \begin{bmatrix} P & Q \\ -Q & P \end{bmatrix}$$
If A is Hermitian, P is symmetric, Q is antisymmetric, and A' is symmetric. If A is symmetric, P and Q are both symmetric and A' is (almost) an arbitrary real matrix.

If you want to explore this, I would try using a "pseudo norm" that really is a norm - e.g. Norm(x + iy) = |x| + |y| or something similar.

By writing A in terms of the real and imaginary parts is really more illuminating. It is simply too bothersome not to be able to make use of this kind of symmetry on a problem of such general interest. As you mentioned, maybe there's a way to restart the iteration when failure happens, however it is probably not possible to prove general convergence even with restarted iterations. Anyway, it's good to know I didn't miss anything obvious.

A pseudo-orthogonalization process

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad The vector to which a dual vector corresponds

Graduate Confusion about the Moyal-Weyl twist

Undergrad 2 interpretations of bra-ket expression: equal, & isomorphic, but...

Undergrad Spinor calculus

Undergrad Matrix representation of rank-2 spinors

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect