A pseudo-orthogonalization process

  • Thread starter sunjin09
  • Start date
  • Tags
    Process
In summary, we discussed the decomposition of a Hermitian symmetric complex matrix into a tridiagonal form using Schmidt orthonormalization. We then explored the potential issues with applying this method to a symmetric matrix where the inner product is not a true inner product. It was noted that this can lead to problems with preserving the geometry of the operator and may result in numerical issues. The question was raised about any constraints on the matrix that could prevent these issues. It was suggested that further analysis or experimentation would be needed to determine the reliability of this method.
  • #1
sunjin09
312
0
It is well known that a Hermitian symmetric complex matrix A, [itex]A^{\dagger}=A[/itex] can be taking into a tridiagonolized form: [itex]A=V^{\dagger}HV[/itex] where [itex]^{\dagger}[/itex] is Hermitian conjugate and [itex]H[/itex] is the tridiagonal Hessenberg matrix, and [itex]V^{\dagger}V=VV^{\dagger}=I[/itex]. This decomposition is realized using Schmidt orthonormalization, where H is nothing but the coefficients generated by the orthonormaliztion.

Now suppose A is only symmetric, i.e., [itex]A^T=A[/itex] where T denotes transpose without conjugation, and proceed formally through Schmidt orthonormalization, where the corresponding "inner product" [itex]<x,y>=\sqrt{\sum_ix_i*y_i}[/itex] without taking abs value, i.e., not a real inner product, but can nevertheless be defined. The result would be that A is tridiagonalized into the form [itex]A=VHV^T[/itex] (if V has full rank), where [itex]V^TV=VV^T=I[/itex], enabling solution to the problem [itex]Ax=b[/itex] through solution of [itex]Hy=V^Tb[/itex] followed by [itex]x=Vy[/itex], where the H system is much easier to solve. Now my question is: where can this method go wrong? If so what is the condition on A under which this method does work?
 
Last edited:
Physics news on Phys.org
  • #2
Hey sunjin09.

The thing I worry about not using a proper inner-product is that the geometry of your operator is not preserved.

Remember that the inner-product geometrically preserves the geometry of your operator which for a normal inner product in R^n, preserves that characteristic.

To place a square root, what you have done in terms of the transformation is basically changed the geometry with respect to the new inner-product. In terms of orthonormalization, your orthonormalization characteristics still are reflected in the inner product (since the term in the square root sign is still zero giving a zero answer), but in terms of the actual projection itself, this will be really screwy.

The reason why I am saying this is because if you want to calculate projections of a given orthonormal basis from say a vector (or an operator) onto the basis itself, that inner product definition will screw things up really badly and if you are doing this in your decomposition and don't transform things back somehow then the result won't make any sense.

Also just out of curiosity, are there any constraints on your matrix besides it being Hermitian (i.e. any further constraints)?
 
  • #3
Hi chiro, thanks for answering. Le me first clarify a few points:

1. The Gram–Schmidt process I mentioned is actually Anordi Gram–Schmidt process, where you start with an (arbitrary) initial vector v, and try to find an orthogonal basis for the Krylov space K=span{v,Av,A^2v,...}.

2. The system A I consider is actually the EM Green's function G(r,r'), which is known to be transpose symmetric, but NOT Hermitian, i.e., G(r',r)=G(r,r') but G(r',r)≠G*(r,r'). G is used to form the integral equation of an (arbitrary) EM scattering problem, i.e.,
E=E0+G(σE), where E0 is known incident E field, σ is known conductivity and E is the unknown total field, i.e., total field is equal to incident field plus scattered (induced) field, where scattered field is generated by induced current σE. Quite straightforward formulation.

The system is typically very large, and an approximate solution is sought in the Krylov space K=span{E0,G(σE0),G(σG(σE0)),...} (fixed point iteration or Born series). If an orthognonal decomposition of K may be formulated as mentioned in my OP, I would avoid biconjugate gradient which works for nonsymmetric system, and make use of the non-conjugate symmetry and cut the cost by half. Since the problem is a well defined physics problem, the Born series is guaranteed to converge to a unique solution. I guess this is some constraints on the system, i.e., well posed with unique solution, or the system E=AE with the operator A: AE=E0+G(σE), is a contraction.

I tested with a small arbitrary A with the illegal inner product I mentioned and it worked. I wonder if it works with my EM problem in general. The thing I worried about the illegal inner product is indeed what you mentioned: <x,y>=0 does not guarantee x and y are orthogonal, <x,x>=0 does not even guarantee x=0. So that my "unitary" matrix [itex]V, V^TV=VV^T=I[/itex], may be poorly conditioned or even singular.
 
Last edited:
  • #4
Maybe I'm misunderstanding your notation, but don't you have the basic problem that your defintion of < x , x > can be zero when x is nonzero? For example if x is the vector
(1 + i), (1 - i) ?

If that is so, it shouldn't be too hard to invent a matrix where the method fails completely, or runs into severe numerical problems losing precision through cancelling nearly equal and opposite terms.

That can't happen in the Hermitian case, because the computation of < x, x > is always well conditioned.
 
  • #5
AlephZero said:
Maybe I'm misunderstanding your notation, but don't you have the basic problem that your defintion of < x , x > can be zero when x is nonzero? For example if x is the vector
(1 + i), (1 - i) ?

If that is so, it shouldn't be too hard to invent a matrix where the method fails completely, or runs into severe numerical problems losing precision through cancelling nearly equal and opposite terms.

That can't happen in the Hermitian case, because the computation of < x, x > is always well conditioned.

You are totally right. That's what I worry about. I was just hoping maybe the solution is still valid given the particular problem I have (with further constraints on the system). Remember the solution is an iterative one. My previous post has a lot more details. Can you help look at it? Thank you so much!
 
  • #6
I think you are "living dangerously" with this. We agree the method fails in general if you create a vector that has pseudo-norm-length 0. So you really have to prove that is impossible because of the special properties of your problem, or find a way to restart if it happens.

If you get a small pseudo-norm instead of an exact zero, and the corresponding "unit length vector" will be probably be scaled to have very large components, leading to numerical problems later on.

I don't think arbitrary complex symmetric matrices have many "nice" numerical properties. You can see that by reformulating the problem as a double-sized real matrix. If ##A = P + iQ## where ##P## and ##Q## are real, then you can replace A by the real matrix
$$A' = \begin{bmatrix} P & Q \\ -Q & P \end{bmatrix}$$
If A is Hermitian, P is symmetric, Q is antisymmetric, and A' is symmetric. If A is symmetric, P and Q are both symmetric and A' is (almost) an arbitrary real matrix.

If you want to explore this, I would try using a "pseudo norm" that really is a norm - e.g. Norm(x + iy) = |x| + |y| or something similar.
 
  • #7
AlephZero said:
I think you are "living dangerously" with this. We agree the method fails in general if you create a vector that has pseudo-norm-length 0. So you really have to prove that is impossible because of the special properties of your problem, or find a way to restart if it happens.

If you get a small pseudo-norm instead of an exact zero, and the corresponding "unit length vector" will be probably be scaled to have very large components, leading to numerical problems later on.

I don't think arbitrary complex symmetric matrices have many "nice" numerical properties. You can see that by reformulating the problem as a double-sized real matrix. If ##A = P + iQ## where ##P## and ##Q## are real, then you can replace A by the real matrix
$$A' = \begin{bmatrix} P & Q \\ -Q & P \end{bmatrix}$$
If A is Hermitian, P is symmetric, Q is antisymmetric, and A' is symmetric. If A is symmetric, P and Q are both symmetric and A' is (almost) an arbitrary real matrix.

If you want to explore this, I would try using a "pseudo norm" that really is a norm - e.g. Norm(x + iy) = |x| + |y| or something similar.

By writing A in terms of the real and imaginary parts is really more illuminating. It is simply too bothersome not to be able to make use of this kind of symmetry on a problem of such general interest. As you mentioned, maybe there's a way to restart the iteration when failure happens, however it is probably not possible to prove general convergence even with restarted iterations. Anyway, it's good to know I didn't miss anything obvious.
 

1. What is a pseudo-orthogonalization process?

A pseudo-orthogonalization process is a mathematical method used in linear algebra to approximate a set of vectors that are close to being orthogonal (perpendicular) to each other. It involves iteratively adjusting the vectors to make them more orthogonal until a desired level of accuracy is reached.

2. How does a pseudo-orthogonalization process work?

The process starts with a set of vectors that are not perfectly orthogonal. It then calculates a new set of vectors that are closer to being orthogonal by subtracting the components of each vector that are parallel to the others. This process is repeated until the vectors are sufficiently orthogonal.

3. What applications is a pseudo-orthogonalization process commonly used for?

A pseudo-orthogonalization process is commonly used in scientific and engineering fields where linear algebra is applied, such as in signal processing, computer graphics, and machine learning. It is also used in numerical methods for solving differential equations and optimization problems.

4. What is the difference between pseudo-orthogonalization and true orthogonalization?

True orthogonalization involves finding a set of vectors that are perfectly orthogonal to each other, while pseudo-orthogonalization seeks to approximate this orthogonality. Pseudo-orthogonalization is often used when true orthogonalization is not feasible or when an approximate solution is sufficient.

5. Are there any disadvantages to using a pseudo-orthogonalization process?

One potential disadvantage is that the process may not converge to a perfectly orthogonal solution, resulting in less accurate results. Additionally, the process can be computationally intensive, especially for large sets of vectors. It is important to carefully consider the trade-offs between accuracy and computational complexity when using a pseudo-orthogonalization process.

Similar threads

Replies
3
Views
2K
  • Linear and Abstract Algebra
Replies
2
Views
603
  • Linear and Abstract Algebra
Replies
3
Views
936
Replies
8
Views
1K
  • Linear and Abstract Algebra
Replies
11
Views
2K
  • Quantum Physics
Replies
1
Views
1K
  • Linear and Abstract Algebra
Replies
2
Views
1K
  • Linear and Abstract Algebra
Replies
20
Views
3K
  • Linear and Abstract Algebra
Replies
4
Views
1K
  • Linear and Abstract Algebra
Replies
2
Views
1K
Back
Top