Least-square optimization of a complex function

elgen · Oct 28, 2010

Dear all,

I have a least square optimization problem stated as below

[tex]\xi(z_1, z_2) = \sum_{i=1}^{M} ||r(z_1, z_2)||^2[/tex]

where [tex]\xi[/tex] denotes the cost function and [tex]r[/tex] denotes the residual and is a complex function of [tex]z_1, z_2[/tex].

My question is around [tex]||\cdot||[/tex]. Many textbooks only deal with real functions and say that this is the Euclidean norm, which is defined as the conjugated inner product of the residual, i.e. [tex]||r||^2 = conj(r)*r[/tex].

My question is that when I apply the gradient descent method to solve this problem, how to calculate [tex]\nabla \xi[/tex]? In particular, as [tex]\xi[/tex] includes [tex]conj(r)[/tex], we cannot take the derivative with respect to [tex]z_1, z_2[/tex] as [tex]conj(r)[/tex] is not an analytic function.

Should I use the un-conjugated inner product for the definition of the norm for this LS optimization with a complex residual function?

Any feedback is welcome. Thank you.

elgen

Petr Mugver · Oct 29, 2010

Your function is a real function of four real parameters, the real and imaginary parts of z1 and z2. Recall that, if f(z1, z2) = a(z1, z2) + i b(z1, z2), where a and b are real functions, then || f ||^2 = a^2 + b^2. Hope this helps.

elgen · Nov 12, 2010

Took me some time to figure it out. The functional involves four real variable and I applied iterative methods to solve the non-linear least square problem and obtain the correct answer. Your feedback definitely helped. Thx a lot.

To my own curiosity, I defined the functional being simply the product of two complex functions (no conjugation). It becomes

[tex]\xi(z_1,z_2)=\sum_{i=1}^M r_i(z_1, z_2)r_i(z_1,z_2) [/tex].

By treating z_1 and z_2 as two variables (not treat the real and imaginary part separately), I was also able to get the right answer as well.

This leads to my hypothesis, which is, if the residual [/tex]r(z_1,z_2)[/tex] is an analytic function of the complex variables, we could treat these variables just as real numbers and apply the iterative methods.

I am also curious that - is there any difference between these two functionals? When the conjugated functional should be used over the un-conjugated functional and vice versa?

Thx for the feedback again.

elgen · Nov 16, 2010

On a second thought, minimizing the residual [tex]r_i(z_1,z_2)[/tex] is not the same as minimizing the functional

[tex]\xi(z_1,z_2)=\sum_{i=1}^{M}r_i(z_1,z_2) r_i(z_1,z_2)[/tex]

If [tex]r_1=3[/tex] and [tex]r_2=3i[/tex], these residuals are not zero. However, [tex]\xi=0[/tex].

The functional defined using the conjugated product satisfies that it is minimized when each residual are minimized.

elgen · Nov 16, 2010

If the residual is defined as [tex]r_i=f_i^{obs} -f_i(z_1,z_2)[/tex], I am still not sure how to take the gradient method of the cost function if I don't have an analytic expression of [tex]f_i[/tex]. I mean let
[tex]\xi = \Re\{ f_i^{obs} - f_i(z_1,z_2) \}^2 + \Im \{ f_i^{obs} - f_i(z_1,z_2) \}^2
[/tex].
Should I proceed as
[tex]
\frac{\partial \xi}{\partial z_1} = -2 \Re\{ f_i^{obs}-f_i(z_1,z_2) \} \Re\{ \frac{\partial f_i}{\partial z_1} \}
- 2 \Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial f_i}{\partial z_1} \}
[/tex]
[tex]
\frac{\partial \xi}{\partial z_2} = -2 \Re\{ f_i^{obs}-f_i(z_1,z_2) \} \Re\{ \frac{\partial f_i}{\partial z_2} \}
- 2 \Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial f_i}{\partial z_2} \}
[/tex]
and take the second derivative as
[tex]
\frac{\partial^2 \xi}{\partial z_1^2} = 2 \Re\{ \frac{\partial f_i}{\partial z_1} \}^2 - 2\Re\{ f_i^{obs}-f_i(z_1,z_2) \}\Re\{\frac{\partial^2f_i}{\partial z_1^2}\}
+ 2 \Im\{ \frac{\partial f_i}{\partial z_1} \}^2 -2\Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial^2f_i}{\partial z_1^2} \}
[/tex]
[tex]
\frac{\partial^2 \xi}{\partial z_2^2} = 2 \Re\{ \frac{\partial f_i}{\partial z_2} \}^2 - 2\Re\{ f_i^{obs}-f_i(z_1,z_2) \}\Re\{\frac{\partial^2f_i}{\partial z_2^2}\}
+ 2 \Im\{ \frac{\partial f_i}{\partial z_2} \}^2 -2\Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial^2f_i}{\partial z_2^2} \}
[/tex] ?

Thx.

elgen · Jan 19, 2011

The key is to define a gradient operator with respect to complex quantities of a scalar-real valued functional.

author = {Brandwood, D. H.},
title = {A complex gradient operator and its application in adaptive array theory},
journal = {IEE Proceedings H Microwaves, Optics and Antennas},

Least-square optimization of a complex function

1. What is least-square optimization and why is it important in scientific research?

2. Can least-square optimization be applied to complex functions?

3. What are the limitations of least-square optimization?

4. How is least-square optimization different from other optimization methods?

5. What are some common applications of least-square optimization in scientific research?

Similar threads

Hot Threads

Recent Insights