Least-square optimization of a complex function

54
0
Dear all,

I have a least square optimization problem stated as below

[tex]\xi(z_1, z_2) = \sum_{i=1}^{M} ||r(z_1, z_2)||^2[/tex]

where [tex]\xi[/tex] denotes the cost function and [tex]r[/tex] denotes the residual and is a complex function of [tex]z_1, z_2[/tex].

My question is around [tex]||\cdot||[/tex]. Many textbooks only deal with real functions and say that this is the Euclidean norm, which is defined as the conjugated inner product of the residual, i.e. [tex]||r||^2 = conj(r)*r[/tex].

My question is that when I apply the gradient descent method to solve this problem, how to calculate [tex]\nabla \xi[/tex]? In particular, as [tex]\xi[/tex] includes [tex]conj(r)[/tex], we cannot take the derivative with respect to [tex]z_1, z_2[/tex] as [tex]conj(r)[/tex] is not an analytic function.

Should I use the un-conjugated inner product for the definition of the norm for this LS optimization with a complex residual function?

Any feedback is welcome. Thank you.


elgen
 
Your function is a real function of four real parameters, the real and imaginary parts of z1 and z2. Recall that, if f(z1, z2) = a(z1, z2) + i b(z1, z2), where a and b are real functions, then || f ||^2 = a^2 + b^2. Hope this helps.
 
54
0
Took me some time to figure it out. The functional involves four real variable and I applied iterative methods to solve the non-linear least square problem and obtain the correct answer. Your feedback definitely helped. Thx a lot.

To my own curiosity, I defined the functional being simply the product of two complex functions (no conjugation). It becomes

[tex]\xi(z_1,z_2)=\sum_{i=1}^M r_i(z_1, z_2)r_i(z_1,z_2) [/tex].

By treating z_1 and z_2 as two variables (not treat the real and imaginary part separately), I was also able to get the right answer as well.

This leads to my hypothesis, which is, if the residual [/tex]r(z_1,z_2)[/tex] is an analytic function of the complex variables, we could treat these variables just as real numbers and apply the iterative methods.

I am also curious that - is there any difference between these two functionals? When the conjugated functional should be used over the un-conjugated functional and vice versa?

Thx for the feedback again.
 
54
0
On a second thought, minimizing the residual [tex]r_i(z_1,z_2)[/tex] is not the same as minimizing the functional

[tex]\xi(z_1,z_2)=\sum_{i=1}^{M}r_i(z_1,z_2) r_i(z_1,z_2)[/tex]

If [tex]r_1=3[/tex] and [tex]r_2=3i[/tex], these residuals are not zero. However, [tex]\xi=0[/tex].

The functional defined using the conjugated product satisfies that it is minimized when each residual are minimized.
 
54
0
If the residual is defined as [tex]r_i=f_i^{obs} -f_i(z_1,z_2)[/tex], I am still not sure how to take the gradient method of the cost function if I don't have an analytic expression of [tex]f_i[/tex]. I mean let
[tex]\xi = \Re\{ f_i^{obs} - f_i(z_1,z_2) \}^2 + \Im \{ f_i^{obs} - f_i(z_1,z_2) \}^2
[/tex].
Should I proceed as
[tex]
\frac{\partial \xi}{\partial z_1} = -2 \Re\{ f_i^{obs}-f_i(z_1,z_2) \} \Re\{ \frac{\partial f_i}{\partial z_1} \}
- 2 \Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial f_i}{\partial z_1} \}
[/tex]
[tex]
\frac{\partial \xi}{\partial z_2} = -2 \Re\{ f_i^{obs}-f_i(z_1,z_2) \} \Re\{ \frac{\partial f_i}{\partial z_2} \}
- 2 \Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial f_i}{\partial z_2} \}
[/tex]
and take the second derivative as
[tex]
\frac{\partial^2 \xi}{\partial z_1^2} = 2 \Re\{ \frac{\partial f_i}{\partial z_1} \}^2 - 2\Re\{ f_i^{obs}-f_i(z_1,z_2) \}\Re\{\frac{\partial^2f_i}{\partial z_1^2}\}
+ 2 \Im\{ \frac{\partial f_i}{\partial z_1} \}^2 -2\Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial^2f_i}{\partial z_1^2} \}
[/tex]
[tex]
\frac{\partial^2 \xi}{\partial z_2^2} = 2 \Re\{ \frac{\partial f_i}{\partial z_2} \}^2 - 2\Re\{ f_i^{obs}-f_i(z_1,z_2) \}\Re\{\frac{\partial^2f_i}{\partial z_2^2}\}
+ 2 \Im\{ \frac{\partial f_i}{\partial z_2} \}^2 -2\Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial^2f_i}{\partial z_2^2} \}
[/tex] ?

Thx.
 
54
0
The key is to define a gradient operator with respect to complex quantities of a scalar-real valued functional.

author = {Brandwood, D. H.},
title = {A complex gradient operator and its application in adaptive array theory},
journal = {IEE Proceedings H Microwaves, Optics and Antennas},
 

The Physics Forums Way

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving
Top