Least-square optimization of a complex function

In summary, the conversation discusses a least square optimization problem with a complex residual function and a cost function involving the Euclidean norm. The question is whether to use the conjugated or un-conjugated inner product when applying the gradient descent method to solve the problem. The conversation also touches on the difference between minimizing the residual and minimizing the functional, and the need for a gradient operator with respect to complex quantities.
  • #1
elgen
64
5
Dear all,

I have a least square optimization problem stated as below

[tex]\xi(z_1, z_2) = \sum_{i=1}^{M} ||r(z_1, z_2)||^2[/tex]

where [tex]\xi[/tex] denotes the cost function and [tex]r[/tex] denotes the residual and is a complex function of [tex]z_1, z_2[/tex].

My question is around [tex]||\cdot||[/tex]. Many textbooks only deal with real functions and say that this is the Euclidean norm, which is defined as the conjugated inner product of the residual, i.e. [tex]||r||^2 = conj(r)*r[/tex].

My question is that when I apply the gradient descent method to solve this problem, how to calculate [tex]\nabla \xi[/tex]? In particular, as [tex]\xi[/tex] includes [tex]conj(r)[/tex], we cannot take the derivative with respect to [tex]z_1, z_2[/tex] as [tex]conj(r)[/tex] is not an analytic function.

Should I use the un-conjugated inner product for the definition of the norm for this LS optimization with a complex residual function?

Any feedback is welcome. Thank you.


elgen
 
Mathematics news on Phys.org
  • #2
Your function is a real function of four real parameters, the real and imaginary parts of z1 and z2. Recall that, if f(z1, z2) = a(z1, z2) + i b(z1, z2), where a and b are real functions, then || f ||^2 = a^2 + b^2. Hope this helps.
 
  • #3
Took me some time to figure it out. The functional involves four real variable and I applied iterative methods to solve the non-linear least square problem and obtain the correct answer. Your feedback definitely helped. Thx a lot.

To my own curiosity, I defined the functional being simply the product of two complex functions (no conjugation). It becomes

[tex]\xi(z_1,z_2)=\sum_{i=1}^M r_i(z_1, z_2)r_i(z_1,z_2) [/tex].

By treating z_1 and z_2 as two variables (not treat the real and imaginary part separately), I was also able to get the right answer as well.

This leads to my hypothesis, which is, if the residual [/tex]r(z_1,z_2)[/tex] is an analytic function of the complex variables, we could treat these variables just as real numbers and apply the iterative methods.

I am also curious that - is there any difference between these two functionals? When the conjugated functional should be used over the un-conjugated functional and vice versa?

Thx for the feedback again.
 
  • #4
On a second thought, minimizing the residual [tex]r_i(z_1,z_2)[/tex] is not the same as minimizing the functional

[tex]\xi(z_1,z_2)=\sum_{i=1}^{M}r_i(z_1,z_2) r_i(z_1,z_2)[/tex]

If [tex]r_1=3[/tex] and [tex]r_2=3i[/tex], these residuals are not zero. However, [tex]\xi=0[/tex].

The functional defined using the conjugated product satisfies that it is minimized when each residual are minimized.
 
  • #5
If the residual is defined as [tex]r_i=f_i^{obs} -f_i(z_1,z_2)[/tex], I am still not sure how to take the gradient method of the cost function if I don't have an analytic expression of [tex]f_i[/tex]. I mean let
[tex]\xi = \Re\{ f_i^{obs} - f_i(z_1,z_2) \}^2 + \Im \{ f_i^{obs} - f_i(z_1,z_2) \}^2
[/tex].
Should I proceed as
[tex]
\frac{\partial \xi}{\partial z_1} = -2 \Re\{ f_i^{obs}-f_i(z_1,z_2) \} \Re\{ \frac{\partial f_i}{\partial z_1} \}
- 2 \Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial f_i}{\partial z_1} \}
[/tex]
[tex]
\frac{\partial \xi}{\partial z_2} = -2 \Re\{ f_i^{obs}-f_i(z_1,z_2) \} \Re\{ \frac{\partial f_i}{\partial z_2} \}
- 2 \Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial f_i}{\partial z_2} \}
[/tex]
and take the second derivative as
[tex]
\frac{\partial^2 \xi}{\partial z_1^2} = 2 \Re\{ \frac{\partial f_i}{\partial z_1} \}^2 - 2\Re\{ f_i^{obs}-f_i(z_1,z_2) \}\Re\{\frac{\partial^2f_i}{\partial z_1^2}\}
+ 2 \Im\{ \frac{\partial f_i}{\partial z_1} \}^2 -2\Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial^2f_i}{\partial z_1^2} \}
[/tex]
[tex]
\frac{\partial^2 \xi}{\partial z_2^2} = 2 \Re\{ \frac{\partial f_i}{\partial z_2} \}^2 - 2\Re\{ f_i^{obs}-f_i(z_1,z_2) \}\Re\{\frac{\partial^2f_i}{\partial z_2^2}\}
+ 2 \Im\{ \frac{\partial f_i}{\partial z_2} \}^2 -2\Im\{ f_i^{obs}-f_i(z_1,z_2) \} \Im\{ \frac{\partial^2f_i}{\partial z_2^2} \}
[/tex] ?

Thx.
 
  • #6
The key is to define a gradient operator with respect to complex quantities of a scalar-real valued functional.

author = {Brandwood, D. H.},
title = {A complex gradient operator and its application in adaptive array theory},
journal = {IEE Proceedings H Microwaves, Optics and Antennas},
 

1. What is least-square optimization and why is it important in scientific research?

Least-square optimization is a mathematical method used to find the best fit for a set of data points by minimizing the sum of squared errors between the data and a function. It is important in scientific research because it allows us to quantify the relationship between variables and make predictions based on the data.

2. Can least-square optimization be applied to complex functions?

Yes, least-square optimization can be applied to complex functions such as nonlinear or multivariate functions. However, it may require more computational power and may not always result in a unique solution.

3. What are the limitations of least-square optimization?

One limitation of least-square optimization is that it assumes the data points are independent and identically distributed, which may not always be the case in real-world scenarios. Additionally, it may not always result in the most accurate or precise fit for the data.

4. How is least-square optimization different from other optimization methods?

Compared to other optimization methods, such as gradient descent or genetic algorithms, least-square optimization focuses on minimizing the sum of squared errors, rather than directly optimizing the parameters of the function. It is also a deterministic method, meaning it will always produce the same result for a given set of data.

5. What are some common applications of least-square optimization in scientific research?

Least-square optimization is commonly used in fields such as statistics, physics, engineering, and economics. It can be applied to various problems, including curve fitting, parameter estimation, data smoothing, and signal processing.

Similar threads

Replies
3
Views
2K
Replies
2
Views
2K
  • Calculus and Beyond Homework Help
Replies
3
Views
2K
  • Calculus and Beyond Homework Help
Replies
10
Views
7K
Replies
4
Views
1K
  • STEM Academic Advising
Replies
13
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
3K
  • Poll
  • Science and Math Textbooks
Replies
1
Views
3K
  • Calculus and Beyond Homework Help
Replies
1
Views
1K
Replies
1
Views
2K
Back
Top