Derivative of norm of function w.r.t real-part of function

In summary, the conversation discusses a method for showing that the partial derivative of a complex function can be rewritten as Re(A^†(Au-b)), by introducing a "zero padding" operator and interpreting Au as a function. The conversation also mentions a simpler method using matrix calculus, but there may have been errors in the calculation.
  • #1
SchroedingersLion
215
57
TL;DR Summary
Troubles understanding an "exotic" method of taking a derivative of a norm of a complex valued function with respect to the the real part of the function.
Greetings,

suppose we have ##h(u)=\frac{1}{2} \left\|Au-b \right\|_{2}^2## with ##A## a complex matrix and ##b,u## complex vectors of suitable dimensions. Write ##u=u_1 + iu_2## with ##u_1## and ##u_2## as the real and imaginary part of ##u##, respectively.

Show that ##\frac {\partial h} {\partial u_1} = Re \left[ A^{\dagger}(Au-b) \right]##.

Now there is a straight-forward way to show this: Just rewrite ##h(u)## in terms of the matrix/vector components, take the derivative, and show that the required result is obtained. That, however, takes 2 pages of index and dagger loaded calculations.
Our lecturer presented a simpler method, but I was unable to follow his logic (there might be errors in there):

IMG_20201005_183950004.jpg

IMG_20201005_184010224.jpg

He seems to interpret ##Au## as a function, and he introduces a "zero padding" operator ##\zeta## (looks like a 2), such that ##\zeta u=u_1 + i0## (?). Though I don't see why that would lead to ##A\zeta## being a function from ##\mathbb{R}^n \rightarrow \mathbb{C}^n##. Also I feel like something is wrong with the indices of ##u## here.

Note that in the case of real-valued ##A,b,u##, the derivative becomes ##A^{\dagger}(Au-b)##, which he seems to have used at the start of image 2.

Can anyone make sense of the notes?SL
 
Physics news on Phys.org
  • #2
I don't understand what he is doing, but ##h(u)=\frac{1}{2} \left\|Au-b \right\|_{2}^2 = \frac 1 2 (Au-b)^T(Au-b)^*## and therefore
$$\frac {\partial h} {\partial u_1}
= \frac 1 2 (\frac {\partial Au} {\partial u_1})^T(Au-b)^* + \frac 1 2 (Au-b)^T(\frac {\partial Au} {\partial u_1})^* $$
$$= \frac 1 2 \left(\frac {\partial u}{\partial u_1}\right)^T A^T (Au-b)^*+ \frac 1 2 (Au-b)^T A^* (\frac {\partial u} {\partial u_1})^*$$
Now ##\frac {\partial u} {\partial u_1}## and its complex conjugate are simply 1:
$$\frac {\partial h} {\partial u_1}
=\frac 1 2 A^T (Au-b)^*+ \frac 1 2 (Au-b)^T A^* $$
$$=\frac 1 2 \left( A^\dagger (Au-b)\right)^* + \frac 1 2 (A^\dagger(Au-b))^T
\stackrel{?}{=} Re(A^\dagger (Au-b))$$
Hmm... something went wrong mixing up row and column vectors. Probably from assuming ##\frac {\partial u} {\partial u_1} = 1##. But it certainly looks like this approach works.
 
  • Like
Likes SchroedingersLion
  • #3
mfb said:
I don't understand what he is doing, but ##h(u)=\frac{1}{2} \left\|Au-b \right\|_{2}^2 = \frac 1 2 (Au-b)^T(Au-b)^*## and therefore
$$\frac {\partial h} {\partial u_1}
= \frac 1 2 (\frac {\partial Au} {\partial u_1})^T(Au-b)^* + \frac 1 2 (Au-b)^T(\frac {\partial Au} {\partial u_1})^* $$
$$= \frac 1 2 \left(\frac {\partial u}{\partial u_1}\right)^T A^T (Au-b)^*+ \frac 1 2 (Au-b)^T A^* (\frac {\partial u} {\partial u_1})^*$$
Now ##\frac {\partial u} {\partial u_1}## and its complex conjugate are simply 1:
$$\frac {\partial h} {\partial u_1}
=\frac 1 2 A^T (Au-b)^*+ \frac 1 2 (Au-b)^T A^* $$
$$=\frac 1 2 \left( A^\dagger (Au-b)\right)^* + \frac 1 2 (A^\dagger(Au-b))^T
\stackrel{?}{=} Re(A^\dagger (Au-b))$$
Hmm... something went wrong mixing up row and column vectors. Probably from assuming ##\frac {\partial u} {\partial u_1} = 1##. But it certainly looks like this approach works.

Shouldn't your first line be

##h(u)=\frac{1}{2} \left\|Au-b \right\|_{2}^2 = \frac 1 2 (Au-b)^{*T}(Au-b)## ?
 
  • #4
Doesn't matter where you put the complex conjugation if you know the product is real. ##ab^* = (a^*b)^* = a^*b##.
 
  • Like
Likes hutchphd
  • #5
But you end up with a little work to do at the end, yes? Your result is correct I think.
 
  • #6
mfb said:
I don't understand what he is doing, but ##h(u)=\frac{1}{2} \left\|Au-b \right\|_{2}^2 = \frac 1 2 (Au-b)^T(Au-b)^*## and therefore
$$\frac {\partial h} {\partial u_1}
= \frac 1 2 (\frac {\partial Au} {\partial u_1})^T(Au-b)^* + \frac 1 2 (Au-b)^T(\frac {\partial Au} {\partial u_1})^* $$
$$= \frac 1 2 \left(\frac {\partial u}{\partial u_1}\right)^T A^T (Au-b)^*+ \frac 1 2 (Au-b)^T A^* (\frac {\partial u} {\partial u_1})^*$$
Now ##\frac {\partial u} {\partial u_1}## and its complex conjugate are simply 1:
$$\frac {\partial h} {\partial u_1}
=\frac 1 2 A^T (Au-b)^*+ \frac 1 2 (Au-b)^T A^* $$
$$=\frac 1 2 \left( A^\dagger (Au-b)\right)^* + \frac 1 2 (A^\dagger(Au-b))^T
\stackrel{?}{=} Re(A^\dagger (Au-b))$$
Hmm... something went wrong mixing up row and column vectors. Probably from assuming ##\frac {\partial u} {\partial u_1} = 1##. But it certainly looks like this approach works.
Thanks mfb. I did not try to solve it using matrix calculus like this. I shall try to spot the misstep in the calculation so that it works out in the end. Originally, I started similarly, but I rewrote the scalar product ##(Au-b)^{\dagger}(Au-b) ## in terms of a sum over the components which became a very tedious calculation...

However, I would still like to understand that magic that the lecturer performed...
 

1. What is the derivative of the norm of a function with respect to the real part of the function?

The derivative of the norm of a function with respect to the real part of the function is the partial derivative of the norm with respect to the real part. This can be calculated using the chain rule and the definition of the norm.

2. Why is it important to calculate the derivative of the norm of a function with respect to the real part of the function?

Knowing the derivative of the norm of a function with respect to the real part is important in optimization problems, as it allows us to find the minimum or maximum values of a function with respect to the real part. It is also used in various engineering and scientific applications, such as signal processing and image processing.

3. How is the derivative of the norm of a function with respect to the real part of the function related to the gradient?

The gradient of a function is a vector that points in the direction of steepest increase. The derivative of the norm of a function with respect to the real part is equivalent to the gradient of the function with respect to the real part. This means that the direction of the gradient is also the direction of the fastest increase in the norm of the function.

4. Can the derivative of the norm of a function with respect to the real part of the function be negative?

Yes, the derivative of the norm of a function with respect to the real part can be negative. This indicates that the norm of the function is decreasing as the real part increases. The sign of the derivative can provide information about the behavior of the function and can be used in optimization techniques.

5. Are there any special cases where the derivative of the norm of a function with respect to the real part of the function is equal to zero?

Yes, there are cases where the derivative of the norm of a function with respect to the real part is equal to zero. This occurs when the function is constant with respect to the real part, meaning that the norm does not change as the real part varies. This can also happen at critical points of the function, where the derivative is zero in all directions.

Similar threads

Replies
1
Views
175
Replies
3
Views
1K
  • Linear and Abstract Algebra
Replies
3
Views
1K
Replies
3
Views
2K
Replies
1
Views
4K
  • Math POTW for Graduate Students
Replies
1
Views
977
Replies
4
Views
367
Replies
15
Views
2K
Back
Top