# Constraining the element-wise magnitudes of a matrix

I have a complex matrix, $\textbf{A}$, and I want to left-multiply it by a unitary matrix, $\textbf{U}$ (i.e. $\textbf{U}$ is square and $\textbf{U}^H\textbf{U}=\textbf{I}$).

The goal is to find the $\textbf{U}$ which yields the optimal solution $\textbf{B}_{opt} \triangleq \textbf{U}\textbf{A}$, where $\textbf{B}_{opt}$ is optimal in the sense that its element-wise magnitudes are all simultaneously as close as possible to unity.

That is, I would like to minimise something like: $\underline{1}^T \left|\left(\left|\textbf{B}_{opt}\right|^2 - \underline{1}\underline{1}^T\right)\right|^2 \underline{1}$ (subject to $\textbf{U}^H\textbf{U}=\textbf{I}$), where $\left|\textbf{B}_{opt}\right|$ denotes the element-by-element absolute value of $\textbf{B}_{opt}$ and $\underline{1}$ is a column vector of ones.

How can I approach this problem? I have tried to find a solution using Lagrange multipliers, but I can't seem to gain any insight into how to design $\textbf{U}$. What other sorts of methods are available for this type of problem?

Any advice is greatly appreciated!

## Answers and Replies

Ben Niehoff
Science Advisor
Gold Member
The "element-by-element magnitude squared" can be written

$$\mathrm{tr} \, (A^\dagger A)$$

This might be useful.

The "element-by-element magnitude squared" can be written

$$\mathrm{tr} \, (A^\dagger A)$$

This might be useful.

Ah yes - I have been using that expression (sum of element-by-element magnitude squared is the Frobenius norm). It is particularly useful because matrix traces tend to be easy to differentiate.

However, I run into trouble with the $\left|\textbf{B}_{opt}\right|^2$ term, which is not summed. I wrote it as $\left( \textbf{B}_{opt} \odot \textbf{B}_{opt}^* \right)$, where $\odot$ is the Hadamard (element-by-element) product. When differentiated (w.r.t. $\textbf{U}^*$) this produces an ugly expression with several Hadamard products that I find difficult to work with.

Furthermore, differentiating the constraint term in the Lagrange function seems to give strange results (or perhaps I have made an error)... so I'm looking for any fresh ideas or insights!