# Help understanding proof for Capon Method

1. Oct 9, 2013

### Milliarde

Hey all, I'm trying to understand the Capon Method for some reading into MIMO radar [1] and went to the original source [2] trying to wrap my head around the linear algebra involved. From [2], the Capon method is derived in Section 5.4.1, which refers to Result R35 in Appendix A. It is about finding a unique solution to the following minimization problem:

$min_{X} X^*AX$ subject to $X^*B=C$

It's been a while since I took linear algebra, but I remember most of the basics. However, I can't wrap my head around the line after Proof's equation A.9.6:
$$\Delta^*AX_0 = \Delta^*B(B^*A^{-1}B)^{-1}C^*=0$$
It seems to come out of nowhere to me, and I'm sure it would require more context which is why I provided the source so it wouldn't be as hard to read this post.

My attempts to work it out on scratch paper haven't been fruitful, mainly because I've tried starting with the equation $X^*B=C$ and doing stuff like this:
$$X_0^*B=C \Rightarrow B^*X_0=C^* \Rightarrow X_0 = B^{*-1}C^*$$
However, since $B$ is not square, I don't think I can have $B^{*-1}$. Without the ability to do things like that, I don't know how to systematically get to the equation above (line after A.9.6). It feels like you'd have to jump multiple steps to get the part of the equation that's inverted and inside parenthesis, for example.

Any advice would be much appreciated, I'd really like to understand how this works.

Thank you!

References:
[1]: MIMO Radar Signal Processing by Jian Li and Petre Stoica.
[2]: Spectral Analysis of Signals by Petre Stoica and Randolph Moses.

2. Oct 9, 2013

### SteamKing

Staff Emeritus
I think it would be helpful for you to review the techniques used for regression analyses. In these analyses, the system of equations to be solved are overdetermined, i.e., there are more equations than unknowns. There are several techniques which are used to remedy this: formation of normal equations (which makes a consistent system of equations), the QR method, and the singular value decomposition. It seems that you must be familiar with this material in order to understand the derivation carried out in App. A of your link.

3. Oct 10, 2013

### jasonRF

Finding necessary conditions for the solutions of these kinds of problems is straightforward: you can use Lagrange multipliers and set a derivative equal to zero, just like you did in calculus class. The only difference is that your parameters are complex: you can either split into real and imaginary parts or use a short-cut like Wirtinger calculus (https://www.physicsforums.com/showthread.php?t=708264). This should immediately yield Capon's solution (C=1 here)
$$\hat{X} = \frac{A^{-1} B}{B^* A^{-1} B}$$
Here $\hat{X}$ denotes the value of $X$ that makes the derivative zero.

To show that the solution is indeed the minimum you can note the following (I am trying to keep your notation and I put C=1; you should work through this yourself):
$$Q = X^* A X = (X - \hat{X})^* A (X - \hat{X}) + \hat{X}^* A \hat{X}$$

Since the second term in Q does not depend upon $X$, we can see that Q is minimized when the first term is zero, or $X = \hat{X}$. Indeed, noting that the original quadratic form can be written this is all you really need to solve the entire problem, but I think that the first step I indicated helps you see why you might think of re-writing Q this way in the first place.
jason

EDIT: I did not attempt to open or read the PDF you linked (I know that book is available for purchase on Amazon - is the pdf linked legal?), so I hope my explanation is what you are looking for!

4. Oct 10, 2013

### jasonRF

Over lunch I looked at the link - it is from the author's web site so presumably it is perfectly legal. What a nice free resource!

5. Oct 10, 2013

### marcusl

Your [2] is a very long way from "the original source." Capon published his method for array direction finding in 1967, and applied it to spectral estimation in a very widely cited paper from 1969.

6. Oct 10, 2013

### Milliarde

@jasonRF, thanks a bunch for the help you've provided so far, as well as the links. They've been very informative. I've never used Lagrangian multipliers or Wirtinger calculus before, so I'm slowly learning it.

However, I'm still sort of confused how to apply these techniques to the matrix calculus example I'm trying to figure out. I followed the steps in the thread you linked and reworked mathwonk's solution, but it still feels like it's different since the other example was a continuous function $f(x)=zz^*$. When this is the case, it seems simpler to me to use algebra to reärrange terms than when matrices (especially non-invertible) are present.

When you say the difference is the complex part and that it could be split into real and imaginary, is that only helpful for determining the partial derivatives for the Lagrangian multiplier method? Or does it serve some purpose after the derivatives have been taken?

When I've tried the Lagrangian method this is what I get so far. I'm basing this off of looking through some of the example problems on Wikipedia's page for Lagrangian multipliers, notably example 2, although I'm sure I'm doing something incorrectly.

Also I changed some notation since things got confusing between sources: $H$ = Hermitian (conjugate transpose), $T$ = transpose, $C$ = conjugate.

$f(X) = X^HAX, g(X) = X^HB = C$

$\Lambda(X, \lambda) = f(X) + \lambda(g(X)-C) = X^HAX+\lambda(X^HB-C)$

$\frac{\partial \Lambda}{\partial X} = A^TX^C = 0 = X^HA, (??)$ (this should come from Wirtinger derivative)

$\frac{\partial \Lambda}{\partial \lambda} = X^H B - C = 0$

I'm not sure what the next step would be, though, and I don't even feel confident in the results I got above. Sorry for all the questions, the way you've written it makes it sound so simple so I feel like I should understand it.

@marcusl, you are right, I should have simply written "source" instead, I was referring to the reference given in the original textbook where it went over Capon's method.

7. Oct 10, 2013

### jasonRF

If you have never used Lagrange multipliers before you should do a simple problem or two to help you get the mechanics down (Wikipedia or your calculus book is likely a great place to learn it).

Anyway, you wrote your constraint in terms of $X^H$, so if you take the derivative with respect to $X$ it is the same as using no lagrange multiplier. For Wirtinger I always take the derivative with respect to the conjugate anyway, which yields
$$\frac{\partial \Lambda}{\partial X^H} = A X + \lambda B.$$

Alternatively, if you write your constraint $B^H X = C^*$ instead then your derivative with respect to $X$ should work out for you.

jason

EDIT: by the way I am using the asterix to represent conjugation here.