Homework Help: Derivation of completeness relation from Jackson's Classical Electrody

1. Aug 1, 2014

physicsjn

1. The problem statement, all variables and given/known data
Greetings! I am reading section 2.8 of Jackson and trying to understand how completeness relation was derived.

It starts with the orthonormality condition:
$∫U_N ^*(ε) U(ε) dε =δ_{nm}$

We can represent a function as a sum of orthonormal functions if N is finite:
$f(ε) ⇔ \sum_{n=1}^N a_n U_n (ε)$

$M_N = \int_a ^b |f(ε) - \sum_{n=1}^N a_n U_n(ε)|^2 dε$

Jackson says that "it is easy to show that the coefficients are given by
$a_n=\int_a ^b U_n ^*(ε) f(ε) dε$."

Question: How do I show this?

2. Relevant equations
Same as above.

3. The attempt at a solution
I guess minimizing MN means setting it to zero. And then I'm not sure what to do after. How do I extract the buried coefficients ? I am very sorry.

Last edited: Aug 1, 2014
2. Aug 1, 2014

AlephZero

No, it means that $\displaystyle \frac {\partial M_N }{\partial a_n} = 0$ for each $n$.

So you get $\displaystyle \int_a^b \frac{\partial}{\partial a_n} | f(\epsilon) - a_n U_n(\epsilon)|^2 d\epsilon = 0$ for each $n$.

3. Aug 1, 2014

physicsjn

@AlephZero Thank you very much!

So we'll have
(1) $\int_a ^b \frac{\partial}{\partial a_n} \bigg( \sqrt{(f(ε) - a_n U_n (ε))^2} \bigg) ^2 dε= 0$
since $|x| = \sqrt {x^2}$

(2) $\int_a ^b \frac{\partial}{\partial a_n}(f(ε) - a_n U_n (ε))^2 dε= 0$

(3) $2\int_a ^b (f(ε) - a_n U_n (ε)) U_n^*(ε) dε= 0$

(4) $\int_a ^b f(ε)U_n^*(ε) - \int_a ^b a_n U_n (ε) U_n^*(ε) dε= 0$

(5) $\int_a ^b f(ε)U_n^*(ε) = \int_a ^b a_n U_n (ε) U_n^*(ε) dε$
via orthonormality condition $0 = \int_a ^bU_n (ε) U_n^*(ε) dε$

(6) $\int_a ^b f(ε)U_n^*(ε) = a_n$

I see. Thanks a lot. But in step 2 going to 3, when I applied chain rule, U* came out instead of U. I did that so that I can arrive at step 6. However, I don't know why U* should come out instead of U. Is there a valid mathematical explanation for this? Thanks again.

4. Aug 1, 2014

Fredrik

Staff Emeritus
I haven't looked at the relevant section of the book, and I haven't worked this out to see if this leads to anything good, but $x=\sqrt{x^2}$ is only true when x is real and positive. The $U_n$ functions appear to be complex valued, so it looks like you should be using $|x|^2=x^*x$. Maybe that's where the missing * comes from. I would try starting with
$$0=\frac{\partial M_N}{\partial a_n} = \frac{\partial}{\partial a_n}\int _a^b\bigg|f(\varepsilon)-\sum_m a_m U_m(\varepsilon)\bigg|^2 \mathrm d\varepsilon = \frac{\partial}{\partial a_n}\int _a^b\bigg(f(\varepsilon)-\sum_m a_m U_m(\varepsilon)\bigg)^*\bigg(f(\varepsilon)-\sum_k a_k U_k(\varepsilon)\bigg) \mathrm d\varepsilon=\cdots$$

Last edited: Aug 1, 2014
5. Aug 1, 2014

AlephZero

Agreed. And of course this also works when $U$ is real-valued and $U^* = U$.

6. Aug 2, 2014

physicsjn

Thanks AlephZero and Fredrik for your replies. I also am not sure if this will lead to anything, but we had a lecture about Chapter 4 and the professor ask us to fill the missing steps. And the derivation seems to go back all to the way to Chapter 2. Anyway, if
$|x|=x^*x$

$0=\frac{\partial}{\partial a_n}\int_a^b \big(f(ε) - \sum\limits_{m}a_mU_m(ε) \big)^* \big(f(ε) - \sum\limits_{m}a_kU_k(ε) \big) dε$

$0=\frac{\partial}{\partial a_n}\int_a^b \big( |f(ε)|^2 - \sum\limits_{m} a_m U_m ^* f(ε) - \sum\limits_{k} a_k U_k f^*(ε) + \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big) dε$

Differentiating, constants and summation terms with indices m and k not equal to n will be killed. Therefore,
$0=\int_a^b -a_n U_n^* f(ε) - a_n U_n f^*(ε) + \frac{\partial}{\partial a_n} \big( \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big) dε$

Applying product rule to the last term,
$\frac{\partial}{\partial a_n} \big( \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big)= \sum\limits_{m}a_mU_m^*(ε) \frac{\partial}{\partial a_n}\big( \sum\limits_{k}a_kU_k(ε) \big) + \frac{\partial}{\partial a_n}\big(\sum\limits_{m}a_mU_m^*(ε) \big) \sum\limits_{k}a_kU_k(ε)$

Again, the non-n terms are killed during differentiation with an,
$\frac{\partial}{\partial a_n} \big( \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big)=\sum\limits_m a_m U_m^* (a_n U_n) + \sum\limits_k (a_k U_k) a_n U_n^*$

Going back to the equation,
$0=\int_a^b \bigg( -a_n U_n^* f(ε) - a_n U_n f^*(ε) +\sum\limits_m a_m U_m^* (a_n U_n) + \sum\limits_k (a_k U_k) a_n U_n^* \bigg) dε$

$0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f^*(ε) dε + \int_a^b a_n^2 U_n^*U_n dε + \int_a^b a_n^2 U_n U_n^* dε$

$0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f^*(ε) dε + a_n ^2 +a_n ^2$

$0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f^*(ε) dε + 2a_n ^2$

Hmm, . Alright, I will assume that f(ε) is real to make things less complicated.
$0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f(ε) dε + 2a_n ^2$

$2a_n ^2=\int_a^b a_n U_n^* f(ε) dε + \int_a^b a_n U_n f(ε) dε$

Okay, I'm almost there. There's an extra term and factor 2. Did I do something wrong?

Thanks again.

PS. In my previous post, step (5), I made a mistake. The orthonormality condition should be
$1=\int_a^b U_n(ε) U_n^*(ε) dε$ not $0=\int_a^b U_n(ε) U_n^*(ε) dε$

Last edited: Aug 2, 2014
7. Aug 2, 2014

Fredrik

Staff Emeritus
You seem to have evaluated $\frac{\partial}{\partial a_n}a_n$ to $a_n$ instead of 1. So you get too many $a_n$ in the end. You may be able to simplify things a bit by using the formula $z+z^*=2\operatorname{Re} z$ when you find that one of your terms is the complex conjugate of another. But I think there's a more serious problem with this approach. We can't assume that $a_n$ is real. Doing so will likely give us the wrong result. This makes terms that include $a_k^*$ a problem. We can't just write
$$\frac{\partial}{\partial a_n}a_k^*= \left(\frac{\partial}{\partial a_n}a_k\right)^* =\delta_{nm}^* =\delta_{nm}$$
because the complex conjugation function isn't (complex) differentiable, and that makes the first step invalid.

The straightforward way to convert this to an optimization problem in (real) calculus is to write $a_k=b_k+ic_k$ with $b_k$ and $c_k$ both real, and then see what we get from $\frac{\partial M_N}{\partial b_n}=0$ and $\frac{\partial M_N}{\partial c_n}=0$. Another option is to simply treat $a_n$ and $a_n^*$ as independent variables, i.e. to see what we get from $\frac{\partial M_N}{\partial a_n}=0$ and $\frac{\partial M_N}{\partial a_n^*}=0$ when we use $\frac{\partial}{\partial a_n}a_k^*=0$ and $\frac{\partial}{\partial a_n^*}a_k=0$ for all n and k (including when n=k). It's not at all obvious that this makes sense, but it does, and physics books use this trick all the time without any explanation, so I think it should be OK to use it. When we use this trick, the calculation gets much easier.

Another thing that struck me is that there's an easier way to obtain the formula we want. It follows almost immediately from $f=\sum_n a_n U_n$. It's a one-line calculation. So I don't really see why we're solving an optimization problem here. Maybe it's just to verify that the formula we have to use anyway is the solution to an optimization problem.

8. Aug 2, 2014

AlephZero

Another way round that problem is to see that the definition of the functions $U_n$ is arbitrary, in the sense that we can replace $U_n$ by $e^{i\phi_n}U_n$ for any real number $\phi_n$. (Physically this corresponds to "rotating" the $U_n$ into a preferred orientation, in some sense.)

For a suitable choice of the $\phi_n$, the coefficients $a_n$ are real numbers, and the calculus problems go away.

9. Aug 2, 2014

Oxvillian

I think the idea is that we're not assuming that the $U_n$ comprise a complete set of functions. Then the smallest error possible will in general be nonzero.

Last edited: Aug 2, 2014
10. Aug 2, 2014

Oxvillian

I disagree because redefining the $U_n$ in this way doesn't suddenly reduce the extremization problem from one involving $2N$ free parameters to one involving $N$ free parameters. One still has to show that the minimum can't be improved by making one of the $a_n$'s a little bit complex.

Also note that your redefinition of the $U_n$'s is specific to one particular $f$. For example, the expansion of $if$ in terms of the same set of functions would involve purely imaginary $a_n$ coefficients.

11. Aug 2, 2014

Fredrik

Staff Emeritus
I see another issue with the redefinition. If we write
$$f=\sum_n a_n U_n=\sum_n \left(a_n e^{-i\phi_n}\right)\left(e^{i\phi_n}U_n\right) =\sum_n a_n' U_n'$$ then the $a_n'$ will have to be functions, not numbers.
$$f(\varepsilon) =\sum_n \left(a_n e^{-i\phi_n(\varepsilon)}\right) \left(e^{i\phi_n(\varepsilon)} U_n(\varepsilon)\right)$$

12. Aug 2, 2014

Fredrik

Staff Emeritus
The method I would recommend is to treat $a_n$ and $a_n^*$ as independent variables. The reason why this works is this: (Thanks Avodyne.) Suppose that $R:\mathbb R^2\to\mathbb R$ and $C:\mathbb C^2\to\mathbb R$ are such that $R(x,y)=C(x+iy,x-iy)$ for all $x,y\in\mathbb R$. I will use the notation $D_i$ for the operator that takes a function defined on $\mathbb R^2$ to its ith partial derivative, and $\bar D_i$ for the operator that takes a function defined on $\mathbb C^2$ to its ith partial derivative. We have

Then we have
\begin{align}
&D_1R=\bar D_1C+\bar D_2C\\
&D_2R=i(\bar D_1C-\bar D_2C)
\end{align} Here we can see that if $\bar D_1C=\bar D_2C=0$, then $D_1 R=D_2R=0$. To see that the converse is also true, note that the above implies that
\begin{align}
&D_1R-iD_2R=2\bar D_1C\\
&D_1R+iD_2R=2\bar D_2C
\end{align}

13. Aug 4, 2014

physicsjn

Thank you very much Fredrik, AlephZero, and Oxvillian for your kind replies. :shy: Sorry I just replied today. I had no access to internet yesterday.

Yeah, I see that I did that mistake. I'll work on that first. Thanks Fredrik!

Hm. Actually, I'm just following Jackson's approach at section 2.8. @Frederik In the book, the equation $f=\sum\limits_{n} a_nU_n$ follows after this equation $a_n = \int_a ^b U_n^*(ε) f(ε) dε$ we are trying to show, so I don't think I can use that.

I'm kinda confused. Do I really have to assume that $a_n$and $f(ε)$ are complex numbers and functions respectively?

Thanks again everyone. I think I would need some time to digest the other replies that I did not comment on. (My background in complex numbers is not very solid. Sorry.)

Last edited: Aug 4, 2014
14. Aug 4, 2014

Fredrik

Staff Emeritus
I finally opened my Jackson to see what he says. The presentation is pretty strange. He starts by saying that "An arbitrary function f(ε), square integrable on the interval (a,b), can be expanded in a series of the orthonormal functions Un(ε)." To say this is to say that there's a sequence $\langle a_n\rangle_{n=0}^\infty$ in $\mathbb C$ such that $f=\sum_{n=0}^\infty a_n U_n$. This and the orthonormality condition together imply that $\int_a^b U_n^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n$.

Then he says "If the number of terms is finite..." This should mean that he's now considering the case where all but a finite number of the $a_n$ are zero. But he's not. He's ignoring that he has already told us that f can be expanded in a series like that, and is just trying to state the following problem: Suppose that $\{U_n\}_{n=0}^\infty$ is an orthonormal set of complex-valued functions. Suppose that we want to approximate f by a linear combination of the $U_n$, i.e. $f\approx\sum_{n=0}^N a_nU_n$. Then what is the best choice of $a_n$? First we have to define what we mean by "best", so he does that, and then tells us that the definition leads to the formula $\int_a^b U_n^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n$.

Since students at this level are likely to be familiar with vector spaces and bases for vector spaces, I think maybe it would have been better to say that a given orthonormal set $\{U_n\}$ of complex-valued functions may or may not be an orthonormal basis for the vector space of square-integrable complex-valued functions with domain (a,b) (note that f is an element of that vector space even if it's real-valued), and if it is, then we have $f=\sum_n a_n U_n$, which implies that $\int_a^b U_n^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n$. There are some subtleties in this approach too, and we would probably ignore them, but at least we would be ignoring the same things as in an introductory QM course.

If the $U_n$ are complex-valued functions, then the $a_n$ will be complex numbers, even if the specific f that you want to expand in a series happens to be a real-valued function.

Last edited: Aug 4, 2014
15. Aug 4, 2014

Oxvillian

Fredrik - the moral of Jackson's story is that if we have an incomplete set of orthonormal functions, the "expansion of best fit" to a given function $f$ is still obtained by using the familiar rule
$$a_n=\int_a ^b U_n ^*(ε) f(ε) dε.$$
It's like saying that if you want to get as close as you can to an airplane but are constrained to move around on the (flat) surface of the earth, the best you can do is get directly under the airplane. That is, you still use the $x$ and $y$ coordinates of the airplane.

Last edited: Aug 4, 2014
16. Aug 4, 2014

Fredrik

Staff Emeritus
Isn't that what I said?

Edit: I see that I wrote $U^*$ instead of $U_n^*$ in the formula for $a_n$. I have edited the post I just quoted to correct that mistake there.