Derivation of completeness relation from Jackson's Classical Electrody

schrodingerscat11
Messages
86
Reaction score
1

Homework Statement


Greetings! I am reading section 2.8 of Jackson and trying to understand how completeness relation was derived.

It starts with the orthonormality condition:
∫U_N ^*(ε) U(ε) dε =δ_{nm}

We can represent a function as a sum of orthonormal functions if N is finite:
f(ε) ⇔ \sum_{n=1}^N a_n U_n (ε)

We can get the best coefficients a by minimizing the error MN:
M_N = \int_a ^b |f(ε) - \sum_{n=1}^N a_n U_n(ε)|^2 dε

Jackson says that "it is easy to show that the coefficients are given by
a_n=\int_a ^b U_n ^*(ε) f(ε) dε."

Question: How do I show this?

Homework Equations


Same as above.


The Attempt at a Solution


I guess minimizing MN means setting it to zero. And then I'm not sure what to do after. How do I extract the buried coefficients ? I am very sorry. :confused:
 
Last edited:
Physics news on Phys.org
physicsjn said:
I guess minimizing MN means setting it to zero.

No, it means that ##\displaystyle \frac {\partial M_N }{\partial a_n} = 0## for each ##n##.

So you get ##\displaystyle \int_a^b \frac{\partial}{\partial a_n} | f(\epsilon) - a_n U_n(\epsilon)|^2 d\epsilon = 0## for each ##n##.
 
  • Like
Likes 1 person
@AlephZero Thank you very much! :biggrin:

So we'll have
(1) \int_a ^b \frac{\partial}{\partial a_n} \bigg( \sqrt{(f(ε) - a_n U_n (ε))^2} \bigg) ^2 dε= 0
since |x| = \sqrt {x^2}

(2) \int_a ^b \frac{\partial}{\partial a_n}(f(ε) - a_n U_n (ε))^2 dε= 0

(3) 2\int_a ^b (f(ε) - a_n U_n (ε)) U_n^*(ε) dε= 0

(4) \int_a ^b f(ε)U_n^*(ε) - \int_a ^b a_n U_n (ε) U_n^*(ε) dε= 0

(5) \int_a ^b f(ε)U_n^*(ε) = \int_a ^b a_n U_n (ε) U_n^*(ε) dε
via orthonormality condition 0 = \int_a ^bU_n (ε) U_n^*(ε) dε

(6) \int_a ^b f(ε)U_n^*(ε) = a_n

I see. Thanks a lot. But in step 2 going to 3, when I applied chain rule, U* came out instead of U. I did that so that I can arrive at step 6. However, I don't know why U* should come out instead of U. Is there a valid mathematical explanation for this? Thanks again.
 
I haven't looked at the relevant section of the book, and I haven't worked this out to see if this leads to anything good, but ##x=\sqrt{x^2}## is only true when x is real and positive. The ##U_n## functions appear to be complex valued, so it looks like you should be using ##|x|^2=x^*x##. Maybe that's where the missing * comes from. I would try starting with
$$0=\frac{\partial M_N}{\partial a_n} = \frac{\partial}{\partial a_n}\int _a^b\bigg|f(\varepsilon)-\sum_m a_m U_m(\varepsilon)\bigg|^2 \mathrm d\varepsilon = \frac{\partial}{\partial a_n}\int _a^b\bigg(f(\varepsilon)-\sum_m a_m U_m(\varepsilon)\bigg)^*\bigg(f(\varepsilon)-\sum_k a_k U_k(\varepsilon)\bigg) \mathrm d\varepsilon=\cdots$$
 
Last edited:
  • Like
Likes 1 person
Fredrik said:
The ##U_n## functions appear to be complex valued, so it looks like you should be using ##|x|^2=x^*x##.
Agreed. And of course this also works when ##U## is real-valued and ##U^* = U##.
 
  • Like
Likes 1 person
Thanks AlephZero and Fredrik for your replies. I also am not sure if this will lead to anything, but we had a lecture about Chapter 4 and the professor ask us to fill the missing steps. And the derivation seems to go back all to the way to Chapter 2. Anyway, if
|x|=x^*x

0=\frac{\partial}{\partial a_n}\int_a^b \big(f(ε) - \sum\limits_{m}a_mU_m(ε) \big)^* \big(f(ε) - \sum\limits_{m}a_kU_k(ε) \big) dε

0=\frac{\partial}{\partial a_n}\int_a^b \big( |f(ε)|^2 - \sum\limits_{m} a_m U_m ^* f(ε) - \sum\limits_{k} a_k U_k f^*(ε) + \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big) dε

Differentiating, constants and summation terms with indices m and k not equal to n will be killed. Therefore,
0=\int_a^b -a_n U_n^* f(ε) - a_n U_n f^*(ε) + \frac{\partial}{\partial a_n} \big( \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big) dε

Applying product rule to the last term,
\frac{\partial}{\partial a_n} \big( \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big)= \sum\limits_{m}a_mU_m^*(ε) \frac{\partial}{\partial a_n}\big( \sum\limits_{k}a_kU_k(ε) \big) + \frac{\partial}{\partial a_n}\big(\sum\limits_{m}a_mU_m^*(ε) \big) \sum\limits_{k}a_kU_k(ε)

Again, the non-n terms are killed during differentiation with an,
\frac{\partial}{\partial a_n} \big( \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big)=\sum\limits_m a_m U_m^* (a_n U_n) + \sum\limits_k (a_k U_k) a_n U_n^*

Going back to the equation,
0=\int_a^b \bigg( -a_n U_n^* f(ε) - a_n U_n f^*(ε) +\sum\limits_m a_m U_m^* (a_n U_n) + \sum\limits_k (a_k U_k) a_n U_n^* \bigg) dε

0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f^*(ε) dε + \int_a^b a_n^2 U_n^*U_n dε + \int_a^b a_n^2 U_n U_n^* dε

0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f^*(ε) dε + a_n ^2 +a_n ^2

0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f^*(ε) dε + 2a_n ^2

Hmm, :frown:. Alright, I will assume that f(ε) is real to make things less complicated.
0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f(ε) dε + 2a_n ^2

2a_n ^2=\int_a^b a_n U_n^* f(ε) dε + \int_a^b a_n U_n f(ε) dε

Okay, I'm almost there. There's an extra term and factor 2. Did I do something wrong? :confused:

Thanks again.

PS. In my previous post, step (5), I made a mistake. The orthonormality condition should be
1=\int_a^b U_n(ε) U_n^*(ε) dε not 0=\int_a^b U_n(ε) U_n^*(ε) dε
 
Last edited:
physicsjn said:
2a_n ^2=\int_a^b a_n U_n^* f(ε) dε + \int_a^b a_n U_n f(ε) dε

Okay, I'm almost there. There's an extra term and factor 2. Did I do something wrong? :confused:
You seem to have evaluated ##\frac{\partial}{\partial a_n}a_n## to ##a_n## instead of 1. So you get too many ##a_n## in the end. You may be able to simplify things a bit by using the formula ##z+z^*=2\operatorname{Re} z## when you find that one of your terms is the complex conjugate of another. But I think there's a more serious problem with this approach. We can't assume that ##a_n## is real. Doing so will likely give us the wrong result. This makes terms that include ##a_k^*## a problem. We can't just write
$$\frac{\partial}{\partial a_n}a_k^*= \left(\frac{\partial}{\partial a_n}a_k\right)^* =\delta_{nm}^* =\delta_{nm}$$
because the complex conjugation function isn't (complex) differentiable, and that makes the first step invalid.

The straightforward way to convert this to an optimization problem in (real) calculus is to write ##a_k=b_k+ic_k## with ##b_k## and ##c_k## both real, and then see what we get from ##\frac{\partial M_N}{\partial b_n}=0## and ##\frac{\partial M_N}{\partial c_n}=0##. Another option is to simply treat ##a_n## and ##a_n^*## as independent variables, i.e. to see what we get from ##\frac{\partial M_N}{\partial a_n}=0## and ##\frac{\partial M_N}{\partial a_n^*}=0## when we use ##\frac{\partial}{\partial a_n}a_k^*=0## and ##\frac{\partial}{\partial a_n^*}a_k=0## for all n and k (including when n=k). It's not at all obvious that this makes sense, but it does, and physics books use this trick all the time without any explanation, so I think it should be OK to use it. When we use this trick, the calculation gets much easier.

Another thing that struck me is that there's an easier way to obtain the formula we want. It follows almost immediately from ##f=\sum_n a_n U_n##. It's a one-line calculation. So I don't really see why we're solving an optimization problem here. Maybe it's just to verify that the formula we have to use anyway is the solution to an optimization problem.
 
  • Like
Likes 1 person
Fredrik said:
But I think there's a more serious problem with this approach. We can't assume that ##a_n## is real.

Another way round that problem is to see that the definition of the functions ##U_n## is arbitrary, in the sense that we can replace ##U_n## by ##e^{i\phi_n}U_n## for any real number ##\phi_n##. (Physically this corresponds to "rotating" the ##U_n## into a preferred orientation, in some sense.)

For a suitable choice of the ##\phi_n##, the coefficients ##a_n## are real numbers, and the calculus problems go away.
 
  • Like
Likes 1 person
Fredrik said:
Another thing that struck me is that there's an easier way to obtain the formula we want. It follows almost immediately from ##f=\sum_n a_n U_n##. It's a one-line calculation. So I don't really see why we're solving an optimization problem here. Maybe it's just to verify that the formula we have to use anyway is the solution to an optimization problem.

I think the idea is that we're not assuming that the U_n comprise a complete set of functions. Then the smallest error possible will in general be nonzero.
 
Last edited:
  • #10
AlephZero said:
Another way round that problem is to see that the definition of the functions ##U_n## is arbitrary, in the sense that we can replace ##U_n## by ##e^{i\phi_n}U_n## for any real number ##\phi_n##. (Physically this corresponds to "rotating" the ##U_n## into a preferred orientation, in some sense.)

For a suitable choice of the ##\phi_n##, the coefficients ##a_n## are real numbers, and the calculus problems go away.

I disagree because redefining the U_n in this way doesn't suddenly reduce the extremization problem from one involving 2N free parameters to one involving N free parameters. One still has to show that the minimum can't be improved by making one of the a_n's a little bit complex.

Also note that your redefinition of the U_n's is specific to one particular f. For example, the expansion of if in terms of the same set of functions would involve purely imaginary a_n coefficients.
 
  • Like
Likes 1 person
  • #11
I see another issue with the redefinition. If we write
$$f=\sum_n a_n U_n=\sum_n \left(a_n e^{-i\phi_n}\right)\left(e^{i\phi_n}U_n\right) =\sum_n a_n' U_n'$$ then the ##a_n'## will have to be functions, not numbers.
$$f(\varepsilon) =\sum_n \left(a_n e^{-i\phi_n(\varepsilon)}\right) \left(e^{i\phi_n(\varepsilon)} U_n(\varepsilon)\right)$$
 
  • #12
The method I would recommend is to treat ##a_n## and ##a_n^*## as independent variables. The reason why this works is this: (Thanks Avodyne.) Suppose that ##R:\mathbb R^2\to\mathbb R## and ##C:\mathbb C^2\to\mathbb R## are such that ##R(x,y)=C(x+iy,x-iy)## for all ##x,y\in\mathbb R##. I will use the notation ##D_i## for the operator that takes a function defined on ##\mathbb R^2## to its ith partial derivative, and ##\bar D_i## for the operator that takes a function defined on ##\mathbb C^2## to its ith partial derivative. We have

Then we have
\begin{align}
&D_1R=\bar D_1C+\bar D_2C\\
&D_2R=i(\bar D_1C-\bar D_2C)
\end{align} Here we can see that if ##\bar D_1C=\bar D_2C=0##, then ##D_1 R=D_2R=0##. To see that the converse is also true, note that the above implies that
\begin{align}
&D_1R-iD_2R=2\bar D_1C\\
&D_1R+iD_2R=2\bar D_2C
\end{align}
 
  • Like
Likes 1 person
  • #13
Thank you very much Fredrik, AlephZero, and Oxvillian for your kind replies. :shy: Sorry I just replied today. I had no access to internet yesterday.

Fredrik said:
You seem to have evaluated ##\frac{\partial}{\partial a_n}a_n## to ##a_n## instead of 1. So you get too many ##a_n## in the end.
Yeah, I see that I did that mistake. I'll work on that first. Thanks Fredrik! :smile:

Fredrik said:
Another thing that struck me is that there's an easier way to obtain the formula we want. It follows almost immediately from ##f=\sum_n a_n U_n##. It's a one-line calculation. So I don't really see why we're solving an optimization problem here. Maybe it's just to verify that the formula we have to use anyway is the solution to an optimization problem.
Oxvillian said:
I think the idea is that we're not assuming that the U_n comprise a complete set of functions. Then the smallest error possible will in general be nonzero.

Hm. Actually, I'm just following Jackson's approach at section 2.8. @Frederik In the book, the equation f=\sum\limits_{n} a_nU_n follows after this equation a_n = \int_a ^b U_n^*(ε) f(ε) dε we are trying to show, so I don't think I can use that.

Fredrik said:
We can't assume that ##a_n## is real. Doing so will likely give us the wrong result. This makes terms that include ##a_k^*## a problem. We can't just write
$$\frac{\partial}{\partial a_n}a_k^*= \left(\frac{\partial}{\partial a_n}a_k\right)^* =\delta_{nm}^* =\delta_{nm}$$
because the complex conjugation function isn't (complex) differentiable, and that makes the first step invalid.

AlephZero said:
For a suitable choice of the ##\phi_n##, the coefficients ##a_n## are real numbers, and the calculus problems go away.

Oxvillian said:
Also note that your redefinition of the U_n's is specific to one particular f. For example, the expansion of if in terms of the same set of functions would involve purely imaginary a_n coefficients.

I'm kinda confused. Do I really have to assume that a_nand f(ε) are complex numbers and functions respectively?

Thanks again everyone.:biggrin: I think I would need some time to digest the other replies that I did not comment on. (My background in complex numbers is not very solid. :frown: Sorry.)
 
Last edited:
  • #14
physicsjn said:
Hm. Actually, I'm just following Jackson's approach at section 2.8. @Frederik In the book, the equation f=\sum\limits_{n} a_nU_n follows after this equation a_n = \int_a ^b U_n^*(ε) f(ε) dε we are trying to show, so I don't think I can use that.
I finally opened my Jackson to see what he says. The presentation is pretty strange. He starts by saying that "An arbitrary function f(ε), square integrable on the interval (a,b), can be expanded in a series of the orthonormal functions Un(ε)." To say this is to say that there's a sequence ##\langle a_n\rangle_{n=0}^\infty## in ##\mathbb C## such that ##f=\sum_{n=0}^\infty a_n U_n##. This and the orthonormality condition together imply that ##\int_a^b U_n^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n##.

Then he says "If the number of terms is finite..." This should mean that he's now considering the case where all but a finite number of the ##a_n## are zero. But he's not. He's ignoring that he has already told us that f can be expanded in a series like that, and is just trying to state the following problem: Suppose that ##\{U_n\}_{n=0}^\infty## is an orthonormal set of complex-valued functions. Suppose that we want to approximate f by a linear combination of the ##U_n##, i.e. ##f\approx\sum_{n=0}^N a_nU_n##. Then what is the best choice of ##a_n##? First we have to define what we mean by "best", so he does that, and then tells us that the definition leads to the formula ##\int_a^b U_n^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n##.

Since students at this level are likely to be familiar with vector spaces and bases for vector spaces, I think maybe it would have been better to say that a given orthonormal set ##\{U_n\}## of complex-valued functions may or may not be an orthonormal basis for the vector space of square-integrable complex-valued functions with domain (a,b) (note that f is an element of that vector space even if it's real-valued), and if it is, then we have ##f=\sum_n a_n U_n##, which implies that ##\int_a^b U_n^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n##. There are some subtleties in this approach too, and we would probably ignore them, but at least we would be ignoring the same things as in an introductory QM course. :smile:

physicsjn said:
I'm kinda confused. Do I really have to assume that a_nand f(ε) are complex numbers and functions respectively?
If the ##U_n## are complex-valued functions, then the ##a_n## will be complex numbers, even if the specific f that you want to expand in a series happens to be a real-valued function.
 
Last edited:
  • #15
Fredrik - the moral of Jackson's story is that if we have an incomplete set of orthonormal functions, the "expansion of best fit" to a given function f is still obtained by using the familiar rule
a_n=\int_a ^b U_n ^*(ε) f(ε) dε.
It's like saying that if you want to get as close as you can to an airplane but are constrained to move around on the (flat) surface of the earth, the best you can do is get directly under the airplane. That is, you still use the x and y coordinates of the airplane.
 
Last edited:
  • #16
Oxvillian said:
Fredrik - the moral of Jackson's story is that if we have an incomplete set of orthonormal functions, the "expansion of best fit" to a given function f is still obtained by using the familiar rule
a_n=\int_a ^b U_n ^*(ε) f(ε) dε.
Isn't that what I said?

Fredrik said:
He's ignoring that he has already told us that f can be expanded in a series like that, and is just trying to state the following problem: Suppose that ##\{U_n\}_{n=0}^\infty## is an orthonormal set of complex-valued functions. Suppose that we want to approximate f by a linear combination of the ##U_n##, i.e. ##f\approx\sum_{n=0}^N a_nU_n##. Then what is the best choice of ##a_n##? First we have to define what we mean by "best", so he does that, and then tells us that the definition leads to the formula ##\int_a^b U^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n##.
Edit: I see that I wrote ##U^*## instead of ##U_n^*## in the formula for ##a_n##. I have edited the post I just quoted to correct that mistake there.
 
Back
Top