Derivation of completeness relation from Jackson's Classical Electrody

In summary, the conversation discusses the derivation of the completeness relation in section 2.8 of Jackson. It starts with the orthonormality condition and shows that a function can be represented as a sum of orthonormal functions when N is finite. The best coefficients can be obtained by minimizing the error MN. The conversation then asks for clarification on how to show this, and the attempt at a solution involves using the chain rule and product rule to simplify the expression and eventually arrive at the desired equation. The final step involves correcting a mistake in the orthonormality condition.
  • #1
schrodingerscat11
89
1

Homework Statement


Greetings! I am reading section 2.8 of Jackson and trying to understand how completeness relation was derived.

It starts with the orthonormality condition:
[itex]∫U_N ^*(ε) U(ε) dε =δ_{nm}[/itex]

We can represent a function as a sum of orthonormal functions if N is finite:
[itex] f(ε) ⇔ \sum_{n=1}^N a_n U_n (ε) [/itex]

We can get the best coefficients a by minimizing the error MN:
[itex]M_N = \int_a ^b |f(ε) - \sum_{n=1}^N a_n U_n(ε)|^2 dε[/itex]

Jackson says that "it is easy to show that the coefficients are given by
[itex]a_n=\int_a ^b U_n ^*(ε) f(ε) dε [/itex]."

Question: How do I show this?

Homework Equations


Same as above.


The Attempt at a Solution


I guess minimizing MN means setting it to zero. And then I'm not sure what to do after. How do I extract the buried coefficients ? I am very sorry. :confused:
 
Last edited:
Physics news on Phys.org
  • #2
physicsjn said:
I guess minimizing MN means setting it to zero.

No, it means that ##\displaystyle \frac {\partial M_N }{\partial a_n} = 0## for each ##n##.

So you get ##\displaystyle \int_a^b \frac{\partial}{\partial a_n} | f(\epsilon) - a_n U_n(\epsilon)|^2 d\epsilon = 0## for each ##n##.
 
  • Like
Likes 1 person
  • #3
@AlephZero Thank you very much! :biggrin:

So we'll have
(1) [itex] \int_a ^b \frac{\partial}{\partial a_n} \bigg( \sqrt{(f(ε) - a_n U_n (ε))^2} \bigg) ^2 dε= 0[/itex]
since [itex] |x| = \sqrt {x^2} [/itex]

(2) [itex] \int_a ^b \frac{\partial}{\partial a_n}(f(ε) - a_n U_n (ε))^2 dε= 0[/itex]

(3) [itex]2\int_a ^b (f(ε) - a_n U_n (ε)) U_n^*(ε) dε= 0[/itex]

(4) [itex]\int_a ^b f(ε)U_n^*(ε) - \int_a ^b a_n U_n (ε) U_n^*(ε) dε= 0[/itex]

(5) [itex]\int_a ^b f(ε)U_n^*(ε) = \int_a ^b a_n U_n (ε) U_n^*(ε) dε[/itex]
via orthonormality condition [itex]0 = \int_a ^bU_n (ε) U_n^*(ε) dε[/itex]

(6) [itex]\int_a ^b f(ε)U_n^*(ε) = a_n[/itex]

I see. Thanks a lot. But in step 2 going to 3, when I applied chain rule, U* came out instead of U. I did that so that I can arrive at step 6. However, I don't know why U* should come out instead of U. Is there a valid mathematical explanation for this? Thanks again.
 
  • #4
I haven't looked at the relevant section of the book, and I haven't worked this out to see if this leads to anything good, but ##x=\sqrt{x^2}## is only true when x is real and positive. The ##U_n## functions appear to be complex valued, so it looks like you should be using ##|x|^2=x^*x##. Maybe that's where the missing * comes from. I would try starting with
$$0=\frac{\partial M_N}{\partial a_n} = \frac{\partial}{\partial a_n}\int _a^b\bigg|f(\varepsilon)-\sum_m a_m U_m(\varepsilon)\bigg|^2 \mathrm d\varepsilon = \frac{\partial}{\partial a_n}\int _a^b\bigg(f(\varepsilon)-\sum_m a_m U_m(\varepsilon)\bigg)^*\bigg(f(\varepsilon)-\sum_k a_k U_k(\varepsilon)\bigg) \mathrm d\varepsilon=\cdots$$
 
Last edited:
  • Like
Likes 1 person
  • #5
Fredrik said:
The ##U_n## functions appear to be complex valued, so it looks like you should be using ##|x|^2=x^*x##.
Agreed. And of course this also works when ##U## is real-valued and ##U^* = U##.
 
  • Like
Likes 1 person
  • #6
Thanks AlephZero and Fredrik for your replies. I also am not sure if this will lead to anything, but we had a lecture about Chapter 4 and the professor ask us to fill the missing steps. And the derivation seems to go back all to the way to Chapter 2. Anyway, if
[itex]|x|=x^*x[/itex]

[itex]0=\frac{\partial}{\partial a_n}\int_a^b \big(f(ε) - \sum\limits_{m}a_mU_m(ε) \big)^* \big(f(ε) - \sum\limits_{m}a_kU_k(ε) \big) dε [/itex]

[itex]0=\frac{\partial}{\partial a_n}\int_a^b \big( |f(ε)|^2 - \sum\limits_{m} a_m U_m ^* f(ε) - \sum\limits_{k} a_k U_k f^*(ε) + \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big) dε [/itex]

Differentiating, constants and summation terms with indices m and k not equal to n will be killed. Therefore,
[itex]0=\int_a^b -a_n U_n^* f(ε) - a_n U_n f^*(ε) + \frac{\partial}{\partial a_n} \big( \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big) dε [/itex]

Applying product rule to the last term,
[itex]\frac{\partial}{\partial a_n} \big( \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big)= \sum\limits_{m}a_mU_m^*(ε) \frac{\partial}{\partial a_n}\big( \sum\limits_{k}a_kU_k(ε) \big) + \frac{\partial}{\partial a_n}\big(\sum\limits_{m}a_mU_m^*(ε) \big) \sum\limits_{k}a_kU_k(ε) [/itex]

Again, the non-n terms are killed during differentiation with an,
[itex]\frac{\partial}{\partial a_n} \big( \sum\limits_{m}a_mU_m^*(ε) \sum\limits_{k}a_kU_k(ε) \big)=\sum\limits_m a_m U_m^* (a_n U_n) + \sum\limits_k (a_k U_k) a_n U_n^* [/itex]

Going back to the equation,
[itex]0=\int_a^b \bigg( -a_n U_n^* f(ε) - a_n U_n f^*(ε) +\sum\limits_m a_m U_m^* (a_n U_n) + \sum\limits_k (a_k U_k) a_n U_n^* \bigg) dε [/itex]

[itex]0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f^*(ε) dε + \int_a^b a_n^2 U_n^*U_n dε + \int_a^b a_n^2 U_n U_n^* dε [/itex]

[itex]0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f^*(ε) dε + a_n ^2 +a_n ^2 [/itex]

[itex]0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f^*(ε) dε + 2a_n ^2[/itex]

Hmm, :frown:. Alright, I will assume that f(ε) is real to make things less complicated.
[itex]0=-\int_a^b a_n U_n^* f(ε) dε -\int_a^b a_n U_n f(ε) dε + 2a_n ^2[/itex]

[itex]2a_n ^2=\int_a^b a_n U_n^* f(ε) dε + \int_a^b a_n U_n f(ε) dε [/itex]

Okay, I'm almost there. There's an extra term and factor 2. Did I do something wrong? :confused:

Thanks again.

PS. In my previous post, step (5), I made a mistake. The orthonormality condition should be
[itex] 1=\int_a^b U_n(ε) U_n^*(ε) dε[/itex] not [itex] 0=\int_a^b U_n(ε) U_n^*(ε) dε[/itex]
 
Last edited:
  • #7
physicsjn said:
[itex]2a_n ^2=\int_a^b a_n U_n^* f(ε) dε + \int_a^b a_n U_n f(ε) dε [/itex]

Okay, I'm almost there. There's an extra term and factor 2. Did I do something wrong? :confused:
You seem to have evaluated ##\frac{\partial}{\partial a_n}a_n## to ##a_n## instead of 1. So you get too many ##a_n## in the end. You may be able to simplify things a bit by using the formula ##z+z^*=2\operatorname{Re} z## when you find that one of your terms is the complex conjugate of another. But I think there's a more serious problem with this approach. We can't assume that ##a_n## is real. Doing so will likely give us the wrong result. This makes terms that include ##a_k^*## a problem. We can't just write
$$\frac{\partial}{\partial a_n}a_k^*= \left(\frac{\partial}{\partial a_n}a_k\right)^* =\delta_{nm}^* =\delta_{nm}$$
because the complex conjugation function isn't (complex) differentiable, and that makes the first step invalid.

The straightforward way to convert this to an optimization problem in (real) calculus is to write ##a_k=b_k+ic_k## with ##b_k## and ##c_k## both real, and then see what we get from ##\frac{\partial M_N}{\partial b_n}=0## and ##\frac{\partial M_N}{\partial c_n}=0##. Another option is to simply treat ##a_n## and ##a_n^*## as independent variables, i.e. to see what we get from ##\frac{\partial M_N}{\partial a_n}=0## and ##\frac{\partial M_N}{\partial a_n^*}=0## when we use ##\frac{\partial}{\partial a_n}a_k^*=0## and ##\frac{\partial}{\partial a_n^*}a_k=0## for all n and k (including when n=k). It's not at all obvious that this makes sense, but it does, and physics books use this trick all the time without any explanation, so I think it should be OK to use it. When we use this trick, the calculation gets much easier.

Another thing that struck me is that there's an easier way to obtain the formula we want. It follows almost immediately from ##f=\sum_n a_n U_n##. It's a one-line calculation. So I don't really see why we're solving an optimization problem here. Maybe it's just to verify that the formula we have to use anyway is the solution to an optimization problem.
 
  • Like
Likes 1 person
  • #8
Fredrik said:
But I think there's a more serious problem with this approach. We can't assume that ##a_n## is real.

Another way round that problem is to see that the definition of the functions ##U_n## is arbitrary, in the sense that we can replace ##U_n## by ##e^{i\phi_n}U_n## for any real number ##\phi_n##. (Physically this corresponds to "rotating" the ##U_n## into a preferred orientation, in some sense.)

For a suitable choice of the ##\phi_n##, the coefficients ##a_n## are real numbers, and the calculus problems go away.
 
  • Like
Likes 1 person
  • #9
Fredrik said:
Another thing that struck me is that there's an easier way to obtain the formula we want. It follows almost immediately from ##f=\sum_n a_n U_n##. It's a one-line calculation. So I don't really see why we're solving an optimization problem here. Maybe it's just to verify that the formula we have to use anyway is the solution to an optimization problem.

I think the idea is that we're not assuming that the [itex]U_n[/itex] comprise a complete set of functions. Then the smallest error possible will in general be nonzero.
 
Last edited:
  • #10
AlephZero said:
Another way round that problem is to see that the definition of the functions ##U_n## is arbitrary, in the sense that we can replace ##U_n## by ##e^{i\phi_n}U_n## for any real number ##\phi_n##. (Physically this corresponds to "rotating" the ##U_n## into a preferred orientation, in some sense.)

For a suitable choice of the ##\phi_n##, the coefficients ##a_n## are real numbers, and the calculus problems go away.

I disagree because redefining the [itex]U_n[/itex] in this way doesn't suddenly reduce the extremization problem from one involving [itex]2N[/itex] free parameters to one involving [itex]N[/itex] free parameters. One still has to show that the minimum can't be improved by making one of the [itex]a_n[/itex]'s a little bit complex.

Also note that your redefinition of the [itex]U_n[/itex]'s is specific to one particular [itex]f[/itex]. For example, the expansion of [itex]if[/itex] in terms of the same set of functions would involve purely imaginary [itex]a_n[/itex] coefficients.
 
  • Like
Likes 1 person
  • #11
I see another issue with the redefinition. If we write
$$f=\sum_n a_n U_n=\sum_n \left(a_n e^{-i\phi_n}\right)\left(e^{i\phi_n}U_n\right) =\sum_n a_n' U_n'$$ then the ##a_n'## will have to be functions, not numbers.
$$f(\varepsilon) =\sum_n \left(a_n e^{-i\phi_n(\varepsilon)}\right) \left(e^{i\phi_n(\varepsilon)} U_n(\varepsilon)\right)$$
 
  • #12
The method I would recommend is to treat ##a_n## and ##a_n^*## as independent variables. The reason why this works is this: (Thanks Avodyne.) Suppose that ##R:\mathbb R^2\to\mathbb R## and ##C:\mathbb C^2\to\mathbb R## are such that ##R(x,y)=C(x+iy,x-iy)## for all ##x,y\in\mathbb R##. I will use the notation ##D_i## for the operator that takes a function defined on ##\mathbb R^2## to its ith partial derivative, and ##\bar D_i## for the operator that takes a function defined on ##\mathbb C^2## to its ith partial derivative. We have

Then we have
\begin{align}
&D_1R=\bar D_1C+\bar D_2C\\
&D_2R=i(\bar D_1C-\bar D_2C)
\end{align} Here we can see that if ##\bar D_1C=\bar D_2C=0##, then ##D_1 R=D_2R=0##. To see that the converse is also true, note that the above implies that
\begin{align}
&D_1R-iD_2R=2\bar D_1C\\
&D_1R+iD_2R=2\bar D_2C
\end{align}
 
  • Like
Likes 1 person
  • #13
Thank you very much Fredrik, AlephZero, and Oxvillian for your kind replies. :shy: Sorry I just replied today. I had no access to internet yesterday.

Fredrik said:
You seem to have evaluated ##\frac{\partial}{\partial a_n}a_n## to ##a_n## instead of 1. So you get too many ##a_n## in the end.
Yeah, I see that I did that mistake. I'll work on that first. Thanks Fredrik! :smile:

Fredrik said:
Another thing that struck me is that there's an easier way to obtain the formula we want. It follows almost immediately from ##f=\sum_n a_n U_n##. It's a one-line calculation. So I don't really see why we're solving an optimization problem here. Maybe it's just to verify that the formula we have to use anyway is the solution to an optimization problem.
Oxvillian said:
I think the idea is that we're not assuming that the [itex]U_n[/itex] comprise a complete set of functions. Then the smallest error possible will in general be nonzero.

Hm. Actually, I'm just following Jackson's approach at section 2.8. @Frederik In the book, the equation [itex]f=\sum\limits_{n} a_nU_n[/itex] follows after this equation [itex] a_n = \int_a ^b U_n^*(ε) f(ε) dε [/itex] we are trying to show, so I don't think I can use that.

Fredrik said:
We can't assume that ##a_n## is real. Doing so will likely give us the wrong result. This makes terms that include ##a_k^*## a problem. We can't just write
$$\frac{\partial}{\partial a_n}a_k^*= \left(\frac{\partial}{\partial a_n}a_k\right)^* =\delta_{nm}^* =\delta_{nm}$$
because the complex conjugation function isn't (complex) differentiable, and that makes the first step invalid.

AlephZero said:
For a suitable choice of the ##\phi_n##, the coefficients ##a_n## are real numbers, and the calculus problems go away.

Oxvillian said:
Also note that your redefinition of the [itex]U_n[/itex]'s is specific to one particular [itex]f[/itex]. For example, the expansion of [itex]if[/itex] in terms of the same set of functions would involve purely imaginary [itex]a_n[/itex] coefficients.

I'm kinda confused. Do I really have to assume that [itex] a_n [/itex]and [itex]f(ε)[/itex] are complex numbers and functions respectively?

Thanks again everyone.:biggrin: I think I would need some time to digest the other replies that I did not comment on. (My background in complex numbers is not very solid. :frown: Sorry.)
 
Last edited:
  • #14
physicsjn said:
Hm. Actually, I'm just following Jackson's approach at section 2.8. @Frederik In the book, the equation [itex]f=\sum\limits_{n} a_nU_n[/itex] follows after this equation [itex] a_n = \int_a ^b U_n^*(ε) f(ε) dε [/itex] we are trying to show, so I don't think I can use that.
I finally opened my Jackson to see what he says. The presentation is pretty strange. He starts by saying that "An arbitrary function f(ε), square integrable on the interval (a,b), can be expanded in a series of the orthonormal functions Un(ε)." To say this is to say that there's a sequence ##\langle a_n\rangle_{n=0}^\infty## in ##\mathbb C## such that ##f=\sum_{n=0}^\infty a_n U_n##. This and the orthonormality condition together imply that ##\int_a^b U_n^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n##.

Then he says "If the number of terms is finite..." This should mean that he's now considering the case where all but a finite number of the ##a_n## are zero. But he's not. He's ignoring that he has already told us that f can be expanded in a series like that, and is just trying to state the following problem: Suppose that ##\{U_n\}_{n=0}^\infty## is an orthonormal set of complex-valued functions. Suppose that we want to approximate f by a linear combination of the ##U_n##, i.e. ##f\approx\sum_{n=0}^N a_nU_n##. Then what is the best choice of ##a_n##? First we have to define what we mean by "best", so he does that, and then tells us that the definition leads to the formula ##\int_a^b U_n^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n##.

Since students at this level are likely to be familiar with vector spaces and bases for vector spaces, I think maybe it would have been better to say that a given orthonormal set ##\{U_n\}## of complex-valued functions may or may not be an orthonormal basis for the vector space of square-integrable complex-valued functions with domain (a,b) (note that f is an element of that vector space even if it's real-valued), and if it is, then we have ##f=\sum_n a_n U_n##, which implies that ##\int_a^b U_n^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n##. There are some subtleties in this approach too, and we would probably ignore them, but at least we would be ignoring the same things as in an introductory QM course. :smile:

physicsjn said:
I'm kinda confused. Do I really have to assume that [itex] a_n [/itex]and [itex]f(ε)[/itex] are complex numbers and functions respectively?
If the ##U_n## are complex-valued functions, then the ##a_n## will be complex numbers, even if the specific f that you want to expand in a series happens to be a real-valued function.
 
Last edited:
  • #15
Fredrik - the moral of Jackson's story is that if we have an incomplete set of orthonormal functions, the "expansion of best fit" to a given function [itex]f[/itex] is still obtained by using the familiar rule
[tex]a_n=\int_a ^b U_n ^*(ε) f(ε) dε.[/tex]
It's like saying that if you want to get as close as you can to an airplane but are constrained to move around on the (flat) surface of the earth, the best you can do is get directly under the airplane. That is, you still use the [itex]x[/itex] and [itex]y[/itex] coordinates of the airplane.
 
Last edited:
  • #16
Oxvillian said:
Fredrik - the moral of Jackson's story is that if we have an incomplete set of orthonormal functions, the "expansion of best fit" to a given function [itex]f[/itex] is still obtained by using the familiar rule
[tex]a_n=\int_a ^b U_n ^*(ε) f(ε) dε.[/tex]
Isn't that what I said?

Fredrik said:
He's ignoring that he has already told us that f can be expanded in a series like that, and is just trying to state the following problem: Suppose that ##\{U_n\}_{n=0}^\infty## is an orthonormal set of complex-valued functions. Suppose that we want to approximate f by a linear combination of the ##U_n##, i.e. ##f\approx\sum_{n=0}^N a_nU_n##. Then what is the best choice of ##a_n##? First we have to define what we mean by "best", so he does that, and then tells us that the definition leads to the formula ##\int_a^b U^*(\varepsilon)f(\varepsilon)\mathrm d\varepsilon =a_n##.
Edit: I see that I wrote ##U^*## instead of ##U_n^*## in the formula for ##a_n##. I have edited the post I just quoted to correct that mistake there.
 

1. What is Jackson's Classical Electrodynamics?

Jackson's Classical Electrodynamics is a textbook written by John David Jackson that provides a comprehensive overview of classical electrodynamics, including topics such as electrostatics, magnetostatics, electromagnetic waves, and relativity.

2. What is the completeness relation in Jackson's Classical Electrodynamics?

The completeness relation in Jackson's Classical Electrodynamics is a mathematical expression that relates the electric and magnetic fields to the electromagnetic potentials in the context of Maxwell's equations. It is used to describe the complete set of solutions to these equations.

3. How is the completeness relation derived from Jackson's Classical Electrodynamics?

The completeness relation can be derived from Jackson's Classical Electrodynamics by applying a mathematical technique known as the Helmholtz decomposition. This involves decomposing the electromagnetic field into its transverse and longitudinal components, and then using the properties of vector calculus to derive the completeness relation.

4. What is the significance of the completeness relation in classical electrodynamics?

The completeness relation is significant in classical electrodynamics because it provides a powerful tool for solving a wide range of problems related to electromagnetic fields. It allows us to express the fields in terms of the potentials, which can often simplify the mathematical calculations involved in solving Maxwell's equations.

5. Are there any limitations to the completeness relation in Jackson's Classical Electrodynamics?

While the completeness relation is a useful tool in classical electrodynamics, it does have some limitations. It is only applicable in the context of linear, homogeneous media, and it assumes that the fields and potentials are well-behaved and satisfy certain boundary conditions. In more complex situations, such as those involving nonlinear or inhomogeneous media, the completeness relation may not hold.

Similar threads

  • Advanced Physics Homework Help
Replies
1
Views
3K
  • Calculus and Beyond Homework Help
Replies
3
Views
983
  • Advanced Physics Homework Help
Replies
1
Views
1K
  • Topology and Analysis
Replies
3
Views
2K
  • Introductory Physics Homework Help
Replies
4
Views
885
  • Classical Physics
Replies
1
Views
2K
Replies
4
Views
3K
Replies
2
Views
1K
Replies
1
Views
1K
  • Advanced Physics Homework Help
Replies
7
Views
4K
Back
Top