Explaining Distribution Confusion: Delta Functions & Eigenstates of Momentum

Hurkyl · Mar 4, 2006

I'm happy enough with the basic idea of a distribution -- e.g. delta functions, and eigenstates of the momentum operator, and stuff like that.

But I've seen people use some manipulations that I cannot figure out how to rigorously explain -- and was wondering if anyone out there is able to do so.

The first concerns the delta function. In particular, the "inner product" of two delta functions. For example, in bra-ket notation, one might see [itex]\langle x' | x \rangle[/itex] where x and x' are eigenstates of the position operator. I suppose this is "supposed" to be equal to [itex]\delta(x - x')[/itex], but I don't know how to arrive at that.

I suppose since the test functions are dense in the space of distributions, we should be able to think of these two deltas as limits of sequences of test functions. But this doesn't seem to be well defined... I can see how [itex]\langle x' | x \rangle = 0[/itex] when [itex]x' \neq x[/itex], but as for [itex]\langle x | x \rangle[/itex]...

I'm going to cheat and use discontinuous "test functions" for simplicity -- I can't imagine it would be any different with a real test function. We can represent a delta function centered at x as the limit of the sequence of functions:

[tex]
\delta_{n,x} (y) := \begin{cases}
n & y \in (x -\frac{1}{2n}, x + \frac{1}{2n}) \\
0 & y \notin (x -\frac{1}{2n}, x + \frac{1}{2n})
\end{cases}
[/tex]

and if we take the inner product...

[tex]
\langle \delta_{n,x} | \delta_{n,y} \rangle = \begin{cases}
0 & y \leq x - \frac{1}{n} \\
n^2 - n^3 (x - y) & x - \frac{1}{n} \leq y \leq x \\
n^2 - n^3 (y - x) & x \leq y \leq x + \frac{1}{n} \\
0 & x + \frac{1}{n} \leq y
\end{cases}
[/tex]

But this inner product doesn't converge to [itex]\delta(x - y)[/itex] -- but [itex](1/n) \langle \delta_{n,x} | \delta_{n,y} \rangle[/itex] does.

And, of course, there's no reason that the indices should vary together -- we should be able to decouple them, in which case we need to take [itex](2/(m+n)) \langle \delta_{m,x} | \delta_{n,y} \rangle[/itex] in order to converge to [itex]\delta(x - y)[/itex]!

Another conundrum is that I've seen it written:

[tex]
\int_{-\infty}^{+\infty} e^{iyx} \, dy = \delta(x)
[/tex]

which is another thing I don't understand. I may be missing a normalization constant, but that's not the point... I have no idea how to give this any rigorous sense.

vanesch · Mar 5, 2006

Hurkyl said:

I'm going to cheat and use discontinuous "test functions" for simplicity -- I can't imagine it would be any different with a real test function. We can represent a delta function centered at x as the limit of the sequence of functions:

[tex]
\delta_{n,x} (y) := \begin{cases}
n & y \in (x -\frac{1}{2n}, x + \frac{1}{2n}) \\
0 & y \notin (x -\frac{1}{2n}, x + \frac{1}{2n})
\end{cases}
[/tex]

One point I see is that the test function should be normalized according to the quadratic Hilbert norm, while this function isn't... but this is begging the question in the first place because you need to solve your problem in order to even be able to find its normalization.
Now, you're the mathematician, not I...

Hurkyl · Mar 5, 2006

Now, you're the mathematician, not I...

Yes, and I would say taking the inner product of delta functions (or of any two distributions in general) is nonsensical! Yet people seem to do it anyways to good effect.

One point I see is that the test function should be normalized according to the quadratic Hilbert norm, while this function isn't...

That can't be right -- the test functions are supposed to form a vector space of their own! (So that you can take its dual space to get the space of distributions)

And to converge to a delta function, you have to have integral one... and I don't think you can have both integral one and squared integral one while being positive and concentrated near a single point!

akhmeteli · Mar 5, 2006

Sorry, I have a problem with inserting TeX, so please see the attached file.

Hurkyl · Mar 5, 2006

Transcribing the .doc: (quote it to see how it's done)

akhmeteli said:

I’ll use different letters from those used in your post.
Actually, the eigenfunction of the operator of coordinate may be written as follows:

[tex]| y \rangle = \delta(y - x)[/tex]

and the inner product of two such eigenfunctions is

[tex]
\langle z | y \rangle
= \int_{-\infty}^{+\infty} \langle z | (x) | y \rangle (x) \, dx
= \int_{-\infty}^{+\infty} \delta(z-x) \delta(y-x) \, dx
= \delta(z-y)
[/tex]

Thus, we obtain the required formula.
Indeed, eigenfunctions of coordinate are not normalizable, so one says that they are “normalizable to delta-function”, which means exactly what this formula says.
Now, let us “prove” the formula

[tex]
\int_{-\infty}^{+\infty} e^{i y x} \, dy = 2 \pi \delta(x)
[/tex]

To this end, one must prove that

[tex]\int_{-\infty}^{+\infty} f(x) \, dx \int_{-\infty}^{+\infty} e^{i y x} \, dy = 2 \pi f(0)[/tex]

In fact,

[tex]\int_{-\infty}^{+\infty} f(x) \, dx \int_{-\infty}^{+\infty} e^{i y x} \, dy
= \lim_{a \rightarrow \infty} \int_{-\infty}^{+\infty} f(x) \, dx \int_{-a}^{+a} e^{i y x} \, dy
= \lim_{a \rightarrow \infty} \int_{-\infty}^{+\infty} f(x) \, dx \frac{1}{ix} (e^{iax} - e^{-iax}) =
[/tex]
[tex]
= \lim_{a \rightarrow \infty} \int_{-\infty}^{+\infty} f(x) \frac{2 \sin ax}{x} \, dx
= 2 f(0) \lim_{a \rightarrow \infty} f(x) \frac{\sin ax}{x} \, dx
= 2 \pi f(0)
[/tex]

where the well-known limit

[tex]
\lim_{a \rightarrow \infty} \frac{\sin ax}{x} = \pi
[/tex]

(for [itex]a > 0[/itex]) may be obtained, for example, using contour integration and the Jordan’s lemma.

Hurkyl · Mar 5, 2006

[tex]

\langle z | y \rangle

= \int_{-\infty}^{+\infty} \langle z | (x) | y \rangle (x) \, dx

= \int_{-\infty}^{+\infty} \delta(z-x) \delta(y-x) \, dx

= \delta(z-y)

[/tex]

This manipulations is actually one of the ones that bothers me.

For a number z, [itex]\delta(z-x)[/itex] is a distribution -- it's job in life is to eat a test function f(x), and spit out the number variable f(z). We, of course, notate the action by [itex]\int_{-\infty}^{+\infty} \delta(z-x) f(x) \, dx = f(z)[/itex].

But this, of course, isn't as ordinary as it appears. I know of only two ways of giving it formal meaning:

(1) The notation

[tex]\int_{-\infty}^{+\infty} \delta(z-x) \_\_\_ \, dx[/tex]

is simply the function symbol that we use to denote applying the distribution [itex]\delta(z-x)[/itex].

(2) We use the distribution [itex]\delta(z-x)[/itex] to form a measure. In such a case, [itex]\delta(z - x) \, dx[/itex] becomes a single formal symbol used to denote this measure in an ordinary integral.

Either way, the expression

[tex]\int_{-\infty}^{+\infty} \delta(z-x) \delta(y-x) \, dx[/tex]

doesn't make sense -- in case (1), we are trying to apply it to something that isn't a test function, and in case (2), this expression is grammatically incorrect!

But I think this suggests what this really means...

Suppose we take [itex]\int_{-\infty}^{+\infty} \delta(z-x) \_\_\_ \, dx[/itex] to be an operator that takes a test function of two arguments (x and z) and returns a test function of one arguments (z).

Then, we take [itex]\int_{-\infty}^{+\infty} \delta(y-x) \_\_\_ \, dy[/itex] to be an operator that takes a test function of three arguments (x, y, and z) and returns a test function of two arguments (x and z).

Then, this expression makes sense:

[tex]
\int_{-\infty}^{+\infty} \delta(z - x)
\left(
\int_{-\infty}^{+\infty} \delta(y - x) f(x, y, z) \, dy
\right)
\, dx
= f(z, z, z)
[/tex]

(Sorry -- but it mentally hurts me to put that dx anywhere else in the equation)

So, with a test function that is a function only of y and z, and some abuse of notation, we can write

[tex]
\int_{-\infty}^{+\infty} f(y, z)
\int_{-\infty}^{+\infty} \delta(y - x) \delta(z - x) \, dx
\, dy
= f(z, z)
[/tex]

which gives us [itex]
\int_{-\infty}^{+\infty} \delta(y - x) \delta(z - x) \, dx = \delta(z - y)
[/tex] as desired.

But this required some messy abuse of notation -- I'll have to think a bit to see if I can get rid of it.

I'll have to think about the second part -- is it necessary to use the Cauchy Principal Value, though?

(P.S. what's the LaTeX for a long underline?)

akhmeteli · Mar 5, 2006

As for the first part, my understanding is as follows. It is true that in general you cannot define a decent product of two distributions, but you can typically regard a convolution of two distributions as a distribution (and this is just the case that we are discussing). Unfortunately, my source is in Russian (V.S. Vladimirov, Equations of mathematical physics, Moscow, Nauka, 1976). I don't know if there is an English translation, but I guess this subject should be covered in any book on distributions.
As for the second part, I don't know if the equality is valid only if the integral is regarded as Cauchy's principal value.

Hurkyl · Mar 5, 2006

I played with it some more, and it seems that it's actually fairly trivial.

[itex]\delta(y - x)[/itex] is a linear transformation that maps test functions of a variable y to test functions of a variable x.

[itex]\delta(z - x)[/itex] is a linear transformation that maps test functions of a variable x to test functions of a variable z.

Their composition, then, is the linear transformation [itex]\delta(z - y)[/itex] that maps test functions of a variable y to test functions of a variable z.

Or, in general, if I have distributions [itex]\omega(y, x)[/itex] and [itex]\zeta(x, z)[/itex] that act as linear operators on the space [itex]\Phi[/itex] of test functions, then we can set [itex]\int_{-\infty}^{+\infty} \omega(y, x) \zeta(x, z) \, dx[/itex] to be their composition!

Which, I suppose, is exactly what you are saying about convolving distributions to produce new distributions.

As a nice bonus, if we take two distributions that are simply duals of test functions, composition is nothing more than the dual of their inner product, thus justifying the ambiguity of notation.

Excellent, this makes me much happier.

Hurkyl · Mar 5, 2006

The CPV doesn't seem necessary. E.G. the expression:

[tex]
\frac{1}{2\pi} \int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty} e^{is(x-y)} f(y) \, dy \, ds
[/tex]

reduces to, if f is sufficiently well-behaved:

[tex]
\frac{1}{\pi} \lim_{\substack{a \rightarrow +\infty \\ b \rightarrow +\infty}}
\int_{-\infty}^{+\infty} f(x - \frac{2v}{a+b})
\frac{\sin v}{v}
\exp(i v \frac{b-a}{b+a}) \, dv
[/tex]

and I imagine the product of f and that complex exponential would also be well-behaved, so that it may be wrapped up with the work you presented.

I guess the notation

[tex]\int_{-\infty}^{+\infty} e^{isx} \, ds[/tex]

is supposed to denote the operator that eats a test function f(x) and spits out

[tex]\int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty} e^{isx} f(x) \, dx \, ds[/tex]

Bleh, this stuff is hard to get used to.

I guess it's all right though, since this notation seems to behave well with test functions. (Which are now real-valued, since I don't want to worry about getting the conjugates right!)

The notation [itex]\int_{-\infty}^{+\infty} g(x, y) \, dx[/itex] is "supposed" to denote the Lesbegue integral of your test function g. (Let's call the result h(y)) But I could interpret it as an operator just like the above, that when applied to f(y), gives:

[tex]
\begin{equation*}
\begin{split}
\int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty} g(x, y) f(y) \, dy \, dx
&= \int_{-\infty}^{+\infty} f(y) \left( \int_{-\infty}^{+\infty} g(x, y) \, dx \right) \, dy \\
&= \int_{-\infty}^{+\infty} f(y) h(y) \, dy \\
&= \langle f | h \rangle
\end{split}
\end{equation*}
[/tex]

and so again, if we reinterpret the ordinary notation in this new way, we get the same results, thus justifying it.

I still think "bleh!".

George Jones · Mar 5, 2006

Hurkyl said:

I guess the notation

[tex]\int_{-\infty}^{+\infty} e^{isx} \, ds[/tex]

is supposed to denote the operator that eats a test function f(x) and spits out

[tex]\int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty} e^{isx} f(x) \, dx \, ds[/tex]

Bleh, this stuff is hard to get used to.

Yes.

This is a special case of the Fourier transform of a disribution. Let [itex]\mathcal{F}[/itex] be the operator that effects Fourier tranforms. If [itex]F[/itex] is a tempered distribution, then [itex]\mathcal{F}F[/itex], defined by [itex]\left( \mathcal{F}F\right) \left( g\right) =F\left( \mathcal{F}g\right)[/itex] for all test functions [itex]g[/itex], is also a tempered distribution. As a function, the constant function [itex]f\left( x\right) =1/\sqrt{2\pi}[/itex] does not have a Fourier transform, but, as a distribution, it does have a Fourier transform. A large class (larger than the set of test functions) of 'reasonable' (locally integrable?) functions can be put into natural bijective correspondence with a subspace of the space of tempered distributions. Let [itex]h[/itex] be a reasonable function, and define a distribution [itex]H[/itex] by

[tex]H(g)=\int h\left( x\right) g\left( x\right) dx[/tex]

for every test function [itex]g[/itex]. Now, some reasonable functions [itex]h[/itex] do not have Forurier transforms that are functions, but every tempered distribution, including [itex]H[/itex], has a Fourier transform. Physcists write

[tex]\frac{1}{\sqrt{2\pi}}\int h\left( x\right) e^{isx}dx[/tex]

even when [itex]h[/itex] doesn't have a Fourier transform. Then, this is just symbolism for the distributional Fourier transform defined above. As a distibution, the Fourier transform of [itex]f\left( x\right) =1/\sqrt{2\pi}[/itex] is the Dirac delta distribution.

Regards,
George

reilly · Mar 6, 2006

Hurkyl -- Everything, or almost everything you need to know can be found in
Wiener's book , The Fourier Integral and Certain of its Applications (Dover) and the classic, Lighthill's Fourier Analysis and Generalized Fuctions. - and many others as well.

Your issue with position eigenstates, same issue for momentum, has long been solved --see Kemble's Quantum Mechanics which elucidates the torturous logic used in the 1930s to deal with continuous spectra.But, for practical purposes, think Gaussian, a delta function is the appropriate generalized limit of a narrower and narrower Gaussian. For a more modern approach -- by physicists for physicists -- see a very detailed discussion of wave packets used to help make scattering theory more mathematically sound, see Goldberger and Watson's Collision Theory

I'd say that most everything that physicists do with delta functions can and has been made rigorous. Further, I suspect that a physicist's intuition will ferret out many of the no-no-s of delta function practice. Note that complex variables give a very rigorous representation of a delta function -- Cauchy's famed integral around a pole.

Regards,
Reilly Atkinson

Hurkyl · Mar 6, 2006

I'd say that most everything that physicists do with delta functions can and has been made rigorous.

It's not that I'm worried if it could be made rigorous... it's that I'm annoyed I don't know all the details!

I'm actually a mathematician, not a physicist. (I try to teach myself physics for fun)

The problem I tend to face when reading physics for physicists is that I feel like things aren't explained very thoroughly, and I have no idea what everything is. But when I know what's "really" going on the background mathematically, I can follow along much better.

I also don't mind working the details out myself, once I know where things are going.

This latest drive of mine actually stemmed from wanting to understand how we can take the divergence of a field and get a delta distribution... this reminded me that I wanted to learn more about distributions anyways, and I felt like I had nearly enough information to plow forward! The two questions in my OP were the ones that had been giving me the most grief.

But now, I feel like I've answered my question -- I ought to be looking at everything as if it's a distribution, instead of just thinking of them being inserted here and there as useful.

For example, when I see the operator d/dx, I realize now that I shouldn't be thinking of it as the operator that takes a function f(x) and produces a new function defined by:

[tex]
\frac{df}{dx}(a) := \lim_{h \rightarrow 0} \frac{f(a+h) - f(a)}{h}
[/tex]

but I should instead be thinking of it as the operator that takes a distribution f(x) and produces a new distribution defined by:

[tex]
\int \frac{df(x)}{dx} \phi(x) \, dx
:= -\int f(x) \frac{d\phi(x)}{dx} \, dx
[/tex]

The higher dimensional case answers my original question -- it gives the rigorous reason that we can get a delta function when we take the divergence of a field. (And as a side benefit, answers another question that bothered me: why the delta function has a derivative!)

I have enough to keep me busy for a while -- there are details I want to work out!

selfAdjoint · Mar 7, 2006

Hurkyl said:

It's not that I'm worried if it could be made rigorous... it's that I'm annoyed I don't know all the details! I'm actually a mathematician, not a physicist. (I try to teach myself physics for fun)

Oh my Landsmann! If you can talk them into giving you just a precis on all the manipulations and assumptions they use in fiddling with distributions and deltas, you are a better man than I am. I have been asking that for years, literally, and they just go "Huh? Read the textbooks!" If that took care of my problem would I keep asking?

reilly · Mar 7, 2006

Hurkyl -- Check out older books --pre World war II, pre-delta function -- on differential equations, ordinary and partial, particularly Green's Functions. the delta function replaced a great deal of clever and tedious algebraic manipulations. Check the Dover Catalogue for Bateman, and older books. Also, see Jackson for the basic formal solutions to PDEs of E&M,. doing delta functions wihout delta - functions. (A delta function from a divergence happens because of gauss's thrm.)

Regards,

Reilly Atkinson

Hurkyl · Mar 7, 2006

There's also a "Physics for mathematicians" book you can find by digging through the links section of this site. Its quantum mechanics section specifically avoids delta functions, et cetera.

But I'm not trying to avoid delta functions, nor to find ways of doing the same things without them -- I want to be able to use them, and be armed with the knowledge that I'm using them in a mathematically rigorous manner!

Surprisingly, I've found Wikipedia a good launching point for trying to work out the details myself. Specifically, its entries on rigged Hilbert space and distributions.

(A delta function from a divergence happens because of gauss's thrm.)

I certainly see that as a reason why you would want a delta function to arise -- but it makes me wonder how we can justify it! It would be a horrible abomination of mathematics to pretend that, for example, the traditional definition of divergence yields a delta function.

So I was curious how things were defined to make this happen. For example, I was wondering if the definition

[tex]
\nabla \cdot \vec{f} :=
\lim_{r \rightarrow 0} \frac{3}{4 \pi r^3} \oint_{S_r} \vec{f} \cdot d \vec{A}
[/tex]

where [itex]S_r[/itex] is the sphere of radius r would yield a delta function, if this limit was done over the space of distributions. And even if we could, this is awfully narrow -- I'd have to go and prove another whole set of theorems for the gradient, and for the curl, and then what about some other interesting differential operator? Do I have to handle that specially as well?

But now, I realize that we can define, for the distribution L:

[tex]
(\nabla \cdot L) [ \varphi ] := -L [ \nabla \varphi ]
[/tex]

Or, in the integral notation, for the measure [itex]\vec{\omega(x)} \, dx[/itex]:

[tex]
\iiint (\nabla \cdot \vec{\omega})(x) \varphi(x) \, dx
:= -\iiint \vec{\omega}(x) \cdot (\nabla \varphi)(x) \, dx
[/tex]

If we interpret an ordinary function as being a distribution (when possible), we can apply this definition of divergence and get a delta function.

selfAdjoint: My current suspicion is that all this stuff with distributions is simply a form of (infinite-dimensional) tensor algebra! But instead of indices you have variables, and all those integrals are really just contractions. (Or, you can use the Riesz-representation theorem to turn a distribution into a measure, so that you eventually get an honest-to-goodness integral)

dextercioby · Mar 8, 2006

Along the same lines, here's a problem in distribution theory that i must solve, and i ain't got a clue

One knows that [itex] \nabla\cdot\vec{D}= 0 [/itex]. Suppose one has two media with different permittivities [itex] \epsilon_{1} [/itex] and [itex] \epsilon_{2} [/itex]. Prove that the tangent component of [itex] \vec{D} [/itex] is continuous.

One normally does that in classical ED using an approximation based on Stokes' theorem.

However, I'm supposed to do it using distributional calculus...

Daniel.

reilly · Mar 10, 2006

This very simple minded, but seems to me Stokes is still the man. That is, the usual proof requires the legs perpendicular to give no contribution to the integral. If there were any singularities in the integral due to singularities in the field, then the divergence of D would no longer be zero. Any distribution must be a null one.
regards,
Reilly

Explaining Distribution Confusion: Delta Functions & Eigenstates of Momentum

Attachments

1. What is a delta function?

2. How is a delta function related to momentum distributions?

3. What are eigenstates of momentum?

4. How do delta functions and eigenstates of momentum relate to each other?

5. Why are delta functions and eigenstates of momentum important in explaining distribution confusion?

Similar threads

Hot Threads

Recent Insights