Fun with Deltas and ##L^2##: Physicists vs Mathematicians

Staff Emeritus

@stevendaryl Why would you take your test space ##\mathcal{H}## as the square integrable functions? That doesn't work at all. The Dirac delta function would be ill-defined. The distributions you get by taking ##\mathcal{H} = L^2## are just the elements of ##L^2## again.

Why do you say that? The function that takes a function $f(x)$ and returns $f(0)$ is a well-defined distribution. Maybe what you mean is that if we identify two functions $f \sim g$ if $\int |f(x) - g(x)|^2 dx = 0$, then the delta-function distribution is ill-defined on the equivalence class. But it's perfectly well-defined on square-integrable functions, if we don't take equivalence classes.

Last edited by a moderator:

Staff Emeritus
Thanks Steven, I did read the wiki definitions before I asked the question. I think where i am having difficulty is at x = 0. By definition, f(x) is infinity at zero. I see that as a discontinuity. There are no limits of integration. Where is the Δx? The limit of a function view implies there is a Δx→ 0. (Or the W parameter in the example). I might be being pig headed about this.

$\delta(x)$ is not really a function, so $\delta(0)$ is not defined. The only thing that is defined about the delta function is integrals involving it:

$\int_{a}^b \delta(x) f(x) = f(0)$ if the interval contains $x=0$, and is zero, otherwise.

Statements involving $\delta(x)$ that don't explicitly involve integrals can only be made sense of by understanding them in terms of integration.

Why do you say that? The function that takes a function $f(x)$ and returns $f(0)$ is a well-defined distribution. Maybe what you mean is that if we identify two functions $f \sim g$ if $\int |f(x) - g(x)|^2 dx = 0$, then the delta-function distribution is ill-defined on the equivalence class. But it's perfectly well-defined on square-integrable functions, if we don't take equivalence classes.
It's not continuous, since ##\lVert f - g\rVert = 0##, but ##\left| F(f)-F(g)\right| \neq 0##. So it would be an element of the algebraic dual, but not the topological dual space.

Staff Emeritus
Why do you say that? The function that takes a function $f(x)$ and returns $f(0)$ is a well-defined distribution. Maybe what you mean is that if we identify two functions $f \sim g$ if $\int |f(x) - g(x)|^2 dx = 0$, then the delta-function distribution is ill-defined on the equivalence class. But it's perfectly well-defined on square-integrable functions, if we don't take equivalence classes.

It's a little confusing. When people talk about $L^2$, they do mean functions modulo equivalence. But it seems to me that we can certainly talk about the set of square-integrable functions without taking equivalence relations.

Staff Emeritus
It's not continuous, since ##\lVert f - g\rVert = 0##, but ##\left| F(f)-F(g)\right| \neq 0##. So it would be an element of the algebraic dual, but not the topological dual space.

Viewed as functions (as opposed to equivalence classes of functions), $\int |f(x) - g(x)|^2 dx = 0$ does not imply that $f=g$, so I don't see what's the problem with having $F(f) \neq F(g)$.

It's a little confusing. When people talk about $L^2$, they do mean functions modulo equivalence. But it seems to me that we can certainly talk about the set of square-integrable functions without taking equivalence relations.
The space of square-integrable functions is usually denoted by ##\mathcal L^2## rather than ##L^2##, but it is not a Hilbert space, so it's not suitable for QM.

Viewed as functions (as opposed to equivalence classes of functions), $\int |f(x) - g(x)|^2 dx = 0$ does not imply that $f=g$, so I don't see what's the problem with having $F(f) \neq F(g)$.
Your ##F## is well-defined on ##\mathcal L^2##. It's just not continuous and distributions are usually defined to be continuous. It's okay to use discontinuous distributions, but you must be aware that many things might not work anymore as expected.

Staff Emeritus
Homework Helper
Why do you say that? The function that takes a function $f(x)$ and returns $f(0)$ is a well-defined distribution. Maybe what you mean is that if we identify two functions $f \sim g$ if $\int |f(x) - g(x)|^2 dx = 0$, then the delta-function distribution is ill-defined on the equivalence class. But it's perfectly well-defined on square-integrable functions, if we don't take equivalence classes.

That's the thing isn't it. If ##f = g## a.e., why would we expect the integrals ##\int f(x)\delta(x)dx ## and ##\int g(x)\delta(x)dx## to be different.
Furthermore, what will be the derivative of ##\delta## in your formalism?
And why did you not put any continuity requirements on the definition of your distribution?

I mean: ok, sure, your definition is consistent, but I don't see it as very useful.

Staff Emeritus
That's the thing isn't it. If ##f = g## a.e., why would we expect the integrals ##\int f(x)\delta(x)dx ## and ##\int g(x)\delta(x)dx## to be different.

That's the whole point--$\int f(x) \delta(x) dx$ is not actually an integral. It's just notation. That notation is defined to be $f(0)$.

Furthermore, what will be the derivative of ##\delta## in your formalism?

I don't understand the question. $\delta(x)$ is not a function. It doesn't have a derivative. It only makes sense in the expression $\int \delta(x) f(x) dx$.

A distribution is not in general an integral. Certain distributions can be defined via an integral, but the delta function is one that can't be.

And why did you not put any continuity requirements on the definition of your distribution?

There are no continuity requirements: $F(f) \equiv f(0)$. That's well-defined for any function that has a value at $x=0$

Staff Emeritus
Homework Helper
That's the whole point--$\int f(x) \delta(x) dx$ is not actually an integral. It's just notation. That notation is defined to be $f(0)$.

I don't understand the question. $\delta(x)$ is not a function. It doesn't have a derivative. It only makes sense in the expression $\int \delta(x) f(x) dx$.

A distribution is not in general an integral. Certain distributions can be defined via an integral, but the delta function is one that can't be.

There are no continuity requirements: $F(f) \equiv f(0)$. That's well-defined for any function that has a value at $x=0$

You do realize that no math book adopts the definition you put down in this thread, right?

Staff Emeritus
Your ##F## is well-defined on ##\mathcal L^2##. It's just not continuous and distributions are usually defined to be continuous. It's okay to use discontinuous distributions, but you must be aware that many things might not work anymore as expected.

Well, the context of this discussion is the delta function, and there is no way to make sense of a delta function in terms of continuous distributions.

Staff Emeritus
Homework Helper
Well, the context of this discussion is the delta function, and there is no way to make sense of a delta function in terms of continuous distributions.

Why would you say that? Of course there is...

Staff Emeritus
Homework Helper
The definition in terms of distributions is completely standard:
https://en.wikipedia.org/wiki/Dirac_delta_function#As_a_distribution

You really don't see the difference between what you wrote and what's in the wiki link? For one, the space of test functions in the wiki link is not ##\mathcal{H}##, and second they require continuity.

Staff Emeritus
Why would you say that? Of course there is...

Could you elaborate? Because I don't know what you are talking about.

I'm just giving the same definition as Wikipedia, which is this:

As a distribution, the Dirac delta is a linear functional on the space of test functions and is defined by

for every test function φ.

Well, the context of this discussion is the delta function, and there is no way to make sense of a delta function in terms of continuous distributions.
The delta distrubution is continuous on the usual test function spaces like ##C^\infty_c## or ##\mathcal S##. You can embedd them into ##L^2## by sending each ##f## to its equivalence class. Of course, delta isn't continuous in the Hilbert space norm, but it's continuous in the topologies of the usual test function spaces.

(Edit: Since micromass is active, I'm going to let him continue the argument.)

Staff Emeritus
Homework Helper
Could you elaborate? Because I don't know what you are talking about.

I'm just giving the same definition as Wikipedia, which is this:

As a distribution, the Dirac delta is a linear functional on the space of test functions and is defined by

for every test function φ.

OK, and what is their space of test functions? Also, read the following paragraph which deals about continuity.

Staff Emeritus
OK, and what is their space of test functions? Also, read the following paragraph which deals about continuity.

Look, the functional $F$ defined by $F(f) = f(0)$ is defined on the set of all functions from $R$ to $C$, regardless of whether they are continuous, or square-integrable, or whatever. You might be right that this function is only considered a "distribution" when it is restricted to a suitable collection of functions.

Mentor
There are no limits of integration.

The limits of integration are any range that includes the point at which δ(x) ≠ 0.

Heh,... gotta love these physicist-vs-mathematician encounters... [SCNR]

(To be fair, the "usual topology on the space of test functions" is far from trivial, imho.)

Nugatory and bhobba
Homework Helper
Heh,... gotta love these physicist-vs-mathematician encounters... [SCNR]

(To be fair, the "usual topology on the space of test functions" is far from trivial, imho.)

As a student, I could get really angry when someone dared to mention the delta function.

As a student, I could get really angry when someone dared to mention the delta function.
Well? It's a generalized function, isn't it? Don't mathematicians like to generalize?

[Edit: @Kevin McHugh: Sorry if this little side-exchange is a distraction from your main topic. Let us know if your original question is still inadequately answered.]

PeroK and bhobba
Staff Emeritus
Homework Helper

As a student, I could get really angry when someone dared to mention the delta function.

The delta function is a function though, just not one ##\mathbb{R}\rightarrow \mathbb{R}##

Gold Member
2021 Award
Why do you say that? The function that takes a function $f(x)$ and returns $f(0)$ is a well-defined distribution. Maybe what you mean is that if we identify two functions $f \sim g$ if $\int |f(x) - g(x)|^2 dx = 0$, then the delta-function distribution is ill-defined on the equivalence class. But it's perfectly well-defined on square-integrable functions, if we don't take equivalence classes.
The Dirac ##\delta## distribution cannot be a dual vector in ##L^2##, because these are one-to-one mapped to the square integrable functions themselves, because the dual of the separable Hilbert space is equivalent to the Hilbert space itself. The ##\delta## distribution is defined on a dense subspace (like the space of rapidly decreasing ##C^{\infty}## functions). It's dual is larger than the Hilbert space. In context of quantum theory it's a functional defined on the domain of the position operator as an essentially self-adjoint operator.

Mentor
Staff Emeritus
The Dirac ##\delta## distribution cannot be a dual vector in ##L^2##, because these are one-to-one mapped to the square integrable functions themselves, because the dual of the separable Hilbert space is equivalent to the Hilbert space itself.

But isn't it the case that in the $L^2$ Hilbert space, two functions $f$ and $g$ are considered equal if $\int |f(x) - g(x)|^2 dx = 0$? So the Hilbert space is actually a set of equivalence classes of functions, rather than a set of functions. So for a distribution $F$ to be well defined on that Hilbert space, it must be that $F(f) = F(g)$ whenever $\int |f(x) - g(x)|^2 dx = 0$.

Staff Emeritus
Homework Helper
But isn't it the case that in the $L^2$ Hilbert space, two functions $f$ and $g$ are considered equal if $\int |f(x) - g(x)|^2 dx = 0$? So the Hilbert space is actually a set of equivalence classes of functions, rather than a set of functions. So for a distribution $F$ to be well defined on that Hilbert space, it must be that $F(f) = F(g)$ whenever $\int |f(x) - g(x)|^2 dx = 0$.

Correct.

bhobba
Gold Member
2021 Award
Sure, and also this makes it very clear that the ##\delta## distribution cannot be uniquely defined on Hilbert space, because the definition, valid for appropriate test functions that are well-defined everywhere is ##\delta(f)=f(0)##, but ##f(0)## is just a value at a single point, which is not defined at all for ##f \in L^2##, you can change the value at ##x=0## to anything you like without changing the function as member of ##L^2##.

bhobba
secur
Well I can tell you how dirac function was defined by Dirac, and by mathematicians in Functional Analysis, 40 years ago.

--------------- DIRAC (from Principles of Quantum Mechanics)

First, it's not a function at all. Dirac called it an "improper function", and defined it as 0 everywhere but x=0, where it's undefined; you can call it infinity if you like. The defining characteristic is: the integral (from -inf to +inf, or generally any limits that include 0) is 1. I'll call it dirac(x).

Its most important property is that when integrated against a function f the integral is f(0).

Dirac made the concept (more or less) rigorous by imagining a function which is defined in a small area (bounded by epsilon) around 0, and whose integral is 1. It must have no "unnecessarily wild variations". Parametrize this function by epsilon to get a family of such functions, let epsilon go to 0. The limit is dirac(x). Previous posters have given examples of such families. To make it (more or less) rigorous we integrate a function f against this parametrized family and take the limit as epsilon goes to 0. However we almost never bother, since the answer will be simply f(0).

Dirac also noted it can be considered the differential coefficient of the Heaviside step function.

By the way the derivative of dirac(x) itself is a dipole or "unit doublet".

Dirac emphasized that this improper function only made sense as a kernel. dirac(x) itself is only a shorthand notation meaning: perform the appropriate integration. But, as long as we're careful, we can often treat it like a function:

"In quantum theory whenever an improper function appears, it will be something to be used ultimately in an integrand."
"The use of improper functions does not involve any lack of rigor in the theory, but is merely a convenient notation ..."
"We can often use an improper function as though it were an ordinary continuous function ..."

He gives various identities such as x dirac(x) = 0, always emphasizing that it means: if each expression is used in an integration it will give the same answer.

He introduced dirac(x) to deal with continuous ranges of eigenvalues, like position. In the discrete case dirac(x) is simply the Kronecker delta, about which there's no problem. But we need to generalize to the continuous case. For discrete eigenvectors of course the norm is 1; for different eigenvectors, the inner product is 0. But for continuous the norm can't be made 1. However he wanted it as similar as possible, so using dirac(x) he's able to make the inner product of two different (continuous-range) eigenvectors 0; and the norm of one of them, a finite number "c" which depends on the specific eigenvector. Having introduced it for this purpose, it turns out to be useful in many other contexts as well.

By including dirac(x) as an eigenvector (with infinite length) he's no longer working in L2 space; today we call it a "rigged H space" (nuclear is one of these); but Dirac gave it no name.

By the way somebody said square-integrable functions is not a Hilbert Space?? Sure it is.

Dirac blew by some potential problems by noting that in reality physicists never deal with an exact value of x, it's always an imprecise number covering a small range. So in practice dirac(x) never actually arises. When you actually compute numbers based on continuous eigenvalues you must deal with a (small) range of them.

He also mentions the interesting point that although other vectors can be expressed as an integral over the continuous range (using expanded Identity) the basis vectors themselves can't, being atomic (nor a finite sum of them). So to express any possible vector he uses an integral over the basis vectors, plus a summation.

He uses dirac(x) to deal with generalized matrices, generalized diagonal matrices, relative probability amplitudes, and so forth. He uses d log x / dx = 1/x - i pi dirac (x) in scattering theory.

That summarizes Dirac's work with dirac(x). His attitude was, that's good enough for physicists.

--------------- MATHEMATICIANS

Unfortunately it's not good enough for mathematicians. General comment, there are uncountably many ways to formalize Dirac's brilliant ideas; but for physicists, as far as I know, Dirac's approach is good enough. If you insist on formality, just pick the simplest formalization and leave it at that. 99% of the elaboration of measure theory has absolutely no relevance to Quantum Mechanics.

Measure:

Dirac is very clear about what the (improper) function is: an atomic, or discrete, measure. On Wikipedia I see they actually have something called the "Dirac measure" which is precisely for this purpose. It's a type of Radon measure. There are many other formal definitions one could use.

A measure, of course, only makes sense as a kernel to integrate a function against (see "quadratic forms").

Distribution:

The result of that integration is called a distribution associated with the generalized function, in our case dirac(x).

Now a big problem arises - abuse of notation. People use the same symbol for the distribution as for the kernel (or measure, or generalized function). Thus I see the statement "dirac(f) = f(0)". After all once you start talking about the distribution you don't need to refer to the real dirac(x) function often. But don't get confused! The distribution is not, in fact, the actual dirac function.

At this point I'm almost done because I see at least two of you are on top of this topic (although making some curious errors?) and everyone can go read Wikipedia. But I don't understand, why do physicists care? Since you're dealing only with physical functions, you don't have to prove existence, smoothness, integrability, etc - the proof is physically right in front of you. Dirac's approach seems perfect as it is. Obviously, I've got a lot to learn about modern QM!

Anyway a distribution, dirac(f), is defined on a space of Test Functions. These are C-infinity with compact support. Basically we make them very well-behaved so as to have no problems integrating against generalized functions.

dirac(f) is a linear functional that lives in the continuous dual space, NOT the regular algebraic dual space.

To define its value we use a family of Test Functions. They're like Dirac's families mentioned above, but only the smoothest are used. Test functions converge using a supremum norm, so that dirac(f) is continuous (on the space of Test Functions of course).

dirac(f) can be defined on any L2 function but not directly. The Test functions are dense so you can converge to any L2 function; in fact you can always find one Test function which is close enough to the wave function (or ket). However you have to chop it; it can't go to infinity, even very quickly.

The reason not to define dirac(f) directly on L2 is, it wouldn't be continuous. One can't help, really, considering dirac(f) to be defined on L2 but be careful applying any distribution theorems - they'll usually depend on continuity, other nice properties.

Note there can be a ket f with dirac(f) NOT f(0). That's because L2 could have a "bad" point there which will be smoothed out when you approximate it with Test functions. Another way to see that (as mentioned by posters) is the L2 equivalence classes lose a countable number of "bad" points (of Lebesgue measure zero) since they don't affect square-integrable norm. I can't imagine why a physicist would care about such non-physical functions, though.

I hope this helps clear up confusion.

Homework Helper
By the way somebody said square-integrable functions is not a Hilbert Space?? Sure it is.
Equivalence classes of square-integrable functions form an Hilbert space.

bhobba
secur
That's also true, and doesn't disagree with my statement.

Homework Helper
That's also true, and doesn't disagree with my statement.
How do you define the positive definite inner product on the space of square-integrable functions (without taking equivalence classes), making the space an Hilbert space?
If we take the Lebesgue square-integrable functions on ##\mathbb R##, what is the norm of the function ##f: \mathbb R \to \mathbb R## defined by ##\forall x\in \mathbb R \setminus \{0\}: f(x)=0##, ##f(0)=1##?

vanhees71
Staff Emeritus
How do you define the positive definite inner product on the space of square-integrable functions (without taking equivalence classes), making the space an Hilbert space?

It isn't a Hilbert space without taking equivalence classes, because one of the axioms of Hilbert space is that $\langle \psi|\psi \rangle = 0$ implies $|\psi\rangle = 0$.

vanhees71
Homework Helper
It isn't a Hilbert space without taking equivalence classes, because one of the axioms of Hilbert space is that $\langle \psi|\psi \rangle = 0$ implies $|\psi\rangle = 0$.
Exactly.

(At least for the usual spaces of Lebesgue square-integrable functions.)

secur
You're pointing out that the space of square-integrable functions can have a non-trivial kernel.

It depends what functions you allow. In QM, being a physical theory, we assume a trivial kernel - at least, we used to. All functions must be at least piece-wise smooth. Actually you can assume C-infinity if it's convenient, and approximate a non-physical idealization like a square well arbitrarily closely. They can be combined with atomic, or discrete, points. None of that gives a non-trivial kernel. For that you must have individual points in the support which differ between two kets both of which have zero norm.

Consider the dirac function itself. It's 0 everywhere but the origin; there, it can be considered infinite or undefined. So it's not a function at all. Since it has atomic (or whatever term you like to use) weight at zero, its norm is not 0. If you include it as a ket you no longer have a Hilbert Space.

Now suppose you changed dirac(x) by defining the value at 0 to be 1. Then at least it's a function. Its integral is 0, and the kernel would be non-trivial. So this is the simplest possible type of function that would make it necessary to take the quotient.

But such a function isn't allowed in a physical theory. It's meaningless to talk about one specific value in a continuous range (unless it's atomic). Especially with Uncertainty Principle, but even in classical physics that's true. It's tantamount to measuring with infinite precision.

So no physical function space (continuous - it's obvious for discrete) can have a non-trivial kernel. No doubt that fact has a name, which I don't remember, but we more-or-less took it for granted when dealing with applied mathematics.

Of course in pure math we can have non-trivial kernels. Normally you factor it out (use equivalence classes) because you want a unique additive identity.

But this is physics! I'm sure I never met an applied mathematician (scientist) who thought of two functions with one non-atomic point different from the other. Until now. Such a thing could only arise as an artifact. For instance when defining a square well, you could say it's 1 up to L, then zero. And you could include the point L in either set; in other words, have closure on either side. But obviously you can just define it one way or the other - you wouldn't use both ways in the same problem! Furthermore a perfect square well is not physical. In reality it's an (arbitrarily close) approximation to the real physical energy.

Maybe I'm wrong? Please give me an example of a physically meaningful function with zero (square-integrable) norm, which isn't 0.

Of course this is a trivial (pun intended) point, but since I said someone else was wrong I certainly can't complain about being corrected.

[edit - I hadn't noticed this question before]
If we take the Lebesgue square-integrable functions on RR\mathbb R, what is the norm of the function f:R→Rf:R→Rf: \mathbb R \to \mathbb R defined by ∀x∈R∖{0}:f(x)=0∀x∈R∖{0}:f(x)=0\forall x\in \mathbb R \setminus \{0\}: f(x)=0, f(0)=1f(0)=1f(0)=1?

Funny, exactly the example I mentioned! Please tell me the physical situation where this function has meaning.

Last edited:
Mentor
To make it (more or less) rigorous we integrate a function f against this parametrized family and take the limit as epsilon goes to 0.
To emphasize a point: first evaluate the integral, using the parametrized "not-yet-delta", then take the limit. I remember doing some exercises which carried out this procedure explicitly, with actual example functions, when I first learned about the Dirac delta many years ago.

secur